Dbt Flashcards
(63 cards)
What informs dbt about the context of a project?
A dbt project
A dbt project specifies how to transform data and build datasets.
What is the minimum requirement for a dbt project?
The dbt_project.yml project configuration file
This file is essential for dbt to function correctly.
What does the models resource in a dbt project do?
Transforms raw data into a dataset ready for analytics or serves as an intermediate step
Each model is contained in a single file.
What are snapshots used for in a dbt project?
Capturing the state of mutable tables for later reference
Snapshots allow tracking changes over time.
What are seeds in a dbt project?
CSV files with static data that can be loaded into the data platform
Seeds provide a way to incorporate static datasets.
What purpose do data tests serve in a dbt project?
SQL queries to test the models and resources in your project
Data tests ensure data quality and integrity.
What are macros in a dbt project?
Blocks of code that can be reused multiple times
Macros promote code efficiency and reduce redundancy.
What is the role of docs in a dbt project?
Documentation for the project that can be built
Docs help in understanding and maintaining the project.
What do sources represent in a dbt project?
Naming and describing data loaded into the warehouse by Extract and Load tools
Sources provide clarity on data origins.
What are exposures in a dbt project?
Defining and describing a downstream use of your project
Exposures help track how data is utilized post-transformation.
What is the purpose of metrics in a dbt project?
Defining metrics for your project
Metrics provide quantifiable measures for analysis.
What do groups enable in a dbt project?
Collaborative node organization in restricted collections
Groups facilitate teamwork and organization.
What is the analysis resource in a dbt project used for?
Organizing analytical SQL queries in the project
This includes queries like the general ledger from QuickBooks.
What do semantic models define in a dbt project?
Foundational data relationships in MetricFlow and the Semantic Layer
They enable querying metrics using a semantic graph.
What are saved queries in a dbt project?
Organizing reusable queries by grouping metrics, dimensions, and filters into nodes
Saved queries enhance efficiency by reducing repetitive query writing.
What should be considered when building the structure of a dbt project?
Impacts on workflow such as command execution paths, navigation, and model configuration
These factors influence usability for developers and stakeholders.
What is the purpose of the dbt_project.yml file?
It defines the directory of the dbt project and other project configurations.
This file is essential for managing dbt project settings and ensuring proper functionality.
What does the YAML key ‘name’ represent in dbt_project.yml?
Your project’s name in snake case.
This key is used to identify the project uniquely.
What does the YAML key ‘version’ signify in dbt_project.yml?
Version of your project.
This helps track changes and manage project updates.
What is the purpose of the ‘require-dbt-version’ key in dbt_project.yml?
Restrict your project to only work with a range of dbt Core versions.
This ensures compatibility with specific dbt versions.
What does the ‘profile’ key in dbt_project.yml specify?
The profile dbt uses to connect to your data platform.
This key configures the connection settings for the data source.
What information do ‘model-paths’ provide in dbt_project.yml?
Directories to where your model and source files live.
This structure organizes model files for better management.
What is indicated by the ‘seed-paths’ key in dbt_project.yml?
Directories to where your seed files live.
Seed files are CSV data files that dbt can load into the database.