Dbt Flashcards

(63 cards)

1
Q

What informs dbt about the context of a project?

A

A dbt project

A dbt project specifies how to transform data and build datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the minimum requirement for a dbt project?

A

The dbt_project.yml project configuration file

This file is essential for dbt to function correctly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the models resource in a dbt project do?

A

Transforms raw data into a dataset ready for analytics or serves as an intermediate step

Each model is contained in a single file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are snapshots used for in a dbt project?

A

Capturing the state of mutable tables for later reference

Snapshots allow tracking changes over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are seeds in a dbt project?

A

CSV files with static data that can be loaded into the data platform

Seeds provide a way to incorporate static datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What purpose do data tests serve in a dbt project?

A

SQL queries to test the models and resources in your project

Data tests ensure data quality and integrity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are macros in a dbt project?

A

Blocks of code that can be reused multiple times

Macros promote code efficiency and reduce redundancy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the role of docs in a dbt project?

A

Documentation for the project that can be built

Docs help in understanding and maintaining the project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do sources represent in a dbt project?

A

Naming and describing data loaded into the warehouse by Extract and Load tools

Sources provide clarity on data origins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are exposures in a dbt project?

A

Defining and describing a downstream use of your project

Exposures help track how data is utilized post-transformation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of metrics in a dbt project?

A

Defining metrics for your project

Metrics provide quantifiable measures for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do groups enable in a dbt project?

A

Collaborative node organization in restricted collections

Groups facilitate teamwork and organization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the analysis resource in a dbt project used for?

A

Organizing analytical SQL queries in the project

This includes queries like the general ledger from QuickBooks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do semantic models define in a dbt project?

A

Foundational data relationships in MetricFlow and the Semantic Layer

They enable querying metrics using a semantic graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are saved queries in a dbt project?

A

Organizing reusable queries by grouping metrics, dimensions, and filters into nodes

Saved queries enhance efficiency by reducing repetitive query writing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What should be considered when building the structure of a dbt project?

A

Impacts on workflow such as command execution paths, navigation, and model configuration

These factors influence usability for developers and stakeholders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the purpose of the dbt_project.yml file?

A

It defines the directory of the dbt project and other project configurations.

This file is essential for managing dbt project settings and ensuring proper functionality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does the YAML key ‘name’ represent in dbt_project.yml?

A

Your project’s name in snake case.

This key is used to identify the project uniquely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does the YAML key ‘version’ signify in dbt_project.yml?

A

Version of your project.

This helps track changes and manage project updates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the purpose of the ‘require-dbt-version’ key in dbt_project.yml?

A

Restrict your project to only work with a range of dbt Core versions.

This ensures compatibility with specific dbt versions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does the ‘profile’ key in dbt_project.yml specify?

A

The profile dbt uses to connect to your data platform.

This key configures the connection settings for the data source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What information do ‘model-paths’ provide in dbt_project.yml?

A

Directories to where your model and source files live.

This structure organizes model files for better management.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is indicated by the ‘seed-paths’ key in dbt_project.yml?

A

Directories to where your seed files live.

Seed files are CSV data files that dbt can load into the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the purpose of the 'test-paths' key in dbt_project.yml?
Directories to where your test files live. ## Footnote This allows for organized placement of testing scripts.
26
What do 'analysis-paths' refer to in dbt_project.yml?
Directories to where your analyses live. ## Footnote This structure is used for analytical queries or reports.
27
What is the function of the 'macro-paths' key in dbt_project.yml?
Directories to where your macros live. ## Footnote Macros are reusable SQL snippets that simplify code writing.
28
What does the 'snapshot-paths' key indicate in dbt_project.yml?
Directories to where your snapshots live. ## Footnote Snapshots capture the state of data at a point in time.
29
What is defined by the 'docs-paths' key in dbt_project.yml?
Directories to where your docs blocks live. ## Footnote This helps in organizing documentation associated with the project.
30
What does the 'vars' key represent in dbt_project.yml?
Project variables you want to use for data compilation. ## Footnote These variables can be referenced throughout the project for dynamic configurations.
31
Fill in the blank: The dbt_project.yml file is essential for _______.
[managing dbt project settings]
32
True or False: The 'version' key in dbt_project.yml is optional.
True. ## Footnote While it is recommended to set a version, it is not strictly required.
33
Which dbt YAML file does not support Jinja?
dependencies.yml
34
In which dbt YAML files can you use vars?
Any YAML file that supports Jinja, like schema.yml and snapshots.yml
35
How do you pass vars to dbt_project.yml, packages.yml, and profiles.yml?
Through the CLI using --vars
36
Why can't you define vars inside dbt_project.yml, packages.yml, and profiles.yml?
These files are parsed before Jinja is rendered
37
Can you use env_var() in YAML files that support Jinja?
Yes
38
Which dbt YAML files support secure environment variables with DBT_ENV_SECRET_?
profiles.yml and packages.yml
39
Which dbt package generates YML and SQL files for models and sources?
dbt_codegen
40
Which dbt package provides macros like date_spine for development?
dbt_utils
41
Which package evaluates your dbt project against best practices?
dbt_project_evaluator
42
Which package offers additional tests beyond dbt's built-in ones?
dbt_expectations
43
Which package helps compare outputs of two queries for refactoring?
dbt_audit_helper
44
Which package tracks dbt run performance over time?
dbt_artifacts
45
Which package ensures your dbt project is tested and documented?
dbt_meta_testing
46
What is a resilient way to select models in dbt?
Use folder structure, e.g., dbt build --select marts.marketing
47
How should you group dbt jobs?
By build cadences and SLAs (hourly, daily, weekly)
48
How can you test a subset of records in dbt?
Use the 'where' config for tests
49
How can you examine failing test records in dbt?
Use 'store_failures'
50
How do you set acceptable failure thresholds for dbt tests?
Use severity thresholds
51
What config optimizes incremental model behavior?
'incremental_strategy'
52
Where do you set global defaults in dbt?
In dbt_project.yml using vars
53
How can you avoid repetition in Jinja code?
Use for loops
54
What is a better way to apply grants than post-hooks?
Use the grants config
55
How do you prevent reprocessing already transformed data?
Set source-freshness thresholds
56
How do you run a model and its upstream dependencies?
Use '+' on the left, e.g., dbt build --select +model_name
57
How do you run a model and its downstream dependencies?
Use '+' on the right, e.g., dbt build --select model_name+
58
How do you run all models in a directory?
Use the directory name, e.g., dbt build --select dir_name
59
What does the '@' operator do in dbt CI setups?
Runs selection’s parents, children, and children’s parents
60
How do you exclude models from a dbt selection?
Use the --exclude flag
61
What flag rebuilds an incremental model completely?
--full-refresh
62
What are dbt seeds used for?
Creating lookup tables from CSVs
63
How can you change logic based on the dbt environment?
Use target.name