Tests Flashcards
(81 cards)
What is the primary purpose of data tests in dbt?
Data tests in dbt are used to make assertions about models and other resources like sources, seeds, and snapshots. They help ensure data integrity by verifying assumptions about the data, such as uniqueness or non-null values.
How does dbt determine if a data test passes or fails?
A data test passes if its corresponding SQL query returns zero failing records. If the query finds records that disprove the assertion, the test fails.
What are some examples of built-in generic data tests in dbt?
dbt provides built-in generic data tests for checking non-null values, uniqueness, referential integrity (foreign key relationships), and values from a specified list.
What is the difference between a singular and a generic data test in dbt?
A singular data test is a custom SQL query saved in a .sql file to find failing records. A generic data test is a reusable, parameterized test defined in a test block and applied via .yml configuration.
Why are generic data tests generally more common than singular data tests in dbt?
Generic data tests are reusable and flexible, allowing consistent testing across multiple models with minimal effort. Their reusability and configurability make them ideal for frequent use.
What does a data test in dbt typically look for in its SQL query?
A data test SQL query looks for failing records—those that violate the test’s assertion. For instance, if testing for uniqueness, the query finds duplicate values.
Where are singular data tests typically stored in a dbt project?
Singular data tests are stored as .sql files within the test directory of the dbt project.
How can you define a generic data test in dbt?
Generic data tests are defined using test blocks, similar to macros. They accept arguments and are referenced by name in .yml files attached to models, columns, or sources.
What happens when a data test fails in dbt?
When a data test fails, dbt returns the set of rows that caused the failure. This helps identify exactly what data violated the assertion.
Can data tests be extended to fit specific business logic in dbt?
Yes, any assertion that can be expressed as a SQL SELECT query can be used as a data test, allowing teams to validate business-specific rules.
How are data tests executed in a dbt workflow?
Data tests are run using the dbt test
command, which evaluates all defined tests and reports on their success or failure.
What is the role of schema tests in dbt?
Schema tests, also known as generic data tests, validate schema-level properties like nullability, uniqueness, and relationships, and are defined declaratively in YAML files.
How does dbt test help prevent regressions in your codebase?
By running consistent tests on models, dbt ensures that data assumptions hold true even as underlying code changes, helping to catch unintended issues early.
What should a data test SQL query return to indicate a successful test?
It should return zero rows—meaning no failing records were found that violate the assertion.
What is a singular data test in dbt?
A singular data test is a custom SQL query stored in a .sql file that returns failing records for a specific assertion. It’s a one-off test created for a unique use case.
Where are singular data tests typically stored in a dbt project?
They are stored in the tests directory defined by the test-paths
config. Each file should contain one SELECT statement.
What naming convention does dbt use for singular data tests?
The name of the test is derived from the file name, such as assert_total_payment_amount_is_positive.sql
.
Can Jinja be used in singular data tests?
Yes, you can use Jinja syntax, including ref
and source
, in singular test SQL files just like in models.
What should a singular test SQL query return?
It should return only the failing records. For example, a query that selects rows where total_amount < 0 to test for positive payment totals.
Why should you omit semicolons in singular data test SQL files?
Semicolons can cause execution errors during testing in dbt, so they should be left out of the final SELECT statement.
How do you add a description to a singular data test?
You add it to a .yml file in the tests directory, specifying the test name and a description under data_tests
.
What is a generic data test in dbt?
A generic data test is a parameterized test defined using a test block. It can be reused across models and columns by passing different arguments.
How is a generic test defined in dbt?
Generic tests are defined using a Jinja {% test %}
block with parameters like model
and column_name
. The body contains a SELECT query using those parameters.
How are generic data tests applied to dbt resources?
They are configured in YAML files under the properties of models, sources, seeds, or snapshots. You specify the test name and any required arguments.