Referential Integrity Flashcards

1
Q

What does referential integrity mean?

A

That lookup columns will always have a value and that value will find a match in the destination table.

(Aug. 2021, 5.3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The referential integrity value that gets assigned is determined by the circumstances behind the null or unmatched value and is defined by the ____ _______.

A

Load package

Aug. 2021, 5.3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Databases that enforce referential integrity have two important properties. What are they?

A
  1. Lookup columns will always have a value even if the source data is null.
  2. Lookup column values will always have a matching value in the destination table/column.

(Aug. 2021, 5.4)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What default value does Caboodle use to identify Unspecified Values?

A

-1

Aug. 2021, 5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is a value said to be unspecified?

A

When a lookup column’s value is NULL in the source system.

Aug. 2021, 5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What default value does Caboodle use to identify Not Applicable values?

A

-2

Aug. 2021, 5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where would you look to discover why a -2 value is in a key column?

A

You should investigate that column’s data lineage and the table’s queries.

(Aug. 2021, 5.6)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does Caboodle do when a record is hard-deleted in the source system?

A

Caboodle marks the deletion by updating the row with default values that represent the deletion.

(Aug. 2021, 5.6)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Does the primary key of a non-snapshot table change when a record is hard-deleted in the source system?

A

No. The primary key remains unchanged.

Aug. 2021, 5.6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What value in the _IsDeleted column indicates a deletion?

A

1

Aug. 2021, 5.7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Does the surrogate key or the durable key change when data is deleted in a SNAPSHOT table?

A

A new surrogate key is generated. The durable key remains unchanged.

(Aug. 2021, 5.8)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the default values Caboodle uses for a deletion?

A

String columns = *Deleted
Lookup columns = -3
Everything Else = NULL
_IsDeleted = 1

(Aug. 2021, 5.7)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens when an entity in a snapshot table is deleted?

A

A new row is added and is populated with the deleted default values.
The other rows pertaining to this entity will be updated to reflect the deletion. Only the snapshot columns will be updated with the deleted default values. The data in non-snapshot columns is now stale.

(Aug. 2021, 5.7-5.8)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does Caboodle ensure the second principal of referential integrity?

A

Every table in Caboodle has three special rows with primary key values of -1, -2, and -3. These rows do not represent actual entities, they only exist to satisfy referential integrity.

(Aug. 2021, 5.16)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do the “Three Special Rows” exist?

A
  1. So that the second principal of referential integrity is satisfied.
  2. So we can join to them if we need to (tidbit from class)

(Aug. 2021, 5.16)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the values for the “Three Special Rows”?

A

PrimaryKey = -1, -2, and -3
Strings = *Unspecified, *Not Applicable, *Deleted
Lookup Columns = -1, -2, and -3
Everything Else = NULL

17
Q

How would you handle the “Three Special Rows” when querying Caboodle?

A

They would need to be removed via the WHERE clause because they do not represent anything.

(Aug. 2021, 5.17)

18
Q

How would you use the WHERE clause to remove the “Three Special Rows”?

A
  1. Filter on the PrimaryKey column to only show values that are >0
  2. Filter on the Count column
    The Count column stores a 1 for all rows where the surrogate key is greater than 0 and the row does not represent deleted data.
    Count columns are only in fact tables (tidbit from class)

(Aug. 2021, 5.17)

19
Q

What tables have count columns?

A

Fact tables.

Aug. 2021, 5.17

20
Q

Count columns store 1’s and 0’s. What do they represent?

A

1 for all rows where the surrogate key is greater than 0 and the row does not represent deleted data.

0 for all rows where the surrogate key is less than 0 and the row represents deleted data.

(Aug. 2021, 5.17)

21
Q

What happens when a lookup column value does not use one of the default values and also does not have a match in the destination table?

A

Caboodle creates inferred rows.

Aug. 2021, 5.18

22
Q

What are inferred rows also known as?

A

Inferred rows can be thought of as placeholder rows. They are used when data is out of sink. It’s like Caboodle is saying, “I don’t know what it is or who they are yet, but I will.”

(Aug. 2021, 5.18)

23
Q

Do inferred rows have default values? If yes, can you name them?

A
Yes.
PrimaryKey = Generated during ETL
Strings = *Unknown
Lookup Columns = -1
Everything Else = NULL
_IsInferred = 1

(Aug. 2021, 5.18)

24
Q

What column would you use to identify inferred rows?

A

The _IsInferred column.

_IsInferred only equals 1 when it’s actually inferred. Not when it’s updated.

(Aug. 2021, 5.19)

25
Q

What is the difference between an inner join and a left outer join when joining from a lookup column, in Caboodle?

A

Because lookup column values will always have a matching value in the destination table/column, there is no difference between an inner join and a left outer join when joining from a lookup column.

(Aug. 2021, 5.20)

26
Q

Are inner joins and outer joins the same when joining from the primary key?

A

No. Referential integrity ensures that lookup columns are always populated and will always find a matching value in the destination table.

(Aug. 2021, 5.21)

27
Q

Do all inferred rows in Caboodle have a primary key of -1?

A

No.
The primary key for an inferred row is generated during the ETL process.
Lookup columns have a value of -1.

(Aug. 2021, 5.27)

28
Q

A row in a Caboodle table has a primary key value of -2. What value will be stored in columns with a data type of “Date” for this row that are not lookup columns?

A

NULL

Aug. 2021, 5.29

29
Q

If a lookup column is null in Clarity, what will appear in the corresponding column in Caboodle?

A

It depends on if the SSIS package defined the data lineage.
If it did, then a -1 will appear in Caboodle to represent an unspecified value.
If it didn’t, then a -2 will appear in Caboodle to represent a value that is not applicable.

(Aug. 2021, 5.29)