Data Warehousing Flashcards
(33 cards)
What is the process of removing all redundant information for a schema
Normalisation
What are the 4 steps of the dimensional design process?
1) Select the business process
2) Declare the grain in business terms
3) Identify the dimensions
4) Identify the facts
What is Grain?
What a single fact table row represent
What is a dimension?
Provides context for the Fact measures.
Context being What, Where, When, Why, Who and How.
What is a Fact?
Measurements that result from a business process event.
What is a degenerate dimension?
TBC
What are the 3 types of fact measure
1) Additive - Summed across all dimensions
2) Semi Additive - Summed across some dimensions (e.g. Not Date)
3) Non Additive - Cannot be summed
Give an example of a Factless Fact
A fact table recording calendar day, student, teacher, class and location.
What are the 3 main types of fact table
1) Transnational
2) Snapshot
3) Accumulating snapshot
What is a consolidated fact table?
Where 2 or more linked business processes are combined in a single fact table. e.g sales and forecasts
What is the name of the type of key is used to join dimensions and facts?
Surrogate Key or Primary Key / Foreign Key
What is the name of the type of key which is used in operational systems to uniquely identify resources
Natural Key / Business Key
What is a degenerate dimension and how does it occur?
This is a dimension key in a fact table that has no associated dimension. This could happen where all the information from an invoice is expressed in the rows and dimensions of a fact leaving the invoice number which is still useful for analysis.
What name is given to a dimension which is joined to multiple times on a single fact table
Role-playing Dimension e.g. Date
What are conformed dimensions?
Where the same dimensions apply to multiple facts which allows data to be easily combined.
What is a shrunken dimension?
One which works on a subset of rows or attributes. This could be a dimension which is associated with an aggregated fact table
What is value chain?
Natural flow of a businesses primary business e.g. Quote,Application,Completion
What is a Data Warehouse Bus Matrix?
A matrix of rows representing business processes and columns representing the conformed dimensions which relates to these business processes.
What is a Implementation Bus Matrix?
A matrix of rows representing the data warehouse facts (Within each business process) and columns representing the conformed dimensions which relates to these facts
What is a Opportunity / Stakeholder Matrix?
A matrix of rows representing business processes and columns representing the business groups / departments interested in these business processes
What are the 7 slowly changing dimension attribute definitions?
Type 0: Retain original Type 1: Overwrite Type 2: Add new row Type 3: Add new attribute (alt Reality) Type 4: Add mini dimension (rapid change) Type 5: Add mini dimension + Type 1 (outrigger) Type 6: Add Type 1 and Type 2 Type 7: Duel Type 1 and Type 2
What would you do if you have a fact table where a dimension can be multi-valued e.g. Hospital diagnosis?
Use a bridging table which has a row for each diagnosis
What is behavioral tag time series?
a customer is assigned a tag on a periodic basis based on behavioral traits. The current value is stored as well as a string showing how the tags have changed over time
How should you store textual comments in a data warehouse
Not on the fact table but using a text comment dimension