Data Warehouse Design - Chapter 1.5 & 1.6 Flashcards

1
Q

How is data from a data warehouse normally represented?

A

With the multidimensional model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is the multi-dimensional model used as a paradigm of data warehouse representation?

A
  1. Ease of use and intuitiveness
  2. The widespread use of productivity tools, such as excel, that adopt the multidimensional model as visualization paradigm.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the five most important concepts of the multi-dimensional model?

A
  1. Facts
  2. Events
  3. Measures
  4. Dimension
  5. Attributes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are facts in the multi-dimensional model?

A

Enterprise-specific factors that affect decision-making processes, such as sales, shipments, surgeries etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are events in the multi-dimensional model?

A

Instances of a fact, such as every single sale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a measure in the multi-dimensional model?

A

Quantitative descriptions of events that describe each fact. For example: sales receipts, amounts shipped, surgery time etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are analysis dimensions in the multi-dimensional model?

A

The space axis of your model that define the different perspectives to single out events.

Say that you have the fact sales. Dimensions could be: products, stores and dates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why do we use a multi-dimensional model?

A

To easily select the events based on their dimensions by using the visualized model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a cube?

A

The multi-dimensional model looks like a cube.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a hypercube?

A

A multi-dimensional model with more than 3 dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does it mean when a cube is sparse?

A

Not every cube cell is filled in. It can happen that at a specifc date there is no product sold by that store.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How could you represent the sales cube with the relational model schema?

A

SALES (store, product, date, quanitity, receipts)

Underline store, product, date because they form the primary key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How could you represent the dependency of the sales cube?

A

store, product, date -> quantity, receipts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a roll-up hierarchy?

A

Each dimension normally has a hierarchy of aggregation levels.

Product -> type -> category

^ these are dimensional attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do you need to reduce the quantity of the data and which two ways are there to do so?

A

Information in a multi-dimensional cube is too large to be analyzed without relying on automatic tools.

  1. Restriction
  2. Aggregation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is meant by restricting data?

A

Seperating part of the data from a cube to mark out an analysis field.

Also named: making selections and/or projections

17
Q

What are two ways of selection to restrict data (RESTRICTION)?

A
  1. Data slicing.
    You set one or more dimensions to a specific value to call of events associated to that value
  2. Dicing
    Generalization of slicing - so you put in general constraints due to which you can only select a specific number of events.
18
Q

What is meant by projection in relation to restricting data?

A

Making a choice to keep just one subgroup of measures for every event and reject other measures.

19
Q

What is meta-data?

A

Data used to define other data.

20
Q

How does meta-data play an important role in data warehousing?

A
  1. It specifies source, values, usage and features of data warehouse data.
  2. It defines how data can be changed and processed at every architecture layer.
21
Q

What are the two categories to classify meta-data in?

A
  1. Internal meta-data
    Useful for system administrators, defines sources, policies, constraints etc.
  2. External meta-data
    End-users. Definitions, quality standards, units of measure etc.
22
Q

What are the 5 things a meta-data management tool should be able to do?

A
  1. Allow administrators to perform system administration operations and manage security.
  2. Allow end-users to navigate and query meta data
  3. Use a GUI
  4. Allow end-users to extend meta data
  5. Allow meta-data to be imported/exported from other tools and formats.
23
Q

What is aggregation?

A

Adjusting the granularity of the dimension and thereby decreasing the amount of events associated.