Big Data Lecture 14 Data Cubes Flashcards

1
Q

What is OLTP?

A

Online transaction processing, both read and write intensive system in real time, example is relational database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is OLAP?

A

Online analytical processing, we want to agreggate data, can be not fully online, example are Data Cubes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main differences of OLTP and OLAP in IO?

A

OLTP writes mostly, OLAP reads often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference of OLAP and OLTP in detail of information it provides about data?

A

OLTP has detailed individual records, OLAP historical summarized consolidated data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference of OLTP and OLAP in terms of size of transactions?

A

OLTP has small transactions, OLAP has few huge queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference of OLAP and OLTP in execution time?

A

OLTP is interactive, OLAP takes its time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference of OLAP and OLTP in terms of duplication and denormalization?

A

OLTP is consistent, OLAP has redundant data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What 4 adjectives define OLAP?

A

<ul><li>Subject oriented: usually for one use case (web analytics, sales...),</li><li>time-variant: the data tends to be historical,</li><li>integrated: loads data from many different sources,</li><li>non-volatile: the data is locked once it is loaded.</li></ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the main data shape in OLAP?

A

Data cube! Multidimensional object with many dimensions, which are indexed by members.<br></br>E.g. YEAR x PRODUCT x COUNTRY identifies a unique product sold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a fact table?

A

Table that stores for each coordinate combination a value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a measurement?

A

Value stored in the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is slicing?

A

Taking a slice of the cube over some dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is dicing?

A

Reformatting the table such that the axis of the table become the dimensions, this is called the cross-tabulated view.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is aggregation from the POV of the cube?

A

It is like squeezing a cube along one dimension.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a roll-up?

A

We combine two categories in aggregation together, to create a lower dimensional view of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a drill-down?

A

Opposite of roll up. We make the data more granular, by considering more subvariables in each category, making the view more precise, less aggregated.

17
Q

What is the difference of ROLAP and MOLAP?

A

Relational OLAP (built on top of RDBMS) and Multidimensional OLAP (natively deals with multiple dimensions).

18
Q

What is pivoting (and unpivoting)?

A

Pivoting takes dimensions and makes the associated values into a different measurements. Unpivoting reverses that.

19
Q

What is a star schema of tables?

A

We keep one main table with all dimensions, and then we have <i>satellite tables </i>which reference the remaining data.

20
Q

What is snowflake schema?

A

Within each table of the star schema, we further normalize the data into more tables.

21
Q

What is MDX?

A

Stands for Multi-Dimensional eXpressions. The query language for Data Cubes.

22
Q

How to roll up in SQL?

A

Use group by, and the dimensions that you want to keep insert into the groupings.

23
Q

How to cube in SQL?

A

You can ‘GROUP BY GROUPING SETS’ and then denote all the combinations, or ‘GROUP BY CUBE’ and just input the attributes, it will produce the 2^n different sets we need to group on.

24
Q

How to roll up in SQL?

A

You can use ‘GROUP BY ROLLUP’ that produces all the attributes for (n+1) types that we need.

25
How do authorities deal with fancy reports from companies to make the readeable to the system?
They let them produce the reports in XHTML, and make them annotate the important attributes for parsing. Later, data cubes can be used for simple aggregation online.