Chapter 14 Flashcards

1
Q

why replication?

A
  • increase availability

- improve performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Eager (synchronous) replication

A

Transaction synchronizes with copies of replicated elements before commit
advantages: guarantees one-copy serializable execution; avoids inconsistencies
Potential problems: update overhead (reduced update performance and increase transaction response time); deadlocks; lack of scalability; cannot be used if nodes are disconnected (e.g. mobile databases) or unavailable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Lazy (asynchronous) replication

A

Changes introduced at one site are propagated (as separate transactions) to other sites only after commit

Advantages: Minimal update overhead (improved response time over eager replication); works also if sites are not connected or unavailable
Potential problems: Out-of-date data; conflicting updates on different replicas can cause inconsistencies between copies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Single-master primary-copy replication

A

One replica as primary copy and the other as secondaries. Eager and lazy replication can be used for that

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If a failure occurs in the primary copy:

A

option 1 - disallow updates until primary recovers

option 2 - a secondary takes over as primary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Multi-master replication

A

„update anywhere“ model.

„race“ each other to propagate the update to all the other nodes (potential for lost updates)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to detect conflicts in multi-master replication?

A

based on timestamps.
each node compares old object timestamp of incoming replica updates with its own object timestamp. If they are the same, the update is accepted. If not, then the incoming update transaction is rejected and submitted for reconciliation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Reconciliation

What are the two approaches?

A

automatically, based on rules

Manually

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the 2 alternatives for conflict detection?

A

1 - semantic synchronization: permit commutative transactions (e.g., processing checks at a bank has the same result independent of order)
provide acceptance criteria for detecting conflicts (pass the acceptance tests)

2 - avoids conflicts by implementing update strategies in the application:
fragmentation by key
fragmentation by time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the Data warehousing

goal?

A

materialized integration of data from numerous heterogeneous sources to enable powerful strategic data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

OLAP - online analytical processing

facts table:

A

events events or objects of interest (e.g. sales event, with info about the product sold, the store, the sales date and price)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

OLAP dimension table:

A

objects can often be thought of as arranged in a multi-dimensional space or cube (e.g., sales events have store, product, and time period dimensions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is a Relational OLAP schema

Star structure?

A

dimension tables (linked to fact tables) tend to be small
fact table tends to be huge
measures (dependent attributes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how is snowflake schema?

A

„normalized“ dimensions
it has multiple tables to avoid redundancy
it requires additional joins for OLAP queries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What olap group by and what it computes?

A

OLAP queries usually „group by“ the dimensions, compute aggregate values of measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How data preparation steps (ETL) are counducted in a warehouse?

A

Monitor discovers and reports changes in data sources -> extractors select and transport data from data sources into the staging area -> transformers perform standardization and integration of data -> loaders insert the data from the staging area into the main warehouse

17
Q

What are the approaches to monitor data changes?

A

log-based: DBMS writes info about updates into its transaction log and this log is analyzed to extract the change data;
Trigger-based: db trigger are used to gather change data
replication middleware: use the above approaches
audit columns: timestap-based
snapshot differentials: compares current state of data source with snapshot (expensive)