Chapter 3: Replication, CAP Flashcards Preview

Cloud Computing > Chapter 3: Replication, CAP > Flashcards

Flashcards in Chapter 3: Replication, CAP Deck (20):

Where to Place the Replicas?

Within a cloud data center, replica placement is often done with respect to the network hierarchy
- One replica on another machine
- Guards against individual node failures
- One replica on another rack
- Guards against outages of the rack switch
On a global scale, replicas are often distributed with regard to the client locations


Who can create the copies for Replication?

Server-initiated replication
- Copies are created by server when popularity of data
item increases
- Mainly used to reduce server load
- Server decides among a set of replica servers
Client-initiated replication
- Also known as client caches
- Replica is created as result of client‘s response
- Server has no control of cached copy anymore
- Stale replicas handled by expiration date
- Traditional examples: Web proxies


What Happens at Replication when the Data Changes? steps 1. - 3.

1. Invalidation protocols
- Inform replica servers that their replica is now invalid
- Good when many updates and few reads
2. Transferring the modified data among the servers
- Each server immediately receives latest version
- Good when few updates and many reads
3. Don‘t send modified data, but modification commands
- Good when commands substantially smaller than data
- Assumes that servers are able to apply commands
- Beneficial when network bandwidth the scarce


Explain Push- and Pull- based updates

Push-based updates (server-based protocols)
- Server pushes updates to replica servers
- Mostly used in server-initiated replica setups
- Used when high degree of consistency is needed
Pull-based updates (client-based protocols)
- Clients request updates from server
- Often used by client caches
- Good when read-to-update ratio is low


What are Issues of Push- and Pull- based updates

State of server
Messages sent
Response time at client


Which two Views On Consistency exist?

Data-centric consistency models
- Talk about consistency from a global perspective
- Provides guarantees how a sequence of read/write operations are perceived by multiple clients
Client-centric consistency models
- Talk about consistency from the client‘s perspective
- Provides guarantees how a single client perceives the state of a replicated data item


Two groups of Data-Centric Consistency Models

Strong consistency models
- Operations on shared data is synchronized
- Strict consistency (related to time)
- Sequential consistency (what we are used to)
- Causal consistency (maintains only causal relations)
Weak consistency models
- Synchronization only when data is locked/unlocked
- General weak consistency
- Release consistency
- Entry consistency


5 Client-Centric Consistency Models (Client-Centric)

1. Eventual Consistency
2. Monotonic Reads
3. Monotonic Writes
4. Read Your Writes
5. Writes Follow Reads


Characteristics of Eventual Consistency (Client-Centric)

- all replicas will reach the most recent state at some point of time
- client can read from everywhere
- most common used by big cloud providers


Characteristics of Monotonic Reads (Client-Centric)

Teh client reads always from the servers which has all writes that the cllient previously read.


Characteristics of Monotonic Writes (Client-Centric)

the client can only write on servers where he his previous writes have been completed


Characteristics of Read Your Writes (Client-Centric)

All write operations of a client will always be seen by a successive read operation of the same client.


Characteristics of Writes Follow Reads (Client-Centric)

A write operation after a read operation can only be performed after all preceding writes have been performed.


General Remark on Consistency

In general, the stricter the consistency model, the more it impacts the scalability of a system
- More consistency requires more synchronization
- While the data is synchronized, some client requests
may be answered
- Databases of the 80s and 90s put strong emphasis on consistency, lived with limited scalability/availability
- Today‘s cloud databases often sacrifice consistency in favor of scalability and availability


Explain the Brewer’s CAP Theorem

In a distributed system, it is impossible to provide all three of the following guarantees at the same time:
- Consistency: Write to one node, read from another node will return something no older than what was written
- Availability: Non-failing node will send proper response (no error or timeout)
- Partition tolerance: Keep promise of either consistency
or availability in case of network partition


What stats the ACID properties in the context of databases for?

Atomicity - Atomarität (Abgeschlossenheit)
Consistency - Konsistenzerhaltung
Isolation - Isolation (Abgrenzung)
Durability - Dauerhaftigkeit


Why Replication and not only Partitioning?

Partitioning helps with scalability but not availability
- Data is only stored in one location
- If machine goes down, data is gone
Replication can improve both scalability and availability!


Characteristics of Strict Consistency (Data-Centric)

Any read operation returns the value stored by the most recent write operation.


Characteristics of Sequential Consistency (Data-Centric)

Any reads in a sequence return the last writes in sequence.


Characteristics of Causal Consistency (Data-Centric)

as long as the writes were not potentially depending, a different read order of concurrent writes is ok.