Foundational Distributed System Concepts -- Availability and Reliability Flashcards

Question

What is Quorum-Based Replication (often part of leaderless or multi-master)?

Answer 1

It is a NOT a replication strategy but a consistency mechanism often employed within leaderless or sometimes multi-master systems to guarantee a level of consistency and fault tolerance.

Answer 2

Key Concepts: N: Total number of replicas. W (Write Quorum): Minimum number of replicas that must acknowledge a write operation for it to be considered successful. R (Read Quorum): Minimum number of replicas that must respond to a read request. Consistency Guarantees with Quorums: To guarantee strong consistency (e.g., read-your-writes, linearizability), the following condition must hold: W+R>N This condition ensures that there's always at least one overlapping replica between the write quorum and the read quorum, meaning any read will "see" the most recent write.

Answer 3

Writes: A client sends a write request to multiple nodes. The write is successful if W nodes acknowledge it. Reads: A client sends a read request to multiple nodes. It collects responses from R nodes and then typically returns the most recent version (often determined by a timestamp or version vector). If older versions are found, a "read repair" mechanism might update the stale replicas in the background.

Answer 4

Tunable Consistency: By adjusting W and R, you can tune the balance between consistency, availability, and performance. Strong Consistency: If W+R>N, you achieve strong consistency. Eventual Consistency: If W+R≤N, you get eventual consistency. For example, if W=1 and R=1, you prioritize availability and low latency, but data can be highly inconsistent. High Availability: The system can tolerate up to N−W node failures for writes and N−R node failures for reads while maintaining availability. Fault Tolerance: No single point of failure. Scalability: Can scale horizontally by adding more replicas.

Answer 5

Increased Latency: Operations (especially reads and writes) can incur higher latency as they need to coordinate with multiple nodes. Complexity: Implementing quorum logic, conflict resolution (if not strongly consistent), and read repair can be complex. Consistency Tradeoffs: While tunable, achieving strong consistency (e.g., W=N,R=1 or W=majority,R=majority) can reduce availability and performance compared to weaker consistency models. Conflict Resolution (if W+R≤N): If the quorum condition isn't met for strong consistency, conflicts are possible and must be resolved.

Answer 6

Consistency vs. Performance/Availability: Direct and explicit control over this tradeoff by adjusting quorum sizes. Complexity: Higher implementation and operational complexity than simple asynchronous leader-follower.

Answer 7

In leaderless replication (also known as "Dynamo-style replication" after Amazon's Dynamo), there is no designated leader node. Any replica can accept read and write requests directly from clients. Consistency is typically managed using quorum mechanisms and various conflict resolution techniques.

Answer 8

Databases like Cassandra, DynamoDB, Riak (which are leaderless systems). When fine-grained control over consistency guarantees is required. Highly available systems that can tolerate eventual consistency or where strong consistency is only required for critical paths. Systems where individual node failures are common and rapid recovery without leader election overhead is desired.

Answer 9

Writes: A client sends a write request to a coordinator node (which is often just the node the client connected to, and acts as a proxy). The coordinator then forwards the write to N replicas (where N is the replication factor). The write is considered successful if W of these replicas acknowledge the write. Reads: A client sends a read request to a coordinator node. The coordinator forwards the read to N replicas, collects responses from R replicas, resolves any conflicts (e.g., using version vectors, timestamps, or LWW), and returns the most recent version to the client. Read Repair: If a read detects inconsistencies (e.g., a replica returning stale data), the coordinator will silently update the stale replicas in the background. Hinted Handoff: If a replica is temporarily unavailable, the coordinator might send the write to another healthy replica (a "hinted handoff") which will then deliver the write to the original replica once it comes back online.

Answer 10

Extreme High Availability: No single point of failure or bottleneck. The system can continue operating as long as W nodes for writes or R nodes for reads are available. Excellent Fault Tolerance: Very resilient to node failures and network partitions. Horizontal Scalability: Easy to scale horizontally by adding more nodes. Low Latency (for client if W and R are small): Clients can write to or read from the nearest available node, potentially minimizing latency, especially if W and R are set to less than N.

Answer 11

Eventual Consistency is Common: While strong consistency can be achieved with W+R>N, it often comes at the cost of higher latency. Most leaderless systems default to eventual consistency for performance and availability. Complex Consistency Reasoning: Understanding the state of the system and debugging inconsistencies can be challenging for developers. Conflict Resolution: Requires sophisticated conflict resolution mechanisms (version vectors, LWW, application-level resolution) which add complexity. Read Repair Overhead: Background read repairs can add load to the cluster. No Global Order: Without a single leader, it's harder to establish a global, linearizable order of operations, which can be an issue for certain types of applications (e.g., strict financial transactions requiring total ordering).

Answer 12

Consistency vs. Availability/Performance: Heavily biased towards Availability and Partition Tolerance over strong consistency. It's an AP system in CAP theorem. Operational Complexity: Can be complex to operate and monitor, especially for consistency guarantees.

Answer 13

Massively scalable, highly available systems where eventual consistency is acceptable. Use cases like product catalogs, user profiles, shopping carts, time-series data, or IoT data. Systems that require writes to always be available, even during network partitions. When you need to handle extremely high read and write throughput across many nodes. Examples: Cassandra, DynamoDB, Riak.

Answer 14

CAP Theorem: How each strategy positions itself on the consistency-availability spectrum. Leader-Follower (Synchronous): Favors C and P. Leader-Follower (Asynchronous): Favors A and P, with eventual consistency. Multi-Master: Favors A and P, with eventual consistency and conflict resolution. Leaderless (Quorum): Favors A and P, with tunable consistency (can be CP or AP).

Foundational Distributed System Concepts -- Availability and Reliability Flashcards

Study concepts (38 cards)