CAP theorem & fallacies of distributed computing Flashcards
(11 cards)
What is the CAP theorem? Where does it apply?
A shared-data system can have at most two of the three following properties: Consistency, Availability, and tolerance to network Partitions.
It applies to distributed systems.
CAP: Discuss Consistency.
a system is consistent if an update is applied to all relevant nodes at the same logical time. Among other things, this means that standard database replication is not strongly consistent. As anyone whose read replicas have drifted from the master knows, special logic must be introduced to handle replication lag.
That said, consistency which is both instantaneous and global is impossible. The universe simply does not permit it. So the goal here is to push the time resolutions at which the consistency breaks down to a point where we no longer notice it.
CAP: Discuss Availability.
For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response. That is, any algorithm used by the service must eventually terminate …
CAP: Discuss Partition Tolerance.
if the network stops delivering messages between two sets of servers, will the system continue to work correctly?
For a distributed (i.e., multi-node) system to not require partition-tolerance it would have to run on a network which is guaranteed to never drop messages (or even deliver them late) and whose nodes are guaranteed to never die. You and I do not work with these types of systems because they don’t exist.
CAP: Discuss Choosing Consistency over Availability
If a system chooses to provide Consistency over Availability in the presence of partitions (again, read: failures), it will preserve the guarantees of its atomic reads and writes by refusing to respond to some requests. It may decide to shut down entirely (like the clients of a single-node data store), refuse writes (like Two-Phase Commit), or only respond to reads and writes for pieces of data whose “master” node is inside the partition component (like Membase).
This is perfectly reasonable. There are plenty of things (atomic counters, for one) which are made much easier (or even possible) by strongly consistent systems. They are a perfectly valid type of tool for satisfying a particular set of business requirements.
CAP: Discuss Choosing Availability Over Consistency
If a system chooses to provide Availability over Consistency in the presence of partitions (all together now: failures), it will respond to all requests, potentially returning stale reads and accepting conflicting writes. These inconsistencies are often resolved via causal ordering mechanisms like vector clocks and application-specific conflict resolution procedures. (Dynamo systems usually offer both of these; Cassandra’s hard-coded Last-Writer-Wins conflict resolution being the main exception.)
Again, this is perfectly reasonable. There are plenty of data models which are amenable to conflict resolution and for which stale reads are acceptable (ironically, many of these data models are in the financial industry) and for which unavailability results in massive bottom-line losses. (Amazon’s shopping cart system is the canonical example of a Dynamo model3).
CAP: Discuss Choosing Availability And Consistency
You cannot choose both consistency and availability in a distributed system.
CAP: Discuss Yield and Harvest.
We assume that clients make queries to servers, in which case there are at least two metrics for correct behavior: yield, which is the probability of completing a request, and harvest, which measures the fraction of the data reflected in the response, i.e. the completeness of the answer to the query.
Despite your best efforts, your system will experience enough faults that it will have to make a choice between reducing yield (i.e., stop answering requests) and reducing harvest (i.e., giving answers based on incomplete data). This decision should be based on business requirements.
Give the 8 Fallacies of distributed computing.
1) The network is reliable.
2) Latency is zero.
3) Bandwidth is infinite.
4) The network is secure.
5) Topology doesn’t change.
6) There is one administrator.
7) Transport cost is zero.
8) The network is homogeneous.
What is network topology?
Network topology is the arrangement of the various elements (links, nodes, etc.) of a computer network.
Essentially, it is the topological structure of a network, and may be depicted physically or logically.
Physical topology refers to the placement of the network’s various components, including device location and cable installation, while logical topology shows how data flows within a network, regardless of its physical design.
Distances between nodes, physical interconnections, transmission rates, and/or signal types may differ between two networks, yet their topologies may be identical.
What are the 5 effects of ignoring the fallacies of distributed computing?
1) Ignorance of network latency, and of the packet loss it can cause, induces application- and transport-layer developers to allow unbounded traffic, greatly increasing dropped packets and wasting bandwidth.
2) Complacency regarding network security results in being blindsided by malicious users and programs that continually adapt to security measures.
3) Multiple administrators, as with subnets for rival companies, may institute conflicting policies of which senders of network traffic must be aware in order to complete their desired paths.
4) The “hidden” costs of building and maintaining a network or subnet are non-negligible and must consequently be noted in budgets to avoid vast shortfalls.
5) Ignorance of bandwidth limits on the part of traffic senders can result in bottlenecks over frequency-multiplexed media.