622 final Flashcards
(156 cards)
replication pros
improves scalability
distribute requests among replicated data servers
improves robustness
if one data server fails, others may be available
reduces latency of access
(wide-area systems) place data close to where it’s needed
(same box) place frequently accessed data in fast media
replication cons
hard to maintain replica consistency
expensive in case of frequent updates
decide on replica placement may not be trivial
harder in the case of dynamic placement
replicationcomes in many forms
-load balancing
data requests dynamically allocated to co-located replicasEx: database servers, web-search engines
-mirroring
requests directed to one of geographically dispersed replicasEx: web servers
caching: storing data at a convenient place
error
unintended state of a component/system
fault
a component/system characteristic or property that may cause errors
kinds of faults
transient
occurs and disappears; unlikely to reoccur
intermittent
recurrently appears and disappears
permanent
persists until fixed
failure
Failure: external manifestation of an error; undesired behavior
fail-stop
stops doing anything (“crashes”) on failure & we can reliably detect its failure (e.g., announces failure)
fail arbitrary/byzantine
behaves arbitrarily on failure
fail-silent
crashes on failure, but we can’t distinguish crashes from communication link failures
fail-safe
may crash or produce other results on failure, but failure cannot do any harm
fail noisy
crashes on failure, we can eventually determine that it crashed (detection may be delayed)
information redundancy
adding extra bits so that errors may be detected and possibly recovered
time redundancy
retry after detecting failure
physical redundancy
providing extra/replacement components
Triple modular redundancy
(LECTURE 7 SLIDE 20)
If A1 fails in fail-arbitrary way, OK
A2 & A3 outvote in each voter
If A2 & A3 fail in fail-silent way, OK
Only A1’s result received by voters, so it’s the only result used by voters
If A1 & V1 fail in fail-arbitrary way, OK
V2 & V3 will be correct (outvoted by A2 & A3), B1 will produce bad value but will be outvoted by B2 & B3
V10 fail in any way – SYSTEM FAILURE
A1 & A2 fail-arbitrary – SYSTEM FAILURE POSSIBLE (outvote A3)
Byzantine faults
Byzantine fault = “anything can happen”
CAP (Brewer’s) theorem
It’s impossible for a distributed data store to simultaneously provide > 2 of 3 guarantees:
Consistency: Every read receives the most recent write or an error [not same as ACID definition]
Availability: Every request receives a response that is not an error
Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
In presence of (network) partition, must choose between consistency & availability
Consistency > availability:
ACID (Atomicity, Consistency, Isolation, Durability)
Traditional database systems
Availability > consistency:
Eventual consistency / optimistic replication: if no new updates are made to a given data item, eventually all accesses to it will return the last updated value (“converged”)
BASE (Basically Available, Soft state, Eventual consistency) semantics
Many “NoSQL” databases in this category
PACELC
PACELC extends CAP:
“if there is a partition (P), how does the system trade off availability and consistency (A and C) [per CAP];
else (E), when the system is running normally without partitions, how does the system trade off latency (L) and consistency (C)?”
PA/EL systems
if a partition occurs, prefer availability (giving up consistency), else prefer lower latency (also giving up consistency) – consistency always less important
PC/EC systems
PC/EC systems
PA/EC systems
if a partition occurs, prefer availability (giving up consistency), else prefer consistency (giving up lower latency)