Lesson 1: Introduction to Distributed Systems Flashcards

Question 1

Q

What is a distributed system?

Answer

A

“A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable” - Leslie Lamport

A collection of computing units that interact by exchanging messages via an interconnection network and appear to external users as a single coherent computing facility (e.g., there is a common goal that the system must accomplish and all system componets must contribute to it)

Question 2

Q

Describe a system model of a distributed system

Answer

A

two or more nodes
connected via communication channels
send and receive messages
each node contributes to the overall system state

Nodes receive a message from a channel, take some time to act on it, send a message to (one or more) channels.

Actions are triggered in response to messages. The outcome of the actions is that the state at an individual node may change.

Messages spend some time in the channel and then are delivered (zero or more times, ideally exactly once)

Question 3

Q

What is unique and what is hard about distributed systems?

Answer

A

Asynchrony
- Instant vs bounded vs unpredictable vs infinite latency
- Implication on system design

Failures
- From “failstop” to transient to Byzantine/can’t tell
- Server/process vs network

Consistency (single up-to-date copy of data or agreement therof)
- Concurrency, ordering
- Replication caching

Question 4

Q

Why are system models important in distributed systems?

Answer

A

Using system models and analyzing system behavior using models is important because otherwise we’d have to build prototypes and perform experimental validation under all scenarios.

Question 5

Q

What are models characterized by?

Answer

A

Elements and rules (e.g., a system is modeled as a collection of nodes, channels, etc.)
Invariants aka assumptions (e.g., every message is delivered after sometime -> assumes the network will not fail)

Question 6

Q

When is a model good enough?

Answer

A

Accurate: is it possible to learn some truths about the real system using the model?
Tractable: are analysis of a certain problem using the model even possible?

When defining or picking a model, does it:
- Allow us to demonstrate the problem?
- Prototype a solution to the problem in the context of the model?

Question 7

Q

What are the 8 fallacies of distributed computing?

Answer

A

The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
Topology doesn’t change
There is one administrator
Transport cost is zero
The network is homogenous

Question 8

Q

What are desirable properties of a distributed system?

Answer

A

Fault-tolerant: it can recover from component failures without performing incorrect actions.

Highly available: it can restore operations, permitting it to resume providing services even when some components have failed.

Recoverable: failed components can restart themselves and rejoin the system, after the cause of failure has been repaired.

Consistent: the system can coordinate actions by multiple components often in the presence of concurrency and failure. This underlies the ability of a distributed system to act like a non-distributed system.

Scalable: It can operate correctly even as some aspect of the system is scaled to a larger size.

Predictable performance: the ability to provide desired responsiveness in a timely manner.

Secure: the system authenticates access to data and services.

Question 9

Q

What makes a system “correct”?

Answer

A

A distributed system should behave as if it were a single coherent entity.

Given a set of inputs inputs, the output of a distributed system and the output of a single coherent entity should be identical.

The output of a system reflects all nodes.

Question 10

Q

What is the CAP Theorem?

Answer

A

The CAP Theorem says a distributed system can deliver only two of three desired characteristics: consistency, availability, and partition tolerance.

The CAP Theorem implies that when there’s a network partition then there’s an availability vs consistency tradeoff.

Slow response == no response
low latency == no availability
availability vs consistency == latency vs consistency

Lesson 1: Introduction to Distributed Systems Flashcards

(10 cards)