Distributed Systems Flashcards

1
Q

What is a distributed system?

A

Software in which components are located on networked computers and can coordinate their actions by sending messages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between a parallel system and a distributed system.

A

Parallel systems still share memory, while distributed systems don’t share any components.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pros of a DS?

A

Scalability
Reduced latency
Fault tolerance
Mobility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Characteristics of DS?

A

Each entity has its own memory, distributed state needs to be synchronized
Entities communicate using message passing
Each entity maintains parts of the complete picture
Fault tolerant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Challenges with a DS?

A
  1. Partial failure
  2. Unreliable networks
  3. Unreliable time
  4. No single source of truth
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a synchronous system?

A

The process execution speeds or message delivery times are bounded. This means that timed failure detection, time-based coordination, and worst case performance can exist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a asynchronous system?

A

There are no assumptions about process execution speeds or message delivery times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is waiting for a reponse in an asynchronous system ambiguous?

A

Because you cannot tell if
The request was lost
The remote node is down
The response was lost

The usual remedy is to set timeouts and retry until success

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two ways to keep time?

A

Real time clocks (RTCs), which are kept in sync with the NTP protocol with centralized servers.
Monotonic clocks which only move forward.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why are monotonic clocks useful and who maintains them?

A

They are maintained by the OS, and are helpful for maintaining order within a node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are RTCs useful.

A

THey can synchronize time across nodes with an accuracy of milliseconds, whereas modern CPUs can do millions of operations / ms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

External ordering schemes?

A

Total order: Message rate is globally bounded, synchronized RTCs guarantee order
Causal order: Rely on the happens-before relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When do we call events concurrent?

A

When they don’t have a happens-before relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are lamport timestamps?

A

Each process p maintains a counter LT(p)
* p performs action, increments LT(p)
* p sends a message, includes LT(p)
* P receives a message from q. LT(p) = max(LT(p), LT(q)) + 1

For two events a -> b then LT(a) < LT(b). The reverse is not true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are vector clocks

A

On a system of N nodes, each node i maintains a vector Vi of size N. As such:
* Vi[i] is the number of events that occurred in node i
* Vi[j] is the number of events that node i knows occurred at node j

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do VC updates work?

A
  • Local events at node i increment Vi[i]
  • WHen node i sends a message to node j, it includes Vi
  • When node j received Vi, it updates all elements of Vj to Vj[x] = max(Vi[x], Vj[x])
16
Q

What are the guarantees of VCs?

A
  • If a -> b then VC(a) < VC(b)
  • If VC(a) < VC(b) then a -> b
  • If VC(a) < VC(b) then RT(a) < RT(b)