Distributed Architectures Flashcards by Matthew Gilbert

What are the two types of scaling?

Vertical and Horizontal

How well did you know this?

Not at all

Perfectly

What is Vertical Scaling?

Increasing the capacity or ability of a single system or machine

How well did you know this?

Not at all

Perfectly

What is Horizontal Scaling?

Adding more instances or server to distribute the workload across multiple machines

How well did you know this?

Not at all

Perfectly

What is replication?

A copy of the data is kept in each node

How well did you know this?

Not at all

Perfectly

What is partitioning?

Different nodes have different part of the data, each node stores a subset

How well did you know this?

Not at all

Perfectly

3 answers

What are the advantages of replication?

Reducing latency as as data is geographically close
Increased availability
Increased Performance due to high No of machines to serve query

How well did you know this?

Not at all

Perfectly

What are the disadvantages of replication?

The data has to be small enough to be stored on a single machine.
Write operations involve updating multiple copies of the same data
Updates need to be made on all nodes, if a write fails then data reliability can be affected

How well did you know this?

Not at all

Perfectly

How are nodes organised?

Nodes are followers with one node designated a leader. The leader has the most recent data and all write operations are sent to the leader.

How well did you know this?

Not at all

Perfectly

How are read and write operations handled with leader/follow dynamic?

All write operations are sent to the leader and once completed, send updates to the followers.

Read Queries can be sent to any node so are much quicker.

How well did you know this?

Not at all

Perfectly

How is Synchronous Replication used?

When the leader sends out a write request to followers only a subset of nodes need have confirmed completion. If all were required then a small issue could stop the whole system.

How well did you know this?

Not at all

Perfectly

What is a partitioned database?

A database that stores different subsets of the data in different nodes.

How well did you know this?

Not at all

Perfectly

How does partitioning store data?

A node can have more than one partition but each piece of data will belong to exactly one partition.

How well did you know this?

Not at all

Perfectly

What is the goal of partitioning?

The goal is to spread the data and the query load evenly across nodes.

How well did you know this?

Not at all

Perfectly

What are Hotspot nodes?

Nodes that handle a disproportionately larger amount of data or load. Also known as skewed partitioning.

How well did you know this?

Not at all

Perfectly

What are three ways of partitioning data?

Randomly scatter the data across nodes
Partitioning by key range
Partitioning by Hash of key

How well did you know this?

Not at all

Perfectly

Why are partitions rebalanced?

Study These Flashcards

To fairly share the load between the nodes in the cluster and reduce hot spot nodes.

What are the two types of partitioning?

Study These Flashcards

Fixed number
Dynamic

How is fixed number partitioning implemented?

Study These Flashcards

Many more partitions than nodes are created. If nodes are added, they can steal from the other nodes. Only entire partitions are moved between nodes, and the number of partitions doesn’t change, nor does the assignment of keys to partitions.

How is dynamic partitioning implemented?

Study These Flashcards

If partitions grow beyond a certain size they are split in two. If the data becomes smaller, partitions can be merged.

What does CAP stand for?

Study These Flashcards

Consistency, Availability, Partition Tolerance

What is the CAP theorem?

Study These Flashcards

In a distributed system only two of consistency, availability, and partition tolerance can exist together.

What is system consistency?

Study These Flashcards

Every read receives the most recent write or an error

What is system availability?

Study These Flashcards

Every request receives a non-error response.

What is partition tolerance?

Study These Flashcards

The system continues to operate even if the parts of the network become disconnected.

What are the ACID properties?

Atomicity, Consistency, Isolation, Durability

What is eventual consistency?

At a given time point some data nodes will have an outdated version of the data, overtime this will be updated resulting in eventual consistency and all the nodes have the same data.

What is linearisability?

The idea that from a user perspective the system should appear as if all the operations on it are atomic and there is one single copy of data.

How is linearizability achieved?

If one client reads a value then all subsequent clients should read the same value

What are the characteristics of linearizability?

If applications require linearizability, some replicas are disconnected; applications that don't require linearizability can be more tolerant to network faults. It has performance impacts and isn't guaranteed in many systems in favour of performance.

How do network delays affect response times?

If linearizability is desired, the response time of read and write requests should be at least proportional to the uncertainty of network delays.

What is consensus?

Getting several nodes to agree on something.

What is Atomicity?

The idea that a transaction that involves multiple nodes or services is treated as one indivisible piece of work. Either all commits are successful or none of the commits are.

What is Consistency?

Ensures that transactions only make changes to tables in predefined, predictable ways. All nodes or replicas have the same view of data at a given time.

What is a Two-phase commit?

An algorithm for achieving atomic transaction commits across multiple nodes

What is Isolation?

When multiple users are reading and writing from the same table all at once, isolation of their transactions ensures that the concurrent transactions do not interfere with or affect one another

What is Durability?

Ensures that changes to your data made by successfully executed transactions will be saved, even in the event of system failure.

Distributed Architectures Flashcards

(36 cards)