Distributed Systems Flashcards

Question

Why use RPC over DSM?

Answer 1

- Easier to deal with failure - Method calls should be the same across all components of the system - No need to worry about typical shared memory constructs

Answer 2

Unstructured - styles with no theoretical limits on how entities are coupled Structured - styles that organize entities in ways that limit what entities a given entity can interact with directly

Answer 3

Advantages - simplicity of system's internal organization - ease of access among system's entities Disadvantages - increased difficulty of monitoring and maintaining system operation, due to range of possible interactions among the entities - worst case it can be a fully connected (mesh) network that requires O(n^2) connections for interactions

Answer 4

- schemes that distribute content across objects using a function that's computed from items' keys

Answer 5

object-based - unordered collection of components | resource-based - unordered collection of resources

Answer 6

- definitive version of object, including its state, positioned at a single, base node in a distributed network - to use an object, other nodes bind to it, then access it - the bind creates proxy representation of the remote object at local host

Answer 7

a software construct that controls access to other objects; it redirects access to the remote object and relays remote object's responses to local hosts

Answer 8

Advantages - can improve application transparency by blurring distinction between remote and local objects - natural model for session-based interaction with remote objects - natural model for encapsulating data Disadvantages - can't fully mask impact of network on communications (network failure, remote host failure, impracticality of duplicating large amounts of local host state at remote object site) - excessive use of objects can degrade host performance, creating computational bottlenecks

Answer 9

Advantages - simplifies an application's coding - reduces flow of traffic across the network Disadvantages - state information of the remote object must be maintained - lends itself to high duplication of data - increases complexity of the messages (every message must be self-contained so this may lead to reauthentication for every message in session)

Answer 10

- resources are identified via a standard naming scheme - services offer a uniform, message-based interface - messages are self-describing - resources maintain no state on callers

Answer 11

REpresentational State Transfer (REST)

Answer 12

- resources named using Uniform Resource Indicators (URIs) - offers a CRUD-style, PUT-GET-POST-DELETE protocol for managing state - communication protocol (HTTPS) is still stateless

Answer 13

Advantages - reduces performance impact on remote object host by eliminating need to for host to maintain state - simplifies service implementation by removing need for remote object host to recover from host and network failure Disadvantages - potentially complicates client-side implementation

Answer 14

- Horizontal - Commons-based - Layered - Hierarchical - Other miscellaneous

Answer 15

Advantages - facilitates maintaining system operation due to limiting of direct object interaction - accommodates networks where some indirect interactions are unsupported Disadvantages - restrictions complicate system fabric by introducing need for routing of communications between indirectly connected entities - restrictions increase probability of network partitioning

Answer 16

- roughly, a linked-list representation of objects - can be strict (left and right neighbor) or loose (left and right neighbor with O(1) shortcut to other objects in the architecture) - can be singly- or doubly-linked (one-way or two-way request flow)

Answer 17

Advantages - conceptually and operationally simple Disadvantages - specialized applicability

Answer 18

Linear: strict horizontal style where start and end node are not connected - something like a message parser (decrypt, authenticate, remove duplication, etc.) Ring: loose horizontal style where start and end node are connected with potential chords throughout the structure - distributed hash table

Answer 19

Regular ring - you have nodes used for lookup - each lookup node has keys associated with it - if node is in the current node's lookup table, great, jump to it - if not, go to next lookup node Chordal - each chord contains a finger table that references other nodes on the network - to look for node n, visit the first node with a finger table - if node n is in the finger table, great, jump there - if not, visit the largest node in the finger table that does not exceed the desired node - repeat until node is found - if nodes in the finger table are all larger, hop to next node (linear search) until node is found or another valid finger table is found

Answer 20

- any style that is referentially decoupled - components interact via a central information hold - the central information center contains information about the computational environment

Answer 21

- referentially coupled requires entities to explicitly name each other when communicating - referentially decoupled entities do not name each other when communicating

Answer 22

Entity-oriented (temporally decoupled) - the commons functions as a shared store - items in the commons can be accessed indefinitely until removed Event-oriented (temporally coupled) - items in the commons have a limited useful lifespan - each item can only be accessed by processes that are active when that item was added to the commons

Answer 23

- components are organized into groups and stacked such that modules in a layer depend on modules beneath them - full/strict/closed (layer n is limited to reference n-1) - partial/open (layer n can reference all layers below)

Answer 24

Advantages - aids understanding when clear decompositions of components into layers is achievable - aids maintainability by isolating concerns by layer - aids portability and adaptability by allowing for substitutions by layer Disadvantages - functionality can be difficulty to layer - layering potentially decreases performance

Answer 25

- generalization of layered style where system's components interact along lines of communication that are structured as trees Advantages - applicable to networks with limited element-to-element connectivity - strikes a balance between component proximity and number of connectors - intuitive - algorithmic techniques for tree manipulation are well-known

Answer 26

strategy for organizing elements and their interactions

Answer 27

Unstructured Peer

Answer 28

- system organized as a graph of random connections | - requests and data flow through system using various ad-hoc techniques

Answer 29

Flooding - each node distributes all messages to all neighboring nodes Random walk - messages follow random pathways through the network Anti-entropy - push-pull on random nodes Gossiping - nodes distribute messages to their neighbors with a given probability, either fixed or based on history (informed gossip)

Answer 30

Advantages - readily distributes over a network - easy to scale - easy to reuse Disadvantages - difficult to analyze and debug due to relative lack of structure - difficult to provide interoperability in networks of heterogeneous components

Answer 31

Microservice architecture

Answer 32

- graph of small services defined by their APIs | - each microservice has arbitrary connections to the others

Answer 33

Advantages - low coupling, autonomous, flexible - maps well to DevOps - independently scalable - easier deployment per service - system should be more available Disadvantages - end-to-end testability is more difficult - logging and overall performance monitoring is harder - can get very complex if there are many components - performance suffers, due to latencies from repeated message passing

Answer 34

Pipe-and-filter

Answer 35

- upstream components accept data, process it, and pass it downstream - architecture must coordinate data flow (asynchronous and synchronous flows) Example - compilers

Answer 36

Advantages - simple to implement - lends itself toward reusable components - easy to maintain - provides a natural basis for concurrency Disadvantages - error handling (no global state, restart is often difficult) - data flow must be uniform along pipes for best performance - forces batch mode processing

Answer 37

Repository and tuple space

Answer 38

- supports communication by multiple processes through a shared, persistent store Examples - files, registry keys, databases

Answer 39

- specialized, distributed repository architecture when processes store, read, and retrieve tuples

Answer 40

Blackboard, publish-subscribe, service-oriented

Answer 41

- like repository, but with active central store (changes to blackboard trigger events in processes) - can either have central (events trigger in blackboard) or distributed (events trigger in processes) control

Answer 42

- like a hybrid between blackboard and repository patterns - publishers store data to central store (looks like a write-only repository to publishers) - subscribers register for data from server which pushes messages of interest on receipt (looks like a read-only repository to subscribers)

Answer 43

- decouples publishers from subscribers (publishes can publish independently of subscribers, and subscribers can obtain content asynchronously) - can support anonymity of both publishers and subscribers

Answer 44

- created to support integration of heterogeneous systems | - allows multiple peer components to interact via a "software backplane"

Answer 45

Client-server, tiered computing, broker, and proxy

Answer 46

- provided by some middleware - allows for dynamic binding between client and service provider at beginning of interaction Interaction sequence - before interaction, the server registers with the broker - a client requests a service by type - broker responds with the address of a server that can provide the requested service - client and server communicate directly after that

Answer 47

- all communication occurs via an intermediate process | - shields server(s) from client

Answer 48

Client-server vs. broker, proxy - simpler - faster: no initial or intermediate communication Broker - vs. client-server: more flexible and scalable (broker can route all requests to any server at the start of a session) - vs. proxy: faster (less overhead for multi-message session) Proxy - vs. client-server: more flexible and scalable (allows proxy to route any requests to any server at any point during a session) - vs. broker: more secure (hides server(s) from client throughout session)

Answer 49

3 types - Infrastructure as a Service (IaaS): network-accessible resources that function as machines - Platform as a Service (PaaS): network-accessible resources that function as operating systems - Software as a Service (SaaS): network-accessible resources that function as individual services Management - On-premises: you manage everything - IaaS: you manage everything up to the OS - PaaS: you manage the applications and data - SaaS: you manage nothing (only use the service) Standardization - IaaS: instruction sets (achieved through virtualization) - PaaS: application environments; Linux, Windows, etc. - SaaS: applications

Answer 50

- virtual machines do not mirror read hardware | - virtualization mirrors real hardware

Answer 51

- with synchronous communication, the client blocks further processing until the request is handled (wastes cycles but the handling of the request -- success/failure -- is easily known) - with asynchronous communication, the client spins a thread/process to make the request while continuing other processing (no wasted cycles but the handling of the request is not easily known)

Answer 52

``` Application Transport Internet Protocol Network Access Physical ```

Answer 53

- the serialization/deserialization of objects/data

Answer 54

- provides extra support for distributing functionality across a network, typically through providing adapters for relaying messages Example - MPI

Answer 55

- how does the system ensure that the flow of data is timely enough to meet user expectations for presentation quality in real-time

Answer 56

- reduce the overall number of messages - reduce time to send via caching - eliminate needless content

Answer 57

- authentication provides access control to the overall system via passwords, usernames, digital fingerprints, etc. - authorization provides access control to content/resources within the system via user groups

Answer 58

- overlay routing network arranged in the form of a tree - tree is balanced (as much as possible) - to find a node on the network, start at leaves and work up

Answer 59

- uses two keys - public key can be distributed to anyone - private key only known by the holder - to encrypt a message, the sender uses the receiver's public key - to decrypt a message, the receiver uses their private key

Answer 60

- ensure that relationships between components hold true at specified points in time - two different strategies (centralized and distributed)

Answer 61

Absolute (physical) time - using time to coordinate action Relative (logical) time - using exchanges of messages among cooperating processes to sequence actions

Answer 62

- difficult to measure - transmissions of time are subject to jitter - clocks drift, creating skew

Answer 63

- variability in message latency

Answer 64

- time between message send and receipt

Answer 65

- tendency of a clock to run at a different rate from a reference clock

Answer 66

- difference between a reference clock and other clocks

Answer 67

- Cristian's algorithm - a.k.a. Network Time Protocol (NTP) - Berkeley algorithm - Reference Broadcast Synchronization (RBS) - Google's TrueTime

Answer 68

- host requests UTC from time server, remembers as T1 - time server sends T2 (receipt of request) and T3 (when response was sent) - T4 is when the response was received by the host - host adjusts T3 by ((T2 - T1) + (T4 - T3))/2

Answer 69

- limiting number of processes that can access a pool of resources at any one time

Answer 70

- token-based: process seeking access acquires a token and keeps access until surrendering token - permission-based: process seeking access obtains permission before acting

Answer 71

- centralized: one coordinating process manages access to resource - decentralized: competing processes coordinate access together

Answer 72

- strict consistency: each read retrieves the most recent write to the resource - eventual consistency: reads are not guaranteed to contain the most recent writes to the resource but they will eventually have the most recent updates over time

Distributed Systems Flashcards

(96 cards)