622 Flashcards

(208 cards)

1
Q

Eight pleasant thoughts

A

-The network is reliable
-The network is secure
-The network is homogenous
-The topology does not change
-Latency is zero
-Bandwidth is infinite
-Transport cost is zero
-There is one administrator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a distributed system?[Tanenbaum]

A

A collection of independent computers that appears to its users as a single coherent system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Three key characteristics [Tanenbaum]

A

Multiple machines are autonomous
Software lets users see a single system
System easy to expand without user noticing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a distributed system?[Webopedia]

A

A type of computing in which different components and objects comprising an application can be located on different computers connected to a network.
Key requirement:set of standards that specify how objects communicate with one another(e.g. CORBA, DCOM, REST, …).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

[Wikipedia] distributed computing:

A

decentralized and parallel computing, using two or more computers communicating over a network to accomplish a common objective or task.

Note:The types of hardware, programming languages, operating systems and other resources may vary drastically. It is similar to computer clustering with the main difference being a wide geographic dispersion of the resources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Challenges of DS

A

-latency of communication
-coordination
-shared resources and mutual exclusion
-ordering, deadlock and live-lock
-timing
-adaptation to change
-failures, soft faults, and optimization
-service discovery and configuration
-heterogeneity and third-party software
-scalability and evolution
-security and privacy
-trust on machines, software, communications & other users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Advantages of DS

A
  • processing capacity
  • fault tolerant, evolving, scalable
    -explicit control, preferences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Replicas

A

Often useful to have same task performed by multiple components so all have to fail for task to fail
What if data must be shared between components?
Often one component is “master” (aka “original” or “authoritative version”)
The other components are copies from the master
This may be apportioned, e.g., a component may be a master for just some portion like “names A-K”
Confusion in counting the number of “replicas”:
Some might not include the “master” in the count of replicas
Some might include the “master” (“replicas of each other”)
Make sure you know if the “master” is included

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to solve for number of replicas

A

If we assume independence and:
F = Probability that one replica fails in time period, F≠1
n = (natural) number of components (e.g., replicas including master). Thus F^n is the probability all n will fail simultaneously
G = Goal, permitted probability of total system failure where all n replicas fail (including original)

F^n ≤ G or
n ≥ (log G)/(log F)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Independence assumption

A

These calculations assume independent failures
Reasonable model for many hardware failures
Software failures often not independent
Knight & Leveson [1986] found via experiment that software faults are not independent
Thus “N-version programming” doesn’t lead to the reliability increase you might predict
It can be helpful, but less than you’d think

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Caching: Special case of replication

A

Make cop(ies) of a resource (data)
Often happens on demand
Other replication approaches often planned & executed in advance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Challenges of DSexample: replication has downsides

A

buy more hardware
administration costs
software upgrades
load balancing
performance overhead
more complex software
consistency problems
sometimes tolerable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

hiding access

A

hide differences in data representation and how a resource is accessed

(conversion of complex fortmats. latency vs fidelity of access)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

hiding location

A

hide where a resource is located (trusted hosts, different performance, difference capabilities and network access)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

migration

A

hide that a resource may move to another location (trusted hosts, different performance, difference capabilities and network access)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

relocation

A

hide that a resource may be moved to another location while in use (trusted hosts, different performance, difference capabilities and network access)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

replication

A

hide the fact that several copies of a resource exist (select server based on QoS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

concurrency

A

hide that a resource may be shared by several competitive users(cannot hide sharing of resources: they’re consumed , data is modified by others)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

failure

A

hide the failure and recovery of a resource(unexplained behavior)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

persistence

A

hide whether a (software) resource is in memory or on disk(someone needs to decide whether an object is persistent and commit it to disk)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

awareness and adaptation

A

separate decisions from (controllable) mechanisms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Some measures (per Neumann) for system scaleability

A

Size
Users & resources
Geographical
May lie far apart
Administrative
May span many
independent
administrative
organizations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Decentralized algorithms

A

No machine has complete information about the system state.
Machines make decisions based only on local information.
Failure of one machine does not ruin the algorithm.
There is no implicit assumption that a global clock exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Asynchronous communication

A

Hiding communication latencies important for geographical scaleability
Max speed is speed of light in vacuum (~3.00×108 m/s)
Information transfer through material normally less
Physical components have other performance latencies
Software takes time to execute once it receives data
Sending information and waiting for reply is synchronous communication
Alternative: Asynchronous – send information, don’t wait for reply

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Moving location of execution
E.G., when checking form inputs, consider two options: Send each input to server & wait for reply – maybe long delay Could move checking code to client Now check response can be immediate (once code is there) Can be special-case (e.g., HTML5 form validators) Can be general (e.g., a general execution engine like Javascript) Beware of security ramifications Often two sides (e.g., client & server) cross trust boundary Security checks must often be redone on server in many cases Server can’t trust client Many checks are done on both client (for speed) and server (for security) Client must often check the data it’s asked to execute (especially if it’s a full language like Javascript) Client can’t trust server
26
Cloud computing
Clouds widely used, often misunderstood Clouds often cheaper (where appropriate), many variations Decisions to use cloud (and how) impact security NIST Definition of Cloud Computing (NIST SP 800-145): Cloud computing is “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” Five essential characteristics: On-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service Virtualization is common, but not required, to be a cloud Book makes this (common) mistake
27
Cloud service models
Infrastructure as a Service (IaaS): “consumer [can] deploy and run arbitrary software [including] operating systems and applications....” Platform as a Service (PaaS): “consumer [can deploy] consumer-created or acquired applications...” [on top of provided platform] Software as a Service (SaaS): “consumer [can] use the provider’s applications running on a cloud infrastructure...”
28
OSI 7 layer model (Open Systems Interconnection)
physical, data link, network, transport, session, presentation, application
29
physical
specifies: pin layout, voltages, modulation does: establish & terminate access to medium, flow control, contention resolution at this level: hubs, repeaters, network adapters
30
data link
specifies: how to transfer data in a LAN does: detect and correct errors at this level: MAC addresses (flat, HW-based)
31
network
specifies: how to transfer data sequences across LANs (e.g., IP) does: routing at this level: hierarchical address scheme, routers, bridges & switches
32
IP Service Model
Connectionless (datagram/packet-based) Best-effort delivery (unreliable service) packets are lost packets are delivered out of order duplicate copies of a packet are delivered packets can be delayed for a long time Datagram format
33
Datagram forwarding
Strategy every datagram contains destination’s address if directly connected to destination network, then forward to host if not directly connected to destination network, then forward to some router forwarding table maps network number into next hop each host has a default router each router maintains a forwarding table
34
Forwarding Tables
Suppose there are n possible destinations, how many bits are needed to represent addresses in a routing table? log2n So, we need to store and search n * log2n bits in routing tables? We’re smarter than that!
35
Global Addresses
Globally unique, hierarchical: network+host Dot Notation 10.3.2.4 128.96.33.81 192.12.69.77
36
Transport
specifies: reliable transference of data (e.g., TCP, UDP) does: flow control, segmentation, error control, retransmission
37
UDP
(User Datagram Protocol) connectionless - sends independent packets of data, called datagrams, from one computer to another with no guarantees about arrival each time a datagram is sent, the local and receiving socket address need to be sent as well
38
TCP
(Transmission Control Protocol) connection-oriented - provides a reliable flow of data between two computers: data sent from one end of the connection gets to the other end in the same order in order to communicate using TCP protocol, a connection must first be established between the pair of sockets once two sockets have been connected, they can be used to transmit data in both (or either one of the) directions
39
Overhead
UDP - every time a datagram is sent, the local and receiving socket address need to be sent along with it TCP - a connection must be established before communications between the pair of sockets start (i.e. there is a connection setup time in TCP)
40
Packet size
UDP - there is a size limit of 64 kilobytes per datagram TCP - there is no limit; the pair of sockets behaves like streams
41
reliability
UDP - there is no guarantee that the sent datagrams will be received in the same order by the receiving socket TCP - it is guaranteed that the sent packets will be received in the order in which they were sent
42
which protocol to use?`
TCP - useful when indefinite amount of data need to be transferred ‘in order’ and reliably UDP - useful when data transfer should not be slowed down by the extra overhead of the reliable connection
43
session
specifies: establishing long lived connections does: checkpointing, adjournment, restart
44
presentation
specifies: data formats and transformation (e.g., MIME) does: serialization, compression, encryption, encoding transformation (EBCDIC/ASCII)
45
application
specifies: application-specific protocols (e.g., http, smtp, ftp, telnet) does: support app-specific functionality
46
goal of the OSI
separation of concerns enables good implementation at each level each layer is independent of the ones on top layer n depends on the spec of n-1, but not on its implementation/manufacturer
47
port
Generally, a computer has a single physical connection to the network this connection is identified by the computer’s 32-bit IP address all data destined for a particular computer arrives through this connection TCP and UDP use ports to identify a particular process/application port = abstract destination point at a particular host each port is identified by a positive 16-bit number, in the range 0 - 65,535 port numbers 0 - 1023 are reserved for well-known services (HTTP - 80, telnet – 23)
48
socket
basic abstraction for network communication “end-point of communication” uniquely identified with IP address and port example: Socket MyClient = new Socket("Machine name", PortNumber); gives a file-system like abstraction to the capabilities of the network two end-points communicate by “writing” into and “reading” out of socket there are two types of transport via sockets reliable, byte-stream oriented unreliable datagram
49
socket programming with TCP
Server Side: server runs on a specific computer and has a socket bound to a specific port number server listens to the socket for a client to make a connection request Client Side: client tries to rendezvous with the server on the server's machine and port Server Side: the server accepts the connection by creating a new socket bound to a different port Client Side: if the connection is accepted, the client uses the new socket to communicate with the server
50
Socket programming with UDP
All clients use the same socket to communicate with the server Packets of data (datagrams) are exchanged No new sockets need to be created
51
C- vs. Java- socket programming
Java keeps all the socket complexity “under the cover” It does not expose the full range of socket possibilities But, it enables sockets to be opened/used as easily as a file would be opened/used By using the java.net.Socket class instead of relying on native code, Java programs can communicate over the network in a platform-independent fashion
52
Java socket programming
all classes related to sockets are in java.net package Socket class - implements client sockets (also called just "sockets") ServerSocket class - implements server sockets A server socket waits for requests to come in over the network. It performs some operation based on that request, and then possibly returns a result to the requester. DatagramSocket class - socket for sending and receiving datagram packets DatagramPacket class - represents a datagram packet Datagram packets are used to implement a connectionless packet delivery service. Multiple packets sent from one machine to another might be routed differently, and might arrive in any order. InetAddress class - represents an Internet Protocol (IP) address MulticastSocket class - useful for sending and receiving IP multicast packets. A MulticastSocket is a (UDP) DatagramSocket, with additional capabilities for joining "groups" of other multicast hosts on the internet. A multicast group is specified by a class D IP address.
53
what does middleware offer?
conceptual model for communication
54
different styles have different data sharing assumptions
RPC (address space or memory) RMI object refs (middleware) messages data store -> files/objects persistent store data stream -> and <- data store/source
55
read slide 49-60 in lecture 2
ok
56
RPC is implemented by
sending messages
57
where to send RPC messages?
hardwired for fixed deployment some RPC environments support dynamic binding (more to come during the lecture on Service Discovery)
58
solution to object refs
increase granularity from bytes to objects both local objects and references to remote objects are passed by value (serialization) the result of the called method is also serialized and passed back to the caller
59
difference between RMI and RPC?
RMI: doesn’t try to hide distribution in the language: remote objects are declared “remote” marshalling is simplified by passing by value only (object references can be used in nested RMIs) (in Java) by having JVMs hide platform dependencies in data representation serialization could be much heavier by having to pass the code for the objects with every call, but that can be avoided by passing URLs for downloading the code, rather than the code itself
60
reasons to escape the call return style
no result needs to be returned a server may not be available at the time of the request make the client more responsive to other events/user allow any component to initiate communication
61
some middleware push the envelope
dealing with errors: idempotent, at-least-once, at-most-once… the promised simplicity of procedure calling sometimes hinders more sophisticated solutions
62
when to use call return style
the server is ready to process each request components and network are mostly reliable not many concurrent events in the caller: it is fine to block the caller one component (client) has the initiative, others (servers) wait for requests
63
what is needed for RMI
Java makes RMI (Remote Method Invocation) fairly easy, but there are some extra steps To send a message to a remote “server object,” The “client object” has to find the object Do this by looking it up in a registry The client object then has to marshal the parameters (prepare them for transmission) Java requires Serializable parameters The server object has to unmarshal its parameters, do its computation, and marshal its response The client object has to unmarshal the response
64
remote object
an object on another computer
65
client object
object making the request
66
server object
object receiving the request((can easily trade roles with the client object)
67
rmiregistry
special server that looks up objects by name
68
rmic
special compiler for creating stub (client) and skeleton (server) classes
69
processes for RMI
The Client The Server The Object Registry, rmiregistry, which is like a DNS service for objects You also need TCP/IP
70
interfaces
Interfaces define behavior Classes define implementation Therefore, In order to use a remote object, the client must know its behavior (interface), but does not need to know its implementation (class) In order to provide an object, the server must know both its interface (behavior) and its class (implementation) In short, The interface must be available to both client and server The class should only be on the server
71
remote class
one whose instances can be accessed remotely On the computer where it is defined, instances of this class can be accessed just like any other object On other computers, the remote object can be accessed via object handles
72
serializable class
one whose instances can be marshaled (turned into a linear sequence of bits) Serializable objects can be transmitted from one computer to another
73
conditions for serializability
If an object is to be serialized: The class must be declared as public The class must implement Serializable All fields of the class must be serializable: either primitive types or serializable objects
74
parts of remote class
The interface (used by both client and server): Must be public Must extend the interface java.rmi.Remote Every method in the interface must declare that it throws java.rmi.RemoteException (other exceptions may also be thrown) The class itself (used only by the server): Must implement a Remote interface Should extend java.rmi.server.UnicastRemoteObject May have locally accessible methods that are not in its Remote interface
75
remote object
lives on another computer (like server) You can send messages to a Remote object and get responses back from the object All you need to know about the Remote object is its interface Remote objects don’t pose much of a security issue
76
serializable object
You can transmit a copy of a Serializable object between computers The receiving object needs to know how the object is implemented; it needs the class as well as the interface There is a way to transmit the class definition Accepting classes does pose a security issue
77
server class
The class that defines the server object should extend UnicastRemoteObject This makes a connection with exactly one other computer If you must extend some other class, you can use exportObject() instead Sun does not provide a MulticastRemoteObject class The server class needs to register its server object: String url = "rmi://" + host + ":" + port + "/" + objectName; The default port is 1099 Naming.rebind(url, object); Every remotely available method must throw a RemoteException. Why?
78
rmic
The class that implements the remote object should be compiled as usual Then, it should be compiled with rmic: rmic Hello This will generate files Hello_Stub.class and Hello_Skel.class These classes do the actual communication The “Stub” class must be copied to the client area The “Skel” was needed in SDK 1.1 but is no longer necessary
79
overlay networks
network/transport support for multicast hasn’t worked well in practice overlay networks at application layer! result is a logical network built on top of a (probabaly different) network new network optimized for application but may have performance issues due to underlying network
80
distributed solution
tree network approach function to map topics to nodes identifies unique root node for a given topic follow() request follows tree if node has not seen request become a “forwarder” make sender your “child” for that request forward request to next node in tree if node has seen request make sender your “child” for that request send() request follows topic tree
81
mesh network "epidemic" approach
node states: infected, susceptible, removed goal is to become infected! P picks a random Q push: P updates pushed to Q // not so good… pull: Q updates pulled to P push-pull: both gossip adaptation less interest if receiver already has update
82
removing data
suppose an update is deleted what happens when old copy is found? distributed systems are different! notion of a death certificate “that update doesn’t matter anymore” have to keep that update how long?
83
persistence
once sent, messages endure in the system, regardless of the sender remaining active and the recipient being available
84
synchronicity
the sender continues after sending, or blocks, waiting for the message to be delivered (buffered) to be received (read) to be processed
85
primitives and their meaning
socket: create a new communication endpoint bind: attach a local address to a socket listen: announce willingness to accept connections accept: block caller until a connection request arrives connect: actively attempt to establish a connection send: send some data over the connection receive: receive some data over the connection close: release the connection
86
when to use messaging style
component interactions don’t follow a strict call-return pattern make components more responsive to other events/user (no blocking) allow any component to initiate communication components may not be available to receive/process messages (persistency)
87
when to use data sharing style
(persistent) data plays a central role in the system components don’t need to synchronize control flow other than on data availability or values E.g., modern database management systems, distributed file systems
87
space decoupling
interaction partners do not need to know each other
87
data streaming style is a variant where
data is stored/generated at one place and consumed by one or more clients timeliness of data delivery is crucial asynchronous (unbound, e.g., caching) synchronous (upper bound, e.g., sensors) isochronous (bounded jitter, e.g., media) complex data may be transmitted as separate streams which need to be synchronized: e.g. video + stereo sound streaming is supported by middleware (e.g. RSVP) on top of the data link network layer
87
time decoupling
the interaction partners do not need to be actively participating in the interaction at the same time
88
synchronization decoupling
publishers aren't blocked while producing events, subscribers can get asynchronously notified of the occurrence of an event
89
event filters and patterns
An event filter selects event notifications by specifying a set of attributes and constraints on the values of those attributes A pattern is composed of several filters A subscription can be expressed as a filter or a pattern
90
event matching
A subscription matches an event notification when the notification satisfies all the constraints specified in the subscription
91
publish subscribe and advertise
Nodes publish event notifications to access points Nodes subscribe to access points in order to receive event notifications Specified by filters and/or patterns Advertisements defines the event notifications a node may possibly generate using the same semantics as filters
92
filtering
The goal of filtering only deliver messages “of interest” to nodes, reducing the overall traffic across the network The system is aided by use of advertisements
93
routing algorithms
Server must establish appropriate routing path to ensures that notification published by objects of interest are correctly delivered to all the interested parties that subscribed to them Simplest strategy is to maintain the subscriptions at their access point and broadcast the notification throughout the network Least efficient Consumes lots of bandwidth
94
SIENA
Central idea is to send the notification towards the event servers that have clients that are interested in that notification (possibly using shortest path) Downstream replication Notification should be replicated only downstream and as close as possible to parties interested in it Upstream evaluation Filters are applied and patterns are assembled upstream – as close as possible to the source of notification Subscription forwarding Routing paths for notification are set by subscriptions which are propagated throughout the network so as to form a tree that connects subscribers to all the servers in network
95
review siena routing from reading assignment
ok
96
name resolution
a mapping between: identifier of an entity (app/process, file) address of an access point (network address, stub/proxy, piece of hardware plugged to network) entities connect to the network via one or more access points, which can be reached at an address
97
identifier
refers to at most one entity each entity is referred to by at most one identifier an identifier always refers to the same entity (i.e., not reused)
98
name spaces
names are normally organized into name spaces ex. file names usually organized as a directed, acyclic graph name represented as a path through nodes in the graph absolute path starts from root relative path starts from an arbitrary point paths represented as or /link-1/…/link-n
99
look at slides 10-15 on lecture 4
ok
100
recursive resolutions
could lead to shorter response time, but increases resource requirements in name servers effectiveness of caching depends heavily on stability of addresses (mobility is an issue) many resolution servers support only iterative resolution
101
insight: for many applications entity names are irrelevant
an application may need to find a component with certain capabilities, e.g., a spell checker, or a nearby printer “resolution” should be guided by the capabilities, not the identity (name) of such components for other purposes, the identity (name) is still important (e.g. web servers and email servers) since these components are not typically mobile, conventional name resolution can still be applied
102
service type
type name, e.g., printer, speech recognition Note: service is ambiguously used to designate (a) service instance (b) service type (c) service supplier version interface signature how to request a service from the supplier, e.g., Java interface ontology relations among types difficulty: relations are not always hierarchical example frameworks: DAML/OWL, UDDI, WSDL
103
quality of service
static attributes intrinsic to the supplier not dependent on circumstances or resources e.g., printer supports color and duplex dynamic attributes dependent on usage history and resources e.g., latency (size of Q, available CPU, bandwidth…), accuracy (used algorithms, iterations, quality of data) subject to tradeoffs database query: fast vs. complete language translation: speed/cost vs. accuracy printing: high quality printing with long Q vs. low quality with short Q
104
context
physical characterization of where, when and how the service will be provided static attributes e.g., location of a wall-mounted display dynamic attributes e.g., location of a PDA, printer queue size implications to privacy and security e.g., is the wall-mounted display in a private room or at a lounge with public access context is more of an issue for some kinds of services than others (non-interactive) distinguish computation from presentation of results
105
existing discovery middleware supports different levels of service description
bare bones name of service type, address/stub to reach supplier E.g., RMI some include spec of API signatures e.g., Jini/JNDI (Java interface) a few include QoS Web Services describes generic attributes such as price and reliability, but not service-specific attributes E.g., web services (covered later in this lecture) some research middleware include service-specific QoS, context, privacy and security
106
what if more than one supplier matches the description?
depends on the level of service description bare bones no way to distinguish, just pick one some include spec of API signatures pick one that is compatible trust, QoS & context use a quantitative framework (e.g. utility functions) to evaluate which one is best requires richer description of the service requirements e.g. find a duplex printer < 100 ft away and with < 2 minute wait p1 is 102 ft away, 2s wait p2 is 95 ft away, 1 minute wait p3 is 10 ft away, 3 minute wait
107
which component to talk to?
service discovery: description of capabilities -> address of an access point name resolution: identifier of an entity -> address of an access point
108
how to describe services and on the mechanisms to make discovery work
mechanisms to make it work: description of capabilities -> address of an access point
109
how about the discovery mechanisms?
mechanisms to make it work: description of capabilities -> address of an access point
110
directed discovery
clients are configured with a list of address to go ask for services
111
client-initiated broadcast
clients broadcast service requests on demand
112
supplier-initiated broadcast
suppliers broadcast their capabilities periodically
113
directory-based discovery
suppliers post their capabilities on a directory clients query the directory
114
should the mechanisms enforce boundaries to discovery?
broadcasting-based discovery is bounded by the network policy for broadcast (usually LAN) directed and directory-based offer more control of scalability hard question: how do directories coordinate? how far, which directories, to direct a query?
115
what is a service?
the act of performing helpful or useful labor that does not produce a tangible commodity
116
the notion of service implies_______between supplier and consumer
separation of concerns
117
computing definition of service
the act of performing helpful or useful labor, where the service supplier is developed separately from consumers and may serve many consumers
118
services become first class
suppliers register their capabilities, consumers look for services not specific components
119
two models for activating service supply:
factory & pool
120
stateful suppliers vs stateless
stateful keeps state of conversation while stateless doesn't
121
middleware may alleviate dependency on programming language and OS but
introduces dependency on middleware
122
Middleware solutions have significant challenges
too many “standards” one born every few months: code evolution nightmare integration with legacy systems millions of LOC and billions of $ already invested strategy: wrappers around old code with wrappers latency becomes an issue
123
web services come in as an integration technology
focus on bridging existing technologies key characteristic: middleware for middleware it’s about how to access an application, it is not an implementation technology looser coupling than RPC-based middleware avoid proprietary APIs Simple Object Access Protocol (SOAP) based on sending XML messages over http, no SOAP API or ORB WSDL & SOAP are not widely used today, but it’s important to understand why – complexity is a killer
124
web services introduces its own set of standards
directory (UDDI: universal description, discovery and integration) service description(WSDL: web services description language) messages(SOAP: simple object access protocol) which work on top of: data types(XML Schema) data(XML)
125
Orchestration
an approach for service composition and coordination. A central service coordinates the invocation of other services to achieve the system’s functionality
126
Choreography
another approach for service composition. a decentralized approach to coordination of services, where each service knows how it needs to behave to achieve the system’s functionalities
127
BPEL
Business process execution language. often used for modeling the coordination among services. BPEL engine executes the model and exposes it as a service
128
in-only communication pattern
receives a message but will not respond
129
robust in-only communication pattern
receives a message and may issue a fault message
130
in-out communication pattern
receives a message and may issue a reply or fault message
131
rpc
call-return, only valid for in-only and in-out patterns
132
iri
(International Resource Identifier) message can be serialized as an IRI multipart: … The word “style” here is not the same as what we called “communication style” in this class
133
binding
defines the communication protocol typically SOAP
134
service (java)
associates the interfaces with a URL and protocol (binding)
135
WS descriptions may be posted on UBRs
1. SW companies, standards bodies, and programmers populate the registry with descriptions of different types of services 2. Businesses populate the registry with descriptions of the services they support 3. UBR assigns a programmatically unique identifier to each service and business registration 4. Marketplaces, search engines, and business apps query the registry to discover services at other companies 5. Business uses this data to facilitate easier integration with each other over the Web
136
UDDI business registry
supports business registrations. XML document created by supplier company (or on its behalf) may have multiple service listings
137
Wrap-up of WSDL & SOAP
In spite of the name, SOAP was not simple Pursuit of generality made WSDL & SOAP too complicated for use in “normal cases” Simple things weren’t simple Inadequate security story Today, other approaches far more common, e.g. REST
138
what is REST?
Representational State Transfer (REST) “a software architecture style consisting of guidelines and best practices for creating scalable web services” [Wikipedia, “Representational state transfer”]
139
RESTful API
An API following REST style Intended to be simpler alternative to SOAP and WSDL-based Web services In practice, RESTful systems typically communicate over the HTTP protocol with the same HTTP verbs (GET, POST, PUT, DELETE, etc.)
140
Original Formal REST Design Constraints
Client–server (storage on server) Stateless Cacheable Layered system Client may connect to intermediary Code on demand (optional) Uniform interface Identification of resources (e.g., URIs) Manipulation of resources through these representations Self-descriptive messages (e.g., MIME type) Hypermedia as the engine of application state (HATEOS) – THIS ONE IS CONTROVERSIAL
141
Applying RESTful API (Collection URI)
GET: list the URIS and perhaps other details of the collection's members PUT: replace the entire collection with another collection POST: create a new entry in the collection. the new entry's URI is assigned automatically and is usually returned by the operation DELETE: delete the addressed member of the collection
142
RESTful API (element URI)
GET: retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type. PUT: replace the addressed member of the collection, or if it doesn't exist, create it. POST: not generally used. treat the addressed member as a collection in its own right and create a new entry in it. DELETE: delete the addressed member of the collection
143
Hypermedia as the engine of application state (HATEOS) (1)
Original REST definition requires “Hypermedia as the engine of application state” (HATEOS) One accessing the initial REST URI, client must be able to follow server-provided links to (eventually) discover all the resources Human analogy: from start page can only click In theory, HATEOS = no need for client to hard code information about application structure or dynamics [https://restfulapi.net/hateoas/] Some purists insist that REST requires HATEOS Original definition requires it! However, I don’t accept this claim…
144
Hypermedia as the engine of application state (HATEOS) (2)
In practice, almost all RESTful APIs do not implement HATEOS (this variant sometimes called “Practical REST”) Problems with HATEOS: Increased implementation complexity Bloats every response with many almost-always-unused links In the rare cases that HATEOS links are used, encourages “chatty” (slow & resource-intensive) integration as clients must navigate instead of directly requesting what they need Clients rarely use it – typically clients make a direct request No de facto standard, so clients can’t easily use HATOS info HATEOAS only communicates connection, not meaning, so HATEOS info often doesn’t provide enough info to be useful
145
REST Security
REST builds on HTTP – same general rules For wire confidentiality, use TLS (SSL) To authenticate must use agreed-on authentication method E.G., OAUTH2 (token or key/secret) or basic authentication Typically on login uses cookies to store session key or other session info Requestee determines authorization
146
OpenAPI / Swagger
OpenAPI (originally “Swagger spec”) machine-readable interface files for describing, producing, consuming, and visualizing RESTful Web services Development overseen by Open API Initiative (of the Linux Foundation) Language-agnostic Swagger = common implementation
147
OpenAPI basics
OpenAPI document is a JSON object may be represented in JSON or YAML All field names are case sensitive Primitive data types based on JSON integer is a type optional modifier “format”, e.g., a dateTime represented as type=string format=date-time Defines supported paths, operations (incl. summary & parameters), responses
148
synchronized clocks
strict timing constraints for messages compare timestamps on distributed data
149
Time
Fundamental unit: second Historically, 1/86 400 of a mean solar day But there are irregularities in the rotation of the Earth Also, Earth’s rotation is slowing down Since 1967, a second is based on atomic measures A second is the duration of 9 192 631 770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the cesium 133 atom (in its ground state at a temperature of 0 K)
150
International Atomic Time (TAI)
Weighted average of the time kept by over 400 atomic clocks in over 50 national laboratories worldwide Continuously increasing, accurate, not coordinated with Earth’s rotation
151
Global Positioning System (GPS) time
Each GPS satellite broadcasts its position & local time Receivers determine location & time using transmission delay GPS time was zero at 0h 6-Jan-1980 TAI is always ahead of GPS by 19 seconds
152
Universal Time 1 (UT1)
Conceptually mean solar time at 0°longitude (Greenwich), but actually uses distant quasars, etc. Measures Earth’s rotation (thus coordinated with it), but Earth wobbles, so its length of second varies (!)
153
Coordinated Universal Time (UTC)
Primary time standard by which the world regulates clocks and time Tanenbaum uses term nonstandard“Universal Coordinated Time” Based on TAI, but seconds added/removed to keep within 1 second of UT1 (mean solar time at 0°longitude) Leap seconds occasionally inserted: 58, 59, *60*, 0, 1, … Insertion preference at the end of December and June Leap second inserted on 2015-07-01; TAI-UTC=36s Some want to stop adding leap seconds – this would redefine “day” to be unrelated to the sun and Earth’s rotation I’m pro-leap-seconds; if you want continuous, use TAI or GPS time
154
Local time
Add daylight saving time & timezone to UTC Eastern Standard Time (EST - US) UTC -0500 Eastern Daylight Time (EDT; summer) UTC -0400 India standard time UTC +0530 Nepal standard time UTC +0545
155
clocks drift from UTC by the very nature of UTC and device characteristics
2ρ.Δt (read slide 12 of 05-synch)
156
What is the minimum frequency (in seconds) that two clocks must be resynchronized if you want to guarantee that they are no more than 10^-5 seconds (10 microseconds) apart, where each has a maximum (linear) drift rate of p=10^-6 (1 microsecond/second, a typical rate for a hardware quartz clock)?
See third edition section 6.1. Given maximum clock drift rate p (the difference per unit time from a perfect reference clock), and precision "precision" in seconds (the maximum 2 clocks are allowed to be, even if they drift in worse case in opposite directions), then the clocks must be resynchronized at least every precision / (2p) seconds. Given p=10^-6 (a typical rate for hardware quartz clock), and precision of 10^-5, we have a minimum resynchronization frequency = (precision)/(2p) = (10^-5)/(2*(10^-6)) = every 5 seconds.
157
Order the following process when attempting to get . Presume that DNS iterative resolution is used, that the DNS resolver begins with no cached data other than the DNS root, and that the names "www.cs.gmu.edu" and "cs.gmu.edu" are managed by the same DNS server (i.e., are within the same DNS zone)
The client contacts its local name resolver to implement the name resolution process on www.cs.gmu.edu. The name resolver hands the complete name www.cs.gmu.edu to the root name server ".". The DNS root server resolves www.cs.gmu.edu as far as it can, and since it can only resolve to edu, it will return the address of the name server for "edu.". The client name resolver contacts the name server for "edu." and requests it to resolve www.cs.gmu (.edu). The name server for "edu." resolves www.cs.gmu (.edu) as far as it can, and since it can only resolve to gmu, it returns the address of the name server for "gmu.edu.". The client name resolver contacts the name server for "gmu.edu." and requests it to resolve www.cs (.gmu.edu). The name server for "gmu.edu." resolves www.cs (.gmu.edu) as far as it can, and it returns the address of the name server for "cs.gmu.edu.". The client name resolver contacts the name server for "cs.gmu.edu." and requests it to provide the IP address of www (.cs.gmu.edu). The name server for "cs.gmu.edu." provides the IP address of www.cs.gmu.edu. The client's local name resolver returns the IP address of www.cs.gmu.edu. This IP address can then be used to initiate the HTTPS protocol to perform a GET of "/about/contact-info".
158
An NTP client is polling its NTP servers. It sends one request at client time 345 ms, which is received by the server at server time 1032 ms. The server responds at server time 1184 ms, and the client receives the message at client time 700 ms. What is the computed time offset for this particular request? (If the answer is not an integer use a decimal representation, e.g., "1.5" or "-80.5").
This is just the computed time offset (θ). This is different from the round-trip delay (δ), though that's also calculated in the algorithm because results with lower round-trip delay are preferred. However, since the question only asked for the computed time offset, that's what you should have provided. This computed time offset is computed using ((t1-t0)+(t2-t3))/2. answer is 585.5
159
how about the network propagation time?
radio broadcast ±10ms due to atmospheric fluctuations satellite ±500μs knowing the distance to a geostationary satellite local network tens or hundreds of ms, due to network stack, load on processors internet seconds range, due to routers, queues…
160
Network Time Protocol (NTP)
Network Time Protocol (NTP) is a networking protocol for clock synchronization to “real” time Designed for variable-latency data networks Servers provide time values, clients request time info (it does support peer-to-peer) Hierarchical: “stratum” counts layers from reference clock (prevents cycles). Stratum 0 = reference clock Clients & servers include local clock timestamps in messages More complex algorithm, but gets real time distributed Clients Regularly polls three or more NTP servers on diverse networks Gathers data & determines how to adjust its clock
161
NTP client data processing
Client regularly polls servers, for each computes time offset (θ) and round-trip delay (δ) The values for θ and δ are passed through filters and subjected to statistical analysis (book: θ of smallest δ) Outliers are discarded and an estimate of time offset is derived from the best three remaining candidates Presumes symmetrical nominal delay Many details omitted here!
162
Lamport had a simple idea preserving before-after relationships
definition: if a and b are events, a b denotes that a occurs before b transitivity: if a b and b  c then a c if a and b occur in the same process, and a occurs before b, then a  b holds if a is the event of a message being sent by one process, and b is the event of that message being received by another process then a b holds definition (interleaving): if a and b happen in two processes that do not coordinate, then neither a b and b a holds
163
Lamport’s idea preserving before-after relationships of message sending and receiving events
definition: C(a) is the clock value when event a occurs if a and b occur in the same process, and a occurs before b, then C(a) < C(b) if a is the event of a message being sent by one process, and b is the event of that message being received by another process then make sure C(a) < C(b) interleaving: if a and b happen in two processes that do not coordinate, then we don’t know the relation between C(a) and C(b)
164
lamport clock adjustment
Lamport’s algorithm can be used to ensure that all nodes agree on the ordering of events if (1) Each time stamped message is sent to everyone in the group, (2) Messages sent from the same sender are received in the same order (3) No messages are lost
165
parallelism
Simultaneous execution - execution of process or computation simultaneously Need >1 CPU core (but today that’s the normal case, and it’s always true in a distributed system)
166
concurrency
“concurrency is the property of program, algorithm, or problem decomposability into order-independent or partially-ordered components or units.” [Lamport1978] Several computations are executing during overlapping time periods—concurrently—instead of sequentially (one completing before the next starts) Concurrency doesn’t require parallelism – it can be implemented on a single processor (through interleaving)
167
parallelism != concurrency
but often same solutions apply to both
168
threads
Process Each process has a separate memory area from other processes Thread Executes code, no attempt to isolate memory of a thread from other threads in the same process Thread implementation generally maintains minimum information (e.g., CPU context) Using multiple threads can have higher performance than multiple processes, but using them correctly requires more intellectual effort Easier to get things wrong, and can be difficult to debug because defects often aren’t reproduceable Using threads or processes can improve scaleability Take advantage of those multiple processors you have
169
understanding threads
some advantages of using threads separation of concerns: different activities in different threads one thread remains responsive (e.g. user input) even if others are busy or blocked (e.g. waiting for messages) – decreases overall latency support requests of multiple clients using threads for replicated computation pool: assign a thread when a request comes in more efficient, harder to manage factory: create a thread when a request comes in easier to manage, less efficient threads are supported by a library/VM, the OS, or both making a process-blocking OS call blocks all threads in some library/VM implementations calling exit() in one thread terminates the process
170
threads in distributed systems
Threads provide convenient way to allow blocking system calls without blocking entire process Good for distributed systems – easier to express communication with multiple logical connections at same time Distributed systems can impose significant delays in communication between components – don’t want to block everything
171
data is shared at different granularities
shared memory (address space) distributed shared memory objects distributed object stores files distributed file systems
172
concurrency challenges the predictability of effects
P and Q running concurrently. For the moment we’ll assume + and * are atomic (normally they are not) (see slide 11and 12 lecture 6) x = x + 3 often implemented like this: Load value of x Load constant 3 Add them Store result in x … so even “simple” operations like addition are often not atomic (they can be broken down)
173
blackboard architectures
single data repository (blackboard) all components read/write on blackboard
174
client-server with central database
multiple clients may access same DB records
175
distribution challenges the consistency of observed values
distribution network propagation delays different clocks causes inconsistency given one trace of events, different components may see those events in a different order
176
predictability ≠ consistency
concurrency causes unpredictability cannot tell which trace will occur distribution network propagation delays different clocks causes inconsistency given one trace of events, different components may see those events in a different order
177
ordering consistency policies share the premise of hiding synchronization from programmers
definition ordering (aka data-centric) consistency: all components observe operations on shared data at the same time strict consistency – only possible with shared clock and insignificant propagation delays in the same order linear consistency in an order that “makes sense” sequential, causal, FIFO consistency
178
synchronization
address both predictability and ordering consistency. programmers need to use explicit synchronization techniques anyway, because of unpredictability (due to concurrency) explicit synchronization relieves the middleware/OS from having to assure ordering consistency
179
monitors
used for explicit synchronization Monitor provides a queue with certain entry condition that is used to guarantee only one process operates on the critical section (data) at a time
180
the parking lot start by identifying structure and events
events or actions of interest? arrival and departure identify processes arrivals, departures and CarPark control define structure and interactions
181
Replication
Reasons for replication: Improve reliability – can continue working even if some component fail (see fault tolerance) Performance – can distribute work & put data near where it’s needed Problem: Can lead to consistency problems Challenge to keep replicants consistent
182
Data-centric consistency models
Often focus on shared data aka “data store” May be (distributed) shared database, shared filesystem, shared memory, etc. Consistency model = a contract between processes & data store (if processes do X, data store promises Y) Without a global clock not easy to define “last write” Often can accept some inconsistencies… but need to bound it, & thus need to categorize them
183
Kinds of inconsistencies
Deviation in values between replicas Absolute numerical deviation (“no more than $0.02”) Relative numerical deviations (“no more than 0.5%”) Deviation in staleness (how old) “Data no older than X seconds” Deviation in ordering of update operations This is more complex! Sequential consistency, causal consistency, eventual consistency, …
184
Sequential consistency
“The result of any execution is the same as if the read & write operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program.” Any valid interleaving of read & write operations is acceptable, but all processes see the same interleaving Expensive to implement in distributed system
185
Causal consistency
Weakens causal consistency – distinguishes what is potentially causally related “Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines.”
186
Eventual consistency
“If no updates take place for a long time, all replicas will gradually become consistent [have exactly the same data]” Updates must eventually propagate to all replicas Problem: write-write conflicts (same data item written with different values) Often the solution is an algorithm that declares one as the “winner” (cancelling the effects of any previous conflict) Often cheap to implement, at a cost of inconsistency for a period of time
187
Client-centric consistency
Give up the idea of a “central data store” Provide consistency guarantees from POV of 1 client, No guarantees about different clients Various consistency models: Monotonic reads: If a process reads the value of a data item x, any successive read of x by that process will always return that same or more recent value (“never read older version”) Monotonic writes: A write by a process on data item x is completed before any later write on x by the same process Read-your-writes: The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process Writes follow reads: A write operation by a process on data item x following a previous read operation on x by the same process is guaranteed to take place on the same or more recent value of x that was read (“See a posting about an article only if saw original article”)
188
Some systems support so-called "ACID" transactions. What does the "A" in "ACID" stand for?
Atomic
189
in a publish/subscribe system, applications can indicate their interest in specific types of message, and the communication middleware will ensure that those types of messages are delivered to the applications interested in that type of message.
true
190
magine an advanced calculator running locally on a computer system. It does not communicate outside that computer system. True or False: This system would typically be considered a distributed system.
false
191
True or false: Caching and replication can lead to consistency problems.
true
192
Which of the following is a valid assumptions to make, in general, when developing a distributed system?
none of them are true: The network topology does not ever change. Network bandwidth is infinite. The network is reliable. Network latency is zero.
193
which of the following is false
In RMI, unlike in RPC, applications send messages to logical contact points.
194
The world-wide web (WWW) consists users who use web browsers (web clients) to talk to a variety of web servers. Users type in URLs to retrieve information from the web.
true
195
An experimental file server is down 20% of the time, due to various problems. If you are to use replication to alleviate this problem, how many replicas would you need to reduce the down time to 0.7%? Assume independent failures. Do NOT include the "master" system in your count of the replicas.
3 System down = all servers are down. Probability the system is down= (probability of each server down)^N Goal ≥ Probability the system is down Probability each server is down = 20% = 20/100 = 0.2 Goal = 0.7% = 0.7/100 = 0.007 Probability the system is down = (0.2)^N N = 2 ==> Probability the system is down = (0.2)*(0.2) = 0.04 N = 3 ==> Probability the system is down = (0.2)*(0.2)*(0.2) = 0.008 N = 4 ==> Probability the system is down = (0.2)*(0.2)*(0.2)*(0.2) = 0.0016 Goal = 0.007 ≥ Probability the system is down = 0.0016 So, the minimum number of servers needed to reduce the down time to 0.7%, including the master, is 4. That means only 3 additional replicas are needed (since the question expressly asked you to not count the master server in the count of replicas).
196
Imagine an e-commerce system consisting of many software components, including client, server, and database objects, that are deployed on different devices and communicate and coordinate with each other to provide the ecommerce functionality. True or False: This system would typically be considered a distributed system.
TRUE
197
What does the term "durable" in ACID mean for transactions?
Once a transaction commits, the changes are permanent.
198
notify
install a handler to be called when a message is put into the specified queue
199
poll
Check a specified queue for messages, and remote the first. Never block
200
In some period of time, the probability of a component successfully working the entire time (without failure) is 70%. If you want the system as a whole to fail with probability less than or equal to 0.02%, what is the minimum number of replicas (INCLUDING a "master") that will be required? Presume component failure is independent.
8 We need "F" (Fail) probability, and all we're given is the probability of success for each component. This is easily resolved: F = 1 - Success n >= (log Goal)/(log F) so find the smallest n (the "ceiling") that meets this inequality. Since "n" has to be an integer (you can't use a *part* of a component), you'll typically have to go up to the smallest integer greater than or equal to this calculation (this is also called the "ceiling"). Finally: Do we include the "master" in this case? As discussed in class, people aren't consistent in their terminology so you have to look at it for each question. In this case, the question *clearly* states that we *do* include the "master" system in this case, so you don't subtract by one.
201
True or False: MOM achieves fault-tolerance by implementing message queues that store messages temporarily on persistent storage. The sender writes the message into the message queue and if the receiver is unavailable due to a failure, the message queue retains the message until the receiver is available again.
true
202
A remote procedure call (RPC) is implemented by a number of steps. Order the steps below (presuming that the RPC must go over a network).
The client procedure calls the client stub in the normal way. The client stub builds the message (e.g., "marshalling" the parameters) and calls the local operating system asking the message to be sent via the network. The client's operating system sends the message to the remote (server) operating system. The remote operating system gives the message to the server stub. The server stub unpacks the parameter(s) (aka "unmarshalls" the parameters) and calls the server. The server does the work and returns the result to the stub. The server stub packs the result into a message (including "marshalling" the parameters) and calls its local operating system. The server's operating system sends the message to the client's operating system. The client's operating system gives the message to the client stub. The client stub unpacks the result and returns it to the caller within the client.
202
Which of the following is the most accurate description of "persistence" as an attribute of a messaging system?
Persistence refers to whether the messages endure in the system (regardless of the sender being active and the recipient being available) or not.
202
Which of the following communication approaches would be most useful if the communicating components are not aware of each other’s locations, and the communicating components may not be available at the same time?
publish-subscribe
202
There are two models for activating service supply: factory & pool. In a factory model, the system instantiates all the instances that might be needed ahead-of-time, and then when a request shows up, one of those pre-created instances is allocated to the request. In a pool model, an instance is created at the time of the request. An advantage of pre-creating instances is that requests can often be handled more quickly, since the instance is already ready to go (this effect is more important if instantiation takes a long time). A disadvantage of pre-creating instances is that the instance typically requires resources (at least storage) even when it is not being used. Is all of the above generally true, or is it false?
false Read carefully! It's true that "There are two models for activating service supply: factory & pool." The later text is also true, because it's true that "An advantage of pre-creating instances is that requests can often be handled more quickly, since the instance is already ready to go (this effect is more important if instantiation takes a long time). A disadvantage of pre-creating instances is that the instance typically requires resources (at least storage) even when it is not being used." However, the definitions of factory and pool are completely reversed. In the pool model, "the system instantiates all the instances that might be needed ahead-of-time, and then when a request shows up, one of those pre-created instances is allocated to the request". In the factory model, an instance is created at the time of the request.