622 Flashcards

Question

Moving location of execution

Answer 1

E.G., when checking form inputs, consider two options: Send each input to server & wait for reply – maybe long delay Could move checking code to client Now check response can be immediate (once code is there) Can be special-case (e.g., HTML5 form validators) Can be general (e.g., a general execution engine like Javascript) Beware of security ramifications Often two sides (e.g., client & server) cross trust boundary Security checks must often be redone on server in many cases Server can’t trust client Many checks are done on both client (for speed) and server (for security) Client must often check the data it’s asked to execute (especially if it’s a full language like Javascript) Client can’t trust server

Answer 2

Clouds widely used, often misunderstood Clouds often cheaper (where appropriate), many variations Decisions to use cloud (and how) impact security NIST Definition of Cloud Computing (NIST SP 800-145): Cloud computing is “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” Five essential characteristics: On-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service Virtualization is common, but not required, to be a cloud Book makes this (common) mistake

Answer 3

Infrastructure as a Service (IaaS): “consumer [can] deploy and run arbitrary software [including] operating systems and applications....” Platform as a Service (PaaS): “consumer [can deploy] consumer-created or acquired applications...” [on top of provided platform] Software as a Service (SaaS): “consumer [can] use the provider’s applications running on a cloud infrastructure...”

Answer 4

physical, data link, network, transport, session, presentation, application

Answer 5

specifies: pin layout, voltages, modulation does: establish & terminate access to medium,flow control, contention resolution at this level: hubs, repeaters,network adapters

Answer 6

specifies: how to transfer data in a LAN does: detect and correct errors at this level: MAC addresses (flat, HW-based)

Answer 7

specifies: how to transfer data sequences across LANs (e.g., IP) does: routing at this level: hierarchical address scheme,routers, bridges & switches

Answer 8

Connectionless (datagram/packet-based) Best-effort delivery (unreliable service) packets are lost packets are delivered out of order duplicate copies of a packet are delivered packets can be delayed for a long time Datagram format

Answer 9

Strategy every datagram contains destination’s address if directly connected to destination network, then forward to host if not directly connected to destination network, then forward to some router forwarding table maps network number into next hop each host has a default router each router maintains a forwarding table

Answer 10

Suppose there are n possible destinations, how many bits are needed to represent addresses in a routing table? log2n So, we need to store and search n * log2n bits in routing tables? We’re smarter than that!

Answer 11

Globally unique, hierarchical: network+host Dot Notation 10.3.2.4 128.96.33.81 192.12.69.77

Answer 12

specifies: reliable transference of data(e.g., TCP, UDP) does: flow control, segmentation, error control,retransmission

Answer 13

(User Datagram Protocol) connectionless - sends independent packets of data, called datagrams, from one computer to another with no guarantees about arrival each time a datagram is sent, the local and receiving socket address need to be sent as well

Answer 14

(Transmission Control Protocol) connection-oriented - provides a reliable flow of data between two computers: data sent from one end of the connection gets to the other end in the same order in order to communicate using TCP protocol, a connection must first be established between the pair of sockets once two sockets have been connected, they can be used to transmit data in both (or either one of the) directions

Answer 15

UDP - every time a datagram is sent, the local and receiving socket address need to be sent along with it TCP - a connection must be established before communications between the pair of sockets start (i.e. there is a connection setup time in TCP)

Answer 16

UDP - there is a size limit of 64 kilobytes per datagram TCP - there is no limit; the pair of sockets behaves like streams

Answer 17

UDP - there is no guarantee that the sent datagrams will be received in the same order by the receiving socket TCP - it is guaranteed that the sent packets will be received in the order in which they were sent

Answer 18

TCP - useful when indefinite amount of data need to be transferred ‘in order’ and reliably UDP - useful when data transfer should not be slowed down by the extra overhead of the reliable connection

Answer 19

specifies: establishing long lived connections does: checkpointing, adjournment, restart

Answer 20

specifies: data formats and transformation(e.g., MIME) does: serialization, compression, encryption, encoding transformation (EBCDIC/ASCII)

Answer 21

specifies: application-specific protocols(e.g., http, smtp, ftp, telnet) does: support app-specific functionality

Answer 22

separation of concerns enables good implementationat each level each layer is independentof the ones on top layer n depends on the spec of n-1, but not on its implementation/manufacturer

Answer 23

Generally, a computer has a single physical connection to the network this connection is identified by the computer’s 32-bit IP address all data destined for a particular computer arrives through this connection TCP and UDP use ports to identify a particular process/application port = abstract destination point at a particular host each port is identified by a positive 16-bit number, in the range 0 - 65,535 port numbers 0 - 1023 are reserved for well-known services (HTTP - 80, telnet – 23)

Answer 24

basic abstraction for network communication “end-point of communication” uniquely identified with IP address and port example: Socket MyClient = new Socket("Machine name", PortNumber); gives a file-system like abstraction to the capabilities of the network two end-points communicate by “writing” into and “reading” out of socket there are two types of transport via sockets reliable, byte-stream oriented unreliable datagram

Answer 25

Server Side: server runs on a specific computer and has a socket bound to a specific port number server listens to the socket for a client to make a connection request Client Side: client tries to rendezvous with the server on the server's machine and port Server Side: the server accepts the connection by creating a new socket bound to a different port Client Side: if the connection is accepted, the client uses the new socket to communicate with the server

Answer 26

All clients use the same socket to communicate with the server Packets of data (datagrams) are exchanged No new sockets need to be created

Answer 27

Java keeps all the socket complexity “under the cover” It does not expose the full range of socket possibilities But, it enables sockets to be opened/used as easily as a file would be opened/used By using the java.net.Socket class instead of relying on native code, Java programs can communicate over the network in a platform-independent fashion

Answer 28

all classes related to sockets are in java.net package Socket class - implements client sockets (also called just "sockets") ServerSocket class - implements server sockets A server socket waits for requests to come in over the network. It performs some operation based on that request, and then possibly returns a result to the requester. DatagramSocket class - socket for sending and receiving datagram packets DatagramPacket class - represents a datagram packet Datagram packets are used to implement a connectionless packet delivery service. Multiple packets sent from one machine to another might be routed differently, and might arrive in any order. InetAddress class - represents an Internet Protocol (IP) address MulticastSocket class - useful for sending and receiving IP multicast packets. A MulticastSocket is a (UDP) DatagramSocket, with additional capabilities for joining "groups" of other multicast hosts on the internet. A multicast group is specified by a class D IP address.

Answer 29

conceptual model for communication

Answer 30

RPC (address space or memory) RMI object refs (middleware) messages data store -> files/objects persistent store data stream -> and <- data store/source

Answer 31

sending messages

Answer 32

hardwired for fixed deployment some RPC environments support dynamic binding (more to come during the lecture on Service Discovery)

Answer 33

increase granularity from bytes to objects both local objects and references to remote objects are passed by value (serialization) the result of the called method is also serialized and passed back to the caller

Answer 34

RMI: doesn’t try to hide distribution in the language:remote objects are declared “remote” marshalling is simplified by passing by value only(object references can be used in nested RMIs) (in Java) by having JVMs hide platform dependencies in data representation serialization could be much heavier by having to pass the code for the objects with every call, but that can be avoided by passing URLs for downloading the code, rather than the code itself

Answer 35

no result needs to be returned a server may not be availableat the time of the request make the client more responsiveto other events/user allow any component to initiate communication

Answer 36

dealing with errors:idempotent, at-least-once, at-most-once… the promised simplicity of procedure calling sometimes hinders more sophisticated solutions

Answer 37

the server is ready to process each request components and network are mostly reliable not many concurrent events in the caller:it is fine to block the caller one component (client) has the initiative,others (servers) wait for requests

Answer 38

Java makes RMI (Remote Method Invocation) fairly easy, but there are some extra steps To send a message to a remote “server object,” The “client object” has to find the object Do this by looking it up in a registry The client object then has to marshal the parameters (prepare them for transmission) Java requires Serializable parameters The server object has to unmarshal its parameters, do its computation, and marshal its response The client object has to unmarshal the response

Answer 39

an object on another computer

Answer 40

object making the request

Answer 41

object receiving the request((can easily trade roles with the client object)

Answer 42

special server that looks up objects by name

Answer 43

special compiler for creating stub (client) and skeleton (server) classes

Answer 44

The Client The Server The Object Registry, rmiregistry, which is like a DNS service for objects You also need TCP/IP

Answer 45

Interfaces define behavior Classes define implementation Therefore, In order to use a remote object, the client must know its behavior (interface), but does not need to know its implementation (class) In order to provide an object, the server must know both its interface (behavior) and its class (implementation) In short, The interface must be available to both client and server The class should only be on the server

Answer 46

one whose instances can be accessed remotely On the computer where it is defined, instances of this class can be accessed just like any other object On other computers, the remote object can be accessed via object handles

Answer 47

one whose instances can be marshaled (turned into a linear sequence of bits) Serializable objects can be transmitted from one computer to another

Answer 48

If an object is to be serialized: The class must be declared as public The class must implement Serializable All fields of the class must be serializable: either primitive types or serializable objects

Answer 49

The interface (used by both client and server): Must be public Must extend the interface java.rmi.Remote Every method in the interface must declare that it throws java.rmi.RemoteException (other exceptions may also be thrown) The class itself (used only by the server): Must implement a Remote interface Should extend java.rmi.server.UnicastRemoteObject May have locally accessible methods that are not in its Remote interface

Answer 50

lives on another computer (like server) You can send messages to a Remote object and get responses back from the object All you need to know about the Remote object is its interface Remote objects don’t pose much of a security issue

Answer 51

You can transmit a copy of a Serializable object between computers The receiving object needs to know how the object is implemented; it needs the class as well as the interface There is a way to transmit the class definition Accepting classes does pose a security issue

Answer 52

The class that defines the server object should extend UnicastRemoteObject This makes a connection with exactly one other computer If you must extend some other class, you can use exportObject() instead Sun does not provide a MulticastRemoteObject class The server class needs to register its server object: String url = "rmi://" + host + ":" + port + "/" + objectName; The default port is 1099 Naming.rebind(url, object); Every remotely available method must throw a RemoteException. Why?

Answer 53

The class that implements the remote object should be compiled as usual Then, it should be compiled with rmic: rmic Hello This will generate files Hello_Stub.class and Hello_Skel.class These classes do the actual communication The “Stub” class must be copied to the client area The “Skel” was needed in SDK 1.1 but is no longer necessary

Answer 54

network/transport support for multicast hasn’t worked well in practice overlay networks at application layer! result is a logical network built on top of a (probabaly different) network new network optimized for application but may have performance issues due to underlying network

Answer 55

tree network approach function to map topics to nodes identifies unique root node for a given topic follow() request follows tree if node has not seen request become a “forwarder” make sender your “child” for that request forward request to next node in tree if node has seen request make sender your “child” for that request send() request follows topic tree

Answer 56

node states: infected, susceptible, removed goal is to become infected! P picks a random Q push: P updates pushed to Q // not so good… pull: Q updates pulled to P push-pull: both gossip adaptation less interest if receiver already has update

Answer 57

suppose an update is deleted what happens when old copy is found? distributed systems are different! notion of a death certificate “that update doesn’t matter anymore” have to keep that update how long?

Answer 58

once sent, messages endure in the system, regardless of the sender remaining activeand the recipient being available

Answer 59

the sender continues after sending,or blocks, waiting for the message to be delivered (buffered) to be received (read) to be processed

Answer 60

socket: create a new communication endpoint bind: attach a local address to a socket listen: announce willingness to accept connections accept: block caller until a connection request arrives connect: actively attempt to establish a connection send: send some data over the connection receive: receive some data over the connection close: release the connection

Answer 61

component interactions don’tfollow a strict call-return pattern make components more responsiveto other events/user (no blocking) allow any component to initiate communication components may not be available to receive/process messages (persistency)

Answer 62

(persistent) data plays a central role in the system components don’t need to synchronize control flow other than on data availability or values E.g., modern database management systems, distributed file systems

Answer 63

interaction partners do not need to know each other

Answer 64

data is stored/generated at one place and consumed by one or more clients timeliness of data delivery is crucial asynchronous (unbound, e.g., caching) synchronous (upper bound, e.g., sensors) isochronous (bounded jitter, e.g., media) complex data may be transmitted as separate streams which need to be synchronized: e.g. video + stereo sound streaming is supported by middleware (e.g. RSVP) on top of the data link network layer

Answer 65

the interaction partners do not need to be actively participating in the interaction at the same time

Answer 66

publishers aren't blocked while producing events, subscribers can get asynchronously notified of the occurrence of an event

Answer 67

An event filter selects event notifications by specifying a set of attributes and constraints on the values of those attributes A pattern is composed of several filters A subscription can be expressed as a filter or a pattern

Answer 68

A subscription matches an event notification when the notification satisfies all the constraints specified in the subscription

Answer 69

Nodes publish event notifications to access points Nodes subscribe to access points in order to receive event notifications Specified by filters and/or patterns Advertisements defines the event notifications a node may possibly generate using the same semantics as filters

Answer 70

The goal of filtering only deliver messages “of interest” to nodes, reducing the overall traffic across the network The system is aided by use of advertisements

Answer 71

Server must establish appropriate routing path to ensures that notification published by objects of interest are correctly delivered to all the interested parties that subscribed to them Simplest strategy is to maintain the subscriptions at their access point and broadcast the notification throughout the network Least efficient Consumes lots of bandwidth

Answer 72

Central idea is to send the notification towards the event servers that have clients that are interested in that notification (possibly using shortest path) Downstream replication Notification should be replicated only downstream and as close as possible to parties interested in it Upstream evaluation Filters are applied and patterns are assembled upstream – as close as possible to the source of notification Subscription forwarding Routing paths for notification are set by subscriptions which are propagated throughout the network so as to form a tree that connects subscribers to all the servers in network

Answer 73

a mapping between: identifier of an entity (app/process, file) address of an access point (network address, stub/proxy, piece of hardware plugged to network) entities connect to the network via one or more access points, which can be reached at an address

Answer 74

refers to at most one entity each entity is referred to by at most one identifier an identifier always refers to the same entity (i.e., not reused)

Answer 75

names are normally organized into name spaces ex. file names usually organized as a directed, acyclic graph name represented as a path through nodes in the graph absolute path starts from root relative path starts from an arbitrary point paths represented as or /link-1/…/link-n

Answer 76

could lead to shorter response time, but increases resource requirements in name servers effectiveness of caching depends heavilyon stability of addresses (mobility is an issue) many resolution servers support only iterative resolution

Answer 77

an application may need to find a component with certain capabilities, e.g., a spell checker, or a nearby printer “resolution” should be guided by the capabilities,not the identity (name) of such components for other purposes, the identity (name) is still important (e.g. web servers and email servers) since these components are not typically mobile, conventional name resolution can still be applied

Answer 78

type name, e.g., printer, speech recognition Note: service is ambiguously used to designate(a) service instance (b) service type (c) service supplier version interface signature how to request a service from the supplier, e.g., Java interface ontology relations among types difficulty: relations are not always hierarchical example frameworks: DAML/OWL, UDDI, WSDL

Answer 79

static attributes intrinsic to the suppliernot dependent on circumstances or resources e.g., printer supports color and duplex dynamic attributes dependent on usage history and resources e.g., latency (size of Q, available CPU, bandwidth…),accuracy (used algorithms, iterations, quality of data) subject to tradeoffs database query: fast vs. complete language translation: speed/cost vs. accuracy printing: high quality printing with long Qvs. low quality with short Q

Answer 80

physical characterizationof where, when and how the service will be provided static attributes e.g., location of a wall-mounted display dynamic attributes e.g., location of a PDA, printer queue size implications to privacy and security e.g., is the wall-mounted display in a private roomor at a lounge with public access context is more of an issue for some kinds ofservices than others (non-interactive) distinguish computation from presentation of results

Answer 81

bare bones name of service type, address/stub to reach supplier E.g., RMI some include spec of API signatures e.g., Jini/JNDI (Java interface) a few include QoS Web Services describes generic attributes such as price and reliability, but not service-specific attributes E.g., web services (covered later in this lecture) some research middleware includeservice-specific QoS, context, privacy and security

Answer 82

depends on the level of service description bare bones no way to distinguish, just pick one some include spec of API signatures pick one that is compatible trust, QoS & context use a quantitative framework (e.g. utility functions)to evaluate which one is best requires richer description of the service requirementse.g. find a duplex printer < 100 ft away and with < 2 minute wait p1 is 102 ft away, 2s wait p2 is 95 ft away, 1 minute wait p3 is 10 ft away, 3 minute wait

Answer 83

service discovery: description of capabilities -> address of an access point name resolution: identifier of an entity -> address of an access point

Answer 84

mechanisms to make it work: description of capabilities -> address of an access point

Answer 85

mechanisms to make it work: description of capabilities -> address of an access point

Answer 86

clients are configured with a list of addressto go ask for services

Answer 87

clients broadcast service requests on demand

Answer 88

suppliers broadcast their capabilities periodically

Answer 89

suppliers post their capabilities on a directory clients query the directory

Answer 90

broadcasting-based discovery is boundedby the network policy for broadcast (usually LAN) directed and directory-basedoffer more control of scalability hard question: how do directories coordinate? how far, which directories, to direct a query?

Answer 91

the act of performing helpful or useful laborthat does not produce a tangible commodity

Answer 92

separation of concerns

Answer 93

the act of performing helpful or useful labor,where the service supplier is developed separately from consumers and may serve many consumers

Answer 94

suppliers register their capabilities,consumers look for services not specific components

Answer 95

factory & pool

Answer 96

stateful keeps state of conversation while stateless doesn't

Answer 97

introduces dependency on middleware

Answer 98

too many “standards”one born every few months: code evolution nightmare integration with legacy systemsmillions of LOC and billions of $ already invested strategy: wrappers around old code with wrappers latency becomes an issue

Answer 99

focus on bridging existing technologies key characteristic: middleware for middleware it’s about how to access an application,it is not an implementation technology looser coupling than RPC-based middleware avoid proprietary APIs Simple Object Access Protocol (SOAP)based on sending XML messages over http,no SOAP API or ORB WSDL & SOAP are not widely used today, but it’s important to understand why – complexity is a killer

Answer 100

directory (UDDI: universal description, discovery and integration) service description(WSDL: web services description language) messages(SOAP: simple object access protocol) which work on top of: data types(XML Schema) data(XML)

Answer 101

an approach for service composition and coordination. A central service coordinates the invocation of other services to achieve the system’s functionality

Answer 102

another approach for service composition. a decentralized approach to coordination of services, where each service knows how it needs to behave to achieve the system’s functionalities

Answer 103

Business process execution language. often used for modeling the coordination among services. BPEL engine executes the model and exposes it as a service

Answer 104

receives a message but will not respond

Answer 105

receives a message and may issue a fault message

Answer 106

receives a message and may issue a reply or fault message

Answer 107

call-return, only valid for in-only and in-out patterns

Answer 108

(International Resource Identifier)message can be serialized as an IRI multipart: … The word “style” here is not the same as what we called “communication style” in this class

Answer 109

defines the communication protocoltypically SOAP

Answer 110

associates theinterfaces with a URLand protocol (binding)

Answer 111

1. SW companies, standards bodies, and programmers populate the registry with descriptions of different types of services 2. Businesses populate the registry with descriptions of the services they support 3. UBR assigns a programmatically unique identifier to each service and business registration 4. Marketplaces, search engines, and business apps query the registry to discover services at other companies 5. Business uses this data to facilitate easier integration with each other over the Web

Answer 112

supports business registrations. XML document created by supplier company (or on its behalf) may have multiple service listings

Answer 113

In spite of the name, SOAP was not simple Pursuit of generality made WSDL & SOAP too complicated for use in “normal cases” Simple things weren’t simple Inadequate security story Today, other approaches far more common, e.g. REST

Answer 114

Representational State Transfer (REST) “a software architecture style consisting of guidelines and best practices for creating scalable web services” [Wikipedia, “Representational state transfer”]

Answer 115

An API following REST style Intended to be simpler alternative to SOAP and WSDL-based Web services In practice, RESTful systems typically communicate over the HTTP protocol with the same HTTP verbs (GET, POST, PUT, DELETE, etc.)

Answer 116

Client–server (storage on server) Stateless Cacheable Layered system Client may connect to intermediary Code on demand (optional) Uniform interface Identification of resources (e.g., URIs) Manipulation of resources through these representations Self-descriptive messages (e.g., MIME type) Hypermedia as the engine of application state (HATEOS) –THIS ONE IS CONTROVERSIAL

Answer 117

GET: list the URIS and perhaps other details of the collection's members PUT: replace the entire collection with another collection POST: create a new entry in the collection. the new entry's URI is assigned automatically and is usually returned by the operation DELETE: delete the addressed member of the collection

Answer 118

GET: retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type. PUT: replace the addressed member of the collection, or if it doesn't exist, create it. POST: not generally used. treat the addressed member as a collection in its own right and create a new entry in it. DELETE: delete the addressed member of the collection

Answer 119

Original REST definition requires “Hypermedia as the engine of application state” (HATEOS) One accessing the initial REST URI, client must be able to follow server-provided links to (eventually) discover all the resources Human analogy: from start page can only click In theory, HATEOS = no need for client to hard code information about application structure or dynamics [https://restfulapi.net/hateoas/] Some purists insist that REST requires HATEOS Original definition requires it! However, I don’t accept this claim…

Answer 120

In practice, almost all RESTful APIs do not implement HATEOS (this variant sometimes called “Practical REST”) Problems with HATEOS: Increased implementation complexity Bloats every response with many almost-always-unused links In the rare cases that HATEOS links are used, encourages “chatty” (slow & resource-intensive) integration as clients must navigate instead of directly requesting what they need Clients rarely use it – typically clients make a direct request No de facto standard, so clients can’t easily use HATOS info HATEOAS only communicates connection, not meaning, so HATEOS info often doesn’t provide enough info to be useful

Answer 121

REST builds on HTTP – same general rules For wire confidentiality, use TLS (SSL) To authenticate must use agreed-on authentication method E.G., OAUTH2 (token or key/secret) or basic authentication Typically on login uses cookies to store session key or other session info Requestee determines authorization

Answer 122

OpenAPI (originally “Swagger spec”) machine-readable interface files for describing, producing, consuming, and visualizing RESTful Web services Development overseen by Open API Initiative (of the Linux Foundation) Language-agnostic Swagger = common implementation

Answer 123

OpenAPI document is a JSON object may be represented in JSON or YAML All field names are case sensitive Primitive data types based on JSON integer is a type optional modifier “format”, e.g., a dateTime represented as type=string format=date-time Defines supported paths, operations (incl. summary & parameters), responses

Answer 124

strict timing constraints for messages compare timestamps on distributed data

Answer 125

Fundamental unit: second Historically, 1/86 400 of a mean solar day But there are irregularities in the rotation of the Earth Also, Earth’s rotation is slowing down Since 1967, a second is based on atomic measures A second is the duration of 9 192 631 770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the cesium 133 atom (in its ground state at a temperature of 0 K)

Answer 126

Weighted average of the time kept by over 400 atomic clocks in over 50 national laboratories worldwide Continuously increasing, accurate, not coordinated with Earth’s rotation

Answer 127

Each GPS satellite broadcasts its position & local time Receivers determine location & time using transmission delay GPS time was zero at 0h 6-Jan-1980 TAI is always ahead of GPS by 19 seconds

Answer 128

Conceptually mean solar time at 0°longitude (Greenwich), but actually uses distant quasars, etc. Measures Earth’s rotation (thus coordinated with it), but Earth wobbles, so its length of second varies (!)

Answer 129

Primary time standard by which the world regulates clocks and time Tanenbaum uses term nonstandard“Universal Coordinated Time” Based on TAI, but seconds added/removed to keep within 1 second of UT1 (mean solar time at 0°longitude) Leap seconds occasionally inserted: 58, 59, *60*, 0, 1, … Insertion preference at the end of December and June Leap second inserted on 2015-07-01; TAI-UTC=36s Some want to stop adding leap seconds – this would redefine “day” to be unrelated to the sun and Earth’s rotation I’m pro-leap-seconds; if you want continuous, use TAI or GPS time

Answer 130

Add daylight saving time & timezone to UTC Eastern Standard Time (EST - US) UTC -0500 Eastern Daylight Time (EDT; summer) UTC -0400 India standard time UTC +0530 Nepal standard time UTC +0545

Answer 131

2ρ.Δt (read slide 12 of 05-synch)

Answer 132

See third edition section 6.1. Given maximum clock drift rate p (the difference per unit time from a perfect reference clock), and precision "precision" in seconds (the maximum 2 clocks are allowed to be, even if they drift in worse case in opposite directions), then the clocks must be resynchronized at least every precision / (2p) seconds. Given p=10^-6 (a typical rate for hardware quartz clock), and precision of 10^-5, we have a minimum resynchronization frequency = (precision)/(2p) = (10^-5)/(2*(10^-6)) = every 5 seconds.

Answer 133

The client contacts its local name resolver to implement the name resolution process on www.cs.gmu.edu. The name resolver hands the complete name www.cs.gmu.edu to the root name server ".". The DNS root server resolves www.cs.gmu.edu as far as it can, and since it can only resolve to edu, it will return the address of the name server for "edu.". The client name resolver contacts the name server for "edu." and requests it to resolve www.cs.gmu (.edu). The name server for "edu." resolves www.cs.gmu (.edu) as far as it can, and since it can only resolve to gmu, it returns the address of the name server for "gmu.edu.". The client name resolver contacts the name server for "gmu.edu." and requests it to resolve www.cs (.gmu.edu). The name server for "gmu.edu." resolves www.cs (.gmu.edu) as far as it can, and it returns the address of the name server for "cs.gmu.edu.". The client name resolver contacts the name server for "cs.gmu.edu." and requests it to provide the IP address of www (.cs.gmu.edu). The name server for "cs.gmu.edu." provides the IP address of www.cs.gmu.edu. The client's local name resolver returns the IP address of www.cs.gmu.edu. This IP address can then be used to initiate the HTTPS protocol to perform a GET of "/about/contact-info".

Answer 134

This is just the computed time offset (θ). This is different from the round-trip delay (δ), though that's also calculated in the algorithm because results with lower round-trip delay are preferred. However, since the question only asked for the computed time offset, that's what you should have provided. This computed time offset is computed using ((t1-t0)+(t2-t3))/2. answer is 585.5

Answer 135

radio broadcast ±10ms due to atmospheric fluctuations satellite ±500μs knowing the distance to a geostationary satellite local network tens or hundreds of ms, due to network stack, load on processors internet seconds range, due to routers, queues…

Answer 136

Network Time Protocol (NTP) is a networking protocol for clock synchronization to “real” time Designed for variable-latency data networks Servers provide time values, clients request time info (it does support peer-to-peer) Hierarchical: “stratum” counts layers from reference clock (prevents cycles). Stratum 0 = reference clock Clients & servers include local clock timestamps in messages More complex algorithm, but gets real time distributed Clients Regularly polls three or more NTP servers on diverse networks Gathers data & determines how to adjust its clock

Answer 137

Client regularly polls servers, for each computes time offset (θ) and round-trip delay (δ) The values for θ and δ are passed through filters and subjected to statistical analysis (book: θ of smallest δ) Outliers are discarded and an estimate of time offset is derived from the best three remaining candidates Presumes symmetrical nominal delay Many details omitted here!

Answer 138

definition:if a and b are events, a b denotes that a occurs before b transitivity:if a b and b  c then a c if a and b occur in the same process, and a occurs before b, then a  b holds if a is the event of a message being sent by one process, and b is the event of that message being received by another process then a b holds definition (interleaving):if a and b happen in two processes that do not coordinate, then neither a b and b a holds

Answer 139

definition:C(a) is the clock value when event a occurs if a and b occur in the same process, and a occurs before b, then C(a) < C(b) if a is the event of a message being sent by one process, and b is the event of that message being received by another process then make sure C(a) < C(b) interleaving:if a and b happen in two processes that do not coordinate, then we don’t know the relation between C(a) and C(b)

Answer 140

Lamport’s algorithm can be used to ensure that all nodes agree on the ordering of events if (1) Each time stamped message is sent to everyone in the group, (2) Messages sent from the same sender are received in the same order (3) No messages are lost

Answer 141

Simultaneous execution - execution of process or computation simultaneously Need >1 CPU core (but today that’s the normal case, and it’s always true in a distributed system)

Answer 142

“concurrency is the property of program, algorithm, or problem decomposability into order-independent or partially-ordered components or units.” [Lamport1978] Several computations are executing during overlapping time periods—concurrently—instead of sequentially (one completing before the next starts) Concurrency doesn’t require parallelism – it can be implemented on a single processor (through interleaving)

Answer 143

but often same solutions apply to both

Answer 144

Process Each process has a separate memory area from other processes Thread Executes code, no attempt to isolate memory of a thread from other threads in the same process Thread implementation generally maintains minimum information (e.g., CPU context) Using multiple threads can have higher performance than multiple processes, but using them correctly requires more intellectual effort Easier to get things wrong, and can be difficult to debug because defects often aren’t reproduceable Using threads or processes can improve scaleability Take advantage of those multiple processors you have

Answer 145

some advantages of using threads separation of concerns:different activities in different threads one thread remains responsive (e.g. user input)even if others are busy or blocked (e.g. waiting for messages) – decreases overall latency support requests of multiple clients using threads for replicated computation pool: assign a thread when a request comes in more efficient, harder to manage factory: create a thread when a request comes in easier to manage, less efficient threads are supported by a library/VM, the OS, or both making a process-blocking OS call blocks all threads in some library/VM implementations calling exit() in one thread terminates the process

Answer 146

Threads provide convenient way to allow blocking system calls without blocking entire process Good for distributed systems – easier to express communication with multiple logical connections at same time Distributed systems can impose significant delays in communication between components – don’t want to block everything

Answer 147

shared memory (address space) distributed shared memory objects distributed object stores files distributed file systems

Answer 148

P and Q running concurrently. For the moment we’ll assume + and * are atomic (normally they are not) (see slide 11and 12 lecture 6) x = x + 3 often implemented like this: Load value of x Load constant 3 Add them Store result in x … so even “simple” operations like addition are often not atomic (they can be broken down)

Answer 149

single data repository (blackboard) all components read/write on blackboard

Answer 150

multiple clients may access same DB records

Answer 151

distribution network propagation delays different clocks causes inconsistency given one trace of events, different components may see those events in a different order

Answer 152

concurrency causes unpredictability cannot tell which trace will occur distribution network propagation delays different clocks causes inconsistency given one trace of events, different components may see those events in a different order

Answer 153

definition ordering (aka data-centric) consistency: all components observe operations on shared data at the same time strict consistency – only possible with shared clock and insignificant propagation delays in the same order linear consistency in an order that “makes sense” sequential, causal, FIFO consistency

Answer 154

address both predictability and ordering consistency. programmers need to useexplicit synchronization techniques anyway,because of unpredictability (due to concurrency) explicit synchronizationrelieves the middleware/OS fromhaving to assure ordering consistency

Answer 155

used for explicit synchronization Monitor provides a queue with certain entry condition that is used to guarantee only one process operates on the critical section (data) at a time

Answer 156

events or actions of interest? arrival and departure identify processes arrivals, departures and CarPark control define structure and interactions

Answer 157

Reasons for replication: Improve reliability – can continue working even if some component fail (see fault tolerance) Performance – can distribute work & put data near where it’s needed Problem: Can lead to consistency problems Challenge to keep replicants consistent

Answer 158

Often focus on shared data aka “data store” May be (distributed) shared database, shared filesystem, shared memory, etc. Consistency model = a contract between processes & data store (if processes do X, data store promises Y) Without a global clock not easy to define “last write” Often can accept some inconsistencies… but need to bound it, & thus need to categorize them

Answer 159

Deviation in values between replicas Absolute numerical deviation (“no more than $0.02”) Relative numerical deviations (“no more than 0.5%”) Deviation in staleness (how old) “Data no older than X seconds” Deviation in ordering of update operations This is more complex! Sequential consistency, causal consistency, eventual consistency, …

Answer 160

“The result of any execution is the same as if the read & write operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program.” Any valid interleaving of read & write operations is acceptable, but all processes see the same interleaving Expensive to implement in distributed system

Answer 161

Weakens causal consistency – distinguishes what is potentially causally related “Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines.”

Answer 162

“If no updates take place for a long time, all replicas will gradually become consistent [have exactly the same data]” Updates must eventually propagate to all replicas Problem: write-write conflicts (same data item written with different values) Often the solution is an algorithm that declares one as the “winner” (cancelling the effects of any previous conflict) Often cheap to implement, at a cost of inconsistency for a period of time

Answer 163

Give up the idea of a “central data store” Provide consistency guarantees from POV of 1 client, No guarantees about different clients Various consistency models: Monotonic reads: If a process reads the value of a data item x, any successive read of x by that process will always return that same or more recent value (“never read older version”) Monotonic writes: A write by a process on data item x is completed before any later write on x by the same process Read-your-writes: The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process Writes follow reads: A write operation by a process on data item x following a previous read operation on x by the same process is guaranteed to take place on the same or more recent value of x that was read (“See a posting about an article only if saw original article”)

Answer 164

none of them are true: The network topology does not ever change. Network bandwidth is infinite. The network is reliable. Network latency is zero.

Answer 165

In RMI, unlike in RPC, applications send messages to logical contact points.

Answer 166

3 System down = all servers are down. Probability the system is down= (probability of each server down)^N Goal ≥ Probability the system is down Probability each server is down = 20% = 20/100 = 0.2 Goal = 0.7% = 0.7/100 = 0.007 Probability the system is down = (0.2)^N N = 2 ==> Probability the system is down = (0.2)*(0.2) = 0.04 N = 3 ==> Probability the system is down = (0.2)*(0.2)*(0.2) = 0.008 N = 4 ==> Probability the system is down = (0.2)*(0.2)*(0.2)*(0.2) = 0.0016 Goal = 0.007 ≥ Probability the system is down = 0.0016 So, the minimum number of servers needed to reduce the down time to 0.7%, including the master, is 4. That means only 3 additional replicas are needed (since the question expressly asked you to not count the master server in the count of replicas).

Answer 167

Once a transaction commits, the changes are permanent.

Answer 168

install a handler to be called when a message is put into the specified queue

Answer 169

Check a specified queue for messages, and remote the first. Never block

Answer 170

8 We need "F" (Fail) probability, and all we're given is the probability of success for each component. This is easily resolved: F = 1 - Success n >= (log Goal)/(log F) so find the smallest n (the "ceiling") that meets this inequality. Since "n" has to be an integer (you can't use a *part* of a component), you'll typically have to go up to the smallest integer greater than or equal to this calculation (this is also called the "ceiling"). Finally: Do we include the "master" in this case? As discussed in class, people aren't consistent in their terminology so you have to look at it for each question. In this case, the question *clearly* states that we *do* include the "master" system in this case, so you don't subtract by one.

Answer 171

The client procedure calls the client stub in the normal way. The client stub builds the message (e.g., "marshalling" the parameters) and calls the local operating system asking the message to be sent via the network. The client's operating system sends the message to the remote (server) operating system. The remote operating system gives the message to the server stub. The server stub unpacks the parameter(s) (aka "unmarshalls" the parameters) and calls the server. The server does the work and returns the result to the stub. The server stub packs the result into a message (including "marshalling" the parameters) and calls its local operating system. The server's operating system sends the message to the client's operating system. The client's operating system gives the message to the client stub. The client stub unpacks the result and returns it to the caller within the client.

Answer 172

Persistence refers to whether the messages endure in the system (regardless of the sender being active and the recipient being available) or not.

Answer 173

publish-subscribe

Answer 174

false Read carefully! It's true that "There are two models for activating service supply: factory & pool." The later text is also true, because it's true that "An advantage of pre-creating instances is that requests can often be handled more quickly, since the instance is already ready to go (this effect is more important if instantiation takes a long time). A disadvantage of pre-creating instances is that the instance typically requires resources (at least storage) even when it is not being used." However, the definitions of factory and pool are completely reversed. In the pool model, "the system instantiates all the instances that might be needed ahead-of-time, and then when a request shows up, one of those pre-created instances is allocated to the request". In the factory model, an instance is created at the time of the request.

622 Flashcards

(208 cards)