Fundamentals Flashcards

1
Q

What is a distributed system?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the parts of a distributed system (or “distributed application”) called that may be arranged and may cooperate in different ways?

A

Processes/Threads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In client/server distrubted computing what does the server provide to clients?

A

service(s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When we say “client” or “server” we are often referring to either/both of:

A

The machine providing/using the service and, more accurately, the thread(s)/process(es) running on those machines to provide/use the service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In a P2P system, what happens with “servers” and “clients”

A

In such systems, there are no systems indentified as “servers” or “clients”, all are supposedly equivalent “peers”

With P2P, each system typically performs both roles (server and client) at the same or, possibly, at different times

– E.g. a P2P file sharing user might be providing MPGs to another user at the same time s/he is downloading other MPGs

• When providing files, the user’s machine is acting as a server but when it is downloading files, it is clearly acting as a client

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some possible benefits offered by distributed systems?

A

Improved/broader access to the system

  • Data access, and other services may be made conveniently available across multiple machines
    • E.g. a database/e-mail/file directory/… that is physically maintained at an organization’s head office may be accessible from its many branch offices around the world

Enhanced sharing

  • Resources can be easily shared by many users
  • E.g. high-end printers & backup devices are commonly shared

Cost-effectiveness

  • Sharing devices is commonly more cost-effective than supplying each user with their own
    • E.g. share a single high-speed, duplex, colour printer between several users rather than buying many

Less Systems Administration Effort

  • Accounting information can be shared across many machines in a distributed system
    • E.g. the system administrator only has to create and maintain a single userid for many machines

Enhanced Availability

  • Having multiple copies of data can offer improved availability
    • E.g. “replicated” web servers can continue to provide service even when one or more have failed

Better Performance

  • Multiple service providers can offer better service
    • E.g. the web requests sent to a large web server are commonly “distributed” across a number of web servers to provide faster response to each request
      • content distribution across geographical regions (ala Akamai)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some possible liabilities of using distributed systems?

A

Complexity

  • Building and maintaining distributed systems is harder than building and maintaining centralized ones
    • E.g. how do we know which parts of the system should be on which machines? What happens when one part fails?

Higher operational costs

  • Looking after distributed systems can be expensive
    • E.g. OS upgrades and patches must be done on N smaller machines instead of just one big one, etc.

Security and Trust Issues

  • Whenever sensitive data is moved across a network, security becomes a concern
    • E.g. typing your credit card number and expiration date into a web site – who’s really at the other end, listening on the line?

Decreased Availability

  • Wait a minute!!! Didn’t we list improved availability as a benefit of distributed systems???
  • Yes, but you have to work for it!
    • What if one part of a distributed system goes down and you don’t have replication of that part? – “Extended failure modes”
    • “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable” (Leslie Lamport - paraphrased)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

For processes and threads, which is lightweight and which is heavywieght?

A

Processes = Lightweight

Threads = Heavyweight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do all distributed systems share in common?

A
  • Network communication
  • How to find things (servers/services, peers, …)
  • The need to deal with a variety of failures
  • How to buiild systems that can grow (scalability)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the “network”?

A

They provide the “mechanism” by which the parts of a distributed applicaiton communicate

What happens inside the network has an impact on how we build distributed systems and what they can be expected to do

The cloud, like driving a car, we don’t need to understand how the engine works, we just need to understand how to drive it (the interface)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

“Useful dsitributed applications”

What does the term “useful” mean in this context?

A

Useful = How fast, how available/reliabile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

With regard to network communication, what are some key questions one should probably ask?

A
  • What does the communication?
  • How do we identify communicating parties?
  • How do they communicate? (What is the API?)
  • Is communication always reliable?
    • the internet is a “best effort” service unless you manually account for it’s unreliability
  • Is communication performance predicatable?
  • What happens as networks get big (or small)?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What communicates in a distributed systems?

A

The users of distributed systems are NOT the communicating entities

  • Though they sometimes do communicate using distributed systems (skype, email, facebook)

The communicating entities are actually the “running programs”

  • “processes” or “threads”
  • For now we assume that processes communicate over the network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the network communication diagram with regard to client machines and server machines?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How are communicating parties identified?

A

We need at least both a unique machine id and a unique process id on that machine

  • For the internet, an Internet Protocol(IP) address uniquely identies each machine(more or less)

But even this is still not enough single pids get re-used over time on a single machine.

Process IDs will need to reused at some point. It is unique within the set that currently exists. If we have services that run for a long time, we could possibly run out of PIDS.

  • A port number identifies a service provided by a process running on a given machine
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does it mean for a process to communicate?

A

Model of “message passing”

Other abstractions can be built on top

This works well because networks provide for the delivery of data in the form of “packets”

Building on packets, “messages” may be sent between two “end points” (processes on specific machines identified using port #’s)

Prof notes: Representations of the stuff can be affected by proc x or proc y

Where the client and server parts don’t correct agree on what’s being transferred over the network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does the intuitive message example look like?

A

“please give file ‘f’” to port 27983 on machine 130.179.28.1

and might be followed by something like:

recieve file ‘f’ from port 27983 on machine 130.179.28.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the key concepts inside of messages?

A
  • endpoint indentification (port# /IP address piars)
  • messages
  • send call(s)
  • recieve call(s)
19
Q

What are the two message-based communication strategies that are commonly supported?

A

Connection-Orientated:

Like picking up the phone, punch in number, they pick up and you have a connection until someone hangs up.

Connectionless:

Connectionless is like the postal system, every letter you send you have to address. Can’t drop in a letter with no address and expect it to get to your aunt.

20
Q

Generally, how might the communication look for a file transfer program?

A

A file transfer program, the client and service would probably exchange many messages contraining blocks of data since the files may be large.

Prof Notes:

If you were to transfer a 4gb file, it would be hundreds of thousands of message, 64bits by 64 bits. Connection oriented would make more sense here. You don’t have to say were each individual message is going, you establish the connection and funnel them all through until you are ready to disconnect

21
Q

Using connection oriented messaging we might see a pattern like?

A
22
Q

What is an example of Connection Oriented forms of communication and what are it’s details?

A
23
Q

What is an example of Connectionless forms of communication and what are it’s details?

A
24
Q

What is an example where one might not be concerned with reliability

A
25
Q

Why can network performance significantly affect distributed application performance?

A
26
Q

What happens when networks get big?

A
27
Q

What are resources?

A
28
Q

How does resource naming work with DNS as an example?

A
29
Q

In general what is resource discovery?

A
30
Q

What are the 3 standard techniques for resource discovery?

A
31
Q

What are some key differentiators when considering the implementation of resource discovery?

A
32
Q

Distributed systems should be designed from the start with what in mind?

A

Failures.

Just considering the “happy path” during design is unacceptable

33
Q

Additional failture modes become possible in distributed system because of what two factors?

A
34
Q

Provide an example of one process failing but another still runs with a failure at the NFS server itself

A
35
Q

Provide an example of one process failing but another still runs with a failure at the connection to the NFS server

A
36
Q

What must be provided in all distributed application

A

The ability to detect and deal with failures must be provided in all distributed applications (or the networks underlying them)

37
Q

What are the techniques for detecting and recovering from failures

A
  • Timeouts
    • set a timer when a request is sent and if it expires before a response is recieved…
    • At timeout we know SOMETHING is wrong
  • Redudancy
    • Assign the same work to mulitple machines and their work is verified against each other
    • OR sending redudant messages on different network paths(for critical communications only!)
38
Q

Provide an example where a failure would be catastrophic if not handled?

A
39
Q

What is a strategy for recovering from communcation related failures?

A

TCP does this or you

40
Q

What is the definition of scalability

A
41
Q

Provide one way to address this consistency problem

A
42
Q

Since deployment of network upgrades is trypically very slow, it is not normally feasible to solve the problem by expecting better/faster networks. What is normally used instead to try to minimize communication?

A
43
Q

How does the size of user base affect scalability?

A