Terms Flashcards

1
Q

HPC (High Performance Computing)

A

A computing model that focuses on processing power over data throughput.
Handles large-scale problems and is maximized for speed.
e.g. Weather Forcasting, Scientific Modelling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

HTC (High Throughput Computing)

A

A computing model that focuses on maximizing data throughput over processing power.
handles large amounts of smaller, independent processes.
e.g. Web services that handle user requests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gustafson-Barsis’s Law

A

as the number of processors increases, the problem size or workload can be increased proportionally to keep the total execution time of the program constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Little’s Law

A

States that the average number of tasks in a system is equal to the arrival rate of tasks multiplied by the average time a task spends in the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Buffer

A

A temporary storage space used to hold data while it’s being moved from one place to another.
Enables efficient data transfer and controls the flow of data between systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the requirements for deadlock to occur in concurrent code?

A

Deadlock occurs when two or more threads are blocked because they are waiting on each other to release resources they hold.
The 4 conditions for Deadlocking to occur are mutual exclusion, Hold & Wait, No Preemption and Circular Wait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Atomocity

A

When a concurrent programme is completing a task, it treats all actions as a single unit of work.
It prevents race conditions, deadlocks and inconsistent data states.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Monitor

A

High-level synchronization construct that provides a way to control access to shared resources in a multi-thread environment.
Uses mutual exclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the synchronized keyword in java do and give 3 limitations of this keyword?

A

Java’s synchronized keyword creates monitors that control access to shared resources in a multi-thread environment.
3 limitations of this are it’s potential for deadlock, limited granularity and potential performance overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give an example of a commercial distributed system

A

Amazon’s Web Services.
(Cloud computing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Provide two examples of system desgins for distributed systems

A

Client-Server Architecture and Peer-to-Peer architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Client-Server architecture?

A

The client sends requests to the server, the server processes these requests and sends responses to the client.
Limitation: latency issues if the server goes down
This is a centralized system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Peer-to-Peer architecture?

A

All nodes act as both clients and servers and they communicate with eachother directly.
Provides a more decentralized
and fault-tolerant system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a centralized system?

A

One where all nodes in the network communicate with a central coordinator node. E.g. Client-Server architecture.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a decentralized system?

A

One where all nodes communicate directly with eachother with no central node.
E.g. Peer-to-Peer architecture, Blockchain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Safety in concurrent programming?

A

The property that guarantees that no matter how the concurrent system processes thread interleave, certain aspects of the system will always hold.
e.g. If two owners of bank account attempt to withdraw funds simultaneously, there’s no danger of the system messing up and giving the user’s more than what the account has available.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is liveness in a concurrent system?

A

The property that guarantees that some desirable behavior will eventually happen.
e.g. all incoming requests to a web server should be eventually processed and responded no matter what.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is Dekker’s Algorithm?

A

A mutual exclusion algorithm.
It uses flags to signal that a thread wants to enter a critical section.
it also uses a turn variable to indicate whether it is the threads turn to enter the critical section.
There are two parts to Dekker’s algorithm:
Entry Section –> Processor sets its flag to true and waits until the other processor has its flag set to false, then the processor checks the turn variable to see if it can enter the critical section.
Exit Section –> Processor sets its flag to false when finished in the critical section and the turn variable is set to the other number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the difference between monitors and semaphores?

A

Monitors –> use High-level synchronization constructs to control access to shared resources in multi-thread systems.
Semaphores –> Integer variables used to control access to shared resources. (generally in binary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the Producer Consumer problem?

A

A synchronization problem where producers produce data items and put them into a buffer, while consumers remove said items from the buffer and process them. The problem arises when producers produce data items faster than the consumer can consume them, leading to buffer overflows and starvation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you solve the producer consumer problem?

A

Use two semaphores: one to represent empty slots available in the buffer and another to represent filled slots in the buffer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Give an example of a data item in Java that is always thread-safe

A

Immutable objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How do you ensure that a class is thread safe?

A

Synchronize critical sections of the code and use atomic / volatile variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are stubs and skeletons in RMI (Remote method Invocation)?

A

Stubs –> Client-side proxy that represents a remote object in the local JVM, responsible for marshalling the method arguments and sending them to the remote object, then unmarshalling the method results.
Skeletons –> Server-side object that acts as a mediator between the remote object and the network.
In simple terms they both work together to enable communication between clients and servers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Define marshalling

A

the process of converting a method argument into a format that can be transmitted over a network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Define IDLs (Interface Definition Languages)

A

They define a standard way of defining interfaces and data types which can be used accross different programming languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Name some common services of distributed systems

A

Communication, Security, Resource Sharing, Scalability, Load Balancing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Define RPC (Remote Procedure Call)

A

Used in distributed computing, it allows a programme running on one computer to call a procedure located on another computer over a network.
Stubs and Skeletons hide the details of RPC in RMI.
Mainly used in client-server architectures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Define Task Partitioning

A

Process of dividing a complex task into smaller sub-tasks to be executed in parallel. Two common types of this are Corse Grain Partitioning and Fine Grain Partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Define Corse Grain Partitioning

A

Involves dividing a large task into a few large sub-tasks which are executed in parallel by different processors. Easy to implement
E.g. a web app that processes different user requests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Define Fine Grain Partitioning

A

Divides large tasks into many small sub-tasks which are executed in parallel by different processors.
Fast processing times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is a decomposition strategy

A

Strategy used in computing to break down a task into smaller more manageable parts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What size buffer would you need to handle up to 7,000 transactions in a web app with a latency of 100ms

A

It has to be at least big enough to handle the 7,000 transactions with additional headroom for traffic bursts , network latency etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Amdahl’s Law?

A

Used in parallel computing, it describes the theoretical maximum speedup that can be achieved by using multiple processors to execute a programme.
Speedup = 1/[(1-P) + (P/N)] where P is the parallel portion of the code as a decimal and N is the number of processors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What are the correctness properties that apply programmes that are not supposed to terminate\/

A

Safety and Liveness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Define Latency

A

The time taken for a request to get sent to a server and the corresponding response to be received.
Minimizing latency usually involves using HPC and low-latency networking.

37
Q

Define Throughput

A

The measure of how many requests a server can handle at a given time.
Maximizing throughput usually involves using distributed computing.

38
Q

What are Spinlocks?

A

Low-level synchronization constructs used to control access to shared resources in multi-thread systems.
When the calling thread requests a spin lock that is already held by another thread, the other thread spins in a loop to check of the lock has become available.

39
Q

What are Spinlocks?

A

Low-level synchronization constructs used to control access to shared resources in multi-thread systems.
When the calling thread requests a spin lock that is already held by another thread, the other thread spins in a loop to check of the lock has become available.

40
Q

What are some advantages of Monitors over spinlocks?

A

Monitors are more efficient as they do not waste CPU cycles while waiting for a lock.
Easier to use.

41
Q

What happens when a thread enters a synchronized method?

A

It acquires the monitor associated with the object on which the method is called

42
Q

What does a synchronized block do?

A

It allows a thread to acquire and release a monitor explicitly.
(Allows for Fine-Grain control over the locking mechanism).

43
Q

What is a monitor invariant?

A

Defines the state the monitor is expected to maintain and any operations performed on the monitor should preserve this invariant.
E.g. for a monitor that models a bank account with a balance, the model invariant may be that the balance can never drop below 0.

44
Q

what does it mean when a monitor invariant is relaxed?

A

The monitor’s internal consistency check is temporarily suspended.
This is good for cases like the producer consumer scenario, where the producer may temporarily violate the monitor invariant by adding items to a full buffer.

45
Q

What are race conditions?

A

A situation that can occur when two or more threads access shared resources in an unsynchronized manner.

46
Q

What does the notify() synchronization primitive do?

A

Wakes up a thread that is waiting for an object’s monitor

47
Q

What does the wait() synchronization primitive do?

A

Causes the thread to wait until it’s notified by another thread using the object’s monitor with the notify() method.

48
Q

What is the Epidemic Algorithm?

A

A distributed computing algorithm that enables information to be disseminated and replicated accross a system of nodes in a decentralized and fault-tolerant manner.
Nodes in a network randomly exchange data with eachother, and this randomness ensures that all nodes in the network eventually receive the data even if there are network failures.
The “source node” is the first node to start the exchanges and every node that receives the data becomes a “carrier”

49
Q

Name two types of epidemic algorithms

A

Push model and the Pull model

50
Q

Define a push model

A

A type of epidemic propagation model, a replica sends updates to all other replicas it encounters, whether they need them or not
Not very efficient

51
Q

Define a pull model

A

A type of epidemic propagation model, a replica requests updates from other replicas when they need it.
More efficient than a push model
e.g. Gitlab

52
Q

What is location transparency in distributed computing?

A

The ability to access files regardless of a machines physical location (User doesn’t need to know where the files are stored to access them).
E.g. DNS (Domain Name System), a distributed system that maps domain names to IP addresses. The client doesn’t need to know the location of the DNS server to resolve the domain name to an IP address.

53
Q

What is location independence in distributed computing?

A

Ability to move files around the system without disrupting the user’s ability to access them.
E.g. cloud services like Google Drive.

54
Q

What is a cache?

A

A temporary storage space to speed up access to frequently accessed data.
Reduces latency and network traffic

55
Q

What is the cache-consistency problem?

A

Challenge of ensuring that all copies of a file are consistent and up to date when changes are made to one copy.

56
Q

How do you solve the cache-consistency problem?

A

Write-through Policy
Invalidate Policy
Update Policy
Hybrid Policy

57
Q

Define the write-through policy

A

Any update to a file is written to all copies immediately

58
Q

Define the invalidate policy

A

When an update is made to a file, all copies are invalidated and the next operation will retrieve the updated file.

59
Q

Define the update policy

A

Changes to an updated file are propagated to other copies ASAP.

60
Q

Define the Hybrid Policy

A

A combination of the write-through policy, the invalidate policy and the update policy

61
Q

What is Peterson’s Algorithm?

A

Synchronization algorithm used to solve the critical section of a two processor system, similar to Dekker’s algorithm.

62
Q

What is the Bakery Algorithm?

A

Mutual Exclusion Algorithm designed for n processes.
Works by assigning a unique ticket number to each process, and allowing processes to enter the critical section based on their ticket number

63
Q

Is the Bakery Algorithm suitable for n processes?

A

It depends, this algorithm works well for a small number of threads (e.g. 5), but it’s efficiency worsens for a larger number of processes for two main reasons:
Bust-Waiting –> System resources get used while threads wait in the loop waiting for other processes to finish their work. this can lead to contention for shared resources. e.g. CPU time / Memory
Non-Scalable –> Doesn’t scale well as the number of processes increases. (0(n^2) complexity).

64
Q

What are lock objects in the Java.util.concurrent package?

A

Mechanisms for controlling access to shared resources in multi-thread environments.

65
Q

Give an example of a lock object from the java.util.concurrent package

A

ReentrantLock lock–> Allows a thread to acquire the same lock multiple of times without deadlocking itself.
Used in situations where more control over locking and unlocking is required as it provides greater concurrency than synchronized blocks of code.

Lock.tryLock() –> returns a boolean value indicating whether the current thread has acquired the lock or not

66
Q

What is the Distributed Token Ring Algorithm

A

Mutual Exclusion Algorithm where processes are organised in a logical ring. A token is then passed around the ring and whichever process holds the token can access the shared resource / critical section.

67
Q

What are Clock Synchronization Algorithms?

A

Used to synchronize the clocks of multiple nodes in a distributed system

68
Q

What is the clock of a node in a distributed system?

A

Used to order events and messages that occur within a node, as well as to synchronize the clocks of other nodes in the system

69
Q

What are clock synchronization algorithms?

A

Used to synchronize the clocks of multiple nodes in a distributed system.

70
Q

What is Lamport’s Algorithm?

A

A clock synchronization algorithm, it synchronizes the clocks of nodes in a system.
Works by adding a timestamp to each message sent between processes, based on the sender’s logical clock value, and the receiving node updates its own logical clock based on this timestamp and then processes the message.
Ensures that messages are partially ordered.

71
Q

What is the main problem with Lamport’s algorithm?

A

It assumes that messages are sent in a timely and reliable manner, which isn’t always the case.

72
Q

What is the Bully Algorithm?

A

Used to elect a coordinator in a decentralized distributed system.
It assumes that each process has a unique ID / Priority and the thread with the highest ID is elected as the coordinator.
Does not scale well when used in large scale systems.

73
Q

What is a Java Artifact?

A

A deployable unit of code that is packaged and used for distribution and use.
e.g. Java.util.concurrent

74
Q

What are the three approaches to Java Artifact design?

A

Contract-First approach –> The web service description language (WSDL) contract is first created, and the java artifact is generated from this.
Code-First approach –> Java code is written first, and then the WSDL contract i generated from this java code.
Hybrid Approach –> Java code and WSDL contract are developed simultaneously

75
Q

What is WSDL (web service description language)?

A

XML-based language used to describe the functionality offered by a web service.
Typically consists of parts:
Types
Messages
Port Type
Binding

76
Q

What is Strong Scalability?

A

Ability of the programme to execute a fixed task quicker as more processors work on it.
E.g. If a task takes 10 minutes to complete on one processor but 5 minutes on another processor, then the programme has strong scalability

77
Q

What is Weak Scalability?

A

A programme has weak scalability if when the size of the task increases proportionally to the number of processors working on it, the execution time should remain roughly the same. E.g. if a task running on one processor takes 10 minutes to execute, but the same task that is twice as big also takes roughly 10 minutes to execute on two processors, then the programme has weak scalability.

78
Q

What are 3 major components of a Java application that uses RMI?

A

Remote Interface –> Defines the remote methods that can be called by the clients of the distributed system to access and modify shared resources.
Remote Object –> Object that implements the remote interface and provides the implementation of the methods declared in the interface. Responsible for receiving method calls from the client and invoking the method
Stubs and Skeletons –> respsonsibe for marshalling and unmarshalling remote method arguments and sending them to the remote object over the network

79
Q

What are 3 major components of a Java application that uses RMI?

A

Remote Interface –> Defines the remote methods that can be called by the clients of the distributed system to access and modify shared resources.
Remote Object –> Object that implements the remote interface and provides the implementation of the methods declared in the interface. Responsible for receiving method calls from the client and invoking the method
Stubs and Skeletons –> respsonsibe for marshalling and unmarshalling remote method arguments and sending them to the remote object over the network

80
Q

What is an inceptor for general remote object invocation?

A

Objects that intercept and process incoming and outgoing remote method calls to a remote object.
Necessary to perform tasks like authentication and authorization, performance monitoring and error handling

81
Q

How do Inceptors work in remote object invocation?

A

An inceptor gets registered to a remote object, and if a remote method is called on this object the RMI system calls the inceptor invoke method which allows the inceptor to perform tasks like authentication and authorization checks before the remote method is actually invoked on the object.

82
Q

Name the key differences between a centralized system and a token ring system

A

A centralized system requires fewer messages to be exchanged as all nodes only have to deal with the central server, while in a token ring system all nodes must communicate with eachother as the token gets passed around.
Also, the centralized system is more susceptible to failure than a token ring system due to the single point of potential failure which is the central server.

83
Q

What is MPI (Message Passing Interface)?

A

Used in inter-process communication (IPC) and message passing in parallel computing.
Provides a standard API for developing parallel applications that run on many different architectures.
Advantages:
Portable, scalable

84
Q

What is OpenMP (Open multi-processing)?

A

An API for shared memory multiprocessing in C, it enables developers to write parallel programmes that can run on multi-core processors and shared memory systems.
Advantages:
Easy to use, scalable

85
Q

What are advantages of using both MPI (Message Passing Interface) and OpenMP (open multi-processing)

A

Using both can be advantageous when dealing with apps that require parallelism at both the shared memory and distributed memory levels

86
Q

Describe the Apache Web Server

A

Open-source web server software which serves web pages over the internet.
It receives incoming HTTP requests from the client web browser and passes them to the appropriate handler, the dynamic content generator generates content based on the user’s request.

87
Q

How do you implement monitors in C?

A

Use C’s synchronization primitives mutexes and condition variables

88
Q

What is a Mutex

A

Mutex (standing for mutual exclusion) is a synchronization primitive used in c to allow for mutual exclusion when threads access a shared resource.
Generally used alongside a condition variable.
Represented as pthread_mutex_t
Process must acquire a mutex using pthread_mutex_lock before accessing a shared resource, and then must release it using pthread_mutex_unlock

89
Q

What is a condition variable?

A

a synchronization primitive used in c to allow for mutual exclusion when threads access a shared resource.
Generally used alongside a mutex.
represented as pthread_cond_t
Thread waits for a certain condition to be true using pthread_cond_wait which blocks the thread until it is signaled by another thread that the shared resource is no longer being modified by using pthread_cond_signal.