Part 2 Flashcards by Andrew Romans

Process vs. thread, describe the distinctions. What happens on a process vs. thread context switch?

In the toy shop example: a Process is like an order of toys and a thread is like a worker in a toy shop.
A process is an active entity that represents the state of a program while executing. A thread is an entity that performs a task directed by a process that can happen simultaneously with other simular threads.

During process context switch, the cpu switches processes which forces virtual memory address space to be remapped to the processes physical address space, which is costly. Since threads share an address space, this remapping is unnecessary. As such, thread context switching is faster overall.

if (t_idle) > 2 * (t_ctx_switch) -> context switch to hide idle time

How well did you know this?

Not at all

Perfectly

Describe the states in a lifetime of a process?

1) New
2) Ready
3) Running
4) Terminated
5) Waiting

How well did you know this?

Not at all

Perfectly

Describe the lifetime of a thread?

The thread is first created by the fork() method, and then executes, exec(), some function. A thread then closes/returns to the parent thread by using the function join()

How well did you know this?

Not at all

Perfectly

Describe all the steps which take place for a process to transition form a waiting (blocked) state to a running (executing on the CPU) state

The process is in a waiting state when some I/O or wait event is triggered. Once the I/O or wait event is completed, the process goes to a ready state. Upon some schedular dispatch event, the process then transitions to a running state.

How well did you know this?

Not at all

Perfectly

What are the pros-and-cons of message-based vs. shared-memory-based IPC?

Message-Passing:
+ Managed by the operating system
- A lot of overhead involved

Shared Memory IPC:
+ Operating System is out of the way (not involved)
- Since OS is out of the way, we must re-implement code (More error prone)

How well did you know this?

Not at all

Perfectly

What are benefits of multithreading?
When is it useful to add more threads, when does adding threads lead to pure overhead?
What are the possible sources of overhead associated with multithreading?

Parallelization -> Speed Up Tasks
Specialization -> Makes use of Hot Cache!
Efficiency -> Lower memory requirements & cheaper IPC

It becomes useful to add more threads when there is a bottleneck in the system.
(?) Adding threads where they need to access a shared variable or resource creates more overhead within the implementation

Keeping track of Mutex and Condition Variables, using a single mutex for multiple resources, using signal when instead you should broadcast.

How well did you know this?

Not at all

Perfectly

Describe the boss-worked multithreading pattern.
If you need to improve a performance metric like throughput or response time, what could you do in a boss-worker model?
What are the limiting factors in improving performance with this pattern?

In the boss-worker patern, a boss thread assigns jobs/tasks to other worker threads. The throughput is calculated by: 1 / boss_time_per_order.

To increase performance, one could create more worker threads on demand, but this can be inefficient. Another way is by having a pool of workers with a dynamic size. A variant of this creates specialized workers for certain tasks.

With this pattern based on the variant implemented, there is a lot of overhead keeping track of all the worker threads, keeping the thread queue synchronized, and determing the right amount of workers to maintain efficiency.

How well did you know this?

Not at all

Perfectly

Describe the pipelined multithreading pattern. If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model? What are the limiting factors in improving performance with this pattern?

In a pipelines pattern, the threads are assigned a subtask of one entire task.
To improve the pipeline pattern is by creating a thread pool in the stage that is the weakest link/slowest stage.
The limiting factors for development and performance optimization is that there are a lot of balancing and synchronization overhead.

How well did you know this?

Not at all

Perfectly

What are mutexes and condition variables?

A Mutex is a way to allow only one thread access (to have exclusive acess) to a set of data, or variable, at a time.
Condition variables can be used to keep other threads waiting until that specific condition has been met and signaled/broadcast by another thread.
These two variables are usually used within the critical section of a thread’s task.

How well did you know this?

Not at all

Perfectly

Can you quickly write the steps/code for entering/existing a critical section for problems such as reader/writer, reader/writer with selective priority (e.g., reader priority vs. writer priority)?

Swapping the read_phase w/ write_page and vice versa changes this to a writer priority function.

// ----- READER ------
Lock(counter_mutex) {
while(resource_counter == -1)
   wait(counter_mutex, read_phase);
resource_counter++;
}  // Unlock
// . . . do stuff . . . 
Lock(counter_mutex) {
resource_counter--;
if (resource_counter == 0)
   signal(write_phase);
}  // Unlock

// ----- WRITER -----
Lock(counter_mutex) {
while(resource_counter != 0)
   wait(counter_mutex, write_phase);
resource_counter = -1;
}   // Unlock
// . . . do stuff . . .
Lock(counter_mutex) {
resource_counter = 0;
broadcast(read_phase);
signal(write_phase);
}   // Unlock

How well did you know this?

Not at all

Perfectly

What are spurious wake-ups, how do you avoid them, and can you always avoid them?

Spurious wake-ups are unnecessarily waking up a thread knowing that the thread cannot access the data it needs too, or that it cannot proceed with normal execution.
You can avoid them by reordering the operations of which you wake up a thread. By locking the mutex and signaling/broadcasting outside of the lock. However, this isn’t always the case and sometimes cannot be done without affecting the correctness of the code.

How well did you know this?

Not at all

Perfectly

Why is using a while() loop for the predicate check in the critical section entry code so important?

We cannot guarantee access to a mutex once the condition is signaled - so we use a while() loop to assure we don’t change the data while its being accessed elsewhere. In addition, using a while loop can support multiple threads.

How well did you know this?

Not at all

Perfectly

What’s a simple way to prevent deadlocks? Why?

A simple way of preventing deadlocks is to maintain a lock order, or lock chain. This prevents the deadlock cycle and will guarantee that the threads will always have access to a mutex at one point in the chain.

– Extra —
Deadlock prevention can be expensive to implement, sometimes a better method to avoid deadlocks is to detect when a deadlock occurs and to recover/rollback when necessary. Another met hod is the Ostrich Algorithm, doing nothing and if a deadlock occurs - just reboot.

How well did you know this?

Not at all

Perfectly

Can you explain the relationship among kernel vs. user-level threads?
Think though a general mxn scenario (as described in the Solaris papers), and in the current Linux model. What happens during scheduling, synchronization and signaling in these cases?

In order for a user-level thread to execute, it must first be associated with a kernel-level thread. The OS scheduer, then schedules that kernel-level thread to a CPU.
There are three models as to how a user-level thread interacts with a kernel-level thread, and those models are: one-to-one, many-to-one, and many-to-many.

A user-level thread can be assigned to any amount of kernel-level threads. The complications arise when the kernel-level threads must synchroniize between each other - which can be done through adaptive mutexes - chosing when to spin or block other threads as to not waste cycles. Signals are called by either the CPU or Process running to keep each other in the loop as to what needs to occur.

How well did you know this?

Not at all

Perfectly

Can you explain why some of the mechanisms described in the Solaris papers (for configuring the degree concurrency, for signaling, the use of LWP…) are not used or necessary in the current threads model in Linux?

Because the solaris model was many to many, special signals and system calls allowed the kernel and user level library to interact and coordinate with one another, fixing the visibility issues present.
However, the visibility of state and decisions between kernel and UL library are solved in a 1-1 model and as such special signals and system calls are not needed in current linux threading models.

How well did you know this?

Not at all

Perfectly

What’s an interrupt? What’s a signal? What happens during interrupt or signal handling?

Study These Flashcards

Interrupts are events generated externally by components other than the cpu. These are based on the physical platform and can be done asynchronously.
Signals are events triggered by the CPU & Processes running. These are based on the operating system and can be done either synchronously or asynchronously.
The default actions of an interrupt or signal received consist of the following actions: terminate, ignore, terminate and coredump, stop, or continue. However, these actions can be customized by the developer.

How does the OS know what to execute in response to a interrupt or signal? Can each process configure their own signal handler? Can each thread have their own signal handler?

Study These Flashcards

The OS handles the interrupts/signals based on the mask bit set on the User-level and the Kernel-level. The threading library knows the status of all the user-level threads, and as such when a signal/interrupt is received in the kernel-level, the OS makes use of that information.
A process cannot configure their own signal handler, that is controlled by the OS. A thread can have their own signal handler, as the process must wait for an open thread to handle an incoming signal.

What’s the potential issue if a interrupt or signal handler needs to lock a mutex? What’s the workaround described in the Solaris papers?

Study These Flashcards

If the thread that was interrupted or signaled already has the mutex locked that the handler needs access too - we’ve entered into a deadlock scenario. The interrupted thread will not unlock the mutex until the handler routine is finished, however the handler routine cannot finish without access to the mutex.
The workaround described in the Solaris paper is to control the interruptions by the handler code and use interrupt/signal masks.

Contrast the pros-and-cons of a multithreaded (MT) and multiprocess (MP) implementation of a webserver, as described in the Flash paper.

Study These Flashcards

A multiprocess webserver is super simple to implement/develop, however it results in a high memory usage since there are many processes runining at a time, the context switch price is expensive, and can be hard to maintain a shared state.

A multithreaded webserver shares an address space making a shared state easier and context switching cheaper. However, the implementation is a lot more complex and has a synchronization overhead, and the underlying support for threads from the OS is required.

What are the benefits of the event-based model described in the Flash paper over MT and MP? What are the limitations?

Study These Flashcards

According to the flash paper, the benefits of the event-based model is that it shares a single address space, has a single flow of control, smaller memory requirements, does not require any context switching, and there is no synchronization involved.

The problem with event-based model is that blocking a request/handler will block the entire process.

Would you convert the AMPED model into a AMTED (async multi-threaded event-driven)? How do you think ab AMTED version of Flash would compare to the AMPED version of Flash?

Study These Flashcards

No, converting to a multi-threaded event-driven model would add more overhead and little improvements to an otherwise already efficient model. AMPED shared a single address space and had no need for synchronization and made use of helper threads to avoid block during I/O operations.

I think the overhead and new need for synchronization would cause Flash to perform worse for data set sizes greater than 100 mbs due to the additional memory needed to support synchronization and blocking.

There are several sets of experimental results from the Flash paper discussed in the lesson. What was the purpose of each set of experiments (what was the question they wanted to answer)?
Why was the experiment structured in a particular way (why they chose the variables to be varied, the workload parameters, the measured metric…)?

Study These Flashcards

To define useful inputs the FLASH paper used a realistic workload of sequential requests ( to capture a distribution of webpage accesses). To do this they implemented a traced-based approach which gathered traces from real webservers and replayed those traces to repeat the experiment.

If you ran your server from the class project for two different traces: (i) many requests for a single file, and (ii) many random requests across a very large pool of very large files, what do you think would happen as you add more threads to your server?

Study These Flashcards

The approach on the project was more simple MT based with little to no optimizations made. Due to this, our graphs would most likely follow a similar outlook to the Apache’s outcomes. Our bandwidth would significantly decrease as requests and file size increased.

In Solaris paper name the four data structures used.

Name at least 2 needed data sets in each structure.

Study These Flashcards

Process
1) List of kernel-level threads
2) Virtual Address Space
3) User creds
4) signal handlers
LWP (Light-Weight Process)
1) User-Level Registers
2) Resource Usage Info
3) Signal mask
4) syscall args
Kernel-Level Threads
1) Kernel-level registers
2) Scheduling Info
3) Pointers to LWP
4) Stack pointer
CPU
1) Current-Thread
2) List of Kernel-Level threads
3) Dispatching interrupt handling info

What are some synchronization related issues

If critical sections are short, don't bother with blocking. Faster to spin cycles. (Using adaptive mutexes) - Destroying threads can be costly! (Better to reuse them)

Part 2 Flashcards

(25 cards)