What are the key roles of an operating system?
To hide the complexity of the underlying hardware from applications (abstraction), manage the state of that hardware according to predefined policies (arbitrate), and ensure that those applications are isolated and protected from one another.
What is an OS abstraction? Give examples.
OS abstractions are techniques used to hide the underlying details of hardware by providing an API that is simpler to reason about than reasoning about the underlying hardware directly. Examples: process, thread, file, socket, memory page
What is an OS mechanism? Give examples.
Mechanisms are functionality provided by the operating system to do certain things with the hardware that can be used to implement policies. A mechanism is how something can be done. Examples: create, schedule, allocate
What are OS Policies? Give examples.
Policies are the behavior defined for the operating system to have that can be implemented using the operating system’s provided mechanism. A policy is what/which should be done. Examples: least recently used (LRU), earliest deadline first (EDF)
What does the principle of separation of mechanism and policy mean?
This entails a separation between how we specify what should be done and how it is done. This separation is important for flexibility, because policies are likely to change. Mechanisms insensitive to policy changes are more desirable because they allow for policy to change more easily.
What does the principle of the “optimize for the common case” mean?
Operating systems are likely to be used for a specific use case. By knowing more information about this use case, one can design the system in such a way that this use case performs well. There is more benefit in trying to make this use case run as smoothly as possible than trying to write a system that is not aware of this use case and does not take advantage of its details. There is an opportunity to improve performance by knowing how the system will be used and selecting policies accordingly.
What happens during a user-kernel mode crossing?
This crossing is done when a user program attempts to execute a privileged instruction. This will cause a trap to occur.
What are some of the reasons why user-kernel mode crossing happens?
Because the privileged instructions are only allowed to the OS. User process doesn’t know how the hardware resources are managed and if it want to directly access and modify the hardware state, it is most likely to affect other processes in bad way. Because of this reason, such attempts are forbidden and must go through the OS to do the work on behalf of the user process. Such action is done by system call and the system calls require the user-kernel boundary crossing.
What is a kernel trap?
A kernel trap is when a user application is interrupted, and control is switched to the kernel at a specific location.
Why does a kernel trap happen?
A trap is caused by a user program’s attempt to perform a privileged instruction.
What are the steps that take place during a kernel trap?
During a kernel trap, the kernel switches control to the OS and the OS checks why the trap occurred, and determines whether to execute the instruction that caused the trap, or to terminate the process.
What is a system call?
A system call is an interface provided by the OS for providing privileged instructions on behalf of a user program.
How doe a system call happen?
To make a system call, an application must write arguments to a well defined location and use the specific system call number to make the call.
What are the steps that take place during a system call?
During a system call, the kernel will execute a privileged instruction on behalf of the user. System calls are expensive because any user application data will be removed from the cache and replaced by kernel data, i.e. context switch occurs each time the boundary is crossed.
What are the pros and cons of a monolithic OS design?
Pros: single codebase allows for compile-time optimizations. Cons : customization, portability, manageability, memory footprint, performance
What are the pros and cons of a modular OS design?
Pros: more maintainable than a monolith, smaller footprint, less resource intensive, customization Cons: indirection can impact performance, maintenance can still be an issue
What are the pros and cons of a microkernel OS design?
Pros: small, lower overhead, better performance, easy to verify and test Cons: low portability, specialized, leads to software complexity, frequent user/kernel crossing
Describe the distinctions between a process and a thread.
A process is represented by its address space (i.e. code/data/files). The process is also represented by its execution context (i.e. regs/stack) that contains information about the values of the registers, the stack pointer, program counter, etc. A thread is very much like a separate process, except for one difference: they share the same address space and thus can access the same data.
Compare a process context switch to a thread context switch
Because threads share an address space, the context switch among them happens faster than processes. So, processes take longer to context switch. Both threads and processes have their execution context described with stack and registers. Because threads share the virtual address space, it is more likely that when multiple threads execute concurrently, the data that's needed by one thread is already in the cache, brought in by another thread. So, they typically result in hotter caches.
What states are there in the lifetime of a process?
Running, Ready, Blocked(Wait), also New and Terminated
Describe the process state: Running
In the running state, a process is running on a processor. This means it is executing instructions. Being moved from ready to running means the process has been scheduled. Being moved from running to ready means the process has been descheduled.
Decribe the process stage: Ready
In the ready state, a process is ready to run but for some reason the OS has chosen not to run it at this given moment.
Describe the process state: Blocked (or Waiting)
In the blocked state, a process has performed some kind of operation that makes it not ready to run until some other event takes place. A common example: when a process initiates an I/O request to a disk, it becomes blocked and thus some other process can use the processor.
Describe the process state: New
Process is created. OS performs admission control. If it is determined to be OK, OS allocates and initializes PCB.
Describe the process state: Terminated
The process has completed its work or ends up with an error.
Draw the process lifecycle
Describe the lifetime of a thread
Describe all the steps which take place for a process to transition from a waiting (blocked) state to a running (executing on the CPU) state.
Once a process has become blocked (e.g., by initiating an I/O operation), the OS will keep it as such until some event occurs (e.g., I/O completion); at that point, the process moves to the ready state again (and potentially immediately to running again, if the OS so decides).
What are the pros-and-cons of message-based vs. shared-memory-based IPC.
Pros: OS managed, messages share APIs and system calls
Cons: overhead, all information that we want to pass is copied
Pros: OS is not in the path of communication, no OS overhead
Cons: no fixed and well defined APIs, more error prone
When it it useful to add threads?
As long as the time to context switch T is such that T_idle is greater than twice the time to context switch it makes sense to context switch to another thread and hide the idling time
When does adding threads lead to pure overhead?
If you have significantly more threads ready to run than there are processors, you will usually find that your performance degrades. This is partly because most thread schedulers are quite slow at making general re-scheduling decisions. If there is a processor idle waiting for your thread, the scheduler can probably get it there quite quickly. But if your thread has to be put on a queue, and later swapped into a processor in place of some other thread, it will be more expensive. A second effect is that if you have lots of threads running they are more likely to conflict over mutexes or over the resources managed by your condition variables.
What are the benefits of multithreading?
By multithreading the OS’s kernel we allow the OS to support multiple execution contexts, and this is particularly useful when there are in fact multiple CPUs, so that the OS context can execute concurrently on different CPUs in a multiprocessor/multicore platform.
Describe the boss-worker multithreading pattern
The boss assigns work to the workers. Each worker performs the entire task. The boss and the workers communicate via shared queue.
If you need to improve a performance metric like throughput or response time, what could you do in a boss-worker model?
Throughput of the system is limited by the boss thread. We can use a queue to increase throughput by making the boss add tasks to the queue and the workers retrieve task from the queue which results in lower time per task that the boss needs.
What are the limiting factors in improving performance in the boss-worker pattern?
A negative of this approach is that it ignores locality. The boss doesn’t keep track of what each worker is doing. If we have a situation where a worker just completed a similar type of task or identical type of task, it is more likely that the same worker will be more efficient in performing the exact same task in the future. Or it may be that it already has a tool that is required to build that particular type of toy nearby on its desk. But if the boss does not know that the workers are doing, it has no way to make these kind of optimizations.
Describe the pipeline multihreading pattern
Each thread does some work and passes the partial result to another thread.
If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model?
The ideal is that the stages are equal: this will provide maximum throughput, by utilizing all your processors fully. Achieving this ideal requires hand tuning, and re-tuning as the program changes.
What are the limiting factors in improving performance with the pipeline pattern?
- hand tuning is needed for determining the right number of pipeline stages
- hardwired limit in the degree of parallelization
- overall throughput is limited by slowest stage
- more hand-tuning needed
- more hand-tuning needed
What are mutexes?
The simplest way that threads interact is through access to shared memory. In a high-level language, this is usually expressed as access to global variables. Since threads are running in parallel, the programmer must explicitly arrange to avoid errors arising when more than one thread is accessing the shared variables. The simplest tool for doing this is a primitive that offers mutual exclusion (sometimes called critical sections), specifying for a particular region of code that only one thread can execute there at any time.
What are condition variables?
You can view a mutex as a simple kind of resource scheduling mechanism. The resource being scheduled is the shared memory accessed inside the LOCK clause, and the scheduling policy is one thread at a time. But often the programmer needs to express more complicated scheduling policies. This requires use of a mechanism that allows a thread to block until some event happens. This is achieved with a condition variable.
A condition variable is always associated with a particular mutex, and with the data protected by that mutex.
Can you quickly write the steps/code for entering/existing a critical section for problems such as reader/writer, reader/writer with selective priority (e.g., reader priority vs. writer priority)?
What are spurious wake-ups, how do you avoid them, and can you always avoid them?
If you keep your use of condition variables very simple, you might introduce the possibility of awakening threads that cannot make useful progress. This can happen if you use “Broadcast” when “Signal” would be sufficient, or if you have threads waiting on a single condition variable for multiple different reasons.
Do you understand the need for using a while() loop for the predicate check in the critical section entry code examples in the lessons?
if() only checks once, while() keeps checking until it's satisfied.
What's a simple way to prevent deadlocks?
Probably the most practical prevention technique (and certainly one that is frequently employed) is to write your locking code such that you never induce a circular wait. The most straightforward way to do that is to provide a total ordering on lock acquisition. For example, if there are only two locks in the system (L1 and L2), you can prevent deadlock by always acquiring L1 before L2.
Explain the relationship between kernel-level threads and user-level threads
To have a multithreaded OS kernel, it must maintain:
- thread abstraction (data structure to rep. threads)
- scheduling, sync
To support threads at the user level, it must have:
- a user-level library that is linked with the application
- The library supports a data structures, scheduling, synchronization and other mechs that's needed to make resource mgmt decisions for the threads
User level threads can be mapped to underlying kerner-level threads:
To the user level library, kernel level threads look a lot like virtual CPUs
The user threading library keeps track of all the threads that represent a single process. There is a relationship between the threads and a PCB that represents that address space. For each process, we need to keep track of what kernel level threads that execute on behalf of the process, and for each kernel level thread, we need to know what address space
within which that thread executes. If the system has multiple CPUs, we need a data structure to represent the CPU and maintain relationships with KLTs.
Describe thread related data structures for a single CPU
- Virtual Address mapping
User Level library data structure for user level thread (ULT):
- UL thread ID
- UL registers
- thread stack
Kernel level thread data structure (KLT):
- register pointer
What are thread interrupts?
- events generated externally by components other than the CPU (I/O devices, timers, other CPUs)
- determined based on the physical platform
- appear asyncronously
- compare to "snowstorm warning"
What are thread signals?
- events triggered by the CPU & software running on it
- determined based on the operating system
- appear syncronously or asynchronously
- compared to a low battery warning
What is the current threading model in the Linux OS?
Compare a Hard and Light Process State
Hard process state:
- information relevant for all of the user level threads that execute within the process
- ie virtual address mapping
Light process state:
- information relevant for a subset of user-level threads that are currently associated with a particular kernel level thread
Discuss the rationale for multiple data structures in kernel/user level threads
Describe User Level structures in Solaris 2.0
- On thread creation, the library returns a thread id, which is not a direct pointer to the actual thread data structure
- it's an index in a table of pointers
- table pointers point to per thread data structure
- The thread data structure contains a number of fields, execution context, registers, signal mask priority, stack pointer, thread local storage, stack
- The size of the data structure is known up front at compile time so we can create these thread data structures, and layer them in a contiguous way, which can help us achieve locality, make it easier for the scheduler to find the next thread (it just has to multiply the thread integers with the size of the data structure)
- it's an index in a table of pointers
- table pointers point to per thread data structure
What is the benefit of a thread id NOT to be a pointer to a thread data structure?
If there were a problem with the thread, if the thread ID were a pointer, then the pointer would point to some corrupt memory. By having a thread ID index into a table entry, we can encode some info into the table entry that can provide meaningful feedback or an error message
What is thread local storage?
This includes the variables that are defined in the thread functions that are known at compile time, so the compiler can allocate private storage on a per-thread basis for each of them
What is a problem with stack growth in user level thread data structures?
- stack growth can be dangerous
- the thread library doesn't really control the stack growth, and the OS itself doesn't know that there are multiple user level threads.
- it's possible that as the stack is growing, that one thread will end up overwriting the data structure of another thread. If that happens the error will only be detected when the offended thread is run, and tracking down the offending thread is difficult.
What is a solution to threads writing over other threads' stack information in a user-level data structure?
Create a red zone
A red zone separates information about different threads.
It is a portion of the virtual address space that is not allocated. If a thread is running and its stack is increasing, if it tries to write to an address that falls into the red zone region, the OS will cause a fault.
This makes it much easier to debug what happened because the fault because it is directly caused by the thread that was executing.
Name the kernel level data structures
- Lightweight Process (LWP)
- Kernel Level Threads
Describe a kernel-level process data structure
- list of kernel-level threads
- virtual address space
- user credentials
- signal handlers
Describe a Light-Weight Process (LWP)
- user level registers
- system call args
- resource usage info
- signal mask
- similar to ULT, but visible to kernel
- not needed when the process is not running
Describe a Kernel-level Thread data structure
- kernel-level registers
- stack pointer
- scheduling info
- pointers to associated LWP, processes, CPU structures
- information needed even when process is not running => NOT SWAPPABLE
Describe a thread CPU data structure
- current thread
- list of kernel-level threads
- dispatching & interrupt handling information
Describe the pluses and minuses of user/kernel thread structures
+ OS can see / understand sync, sked, etc
- User must use kernel for all ops, expensive, not portable, policies may be limited
+ Portable, does not depend on OS policies
- If kernel thread is blocked, entire process is blocked. No OS insight into process
+ Can be the best of both worlds - process can have one or multiple kernel threads - unbound or bound
- Requires coord btwn kernel and user
True or False:
The user-level library does not know what is happening in the kernel
True or False:
The kernel knows what is happening in the user-level library
How did the Solaris implementation correct for visibility issues between the user-level libraries and kernel
They introduced system calls and special signals to allow kernal and ULT library to interact and coordinate
In the pthreads library, which function sets the concurrency level and what value instructs the implementation to manage the concurrency level as it deems appropriate?
What does the user-level library see?
- user level threads
- available kernel level threads
What does the kernel see?
- kernel level threads
- kernel level scheduler
What is a "bound thread" and what is its difference with "thread pinning"?
A bound thread is when a user level library requests that one of its user level threads be bound to a kernel-level thread.
if a kernel level thread is to be permanently associated with a CPU in a multi CPU system, that thread is "pinned".
When does the UL library scheduler run?
Process jumps to UL library scheduler when:
- ULTs explicitly yield
- timer set by UL library expires
- ULTs call library functions like lock/unlock
- block threads become runnable
it also runs on ULT operations and signals from timer or directly from the kernel
How can one CPU communicate to another CPU to stop running a low priority thread in favor or a higher priority thread?
Send signal to other thread running on other CPU to run library code locally, which will indicate that the low priority thread will need to be stopped and the high priority thread will start running instead
If there is a thread T1 holding a mutex on CPU1 and thread, T4 wanting the mutex running on CPU2, should the OS put T4 in the mutex queue?
If the critical section is short, it would be better to spin for a few cycles on the CPU and wait for T1 to release the mutex. If it takes less time for T1 to release the mutex, we're better off spinning than taking the thread and context switching it.
Fo long critical section, use the default blocking behavior
What is a one-shot signal?
- "n signals pending == 1 signal pending" : at least once
- must be explicitly re-enabled
What is a real-time signal?