0. Midterm Questions Flashcards
What is GWA’s special talent?
Haircut detection
Which of the following is a requirement of a critical section? [progress, concurrency, mutual inclusion, idleness]
Progress
Intra-process (within) communication is easier than interprocess (between) communication. True or false?
True.
Threads of a single process share an address space. Process address spaces are generally restricted to only that process, not others.
Which of the following requires communication with the operating system? [Switching between two threads. Inverting a matrix. Recursion. Creating a new process.]
Creating a new process
Which of the following is NOT an example of an operating system mechanism?
[A context switch;
Using timer interrupts to stop a running thread;
Maintaining the running, read, and waiting queues;
Choosing a thread to run at random]
Choosing a thread to run at random (this is a policy).
The Rotating Staircase Deadline Scheduler is most similar to which other scheduling algorithm? [Lottery scheduling; Multi-level feedback queues; Round-Robin; Random]
Multi-level feedback scheduling
What would probably be stored in a page table entry? [the physical memory address; the virtual memory address; the process ID; the file name]
The physical memory address
Address translation allows the kernel to implement what abstraction?
[Files, Threads, Processes, Address spaces]
Address spaces
Con Kolivas was particularly interested in improving what aspect of Linux scheduling?
[Overhead; Throughput; Interactive performance; Awesomeness]
Interactive performance
Which is probably NOT a privileged operation?
[Changing the interrupt mask; Loading an entry into the TLB; Modifying the exception handlers; Adding two registers and placing the result in a third register]
Adding two registers and placing the result in a third register
Consider a 32-bit system with 4K pages uses multi-level page tables as described in class:
- 10 bits for the first-level index.
- 10 bits for the second level index.
- a 12-bit offset
Assume that page table entries are 32-bits and so can be stored directly in the second-level table.
If a process has 10 contiguous code packages (including global variables), 4 contiguous heap pages, and a single 1 page stack, what is the minimum number of pages required for the process’s page table?
What is the maximum?
In both cases, briefly explain your answer.
Each first- or second-level table consumes one page, since it has 1024 (2^10) 4-byte entries. So at minimum, this process’s page table requires two pages: one of the top-level table and a second for the second-level table if the code, heap, and stack are close enough together (within 4MB) to all lie within the same second-level table. (Not common).
The maximum can be thought of as follows: One page for the top-level table plus three pages for the second-level tables, one covering the 10 contiguous code pages, one covering the 4 contiguous heap pages, and a third covering the single stack page. The true maximum is 6 tho, since both the code segment and heap may overlap onto two second-level pages each if they are set up incorrectly in the address space.
Identify one separation between OS mechanism and policy that we have discussed this semester. Describe the mechanism and one policy.
One example is the separation between the ability to stop and start threads (mechanism) and scheduling (policy). The mechanism is preemptive context switches using the timer to give the kernel a chance to run regularly and interrupt the running thread if needed. One policy is to schedule threads at random.
Another example is the separation between the virtual address translations performed by the TLB or MMU (mechanism) and the mapping between virtual and physical memory set by the kernel and running processes (policy). The mechanism is the TLB’s ability to cache translations and map virtual addresses to physical addresses. One policy is having a 2GB virutal address space where the code starts at 0x10000, the heap starts at 0x10000000 and grows upward and the stack starts at 0x80000000 and grows down. But any potential address space layout would earn you credit here.
Identify three system calls that allocate new virtual addresses (i.e., allow the process to use them) and provide a brief description for each)
- exec( ): loads content from the ELF file and sets up virtual addresses mainly that point to the process’s code area and any statically-declared variables
- fork( ): copies the parent’s entire address space, including all the code pages, the entire heap, and one (or more) thread stacks, allowing the child to use all of these virtual addresses which will (eventually) have to point to private physical addresses
- sbrk( ): extends the “break point”, or the top of the process’s heap, allocating new virtual addresses that point to physical memory. Not usually called by processes directly, but instead by malloc libraries when their subpage allocator runs out of space.
- mmap( ): asks the kernel to map a portion of virtual addresses to point to a region of a file
Given a simple MLFQ approach that rewards any thread that sleeps or yields before its quantum expires, first describe a way that a computationally-intensive thread can take advantage of a weakness in this approach to remain in one of the top queues. Second, propose a modification to MLFQ that addresses this problem.
The problem is that if a thread can determine how long its quantum is, even approximately, it can arrange to call yield right before its quantum expires. As a result, it has done almost as much computation as it would have normally, but it is not demotes to a lower-priority queue as it should be.
The fix is simple. Instead of looking at whether a thread blocks or yields before its quantum ends, consider the percentage of its quantum it has used before it sleeps or yields. Either establish a stronger threshold for not being demoted (can’t use more than 10% of the quantum), or use the percentage itself to make a more nuanced decision (if within 10% maintain, if between 10-50% demote one level, if over 50% demote two levels, or whatever).
Another way to fix this problem is simply to not reward threads that yeild and only consider those that block. However, this isn’t quite sufficient, since a thread may be able to block on something that it knows will return immediately - like a read from /dev/random - or arrange to have a second thread within the same process release it. Looking at the overall quantum usage is a safer bet.
Long question: We have seen semaphores, spin and sleep locks, condition variables, and reader-writer locks. However, many other useful synchronization primitives exist.
First, describe one additional synchronizaiton primitive. Provide a complete interface for it in C pseudo-code, and describe how to implement it.
Second, provide two different use cases for your new synchronization primitive. Feel free to use pseudo-code as well as English here as well.
See 2016 midterm solution sheet.
All of the following are critical section requirements EXCEPT
[mutual exclusion, concurrency, progress, performance]
Concurrency
All of the following are private to each process thread EXCEPT
[stack; file handles; registers]
File handles
What action does NOT require the kernel?
[Switching between two threads; Reading from a file; Creating a new process; Altering a virtual address mapping]
Switching between two threads
Is this true? Does the scheduler work independent of the kernel?
What part of the address space is NOT initialized using information from the ELF file?
[Code segment; The heap; Uninitialized static data; Initialized static data]
The heap
Which of the following is NOT a part of making a system call?
[Arranging the arguments in registers or on the stack; Loading the system call number into a register; Generating a software interrupt; Allocating memory in the heap using malloc]
Allocating memory in the heap using malloc
What information would probably not be stored in a page table entry?
[The location on disk; Read, write, or execute permissions; The process ID; The physical memory address]
The process ID
Paging using fixed-size pages can suffer from internal fragmentation (true or false)
True
One way to eliminate deadlock when acquiring multiple locks is to eliminate cycles. Describe another way to avoid deadlock and how to implement it.
There isn’t much you can do about protected access to shared resources, which requires waiting - either passive or active. In some cases adding resources to eliminate unnecessary sharing could be an option, and we’ll accept that. (Eliminating sleeping isn’t a solution, since replacing deadlocks with livelocks isn’t a great idea.) But the other two conditions are potential solutions:
- ADD RESOURCES, eliminating the need for sharing. As an exmaple from the dining philosophers problem, adding chopsticks eliminates the need to share. We’ll accept this answer if you were quite specific about this not working in all circumstances, since in certain cases global resources are required for correctness.
- ALLOW RESOURCE PREEMPTION, creating a way for the operating system to forcibly take a resource from a thread that is holding it. One way to implement this is to detect a deadlock by identifying a circular dependency in the wait queue and then interrupt one of the threads holding the locks causing the cycle. This would require some way for the thread to register a handler each time it requests a lock that would be called if that lock was force dropped.
- DISALLOW MULTIPLE INDEPENDENT REQUESTS, preventing a thread from holding some resources while requesting others. You could implement this by checking each time a thread requests a lock to make sure tha tit doesn’t hold any other locks. Enforcing this without any new lock acquisition mechanisms would require programmers to change their code to introduce new locks covering cases. Alternatively, you could create a new locking mechanism allowing multiple locks to be acquired simultaneously.
(See stoplight problem?)
Describe two changes to the OS virtual memory system that you would need or might want to make to accommodate 48-bit virtual and physical addresses. For each, describe how it would affect the computation or memory overhead of virtual to physical translation.
Answer options:
1. Page table entry size. You clearly need to do something about your page table entries, since whatever you did to bitpack for 32-bit virtual addresses is going to change with 48-bit virtual addresses. Almost inevitably, you’re going to end up storing more state.
- Address space size. You don’t necessarily have to change the address space size, since you can simply fit more content from 32-bit wide address space into a system with 48-bit virtual addresses. You may want to change the breakpoint between userspace and the kernel, however, since there is no point wasting any of the 4 GB wide virtual address space on the kernel at this point. Everything from 0x0 to 0xFFFFFFFF should be used for the process, with kernel addresses outside of that range. This doens’t necessarily have any direct effect on the overheads, althoguh it may affect the size of other data structures (PTEs or page tables)
- Larger pages. You may really want to increase the 4K page size at this point, with the concordant tradeoff between TLB faults and spatial locality. Larger pages may reduce the number of translations required.
- Different page table structures. It could be time to add a third level to the existing two-level page tables, particularly if you decide to expand the address space size. The goal would be to reduce the amount of space required for the page tables for common address space layout patterns.
- MMU changes. Clearly the MMU and TLB need to be modified to help map 48-bit virtual addresses to 48-bit physical addresses. No effect on the kernel overheads here, but any speculation about the effect on hardware (wider addresses might simply fewer entries) would be accepted.