Midterm Flashcards
(86 cards)
Why can threads share the same program code and heap areas on memory, but not the stack area?
Each thread has its own flow of execution, with its own sequence of function calls. It would be impossible to maintain a single stack that would keep track of multiple threads at the same time. Besides, stacks typically allow accessing the top element only, meaning that only one thread could access the stack at a time.
Define Atomicity.
The state of being indivisible.
Define Critical section.
Operations that should occur in an atomic way with mutual exclusion to avoid a race condition. A critical section is a piece of code that accesses a shared variable (or more generally, a shared resource) and must not be concurrently executed by more than one thread.
Define Race condition.
Running two or more operations that should run sequentially at the same time. A race condition arises if multiple threads of execution enter the critical section at roughly the same time; both attempt to update the shared data structure, leading to a surprising (and perhaps undesirable) outcome. A race condition occurs when the outcome of a program depends on the timing or sequence of events.
Define Mutual exclusion.
Only one thread can access a certain resource at a single time. Mutual exclusion (mutex) guarantees that if one thread is executing within the critical section, the others will be prevented from doing so.
Define Deadlock.
Infinite wait to resources due to cyclic dependencies across threads. In gridlock, mutual exclusion occurs because only one line of cars can occupy an intersection at a time. Hold-and-wait occurs when one line of cars can occupy an intersection while waiting for another intersection to become available. No preemption means that when one line of cars is occupying an intersection, others cannot forcefully remove them. Circular wait can occur with lines of cars passing through intersections in different orders.
How can we implement locks in a way that is guaranteed to be correct, efficient, and fair? Do we need assistance from the hardware to do so? What about the operating system?
For a lock to be correct, we need to obtain/release the lock in an atomic way. To do that, we need assistance from the hardware in the form of special instructions (e.g., test-and-set). For it to be efficient, we need to avoid spinning (busy waiting). To do that, we need assistance from the operating system to keep threads that cannot obtain a lock ‘sleeping’ until the lock is available. The operating system can also keep track of the threads waiting for the same lock and enforce some policy (e.g., FIFO) to ensure fairness.
Why would the given implementation of a condition not work?
The process of going to sleep does not occur in an atomic way. A thread may check the value of the variable done and decide to sleep but then be scheduled out before sleeping. Another thread may change the value of the variable done and send a signal to wake up the sleeping thread before it sleeps, so when it goes back into execution it will go into sleep indefinitely. To solve this problem, they need a mutual exclusion lock.
What is the race condition in the given put function for a hash table?
Lines 23 and 24 (e->next = n; and *p = e;). Two insertions can occur at the same location at the same time, so we may have missing nodes or an unsorted linked list.
How can you change the put function using a single lock to ensure correct operation of concurrent calls?
We must obtain the lock before Line 17 and release it after Line 24. If we lock Lines 23-24 only, we avoid having missing nodes, but we can still have nodes out of order.
How can you change the put function using the atomic __compare_and_swap function to avoid a race condition without locks or semaphores?
for(;;) { for(p=&table[key%NBUCKET], n=table[key%NBUCKET]; n != NULL; p=&n->next, n=n->next) { if(n->key > key) break; } e->next = n; if(__compare_and_swap(p, n, e)) break; }
What is the difference in performance between the single-lock solution and the compare-and-swap solution for the put function? In which scenario does one outperform the other?
The single-lock solution does not allow concurrent access to the linked lists for insertion. Meanwhile, the compare-and-swap solution allows concurrent access and only slows down when two threads try to insert a node in the same location. Whenever we have a lot of hash collisions, the compare-and-swap solution will outperform the single-lock solution.
In the producer/consumer code with semaphores, what stops 2*MAX producer threads from producing more than MAX elements at a time and overwriting elements in the buffer? How?
The initialization of the ‘empty’ semaphore. It starts with the value MAX, meaning that the first MAX producers will decrement the semaphore value, but it will still be non-negative, so they all can execute concurrently. But as soon as the producer MAX+1 tries to pass the semaphore, it will decrement it into a negative value and will sleep until it receives a signal for the ‘empty’ semaphore, which can only be done by a consumer.
Explain how gridlock is the same as deadlock in an operating system by showing how each of the four conditions for deadlock hold.
In the gridlock analogy, each line of cars traveling in the same direction is considered a thread.
• Mutual exclusion: Only one line of cars can occupy an intersection at a time.
• Hold-and-wait: One line of cars can occupy an intersection while waiting for another intersection to become available.
• No preemption: When one line of cars is occupying an intersection, others cannot forcefully remove them to free the intersection.
• Circular wait: We can have lines of cars passing through intersections in different orders.
What are the disadvantages of paging?
• Internal fragmentation: Page size may not match size needed by process, leading to wasted memory.
• Additional memory reference to page table: Can be very inefficient without a TLB.
• Storage for page tables may be substantial: Especially with linear page tables.
• Page tables must be allocated contiguously in memory: This can be a problem.
What is a TLB?
TLB stands for Translation Lookaside Buffer. It is a hardware cache of popular virtual-to-physical address translations within the CPU’s Memory Management Unit (MMU). It caches some popular page table entries.
Outline the paging translation steps with a TLB.
- Extract VPN (virtual page num) from VA (virtual addr).
- Check TLB for VPN.
- If miss: a. Calculate addr of PTE (page table entry). b. Read PTE from memory. c. Replace some entry in TLB.
- Extract PFN (page frame number).
- Build PA (phys addr).
- Read contents of PA from memory into register.
How can the system improve TLB performance (hit rate) given a fixed number of TLB entries?
Increase page size. Fewer unique page translations are needed to access the same amount of memory.
Define TLB Reach.
Number of TLB entries * Page Size.
What access pattern will result in slow TLB performance?
Highly random access with no repeat accesses. Sequential array accesses almost always hit in the TLB and are very fast.
Explain temporal locality.
An instruction or data item that has been recently accessed will likely be re-accessed soon in the future.
Explain spatial locality.
If a program accesses memory at address x, it will likely soon access memory near x. TLBs improve performance due to spatial locality.
What TLB characteristics are best for spatial locality?
Access same page repeatedly; need same vpn→ppn translation. Same TLB entry re-used.
What TLB characteristics are best for temporal locality?
Access same address near in future. Same TLB entry re-used in near future. How near in future? How many TLB entries are there?