midterm Flashcards Preview

cs6200 > midterm > Flashcards

Flashcards in midterm Deck (36):

What are the key roles of an operating system?

hide hardware complexity (abstraction): read/write file storage, send/recv socket network;
resource management (arbitration): memory management, CPU scheduling;
provide isolation and protection: memory allocation and isolation


Can you make distinction between OS abstractions, mechanisms, policies?

Abstractions hide the underlying details of hardware by providing an API that is simpler to reason about than reasoning about the underlying hardware directly (part of the *how* things can be done); ex: process, thread, file, socket, memory page

Mechanisms provide functionality to do certain things with the hardware that can be used to implement policies (*how* something can be done); ex: create, schedule, open, write, allocate, map to a process

Policies are the defined behaviors for the OS to have that can be implemented using mechanisms (*what* should be done): least recently used (LRU), earliest deadline first (EDF)


What does the principle of separation of mechanism and policy mean?

separation of mechanism and policy:
means there should be a separation in how we specify *how* needs to be done and *how* it gets done; policies can change, so its good to keep the mechanisms separate from the policies so that the mechanisms can continue to support the policies as the policies change; promotes flexibility!
implement flexible mechanisms to support many policies
e.g. LRU, LFU, random


What does the principle optimize for the common case mean?

optimize for the common case:
there'll be a ton of different ways that you *can* implement something but they might not all mesh well together, so you should try to focus your implementation on the common case
things to think about as you try to figure out what the common case actually is:
Where will the OS be used?
What will the user want to execute on that machine?
What are the workload requirements?


What happens during a user-kernel mode crossing?

Happens when a user-level process tries to execute a privileged instruction; causes a trap to occur

When in kernel mode, a special bit is set in the CPU that permits any instruction that directly manipulates hardware to execute.
When in user mode, the bit is not set, so instructions that attempt to perform privileged operations will be forbidden; they will cause a trap, the app will be interrupted, and the hardware switches control to the OS at a spec. location.
Then the OS has the chance to check what caused the trap to occur and decide if it should grant it access or if it should terminate the proc (if it was trying to do something illegal).
Or in addition to the trap method: interactions between app and OS can be via system calls.
Also: signals


What are some of the reasons why user-kernel mode crossing happens?

sometimes the user-level process needs to do privileged stuff with the OS, but only the kernel is allowed to do that
...because the user-level process doesn't see the whole OS picture and isnt necessarily smart enough for the OS to trust it and let it do what it wants

process gets done by a system call; system calls require user-kernel mode crossing

if application needs the OS to...


What is a kernel trap? Why does it happen? What are the steps that take place during a kernel trap?

kernel trap:
-occurs when an application in user-mode tries to execute something that requires privileged access on the OS
-the OS switches control to the kernel so that it can check why the trap occurred and whether it should execute the instruction that caused it or terminate the process

-when an app in user-mode tries to do something in privileged mode they cause cause a trap, the app is interrupted, and the hardware switches control to the OS at a spec. location. Then the OS has the chance to check what caused the trap to occur and decide if it should grant it access or if it should terminate the proc (if it was trying to do something illegal).


What is a system call? How does it happen? What are the steps that take place during a system call?

system calls:
-an interface provided by the OS that user-level apps can use when they want to do a privilaged operation
-user-level app calls syscall with syscall# and args [written to well-defined location]
-OS context switches from user-level thread to a kernel thread to execute the call
-then the OS context switches from kernel thread back to user thread

-a set of operations provided by the OS that the apps can explicitly invoke if they want the OS to perform a certain service and certain privilaged access on their behalf; ex: open(file), send(socket), malloc(memory)
1. user proc executing
2. user proc calls system call
3. change OS context from user->kernel
4. pass args that are necessary for syscall op
5. jump somewhere in the kernel mem so it can go thru the instruction sequence for that syscall
6. once syscall completes, it returns the results
7. change context back from kernel->user
8. jump to same location in user proc code where syscall was made from

app must:
-write arguments
-save relevant data at well-defined location
-make system call


Contrast the design decisions and performance tradeoffs among monolithic, modular and microkernel-based OS designs.

+everything included
+inlining, compile-time optimizations
-customization, portability, manageability...
-memory footprint

+smaller footprint
+less resource needs
-indirection can impact performance
-maintenance can still be an issue

-complexity of software development
-cost of user/kernel crossing


Process vs. thread, describe the distinctions.

-each [single-threaded] process has own address space, also by its execution context (regs, stack, PC, etc)
--OS represents this in a PCB
-take longer to context switch
-OS makes sure that there's no overlap of virtual -> physical address space in different processes

-threads represent multiple independent execution contexts
-all the threads (that belong to the same process) share the same virtual-to-physical address mappings, code, data, and files [and the cache if the CPU is context switching from one thread to another within the same context]
-each thread has its own PC, thread-specific regs, stackpointer, and stack
--OS represents this in a more complex PCB: has all the info thats shared among the threads, and separate info about the execution contexts of all of the threads [that are part of that process]
-can share a virtual address space
-usually result in hotter caches when multiple threads exist (when you context switch btwn threads, the cache doesnt get swapped out)
-since threads share the same virtual->physical address mappings, multiple threads could try to access the same data at the same time and cause data races

-have an execution context (stack and registers)
-make use of some communication mechanisms


What happens on a process vs. thread context switch.

process context switch (ex P1 is running and P2 is idle):
1. P1 is currently running
2. OS interrupts P1
3. OS saves state of P1 into PCB for P1
4. OS restores PCB of P2: has to update CPU regs with values that correspond to those from PCB of P2
5. when P2 is done or when OS interrupts P2: OS saves state of P2 into PCB for P2
6. OS restores PCB of P1: has to update CPU regs with values that correspond to those from PCB of P1
7. now P1 is running again and picked up from the spot it left off before it was interrupted

thread context switch:
(don't need to redo virtual->physical address mapping)
1. T1 is currently running
2. OS saves execution context [just stack and regs] of T1 into T1 part of PCB for process
3. OS restores execution context of T2 (doesn't have to create new virtual->physical address mappings, also keeps the same cache)
4. when T2 is done: OS saves execution context of T2 into T2 part of PCB
5. OS restores execution context of T1
6. now T1 is running again and picked up from the spot it left off before it was interrupted


Describe the states in a lifetime of a process?

new: when process is created, successfully passed OS admission control, has a PCB and some memory allocated to it
ready ?? what places the proc in the ready Q?? >
->ready: ready to start executing but isn't actually running on the CPU yet (waiting for scheduler)
running by putting proc on CPU>
->running: after scheduler gives the CPU to a ready process
ready Q when interrupt occurs; or if the proc's CPU timeslice expired, then it just goes right back into the ready Q; or if the proc forks a child, once the child executes the parent goes back into the ready Q>
-->ready: can get kicked back here if something interrupts it

-->waiting: can get sent here from running if the proc needs to initiate a longer operation (ex: reading data from disk) or wait on some input from timer or keyboard

--->ready: goes back here once the event that it was waiting for happens
terminated when proc exits or errs out? >
-->terminated: when running proc finishes all operations in the program or encounters some error it'll exit (return some kind of exit code, success/error) then proc is terminated


Describe the lifetime of a thread

1. thread is created by parent thread; parent thread blocks until thread creation is complete
-can be created as detached (from) or joinable (to) the parent
2. child thread starts executing at the start_routine specified when thread was created
3a. if child was created as detached, the child and parent can [return results???] and exit at any time, independently of each other
3b. if child was created as joinable, the child thread must return its results to the parent and exit before the parent can return its result and exit
4. after thread exits, its structure is destroyed and memory is freed (either immediately or it is put on a deathrow and reaped later at a lower load time)

1. thread T1 is created by parent T0 [ex: with Fork(proc, args)]; parent blocks until thread creation is complete
2. T1 starts executing at the instruction pointed to by proc with the args specified in args; T0 continues executing where it left off
3. T0 calls join(T1) and blocks until T1 is complete (if T1 hasn't already completed); once T1 returns its result to T0, T1 is terminated
4. T0 can now exit


Describe all the steps which take place for a process to transition form a waiting (blocked) state to a running (executing on the CPU) state.

1. process is in waiting state, ex: waiting on a keyboard read I/O operation, so the process is in the keyboard's I/O Q
2. the keyboard I/O event happens
3. the OS responds to the process's request with the read keyboard I/O event info
4. [something?? def not the scheduler] moves the process back into the ready Q
5. the scheduler grants the process access to the CPU, moves the proc to the CPU and lets it start executing

1. process is in waiting state
2. the event that the process is waiting on happens
3. process moves to ready state
4. scheduler grants CPU usage to process
5. process moves to running state


What are the pros-and-cons of message-based vs. shared-memory-based IPC.

message-based IPC (OS provides a communication channel, ex: shared buffer, and both processes send/recv info to/from the channel):
+OS manages it
-overheads: every piece of info that we want to pass we have to copy from userspace of P1 into channel in kernel mem, and then back into the addr space of P2

shared-memory IPC (OS establishes a shared channel and maps it into the address space of both processes):
+OS is out of the way!
-process developers have to re-implement code (since they can't use the OS's APIs for msg sharing)

performance depends:
-shared memory: data exchange is cheap but actual operation of memory mapping between processes is expensive;
only makes sense to do shared memory implementation if the setup cost can be amortized across a sufficient number of messages


What are the benefits of multi threading?

benefits of multi-threading:
+parallelization => speed up (can process input much faster than if only a single thread on a single CPU has to process the entire matrix)
+specialization => hot cache! [ex: have certain threads handle different tasks, we can differentiate how we manage those threads; ex: give higher priority to threads that handle more important tasks;
also: performance is dependent on how much state can be present in the processor cache: each thread running on a diff processor has access to its own processor cache, if the thread repeatedly executes a smaller portion of the code - just 1 code, then more of that state/program will be present in the cache->translates to hotter cache->better performacne]
+efficiency => lower memory management and cheaper IPC [with threads (just need 1 address space, and each execution context), you dont have to allocate the address space and execution context for every single process; passing data/synchronizing btwn processes requires more expensive IPC methods, (with threads its just done via shared variables in the same process address space)]

-benefits to applications and to OS itself:
--by multithreading OS kernel threads: allows OS to support multiple execution contexts on multiple CPUs
-OS's threads working on behalf of apps
-OS-level services like daemons or [device] drivers

+hide latency


When is it useful to add more threads?**

It can be useful to add more threads if you are able to structure your application such that it can do multiple things at the same time without interfering with each other. For example, like with a webserver, a separate thread can be used to serve each incoming client request; or the threads can be structured in a pipeline so that each thread performs a different step in the workflow, like an assembly line.

re. if you have more threads than CPUS:
if the amount time that the CPU would be idle is longer than 2 times the amount of time it would take to do a thread context switch, then the CPU should context switch to another thread to hide latency.
but otherwise, its better for the CPU to just wait out the idle time instead of context switching.


When does adding threads lead to pure overhead?**


(Birrell:) If you have significantly more threads ready to run than processors.

re. if you have more threads than CPUS:
if the amount time that the CPU would be idle is longer than 2 times the amount of time it would take to do a thread context switch, then the CPU should context switch to another thread to hide latency.
but otherwise, its better for the CPU to just wait out the idle time instead of context switching.

?if you dont have much work for the thread to do?


What are the possible sources of overhead associated with multithreading?

passing data among threads thru shared memory queues

If you have significantly more threads ready to run than processors
-because thread schedulers are slow at making general scheduling decisions
-if thread has to be put on a queue and later swapped into a processor in place of some other thread
-if you have lots of threads running that are more likely to conflict over mutexes or the resources managed by your condition vars

-using mutexes, if another thread blocks on the mutex
-the thread scheduler (and lock conflicts with high priority threads)
-spurious wakeups


Describe the boss-worker multithreading pattern.
If you need to improve a performance metric like throughout or response time, what could you do in a boss-worker model?
What are the limiting factors in improving performance with this pattern?

-boss assigns work to workers
-worker: performs entire task
-boss-workers communicate via producer/consumer queue
-worker pool: static or dynamic

X thread pool mgmt (inc. synch for shared buffer)
X ignores locality (boss doesnt keep track of what any one worker was doing last)

throughput of the system limited by boss thread => must keep boss efficient

throughput = 1/boss_time_per_order

boss assigns work by:

-directly signaling specific worker
+workers dont need to synchronize
X boss must keep track what each worker is doing
X throughput will go down

-placing work in producer/consumer queue
+boss doesnt need to know details about workers
X queue synchronization
(this is still results in lower time_per_order that the boss needs to spend so it results in overall better throughput)

how many workers in pool?
-add more workers on demand? Xmay be inefficient though because you have to wait for worker to arrive
-have a pool of workers thats created up front?
--static vs dynamic: allow size of pool to be dynamic in size

boss-worker variants:
-all workers created equal
-workers specialized for certain tasks [boss has to do a little more work per order now but xtra work is likely offset]
+better locality; QOS mgmt
X load balancing


Describe the pipeline multithreading pattern.
If you need to improve a performance metric like throughout or response time, what could you do in a pipeline model?
What are the limiting factors in improving performance with this pattern?

pipeline pattern:
-sequence of stages
-stage == subtask
-each stage == thread pool (can use same method as boss-workers to determine # of threads)
-buffer-based communication
+specialization and locality
X balancing and synchronization overheads

-threads assigned one subtask in the system
-entire tasks == pipeline of threads
-multiple tasks concurrently in the system, in different pipeline stages
-throughput == weakest link
=> pipeline stage == thread pool (longer stage means you should assign more threads to that stage)
-shared-buffer based communication [to pass msgs] between stages


What are mutexes?
Condition variables?

-a synchronization mechanism to grant exclusive access to shared state information /data structure to only one thread at a time
-like a lock that should be used whenever accessing data or shared state that's shared among threads
-data structure needs to contain: status (locked or free), some info about the owner of the mutex (who currently has the lock), and a list of blocked_threads who are currently waiting on the lock

condition variables:
-a synchronization mechanism that allows threads to wait on other threads for a specific condition before proceeding
--ex cond var=list_full, from producer/consumer example:
while (my_list.not_full())
Wait(m, list_full);
if my_list.full()
-condition variable API:
--Condition type
--Wait(mutex, cond)
---mutex is automatically released and reacquired on wait
--//automatically release mutex
--//and go on wait queue
--//...wait wait wait...
--//remove from queue
--//re-acquire mutex
//exit the wait operation
---notify only one thread waiting on condition
---notify all waiting threads
--Condition Variable data structure: mutex ref, waiting threads, ...


Can you quickly write the steps/code for entering/exiting a critical section for problems such as readers/writer, readers/writer with selective priority (i.e. reader priority vs writer priority)?

wait(mutex, cond_var);
update state => update predicate
signal and/or broadcast(cond_var_with_correct_waiting_threads)
} //unlock;

Lock(counter_mutex) {
while(resource_counter == -1)
Wait(counter_mutex, read_phase);
} //unlock;
//read data
Lock(counter_mutex) {
if(resouce_counter == 0)
} //unlock;

Lock(counter_mutex) {
while(resource_counter != 0)
Wait(counter_mutex, write_phase);
resource_counter = -1;
} //unlock;
//write data
Lock(counter_mutex) {
resource_counter = 0;
} //unlock;

generic with proxy variable:
perform critical operation (read/write shared file)
while (!predicate_for_access)
wait(mutex, cond_var)
update predicate
update predicate;


Do you understand the need for using a while loop for the predicate check in the critical section entry code examples in the lessons?

we have to use while() instead of if() because:
-"while" can support multiple consumer threads
-we cannot guarantee access to the mutex once the condition is signalled
-the list can change before the consumer gets access again


What are spurious wake ups?
How do you avoid them?
Can you always avoid them?

spurious wakeups: when we wake threads up knowing they may not be able to proceed, ex:
Wait(counter_mutex, read/write phase);]
if (unlock after broadcast/signal), no other thread can get lock!

in some cases, we can unlock the mutex before we broadcast/signal
resource_counter = 0;
but its not possible for us to restructure the readers code to work like this, so in some cases we cannot always avoid spurious wakeups


What’s a simple way to prevent deadlocks? Why?

deadlock: two or more competing threads are waiting on each other to complete, but none of them ever do

simple way to prevent deadlocks: maintain lock order, ex: first m_A, then m_B
+will prevent cycles in wait graphs (deadlocks)
+this is foolproof, guaranteed, but it can be quite complicated in more complex programs

-unlock A before unlocking B (fine-grained locking), but X threads need both A and B!
-get all the locks up front, then release at the end (use one MEGA lock), but X too restrictive=>limits parallelism, though +for some apps this is OK

...another one: the ostrich algorithm: DO NOTHING! if all else fails, just reboot!


Can you explain the relationship between kernel vs user-level threads?
Think through a general m*n scenario and in the current Linux model.
What happens during scheduling, synchronization, and signals in these cases?

kernel-level threads:
-imply that the OS itself is multithreaded
-are visible to the kernel
-are managed by kernel-level components like the OS-level scheduler

user-level threads:
-processes themselves are multithreaded (these are the user-level threads)
-for a ULT to execute, it must be assoc with a KLT, then the OS-level scheduler must schedule that KLT onto a CPU

1*1 model:
-each ULT has a KLT assoc with it
-when user proc creates a new ULT, a KLT is associated to it (so new KLT is created or if theres one available it gets associated with it)
+OS can see all of the ULTs and understands that proc is MT and understands what those threads needs (synch, sched, blocking, etc)
X must go to OS for all ops (may be expensive)
X OS may have limits on policies, thread #s
X portability

m*1 model:
-all of the ULTs are mapped onto a single KLT
-at UL, there is a thread mgmt lib that decides which ULT will be mapped onto the KLT at any given time (but that KLT will still only be run when the OS schedules it onto the CPU)
+totally portable, doesnt depend on OS limits and policies (re. sched, synch, blocking, etc)
X OS has no insights into app needs (doesnt even now that its MT)
X OS may block entire process if one ULT blocks on I/O

m*m model:
-allows some ULTs to be assoc with 1 KLT and others to have a 1-1 mapping with a KLT
+can be best of both worlds (OS knows its MT)
+can have bound or unbound threads
-requires coordination between UL- and KL- thread managers

in general:
-at kernel level: System Scope: system-wide thread mgmt by OS-level thread managers (e.g. CPU scheduler)
-at user level: Process Scope: user-level library manages threads within a single process


Can you explain why some of the mechanisms (for configuring degree of concurency, for signaling, the use of LWP...) describes in the Solaris papers are not used/necessary in the current Linux model?

task struct in current Linux:
-main execution abstraction=>task
--kernel level thread
-single-threaded process =>[has] 1 task
-multi-threaded process => [has] many tasks

each task/process is represented by [a lot of stuff, but specifically] a pid [processID], a list of tasks, ...

linux never had one contiguous PCB; instead process state was always represented by a collection of refernces to data structures (ex: mem mgmt, file mgmt, etc), referenced via ponters
-makes it easy for tasks in a single process to share some parts of addr space (ex: virt addr mappings, files)

to create a new task, linux uses function "clone(function, stack_ptr, sharing_flags, args)"
-sharing_flags: enable more flexibility when creating new tasks

fork in linux is internaly implemented via clone and clearing all the flags;
fork also has diff semantics for MT processes [expect child to be a ST process, replicating a portion of the addrspace thats visible to current parent task] and ST processes [expect child to be full replica of parent process]

current linux threading model: Native POSIX Threads Library (NPTL): 1-1 model
--kernel sees ULT info
--kernel traps are cheaper
--more resources: memory, large range of IDs...
-older model: LinuxThreads (more similar to M-M model, suffered from same complexity as in Solaris papers wiht signal MGMT etc)


What are the benefits of the event-based model described in the Flash paper over MT and MP?
What are the limitations?
Would you convert the AMPED model into a AMTED (async multi-threaded event-driven)?
How do you think a AMTED version of Flash would compare to the AMPED version of Flash?**

benefits of event-driven model over MP/MT:
* in MT/MP you can only have 1 execution context [request] per thread/process, but in event-driven model you can have a whole bunch of requests in flight in various phases at any given point in time
+ single address space
+single flow of control
+smaller memory management
+no ctx switching
+no synchronization
+(AMPED:) resolves portability limitations of basic event-driven model
+(AMPED:) smaller footprint than regular worker thread

limitations of event-driven model:
-a blocking request/handler will block the entire process
-(AMPED:) applicability to certain classes of
-(AMPED:) event routing on multi-CPU systems

would probably be OK to try AMTED vs AMPED today because from the Flash paper, "Separate processes were chosen instead of kernel threads to implement the helpers, in order to ensure portability of Flash to operating systems that do not (yet) support kernel threads, such as FreeBSD 2.2.6." [1999]

AMTEDFlash vs AMPEDFlash?
I think AMTEDFlash would perform better than AMPEDFlash because it doesn't need to context switch............ [actually i think i need to think about this somemore]
I think AMTED would perform better than AMPED because in AMPED, if a systemcall blocks then the entire process would be blocked. Whereas with AMTED, if a systemcall blocks, then only the thread that issued it would be blocked, so other work could still continue as the other threads could keep going.


What’s the potential issue if a interrupt or signal handler needs to lock a mutex?
What’s the workaround described in the Solaris papers?

interrupts and signals are executed in the context[on the stack of] of the thread that was interrupted; so when an interrupt/signal occurs, the thread's PC points to the 1st instruction of the signal/interrupt handler code and its stack remains the same

There can be an issue if the signal/interrupt handler code requires a mutex m and the originally executing thread already had mutex m.

for example:
1. thread is currently executing; its PC is pointing to whatever code its currently doing and its SP just has its stack.
2. thread normal code instructions requires mutex m; so thread locks m and starts executing its critical section.
3. a signal/interrupt occurs: the thread's PC now points to the 1st instruction of the signal/interrupt handler code and its stack remains the same
4. the thread starts executing the handler code
5. an instruction in the handler code requires mutex m
HERE we have deadlock because:
-the handler code can't finish its execution until it gets m
-but it'll never get m because the earlier thread code already has m and is waiting for the handler code to finish
-but the handler code will never finish because it doesnt have m

WORKAROUND FROM PAPER: disable signals/interrupts while in a critical section (so also keep critical sections small!)

Interrupts and signals are executed in the context of the thread that was interrupted (i.e. on the thread's stack).

If signal/interrupted handler code needs to access some state that other threads in system would be accessing, then we have to use mutexes, but if the current thread already has the mutex, then we have deadlock because:
-thread won't give up current mutex until it completes its critical section
-and the signal/interrupt handler won't move forward until it has the mutex.

workaround from papers:
[non solution: 1. keep handling code simple
=>X too restrictive, limits what a handler can do]
2. control interruptions by handler code => use interrupt/signal masks
-disable interrupts/signals (using signal/interrupt mask) while in critical sections


Contrast the pros-and-cons of a multithreaded (MT) and multiprocess (MP) implementation of a webserver, as described in the Flash paper.

+simple programming
-many processes => high memory usage
-costly context switch
-hard/costly to maintain shared state
-tricky port setup

+shared address space
+shared state
+cheap context switch
-not simple implementation
-requires synchronization
-underlying support for threads (not so much of an issue today)


What is an interrupt? What is a signal?

-an event generated externally by components other than the CPU (ex: I/O devices, timers, other CPUs)
-determined based on the physical platform
-appear synchronously

-an event triggered by the CPU and software running on it
-determined based on the OS
-appear synchronously or asynchronously
-2 types
--one-shot signals
---"n signals pending == 1 signal pending" at least once
---must be explicitly re-enabled
--real-time signals:
---"if n signals raised, then handler is called n times"

-have a unique ID on the hardware [interrupts] or OS [signals]
-can be masked and disabled/suspended via corresponding mask
-per-CPU interrupt mask, per-process signal mask (because signals are delivered to processes)
-if mask indicates enabled, trigger corresponding handler
--interrupt handler set on entire system by OS
--signal handlers set on per-process basis, by process


What happens during interrupt or signal handling?
How does the OS know what to execute in response to an interrupt or signal?

1. device sends a signal through the interconnect that connects the device to the CPU complex (used to be through wires, but now have INT# [MSI# message signal interrupt])
2. based on the pins where the interrupt occurred or the INT/MSI #, we know exactly where the interrupt occurred
3. interrupt interrupts the thread that's executing on the CPU
4. if the interrupt is enabled, then the interrupt handler table is referenced based on the interrupt #; table specifies starting address of interrupt handling routines based on INT#.
5. set the program counter to the starting address of the interrupt handling routine
6. start executing handling routine

1. (ex:) thread is trying to access mem that doesnt belong to it
2. signal SIGSEGV is generated from OS
4. if the signal is enabled, then the signal handler table is referenced based on the signal #; table specifies starting address of signal handling routines based on signal #.
5. set the program counter to the starting address of the signal handling routine
6. start executing handling routine

there's an interrupt handler table and a signal handler table that contains the starting address of each handler routine mapped to each interrupt or signal handler #


Can each process configure their own signal handler?
Can each thread have their own signal handler?

processes: yes, processes can configure their own signal handlers [or at least, they can choose different options for how to handle the OS's signal other than the default]; for most signals, a process can install its own custom handling routine, but there are some signals that cannot be "caught"

??? NO? because all threads in a process share the same code so the thread wouldn't have any different logic than that of the other threads in the process???
???or really, thread ULT lib takes care of it???

interrupt handlers: NO, interrupt handlers are configured by the OS


There are several sets of experimental results from the Flash paper discussed in the lesson.
Do you understand the purpose of each set of experiments (what was the question they wanted to answer)?
Do you understand why the experiment was structured in a particular why (why they chose the variables to be varied, the workload parameters, the measured metric…).

6.1 Synthetic Workload
-set of clients repeatedly request the same file, file size is varied in each test
-how does each server model perform at its highest capacity?
--how does each server model perform when file is cached?

6.2 Trace-based (v1, v2)
-both: how does each server model perform with more realistic workload?
-v1: each server model is tested against real [smaller-ish?] traces [with fixed dataset sizes?] (Owlnet and CS traces)
--how does each server model perform with real requests for different files [with a fixed dataset size?]?
-v2: each server model is tested against real [larger-ish?] traces and a range of dataset sizes (ECE)
--how does each server model perform with real requests for different files and different dataset sizes? [i.e. with an even more realistic workload]

6.3 Flash Performance Breakdown
-Flash is tested with requests for the same file that is cached [same as 6.1] but with all possible combinations of each of the 3 optimizations applied to it
--how impactful is each of the optimizations that were applied to Flash?

6.4 Performance under WAN conditions
-each server model is tested with the 90MB dataset size of the ECE trace and persistent LAN connections (supposed to emulate WAN connections)
--how much of an impact does the number of concurrent HTTP connections [which is more realistic because most requests are done via WAN instead of LAN] have on each server model's performance?

-experiments 1-3 each get a little more realistic as it goes on; experiment 4 specifically tests the effectiveness of each of the optimizations they applied to Flash; experiment 5 specifically tests the impact of the number of concurrent connections on server performance
-1st experiment checks that all servers are on a level playing field with cached workloads
-2nd experiment tests a slightly more realistic workload by using requests for different files but fixed dataset size (on 2 traces)
-3rd experiment tests a slight more realistic workload by using requests for different files and varying dataset sizes (with 1 trace)
-4th experiment specifically tests the effectiveness of the optimizations they applied to the Flash server [that were also applied to the other models in the other experiments]
-5th experiment specifically tests the effect of the number of concurrent connections on server performance


If you ran your server from the class project for two different traces:
(i) many requests for a single file,
and (ii) many random requests across a very large pool of very large files,
what do you think would happen as you add more threads to your server?
Can you sketch a hypothetical graph?*

i. many requests for a single file:
I think that the performance of the server would increase almost linearly as the number of threads increases, because each new thread can serve a request to a client. On the server side, the threads just need to read the file, so the only global state that they really all have to share and access [with the boss thread] is the request queue.

ii: many random requests across a very large pool of very large files:
I think that the performance of the server would still increase as the number of threads increases, because again, each new thread can serve a request to a client. ...I kinda don't think that the filesize should have too much of an impact on this, except that you won't be able to fit as much of the file in the cache, but since they're all very large files anyway, we already wouldn't have been able to take much advantage of the cache anyway.
...thought about it some more: all threads get to share the same cache, and if the requests are all coming in for a bunch of diff files and of diff sizes, it would be hard to maintain any of the file contents themselves in the cache; however maybe if the server could just keep the file header info for all files (or most? like the most frequently requested files?) in the cache then that could speed up the threads' abilities to respond to the client with the file header. Sure each thread would still need to go back and forth to fetch the file contents from disk, but at least they'd be able to get the file header info quickly and efficiently instead of having to look that up and calculate it for every new request.