Chapter 5 Flashcards
In a system with a single CPUcore
only one process can run at a time. Others must wait until theCPU’s core is free and can be rescheduled
The objective of multiprogramming is
g is to have some process running at all times, to maximize CPUutilization. The idea is relatively simple. Aprocess is executed until it must wait, typically for the completion of someI/Orequest. In a simple computer system, theCPUthen just sits idle. All this waiting time is wasted; no useful work is accomplished. With multiprogramming, we try to use this time produc-tively. Several processes are kept in memory at one time. When one process has to wait, the operating system takes theCPUaway from that process and gives theCPUto another process. This pattern continues. Every time one process has to wait, another process can take over use of theCPU. On a multicore system, this concept of keeping theCPUbusy is extended to all processing cores on the system.
Scheduling of this kind is a fundamental operating-system function.
Almost all computer resources are scheduled before use. TheCPUis, of course, one of the primary computer resources. Thus, its scheduling is central to operating-system design.
process execution consists of a
of a cycle ofCPUexecution and I/O wait. Processes alternate between these two states. Process execution begins with a CPU burst.
That is followed by an I/O burst, which is followed by anotherCPUburst, then anotherI/Oburst, and so on. Eventually, the f i nalCPUburst ends with a system request to terminate execution (Figure 5.1).
The durations ofCPUbursts have been measured extensively. Although they vary greatly from process to process and from computer to computer, they tend to have a frequency curve similar to that shown in Figure 5.2. The curve is generally characterized as exponential or hyperexponential, with a large number of shortCPUbursts and a small number of long CPUbursts.
AnI/O-bound program typically has many shortCPUbursts. ACPU-bound program might have a few long CPUbursts. This distribution can be important when implementing aCPU-scheduling algorithm.
CPU Scheduler
Whenever the CPU becomes idle, the operating system must select one of the processes in the ready queue to be executed. The selection process is carried out by the CPU scheduler, which selects a process from the processes in memory that are ready to execute and allocates the CPU to that process.
CPU-scheduling decisions may take place under the following four circum-stances
- When a process switches from the running state to the waiting state (for example, as the result of an I/O request or an invocation of wait() for the termination of a child process)
- When a process switches from the running state to the ready state (for example, when an interrupt occurs)
- When a process switches from the waiting state to the ready state (for example, at completion of I/O) 4.
When a process terminates
For situations 1 and 4, there is no choice in terms of scheduling. A new process (if one exists in the ready queue) must be selected for execution. There is a choice, however, for situations 2 and 3.
scheduling scheme is nonpreemptive or cooperative when
- When a process switches from the running state to the waiting state (for example, as the result of anI/Orequest or an invocation ofwait()for the termination of a child process)
- When a process terminates
When scheduling takes place only under circumstances 1 and 4, we say that the scheduling scheme is nonpreemptive or cooperative. Otherwise, it is pre-emptive. Under nonpreemptive scheduling, once theCPUhas been allocated to a process, the process keeps theCPUuntil it releases it either by terminating or by switching to the waiting state. Virtually all modern operating systems including Windows, macOS, Linux, andUNIXuse preemptive scheduling algo-rithms.
pre-emptive scheduling
Otherwise, it is pre-emptive. Under nonpreemptive scheduling, once theCPUhas been allocated to a process, the process keeps theCPUuntil it releases it either by terminating or by switching to the waiting state. Virtually all modern operating systems including Windows, macOS, Linux, andUNIXuse preemptive scheduling algo-rithms.
Unfortunately, preemptive scheduling can result in race conditions when data are shared among several processes. Consider the case of two processes that share data. While one process is updating the data, it is preempted so that the second process can run. The second process then tries to read the data, which are in an inconsistent state. This issue will be explored in detail in Chapter 6.
Preemption also affects the design of the operating-system kernel. During the processing of a system call, the kernel may be busy with an activity on behalf of a process. Such activities may involve changing important kernel data (for instance,I/Oqueues). What happens if the process is preempted in the middle of these changes and the kernel (or the device driver) needs to read or modify the same structure? Chaos ensues. As will be discussed in Section 6.2, operating-system kernels can be designed as either nonpreemptive or preemptive. A nonpreemptive kernel will wait for a system call to complete or for a process to block while waiting forI/Oto complete to take place before doing a context switch. This scheme ensures that the kernel structure is simple, since the kernel will not preempt a process while the kernel data structures are in an inconsistent state. Unfortunately, this kernel-execution model is a poor one for supporting real-time computing, where tasks must complete execution within a given time frame. In Section 5.6, we explore scheduling demands of real-time systems. A preemptive kernel requires mechanisms such as mutex locks to prevent race conditions when accessing shared kernel data structures.
Most modern operating systems are now fully preemptive when running in kernel mode.
Because interrupts can, by definition, occur at any time, and because they cannot always be ignored by the kernel, the sections of code affected by inter-rupts must be guarded from simultaneous use
use.The operating system needs to accept interrupts at almost all times. Otherwise, input might be lost or output overwritten. So that these sections of code are not accessed concurrently by several processes, they disable interrupts at entry and reenable interrupts at exit. It is important to note that sections of code that disable interrupts do not occur very often and typically contain few instructions.
Dispatcher
Another component involved in the CPU-scheduling function is the dispatcher.
The dispatcher is the module that gives control of the CPU’s core to the process selected by the CPU scheduler. This function involves the following:
•Switching context from one process to another
•Switching to user mode
•Jumping to the proper location in the user program to resume that program
The dispatcher should be as fast as possible, since it is invoked during every context switch. The time it takes for the dispatcher to stop one process and start another running is known as the dispatch latency
voluntary and nonvoluntary context switches
voluntary context switch occurs when a process has given up control of the CPU because it requires a resource that is currently unavailable (such as blocking for I/O.) A non voluntary context switch occurs when the CPU has been taken away from a process, such as when its time slice has expired or it has been preempted by a higher-priority process.
CPU scheduling algorithms criteria
•CPU utilization. We want to keep the CPU as busy as possible. Conceptually, CPU utilization can range from 0 to 100 percent. In a real system, it should range from 40 percent (for a lightly loaded system) to 90 percent (for a heavily loaded system). (CPU utilization can be obtained by using thetopcommand on Linux, macOS, and UNIX systems.)
•Throughput. If the CPU is busy executing processes, then work is being done. One measure of work is the number of processes that are completed per time unit, called throughput. For long processes, this rate may be one process over several seconds; for short transactions, it may be tens of processes per second.
•Turnaround time. From the point of view of a particular process, the important criterion is how long it takes to execute that process. The interval from the time of submission of a process to the time of completion is the turnaround time. Turnaround time is the sum of the periods spent waiting in the ready queue, executing on the CPU, and doing I/O.
•Waiting time. The CPU-scheduling algorithm does not affect the amount of time during which a process executes or does I/O. It affects only the amount of time that a process spends waiting in the ready queue. Waiting time is the sum of the periods spent waiting in the ready queue.
•Response time. In an interactive system, turnaround time may not be the best criterion. Often, a process can produce some output fairly early and can continue computing new results while previous results are being output to the user. Thus, another measure is the time from the submission of a request until the first response is produced. This measure, called response time, is the time it takes to start responding, not the time it takes to output the response
It is desirable to maximize CPU utilization and throughput and to minimize turnaround time, waiting time, and response time. In most cases, we optimize the average measure. However, under some circumstances, we prefer to optimize the minimum or maximum values rather than the average. For example, to guarantee that all users get good service, we may want to minimize the maximum response time.
Investigators have suggested that, for interactive systems (such as a PC desktop or laptop system), it is more important to minimize the variance in the response time than to minimize the average response time. A system with reasonable and predictable response time may be considered more desirable than a system that is faster on the average but is highly variable.
First-Come, First-Served Scheduling
By far the simplest CPU-scheduling algorithm is the first-come first-serve (FCFS) scheduling algorithm. With this scheme, the process that requests the CPU first is allocated the CPU first. The implementation of the FCFS policy is easily managed with a FIFO queue. When a process enters the ready queue, its PCB is linked onto the tail of the queue. When the CPU is free, it is allocated to the process at the head of the queue. The running process is then removed from the queue. The code for FCFS scheduling is simple to write and understand.
On the negative side, the average waiting time under the FCFS policy is often quite long.
In addition, consider the performance of FCFS scheduling in a dynamic situation. Assume we have one CPU-bound process and manyI/O-bound processes. As the processes flow around the system, the following scenario may result. The CPU-bound process will get and hold theCPU. During this time, all the other processes will finish their I/O and will move into the ready queue, waiting for the CPU. While the processes wait in the ready queue, the I/O devices are idle. Eventually, the CPU-bound process finishes its CPU burst and moves to an I/O device. All the I/O-bound processes, which have short CPU bursts, execute quickly and move back to the I/O queues. At this point, the CPU sits idle. TheCPU-bound process will then move back to the ready queue and be allocated theCPU. Again, all the I/O processes end up waiting in the ready queue until theCPU-bound process is done. There is a convoy effect as all the other processes wait for the one big process to get off the CPU. This effect results in lower CPU and device utilization than might be possible if the shorter processes were allowed to go f i rst.
Note also that the FCFSscheduling algorithm is nonpreemptive. Once the CPU has been allocated to a process, that process keeps the CPU until it releases theCPU, either by terminating or by requesting I/O. The FCFS algorithm is thus particularly troublesome for interactive systems, where it is important that each process get a share of the CPU at regular intervals. It would be disastrous to allow one process to keep the CPU for an extended period.
Shortest-Job-First Scheduling
A different approach to CPU scheduling is the shortest-job-firs (SJF) scheduling algorithm. This algorithm associates with each process the length of the process’s next CPU burst. When the CPU is available, it is assigned to the process that has the smallest next CPU burst. If the next CPU bursts of two processes are the same,FCFS scheduling is used to break the tie. Note that a more appropriate term for this scheduling method would be the shortest-next-CPU-burst algorithm, because scheduling depends on the length of the nextCPUburst of a process, rather than its total length. We use the term SJF because most people and textbooks use this term to refer to this type of scheduling. The SJF scheduling algorithm is provably optimal, in that it gives the mini-mum average waiting time for a given set of processes. Moving a short process before a long one decreases the waiting time of the short process more than it increases the waiting time of the long process. Consequently, the average waiting time decreases.
Although the SJF algorithm is optimal, it cannot be implemented at the level of CPU scheduling, as there is no way to know the length of the nextCPUburst.
One approach to this problem is to try to approximate SJF scheduling. We may not know the length of the next CPU burst, but we may be able to predict its value. We expect that the nextCPUburst will be similar in length to the previous ones. By computing an approximation of the length of the nextCPUburst, we can pick the process with the shortest predicted CPU burst. Use exponential average. The SJF algorithm can be either preemptive or nonpreemptive. The choice arises when a new process arrives at the ready queue while a previous process is still executing. The next CPU burst of the newly arrived process may be shorter than what is left of the currently executing process. A preemptive SJF algorithm will preempt the currently executing process, whereas a non-preemptive SJF algorithm will allow the currently running process to finish its CPU burst. Preemptive SJF scheduling is sometimes called shortest-remaining-time-firs scheduling.
Round-Robin Scheduling
The round-robin (RR) scheduling algorithm is similar to FCFS scheduling, but preemption is added to enable the system to switch between processes. A small unit of time, called a time quantum or time slice, is defined. A time quantum is generally from 10 to 100 milliseconds in length. The ready queue is treated as a circular queue. The CPU scheduler goes around the ready queue, allocating the CPU to each process for a time interval of up to 1 time quantum.
To implement RR scheduling, we again treat the ready queue as a FIFO queue of processes. New processes are added to the tail of the ready queue.
The CPU scheduler picks the first process from the ready queue, sets a timer to interrupt after 1 time quantum, and dispatches the process.
One of two things will then happen. The process may have a CPU burst of less than 1 time quantum. In this case, the process itself will release the CPU voluntarily. The scheduler will then proceed to the next process in the ready queue. If the CPU burst of the currently running process is longer than 1 time quantum, the timer will go off and will cause an interrupt to the operating system. A context switch will be executed, and the process will be put at the tail of the ready queue. The CPU scheduler will then select the next process in the ready queue.
The average waiting time under the RR policy is often long.
The performance of theRRalgorithm depends heavily on the size of the time quantum. At one extreme, if the time quantum is extremely large, the RRpolicy is the same as theFCFSpolicy. In contrast, if the time quantum is extremely small (say, 1 millisecond), the RR approach can result in a large
process of 10 time units. If the quantum is 12 time units, the process f i nishes in less than 1 time quantum, with no overhead. If the quantum is 6 time units, however, the process requires 2 quanta, resulting in a context switch. If the time quantum is 1 time unit, then nine context switches will occur, slowing the execution of the process accordingly (Figure 5.5).
Thus, we want the time quantum to be large with respect to the context-switch time. If the context-switch time is approximately 10 percent of the time quantum, then about 10 percent of theCPUtime will be spent in context switching. In practice, most modern systems have time quanta ranging from 10 to 100 milliseconds. The time required for a context switch is typically less than 10 microseconds; thus, the context-switch time is a small fraction of the time quantum.
Turnaround time also depends on the size of the time quantum. As we can see from Figure 5.6, the average turnaround time of a set of processes does not necessarily improve as the time-quantum size increases. In general, the average turnaround time can be improved if most processes f i nish their nextCPUburst in a single time quantum. For example, given three processes of 10 time units each and a quantum of 1 time unit, the average turnaround time is 29. If the time quantum is 10, however, the average turnaround time drops to 20. If context-switch time is added in, the average turnaround time increases even more for a smaller time quantum, since more context switches are required.
Although the time quantum should be large compared with the context-switch time, it should not be too large. As we pointed out earlier, if the time quantum is too large,RRscheduling degenerates to anFCFSpolicy. A rule of thumb is that 80 percent of theCPUbursts should be shorter than the time quantum.
Priority Scheduling
The SJF algorithm is a special case of the general priority-scheduling algorithm.
A priority is associated with each process, and the CPU is allocated to the process with the highest priority. Equal-priority processes are scheduled in FCFS order. An SJF algorithm is simply a priority algorithm where the priority (p) is the inverse of the (predicted) nextCPUburst. The larger the CPU burst, the lower the priority, and vice versa.
Note that we discuss scheduling in terms of high priority and low priority.
Priorities are generally indicated by some fixed range of numbers, such as 0 to 7 or 0 to 4,095. However, there is no general agreement on whether 0 is the highest or lowest priority. Some systems use low numbers to represent low priority; others use low numbers for high priority. This difference can lead to confusion. In this text, we assume that low numbers represent high priority.
Priorities can be defined either internally or externally
Internally defined priorities use some measurable quantity or quantities to compute the priority of a process. For example, time limits, memory requirements, the number of open files, and the ratio of average I/O burst to average CPU burst have been used in computing priorities. External priorities are set by criteria outside the operating system, such as the importance of the process, the type and amount of funds being paid for computer use, the department sponsoring the work, and other, often political, factors.