You are on page 1of 15

CHAPTER 1: PROCESS MANAGEMENT

Day1

I- What is a process?

The OS manages a variety of activities including user programs and system programs. Each of these
activities is encapsulated in a process. A process is not a program but one instance of a program in execution.
Many processes can be running the same program, but each is distinct with its own state (e.g. MS Word). A
process state is made up of the following elements:
 The code for the running program
 Static data for the running program
 Space for dynamic data(heap), the heap pointer(HP)
 The process’s stack which typically contains temporary data such as subroutine parameter, return
addresses, and temporary variables
 The stack pointer(SP) to handle the process’s stack
 Current value of Program Counter (PC), indicating the next instruction to be executed
 Contents of other processor registers
 A set of OS resources in use (e.g. Open files)
 The process’s execution state (ready, running,…)
And all other information the activity needs to run.

II- Process execution state


As a process executes, its state changes as a result of process actions (e.g. system calls), OS actions
(scheduling), and external actions (interrupts). The state of a process is defined in part by the current
activity of that process. Each process may be in one of the following states:
 New State: The process being created.
 Ready State: A process is said to be ready if it is waiting to be assigned to a processor.
 Running State: A process is said to be running if it has the CPU, that is, process actually using the
CPU at that particular instant.
 Blocked (or waiting) State: A process is said to be blocked if it is waiting for some event to happen
(such as an Input/ Output completion for example) before it can proceed.
 Terminated state: The process has finished execution.
Figure 1: Diagram of process states.

The OS manages multiple active processes using state queues.

III- Data structures used by the OS to handle processes

a- Process table and process Control Blocks


To implement the process model, the OS maintains a table, called process table with one entry per process.
Each entry represents a process in the OS and is called its process control block (PCB), or task control
block. The PCB tracks the execution state and location of each process. The OS allocates a new PCB on the
creation of each process and places it on a state queue. It (The OS) also deallocates the PCB when the
process terminates.
The PCB contains:
 The process state (running, waiting,…)
 Process number
 The contents of the Program counter
 The contents of the stack pointer
 General purpose registers, memory management information
 Owner’s name
 List of open files
 Queue pointers for state queues
 Scheduling information (e.g. priority)
 …
The OS starts executing a ready process by loading hardware registers (PC, SP, etc) from its PCB. While a
process is running, the CPU modifies the Program Counter (PC), stack pointer (SP), registers, etc.
When the OS stops a process, it saves the current values of the registers (PC, SP, etc) into its PCB. This
process of switching the CPU from one process to another (stopping one and starting the next) is the context
switch.

b- Process state queues


The OS maintains the PCB of all the processes in state queues. It places the PCBs of all processes in the
same execution state in the same queue. When the OS changes the state of a process, the PCB is unlinked
from its current queue and moved to its new state queue. The OS can use different policies to manage each
queue. Each I/O device has its own wait queue.
Example:
IV- Process creation and termination

A new process is generally created by having an existing process (running in user mode or kernel mode)
executes a process creation system call. This call tells the operating system to create a new process and
indicates the program to run in it. The creator is called the parent and the new process is the child. On
UNIX, the corresponding system call is fork, which creates an exact clone of the calling process (same
memory image, same open files,…). To create its own memory image and run as a different program, the
child process have to execute execve or a similar system call.

On windows, in contrast, a single win32 function named CreateProcess handles both process creation and
loading of the correct program into the new process.

In all cases, after a process is created, the parent and child have their own distinct address spaces. But it is
however possible for the newly created process to share some of its creator’s other resources such as open
files. A parent can either wait for the child to complete, or continue in parallel.

One of the following conditions can cause a process to terminate:


- The process has done its work (normal exit). In this case, the exit system call (in UNIX) or
exitProcess system call (in Windows) is used by the process to tell the OS that it is finished.
- An error occurred during execution, caused by the process itself (error exit) due to a program bug
for example.
- An error occurred during execution, due to a user action (fatal error due to incorrect parameters for
example)
- The process has been killed by the operating system as a result of a specific system call executed by
another process. The call is kill on UNIX and the corresponding win32 function is
TerminateProcess on windows.

On process termination, the OS reclaims all resources assigned to the process.

V- Process hierarchy

In some systems, when a process creates another process, the parent and the child continue to be associated
in some way. The child process itself can also create more processes, leading to a hierarchy of processes. The
Way UNIX initializes itself when it starts can be used to illustrate such a structure. The order of appearance
in the hierarchy imposes a certain order among processes.

But it should be noted that windows has no concept of process hierarchy. The reason is that, when a process
is created, the parent is given a special token that it can use to control the child. But the parent is free to pass
the token to some other process, thus invalidating the hierarchy. So unlike UNIX, a process can disinherit its
children on windows.
Day 2

VI- Inter Process communication (focus on shared memory systems)

Inter process communication is the ability for two or more processes to exchange data or
information. Within the system, processes may need to communicate in order to perform a given task.
According to the arrangement of processing units and memories within the system, one distinguishes two
models that can be used by processes to communicate: shared memory and message exchange. Shared
memory model holds for mono processor systems (a single processor and a single main memory) and
multiprocessors with a shared memory; Message exchange model is more suitable for multi computers
(multiprocessors in which each processor has its own private memory, and there is no shared memory) and
more generally distributed systems. In this chapter, we focus on the shared memory machines.

Communication through shared memory or shared files implies two or more processes reading or
writing the same memory or the same file, sometimes causing the contents of the shared memory to
become not consistent, and this may lead to catastrophic consequences.

Three issues should then be addressed here:

- How can one process pass information to another?


- How can we make sure two or more processes don’t get in each other way?
- How can we ensure proper sequencing when there are dependencies among processes?

To face these issues, Race conditions (i.e. having two or more processes operating on a shared memory, a
shared file or any other thing else shared simultaneously, and for which the final result depends on who
accessed when) should necessarily be avoided, through mutual exclusion (i.e. ensuring that no more than
one process is operating on the shared memory at the same time). In addition, we need to ensure proper
sequencing when there are dependencies among processes. So all the solutions designed aim at ensuring
mutual exclusion and proper sequencing (synchronization) among processes communicating through shared
memory.

The part of the program code in which the process accesses the shared memory area is called critical
section or critical region.

The following 4 conditions have been defined to ensure a correct cooperation among processes and an
efficient use of the shared memory:

1- No two processes may be simultaneously inside their critical regions


2- No assumptions may be made about speeds or the number of CPUs
3- No process running outside its critical region may block other processes
4- No process should have to wait forever to enter its critical region.

These 4 conditions will be used to evaluate any proposed solution. A solution is said to be correct if it meets
all the 4 conditions.

1. Earliest solutions
 Disabling interrupts

This solution comes from the fact that race conditions generally arise as a result of preemptive process
scheduling. Preemption can then be avoided by allowing the user process to disable interrupts when it
enters its critical section, and re-enable them when it leaves the section. By doing so, one makes sure that
the running process cannot be preempted while operating on the shared memory. The solution works more or
less on mono processor systems, but not on multiprocessor systems. Why?
The other drawback of this solution is that it gives too much power to the user process, which is not a good
design choice. Imagine if the user process doesn’t re-enable interrupts when it leaves the critical section.

 Lock variables

An alternative software solution to the problem is protecting the shared area by using a lock variable, let’s
say initialized to 0, and set to 1 by any process safely entering the shared area and reset to 0 when it leaves.
So to enter the critical section, every process should check if the lock variable is 0. But with this solution, it
is possible to have more than one process in their critical sections, as a result of scheduling. Why? (Consider
the printing example given in class). So something else needs to be done.

2. Primitives involving busy waiting


 Strict alternation

This solution tries to fix the problem encountered with the lock variables by ensuring a strict alternation to
avoid more than one process active in critical section at the same time. Here, processes access their critical
sections in turns. So there is a variable representing the various turns. Any process wishing to access its
critical region checks the turn variable to see if it can enter. If not, it enters a loop in which is
continuously test the turn variable (busy waiting) till it’s allowed to enter. Strict alternation is ensured by
allowing a process to set another process’s turn. This solution ensures a mutual exclusion, but its
drawback comes from the strict alternation. It works only when the CPU bursts are almost the same for all
the processes using the shared area, and strictly alternate. If processes don’t alternate strictly on the critical
region, we can have cases in which a process blocks another one when not in critical region, thus violating
condition 3 stated above.

Example

 Peterson’s solution

As an alternative to the problem faced with the strict alternation, Peterson designed another busy waiting
based solution in which processes don’t need to strictly alternate. Its solution is described as follows: Before
entering its critical region, a process should call a procedure, namely enter_region to check if it is safe to
enter; By the same way, it should call another procedure, namely leave_region after leaving the region, to
allow blocked processes to enter the section if need be.

In the enter_region procedure, the process manifests its interest to the critical section and, compared to the
strict alternation solution, every process set its own turn. A process is allowed to enter its critical section
only if no other process was interested before it. By doing so, a process enters a busy waiting loop whenever
it sets the turn variable and discover that other processes are interested. By doing so, it is ensured that the
process that was first interested will be the first to enter its critical section. In the leave_region procedure,
the process just indicates that it is no longer interested by the shared memory.

Example with 2 processes (code written in C)


 Solution involving hardware (TSL and XCHG instructions)

Some more efficient busy waiting solutions involving the hardware have been designed. The idea comes
from the problem encountered with the lock variable. To fix the problem, the system should allow a set of
actions(such as testing and setting the lock variable ) to be performed in an indivisible manner (atomic
actions). Fortunately, some computers have an instruction like TSL register, lock (Test and Set Lock) ; Its
equivalent on all Intel x86 CPUs is XCHG register, lock. These instructions allow a number of operations
to be performed in an indivisible manner. They work as follows:

The TSL Reg, lock instruction allows to copy the contents of the memory location ‘lock’ into the register
‘Reg’, and store a non- zero value at the location ‘lock’, in an indivisible manner.

To use the TSL instruction, we will use a shared variable, lock, to coordinate access to shared memory.
When lock is 0, any process may set it to a non-zero value using the TSL instruction and then read or write
the shared memory. When it is done, the process sets lock back to 0.

The XCHG Reg, lock instruction exchanges the contents of the memory location ‘lock’ and the register
‘Reg’, in an indivisible manner.

These solutions are especially adapted to multiprocessor systems because the memory bus can even be
blocked while the indivisible operations are being performed. These 2 instructions have been exploited to
ensure mutual exclusion by rewriting the enter_region and leave_region procedures as follows: (note the
use of assembly language)

 Entering and leaving the critical region using TSL instruction

 Entering and leaving the critical region using XCHG instruction


3. More elaborated solutions

The problem faced with the Peterson’s solution, TSL and XCHG instructions is that they waste too much
CPU time to ‘loop test’ for changes. Furthermore, in a preemptive priority scheduling based system, a
process might loop forever. why? Consider a simple example. This is why more elaborated solutions have
been devised.

These solutions are tested on the producer-consumer problem (or bounded buffer problem) which
figures out the race condition problem as well as the need for proper sequencing (synchronization). The
problem is stated as follows: There is a shared bounded buffer as well as a number of processes subdivided
into producers and consumers. Producers are supposed to insert items on the buffer that will serve as inputs
for consumers. A solution should be designed to ensure that a producer trying to insert an item when
the buffer is full should be blocked and awakened when there will be a space to insert, and a consumer
trying to consume an item when the buffer is empty should be blocked, and awakened only when there
is an item to consume.

Some of the primitives used to solve such problems are presented as follows:

 Sleep and wake up (Let’s consider a simple case with one producer and one consumer)

Here, the idea is that, if the buffer is full and the producer is trying to insert an item, it should sleep, and be
awakened when there is a space to insert. By the same manner, when the buffer is empty and the consumer is
trying to consume an item, it should sleep and wake up when there is an item to consume. This can be
achieved with the use of 2 system calls, namely sleep and wake up.

Solution to the producer-consumer problem using sleep and wake up

Problem: there are situations in which both processes may sleep forever. Explain!

This problem is due to the lost of wake up signals. So something needs to be done. A viable solution is
undoubtedly to make some actions indivisibles and find a way to record wake up signals. This is where
semaphores come.

 Semaphores

Semaphores are constructs used to record wake up signals and transmit them to sleeping processes when
need be. They are used to solve the problem faced with the sleep and wake up solution, and are implemented
as follows:
A semaphore is just a positive integer variable that can be assigned any value from 0. Two operations are
performed on the variable, namely up and down, described as follows:

- The down operation first tests if the semaphore is 0; if so, the calling process goes to sleep without
completing the operation. If not, the process just decrements the semaphore’s value and continues its work.
Checking the semaphore and sleeping, or decrementing it, are done as a single, indivisible atomic action.
- The up operation just increments the value of the semaphore and, if there were processes sleeping on
that semaphore, one of them is chosen(randomly) by the system and allowed to complete its down
operation. A solution to the producer-consumer problem using semaphores is presented as follows:

To have a solution appropriate for multiprocessors, it is necessary to protect each semaphore with the TSL
or XCHG instruction to avoid more than one process being operating on it at the same time.

Using semaphores, a strict order should be respected, to avoid catastrophic unwanted situations. An
example of such a situation can be encountered by reversing the order of the ‘up’ operations. This makes
programming with semaphores too constraining. This is why a less constraining primitive have been devised,
namely monitors, to make programming simpler.

 Monitors (devised by Brinch Hansen and Hoare)

The idea is to ease programming by making the compiler handle mutual exclusion, so that the programmer
focuses mostly on synchronization issues. In fact, a monitor is just a collection of variables, data structures
and procedures grouped together into a sort of package. The key property used by the compiler to ensure
mutual exclusion is that no more than one process can be active in the monitor at a given moment. Here,
to ensure mutual exclusion, one just needs to write all the critical sections as monitor procedures. Once a
process is operating on a monitor’s procedure, any other process trying to enter the same monitor should
simply be suspended. To ensure synchronization, monitors may contain some condition variables on which
a process operating on the monitor may block if it cannot proceed. Two operations are possible on the
condition variables: wait and signal.

The wait operation blocks the caller on a given condition. It allows a process previously suspended to enter
the monitor;

The signal operation is used to allow the system scheduler to wake up one of the processes blocked on a
given condition. If there was no waiting process, the signal is lost. To avoid more than one active process in
the monitor, Hansen proposed that a process doing a signal must exit the monitor immediately. I.e. the
‘signal’ statement must only be the last statement in a monitor’s procedure. The problem with monitors is
that it is a programming language construct. So no all programming languages have it, unlike semaphores
that can easily be integrated to any compiler, since they are basically system calls. A skeleton of the
solution to the producer-consumer problem with monitors, written in an imaginary language is presented as
follows:

Day 3

VII- Process scheduling

Whenever two or more processes are simultaneously in the ready state on a multi programmed computer, a
choice has to be made on which process to run next. The part of OS making such a choice is called the
process scheduler, and the algorithm it uses is called scheduling algorithm. In this section, we intend to
study the various scheduling algorithms that can be used, according to the type of computer system at hand.

A good scheduling algorithm should ensure good system performances as well as user satisfaction. So, for
every system, scheduling decisions should first of all be based on these 2 requirements. User satisfaction
means the scheduling decision should meet the expectations of users. Good System performance means the
scheduling decisions should not deteriorate the system’s performances, especially trying to occupy all parts
of the system all the time (i.e. ensuring optimal utilization) and avoid doing too many context switches
because they are expensive (mode switch, saving the state of the current process, saving the memory map,
running the scheduling algorithm, reloading the MMU, starting the new process, … ).

To design a good scheduling algorithm, it is necessary to analyze the way processes behave. In fact, nearly
all processes alternate bursts of computing with I/O requests. Typically, the CPU runs for a while without
stopping, and then a system call is made to perform an I/O operation. When the system call completes, the
CPU computes again until it needs more data or has to write more data, and so on. From the analysis, it is
stated that some processes are compute-bound (i.e. they compute much between I/O requests) while others
are I/O –bound (i.e. they don’t compute much between I/O requests). So compute-bound processes typically
have long CPU bursts (duration for which the process needs to actually use the CPU) and infrequent I/O
waits, whereas I/O –bound processes have short CPU bursts and thus frequent I/O waits. This analysis is
very important and should be taken into account when designing a scheduling algorithm. Especially, we
should consider the fact that nowadays, CPUs are improving faster than I/O devices, the latter becoming the
limiting factor, not the former as at the beginning. Think about it! The two situations are illustrated on the
following figure:

1. When to schedule

CPU scheduling decisions may take place under the following four circumstances:
 When a new process is created
 When a process terminates
 When a process blocks on I/O, on a semaphore, or for some other reason
 When an interrupt occurs (from an I/O device, from the hardware clock,…).
Scheduling algorithms taking a scheduling decision at each clock interrupt or at every kth clock interrupt are
said to be preemptive, whereas those taking a scheduling decision only when the currently running process
blocks or voluntarily release the CPU are said to be non preemptive.

A preemptive scheduling algorithm picks a process and lets it run for a maximum of some fixed time
(generally measured in terms of number of clock interrupts). If the process is still running at the end of the
time interval, it is suspended and the scheduler picks another process to run.

2. Categories of scheduling algorithms

Scheduling algorithms are designed according to the application area in which the computer system at hand
will be use. So, according to requirements of the environment, one distinguishes batch systems (in which
jobs are supplied in terms of batches), interactive systems (in which users submit jobs and need to interact
with the system from time to time) and real time systems (in which the execution of jobs has time as key
parameter). So the development of a scheduling algorithm should take the environment’s constraints into
account. In order to satisfy the users needs while optimizing the performances of the system, the following
classification is made, according to the environment:

 On batch systems, since there are no users impatiently waiting at their terminals for a quick
response, non preemptive algorithms or preemptive algorithms with long time period for each process,
are most often acceptable.
 In an interactive environment with many impatient users, preemption is essential to keep
one process from hogging the CPU and denying service to the others. Having this in mind, what remains to
do is setting a good scheduling policy that will preserve the performances of the system.
 In systems with real-time constraints, we should try to balance between preemptive and
non preemptive algorithms, according to deadlines of the various jobs.
3. Scheduling algorithm goals

The following are the goals to consider when designing a scheduling algorithm. Some of those goals are
desirable in all cases(no matter the environment) while others depend on the environment (bath, interactive
or real-time).

a. Goals desirable in all cases:


Fairness: giving each process a fair share of the CPU
Policy enforcement: carrying out the stated policy
Balance: keeping all parts of the system busy
b. Goals to meet for batch systems:
Throughput: maximize the number of jobs executed per hour for example
Minimize the turnaround time of jobs (time between the submission and termination)
CPU utilization: keep the CPU busy all the time
c. Goals to meet for interactive systems
Minimize response time: respond to request quickly
Proportionality: meet users’ expectations (don’t irritate the users)
d. Goals to meet for real-time systems
Meeting deadlines (to avoid lost of data for example)
Be predictive: be prepared in advance

 Scheduling in batch systems


 First come first served (FCFS)

Here, processes are assigned the CPU in the order they request it. In fact, the first process to enter the ready
queue is started immediately and allowed to run as long as it wants to. If other processes come in, they are
put onto the end of the queue. When the running process blocks, the first process on the queue is run next.
When a blocked process becomes ready, it is put at the end of the queue.

Strengths: easy to understand and to program, fair;

Weaknesses: Doesn’t minimize the turnaround time and doesn’t keep all parts of the system busy at the
same time (balance).

Consider a situation in which we have a compute-bound process that runs for 1s at a time and many I/O –
bound processes that use little CPU time but each has to perform 1000 disk reads to complete. The result is
that each I/O – bound process gets to read 1 block per second and will take 1000 seconds to finish. But
imagine if we preempt the compute-bound process every 10 ms.

 Shortest job first(SJF)

How to minimize the turnaround time and allow all parts of the system to be busy at the same time? The idea
is to prioritize the I/O – bound processes. Since they generally have short CPU burst times, they will
compute quickly and start their I/O operation and, while doing it, the CPU can be allocated to the next
shortest burst time process. By doing so, all parts of the system (CPU, I/O devices,…) might be kept busy
almost all the time. Such a scheduling algorithm in which the CPU is allocated to processes starting from the
shortest burst time process to the longest one is called shortest job first. The constraint here is that the run
times should be known is advance. Does it actually minimize the turnaround time? Yes! Let’s consider an
example. Four processes A, B, C and D enter the ready queue in that order (i.e. A first, B second, C third and
D fourth), with run times of 8,4,4 and 4 minutes respectively. Applying the FCFS, the turnaround time for A
is 8 minutes, B is 12 minutes, C is 16 minutes and D is 20 minutes. So the average turnaround time is
(8+12+16+20)/4=14 minutes.

By applying the SJF, the turnaround times are now 4, 8, 12 and 20 minutes. The average becomes 11
minutes, which is better than what we obtained with FCFS. This algorithm is even optimal, as far as
turnaround time minimization is concerned. Why?

It should be noted that SJF is optimal provided that all the processes are present on the ready queue
before we start allocating the CPU. When this constraint is not respected, the algorithm is no longer
optimal. In fact, consider that, at a given moment, a new process may enter the ready queue and having a run
time even shorter than the remaining run time of the currently running process. In this case, the new
comer has to wait since SJF isn’t preemptive. So at the end, the average turnaround time will no longer be
minimized! What to do?

 Shortest remaining time next (SRTN)

To fix the problem we faced with SJF, It would have been better to suspend the currently running process
and allocate the CPU to the new comer, since it is now the shortest one. This is where comes the shortest
remaining time next, which is just a preemptive version of the SJF. In the SRTN, whenever a new process
enter the ready queue, its run time is compared to the remaining run time of the currently running process
and, if more smaller, the CPU is withdrawn from the currently running process and given to the new comer.

 Scheduling in interactive systems


 Round robin scheduling

This is one of the oldest, simplest, fairest and most widely used algorithms for interactive systems. Here,
each process is assigned a time interval, called its quantum, during which it is allowed to run. If the process
is still running at the end of the quantum, the CPU is preempted and given to another process. If the process
has blocked or finished before the quantum elapsed, the CPU is given to another process. The scheduler just
needs to maintain a list of runnable processes. When a process uses up its quantum, it is put on the end of
end of the list.

The idea here is to satisfy all the users who are all probably impatiently waiting on their terminals. So, if
every process belongs to a particular user, round-robin is a good option to satisfy all of them.

Example:

The major issue here is the length of the quantum that can become a bottleneck for the algorithm. In fact,
setting the quantum too short causes too many process switches and lowers the CPU efficiency, but setting it
too long may cause poor response to short interactive requests. So a compromise should be found.

 Priority scheduling

For the round-robin scheduling, processes are supposed to be equally important, i.e. give the same privilege
to all users. But user satisfaction isn’t as simple as this. There are environments (governments, schools,
companies, a PC with foreground and background processes,…) in which some processes are, and should be
more prioritized than others. So a good scheduling algorithm should also take these parameters into account.
This is where the priority scheduling comes. Here, each process is assigned a priority, and the runnable
process with the highest priority is allowed to run first.

On some typical priority scheduling systems, each process is assigned a maximum time quantum that it is
allowed to run. When this quantum is used up, the next highest priority process is given a chance to run. To
prevent high-priority processes from hogging the CPU, the scheduler may decrease the priority of the
currently running process at each clock interrupt. Priorities can be assigned statically (i.e. fixed in advance)
or dynamically (i.e. may be modified at run time). Dynamic priorities are used to achieve some system
goals (prioritizing I/O bound processes for example)

It is often convenient to group processes into priority classes and use priority scheduling among the classes
and roud-robin scheduling within each class. But this may lead to starvation.

Example:

 Multiple queues priority scheduling

With priority classes, we still need to find a compromise to ameliorate the system performances and satisfy
users (based on priorities). In fact, users owing highly interactive processes are more likely to be impatiently
waiting on their terminals. So they should be more prioritized. To satisfy them, the following analysis is
performed:

- If moving to the next priority class is conditioned solely by the completion of more prioritized
classed, this would lead to starvation (hence poor response time)
- Setting o short quantum may affect system performances (due to context switches)
- Since long CPU burst processes are more likely to be not interactive, it is more efficient to make
them use the CPU as least as possible (i.e. avoiding those processes to hog the CPU); to avoid short
interactive processes to starve.

To fix these issues, the idea is to find a way to discover compute bound processes and prevent highly
interactive processes to starve. Since highly interactive processes are more likely to be short I/O bound
processes, they can be prioritized by observing the behavior of processes in every class at every round and
taking subsequent decisions. If a process doesn’t block or complete before its quantum elapses, it is more
likely to be a compute-bound process. So it should be removed and placed on the lower priority class and
allowed to run for more rounds (more quanta, to not be penalized forever). The same thing is done till the
last (the least prioritized) class. By doing so, we increase the performance of the system, and try at the same
time to satisfy all the users (by privileging more prioritized and highly interactive processes). One of earliest
priority scheduler to implement this mechanism was in CTSS, the M.I.T Compatible Time Sharing System
that ran on the IBM 7094.

 Other algorithms: shortest process next, lottery scheduling, fair share scheduling…

 Scheduling in real time systems

In both hard and soft real-time systems, the processes’ behavior is predictable and known in advance. These
processes may typically respond to external events that can be periodic (happening at regular intervals of
time referred to as periods) or aperiodic (happening in an unpredictable way). When an external event is
detected, it is the job of the scheduler to schedule the processes in such a way that all deadlines are met. For
periodic event streams, before even looking for a scheduling algorithm, it is possible to know whether the set
of processes is schedulable or not, i.e. All the deadlines can be met. for example, if there are m processes
responding to m periodic events and event “i” occurs with period pi and requires Ci seconds of CPU time to
handle each event, then the set of processes is schedulable if C1/p1+C2/p2+…+Cm/pm<=1; i.e.

, Ci/pi representing the fraction of CPU being used by process i.

Real-time scheduling algorithms can be static or dynamic. Static scheduling algorithms make
scheduling decisions before the system starts running whereas dynamic algorithms make scheduling
decisions at run time. Static scheduling only works when there is perfect information available in advance
about the work to be done and the deadlines that have to be met. But dynamic algorithms don’t have these
restrictions. On a typical real-time system, multiple processes compete for the CPU, each with its own work
and deadlines.

Case study: Rate Monotonic Scheduling (RMS); Earliest Deadline first (EDF)

 Rate Monotonic Scheduling is a classical static priority scheduling algorithm for preemptable periodic
processes. For RMS to work, processes should meet the following conditions:
- Each periodic process must complete within its period
- No process is dependent on any other process
- Each process needs the same amount of CPU time on each burst
- Any non periodic processes have no deadlines
- Process preemption occurs instantaneously and with no overhead

The last condition isn’t reasonable but it makes modeling the system much easier. The algorithm works
as follows: Each process is assigned a fixed priority equal to the frequency of occurrence of its triggering
event, which is given by the number of times the process must run per second. For example, a process A
that must run every 30 ms(33 times/second) gets priority 33, a process B that must run every 40 ms(25
times per second) gets priority 25, and a process C that must run every 50 ms(20 times/second) gets
priority 20. The periods are thus linear with the rate. This is why it is called rate monotonic. At run
time, the scheduler always runs the highest priority ready process, preempting the running process if
need be. The figure below shows how RMS works with our example of three processes A, B and C given
above. A, B and C have static priorities 33, 25 and 20 respectively. This means that whenever A needs to
run, it runs, preempting any other process currently using the CPU. Process B can preempt C, but not A.
process C has to wait until the CPU is otherwise idle in order to run. Some additional information is that
process A runs for 10 ms on each burst, B runs for 15 ms on each burst, and C runs for 5ms on each
burst.

Fig: Rate Monotonic Scheduling for 3 periodic processes A, B and C; Earliest Deadline First
scheduling for the same three processes A, B and C.

On the figure, all three processes are initially ready to run. The highest priority one, A, is chosen and allowed
to run until it completes at 10 ms, as shown in the RMS line. After it finishes, B and C are run in that order.
Together, these processes take 30 ms to run, so when C finishes, it is time for A to run again. This rotation
goes on until the end, the system being idle between t=70 and t=80, between t=110 and t=120,… For this
example, all processes meet their deadlines. But there are cases in which some deadlines can be missed.
Consider the case in which process A needs 15 ms of CPU time per burst instead of 10 ms, and redo the
example. Are all the deadlines met? No! That is why an additional condition has been devised to make sure
that RMS is guaranteed to work. The condition is given as follows:

 Earliest Deadline First is a dynamic priority scheduling algorithm for preemptable processes. It is not
as restrictive as RMS. i.e. it doesn’t require processes to be periodic, and also doesn’t require the same
run time per CPU burst. The algorithm works as follows: whenever a process needs to CPU time, it
announces its presence and its deadline. The scheduler keeps a list of runnable processes, sorted on
deadlines. The algorithm runs the first process on the list, which is the one with the closest deadline.
Whenever a new process becomes ready, the system checks to see if its deadline occurs before that of the
currently running process. If so, the new process preempts the current one. EDF has been implemented
on our previous example with three processes A, B and C. For each process, the deadline is considered to
be the arrival of the next occurrence of the triggering event. So initially, all three processes are ready
and are run in the order of their deadlines. A must finish by t=30, B must finish by t=40, and C must
finish by t=50, so A has the earliest deadline and thus goes first. Up until t=90, the choices are the same
as RMS. At t=90, A becomes ready again, and its deadline it t=120, the same as B’s deadline. But
considering the cost of a context switch, it is better to let B continue to run. It should be noted that RMS
and EDF don’t always give the same result. For instance, implementing EDF on the case in which
process A needs 15 ms of CPU time on each burst instead of 10 ms, all the deadlines are still met, which
was not the case with RMS. It is proved that EDF always work for any schedulable set of processes, and
can achieve 100% CPU utilization. The major problem is just the fact that the algorithm is more
complex.

Questions:

- Policy Vs mechanism: what is the implication in designing a scheduling algorithm?

You might also like