You are on page 1of 277

Operating System

Chapter 1

Presented By:- Dr. Sanjeev Sharma


What is Operating system
• It is a control program that provides an interface between the
computer hardware and the user.
• Part of this interface includes tools and services for the user.
• From Silberschatz (page 3): “An operating system is a
program that acts as an intermediary between a user of
computer and computer hardware. The purpose of the OS is
provide an environment in which the user can execute
programs.
• The primary goal of an OS is thus to make the computer
convenient to use.
• A secondary goal is to use the computer hardware in an
efficient manner.”
Ability to evolve: An OS should be constructed in such a way as to
permit the
effective development, testing, and introduction of new system
functions without interfering with service.
Abstract view of the computer System
• Computer Hardware – CPU, memory, I/O devices provide
basic computing resources.
• System and Application Programs – Compilers, database
systems, games, business programs, etc. define the ways the
computing resources are used to solve the user’s problems.
• Operating System – Controls and coordinates the computing
resources among the system and application programs for the
users.
• End User – Views the computer system as a set of
applications. The End User is generally not concerned with
various details of the hardware.
• Programmer – Uses languages, utilities (frequently used
functions) and OS services (linkers, assemblers, etc.) to
develop applications instead. This method is used to reduce
complexity by abstracting the detail of machine dependant
calls into APIs and various utilities and OS services.

• OS – Masks the hardware details from the programmer and


provides an interface to the system. Manages the computers
resources. The OS designer has to be familiar with user
requirements and hardware details.
Functions of Operating System

• Memory Management
• Processor Management
• Device Management
• File Management
• Security
• Control over system performance
• Job accounting
• Error detecting aids
• Coordination between other software and users
• Memory management refers to management of Primary
Memory or Main Memory.
• Main memory is a large array of words or bytes where each
word or byte has its own address.
• Main memory provides a fast storage that can be accessed
directly by the CPU.
• For a program to be executed, it must in the main memory.
• An Operating System does the following activities for memory
management:
– Keeps tracks of primary memory, i.e., what part of it are in use
by whom, what parts are not in use?
– In multiprogramming, the OS decides which process will get
memory when and how much.
– Allocates the memory when a process requests it to do so.
– De-allocates the memory when a process no longer needs it or
has been terminated.
Processor Management

• In multiprogramming environment, the OS decides which


process gets the processor when and for how much time. This
function is called process scheduling. An Operating System
does the following activities for processor management:
– Keeps tracks of processor and status of process. The program
responsible for this task is known as traffic controller.
– Allocates the processor (CPU) to a process.
– De-allocates processor when a process is no longer required
Scheduling processes and threads on the CPUs
Providing mechanisms for process synchronization , process
communication
Device Management
• An Operating System manages device communication via their
respective drivers. It does the following activities for device
management:
– Keeps tracks of all devices. The program responsible for this
task is known as the I/O controller.
– Decides which process gets the device when and for how much
time.
– Allocates the device in the most efficient way.
– De-allocates devices.
File Management
A file is a collection of related information defined by its creator. Commonly,
files represent programs (both source and object forms) and data. Data files
may be numeric, alphabetic, alphanumeric, or binary.
• A file system is normally organized into directories for easy
navigation and usage. These directories may contain files and
other directions.
• An Operating System does the following activities for file
management:
– Keeps track of information, location, uses, status etc. The
collective facilities are often known as file system.
– Decides who gets the resources.
– Allocates the resources.
– De-allocates the resources.
Supporting primitives for manipulating files and directories
Mapping files onto secondary storage
Backing up files on stable (nonvolatile) storage medi
Other Important Activities
• Security - By means of password and similar other techniques,
it prevents unauthorized access to programs and data.
• Control over system performance - Recording delays
between request for a service and response from the system.
• Job accounting -- Keeping track of time and resources used
by various jobs and users.
• Error detecting aids - Production of dumps, traces, error
messages, and other debugging and error detecting aids.
• Coordination between other software and users -
Coordination and assignment of compilers, interpreters,
assemblers and other software to the various users of the
computer systems.
The user may need access to a deviece that is resh;icted,for examp1e.Operatmg
systems provide various methods to allow privilege escalatio.
Operating System

Presented By:- Dr. Sanjeev Sharma


Operating system Evolution
• Let’s see how operating systems evolve over time.
• This will help us to identify some common features of
operating systems and how and why these systems have been
developed as they are.
• Serial Processing
• Simple Batch Systems (1960)
• Multiprogrammed Batch Systems (1970)
• Time-Sharing and Real-Time Systems (1970)
• Personal/Desktop Systems (1980)
• Multiprocessor Systems (1980)
• Networked/Distributed Systems (1980)
Early System
• Structure
– Single user system.
– Large machines run from console.
– Programmer/User as operator.
– Paper Tape or Punched cards.
– No tapes/disks in computer.

• Significant amount of setup time.


• Low CPU utilization.
• But very secure
Batch Operating System
• The users of a batch operating system do not interact with the
computer directly. Each user prepares his job on an off-line
device like punch cards and submits it to the computer
operator. To speed up processing, jobs with similar needs are
batched together and run as a group. The programmers leave
their programs with the operator and the operator then sorts the
programs with similar requirements into batches.
• The problems with Batch Systems are as follows:
– Lack of interaction between the user and the job.
– CPU is often idle, because the speed of the mechanical I/O
devices is slower than the CPU.
– Difficult to provide the desired priority.
regpi

• Here First the pooled jobs are read and executed by the batch
monitor, and then these jobs are grouped; placing the identical
jobs (jobs with the similar needs) in the same batch, So, in the
batch processing system, the batched jobs were executed
automatically one after another saving its time by performing
the activities (like loading of compiler) only for once. It
resulted in improved system utilization due to reduced turn
around time.
• The operating systems (called resident monitor) manages the
execution of each program in the batch.
– Monitor utilities are loaded when needed.
– Resident monitor is always in main memory and available for
execution.
– The resident monitor usually has the following part.
• Control card interpreter – responsible for reading and
carrying out instructions on the cards.
• Loader – loads systems programs and applications
programs into memory.
• Device drivers – know special characteristics and
properties for each of the system’s I/O devices.
• One big problem associate with these OS is CPU was often
idle.
• To over come this spooling can be used.
• Uniprogramming Until Now
– I/O operations are exceedingly slow (compared to instruction
execution).
– A program containing even a very small number of I/O
operations will spend most of its time waiting for them.
– Hence: poor CPU usage when only one program is present in
memory.
Memory model for uniprogramming
Multiprogrammed Batch Systems

• Several jobs are kept in main memory at the same time, and
the CPU is multiplexed among them.
• If memory can hold several programs, then CPU can switch to
another one whenever a program is waiting for an I/O to
complete – This is multiprogramming.
Time-sharing Operating Systems

• Time-sharing is a technique which enables many people,


located at various terminals, to use a particular computer
system at the same time.
• Time-sharing or multitasking is a logical extension of
multiprogramming.
• Processor's time which is shared among multiple users
simultaneously is termed as time-sharing.
• The main difference between Multiprogrammed Batch
Systems and Time-Sharing Systems is that in case of
Multiprogrammed batch systems, the objective is to maximize
processor use, whereas in Time-Sharing Systems, the objective
is to minimize response time.
• Multiple jobs are executed by the CPU by switching between
them, but the switches occur so frequently. Thus, the user can
receive an immediate response. For example, in a transaction
processing, the processor executes each user program in a
short burst or quantum of computation. That is, if n users are
present, then each user can get a time quantum. When the user
submits the command, the response time is in few seconds at
most.
• Advantages of Timesharing operating systems are as follows:
– Provides the advantage of quick response
– Avoids duplication of software
– Reduces CPU idle time

• Disadvantages of Time-sharing operating systems are as


follows:
– Problem of reliability
– Question of security and integrity of user programs and data
– Problem of data communication
Distributed Operating System

• Distribute the computation among several physically separated


processors.
• Loosely coupled system – each processor has its own local
memory; processors communicate with one another through
various communications lines, such as high-speed buses or
telephone lines.
• These processors are referred as sites, nodes, computers, and
so on.
REFSRD

• The advantages of distributed systems are as follows:


– With resource sharing facility, a user at one site may be able to
use the resources available at another.
– Speedup the exchange of data with one another via electronic
mail.
– If one site fails in a distributed system, the remaining sites can
potentially continue operating.
– Better service to the customers.
– Reduction of the load on the host computer.
– Reduction of delays in data processing.
Network Operating System

• A Network Operating System runs on a server and provides


the server the capability to manage data, users, groups,
security, applications, and other networking functions. The
primary purpose of the network operating system is to allow
shared file and printer access among multiple computers in a
network, typically a local area network (LAN), a private
network or to other networks.

• Examples of network operating systems include Microsoft
Windows Server 2003, Microsoft Windows Server 2008,
UNIX, Linux, Mac OS X, Novell NetWare
CSUR

• The advantages of network operating systems are as follows:


– Centralized servers are highly stable.
– Security is server managed.
– Upgrades to new technologies and hardware can be easily
integrated into the system.
– Remote access to servers is possible from different locations and
types of systems.

• The disadvantages of network operating systems are as follows:


– High cost of buying and running a server.
– Dependency on a central location for most operations.
– Regular maintenance and updates are required.
Real-Time Operating System

• A real-time system is defined as a data processing system in which the time


interval required to process and respond to inputs is so small that it controls
the environment. The time taken by the system to respond to an input and
display of required updated information is termed as the response time. So
in this method, the response time is very less as compared to online
processing.

• Real-time systems are used when there are rigid time requirements on the
operation of a processor or the flow of data and real-time systems can be
used as a control device in a dedicated application. A real-time operating
system must have well-defined, fixed time constraints, otherwise the
system will fail. For example, Scientific experiments, medical imaging
systems, industrial control systems, weapon systems, robots, air traffic
control systems, etc.
• There are two types of real-time operating systems.
• Hard real-time systems
Hard real-time systems guarantee that critical tasks complete on time. In
hard real-time systems, secondary storage is limited or missing and the data
is stored in ROM. In these systems, virtual memory is almost never found.

• Soft real-time systems


• Soft real-time systems are less restrictive. A critical real-time task gets
priority over other tasks and retains the priority until it completes. Soft real-
time systems have limited utility than hard real-time systems. For example,
multimedia, virtual reality, Advanced Scientific Projects like undersea
exploration and planetary rovers, etc.
Operating System
Concurrent Process and Scheduling
Process Concept
• A process is a program in execution. Process is not as same as
program code but a lot more than it. A process is an 'active'
entity as opposed to program which is considered to be a
'passive' entity. Attributes held by process include hardware
state, memory, CPU etc.
• To put it in simple terms, we write our computer programs in a
text file and when we execute this program, it becomes a
process which performs all the tasks mentioned in the program.

• When a program is loaded into the memory and it becomes a


process, it can be divided into four sections ─ stack, heap, text
and data.
Process Section
• Stack:- The process Stack contains the temporary data such as
method/function parameters, return address and local
variables.
• Heap:-This is dynamically allocated memory to a process
during its run time.
• Text section is made up of the compiled program code, read in
from non-volatile storage when the program is launched..
• Data:-This section contains the global and static variables.
Process Section
Process State
• When a process executes, it passes through different states. These stages
may differ in different operating systems, and the names of these states are
also not standardized.
• In general, a process can have one of the following five states at a time.

• New:- This is the initial state when a process is first started/created.

• Ready:- The process is waiting to be assigned to a processor. Ready


processes are waiting to have the processor allocated to them by the
operating system so that they can run. Process may come into this state
after Start state or while running it by but interrupted by the scheduler to
assign CPU to some other process.
• Running:-Once the process has been assigned to a processor by the OS
scheduler, the process state is set to running and the processor executes its
instructions.
• Waiting
• Process moves into the waiting state if it needs to wait for a
resource, such as waiting for user input, or waiting for a file to
become available.

• Terminated or Exit
• Once the process finishes its execution, or it is terminated by
the operating system, it is moved to the terminated state where
it waits to be removed from main memory.
Diagram of Process State
Process Control Block (PCB)
A Process Control Block is a data structure maintained by the
Operating System for every process. The PCB is identified by an
integer process ID (PID). A PCB keeps all the information needed to
keep track of a process
Information associated with each process
• Process state:-The current state of the process i.e., whether it is
ready, running, waiting, or whatever.
• Program counter:- Program Counter is a pointer to the address of
the next instruction to be executed for this process.
• CPU registers:- Various CPU registers where process need to be
stored for execution for running state.
• CPU scheduling information:- Process priority and other
scheduling information which is required to schedule the process.
• Memory-management information:-This includes the
information of page table, memory limits, Segment table
depending on memory used by the operating system.

• Accounting information:” -This includes the amount of CPU


used for process execution, time limits, execution ID etc.

• I/O status information:-This includes a list of I/O devices


allocated to the process.
• The PCB is maintained for a process throughout its lifetime,
and is deleted once the process terminates.
Process Control Block
CPU Switch from Process to Process
Process Scheduling Queues
• The OS maintains all PCBs in Process Scheduling Queues. The OS
maintains a separate queue for each of the process states and PCBs
of all processes in the same execution state are placed in the same
queue. When the state of a process is changed, its PCB is unlinked
from its current queue and moved to its new state queue.
• The Operating System maintains the following important process
scheduling queues −
• Job queue − This queue keeps all the processes in the system.
• Ready queue − This queue keeps a set of all processes residing in
main memory, ready and waiting to execute. A new process is
always put in this queue.
• Device queues − The processes which are blocked due to
unavailability of an I/O device constitute this queue.
• Schedulers are special system software which handle process
scheduling in various ways. Their main task is to select the
jobs to be submitted into the system and to decide which
process to run. Schedulers are of three types −
• Long-Term Scheduler
• Short-Term Scheduler
• Medium-Term Scheduler
Long Term Scheduler
• It is also called a job scheduler. A long-term scheduler
determines which programs are admitted to the system for
processing. It selects processes from the queue and loads them
into memory for execution. Process loads into the memory for
CPU scheduling.
• The primary objective of the job scheduler is to provide a
balanced mix of jobs, such as I/O bound and processor bound.
It also controls the degree of multiprogramming. If the degree
of multiprogramming is stable, then the average rate of process
creation must be equal to the average departure rate of
processes leaving the system.
• It is also called as CPU scheduler. Its main objective is to
increase system performance in accordance with the chosen set
of criteria. It is the change of ready state to running state of the
process. CPU scheduler selects a process among the processes
that are ready to execute and allocates CPU to one of them.

• Short-term schedulers, also known as dispatchers, make the


decision of which process to execute next. Short-term
schedulers are faster than long-term schedulers.
• Medium-term scheduling is a part of swapping. It removes the
processes from the memory. It reduces the degree of
multiprogramming. The medium-term scheduler is in-charge
of handling the swapped out-processes.
• A running process may become suspended if it makes an I/O
request. A suspended processes cannot make any progress
towards completion. In this condition, to remove the process
from memory and make space for other processes, the
suspended process is moved to the secondary storage. This
process is called swapping, and the process is said to be
swapped out or rolled out. Swapping may be necessary to
improve the process mix.
Ready Queue vs I/O queue
Representation of Process Scheduling
Action of Medium Term Schedular
Context Switch
• When CPU switches to another process, the system must save
the state of the old process and load the saved state for the
new process

• Context-switch time is overhead; the system does no useful


work while switching

• Time dependent on hardware support


Inter Process Communication
• A process can be of two type:
• Independent process.
• Co-operating process.
• An independent process is not affected by the execution of other
processes while a co-operating process can be affected by other
executing processes.
• Though one can think that those processes, which are running
independently, will execute very efficiently but in practical, there are
many situations when co-operative nature can be utilised for
increasing computational speed, convenience and modularity.
• Inter process communication (IPC) is a mechanism which allows
processes to communicate each other and synchronize their actions.
The communication between these processes can be seen as a
method of co-operation between them.
• There are numerous reasons for providing an environment or situation
which allows process co-operation:
• Information sharing: Since a number of users may be interested in the
same piece of information (for example, a shared file), you must provide a
situation for allowing concurrent access to those information.
• Computation speedup: If you want a particular work to run fast, you must
break it into sub-tasks where each of them will get execute in parallel with
the other tasks. Note that such a speed-up can be attained only when the
computer has compound or various processing elements like CPUs or I/O
channels.
• Modularity: You may want to build the system in a modular way by
dividing the system functions into split processes or threads.
• Convenience: Even a single user may work on many tasks at a time. For
example, a user may be editing, formatting, printing, and compiling in
parallel.
• Working together multiple processes, require an inter process
communication (IPC) method which will allow them to
exchange data along with various information. There are two
primary models of inter process communication:
– shared memory and
– message passing.
• In the shared-memory model, a region of memory which is
shared by cooperating processes gets established. Processes
can then able to exchange information by reading and writing
all the data to the shared region. In the message-passing form,
communication takes place by way of messages exchanged
among the cooperating processes.
Process Synchronization
• Concurrent access to shared data may result in data
inconsistency

• Maintaining data consistency requires mechanisms to ensure


the orderly execution of cooperating processes

• Suppose that we wanted to provide a solution to the consumer-


producer problem that fills all the buffers. We can do so by
having an integer counter that keeps track of the number of full
buffers. Initially, counter is set to 0. It is incremented by the
producer after it produces a new buffer and is decremented by
the consumer after it consumes a buffer.
Producer- Consumer Problem solution
using Counter Variable
Code For Producer Process

while (true) {

/* produce an item and put in nextProduced */


while (counter == BUFFER_SIZE)
; // do nothing
buffer [in] = nextProduced;
in = (in + 1) % BUFFER_SIZE;
counter++;
}
Code for Consumer Process

while (true) {
while (counter == 0)
; // do nothing
nextConsumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
counter--;
/* consume the item in nextConsumed
}
Race Condition
• A race condition is a special condition that may occur inside a
critical section. A critical section is a section of code that is executed
by multiple threads and where the sequence of execution for the
threads makes a difference in the result of the concurrent execution
of the critical section.
• When the result of multiple threads executing a critical section may
differ depending on the sequence in which the threads execute, the
critical section is said to contain a race condition. The term race
condition stems from the metaphor that the threads are racing
through the critical section, and that the result of that race impacts
the result of executing the critical section.
• This may all sound a bit complicated, so I will elaborate more on
race conditions and critical sections in the following sections.
• To prevent race conditions from occurring you must make sure
that the critical section is executed as an atomic instruction.
That means that once a single thread is executing it, no other
threads can execute it until the first thread has left the critical
section.

• Race conditions can be avoided by proper thread


synchronization in critical sections.
Race Condition
• counter++ could be implemented as
register1 = counter
register1 = register1 + 1
counter = register1

• counter-- could be implemented as


register2 = counter
register2 = register2 – 1
count = register2
Consider this execution interleaving with “counter= 5” initially:
S0: producer execute register1 = counter {register1 = 5}
S1: producer execute register1 = register1 + 1 {register1 = 6}
S2: consumer execute register2 = counter {register2 = 5}
S3: consumer execute register2 = register2 - 1 {register2 = 4}
S4: producer execute counter = register1 {count = 6 }
S5: consumer execute counter = register2 {count = 4}
Critical Section
• A critical section is a region of code in which a process uses a
variable (which may be an object or some other data structure)
that is shared with another process (e.g. the “code” that read,
modified, and wrote an account balance in the example you
did.)
• Problems can arise if two processes are in critical sections
accessing the same variable at the same time.
• The critical section problem refers to the problem of how to
ensure that at most one process is executing its critical section
at a given time.
Solution to Critical-Section Problem
1. Mutual Exclusion - If process Pi is executing in its critical
section, then no other processes can be executing in their
critical sections
2. Progress - If no process is executing in its critical section and
there exist some processes that wish to enter their critical
section, then the selection of the processes that will enter the
critical section next cannot be postponed indefinitely
3. Bounded Waiting - A bound must exist on the number of
times that other processes are allowed to enter their critical
sections after a process has made a request to enter its critical
section and before that request is granted
 Assume that each process executes at a nonzero speed
 No assumption concerning relative speed of the N processes
Peterson Solution
• Peterson's Solution is a classic software-based solution to the critical
section problem.
• Peterson's solution is based on two processes, P0 and P1, which
alternate between their critical sections and remainder sections. For
convenience of discussion, "this" process is Pi, and the "other"
process is Pj. ( I.e. j = 1 - i )
• Peterson's solution requires two shared data items:
• int turn - Indicates whose turn it is to enter into the critical section.
If turn = = i, then process i is allowed into their critical section.
• boolean flag[ 2 ] - Indicates when a process wants to enter into their
critical section. When process i wants to enter their critical section,
it sets flag[ i ] to true.
Peterson’s Solution for Process i
• In the entry section, process i first raises a flag indicating a desire to
enter the critical section.

• Then turn is set to j to allow the other process to enter their critical
section if process j so desires.

• The while loop is a busy loop ( notice the semicolon at the end ),
which makes process i wait as long as process j has the turn and
wants to enter the critical section.

• Process i lowers the flag[ i ] in the exit section, allowing process j to


continue if it has been waiting.
• To prove that the solution is correct, we must examine the three conditions
listed above:
– Mutual exclusion - If one process is executing their critical section when the other
wishes to do so, the second process will become blocked by the flag of the first
process. If both processes attempt to enter at the same time, the last process to
execute "turn = j" will be blocked.
– Progress - Each process can only be blocked at the while if the other process wants
to use the critical section ( flag[ j ] = = true ), AND it is the other process's turn to
use the critical section ( turn = = j ). If both of those conditions are true, then the
other process ( j ) will be allowed to enter the critical section, and upon exiting the
critical section, will set flag[ j ] to false, releasing process i. The shared variable turn
assures that only one process at a time can be blocked, and the flag variable allows
one process to release the other when exiting their critical section.
– Bounded Waiting - As each process enters their entry section, they set the turn
variable to be the other processes turn. Since no process ever sets it back to their
own turn, this ensures that each process will have to let the other process go first at
most one time before it becomes their turn again.
• Note that the instruction "turn = j" is atomic, that is it is a single machine
instruction which cannot be interrupted.
Semaphore
• In 1965, Dijkstra proposed a new and very significant technique for
managing concurrent processes by using the value of a simple integer
variable to synchronize the progress of interacting processes. This integer
variable is called semaphore. So it is basically a synchronizing tool and is
accessed only through two low standard atomic operations, wait and signal
designated by P() and V() respectively.
• Two standard operations, wait and signal are defined on the semaphore.
Entry to the critical section is controlled by the wait operation and exit
from a critical region is taken care by signal operation.
• The manipulation of semaphore (S) takes place as following:
• The wait command P(S) decrements the semaphore value by 1..
• The V(S) i.e. signals operation increments the semaphore value by 1.
• Mutual exclusion on the semaphore is enforced within P(S) and V(S). If a
number of processes attempt P(S) simultaneously, only one process will be
allowed to proceed & the other processes will be waiting.
Wait and Signal function
• In practice, semaphores can take on one of two forms:
• Binary semaphores can take on one of two values, 0 or 1. They can
be used to solve the critical section problem as described above, and
are sometimes known as mutexes, because they provide mutual
exclusion.
• Counting semaphores can take on any integer value, and are
usually used to count the number remaining of some limited
resource. The counter is initialized to the number of such resources
available in the system, and whenever the counting semaphore is
greater than zero, then a process can enter a critical section and use
one of the resources. When the counter gets to zero ( or negative in
some implementations ), then the process blocks until another
process frees up a resource and increments the counting semaphore
with a signal call. ( The binary semaphore can be seen as just a
special case where the number of resources initially available is just
one. )
Semaphore Implementation
• Must guarantee that no two processes can execute wait () and
signal () on the same semaphore at the same time

• Thus, implementation becomes the critical section problem


where the wait and signal code are placed in the crtical section.
– Could now have busy waiting in critical section implementation
• But implementation code is short
• Little busy waiting if critical section rarely occupied

• Note that applications may spend lots of time in critical


sections and therefore this is not a good solution.
Semaphore Implementation with no Busy waiting

• With each semaphore there is an associated waiting queue.


Each entry in a waiting queue has two data items:
– value (of type integer)
– pointer to next record in the list

• Two operations:
– block – place the process invoking the operation on the
appropriate waiting queue.
– wakeup – remove one of processes in the waiting queue and
place it in the ready queue.
Semaphore Implementation with no Busy waiting (Cont.)

• Implementation of wait:

wait (S){
value--;
if (value < 0) {
add this process to waiting queue
block(); }
}

• Implementation of signal:

Signal (S){
value++;
if (value <= 0) {
remove a process P from the waiting queue
wakeup(P); }
}
Deadlock and Starvation
• Deadlock – two or more processes are waiting indefinitely for
an event that can be caused by only one of the waiting
processes
• Let S and Q be two semaphores initialized to 1
P0 P1
wait (S); wait (Q);
wait (Q); wait (S);
. .
. .
. .
signal (S); signal (Q);
signal (Q); signal (S);
• Starvation – indefinite blocking. A process may never be
removed from the semaphore queue in which it is suspended.
Classical Problems of Synchronization

• Bounded-Buffer Problem
• Readers and Writers Problem
• Dining-Philosophers Problem
Bounded-Buffer Problem
• This is a generalization of the producer-consumer problem
wherein access is controlled to a shared group of buffers of a
limited size. In this solution, the two counting semaphores
"full" and "empty" keep track of the current number of full and
empty buffers respectively ( and initialized to 0 and N
respectively. )
• The binary semaphore mutex controls access to the critical
section.
• The producer and consumer processes are nearly identical -
One can think of the producer as producing full buffers, and
the consumer producing empty buffers Semaphore mutex
initialized to the value 1
• Semaphore full initialized to the value 0
• Semaphore empty initialized to the value N.
Bounded Buffer Problem (Cont.)
Bounded Buffer Problem (Cont.)
Readers-Writers Problem
• In the readers-writers problem there are some processes ( termed readers ) who only
read the shared data, and never change it, and there are other processes ( termed
writers ) who may change the data in addition to or instead of reading it. There is no
limit to how many readers can access the data simultaneously, but when a writer
accesses the data, it needs exclusive access.
• There are several variations to the readers-writers problem, most centered around
relative priorities of readers versus writers. The first readers-writers problem gives
priority to readers. In this problem, if a reader wants access to the data, and there is
not already a writer accessing it, then access is granted to the reader. A solution to
this problem can lead to starvation of the writers, as there could always be more
readers coming along to access the data. ( A steady stream of readers will jump
ahead of waiting writers as long as there is currently already another reader
accessing the data, because the writer is forced to wait until the data is idle, which
may never happen if there are enough readers. )
• The second readers-writers problem gives priority to the writers. In this problem,
when a writer wants access to the data it jumps to the head of the queue - All
waiting readers are blocked, and the writer gets access to the data as soon as it
becomes available. In this solution the readers may be starved by a steady stream of
writers.
Readers-Writers Problem (Cont.)
• The following code is an example of the first readers-writers
problem, and involves an important counter and two binary
semaphores: readcount is used by the reader processes, to count the
number of readers currently accessing the data.
• mutex is a semaphore used only by the readers for controlled access
to readcount.
• rw_mutex is a semaphore used to block and release the writers. The
first reader to access the data will set this lock and the last reader to
exit will release it; The remaining readers do not touch rw_mutex. (
Eighth edition called this variable wrt. )
• Note that the first reader to come along will block on rw_mutex if
there is currently a writer accessing the data, and that all following
readers will only block on mutex for their turn to increment
readcount.
Readers-Writers Problem (Cont.)
Dining-Philosophers Problem
• The dining philosophers problem is a classic synchronization
problem involving the allocation of limited resources amongst a
group of processes in a deadlock-free and starvation-free manner:
Consider five philosophers sitting around a table, in which there are
five chopsticks evenly distributed and an endless bowl of rice in the
center, as shown in the diagram below. ( There is exactly one
chopstick between each pair of dining philosophers. )
• These philosophers spend their lives alternating between two
activities: eating and thinking.
• When it is time for a philosopher to eat, it must first acquire two
chopsticks - one from their left and one from their right.
• When a philosopher thinks, it puts down both chopsticks in their
original locations.
Dining-Philosophers Problem (Cont.)

• One possible solution, as shown in the following code section,


is to use a set of five semaphores ( chopsticks[ 5 ] ), and to
have each hungry philosopher first wait on their left chopstick
( chopsticks[ i ] ), and then wait on their right chopstick (
chopsticks[ ( i + 1 ) % 5 ] )
• But suppose that all five philosophers get hungry at the same
time, and each starts by picking up their left chopstick. They
then look for their right chopstick, but because it is
unavailable, they wait for it, forever, and eventually all the
philosophers starve due to the resulting deadlock.
• Some potential solutions to the problem include: Only allow
four philosophers to dine at the same time. ( Limited
simultaneous processes. )
• Allow philosophers to pick up chopsticks only when both are
available, in a critical section. ( All or nothing allocation of
critical resources. )
• Use an asymmetric solution, in which odd philosophers pick
up their left chopstick first and even philosophers pick up their
right chopstick first. ( Will this solution always work? What if
there are an even number of philosophers? )
CPU Scheduling
• CPU scheduling is a process which allows one process to use
the CPU while the execution of another process is on hold(in
waiting state) due to unavailability of any resource like I/O etc,
thereby making full use of CPU. The aim of CPU scheduling
is to make the system efficient, fast and fair.
• Whenever the CPU becomes idle, the operating system must
select one of the processes in the ready queue to be executed.
The selection process is carried out by the short-term scheduler
(or CPU scheduler). The scheduler selects from among the
processes in memory that are ready to execute, and allocates
the CPU to one of them.
CPU-I/O Burst Cycle

• Almost all processes alternate between two states in a


continuing cycle, as shown in Figure 5.1 below :
– A CPU burst of performing calculations, and
– An I/O burst, waiting for data transfer in or out of the system.
• Whenever the CPU becomes idle, it is the job of the CPU
Scheduler ( a.k.a. the short-term scheduler ) to select another
process from the ready queue to run next. The storage structure
for the ready queue and the algorithm used to select the next
process are not necessarily a FIFO queue. There are several
alternatives to choose from, as well as numerous adjustable
parameters for each algorithm
• CPU scheduling decisions take place under one of four conditions:
– When a process switches from the running state to the waiting state,
such as for an I/O request or invocation of the wait( ) system call.
– When a process switches from the running state to the ready state, for
example in response to an interrupt.
– When a process switches from the waiting state to the ready state, say
at completion of I/O or a return from wait( ).
– When a process terminates.
• For conditions 1 and 4 there is no choice - A new process must be
selected. For conditions 2 and 3 there is a choice - To either continue
running the current process, or select a different one. If scheduling
takes place only under conditions 1 and 4, the system is said to be
non-preemptive. Under these conditions, once a process starts
running it keeps running, until it either voluntarily blocks or until it
finishes. Otherwise the system is said to be preemptive.
• Dispatcher
• The dispatcher is the module that gives control of the CPU to
the process selected by the scheduler. This function involves:
– Switching context.
– Switching to user mode.
– Jumping to the proper location in the newly loaded program.
• The dispatcher needs to be as fast as possible, as it is run on
every context switch. The time consumed by the dispatcher is
known as dispatch latency.
Scheduling Criteria

• There are several different criteria to consider when trying to select


the "best" scheduling algorithm for a particular situation and
environment, including:
– CPU utilization - Ideally the CPU would be busy 100% of the time, so
as to waste 0 CPU cycles. On a real system CPU usage should range
from 40% ( lightly loaded ) to 90% ( heavily loaded. )
– Throughput - Number of processes completed per unit time. May
range from 10 / second to 1 / hour depending on the specific processes.
– Turnaround time - Time required for a particular process to complete,
from submission time to completion.
– Waiting time - How much time processes spend in the ready queue
waiting their turn to get on the CPU.
– Response time - The time taken in an interactive program from the
issuance of a command to the commence of a response to that
command.
Scheduling Algorithms

• First Come First Serve (FCFS)


• Shortest Job First Scheduling Algorithm (SJF)
• Priority Scheduling
• Round Robin Scheduling
• Multilevel Queue Scheduling
• Multilevel Feedback Queue Scheduling Algorithm
First Come First Serve
• FCFS is very simple - Just a FIFO queue, like customers
waiting in line at the bank or the post office or at a copying
machine.
• Unfortunately, however, FCFS can yield some very long
average wait times, particularly if the first process to get there
takes a long time. For example, consider the following three
processes:
Process Burst Time

P1 24

P2 3

P3 3
• In the first Gantt chart below, process P1 arrives first. The
average waiting time for the three processes is ( 0 + 24 + 27 ) /
3 = 17.0 ms.
• In the second Gantt chart below, the same three processes have
an average wait time of ( 0 + 3 + 6 ) / 3 = 3.0 ms. The total run
time for the three bursts is the same, but in the second case two
of the three finish much quicker, and the other process is only
delayed by a short amount.
• FCFS can also block the system in a busy dynamic system in
another way, known as the convoy effect. When one CPU
intensive process blocks the CPU, a number of I/O intensive
processes can get backed up behind it, leaving the I/O devices
idle. When the CPU hog finally relinquishes the CPU, then the
I/O processes pass through the CPU quickly, leaving the CPU
idle while everyone queues up for I/O, and then the cycle
repeats itself when the CPU intensive process gets back to the
ready queue.
Shortest-Job-First Scheduling, SJF

• The idea behind the SJF algorithm is to pick the quickest


fastest little job that needs to be done, get it out of the way
first, and then pick the next smallest fastest job to do next.
• ( Technically this algorithm picks a process based on the next
shortest CPU burst, not the overall process time. )
• For example, the Gantt chart below is based upon the
following CPU burst times, ( and the assumption that all jobs
arrive at the same time. )
Process Burst Time
P1 6
P2 8
P3 7
P4 3
• In the case above the average wait time is ( 0 + 3 + 9 + 16 ) / 4
= 7.0 ms, ( as opposed to 10.25 ms for FCFS for the same
processes. )
• SJF can be proven to be the fastest scheduling algorithm, but it
suffers from one important problem: How do you know how
long the next CPU burst is going to be? For long-term batch
jobs this can be done based upon the limits that users set for
their jobs when they submit them, which encourages them to
set low limits, but risks their having to re-submit the job if they
set the limit too low. However that does not work for short-
term CPU scheduling on an interactive system.
• Another option would be to statistically measure the run time
characteristics of jobs, particularly if the same tasks are run
repeatedly and predictably. But once again that really isn't a
viable option for short term CPU scheduling in the real world.
• A more practical approach is to predict the length of the next burst,
based on some historical measurement of recent burst times for this
process. One simple, fast, and relatively accurate method is the
exponential average, which can be defined as follows. ( The book
uses tau and t for their variables, but those are hard to distinguish
from one another and don't work well in HTML. )
• estimate[ i + 1 ] = alpha * burst[ i ] + ( 1.0 - alpha ) * estimate[ i ]
• In this scheme the previous estimate contains the history of all
previous times, and alpha serves as a weighting factor for the
relative importance of recent data versus past history. If alpha is 1.0,
then past history is ignored, and we assume the next burst will be the
same length as the last burst. If alpha is 0.0, then all measured burst
times are ignored, and we just assume a constant burst time. Most
commonly alpha is set at 0.5, as illustrated in Figure 5.3:
• SJF can be either preemptive or non-preemptive. Preemption
occurs when a new process arrives in the ready queue that has
a predicted burst time shorter than the time remaining in the
process whose burst is currently on the CPU. Preemptive SJF
is sometimes referred to as shortest remaining time first
scheduling.
• For example, the following Gantt chart is based upon the
following data:
Process Arrival Time Burst Time
P1 0 8
P2 1 4
P3 2 9
p4 3 5
• The average wait time in this case is ( ( 5 - 3 ) + ( 10 - 1 ) + (
17 - 2 ) ) / 4 = 26 / 4 = 6.5 ms. ( As opposed to 7.75 ms for
non-preemptive SJF or 8.75 for FCFS. )
Priority Scheduling

• Priority scheduling is a more general case of SJF, in which


each job is assigned a priority and the job with the highest
priority gets scheduled first. ( SJF uses the inverse of the next
expected burst time as its priority - The smaller the expected
burst, the higher the priority. )
• Note that in practice, priorities are implemented using integers
within a fixed range, but there is no agreed-upon convention as
to whether "high" priorities use large numbers or small
numbers. This book uses low number for high priorities, with
0 being the highest possible priority.
• For example, the following Gantt chart is based upon these
process burst times and priorities, and yields an average
waiting time of 8.2 ms:
Process Burst Time Priority
P1 10 3
P2 1 1
P3 2 4
P4 1 5
P5 5 2
• Priorities can be assigned either internally or externally. Internal priorities
are assigned by the OS using criteria such as average burst time, ratio of
CPU to I/O activity, system resource use, and other factors available to the
kernel. External priorities are assigned by users, based on the importance of
the job, fees paid, politics, etc.
• Priority scheduling can be either preemptive or non-preemptive.
• Priority scheduling can suffer from a major problem known as indefinite
blocking, or starvation, in which a low-priority task can wait forever
because there are always some other jobs around that have higher priority.
– If this problem is allowed to occur, then processes will either run
eventually when the system load lightens ( at say 2:00 a.m. ), or will
eventually get lost when the system is shut down or crashes. ( There are
rumors of jobs that have been stuck for years. )
– One common solution to this problem is aging, in which priorities of jobs
increase the longer they wait. Under this scheme a low-priority job will
eventually get its priority raised high enough that it gets run.
Round Robin Scheduling

• Round robin scheduling is similar to FCFS scheduling, except that


CPU bursts are assigned with limits called time quantum.
• When a process is given the CPU, a timer is set for whatever value
has been set for a time quantum.
– If the process finishes its burst before the time quantum timer expires,
then it is swapped out of the CPU just like the normal FCFS algorithm.
– If the timer goes off first, then the process is swapped out of the CPU
and moved to the back end of the ready queue.
• The ready queue is maintained as a circular queue, so when all
processes have had a turn, then the scheduler gives the first process
another turn, and so on.
• RR scheduling can give the effect of all processors sharing the CPU
equally, although the average wait time can be longer than with
other scheduling algorithms. In the following example the average
wait time is 5.66 ms.
Process Burst Time
P1 24
P2 3
P3 3
• The performance of RR is sensitive to the time quantum
selected. If the quantum is large enough, then RR reduces to
the FCFS algorithm; If it is very small, then each process gets
1/nth of the processor time and share the CPU equally.
• BUT, a real system invokes overhead for every context switch,
and the smaller the time quantum the more context switches
there are. ( See Figure 6.4 below. ) Most modern systems use
time quantum between 10 and 100 milliseconds, and context
switch times on the order of 10 microseconds, so the overhead
is small relative to the time quantum.
• Turn around time also varies with quantum time, in a non-
apparent manner. Consider, for example the processes shown
in Figure 6.5:
• In general, turnaround time is minimized if most processes
finish their next cpu burst within one time quantum. For
example, with three processes of 10 ms bursts each, the
average turnaround time for 1 ms quantum is 29, and for 10
ms quantum it reduces to 20. However, if it is made too large,
then RR just degenerates to FCFS. A rule of thumb is that 80%
of CPU bursts should be smaller than the time quantum.
Multilevel Queue Scheduling

• When processes can be readily categorized, then multiple


separate queues can be established, each implementing
whatever scheduling algorithm is most appropriate for that
type of job, and/or with different parametric adjustments.
• Scheduling must also be done between queues, that is
scheduling one queue to get time relative to other queues. Two
common options are strict priority ( no job in a lower priority
queue runs until all higher priority queues are empty ) and
round-robin ( each queue gets a time slice in turn, possibly of
different sizes. )
• Note that under this algorithm jobs cannot switch from queue
to queue - Once they are assigned a queue, that is their queue
until they finish.
Multilevel Feedback-Queue Scheduling

• Multilevel feedback queue scheduling is similar to the ordinary multilevel


queue scheduling described above, except jobs may be moved from one
queue to another for a variety of reasons:
– If the characteristics of a job change between CPU-intensive and I/O
intensive, then it may be appropriate to switch a job from one queue to
another.
– Aging can also be incorporated, so that a job that has waited for a long time
can get bumped up into a higher priority queue for a while.
• Multilevel feedback queue scheduling is the most flexible, because it can
be tuned for any situation. But it is also the most complex to implement
because of all the adjustable parameters. Some of the parameters which
define one of these systems include:
– The number of queues.
– The scheduling algorithm for each queue.
– The methods used to upgrade or demote processes from one queue to
another. ( Which may be different. )
– The method used to determine which queue a process enters initially.
• Now let us suppose that queue 1 and 2 follow round robin with
time quantum 4 and 8 respectively and queue 3 follow FCFS.
One implementation of MFQS is given below –
• When a process starts executing then it first enters queue 1.
• In queue 1 process executes for 4 unit and if it completes in
this 4 unit or it gives CPU for I/O operation in this 4 unit than
the priority of this process does not change and if it again
comes in the ready queue than it again starts its execution in
Queue 1.
• If a process in queue 1 does not complete in 4 unit then its
priority gets reduced and it shifted to queue 2
• Above points 2 and 3 are also true for queue 2 processes but
the time quantum is 8 unit.In a general case if a process does
not complete in a time quantum than it is shifted to the lower
priority queue.
• In the last queue, processes are scheduled in FCFS manner.
• A process in lower priority queue can only execute only when
higher priority queues are empty.
• A process running in the lower priority queue is interrupted by
a process arriving in the higher priority queue.
A CONTROL STRUCTURE FOR
INDICATING PARALLELISM
• Many programmmg language constructs for indicating parallelism have
appeared in the literature. These generally involve pairs of statements as
follows
• One statement indicating that execution is to split into several parallel
execution sequences (threads of control).
• One statement indicating that certain parallel execution sequences are to
merge sequential execution is to resume
• These statements occur in pairs and are commonly called parbegin for
begin end parallel begin and end paren in text use parbegin/parend as
concurrent execution Fig. 4.1 suggested by Dijkstra form is shown par
begin statement statement 2 statement parend The parbeginparend
parallelism construct. a program currently executing a sin parbegin
construct. This causes threads of control-one for each state simple
statements, procedure calls, b or combinations of these. Each and reaches
the parend. Wh a single thread of control is after the parend.
Multiple-Processor Scheduling

• When multiple processors are available, then the scheduling


gets more complicated, because now there is more than one
CPU which must be kept busy and in effective use at all times.
• Load sharing revolves around balancing the load between
multiple processors.
• Multi-processor systems may be heterogeneous, ( different
kinds of CPUs ), or homogenous, ( all the same kind of CPU
).
• Issue may be related to
– which process to be run and on which CPU
– Whether process are unrelated or come into group.
Approaches to Multiple-Processor Scheduling

• One approach to multi-processor scheduling is asymmetric


multiprocessing, in which one processor is the master,
controlling all activities and running all kernel code, while the
other runs only user code. This approach is relatively simple,
as there is no need to share critical system data.

• Another approach is symmetric multiprocessing, SMP, where


each processor schedules its own jobs, either from a common
ready queue or from separate ready queues for each processor.
Multi processor timesharing
• The simplest strategy of scheduling is that of time sharing by
maintaining a global ready queue as they would be in
uniprocessor system. it can never happen would be in a
processor system. It provides automatic load balancing
because it can never happen one CPU is idle while other are
overloaded.
• The disadvantages of this approach overhead in contention for
the scheduling data structure as the numbers of CPUs grow
and the usual overhead in doing a context switch when a
process blocks for I/O.
Load Balancing

• Obviously an important goal in a multiprocessor system is to


balance the load between processors, so that one processor won't be
sitting idle while another is overloaded.
• Systems using a common ready queue are naturally self-balancing,
and do not need any special handling. Most systems, however,
maintain separate ready queues for each processor.
• Balancing can be achieved through either push migration or pull
migration:
– Push migration involves a separate process that runs periodically, ( e.g.
every 200 milliseconds ), and moves processes from heavily loaded
processors onto less loaded ones.
– Pull migration involves idle processors taking processes from the
ready queues of other processors.
Affinity Scheduling
• Processor Affinity means a processes has an affinity for the
processor on which it is currently running.
• When a process runs on a specific processor there are certain effects
on the cache memory. The data most recently accessed by the
process populate the cache for the processor and as a result
successive memory access by the process are often satisfied in the
cache memory. Now if the process migrates to another processor, the
contents of the cache memory must be invalidated for the first
processor and the cache for the second processor must be
repopulated. Because of the high cost of invalidating and
repopulating caches, most of the SMP(symmetric multiprocessing)
systems try to avoid migration of processes from one processor to
another and try to keep a process running on the same processor.
This is known as PROCESSOR AFFINITY.
Deadlocks

Operating System Concepts – 9th Edition Silberschatz, Galvin and Gagne ©2013
Chapter : Deadlocks
 System Model
 Deadlock Characterization
 Methods for Handling Deadlocks
 Deadlock Prevention
 Deadlock Avoidance
 Deadlock Detection
 Recovery from Deadlock

Operating System Concepts – 9th Edition 7.2 Silberschatz, Galvin and Gagne ©2013
Chapter Objectives

 To develop a description of deadlocks, which prevent


sets of concurrent processes from completing their
tasks
 To present a number of different methods for
preventing or avoiding deadlocks in a computer
system

Operating System Concepts – 9th Edition 7.3 Silberschatz, Galvin and Gagne ©2013
System Model

 System consists of resources to be distributed among a number of competing processes. T

 Resource types R1, R2, . . ., Rm


CPU cycles, memory space, I/O devices are examples of resource types. I

 Each resource type Ri has Wi instances.


 Each process utilizes a resource as follows:
 request The process requests the resource. If the request cannot be granted immediately then the
requesting process must wait until it can acquire the resource
 use The process can operate on the resource

 release
process must request a resource before using it and must release the
resource after using it. A process may request as many resources as it requires
to carry out its designated task. Obviously, the number of resources requested
may not exceed the total number of resources available in the system. In other
words, a process cannot request three printers if the system has only two

Operating System Concepts – 9th Edition 7.4 Silberschatz, Galvin and Gagne ©2013
Deadlock Characterization
Deadlock can arise if four conditions hold simultaneously.

 Mutual exclusion: only one process at a time can use a


resource f another process requests that resource, the requesting process must be delayed
until the resource has been released.
 Hold and wait: a process holding at least one resource is
waiting to acquire additional resources held by other
processes
 No preemption: a resource can be released only voluntarily
by the process holding it, after that process has completed
its task
 Circular wait: there exists a set {P0, P1, …, Pn} of waiting
processes such that P0 is waiting for a resource that is held
by P1, P1 is waiting for a resource that is held by P2, …, Pn–1
is waiting for a resource that is held by Pn, and Pn is waiting
for a resource that is held by P0.

Operating System Concepts – 9th Edition 7.5 Silberschatz, Galvin and Gagne ©2013
Resource-Allocation Graph
Deadlocks can be described more precisely in terms of a directed graph called
Resource Allocaton graph.
A set of vertices V and a set of edges E.
 V is partitioned into two types:
 P = {P1, P2, …, Pn}, the set consisting of all the processes
in the system

 R = {R1, R2, …, Rm}, the set consisting of all resource


types in the system

 request edge – directed edge Pi → Rj

 assignment edge – directed edge Rj → Pi

Operating System Concepts – 9th Edition 7.6 Silberschatz, Galvin and Gagne ©2013
Resource-Allocation Graph (Cont.)
 Process

 Resource Type with 4 instances

 Pi requests instance of Rj

Pi
Rj
 Pi is holding an instance of Rj

Pi
Rj

Operating System Concepts – 9th Edition 7.7 Silberschatz, Galvin and Gagne ©2013
Example of a Resource Allocation Graph

Operating System Concepts – 9th Edition 7.8 Silberschatz, Galvin and Gagne ©2013
Resource Allocation Graph With A Deadlock

Operating System Concepts – 9th Edition 7.9 Silberschatz, Galvin and Gagne ©2013
Graph With A Cycle But No Deadlock

Operating System Concepts – 9th Edition 7.10 Silberschatz, Galvin and Gagne ©2013
Basic Facts

 If graph contains no cycles ⇒ no deadlock


 If graph contains a cycle ⇒
 if only one instance per resource type, then deadlock
 if several instances per resource type, possibility of
deadlock

Operating System Concepts – 9th Edition 7.11 Silberschatz, Galvin and Gagne ©2013
Methods for Handling Deadlocks
Generally speaking, we can deal with the deadlock problem in one of three ways:
We can use a protocol to prevent or avoid deadlocks, ensuring that the
system will never enter a deadlocked state.

 Ensure that the system will never enter a deadlock


state:
 Deadlock prevention
 Deadlock avoidence
 Allow the system to enter a deadlock state and then
recover
 Ignore the problem and pretend that deadlocks never
occur in the system; used by most operating systems,
including UNIX & Windows .It is then up to the application developer to write programs that
handle deadlocks

Operating System Concepts – 9th Edition 7.12 Silberschatz, Galvin and Gagne ©2013
Deadlock Prevention
It provides a set of methods for ensuring that at least one of the necessary conditions
(Section 7.2.1) cannot hold.
Restrain the ways request can be made

 Mutual Exclusion – not required for sharable resources


(e.g., read-only files); must hold for non-sharable resources
 Hold and Wait – must guarantee that whenever a process
requests a resource, it does not hold any other resources
 Require process to request and be allocated all its
resources before it begins execution, or allow process
to request resources only when the process has none
allocated to it.
 Low resource utilization; starvation possible
LRU - >since resources may be allocated but unused for a long period
Startvation-> A process that needs several popular resources may have to wait indefinitely, because at least one
of the resources that it needs is always allocated to some other procecess

The mutual-exclusion condition must hold for nonsharable resources. For


example, a printer cannot be simultaneously shared by several processes.
Sharable resources, in contrast, do not require mutually exclusive access and
thus cannot be involved in a deadlock.

Operating System Concepts – 9th Edition 7.13 Silberschatz, Galvin and Gagne ©2013
Deadlock Prevention (Cont.)
 No Preemption –
 If a process that is holding some resources requests
another resource that cannot be immediately allocated to
it, then all resources currently being held are released
 Preempted resources are added to the list of resources
for which the process is waiting
 Process will be restarted only when it can regain its old
resources, as well as the new ones that it is requesting
 Circular Wait – impose a total ordering of all resource types,
and require that each process requests resources in an
increasing order of enumeration

Operating System Concepts – 9th Edition 7.14 Silberschatz, Galvin and Gagne ©2013
Deadlock Example
/* thread one runs in this function */
void *do_work_one(void *param)
{
pthread_mutex_lock(&first_mutex);
pthread_mutex_lock(&second_mutex);
/** * Do some work */
pthread_mutex_unlock(&second_mutex);
pthread_mutex_unlock(&first_mutex);
pthread_exit(0);
}
/* thread two runs in this function */
void *do_work_two(void *param)
{
pthread_mutex_lock(&second_mutex);
pthread_mutex_lock(&first_mutex);
/** * Do some work */
pthread_mutex_unlock(&first_mutex);
pthread_mutex_unlock(&second_mutex);
pthread_exit(0);
}

Operating System Concepts – 9th Edition 7.15 Silberschatz, Galvin and Gagne ©2013
Deadlock Example with Lock Ordering
void transaction(Account from, Account to, double amount)
{
mutex lock1, lock2;
lock1 = get_lock(from);
lock2 = get_lock(to);
acquire(lock1);
acquire(lock2);
withdraw(from, amount);
deposit(to, amount);
release(lock2);
release(lock1);
}

Transactions 1 and 2 execute concurrently. Transaction 1 transfers $25


from account A to account B, and Transaction 2 transfers $50 from account
B to account A

Operating System Concepts – 9th Edition 7.16 Silberschatz, Galvin and Gagne ©2013
Deadlock Avoidance
Requires that the system has some additional a priori information
available
 Simplest and most useful model requires that each process
declare the maximum number of resources of each type
that it may need
 The deadlock-avoidance algorithm dynamically examines
the resource-allocation state to ensure that there can never
be a circular-wait condition
 Resource-allocation state is defined by the number of
available and allocated resources, and the maximum
demands of the processes

Operating System Concepts – 9th Edition 7.17 Silberschatz, Galvin and Gagne ©2013
Safe State

 When a process requests an available resource, system must


decide if immediate allocation leaves the system in a safe state
 System is in safe state if there exists a sequence <P1, P2, …, Pn>
of ALL the processes in the systems such that for each Pi, the
resources that Pi can still request can be satisfied by currently
available resources + resources held by all the Pj, with j < I
 That is:
 If Pi resource needs are not immediately available, then Pi can
wait until all Pj have finished
 When Pj is finished, Pi can obtain needed resources, execute,
return allocated resources, and terminate
 When Pi terminates, Pi +1 can obtain its needed resources, and
so on

Operating System Concepts – 9th Edition 7.18 Silberschatz, Galvin and Gagne ©2013
Basic Facts

 If a system is in safe state ⇒ no deadlocks

 If a system is in unsafe state ⇒ possibility of deadlock

 Avoidance ⇒ ensure that a system will never enter an


unsafe state.

Operating System Concepts – 9th Edition 7.19 Silberschatz, Galvin and Gagne ©2013
Safe, Unsafe, Deadlock State

Operating System Concepts – 9th Edition 7.20 Silberschatz, Galvin and Gagne ©2013
Avoidance Algorithms

 Single instance of a resource type


 Use a resource-allocation graph

 Multiple instances of a resource type


 Use the banker’s algorithm

Operating System Concepts – 9th Edition 7.21 Silberschatz, Galvin and Gagne ©2013
Resource-Allocation Graph Scheme
 Claim edge Pi → Rj indicated that process Pj may request
resource Rj; represented by a dashed line
 Claim edge converts to request edge when a process requests
a resource
 Request edge converted to an assignment edge when the
resource is allocated to the process
 When a resource is released by a process, assignment edge
reconverts to a claim edge
 Resources must be claimed a priori in the system

Operating System Concepts – 9th Edition 7.22 Silberschatz, Galvin and Gagne ©2013
Resource-Allocation Graph

Operating System Concepts – 9th Edition 7.23 Silberschatz, Galvin and Gagne ©2013
Unsafe State In Resource-Allocation Graph

Operating System Concepts – 9th Edition 7.24 Silberschatz, Galvin and Gagne ©2013
Resource-Allocation Graph Algorithm

 Suppose that process Pi requests a resource Rj


 The request can be granted only if converting the
request edge to an assignment edge does not result
in the formation of a cycle in the resource allocation
graph

Operating System Concepts – 9th Edition 7.25 Silberschatz, Galvin and Gagne ©2013
Banker’s Algorithm
 Multiple instances Use when there are Multiple instances of a resource type

 Each process must a priori claim maximum use

 When a process requests a resource it may have to wait until some other
process releases
enough resources.
 When a process gets all its resources it must return them in a
finite amount of time

priori claim max use - ie when a new process enters the system, it must declare the
maximum number of instances of each resource type that it may need When a user
requests a set of resources, the system must determine whether the allocation of these
resources will leave the system in a safe state.

Operating System Concepts – 9th Edition 7.26 Silberschatz, Galvin and Gagne ©2013
Data Structures for the Banker’s Algorithm

Let n = number of processes, and m = number of resources types.


m indicates the number of available resources
of each type. If
 Available: Vector of length m. If available [j] = k, there are k
instances of resource type Rj available
m matrix defines the maximum demand of each process
 Max: n x m matrix. If Max [i,j] = k, then process Pi may request at
most k instances of resource type Rj
matrix defines the number of resources of each type currently allocated to each process

 Allocation: n x m matrix. If Allocation[i,j] = k then Pi is currently


allocated k instances of Rj
x indicates the remaining resource need of each process
 Need: n x m matrix. If Need[i,j] = k, then Pi may need k more
instances of Rj to complete its task

Need [i,j] = Max[i,j] – Allocation [i,j]

Operating System Concepts – 9th Edition 7.27 Silberschatz, Galvin and Gagne ©2013
Safety Algorithm
1. Let Work and Finish be vectors of length m and n, respectively.
Initialize:
Work = Available
Finish [i] = false for i = 0, 1, …, n- 1

2. Find an i such that both:


(a) Finish [i] = false
(b) Needi ≤ Work
If no such i exists, go to step 4

3. Work = Work + Allocationi


Finish[i] = true
go to step 2

4. If Finish [i] == true for all i, then the system is in a safe state

Operating System Concepts – 9th Edition 7.28 Silberschatz, Galvin and Gagne ©2013
Resource-Request Algorithm for Process Pi

Requesti = request vector for process Pi. If Requesti [j] = k then


process Pi wants k instances of resource type Rj
1. If Requesti ≤ Needi go to step 2. Otherwise, raise error condition,
since process has exceeded its maximum claim
2. If Requesti ≤ Available, go to step 3. Otherwise Pi must wait,
since resources are not available
3. Pretend to allocate requested resources to Pi by modifying the
state as follows:
Available = Available – Requesti;
Allocationi = Allocationi + Requesti;
Needi = Needi – Requesti;
 If safe ⇒ the resources are allocated to Pi
 If unsafe ⇒ Pi must wait, and the old resource-allocation state
is restored

Operating System Concepts – 9th Edition 7.29 Silberschatz, Galvin and Gagne ©2013
Example of Banker’s Algorithm

 5 processes P0 through P4;


3 resource types:
A (10 instances), B (5instances), and C (7 instances)
 Snapshot at time T0:
Allocation Max Available
ABC ABC ABC
P0 010 753 332
P1 200 322
P2 302 902
P3 211 222
P4 002 433

Operating System Concepts – 9th Edition 7.30 Silberschatz, Galvin and Gagne ©2013
Example (Cont.)
 The content of the matrix Need is defined to be Max – Allocation

Need
ABC
P0 743
P1 122
P2 600
P3 011
P4 431

 The system is in a safe state since the sequence < P1, P3, P4, P2, P0>
satisfies safety criteria

Operating System Concepts – 9th Edition 7.31 Silberschatz, Galvin and Gagne ©2013
Example: P1 Request (1,0,2)
 Check that Request ≤ Available (that is, (1,0,2) ≤ (3,3,2) ⇒ true
Allocation Need Available
ABC ABC ABC
P0 010 743 230
P1 302 020
P2 302 600
P3 211 011
P4 002 431

 Executing safety algorithm shows that sequence < P1, P3, P4, P0, P2>
satisfies safety requirement

 Can request for (3,3,0) by P4 be granted?

 Can request for (0,2,0) by P0 be granted?

Operating System Concepts – 9th Edition 7.32 Silberschatz, Galvin and Gagne ©2013
Deadlock Detection

 Allow system to enter deadlock state

 Detection algorithm

 Recovery scheme

Operating System Concepts – 9th Edition 7.33 Silberschatz, Galvin and Gagne ©2013
Single Instance of Each Resource Type

 Maintain wait-for graph


 Nodes are processes
 Pi → Pj if Pi is waiting for Pj

 Periodically invoke an algorithm that searches for a cycle in the


graph. If there is a cycle, there exists a deadlock

 An algorithm to detect a cycle in a graph requires an order of n2


operations, where n is the number of vertices in the graph

Operating System Concepts – 9th Edition 7.34 Silberschatz, Galvin and Gagne ©2013
Resource-Allocation Graph and Wait-for Graph

Resource-Allocation Graph Corresponding wait-for graph

Operating System Concepts – 9th Edition 7.35 Silberschatz, Galvin and Gagne ©2013
Several Instances of a Resource Type
 Available: A vector of length m indicates the number of
available resources of each type
 Allocation: An n x m matrix defines the number of resources
of each type currently allocated to each process
 Request: An n x m matrix indicates the current request of
each process. If Request [i][j] = k, then process Pi is
requesting k more instances of resource type Rj.

Operating System Concepts – 9th Edition 7.36 Silberschatz, Galvin and Gagne ©2013
Detection Algorithm

1. Let Work and Finish be vectors of length m and n, respectively


Initialize:
(a) Work = Available
(b) For i = 1,2, …, n, if Allocationi ≠ 0, then
Finish[i] = false; otherwise, Finish[i] = true

2. Find an index i such that both:


(a) Finish[i] == false
(b) Requesti ≤ Work

If no such i exists, go to step 4

Operating System Concepts – 9th Edition 7.37 Silberschatz, Galvin and Gagne ©2013
Detection Algorithm (Cont.)
3. Work = Work + Allocationi
Finish[i] = true
go to step 2

4. If Finish[i] == false, for some i, 1 ≤ i ≤ n, then the system is in


deadlock state. Moreover, if Finish[i] == false, then Pi is
deadlocked

Algorithm requires an order of O(m x n2) operations to detect


whether the system is in deadlocked state

Operating System Concepts – 9th Edition 7.38 Silberschatz, Galvin and Gagne ©2013
Example of Detection Algorithm
 Five processes P0 through P4; three resource types
A (7 instances), B (2 instances), and C (6 instances)

 Snapshot at time T0:


Allocation Request Available
ABC ABC ABC
P0 010 000 000
P1 200 202
P2 303 000
P3 211 100
P4 002 002

 Sequence <P0, P2, P3, P1, P4> will result in Finish[i] = true for all i

Operating System Concepts – 9th Edition 7.39 Silberschatz, Galvin and Gagne ©2013
Example (Cont.)

 P2 requests an additional instance of type C


Request
ABC
P0 000
P1 202
P2 001
P3 100
P4 002

 State of system?
 Can reclaim resources held by process P0, but insufficient
resources to fulfill other processes; requests
 Deadlock exists, consisting of processes P1, P2, P3, and P4

Operating System Concepts – 9th Edition 7.40 Silberschatz, Galvin and Gagne ©2013
Detection-Algorithm Usage
 When, and how often, to invoke depends on:
 How often a deadlock is likely to occur?
 How many processes will need to be rolled back?
 one for each disjoint cycle

 If detection algorithm is invoked arbitrarily, there may be many


cycles in the resource graph and so we would not be able to tell
which of the many deadlocked processes “caused” the
deadlock.

Operating System Concepts – 9th Edition 7.41 Silberschatz, Galvin and Gagne ©2013
Recovery from Deadlock: Process Termination

Abort will
 clearly
This method all break
deadlocked processes
the deadlock cycle, but at great expense; the deadlocked processes
may have computed for a long time, and the results of these partial computations must be discarded
and probably will have to be recomputed later
 Abort one process at a time until the deadlock cycle is eliminated

 In which order should we choose to abort?


1. Priority of the process
2. How long process has computed, and how much longer to
completion
3. Resources the process has used
4. Resources process needs to complete
5. How many processes will need to be terminated
6. Is process interactive or batch?

Operating System Concepts – 9th Edition 7.42 Silberschatz, Galvin and Gagne ©2013
Recovery from Deadlock: Resource Preemption
To eliminate deadlocks using resource preemption, we successively preempt
some resources from processes and give these resources to other processes 1-m til the
deadlock cycle is broken.
 Selecting a victim – minimize cost
Cost factors may include such parameters as the number of resources a deadlocked process is
holding and the amount of time the process has thus far consumed during its execution.
 Rollback – return to some safe state, restart process for that
state

 Starvation – same process may always be picked as victim,


include number of rollback in cost factor
Clearly, we must ensure that a process can be picked as a victim" only a (small) finite
number of times. The most common solution is to include the number of rollbacks in the cost
factor

Since, in general, it is difficult to determine what a safe state is, the


simplest solution is a total rollback: abort the process and then restart
it. Although it is more effective to roll back the process only as far as
necessary to break the deadlock, this method requires the system to keep
more information about the state of all running processes.

Operating System Concepts – 9th Edition 7.43 Silberschatz, Galvin and Gagne ©2013
Main Memory

Operating System Concepts – 9th Edition Silberschatz, Galvin and Gagne ©2013
Background

 Program must be brought (from disk) into memory and


placed within a process for it to be run
 Main memory and registers are only storage CPU can
access directly
 Memory unit only sees a stream of addresses + read
requests, or address + data and write requests
 Register access in one CPU clock (or less)
 Main memory can take many cycles, causing a stall
 Cache sits between main memory and CPU registers
 Protection of memory required to ensure correct operation

8.2
Base and Limit Registers
 A pair of base and limit registers define the logical address space
 CPU must check every memory access generated in user mode to
be sure it is between base and limit for that user

8.3
Hardware Address Protection

8.4
Address Binding
 Programs on disk, ready to be brought into memory to execute form an
input queue
 Without support, must be loaded into address 0000
 Inconvenient to have first user process physical address always at 0000
 How can it not be?
 Further, addresses represented in different ways at different stages of a
program’s life
 Source code addresses usually symbolic
 Compiled code addresses bind to relocatable addresses
 i.e. “14 bytes from beginning of this module”
 Linker or loader will bind relocatable addresses to absolute addresses
 i.e. 74014
 Each binding maps one address space to another

8.5
Logical vs. Physical Address Space

 The concept of a logical address space that is bound to a


separate physical address space is central to proper memory
management
 Logical address – generated by the CPU; also referred to
as virtual address
 Physical address – address seen by the memory unit
 Logical address space is the set of all logical addresses
generated by a program
 Physical address space is the set of all physical addresses
generated by a program

8.6
Memory-Management Unit (MMU)
 Hardware device that at run time maps virtual to physical
address
 Many methods possible, covered in the rest of this chapter
 To start, consider simple scheme where the value in the
relocation register is added to every address generated by a
user process at the time it is sent to memory
 Base register now called relocation register
 MS-DOS on Intel 80x86 used 4 relocation registers
 The user program deals with logical addresses; it never sees the
real physical addresses
 Execution-time binding occurs when reference is made to
location in memory
 Logical address bound to physical addresses

8.7
Dynamic relocation using a relocation register

 Routine is not loaded until it is


called
 Better memory-space utilization;
unused routine is never loaded
 All routines kept on disk in
relocatable load format
 Useful when large amounts of
code are needed to handle
infrequently occurring cases
 No special support from the
operating system is required
 Implemented through program
design
 OS can help by providing libraries
to implement dynamic loading

8.8
Swapping
 A process can be swapped temporarily out of memory to a
backing store, and then brought back into memory for continued
execution
 Total physical memory space of processes can exceed
physical memory
 Backing store – fast disk large enough to accommodate copies
of all memory images for all users; must provide direct access to
these memory images
 Roll out, roll in – swapping variant used for priority-based
scheduling algorithms; lower-priority process is swapped out so
higher-priority process can be loaded and executed
 Major part of swap time is transfer time; total transfer time is
directly proportional to the amount of memory swapped
 System maintains a ready queue of ready-to-run processes
which have memory images on disk

8.9
Schematic View of Swapping

8.10
Context Switch Time including Swapping

 If next processes to be put on CPU is not in memory, need to


swap out a process and swap in target process
 Context switch time can then be very high
 100MB process swapping to hard disk with transfer rate of
50MB/sec
 Swap out time of 2000 ms
 Plus swap in of same sized process
 Total context switch swapping component time of 4000ms
(4 seconds)
 Can reduce if reduce size of memory swapped – by knowing
how much memory really being used
 System calls to inform OS of memory use via
request_memory() and release_memory()

8.11
Context Switch Time and Swapping (Cont.)

 Other constraints as well on swapping


 Pending I/O – can’t swap out as I/O would occur to wrong
process
 Or always transfer I/O to kernel space, then to I/O device
 Known as double buffering, adds overhead
 Standard swapping not used in modern operating systems
 But modified version common
 Swap only when free memory extremely low

8.12
Contiguous Allocation
 Main memory must support both OS and user processes
 Limited resource, must allocate efficiently
 Contiguous allocation is one early method
 Main memory usually into two partitions:
 Resident operating system, usually held in low memory with
interrupt vector
 User processes then held in high memory
 Each process contained in single contiguous section of
memory

The memory is usually divided into two partitions: one for the resident
operating system and one for the user processes. We can place the operating
system in either low memory or high memory. The major factor affecting this
decision is the location of the interrupt vector. Since the interrupt vector is
often in low memory, programmers usually place the operating system in low
memory as well.

8.13
Contiguous Allocation (Cont.)
 Relocation registers used to protect user processes from each
other, and from changing operating-system code and data
 Base register contains value of smallest physical address
 Limit register contains range of logical addresses – each
logical address must be less than the limit register
 MMU maps logical address dynamically
 Can then allow actions such as kernel code being transient
and kernel changing size

8.14
Hardware Support for Relocation and Limit Registers

8.15
Multiple-partition allocation
One of the simplest methods for allocating memory is to divide memory into several fixed-sized
partions Each partition may contain exactly one process
 Multiple-partition allocation
 Degree of multiprogramming limited by number of partitions
 Variable-partition sizes for efficiency (sized to a given process’ needs)
 Hole – block of available memory; holes of various size are scattered
throughout memory
 When a process arrives, it is allocated memory from a hole large enough to
accommodate it
 Process exiting frees its partition, adjacent free partitions combined
 Operating system maintains information about:
a) allocated partitions b) free partitions (hole)

8.16
Dynamic Storage-Allocation Problem
How to satisfy a request of size n from a list of free holes?

 First-fit: Allocate the first hole that is big enough Searching can start either
at the beginning of the set of holes or at the location where the previous first-fit search ended. We
can stop searching as soon as we find a free hole that is large enough.
 Best-fit: Allocate the smallest hole that is big enough; must
search entire list, unless ordered by size
 Produces the smallest leftover hole

 Worst-fit: Allocate the largest hole; must also search entire list
 Produces the largest leftover hole which may be more useful than the
smaller leftover hole from a best-fit
approach
First-fit and best-fit better than worst-fit in terms of speed and storage
utilization
The system may need to check whether there are processes waiting for memory and whether this
newly freed and recombined memory could satisfy the demands of any of these waiting
processes.
This procedure is a particular instance of the general dtnamic storage allocation which concerns how
to satisfy a request of size n from a lisr of free holes.

8.17
Fragmentation
Both the first-fit and best-fit strategies for memory allocation suffer from external As processes are
loaded and removed from memory, the free memory space is broken into little pieces
 External Fragmentation – total memory space exists to
satisfy a request, but it is not contiguous Storage is fragmented into a large
number of small holes
 Internal Fragmentation – allocated memory may be slightly
larger than requested memory; this size difference is memory
internal to a partition, but not being used
 First fit analysis reveals that given N blocks allocated, 0.5 N
blocks lost to fragmentation
 1/3 may be unusable -> 50-percent rule
That is, one-third of memory may be unusable! This property is known as th

read fragmentation form book pg nu 341 of pdf, 327 of book

8.18
Fragmentation (Cont.)

 Reduce external fragmentation by compaction


 Shuffle memory contents to place all free memory together
. in one large block Compaction is not always possible, however. If relocation
is static and is done at assembly or load time, compaction cannot be don
 Compaction is possible only if relocation is dynamic, and is
done at execution time , relocation requires only moving the program and data and
then changing the base register to reflect the new base address
 I/O problem

 Latch job in memory while it is involved in I/O


 Do I/O only into OS buffers
 Now consider that backing store has same fragmentation
problems
Another possible solution to the external-fragmentation problem is to permit the logical address
space of the processes to be noncontiguous, thus allowing a process to be allocated physical
memory wherever such memory is available. Two complementary techniques achieve this
solution: paging (Section 8.4) and segmentation (Section 8.6).

8.19
Segmentation
 Memory-management scheme that supports user view of memory
 A program is a collection of segments
 A segment is a logical unit such as:
main program
procedure
function
method
object
local variables, global variables
common block
stack
symbol table
arrays
. A logical address space is a collection of segments. Each
segment has a name and a length. The user therefore specifies each address
by two quantities: a segment name and an offset

8.20
User’s View of a Program

8.21
Logical View of Segmentation

4
1

3 2
4

user space physical memory space

8.22
Segmentation Architecture
For simplicity of implementation, segments are numbered and are referred to by a segn"lent
number, rather than by a segment name
 Logical address consists of a two tuple:
<segment-number, offset>,
Normally, the user program is compiled, and the compiler automatically constructs segments reflecting
the input program.
 Segment table – maps two-dimensional physical addresses; each
table entry has:
 base – contains the starting physical address where the
segments reside in memory
 limit – specifies the length of the segment

 Segment-table base register (STBR) points to the segment


table’s location in memory

 Segment-table length register (STLR) indicates number of


segments used by a program;
segment number s is legal if s < STLR

8.23
Segmentation Architecture (Cont.)
 Protection
 With each entry in segment table associate:
 validation bit = 0  illegal segment
 read/write/execute privileges
 Protection bits associated with segments; code sharing
occurs at segment level
 Since segments vary in length, memory allocation is a
dynamic storage-allocation problem
 A segmentation example is shown in the following diagram

8.24
Segmentation Hardware

8.25
Read Entire Pagin from galvin , pdf pg no 341, book pg no 328
Paging
 Physical address space of a process can be noncontiguous;
process is allocated physical memory whenever the latter is
available
 Avoids external fragmentation
 Avoids problem of varying sized memory chunks
The basic method for implementing paging involves breaking
 Divide physical memory into fixed-sized blocks called frames
 Size is power of 2, between 512 bytes and 16 Mbytes
 Divide logical memory into blocks of same size called pages
 Keep track of all free frames
 To run a program of size N pages, need to find N free frames and
load program
 Set up a page table to translate logical to physical addresses
 Backing store likewise split into pages
 Still have Internal fragmentation

8.26
Address Translation Scheme
 Address generated by CPU is divided into:
 Page number (p) – used as an index into a page table which
contains base address of each page in physical memory
 Page offset (d) – combined with base address to define the
physical memory address that is sent to the memory unit

page number page offset


p d
m -n n

 For given logical address space 2m and page size 2n


Paging is a memory-management scheme that permits the physical address space a process to be
noncontiguous. Paging avoids external fragmentation and the need for compaction. It also solves the
considerable problem of fitting memory chunks of varying sizes onto the backin.g store; most memory
management schemes used before the introduction of paging suffered from this problem. Because of its
advantages over earlier methods, paging in its various forms is used in most operating systems

Traditionally, support for paging has been handled by hardware. However, recent designs have implemented
paging by closely integrating the hardware and operating system, especially on 64-bit microprocessors.

8.27
Paging Hardware

8.28
Paging Model of Logical and Physical Memory

8.29
Paging Example

n=2 and m=4 32-byte memory and 4-byte pages

8.30
Paging (Cont.)

 Calculating internal fragmentation


 Page size = 2,048 bytes
 Process size = 72,766 bytes
 35 pages + 1,086 bytes
 Internal fragmentation of 2,048 - 1,086 = 962 bytes
 Worst case fragmentation = 1 frame – 1 byte
 On average fragmentation = 1 / 2 frame size
 So small frame sizes desirable?
 But each page table entry takes memory to track
 Page sizes growing over time
 Solaris supports two page sizes – 8 KB and 4 MB
 Process view and physical memory now very different
 By implementation process can only access its own memory

8.31
Free Frames

Before allocation After allocation

8.32
Implementation of Page Table
 Page table is kept in main memory
 Page-table base register (PTBR) points to the page table
 Page-table length register (PTLR) indicates size of the page
table
 In this scheme every data/instruction access requires two
memory accesses
 One for the page table and one for the data / instruction
 The two memory access problem can be solved by the use of
a special fast-lookup hardware cache called associative
memory or translation look-aside buffers (TLBs)

8.33
Implementation of Page Table (Cont.)
 Some TLBs store address-space identifiers (ASIDs) in each
TLB entry – uniquely identifies each process to provide
address-space protection for that process
 Otherwise need to flush at every context switch
 TLBs typically small (64 to 1,024 entries)
 On a TLB miss, value is loaded into the TLB for faster access
next time
 Replacement policies must be considered
 Some entries can be wired down for permanent fast
access

8.34
Associative Memory

 Associative memory – parallel search

Page # Frame #

 Address translation (p, d)


 If p is in associative register, get frame # out
 Otherwise get frame # from page table in memory

8.35
Paging Hardware With TLB

8.36
8.37
Effective Access Time
 Associative Lookup =  time unit
 Can be < 10% of memory access time
 Hit ratio = 
 Hit ratio – percentage of times that a page number is found in the
associative registers; ratio related to number of associative
registers
 Consider  = 80%,  = 20ns for TLB search, 100ns for memory access

 Consider  = 80%,  = 20ns for TLB search, 100ns for memory access
EAT = 0.80 x 100 + 0.20 x 200 = 120ns
 Consider more realistic hit ratio ->  = 99%,  = 20ns for TLB search,
100ns for memory access
 EAT = 0.99 x 100 + 0.01 x 200 = 101ns

8.38
Memory Protection
Memory protection in a paged environment is accomplished by protection bits
associated with each frame. Normally, these bits are kept in the page table
 Memory protection implemented by associating protection bit
with each frame to indicate if read-only or read-write access is
allowed
 Can also add more bits to indicate page execute-only, and
so on
 Valid-invalid bit attached to each entry in the page table:
 “valid” indicates that the associated page is in the
process’ logical address space, and is thus a legal page
 “invalid” indicates that the page is not in the process’
logical address space
 Or use page-table length register (PTLR) This value is
 Any violations result in a trap to the kernel checked against every
logical address to verify
that the address is in the
valid range for the process

8.39
Valid (v) or Invalid (i) Bit In A Page Table

8.40
Virtual Memory
Background (Cont.)
 Virtual memory – separation of user logical memory from
physical memory
 Only part of the program needs to be in memory for execution
 Logical address space can therefore be much larger than physical
address space
 Allows address spaces to be shared by several processes
 Allows for more efficient process creation
 More programs running concurrently
 Less I/O needed to load or swap processes
Virtual memory is a tecrucique that allows the execution of processes that are not completely in memory. One major advantage of this scheme is
that programs can be larger than physical memory. Further, virtual memory abstracts main memory into an extremely large, uniform array of storage,
separating logical memory as viewed by the user from physical memory. This technique frees programmers from the concerns of memory-storage
limitations. Virtual memory also allows processes to share files easily and to implement shared memory. In addition, it provides an efficient
mechanism for process creation. Virtual memory is not easy to implement, however, and
may substantially decrease performance if it is used carelessly
Background (Cont.)
 Virtual address space – logical view of how process is
stored in memory Typically, this view is that a process
begins at a certain logical address-say, address 0-and exists in contiguous
memory, as
 Usually start at address 0, contiguous addresses until end of
space
 Meanwhile, physical memory organized in page frames
d that the physical page
 MMU must map logical to physical frames assigned to a process may not be contiguous.

 Virtual memory can be implemented via:


 Demand paging memorymanagement unit (MMU) to map logical pages to physical
page frames in
 Demand segmentation memory.
Virtual Memory That is Larger Than Physical Memory
Demand Paging
 Could bring entire process into memory
Loading the entire program into memory results in loading the executable code for all options,
at load time regardless of whether an option is ultimately selected by the user or not
 Or bring a page into memory only when
it is needed
 Less I/O needed, no unnecessary
I/O
 Less memory needed
 Faster response
 More users
 Similar to paging system with swapping
(diagram on right)
 Page is needed  reference to it
 invalid reference  abort
 not-in-memory  bring to memory
 Lazy swapper – never swaps a page
into memory unless page will be needed
 Swapper that deals with pages is a
pager whereas a papeg is concerned with
the individual pages of a process.
A demand-paging system is similar to a paging system with
With demand-paged virtual memory, pages are only loaded when they are swapping where processes reside in secondary memory
demanded during program execution; pages that are never accessed are thus (usually a disk). When we want to execute a process, we
never loaded into physical memor swap it into memory.
Basic Concepts
 With swapping, pager guesses which pages will be used before
swapping out again
 Instead, pager brings in only those pages into memory Thus, it avoids reading
into memory pages that will not be
 How to determine that set of pages? used anyway, decreasing the swap
time
and the amount of physical memory
 Need new MMU functionality to implement demand paging
needed.

 If pages needed are already memory resident


 No difference from non demand-paging
 If page needed and not memory resident
 Need to detect and load the page into memory from storage
 Without changing program behavior
 Without programmer needing to change code
Valid-Invalid Bit
 With each page table entry a valid–invalid bit is associated
(v  in-memory – memory resident, i  not-in-memory)
 Initially valid–invalid bit is set to i on all entries
 Example of a page table snapshot:

 During MMU address translation, if valid–invalid bit in page table


entry is i  page fault
Page Table When Some Pages Are Not in Main Memory
Page Fault
s if the process tries to access a page that was not brought
into memory? Access to a page marked invalid causes a [age Fault

 If there is a reference to a page, first reference to that page will


trap to operating system: This trap is the
result of the operating system's failure to bring the desired page into memory
page fault
1. Operating system looks at another table to decide:
 Invalid reference  abort
 Just not in memory
2. Find free frame
3. Swap page into frame via scheduled disk operation
4. Reset tables to indicate page now in memory
Set validation bit = v
5. Restart the instruction that caused the page fault
Steps in Handling a Page Fault
Aspects of Demand Paging
we can start executing a process

 Extreme case – start process with no pages in memory


 OS sets instruction pointer to first instruction of process, non-
memory-resident -> page fault the process immdiately faults for the page
 And for every other process pages on first access
 Pure demand paging never bring a page into memory until it is
required.
 Actually, a given instruction could access multiple pages -> multiple
page faults
 Consider fetch and decode of instruction which adds 2 numbers
from memory and stores result back to memory
, which results in
 Pain decreased because of locality of reference reasonable performance from
demand paging.
 Hardware support needed for demand paging
 Page table with valid / invalid bit
 Secondary memory (swap device with swap space)
 Instruction restart This memory holds those pages that are not present
in main memory. The secondary memory is usually a high-speed disk. It is
known as the swap device, and the section of disk used for this purpose is
known as Swap Space.
Instruction Restart
 Consider an instruction that could access several different locations
 block move

 auto increment/decrement location


 Restart the whole operation?
 What if source and destination overlap?
Performance of Demand Paging
 Stages in Demand Paging (worse case)
1. Trap to the operating system
2. Save the user registers and process state
3. Determine that the interrupt was a page fault
4. Check that the page reference was legal and determine the location of the page on the disk
5. Issue a read from the disk to a free frame:
1. Wait in a queue for this device until the read request is serviced
2. Wait for the device seek and/or latency time
3. Begin the transfer of the page to a free frame
6. While waiting, allocate the CPU to some other user
7. Receive an interrupt from the disk I/O subsystem (I/O completed)
8. Save the registers and process state for the other user
9. Determine that the interrupt was from the disk
10. Correct the page table and other tables to show page is now in memory
11. Wait for the CPU to be allocated to this process again
12. Restore the user registers, process state, and new page table, and then resume the
interrupted instruction
Performance of Demand Paging (Cont.)
 Three major activities
 Service the interrupt – careful coding means just several hundred
instructions needed
 Read the page – lots of time
 Restart the process – again just a small amount of time
 Page Fault Rate 0  p  1
 if p = 0 no page faults
 if p = 1, every reference is a fault
 Effective Access Time (EAT)
EAT = (1 – p) x memory access
+ p (page fault overhead
+ swap page out
+ swap page in )
Demand Paging Example
 Memory access time = 200 nanoseconds
 Average page-fault service time = 8 milliseconds
 EAT = (1 – p) x 200 + p (8 milliseconds)
= (1 – p x 200 + p x 8,000,000
= 200 + p x 7,999,800
 If one access out of 1,000 causes a page fault, then
EAT = 8.2 microseconds.
This is a slowdown by a factor of 40!!
 If want performance degradation < 10 percent
 220 > 200 + 7,999,800 x p
20 > 7,999,800 x p
 p < .0000025
 < one page fault in every 400,000 memory accesses
What Happens if There is no Free Frame?

 Used up by process pages


 Also in demand from the kernel, I/O buffers, etc
 How much to allocate to each?
 Page replacement – find some page in memory, but not really in
use, page it out
 Algorithm – terminate? swap out? replace the page?
 Performance – want an algorithm which will result in minimum
number of page faults
 Same page may be brought into memory several times
Page Replacement

 Prevent over-allocation of memory by modifying page-


fault service routine to include page replacement
 Use modify (dirty) bit to reduce overhead of page
transfers – only modified pages are written to disk
 Page replacement completes separation between logical
memory and physical memory – large virtual memory can
be provided on a smaller physical memory
Need For Page Replacement
Basic Page Replacement
1. Find the location of the desired page on disk

2. Find a free frame:


- If there is a free frame, use it
- If there is no free frame, use a page replacement algorithm to
select a victim frame
- Write victim frame to disk if dirty

3. Bring the desired page into the (newly) free frame; update the page
and frame tables

4. Continue the process by restarting the instruction that caused the trap

Note now potentially 2 page transfers for page fault – increasing EAT
Page Replacement
Page and Frame Replacement Algorithms

 Frame-allocation algorithm determines


 How many frames to give each process
 Which frames to replace
 Page-replacement algorithm
 Want lowest page-fault rate on both first access and re-access
 Evaluate algorithm by running it on a particular string of memory
references (reference string) and computing the number of page
faults on that string
 String is just page numbers, not full addresses
 Repeated access to the same page does not cause a page fault
 Results depend on number of frames available
 In all our examples, the reference string of referenced page
numbers is
7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
Graph of Page Faults Versus The Number of Frames
First-In-First-Out (FIFO) Algorithm
 Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
 3 frames (3 pages can be in memory at a time per process)

15 page faults
 Can vary by reference string: consider 1,2,3,4,1,2,5,1,2,3,4,5
 Adding more frames can cause more page faults!
 Belady’s Anomaly
 How to track ages of pages?
 Just use a FIFO queue
FIFO Illustrating Belady’s Anomaly
Optimal Algorithm
 Replace page that will not be used for longest period of time
 9 is optimal for the example
 How do you know this?
 Can’t read the future
 Used for measuring how well your algorithm performs
Least Recently Used (LRU) Algorithm
 Use past knowledge rather than future
 Replace page that has not been used in the most amount of time
 Associate time of last use with each page

 12 faults – better than FIFO but worse than OPT


 Generally good algorithm and frequently used
 But how to implement?
LRU Algorithm (Cont.)
 Counter implementation
 Every page entry has a counter; every time page is referenced
through this entry, copy the clock into the counter
 When a page needs to be changed, look at the counters to find
smallest value
 Search through table needed
 Stack implementation
 Keep a stack of page numbers in a double link form:
 Page referenced:
 move it to the top
 requires 6 pointers to be changed
 But each update more expensive
 No search for replacement
 LRU and OPT are cases of stack algorithms that don’t have
Belady’s Anomaly
Use Of A Stack to Record Most Recent Page References
LRU Approximation Algorithms
 LRU needs special hardware and still slow
 Reference bit
 With each page associate a bit, initially = 0
 When page is referenced bit set to 1
 Replace any with reference bit = 0 (if one exists)
 We do not know the order, however
 Second-chance algorithm
 Generally FIFO, plus hardware-provided reference bit
 Clock replacement
 If page to be replaced has
 Reference bit = 0 -> replace it
 reference bit = 1 then:
– set reference bit 0, leave page in memory
– replace next page, subject to same rules
Second-Chance (clock) Page-Replacement Algorithm
Counting Algorithms

 Keep a counter of the number of references that have been made


to each page
 Not common

 Lease Frequently Used (LFU) Algorithm: replaces page with


smallest count

 Most Frequently Used (MFU) Algorithm: based on the argument


that the page with the smallest count was probably just brought in
and has yet to be used
Applications and Page Replacement

 All of these algorithms have OS guessing about future page


access
 Some applications have better knowledge – i.e. databases
 Memory intensive applications can cause double buffering
 OS keeps copy of page in memory as I/O buffer
 Application keeps page in memory for its own work
 Operating system can given direct access to the disk, getting out
of the way of the applications
 Raw disk mode
 Bypasses buffering, locking, etc
Allocation of Frames
 Each process needs minimum number of frames
 Example: IBM 370 – 6 pages to handle SS MOVE instruction:
 instruction is 6 bytes, might span 2 pages
 2 pages to handle from
 2 pages to handle to
 Maximum of course is total frames in the system
 Two major allocation schemes
 fixed allocation
 priority allocation
 Many variations
Fixed Allocation
 Equal allocation – For example, if there are 100 frames (after
allocating frames for the OS) and 5 processes, give each process
20 frames
 Keep some as free frame buffer pool

 Proportional allocation – Allocate according to the size of process


 Dynamic as degree of multiprogramming, process sizes
change
m = 64
si  size of process pi s1 = 10
S   si s2 = 127
m  total number of frames a1 =
10
´ 62 » 4
137
s
ai  allocation for pi  i  m 127
S a2 = ´ 62 » 57
137
Priority Allocation

 Use a proportional allocation scheme using priorities rather


than size

 If process Pi generates a page fault,


 select for replacement one of its frames
 select for replacement a frame from a process with lower
priority number
Global vs. Local Allocation
 Global replacement – process selects a replacement frame
from the set of all frames; one process can take a frame from
another
 But then process execution time can vary greatly
 But greater throughput so more common
a process cannot control its own page-fault rate.

 Local replacement – each process selects from only its own


set of allocated frames
 More consistent per-process performance
 But possibly underutilized memory
Under local replacement, the set of pages in memory for a process is affected by the
paging behavior of only that process. Local replacement might hinder a process,
however, by not making available to it other, less used pages of memory.

In fact, look at any process that does not have "enough" frames. If the
process does not have the num.ber of frames it needs to support pages in
active use, it will quickly page-fault. At this point, it must replace some page.
However, since all its pages are in active use, it must replace a page that will
be needed again right away. Consequently, it quickly faults again, and again,
and again, replacing pages that it must back in immediately
Thrashing
 If a process does not have “enough” pages, the page-fault rate is
very high
 Page fault to get page
 Replace existing frame
 But quickly need replaced frame back
 This leads to:
 Low CPU utilization
 Operating system thinking that it needs to increase the
degree of multiprogramming
 Another process added to the system

 Thrashing  a process is busy swapping pages in and out

This high paging activity is called A process is thrashing if it is


spending more time paging than executing.
Thrashing (Cont.)
Demand Paging and Thrashing
 Why does demand paging work?
Locality model
 Process migrates from one locality to another
 Localities may overlap

 Why does thrashing occur?


 size of locality > total memory size
 Limit effects by using local or priority page replacement
Locality In A Memory-Reference Pattern
Working-Set Model
   working-set window  a fixed number of page references
Example: 10,000 instructions
 WSSi (working set of Process Pi) =
total number of pages referenced in the most recent  (varies in time)
 if  too small will not encompass entire locality
 if  too large will encompass several localities
 if  =   will encompass entire program
 D =  WSSi  total demand frames
 Approximation of locality
 if D > m  Thrashing
 Policy if D > m, then suspend or swap out one of the processes
Keeping Track of the Working Set
 Approximate with interval timer + a reference bit
 Example:  = 10,000
 Timer interrupts after every 5000 time units
 Keep in memory 2 bits for each page
 Whenever a timer interrupts copy and sets the values of all
reference bits to 0
 If one of the bits in memory = 1  page in working set
 Why is this not completely accurate?
 Improvement = 10 bits and interrupt every 1000 time units
Page-Fault Frequency
 More direct approach than WSS
 Establish “acceptable” page-fault frequency (PFF) rate
and use local replacement policy
 If actual rate too low, process loses frame
 If actual rate too high, process gains frame
Working Sets and Page Fault Rates
 Direct relationship between working set of a process and its
page-fault rate
 Working set changes over time
 Peaks and valleys over time
Disk Scheduling

Disk scheduling refers to the method or algorithm used by operating systems to determine the order in which
read and write requests to a disk are processed. Since multiple requests can come from different applications,
the goal of disk scheduling is to optimize disk performance, minimize disk head movement, reduce seek time,
and ultimately improve system efficiency and responsiveness.
Overview of Mass Storage Structure
 Magnetic disks provide bulk of secondary storage of modern computers
 Drives rotate at 60 to 250 times per second
 Transfer rate is rate at which data flow between drive and computer
 Positioning time (random-access time) is time to move disk arm to
desired cylinder (seek time) and time for desired sector to rotate
under the disk head (rotational latency) dddd acsh
 Head crash results from disk head making contact with the disk
surface -- That’s bad
 Disks can be removable
 Drive attached to computer via I/O bus
 Busses vary, including EIDE, ATA, SATA, USB, Fibre Channel,
SCSI, SAS, Firewire
 Host controller in computer uses bus to talk to disk controller built
into drive or storage array
Moving-head Disk Mechanism

Seek Time:
This is the time it takes to move the disk arm (also known as the actuator arm) to the desired cylinder or track. The disk arm moves the
read/write head across the platters to reach the correct position for data access.
Rotational Latency:
This is the time it takes for the desired sector to rotate under the read/write head once the disk arm has reached the correct track. It
depends on the rotational speed of the disk, measured in revolutions per minute (RPM).
Disk Scheduling
 The operating system is responsible for using hardware
efficiently — for the disk drives, this means having a fast
access time and disk bandwidth
 Minimize seek time
 Seek time  seek distance
 Disk bandwidth is the total number of bytes transferred,
divided by the total time between the first request for service
and the completion of the last transfer
Disk Scheduling (Cont.)
 There are many sources of disk I/O request
 OS
 System processes
 Users processes
 I/O request includes input or output mode, disk address, memory
address, number of sectors to transfer
 OS maintains queue of requests, per disk or device
 Idle disk can immediately work on I/O request, busy disk means
work must queue
 Optimization algorithms only make sense when a queue exists
Disk Scheduling (Cont.)
 Note that drive controllers have small buffers and can manage a
queue of I/O requests (of varying “depth”)
 Several algorithms exist to schedule the servicing of disk I/O
requests
 The analysis is true for one or many platters
 We illustrate scheduling algorithms with a request queue (0-199)

98, 183, 37, 122, 14, 124, 65, 67


Head pointer 53
FCFS
This algorithm is intrinsically fair, but it generally does notprovide the fastest service.
, the total head movementcould be decreased substantially, and performance could be thereby improved.
Illustration shows total head movement of 640 cylinders
SSTF
 Shortest Seek Time First selects the request with the minimum
seek time from the current head position
 SSTF scheduling is a form of SJF scheduling; may cause
starvation of some requests
 Illustration shows total head movement of 236 cylinders
SCAN

 The disk arm starts at one end of the disk, and moves toward the
other end, servicing requests until it gets to the other end of the
disk, where the head movement is reversed and servicing
continues.
 SCAN algorithm Sometimes called the elevator algorithm
 Illustration shows total head movement of 236 cylinders
 But note that if requests are uniformly dense, largest density at
other end of disk and those wait the longest
SCAN (Cont.)
C-SCAN
 Provides a more uniform wait time than SCAN
 The head moves from one end of the disk to the other, servicing
requests as it goes
 When it reaches the other end, however, it immediately
returns to the beginning of the disk, without servicing any
requests on the return trip
 Treats the cylinders as a circular list that wraps around from the
last cylinder to the first one
 Total number of cylinders?
C-SCAN (Cont.)
C-LOOK
 LOOK a version of SCAN, C-LOOK a version of C-SCAN
 Arm only goes as far as the last request in each direction,
then reverses direction immediately, without first going all
the way to the end of the disk
 Total number of cylinders?
C-LOOK (Cont.)

You might also like