Professional Documents
Culture Documents
Second Edition
Operating System
Second Edition
Rohit Khurana
Founder and CEO
ITLESL, Delhi
VIKAS® PUBLISHING HOUSE PVT LTD
E-28, Sector-8, Noida-201301 (UP) India
Phone: +91-120-4078900 • Fax: +91-120-4078999
Registered Office: 576, Masjid Road, Jangpura, New Delhi-110014. India
E-mail: helpline@vikaspublishing.com
• Website: www.vikaspublishing.com
• Ahmedabad : 305, Grand Monarch, 100 ft Shyamal Road, Near
Seema Hall, Ahmedabad-380051 • Ph. +91-79-
65254204, +91-9898294208
• Bengaluru : First Floor, N S Bhawan, 4th Cross, 4th Main,
Gandhi Nagar, Bengaluru-560009 • Ph. +91-80-
22281254, 22204639
• Chennai : E-12, Nelson Chambers, 115, Nelson Manickam
Road, Aminjikarai, Chennai-600029 • Ph. +91-44-
23744547, 23746090
• Hyderabad : Aashray Mansion, Flat-G (G.F.), 3-6-361/8, Street
No. 20, Himayath Nagar, Hyderabad-500029 • Ph.
+91-40-23269992 • Fax. +91-40-23269993
• Kolkata : 82, Park Street, Kolkata-700017 • Ph. +91-33-
22837880
• Mumbai : 67/68, 3rd Floor, Aditya Industrial Estate, Chincholi
Bunder, Behind Balaji International School &
Evershine Mall, Malad (West), Mumbai-400064 •
Ph. +91-22-28772545, 28768301
• Patna : Flat No. 101, Sri Ram Tower, Beside Chiraiyatand
Over Bridge, Kankarbagh Main Rd., Kankarbagh,
Patna-800020 • Ph. +91-612-2351147
Operating System
ISBN: 978-93259-7563-7
Key Additions
Though several enhancements have been made to the text,
following are the key additions to this new edition.
• Chapter 1 introduces two more types of operating system,
including Personal Computer (PC) Operating System and Mobile
Operating System.
• Chapter 2 now also includes the methods used for
communication in client-server systems: Socket, Remote
Procedure Call (RPC) and Remote Method Invocation (RMI).
• The topics Thread Library and Thread Scheduling have been
added in Chapters 3 and 4, respectively.
• A few topics, including Principles of Concurrency, Precedence
Graph, Concurrency Conditions and Sleeping Barber Problem
have been added in Chapter 5.
• Chapter 7 comprises an additional topic Structure of Page
Tables.
• Chapter 8 introduces topics like Demand Segmentation and
Cache Memory Organization.
• Chapter 9 now covers the concept of STREAMS. It also throws
light on how I/O requests from the users are transformed to
hardware operations.
• Chapters 10 and 11 add topics such as Disk Attachment, Stable
Storage, Tertiary Storage, Record Blocking and File Sharing.
• The text of Chapter 13 is a complete overhaul and includes new
topics, such as Goals and Principles of Protection, Access
Control Matrix and its Implementation, Revocation of Access
Rights, Cryptography, Trusted Systems, and Firewalls.
Chapter Organization
The text is organized into 17 chapters.
• Chapter 1 introduces operating system, its services, and
structure. Also, it provides an insight into the organization of
computer system.
• Chapter 2 deals essentially with basic concepts of processes,
such as process scheduling, operations on processes and
communication between processes. It also introduces the
methods used for communication in client-server systems.
• Chapter 3 helps to understand the need and advantages of
threads, various multithreading models as well as threading
issues. It also introduces the concept of thread libraries and
discusses various operations on the threads of Pthread library.
• Chapter 4 spells out the scheduling criteria and different types of
scheduling algorithms. It also discusses several issues regarding
scheduling in multiprocessor and real-time systems.
• Chapter 5 throws light on several methods used for achieving
synchronization among cooperating processes.
• Chapter 6 describes the deadlock situation and the conditions
that lead to deadlock. It also provides methods for handling
deadlock.
• Chapter 7 familiarises the reader with the various memory
management strategies used for contiguous and non-contiguous
memory allocation.
• Chapter 8 introduces the concept of virtual memory. It also
discusses how virtual memory is implemented using demand
paging and demand segmentation.
• Chapter 9 discusses system I/O in detail, including the I/O system
design, interfaces, and functions. It also explains the STREAMS
mechanism of UNIX System V and the transformation of I/O
requests into hardware operation.
• Chapter 10 explains disk scheduling algorithms, disk
management, swap-space management and RAID. It also
introduces the concept of stable and tertiary storage.
• Chapter 11 acquaints the readers with basic concepts of files
including file types, attributes, operations, structure, and access
methods. It also describes the concepts of file-system mounting,
file sharing, record blocking and protection.
• Chapter 12 explores how the files and directories are
implemented. Management of free space on the disk is explained
as well.
• Chapter 13 disseminates to the reader the need of security and
protection in computer systems. It also explains methods to
implement the same.
• Chapter 14 sheds light on multiprocessor and distributed systems
including their types, architecture and benefits. It also describes
the distributed file system.
• Chapter 15 covers the UNIX operating system including its
development and structure. It discusses how processes,
memory, I/O, files, and directories are managed in UNIX. It also
introduces elementary shell programming.
• Chapter 16 presents an in-depth examination of the Linux
operating system. It describes how theoretical concepts of
operating system relate to one another and to practice.
• Chapter 17 expounds on implementation of various operating
system concepts in Windows 2000 operating system.
Acknowledgement
Rohit Khurana
Founder and CEO
ITLESL, Delhi
Contents
2. Process Management
2.1 Introduction
2.2 Process Concept
2.2.1 The Process 2.2.2 Process States
2.2.3 Process Control Block (PCB)
2.3 Process Scheduling
2.4 Operations on Processes
2.4.1 Process Creation 2.4.2 Process Termination
2.5 Cooperating Processes
2.6 Inter-process Communication
2.6.1 Shared Memory Systems 2.6.2 Message Passing Systems
2.7 Communication in Client-Server Systems
2.7.1 Socket 2.7.2 Remote Procedure Call (RPC)
2.7.3 Remote Method Invocation (RMI)
Let us Summarize
Exercises
3. Threads
3.1 Introduction
3.2 Thread Concept
3.2.1 Advantages of Threads 3.2.2 Implementation of Threads
3.3 Multithreading Models
3.3.1 Many-to-One (M:1) Model 3.3.2 One-to-One (1:1) Model
3.3.3 Many-to-Many (M:M) Model
3.4 Threading Issues
3.4.1 fork() and exec() System Calls 3.4.2 Thread Cancellation
3.4.3 Thread-specific Data
3.5 Thread Libraries
3.5.1 Pthreads Library
Let us Summarize
Exercises
4. CPU Scheduduling
4.1 Introduction
4.2 Scheduling Concepts
4.2.1 Process Behaviour 4.2.2 When to Schedule
4.2.3 Dispatcher
4.3 Scheduling Criteria
4.4 Scheduling Algorithms
4.4.1 First-Come First-Served (FCFS) Scheduling
4.4.2 Shortest Job First (SJF) Scheduling
4.4.3 Shortest Remaining Time Next (SRTN) Scheduling
4.4.4 Priority-based Scheduling
4.4.5 Highest Response Ratio Next (HRN) Scheduling
4.4.6 Round Robin (RR) Scheduling 4.4.7 Multilevel Queue
Scheduling
4.4.8 Multilevel Feedback Queue Scheduling
4.5 Multiple Processor Scheduling
4.6 Real-time Scheduling
4.6.1 Hard Real-time Systems 4.6.2 Soft Real-time Systems
4.7 Algorithm Evaluation
4.8 Thread Scheduling
Let us Summarize
Exercises
5. Process Synchronization
5.1 Introduction
5.2 Principles of Concurrency
5.3 Precedence Graph
5.4 Critical Regions
5.4.1 Critical-Section Problem
5.5 Synchronization: Software Approaches
5.5.1 Strict Alternation: Attempt for Two-Process Solution
5.5.2 Dekker’s Algorithm: Two Process solution
5.5.3 Peterson’s Algorithm: Two-Process Solution
5.5.4 Bakery Algorithm: Multiple-Process Solution
5.6 Synchronization Hardware
5.7 Semaphores
5.8 Classical Problems of Synchronization
5.8.1 Producer-Consumer Problem
5.8.2 Readers-Writers Problem
5.8.3 Dining-Philosophers Problem
5.8.4 Sleeping Barber Problem
5.9 Monitors
5.10 Message Passing
Let us Summarize
Exercises
6. Deadlock
6.1 Introduction
6.2 System Model
6.3 Deadlock Characterization
6.3.1 Deadlock Conditions 6.3.2 Resource Allocation Graph
6.4 Methods for Handling Deadlocks
6.5 Deadlock Prevention
6.6 Deadlock Avoidance
6.6.1 Resource Allocation Graph Algorithm
6.6.2 Banker’s Algorithm
6.7 Deadlock Detection
6.7.1 Single Instance of Each Resource Type
6.7.2 Multiple Instances of a Resource Type
6.8 Deadlock Recovery
6.8.1 Terminating the Processes
6.8.2 Preempting the Resources
Let us Summarize
Exercises
8. Virtual Memory
8.1 Introduction
8.2 Background
8.3 Demand Paging
8.3.1 Performance of Demand Paging
8.4 Process Creation
8.4.1 Copy-on-Write 8.4.2 Memory-Mapped Files
8.5 Page Replacement
8.5.1 FIFO Page Replacement 8.5.2 Optimal Page Replacement
8.5.3 LRU Page Replacement 8.5.4 Second Chance Page
Replacement
8.5.5 Counting-Based Page Replacement Algorithm
8.6 Allocation of Frames
8.7 Thrashing
8.7.1 Locality 8.7.2 Working Set Model
8.7.3 Page-fault Frequency (PFF)
8.8 Demand Segmentation
8.9 Cache Memory Organization
8.9.1 Terminologies Related to Cache
8.9.2 Impact on Performance
8.9.3 Advantages and Disadvantages of Cache Memory
Let us Summarize
Exercises
9. I/O Systems
9.1 Introduction
9.2 I/O Hardware
9.3 I/O Techniques
9.3.1 Polling 9.3.2 Interrupt-driven I/O
9.3.3 Direct Memory Access (DMA)
9.4 Application I/O Interface
9.5 Kernel I/O Subsystem
9.5.1 I/O Scheduling 9.5.2 Buffering 9.5.3 Caching
9.5.4 Spooling 9.5.5 Error Handling
9.6 Transforming I/O Requests to Hardware Operations
9.7 Streams
9.8 Performance
Let us Summarize
Exercises
Glossary
chapter 1
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Define the term operating system along with its objectives and
functions.
⟡ Understand different views of an operating system.
⟡ Explore how operating systems have evolved.
⟡ Discuss different types of operating systems and compare them.
⟡ Describe the basic computer system organization and architecture.
⟡ Describe operations of an operating system.
⟡ Understand the components of an operating system.
⟡ List the services provided by an operating system.
⟡ Describe the two types of user interface.
⟡ Explain different types of system calls.
⟡ List different categories of system programs.
⟡ Discuss various ways of structuring an operating system.
⟡ Understand the concept of virtual machines.
1.1 INTRODUCTION
A computer system consists of two main components: the hardware
and the software. The hardware components include the central
processing unit (CPU), the memory, and the input/output (I/O)
devices. The software part comprises the system and application
programs such as compilers, text editors, word processors,
spreadsheets, database systems, etc. An application program is
developed by an application programmer in some programming
language. The application programmers and the end users (users
who interact with the application programs to solve their problems)
are generally not concerned with the details of the computer
hardware, and hence do not directly interact with it. Thus, to use the
various hardware components, the application programs and the
users need an intermediate layer that provides a convenient interface
to use the system. This layer is referred to as an operating system
(OS).
User View
In a stand-alone environment, where a single user sits in front of a
personal computer, the operating system is designed basically for the
ease of use, and some attention is also paid to system
performance. However, since these systems are designed for a
single user to monopolize the resources, there is no sharing of
hardware and software among multiple users. Therefore, no attention
is paid to resource utilization.
In a networked environment, where multiple users share
resources and may exchange information, the operating system is
designed for resource utilization. In this case, the operating system
ensures that the available processor time, memory, and I/O devices
are used efficiently, and no individual user tries to monopolize the
system resources. In case, the various users are connected to a
mainframe or a minicomputer via their terminals, no attention is
paid to usability of individual systems. However, in case the users are
connected to the servers via their workstations, a compromise
between individual usability and resource utilization is made while
designing the operating system.
In handheld systems, the operating system is basically designed
for individual usability as these systems are mostly stand-alone units
for individual users. Finally, for the computers that have little or no
user view such as embedded systems, the operating system is
basically designed to ensure that these systems will run without user
intervention.
System View
As discussed earlier, the computer system consists of many
resources such as CPU time, memory, and I/O devices, which are
required to solve a computing problem. It is the responsibility of the
operating system to manage these resources, and allocate them to
various programs and users in a way such that the computer system
can be operated in an efficient and fair manner. Thus, from the
system’s point of view, the operating system primarily acts as a
resource allocator.
The operating system also acts as a control program that
manages the execution of user programs to avoid errors and
improper use of computer system. It also controls the I/O devices and
their operations.
1.4 EVOLUTION OF OPERATING SYSTEMS
The operating system may process its work serially or concurrently.
That is, it can dedicate all the computer resources to a single
program until the program finishes or can dynamically assign the
resources to various currently active programs. The execution of
multiple programs in an interleaved manner is known as
multiprogramming. In this section, we will discuss how the operating
systems have been evolved from serial processing to
multiprogramming systems.
1.4.3 Multiprogramming
Though the batch processing system attempted to utilize the
computer resources like CPU and I/O devices efficiently, but it still
dedicated all resources to a single job at a time. The execution of a
single job cannot keep the CPU and I/O devices busy at all times
because during execution, the jobs sometimes require CPU and
sometimes I/O devices, but not both at one point of time. Hence,
when the CPU is busy, the I/O devices have to wait, and when the I/O
devices are busy, the CPU remains idle.
For example, consider two jobs P1 and P2 such that both of them
require CPU time and I/O time alternatively. The serial execution of P1
and P2 is shown in Figure 1.4 (a). The shaded boxes show the CPU
activity of the jobs, and white boxes show their I/O activity. It is clear
from the figure that when P1 is busy in its I/O activity, the CPU is idle
even if P2 is ready for execution.
The idle time of CPU and I/O devices can be reduced by using
multiprogramming that allows multiple jobs to reside in the main
memory at the same time. If one job is busy with I/O devices, CPU
can pick another job and start executing it. To implement
multiprogramming, the memory is divided into several partitions,
where each partition can hold only one job. The jobs are organized in
such a way that the CPU always has one job to execute. This
increases the CPU utilization by minimizing the CPU idle time.
The basic idea behind multiprogramming is that the operating
system loads multiple jobs into the memory from the job pool on the
disk. It then picks up one job among them and starts executing it.
When this job needs to perform the I/O activity, the operating system
simply picks up another job, and starts executing it. Again when this
job requires the I/O activity, the operating system switches to the third
job, and so on. When the I/O activity of the job gets finished, it gets
the CPU back. Therefore, as long as there is at least one job to
execute, the CPU will never remain idle. The memory layout for a
multiprogramming batched system is shown in Figure 1.3.
Process Management
The basic concept supported by almost all the operating systems is
the process. A process is a program under execution or we can say
an executing set of machine instructions. A program by itself does
nothing; it is a passive entity. In contrast, a process is an active entity
that executes the instructions specified in the program. It is
associated with a program counter that specifies the instruction to be
executed next. The instructions of a process are executed
sequentially, that is, one after another until the process terminates. It
is not necessary for a program to have only a single process; rather it
may be associated with many processes. However, the different
processes associated with the same program are not treated as
separate execution sequences. Furthermore, a process may spawn
several other processes (known as child processes) during
execution. These child processes may in turn create other child
processes, resulting in a process tree.
A process is intended to perform a specific task. To do its
intended task, each process uses some resources during its lifetime,
such as memory, files, CPU time, and I/O devices. These resources
can be allocated to the process either at the time of its creation or
during its execution. In addition to resources, some processes may
also need certain input when they are created. For example, if a
process is created to open a specific file, then it is required to provide
the desired file name as input to the process so that the process
could execute the appropriate instructions and system calls to
accomplish its task. After the process is terminated, the reusable
resources (if any) are reclaimed by the operating systems.
A process can be either a system process executing the system’s
code or a user process executing the user’s code. Usually, a system
contains a collection of system and user processes. These processes
can be made to execute concurrently by switching the CPU among
them. In relation to process management, the responsibilities of an
operating system are as follows:
• to create and delete processes (including both user and system
processes),
• to suspend the execution of a process temporarily and later
resume it,
• to facilitate communication among processes by providing
communication mechanisms, and
• to provide mechanisms for dealing with deadlock.
Note: All the concepts related to process management are discussed
in Chapters 2 through 6.
Memory Management
A computer system usually uses main memory and secondary
storage. The main memory is central to the operation of a computer
system. It is a huge collection of words or bytes, which may range
from hundreds of thousands to billions in size, and each byte or word
has a unique address. It holds the instructions and data currently
being processed by the CPU, the result of intermediate calculations,
and the recently processed data. It is the only storage that is directly
accessible to the CPU.
Whenever a program is to be executed, it is allocated space into
the main memory. As it executes, it accesses the data and
instructions from the main memory. After the program has been
executed, the memory space allocated to it is de-allocated and
declared available for some other process. This is the case of single
process in memory which leads to inefficient memory utilization. To
improve the utilization, multiprogramming is used in which multiple
programs are allowed to reside in the main memory at the same time.
However, in this case, the operating system needs more
sophisticated memory management techniques as compared to the
single-user environment. It is the responsibility of the memory
manager to manage memory between multiple programs in an
efficient way. In relation to memory management, the responsibilities
of an operating system are as follows:
• to allocate and deallocate memory space as and when required,
• to make a decision on which of the processes (ready for
execution) should be allocated memory when it is available, and
• to keep track of the parts of memory that have been allocated and
to which processes.
Usually, the capacity of the main memory is limited and not
enough to accommodate all data and programs in a typical computer.
Moreover, all the data stored is lost when power is lost or switched
off. Thus, it is required to use secondary storage in a computer
system to back up main memory. The most commonly used
secondary storage in computer systems is the disk. It stores the most
programs, such as compliers, sort routines, assemblers, etc. These
programs are loaded into the main memory when needed and
otherwise kept stored on the disk. Thus, it is important for a computer
system that the disk storage must be used efficiently. In relation to
disk management, the responsibilities of an operating system are as
follows:
• to allocate space on disk,
• to manage the unused (free) space available on disk, and
• to perform disk scheduling.
Note: Various memory management strategies are discussed in
Chapters 7 and 8. The disk management techniques are discussed in
Chapter 10.
File Management
Another important component of all the operating systems is file
management, which deals with the management and organization of
various files in the system. As we know that a computer system
consists of various storage devices such as hard disks, floppy disks,
compact discs, and so on, the operating system provides an abstract
view of these devices by hiding their internal structure so that the
users can directly access the data (on physical devices) without
exactly knowing where and how the data is actually stored.
The operating system defines a logical storage unit known as a
file, and all the data is stored in the form of files. Each file is
associated with some attributes such as its name, size, type, location,
date and time, etc. The users can perform various operations on files
such as create, delete, read, write, open, seek, rename, append, and
close. Operating system handles these operations with the help of
system calls.
To organize the files in a systematic manner, the operating system
provides the concept of directories. A directory can be defined as a
way of grouping files together. The directories are organized in a
hierarchical manner, which allows users to have subdirectories under
their directories, thus making the file system more logical and
organized. In relation to file management, the responsibilities of an
operating system are as follows:
• to create and delete files and directories,
• to back up the files onto some stable storage media,
• to map files onto secondary storage, and
• to provide primitives that enable one to manipulate the contents of
files and directories.
Note: The file management techniques are discussed in Chapters 11
and 12.
I/O Management
A computer system consists of several I/O devices such as keyboard,
monitor, printer, and so on. It is the responsibility of an operating
system to control and manage these devices. The operating system
also provides a device-independent interface between the devices
and the users so that the users can issue commands to use these
devices without actually knowing how their commands are being
executed. To hide the details of different devices, operating system
designers let the kernel to use device-driver modules, which present
a uniform device-access interface. The I/O management is discussed
in Chapter 9.
Command-line Interface
As mentioned, this interface enables users to interact with the
operating system by typing commands. These commands are then
interpreted and executed in order to provide response to the user.
The MS-DOS is the most commonly used operating system that
provides command-line interface. Figure 1.10 shows MS-DOS
command-line interface. Some operating systems provide more than
one command-line interface; therefore, on such systems, command-
line interfaces are called shells. For example, UNIX provides C shell,
Bourne shell, Korn shell, etc.
Fig. 1.10 MS-DOS Command-Line Interface
Simple Structure
Early operating systems were developed with an elementary
approach without much concern about the structure. In this approach,
the structure of the operating systems was not well-defined. The
operating systems were monolithic, written as a collection of
procedures where each procedure is free to call any other procedure.
An example of operating systems designed with this approach is MS-
DOS. Initially, MS-DOS was designed as a small-size and simple
system, and with limited scope, but grew beyond its scope with time.
It was designed with the idea of providing more functionality within
less space; therefore, it was not carefully divided into modules. Figure
1.12 shows the structure of the MS-DOS system.
Though MS-DOS has a limited structuring, there is no clear
separation between the different interfaces and level of functionality.
For example, application programs can directly call the basic I/O
routines to read/write data on disk instead of going through a series
of interfaces. This exemption makes the MS-DOS system susceptible
to malicious programs that may lead to system crash. Moreover, due
to the lack of hardware protection and dual-mode operation in the
Intel 8088 system (for which MS-DOS system was developed), the
base hardware was directly accessible to the application programs.
Fig. 1.12 Structure of the MS-DOS System
Layered Approach
In the layered approach, the operating system is organized as a
hierarchy of layers with each layer built on the top of the layer below
it. The topmost layer is the user interface, while the bottommost layer
is the hardware. Each layer has a well-defined function and
comprises data structures and a set of routines. The layers are
constructed in such a manner that a typical layer (say, layer n) is able
to invoke operations on its lower layers and the operations of layer n
can be invoked by its higher layers.
Fig. 1.13 Layers in THE System
Microkernels
Initially, the size of kernel was small; with Berkley UNIX (BSD) began
the era of large monolithic kernels. The monolithic kernel runs every
basic system service like scheduling, inter-process communication,
file management, process and memory management, device
management, etc., in the kernel space itself. The inclusion of all basic
services in kernel space increased the size of the kernel. In addition,
these kernels were difficult to extend and maintain. The addition of
new features required the recompilation of the whole kernel, which
was time and resource consuming.
To overcome these problems, an approach called microkernel
was developed that emphasized on modularizing the kernel. The idea
is to remove the less essential components from the kernel and
keeping only a subset of mechanisms typically included in a kernel
thereby reducing its size as well as number of system calls. The
components moved outside the kernel are implemented either as
system- or user-level programs. MACH system and OS X are the
examples of operating systems designed with microkernel approach.
The main advantage of the microkernel approach is that the
operating system can be extended easily; the addition of new
services in the user space does not cause any changes at the kernel
level. In addition, microkernel offers high security and reliability as
most services run as user processes rather than kernel processes.
Thus, if any of the running services fail, the rest of the system
remains unaffected.
Note: Though in the microkernel approach, the size of the kernel was
reduced, still there is an issue regarding which services to be
included in the kernel and which services to be implemented at the
user level.
Modules
The module approach employs object-oriented programming
techniques to design a modular kernel. In this approach, the
operating system is organized around a core kernel and other
loadable modules that can be linked dynamically with the kernel
either at boot time or at run time. The idea is to make the kernel
providing only core services while certain services can be added
dynamically. An example of module-based operating system is
Solaris, which consists of core kernel and seven loadable kernel
modules: scheduling classes, file systems, loadable system calls,
executable formats, streams modules, miscellaneous, and device and
bus drivers.
The modular approach is similar to the layered approach in the
sense that each kernel module has well-defined interfaces. However,
it is more flexible than the layered approach as each module is free to
call any other module.
LET US SUMMARIZE
1. The operating system is defined as a program that is running at all times
on the computer (usually called the kernel). It acts as an interface
between the computer users and the computer hardware.
2. An operating system performs two basically unrelated functions: extending
the machine and managing resources. It can be described as an
extended machine that hides the hardware details from the user and
makes the computer system convenient and easy to use. It can also be
described as a resource manager that manages all the computer
resources efficiently.
3. The role of an operating system can be more fully understood by exploring
it from two different viewpoints: the user point of view and the system
point of view.
4. In serial processing, all the users are allowed to access the computer
sequentially (one after the other).
5. In the batch processing system, the operator used to batch together the
jobs with similar requirements, and run these batches one by one.
6. Multiprogramming allows multiple jobs to reside in the main memory at the
same time. If one job is busy with I/O devices, the CPU can pick another
job and start executing it. Thus, jobs are organized in such a way that the
CPU always has one to execute. This increases the CPU utilization by
minimizing the CPU idle time.
7. The number of jobs competing to get the system resources in a
multiprogramming environment is known as degree of multiprogramming.
8. An extension of multiprogramming systems is time-sharing systems (or
multitasking) in which multiple users are allowed to interact with the
system through their terminals.
9. The CPU in time-sharing systems switches so rapidly from one user to
another that each user gets the impression that only he or she is working
on the system, even though the system is being shared among multiple
users.
10. Different types of operating systems include batch operating systems,
multiprogramming operating systems, time-sharing systems, real-time
operating system, distributed operating system, PC operating systems,
and mobile operating systems.
11. A computer system basically consists of one or more processors (CPUs),
several device controllers, and the memory. All these components are
connected through a common bus that provides access to shared
memory. Each device controller acts as an interface between a particular
I/O device and the operating system.
12. When the system boots up, the initial program that runs on the system is
known as bootstrap program.
13. The event notification is done with the help of an interrupt that is fired
either by the hardware or the software.
14. Whenever an interrupt is fired, the CPU stops executing the current task,
and jumps to a predefined location in the kernel’s address space, which
contains the starting address of the service routine for the interrupt
(known as interrupt handler).
15. Whenever a program needs to be executed, it must be first loaded into the
main memory (called random-access memory or RAM). Two instructions,
namely, load and store are used to interact with the memory.
16. Secondary storage is nonvolatile in nature, that is, the data is permanently
stored and survives power failure and system crashes. Magnetic disk
(generally called disk) is the primary form of secondary storage that
enables the storage of enormous amount of data.
17. Handling I/O devices is one of the main functions of an operating system.
A significant portion of code of operating system is dedicated to manage
I/O.
18. Single-processor systems consist of one main CPU that can execute a
general-purpose instruction set, which includes instructions from user
processes. Other than the one main CPU, most systems also have some
special-purpose processors.
19. The multiprocessor systems (also known as parallel systems or tightly
coupled systems) consist of multiple processors in close communication
in a sense that they share the computer bus and even the system clock,
memory, and peripheral devices.
20. A clustered system is another type of system with multiple CPUs. In
clustered systems, two or more individual systems (called nodes) are
grouped together to form a cluster that can share storage and are closely
linked via high-speed LAN (local area network).
21. The part of the operating system called interrupt service routine (ISR)
executes the appropriate code segment to deal with the interrupt.
22. In order to ensure the proper functioning of the computer system, the
operating system, and all other programs and their data must be
protected against the incorrect programs. To achieve this protection, two
modes of operations, namely, user mode and monitor mode (also known
as supervisor mode, system mode, kernel mode, or privileged mode) are
specified.
23. It is necessary to prevent a user program from gaining the control of the
system for an infinite time. For this, a timer is maintained, which interrupts
the system after a specified period.
24. Though the structure of all systems may differ, the common goal of most
systems is to support the system components including process
management, memory management, file management, I/O management,
and protection and security.
25. One of the major responsibilities of the operating system is to provide an
environment for an efficient execution of user programs. For this, it
provides certain services to the programs and the users. These services
are divided into two sets. One set of services exists for the convenience of
users, and another set of services ensures the efficient operations of the
system in a multiprogramming environment.
26. Providing an interface to interact with the users is essential for an
operating system. There are two types of user interface: command-line
interface and graphical user interface (GUI).
27. All the system calls provided by an operating system can be roughly
grouped into following five major categories, namely, process
management, file management, device management, information
maintenance, and communication.
28. The system programs act as an interface between the operating system
and the application programs. They provide an environment in which
application programs can be developed and executed in a convenient
manner.
29. Every operating system has its own internal structure in terms of file
arrangement, memory management, storage management, etc., and the
entire performance of the system depends on its structure. Various
system structures have evolved with time including simple structure,
layered structure, microkernel, and modules.
30. Virtual machine is nothing but the identical copy of the bare hardware
including CPU, disks, I/O devices, interrupts, etc. It allows each user to
run operating system or software packages of his or her choice on a
single machine thereby creating an illusion that each user has its own
machine.
EXERCISES
Fill in the Blanks
1. To achieve protection, some of the machine instructions that may harm
are designated as _____________.
2. When the system gets started, it is in _____________ mode.
3. The memory-resident portion of the batch operating system is known as
_____________ .
4. The number of jobs competing to get the system resources in
multiprogramming environment is known as _____________ .
5. The lowest level layer of a computer system is _____________ .
Descriptive Questions
1. What is an operating system? Give the view of an OS as a resource
manager.
2. How computer system handles interrupts? Discuss how interrupts can be
handled quickly?
3. Discuss the storage structure of a computer system.
4. How an I/O operation is handled by the system?
5. What do you mean by parallel clustering?
6. Describe briefly how the operating systems have been evolved from serial
processing to multiprogramming systems.
7. Write short notes on the following:
(a) Multiprogramming
(b) Time-sharing systems
(c) Dual-mode operation
(d) Command-line interface
(e) Microkernel
(f) Virtual machines
8. Compare and contrast the different types of operating systems.
9. Discuss the various services that the operating system should provide.
10. Why maintaining a timer is important?
11. How GUI is better than command-line interface?
12. What are system calls? Describe the use of system calls with the help of
an example.
13. Explain various categories of system programs.
14. Discuss various system structures that have evolved with time.
chapter 2
Process Management
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the basic concepts of processes.
⟡ Discuss various states of a process and the transition between these
states.
⟡ Define the term process scheduling.
⟡ Explain various operations that can be performed on processes.
⟡ Understand the concept of cooperating process.
⟡ Provide an overview of inter-process communication.
⟡ Explain different mechanisms used for communication in client-
server environment.
2.1 INTRODUCTION
Earlier, there was a limitation of loading only one program into the
main memory for execution at a time. This program was very
multifaceted and resourceful as it had access to all the resources of
the computer, such as memory, CPU time, I/O devices, and so on. As
time went by, newer techniques incorporated a variety of novel and
powerful features that dramatically improved the efficiency and
functionality of the overall system. Modern computer systems
corroborate multiprogramming, which allows a number of programs to
reside in the main memory at the same time. These programs can be
executed concurrently thereby requiring the system resources to be
shared among them. Multiprogrammed systems need to distinguish
among the multiple executing programs, and this is accomplished
with the concept of a process (also called task on some systems).
When multiple processes run on a system concurrently and more
than one processes require the CPU at the same time, then it
becomes essential to select any one process to which the CPU can
be allocated. To serve this purpose, scheduling is required. Moreover,
the multiple processes running on a system also need to
intercommunicate in order to reciprocate some data or information.
This kind of intercommunication between several processes is
referred to as inter-process communication (IPC).
Scheduling Queues
For scheduling purposes, there exist different queues in the system;
these are as follows:
• Job queue: As the processes enter the system for execution, they
are massed into a queue called job queue (or input queue) on a
mass storage device such as hard disk.
• Ready queue: From the job queue, the processes which are
ready for the execution are shifted into the main memory. In the
main memory, these processes are kept into a queue called ready
queue. In other words, the ready queue contains all those
processes that are waiting for the CPU.
• Device queue: For each I/O device in the system, a separate
queue is maintained which is called device queue. The process
that needs to perform I/O during its execution is kept into the
queue of that specific I/O device; it waits there until it is served by
the device.
Generally, both the ready queue and device queue are maintained
as linked lists that contain PCBs of the processes in the queue as
their nodes. Each PCB includes a pointer to the PCB of the next
process in the queue (see Figure 2.2). In addition, the header node of
the queue contains pointers to the PCBs of the first and the last
process in the queue.
Fig. 2.2 Ready Queue and Device Queue Maintained as Linked List
Types of Schedulers
The following types of schedulers (see Figure 2.4) may coexist in a
complex operating system.
• Long-term scheduler, also known as job scheduler or
admission scheduler, works with the job queue. It selects the
next process to be executed from the job queue and loads it into
the main memory for execution. The long-term scheduler must
select the processes in such a way that some of the processes
are CPU-bound while others are I/O-bound. This is because if all
the processes are CPU-bound, then the devices will remain
unused most of the time. On the other hand, if all the processes
are I/O-bound then the CPU will remain idle most of the time.
Thus, to achieve the best performance, a balanced mix of CPU-
bound and I/O-bound processes must be selected. The main
objective of this scheduler is to control the degree of
multiprogramming (that is, the number of processes in the ready
queue) in order to keep the processor utilization at the desired
level. For this, the long-term scheduler may admit new processes
in the ready queue in the case of poor processor utilization or
may reduce the rate of admission of processes in the ready
queue in case the processor utilization is high. In addition, the
long-term scheduler is generally invoked only when a process
exits from the system. Thus, the frequency of invocation of long-
term scheduler depends on the system and workload and is much
lower than other two types of schedulers.
• Short-term scheduler, also known as CPU scheduler or
process scheduler, selects a process from the ready queue and
allocates CPU to it. This scheduler is required to be invoked
frequently as compared to the long-term scheduler. This is
because generally a process executes for a short period and then
it may have to wait either for I/O or for something else. At that
time, CPU scheduler must select some other process and
allocate CPU to it. Thus, the CPU scheduler must be fast in order
to provide the least time gap between executions.
Context Switch
Transferring the control of CPU from one process to another
demands saving the context of the currently running process and
loading the context of another ready process. This mechanism of
saving and restoring the context is known as context switch. The
portion of the process control block including the process state,
memory management information, and CPU scheduling information
together constitute the context (also called state information) of a
process. Context switch may occur due to a number of reasons some
of which are as follows:
• The current process terminates and exits from the system.
• The time slice of the current process expires.
• The process has to wait for I/O or some other resource.
• Some higher priority process enters the system.
• The process relinquishes the CPU by invoking some system call.
Context switching is performed in two steps, which are as follows:
1. Save context: In this step, the kernel saves the context of the
currently executing process in its PCB of the process so that it
may restore this context later when its processing is done and the
execution of the suspended process can be resumed.
2. Restore context: In this step, the kernel loads the saved context
of a different process that is to be executed next. Note that if the
process to be executed is newly created and the CPU has not yet
been allocated to it, there will be no saved context. In this case,
the kernel loads the context of the new process. However, if the
process to be executed was in waiting state due to I/O or some
other reason, there will be saved context that can be restored.
One of the major detriments of using context switching is that it
incurs a huge cost to the system in terms of real time and CPU cycles
because the system does not perform any productive work during
switching. Therefore, as far as possible, context switching should be
generally refrained from, otherwise it can amount to reckless use of
time. Figure 2.5 shows context switching between two processes P1
and P2.
Fig. 2.5 Context Switching between Processes P1 and P2
Types of Communication
Processes may communicate with each other directly or indirectly.
In direct communication, processes address each other by their
PID assigned to them by the operating system. For example, if a
process P1 wants to send a message to process P2, then the system
calls send() and receive() will be defined as follows:
• send(PID2, message)
• receive(PID1, message)
• receive(id, message)
• receive(X, message)
Synchronization
Messages can be sent or received either synchronously or
asynchronously, also called blocking or non-blocking,
respectively. Various design options for implementing send() and
receive() calls are as follows:
Buffering
As discussed earlier, the messages sent by a process are temporarily
stored in a temporary queue (also called buffer) by the operating
system before delivering them to the recipient. This buffer can be
implemented in a variety of ways, which are as follows:
• No buffering: The capacity of buffer is zero, that is, no messages
may wait in the queue. This implies that sender process has to
wait until the message is received by the receiver process.
• Bounded buffer: The capacity of the buffer is fixed, say m, that
is, at most m processes may wait in the queue at a time. When
there are less than m messages waiting in the queue and a new
message arrives, it is added in the queue. The sender process
need not wait and it can resume its operation. However, if the
queue is full, the sender process is blocked until some space
becomes available in the queue.
• Unbounded buffer: The buffer has an unlimited capacity, that is,
an infinite number of messages can be stored in the queue. In
this case, the sender process never gets blocked.
• Double buffering: Two buffers are shared between the sender
and receiver process. In case one buffer fills up, the second one
is used. When the second buffer fills up, the first might have been
emptied. This way the buffers are used turn by turn, thus avoiding
the blocking of one process because of another.
2.7.1 Socket
Socket is defined as an end-point of the communication path between
two processes. Each of the communicating processes creates a
socket and these sockets are to be connected enabling
communication. The socket is identified by a combination of IP
address and the port number. The IP address is used to identify the
machine on the network and the port number is used to identify the
desired service on that machine.
Usually, a machine provides a variety of services such as
electronic mail, Telnet, FTP, etc. To differentiate among these
services, each service is assigned with a unique port number. To avail
some specific service on a machine, first it is required to connect to
the machine and then connect to the port assigned for that service.
Note that the port numbers less than 1024 are considered well-known
and are reserved for standard services. For example, the port number
used for Telnet is 23.
Sockets employ client-server architecture. The server listens to a
socket bound to a specific port for a client to make connection
request. Whenever a client process requests for a connection, it is
assigned a port number (greater than 1024) by the host computer
(say M). Using this port number and the IP address of host M, the
client socket is created. For example, if the client on host M having IP
address (125.61.15.7) wants to connect to Telnet server (listening to
port number 23) having IP address (112.56.71.8), it may be assigned
a port number 1345. Thus, the client socket and server socket used
for communication will be (125.61.15.7:1345) and (112.56.71.8:23)
respectively as shown in the Figure 2.8.
Note that each connection between the client and the server
employs a unique pair of sockets. That is, if another client on host M
wants to connect to Telnet server, it must be assigned a port number
different from 1345 (but greater than 1024).
LET US SUMMARIZE
1. A process is a program under execution or an executing set of machine
instructions. Its can be either a system process executing the system’s
code or a user process executing the user’s code.
2. A process comprises not only the program code (known as text section)
but also a set of global variables (known as data section) and the process
control block (PCB).
3. The processes that involve more computation than I/O operations thereby
demanding greater use of CPU than I/O devices during their lifetime are
called CPU-bound or compute-bound processes.
4. The processes that involve a lot of I/O operations as compared to
computation during their lifetime are called I/O-bound processes.
5. Each process is labeled with a ‘state’ variable—an integer value that helps
the operating system to decide what to do with the process. It indicates
the nature of the current activity in a process.
6. Various possible states for a process are new, ready, running, waiting, and
terminated.
7. The change in state of a process is known as state transition of a process
and is caused by the occurrence of some event in the system.
8. To keep track of all the processes in the system, the operating system
maintains a table called process table that includes an entry for each
process. This entry is called process control block (PCB).
9. A process control block stores descriptive information pertaining to a
process such as its state, program counter, memory management
information, information about its scheduling, allocated resources,
accounting information, etc, that is required to control the process.
10. The procedure of determining the next process to be executed on the CPU
is called process scheduling and the module of the operating system that
makes this decision is called scheduler.
11. As the processes enter the system for execution, they are kept into a
queue called job queue (or input queue).
12. From the job queue, the processes which are ready for the execution are
brought into the main memory. In the main memory, these processes are
kept into a queue called ready queue.
13. For each I/O device in the system, a separate queue called device queue
is maintained. The process that needs to perform I/O during its execution
is kept into the queue of that specific I/O device and waits there until it is
served by the device.
14. The long-term scheduler, also known as job scheduler or admission
scheduler, selects the next process to be executed from the job queue
and loads it into the main memory for execution.
15. The short-term scheduler, also known as CPU scheduler or process
scheduler, selects a process from the ready queue and allocates the CPU
to it.
16. The medium-term scheduler, also known as swapper, selects a process
among the partially executed or unexecuted swapped-out processes and
swaps it in the main memory.
17. Transferring the control of CPU from one process to another demands
saving the context of the currently running process and loading the
context of another ready process. This task of saving and restoring the
context is known as context switch.
18. The portion of the process control block including the process state,
memory management information and CPU scheduling information
together constitute the context (also called state information) of a process.
19. A user process may create one or more processes during its execution by
invoking the process creation system call.
20. The task of creating a new process on the request of some other process
is called process spawning. The process that spawns a new process is
called parent process whereas the spawned process is called child
process.
21. When a process is terminated, all the resources held by the process are
de-allocated, the process returns output data (if any) to its parent, and
finally the process is removed from the memory by deleting its PCB from
the process table.
22. The processes that coexist in the memory at some time are called
concurrent processes. Concurrent processes may either be independent
or cooperating.
23. Independent processes (also called competitors), as the name implies, do
not share any kind of information or data with each other.
24. Cooperating (also called interacting) processes, on the other hand, need
to exchange data or information with each other.
25. The cooperating processes require some mechanism to communicate with
each other. One such mechanism is inter-process communication (IPC)—
a facility provided by the operating system.
26. Two basic communication models for providing IPC are ‘shared memory
systems’ and ‘message passing systems’.
27. A process running on a system can communicate with another process
running on remote system connected via network with the help of
communication mechanisms, including sockets, remote procedure call
(RPC), and remote method invocation (RMI).
EXERCISES
Fill in the Blanks
1. A process comprises _____________, _____________, and
_____________.
2. Context switching is performed in two steps, which are _____________
and _____________.
3. The processes that coexist in the memory at some time are called
_____________.
4. The two basic communication models for providing IPC are
_____________ and _____________.
5. A process that no longer exists but its PCB is still not removed from the
process table is known as a _____________.
Descriptive Questions
1. What does a process control block contain?
2. Distinguish between CPU-bound and I/O-bound process.
3. Discuss various states of a process.
4. Describe the events under which state transitions between ready, running
and waiting take place.
5. What is the difference between symmetric and asymmetric direct
communication?
6. List three important fields stored in a process control block.
7. Distinguish among long-term, short-term and medium-term scheduler.
8. What is context switching? How is it performed, and what is its
disadvantage?
9. Describe the different models used for inter-process communication.
Which one is better?
10. In message passing systems, the processes can communicate directly or
indirectly. Compare both the ways.
11. Write short notes on the following:
a. Remote method invocation (RMI)
b. Cooperating processes
c. Scheduling queues
12. Consider the indirect communication method where mailboxes are used.
What will be the sequence of execution of send() and receive() calls in
the following two cases?
a. Suppose a process P wants to wait for two messages, one from
mailbox M and other from mailbox N.
b. Suppose P wants to wait for one message from mailbox M or from
mailbox N (or from both).
13. Explain the remote procedure call (RPC) method of communication in
client-server systems.
chapter 3
Threads
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the basic concepts of threads.
⟡ List various advantages of threads.
⟡ Describe the ways of implementing threads.
⟡ Discuss various models of multithreading.
⟡ Understand various threading issues.
⟡ Discuss the Pthreads library and its functions.
3.1 INTRODUCTION
In conventional operating systems, each process has a single thread
of control, that is, the process is able to perform only one task at a
time. To implement multiprogramming, multiple processes with each
having a separate address space may be created and the CPU may
be switched back and forth among them to create the illusion that the
processes are running in parallel. But as discussed in the previous
chapter, process creation and switching are time-consuming and
resource-intensive and thus, incur an overhead to the system.
Therefore, many modern operating systems employ multithreading
that allows a process to have multiple threads of control within the
same address space. These threads may run in parallel thereby
enabling the process to perform multiple tasks at a time.
3.2 THREAD CONCEPT
A thread is defined as the fundamental unit of CPU utilization. A
traditional process comprises a single thread of control, that is, it can
execute one task at a time and thus, is referred to as a single-
threaded process. However, to make the process perform several
tasks simultaneously, multiple threads of a single process can be
created with each thread having its own ID, stack and a set of
registers. In addition, all the threads of the same process share with
each other the code section, data section, and other resources
including list of open files, child processes, signals, etc., of the
process. A process with multiple threads of control is referred to as a
multithreaded process. Figure 3.1 shows the structure of a single-
thread process and a multithreaded process with four threads
(indicated by wavy lines) of control.
Kernel-level Threads
Kernel-level threads are implemented by the kernel, which is
responsible for creating, scheduling, and managing threads within the
kernel space. It maintains a thread table in addition to the process
table that holds the program counter, stack pointer, registers, state,
etc., of each thread in the system. Whenever a process wishes to
create or terminate a new thread, it initiates a system call to the
kernel. In response, the kernel creates or terminates the thread by
modifying the thread table. Many modern operating systems including
Solaris 2, Windows 2000, and Windows NT provide support for
kernel-level threads.
Advantages
• In a multiprocessor environment, multiple kernel-level threads
belonging to a process can be scheduled to run simultaneously
on different CPUs thereby resulting in computation speedup.
• As the threads are managed directly by the kernel, if one thread
issues a system call that blocks it, the kernel can choose another
thread to run either from the same process (to which the blocked
thread belongs) or from some different process.
Disadvantages
• The cost of creating and destroying threads in the kernel is
relatively greater than that of user-level threads.
• The kernel performs switching between the threads, which incurs
overhead to the system.
User-level Threads
User-level threads are implemented by a thread library associated
with the code of a process. The thread library provides support for
creating, scheduling, and managing threads within the user space
without any involvement from the kernel. Thus, the kernel is unaware
of the existence of threads in a process; it is concerned only with
managing single-threaded processes. Whenever a process wishes to
create or terminate a thread, it can do so by calling an appropriate
function from the thread library without the need of kernel
intervention. Moreover, each process maintains its own thread table
that keeps track of the threads belonging to that process and the
kernel maintains only the process table. POSIX Pthreads, Solaris 2
UI-threads, and Mach C-threads are some of the user-thread
libraries.
Advantages
• The user-level threads can be created and managed at a faster
speed as compared to kernel-level threads.
• The thread switching overhead is smaller as it is performed by the
thread library and there is no need to issue the system call.
• The thread library can schedule threads within a process using a
scheduling policy that best suits the process’s nature. For
example, for a real-time process, a priority-based scheduling
policy can be used. On the other hand, for a multithreaded Web
server, round-robin scheduling can be used.
Disadvantages
• At most, one user-level thread can be in operation at one time,
which limits the degree of parallelism.
• If one user-level thread issues a blocking system call, the kernel
blocks the whole process to which the thread belongs even if
there is some other thread that is ready to run. This is because
the kernel does not know the difference between a thread and a
process; it simply treats a thread like a process.
Advantages
• It incurs a low switching overhead as kernel is not involved while
switching between threads.
Disadvantages
• If one user-level thread issues a blocking system call, the kernel
blocks the whole parent process.
• As the kernel-level thread can be accessed by only one user-level
thread at a time, multiple user-level threads cannot run in parallel
on multiple CPUs thereby resulting in low concurrency.
Advantages
• Multiple threads can run in parallel on multiple CPUs in a
multiprocessor environment and thus, greater concurrency is
achieved.
Disadvantages
• It results in high switching overhead due to the involvement of
kernel in switching.
• Most implementations of this model restrict on the number of
threads that can be created in a process. This is because
whenever a user-level thread is created in a process, a
corresponding kernel-level thread is also required to be created.
The creation of many kernel-level threads incurs an overhead to
the system, thereby degrading the performance.
Advantages
• Many user-level threads can be made to run in parallel on different
CPUs by mapping each user-level thread to a different kernel-
level thread.
• Blocking of one user-level thread does not result in the blockage
of other user-level threads that are mapped into different kernel-
level threads.
• Switching between user-level threads associated with the same
kernel-level thread does not incur much overhead.
• There is no restriction on the number of user-level threads that
can be created in a process; as many user-level threads as
required can be created.
Disadvantages
• The implementation of this model is very complex.
Creating a Pthread
A process can create a Pthread by calling the pthread_create()
function. The syntax of this function is as follows:
pthread_create (ptr_id, attr, start_routine, arg);
where
ptr_id is a pointer to the memory location where the ID of Pthread
will be stored.
attr specifies an attributes object that defines the attributes to be
used in Pthread creation.
start_routine is the routine to be executed by the newly-created
Pthread.
arg is the single argument that is passed to the Pthread during its
creation.
Once a Pthread has been created, it starts executing the
start_routine function within the environment of the process that has
created it.
Terminating a Pthread
A Pthread can get terminated under any of the following
circumstances.
• When it calls the pthread_exit(status_code) function.
• After itreturns from its start_routine, because then
pthread_exit() function is called implicitly.
where
<pthread_id> is the ID of the Pthread whose termination is awaited.
adr(x)is the address of the variable x in which the status of the target
Pthread is to be stored.
Following points should be kept in mind while using the
pthread_join() function.
EXERCISES
Fill in the Blanks
1. Many modern operating systems employ _____________ that allows a
process to have multiple threads of control within the same address
space.
2. Two methods for implementing threads are _____________ and
_____________ threads.
3. _____________ allows a process to have multiple threads of control
within the same address space.
4. Whenever a process invokes the _____________ system call, a new
(child) process is created that is the exact duplicate of its parent process.
5. The procedure of terminating a thread before it completes its execution is
known as _____________ and the thread that is to be cancelled is known
as _____________.
CPU Scheduling
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the basic concepts of scheduling.
⟡ Discuss the criteria for scheduling.
⟡ Describe various scheduling algorithms.
⟡ Discuss scheduling for multiprocessor systems.
⟡ Describe real-time scheduling.
⟡ Evaluate various scheduling algorithms.
⟡ Discuss thread scheduling.
4.1 INTRODUCTION
As discussed in Chapter 2, CPU scheduling is the procedure
employed for deciding to which of the ready processes, the CPU
should be allocated. CPU scheduling plays a pivotal role in the basic
framework of the operating system owing to the fact that the CPU is
one of the primary resources of the computer system. The algorithm
used by the scheduler to carry out the selection of a process for
execution is known as scheduling algorithm. A number of
scheduling algorithms are available for CPU scheduling. Each
scheduling algorithm influences the resource utilization, overall
system performance, and quality of service provided to the user.
Therefore, one has to reason out a number of criteria to be
considered while selecting an algorithm on a particular system.
4.2.3 Dispatcher
The CPU scheduler only selects a process to be executed next on
the CPU but it cannot assign CPU to the selected process. The
function of setting up the execution of the selected process on the
CPU is performed by another module of the operating system, known
as dispatcher. The dispatcher involves the following three steps to
perform this function.
1. Context switching is performed. The kernel saves the context of
currently running process and restores the saved state of the
process selected by the CPU scheduler. In case the process
selected by the short-term scheduler is new, the kernel loads its
context.
2. The system switches from the kernel mode to user mode as a
user process is to be executed.
3. The execution of the user process selected by the CPU
scheduler is started by transferring the control either to the
instruction that was supposed to be executed at the time the
process was interrupted, or to the first instruction if the process
is going to be executed for the first time after its creation.
Note: The amount of time required by the dispatcher to suspend
execution of one process and resume execution of another process is
known as dispatch latency. Low dispatch latency implies faster start
of process execution.
Advantages
• It is easier to understand and implement as processes are simply
to be added at the end and removed from the front of the queue.
No process from in between the queue is required to be
accessed.
• It is well suited for batch systems where the longer time periods
for each process are often acceptable.
Disadvantages
• The average waiting time is not minimal. Therefore, this
scheduling algorithm is never recommended where performance
is a major issue.
• It reduces the CPU and I/O devices utilization under some
circumstances. For example, assume that there is one long CPU-
bound process and many short I/O-bound processes in the ready
queue. Now, it may happen that while the CPU-bound process is
executing, the I/O-bound processes complete their I/O and come
to the ready queue for execution. There they have to wait for the
CPU-bound process to release the CPU and the I/O devices also
remain idle during this time. When the CPU-bound process needs
to perform I/O, it comes to the device queue and the CPU is
allocated to I/O-bound processes. As the I/O-bound processes
require a little CPU burst, they execute quickly and come back to
the device queue thereby leaving the CPU idle. Then the CPU-
bound process enters the ready queue and is allocated the CPU
which again makes the I/O processes waiting in ready queue at
some point of time. This happens again and again until the CPU-
bound process is done, which results in low utilization of CPU and
I/O devices.
• It is not suitable for time sharing systems where each process
should get the same amount of CPU time.
Advantages
• It eliminates the variance in waiting and turnaround times. In fact,
it is optimal with respect to average waiting time if all processes
are available at the same time. This is due to the fact that short
processes are made to run before the long ones which decreases
the waiting time for short processes and increases the waiting
time for long processes. However, the reduction in waiting time is
more than the increment and thus, the average waiting time
decreases.
Disadvantages
• It is difficult to implement as it needs to know the length of CPU
burst of processes in advance. In practice, having the prior
knowledge of the required processing time of processes is
difficult. Many systems expect users to provide estimates of CPU
burst of processes which may not always be correct.
• It does not favour the processes having longer CPU burst. This is
because as long as the short processes continue to enter the
ready queue, the long processes will not be allowed to get the
CPU. This results in starvation of long processes.
What is the average turnaround time and average waiting time for
these processes with SRTN algorithm?
Initially, P1 enters the ready queue at t = 0.0 and gets the CPU as
there are no other processes in the queue. While it is executing, at
time t = 0.4, P2 with CPU burst of 4 ms enters the queue. At that time
the remaining CPU burst of P1 is 7.6 ms which is greater than that of
P2. Therefore, the CPU is taken back from P1 and allocated to P2.
During execution of P2, P3 enters at t = 1.0 with a CPU burst of 1 ms.
Again CPU is switched from P2 to P3 as the remaining CPU burst of P2
at t = 1.0 is 3.4 ms which is greater than that of P3. When P3
completes at t = 2.0, the CPU is allocated to P2 because at that time
the remaining CPU burst of P2(which is, 3.4 ms) is shorter than that of
P1(whichis 7.6 ms). Finally, when P2 completes its execution at t= 5.4
ms, the CPU is allocated to P1 which completes its execution at t =
13.
Since Turnaround time = Exit time – Entry time,
Turnaround time for P1= (13 – 0) = 13 ms
Turnaround time for P2= (5.4 – 0.4) = 5 ms
Turnaround time for P3= (2 – 1) = 1 ms
Average turnaround time = (13 + 5 + 1)/3 = 6.33 ms
Since Waiting time = Turnaround time – Processing time,
Waiting time for P1= (13 – 8) = 5 ms
Waiting time for P2= (5 – 4) = 1 ms
Waiting time for P3= (1 – 1) = 0 ms
Average waiting time = (5 + 1 + 0)/3 = 2 ms
Example 6 Consider the same set of processes, their arrival times
and CPU burst as shown in Example 3. How will these processes be
scheduled according to SRTN scheduling algorithm? Compute the
average waiting time and average turnaround time.
Solution The processes will be scheduled as depicted in the
following Gantt chart.
Advantages
• A long process that is near to its completion may be favored over
the short processes entering the system. This results in an
improvement in the turnaround time of the long process.
Disadvantages
• Like SJF, it also requires an estimate of the next CPU burst of a
process in advance.
• Favoring a long process nearing its completion over the several
short processes entering the system may affect the turnaround
times of short processes.
• It favors only those long processes that are just about to complete
and not those who have just started their operation. Thus,
starvation of long processes still may occur.
Assuming that the lower priority number means the higher priority,
how will these processes be scheduled according to non-preemptive
as well as preemptive priority scheduling algorithm? Compute the
average waiting time and average turnaround time in both cases.
Solution
Non-preemptive priority scheduling algorithm
The processes will be scheduled as depicted in the following Gantt
chart.
Advantages
• Important processes are never made to wait because of the
execution of less important processes.
Disadvantages
• It suffers from the problem of starvation of lower priority
processes, since the continuous arrival of higher priority
processes will prevent lower priority processes indefinitely from
acquiring the CPU. One possible solution to this problem is aging
which is a process of gradually increasing the priority of a low
priority process with increase in its waiting time. If the priority of a
low priority process is increased after each fixed time of interval, it
is ensured that at some time it will become a highest priority
process and get executed.
Example 10 Consider four processes P1, P2, P3, and P4 with their
arrival times and required CPU burst (in milliseconds) as shown in the
following table.
Advantages
• It favors short processes. This is because with increase in waiting
time, the response ratio of short processes increases speedily as
compared to long processes. Thus, they are scheduled earlier
than long processes.
• Unlike SJF, starvation does not occur since with increase in
waiting time, the response ratio of long processes also increases
and eventually they are scheduled.
Disadvantages
• Like SJF and SRTN, it also requires an estimate of the expected
service time (CPU burst) of a process.
Initially, P1 enters the ready queue at t = 0 and gets the CPU for 3
ms. While it executes, P2 and P3 enter the queue at t = 1 and t = 3,
respectively. Since, P1 does not execute within 3 ms, an interrupt
occurs when the time slice gets over. P1 is preempted (with remaining
CPU burst of 7 ms), put back in the queue after P3 because P4 has not
entered yet and the CPU is allocated to P2. During execution of P2, P4
enters in the queue at t = 4 and put at the end of queue after P1.
When P2 times out, it is preempted (with remaining CPU burst of 2
ms) and put back at the end of queue after P4. The CPU is allocated
to the next process in the queue, that is, to P3 and it executes
completely before the time slice expires. Thus, the CPU is allocated
to the next process in the queue which is P1. P1 again executes for 3
ms, then preempted (with remaining CPU burst of 4 ms) and put back
at the end of the queue after P2 and the CPU is allocated to P4. P4
executes completely within the time slice and the CPU is allocated to
next process in the queue, that is, P2. As P2 completes before the time
out occurs, the CPU is switched to P1 at t = 16 for another 3 ms.
When the time slice expires, CPU is again allocated to P1 as it is the
only process in the queue.
Since Turnaround time = Exit time – Entry time,
Turnaround time for P1= (20 – 0) = 20 ms
Turnaround time for P2= (16 – 1) = 15 ms
Turnaround time for P3= (8 – 3) = 5 ms
Turnaround time for P4= (14 – 4) = 10 ms
Advantages
• It is efficient for time sharing systems where the CPU time is
divided among the competing processes.
• It increases the fairness among the processes.
Disadvantages
• The processes (even the short processes) may take long time to
execute. This decreases the system throughput.
• It requires some extra hardware support such as a timer to cause
interrupt after each time out.
Note: Ideally, the size of time quantum should be such that 80% of
the processes could complete their execution within one time
quantum.
Advantages
• Processes are permanently assigned to their respective queues
and do not move between queues. This results in low scheduling
overhead.
Disadvantages
• The processes in lower priority queues may have to starve for
CPU in case processes are continuously arriving in higher priority
queues. One possible way to prevent starvation is to time slice
among the queues. Each queue gets a certain share of CPU time
which it schedules among the processes in it. Note that the time
slice of different priority queues may differ.
Example 13 Consider four processes P1, P2, P3, and P4 with their
arrival times and required CPU burst (in milliseconds) as shown in the
following table.
Assume that there are three ready queues Q1, Q2 and Q3. The CPU
time slice for Q1 and Q2 is 5 ms and 10 ms, respectively and in Q3,
processes are scheduled on FCFS basis. How will these processes
be scheduled according to multilevel feedback queue scheduling
algorithm? Compute the average waiting time and average
turnaround time.
Solution The processes will be scheduled as depicted in the
following Gantt chart.
Initially, P1 enters the system at t = 0, placed in Q1 and allocated
the CPU for 5 ms. Since, it does not execute completely, it is moved
to Q2 at t = 5. Now Q1 is empty so the scheduler picks up the process
from the head of Q2. Since, P1 is the only process in Q2, it is again
allocated the CPU for 10 ms. But during its execution, P2 enters Q1 at
t = 12, therefore P1 is preempted and P2 starts executing. At t = 17, P2
is moved to Q2 and placed after P1. The CPU is allocated to the first
process in Q2, that is, P1. While P1 is executing, P3 enters Q1 at t = 25 so
P1 is preempted, placed after P2 in Q2 and P3 starts executing. As P3
executes completely within time slice, the scheduler picks up the first
process in Q2 which is P2 at t = 29. While P2 is executing, P4 enters Q1
at t = 32 because of which P2 is preempted and placed after P1 in Q2.
The CPU is assigned to P4 for 5 ms and at t = 37, P4 is moved to Q2
and placed after P2. At the same time, the CPU is allocated to P1(first
process in Q2). When it completes at t = 42, the next process in Q2
which is P2, starts executing. When it completes, the last process in
Q2, that is, P4 is executed.
Advantages
• It is fair to I/O-bound (short) processes as these processes need
not wait too long and are executed quickly.
• It prevents starvation by moving a lower priority process to a
higher priority queue if it has been waiting for too long.
Disadvantages
• It is the most complex scheduling algorithm.
• Moving the processes between queues causes a number of
context switches which results in an increased overhead.
• The turnaround time for long processes may increase significantly.
Scheduling Approaches
The next issue is how to schedule the processes from the ready
queue to multiple processors. For this, one of following scheduling
approaches may be used.
• Symmetric multiprocessing (SMP): In this approach, each
processor is self-scheduling. For each processor, the scheduler
selects a process for execution from the ready queue. Since,
multiple processors need to access common data structure, this
approach necessitates synchronization among multiple
processors. This is required so that no two processors could
select the same process and no process is lost from the ready
queue.
• Asymmetric multiprocessing: This approach is based on the
master-slave structure among the processors. The responsibility
of making scheduling decisions, I/O processing and other system
activities is up to only one processor (called master), and other
processors (called slaves) simply execute the user’s code.
Whenever some processor becomes available, the master
processor examines the ready queue and selects a process for it.
This approach is easier to implement than symmetric
multiprocessing as only one processor has access to the system
data structures. But at the same time, this approach is inefficient
because a number of processes may block on the master
processor.
Load Balancing
On SMP systems having a private ready queue for each processor, it
might happen at a certain moment of time that one or more
processors are sitting idle while others are overloaded with a number
of processes waiting for them. Thus, in order to achieve the better
utilization of multiple processors, load balancing is required which
means to keep the workload evenly distributed among multiple
processors. There are two techniques to perform load balancing,
namely, push migration and pull migration.
In push migration technique, the load is balanced by periodically
checking the load of each processor and shifting the processes from
the ready queues of overloaded processors to that of less overloaded
or idle processors. On the other hand, in pull migration technique,
the idle processor itself pulls a waiting process from a busy
processor.
Note: Load balancing is often unnecessary on SMP systems with a
single shared ready queue.
Processor Affinity
Processor affinity means an effort to make a process to run on the
same processor it was executed last time. Whenever a process
executes on a processor, the data most recently accessed by it is
kept in the cache memory of that processor. Next time if the process
is run on the same processor, then most if its memory accesses are
satisfied in the cache memory only and as a result, the process
execution speeds up. However, if the process is run on some different
processor next time, the cache of the older processor becomes
invalid and the cache of the new processor is to be re-populated. As
a result, the process execution is delayed. Thus, an attempt should
be made by the operating system to run a process on the same
processor each time instead of migrating it to some another
processor.
When an operating system tries to make a process to run on the
same processor but does not guarantee to always do so, it is referred
to as soft affinity. On the other hand, when an operating system
provides system calls that force a process to run on the same
processor, it is referred to as hard affinity. In soft affinity, there is a
possibility of process migration from one processor to another
whereas in hard affinity, the process is never migrated to some
another processor.
Deterministic Modeling
Deterministic modeling is the simplest and direct method used to
compare the performance of different scheduling algorithms on the
basis of some specific criteria. It takes into account the pre-specified
system workload and measures the performance of each scheduling
algorithm for that workload.
For example, consider a system with workload as shown below.
We have to select an algorithm out of FCFS, SJF, and RR (with time
slice 8 ms), which results in minimum average waiting time.
Queuing Models
Generally, there is no fixed set of processes that run on systems;
thus, it is not possible to measure the exact processing requirements
of processes. However, we can measure the distributions of CPU
bursts and I/O bursts during the life time of processes and derive a
mathematical formula that identifies the probability of a specific CPU
burst. Similarly, the arrival rate of processes in the system can also
be approximated.
The use of mathematical models for evaluating performance of
various systems led to the development of queuing theory, a branch
of mathematics. The fundamental model of queuing theory is identical
to the computer system model. Each computer system is represented
as a set of servers (such as CPU, I/O devices, etc.) with each server
having its own queue. For example, CPU has a ready queue and an
I/O device has a device queue associated with itself. By having
knowledge of arrival rates of processes in each queue and service
rates of processes, we can find out the average length of queue,
average waiting time of processes in the queue, etc.
For example, consider that L denotes the average queue length, W
denotes the average waiting time of a process in the queue, and a
denotes the average arrival rate of processes in the queue. The
relationship between L, W, and a can be expressed by the Little’s
formula, as given below:
L = a×W
This formula is based on the facts discussed next:
• During the time a process waits in the queue (W), (a × W) new
processes enter the queue.
• The system is in steady state, that is, the number of processes
exiting from the queue is equal to the number of processes
entering the queue.
Note: The performance evaluation using the queuing theory is known
as queuing analysis.
In spite of the fact that queuing analysis provides a mathematical
formula to evaluate the performance of scheduling algorithms, it
suffers from few limitations. We can use queuing analysis for only
limited classes of scheduling algorithms, not for all. Moreover, it is
based on approximations; therefore, the accuracy of calculated
results depends on how closely the approximations match with the
real system.
Simulations
Simulations are the more accurate method of evaluating algorithms
that mimic the dynamic behaviour of a real computer system over
time. The computer system model is programmed and all the major
components of system are represented by the data structures. The
simulator employs a variable representing a clock. As the clock is
incremented, the current system state is changed to reflect the
changed actions of processes, scheduler, I/O devices, etc. While the
simulation executes, the system parameters that affect the
performance of scheduling algorithms such as CPU burst, I/O burst,
and so on are gathered and recorded.
The data to drive the simulation can be generated using the trace
tapes, which are created by monitoring the system under study and
recording the events taking place. The sequence of recorded events
is then used to drive the simulation. Although trace tapes is the easier
method to compare the performance of two different scheduling
algorithms for the same set of real inputs, they need a vast amount of
storage space. Moreover, simulation requires a lot of computer time;
this makes it an expensive method.
where,
• pthread_attr_t *attr specifies a pointer to the attribute set for the
thread.
• int scope specifies how the contention scope is to be set. The
value of this parameter could be either PTHREAD_SCOPE_PROCESS or
PTHREAD_SCOPE_SYSTEM.
• int *scope gives a pointer to the int value which is set to the
current value of the contention scope.
Note: Both the functions return nonzero values, in case an error
occurs.
LET US SUMMARIZE
1. The algorithm used by the scheduler to carry out the selection of a
process for execution is known as scheduling algorithm.
2. The time period elapsed in processing before performing the next I/O
operation is known as CPU burst.
3. The time period elapsed in performing I/O before the next CPU burst is
known as I/O burst.
4. The module of the operating system that performs the function of setting
up the execution of the selected process on the CPU is known as
dispatcher.
5. For scheduling purposes, the scheduler may consider some performance
measures and optimization criteria which include fairness, CPU utilization,
balanced utilization, throughput, waiting time, turnaround time and
response time.
6. A wide variety of algorithms are used for the CPU scheduling. These
scheduling algorithms fall into two categories, namely, non-preemptive
and preemptive.
7. In non-preemptive scheduling algorithms, once the CPU is allocated to a
process, it cannot be taken back until the process voluntarily releases it or
the process terminates.
8. In preemptive scheduling algorithms, the CPU can be forcibly taken back
from the currently running process before its completion and allocated to
some other process.
9. FCFS is one of the simplest non-preemptive scheduling algorithms in
which the processes are executed in the order of their arrival in the ready
queue.
10. The shortest job first also known as shortest process next or shortest
request next is a non-preemptive scheduling algorithm that schedules the
processes according to the length of CPU burst they require.
11. The shortest remaining time next also known as shortest time to go is a
preemptive version of the SJF scheduling algorithm. It takes into account
the length of remaining CPU burst of the processes rather than the whole
length in order to schedule them.
12. In priority-based scheduling algorithm, each process is assigned a priority
and the higher priority processes are scheduled before the lower priority
processes.
13. The highest response ratio next scheduling is a non-preemptive
scheduling algorithm that schedules the processes according to their
response ratio. Whenever CPU becomes available, the process having
the highest value of response ratio among all the ready processes is
scheduled next.
14. The round robin scheduling is one of the most widely used preemptive
scheduling algorithms in which each process in the ready queue gets a
fixed amount of CPU time (generally from 10 to 100 milliseconds) known
as time slice or time quantum for its execution.
15. The multilevel queue scheduling is designed for the environments where
the processes can be categorized into different groups on the basis of
their different response time requirements or different scheduling needs.
16. The multilevel feedback queue scheduling also known as multilevel
adaptive scheduling is an improved version of multilevel queue scheduling
algorithm. In this scheduling algorithm, processes are not permanently
assigned to queues; instead they are allowed to move between the
queues.
17. In multiprocessor systems, the ready queue can be implemented in two
ways. Either there may be a separate ready queue for each processor or
there may be a single shared ready queue for all the processors.
18. In symmetric multiprocessing scheduling approach, each processor is self-
scheduling. For each processor, the scheduler selects a process for
execution from the ready queue.
19. In asymmetric multiprocessing scheduling approach, the responsibility of
making scheduling decisions, I/O processing and other system activities is
up to only one processor (called master), and other processors (called
slaves) simply execute the user’s code.
20. A real-time system has well-defined, fixed time constraints which if not
met, may lead to system failure even though the output produced is
correct. It is of two types: hard real-time system and soft real-time system.
21. In hard-real time systems, the scheduler requires a process to declare its
deadline requirements before entering into the system. Then it employs a
technique known as admission control algorithm to decide whether the
process should be admitted.
22. In soft real-time systems, the requirements are less strict; it is not
mandatory to meet the deadline. A real-time process always gets the
priority over other tasks, and retains the priority until its completion. If the
deadline could not be met due to any reason, then it is possible to
reschedule the task and complete it.
23. To select a scheduling algorithm for a particular system, we need to
evaluate the performance of different scheduling algorithms under given
system workload and find out the most suitable one for the system. Some
of the commonly used evaluation methods are deterministic modeling,
queuing models, and simulations.
24. There are two types of threads: user level threads and kernel level
threads. There exist two schemes to schedule these threads: process
contention scope and system contention scope. In process contention
scope (PCS) scheme, the thread library schedules user-level threads to
be mapped on a single available LWP. On the other hand, the system
contention scope (SCS) scheme involves deciding which kernel thread to
schedule onto CPU.
EXERCISES
Fill in the Blanks
1. The time period elapsed in performing I/O before the next CPU burst is
known as _____________.
2. In _____________ scheduling, once the CPU is allocated to a process, it
cannot be taken back until the process voluntarily releases it or the
process terminates.
3. _____________ is non-preemptive scheduling algorithms in which the
processes are executed in the order of their arrival in the ready queue.
4. The _____________ is a preemptive scheduling algorithm in which each
process in the ready queue gets a fixed amount of CPU time.
5. The two schemes to schedule threads are _____________ and
_____________.
Descriptive Questions
1. Distinguish between non-preemptive and preemptive scheduling
algorithms.
2. Define throughput, turnaround time, waiting time, and response time.
3. List the situations that may require the scheduler to make scheduling
decisions.
4. Which non-preemptive scheduling algorithms suffer from starvation and
under what conditions?
5. Describe scheduling in soft real-time systems.
6. Explain the relation (if any) between the following pairs of scheduling
algorithms.
(a) Round robin and FCFS
(b) Multilevel feedback queue and FCFS
(c) SJF and SRTN
(d) SRTN and priority-based
7. Consider five processes P1, P2, P4, P4 and P5 with their arrival times,
required CPU burst (in milliseconds), and priorities as shown in the
following table.
Assume that the lower priority number means the higher priority.
Compute the average waiting time and average turnaround time
of processes for each of the following scheduling algorithms. Also
determine which of the following scheduling algorithms result in
minimum waiting time.
(a) FCFS
(b) SJF
(c) HRN
(d) Non-preemptive priority-based
8. Consider three processes P1, P2, and P3 with same arrival time t= 0. Their
required CPU burst (in milliseconds) is shown in the following table.
Assuming that the time slice is 4 ms, how will these processes be
scheduled according to round robin scheduling algorithm? Compute the
average waiting time and average turnaround time.
9. Consider the same set of processes as shown in Question 7. Compute the
average waiting time and average turnaround time of processes for each
of the following scheduling algorithms.
(a) SRTN
(b) Preemptive priority-based
(c) Round robin (if CPU time slice is 2 ms)
Compare the performance of these scheduling algorithms with each other.
10. Which of the following scheduling algorithms favor the I/O-bound
processes and how?
(a) Multilevel feedback queue
(b) SJF
(c) HRN
11. Write short notes on the following.
(a) Thread scheduling
(b) Soft affinity vs. hard affinity
(c) Dispatcher
(d) Scheduling approaches for multiprocessor scheduling
12. Differentiate between multilevel queue and multilevel feedback queue
scheduling.
13. Consider a scheduling algorithm that prefers to schedule those processes
first which have consumed the least amount of CPU time. How will this
algorithm treat the I/O-bound and CPU-bound processes? Is there any
chance of starvation?
14. Explain various methods used for evaluating performance of scheduling
algorithms.
chapter 5
Process Synchronization
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the principles of concurrency.
⟡ Describe how to implement control synchronization using
precedence graph.
⟡ Define the critical-section problem.
⟡ Explain the software solutions to critical-section problem, including
strict alternation, Dekker’s algorithm, Peterson’s algorithm and
Bakery algorithm.
⟡ Discuss the hardware-supported solutions for critical-section
problem.
⟡ Define semaphores.
⟡ Discuss various classical synchronization problems and their
solutions using semaphores.
⟡ Understand the concept of monitors and message passing.
5.1 INTRODUCTION
Operating systems that support multiprogramming allow multiple
processes to execute concurrently in a system even with a single
processor. The concurrent processes may interact with one another
by sharing data or exchanging messages and control signals in order
to coordinate their actions with respect to one another. To implement
the interactions among cooperating (or interacting) processes, the
operating system must provide a means of synchronization. Based on
the nature of interactions among cooperating processes, two kinds of
synchronization have been identified: control synchronization and
data access synchronization.
Control synchronization is needed when cooperating processes
need to coordinate their execution with respect to one another. For
example, a process cannot perform a certain action in its execution
until some other process or processes have been executed up to a
specific point in their execution. Control synchronization is
implemented with the help of precedence graph of cooperating
processes. On the other hand, data access synchronization is
needed when cooperating processes access shared data. The use of
shared data by cooperating processes may lead to unexpected
results because of race conditions (discussed in the next section).
The data access synchronization mechanisms provide mutually
exclusive access to shared data, thus ensuring that the race
conditions do not occur. This chapter discusses various mechanisms
that have been developed to provide synchronization among
cooperating processes.
Race Condition
As mentioned above, unordered execution of cooperating processes
may result in data inconsistency. To understand the concept, consider
two cooperating processes P1 and P2 that update the balance of an
account in a bank. The code segment for the processes is given in
Table 5.1.
Table 5.1 Code Segment for Processes P1 and P2
Process P1 Process P2
Read Balance Read Balance
Balance = Balance + Balance = Balance – 400
1000
Suppose that the balance is initially 5000, then after the execution
of both P1 and P2, it should be 5600. The correct result is achieved if
P1 and P2 execute one by one in any order either P1 followed by P2 or
P2 followed by P1. However, if the instructions of P1 and P2 are
interleaved arbitrarily, the balance may not be 5600 after the
execution of both P1 and P2. One possible interleaving sequence for
the execution of instructions of P1 and P2 is given in Table 5.2.
Mutual Exclusion
To avoid race conditions or inconsistent results, some form of
synchronization among the processes is required which ensures that
only one process is manipulating the shared data at a time. In other
words, we need to ensure mutual exclusion. That means if a
process P1 is manipulating shared data, no other cooperating process
should be allowed to manipulate it until P1 finishes with it. In the
previous example, since mutual exclusion was not ensured while
accessing the shared data, it resulted in inconsistent result.
Concurrency Conditions
To determine whether two statements (say Si and Sj) can be
executed concurrently with providing valid outputs, the following three
conditions (called Bernstein’s conditions) must hold:
Here, R(Si) is the read set of Si, which includes the variable(s)
whose value has been referenced in Si during execution and W(Si) is
the write set of Si, which includes the variable(s) whose value is to be
modified upon the execution of Si. For example, the read sets and
write sets of the given statements are as follows.
Initially, the flag of both processes (that is, flag[0] and flag[1])
are set to false. When a process wishes to enter in its critical section,
it must set its corresponding flag to true in order to announce to the
other process that it is attempting to enter in its critical section. In
addition, the turn variable (initialized to either 0 or 1) is used to avoid
the livelock—a situation that arises when both processes prevent
each other indefinitely from entering into the critical sections.
Suppose we have two processes say, P0 and P1. The general
structure of the code segment for process P0 is as follows.
The general structure of the code segment for process P1 is as
follows.
To understand how Dekker’s algorithm works, suppose initially,
the process P0 wants to enter its critical section. Therefore, P0 sets
flag[0] to true. It then examines the value of flag[1]. If it is found
false, P0 immediately enters its critical section; otherwise, it checks
the value of turn. If turn=1, then P0 understands that it is the turn of P1
and so, sets flag[0] to false and continues to wait until turn becomes
0. On the other hand, if P0 finds turn=0, then it understands that it is
its turn and thus, periodically checks flag of P1, that is, flag[1]. When
at some point of time, P1 sets flag[1] to false, P0 proceeds. After the
process P0 has exited its critical section, it sets flag[0] to false and
turn to 1 to transfer the right to enter the critical section to P1.
In case both processes wish to enter their critical sections at the
same time which implies that both flag[0] and flag[1] are set to
true, the value of turn variable decides which process can enter into
its critical section. If turn=0, P1 sets flag[1] to false, thus allowing P0
to enter into its critical section and vice versa if turn=1. This ensures
the mutual exclusion requirement. Observe that one of the competing
processes is allowed to enter the critical section at the time when
both attempt to enter at the same time depending on the value of turn
variable, while the other one can enter after the first one has exited
from its critical section. Thus, a process will eventually always run
when both attempt to enter at the same time. This satisfies the
bounded waiting requirement, ensuring freedom from deadlock and
livelock conditions.
Now, consider the case when both P0 and P1 wish to enter their
critical sections at the same time. In this case, both the elements of
the flag will be set to true, and the value of turn will be set to 0 and 1
one by one (by P0 and P1) but only one retains. Now, the first condition
is true, thus, the value of turn decides which process enters its
critical section first. The other process has to wait. It implies mutual
exclusion is preserved.
To verify that the algorithm also satisfies the other two
requirements, observe that the process P0 can be prevented from
entering its critical section if flag[1] is true and turn is 1. If P1 does
not wish to enter its critical section, then P0 finds flag[1] as false and
can enter its critical section. However, when both processes wish to
enter their critical section at the same time the variable turn plays its
role and allows one process to enter its critical section. Suppose turn
is 1, then P1 is allowed first and P0 is stuck in the loop. Now, when P1
exits from its critical section, it sets flag[1] to false to indicate that it
is not in its critical section now. This allows P0 to enter its critical
section. It means P0 enters its critical section after at most one entry
by P1, satisfying both progress and bounded-waiting requirements.
All the elements of the arrays, that is, choosing and number are
initialized to false and 0, respectively.
The algorithm assigns a number to each process and serves the
process with the lowest number first. The algorithm cannot ensure
that two processes do not receive the same number. Thus, if two
processes, say Pi and Pj, receive the same number, then Pi is served
first if i<j. The general structure of the code segment for process, say
Pi, is as follows.
Note: For simplicity, the notation MAX(number) is used to retrieve the
maximum element in the array number.
To verify that mutual exclusion is preserved, suppose a process P0
is executing in its critical section and another process, say P1,
attempts to enter the critical section. For j=0, the process P1 is not
blocked in the first while loop because P0 had set choosing[0] to
false in the entry section. However, in the second while loop for j=0,
P1 finds the following:
Disabling Interrupts
On a system with a single-processor, only one process executes at a
time. The other processes can gain control of processor through
interrupts. Therefore, to solve the critical-section problem, it must be
ensured that when a process is executing in its critical section,
interrupt should not occur. A process can achieve this by disabling
interrupts before entering in its critical section. Note that the process
must enable the interrupts after finishing execution in its critical
section.
This method is simple, but it has certain disadvantages. First, it is
feasible in a single-processor environment only because disabling
interrupts in a multiprocessor environment takes time as message is
passed to all the processors. This message passing delays
processes from entering into their critical sections, thus, decreasing
the system efficiency. Second, it may affect the scheduling goals,
since the processor cannot be preempted from a process executing
in its critical section.
The variable lock and all the elements of array waiting are
initialized to false. Each process also has a local Boolean variable,
say key. The general structure of the code segment for process, say
Pi, is as follows.
5.7 SEMAPHORES
In 1965, Dijkstra suggested using an abstract data type called a
semaphore for controlling synchronization. A semaphore S is an
integer variable which is used to provide a general-purpose solution
to critical-section problem. In his proposal, two standard atomic
operations are defined on S, namely, wait and signal, and after
initialization, S is accessed only through these two operations. The
definition of wait and signal operation in pseudocode is as follows.
The solution of critical-section problem for N processes is
implemented by allowing the processes to share a semaphore S,
which is initialized to 1. The general structure of the code segment for
process, say Pi, is as follows.
Note that all the solutions presented so far for the critical-section
problem, including the solution using semaphore, require busy
waiting. It means if a process is executing in its critical section, all
other processes that attempt to enter their critical sections must loop
continuously in the entry section. Executing a loop continuously
wastes CPU cycles, and is considered a major problem in
multiprogramming systems with one processor.
To overcome the busy waiting problem, the definition of
semaphore is modified to hold an integer value and a list of
processes, and the wait and signal operations are also modified. In
the modified wait operation, when a process finds that the value of
the semaphore is negative, it blocks itself instead of busy waiting.
Blocking a process means it is inserted in the queue associated with
the semaphore and the state of the process is switched to the waiting
state. The signal operation is modified to remove a process, if any,
from the queue associated with the semaphore and restart it. The
modified definition of semaphore, the wait operation, and the signal
operation is as follows.
Note: The block() operation and wakeup() operation are provided by
the operating system as basic system calls.
An important requirement is that both the wait and signal
operations must be treated as atomic instructions. It means no two
processes can execute wait and signal operations on the same
semaphore at the same time. We can view this as a critical-section
problem, where the critical section consists of wait and signal
operations. This problem can be solved by employing any of the
solutions presented earlier.
In this way though, we have not completely eliminated the busy
waiting but limited the busy waiting to only the critical sections
consisting of wait and signal operations. Since these two operations
are very short, busy waiting occurs rarely and for a very short time
only.
The semaphore presented above is known as counting
semaphore or general semaphore, since its integer value can range
over an unrestricted domain. Another type of semaphore is binary
semaphore whose integer value can range only between 0 and 1.
Binary semaphore is simpler to implement than general semaphore.
The wait and signal operations for a binary semaphore S, initialized
to 1, are as follows.
Initially, all the philosophers are in the thinking stage and while
thinking they do not interact with each other. As time goes on,
philosophers might feel hungry. When a philosopher feels hungry, he
attempts to pick up the two chopsticks closest to him (that are in
between him and his left and his right philosophers). If the
philosophers on his left and right are not eating, he successfully gets
the two chopsticks. With the two chopsticks in his hand, he starts
eating. After eating is finished, he puts the chopsticks back on the
table and starts thinking again. On the other hand, if the philosopher
on his left or right is already eating, then he is unable to successfully
grab the two chopsticks at the same time, and thus, must wait. Note
that this situation is similar to the one that occurs in the system to
allocate resources among several processes. Each process should
get required resources to finish its task without being deadlocked and
starved.
A solution to this problem is to represent each chopstick as a
semaphore, and philosophers must grab or release chopsticks by
executing wait operation or signal operation, respectively, on the
appropriate semaphores. We use an array chopstick of size 5 where
each element is initialized to 1. The general structure of the code
segment for philosopher i is as follows.
5.9 MONITORS
A monitor is a programming language construct which is also used to
provide mutually exclusive access to critical sections. The
programmer defines monitor type which consists of declaration of
shared data (or variables), procedures or functions that access these
variables, and initialization code. The general syntax of declaring a
monitor type is as follows.
The variables defined inside a monitor can only be accessed by
the functions defined within the monitor, and no process is allowed to
directly access these variables. Thus, processes can access these
variables only through the execution of the functions defined inside
the monitor. Further, the monitor construct ensures that only one
process may be executing within the monitor at a time. If a process is
executing within the monitor, then other requesting processes are
blocked and placed on an entry queue.
Though, monitor construct ensures mutual exclusion for
processes, but sometimes programmer may find them insufficient to
represent some synchronization schemes. For such situations,
programmer needs to define his own synchronization mechanisms.
He can define his own mechanisms by defining variables of condition
type on which only two operations can be invoked: wait and signal.
Suppose, programmer defines a variable C of condition type, then
execution of the operation C.wait() by a process, say Pi, suspends
the execution of Pi, and places it in a queue associated with the
condition variable C. On the other hand, the execution of the
operation C.signal() by a process, say Pi, resumes the execution of
exactly one suspended process Pj, if any. It means that the execution
of the signal operation by Pi allows other suspended process Pj to
execute within the monitor. However, only one process is allowed to
execute within the monitor at one time. Thus, monitor construct
prevents Pj from resuming until Pi is executing in the monitor. There
are following possibilities to handle this situation.
• The process Pi must be suspended to allow Pj to resume and wait
until Pj leaves the monitor.
• The process Pj must remain suspended until Pi leaves the
monitor.
• The process Pi must execute the signal operation as its last
statement in the monitor so that Pj can resume immediately.
EXERCISES
Fill in the Blanks
1. The portion of the code of a process in which it accesses or changes the
shared data is known as _____________.
2. Unordered execution of cooperating processes may result in
_____________.
3. The two standard atomic operations defined on monitor are
_____________ and _____________.
4. The semaphore whose integer value can range over an unrestricted
domain is known as _____________.
5. Two types of synchronization that may be needed among the cooperating
processes include_____________ and _____________.
Descriptive Questions
1. Explain with example that why some form of synchronization among the
processes is required.
2. Define critical-section problem. Also explain all the requirements that a
solution to critical-section problem must met.
3. Describe bakery algorithm to solve critical-section problem.
4. Give TestAndSet instruction. Also give the algorithm that uses TestAndSet
instruction to solve the critical-section problem and meets all the
requirements of the solution for the critical-section problem.
5. Write short notes on the following.
(a) Semaphore
(b) Swap instruction
(c) Entry and exit section
(d) Critical section
6. What is busy waiting? How semaphore is used to overcome the busy
waiting problem?
7. Explain the use of semaphore in developing a solution to bounded buffer
producer-consumer problem.
8. Describe the dining-philosophers problem. Give a solution to the dining-
philosopher problem with the use of monitors.
9. Can semaphores and monitors be used in distributed systems? Why or
why not.
10. Explain the bounded buffer producer-consumer problem and provide a
solution to this using message passing.
11. Discuss the Dekker’s algorithm and prove that this algorithm satisfies all
the three requirements for solution to critical-section problem.
12. Describe the use of precedence graph. What information does it provide?
chapter 6
Deadlock
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Define system model.
⟡ Discuss the features that characterize the deadlock.
⟡ Discuss different methods of handling deadlock.
⟡ Explain how a deadlock can be prevented by eliminating one of the
four conditions of a deadlock.
⟡ Understand the concept of safe and unsafe state.
⟡ Explain various deadlock avoidance algorithms.
⟡ Discuss different deadlock detection methodologies.
⟡ List the ways to recover from a deadlock.
6.1 INTRODUCTION
Deadlock occurs when every process in a set of processes are in a
simultaneous wait state and each of them is waiting for the release of
a resource held exclusively by one of the waiting processes in the
set. None of the processes can proceed until at least one of the
waiting processes releases the acquired resource. Deadlocks may
occur on a single system or across several machines. This chapter
discusses the different ways in which these deadlocks can be
handled.
6.2 SYSTEM MODEL
A system consists of various types of resources like input/output
devices, memory space, processors, disks, etc. For some resource
types, several instances may be available. For example, a system
may have two printers. When several instances of a resource type
are available, any one of them can be used to satisfy the request for
that resource type.
A process may need multiple resource types to accomplish its
task. However, to use any resource type, it must follow some steps
which are discussed next.
1. Request for the required resource.
2. Use the allocated resource.
3. Release the resource after completing the task.
If the requested resource is not available, the requesting process
enters a waiting state until it acquires the resource. Consider a
system with a printer and a disk drive and two processes P1 and P2
are executing simultaneously on this system. During execution, the
process P1 requests for the printer and process P2 requests for the
disk drive and both the requests are granted. Further, the process P2
requests for the printer held by process P1 and process P1 requests for
the disk drive held by the process P2. Here, both processes wll enter a
waiting state. Since each process is waiting for the release of
resource held by other, they will remain in waiting state forever. This
situation is called deadlock.
Note: When two processes are inadvertently waiting for the
resources held by each other, this situation is referred to as a deadly
embrace.
Fig. 6.2 Resource Allocation Graph for Multiple Instances of a Resource Type
This resource allocation graph has the following indications.
1. Process P1 is waiting for the allocation of resource R1 held by the process
P2.
2. Process P2 is waiting for the allocation of instance (R22) of resource type
R2.
3. Process P1 is holding an instance (R21) of resource type R2.
It can be observed that the graph forms a cycle but still processes
are not deadlocked. The process P2 can acquire the second instance
(R22) of the resource type R2 and complete its execution. After
completing the execution, it can release the resource R1 that can be
used by the process P1. Since, no process is in waiting state, there is
no deadlock.
From this discussion, it is clear that if each resource type has
exactly one instance, cycle in resource allocation graph indicates a
deadlock. If each resource type has several instances, cycle in
resource allocation graph does not necessarily imply a deadlock.
Thus, it can be concluded that if a graph contains no cycle, the set of
processes are not deadlocked; however, if there is a cycle then
deadlock may exist.
Eliminating No Preemption
Elimination of this condition means a process can release the
resource held by it. If a process requests for a resource held by some
other process then instead of making it wait, all the resources
currently held by this process can be preempted. The process will be
restarted only when it is allocated the requested as well as the
preempted resources. Note that only those resources can be
preempted whose current working state can be saved and can be
later restored. For example, the resources like printer and disk drives
cannot be preempted.
Fig. 6.3 Relationship between Safe State, Unsafe State and Deadlock
Fig. 6.4 Safe Sequence of Execution of Processes
On the basis of available information, it can be easily observed
that the resource requirement of the process P2 can be easily
satisfied. Therefore, resources are allocated to the process P2 and it
is allowed to execute till its completion. After the execution of the
process P2, all the resources held by it are released. The number of
the resources now available are not enough to be allocated to the
process P1, whereas, they are enough to be allocated to the process
P3. Therefore, resources are allocated to the process P3 and it is
allowed to execute till its completion. The number of resources
available after the execution of process P3 can now easily be
allocated to the process P1. Hence, the execution of the processes in
sequence P2, P3, P1 is safe (see Figure 6.4).
Now consider a sequence P2, P1, P3. In this sequence, after the
execution of process P2, the number of available resources is 6, and
is allocated to the process P1. Even after the allocation of all the
available resources, the process P1 is still short of one resource for its
complete execution. As a result, the process P1 enters a waiting state
and waits for process P3 to release the resource held by it, which in
turn is waiting for the remaining resources to be allocated for its
complete execution. Now the processes P1 and P3 are waiting for each
other to release the resources, leading to the deadlock. Hence, this
sequence of process execution is unsafe.
Note that a safe state is a deadlock free state, whereas all unsafe
states may or may not result in a deadlock. That is, an unsafe state
may lead to a deadlock but not always.
Safety Algorithm
This algorithm is used for determining whether or not a system is in
safe state. To understand how this algorithm works, consider a vector
Complete of size p. Following are the steps of the algorithm.
1. Initialize Complete[i]=False for all i=1, 2, 3,..., p. Complete[i] =False
indicates that the ith process is still not completed.
2. Search for an i, such that Complete[i]=False and (R£A) that is, resources
required by this process is less than the available resources. If no such
process exists, then go to step 4.
3. Allocate the required resources to the process and let it finish its
execution. Set Complete[i]=True for that process and add all its resources
to vector A. Go to step 2.
4. If Complete[i]=True for all i, then the system is in safe state. Otherwise, it
indicates that there exists a process for which Complete[i]=False and
resources required by it are more than the available resources. Hence, it
is in unending waiting state leading to an unsafe state.
Resource-request Algorithm
Once it is confirmed that system is in safe state, an algorithm called
resource-request algorithm is used for determining whether the
request by a process can be satisfied or not. To understand this
algorithm, let Req be a matrix of the order pxq, indicating the number
of resources of each type requested by each process at any given
point of time. That is, Req[i] [j] indicates the number of resources of
jth type requested by the ith process at any given point of time.
Following are the steps of this algorithm.
1. If Req[i] [j] ≤ R[i][j], go to step 2, otherwise an error occurs as
process is requesting for more resources than the maximum number of
resources required by it.
2. If Req[i] [j] ≤ A[i] [j], go to step 3, otherwise the process Pi must wait
until the required resources are available.
3. Allocate the resources and make the following changes in the data
structures.
An Example
Consider a system with three processes (P1, P2 and P3) and three
resource types (X, Y and Z). There are 10 instances of resource type
X, 5 of Y and 7 of Z. The matrix M for maximum number of resources
required by the process, matrix C for the number of resources
currently allocated to each process and vector A for maximum
available resources are shown in Figure 6.6 (a), (b) and (c),
respectively. Now, the matrix R representing the number of remaining
resources required by each process can be obtained by the formula
M–C, which is shown in Figure 6.6 (d).
LET US SUMMARIZE
1. Deadlock occurs when every process in a set of processes are in a
simultaneous wait state and each of them is waiting for the release of a
resource held exclusively by one of the waiting processes in the set.
2. A system consists of various types of resources like input/output devices,
memory space, processors, disks, etc. For some resource types, several
instances may be available. When several instances of a resource type
are available, any one of them can be used to satisfy the request for that
resource type.
3. Four necessary conditions for a deadlock are mutual exclusion, hold and
wait, no preemption and circular wait.
4. A deadlock can be depicted with the help of a directed graph known as
resource allocation graph.
5. If each resource type has exactly one instance, cycle in resource
allocation graph indicates a deadlock. If each resource type has several
instances, cycle in resource allocation graph does not necessarily imply a
deadlock.
6. Deadlock prevention or deadlock avoidance techniques can be used to
ensure that deadlocks never occur in a system.
7. A deadlock can be prevented by not allowing all four conditions to be
satisfied simultaneously, that is, by making sure that at least one of the
four conditions does not hold.
8. A deadlock can be avoided by never allowing allocation of a resource to a
process if it leads to a deadlock. This can be achieved when some
additional information is available about how the processes are going to
request for resources in future.
9. A state is said to be safe if allocation of resources to processes does not
lead to the deadlock. More precisely, a system is in safe state only if there
is a safe sequence. A safe sequence is a sequence of process execution
such that each and every process executes till its completion. If no such
sequence of process execution exists then the state of the system is said
to be unsafe.
10. There is a possibility of deadlock if neither deadlock prevention nor
deadlock avoidance method is applied in a system. In such a situation, an
algorithm must be provided for detecting the occurrence of deadlock in a
system.
11. When only single resource of each type is available, the deadlock can be
detected by using a variation of resource allocation graph known as wait-
for graph.
12. When multiple instances of a resource type exist, the wait-for graph
becomes inefficient to detect the deadlock in the system. For such
system, another algorithm which uses certain data structures similar to
the ones used in banker’s algorithm is applied.
13. Once the deadlock is detected, a methodology must be provided for the
recovery of the system from the deadlock.
14. Two different ways in which deadlock can be broken and system can be
recovered are— terminate one or more process to break the circular-wait
condition, preempt the resources from the processes involved in the
deadlock.
EXERCISES
Fill in the Blanks
1. If the requested resource is not available, the requesting process enters a
_____________ until it acquires the resource.
2. A deadlock can be depicted with the help of a directed graph known as
_____________.
3. Deadlock can be avoided using an algorithm known as _____________
devised by Dijkstra in 1965.
4. A _____________ is a sequence of process execution such that each and
every process executes till its completion.
5. When only single resource of each type is available, the deadlock can be
detected by using a variation of resource allocation graph called
_____________.
Descriptive Questions
1. Explain deadlock with an example.
2. What are the four conditions necessary for the deadlock? Explain them.
3. What are the steps performed by a process to use any resource type?
4. List the various ways to handle a deadlock.
5. How can the circular wait condition be prevented?
6. Mention different ways by which a system can be recovered from a
deadlock.
7. Consider a system having three instances of a resource type and two
processes. Each process needs two resources to complete its execution.
Can deadlock occur? Explain.
8. Consider a system is in an unsafe state. Illustrate that the processes can
complete their execution without entering a deadlock state.
9. Consider a system has six instances of a resource type and m processes.
For which values of m, deadlock will not occur?
10. Consider a system consisting of four processes and a single resource. The
current sate of the system is given here.
For this state to be safe, what should be the minimum number of
instances of this resource?
11. Consider the following state of a system.
Answer the following questions using the banker’s algorithm for multiple
resources:
(a) What is the content of the matrix Required?
(b) Is the system in a safe state?
(c) Can a request from a process P2 for (1, 0, 2) be granted immediately?
12. What is a resource allocation graph? Explain with an example.
13. Explain the banker’s algorithm for multiple resources with the help of an
example.
chapter 7
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Differentiate between logical and physical addresses.
⟡ Define address binding.
⟡ Understand the memory management in a bare machine.
⟡ Discuss memory management strategies that involve contiguous
memory allocation.
⟡ Explain memory management strategies that involve non-contiguous
memory allocation—for example, paging and segmentation.
⟡ Introduce a memory management scheme which is a combination of
segmentation and paging.
⟡ Discuss swapping.
⟡ Discuss overlays.
7.1 INTRODUCTION
To improve the utilization of the CPU and the speed of the computer’s
response to its users, the system keeps several processes in
memory, that is, several processes share the memory. Due to the
sharing of memory, there is need for memory management. It is the
job of memory manager, a part of the operating system, to manage
memory between multiple processes in an efficient way. For this, it
keeps track of which part of the memory is occupied and which part is
free. It also allocates and de-allocates memory to the processes
whenever required, and so on. Moreover, it provides the protection
mechanism to protect the memory allocated to each process from
being accessed by other processes.
For managing the memory, the memory manager may choose
from a number of available strategies. Memory allocation to the
processes using these strategies is either contiguous or non-
contiguous. This chapter discusses all the memory management
strategies in detail.
7.2 BACKGROUND
Every byte in the memory has a specific address that may range from
0 to some maximum value as defined by the hardware. This address
is known as physical address. Whenever a program is brought into
main memory for execution, it occupies certain number of memory
locations. The set of all physical addresses used by the program is
known as physical address space. However, before a program can
be executed, it must be compiled to produce the machine code. A
program is compiled to run starting from some fixed address and
accordingly all the variables and procedures used in the source
program are assigned some specific address known as logical
address. Thus, in machine code, all references to data or code are
made by specifying the logical addresses and not by the variable or
procedure names. The range of addresses that user programs can
use is system-defined and the set of all logical addresses used by a
user program is known as its logical address space.
When a user program is brought into main memory for execution,
its logical addresses must be mapped to physical addresses. The
mapping from addresses associated with a program to memory
addresses is known as address binding. The address binding can
take place at one of the following times.
• Compile time: The address binding takes place at compile time if
it is known at the compile time which addresses the program will
occupy in the main memory. In this case, the program generates
absolute code at the compile time only, that is, logical addresses
are same as that of physical addresses.
• Load time: The address binding occurs at load time if it is not
known at the compile time which addresses the program will
occupy in the main memory. In this case, the program generates
relocatable code at the compile time which is then converted into
the absolute code at the load time.
• Run time: The address binding occurs at run time if the process is
supposed to move from one memory segment to other during its
execution. In this case also, the program generates relocatable
code at compile time which is then converted into the absolute
code at the run time.
Note: The run time address binding is performed by the hardware
device known as memory-management unit (MMU).
Note: The portion of the operating system residing in the memory is known as
resident monitor.
Hardware Support
Single contiguous memory allocation is a simple memory
management scheme that is usually associated with stand-alone
computers with simple batch operating systems. Thus, no special
hardware support is needed, except for protecting the operating
system from the user process. This hardware protection mechanism
may include bounds register and two modes of CPU (supervisor and
user). The address of protected area is contained in bounds register.
Now, depending upon the mode of CPU, access to protected area is
controlled. If CPU is in user mode, each time a memory reference is
made, a hardware check is performed to ensure that it is not an
access to the protected area. If an attempt is made to access the
protected area, an interrupt is generated and control is transferred to
the operating system. On the other hand, if the CPU is in supervisor
mode, the operating system is allowed to access the protected area
as well as to execute the privileged instructions to alter the content of
bounds register.
Software Support
In this scheme, only one process can execute at a time. Whenever a
process is to be executed, the size of the process is checked. If the
size is less or equal to size of memory, the operating system loads it
into the main memory for execution. After termination of that process,
the operating system waits for another process. If the size of the
process is greater than the size of memory, an error occurs and next
process is scheduled. The same sequence is performed for the next
process.
Advantages
This scheme is easy to implement. Generally, the operating system
needs to keep track of the first and the last location allocated to the
user processes. However, in this case, the first location is
immediately following the operating system and the last location is
determined by the capacity of the memory. It needs no hardware
support except for protecting the operating system from the user
process.
Disadvantages
The main drawback of this technique is that since only one user
process can execute at a time, the portion of the memory which is
allocated but not used by the process will get wasted as shown in
Figure 7.1. Thus, memory is not fully utilized. Another disadvantage is
that the process size must be smaller or equal to the size of main
memory; otherwise, the process cannot be executed.
Note: The memory management scheme having single partition is
used by single process microcomputer operating systems, such as
CP/ M and PC-DOS.
Fragmentation Problem
An important issue related to contiguous multiple partition allocation
scheme is to deal with memory fragmentation. There are two facets
to memory fragmentation: internal and external fragmentation.
Internal fragmentation exists in the case of memory having multiple
fixed partitions when the memory allocated to a process is not fully
used by the process. Internal fragmentation has already been
discussed in detail. So here we focus on external fragmentation.
External fragmentation (also known as checker boarding) occurs
in the case of memory having multiple variable partitions when the
total free memory in the system is large enough to accommodate a
waiting process but it cannot be utilized as it is not contiguous.
To understand the external fragmentation problem, consider the
memory system (map along with free storage list) shown in Figure
7.5. Now, if a request for a partition of size 5M arrives, it cannot be
granted because no single partition is available that is large enough
to satisfy the request [see Figure 7.5 (a)]. However, the combined
free space is sufficient to satisfy the request.
To get rid of external fragmentation problem, it is desirable to
relocate (or shuffle) some or all portions of the memory in order to
place all the free holes together at one end of memory to make one
large hole. This technique of reforming the storage is termed as
compaction. Compaction results in the memory partitioned into two
contiguous blocks—one of used memory and another of free
memory. Figure 7.9 shows the memory map shown in Figure 7.5(a)
after performing compaction. Compaction may take place at the
moment any node frees some memory or when a request for
allocating memory fails, provided the combined free space is enough
to satisfy the request. Since it is expensive in terms of CPU time, it is
rarely used.
Relocation
From the earlier discussion, it is clear that the different processes run
at different partitions. Now, suppose a process contains an instruction
that requires access to address location 50 in its logical address
space. If this process is loaded into a partition at address 10M, this
instruction will jump to the absolute address 50 in physical memory,
which is inside the operating system. In this case, it is required to
map the address location 50 in logical address space to the address
location 10M + 50 in the physical memory. Similarly, if the process is
loaded into some other partition, say at address 20M, then it should
be mapped to address location 20M+50. This problem is known as
relocation problem. This relocation problem can be solved by
equipping the system with a hardware register called relocation
register which contains the starting address of the partition into
which the process is to be loaded. Whenever an address is
generated during the execution of a process, the memory
management unit adds the content of the relocation register to the
address resulting in physical memory address.
Example 3 Consider that the logical address of an instruction in a
program is 7632 and the content of relocation register is 2500. To
which location in the memory will this address be mapped?
Solution Here, Logical address = 7632,
Content of relocation register = 2500
Since, Physical address = Logical address + Content of relocation
register
Physical address = 7632 + 2500 = 10132
Thus, the logical address 7632 will be mapped to the location 10132
in memory.
Example 4 If a computer system has 16-bit address line and
supports 1K page size what will be the maximum page number
supported by the system?
Solution A computer system has 16-bit address lines, implies that
the logical address is of 16 bits. Therefore, the size of logical address
space is 216 and page size is 1K, that is, 1 *1024 bytes = 210 bytes.
Thus, the page offset will be of 10 bits and page number will be of
(16-10) = 6 bits.
Therefore, the maximum page number supported by this system
is 111111.
Memory Protection
Using relocation register, the problem of relocation can be solved but
there is a possibility that a user process may access the memory
address of other processes or the operating system. To protect the
operating system from being accessed by other processes and the
processes from one another, another hardware register called limit
register is used. This register holds the range of logical addresses.
Each logical address of a program is checked against this register to
ensure that it does not attempt to access the memory address
outside the allocated partition. Figure 7.10 shows relocation and
protection mechanism using relocation and limit register respectively.
Fig. 7.10 Relocation and Protection using Relocation and Limit Register
7.5 NON-CONTIGUOUS MEMORY ALLOCATION
In non-contiguous allocation approach, parts of a single process can
occupy noncontiguous physical addresses. In this section, we will
discuss memory management schemes based on non-contiguous
allocation of physical memory.
7.5.1 Paging
In paging, the physical memory is divided into fixed-sized blocks
called page frames and logical memory is also divided into fixed-size
blocks called pages which are of same size as that of page frames.
When a process is to be executed, its pages can be loaded into any
unallocated frames (not necessarily contiguous) from the disk. Figure
7.11 shows two processes A and B with all their pages loaded into the
memory. In this figure, the page size is of 4KB. Nowadays, the
systems typically support page sizes between 4KB and 8KB.
However, some systems support even larger page sizes.
Basic Operation
In paging, the mapping of logical addresses to physical addresses is
performed at the page level. When CPU generates a logical address,
it is divided into two parts: a page number (p) [high-order bits] and a
page offset (d) [low-order bits] where d specifies the address of the
instruction within the page p. Since the logical address is a power of
2, the page size is always chosen as a power of 2 so that the logical
address can be converted easily into page number and page offset.
To understand this, consider the size of logical address space is 2m.
Now, if we choose a page size of 2n (bytes or words), then n bits will
specify the page offset and m-n bits will specify the page number.
Example 5 Consider a system that generates logical address of 16
bits and page size is 4KB. How many bits would specify the page
number and page offset?
Fig. 7.11 Concept of Paging
Note: Some systems like Solaris support multiple page sizes (say 8KB and
4MB) depending on the data stored in the pages.
Solution Here, the logical address is of 16 bits, that is, the size of
logical address space is 216 and page size is 4KB, that is, 4 *1024
bytes = 212 bytes.
Thus, the page offset will be of 12 bits and page number will be of
(16-12) = 4 bits.
Now let us see how a logical address is translated into a physical
address. In paging, address translation is performed using a mapping
table, called page table. The operating system maintains a page
table for each process to keep track of which page frame is allocated
to which page. It stores the frame number allocated to each page and
the page number is used as index to the page table (see Figure
7.12).
When CPU generates a logical address, that address is sent to
MMU. The MMU uses the page number to find the corresponding
page frame number in the page table. That page frame number is
attached to the high-order end of the page offset to form the physical
address that is sent to the memory. The mechanism of translation of
logical address into physical address is shown in Figure 7.13.
Note: Since both the page and page frames are of same size, the offsets within
them are identical, and need not be mapped.
Advantages
• Since the memory is always allocated in fixed unit, any free frame
can be allocated to a process. Thus, there is no external
fragmentation.
Disadvantages
• Since memory is allocated in terms of an integral number of page
frames, there may be some internal fragmentation. That is, if the
size of a given process does not come out to be a multiple of
page size, then the last frame allocated to the process may not be
completely used. To illustrate this, consider a page size of 4KB
and a process requires memory of 8195 bytes, that is, 2 pages +
3 bytes. In this case, for only 3 bytes, an entire frame is wasted
resulting in internal fragmentation.
Page Allocation
Whenever a process requests for a page frame, the operating system
first locates a free page frame in the memory, and then allocates it to
the requesting process. For this, the operating system must keep
track of free and allocated page frames in physical memory. One way
to achieve this is to maintain a memory-map table (MMT). An MMT
is structured in the form of a static table in which each entry describes
the status of a page frame, indicating whether it is free or allocated.
Hence, an MMT for a given system contains only as many entries as
there are number of page frames in the physical memory. That is, if
the size of physical memory is m and page size is p, then
f= m/p, where f is the number of page frames.
TLB can contain entries for more than one process at the same
time, so there is a possibility that two processes map the same
page number to different frames. To resolve this ambiguity, a
process identifier (PID) can be added with each entry of TLB. For
each memory access, the PID present in the TLB is matched with
the value in a special register that holds the PID of the currently
executing process. If it matches, the page number is searched to
find the page frame number; otherwise it is treated as a TLB miss.
Hierarchical Paging
In a system where the page table becomes excessively large that it
occupies a significant amount of physical memory, it also needs to be
broken further into pages that can be stored non-contiguously. For
example, consider a system with 32-bit logical address space (232).
Considering a page size of 4 KB (212), the page table consists of 220
(232/212) entries. If each page table entry consists of 4 bytes, the total
physical memory occupied by a page table is 4 MB, which cannot be
stored in the main memory all the time. To get around this problem,
many systems use hierarchical (or multilevel) page table where, a
hierarchy of page tables with several levels is maintained. This
implies that the logical address space is broken down into multiple
page tables at different levels.
The simplest way is to use a two-level paging scheme in which
the top-level page table indexes the second level page table. In a
system having one large page table, the 32-bit logical address is
divided into a page number consisting of 20 bits and a page offset
consisting of 12 bits. Now, for a two-level page table, the page
number of 20 bits is further divided into a 10-bit page number and a
10-bit page offset (see Figure 7.16).
Fig. 7.16 A 32-bit Logical Address with Two Page Table Fields
In this figure, p1 is the index into top-level page table and p2 is the
displacement within the page of the top-level page table. The address
translation scheme for 32-bit paging architecture is shown in Figure
7.17. This scheme is also called forward-mapped page table
because the address translation works from the top-level page table
towards the inner page table.
Note that a hashed page table lookup may require many memory
references to search the desired virtual address and its
corresponding frame number because there is no guarantee on the
number of entries in the linked list.
7.5.2 Segmentation
A user views a program as a collection of segments such as main
program, routines, variables, etc. All of these segments are variable
in size and their size may also vary during execution. Each segment
is identified by a name (or segment number) and the elements within
a segment are identified by their offset from the starting of the
segment. Figure 7.21 shows the user view of a program.
Advantages
• Since a segment contains one type of object, each segment can
have different type of protection. For example, a procedure can
be specified as execute only whereas a char type array can be
specified as read only.
• It allows sharing of data or code between several processes. For
example, a common function or shared library can be shared
between various processes. Instead of having them in address
space of every process, they can be put in a segment and that
segment can be shared.
Example 8 Using the following segment table, compute the physical
address for the logical address consisting of segment and offset as
given below.
(a) segment 2 and offset 247
(b) segment 4 and offset 439
Solution
(a) Here, offset = 247 and segment is 2.
It is clear from the segment table that the limit of segment 2 is 780
and that of segment base is 2200.
Since the offset is less than the segment limit, physical address is
computed as:
Physical address = Offset + Segment base
= 247 + 2200 = 2447
(b) Here, Offset = 439 and segment is 4.
It is clear from the segment table that the limit of segment 4 is 400
and that of segment base is 1650.
Since the offset is greater than the segment limit, invalid-address
error is generated.
Solution
7.6 SWAPPING
Fig. 7.25 Swapping
LET US SUMMARIZE
1. To improve the utilization of the CPU and the speed of the computer’s
response to its users, the system keeps several processes in memory. It
is the job of memory manager, a part of the operating system, to manage
memory between multiple processes in an efficient way.
2. For managing the memory, the memory manager may use from a number
of available memory management strategies.
3. All the memory management strategies allocate memory to the processes
using either of two approaches: contiguous memory allocation or non-
contiguous memory allocation.
4. Every byte in the memory has a specific address that may range from 0 to
some maximum value as defined by the hardware. This address is known
as physical address.
5. A program is compiled to run starting from some fixed address and
accordingly all the variables and procedures used in the source program
are assigned some specific address known as logical address.
6. The mapping from addresses associated with a program to memory
addresses is known as address binding. The addresses binding can take
place at compile time, load time or run time.
7. In computer terminology, a bare machine refers to a computer having no
operating system. In such system, the whole memory is assigned to teh
user process, which runs in teh kernel mode.
8. In contiguous memory allocation, each process is allocated a single
contiguous part of the memory. The different memory management
schemes that are based on this approach are single partition and multiple
partitions.
9. In single partition technique, main memory is partitioned into two parts.
One of them is permanently allocated to the operating system while the
other part is allocated to the user process.
10. The simple way to achieve multiprogramming is to divide the main memory
into a number of partitions which may be of fixed or variable sizes.
11. There are two alternatives for multiple partition technique—equal-sized
partitions or unequal-sized partitions.
12. In equal-sized partitions technique, any process can be loaded into any
partition. Regardless of how small a process is, occupies an entire
partition which leads to the wastage of memory within the partition. This
phenomenon which results in the wastage of memory within the partition
is called internal fragmentation.
13. In unequal-sized partition, whenever a process arrives, it is placed into the
input queue of the smallest partition large enough to hold it. When this
partition becomes free, it is allocated to the process.
14. MVT (Multiprogramming with a Variable number of Tasks) is the
generalization of the fixed partitions technique in which the partitions can
vary in number and size. In this technique, the amount of memory
allocated is exactly the amount of memory a process requires.
15. In MVT, the wastage of the memory space is called external fragmentation
(also known as checker boarding) since the wasted memory is not a part
of any partition.
16. Whenever a process arrives and there are various holes large enough to
accommodate it, the operating system may use one of the algorithms to
select a partition for the process: first fit, best fit, worst fit and quick fit.
17. In multiprogramming environment, multiple processes are executed due to
which two problems can arise which are relocation and protection.
18. The relocation problem can be solved by equipping the system with a
hardware register called relocation register which contains the starting
address of the partition into which the process is to be loaded.
19. To protect the operating system from access by other processes and the
processes from one another, another hardware register called limit
register is used.
20. In non-contiguous allocation approach, parts of a single process can
occupy non-contiguous physical addresses.
21. Paging and segmentation are the memory management techniques based
on the noncontiguous allocation approach.
22. In paging, the physical memory is divided into fixed-sized blocks called
page frames and logical memory is also divided into fixed-size blocks
called pages which are of same size as that of page frames. The address
translation is performed using a mapping table, called page table.
23. To keep track of free and allocated page frames in physical memory, the
operating system maintains a data structure called a memory-map table
(MMT). An MMT is structured in the form of a static table, in which each
entry describes the status of each page frame, indicating whether it is free
or allocated. Another approach to keep track of free frames is to maintain
a list of free frames in the form of a linked list.
24. For structuring page table, there are different techniques, namely,
hierarchical paging, hash page table, and inverted page table.
25. In hierarchical paging technique, a hierarchy of page tables with several
levels is maintained. This implies that the logical address space is broken
down into multiple page tables at different levels.
26. In hashed table technique, a hash table is maintained in which each entry
contains a linked list of elements hashing to the same location. Each
element in linked list contains three fields: virtual page number, the value
of mapped page frame, and a pointer to the next element in the linked list.
27. An inverted page table contains one entry for each page frame of main
memory. Each entry consists of the virtual address of the page stored in
that page frame along with the information about the process that owns
that page. Hence, only one page table is maintained in the system for all
the processes.
28. Segmentation is a memory management scheme that implements the user
view of a program. In this scheme, the entire logical address space is
considered as a collection of segments with each segment having a
number and a length. To keep track of each segment, a segment table is
maintained by the operating system.
29. The idea behind the segmentation with paging is to combine the
advantages of both paging (such as uniform page size) and segmentation
(such as protection and sharing) together into a single scheme. In this
scheme, each segment is divided into a number of pages. To keep track
of these pages, a page table is maintained for each segment.
30. A memory management scheme called swapping can be used to increase
the CPU utilization. The process of bringing a process to memory and
after running for a while, temporarily copying it to disk is known as
swapping.
31. Overlaying is a memory management technique that allows a process to
execute irrespective of the system having insufficient physical memory.
The programmer splits a program into smaller parts called overlays in
such a way that no two overlays are required to be in main memory at the
same time. An overlay is loaded into memory only when it is needed.
EXERCISES
Fill in the Blanks
1. The mapping from addresses associated with a program to memory
addresses is known as _____________.
2. The division of logical memory into fixed size blocks is called
_____________.
3. _____________ is a hardware device which is situated in MMU used to
implement page table.
4. Each entry in an _____________ page table consists of the virtual
address of the page stored in that page frame along with the information
about the process that owns that page.
5. The process of bringing a process to memory and after running for a
while, temporarily copying it to disk is known as _____________.
Descriptive Questions
1. Distinguish physical address and the logical address.
2. What is address binding? At what times does it take place?
3. Differentiate internal and external fragmentation.
4. Discuss the basic operation involved in paging technique with the help of
suitable diagram.
5. What do you mean by segmentation?
6. Consider the following memory map with a number of variable size
partitions.
Assume that initially, all the partitions are empty. How would each of the
first fit, best fit and the worst fit partition selection algorithms allocate
memory to the following processes arriving one after another?
(a) P1 of size 2M
(b) P2 of size 2.9M
(c) P3 of size 1.4M
(d) P4 of size 5.4M
Does any of the algorithms result in a process waiting because of
insufficient memory available? Also determine using which of the
algorithms the memory is most efficiently used?
7. Consider a paged memory system with 216 bytes of physical memory, 256
pages of logical address space, and a page size of 210 bytes, how many
bytes are in a page frame?
8. Can a process on a paged memory system access memory allocated to
some other process? Why or why not?
9. The operating system makes use of two different approaches for keeping
track of free pages frames in the memory. Discuss both of them. Which
one is better in terms of performance?
10. Discuss in detail the memory management strategies involving contiguous
memory allocation. Give suitable diagrams, wherever required.
(i) Swapping
(ii) Overlays
chapter 8
Virtual Memory
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the concept of virtual memory.
⟡ Implement virtual memory using demand paging.
⟡ Evaluate performance of demand paging.
⟡ Discuss how process creation and execution can be made faster
using copy-on-write.
⟡ Explain various page replacement algorithms.
⟡ Discuss allocation of frames to processes.
⟡ Explain thrashing along with its causes and prevention.
⟡ Discuss the demand segmentation, which is another way of
implementing virtual memory.
⟡ Understand the use of cache memory to increase the CPU
performance.
⟡ Explain the organization of cache memory.
8.1 INTRODUCTION
In Chapter 7, we discussed various memory management strategies.
All these strategies require the entire process to be in main memory
before its execution. Thus, the size of the process is limited to the
size of physical memory. To overcome this limitation, a memory
management scheme called overlaying can be used that allows a
process to execute irrespective of the system having insufficient
physical memory. This technique also suffers from a drawback that it
requires a major involvement of the programmer. Moreover, splitting a
program into smaller parts is time consuming.
This resulted in the formulation of another memory management
technique known as virtual memory. Virtual memory gives the
illusion that the system has much larger memory than actually
available memory. The basic idea behind this technique is that the
combined size of code, data and stack may exceed the amount of
physical memory. Thus, virtual memory frees programs from the
constraints of physical memory limitation. Virtual memory can be
implemented by demand paging or demand segmentation. Out of
these two ways, demand paging is commonly used as it is easier to
implement.
8.2 BACKGROUND
Whenever a program needs to be executed, it must reside in the
main memory. The programs with size smaller than the size of the
memory can be fitted entirely in the memory at once for execution.
However, the same is not possible for larger programs. Since, in real
life, most of the programs are larger than the size of the physical
memory, there must be some way to execute these programs. By
examining real life programs, it has been observed that it is rarely a
case that the entire program is required at once in the memory for
execution. In addition, some portion of the program is rarely or never
executed. For example, consider the following cases:
• Most of the programs consist of code segments that are written to
handle error conditions. Such a code is required to be executed
only when some error occurs during the execution of the program.
If the program is executed without any error, such code is never
executed. Thus, keeping such code segments in the memory is
merely wastage of memory.
• Certain subroutines of a program that provide additional
functionality are rarely used by the user. Keeping such
procedures in the memory also results in the wastage of memory.
A technique called virtual memory tends to avoid such wastage of
main memory. As per this technique, the operating system loads into
the memory only those parts of the program that are currently needed
for the execution of the process. The rest is kept on the disk and is
loaded only when needed. The main advantage of this scheme is that
the programmers get the illusion of much larger memory than
physical memory, thus, the size of the user program would no longer
be constrained by the amount of available physical memory. In
addition, since each user utilizes less physical memory, multiple users
are allowed to keep their programs simultaneously in the memory.
This results in increased utilization and throughput of the CPU.
Figure 8.1 illustrates the concept of virtual memory, where a 64M
program can run on a 32M system by loading the 32M in the memory
at an instant; the parts of the program are swapped between memory
and the disk as needed.
Note: In demand paging system, the process of loading a page in the memory
is known as page-in operation instead of swap-in. It is because the whole
process is not loaded; only some pages are loaded into the memory.
Advantages
• It reduces the swap time since only the required pages are
swapped in instead of swapping the whole process.
• It increases the degree of multiprogramming by reducing the
amount of physical memory required for a process.
• It minimizes the initial disk overhead as initially not all pages are to
be read.
• It does not need extra hardware support.
where
p (0 ≤ p ≤1) is the probability of a page fault. If p=0, there is no
page fault. However, p=1 implies that every reference is a page
fault. We could expect p to be close to zero, that is, there will be
only a few page faults.
ma is the memory access time
tpfh is page fault handling time
Note that if there are no page faults (that is, p=0), the EAT is
equal to the memory access time, as shown below:
For example, assuming memory access time of 20 nanoseconds
and page fault handling time of 8 milliseconds, EAT can be calculated
as:
Belady’s Anomaly
FIFO page replacement algorithm suffers from Belady’s anomaly—a
situation in which increasing the number of page frames results in
more page faults. To illustrate this, consider the reference string
containing five pages, numbered 2 to 6 (see Figure 8.8). From this
figure, it is clear that with three page frames, a total of nine page
faults occur. On the other hand, with four page frames, a total of ten
page faults occur.
Fig. 8.8 Belady’s Anomaly
Remember that all frames are initially empty, so your first unique
pages will all cost one fault each.
Solution In case of one page frame, each page reference causes a
page fault. As a result, there will be 20 page faults for all the three
replacement algorithms.
Page frames =2
LRU replacement causes 18 page faults as shown in the figure given
below.
FIFO replacement also causes 18 page faults (see the figure given
below).
Page frames = 4
LRU replacement causes 10 page faults as shown in figure given
next.
Allocation Algorithms
Two ccommon algorithms used to divide free frames among
competing processes are as follows:
• Equal allocation: This algorithm allocates available frames to
processes in such a way that each runnable process gets an
equal share of frames. For example, if p frames are to be
distributed among q processes, then each process will get p/q
frames. Though this algorithm seems to be fair, it does not work
well in all situations. For example, let us consider we have two
processes P1 and P2 and the memory requirement of P1 is much
higher than P2. Now, allocating equal number of frames to both P1
and P2 does not make any sense as it would result in wastage of
frames. This is because the process P2 might be allocated more
number of frames than it actually needs.
• Proportional allocation: This algorithm allocates frames to each
process in proportion to its total size. To understand this
algorithm, let us consider F as the total number of available
frames and vi as the amount of virtual memory required by a
process pi. Therefore, the overall virtual memory, V, required by
all the running processes and the number of frames (ni) that
should be allocated to a process pi can be calculated as:
is, × 48).
Example 3 Consider a system with 1 KB frame size. If a process P1 of
size 10 KB and a process P2 of size 127 KB are running in a system
having 62 free frames, then how many frames would be allocated to
each process in case of?
(a) Equal allocation algorithm
(b) Proportional allocation algorithm
Which allocation algorithm is more efficient?
Solution Since the frame size is 1 KB, P1 requires 10 frames and P2
requires 127 frames. Thus, a total of 137 frames are required to run
both P1 and P2. As the total number of free frames are 62, then
(a) According to equal allocation algorithm, both P1 and P2 will be
allocated 31 frames each.
(b) According to proportional allocation algorithm, will be allocated 4
8.7 THRASHING
When a process has not been allocated as many frames as it needs
to support its pages in active use, it causes a page fault. To handle
this situation, some of its pages should be replaced. But since all its
pages are being actively used, the replaced page will soon be
referenced again thereby causing another page fault. Eventually,
page faults would occur very frequently, replacing pages that would
soon be required to be brought back into memory. As a result, the
system would be mostly busy in performing paging (page-out, page-
in) rather than executing the processes. This high paging activity is
known as thrashing. It results in poor system performance as no
productive work is being performed during thrashing.
The system can detect thrashing by evaluating CPU utilization
against the degree of multiprogramming. Generally, as we increase
the degree of multiprogramming, CPU utilization increases. However,
this does not always hold true. To illustrate this, consider the graph
shown in Figure 8.12 that depicts the behaviour of paging systems.
Initially, the CPU utilization increases with increase in degree of
multiprogramming. It continues to increase until it reaches its
maximum. Now, if the number of running processes is still increased,
CPU utilization drops sharply. To enhance CPU utilization at this
point, the degree of multiprogramming must be reduced.
8.7.1 Locality
Thrashing can be prevented if each process is allocated as much
memory (frames) as it requires. But how should the operating system
know the memory requirement (number of frames required) of a
process. The solution to this problem is influenced by two opposing
factors: over-commitment and under-commitment of memory. If a
process is allocated more number of frames (over-commitment)
than it requires, only few page faults would occur. The process
performance would be good; however, the degree of
multiprogramming would be low. As a result, CPU utilization and
system performance would be poor. In contrast, under-commitment
of memory to a process causes high page-fault rate (as discussed
earlier) which would result in poor process performance. Thus, for
better system performance, it is necessary to allocate appropriate
number of frames to each process.
A clue about the number of frames needed by a process can be
obtained using the locality model of a process execution. Locality
model states that while a process executes, it moves from locality to
locality. Locality is defined as the set of pages that are actively used
together. It is a dynamic property in the sense that the identity of the
particular pages that form the actively used set varies with time. That
is, the program moves from one locality to another during its
execution.
Note: Localities of a process may coincide partially or wholly.
The principle of locality ensures that not too many page faults
would occur if the pages in the current locality of a process are
present in the memory. However, it does not rule out page faults
totally. Once all pages in the current locality of a process are in the
memory, page fault would not occur until the process changes
locality. On the other hand, if a process has not been allocated
enough frames to accommodate its current locality, thrashing would
result.
LET US SUMMARIZE
1. Virtual memory is a technique that enables to execute a program which is
only partially in memory. The virtual memory can be implemented by
demand paging or demand segmentation.
2. In demand paging, a page is loaded into the memory only when it is
needed during program execution. Pages that are never accessed are
never loaded into the memory.
3. Whenever a process requests for a page and that page is not in the
memory then MMU raises an interrupt called page fault or a missing page
interrupt.
4. A reference string is an ordered list of memory references made by a
process.
5. A technique made available by virtual memory called copy-on-write makes
the process creation faster and conserves memory.
6. The first-in, first-out (FIFO) is the simplest page replacement algorithm. As
the name suggests, the first page loaded into the memory is the first page
to be replaced.
7. The optimal page replacement (OPT) algorithm is the best possible page
replacement algorithm in which the page to be referenced in the most
distant future is replaced.
8. The least recently used (LRU) algorithm is an approximation to the optimal
algorithm in which the page that has not been referenced for the longest
time is replaced.
9. The second chance page replacement algorithm (sometimes also referred
to as clock algorithm) is a refinement over FIFO algorithm; it replaces the
page that is both the oldest as well as unused, instead of the oldest page
that may be heavily used.
10. The least frequently used (LFU) algorithm replaces the page that is least
frequently used.
11. The most frequently used (MFU) algorithm replaces the page that has
been just brought in and has yet to be used.
12. In multiprogramming systems where a number of processes may reside in
the main memory at the same time, the free frames must be divided
among the competing processes. Thus, a decision is to be made on the
number of frames that should be allocated to each process.
13. Two common algorithms used to divide free frames among competing
processes include equal allocation and proportional allocation algorithm.
14. Equal allocation algorithm allocates available frames to the processes in
such a way that each runnable process gets an equal share of frames
while proportional allocation algorithm allocates frames to each process in
proportion to its total size.
15. A situation when the system is mostly busy in performing paging (page-
out, page-in) rather than executing the processes is known as thrashing. It
results in poor performance of the system as no productive work is
performed during thrashing.
16. A clue about the number of frames needed by a process can be obtained
using the locality model of a process execution. The locality model states
that while a process executes, it moves from locality to locality. Locality is
defined as the set of pages that are actively used together.
17. Working set model is an approach used to prevent thrashing, and is based
on the assumption of locality. It uses a parameter (say, n) to define the
working set of a process, which is the set of pages that a process has
referenced in the latest n page references.
18. PFF is another approach to prevent thrashing that takes into account the
page-fault rate of a process. This approach provides an idea of when to
increase or decrease the frame allocation.
19. In demand segmentation, a segment of variable size is brought into the
memory. The working set of segmentation should include at least one
each of code, data, and stack segments.
20. The main advantage of demand segmentation is that it inherits the benefits
of protection and sharing provided by segmentation.
21. Cache is a small but very high-speed memory that aims to speed up the
memory access operation. It is placed between the CPU and the main
memory.
22. The cache organization concerns itself with the transfer of information
between the CPU and the main memory.
EXERCISES
Fill in the Blanks
1. _____________ is a technique that enables to execute a program which is
only partially in memory.
2. Whenever a process is to be executed, an area on secondary storage
device is allocated to it on which its pages are copied. The area is known
as _____________ of the process.
3. A _____________ is an ordered list of memory references made by a
process.
4. The algorithm based on the interpretation that page has just brought in
and has yet to be used is _____________.
5. The system can detect thrashing by evaluating _____________ against
the _____________.
Descriptive Questions
1. Explain the concept of virtual memory.
2. What is demand paging? What are its advantages? Explain how it affects
the performance of a computer system.
3. When does a page fault occur? Mention the steps that are taken to handle
page fault.
4. Discuss the hardware support for demand paging.
5. Explain the algorithm used to minimize number of page faults.
6. Explain process creation.
7. ‘Copy-on-write technique makes the creation of process faster and
conserves memory.’ Explain.
8. What is the need of page replacement algorithms?
9. What is Belady’s anomaly? Does LRU replacement algorithm suffer from
this anomaly? Justify your answer with an example.
10. What are memory-mapped files?
11. Discuss the advantages and disadvantages of optimal page replacement
algorithm.
12. Compare LRU and FIFO page replacement algorithms.
13. Which algorithm is used as the basis for comparing performance of other
algorithms?
14. Discuss the two algorithms used for allocating physical frames to
processes.
15. Differentiate between global and local allocation.
16. How does the system keep track of modification of pages?
17. Write a short note on demand segmentation. How is it different from
demand paging?
18. Consider the following reference string consisting of 7 pages from 0 to 6.
Determine how many page faults would occur in the following algorithms:
(a) FIFO replacement
(b) Optimal replacement
(c) LRU replacement assuming one, two, three, and four frames.
19. Consider Figure 8.11(b) and suppose that R bits for the pages are 111001.
Which page will be replaced using second chance replacement
algorithm?
20. What will be the effect of setting the value of parameter n (in working-set
model) either too low or too high on the page-fault rate?
21. What is thrashing? Explain the approaches that can be used to prevent
thrashing.
22. What is cache memory? Explain its organization. Also, list its advantages
and disadvantages.
chapter 9
I/O Systems
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the basics of I/O hardware.
⟡ Explain how various I/O services are embodied in the application I/O
interface.
⟡ Discuss the services provided by the kernel I/O subsystem.
⟡ Describe STREAMS mechanism of the UNIX System V.
⟡ Explain how I/O requests are transformed to hardware operations.
⟡ Understand different factors affecting the performance of I/O.
9.1 INTRODUCTION
I/O and processing are the two main jobs that are performed on a
computer system. In most cases, the user is interested in I/O
operations; rather than processing. For example, while working in MS
Word, the user is interested in reading, entering or printing some
information, and not to compute any answer. Thus, controlling and
managing the I/O devices and the I/O operations is one of the main
responsibilities of an operating system. The operating system must
issue commands to the devices to work, provide a device-
independent interface between devices and the rest of the system,
handle errors or exceptions, catch interrupts, etc.
Since a variety of I/O devices with varying speeds and
functionality are attached to the system, providing a device-
independent interface is a major challenge for operating system
designers. To meet this challenge, designers use a combination of
hardware and software techniques. The basic I/O hardware elements
include ports, buses, and device controllers that can accommodate a
wide variety of I/O devices. In addition, various device-drivers
modules are provided with the operating system kernel to
encapsulate the details and peculiarities of different devices. This
forms an I/O subsystem of the kernel, which separates the
complexities of managing I/O devices from the rest of the kernel.
9.3.1 Polling
A complete interaction between a host and a controller may be
complex, but the basic abstract model of interaction can be
understood by a simple example. Suppose that a host wishes to
interact with a controller and write some data through an I/O port. It is
quite possible that the controller is busy in performing some other
task; hence, the host has to wait before starting an interaction with
the controller. When the host is in this waiting state, we say the host
is busy-waiting or polling.
Note: Controllers are programmed to indicate something or
understand some indications. For example, every controller sets a
busy bit when busy and clears it when gets free.
To start the interaction, the host continues to check the busy bit
until the bit becomes clear. When the host finds that the busy bit has
become clear, it writes a byte in the data-out register and sets the
write bit to indicate the write operation. It also sets the command-
ready bit to let the controller take action. When the controller notices
that the ready bit is set, it sets the busy bit and starts interpreting the
command. As it identifies that write bit is set, it starts reading the
data-out register to get the byte and writes it to the device. After this,
the controller clears the ready bit and busy bit to indicate that it is
ready to take the next instruction. In addition, the controller also
clears an error bit (in the status register) to indicate successful
completion of the I/O operation.
9.3.2 Interrupt-driven I/O
The above scheme for performing interaction between the host and
controller is not always feasible, since it requires busy-waiting for
host. When either the controller or the device is slow, this waiting time
may be long. In that scenario, host must switch to another task.
However, if the host switches to another task and stops checking the
busy bit, how would it come to know that the controller has become
free?
One solution to this problem is that the host must check the busy
bit periodically and determine the status of the controller. This
solution, however, is not feasible because in many cases the host
must service the device continuously; otherwise the data may be lost.
Another solution is to arrange the hardware with which a controller
can inform the CPU that it has finished the work given to it. This
mechanism of informing the CPU about completion of a task (rather
than CPU inquiring the completion of task) is called interrupt. The
interrupt mechanism eliminates the need of busy-waiting of processor
and hence is considered more efficient than the previous one.
Now let us understand, how the interrupt mechanism works. The
CPU hardware has an interrupt-request line, which the controllers use
to raise an interrupt. The controller asserts a signal on this line when
the I/O device becomes free after completing the assigned task. As
the CPU senses the interrupt-request line after executing every
instruction, it frequently comes to know that an interrupt has occurred.
To handle the interrupt, the CPU performs the following steps.
1. It saves the state of current task (at least the program counter)
so that the task can be restarted (later) from where it was
stopped.
2. It switches to the interrupt-handling routine (at some fixed
address in memory) for servicing the interrupt. The interrupt
handler determines the cause of interrupt, does the necessary
processing, and causes the CPU to return to the state prior to
the interrupt.
The above discussed interrupt-handling mechanism is the ideal
one. However, in modern operating systems, the interrupt-handling
mechanism must accommodate the following features.
• High-priority interrupts must be identified and serviced before low-
priority interrupts. If two interrupts occur at the same time, the
interrupt with high-priority must be identified and serviced first.
Also, if one interrupt is being serviced and another high-priority
interrupt occurs, the high-priority interrupt must be serviced
immediately by preempting the low-priority interrupt.
• The CPU must be able to disable the occurrence of interrupts.
This is useful when CPU is going to execute those instructions of
a process that must not be interrupted (like instructions in the
critical section of a process). However, disabling all the interrupts
is not a right decision. This is because the interrupts not only
indicate the completion of task by a device but many exceptions
also such as an attempt to access non-existent memory address,
divide by zero error, etc. To resolve this, most CPUs have two
interrupt-request lines: maskable and non-maskable interrupts.
Maskable interrupts are used by device controllers and can be
disabled by the CPU whenever required but non-maskable
interrupts handle exceptions and should not be disabled.
Note that when the DMA controller acquires the bus for
transferring data, the CPU has to wait for accessing the bus and the
main memory; though it can access cache. This mechanism is called
cycle stealing and it can slightly slow down the CPU. However, it is
much important to note that large amount of data gets transferred
with negligible task by the CPU. Hence, DMA seems to be a very
good approach of utilizing CPU for multiple tasks.
Network Devices
Since the performance and addressing characteristics of network I/O
is different from that of disk I/O, the interface provided for network I/O
is also different. Unlike read(), write() and seek() interface for disks,
the socket interface is provided for network I/O. This interface is
provided in most of the operating systems including UNIX and
Windows NT.
The socket interface consists of various system calls that enable
an application to perform the following tasks:
• To create a socket
• To connect a local socket to a remote address
• To listen to any remote application
• To send and receive packets over the connection.
The socket interface also provides a select() function to provide
information about the sockets. When this function is called, it returns
the information about which sockets have space to accept packet to
be sent and which sockets have packet waiting to be received. This
eliminates the use of polling and busy waiting.
9.5.2 Buffering
A buffer is a region of memory used for holding streams of data
during data transfer between an application and a device or between
two devices. Buffering serves the following purposes in a system.
• The speeds of the producer and the consumer of data streams
may differ. If the producer can produce items at a faster speed
than the consumer can consume or vice-versa, the producer or
consumer would be in waiting state for most of the time,
respectively. To cover up this speed mismatch between the
producer and consumer, buffering may be used. Both producer
and consumer share a common buffer. The producer produces an
item, places it in the buffer and continues to produce the next item
without having to wait for the consumer. Similarly, the consumer
can consume the items without having to wait for the producer.
However, due to fixed size of the buffer, the producer and
consumer still have to wait in case of full and empty buffer,
respectively. To resolve this, double buffering may be used
which allows sharing of two buffers between the producer and the
consumer thereby relaxing the timing requirements between
them.
• The sender and receiver may have different data transfer sizes. To
cope with such disparities, buffers are used. At the sender’s side,
large data is fragmented into small packets, which are then sent
to the receiver. At the receiver’s side, these packets are placed
into a reassembly buffer to produce the source data.
• Another common use of buffering is to support copy semantics for
application I/O. To understand the meaning of copy semantics,
consider that an application invokes the write() system call for
data in the buffer associated with it to be written to the disk.
Further, suppose that meanwhile the system call returns, the
application changes the contents of the buffer. As a result, the
version of the data meant to be written to the disk is lost. But with
copy semantics, the system can ensure that the appropriate
version of the data would be written to the disk. To ensure this, a
buffer is maintained in the kernel. At the time, the application
invokes write() system call, the data is copied to the kernel
buffer. Thus, any subsequent changes in the application buffer
would have no effect.
9.5.3 Caching
A cache is an area of very high speed memory, which is used for
holding copies of data. It provides a faster and an efficient means of
accessing data. It is different from the buffer in the sense that a buffer
may store the only existing copy of data (that does not reside
anywhere else) while a cache may store a copy of data that also
resides elsewhere.
Though caching and buffering serve different purposes,
sometimes an area of memory is used for both purposes. For
example, the operating system may maintain a buffer in the main
memory to store disk data for efficient disk I/O and at the same time
can use this buffer as cache to store the file blocks which are being
accessed frequently.
9.5.4 Spooling
SPOOL is an acronym for Simultaneous Peripheral Operation On-
line. Spooling refers to storing jobs in a buffer so that the CPU can
be utilized efficiently. Spooling is useful because devices access data
at different rates. The buffer provides a waiting station where the data
can rest while the slower device catches up. The most common
spooling application is print spooling. In a multiuser environment,
where multiple users can give the print command simultaneously, the
spooler loads the documents into a buffer from where the printer pulls
them off at its own rate. Meanwhile, a user can perform other
operations on the computer while the printing takes place in the
background. Spooling also lets a user place a number of print jobs on
a queue instead of waiting for each one to finish before specifying the
next one. The operating system also manages all requests to read or
write data from the hard disk through spooling.
9.7 STREAMS
STREAMS is a UNIX System V mechanism that enables
asynchronous I/O between a user and a device. It provides a full-
duplex (two-way communication) connection between a user process
and the device driver of the I/O device. A STREAM consists of a
stream head, driver end, and stream modules (zero or more). The
stream head acts as an interface to the user process and the driver
end controls the device. Between the stream head and the driver
end, are the stream modules that provide the functionality of
STREAMS processing. Each of the stream head, driver end, and
stream modules is associated with two queues: read queue and write
queue. The read queue is used to store the requests for reading from
the device while the write queue is used to store the requests for
writing to the device. Each queue can communicate with its
neighbouring queue via message passing. Figure 9.4 shows the
structure of STREAMS.
Fig. 9.4 Structure of STREAMS
9.8 PERFORMANCE
System performance is greatly affected by the I/O. As we know that
an I/O system call invoked by an application has to pass through a
number of software layers such as kernel, device driver, and device
controller before reaching the physical device. This demands more
CPU time and therefore, performing I/O is costly in terms of the CPU
cycles. Moreover, the layers between the application and the physical
device imply overhead of:
• Context switching while crossing the kernel’s protection boundary.
• Interrupt-handling and signal-handling in the kernel to serve the
I/O device.
• Load on the CPU and memory bus while copying data between
device controller and physical memory and between kernel
buffers and application space.
One cause behind context switching is the occurrence of
interrupts. Whenever an interrupt occurs, the system performs a state
change, executes the appropriate interrupt handler, and then restores
the state. Though modern computers are able to deal with several
thousands of interrupts per second, handling an interrupt is quite an
expensive task.
Another reason that causes high context-switching rate is the
network traffic. To understand how it happens, suppose that a
process on one machine wants to login on a remote machine
connected via the network. Now, the following sequence of steps
takes place for transferring each character from the local machine to
the remote machine.
1. A character is typed on the local machine causing a keyboard
(hardware) interrupt. The system state is saved and the control
is passed to the appropriate interrupt handler.
2. After the interrupt has been handled, the character is passed to
the device driver and from there to the kernel. Finally, the
character is passed from the kernel to the user process. A
context switch occurs as the kernel switches from kernel mode
to user mode.
3. The user process invokes a network I/O system call to pass the
character through the network. A context switch occurs and the
character flows into the local kernel.
4. The character passes through network layers that prepare a
network packet which is transferred to the network device driver
and then to the network controller.
5. The network controller transfers the packet onto the network and
causes an interrupt. The system’s state is saved and the
interrupt is handled.
6. After the interrupt has been handled, a context switch occurs to
indicate the completion of network I/O system call.
7. At the receiving side, the network packet is received by the
network hardware and an interrupt occurs which causes the
state save.
8. The character is unpacked and is passed to the device driver
and from there to the kernel. A context switch occurs and the
character is passed to the appropriate network daemon.
9. The network daemon determines which login session is involved
and passes the character to the network sub-daemon via the
kernel, thereby resulting in two context switches.
Thus, it is clear that passing data through the network involves a
lot of interrupts, state switches, and context switches. Moreover, if the
receiver has to echo the character back to the sender, the work
doubles.
In general, the efficiency of I/O in a system can be improved by:
• reducing the number of context switches.
• reducing the frequency of interrupt generation by employing large
data transfers, smart controllers, and polling.
• reducing the frequency of copying data in the memory during data
transfer between application and device.
• balancing the load of memory bus, CPU, and I/O.
• employing DMA-knowledgeable controllers for increasing
concurrency.
LET US SUMMARIZE
1. Controlling and managing the I/O devices and the I/O operations is one of
the main responsibilities of an operating system. The operating system
must issue commands to the devices to work, provide a device-
independent interface between devices and the rest of the system, handle
errors or exceptions, catch interrupts, and so on.
2. I/O devices are broadly classified into three categories, namely, human-
interface devices, storage devices, and network devices.
3. The basic I/O hardware elements include ports, buses, and device
controllers. A port is a connection point through which a device is
attached to the computer system. It could be a serial port or parallel port.
After being attached, the device communicates with the computer by
sending signals over a bus. A bus is a group of wires that specifies a set
of messages that can be sent over it.
4. A device controller (or adapter) is an electronic component that can
control one or more identical devices depending on the type of device
controller.
5. There are basically three different ways to perform I/O operations
including programmed I/O (polling), interrupt-driven I/O, and direct
memory access (DMA).
6. During programmed I/O, the host may have to wait continuously while the
controller is busy in performing some other task. This behaviour is often
called busy-waiting or polling.
7. In interrupt-driven I/O, the CPU is informed of the completion of a task
(rather than CPU inquiring the completion of the task) by means of
interrupts. The interrupt mechanism eliminates the need of busy-waiting of
processor and hence is considered more efficient than programmed I/O.
8. In DMA, the DMA controller interacts with the device without the CPU
being bothered. As a result, the CPU can be utilized for multiple tasks.
9. All I/O devices are grouped under a few general kinds. For each general
kind, a standardized set of functions (called interface) is designed through
which the device can be accessed. The differences among the I/O
devices are encapsulated into the kernel modules called device drivers.
10. A block device stores data in fixed-size blocks with each block having a
specific address. A character device is the one that accepts and produces
a stream of characters. Unlike block devices, character devices are not
addressable.
11. A socket interface is provided in most of the operating systems for network
I/O. It also provides a select() function to provide information about the
sockets.
12. Clocks and timers are used for getting the current time and elapsed time,
and for setting the timer for some operation or interrupt.
13. An operating system may use blocking or non-blocking I/O system calls for
application interface. The blocking I/O system call causes the invoking
process to block until the call is completed. On the other hand, the non-
blocking I/O system calls do not suspend the execution of the invoking
process for a long period; rather they return quickly with a return value
which indicates the number of bytes that have been transferred.
14. A wide variety of methods are used to control the devices attached to the
computer system. These methods altogether form the I/O subsystem of
the kernel. The kernel I/O subsystem is responsible for providing various
I/O-related services, which include scheduling, buffering, caching,
spooling, and error handling.
15. When a user requests for an I/O operation, a number of steps are
performed to transform the I/O request into the hardware operation so as
to service the I/O request.
16. STREAMS is a UNIX System V mechanism that enables asynchronous
(non-blocking) I/O between a user and a device. It provides a full-duplex
(two-way communication) connection between a user process and the
device driver of the I/O device.
17. The efficiency of I/O in a system can be improved by reducing the number
of context switches, reducing the frequency of interrupt generation,
reducing the frequency of copying data in the memory during data transfer
between application and device, balancing the load of memory bus, CPU,
I/O, and employing DMA-knowledgeable controllers for increasing
concurrency.
EXERCISES
Fill in the Blanks
1. A device is attached with the computer system via a connection point
known as _____________.
2. DMA stands for _____________.
3. Applications can interact with the block and character devices through the
_________ and _____________, respectively.
4. _____________ means deciding the order in which the I/O requests
should be executed.
5. A stream consists of a _____________, _____________, and
_____________.
Descriptive Questions
1. Define the following terms:
(i) Port
(ii) Bus
(iii) Device controller
(iv) Spooling
2. Discuss the various categories of I/O devices. How do these devices differ
from each other?
3. What is asynchronous I/O?
4. State the difference between blocking and non-blocking I/O.
5. Describe some services provided by the I/O subsystem of a kernel.
6. List two common uses of buffering.
7. What is busy-waiting? Is it preferred over blocking-wait?
8. How does DMA result in increased system concurrency?
9. Differentiate between STREAMS driver and STREAMS module.
10. Write a short note on the following:
• Interrupt-driven I/O
• Block and character devices
• Kernel I/O structure
11. With the help of flow chart explain the lifecycle of the I/O operation.
12. Discuss the role of socket interface for network devices.
chapter 10
Mass-Storage Structure
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the physical structure of magnetic disks.
⟡ Describe disk scheduling and various algorithms that are used to
optimize disk performance.
⟡ Explain disk management including formatting of disks and
management of boot and damaged blocks.
⟡ Discuss swap-space management.
⟡ Explain RAID and its levels.
⟡ Describe disk attachment.
⟡ Explore stable storage and tertiary storage.
10.1 INTRODUCTION
As discussed in the previous chapter, a computer system consists of
several devices (such as mouse, keyboard, disk, monitor, CD-ROM)
that deal with different I/O activities. Among all these I/O devices, disk
(or some kind of disk) is considered as an essential requirement for
almost all the computers. Other devices like mouse, CD-ROM, or
even keyboard and monitor are optional for some systems such as
servers. This is because servers are usually accessed by other
resources (say, clients) on the network. Therefore, this chapter mainly
focuses on disk related issues such as its physical structure,
algorithms used to optimize its performance, its management, and
reliability.
Note: Some disks have one read/write head for each track of the platter. These
disks are termed as fixed-head disks, since one head is fixed on each track and
is not moveable. On the other hand, disks in which the head moves along the
platter surface are termed as moveable-head disks.
The total distance moved in serving all the pending requests can
be calculated as:
(143 – 86) + (1470 – 86) + (1470 – 913) + (1774 – 913) + (1774
– 948) + (1509 – 948) + (1509 – 1022) + (1750 – 1022) + (1750
– 130)
⇒ 57 + 1384 + 557 + 861 + 826 + 561 + 487 + 728 + 1620
⇒ 7081 cylinders
(b) SSTF algorithm
First of all, the head serves the cylinder 130 as it is closest to its
current position (which is 143). From there, it moves to the cylinder
86, to 913, 948, 1022, 1470, 1509, 1750, and finally, to the cylinder
1774. Figure 10.8 illustrates how the pending requests are scheduled
according to SSTF algorithm.
The total distance moved in serving all the pending requests can
be calculated as:
(143 – 86) + (1774 – 86)
⇒ 57 + 1688
⇒ 1745 cylinders
(c) SCAN algorithm
As the head is currently serving the request at 143 and previously it
was serving at 125, it is clear that the head is moving towards
cylinder 4999. While moving, the head serves the requests at
cylinders which fall on the way, that is, 913, 948, 1022, 1470, 1509,
1750 and 1774 in this order. Then, upon reaching the end, that is, at
cylinder 4999, the head reverses its direction and serves the requests
at cylinders 130 and 86. Figure 10.9 illustrates how the pending
requests are scheduled according to SCAN algorithm.
The total distance moved in serving all the pending requests can be
calculated as:
(4999 – 143) + (4999 – 86)
⇒ 4856 + 4913
⇒ 9769 cylinders
(d) LOOK algorithm
In LOOK algorithm, the head serves the requests in the same manner
as in SCAN algorithm except when it reaches cylinder 1774, it
reverses its direction instead of going to the end of the disk. Figure
10.10 illustrates how the pending requests are scheduled according
to the LOOK algorithm.
The total distance moved in serving all the pending requests can
be calculated as:
(1774 – 143) + (1774 – 86)
⇒ 1631 + 1688
⇒ 3319 cylinders
The total distance moved in serving all the pending requests can
be calculated as:
(4999 – 143) + 4999 + 130
⇒ 4856 + 5129
⇒ 9985 cylinders
(f) C-LOOK algorithm
In C-LOOK algorithm, the head serves the requests in the same
manner as in LOOK algorithm. The only difference is that when it
reverses its direction on reaching cylinder 1774, it first serves the
request at cylinder 86 and then at 130. Figure 10.12 illustrates how
the pending requests are scheduled according to C-LOOK algorithm.
Fig. 10.12 Using C-LOOK Algorithm
The total distance moved in serving all the pending requests can
be calculated as:
(913 – 143) + (948 – 913) + (1022 – 948) + (1470 – 1022) +
(1509 – 1470) + (1750 – 1509) + (1774 – 1750) + (1774 – 86) +
(130 – 86)
⇒ 770 + 35 + 74 + 448 + 39 + 241 + 24 + 1688 + 44
⇒ 3363 cylinders
Thus, we see that in this example the SSTF algorithm proves
fastest as the head needs to move only 1745 cylinders, as against
7081, 9769, 3319, 9985, and 3363 in other cases.
With four data disks, RAID level 4 requires just one check disk.
Effective space utilization for our example of four data disks is 80
per cent. As always, one check disk is required to hold parity
information, effective space utilization increases with the number
of data disks.
• RAID level 5: Instead of placing data across N disks and parity
information in one separate disk, this level distributes the block-
interleaved parity and data among all the N+1 disks. Such
distribution is advantageous in processing read/ write requests.
All disks can participate in processing read request, unlike RAID
level 4, where dedicated check disks never participate in read
request. So level 5 can satisfy more read requests in a given
amount of time. Since the bottleneck of a single check disk has
been eliminated, several write requests could also be processed
in parallel. RAID level 5 has the best performance among all the
RAID levels with redundancy. In our example of 4 actual disks,
RAID level 5 system has five disks in all, thus, the effective space
utilization for level 5 is same as in level 3 and level 4.
• RAID level 6: RAID level 6 is an extension of RAID level 5 and
applies P + Q redundancy scheme using Reed-Solomon codes.
Reed-Solomon codes enable RAID level 6 to recover from up to
two simultaneous disk failures. RAID level 6 requires two check
disks; however, like RAID level 5, redundant information is
distributed across all disks using block-level striping.
Optical Disks
An optical disk is a flat, circular, plastic disk coated with a material on
which bits may be stored in the form of highly reflective areas and
significantly less reflective areas, from which the stored data may be
read when illuminated with a narrow-beam source, such as a laser
diode. The optical disk storage system consists of a rotating disk
coated with a thin layer of metal (aluminium, gold, or silver) that acts
as a reflective surface and a laser beam, which is used as a
read/write head for recording data onto the disk. Compact disk (CD)
and digital versatile disk (DVD) are the two forms of optical disks.
Magneto-optical Disk
As implied by the name, these disks use a hybrid of magnetic and
optical technologies. A magneto-optical disk writes magnetically (with
thermal assist) and reads optically using the laser beam. A magneto-
optical disk drive is so designed that an inserted disk will be exposed
to a magnet on the label side and to the light (laser beam) on the
opposite side. The disks, which come in 3½-inch and 5¼-inch
formats, have a special alloy layer that has the property of reflecting
laser light at slightly different angles depending on which way it is
magnetized, and data can be stored on it as north and south
magnetic spots, just like on a hard disk.
While a hard disk can be magnetized at any temperature, the
magnetic coating used on the magneto-optical media is designed to
be extremely stable at room temperature, making the data
unchangeable unless the disk is heated to above a temperature level
called the Curie point (usually around 200º C). Instead of heating the
whole disk, magneto-optical drives use a laser to target and heat only
specific regions of the magnetic particles. This accurate technique
enables magneto-optical media to pack in a lot more information than
the other magnetic devices. Once heated, the magnetic particles can
easily have their direction changed by a magnetic field generated by
the read/write head. Information is read using a less powerful laser,
making use of the Kerr effect, where the polarity of the reflected light
is altered depending on the orientation of the magnetic particles.
Where the laser/magnetic head has not touched the disk, the spot
represents a ‘0’, and the spots where the disk has been heated up
and magnetically written will be seen as data ‘1’. However, this is a
‘two-pass’ process, which coupled with the tendency for magneto-
optical heads to be heavy, resulted in early implementations that were
relatively slow. Nevertheless, magneto-optical disks can offer very
high capacity and cheap media as well as top archival properties,
often being rated with an average life of 30 years, which is far longer
than any magnetic media.
LET US SUMMARIZE
1. A magnetic disk is the most commonly used secondary storage medium. It
offers high storage capacity and reliability. Data is represented as
magnetized spots on a disk. A magnetized spot represents 1 and the
absence of a magnetized spot represents 0.
2. A magnetic disk consists of plate/platter, which is made up of metal or
glass material, and its surface is covered with magnetic material to store
data on its surface.
3. Disk surface of a platter is divided into imaginary tracks and sectors.
Tracks are concentric circles where the data is stored, and are numbered
from the outermost to the innermost ring, starting with zero. A sector is
just like an arc that forms an angle at the center. It is the smallest unit of
information that can be transferred to/from the disk.
4. A disk contains one read/write head for each surface of a platter, which is
used to store and retrieve data from the surface of the platter. All the
heads are attached to a single assembly called a disk arm.
5. Transfer of data between the memory and the disk drive is handled by a
disk controller, which interfaces the disk drive to the computer system.
Some common interfaces used for disk drives on personal computers and
workstations are SCSI (small-computer-system-interface; pronounced
“scuzzy”), ATA (AT attachment) and SATA (serial ATA).
6. The process of accessing data comprises three steps, namely, seek,
rotate, and data transfer. The combined time (seek time, latency time, and
data transfer time) is known as the access time of the disk. Specifically, it
can be described as the period of time that elapses between a request for
information from the disk or memory and the information arriving at the
requesting device.
7. Reliability of the disk is measured in terms of the mean time to failure
(MTTF). It is the time period for which the system can run continuously
without any failure.
8. Several algorithms have been developed for disk scheduling, which are
first-come, first served (FCFS), shortest seek time first (SSTF), SCAN,
LOOK, C-SCAN and C-LOOK algorithms.
9. Before the disk can be used for storing data, all its platters must be
divided into sectors (that disk controller can read and write) using some
software. This process is called low-level (or physical) formatting, which is
usually performed by the manufacturer.
10. After physical formatting, logical (or high-level) formatting is to be
performed for each partition of the disk. During logical formatting, the
operating system stores initial file-system data structures and a boot block
on the disk. After logical formatting, the disk can be used to boot the
system and store the data.
11. Due to manufacturing defects, some sectors of a disk drive may
malfunction during low-level formatting. Some sectors may also become
bad during read or write operations with the disk due to head crash.
12. There are several ways of handling bad sectors. On some simple disks,
bad sectors need to be handled manually by using, for instance, format
command or chkdsk command of MS-DOS. However, in modern disks
with advanced disk controller, other schemes including sector sparing and
sector slipping can be used.
13. Swap-space is used in different ways by different operating systems
depending upon the memory management algorithms. The amount of disk
space required to serve as swap-space may vary from a few megabytes
to the level of gigabytes.
14. A major advancement in secondary storage technology is represented by
the development of RAID (Redundant Arrays of Independent Disks). The
basic idea behind the RAID is to have a large array of small independent
disks. The presence of multiple disks in the system improves the overall
transfer rates, if the disks are operated in parallel.
15. In order to improve the performance of a disk, a concept called data
striping is used which utilizes parallelism. Data striping distributes the data
transparently among N disks, which make them appear as a single large,
fast disk.
16. Several kinds of RAID organization, referred to as RAID levels, have been
proposed which aim at providing redundancy at low cost. These levels
have different cost–performance trade-offs. The RAID levels are classified
into seven levels (from level 0 to level 6).
17. The disk of a computer system contains bulk of data which can be
accessed by the system either directly through I/O ports (host-attached
storage) or through a remote system connected via a network (network-
attached storage).
18. Ideally, a disk should always work without producing any errors. However,
practically, it cannot be achieved. The only achievable thing is a disk
subsystem called stable storage which ensures whenever a write is
performed to the disk; it is performed either completely or not at all.
19. Tertiary storage, also known as tertiary memory, is built from inexpensive
disks and tape drives that use removable media. Due to relatively low
speeds of tertiary storage systems, they are primarily used for storing
data that is to be accessed less frequently. Some examples of tertiary
storage devices include floppy disk, optical disk, magneto-optical disk,
and magnetic tape.
EXERCISES
Descriptive Questions
1. Give hardware description and various features of a magnetic disk. How
do you measure its performance?
2. How does LOOK algorithm differ from SCAN algorithm?
3. In which ways can the swap-space be used by the operating system?
4. Explain why SSTF scheduling tends to favour middle cylinders over the
innermost and outermost cylinders.
5. Consider a disk drive having 200 cylinders, numbered from 0 to 199. The
head is currently positioned at cylinder 53 and moving toward the cylinder
199. The queue of pending I/O requests is: 98, 183, 37, 122, 14, 124, 65,
67.
Starting from the current head position, what is the total head
movement (in cylinders) to service the pending requests for each
of the following disk-scheduling algorithms?
(a) FCFS
(b) SSTF
(c) SCAN
(d) LOOK
(e) C-SCAN
(f) C-LOOK
6. Compare and contrast the sector sparing and sector slipping techniques
for managing bad sectors.
7. Define the following:
(a) Disk latency
(b) Seek time
(c) Head crash
(d) MTTF
8. Define RAID. What is the need of having RAID technology?
9. How can the reliability and performance of disk be improved using RAID?
Explain different RAID levels.
10. How does stable storage ensure consistency on the disk during a failure?
11. Write short notes on the following:
(a) Tertiary storage
(b) Swap-space management
(c) Disk formatting
(d) Disk attachment
chapter 11
File Systems
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the concept of file.
⟡ Discuss the aspects related to files such as file attributes, operations,
structures, access, and so on.
⟡ Discuss various types of directory structures.
⟡ Explain file-system mounting and unmounting.
⟡ Discuss the concept of record blocking.
⟡ Understand the concept of file sharing, and issues related to it.
⟡ Explain the protection mechanisms required for protecting files in
multi-user environment.
11.1 INTRODUCTION
Computer applications require large amounts of data to be stored, in
a way that it can be used as and when required. For this, secondary
storage devices such as magnetic disks, magnetic tapes and optical
discs are used. The storage of data on the secondary storage
devices makes the data persistent, that is, the data is permanently
stored and can survive system failures and reboots. In addition, a
user can access the data on these devices as per his/her
requirement.
The data on the disks are stored in the form of files. To store and
retrieve files on the disk, the operating system provides a mechanism
called file system, which is primarily responsible for the management
and organization of various files in a system. The file system consists
of two main parts, namely, a collection of files and a directory
structure. The directory structure is responsible for providing
information about all the files in the system. In this chapter, we will
discuss various aspects related to the file system.
Sequential Access
When the information in the file is accessed in the order of one record
after the other, it is called sequential access. It is the easiest file
access method. Compilers, multimedia applications, sound files and
editors are the most common examples of the programs using
sequential access.
The most frequent and common operations performed on a file
are read and write. In the case of read operation, the record at the
location pointed by the file pointer is read and the file pointer is then
advanced to the next record. Similarly, in the case of write operation,
the record is written to the end of the file and the pointer is advanced
to the end of new record.
Direct Access
With the advent of disks as a storage medium, large amounts of data
can be stored on them. Sequential access of this data would be very
lengthy and a slow process. To overcome this problem, the data on
the disk is stored as blocks of data with index numbers which helps to
read and write data on the disk in any order (known as random or
direct access).
Under direct access, a file is viewed as a sequence of blocks (or
records) which are numbered. The records of a file can be read or
written in any order using this number. For instance, it is possible to
read block 20, then write block 4, and then read block 13. The block
number is a number given by the user. This number is relative to the
beginning of the file. This relative number internally has an actual
absolute disk address. For example, the record number 10 can have
the actual address 12546 and block number 11 can have the actual
address 3450. The relative address is internally mapped to the
absolute disk address by the file system. The user gives the relative
block number for accessing the data without knowing the actual disk
address. Depending on the system, this relative number for a file
starts with either 0 or 1.
In direct access, the system calls for read and write operations
are modified to include the block number as a parameter. For
instance, to perform the read or write operation on a file, the user
gives read n or write n (n is the block number) rather than read next
or write next system calls used in sequential access.
Most applications with large databases require direct access
method for immediate access to large amounts of information. For
example, in a railway reservation system, if a customer requests to
check the status for reservation of the ticket, the system must be able
to access the record of that customer directly without having the need
to access all other customers’ records.
Note that for accessing the files, an operating system may support
either sequential access or direct access, or both. Some systems
require a file to be defined as sequential or direct when it is created;
so that it can be accessed in the way it is declared.
11.3 DIRECTORIES
As stated earlier, a computer stores numerous data on the disk. To
manage this data, the disk is divided into one or more partitions (also
known as volumes) and each partition contains information about the
files stored in it. This information is stored in a directory (also known
as device directory). In simplest terms, a directory is a flat file that
stores information about files and subdirectories.
In this section, we will discuss some most commonly used
directory structures, including single-level, two-level, and hierarchical
directory as well various operations that can be performed over
directories.
The main drawback of this system is that no two files can have
the same name. For instance, if one user (say, jojo) creates a file
with name file1 and then another user (say, abc) also creates a file
with the same name, the file created by the user abc will overwrite the
file created by the user jojo. Thus, all the files must have unique
names in a single-level directory structure. With the increase in the
number of files and users on a system, it becomes very difficult to
have unique names for all the files.
In some situations, a user might need to access files other than its
own files. One such situation might occur with system files. The user
might want to use system programs like compilers, assemblers,
loaders, or other utility programs. In such a case, to copy all the files
in every user directory would require a lot of space and thus, would
not be feasible. One possible solution to this is to make a special user
directory and copy system files into it. Now, whenever a filename is
given, it is first searched in the local UFD. If not found then the file is
searched in the special user directory that contains system files.
11.3.3 Hierarchical Directory System
The hierarchical directory, also known as tree of directory or tree-
structured directory, allows users to have subdirectories under their
directories, thus making the file system more logical and organized
for the user. For instance, a user may have directory furniture, which
stores files related to the types of furniture, say wooden, steel, cane,
etc. Further, he wants to define a subdirectory which states the kind
of furniture available under each type, say sofa, bed, table, chair, etc.
Under this system, the user has the flexibility to define, group and
organize directories and subdirectories according to his requirements.
Pathnames
Under hierarchical directory system, a user can access files of other
users in addition to its own files. To access such files, the user needs
to specify either the absolute path name or the relative path name.
The absolute path name begins at the root and follows a path down
to the specified file, whereas the relative path name defines a path
from the current working directory. For instance, to access a file under
directory D1, using absolute path name, the user will give the path
\\bin\D8\D1\filename. On the other hand, if the user’s current
working directory is \\bin\D8, the relative path name will be
D1\filename.
11.7 PROTECTION
The information stored in a system requires to be protected from the
physical damage and unauthorized access. A file system can be
damaged due to various reasons, such as a system breakdown, theft,
fire, lightning or any other extreme condition that is unavoidable and
uncertain. It is very difficult to restore the data back in such
conditions. In some cases, when the physical damage is irreversible,
the data can be lost permanently. Though physical damage to a
system is unavoidable, measures can be taken to safeguard and
protect the data.
In a single-user system, protection can be provided by storing a
copy of the information on the disk to the disk itself, or to some other
removable storage medium, such as magnetic tapes and compact
discs. If the original data on the disk is erased or overwritten
accidentally, or becomes inaccessible because of its malfunctioning,
the backup copy can be used to restore the lost or damaged data.
Apart from protecting the files from physical damage, the files in a
system also need a protection mechanism to control improper
access.
Password
A password can be assigned to each file and only the users
authorized to access the file are given the password. This scheme
protects the file from unauthorized access. The main drawback of this
approach is the large number of passwords which are practically very
difficult to remember (for each files separately). However, if only one
password is used for accessing all the files, then if once the password
is known, all the files become accessible. To balance the number of
passwords in a system, some systems follow a scheme, where a user
can associate a password with a subdirectory. This scheme allows a
user to access all the files under a subdirectory with a single
password. However, even this scheme is also not very safe. To
overcome the drawbacks of these schemes, the protection must be
provided at a more detailed level by using multiple passwords.
LET US SUMMARIZE
1. Computer applications require large amounts of data to be stored, in a
way that it can be used as and when required.
2. Storing data on the secondary storage devices makes the data persistent,
that is, the data is permanently stored and can survive system failures
and reboots.
3. The data on the disks are stored in the form of files. To store and retrieve
files on the disk, the operating system provides a mechanism called the
file system, which is primarily responsible for the management and
organization of various files in a system.
4. A file is a collection of related data stored as a named unit on the
secondary storage.
5. Each file is associated with some attributes such as its name, size, type,
location, date and time, etc. These are known as file attributes. This
information helps the file system to manage a file within the system.
6. File operations are the functions that can be performed on a file. The
operating system handles the file operations through the use of system
calls.
7. Various operations that can be performed on a file are: create, write, read,
seek, delete, open, append, rename and close.
8. The most common technique to implement a file type is by providing
extension to a file. The file name is divided into two parts, with the two
parts separated by a period (‘.’) symbol, where the first part is the name
and the second part after the period is the file extension.
9. Another way to implement the file type is the use of magic number. A
magic number is a sequence of bits, placed at the beginning of a file to
indicate roughly the type of the file.
10. File structure refers to the internal structure of a file, that is, how a file is
internally stored in the system.
11. The most common file structures recognized and used by different
operating systems are byte sequence, record sequence and tree
structure.
12. The information stored in a file can be accessed in one of the two ways:
sequential access, or direct access.
13. When the information in the file is accessed in the order of one record after
the other, it is called sequential access.
14. When a file is viewed as a sequence of blocks (or records) which are
numbered and can be read or written in any order using this number, it is
called direct access.
15. To manage the data on the disk, the disk is divided into one or more
partitions (also known as volumes) where each partition contains the
information about the files stored in it. This information is stored in a
directory (also known as device directory).
16. Various schemes to define the structure of a directory are: single-level
directory, two-level directory and hierarchical directory.
17. Single-level directory is the simplest directory structure. There is only one
directory that holds all the files. Sometimes, this directory is referred to as
root directory.
18. In a two-level directory structure, a separate directory known as user file
directory (UFD) is created for each user. Whenever a new UFD is created,
an entry is added to the master file directory (MFD) which is at the highest
level in this structure.
19. The hierarchical directory, also known as tree of directory or tree-
structured directory, allows users to have subdirectories under their
directories, thus making the file system more logical and organized for the
user.
20. Mounting a file system means attaching the file system to the directory
structure of the system. The effect of mounting lasts until the file system is
unmounted. Unmounting a file system means detaching a file system from
the system’s directory structure.
21. Whenever a user or an application performs an operation on a file, it is
performed on the record level; whereas I/O is performed on the block
basis. Thus, for performing I/O the records must be organized as blocks.
22. Three methods of record blocking are used depending on the size of the
block, namely, fixed blocking, variable-length spanned blocking, and
variable-length unspanned blocking.
23. File sharing allows a number of people to access the same file
simultaneously. File sharing can be viewed as part of the file systems and
their management.
24. There are mainly two ways in which files can be shared among multiple
users. First, the system by default allows the users to share the files of
other users, and second, the owner of a file explicitly grants access rights
to other users.
25. The owner is the user who has the most control over the file or the
directory. He or she can perform all the operations on the file. The other
users to whom the owner grants access to his or her file are termed as
group members.
26. Remote file systems allow a computer to mount one or more file systems
from one or more remote machines. Thus, in a networked environment,
where file sharing is possible between remote systems, more
sophisticated file sharing methods are needed.
27. Characterization of the system that specifies the semantics of multiple
users accessing a shared file simultaneously is known as consistency
semantics. These semantics specify when the modifications done by one
user should be made visible to the other users accessing the file.
28. In a single-user system or in a system where users are not allowed to
access the files of other users, there is no need for a protection
mechanism. However, in a multi-user system where some user can
access files of other users, the system is prone to improper access, and
hence a protection mechanism is mandatory.
29. To protect the files from improper accesses, the access control mechanism
can follow either of the two approaches: password and access control list.
30. A password can be assigned to each file and only a user knowing the
password can access the file.
31. An access-control list (ACL) is associated with each file and directory,
which stores user names and the type of access allowed to each user.
When a user tries to access a file, the ACL is searched for that particular
file. If that user is listed for the requested access, the access is allowed.
Otherwise, access to the file is denied.
EXERCISES
Fill in the Blanks
1. To store and retrieve files on the disk, the operating system provides a
mechanism called _____________.
2. The additional information that helps the operating system to manage a
file within the file system is called _____________.
3. The data on the disk is kept as blocks of data with an _____________ to
access data directly in a random order.
4. When the information in the file is accessed in the order of one record
after the other, it is called _____________.
5. _____________ is a file system designed for distributed computing
environment.
Descriptive Questions
1. Explain the need for storing data on secondary storage devices.
2. Which system supports double extensions to a file name?
3. What is the difference between absolute path name and relative path
name?
4. Define the role of a file system in organizing and managing different files
in a system.
5. “The operating system gives a logical view of the data to its user”. Justify
this statement.
6. When a user double clicks on a file listed in Windows Explorer, a program
is run and given that file as parameter. List two different ways the
operating system could know which program to run?
7. Some systems simply associate a stream of bytes as a structure for a
file’s data, while others associate many types of structures for it. What are
the related advantages and disadvantages of each system?
8. A program has just read the seventh record; it next wants to read the
fifteenth record. How many records must the program read before reading
the fifteenth record?
(a) with direct access
(b) with sequential access
9. Give an example of an application in which data in a file is accessed in the
following order:
(a) Sequentially
(b) Randomly
10. What do you mean by file-system mounting? How is it performed?
11. What is record blocking? Discuss the three methods of blocking.
12. Explain the relative merits and demerits of using hierarchical directory
structure over single-level and two-level directory structures?
13. Discuss how file sharing can be implemented in a multi-user environment
where:
(a) a single file system is used.
(b) multiple file systems are used.
14. Write short notes on the following.
(a) Path name
(b) Magic number
(c) Consistency semantics
(d) Access control list
chapter 12
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the file system structure.
⟡ Discuss the basic concepts of file system implementation.
⟡ Describe various methods to allocate disk space to files.
⟡ Explain the directory implementation.
⟡ Explore the methods to keep track of free space on the disk.
⟡ Discuss the implementation of shared files.
⟡ Identify the issues related to file system efficiency and performance.
⟡ Understand how to ensure data consistency and recovery in the
event of system failures.
⟡ Understand the concept of log-structured file system.
12.1 INTRODUCTION
In the previous chapter, we have discussed the basic file concepts,
such as how files are named, what operations are allowed on files,
what the directory tree looks like, and other similar issues which help
users to understand the file system. In this chapter, we will discuss
various issues related to file system implementation in which the file
system designers are interested. This involves how files and
directories are implemented and stored, how the disk space is
managed, and how the file system can be made efficient and reliable.
When a process needs to create a new file, it calls the logical file
system. In response, the logical file system allocates either a new
FCB or an FCB from the list of free FCBs in case all the FCBs have
already been created at the time of file system creation. After
allocating the FCB, the next step is to add the new file name and FCB
into the appropriate directory. For this, the system loads the desired
directory into the memory, updates it with the required information
and finally, writes it back onto the disk.
After a file has been created, I/O operations can be performed on
it. However, before a process can perform I/O on the file, the file
needs to be opened. For this, the process executes open() system
call that passes the file name to the logical file system. It may so
happen that the given file is already open and is in use by some other
process. To determine this, the given file name is searched in the
system-wide file-open table first. If the file name is found in the table,
an entry is made in the per-process open-file table which points to the
existing system-wide open-file table entry. On the other hand, if the
file name is not found in the system-wide open-file table (that is, the
file is not already open), the file name is searched for in the directory
structure. When the file is found, its FCB is copied into the system-
wide open-file table, and the count is incremented. The value of the
count indicates the number of users who have opened the file
currently. Figure 12.3 shows the in-memory file-system structures
while opening a file.
After updating the system-wide open-file table, an entry is made in
the per-process open-file table. This entry includes a pointer to the
appropriate entry in the system-wide open-file table, a pointer to the
position in the file where the next read or write will occur, and the
mode in which the file is open. The open() call returns a pointer to the
appropriate entry in the per-process file-system table. This pointer is
used to perform all the operations as long as the file is open.
When a process closes the file, the corresponding entry is
removed from the per-process open-file table and the system-wide
entry’s open count is decremented. When the count becomes 0 which
means all the users who opened the file have closed it, the updated
file information is copied back to the disk-based structures and the
entry is removed from the system-wide open-file table.
• The second layer is VFS layer that resides between the file-
system interface layer and the actual file systems. The VFS layer
is responsible for performing the following two functions.
■ It defines an interface, called VFS interface which separates
the generic operations of the file system from their
implementation and thus, allows different file systems
mounted locally to be accessed in a transparent manner.
■ It also provides a mechanism to identify a file uniquely across
the network rather than only within a single file system. For
this, it assigns a vnode to each file (or directory) which is a
numerical value that designates the file network-wide unique.
For each active node (that is, open file or directory), the
kernel maintains a vnode data structure.
In nutshell, the VFS separates the local files from remote files
and the local files are further distinguished based on the type of
their file systems. When a request comes from a local file, it
handles that request by activating operations specific to the
respective local file system, while for handling remote requests, it
invokes the procedures of NFS protocol.
• The third layer of architecture is the layer which actually
implements the file system type or the remote file-system
protocol.
Fig. 12.4 Architecture of the File System Implementation
This figure shows the linked list allocation for a file. A total of four
disk blocks are allocated to the file. The directory entry indicates that
the file starts at block 12. It then continues at block 9, block 2, and
finally ends at block 5.
The simplicity and straightforwardness of this method makes it
easy to implement. The linked list allocation results in the optimum
utilization of disk space as even a single free block between the used
blocks can be linked and allocated to a file. This method does not
come across with the problem of external fragmentation, thus,
compaction is never required.
The main disadvantages of using linked list allocation are slow
access speed, disk space utilization by pointers, and low reliability of
the system. As this method provides only sequential access to files,
to find the nth block of a file, the search starts at the beginning of the
file and follows the pointer until the nth block is found. For a very large
file, the average turn around time is high.
In linked list allocation, maintaining pointers in each block requires
some disk space. The total disk space required by all the pointers in a
file becomes substantial, thus the requirement of space by each file
increases. The space required by pointers could otherwise be used to
store the information. To overcome this problem, contiguous blocks
are grouped together as a cluster, and allocation to files takes place
as clusters rather than blocks. Clusters allocated to a file are then
linked together. Having a pointer per cluster rather than per block
reduces the total space needed by all the pointers. This approach
also improves the disk throughput as fewer disk seeks are required.
However, this approach may increase internal fragmentation because
having a partially full cluster wastes more space than having a
partially full block.
The linked list allocation is also not very reliable. Since disk blocks
are linked together by pointers, a single damaged pointer may
prevent us from accessing the file blocks that follow the damaged
link. Some operating systems deal with this problem by creating
special files for storing redundant copies of pointers. One copy of the
file is placed in the main memory to provide faster access to disk
blocks. Other redundant pointer files help in safer recovery.
When the user sends a request to create a new file, the directory
is searched to check whether any other file has the same name or
not. If no other file has the same name, the memory will be allocated
and an entry for the same would be added at the end of the directory.
To delete a file, the directory is searched for the file name and if the
file is found, the space allocated to it is released. The delete
operation results in free space that can be reused. To reuse this
space, it can be marked with a used-unused bit, a special name can
be assigned to it, such as all-zeros, or it can be linked to a list of free
directory entries.
When performing the file operations, the directory is searched for
a particular file. The search technique applied greatly influences the
time taken to make the search and in turn the performance and
efficiency of the file system. As discussed, with long directories, a
linear search becomes very slow and takes O(n) comparisons to
locate a given entry, where n is the number of all entries in a
directory. To decrease the search time, the list can be sorted and a
binary search can be applied. Applying binary search reduces the
average search time but keeping the list sorted is a bit difficult and
time-consuming, as directory entries have to be moved with every
creation and deletion of file.
Bit Vector
Bit vector, also known as bit map, is widely used to keep track of the
free blocks on a disk. To track all the free and used blocks on a disk
with total n blocks, a bit map having n bits is required. Each bit in a bit
map represents a disk block, where a 0 in a bit represents an
allocated block and a 1 in a bit represents a free block. Figure 12.11
shows the bit map representation of a disk.
Fig. 12.11 A Bit Map
The bit map method for the managing the free-space list is simple.
For instance, if a file requires four free blocks using contiguous
allocation method, free blocks 12, 13, 14, and 15 (the first four free
blocks on the disk that are adjacent to each other) may be allocated.
However, for the same file using linked or indexed allocation, the file
system may use free blocks 2, 4, 6, and 8 for allocation to the file.
The bit map is usually kept in the main memory to optimize the
search for free blocks. However, for systems with larger disks,
keeping the complete bit map in the main memory becomes difficult.
For a 2 GB disk with 512-byte blocks, a bit map of 512 KB would be
needed.
Linked List
The linked list method creates a linked list of all the free blocks on the
disk. A pointer to the first free block is kept in a special location on the
disk and is cached in the memory. This first block contains a pointer
to the next free block, which contains a pointer to the next free block,
and so on. Figure 12.12 shows the linked list implementation of free
blocks, where block 2 is the first free block on the disk, which points
to block 4, which points to block 5, which points to block 8, which
points to block 9, and so on.
Linked list implementation for managing free-space list requires
additional space. This is because a single entry in linked list requires
more disk space to store a pointer as compared to one bit in bit map
method. In addition, traversing the free-list requires substantial I/O
operations as we have to read each and every block, which takes a
lot of time.
Grouping
Grouping is a modification to the free-list approach in the sense that
instead of having a pointer in each free block to the next free block,
we have pointers for first n free blocks in the first free block. The first
n-1 blocks are then actually free. The nth block contains the address
of the next n free blocks, and so on. A major advantage of this
approach is that the addresses of many free disk blocks can be found
with only one disk access.
Counting
When contiguous or clustering approach is used, creation or deletion
of a file allocates or de-allocates multiple contiguous blocks.
Therefore, instead of having addresses of all the free blocks, as in
grouping, we can have a pointer to the first free block and a count of
contiguous free blocks that follow the first free block. With this
approach, the size of each entry in the free-space list increases
because an entry now consists of a disk address and a count, rather
than just a disk address. However, the overall list will be shorter, as
the count is usually greater than 1.
12.8.1 Efficiency
The optimum utilization of disk space to store the data in an
organized manner defines the efficiency of a file system. A careful
selection of the disk-allocation and directory-management algorithms
is most important to improve the efficiency of a disk.
Certain operating systems make use of clustering (discussed in
Section 12.4.2) to improve their file-system performance. The size of
clusters depends on the file size. For large files, large clusters are
used, and for small files, small clusters are used. This reduces the
internal fragmentation that otherwise occurs when normal clustering
takes place.
The amount and nature of information kept in the file’s directory
influences the efficiency of the file system. A file’s directory that
stores detailed information about a file is informative but at the same
time it requires more read/write on disks for keeping the information
up to date. Therefore, while designing the file system, due
consideration must be given to the data that should be kept in the
directory.
Other consideration that must be kept in mind while designing the
file system is determining the size of the pointers (to access data
from files). Most systems use either 16-bit or 32-bit pointers. These
pointer sizes limit the file sizes to either 216 (64 KB) or 232 bytes (4
GB). A system that requires larger files to store data can implement a
64-bit pointer. This pointer size supports files of 264 bytes. However,
the greater the size of the pointer, the more the requirement of disk
space to store it. This in turn makes allocation and free-space
management algorithms (linked list, indexes, and so on) use up more
disk space.
For better efficiency and performance of a system, various factors,
such as pointer size, length of directory entry, and table size need to
be considered while designing an operating system.
12.8.2 Performance
The system’s read and write operations with memory are much faster
as compared to the read and write operations with the disk. To reduce
this time difference between disk and memory access, various disk
optimization techniques, such as caching, free-behind and read-
ahead are used.
To reduce the disk accesses and improve the system
performance, blocks of data from secondary storage are selectively
brought into the main memory (or cache memory) for faster
accesses. This is termed as caching of disk data.
When a user sends a read request, the file system searches the
cache to locate the required block. If the block is found the request is
satisfied without the need for accessing the disk. However, if the
block is not in the cache, it is first brought into the cache, and then
copied to the process that requires it. All the successive requests for
the same block can then be satisfied from the cache.
When a request arrives to read a block from the disk, the disk
access is required to transfer the block from it to the main memory.
With the assumption that the block may be used again in future, it is
kept in a separate section of the main memory. This technique of
caching disk blocks in the memory is called block cache. In some
other systems, file data is cached as pages (using virtual-memory
techniques) rather than as file system oriented blocks. This technique
is called page cache. Caching the file data using virtual addresses is
more efficient as compared to caching through physical disk blocks.
Therefore, some systems use page cache to cache both process
pages and file data. This is called unified virtual memory.
Now, consider two alternatives to access a file from disk: memory
mapped I/O and standard system calls, such as read and write.
Without a unified buffer cache, the standard system calls have to go
through the buffer cache, whereas the memory mapped I/O has to
use two caches, the page cache and the buffer cache (see Figure
12.13). Memory mapped I/O requires double caching. First the disk
blocks are read from the file system into the buffer cache, and then
the contents in the buffer cache are transferred to the page cache.
This is because virtual memory system cannot interface with the
buffer cache. Double caching has several disadvantages. First, it
wastes memory in storing copy of the data in both the caches.
Second, each time the data is updated in the page cache, the data in
the buffer cache must also be updated to keep the two caches
consistent. This extra movement of data within the memory results in
the wastage of CPU and I/O time.
Fig. 12.13 Input/Output without a Unified Buffer Cache
12.9 RECOVERY
As discussed earlier, a computer stores data in the form of files and
directories on the disk and in the main memory. This data is important
for the users who have created it and also for other users who are
using it or might use it in the future. However, a system failure (or
crash) may result in loss of data and in data inconsistency. This
section discusses how a system can be recovered to a previous
consistent state prior to its failure. Data recovery includes creating full
and incremental backups to restore the system to a previous working
state and checking data consistency using consistency checker.
Fig. 12.15 Full and Incremental Backups between two Distinct System States
Consistency Checking
Consider a situation where due to some reasons (such as power
failure or system crash) the system goes down abruptly. The system
is said to be in the inconsistent state when there is a difference
between the directory information and the actual data on the disk.
The main reason behind the system’s inconsistent state is the use
of main memory to store directory information. As soon as a file
operation occurs, the corresponding information in the directory is
updated in the main memory. However, directory information on the
disk does not necessarily get updated at the same time.
To overcome this problem, most systems use a special program
called consistency checker which runs at the time of system boot. It
compares the data in the directory structure with the data blocks on
the disk, and tries to fix any inconsistency it finds.
LET US SUMMARIZE
1. Every operating system imposes a file system that helps to organize,
manage and retrieve data on the disk.
2. The design of a file system involves two key issues. The first issue
involves defining a file and its attributes, operations that can be performed
on a file, and the directory structure for organizing files. The second issue
involves creating algorithms and data structures to map the logical file
system onto the physical secondary storage devices.
3. The file system is made up of different layers, where each layer
represents a level.
4. The various file system components are I/O controller, basic file system,
file-organization module and logical file system.
5. The file control block (FCB) stores the information about a file such as
ownership, permissions, and location of the file content.
6. There are several on-disk and in-memory structures that are used to
implement a file system. The on-disk structures include boot control block,
partition control block, directory structure and FCB. The in-memory
structures include in-memory partition table, in-memory directory
structure, system-wide open-file table and per-process open-file table.
7. In order to facilitate the processes and applications to interact with
different file systems at the same time, the operating system offers a
virtual file system (VFS), which is a software layer that hides the
implementation details of any single file type.
8. Every system stores multiple files on the same disk. Thus, an important
function of the file system is to manage the space on the disk. This
includes keeping track of the number of disk blocks allocated to files and
the free blocks available for allocation. Some widely used methods for
allocation of disk space to files (that is, file implementation) include
contiguous, linked and indexed.
9. In contiguous allocation, each file is allocated contiguous blocks on the
disk, that is, one after the other.
10. In the linked list allocation method, each file is stored as a linked list of the
disk blocks. The disk blocks are generally scattered throughout the disk,
and each disk block stores the address of the next block. The directory
entry contains the file name and the address of the first and the last
blocks of the file.
11. In indexed allocation, the blocks of a file are scattered all over the disk in
the same manner as they are in linked allocation. However, here the
pointers to the blocks are brought together at one location known as the
index block.
12. Each file has an index block, which is an array of disk-block pointers
(addresses). The kth entry in the index block points to the kth disk block of
the file.
13. The efficiency, performance, and reliability of a file system are directly
related to the directory-management and directory-allocation algorithms
selected for a file system. The most commonly used directory-
management algorithms are linear list and hash table.
14. The linear list method organizes a directory as a collection of fixed size
entries, where each entry contains a (fixed-length) file name, a fixed
structure to store the file attributes, and pointers to the data blocks.
15. A hash table is a data structure, with 0 to n-1 table entries, where n is the
total number of entries in the table. It uses a hash function to compute a
hash value (a number between 0 to n-1) based on the file name.
16. In a scenario where two or more users want to work on the same files at
the same time, it would be convenient to store the common files in a
subdirectory and make this subdirectory appear in the directory of each
user. This implies that the shared file or subdirectory will be present in two
or more directories in the file system.
17. The file system maintains a free-space list that indicates the free blocks on
the disk. To create a file, the free-space list is searched for the required
amount of space, and the space is then allocated to the new file.
18. The various methods used to implement free-space list are bit vector,
linked list, grouping, and counting.
19. Optimum utilization of disk space to store the data in an organized manner
defines the efficiency of a file system. A careful selection of the disk-
allocation and directory-management algorithms is most important to
improve the efficiency of a disk.
20. System’s read and write operations with the memory are much faster as
compared to the read and write operations with the disk. To reduce this
time difference between disk and memory access, various disk
optimization techniques, such as caching, free-behind, and read-ahead
are used.
21. A system failure (or crash) may result in loss of data and in data
inconsistency. The data recovery includes creating full and incremental
backups to restore the system to a previous working state and checking
the data consistency using consistency checker.
22. The log-structured file system was introduced to reduce the movements of
the disk head while accessing the disk. It maintains a log file that contains
the metadata and data of all the files in the file system. Whenever the
data is modified or new data is written in a file, this new data is recorded
at the end of the log file.
EXERCISES
Fill in the Blanks
1. A _____________ stores all the information related to a file.
2. The widely used methods for allocating disk space to files are
_____________, _____________, _____________, and _____________.
3. _____________ is a data structure used along with the linear list of
directory entry that reduces the search time considerably.
4. To implement shared files, the tree-structured system is generalized to
form a _____________.
5. A _____________ stores addresses of all the blocks which are free for
allocation.
Descriptive Questions
1. Name the component of the file system that is responsible for transferring
information between the disk drive and the main memory.
2. Explain the role of the each layer in a file system.
3. List the advantages of using linked list and indexed allocation methods
over linear list allocation method.
4. Explain the need for having a standard file-system structure attached to
various devices in a system.
5. Explain various on-disk and in-memory structures that are used for
implementing a file system.
6. Compare various schemes used for the management of free space.
7. Discuss the methods used for directory implementation and compare
them.
8. How does cache help in improving performance?
9. What is the difference between caching with and without the unified buffer
cache?
10. Explain how data on a system can be recovered to a previous working
state without any data inconsistency after a system failure?
11. Write short notes on the following.
(a) Shared files
(b) Virtual file system
(c) Hash table
12. How does the log-structured file system reduce disk head movements?
chapter 13
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the need for protection and security.
⟡ Describe the goals and principles of protection.
⟡ Explain the various protection mechanisms, including protection
domain and access control matrix.
⟡ Understand the problem of security.
⟡ Identify the types of security violations, methods used in attempting
security violations and various security measure levels.
⟡ Discuss design principles of security.
⟡ Determine various security threats—caused by humans, natural
calamities and by the use of networks.
⟡ Explore the encryption technique used to ensure security.
⟡ Explore different means to authenticate the user, including password,
smart card and biometric techniques.
⟡ Understand the concept of trusted system.
⟡ Describe the use of firewalls in protecting systems and networks.
13.1 INTRODUCTION
Nowadays, most of the organizations serving domains such as
banking, education, finance and telecommunication rely on the use of
computers for their day-to-day activities. These organizations store
huge amount of data in computers. Since the data is highly valuable,
it is important to protect it from unauthorized access. In addition to
data, the protection of computer resources, such as memory and I/O
devices is also necessary.
Since security of data and computer resources is a major
concern, many organizations restrict the entry of unauthorized
persons inside their premises. For this, they place security guards at
the entrance of the building to allow only authorized persons to enter.
Also, servers, network resources, and other sensitive areas (where
file cabinets are placed) are locked and only some authorized
persons are allowed to enter.
Though the terms security and protection are often used
interchangeably, they have different meanings in computer
environment. Security deals with the threats to information caused
by outsiders (non-users), whereas protection deals with the threats
caused by other users (those who are not authorized to do what they
are doing) of the system. This chapter discusses various aspects
related to protection and security of the system.
Domains as Objects
Access matrix can also be used for representing domain switching
among processes. This can be achieved by representing each
domain as an object in the access matrix, and switching among
domains is shown by adding a switch entry at the intersection of the
corresponding row and column. For example, Figure 13.3 shows a
modified access matrix, in which three columns have been added to
represent domains as objects. An entry switch in the row and column
intersection of domains D1 and D2 indicates that domain switching is
possible between them.
Copy Right
The copy right allows copying of an access right from one domain to
another. The access right can only be copied within the same column,
that is, for the same object for which the copy right is defined. The
copy right is denoted by an asterisk (*) appended to the access right.
For example, Figure 13.4 (a) shows that the process executing in
domain D1 has the ability to copy the execute operation into any entry
associated with object O3. Figure 13.4 (b) shows a modified version of
access matrix, where the access right execute* has been copied to
domain D3.
Fig. 13.4 Copy Right and its Variations
It is clear from Figure 13.4 (b) that D1 has propagated both the
access right as well as the copy right to D3. There exist two more
variants of this scheme.
• Limited copy: In this case, only the access right is copied (not the
copy right) from one domain to another. For example, in Figure
13.4 (c) only the access right is copied from D1 to D3—D3 cannot
further copy the access right.
• Transfer: In this case, the right is transferred (not copied) from
one domain to another, that is, it is removed from the original
domain. For example, in Figure 13.4 (d) the access right is
transferred from D1 to D3. Note that it is removed from the domain
D1.
Owner Right
The owner right allows a process to add new rights and remove the
existing rights within the same column for which the owner right is
defined. For example, in Figure 13.5 (a), domain D1 is the owner of
object O1, hence it can grant and revoke the access rights to and from
the other domains for the object O1 [as shown in Figure 13.5 (b)].
Control Right
Control right is applicable to only domain objects. It allows a process
executing in one domain to modify other domains. For example, in
Figure 13.6, a process operating in domain D2 has the right to control
any of the rights in domain D3.
Fig. 13.6 Access Control Matrix with Control Right
Global Table
In this technique, a single (global) table is created for all domains and
objects. The table comprises a set of ordered triples <domain,
object, accessrights-set>. That is, each entry in the table
represents the access rights of a process executing in a specific
domain on a specific object. Whenever a process in domain Di needs
to perform an operation X on an object Oj, then the global table is
searched for a triple <Di, Oj, Rk>, where X𝜖Rk. If the matching entry is
found, the process is allowed to perform the desired operation;
otherwise an exception is raised and the access is denied. This is the
easiest method for implementing the access control matrix. However,
it suffers from some drawbacks. First, it is large in size, thus it cannot
be kept in the main memory. For this reason, an additional I/O is
required to access it from the secondary storage. Secondly, it does
not support any kind of groupings of objects or domains, thus, if any
access right (say, read operation) is applicable to several domains, a
separate entry must be stored for each domain.
A Lock-Key Scheme
In this technique, each object and domain is associated with a list of
unique bit patterns. The bit patterns associated with the objects are
known as locks, and the bit patterns associated with the domains are
known as keys. If a process executing in domain Di wants to perform
an operation X on object Oj, it is allowed to do so only if Di has a key
that matches one of the locks of Oj. Thus, this mechanism can be
considered as a compromise between the two techniques discussed
earlier (access lists and capability lists).
13.6.1 Intruders
Intruders (sometimes also called adversaries) are the attackers who
attempt to breach the security of a network. They attack on the
privacy of a network in order to get unauthorized access. Intruders
are of three types, namely masquerader, misfeasor and clandestine
user.
• Masquerader is an external user who is not authorized to use the
computer and tries to gain privileges to access some legitimate
user’s account. Masquerading is generally done by either using
stolen IDs and passwords or through bypassing authentication
mechanisms.
• Misfeasor is generally a legitimate user who either accesses
some applications or data without any privileges to access them,
or if he/she has privilege to access them, he/she misuses these
privileges. It is generally an internal user.
• Clandestine user is either an internal or external user who gains
admin access to the system and tries to avoid access control and
auditing information.
Trap Doors
Trap doors (also known as backdoors) refer to the security holes left
by the insiders in the software purposely. Sometimes, while
programming the systems, the programmers embed a code into the
program to bypass some normal protective mechanism. For example,
they can insert a code that circumvents the normal login/ password
authentication procedure of the system, thus providing access to the
system. The main characteristic of trap doors is that they are hidden
in the software and no one knows about them for certainty.
In computing industry, insertion of trap doors is usually considered
necessary so that the programmers could quickly gain access to the
system in any undesirable error condition or when all other ways of
accessing the system have failed. However, this itself may prove a
potential security threat if a hacker comes to know about it.
Trojan Horses
Trojan horse is a malicious program that appears to be legal and
useful but concurrently does something unexpected like destroying
existing programs and files. It does not replicate itself in the computer
system and hence, it is not a virus. However, it usually opens the way
for other malicious programs such as viruses to enter the system. In
addition, it may also allow access to unauthorized users.
Trojan horses spread when users are convinced to open or
download a program because they think it has come from a legitimate
source. They can also be mounted on software that is freely
downloadable. They are usually subtler especially in the cases where
they are used for espionage. They can be programmed for self-
destruction, without leaving any evidence other than the damage they
have caused. The most famous Trojan horse is a program called
Back Orifice, which is an unsubtle play of words on Microsoft’s Back
Office suite of programs for NT server. This program allows anybody
to have complete control over the computer or server it occupies.
Another activity relating to the Trojan horse is spyware. Spyware
are small programs that install themselves on computers to gather
data secretly about the computer user without his/her consent and
knowledge and report the collected data to interested users or
parties. The information gathered by the spyware may include e-mail
addresses and passwords, net surfing activities, credit card
information, etc. The spyware often gets automatically installed on
your computer when you download a program from the Internet or
click any option from the pop-up window in the browser.
Logic Bombs
Logic bomb is a program or portion of a program, which lies dormant
until a specific part of program logic is activated. The most common
activator for a logic bomb is date. The logic bomb checks the date of
the computer system and does nothing until a pre-programmed date
and time is reached. It could also be programmed to wait for a certain
message from the programmer. When the logic bomb sees the
message, it gets activated and executes the code. A logic bomb can
also be programmed to activate on a wide variety of other variables
such as when a database grows past a certain size or a user’s home
directory is deleted. For example, the well-known logic bomb is
Michelangelo, which has a trigger set for Michelangelo’s birthday. On
the given birth date, it causes system crash or data loss or other
unexpected interactions with the existing code.
Viruses
Virus (stands for Vital Information Resources Under Seize) is a
program or small code segment that is designed to replicate, attach
to other programs, and perform unsolicited and malicious actions. It
enters the computer system from external sources, such as CD, pen
drive, or e-mail and executes when the infected program is executed.
Further, as an infected computer gets in contact with an uninfected
computer (for example, through computer networks), the virus may
pass on to the uninfected system and destroy the data.
Just as flowers are attractive to the bees that pollinate them, virus
host programs are deliberately made attractive to victimize the user.
They become destructive as soon as they enter the system or are
programmed to lie dormant until activated by a trigger. The various
types of virus are discussed as follows.
• Boot sector virus: This virus infects the master boot record of a
computer system. It either moves the boot record to another
sector on the disk or replaces it with the infected one. It then
marks that sector as a bad sector on the disk. This type of virus is
very difficult to detect since the boot sector is the first program
that is loaded when a computer starts. In effect, the boot sector
virus takes full control of the infected computer.
• File-infecting virus: This virus infects files with extension .com
and .exe. This type of virus usually resides inside the memory
and infects most of the executable files on the system. The virus
replicates by attaching a copy of itself to an uninfected executable
program. It then modifies the host programs and subsequently,
when the program is executed, it executes along with it. File-
infecting virus can only gain control of the computer if the user or
the operating system executes a file infected with the virus.
• Polymorphic virus: This virus changes its code as it propagates
from one file to another. Therefore, each copy of virus appears
different from others; however, they are functionally similar. This
makes the polymorphic virus difficult to detect like the stealth
virus (discussed below). The variation in copies is achieved by
placing superfluous instructions in the virus code or by
interchanging the order of instructions that are not dependent.
Another more effective means to achieve variation is to use
encryption. A part of the virus, called the mutation engine,
generates a random key that is used to encrypt the rest portion of
the virus. The random key is kept stored with the virus while the
mutation engine changes by itself. At the time the infected
program is executed, the stored key is used by the virus to
decrypt itself. Each time the virus replicates, the random key
changes.
• Stealth virus: This virus attempts to conceal its presence from
the user. It makes use of compression such that the length of the
infected program is exactly the same as that of the uninfected
version. For example, it may keep the intercept logic in some I/O
routines so that when some other program requests for
information from the suspicious portions of the disk using these
routines, it will present the original uninfected version to the
program. Stoned Monkey is one example of stealth virus. This
virus uses ‘read stealth’ capability and if a user executes a disk
editing utility to examine the main boot record, he/she would not
find any evidence of infection.
• Multipartite virus: This virus infects both boot sectors and
executable files, and uses both mechanisms to spread. It is the
worst virus of all because it can combine some or all of the stealth
techniques along with polymorphism to prevent detection. For
example, if a user runs an application infected with a multipartite
virus, it activates and infects the hard disk’s master boot record.
Moreover, the next time the computer is started, the virus gets
activated again and starts infecting every program that the user
runs. One-half is an example of a multipartite virus, which exhibits
both stealth and polymorphic behaviour.
Worms
Worms are the programs constructed to infiltrate into the legitimate
data processing programs and alter or destroy the data. They often
use network connections to spread from one computer system to
another, thus, worms attack systems that are linked through
communication lines. Once active within a system, worms behave like
a virus and perform a number of disruptive actions. To reproduce
themselves, worms make use of network medium, such as:
• Network mail facility, in which a worm can mail a copy of itself to
other systems.
• Remote execution capability, in which a worm can execute a copy
of itself on another system.
• Remote log in capability, whereby a worm can log into a remote
system as a user and then use commands to copy itself from one
system to another.
Both worms and viruses tend to fill computer memory with useless
data thereby preventing the user from using memory space for legal
applications or programs. In addition, they can destroy or modify data
and programs to produce erroneous results as well as halt the
operation of the computer system or network. The worm’s replication
mechanism can access the system by using any of the three methods
given below.
• It employs password cracking, in which it attempts to log into
systems using different passwords such as words from an online
dictionary.
• It exploits a trap door mechanism in mail programs, which permits
it to send commands to a remote system’s command interpreter.
• It exploits a bug in a network information program, which permits it
to access a remote system’s command interpreter.
13.9.1 Encryption
One of the most important aspects of the parts of cryptography is
encryption which is a means of protecting confidentiality of data in
an insecure environment, such as while transmitting data over an
insecure communication link. It is used in security and protection
mechanisms to protect the information of users and their resources.
Encryption is accomplished by applying an algorithmic
transformation to the data. The original unencrypted data is referred
to as plaintext, while its encrypted form is referred to as ciphertext.
Thus, encryption is defined as the process of encrypting plaintext so
that ciphertext can be produced. The plaintext is transformed to
ciphertext using the encryption algorithm. The ciphertext needs to be
converted back to plaintext using the opposite process of encryption,
called decryption which uses a decryption algorithm to accomplish
the same.
Both encryption and decryption algorithms make use of a key
(usually, a number or set of numbers) to encrypt or decrypt the data,
respectively (see Figure 13.7). The longer the key, harder it is for an
opponent to decrypt the message.
During encryption, the encryption algorithm (say, E) uses the
encryption key (say, k) to convert the plaintext (say, P) to ciphertext
(say, C), as shown here.
Symmetric Encryption
The type of encryption in which the same key is used for both
encryption and decryption of data is called symmetric encryption.
Data Encryption Standard (DES) is a well known example of
symmetric encryption algorithm. In 1977, the US government
developed DES, which was widely adopted by the industry for use in
security products. The DES algorithm is parameterized by a 56-bit
encryption key. It has a total of 19 distinct stages and encrypts the
plaintext in blocks of 64 bits, producing 64 bits of ciphertext. The first
stage is independent of the key and performs transposition on the 64-
bit plaintext. The last stage is the exact inverse of the first stage
transposition. The preceding stage of last one exchanges the first 32
bits with the next 32 bits. The remaining 16 stages perform encryption
by using parameterized encryption key. Since the algorithm is
symmetric key encryption; it allows decryption to be done with the
same key as encryption. All the steps of the algorithm are run in the
reverse order to recover the original data.
With increasing speeds of computers, it was feared that a special-
purpose chip can crack DES in under a day by searching 256 possible
keys. Therefore, NIST created a modified version of the DES, called
triple DES (3-DES), with increased key length thereby making the
DES more secure. As the name implies, 3-DES performs the DES
thrice, including two encryptions and one decryption. There are two
implementations of 3-DES: one with two keys, while the other with
three keys. The former version uses two keys (k1 and k2) of 56 bits
each, that is, the key size is 112 bits. During encryption, the plaintext
is encrypted using DES with key k1 in the first stage, then the output
of first stage is decrypted using DES with key k2 in the second stage,
and finally, in the third stage, the output of second stage is encrypted
using DES with key k1 thereby producing the ciphertext. In contrast,
the latter version of 3-DES uses three keys of 56 bits each and a
different key is used for encryption/decryption in each stage. The use
of three different keys further increases the key length to 168 bits,
making the communication more secured.
After questioning the inadequacy of DES, the NIST adopted a
new symmetric encryption standard called Advanced Encryption
Standard (AES) in 2001. AES supports key lengths of 128, 192, and
256 bits and specifies the block size of 128 bits. Since the key length
is 128-bit, there are 2128 possible keys. It is estimated that a fast
computer that can crack DES in 1 second could take trillion of years
to crack 128-bit AES key. The main problem with symmetric algorithm
is that the key must be shared among all the authorized users. This
increases the chance of key becoming known to an intruder.
Asymmetric Encryption
In 1976, Diffie and Hellman introduced a new concept of encryption
called asymmetric encryption (or public encryption). It is based on
mathematical functions rather than operations on bit patterns. Unlike
DES and AES, it uses two different keys for encryption and
decryption. These are referred to as public key (used for encryption)
and private key (used for decryption). Each authorized user has a
pair of public key and private key. The public key is known to
everyone, whereas the private key is known to its owner only, thus,
avoiding the weakness of DES. Assume that E and D represent the
public encryption key and the private decryption key, respectively. It
must be ensured that deducing D from E should be extremely difficult.
In addition, the plaintext that is encrypted using the public key Ei
requires the private key Di to decrypt the data.
Now suppose that a user A wants to transfer some information to
user B securely. The user A encrypts the data by using public key of B
and sends the encrypted message to B. On receiving the encrypted
message, B decrypts it by using his private key. Since decryption
process requires private key of user B, which is known only to B, the
information is transferred securely.
In 1978, a group at MIT invented a strong method for asymmetric
encryption. It is known as RSA, the name derived from the initials of
the three discoverers Ron Rivest, Adi Shamir, and Len Adleman. It is
now the most widely accepted asymmetric encryption algorithm; in
fact most of the practically implemented security is based on RSA.
For good security, the algorithm requires keys of at least 1024 bits.
This algorithm is based on some principles from the number theory,
which states that determining the prime factors of a number is
extremely difficult. The algorithm follows the following steps to
determine the encryption key and decryption key.
1. Take two large distinct prime numbers, say m and n (about 1024 bits).
2. Calculate p= m × n and q=(m–1)×(n–1).
3. Find a number which is relatively prime to q, say D. That number is the
decryption key.
4. Find encryption key E such that E × D=1 mod q.
Using these calculated keys, a block B of plaintext is encrypted as
Te=BE mod p. To recover the original data, compute B =(Te)D mod p.
Note that E and p are needed to perform encryption, whereas D and p
are needed to perform decryption. Thus, the public key consists of
(E,p), and the private key consists of (D,p). An important property of
the RSA algorithm is that the roles of E and D can be interchanged. As
the number theory suggests that it is very hard to find prime factors of
p, it is extremely difficult for an intruder to determine decryption key D
using just E and p, because it requires factoring p which is very hard.
As an example, consider we have to encrypt the plaintext 6 using
RSA encryption algorithm. Suppose we use prime numbers 11 and 3
to compute the public key and private key. Here, we have m = 11 and
n = 3. Thus, p and q can be calculated as:
p = m × n = 11 × 3 = 33
q = (m-1)×(n-1) = (11-1) × (3-1) = 10 × 2 = 20
Let us choose D = 3 (a number relatively prime to 20, that is, gcd
(20, 3) = 1.
Now,
E × D = 1 mod q
⇒E × 3 = 1 mod 20
⇒E=7
As we know, the public key consists of (E,p), and the private key
consists of (D,p). Therefore, the public key is (7, 33) and the private
key is (3, 33).
Thus, the plaintext 6 can be converted to ciphertext using the
public key (7, 33) as shown here.
Te=BE mod p
⇒ 67 mod 33
⇒ 30
On applying the private key to the ciphertext 30 to get original
plaintext, we get:
B = (Te)D mod p
⇒ (30)3 mod 33
⇒6
13.10.1 Passwords
Password is the simplest and most commonly used authentication
scheme. In this scheme, each user is asked to enter a username and
password at the time of logging into the system. The combination of
username and password is then matched against the stored list of
usernames and passwords. If a match is found, the system assumes
that the user is legitimate and allows him/her access to the system;
otherwise the access is denied. Generally, the password is asked for
only once when the user logs into the system, however, this process
can be repeated for each operation when the user tries to access
sensitive data.
Though the password scheme is widely used, it has some
limitations. In this method, the security of the system relies
completely on the password. Thus, password itself needs to be
secured from unauthorized access. However unfortunately, the
passwords can be easily guessed, accidently exposed to some
intruder or may be passed illegally from an authorized to
unauthorized user. Moreover, they can be exposed to intruders
through visual or electronic monitoring. In visual monitoring, an
intruder looks over the shoulder of the user while he types the
password by seeing the keyboard. This activity is referred to as
shoulder surfing. On the other hand, in electronic monitoring (or
network monitoring), one having direct access to the network (in
which the system runs) can see the data being transferred on the
network. Such data may consist of user IDs and passwords also. This
activity is referred to as sniffing.
One simple way to secure the password is to store it in an
encrypted form. The system employs a function (say, f(x)) to
encode (encrypt) all the passwords. Whenever a user attempts to log
into the system, the password entered by him/her is first encrypted
using the same function f(x) and then matched against the stored list
of encrypted passwords.
The main advantage of encrypted passwords is that even if the
stored encrypted password is seen, it cannot be determined. Thus,
there is no need to keep the password file secret. However, care
should be taken to ensure that the password would never be
displayed on the screen in its decrypted form.
LET US SUMMARIZE
1. Security deals with threats to information caused by outsiders (non-users),
whereas protection deals with the threats caused by other users (who are
not authorized to do what they are doing) of the system.
2. One of the most important goals of protection is to ensure that each
resource is accessed correctly and only by those processes that are
allowed to do so.
3. Implementing protection requires policies and mechanisms. The policy of
the organization decides which data should be protected from whom and
the mechanism specifies how this policy is to be enforced.
4. Some commonly used protection mechanisms include protection domain,
access control list, and access control matrix.
5. A computer system consists of a set of objects that may be accessed by
the processes. An object can be either a hardware object (such as CPU,
memory segment and printer) or software object (such as file, database,
program and semaphore).
6. A domain is a collection of access rights where each access right is a pair
of <object-name,rights_set>. The object_name is the name of the object
and rights_set is the set of operations that a process is permitted to
perform on the object_name.
7. The association between a process and domain may be either static or
dynamic. In the former case, the process is allowed to access only a fixed
set of objects and rights during its lifetime, while in the latter case, the
process may switch from one domain to another during its execution
(termed as domain switching).
8. Access control matrix (or access matrix) is a mechanism that records the
access rights of processes over objects in a computer system. Like
access control list, it is also employed in file systems and is consulted by
the operating system each time an access request is issued.
9. Allowing controlled change in the contents of the access matrix requires
three additional operations, namely, copy, owner, and control. The copy
and owner rights allow a process to modify only the column entries in the
access matrix; however, the control right allows a process to modify the
row entries.
10. To effectively implement an access control matrix, some techniques are
used. These techniques include global table, access lists for objects,
capability lists for domains, and a lock-key scheme.
11. In dynamic protection system, it might require to revoke the access rights
to the objects shared by several users. Revocation of access rights is
much easier in case of access list as compared to capability list.
12. To implement revocation for capabilities, techniques including re-
acquisition, back-pointers, indirection, and keys are used.
13. Undoubtedly, the protection mechanisms provided by the operating system
enable users to protect their programs and data. But a system is
considered secure only if the users make the intended use of and access
to the computer’s resources, in every situation.
14. Intruders (sometimes also called adversaries) are the attackers who
attempt to breach the security of a network. They attack on the privacy of
a network(s) in order to get unauthorized access. Intruders are of three
types, namely, masquerader, misfeasor and clandestine user.
15. Types of security violations may be many, which can be broadly classified
into two categories: accidental and intentional (malicious). Accidental
security violations are easier to protect than intentional ones.
16. The intruders adopt certain standard methods while making attempts to
breach the security of the system. Some of these methods are
masquerading, replay attack, message modification, man-in-the-middle
attack and session hijacking.
17. There are four levels at which security measures should be applied in
order to protect the system. These include physical, human, operating-
system and network levels.
18. Designing a secure operating system is a crucial task. The major concern
of designers is on the internal security mechanisms that lay the foundation
for implementing security policies.
19. Researchers have identified certain principles that can be followed to
design a secure system. These principles include least privilege, fail-safe
default, complete mediation, user acceptability, economy of mechanism,
least common mechanism, open design and separation of privileges.
20. Security threats continue to evolve around us by finding new ways. Some
of them are caused by humans, some are by nature such as floods,
earthquakes and fire, and some are by the use of Internet such as virus,
Trojan horse, spyware, and so on.
21. Different security threats are classified into two broad categories: program
threats and system and network threats.
22. In simple terms, cryptography is the process of altering messages in a way
that their meaning is hidden from the adversaries who might intercept
them.
23. One of the most important aspects of the parts of cryptography is
encryption which is a means of protecting confidentiality of data in an
insecure environment, such as while transmitting data over an insecure
communication link. There are two categories of encryption algorithms,
namely, symmetric and asymmetric.
24. A process that lets the users to present their identity to the system to
conform their correctness is termed as authentication. User authentication
can be based on—user knowledge (such as a username and password),
user possession (such as a card or key) and/ or user attribute (such as
fingerprint, retina pattern or iris design).
25. A computer and operating system which can be relied upon to a
determined level to implement a given security policy, is referred to as the
trusted system. In other words, a trusted system is defined as the one the
failure of which may compromise a specified security policy.
26. Firewall is such a mechanism that protects and isolates the internal
network from the outside world. Simply put, a firewall prevents certain
outside connections from entering the network.
EXERCISES
Fill in the Blanks
1. _____________ deals with the threats caused by those users of the
system who are not authorized to do what they are doing.
2. _____________ decides which data should be protected from whom.
3. The person who tries to breach the security and harm a system is referred
to as _____________.
4. The association between a process and domain may be either
_____________ or _____________.
5. _____________ use the unique characteristics (or attributes) of an
individual to authenticate a person’s identity.
Descriptive Questions
1. What is the difference between security and protection?
2. Differentiate between protection policy and protection mechanism.
3. Define the following terms.
(a) Intruder
(b) Phishing
(c) Authentication
4. Discuss the goals and principles of protection.
5. Which factors can affect the security of a computer system and harm it?
6. What are the levels at which security measures should be applied to
protect the system?
7. Discuss various means of authenticating a user.
8. What is the advantage of storing passwords in encrypted form in computer
systems?
9. Describe protection mechanism illustrating use of protection domain and
access control matrix.
10. Describe various techniques used to implement access control matrix.
11. Describe the use of one-time passwords.
12. What is the importance of design principles for security? Explain some of
these principles.
13. Define encryption. Point out the differences between symmetric and
asymmetric encryption.
14. Write short notes on the following.
(a) Firewalls
(b) Trusted systems
(c) Types of security violations
(d) Methods used in security attacks
chapter 14
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the term multiprocessor systems.
⟡ Describe the various interconnection networks used in
multiprocessor systems.
⟡ Describe the architecture of multiprocessor systems.
⟡ Discuss different types of multiprocessor systems.
⟡ Describe the term distributed systems.
⟡ Understand the capabilities of a distributed operating system.
⟡ Describe the various techniques of data distribution over a
distributed system.
⟡ Discuss the various aspects of computer networks that form the
basis for distributed systems.
⟡ Understand the concept of distributed file system.
14.1 INTRODUCTION
Today’s computer applications demand high performance machines
that can process large amounts of data in sophisticated ways. The
applications such as geographical information systems, real-time
decision making, and computer-aided design demand the capability
to manage several hundred gigabytes to terabytes of data. Moreover,
with the evolution of Internet, the number of online users as well as
size of data has also increased. This demand has been the driving
force of the emergence of technologies like parallel processing and
data distribution. The systems based on parallelism and data
distribution are called multiprocessor systems and distributed
systems, respectively. This chapter discusses these systems in brief.
Bus
It is the simplest interconnection network in which all the processors
and one or more memory units are connected to a common bus. The
processors can send data on and receive data from this single
communication bus. However, only one processor can communicate
with the memory at a time. This organization is simple and
economical to implement, however, suitable only for low traffic
densities. At medium or high traffic densities, it becomes slow due to
bus contention. A simple bus organization containing four processors
P0, P1, P2 and P3 and a shared memory M is shown in Figure 14.1 (a).
Crossbar Switch
A crossbar switch uses an N × N matrix organization, wherein N
processors are arranged along one dimension and N memory units
are arranged along the other dimension. Every CPU and a memory
unit are connected via an independent bus. The intersection of each
horizontal and vertical bus is known as a crosspoint. Each
crosspoint is basically an electric switch that can be opened or closed
depending on whether or not the communication is required between
the processor and the memory. If a processor Pi wants to access the
data stored in memory unit Mj, the switch between them is closed,
which connects the bus of Pi to the bus of Mj.
The crossbar switch eliminates the problem of bus contention as N
processors are allowed to communicate with N different memory units
at the same time. However, the contention problem may occur when
more than one processor attempts to access the same memory unit
at the same time. Figure 14.1 (b) shows a crossbar switch
interconnection containing four processors P0, P1, P2 and P3 and four
memory units M0, M1, M2 and M3. It is clear from the figure that to
completely connect N processors to N memory units, N2 crosspoints
are required. For 1000 CPUs and 1000 memory units, a million
crosspoints are required, which makes the crossbar interconnections
more complex and expensive.
Multistage Switch
A multistage switch lies in between a bus and a crossbar switch in
terms of cost and parallelism. It consists of several stages, each
containing 2 × 2 crossbar switches. A 2 × 2 crossbar switch consists
of two inputs and two outputs. These switches can be connected in
several ways to build a large multistage interconnection network
(MIN). In general, for N processors and N memory units, m=log2N
stages are required, where each stage has N/2 crossbar switches,
resulting in a total of (N/2)log2N switches.
Figure 14.1 (c) shows a multistage network having eight
processors and eight memory units. It consists of three stages, with
four switches per stage, resulting in a total of 12 switches. The jth
switch in ith stage is denoted by Sij. This type of interconnection
network is termed as 8 × 8 omega network.
Whenever a CPU attempts to access data from a memory unit,
the path to be followed between them is selected using the address
bits of the memory unit to be accessed. Initially, the leftmost bit of the
memory address is used for routing data from the switch at the first
stage to the switch at the second stage. If the address bit is 0, the
upper output of the switch is selected, and if it is 1, the lower output is
selected. At the second stage, the second bit of the address is used
for routing, and so on. The process continues until a switch in the last
stage is encountered, which selects one of the two memory units.
For example, suppose the CPU P2 wants to communicate with the
memory unit M7. The binary address of this memory unit is 111.
Initially, P2 passes the data to switch S12. Since the leftmost bit of the
memory address is 1, S12 routes the data to second stage switch S22
via its lower output. The switch at second stage checks the second bit
of the memory address. Since it is again 1, S22 passes the data to the
third stage switch, which is S34, via its lower output. This switch
checks the third bit, which is again 1. Now, S34 routes the data to the
desired memory unit via its lower output. Consequently, the data
follows the highlighted path as shown in Figure 14.1 (c).
The main advantage of multistage switches is that the cost of an
N×N multistage network is much lower than that of an N×N crossbar
switch, because the former network requires a total of (N/2)log2N
switches, which is much lower than N2 crosspoints. However, unlike a
crossbar switch, it is a blocking network because it cannot process
every set of requests simultaneously. For example, suppose another
CPU, say P3 wants to communicate with the memory unit M6 at the
same time when P2 is communicating with M7. The binary address of
M6 memory unit is 110. Initially, P3 passes the data to switch S12. Since
the leftmost bit of the memory address is 1, S12 routes the data to
second stage switch S22 via its lower output. Since the lower output
line is already busy, this request cannot be processed until P2
releases the communicating line.
Fig. 14.1 Types of Interconnection Networks
14.2.2 Architecture of Multiprocessor Systems
Each CPU in a multiprocessor system is allowed to access its local
memory as well as the memories of other CPUs (non-local
memories). Depending on the speed with which the CPUs can
access the non-local memories, the architecture of multiprocessor
systems can be categorized into two types, namely, uniform memory
access (UMA) and non-uniform memory access (NUMA).
Note: Balance system by Sequent and VAX 8800 by Digital are the examples of UMA
architecture.
If other caches contain the “clean” copy (same as that of the
memory) of the data to be modified, they simply discard that copy and
allow the processor to fetch the cache block from the memory and
modify it. On the other hand, if any of the caches has the “dirty” or
modified copy (different from that of the memory) of the data, it either
directly transfers it to the processor (that wants to perform write
operation), or it writes it back to the memory before performing the
write operation. In this way, cache coherency can be achieved. The
overhead of maintaining cache coherency increases with increase in
the number of processors. Thus, with this improvement also UMA
cannot support more than 64 CPUs at a time.
Another possible design is to let each processor have local private
memories in addition to caches [see Figure 14.2 (b)]. This further
reduces the network traffic as the compiler places all the read-only
data such as program code, constants, and strings, other data such
as stacks and local variables in the private memories of the
processors. The shared memory is used only for writeable shared
variables. However, this design requires active participation of the
compiler.
Unlike UMA, the CPUs in NUMA architecture access the local and
non-local memories with different speed; each CPU can access its
local memory faster than the non-local memories. The remote or non-
local memory can be accessed via LOAD and STORE instructions. This
architecture supports the concept of distributed virtual-memory
architecture, where logically there is a single shared memory, but
physically there are multiple disjoint memory systems.
Like UMA, cache can also be provided to each CPU in each node.
The global ports of each node can also be associated with a cache
for holding data and instructions accessed by the CPUs in that node
from non-local memories. Thus, it is necessary to ensure coherence
between local as well as non-local caches. The system with coherent
caches is known as Cache-Coherent NUMA (CC-NUMA).
Note: The HP AlphaServer and the IBM NUMA-Q are the examples
of NUMA architecture.
Separate Supervisors
In separate supervisor systems, the memory is divided into as many
partitions as there are CPUs, and each partition contains a copy of
the operating system. Thus, each CPU is assigned its own private
memory and its own private copy of the operating system. Since
copying the entire operating system in each partition is not feasible;
the better option is to create copies of only the data, and allow CPUs
to share the operating system code (see Figure 14.4). Consequently,
n CPUs can operate as n independent computers, still sharing a set
of disks and other I/O devices. This makes it better than having n
independent computers.
The main advantage of this scheme is that it allows the memory to
be shared flexibly. That is, if one CPU needs a larger portion of the
memory to run a large program, the operating system can allocate
the required extra memory space to it until the execution of the
program. Once the execution is over, the additional memory is de-
allocated. Another benefit of this scheme is that it allows efficient
communication among executing processes through shared memory.
The main drawback of this approach is that it does not allow
sharing of processes. That is, all the processes of a user who has
logged into CPU1 will execute on CPU1 only. They cannot be
assigned to any other CPU, which sometimes results in an
imbalanced load distribution. For example, if another processor say
CPU2 is idle while CPU1 is heavily loaded with work. Another
problem arises when caching of disk blocks is allowed. If each
operating system maintains its own cache of disk blocks, then
multiple “dirty” copies of a certain disk block may be present at the
same time in multiple caches, which leads to inconsistent results.
Avoiding caches will definitely eliminate this problem, but it will affect
the system performance considerably.
Master-slave Multiprocessors
In master-slave (or asymmetric) multiprocessing systems, one
processor is different from the other processors in a way that it is
dedicated to execute the operating system and hence, known as
master processor. Other processors, known as slave processors,
are identical. They either wait for instructions from the master
processor to perform any task or have predefined tasks.
Symmetric Multiprocessors
In symmetric multiprocessing systems, all the processors perform
identical functions. A single copy of the operating system is kept in
the memory and is shared among all the processors as shown in
Figure 14.6. That is, any processor can execute the operating system
code. Thus, the failure of one CPU does not affect the functioning of
other CPUs.
Though this approach eliminates all the problems associated with
separate supervisors and master-slave systems, it has its own
problems. This approach results in disasters when two or more
processors are executing the OS code at the same time. Imagine a
situation where two or more processors attempt to pick the same
process to execute or claim the same free memory page. This
problem can be resolved by treating the operating system as one big
critical region and associating a mutex variable with it. Now,
whenever a CPU needs to run operating system code, it must first
acquire the mutex, and if the mutex is locked, the CPU should wait
until the mutex becomes free. This approach allows all the CPUs to
execute the operating system code, but in mutually exclusive manner.
Bus/Linear Topology
The bus topology uses a common single cable to connect all the
workstations. Each computer performs its task of sending messages
without the help of the central server. Whenever a message is to be
transmitted on the network, it is passed back and forth along the
cable from one end of the network to the other. However, only one
workstation can transmit a message at a particular time in the bus
topology.
As the message passes through each workstation, the
workstations check the message’s destination address. If the
destination address does not match the workstation’s address, the
bus carries the message to the next station until the message
reaches its desired workstation. Note that the bus comprises
terminators at both ends. The terminator absorbs the message that
reaches the end of the medium. This type of topology is popular
because many computers can be connected to a single central cable.
Advantages
• It is easy to connect and install.
• The cost of installation is low.
• It can be easily extended.
Disadvantages
• The entire network shuts down if there is a failure in the central
cable.
• Only a single message can travel at a particular time.
• It is difficult to troubleshoot an error.
Ring/Circular Topology
In ring topology, the computers are connected in the form of a ring
without any terminating ends. Every workstation in the ring topology
has exactly two neighbours. The data is accepted from one
workstation and is transmitted to the destination through a ring in the
same direction (clockwise or counter-clockwise) until it reaches its
destination.
Advantages
• It is easy to install.
• The cable length required for installation is not much.
• Every computer is given equal access to the ring.
Disadvantages
• The maximum ring length and the number of nodes are limited.
• A failure in any cable or node breaks the loop and can down the
entire network.
Star Topology
In star topology, the devices are not directly linked to each other but
are connected through a centralized network component known as
hub or concentrator. Computers connected to the hub by cable
segments send their traffic to the hub that resends the message
either to all the computers or only to the destination computer. The
hub acts as a central controller and if a node wants to send the data
to another node, it boosts the message and sends it to the intended
node. This topology commonly uses twisted pair cable, however,
coaxial cable or optical fibre can also be used.
Disadvantages
• It is difficult to expand.
• The cost of the hub and the longer cables makes it expensive over
others.
• In case the hub fails, the entire network fails.
Tree Topology
The tree topology combines the characteristics of the bus and star
topologies. It consists of groups of star-configured workstations
connected to a bus backbone cable. Every node is not directly
plugged to the central hub. The majority of nodes are connected to a
secondary hub which in turn is connected to the central hub. Each
secondary hub in this topology functions as the originating point of a
branch to which other nodes connect. This topology is commonly
used where a hierarchical flow of data takes place.
Fig. 14.12 Tree Topology
Advantages
• It eliminates network congestion.
• The network can be easily extended.
• The faulty nodes can easily be isolated from the rest of the
network.
Disadvantages
• It uses large cable length.
• It requires a large amount of hardware components and hence, is
expensive.
• Installation and reconfiguration of the network is very difficult.
Mesh Topology
In mesh topology, each workstation is linked to every other
workstation in the network. That is, every node has a dedicated point-
to-point link to every other node. The messages sent on a mesh
network can take any of the several possible paths from the source to
the destination. A fully connected mesh network with n devices has
n(n-1)/2 physical links. For example, if an organization implementing
the topology has 8 nodes, 8(8-1)/2, that is, 28 links are required. In
addition, routers are used to dynamically select the best path to be
used for transmitting the data.
Advantages
• The availability of large number of routes eliminates congestions.
• It is fault tolerant, that is, failure of any route or node does not fail
the entire network.
Disadvantages
• It is expensive as it requires extensive cabling.
• It difficult to install.
Graph Topology
In a graph topology, the nodes are connected randomly in an arbitrary
fashion. There can be multiple links and all the links may or may not
be connected to all the nodes in the network. However, if all the
nodes are linked through one or more links, the layout is known as a
connected graph.
Circuit Switching
In circuit switching technique, first of all the complete end-to-end
transmission path is established between the source and the
destination computers, and then the message is transmitted through
the path. The main advantage of this technique is that the dedicated
transmission path provides a guaranteed delivery of the message. It
is mostly used for voice communication such as in the Public
Switched Telephone Network (PSTN) in which when a telephone call
is placed, the switching equipment within the telephone system seeks
out a physical path all the way from the computer to the receiver’s
telephone.
In circuit switching, the data is transmitted with no delay (except
for negligible propagation delay). In addition, this technique is simple
and requires no special facilities. Hence, it is well suited for low speed
transmission.
Message Switching
In message switching technique, no physical path is established
between the sender and receiver in advance. This technique follows
the store and forward mechanism, where a special device (usually a
computer system with large memory storage) in the network receives
the message from the source computer and stores it in its memory. It
then finds a free route and sends the stored information to the
intended receiver. In this kind of switching, a message is always
delivered to one device where it is stored and then rerouted to its
destination.
Message switching is one of the earliest types of switching
techniques, which was common in the 1960s and 1970s. As delays in
such switching are inherent (time delay in storing and forwarding the
message) and capacity of data storage required is large, this
technique has virtually become obsolete.
Packet Switching
In packet switching technique, the message is first broken down into
fixed size units known as packets. The packets are discrete units of
variable length block of data. Apart from data, the packets also
contain a header with the control information such as the destination
address, and priority of the message. The packets are transmitted
from the source to its local Packet Switching Exchange (PSE). The
PSE receives the packet, examines the packet header information
and then passes the packet through a free link over the network. If
the link is not free, the packet is placed in a queue until it becomes
free. The packets travel in different routes to reach the destination. At
the destination, the Packet Assembler and Disassembler (PAD)
puts each packet in order and assembles the packet to retrieve the
information.
The benefit of packet switching is that since packets are short,
they are easily transferred over a communication link. Longer
messages require a series of packets to be sent, but do not require
the link to be dedicated between the transmission of each packet.
This also allows packets belonging to other messages to be sent
between the packets of the original message. Hence, packet
switching provides a much fairer and efficient sharing of the
resources. Due to these characteristics, packet switching is widely
used in data networks like the Internet.
ISO Protocol
International Standards Organization (ISO) provided an Open
Systems Interconnection (OSI) reference model for communication
between two end users in a network. In 1983, ISO published a
document called ‘The Basic Reference Model for Open Systems
Interconnection’ which visualizes network protocols as a seven-
layered model. The model lays a framework for the design of network
systems that allow for communication across all types of computer
systems. It consists of seven separate but related layers, namely,
Physical, Data Link, Network, Transport, Session, Presentation and
Application.
A layer in the OSI model communicates with two other OSI layers,
that directly above it and that directly below it. For example, the data
link layer in System X communicates with the network layer and the
physical layer. When a message is sent from one machine to another,
it travels down the layers on one machine and then up the layers on
the other machine. This route is illustrated in Figure 14.18.
As the message travels down the first stack, each layer (except
the physical layer) adds header information to it. These headers
contain control information that are read and processed by the
corresponding layer on the receiving stack. At the receiving stack, the
process happens in reverse. As the message travels up the other
machine, each layer strips off the header added by its peer layer.
The seven layers of the OSI model are listed here.
• Physical layer: It is the lowest layer of the OSI model that defines
the physical characteristics of the network. This layer
communicates with data link layer and regulates transmission of
stream of bits (0s and 1s) over a physical medium such as
cables, and optical fibers. In this layer, bits are converted into
electromagnetic signals before traveling across physical medium.
• Data link layer: It takes the streams of bits from the network layer
to form frames. These frames are then transmitted sequentially to
the receiver. The data link layer at the receiver’s end detects and
corrects any errors in the transmitted data, which travels from the
physical layer.
Fig. 14.18 Communication between two machines using OSI Model
TCP/IP Protocol
LET US SUMMARIZE
1. With the evolution of Internet, the number of online users as well as the
size of data has increased. This demand has been the driving force of the
emergence of technologies like parallel processing and data distribution.
The systems based on parallelism and data distribution are called
multiprocessor systems and distributed systems, respectively.
2. Multiprocessor systems (also known as parallel systems or tightly coupled
systems) consist of multiple processors in close communication in a
sense that they share the computer bus, system clock, and sometimes
even memory and peripheral devices.
3. A multiprocessor system provides several advantages over a uniprocessor
system including increased system throughput, faster computation within
an application and graceful degradation.
4. Multiprocessor systems consist of several components such as CPUs,
one or more memory units, disks and I/O devices. All of these
components communicate with each other via an interconnection network.
The three commonly used interconnection networks are bus, crossbar
switch and multistage switch.
5. Bus is the simplest interconnection network in which all the processors
and one or more memory units are connected to a common bus. The
processors can send data on and receive data from this single
communication bus. However, only one processor can communicate with
the memory at a time.
6. A crossbar switch uses an N × N matrix organization, wherein N
processors are arranged along one dimension and N memory units are
arranged along the other dimension. Every CPU and a memory unit are
connected via an independent bus. The intersection of each horizontal
and vertical bus is known as a crosspoint.
7. A multistage switch lies in between a bus and a crossbar switch in terms
of cost and parallelism. It consists of several stages, each containing 2 ×
2 crossbar switches.
8. Each CPU in a multiprocessor system is allowed to access its local
memory as well as the memories of other CPUs (non-local memories).
Depending on the speed with which the CPUs can access the non-local
memories, the architecture of multiprocessor systems can be categorized
into two types, namely, uniform memory access (UMA) and non-uniform
memory access (NUMA).
9. In uniform memory access (UMA) architecture, all the processors share
the physical memory uniformly, that is, the time taken to access a memory
location is independent of its position relative to the processor. Several
improvements have been made in this architecture to make it better. One
such improvement is to provide a cache to each CPU. Another possible
design is to let each processor have local private memories in addition to
caches.
10. In the NUMA architecture, the CPUs access the local and non-local
memories with different speeds; each CPU can access its local memory
faster than the non-local memories.
11. There are basically three types of multiprocessor operating systems,
namely, separate supervisors, master-slave, and symmetric.
12. In separate supervisor systems, the memory is divided into as many
partitions as there are CPUs, and each partition contains a copy of the
operating system. Thus, each CPU is assigned its own private memory
and its own private copy of the operating system.
13. In master-slave (or asymmetric) multiprocessing systems, one processor
is different from the other processors in a way that it is dedicated to
execute the operating system and hence, known as master processor.
Other processors, known as slave processors, are identical. They either
wait for the instructions from the master processor to perform any task or
have predefined tasks.
14. In symmetric multiprocessing systems, all the processors perform identical
functions. A single copy of the operating system is kept in the memory
and is shared among all the processors.
15. A distributed system consists of a set of loosely coupled processors that
do not share memory or system clock, and are connected by a
communication medium.
16. The main advantages of distributed systems are that they allow resource
sharing, enhance availability and reliability of a resource, provide
computation speed-up and better system performance, and allow
incremental growth of the system.
17. A network operating system is the earliest form of operating system used
for distributed systems.
18. A distributed operating system provides an abstract view of the system by
hiding the physical resource distribution from the users. It provides a
uniform interface for resource access regardless of its location.
19. In a distributed system, the data is distributed across several sites. There
are two ways of achieving data distribution, namely, partitioning and
replication.
20. In partitioning (also known as fragmentation), the data is divided into
several partitions (or fragments), and each partition can be stored at
different sites. On the other hand, in replication, several identical copies or
replicas of the data are maintained and each replica is stored at different
sites.
21. There are three ways of accessing the data in a distributed system,
namely, data migration, computation migration and process migration.
22. A computer network can be as small as several personal computers on a
small network or as large as the Internet. Depending on the geographical
area they span, computer networks can be classified into two main
categories, namely, local area networks and wide area networks.
23. A local area network (LAN) is the network restricted to a small area such
as an office or a factory or a building.
24. A wide area network (WAN) spreads over a large geographical area like a
country or a continent. It is much bigger than a LAN and interconnects
various LANs.
25. A network topology refers to the way a network is laid out either physically
or logically. The various network topologies include bus, ring, star, tree,
mesh, and graph.
26. The main aim of networking is transfer of the data or messages between
different computers. The data is transferred using switches that are
connected to communication devices directly or indirectly. A switch is a
device that selects an appropriate path or circuit to send the data from the
source to the destination.
27. The technique of using the switches to route the data is called a switching
technique (also known as connection strategy). There are three types of
switching techniques, namely, circuit switching, message switching and
packet switching.
28. A communication protocol (also known as a network protocol) is a set of
rules that coordinates the exchange of information. The two most popular
types of communication protocols are the ISO protocol and TCP/IP
protocol.
29. The International Standards Organization (ISO) provided an Open
Systems Interconnection (OSI) reference model for communication
between two end users in a network. An OSI model consists of seven
separate but related layers, namely, Physical, Data Link, Network,
Transport, Session, Presentation and Application.
30. The transmission control protocol/Internet protocol (TCP/IP) is the most
widely adopted protocol over the Internet. It has fewer layers than that of
the ISO protocol. The various layers in the TCP/IP protocol are Link,
Network, Transport and Application.
31. The distributed file system (DFS) provides a way by which users can share
files that are stored on different sites of distributed system. In addition, it
allows easy access to files on distributed system as the users are
unaware of the fact that the files are distributed.
EXERCISES
Fill in the Blanks
1. A _____________ consists of a set of loosely coupled processors that do
not share memory or system clock, and are connected by a
communication medium.
2. A _____________ is the earliest form of operating system used for
distributed systems.
3. There are three ways of accessing the data in a distributed system,
namely, data migration, computation migration and _____________.
4. A _____________ is a device that selects an appropriate path or circuit to
send the data from the source to the destination.
5. The technique of using the switches to route the data is called
_____________.
Descriptive Questions
1. What are the advantages of a multiprocessor system over a uniprocessor
system?
2. Discuss the various types of interconnection networks. How does a
multistage interconnection network considered as a blocking network?
Explain with the help of an example.
3. Explain the UMA architecture of a multiprocessor system. What are the
different variants of UMA architecture? Give suitable diagrams also.
4. Discuss the NUMA architecture of a multiprocessor system with the help
of suitable diagram. How is it different from UMA architecture?
5. What are the various types of multiprocessor operating systems? Discuss
the advantages and disadvantages of each of them.
6. What are the advantages of distributed systems?
7. What are the drawbacks of a network operating system? How does a
distributed operating system overcome these drawbacks?
8. Discuss various techniques of data distribution in distributed systems.
9. Discuss the advantages and disadvantages of ring topology.
10. Differentiate between the following.
(a) Data migration and computation migration (b) LAN and WAN (c) Star
and tree topology
(d) Circuit switching and packet switching
11. Explain the OSI model in detail. How is TCP/IP model different from OSI
model?
12. Explain the distributed file system and the terms associated with it.
13. Explain the different types of transparency in distributed file system.
chapter 15
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Discuss the history of UNIX operating system.
⟡ Understand the role, function and architecture of UNIX kernel.
⟡ Explain how process management is done in UNIX.
⟡ Describe how memory is managed in UNIX.
⟡ Explore the file system and directory structure of UNIX.
⟡ Discuss the I/O system of UNIX.
⟡ Learn shell programming.
15.1 INTRODUCTION
UNIX (officially trademarked as UNIX®) operating system is an open
source software that means complete source code is available with it,
so that one can customize the operating system. UNIX is available on
a wide range of different hardware (that is, it is portable), has
hierarchical file system, device independence and multi-user
operation capabilities. It uses virtual memory and paging to support
programs larger than the physical memory. In addition, it has a
number of utility programs like vi editor, shell (csh) and compilers. As
said by Thompson and Ritchie, the designers of UNIX, “The success
of UNIX lies not so much in new inventions but rather in the full
exploitation of a carefully selected set of fertile ideas, and especially
in showing that they can be keys to the implementation of a small and
yet powerful operating system.” UNIX being one of the dominant
operating systems available on high-end workstations and servers is
also used on systems ranging from cell phones to supercomputers
these days. These are the main reasons behind the fast growth and
vast development of UNIX operating system.
Signals
A signal is the most basic communication mechanism that is used to
alert a process to the occurrence of some event such as abnormal
termination or floating point exception. It does not carry any
information rather it simply indicates that an event has occurred.
When a process sends a signal to another process, the execution of
the receiving process is suspended to handle the signal as in case of
interrupt.
UNIX offers a wide range of signals to indicate different events.
The majority of signals are sent from the kernel to user processes
while some can be used by the user processes to communicate with
each other. However, the kernel does not use signals to communicate
with a process running in kernel mode; instead a wait-queue
mechanism is used to enable kernel-mode processes to convey each
other about incoming asynchronous events. This mechanism allows
several processes to wait for a single event by maintaining a queue
for each event. Whenever a process needs to wait for the completion
of a particular event, it sleeps in the wait queue associated with that
event. After the event has happened, all the processes in the wait
queue are awakened.
Pipes
Pipe is the standard communication mechanism that enables the
transfer of data between processes. It provides a means of one-way
communication between related processes. Each pipe has a read
end and a write end. The data written at the write end of the pipe can
be seen (read) through the read end. When the writer process writes
to the pipe, stream of bytes are copied to the shared buffer whereas
at the time of reading bytes are copied from the shared buffer.
Though both the reader and writer processes may run concurrently,
the access to pipe must be synchronized. UNIX must ensure that only
one process (either writer or reader) is accessing the pipe at a time.
To synchronize the processes, UNIX uses locks and wait queues;
each end of pipe is associated with a wait queue.
Whenever the writer process requests for writing to pipe, UNIX
locks the pipe for it if and only if there is enough space as well as the
pipe is not locked for the reader process. Once the writer process
gains access to pipe, bytes are copied into it. However, if the pipe is
full or locked for the reader process, the writer process sleeps in the
wait queue at write end and remains there unless it is not awakened
by the reader. After the data has been written to the pipe, the pipe is
unlocked and any sleeping readers in the wait queue at read end are
awakened. A similar process follows at the time of reading from the
pipe.
A variation of pipe that UNIX supports is named pipe, also called
FIFO. As the name implies, in FIFOs, data written first to the pipe is
read first. They employ same data structures as used in pipes as well
as are handled in the same way. However unlike pipes, they are
persistent and exist as directory entries. Moreover, unrelated
processes can use them. Any process which wants to use named
pipes must have appropriate access rights. Before starting to use a
FIFO, it needs to be opened and similarly, after use it needs to be
closed. However, UNIX must ensure that a writer process opens the
FIFO before the reader process and a reader process does not
attempt to read from before the writer process has written to.
Shared Memory
Shared memory is another means of communication that allows
cooperating process to pass data to each other. This mechanism
enables a memory segment to be shared between two or more
processes. As discussed in Chapter 2, one process creates the
shared memory segment while others can read/write through it by
attaching the shared memory segment along with their address
space. Like other mechanisms, UNIX must ensure the
synchronization among communicating processes so that no two
processes access the shared area simultaneously.
Shared memory is a faster means of communication as compared
to other methods; however, it does not provide synchronization
among different processes by its own. For this, it is to be used with
some other IPC mechanism that offers synchronization.
where
■ Base is the base priorities of the user processes—it is same
for all the processes
■ Nice is the priority value which is assigned by the user to its
own process. For this, the user can use nice(priority)
system call. The default is 0, but the value can lie between
-20 and +20. The user is allowed to assign only a positive
value to the parameter priority in the nice() system call.
However, the system administrator is allowed assign a
negative value (between -20 and -1) to the parameter
priority.
Swapping
When there exist more processes than that can be accommodated in
main memory, some of processes are removed from memory and
kept on the disk. This is what we refer to as swapping, as discussed
in Chapter 7. The module of operating system that handles the
movement of processes between disk and memory is called
swapper. Generally, the swapper needs to move processes from
memory to disk when the kernel runs out of free memory which may
happen on the occurrence of any one of the following events.
• A process invokes brk() system call to increase the size of data
segment.
• A process invokes a fork() system call to create a child process.
• A stack runs out of space allocated to it due to larger data.
One more possible reason for swapping could be a process that
had been on the disk for too long and now, it has to be brought into
the memory but there is no free space in memory.
Whenever a process is to be swapped out of memory to make
room for new process, the swapper first looks for those processes in
memory that are presently blocked and waiting for some event to
occur. If the swapper finds one or more such processes, it evicts one
of them based on certain criteria. For example, one possible
approach is to remove the process with highest value of priority plus
residence time. On the other hand, if swapper does not find any
blocked processes in memory, one of the ready processes is
swapped out based on the same criteria.
The swapper also examines after every few seconds the
swapped-out processes in order to determine whether any of them is
ready for execution and can be swapped in the memory. If it finds
such a process, then it determines whether it is going to be an easy
or hard swap. The swap is considered easy if there is enough free
space in memory that the chosen process can just be brought into
memory without having to swap out any process from memory. In
contrast, the swap is considered hard if there is no free space in
memory and to swap in the new process, some existing process has
to be swapped out of memory.
Note: In case the swapper finds many processes on the disk that are
ready for execution, it chooses the one to swap into memory that had
been on the disk for the longest time.
The swapper goes on repeating the above process until any of the
following two conditions are met.
• The memory is full of processes which had just been brought into
it and there is no room for any other process.
• There are no processes on the disk which are ready to execute.
To keep track of free space in memory and swap space on swap
device such as disk, linked list of holes is used. Whenever a process
is to be swapped into memory or a swapped-out process is to be
stored on disk, the linked list is searched and a hole is selected
following the first-fit algorithm.
Paging
Here, we discuss paging system in 4BSD UNIX system. The basic
idea behind paging in 4BSD is the same as we described in Chapter
7. That is, there is no need to load the entire process in memory for
its execution; rather loading only the user structure and page tables
of process in memory is enough to start its execution. In other words,
a process cannot be scheduled to run until the swapper brings the
user structure and page table of that process in memory. Once the
execution of process begins, the pages of its data, text and stack
segments are brought into memory dynamically as and when needed.
The physical (main) memory in 4BSD system is composed of
three parts [see Figure 15.2 (a)]. The first part stores the kernel, the
second part stores the core map, and the third part is divided into
page frames. The kernel and core map are never paged out of
memory, that is, they always remain in the memory. Each page frame
in memory contains a data, text, or stack page, a page table page, or
be on the list of free page frames in memory, maintained by the virtual
memory handler. The information regarding the contents of each
page frame is stored in the core map. The core map contains one
entry per page frame; the core map entry 0 describes page frame 0,
core entry 1 describes page frame 1, and so on.
Furthermore, each core map entry has various fields as shown in
Figure 15.2 (b). The first two fields in core map entry (that is, index of
previous and next entry) are useful when the respective page frame
is on the list of free page frames. The next three fields, including disk
block number, disk device number and block hash code, are used
when the page frame contains information. These items specify the
location on the disk where the page contained in the corresponding
page frame is stored and will be put on paging out. The next three
fields, including index into proc table, text/data/ stack and offset within
segment, indicate the process table entry for that page’s process, the
segment containing that process and the location of process within
that segment. The last field contains certain flags which are used by
the paging algorithm.
Inode
Each file in UNIX is associated with a data structure known as inode
(index node). The inode of a specific file contains its attributes like
type, size, ownership, group, information related to protection of the
file, time of creation, and access and modification. In addition, it also
contains the disk addresses of the blocks allocated to the file. Some
of the fields contained in an inode are described as follows.
• File owner identifier: File ownership is distributed amongst the
group owner and other users.
• File type: Files can be of different type like a normal file, a
directory, a special file, a character / block device file or pipe.
• File access permission: File is protected according to the
distribution among three classes: the owner, the group owner and
other users. Each of the three classes has rights to read, write
and execute the file. The rights to read, write and execute could
be set individually for each class. Execute permission for a
directory allows that directory to be searched for any filename.
• File access time: It is the time of the last modification and the last
access of the file.
The inodes of all files existing in the file system are stored in an
inode table stored on disk. A file can be opened by just bringing its
inode into the main memory and storing it into the inode table
resident in memory. This implies that the blocks of any file can be
found by just having the inode of that file in the main memory. This
proves a major advantage over FAT scheme as at any instant only
inodes of the open files need to be in main memory thereby
occupying much smaller space in main memory as compared to FAT
scheme (discussed in Chapter 12). Moreover, the amount of space
reserved in main memory increases with increase in number of open
files only and not with the size of disk.
File Allocation
In UNIX, the disk space is allocated to a file in units of blocks and that
too dynamically as and when needed. UNIX adopts indexed
allocation method to keep track of files, where a part of index is
stored in the inode of files. Each inode includes a number of direct
pointers that point to the disk blocks containing data and three
indirect pointers, including single, double and triple. For instance, the
inode in FreeBSD UNIX system contains 15 pointers where first 12
pointers contain addresses of the first 12 blocks allocated to the file,
while rest three are indirect pointers (see Figure 15.3). In case the file
size is more than 12 blocks, one or more of the following levels of
indirection are used as per the requirement.
• Single indirection: The 13th pointer in inode contains address of
a disk block, known as single indirect block, which contains
pointers to the succeeding blocks in the file.
• Double indirection: In case the file contains more blocks, the
14th pointer in inode points to a disk block (referred to as double
indirect block) which contains pointers to additional single
indirect blocks. Each of the single indirect blocks in turn contains
pointers to the blocks of file.
• Triple indirection: In case the double indirect block is also not
enough, the 15th address in the inode points to a block (referred
to as triple indirect block), which points to additional double
indirect blocks. Each of the double indirect blocks contains
pointers to single indirect blocks each of which in turn contains
pointers to the blocks of file.
Fig. 15.3 Structure of Inode in FreeBSD UNIX
Directory Implementation
In UNIX, directories are implemented using a variation of linear list
method (discussed in Chapter 12). Each directory entry consists of
two fields: the file (or subdirectory) name which can be maximum 14
bytes and the inode number which is an integer of 2 bytes. The inode
number contains the disk address of the inode structure that stores
the file attributes and the address of the file’s data-blocks (see Figure
15.5). With this approach, the size of the directory entry is very small,
and the approach has certain advantages over the linear list of
directory.
15.8.4 Re-direction
We use re-direction operators to modify the default input and output
conventions of UNIX command. For example, the following command
will redirect the output of ls command to the file named
list_of_files.
$ ls > list_of_files
$
For re-directing the input, we use the operator < as shown in the
following command.
$ wc–w < file1
8
Here, the contents of file1 become the input for the command.
The command then counts the number of words in the input contents
and displays it. Notice that the name of the file is not displayed in the
output because the command does not know from where the output
is provided to it.
15.8.5 Wildcards
Wildcard characters are basically used in pattern matching. ‘*’ and ‘?’
characters generate one or more filenames which become part of the
effective command. First the wildcard expressions are resolved and
then the resultant expanded command is interpreted. ‘*’ matches with
any series of characters within the filenames of the directory while ‘?’
matches with a single character. For example to list all the files with
.c extension, we can use the following command.
$ ls *.c
hello.c
palindrome.c
15.8.6 Filters
Filter is a program that inputs data from standard input, perform some
operation on the data, and outputs the result to the standard output.
Thus, it can be used as a pipeline between the two programs. The
commonly used filters in UNIX are sort and grep.
• Sort: This command is used to display the contents of a file in
ascending or descending order. By default, the contents of a file
are displayed in ascending order. As an example, consider the
following command that displays the contents of file1 in
ascending order.
Variables
Like other programming languages, shell also provides the facility to
utilize the variables. Variables are declared simply by assigning value
to them. For example, the following statements declare two variables
and assign value to them.
$ val1=10
$ nam=ABC
If quotes are missing, only the first word will be considered as the
value of the variable. To get the value of par variable, the operator $ is
placed before the name of the variable par as shown below.
$ echo $par
The good old lady
Control Facilities
Shell also provides the decision and loop control structures. Though
these facilities can be invoked using UNIX commands interactively,
the main usage of these structures is basically in the context of shell
programming. The for statement has the syntax as mentioned below.
for variable in value_list
do
commands
done
Output
The value of k is 1
The value of k is 2
The value of k is 3
The value of k is 4
The value of k is 6
The value of k is 7
LET US SUMMARIZE
1. UNIX (officially trademarked as UNIX®) operating system is an open
source software that means complete source code is available with it, so
that one can customize the operating system.
2. UNIX development was started at AT&T’s Bell Laboratories by Ken
Thompson in 1969.
3. While designing UNIX, two concepts have been kept in mind. First about
the file system: it occupies ‘physical space’ and the second about the
process: it is supposed to have ‘life’.
4. In UNIX system, the process is the only active entity. Each process runs
as a single program and has single thread of control.
5. There are three types of processes that can be executed in UNIX: user
processes, daemon processes, and kernel processes.
6. UNIX allows a process to create multiple new processes by invoking the
fork() system call. The forking process is called parent while the newly
created process is termed as child.
7. The UNIX operating system is empowered with various interprocess
communication facilities, some of which include signals, pipes and shared
memory.
8. The processes in UNIX are executed in one of three modes at a particular
point of time. These modes are user mode, kernel non-interruptible mode,
and kernel interruptible mode.
9. UNIX is a time-sharing operating system, it basically follows round-robin
scheduling algorithm with some variation in it, which is multilevel adaptive
scheduling.
10. UNIX employs a simple and straightforward memory model which not only
increases the program’s portability but also enables UNIX to be
implemented on systems with diverse hardware designs.
11. Earlier versions of UNIX system (prior to 3BSD) were based on swapping;
when all the active processes could not be kept in memory, some of them
were moved to the disk in their entirety. Berkeley added paging to UNIX
with 3BSD in order to support larger programs. Virtually, all the current
implementations of UNIX support demand-paged virtual memory system.
12. File system is the most visible part of any operating system. UNIX
provides a simple but elegant file system which uses only a limited
number of system calls.
13. A UNIX file is a stream of zero or more bytes containing arbitrary data. In
UNIX file system, the six types of files are identified, which are regular (or
ordinary) file, directory file, special, named pipe, link and symbolic link.
14. Each file in UNIX is associated with a data structure known as inode
(index node). The inode of a specific file contains attributes and the disk
addresses of the blocks allocated to the file.
15. The inodes of all files existing in the file system are stored in an inode
table stored on disk. A file can be opened by just bringing its inode into
the main memory and storing it into the inode table resident in memory.
16. UNIX adopts indexed allocation method to keep track of files, where a part
of index is stored in the inode of files. Each inode includes a number of
direct pointers that point to the disk blocks containing data and three
indirect pointers, including single, double and triple.
17. A directory is a file with special format where the information about other
files is stored by the system. UNIX uses a hierarchical directory structure
(often referred to as directory tree).
18. In UNIX, directories are implemented using a variation of linear list
method. Each directory entry consists of two fields: the file (or
subdirectory) name which can be maximum 14 bytes and the inode
number which is an integer of 2 bytes.
19. In UNIX, all the I/O devices can be treated as files and can be accessed
through the same system calls (read() and write()) as used for ordinary
files.
20. In order to enable applications to access I/O devices, Linux integrates I/O
devices into a file system as what are called special files.
21. UNIX splits special files into two classes: block special files and character
special files.
22. A block special file corresponds to a block device (such as hard disk,
floppy disk, CD-ROM, DVD, etc.) and comprises a sequence of fixed-size
numbered blocks. On the other hand, a character special file corresponds
to a character device that reads/writes stream of characters such as
mouse, keyboard, printer, etc.
23. Each I/O device in UNIX is uniquely identified by the combination of major
device number and minor device number. The major device number is
used to identify the driver associated with that device while the minor
device number is used to identify the individual device in case the driver
supports multiple devices.
24. UNIX users invoke commands by interacting with command-language
interpreter also known as shell. The shell is built outside the kernel and is
written as a user process.
25. There are five common shells namely, the Bourne, Korn, TC, Bourne
Again SHell and C shell with the program names sh, ksh, tcsh, bash and
csh, respectively.
26. Shell can be used in one of the two ways, interactively or by writing shell
scripts.
EXERCISES
Fill in the Blanks
1. The first GUI for UNIX was introduced by _____________.
2. _____________ are the background processes that are responsible for
controlling the computational environment of the system.
3. _____________ system call suspends the execution of the calling process
until it receives a signal.
4. The _____________ segment of the program in memory holds the
environment variables along with the arguments of command line which
was typed to the shell in order to invoke the program.
5. _____________ command searches for a specific pattern of characters in
the input.
Descriptive Questions
1. Write a short note on the history of UNIX.
2. Explain the architecture, role and function of UNIX kernel with the help of
a block diagram.
3. Write short notes on the following.
(a) Process management system calls
(b) Memory management system calls
(c) I/O management in UNIX
4. Describe the types of processes that can be executed in UNIX.
5. Explain the IPC mechanisms used in UNIX.
6. Give the reasons for dynamic variation of process priorities in UNIX.
7. List some standard subdirectories and files contained in them in UNIX.
8. Give the purpose of text, data and stack segment in address space.
9. How memory is managed in UNIX?
10. How files and directories are implemented in UNIX?
11. Explain the ways in which shell can be used.
12. Explain cat, ls and cd shell commands with the help of an example.
13. What is re-direction? Explain with an example how we can redirect input
and output in UNIX.
14. What does ‘*’ and ‘?’ characters stand for? Explain with the help of an
example.
15. What does wc command do? Explain with its switches.
chapter 16
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ List different components of a Linux system.
⟡ Explain how processes and threads are created and terminated in
Linux.
⟡ Discuss how processes are scheduled.
⟡ Explain memory management strategies for physical and virtual
memory.
⟡ Describe the file system in Linux.
⟡ Explore how I/O devices are handled in Linux.
16.1 INTRODUCTION
Linux is a UNIX-like system whose development started in 1991 by
Linus Torvalds—a Finnish student at the University of Helsinki. It is a
multiuser multitasking operating system and its kernel is also
monolithic in nature. It comes with a GPL (GNU Public License) that
was devised by Richard Stallman, the founder of Free Software
Foundation (FSF). According to this license, the users may use, copy,
modify and redistribute the Linux source code and binary code freely
with a restriction that all works derived from the Linux kernel may not
be sold or redistributed only in binary form; the source code also has
to be shipped together with the product or made available on
demand.
16.2 THE LINUX SYSTEM
The Linux system consists of three main components, namely, kernel,
system libraries and system utilities (see Figure 16.1).
The Linux kernel is the core of the Linux system as it provides an
environment for the execution of processes. It also provides various
system services to allow protected access to hardware resources.
The code contained in kernel is always executed in the processor’s
privileged mode (also called kernel mode) and as a consequence,
has complete access to all the physical resources of the system. The
Linux kernel does not contain any user-mode code.
Paging
Similar to UNIX, Linux also rely on paging, that is, the transfer
between memory and disk are always done in units of pages. The
page replacement in Linux is also performed using a variation of the
clock algorithm (discussed in Chapter 8). Each page is associated
with age that indicates how frequently the page is accessed.
Obviously, the pages which are accessed frequently will have higher
value of age as compared to that of less frequently accessed pages.
During each pass of clock, the age of a page is either increased or
decreased depending on its frequency of usage. Whenever a page is
to be replaced, the page replacement algorithm chooses the page
with the least value of age.
LET US SUMMARIZE
1. Linux is a UNIX-like system whose development started in 1991 by Linus
Torvalds—a Finnish student at the University of Helsinki. It is a multiuser
multitasking operating system and its kernel is also monolithic in nature.
2. The Linux system consists of three main components, namely, kernel,
system libraries and system utilities.
3. The Linux kernel is the core of the Linux system as it provides an
environment for the execution of processes. It also provides various
system services to allow protected access to hardware resources.
4. The system libraries contain all the operating-system-support code that
need not be executed in the kernel mode. They provide a standard set of
functions using which applications can interact with the kernel.
5. The system utilities are a set of user-mode programs with each program
designed to perform an independent, specialized management task.
6. Traditional UNIX systems support only a single thread of execution per
process; however, modern UNIX systems like Linux allow each process to
have multiple kernel-level threads.
7. The information pertaining to a process or thread is maintained in
task_struct data structure.
8. Linux generally uses the term task to refer flow of control in a program. To
create a task, Linux supports the system call fork() whose functionality is
identical to that in UNIX. It also provides the ability to create a task using
the clone() system call.
9. To terminate a process, the Linux provides the exit_group() system call,
which when invoked terminates a process along with all its threads.
10. Linux has two separate classes of processes: real-time and non-real-time
and the real-time processes are given priority over non-real-time
processes.
11. In Linux, the real-time processes can be scheduled in two ways: first-come
first-served (FCFS) and round robin (RR).
12. The non-real-time processes in Linux are scheduled in a time sharing
manner. But the notion of time slice differs from that of conventional time
sharing algorithm. In Linux, the time slice of a process varies according to
its priority; higher priority implies larger time slice.
13. In Linux, due to several hardware limitations, the physical memory regions
cannot be dealt in the similar manner. So, Linux divides the physical
memory into three different zones, which are ZONE_DMA,
ZONE_NORMAL, and ZONE_HIGHMEM.
14. The virtual memory is responsible for creating virtual pages and managing
the transfer of those pages from disk to memory and vice-versa.
15. Linux provides a variety of file systems including ext2fs, ext3fs, and proc
file system. To support these different file systems, the Linux kernel offers
virtual file system (VFS), which hides the differences among the various
file systems from the processes and applications.
16. Like UNIX, Linux also splits special files into two classes: block special
files and character special files. Each I/O device in Linux is also uniquely
identified by the combination of major device number and minor device
number.
17. In addition to minimizing the number of disk accesses during disk I/O,
Linux also aims at minimizing the latency of repetitive disk head
movements. To achieve this objective, it relies on I/O scheduler that
schedules the disk I/O requests in an order that optimizes the disk
access. The basic scheduler of Linux is Linus Elevator that exploits the
order in which I/O requests are added or removed from each device
queue.
18. Modern versions of Linux use a different scheduler named deadline
scheduler. It works in the same way as elevator scheduler; however, it
avoids starvation by associating a deadline with each request.
19. In Linux, each character device driver implementing a terminal device is
represented in the kernel with the help of a tty_struct data structure,
which provides buffering for the data stream from the terminal device and
feeds that data to the line discipline.
EXERCISES
Fill in the Blanks
1. _____________ are a set of user-mode programs with each program
designed to perform an independent, specialized management task.
2. For each zone, the kernel maintains a separate _____________ to
manage memory individually.
3. Each I/O device in Linux is uniquely identified by a combination of
_____________ and _____________.
4. The _____________ contain all the operating-system-support code that
need not be executed in the kernel mode.
5. In Linux, the information pertaining to a process or thread is maintained in
_____________ data structure.
Descriptive Questions
1. Explain various components of a Linux system.
2. Describe different object types defined by the virtual file system.
3. What do the active array and expired array in Linux contain?
4. Differentiate between logical and physical view of virtual memory.
5. When does the kernel create a new virtual address space?
6. Suppose we have a block of 512 KB available in physical memory. How
would the buddy system of Linux serve the following memory requests
coming in the shown order?
(a) 120 KB
(b) 60 KB
(c) 30 KB
(d) 80 KB Illustrate the memory allocations diagrammatically.
7. “Linux offers soft real-time scheduling rather than hard”. Explain.
8. What does the file object in Linux VFS describe?
9. Write short note on the following.
(a) Linux ext2 file system
(b) Linus Elevator
(c) Deadline scheduler
(d) Journaling
10. What are the objectives of the part of I/O system handling block devices?
11. Explain how character I/O devices are handled in Linux.
12. Explain how processes and threads are created in Linux.
chapter 17
LEARNING OBJECTIVES
After reading this chapter, you will be able to:
⟡ Understand the structure of Windows 2000 operating system.
⟡ Explain various mechanisms used for communication between
processes.
⟡ Discuss how threads are scheduled.
⟡ Explain how memory is managed in Windows 2000.
⟡ Describe the file system in Windows 2000.
⟡ Explore how I/O devices are handled in Windows 2000.
17.1 INTRODUCTION
Microsoft Windows is the most popular series of operating system in
the past decade. Windows 95 revolutionized the personal computer
operating system market. Then came Windows 98, Windows ME,
Windows NT, Windows 2000, Windows XP, Windows Vista, Windows
7, and the latest one is Windows 8. In this chapter, we are discussing
how various operating system concepts are implemented in Windows
2000.
Windows 2000 is a 32-bit preemptive multitasking operating
system for Intel Pentium and later microprocessors. Being the
successor of Windows NT 4.0 and having user interface of Windows
98, Windows 2000 was originally going to be named as Windows NT
5.0; however, in 1999, Microsoft renamed it to Windows 2000 so that
the users of both Windows 98 and NT could see the neutral name as
the next logical step for them. In fact, Windows 2000 is better
Windows NT with user interface of Windows 98. It included various
features which were previously available only in Windows 98, such as
support for USB bus, plug and play devices and power management.
In addition, it introduced some new features including X.500-based
directory service, support for smart cards and security using
Kerberos.
Keeping pace with previous versions of Windows NT, Windows
2000 was also released in several versions. Microsoft introduced four
versions of Windows 2000: Windows 2000 Professional intended for
desktop use, Windows 2000 Server, Windows 2000 Advanced Server
and Windows 2000 Datacenter Server. Despite of minor differences
among these versions, the same binary executable file was used for
all the versions.
17.2 STRUCTURE
Windows 2000 comprises two major parts: the operating system and
the environmental subsystems. The operating system is organized
as a hierarchy of layers, each layer utilizing the services of the layer
underneath it. The main layers include hardware abstraction layer
(HAL), kernel and executive, each of which runs in protected (kernel)
mode. The environmental subsystems are a collection of user-
mode processes which enable Windows 2000 to execute programs
developed for other operating systems such as Win32, POSIX and
OS/2. Each subsystem provides an operating environment for a
single application. However, the main operating environment of
Windows 2000 is the Win32 subsystem and therefore, Windows
2000 facilitates to use Win32 API (Win32 Application Programming
Interface) calls. In this section, we will discuss only the operating
system’s structure.
17.2.2 Kernel
The role of kernel is to provide a higher-level abstraction of the
hardware to the executive and environmental subsystems. Other
responsibilities of kernel include thread scheduling, low-level
processor synchronization, exception handling and recovery after
power failure. The main characteristic of Windows 2000 kernel is that
it is permanently resident in the main memory and its execution is
never preempted.
The kernel of Windows 2000 is object-oriented, that is, it uses a
set of object types to perform its functions. An object type is a
system-defined data type that consists of a set of attributes as well as
a set of methods. Each instance of an object type is referred to as an
object. The kernel supports two classes of objects, namely, control
and dispatcher objects. The attributes of both of these objects contain
the kernel data and the methods of these objects perform the
activities of kernel. The control objects include those objects which
control the system while the dispatcher objects include those
objects which handle dispatching and synchronization in the system.
Table 17.1 lists some kernel objects along with their use.
Fig. 17.1 Structure of Windows 2000 System
17.2.3 Executive
The executive offers a variety of services that can be used by
environmental subsystems. These services are grouped under
several components some of which include I/O manager, object
manager, process manager, plug and play (PnP) manager, security
manager, power manger and virtual memory manager. The
description of all these components is as follows.
• I/O manager: The I/O manager includes file system, cache
manager, device drivers and network drivers. Under I/O manager,
the file systems are technically treated as device drivers and in
Windows 2000, two such drivers exist, one for the FAT and
another for NTFS. The I/O manager keeps track of which
installable file systems are loaded. It is also responsible for
controlling the cache manager that deals with caching for the
entire I/O system. Another major responsibility of I/O manager is
to provide generic I/O services to the rest of the system.
Whenever an I/O request arrives, the I/O manager invokes the
appropriate device driver to perform physical I/O, thereby
providing device-independent I/O. In addition, it facilitates one
device driver to call another.
• Object manager: As already described, object is the basic
component that Windows 2000 operating system uses for
performing all its functions. It is the responsibility of object
manager to monitor the usage of all the operating system objects
by keeping track of which processes and threads are accessing
which objects. It provides processes and threads a standard
interface (called handles) to all types of objects. Whenever a
process or thread needs to access some object, it calls open()
method of object manager that in turn generates an object
handle (an identifier unique to the process) and returns it to the
requesting process or thread. The object manager is also
responsible for allocating a part of kernel address space to the
object at the time of its creation and returning the same to the free
list at the time of its termination.
• Process manager: As the name suggests, Windows 2000
process manager is responsible for process and thread
management including their creation, termination and use.
However, it is not aware of process hierarchies; those details are
known only to the specific environmental subsystem to which the
process belongs. When some application, say Win32 application,
needs to create a new process, it invokes the appropriate system
call. As a result, a message is passed to the corresponding
subsystem (in our case, Win32 subsystem) which then calls the
process manager. The process manager, in turn, calls the object
manager for creating a process object and returns the object
handle (corresponding to the newly created process) generated
by the object manager to the subsystem. Once handle to the new
process has been received, the subsystem again calls the
process manager for creating a thread for the new process. The
same process is repeated and a handle to thread is returned to
the subsystem. The subsystem then passes both the handles to
the requesting application.
• Plug and play (PnP) manager: Whenever a new hardware is
installed in the system or some changes are made to the existing
hardware configuration, there should be some entity that
recognizes these changes and adapts to those changes. Such
entity in operating system is the PnP manager. The PnP manager
automatically recognizes the devices installed in the system and
detects changes (if any) while the system operates. It is also
responsible for locating and loading the appropriate device drivers
to make the devices work. For example, when a USB device is
attached to the system, a message is passed to the PnP
manager, which then finds and loads the appropriate driver.
• Security manager: Windows 2000 employs security mechanism
that conforms to the U.S. Department of Defense’s C2
requirements, the Orange Book. This book specifies several rules
such as secure login, privileged access control, address space
protection per process and so on, which the operating system
must follow in order to be classified as secure system. The
security manager is responsible for ensuring that the system
always functions conforming to these rules.
• Power manager: The power manager is responsible for
supervising the usage of power in the system, and take
necessary actions to reduce the power consumption and maintain
integrity of information. For example, whenever the monitor has
been idle for a while, the power manager turns it off to save
energy. Similarly, on laptops, when the battery is about to run dry,
the power manager takes appropriate actions which inform open
applications to save their files and get ready for shutdown.
• Virtual memory manager: The virtual memory manager in
Windows 2000 uses the demand-paged management scheme. It
is responsible for allocating and freeing virtual memory, mapping
of virtual addresses to physical address space, enforcing
protection rules to restrict each process to access pages in its
own address space and not of others, and so on. It also facilitates
memory-mapped file I/O with the help of I/O manager.
Pipes
Pipe is the standard communication mechanism that enables to
transfer data between processes. In Windows 2000, pipes are
available in two modes: byte mode and message mode. The byte-
mode pipes work in the same way as pipes in Linux (described in
previous chapter). However, message-mode pipes are little bit
different as they preserve message boundaries. For example, if a
sender sends a 256 byte message in two writes of 128 bytes each,
the receiver will read it as two separate messages rather than a
single message. Like Linux, Windows 2000 also supports named
pipes. Named pipes are similar to regular pipes in the sense that
they are also available in same two modes (byte and message) but
unlike regular pipes they can be employed for communication in a
networked environment.
Mailslots
Mailslot is a mechanism that enables one-way communication
between processes on same system or different systems connected
via network. A mailslot is a repository of inter-process messages and
it resides in memory. The process that creates and owns a mailslot is
known as mailslot server while the processes that communicate with
it by putting messages in its mailslot are known as mailslot clients.
Any process that knows the name of the mailslot can write
messages to it. Each incoming message is appended to the mailslot
and kept stored there until read by the mailslot server. Mailslots allow
a mailslot client to broadcast a message to multiple mailslot servers
located anywhere on the network provided all mailslot servers bear
the same name of mailslot.
Whenever a server creates a mailslot, it is provided with a mailslot
handle. A mailslot server can read messages from the mailslot only
using this handle. In addition, any process other than the mailslot
server who has obtained the handle to a mailslot can read the
messages from it. Note that the same process can act as both
mailslot server and mailslot client. This enables bi-directional
communication between processes using multiple mailslots.
Note: Though mailslots can be employed in a networked
environment, they are unreliable and thus, do not offer guaranteed
delivery of messages.
17.3.2 Scheduling
In Windows 2000, the entities the operating system schedules are the
threads, not the processes. Accordingly, each thread has a state
while processes do not. A thread, at a time, can be in one of the six
possible states: ready, standby, running, waiting, transition and
terminated. Some of these states have already been described in
Chapter 2, while rests are described as follows.
• Standby: A ready thread is said to be in ‘standby’ state if it is the
next one to be executed.
• Transition: A newly created thread is said to be in ‘transition’
state while it is waiting for the resources required for its execution.
Every thread continues to switch among various states during its
lifetime. Figure 17.2 shows the thread state transition diagram.
The Windows 2000 scheduler selects the threads regardless of
which process they belong to. It does not even know which processes
own which threads. To determine the order in which ready threads
are to be executed, the scheduler uses 32 priority levels, numbered
from 0 to 31, where higher number denotes the higher priority. The 32
priority levels are divided into two classes: real-time and variable. The
threads with priorities from16 to 31 belong to real-time class while
from 0 to 15 belong to variable class. Note that the priorities 16 to 31
are kept reserved by the system and cannot be assigned to user
threads.
Fig. 17.2 Thread State Transition Diagram
17.4.1 Paging
Windows 2000 supports demand-paged memory management
scheme to manage the memory. Theoretically, page sizes can be of
any power of two up to 64 KB, however, on Pentium, the page size is
fixed and is of 4 KB while on Itanium, it can be of 8 KB or 16 KB. The
VM manager of Windows 2000 supports the page size of 4 KB.
Note: Windows 2000 does not support segmentation.
The virtual memory of each process is divided into pages of 4 KB
each and also, the physical memory is divided into page frames of 4
KB each. Windows 2000 uses two-level paging in which page table
itself is also paged. Each process has a page directory (higher level
page table) that holds 1024 page directory entries where each entry
is of size 4 bytes (32 bits). Each page directory entry (PDE) points to
a page table. Furthermore, each page table of a process holds 1024
page table entries where each entry is of size 4 bytes. Each page
table entry (PTE) points to a page frame in the physical memory.
Figure 17.4 shows the virtual memory layout of a process.
Fig. 17.4 Virtual Memory Layout of a Process
Note: The maximum size of all page tables of a process can never exceed 4
MB.
Attribute Description
Standard information Contains information like flag bits,
timestamp.
File name Contains the file name in Unicode.
Attribute list Lists the location of additional MFT
records.
Object ID Represents the file identifier unique to the
volume.
Volume name Contains the name of the volume; used in
$Volume metadata file
Volume information Contains the version of the volume; used
in $Volume metadata file
Index root Used to implement directories.
Index allocation Used to implement very large directories.
Data Contains stream data.
NTFS associates each file with a unique ID known as file
reference. It is of 64 bits where first 48 bit and last 16 bits represent
the file number and the sequence number respectively. The file
number represents the record number in the MFT containing that
file’s entry and the sequence number shows the number of times
that MFT entry has been used.
Implementation of I/O
The I/O manager provides a general framework in which different I/O
devices can operate. This framework basically consists of two parts:
one is a set of loaded device drivers that contains the device-specific
code required for communicating with the devices, and another is
device-independent code that is required for certain aspects of I/O
such as uniform interfacing for device drivers, buffering, and error
handling.
The device drivers should be written in such as way that they
should conform to a Windows Driver Model defined by Microsoft,
which ensures that the device drivers are compatible with the rest of
Windows 2000. Microsoft has also provided a toolkit that help driver
writers to produce conformant drivers. The drivers must meet certain
requirements to conform to the Windows Driver Model. Some of
these requirements are given below.
• The drivers must be able to handle incoming I/O requests that
arrive to them in the form of a standardized packet called an I/O
Request Packet (IRP).
• The drivers must be object based as rest of Windows 2000 in the
sense that they must provide a set of procedures that the rest of
the system can call. In addition, they must be able to correctly
deal with other Windows 2000 objects.
• The drivers must completely support plug and play feature, that is,
they must allow devices to be added or removed as and when
required.
• The drivers must be configurable, that is, they must not contain
any built-in assumptions about which I/O ports ad interrupt lines
certain devices use. For example, a printer driver must not have
any fixed address of the printer port hard coded into it.
• The drivers must permit power management, wherever required.
Power management is required to reduce the power consumption
of some devices of the system or of the entire system, in case the
system is idle. Windows 2000 provides several options to
conserve power. One is turning off the monitor and the hard disks
automatically when the system is idle for a short period. Another
option is to put the system on a standby mode in case you are
away from the system for a while. Third option is to put the
system in hibernation mode in case the system is idle for a long
time period such as overnight. Both standby and hibernation
mode put the entire system in a low-power state. The system
must also wakeup when told to do so.
• The drivers must be capable of being used on a multiprocessor
system because Windows 2000 was basically designed for use
on multiprocessors. This implies that the driver must function
correctly even if the driver code is executed concurrently by two
or more processors.
• The drivers must be portable across Windows 98 and Windows
2000. That is, the drivers must not work only on Windows 2000
but also on Windows 98.
As we have discussed that in Linux, the major device number is
used to identify the driver associated with that device. In Windows
2000, a different scheme is followed to identify the drivers associated
with each device. Whenever the system is booted, or whenever a
new plug-and-play device is attached to the system, Windows 2000
automatically detects the device and calls the PnP manager. The PnP
manager finds out the manufacturer and the model number of the
device, and using this information it looks up in a certain directory on
the hard disk to locate the driver. If the driver is not available in that
directory, it prompts the user to insert a floppy disk or CD-ROM that
contains the required driver. Once the driver is located, it is loaded
into the memory.
As stated earlier, the drivers must be object based in the sense
that they must provide a set of procedures that the rest of the system
can call to get its services. The two basic procedures that a driver
must provide are DriverEntry and AddDevice. Whenever a driver is
loaded into the memory, a driver object is created for it. Once the
driver is loaded, the DriverEntry procedure is called that initializes
the driver. During initialization, it may create some tables and data
structures, and provide values to some of the fields of the driver
object. These fields basically include pointers to all the other
procedures that drivers must supply. The driver objects are stored in
a special directory, \??.
In addition to driver object, a device object is also created for the
device that is controlled by that driver, which points to the driver
object. The device object is used to locate the desired driver object in
the directory \??. Once the driver object is located, its procedures can
be called easily.
The AddDevice procedure is called by the PnP manager once for
each device to be added. Once the device has been added, the driver
is called with the first IRP for setting up the interrupt vector and
initializing the hardware.
Windows 2000 allows a driver to do all the work either by itself
(just like the printer driver), or can divide the work among multiple
drivers. In the former case, the drivers are said to be monolithic, and
in the latter case, the drivers are said to be stacked, which means
that a request may pass through a sequence of drivers, each doing a
part of the work (see Figure 17.6).
Fig. 17.6 Stacked Drivers
The stacked drivers can be used to separate out the complex bus
management part from the functional work of actually controlling the
device. This helps the device writers to write only the device-specific
code without having to know the bus controlling part. Some device
drivers can also insert filter drivers at the top of the stack, which
perform some transformations on the data that is being transferred to
and from the device. For example, a filter driver could compress the
data while writing it onto the disk, and decompress it while reading.
Note that both the application programs and the true device drivers
are unaware of the presence of filter drivers—the filter driver
automatically performs the data transformations.
LET US SUMMARIZE
1. Microsoft Windows is the most popular series of operating system in the
past decade. Windows 95 revolutionized the personal computer operating
system market. Then came Windows 98, Windows ME, Windows NT,
Windows 2000, Windows XP, Windows vista, Windows 7, and the latest
one is Windows 8.
2. Windows 2000 is a 32-bit preemptive multitasking operating system for
Intel Pentium and later microprocessors.
3. Windows 2000 comprises two major parts: the operating system and the
environmental subsystems.
4. The Windows 2000 operating system is organized as a hierarchy of
layers, each layer utilizing the services of the layer underneath it. The
main layers include hardware abstraction layer (HAL), kernel and
executive, each of which runs in protected (kernel) mode.
5. The environmental subsystems are a collection of user-mode processes
which enable Windows 2000 to execute programs developed for other
operating systems such as Win32, POSIX and OS/2.
6. Each environmental subsystem provides an operating environment for a
single application. However, the main operating environment of Windows
2000 is the Win32 subsystem and therefore, Windows 2000 facilitates to
use Win32 API (Win32 Application Programming Interface) calls.
7. The role of HAL is to hide the hardware differences and present the upper
layers with abstract hardware devices. HAL conceals many of the
machine dependencies in it and exports a virtual machine interface which
is used by rest of the operating system and device drivers.
8. The role of kernel is to provide a higher-level abstraction of the hardware
to the executive and environmental subsystems. Other responsibilities of
kernel include thread scheduling, low-level processor synchronization,
exception handling and recovery after power failure.
9. The executive offers a variety of services that can be used by
environmental subsystems. These services are grouped under several
components some of which include I/O manager, object manager, process
manager, plug and play (PnP) manager, security manager, power manger
and virtual memory manager.
10. In Windows 2000, each process comprises one or more threads (the units
of execution scheduled by the kernel) with each thread comprising
multiple fibers (the lightweight threads). Furthermore, the processes to be
handled as a unit can be combined to form a job.
11. Windows 2000 offers a wide variety of mechanisms to let the processes or
threads communicate with each other. Some of the standard mechanisms
include pipes, mailslots, sockets, remote procedure calls (RPC) and
shared memory.
12. In Windows 2000, the entities the operating system schedules are the
threads, not the processes. Accordingly, each thread has a state while
processes do not. A thread, at a time, can be in one of the six possible
states: ready, standby, running, waiting, transition and terminated.
13. To determine the order in which ready threads are to be executed, the
scheduler uses 32 priority levels, numbered from 0 to 31, where higher
number denotes the higher priority.
14. The 32 priority levels are divided into two classes: real-time and variable.
The threads with priorities from16 to 31 belong to real-time class while
from 0 to 15 belong to variable class. Note that the priorities 16 to 31 are
kept reserved by the system and cannot be assigned to user threads.
15. Memory management in Windows 2000 primarily involves managing the
virtual memory. The part of operating system that is responsible for
managing the virtual memory is called virtual memory (VM) manager.
16. Unlike scheduler, the memory manager in Windows 2000 deals entirely
with processes and not with threads. This is because the memory is
allocated to a process and not to a thread.
17. Each user process in Windows 2000 is assigned a virtual address space
and since, VM manager uses 32-bit addresses, the virtual address space
of each process is 4 GB (232 bits) long. Out of 4 GB, the lower 2 GB of
address space store the code and data of a process while the upper 2 GB
are reserved for the operating system.
18. Windows 2000 supports demand-paged memory management scheme to
manage the memory. Theoretically, page sizes can be of any power of two
up to 64 KB, however, on Pentium, the page size is fixed and is of 4 KB
while on Itanium, it can be of 8 KB or 16 KB. The VM manager of
Windows 2000 supports the page size of 4 KB.
19. Windows 2000 employs two dedicated kernel threads, balance set
manager and working set manager that work in conjunction with each
other to handle page faults.
20. The balance set manager keeps track of free frames on the free list and
the working set manager keeps track of the working set of processes.
21. On a system running Windows 2000, one of three file systems, namely,
FAT16, FAT32 and NTFS (New Technology File System) can be used.
However, NTFS supersedes the FAT file systems and has become the
standard file system of Windows 2000 because of several improvements.
22. The basic entity of NTFS is volume. An NTFS volume can be a logical
partition of the disk or the entire disk. It is organized as a sequence of
clusters where a cluster is a collection of contiguous disk sectors.
23. The cluster is the smallest unit of disk space that can be allocated to a file.
The size of a cluster for a volume varies from 512 bytes to 64 KB
depending on the size of volume. Each cluster starting from the beginning
of the disk to the end is assigned a number known as logical cluster
number (LCN).
24. In order to keep track of information regarding each file on volume, NTFS
maintains a master file table (MFT) that contains at least one record for
each file.
25. In NTFS, the internal information about the data for a volume is stored in
special system files known as metadata files.
26. Windows 2000 has been designed with a general framework to which new
devices can easily be attached as and when required. The I/O manager in
Windows 2000 is closely connected to the plug and play (PnP) manager
which automatically recognizes new hardware installed on the computer,
and make appropriate changes in the hardware configuration.
27. The I/O manager provides a general framework in which different I/O
devices can operate. This framework basically consists of two parts: one
is a set of loaded device drivers that contains the device-specific code
required for communicating with the devices, and another is device-
independent code that is required for certain aspects of I/O such as
uniform interfacing for device drivers, buffering, and error handling.
28. Windows 2000 allows a driver to do all the work either by itself, or can
divide the work among multiple drivers. In the former case, the drivers are
said to be monolithic, and in the latter case, the drivers are said to be
stacked, which means that a request may pass through a sequence of
drivers, each doing a part of the work.
EXERCISES
Fill in the Blanks
1. The main layers in Windows 2000 operating system include
_____________, kernel and _____________.
2. An _____________ is a system-defined data type that consists of a set of
attributes as well as a set of methods.
3. In Windows 2000, pipes are available in two modes: _____________ and
_____________.
4. Each page directory entry points to a _____________.
5. _____________ may span multiple partitions and even multiple disks.
Descriptive Questions
1. Explain in brief the structure of Windows 2000 operating system.
2. How a virtual address is mapped onto physical address in Windows 2000?
3. List the various possible states a thread can be in at a time. Explain thread
state transitions with help of diagram.
4. Define the following terms.
(a) Mailslot client
(b) Mailslot server
(c) Resident attribute
5. How threads are scheduled in Windows 2000?
6. What do filter drivers perform?
7. What do the bits of each PTE indicate?
8. What is the role of plug and play manager?
9. Write short notes on the following.
(a) Mailslots
(b) Socket
(c) MFT
(d) Metadata files
10. List some requirements that a conformant driver must meet.
11. How page faults are handled in Windows 2000?
12. Discuss the basic procedures that a driver in Windows 2000 must provide.
Glossary