You are on page 1of 244

Operating System

Concepts
Tei-Wei Kuo
ktw@csie.ntu.edu.tw
Dept. of Computer Science and
Information Engineering
National Taiwan University

Syllabus
ƒ 授課教授: 郭大維 @ Room 315.資工系. NTU.TW
ƒ 助教: 謝仁偉 (d90002@csie.ntu.edu.tw)
賴彥丞 (yclai@csie.ntu.edu.tw)
張原豪 (d93944006@ntu.edu.tw )
ƒ 上課時間: Wednesday 14:20-17:20
ƒ 教室:資 103
ƒ 教科書:
Silberschatz, Galvin, and Gagne, “Operating System
Concept,” Sixth Edition, John Wiley & Sons, Inc.,
2002.
ƒ 成績評量:(subject to changes.):
期中考(40%), 期末考(40%), 作業(20%)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

1
Projects – Nachos 4.0
ƒ Not Another Completely Heuristic Operating
System
ƒ Written by Tom Anderson and his students at
UC Berkeley
http://www.cs.washington.edu/homes/tom/nachos/
ƒ It simulates an MIPS architecture on host
systems
(Unix/Linux/Windows/MacOS X)
ƒ User programs need a cross-compiler (target
MIPS)
ƒ Nachos appears as a single threaded process to
the host operating system.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

2
Chapter 1. Introduction

Introduction
ƒ What is an Operating System?
ƒ A basis for application programs
ƒ An intermediary between users and
hardware

ƒ Amazing variety
ƒ Mainframe, personal computer (PC),
handheld computer, embedded
computer without any user view
Convenient vs Efficient
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

3
Computer System Components
User User ................. User

compilers,
Application Programs word processors,
spreadsheets,
Operating System browsers, etc.

Hardware CPU, I/O devices,


memory, etc.

ƒ OS – a government/environment provider
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

User View
ƒ The user view of the computer varies by
the interface being used!
ƒ Examples:
ƒ Personal Computer Æ Ease of use
ƒ Mainframe or minicomputer Æ
maximization of resource utilization
ƒ Efficiency and fair share
ƒ Workstations Æ compromise between
individual usability & resource utilization
ƒ Handheld computer Æ individual usability
ƒ Embedded computer without user view Æ
run without user intervention
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

4
System View
ƒ A Resource Allocator
ƒ CPU time, Memory Space, File
Storage, I/O Devices, Shared Code,
Data Structures, and more
ƒ A Control Program
ƒ Control execution of user programs
ƒ Prevent errors and misuse
ƒ OS definitions – US Dept.of Justice
against Microsoft in 1998
ƒ The stuff shipped by vendors as an OS
ƒ Run at all time
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Goals
ƒ Two Conflicting Goals:
ƒ Convenient for the user!
ƒ Efficient operation of the computer
system!

ƒ We should
ƒ recognize the influences of operating
systems and computer architecture on
each other
ƒ and learn why and how OS’s are by
tracing their evolution and predicting what
they will become!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

5
UNIX Architecture
User useruser user user
user user user
interface
System call Shells, compilers, X, application programs, etc.
interface
CPU scheduling, signal handling,
virtual memory, paging, swapping,
file system, disk drivers, caching/buffering, etc.

Kernel interface terminal controller, terminals,


to the hardware physical memory, device controller,
devices such as disks, memory, etc.

UNIX
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe Systems
ƒ The first used to tackle many
commercial and scientific
applications!

ƒ 0th Generation – 1940?s


ƒ A significant amount of set-up time in
the running of a job
ƒ Programmer = operator
ƒ Programmed in binary Æ assembler
Æ (1950) high level languages

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

6
Mainframe – Batch Systems

ƒ Batches sorted and submitted by


the operator • loader
ƒ Simple batch systems • job sequencing
ƒ Off-line processing • control card monitor
~ Replace slow input devices with interpreter
faster units Æ replace card
readers with disks User Program
ƒ Resident monitor Area
~ Automatically transfer control from
one job to the next

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe – Batch Systems


ƒ Spooling (Simultaneous Peripheral Operation On-
Line)
~ Replace sequential-access devices with random-access
device
=> Overlap the I/O of one job with the computation of others
e.g. card Æ disk, CPU services, disk Æ printer
ƒ Job Scheduling

disks disks

card reader CPU printer

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

7
Mainframe –
Multiprogrammed Systems
ƒ Multiprogramming increases CPU monitor CPU
utilization by organizing jobs so scheduling
that the CPU always has one to job1
execute – Early 1960 job2
ƒ Multiporgrammed batched job3
systems
ƒ Job scheduling and CPU
scheduling
ƒ Goal : efficient use of scare
resources disk
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe – Time-Sharing
Systems on-line file system
virtual memory
sophisticated CPU scheduling
ƒ Time sharing (or job synchronization
multitasking) is a logical protection & security
......
extension of and so on
multiprogramming!
ƒ Started in 1960s and disk
become common in
1970s.
ƒ An interactive (or hand-
on) computer system
ƒ Multics, IBM OS/360

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

8
Desktop Systems
ƒ Personal Computers (PC’s)
ƒ Appeared in the 1970s.
ƒ Goals of operating systems keep
changing
ƒ Less-Powerful Hardware & Isolated
EnvironmentÆ Poor Features
ƒ Benefited from the development of
mainframe OS’s and the dropping of
hardware cost
ƒ Advanced protection features
ƒ User Convenience & Responsiveness

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Parallel Systems
ƒ Tightly coupled: have more than one
processor in close communication sharing
computer bus, clock, and sometimes
memory and peripheral devices
ƒ Loosely coupled: otherwise
ƒ Advantages
ƒ Speedup – Throughput
ƒ Lower cost – Economy of Scale
ƒ More reliable – Graceful Degradation Æ
Fail Soft (detection, diagnosis, correction)
• A Tandem fault-tolerance solution
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

9
Parallel Systems
ƒ Symmetric multiprocessing model: each
processor runs an identical copy of the OS
ƒ Asymmetric multiprocessing model: a master-
slave relationship
~ Dynamically allocate or pre-allocate tasks
~ Commonly seen in extremely large systems
~ Hardware and software make a difference?
ƒ Trend: the dropping of microporcessor cost
Î OS functions are offloaded to slave
processors (back-ends)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Distributed Systems
ƒ Definition: Loosely-Coupled Systems –
processors do not share memory or a clock
ƒ Heterogeneous vs Homogeneous
ƒ Advantages or Reasons
ƒ Resource sharing: computation power,
peripheral devices, specialized hardware
ƒ Computation speedup: distribute the
computation among various sites – load
sharing
ƒ Reliability: redundancy Æ reliability
ƒ Communication: X-window, email
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

10
Distributed Systems
ƒ Distributed systems depend on
networking for their functionality.
ƒ Networks vary by the protocols used.
ƒ TCP/IP, ATM, etc.
ƒ Types – distance
ƒ Local-area network (LAN)
ƒ Wide-area network (WAN)
ƒ Metropolitan-area network (MAN)
ƒ Small-area network – distance of few feet
ƒ Media – copper wires, fiber strands,
satellite wireless transmission, infrared
communication,etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Distributed Systems
ƒ Client-Server Systems
ƒ Compute-server systems
ƒ File-server systems
ƒ Peer-to-Peer Systems
ƒ Network connectivity is an essential
component.
ƒ Network Operating Systems
ƒ Autonomous computers
ƒ A distributed operating system – a
single OS controlling the network.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

11
Clustered Systems
ƒ Definition: Clustered computers which
share storage and are closely linked via
LAN networking.
ƒ Advantages: high availability, performance
improvement, etc.
ƒ Types
ƒ Asymmetric/symmetric clustering
ƒ Parallel clustering – multiple hosts that
access the same data on the shared
storage.
ƒ Global clusters
ƒ Distributed Lock Manager (DLM)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Real-Time Systems
ƒ Definition: A real-time system is a
computer system where a timely
response by the computer to external
stimuli is vital!
ƒ Hard real-time system: The system
has failed if a timing constraint, e.g.
deadline, is not met.
ƒ All delays in the system must be
bounded.
ƒ Many advanced features are absent.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

12
Real-Time Systems
ƒ Soft real-time system: Missing a
timing constraint is serious, but does
not necessarily result in a failure
unless it is excessive
ƒ A critical task has a higher priority.
ƒ Supported in most commercial OS.
ƒ Real-time means on-time instead of
fast

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Applications for Real-Time Systems!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

13
Real-Time Systems
ƒ Applications
ƒ Air traffic control ƒ Virtual reality
ƒ Space shuttle ƒ Games
ƒ Navigation ƒ User interface
ƒ Multimedia systems ƒ Vision and speech
ƒ Industrial control systems recognition (approx.
100 ~ 200ms)
ƒ Home appliance
controller ƒ PDA, telephone
system
ƒ Nuclear power plant
ƒ And more
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Handheld Systems
ƒ Handheld Systems
ƒ E.g., Personal Digital Assistant (PDA)
ƒ New Challenges – convenience vs
portability
ƒ Limited Size and Weight
ƒ Small Memory Size
ƒ No Virtual Memory
ƒ Slow Processor
ƒ Battery Power
ƒ Small display screen
ƒ Web-clipping
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

14
Feature Migration
ƒ MULTIplexed Information and
Computing Services (MULTICS)
ƒ 1965-1970 at MIT as a utility
ƒ UNIX
ƒ Since 1970 on PDP-11
ƒ Recent OS’s
ƒ MS Windows, IBM OS/2, MacOS X
ƒ OS features being scaled down to fit
PC’s
ƒ Personal Workstations – large PC’s

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Computing Environments
ƒ Traditional Computing
ƒ E.g., typical office environment
ƒ Web-Based Computing
ƒ Web Technology
ƒ Portals, network computers, etc.
ƒ Network connectivity
ƒ New categories of devices
ƒ Load balancers
ƒ Embedded Computing
ƒ Car engines, robots, VCR’s, home automation
ƒ Embedded OS’s often have limited features.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

15
Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Chapter 2
Computer-System Structure

16
Computer-System Structure
ƒ Objective: General knowledge of the
structure of a computer system.
printer tape
drivers
CPU
printer tape-drive
controller controller

disk
memory
controller
controller

memory disks

ƒ Device controllers: synchronize and manage access to devices.


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Booting
ƒ Bootstrap program:
ƒ Initialize all aspects of the system,
e.g., CPU registers, device
controllers, memory, etc.
ƒ Load and run the OS
ƒ Operating system: run init to initialize
system processes, e.g., various
daemons, login processes, after the
kernel has been bootstrapped.
(/etc/rc* & init or /sbin/rc* & init)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

17
Interrupt
ƒ Hardware interrupt, e.g. services
requests of I/O devices
ƒ Software interrupt, e.g. signals,
invalid memory access, division by
zero, system calls, etc – (trap)
process execution
interrupt
handler return
ƒ Procedures: generic handler or
interrupt vector (MS-DOS,UNIX)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interrupt Handling Procedure


interrupted process
system stack
interrupted
fixed address
address,
per interrupt
registers
type
handler ......

ƒ Saving of the address of the interrupted


instruction: fixed locations or stacks
ƒ Interrupt disabling or enabling issues: lost
interrupt?!
prioritized interrupts Æ masking
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

18
Interrupt Handling Procedure
ƒ Interrupt Handling
Î Save interrupt information
Î OS determine the interrupt type (by polling)
Î Call the corresponding handlers
Î Return to the interrupted job by the restoring
important information (e.g., saved return addr. Æ
program counter) Interrupt
---
Vector ---
indexed by 0 --- Interrupt Handlers
a unique 1 ---
(Interrupt Service
device --- Routines)
number ---
---
n ---
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. ---

I/O Structure
ƒ Device controllers are responsible of moving
data between the peripheral devices and their
local buffer storages.
printer tape
drivers
CPU printer
controller
tape-drive
registers buffers controller

memory DMA
controller
registers
buffers
memory
disk
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

19
I/O Structure
ƒ I/O operation
a. CPU sets up specific controller registers
within the controller.
b. Read: devices Æ controller buffers Æ
memory
Write: memory Æ controller buffers Æ
devices
c. Notify the completion of the operation by
triggering an interrupt

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Types
a. Synchronous I/O
ƒ Issues: overlapping of computations and IO
activities, concurrent I/O activities, etc.

I/O system call

wait till the


or • wait instruction (idle till interrupted)
completion • looping
or • polling
• wait for an interrupt
Loop: jmp Loop
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

20
I/O Types
Requesting process Requesting process
user user
wait
Device driver Device driver

Interrupt handler
Interrupt handler
Kernel Kernel
Hardware
Hardware
data transfer
data transfer

Time Time
Synchronous I/O Asynchronous I/O
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O types
b. Asynchronous I/O

wait till the


completion

sync wait mechanisms!!


*efficiency
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

21
I/O Types
ƒ A Device-Status Table Approach
card reader 1
status: idle Request
line printer 3 addr. 38596
status: busy len?1372
disk unit 3
status: idle Request Request
file:xx file:yy
........

Record Record
Addr. len Addr. len
process 1 process 2

•Tracking of many
I/O requests
•type-ahead service
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

DMA
ƒ Goal: Release CPU from handling excessive
interrupts!
ƒ E.g. 9600-baud terminal
2-microsecond service / 1000 microseconds
High-speed device:
2-microsecond service / 4 microseconds
ƒ Procedure
ƒ Execute the device driver to set up the
registers of the DMA controller.
ƒ DMA moves blocks of data between the
memory and its own buffers.
ƒ Transfer from its buffers to its devices.
ƒ Interrupt the CPU when the job is done.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

22
Storage Structure
registers ƒ Access time: a
CPU cycle
cache ƒ Access time:
HW-Managed several cycles
Primary Storage ƒ Access time: many
• volatile storage memory
SW-Managed cycles
Secondary Storage
• nonvolatile storage Magnetic Disks

Tertiary Storage * Differences:


• removable media CD-ROMs/DVDs Size, Cost,
Speed, Volatility
Jukeboxes
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Memory
ƒ Processor can have direct access!
ƒ Intermediate storage for data in the
R1 registers of device controllers
R2 ƒ Memory-Mapped I/O (PC & Mac)
R3 (1) Frequently used devices
.
.
(2) Devices must be fast, such as video
. controller, or special I/O instructions
is used to move data between
Device memory & device controller
Controller registers
Memory ƒ Programmed I/O – polling
ƒ or interrupt-driven handling
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

23
Magnetic disks
r/w
spindle head
sector
track ƒ Transfer Rate
ƒ Random-
Access Time
cylinder arm ƒ Seek time
assembly in x ms
ƒ Rotational
latency in y
ms
ƒ 60~200
disk times/sec
platter
arm
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Magnetic Disks
ƒ Disks
ƒ Fixed-head disks:
ƒ More r/w heads v.s. fast track switching
ƒ Moving-head disks (hard disk)
ƒ Primary concerns:
ƒ Cost, Size, Speed
ƒ Computer Æ host controller Æ disk controller
Æ disk drives (cache ÅÆ disks)
ƒ Floppy disk
ƒ slow rotation, low capacity, low density, but
less expensive
ƒ Tapes: backup or data transfer bet machines
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

24
Storage Hierarchy
register
High hitting rate
• instruction & data cache
Cache
Speed

• combined cache

Main Memory
Volatile Storage Faster than magnetic
Electronic Disk disk – nonvolatile?!
Alias: RAM Disks
Cost

Magnetic Disk

Optical Disk
Sequential
Magnetic Tape Access
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. XX GB/350F

Storage Hierarchy
ƒ Caching
ƒ Information is copied to a faster storage
system on a temporary basis
ƒ Assumption: Data will be used again soon.
ƒ Programmable registers, instr. cache, etc.
ƒ Cache Management
ƒ Cache Size and the Replacement Policy
ƒ Movement of Information Between
Hierarchy
ƒ Hardware Design & Controlling Operating
Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

25
Storage Hierarchy
ƒ Coherency and Consistency
ƒ Among several storage levels (vertical)
ƒ Multitasking vs unitasking
ƒ Among units of the same storage level ,
(horizontal), e.g. cache coherency
ƒ Multiprocessor or distributed systems

CPU Cache CPU cache

Memory Memory

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Hardware Protection
ƒ Goal:
ƒ Prevent errors and misuse!
ƒ E.g., input errors of a program in a
simple batch operating system
ƒ E.g., the modifications of data and code
segments of another process or OS
ƒ Dual-Mode Operations – a mode bit
ƒ User-mode executions except those
after a trap or an interrupt occurs.
ƒ Monitor-mode (system mode, privileged
mode, supervisor mode)
ƒ Privileged instruction:machine
instructions that may cause harm
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

26
Hardware Protection
ƒ System Calls – trap to OS for executing
privileged instructions.
ƒ Resources to protect
ƒ I/O devices, Memory, CPU
ƒ I/O Protection (I/O devices are scare
resources!)
ƒ I/O instructions are privileged.
ƒ User programs must issue I/O through
OS
ƒ User programs can never gain control
over the computer in the system mode.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Hardware Protection
ƒ Memory Protection
ƒ Goal: Prevent a user program from
modifying the code or data structures
of either the OS or other users!
ƒ Instructions to modify the memory
kernel space for a process are privileged.
job1
Base register Ù Check for every
job2 memory address by
Limit register
…… hardware
……
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

27
Hardware Protection
ƒ CPU Protection
ƒ Goal
ƒ Prevent user programs from sucking
up CPU power!
ƒ Use a timer to implement time-sharing
or to compute the current time.
ƒ Instructions that modify timers are
privileged.
ƒ Computer control is turned over to OS
for every time-slice of time!
ƒ Terms: time-sharing, context switch

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure
ƒ Local-Area Network (LAN)
ƒ Characteristics:
ƒ Geographically distributed in a small
area, e.g., an office with different
computers and peripheral devices.
ƒ More reliable and better speed
ƒ High-quality cables, e.g., twisted pair
cables for 10BaseT Ethernet or fiber optic
cables for 100BaseT Ethernet
ƒ Started in 1970s
ƒ Configurations: multiaccess bus, ring,
star networks (with gateways)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

28
Network Structure
ƒ Wide-Area Network (WAN)
ƒ Emerged in late 1960s (Arpanet in
1968)
ƒ World Wide Web (WWW)
ƒ Utilize TCP/IP over
ARPANET/Internet.
• Definition of “Intranet”: roughly speaking for any network under
one authorization, e.g., a company or a school.
• Often in a Local Area Network (LAN), or connected LAN’s.
• Having one (or several) gateway with the outside world.
• In general, it has a higher bandwidth because of a LAN.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure – WAN


HINET

TARNET
gateway

Intranet

gateway
Intranet
router
AIntranet
AIntranet
Intranet

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

29
Network Structure – WAN
ƒ Router
ƒ With a Routing table
ƒ Use some routing protocol, e.g., to maintain
network topology by broadcasting.
ƒ Connecting several subnets (of the same IP-or-
higher-layer protocols) for forwarding packets to
proper subnets.
ƒ Gateway
ƒ Functionality containing that of routers.
ƒ Connecting several subnets (of different or the
same networks, e.g., Bitnet and Internet)for
forwarding packets to proper subnets.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure – WAN


ƒ Connections between networks
ƒ T1: 1.544 mbps, T3: 45mbps
(28T1)
ƒ Telephone-system services over
T1
ƒ Modems
ƒ Conversion of the analog signal
and digital signal

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

30
Network Layers in Linux
applications Applications

Kernel
BSD sockets

INET sockets

TCP UDP

Network Layer Internet Protocol (IP) ARP

PPP SLIP Ethernet


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

TCP/IP
ƒ IP Address:
ƒ 140.123.101.1
ƒ 256*256*256*256 combinations
ƒ 140.123 -> Network Address
ƒ 101.1 -> Host Address
ƒ Subnet:
ƒ 140.123.101 and 140.123.102
ƒ Mapping of IP addresses and host names
ƒ Static assignments: /etc/hosts
ƒ Dynamic acquisition: DNS (Domain Name Server)
ƒ /etc/resolv.confg
ƒ If /etc/hosts is out-of-date, re-check it up with DNS!
ƒ Domain name: cs.ccu.edu.tw as a domain name for
140.123.100, 140.123. 101, and 140.123.103
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

31
TCP/IP
ƒ Transmission Control Protocol (TCP)
ƒ Reliable point-to-point packet
transmissions.
ƒ Applications which communicate over
TCP/IP with each another must provide IP
addresses and port numbers.
ƒ /etc/services
ƒ Port# 80 for a web server.
ƒ User Datagram Protocol (UDP)
ƒ Unreliable point-to-point services.
ƒ Both are over IP.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

TCP/IP
ƒ Mapping of Ethernet physical addresses
and IP addresses
ƒ Each Ethernet card has a built-in Ethernet
physical address, e.g., 08-01-2b-00-50-A6.
ƒ Ethernet cards only recognize frames with
their physical addresses.
ƒ Linux uses ARP (Address Resolution
Protocol) to know and maintain the mapping.
ƒ Broadcast requests over Ethernet for IP
address resolution over ARP.
ƒ Machines with the indicated IP addresses
reply with their Ethernet physical addresses.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

32
TCP/IP
A TCP packet TCP header + Data

An IP packet IP header Data

An Ethernet Ethernet header Data


frame

• Each IP packet has an indicator of which protocol used, e.g., TCP or UDP
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

33
Chapter 3
Operating-System
Structures

Operating-System Structures
ƒ Goals: Provide a way to understand
an operating systems
ƒ Services
ƒ Interface
ƒ System Components

ƒ The type of system desired is the


basis for choices among various
algorithms and strategies!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

34
System Components –
Process Management
ƒ Process Management
ƒ Process: An Active Entity
ƒ Physical and Logical Resources
ƒ Memory, I/O buffers, data, etc.
ƒ Data Structures Representing Current
Activities:
Program
+ Program Counter
(code) Stack
Data Section
CPU Registers
….
And More
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components –
Process Management
ƒ Services
ƒ Process creation and deletion
ƒ Process suspension and resumption
ƒ Process synchronization
ƒ Process communication
ƒ Deadlock handling

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

35
System Components – Main-
Memory Management
ƒ Memory: a large array of words or bytes,
where each has its own address
ƒ OS must keep several programs in
memory to improve CPU utilization and
user response time
ƒ Management algorithms depend on the
hardware support
ƒ Services
ƒ Memory usage and availability
ƒ Decision of memory assignment
ƒ Memory allocation and deallocation

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – File


Management
ƒ Goal:
ƒ A uniform logical view of information
storage
ƒ Each medium controlled by a device
ƒ Magnetic tapes, magnetic disks, optical
disks, etc.
ƒ OS provides a logical storage unit: File
ƒ Formats:
ƒ Free form or being formatted rigidly.
ƒ General Views:
ƒ A sequence of bits, bytes, lines, records
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

36
System Components – File
Management
ƒ Services
ƒ File creation and deletion
ƒ Directory creation and deletion
ƒ Primitives for file and directory
manipulation
ƒ Mapping of files onto secondary
storage
ƒ File Backup

* Privileges for file access control


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – I/O


System Management
ƒ Goal:
ƒ Hide the peculiarities of specific
hardware devices from users
ƒ Components of an I/O System
ƒ A buffering, caching, and spooling
system
ƒ A general device-driver interface
ƒ Drivers

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

37
System Components –
Secondary-Storage Management
ƒ Goal:
ƒ On-line storage medium for
programs & data
ƒ Backup of main memory
ƒ Services for Disk Management
ƒ Free-space management
ƒ Storage allocation, e.g., continuous
allocation
ƒ Disk scheduling, e.g., FCFS

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components –
Networking
ƒ Issues
ƒ Resources sharing
ƒ Routing & connection strategies
ƒ Contention and security
ƒ Network access is usually
generalized as a form of file access
ƒ World-Wide-Web over file-transfer
protocol (ftp), network file-system
(NFS), and hypertext transfer protocol
(http)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

38
System Components –
Protection System
ƒ Goal
ƒ Resources are only allowed to
accessed by authorized processes.
ƒ Protected Resources
ƒ Files, CPU, memory space, etc.
ƒ Services
ƒ Detection & controlling mechanisms
ƒ Specification mechanisms
ƒ Remark: Reliability!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components –
Command-Interpreter system
ƒ Command Interpreter
ƒ Interface between the user and the
operating system
ƒ Friendly interfaces
ƒ Command-line-based interfaces or
mused-based window-and-menu
interface
ƒ e.g., UNIX shell and command.com in
MS-DOS
Get the next command
User-friendly? Execute the command

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

39
Operation-System Services
ƒ Program Execution
ƒ Loading, running, terminating, etc
ƒ I/O Operations
ƒ General/special operations for devices:
ƒ Efficiency & protection
ƒ File-System Manipulation
ƒ Read, write, create, delete, etc
ƒ Communications
ƒ Intra-processor or inter-processor
communication – shared memory or
message passing
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services
ƒ Error Detection
ƒ Possible errors from CPU, memory,
devices, user programs Æ Ensure
correct & consistent computing
ƒ Resource Allocation
ƒ Utilization & efficiency
ƒ Accounting
ƒ Protection & Security
• user convenience or system efficiency!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

40
Operation-System Services
ƒ System calls
ƒ Interface between processes & OS
ƒ How to make system calls?
ƒ Assemble-language instructions or
subroutine/functions calls in high-level
language such as C or Perl?
ƒ Generation of in-line instructions or a
call to a special run-time routine.
ƒ Example: read and copy of a file!
ƒ Library Calls vs System Calls

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services
x ƒ How a system call
register occurs?
ƒ Types and
x: parameters information
for call
Code for
ƒ Parameter Passing
use parameters
load address x
system call 13
from table x System ƒ Registers
Call 13
ƒ Registers pointing to
blocks
ƒ Linux
ƒ Stacks

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

41
Operation-System Services
ƒ System Calls
ƒ Process Control
ƒ File Management
ƒ Device Management
ƒ Information Maintenance
ƒ Communications

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services
ƒ Process & Job Control
ƒ End (normal exit) or abort (abnormal)
ƒ Error level or no
ƒ Load and execute
ƒ How to return control?
ƒ e.g., shell load & execute commands
ƒ Creation and/or termination of
processes
ƒ Multiprogramming?

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

42
Operation-System Services
ƒ Process & Job Control (continued)
ƒ Process Control
ƒ Get or set attributes of processes
ƒ Wait for a specified amount of time
or an event
ƒ Signal event
ƒ Memory dumping, profiling, tracing,
memory allocation & de-allocation

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services
ƒ Examples: MS-DOS & UNIX

free memory process A


interpreter
process
free memory
command process B
interpreter
kernel kernel

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

43
Operation-System Services
ƒ File Management
ƒ Create and delete
ƒ Open and close
ƒ Read, write, and reposition (e.g.,
rewinding)
ƒ lseek
ƒ Get or set attributes of files
ƒ Operations for directories

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services
ƒ Device management
ƒ Request or release
ƒ Open and close of special files
ƒ Files are abstract or virtual devices.
ƒ Read, write, and reposition (e.g.,
rewinding)
ƒ Get or set file attributes
ƒ Logically attach or detach devices

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

44
Operation-System Services
ƒ Information maintenance
ƒ Get or set date or time
ƒ Get or set system data, such as the amount
of free memory
ƒ Communication
ƒ Message Passing
ƒ Open, close, accept connections
ƒ Host ID or process ID
ƒ Send and receive messages
ƒ Transfer status information
ƒ Shared Memory
ƒ Memory mapping & process synchronization
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services
ƒ Shared Memory
ƒ Max Speed & Comm Convenience
ƒ Message Passing
ƒ No Access Conflict & Easy Implementation

Process A M Process A
Shared Memory
Process B M Process B

kernel M kernel
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

45
System Programs
ƒ Goal:
ƒ Provide a convenient environment for
program development and execution
ƒ Types
ƒ File Management, e.g., rm.
ƒ Status information, e.g., date.
ƒ File Modifications, e.g., editors.
ƒ Program Loading and Executions, e.g.,
loader.
ƒ Programming Language Supports, e.g.,
compilers.
ƒ Communications, e.g., telnet.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Programs –
Command Interpreter
ƒ Two approaches:
ƒ Contain codes to execute commands
ƒ Fast but the interpreter tends to be
big!
ƒ Painful in revision!

del cd

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

46
System Programs –
Command Interpreter
ƒ Implement commands as system
programs Æ Search exec files
which corresponds to commands
(UNIX)
ƒ Issues
a. Parameter Passing
ƒ Potential Hazard: virtual memory
b. Being Slow
c. Inconsistent Interpretation of
Parameters
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure – MS-DOS


ƒ MS-DOS Layer Structure

Application program

Resident system program

MS-DOS device drivers

ROM BIOS device drivers

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

47
System Structure – UNIX
User useruser user user
user user user
interface
System call Shells, compilers, X, application programs, etc.
interface
CPU scheduling, signal handling,
virtual memory, paging, swapping,
file system, disk drivers, caching/buffering, etc.

Kernel interface terminal controller, terminals,


to the hardware physical memory, device controller,
devices such as disks, memory, etc.

UNIX
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure
ƒ A Layered Approach – A Myth

Layer M
new
ops
hidden Layer M-1
ops
existing
ops

Advantage: Modularity ~ Debugging &


Verification
Difficulty: Appropriate layer definitions, less
efficiency due to overheads!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

48
System Structure
ƒ A Layer Definition Example:

L5 User programs
L4 I/O buffering
L3 Operator-console device driver
L2 Memory management
L1 CPU scheduling
L0 Hardware

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure – OS/2


ƒ OS/2 Layer Structure
Application Application Application
Application-program Interface
Subsystem Subsystem Subsystem

•memory management
System kernel •task scheduling
•device management

Device driver Device driver Device driver


* Some layers of NT were from user space to kernel space in NT4.0
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

49
System Structure –
Microkernels
ƒ The concept of microkernels was
proposed in CMU in mid 1980s (Mach).
ƒ Moving all nonessential components
from the kernel to the user or system
programs!
ƒ No consensus on services in kernel
ƒ Mostly on process and memory
management and communication
ƒ Benefits:
ƒ Ease of OS service extensions Æ
portability, reliability, security
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure –
Microkernels
ƒ Examples
ƒ Microkernels: True64UNIX (Mach
kernel), MacOS X (Mach kernel),
QNX (msg passing, proc scheduling,
HW interrupts, low-level networking)
ƒ Hybrid structures: Windows NT
OS/2 POSIX
Win32 Applications POSIX
Applications Applications OS/2
Win32 Server
Server Server

kernel
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

50
Virtual Machine
ƒ Virtual Machines: provide an interface that is
identical to the underlying bare hardware

processes processes processes processes

interface
kernel kernel kernel
kernel
VM1 VM2 VM3

virtual machine implementation


hardware
hardware
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine
ƒ Implementation Issues:
ƒ Emulation of Physical Devices
ƒ E.g., Disk Systems
ƒ An IBM minidisk approach
ƒ User/Monitor Modes
ƒ (Physical) Monitor Mode
ƒ Virtual machine software
ƒ (Physical) User Mode
ƒ Virtual monitor mode & Virtual user
mode

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

51
Virtual Machine

virtual P1/VM1 system call


processes processes processes
user
Finish
mode Trap service
virtual
monitor kernel 1 kernel 2 kernel 3
mode Service for the system call
Restart VM1
monitor virtual machine software
mode
hardware Set program counter Simulate the
& register contents, effect of the I/O
& then restart VM1 instruction

time
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine
ƒ Disadvantages:
ƒ Slow!
ƒ Execute most instructions directly on the
hardware
ƒ No direct sharing of resources
ƒ Physical devices and
communications

* I/O could be slow (interpreted) or fast (spooling)


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

52
Virtual Machine
ƒ Advantages:
ƒ Complete Protection – Complete
Isolation!
ƒ OS Research & Development
ƒ System Development Time
ƒ Extensions to Multiple Personalities, such
as Mach (software emulation)
ƒ Emulations of Machines and OS’s, e.g.,
Windows over Linux

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine – Java

ƒ Sun Microsystems in late 1995


java .class files
ƒ Java Language and API Library
ƒ Java Virtual Machine (JVM)
class loader ƒ Class loader (for
verifier bytecode .class files)
ƒ Class verifier
java interpreter
ƒ Java interpreter
ƒ An interpreter, a just-in-time (JIT)
host system compiler, hardware
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

53
Virtual Machine – Java

java .class files


ƒ JVM
ƒ Garbage collection
ƒ Reclaim unused objects
class loader ƒ Implementation being specific for
verifier different systems
ƒ Programs are architecture neutral
java interpreter and portable

host system

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Design &


Implementation
ƒ Design Goals & Specifications:
ƒ User Goals, e.g., ease of use
ƒ System Goals, e.g., reliable
ƒ Rule 1: Separation of Policy & Mechanism
ƒ Policy:What will be done?
ƒ Mechanism:How to do things?
ƒ Example: timer construct and time slice
ƒ Two extreme cases:

Microkernel-based OS Macintosh OS
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

54
System Design &
Implementation
ƒ OS Implementation in High-Level
Languages
ƒ E.g., UNIX, OS/2, MS NT, etc.
ƒ Advantages:
ƒ Being easy to understand & debug
ƒ Being written fast, more compact,
and portable
ƒ Disadvantages:
ƒ Less efficient but more storage for
code
* Tracing for bottleneck identification, exploring of excellent algorithms, etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Generation
ƒ SYSGEN (System Generation)
ƒ Ask and probe for information concerning the
specific configuration of a hardware system
ƒ CPU, memory, device, OS options, etc.

No recompilation Recompilation
& completely Linking of of a modified
table-driven modules for source code
selected OS
ƒ Issues
ƒ Size, Generality, Ease of modification

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

55
Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Chapter 4 Processes

56
Processes
ƒ Objective:
ƒ Process Concept & Definitions

ƒ Process Classification:
ƒ Operating system processes
executing system code
ƒ User processes executing system
code
ƒ User processes executing user code

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes
ƒ Example: Special Processes in Unix
ƒ PID 0 – Swapper (i.e., the scheduler)
ƒ Kernel process
ƒ No program on disks correspond to this
process
ƒ PID 1 – init responsible for bringing up a Unix
system after the kernel has been
bootstrapped. (/etc/rc* & init or /sbin/rc* & init)
ƒ User process with superuser privileges
ƒ PID 2 - pagedaemon responsible for paging
ƒ Kernel process
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

57
Processes
ƒ Process
ƒ A Basic Unit of Work from the Viewpoint of
OS
ƒ Types:
ƒ Sequential processes: an activity resulted from
the execution of a program by a processor
ƒ Multi-thread processes
ƒ An Active Entity
ƒ Program Code – A Passive Entity
ƒ Stack and Data Segments
ƒ The Current Activity
ƒ PC, Registers , Contents in the Stack and Data
Segments

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes
ƒ Process State

terminated
new admitted

interrupt exit
ready

I/O or event completion scheduled running

waiting I/O or event wait

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

58
Processes
ƒ Process Control Block (PCB)
ƒ Process State
ƒ Program Counter
ƒ CPU Registers
ƒ CPU Scheduling Information
ƒ Memory Management Information
ƒ Accounting Information
ƒ I/O Status Information

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes
ƒ PCB: The repository for any information
that may vary from process to process
PCB[]
0
pointer
1
process state
2
pc
register

NPROC-1

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

59
Processes
ƒ Process Control Block (PCB) – An Unix
Example
ƒ proc[i]
ƒ Everything the system must know when the
process is swapped out.
ƒ pid, priority, state, timer counters, etc.
ƒ .u
ƒ Things the system should know when process
is running
ƒ signal disposition, statistics accounting, files[], etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes
per-process
ƒ Example: 4.3BSD kernel stack Red
Zone
.u
text argv, argc,…
structure user stack
sp

x_caddr heap
p_textp
Data Segment
p_addr page
proc[i] table PC
p_p0br Code Segment
entry
u_proc

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

60
Processes
ƒ Example: 4.4BSD per-process
kernel stack

process grp .u
argv, argc,…

file descriptors user stack


proc[i] region lists
VM space
entry heap
p_addr
Data Segment
page
p_p0br
table Code Segment

u_proc
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling
ƒ The goal of multiprogramming
ƒ Maximize CPU/resource utilization!
ƒ The goal of time sharing
ƒ Allow each user to interact with his/her program!
PCB1 PCB2
ready head
queue
tail
disk head PCB3
unit 0 tail
tape head
unit 1
tail
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

61
Process Scheduling – A
Queueing Diagram
dispatch
ready queue CPU

I/O I/O queue I/O request

time slice expired

child terminate child executes fork a child

interrupt occurs wait for an interrupt

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling –
Schedulers
ƒ Long-Term (/Job) Scheduler
CPU
Job pool
Memory

ƒ Goal: Select a good mix of I/O-bound and


CPU-bound process
ƒ Remarks:
1. Control the degree of multiprogramming
2. Can take more time in selecting processes
because of a longer interval between executions
3. May not exist physically
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

62
Process Scheduling –
Schedulers
ƒ Short-Term (/CPU) Scheduler
ƒ Goal:Efficiently allocate the CPU to
one of the ready processes
according to some criteria.
ƒ Mid-Term Scheduler
ƒ Swap processes in and out memory to
control the degree of multiprogramming

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling –
Context Switches
ƒ Context Switch ~ Pure Overheads
ƒ Save the state of the old process and load the
state of the newly scheduled process.
ƒ The context of a process is usually reflected in
PCB and others, e.g., .u in Unix.

ƒ Issues:
ƒ The cost depends on hardware support
ƒ e.g. processes with multiple register sets or
computers with advanced memory management.
ƒ Threads, i.e., light-weight process (LWP), are
introduced to break this bottleneck!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

63
Operations on Processes
ƒ Process Creation & Termination
ƒ Restrictions on resource usage
ƒ Passing of Information
ƒ Concurrent execution

root

pagedaemon swapper init

user1 user2 user3


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operations on Processes
ƒ Process Duplication
ƒ A copy of parent address space +
context is made for child, except the
returned value from fork():
ƒ Child returns with a value 0
ƒ Parent returns with process id of child
ƒ No shared data structures between
parent and child – Communicate via
shared files, pipes, etc.
ƒ Use execve() to load a new program
ƒ fork() vs vfork() (Unix)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

64
Operations on Processes
ƒ Example:

if ( pid = fork() ) == 0) {
/* child process */
execlp(“/bin/ls”, “ls”, NULL);
} else if (pid < 0) {
fprintf(stderr, “Fork Failed”);
exit(-1);
} else {
/* parent process */
wait(NULL);
}
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operations on Processes
ƒ Termination of Child Processes
ƒ Reasons:
ƒ Resource usages, needs, etc.
ƒ Kill, exit, wait, abort, signal, etc.
ƒ Cascading Termination

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

65
Cooperating Processes
ƒ Cooperating processes can affect or
be affected by the other processes
ƒ Independent Processes
ƒ Reasons:
ƒ Information Sharing, e.g., files
ƒ Computation Speedup, e.g.,
parallelism.
ƒ Modularity, e.g., functionality dividing
ƒ Convenience, e.g., multiple work

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Cooperating Processes
ƒ A Consumer-Producer Example:
ƒ Bounded buffer or unbounded buffer
ƒ Supported by inter-process
communication (IPC) or by hand coding
2
1
buffer[0…n-1]
0
Initially,
n-1 in
z
out in=out=0;
n-2

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

66
Cooperating Processes
Producer:
while (1) {
/* produce an item nextp */
while (((in+1) % BUFFER_SIZE) == out)
; /* do nothing */
buffer[ in ] = nextp;
in = (in+1) % BUFFER_SIZE;
}

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Cooperating Processes
Consumer:
while (1) {
while (in == out)
; /* do nothing */
nextc = buffer[ out ];
out = (out+1) % BUFFER_SIZE ;
/* consume the item in nextc */
}

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

67
Interprocess Communication
ƒ Why Inter-Process Communication
(IPC)?
ƒ Exchanging of Data and Control
Information!

ƒ Why Process Synchronization?


ƒ Protect critical sections!
ƒ Ensure the order of executions!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication
ƒ IPC
ƒ Shared Memory
ƒ Message Passing
ƒ Logical Implementation of Message
Passing
ƒ Fixed/variable msg size,
symmetric/asymmetric communication,
direct/indirect communication,
automatic/explicit buffering, send by
copy or reference, etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

68
Interprocess Communication
ƒ Classification of Communication by
Naming
ƒ Processes must have a way to refer
to each other!
ƒ Types
ƒ Direct Communication
ƒ Indirect Communication

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication –
Direct Communication
ƒ Process must explicitly name the
recipient or sender of a communication
ƒ Send(P, msg), Receive(Q, msg)
ƒ Properties of a Link:
a. Communication links are established
automatically.
b. Two processes per a link
c. One link per pair of processes
d. Bidirectional or unidirectional
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

69
Interprocess Communication –
Direct Communication
ƒ Issue in Addressing:
ƒ Symmetric or asymmetric addressing
Send(P, msg), Receive(id, msg)

ƒ Difficulty:
ƒ Process naming vs modularity

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication –
Indirect Communication
ƒ Two processes can communicate only
if the process share a mailbox (or ports)
send(A, msg)=> =>receive(A, msg)
A

ƒ Properties:
1. A link is established between a pair of
processes only if they share a mailbox.
2. n processes per link for n >= 1.
3. n links can exist for a pair of processes for
n >=1.
4. Bidirectional or unidirectional
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

70
Interprocess Communication –
Indirect Communication
ƒ Issues:
a. Who is the recipient of a message?

P2
P1 ?
msgs P3
b. Owners vs Users
ƒ Process Æ owner as the sole recipient?
ƒ OS Æ Let the creator be the owner?
Privileges can be passed?
Garbage collection is needed?

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication –
Synchronization
ƒ Blocking or Nonblocking
(Synchronous versus Asynchronous)
ƒ Blocking send
ƒ Nonblocking send
ƒ Blocking receive
ƒ Nonblocking receive

ƒ Rendezvous – blocking send &


receive

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

71
Interprocess Communication –
Buffering
ƒ The Capacity of a Link = the # of messages
could be held in the link.
ƒ Zero capacity(no buffering)
ƒ Msg transfer must be synchronized – rendezvous!
ƒ Bounded capacity
ƒ Sender can continue execution without waiting till the
link is full
ƒ Unbounded capacity
ƒ Sender is never delayed!
ƒ The last two items are for asynchronous
communication and may need acknowledgement
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication –
Buffering
ƒ Special cases:
a. Msgs may be lost if the receiver
can not catch up with msg sending
Æ synchronization
b. Senders are blocked until the
receivers have received msgs and
replied by reply msgs
Æ A Remote Procedure Call (RPC)
framework

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

72
Interprocess Communication –
Exception Conditions
ƒ Process termination
a. Sender TerminationÆ Notify or
terminate the receiver!
b. Receiver Termination
a. No capacity Æ sender is blocked.
b. BufferingÆ messages are
accumulated.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication –
Exception Conditions
ƒ Ways to Recover Lost Messages (due to
hardware or network failure):
ƒ OS detects & resends messages.
ƒ Sender detects & resends messages.
ƒ OS detects & notify the sender to handle it.
ƒ Issues:
a. Detecting methods, such as timeout!
b. Distinguish multiple copies if retransmitting is
possible
ƒ Scrambled Messages:
ƒ Usually OS adds checksums, such as CRC, inside
messages & resend them as necessary!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

73
Example - Mach
ƒ Mach – A message-based OS from
the Carnegie Mellon University
ƒ When a task is created, two special
mailboxes, called ports, are also
created.
ƒ The Kernel mailbox is used by the
kernel to communication with the
tasks
ƒ The Notify mailbox is used by the
kernel sends notification of event
occurrences.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example - Mach
ƒ Three system calls for message
transfer:
ƒ msg_send:
ƒ Options when mailbox is full:
a. Wait indefinitely
b. Return immediately
c. Wait at most for n ms
d. Temporarily cache a message.
a. A cached message per sending thread
for a mailbox
* One task can either own or receive from a mailbox.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

74
Example - Mach
ƒ msg_receive
ƒ To receive from a mailbox or a set of
mailboxes. Only one task can own &
have a receiving privilege of it
* options when mailbox is empty:
a. Wait indefinitely
b. Return immediately
c. Wait at most for n ms
ƒ msg_rpc
ƒ Remote Procedure Calls
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example - Mach
ƒ port_allocate
ƒ create a mailbox (owner)
ƒ port_status ~ .e.g, # of msgs in a link
ƒ All messages have the same priority and are
served in a FIFO fashion.
ƒ Message Size
ƒ A fixed-length head + a variable-length
data + two mailbox names
ƒ Message copying: message copying Æ
remapping of addressing space
ƒ System calls are carried out by messages.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

75
Example – Windows 2000
ƒ Local Procedure Call (LPC) – Message
Passing on the Same Processor
1. The client opens a handle to a
subsystem’s connection port object.
2. The client sends a connection request.
3. The server creates two private
communication ports, and returns the
handle to one of them to the client.
4. The client and server use the
corresponding port handle to send
messages or callbacks and to listen for
replies.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example – Windows 2000


ƒ Three Types of Message Passing
Techniques
ƒ Small messages
ƒ Message copying
ƒ Large messages – section object
ƒ To avoid memory copy
ƒ Sending and receiving of the pointer
and size information of the object
ƒ A callback mechanism
ƒ When a response could not be
made immediately.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

76
Communication in Client-
Server Systems
ƒ Socket
ƒ An endpoint for communication
identified by an IP address
concatenated with a port number
Host X ƒ A client-server architecture

Web server
Socket
146.86.5.2:1652
Socket
161.25.19.8:80

* /etc/services: Port # under 1024 ~ 23-telnet, 21-ftp, 80-web server, etc.


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in Client-
Server Systems
ƒ Three types of sockets in Java
ƒ Connection-oriented (TCP) – Socket class
ƒ Connectionless (UDP) – DatagramSocket class
ƒ MulticastSocket class – DatagramSocket subclass
Server
sock = new ServerSocket(5155);
Client
… sock = new Socket(“127.0.0.1”,5155);
client = sock.accept(); …
pout = new PrintWriter(client.getOutputStream(), in = sock.getInputStream();
true); bin = new BufferReader(new
… InputStreamReader(in));
Pout.println(new java.util.Date().toString()); …
pout.close(); sock.close();
client.close();
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

77
Communication in Client-
Server Systems
ƒ Remote Procedure Call (RPC)
ƒ A way to abstract the procedure-call
mechanism for use between systems with
network connection.
ƒ Needs:
ƒ Ports to listen from the RPC daemon site
and to return results, identifiers of functions
to call, parameters to pack, etc.
ƒ Stubs at the client site
ƒ One for each RPC
ƒ Locate the proper port and marshall parameters.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in Client-
Server Systems
ƒ Needs (continued)
ƒ Stubs at the server site
ƒ Receive the message
ƒ Invoke the procedure and return the results.
ƒ Issues for RPC
ƒ Data representation
ƒ External Data Representation (XDR)
ƒ Parameter marshalling
ƒ Semantics of a call
ƒ History of all messages processed
ƒ Binding of the client and server port
ƒ Matchmaker – a rendezvous mechanism
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

78
Communication in Client-
Server Systems
Client Messages Server
Call kernel Kernel sends Matchmaker
to send RPC Port: matchaker receives msg
msg to Re: addr. to X
msg to matchmaker
Procedure X

Kernel places Port: kernel Matchmaker


port P in usr Re: port P to X replies to client
RPC msg with port P

Daemon listens
Kernel sends Port: P to port P and
RPC <contents> receives msg

Kernel receives Daemon processes


reply and passes Port: kernel request and sends
to user <output> output
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in Client-
Server Systems
ƒ An Example for RPC
ƒ A Distributed File System (DFS)
ƒ A set of RPC daemons and clients
ƒ DFS port on a server on which a file
operation is to take place:
ƒ Disk operations: read, write,
delete, status, etc –
corresponding to usual system
calls

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

79
Communication in Client-
Server Systems
ƒ Remote Method Invocation (RMI)
ƒ Allow a thread to invoke a method on a
remote object.
ƒ boolean val = Server.someMethod(A,B)
ƒ Implementation
ƒ Stub – a proxy for the remote object
ƒ Parcel – a method name and its
marshalled parameters, etc.
ƒ Skeleton – for the unmarshalling of
parameters and invocation of the method
and the sending of a parcel back
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in Client-
Server Systems
ƒ Parameter Passing
ƒ Local (or Nonremote) Objects
ƒ Pass-by-copy – an object
serialization
ƒ Remote Objects – Reside on a
different Java virtual machine (JVM)
ƒ Pass-by-reference
ƒ Implementation of the interface –
java.io.Serializable

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

80
Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Chapter 5 Threads

81
Threads
ƒ Objectives:
ƒ Concepts and issues associated with
multithreaded computer systems.

ƒ Thread – Lightweight process(LWP)


ƒ a basic unit of CPU utilization
ƒ A thread ID, program counter, a
register set, and a stack space
ƒ Process – heavyweight process
ƒ A single thread of control

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threads
ƒ Motivation
code segment
ƒ A web browser
ƒ Data retrieval
ƒ Text/image displaying
ƒ A word processor
stack stack stack ƒ Displaying
ƒ Keystroke reading
registers registers registers
ƒ Spelling and grammar
checking
data segment ƒ A web server
ƒ Clients’ services
filesfiles
ƒ Request listening
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

82
Threads
ƒ Benefits
ƒ Responsiveness
ƒ Resource Sharing
ƒ Economy
ƒ Creation and context switching
ƒ 30 times slower in process creation
in Solaris 2
ƒ 5 times slower in process context
switching in Solaris 2
ƒ Utilization of Multiprocessor
Architectures
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

User-Level Threads
ƒ User-level threads
are implemented by
a thread library at
the user level.
ƒ Examples:
ƒ POSIX Pthreads,
Mach C-threads,
ƒ Advantages Solaris 2 UI-threads
ƒ Context switching among them is extremely fast
ƒ Disadvantages
ƒ Blocking of a thread in executing a system call can block the
entire process.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

83
Kernel-Level Threads
ƒ Kernel-level threads
are provided a set of
system calls similar to
those of processes
ƒ Examples
ƒ Windows 2000, Solaris
ƒ Advantage 2, True64UNIX

ƒ Blocking of a thread will not block its entire task.


ƒ Disadvantage
ƒ Context switching cost is a little bit higher because
the kernel must do the switching.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multithreading Models
ƒ Many-to-One Model
ƒ Many user-level threads to one
kernel thread
ƒ Advantage:
ƒ Efficiency
k
ƒ Disadvantage:
ƒ One blocking system call blocks all.
ƒ No parallelism for multiple processors
ƒ Example: Green threads for Solaris 2

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

84
Multithreading Models
ƒ One-to-One Model
ƒ One user-level thread to one kernel
thread
ƒ Advantage: One system call blocks
one thread.
k ƒ Disadvantage: Overheads in creating
a kernel thread.
ƒ Example: Windows NT, Windows
2000, OS/2

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multithreading Models
ƒ Many-to-Many Model
ƒ Many user-level threads to many
kernel threads
ƒ Advantage:
ƒ A combination of parallelism and
k k k efficiency
ƒ Example: Solaris 2, IRIX, HP-
UX,Tru64 UNIX

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

85
Threading Issues

ƒ Fork and Exec System Calls


ƒ Fork: Duplicate all threads or create
a duplicate with one thread?
ƒ Exec: Replace the entire process,
including all threads and LWPs.
ƒ Fork Æ exec?

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threading Issues
ƒ Thread Cancellation
ƒ Target thread
ƒ Two scenarios:
ƒ Asynchronous cancellation
ƒ Deferred cancellation
ƒ Cancellation points in Pthread.
ƒ Difficulty
ƒ Resources have been allocated to a
cancelled thread.
ƒ A thread is cancelled while it is updating
data.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

86
Threading Issues
ƒ Signal Handling
ƒ Signal
ƒ Synchronous – delivered to the same
process that performed the operation
causing the signal,
ƒ e.g., illegal memory access or division by
zero
ƒ Asynchronous
ƒ e.g., ^C or timer expiration
ƒ Default or user-defined signal handler
ƒ Signal masking
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threading Issues
ƒ Delivery of a Signal
ƒ To the thread to which the signal applies
ƒ e.g., division-by-zero
ƒ To every thread in the process
ƒ e.g., ^C
ƒ To certain threads in the process
ƒ Assign a specific thread to receive all
threads for the process
ƒ Solaris 2
ƒ Asynchronous Procedure Calls (APCs)
ƒ To a particular thread rather than a process
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

87
Threading Issues
ƒ Thread Pools
ƒ Motivations
ƒ Dynamic creation of threads
ƒ Limit on the number of active threads
ƒ Awake and pass a request to a thread in
the pool
ƒ Benefits
ƒ Faster for service delivery and limit on the
# of threads
ƒ Dynamic or static thread pools
ƒ Thread-specific data – Win32 & Pthreads
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Pthreads
ƒ Pthreads (IEEE 1003.1c)
ƒ API Specification for Thread Creation
and Synchronization
ƒ UNIX-Based Systems, Such As
Solaris 2.
ƒ User-Level Library
ƒ Header File: <pthread.h>
ƒ pthread_attr_init(), pthread_create(),
pthread_exit(), pthread_join(), etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

88
Pthreads
#include <pthread.h>
main(int argc, char *argv[]) {

pthread_attr_init(&attr);
pthread_create(&tid, &attr, runner, argv[1]);
pthread_join(tid, NULL);
…}

void *runner(void *param) {


int i, upper = atoi(param), sum = 0;
if (upper > 0)
for(i=1;i<=upper,i++)
sum+=i;
pthread_exit(0);
}
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Solaris 2
ƒ Implementation
user-level of Pthread API
thread
in addition to
light weight
process supporting
kernel threads
user-level
threads with a
kernel library for
thread creation
CPU and
management.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

89
Solaris 2
ƒ Many-to-Many Model
ƒ Each process has at least one LWP
ƒ Each LWP has a kernel-level thread
ƒ User-level threads must be connected
to LWPs to accomplish work.
ƒ A bound user-level thread
ƒ An unbound thread
ƒ Some kernel threads running on the
kernel’s behalf have no associated
LWPs – system threads
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Solaris 2
ƒ Processor Allocation:
ƒ Multiprogramming or Pinned
ƒ Switches of user-level threads
among LWPs do not need kernel
intervention.
ƒ If the kernel thread is blocked, so
does the LWP and its user-level
threads.
ƒ Dynamic adjustment of the number
of LWPs
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

90
Solaris 2
Process ID
ƒ Data Structures
ƒ A User-Level Thread
Mem Map
ƒ A Thread ID, a register set (including
Priority PC, SP), stack, and priority – in user
space
Open Files
ƒ A LWP
ƒ A Register set for the running user-
level thread – in kernel space
LWP1 ƒ A Kernel thread
ƒ A copy of the kernel registers, a
LWP2 pointer to its LWP, priority, scheduling
information
Solaris 2 Process
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Windows 2000
ƒ Win32 API
ƒ One-to-One Model
ƒ Fiber Library for the M:M Model
ƒ A Thread Contains
ƒ A Thread ID
ƒ Context: A Register Set, A User
Stack, A Kernel Stack, and A Private
Storage Space

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

91
Windows 2000
ƒ Data Structures
ƒ ETHREAD (executive thread block)
Kernel ƒ A ptr to the process,a ptr to KTHREAD,
the address of the starting routine
Space
ƒ KTHREAD (kernel thread block)
ƒ Scheduling and synchronization
information, a kernel stack, a ptr to TEB
ƒ TEB (thread environment block)
User
Space ƒ A user stack, an array for thread-
specific data.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Linux
ƒ Threads introduced in Version 2.2
ƒ clone() versus fork()
ƒ Term task for process& thread
ƒ Several per-process data structures,
e.g., pointers to the same data
structures for open files, signal
handling, virtual memory, etc.
ƒ Flag setting in clone() invocation.
ƒ Pthread implementations

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

92
Java
ƒ Thread Support at the Language Level
ƒ Mapping of Java Threads to Kernel
Threads on the Underlying OS?
ƒ Windows 2000: 1:1 Model
ƒ Thread Creation
ƒ Create a new class derived from the
Thread class
ƒ Run its start method
ƒ Allocate memory and initialize a new
thread in the JVM
ƒ start() calls the run method, making the
thread eligible to be run by the JVM.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Java
class Summation extends Thread
{ public Summation(int n) {
upper = n; }
public void run() {
int sum = 0;
…}
…}
public class ThreadTester
{…
Summation thrd = new
Summation(Integer.ParseInt(args[0]));
thrd.start();
…}
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

93
Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Chapter 6 CPU Scheduling

94
CPU Scheduling

ƒ Objective:
ƒ Basic Scheduling Concepts
ƒ CPU Scheduling Algorithms

ƒ Why Multiprogramming?
ƒ Maximize CPU/Resources Utilization
(Based on Some Criteria)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

CPU Scheduling
ƒ Process Execution
ƒ CPU-bound programs tend to have a
few very long CPU bursts.
ƒ IO-bound programs tend to have
many very short CPU bursts.

CPU-Burst

New
Terminate I/O-Burst
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

95
CPU Scheduling
ƒ The distribution can help in selecting
an appropriate CPU-scheduling
algorithms

120
frequency

100
60
20
8 16 24
Burst Duration (ms)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

CPU Scheduling
ƒ CPU Scheduler – The Selection of
Process for Execution
ƒ A short-term scheduler

New Terminated

dispatched
Ready Running

Waiting

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

96
CPU Scheduling
ƒ Nonpreemptive Scheduling
ƒ A running process keeps CPU until it
volunteers to release CPU
ƒ E.g., I/O or termination
ƒ Advantage
ƒ Easy to implement (at the cost of service
response to other processes)
ƒ E.g., Windows 3.1

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

CPU Scheduling
ƒ Preemptive Scheduling
ƒ Beside the instances for non-preemptive
scheduling, CPU scheduling occurs
whenever some process becomes
ready or the running process leaves the
running state!
ƒ Issues involved:
ƒ Protection of Resources, such as I/O
queues or shared data, especially for
multiprocessor or real-time systems.
ƒ Synchronization
ƒ E.g., Interrupts and System calls
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

97
CPU Scheduling
ƒ Dispatcher
ƒ Functionality:
ƒ Switching context
ƒ Switching to user mode
ƒ Restarting a user program

ƒ Dispatch Latency:
Must be fast
Stop a process Start a process

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Scheduling Criteria
ƒ Why?
ƒ Different scheduling algorithms may
favor one class of processes over
another!
ƒ Criteria
ƒ CPU Utilization
ƒ Throughput
ƒ Turnaround Time: CompletionT-StartT
ƒ Waiting Time: Waiting in the ReadyQ
ƒ Response Time: FirstResponseTime
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

98
Scheduling Criteria
ƒ How to Measure the Performance of
CPU Scheduling Algorithms?

ƒ Optimization of what?
ƒ General Consideration
ƒ Average Measure
ƒ Minimum or Maximum Values
ƒ Variance Æ Predictable Behavior

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Scheduling Algorithms

ƒ First-Come, First-Served Scheduling


(FIFO)
ƒ Shortest-Job-First Scheduling (SJF)
ƒ Priority Scheduling
ƒ Round-Robin Scheduling (RR)
ƒ Multilevel Queue Scheduling
ƒ Multilevel Feedback Queue Scheduling
ƒ Multiple-Processor Scheduling

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

99
First-Come, First-Served
Scheduling (FCFS)
ƒ The process which requests the
CPU first is allocated the CPU
ƒ Properties:
ƒ Non-preemptive scheduling
ƒ CPU might be hold for an extended
period.

CPU
request
A FIFO ready queue dispatched
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

First-Come, First-Served
Scheduling (FCFS)
ƒ Example
Process CPU Burst Time
P1 24
P2 3
P3 3

Gantt P1 P2 P3 Average waiting time


Chart 0 24 27 30 = (0+24+27)/3 = 17
P2 P3 P1 Average waiting time
0 3 6 30 = (6+0+3)/3 = 3
*The average waiting time is highly affected by process CPU
burst times !
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

100
First-Come, First-Served
Scheduling (FCFS)
CPU I/O device
ƒ Example: Convoy
Effect ready queue idle

ƒ One CPU-bound
process + many
I/O-bound
processes

ready queue
All other processes wait for it
to get off the CPU!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Shortest-Job-First Scheduling
(SJF)
ƒ Non-Preemptive SJF
ƒ Shortest next CPU burst first
process CPU burst time
P1 6
Average waiting time P2 8
= (3+16+9+0)/4 = 7 P3 7
P4 3
P4 P1 P3 P2
0 3 9 16 24

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

101
Shortest-Job-First Scheduling
(SJF)
ƒ Nonpreemptive SJF is optimal when
processes are all ready at time 0
ƒ The minimum average waiting time!
ƒ Prediction of the next CPU burst time?
ƒ Long-Term Scheduler
ƒ A specified amount at its submission
time
ƒ Short-Term Scheduler
ƒ Exponential average (0<= α <=1)
τn+1 = α tn + (1-α) τn
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Shortest-Job-First Scheduling
(SJF)
ƒ Preemptive SJF
ƒ Shortest-remaining-time-first
Process CPU Burst Time Arrival Time
P1 8 0
P2 4 1
P3 9 2
P4 5 3
Average Waiting
Time = ((10-1) +
P1 P2 P4 P1 P3
(1-1) + (17-2) +
0 1 5 10 17 26 (5-3))/4 = 26/4
= 6.5
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

102
Shortest-Job-First Scheduling
(SJF)
ƒ Preemptive or Non-preemptive?
ƒ Criteria such as AWT (Average
Waiting Time)

Non-preemptive
0 10 AWT = (0+(10-1))/2
= 9/2 = 4.5
1 10 11
or

0 11 Preemptive AWT
= ((2-1)+0) = 0.5
1 2 * Context switching cost ~ modeling & analysis
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Priority Scheduling

ƒ CPU is assigned to the process


with the highest priority – A
framework for various scheduling
algorithms:
ƒ FCFS: Equal-Priority with Tie-
Breaking by FCFS
ƒ SFJ: Priority = 1 / next CPU burst
length

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

103
Priority Scheduling
Process CPU Burst Time Priority
P1 10 3
P2 1 1
P3 2 3
P4 1 4
P5 5 2
Average waiting time
Gantt Graph = (6+0+16+18+1)/5 = 8.2

P2 P5 P1 P3 P4
0 1 6 16 18 19

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Priority Scheduling
ƒ Priority Assignment
ƒ Internally defined – use some
measurable quantity, such as the #
of open files, Average CPU Burst
Average I/O Burst
ƒ Externally defined – set by criteria
external to the OS, such as the
criticality levels of jobs.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

104
Priority Scheduling
ƒ Preemptive or Non-Preemptive?
ƒ Preemptive scheduling – CPU
scheduling is invoked whenever a
process arrives at the ready queue,
or the running process relinquishes
the CPU.
ƒ Non-preemptive scheduling – CPU
scheduling is invoked only when the
running process relinquishes the
CPU.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Priority Scheduling
ƒ Major Problem
ƒ Indefinite Blocking (/Starvation)
ƒ Low-priority processes could starve
to death!
ƒ A Solution: Aging
ƒ A technique that increases the
priority of processes waiting in the
system for a long time.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

105
Round-Robin Scheduling (RR)
ƒ RR is similar to FCFS except that
preemption is added to switch between
processes.

ready running
Interrupt at every time quantum (time slice)

ƒ Goal: Fairness – Time Sharing


CPU FIFO… New process

The quantum is used up!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Round-Robin Scheduling (RR)


Process CPU Burst Time
P1 24
P2 3 Time slice = 4
P3 3

P1 P2 P3 P1 P1 P1 P1 P1
0 4 7 10 14 18 22 26 30

AWT = ((10-4) + (4-0) + (7-0))/3


= 17/3 = 5.66

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

106
Round-Robin Scheduling (RR)
ƒ Service Size and Interval
ƒ Time quantum = q Æ Service interval <= (n-
1)*q if n processes are ready.
ƒ IF q = ∞, then RR Æ FCFS.
ƒ IF q = ε, then RR Æ processor sharing. The
# of context switchings increases!
process quantum context switch #
0 10 12 0
0 6 10 6 1
0 10
1 9
If context switch cost
= 10% => 1/11 of CPU is wasted!
time quantum
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Round-Robin Scheduling (RR)


ƒ Turnaround Time
process (10ms) quantum = 10 quantum = 1

P1 0 10 0 10 20 30
P2 10 20 0 10 20 30
P3 20 30 0 10 20 30
Average Turnaround Time ATT = (28+29+30)/3 = 29
= (10+20+30)/3 = 20
=> 80% CPU Burst < time slice

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

107
Multilevel Queue Scheduling
ƒ Partition the ready queue into
several separate queues =>
Processes can be classified into
different groups and permanently
assigned to one queue.

System Processes

Interactive Processes

Batch Processes

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multilevel Queue Scheduling


ƒ Intra-queue scheduling
ƒ Independent choice of scheduling
algorithms.
ƒ Inter-queue scheduling
a. Fixed-priority preemptive scheduling
a. e.g., foreground queues always have absolute
priority over the background queues.
b. Time slice between queues
a. e.g., 80% CPU is given to foreground processes,
and 20% CPU to background processes.
c. More??
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

108
Multilevel Feedback Queue
Scheduling
ƒ Different from Multilevel Queue
Scheduling by Allowing Processes to
Migrate Among Queues.
ƒ Configurable Parameters:
a. # of queues
b. The scheduling algorithm for each queue
c. The method to determine when to upgrade a
process to a higher priority queue.
d. The method to determine when to demote a
process to a lower priority queue.
e. The method to determine which queue a newly
ready process will enter.
*Inter-queue scheduling: Fixed-priority preemptive?!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multilevel Feedback Queue


Scheduling
ƒ Example

quantum = 8

quantum = 16

FCFS
*Idea: Separate processes with different CPU-burst
characteristics!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

109
Multiple-Processor Scheduling
ƒ CPU scheduling in a system with
multiple CPUs
ƒ A Homogeneous System
ƒ Processes are identical in terms of their
functionality.
Î Can processes run on any processor?
ƒ A Heterogeneous System
ƒ Programs must be compiled for
instructions on proper processors.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multiple-Processor Scheduling
ƒ Load Sharing – Load Balancing!!
ƒ A queue for each processor
ƒ Self-Scheduling – Symmetric
Multiprocessing
ƒ A common ready queue for all processors.
ƒ Self-Scheduling
ƒ Need synchronization to access common
data structure, e.g., queues.
ƒ Master-Slave – Asymmetric Multiprocessing
ƒ One processor accesses the system
structures Æ no need for data sharing
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

110
Real-Time Scheduling
ƒ Definition
ƒ Real-time means on-time, instead of
fast!
ƒ Hard real-time systems:
ƒ Failure to meet the timing constraints
(such as deadline) of processes may
result in a catastrophe!
ƒ Soft real-time systems:
ƒ Failure to meet the timing constraints
may still contribute value to the
system.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Real-Time Scheduling
ƒ Dispatch Latency

Dispatch latency

Conflicts Dispatch

1. Preemption of the running process


2. Releasing resources needed by the higher
priority process
3. Context switching to the higher priority
process
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

111
Real-Time Scheduling
ƒ Minimization of Dispatch Latency?
ƒ Context switching in many OS, e.g.,
some UNIX versions, can only be done
after a system call completes or an I/O
blocking occurs
ƒ Solutions:
1. Insert safe preemption points in long-
duration system calls.
2. Protect kernel data by some
synchronization mechanisms to make
the entire kernel preemptible.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Real-Time Scheduling
ƒ Priority Inversion:
ƒ A higher-priority processes must wait for
the execution of a lower-priority
processes.

P1 D D D V
P1
Time
P2
P2 D Time
Request
for D
P3 D
P3
Time
P3 also waits for P2 to complete?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

112
Real-Time Scheduling
ƒ Priority Inheritance
ƒ The blocking process inherits the
priority of the process that was
blocked.

P1
P1 D D D V
P2 D Time
P3 P2
Request Time
for D
P3 D
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.
Time

Real-Time Scheduling
ƒ Earliest Deadline First Scheduling
(EDF)
ƒ Processes with closer deadlines have
higher priorities.
( priority (τ i ) ∝ (1 / d i ))

ƒ An optimal dynamic-priority-driven
scheduling algorithm for periodic and
aperiodic processes!
Liu & Layland [JACM 73] showed that EDF is optimal in the sense that a process set is
scheduled by EDF if its CPU utilization ∑  C P  is no larger than 100%.
i
i

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

113
Real-Time Scheduling – EDF
process CPU Burst time Deadline Initial Arrival Time
P1 4 20 0
P2 5 15 1
P3 6 16 2

P2 P3 P1 Average waiting time


P1
=(11+0+4)/3=5
0 1 6 12 15

0 1 12 15 20

0 1 6 16 20
P3
0 2 6 12 18 20
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

A General Architecture of RTOS’s


ƒ Objectives in the Design of Many
RTOS’s
ƒ Efficient Scheduling Mechanisms
ƒ Good Resource Management Policies
ƒ Predictable Performance
ƒ Common Functionality of Many RTOS’s
ƒ Task Management
ƒ Memory Management
ƒ Resource Control, including devices
ƒ Process Synchronization
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

114
A General Architecture

User
processes System calls such as I/O requests
Space
which may cause the releasing
Top CPU of a process!
OS Half

Bottom Timer expires to


Half • Expire the running process’s
time quota
hardware • Keep the accounting info
for each process
Interrupts for Services
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

A General Architecture
ƒ 2-Step Interrupt Services
ƒ Immediate Interrupt Service
Interrupt/ISR Latency

I
ƒ Interrupt priorities > process priorities
ƒ Time: Completion of higher priority ISR,
ISR context switch, disabling of certain
interrupts, starting of the right ISR
(urgent/low-level work, set events)
IST Latency

ƒ Scheduled Interrupt Service


ƒ Usually done by preemptible threads
Scheduled Service

ƒ Remark: Reducing of non-preemptible


code, Priority Tracking/Inheritance
(LynxOS), etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

115
A General Architecture
ƒ Scheduler
ƒ A central part in the kernel
ƒ The scheduler is usually driven by a
clock interrupt periodically, except
when voluntary context switches
occur – thread quantum?
ƒ Timer Resolution
ƒ Tick size vs Interrupt Frequency
ƒ 10ms? 1ms? 1us? 1ns?
ƒ Fine-Grained hardware clock

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

A General Architecture
ƒ Memory Management
ƒ No protection for many embedded
systems
ƒ Memory-locking to avoid paging
ƒ Process Synchronization
ƒ Sources of Priority Inversion
ƒ Nonpreemptible code
ƒ Critical sections
ƒ A limited number of priority levels, etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

116
Algorithm Evaluation
ƒ A General Procedure
ƒ Select criteria that may include several
measures, e.g., maximize CPU
utilization while confining the maximum
response time to 1 second
ƒ Evaluate various algorithms
ƒ Evaluation Methods:
ƒ Deterministic modeling
ƒ Queuing models
ƒ Simulation
ƒ Implementation
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deterministic Modeling
ƒ A Typical Type of Analytic Evaluation
ƒ Take a particular predetermined workload
and defines the performance of each
algorithm for that workload
ƒ Properties
ƒ Simple and fast
ƒ Through excessive executions of a number of
examples, trends might be identified
ƒ But it needs exact numbers for inputs, and its
answers only apply to those cases
ƒ Being too specific and requires too exact
knowledge to be useful!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

117
Deterministic Modeling
FCFC
P1 P2 P3 P4 P5
0 10 39 42 49 61
process CPU Burst time Average Waiting Time (AWT)=(0+10+39+42+49)/5=28
P1 10
Nonpreemptive Shortest Job First
P2 29
P3 3 P3 P4 P1 P5 P2
P4 7
0 3 10 20 32 61
P5 12
AWT=(10+32+0+3+20)/5=13

Round Robin (quantum =10)


P1 P2 P3 P4 P5 P2 P5 P2
0 10 2023 30 40 50 52 61
AWT=(0+(10+20+2)+20+23+(30+10))/5=23
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Queueing Models
ƒ Motivation:
ƒ Workloads vary, and there is no static set
of processes
ƒ Models (~ Queueing-Network Analysis)
ƒ Workload:
a. Arrival rate: the distribution of times when
processes arrive.
b. The distributions of CPU & I/O bursts
ƒ Service rate

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

118
Queueing Models
ƒ Model a computer system as a network
of servers. Each server has a queue of
waiting processes
ƒ Compute average queue length, waiting
time, and so on.
ƒ Properties:
ƒ Generally useful but with limited
application to the classes of algorithms &
distributions
ƒ Assumptions are made to make
problems solvable => inaccurate results
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Queueing Models
ƒ Example: Little’s formula
n = λ∗w
w steady state!
λ λ

n = # of processes in the queue


λ = arrival rate
ω = average waiting time in the queue
ƒ If n =14 & λ =7 processes/sec, then w =
2 seconds.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

119
Simulation
ƒ Motivation:
ƒ Get a more accurate evaluation.
ƒ Procedures:
ƒ Program a model of the computer system
ƒ Drive the simulation with various data sets
ƒ Randomly generated according to some
probability distributions
=> inaccuracy occurs because of only the
occurrence frequency of events. Miss the order &
the relationships of events.
ƒ Trace tapes: monitor the real system &
record the sequence of actual events.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Simulation

ƒ Properties:
ƒ Accurate results can be gotten, but it
could be expensive in terms of
computation time and storage space.
ƒ The coding, design, and debugging of
a simulator can be a big job.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

120
Implementation
ƒ Motivation:
ƒ Get more accurate results than a
simulation!

ƒ Procedure:
ƒ Code scheduling algorithms
ƒ Put them in the OS
ƒ Evaluate the real behaviors

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Implementation
ƒ Difficulties:
ƒ Cost in coding algorithms and
modifying the OS
ƒ Reaction of users to a constantly
changing the OS
ƒ The environment in which algorithms
are used will change
ƒ For example, users may adjust their
behaviors according to the selected
algorithms
=> Separation of the policy and
mechanism!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

121
Process Scheduling Model

ƒ Process Local Scheduling


ƒ E.g., those for user-level threads
ƒ Thread scheduling is done locally to
each application.
ƒ System Global Scheduling
ƒ E.g., those for kernel-level threads
ƒ The kernel decides which thread to
run.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model –


Solaris 2
ƒ Priority-Based Process Scheduling
ƒ Real-Time
ƒ System
ƒ Kernel-service processes
ƒ Time-Sharing
low ƒ A default class
ƒ Interactive
ƒ Each LWP inherits its class from its
parent process
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

122
Process Scheduling Model –
Solaris 2
ƒ Real-Time
ƒ A guaranteed response
ƒ System
ƒ The priorities of system processes are
fixed.
ƒ Time-Sharing
ƒ Multilevel feedback queue scheduling
– priorities inversely proportional to
time slices
ƒ Interactive
ƒ Prefer windowing process
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model –


Solaris 2
ƒ The selected thread runs until one of
the following occurs:
ƒ It blocks.
ƒ It uses its time slice (if it is not a
system thread).
ƒ It is preempted by a higher-priority
thread.
ƒ RR is used when several threads
have the same priority.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

123
Process Scheduling Model –
Windows 2000
ƒ Priority-Based Preemptive Scheduling
ƒ Priority Class/Relationship: 0..31
ƒ Dispatcher: A process runs until
ƒ It is preempted by a higher-priority process.
ƒ It terminates
ƒ Its time quantum ends
ƒ It calls a blocking system call
ƒ Idle thread
ƒ A queue per priority level

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model –


Windows 2000
ƒ Each thread has a base priority that
represents a value in the priority range of
its class.
ƒ A typical class – Normal_Priority_Class
ƒ Time quantum – thread
ƒ Increased after some waiting
ƒ Different for I/O devices.
ƒ Decreased after some computation
ƒ The priority is never lowered below the base
priority.
ƒ Favor foreground processes (more time
quantum)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

124
Process Scheduling Model –
Windows 2000 A Typical Class

Real- High Above Normal Below Idle


time normal normal priority
Time- 31 15 15 15 15 15
critical
Highest 26 15 12 10 8 6
Above 25 14 11 9 7 5
normal
Base Normal
Priority
24 13 10 8 6 4
Below 23 12 9 7 5 3
normal
Lowest 22 11 8 6 4 2
Idle 16 1 1 1 1 1
Real-Time Class
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. Variable Class (1..15)

Process Scheduling Model –


Linux
ƒ Three Classes (POSIX.1b)
ƒ Time-Sharing
ƒ Soft Real-Time: FCFS, and RR
ƒ Real-Time Scheduling Algorithms
ƒ FCFS & RR always run the highest
priority process.
ƒ FCFS runs a process until it exits or
blocks.
ƒ No scheduling in the kernel space for
conventional Linux
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

125
Process Scheduling Model –
Linux
ƒ A Time-Sharing Algorithm for Fairness
ƒ Credits = (credits / 2) + priority
ƒ Recrediting when no runnable process
has any credits.
ƒ Mixture of a history and its priority
ƒ Favor interactive or I/O-bound
processes
ƒ Background processes could be given
lower priorities to receive less credits.
ƒ nice in UNIX

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

126
Chapter 7
Process Synchronization

Process Synchronization
ƒ Why Synchronization?
ƒ To ensure data consistency for
concurrent access to shared data!

ƒ Contents:
ƒ Various mechanisms to ensure the
orderly execution of cooperating
processes

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

127
Process Synchronization
ƒ A Consumer-Producer Example

ƒ Producer ƒ Consumer:
while (1) { while (1) {
while (counter == BUFFER_SIZE) while (counter == 0)
; ;
produce an item in nextp; nextc = buffer[out];
…. out = (out +1) % BUFFER_SIZE;
buffer[in] = nextp; counter--;
in = (in+1) % BUFFER_SIZE; consume an item in nextc;
}
counter++;
}

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Synchronization
ƒ counter++ vs counter—
r1 = counter r2 = counter
r1 = r1 + 1 r2 = r2 - 1
counter = r1 counter = r2
ƒ Initially, let counter = 5.
1. P: r1 = counter
2. P: r1 = r1 + 1
3. C: r2 = counter
4. C: r2 = r2 – 1 A Race Condition!
5. P: counter = r1
6. C: counter = r2
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

128
Process Synchronization
ƒ A Race Condition:
ƒ A situation where the outcome of the
execution depends on the particular
order of process scheduling.
ƒ The Critical-Section Problem:
ƒ Design a protocol that processes can
use to cooperate.
ƒ Each process has a segment of code,
called a critical section, whose
execution must be mutually exclusive.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Synchronization
ƒ A General Structure for the Critical-
Section Problem
do {

permission request entry section;

critical section;
exit notification exit section;

remainder section;
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

129
The Critical-Section Problem
ƒ Three Requirements
1. Mutual Exclusion
a. Only one process can be in its critical section.
2. Progress
a. Only processes not in their remainder section can
decide which will enter its critical section.
b. The selection cannot be postponed indefinitely.
3. Bounded Waiting
a. A waiting process only waits for a bounded
number of processes to enter their critical
sections.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem –


A Two-Process Solution
ƒ Notation
ƒ Processes Pi and Pj, where j=1-i; do {

ƒ Assumption while (turn != i) ;


ƒ Every basic machine-language critical section
instruction is atomic. turn=j;
ƒ Algorithm 1
ƒ Idea: Remember which process is remainder section
allowed to enter its critical section, } while (1);
That is, process i can enter its
critical section if turn = i.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

130
The Critical-Section Problem –
A Two-Process Solution
ƒ Algorithm 1 fails the progress requirement:

P0 suspend or Time
turn=0 exit quit!

turn=1

P1 Time
exit
blocked on P1’s
turn=0
entry section
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem –


A Two-Process Solution
ƒ Algorithm 2 Initially, flag[0]=flag[1]=false
ƒ Idea: Remember the state
of each process. do {
ƒ flag[i]==true Æ Pi is ready
flag[i]=true;
to enter its critical section.
ƒ Algorithm 2 fails the while (flag[j]) ;
progress requirement critical section
when
flag[0]==flag[1]==true; flag[i]=false;
remainder section
ƒ the exact timing of the
two processes? } while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. * The switching of “flag[i]=true” and “while (flag[j]);”.

131
The Critical-Section Problem –
A Two-Process Solution
ƒ Algorithm 3
do {
ƒ Idea: Combine the
ideas of Algorithms 1 flag[i]=true;
and 2 turn=j;
ƒ When (flag[i] && while (flag[j] && turn==j) ;
turn=i), Pj must wait.
critical section
ƒ Initially,
flag[i]=false;
flag[0]=flag[1]=false,
and turn = 0 or 1 remainder section
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem –


A Two-Process Solution
ƒ Properties of Algorithm 3
ƒ Mutual Exclusion
ƒ The eventual value of turn determines
which process enters the critical section.
ƒ Progress
ƒ A process can only be stuck in the while
loop, and the process which can keep it
waiting must be in its critical sections.
ƒ Bounded Waiting
ƒ Each process wait at most one entry by the
other process.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

132
The Critical-Section Problem –
A Multiple-Process Solution
ƒ Bakery Algorithm
ƒ Originally designed for distributed
systems
ƒ Processes which are ready to enter
their critical section must take a
number and wait till the number
becomes the lowest.
ƒ int number[i]: Pi’s number if it is
nonzero.
ƒ boolean choosing[i]: Pi is taking a
number.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem –


A Multiple-Process Solution
do {
choosing[i]=true;
number[i]=max(number[0], …number[n-1])+1;
choosing[i]=false;
for (j=0; j < n; j++)
while choosing[j] ;
while (number[j] != 0 && (number[j],j)<(number[i],i)) ;
critical section
• An observation: If Pi is in its
number[i]=0;
critical section, and Pk (k != i) has
remainder section
already chosen its number[k],
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.
then (number[i],i) < (number[k],k).

133
Synchronization Hardware
ƒ Motivation:
ƒ Hardware features make programming
easier and improve system efficiency.
ƒ Approach:
ƒ Disable Interrupt Æ No Preemption
ƒ Infeasible in multiprocessor environment
where message passing is used.
ƒ Potential impacts on interrupt-driven system
clocks.
ƒ Atomic Hardware Instructions
ƒ Test-and-set, Swap, etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Synchronization Hardware
boolean TestAndSet(boolean &target) {
boolean rv = target;
target=true;
return rv;
}
do {

while (TestAndSet(lock)) ;
critical section
lock=false;
remainder section
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

134
Synchronization Hardware
void Swap(boolean &a, boolean &b) {
boolean temp = a;
a=b;
b=temp;
}
do {
key=true;
while (key == true)
Swap(lock, key);
critical section
lock=false;
remainder section
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Synchronization Hardware
do { ƒ Mutual Exclusion
waiting[i]=true;
ƒ Pass if key == F
key=true;
or waiting[i] == F
while (waiting[i] && key)
key=TestAndSet(lock); ƒ Progress
waiting[i]=false; ƒ Exit process
critical section; sends a process in.
j= (i+1) % n; ƒBounded Waiting
while(j != i) && (not waiting[j]) ƒ Wait at most n-1
j= (j+1) % n; times
If (j=i) lock=false;
ƒAtomic TestAndSet is
else waiting[j]=false;
hard to implement in a
remainder section
multiprocessor
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.
environment.

135
Semaphores
ƒ Motivation:
ƒ A high-level solution for more
complex problems.
ƒ Semaphore
ƒ A variable S only accessible by two
atomic operations:

wait(S) { /* P */ signal(S) { /* V */
while (S <= 0) ; S++;
S—; }
}
•Indivisibility for “(S<=0)”, “S—”, and “S++”
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Semaphores – Usages

ƒ Critical Sections ƒ Precedence Enforcement

do { P1:
wait(mutex); S1;
signal(synch);
critical section
signal(mutex); P2:
remainder section wait(synch);
} while (1); S2;

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

136
Semaphores
ƒ Implementation
ƒ Spinlock – A Busy-Waiting Semaphore
ƒ “while (S <= 0)” causes the wasting of
CPU cycles!
ƒ Advantage:
ƒ When locks are held for a short time,
spinlocks are useful since no context
switching is involved.
ƒ Semaphores with Block-Waiting
ƒ No busy waiting from the entry to the
critical section!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Semaphores
ƒ Semaphores with Block Waiting
typedef struct {
int value;
struct process *L;
} semaphore ;
void wait(semaphore S) { void signal(semaphore S);
S.value--; S.value++;
if (S.value < 0) { if (S.value <= 0) {
add this process to S.L; remove a process P form S.L;
block(); wakeup(P);
} }
} }
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. * |S.value| = the # of waiting processes if S.value < 0.

137
Semaphores
ƒ The queueing strategy can be arbitrary,
but there is a restriction for the bounded-
waiting requirement.
ƒ Mutual exclusion in wait() & signal()
ƒ Uniprocessor Environments
ƒ Interrupt Disabling
ƒ TestAndSet, Swap
ƒ Software Methods, e.g., the Bakery Algorithm,
in Section 7.2
ƒ Multiprocessor Environments
ƒ Remarks: Busy-waiting is limited to only
the critical sections of the wait() & signal()!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlocks and Starvation


ƒ Deadlock
ƒ A set of processes is in a deadlock state when
every process in the set is waiting for an event
that can be caused only by another process in the
set.
P0: wait(S); P1: wait(Q);
wait(Q); wait(S);
… …
signal(S); signal(Q);
signal(Q); signal(S);

ƒ Starvation (or Indefinite Blocking)


ƒ E.g., a LIFO queue
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

138
Binary Semaphore
ƒ Binary Semaphores versus Counting
Semaphores
ƒ The value ranges from 0 to 1Æ easy
implementation!
wait(S) signal(S)
wait(S1); /* protect C */ wait(S1);
C--; C++;
if (C < 0) { if (C <= 0)
signal(S1); signal (S2); /* wakeup */
wait(S2); else
} signal (S1);
signal(S1); * S1 & S2: binary semaphores
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Classical Synchronization
Problems – The Bounded Buffer
Producer:
do {
produce an item in nextp;
…….
Initialized to n wait(empty); /* control buffer availability */
Initialized to 1 wait(mutex); /* mutual exclusion */
……
add nextp to buffer;
signal(mutex);
Initialized to 0 signal(full); /* increase item counts */
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

139
Classical Synchronization
Problems – The Bounded Buffer
Consumer:
do {
Initialized to 0 wait(full); /* control buffer availability */
Initialized to 1 wait(mutex); /* mutual exclusion */
…….
remove an item from buffer to nextp;
……
signal(mutex);
Initialized to n signal(empty); /* increase item counts */
consume nextp;
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Classical Synchronization
Problems – Readers and Writers
ƒ The Basic Assumption:
ƒ Readers: shared locks
ƒ Writers: exclusive locks
ƒ The first reader-writers problem
ƒ No readers will be kept waiting unless a
writer has already obtained permission to
use the shared object Æ potential hazard
to writers!
ƒ The second reader-writers problem:
ƒ Once a writer is ready, it performs its write
asap! Æ potential hazard to readers!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

140
Classical Synchronization
Problems – Readers and Writers
Reader:
First R/W semaphore wrt, mutex;
wait(mutex);
Solution (initialized to 1);
readcount++;
int readcount=0; if (readcount == 1)
Writer: wait(wrt);
Queueing wait(wrt); signal(mutex);
mechanism …… …… reading ……
writing is performed wait(mutex);
…… readcount--;
signal(wrt) if (readcount== 0)
Which is awaken? signal(wrt);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.
signal(mutex);

Classical Synchronization
Problems – Dining-Philosophers
ƒ Each philosopher must pick up one
chopstick beside him/her at a time
ƒ When two chopsticks are picked up,
the philosopher can eat.

thinking hungry

eating dead

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

141
Classical Synchronization
Problems – Dining-Philosophers
semaphore chopstick[5];
do {
wait(chopstick[i]);
wait(chopstick[(i + 1) % 5 ]);
… eat …
signal(chopstick[i]);
signal(chopstick[(i+1) % 5]);
…think …
} while (1);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Classical Synchronization
Problems – Dining-Philosophers
ƒ Deadlock or Starvation?!
ƒ Solutions to Deadlocks:
ƒ At most four philosophers appear.
ƒ Pick up two chopsticks “simultaneously”.
ƒ Order their behaviors, e.g., odds pick up their
right one first, and evens pick up their left one
first.
ƒ Solutions to Starvation:
ƒ No philosopher will starve to death.
ƒ A deadlock could happen??

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

142
Critical Regions
ƒ Motivation:
ƒ Various programming errors in using
low-level constructs,e.g., semaphores
ƒ Interchange the order of wait and signal
operations
ƒ Miss some waits or signals
ƒ Replace waits with signals
ƒ etc
ƒ The needs of high-level language
constructs to reduce the possibility of
errors!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Critical Regions
ƒ Region v when B do S;
ƒ Variable v – shared among processes
and only accessible in the region
struct buffer {
item pool[n];
int count, in, out;
};
Example: Mutual Exclusion
ƒ B – condition
region v when (true) S1;
ƒ count < 0
region v when (true) S2;
ƒ S – statements

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

143
Critical Regions – Consumer-
Producer
struct buffer {
item pool[n];
int count, in, out;
};
Producer: Consumer:
region buffer when region buffer when
(count < n) { (count > 0) {
pool[in] = nextp; nextc = pool[out];
in = (in + 1) % n; out = (out + 1) % n;
count++; count--;
}
}
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Critical Regions –
Implementation by Semaphores
Region x when B do S;

/* to protect the region */ wait(mutex);


semaphore mutex; while (!B) {
/* to (re-)test B */ /* fail B */
semaphore first-delay; first-count++;
int first-count=0; if (second-count > 0)
/* to retest B */ /* try other processes waiting
semaphore second-delay; on second-delay */
int second-count=0; signal(second-delay);
else signal(mutex);
/* block itself on first-delay */
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.
wait(first-delay);

144
Critical Regions –
Implementation by Semaphores
first-count--;
second-count++;
if (first-count > 0)
signal(first-delay);
else signal(second-delay);
/* block itself on first-delay */
wait(second-delay);
second-count--;
}
S;
if (first-count > 0)
signal(first-delay);
else if (second-count > 0)
signal(second-delay);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.
else signal(mutex);

Monitor
ƒ Components
ƒ Variables – monitor state
ƒ Procedures
ƒ Only access local variables or formal
shared data
parameters

x
queue for x ƒ Condition variables monitor name {
ƒ Tailor-made sync variable declaration
void proc1(…) {
ƒ x.wait() or x.signal
……… }
procedures …
entry queue void procn(…) {
initialization }
code }
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

145
Monitor
ƒ Semantics of signal & wait
ƒ x.signal() resumes one suspended
process. If there is none, no effect is
imposed.
ƒ P x.signal() a suspended process Q
ƒ P either waits until Q leaves the
monitor or waits for another condition
ƒ Q either waits until P leaves the
monitor, or waits for another
condition.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Monitor – Dining-Philosophers
monitor dp {
enum {thinking, hungry, eating} state[5];
condition self[5];
Pi: void pickup(int i) {
dp.pickup(i); stat[i]=hungry;
… eat … test(i);
dp.putdown(i); if (stat[i] != eating)
self[i].wait;
}
void putdown(int i) {
stat[i] = thinking;
test((i+4) % 5);
test((i + 1) % 5);
}
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

146
Monitor – Dining-Philosophers
void test(int i) {
if (stat[(i+4) % 5]) != eating &&
stat[i] == hungry &&
state[(i+1) % 5] != eating) {
No deadlock! stat[i] = eating;
But starvation could occur! self[i].signal();
}
}
void init() {
for (int i=0; i < 5; i++)
state[i] = thinking;
}

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Monitor – Implementation by
Semaphores
ƒ Semaphores
ƒ mutex – to protect the monitor
ƒ next – being initialized to zero, on which
processes may suspend themselves
ƒ nextcount
ƒ For each external function F
wait(mutex);

body of F;

if (next-count > 0)
signal(next);
else signal(mutex);
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

147
Monitor – Implementation by
Semaphores
ƒ For every condition x
ƒ A semaphore x-sem
ƒ An integer variable x-count
ƒ Implementation of x.wait() and x.signal :
ƒ x.wait() ƒ x.signal
x-count++; if (x-count > 0) {
if (next-count > 0) next-count++;
signal(next); signal(x-sem);
else signal(mutex); wait(next);
wait(x-sem); next-count--;
x-count--; }
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. * x.wait() and x.signal() are invoked within a monitor.

Monitor
ƒ Process-Resumption Order
ƒ Queuing mechanisms for a monitor
monitor ResAllc { and its condition variables.
boolean busy; ƒ A solution:
condition x;
x.wait(c);
void acquire(int time) {
ƒ where the expression c is evaluated to
if (busy)
determine its process’s resumption
x.wait(time);
order.
busy=true;
R.acquire(t);
}


} access the resource;
R.release;
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

148
Monitor
ƒ Concerns:
ƒ Processes may access resources
without consulting the monitor.
ƒ Processes may never release
resources.
ƒ Processes may release resources
which they never requested.
ƒ Process may even request resources
twice.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Monitor
ƒ Remark: Whether the monitor is correctly
used?
=> Requirements for correct computations
ƒ Processes always make their calls on the
monitor in correct order.
ƒ No uncooperative process can access
resource directly without using the access
protocols.
ƒ Note: Scheduling behavior should consult
the built-in monitor scheduling algorithm if
resource access RPC are built inside the
monitor.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

149
OS Synchronization – Solaris 2
ƒ Semaphores and Condition Variables
ƒ Adaptive Mutex
ƒ Spin-locking if the lock-holding thread
is running; otherwise, blocking is used.
ƒ Readers-Writers Locks
ƒ Expensive in implementations.
ƒ Turnstile
ƒ A queue structure containing threads
blocked on a lock.
ƒ Priority inversion Æ priority inheritance
protocol for kernel threads
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

OS Synchronization – Windows
2000
ƒ General Mechanism
ƒ Spin-locking for short code segments in
a multiprocessor platform.
ƒ Interrupt disabling when access to
global variables is done in a
uniprocessor platform.
ƒ Dispatcher Object
ƒ State: signaled or non-signaled
ƒ Mutex – select one process from its
waiting queue to the ready queue.
ƒ Events – select all processes waiting
for the event.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

150
Atomic Transactions
ƒ Why Atomic Transactions?
ƒ Critical sections ensure mutual
exclusion in data sharing, but the
relationship between critical sections
might also be meaningful!
Æ Atomic Transactions

ƒ Operating systems can be viewed as


manipulators of data!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions –
System Model
ƒ Transaction – a logical unit of
computation
ƒ A sequence of read and write operations
followed by a commit or an abort.
ƒ Beyond “critical sections”
1. Atomicity: All or Nothing
ƒ An aborted transaction must be rolled
back.
ƒ The effect of a committed transaction
must persist and be imposed as a logical
unit of operations.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

151
Atomic Transactions –
System Model
2. Serializability:
ƒ The order of transaction executions
must be equivalent to a serial
schedule.
T0 T1
R(A)
W(A)
R(A) Two operations Oi & Oj conflict if
W(A) 1. Access the same object
R(B) 2. One of them is write
W(B)
R(B)
W(B)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions –
System Model
ƒ Conflict Serializable:
ƒ S is conflict serializable if S can be
transformed into a serial schedule by
swapping nonconflicting operations.
T0 T1 T0 T1
R(A) R(A)
W(A) W(A)
R(A) R(B)
W(A) W(B)
R(B) R(A)
W(B) W(A)
R(B) R(B)
W(B) W(B)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

152
Atomic Transactions –
Concurrency Control
ƒ Locking Protocols
ƒ Lock modes (A general approach!)
ƒ 1. Shared-Mode: “Reads”.
ƒ 2. Exclusive-Mode: “Reads” & “Writes“
ƒ General Rule
ƒ A transaction must receive a lock of an
appropriate mode of an object before it
accesses the object. The lock may not
be released until the last access of the
object is done.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions –
Concurrency Control

Yes Request No
Lock
Locked? compatible with the
Request
current lock?
No

Lock is Yes
granted WAIT

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

153
Atomic Transactions –
Concurrency Control
ƒ When to release locks w/o violating
serializability
R0(A) W0(A) R1(A) R1(B) R0(B) W0(B)
ƒ Two-Phase Locking Protocol (2PL) –
Not Deadlock-Free
serializable
Shrinking
schedules
Phase
Growing
Phase 2PL schedules
ƒ How to improve 2PL?
ƒ Semantics, Order of Data, Access
Pattern, etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions –
Concurrency Control
ƒ Timestamp-Based Protocols
ƒ A time stamp for each transaction TS(Ti)
ƒ Determine transactions’ order in a
schedule in advance!
ƒ A General Approach:
ƒ TS(Ti) – System Clock or Logical Counter
ƒ Unique?
ƒ Scheduling Scheme – deadlock-free &
serializable
ƒ W − timestamp (Q ) = Max
Ti −W ( Q ) (TS (Ti ))

ƒ R − timestamp(Q ) = MaxTi − R ( Q ) (TS (Ti ))


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

154
Atomic Transactions –
Concurrency Control
ƒ R(Q) requested by Ti Æ check TS(Ti) !

Time
Rejected Granted
W-timestamp(Q)

ƒ W(Q) requested by Ti Æ check TS(Ti) !


Time
Rejected Granted
R-timestamp(Q)
Time
Rejected Granted
W-timestamp(Q)
ƒ Rejected transactions are rolled back
and restated with a new time stamp.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Failure Recovery – A Way to


Achieve Atomicity
ƒ Failures of Volatile and Nonvolatile Storages!
ƒ Volatile Storage: Memory and Cache
ƒ Nonvolatile Storage: Disks, Magnetic Tape, etc.
ƒ Stable Storage: Storage which never fail.
ƒ Log-Based Recovery
ƒ Write-Ahead Logging
ƒ Log Records
< Ti starts >
< Ti commits >
< Ti aborts >
< Ti, Data-Item-Name, Old-Value, New-Value>
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

155
Failure Recovery
ƒ Two Basic Recovery Procedures:

Time
restart crash

ƒ undo(Ti): restore data updated by Ti


ƒ redo(Ti): reset data updated by Ti
ƒ Operations must be idempotent!
ƒ Recover the system when a failure occurs:
ƒ “Redo” committed transactions, and
“undo” aborted transactions.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Failure Recovery
ƒ Why Checkpointing?
ƒ The needs to scan and rerun all log
entries to redo committed transactions.
ƒ CheckPoint
ƒ Output all log records, Output DB, and Write
<check point> to stable storage!
ƒ Commit: A Force Write Procedure

checkpoint crash Time


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

156
Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Chapter 8 Deadlocks

157
Deadlocks
ƒ A set of process is in a deadlock state
when every process in the set is waiting
for an event that can be caused by only
another process in the set.
ƒ A System Model
ƒ Competing processes – distributed?
ƒ Resources:
ƒ Physical Resources, e.g., CPU, printers,
memory, etc.
ƒ Logical Resources, e.g., files,
semaphores, etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlocks
ƒ A Normal Sequence
1. Request: Granted or Rejected
2. Use
3. Release
ƒ Remarks
ƒ No request should exceed the
system capacity!
ƒ Deadlock can involve different
resource types!
ƒ Several instances of the same type!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

158
Deadlock Characterization
ƒ Necessary Conditions

(deadlock Æ conditions or ¬ conditions Æ ¬ deadlock)

1. Mutual Exclusion – At least one


resource must be held in a non-
sharable mode!
2. Hold and Wait – Pi is holding at least
one resource and waiting to acquire
additional resources that are currently
held by other processes!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Characterization
3. No Preemption – Resources are
nonpreemptible!
4. Circular Wait – There exists a set
{P0, P1, …, Pn} of waiting process
such that P0 wait P1, P1 wait P2, …,
Pn-1 wait Pn, and Pn wait P0.

ƒ Remark:
ƒ Condition 4 implies Condition 2.
ƒ The four conditions are not
completely independent!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

159
Resource Allocation Graph
System Resource-Allocation Graph
R1 R3
Vertices
Processes:
{P1,…, Pn}
Resource Type :
P1 P2 P3 {R1,…, Rm}
Edges
Request Edge:
Pi Æ Rj
Assignment Edge:
R2 R4 Ri Æ Pj
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Resource Allocation Graph


ƒ Example
R1 R3 ƒ No-Deadlock
ƒ Vertices
ƒ P = { P1, P2, P3 }
ƒ R = { R1, R2, R3, R4 }
P1 P2 P3 ƒ Edges
ƒ E = { P1ÆR1, P2ÆR3,
R1ÆP2, R2ÆP2,
R2ÆP1, R3ÆP3 }
ƒ Resources
R2 R4 ƒ R1:1, R2:2, R3:1, R4:3
ƒ Æ results in a deadlock.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

160
Resource Allocation Graph
ƒ Observation
ƒ The existence of a cycle
ƒ One Instance per Resource Type Æ Yes!!
ƒ Otherwise Æ Only A Necessary Condition!!
R1
P2

P1 P3
R2

P4
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Methods for Handling Deadlocks


ƒ Solutions:
1. Make sure that the system never
enters a deadlock state!
ƒ Deadlock Prevention: Fail at least one
of the necessary conditions
ƒ Deadlock Avoidance: Processes
provide information regarding their
resource usage. Make sure that the
system always stays at a “safe” state!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

161
Methods for Handling Deadlocks
2. Do recovery if the system is
deadlocked.
ƒ Deadlock Detection
ƒ Recovery
3. Ignore the possibility of deadlock
occurrences!
ƒ Restart the system “manually” if the
system “seems” to be deadlocked or
stops functioning.
ƒ Note that the system may be “frozen”
temporarily!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention
ƒ Observation:
ƒ Try to fail anyone of the necessary
condition!
∵ ¬ (∧ i-th condition) → ¬ deadlock

ƒ Mutual Exclusion
?? Some resources, such as a printer,
are intrinsically non-sharable??

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

162
Deadlock Prevention
ƒ Hold and Wait
ƒ Acquire all needed resources before its
execution.
ƒ Release allocated resources before
request additional resources!
[ Tape Drive Æ Disk ] [ Disk & Printer ]

Hold Them All


Tape Drive & Disk Disk & Printer

ƒ Disadvantage:
ƒ Low Resource Utilization
ƒ Starvation
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention
ƒ No Preemption
ƒ Resource preemption causes the release of resources.
ƒ Related protocols are only applied to resources whose
states can be saved and restored, e.g., CPU register &
memory space, instead of printers or tape drives.
ƒ Approach 1:

Allocated
Resource No
Satisfied? resources
Request
are released
Yes
granted

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

163
Deadlock Prevention
ƒ Approach 2

Resource Yes
Satisfied? granted
Request

No

Requested
Yes Preempt
Resources are
those
held by “Waiting”
Resources.
processes?

No
“Wait” and its
allocated resources
may be preempted.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention
ƒ Circular Wait
A resource-ordering approach:
F:RÆN
Resource requests must be made in
an increasing order of enumeration.

ƒ Type 1 – strictly increasing order of


resource requests.
ƒ Initially, order any # of instances of Ri
ƒ Following requests of any # of instances
of Rj must satisfy F(Rj) > F(Ri), and so on.
* A single request must be issued for all
needed instances of the same resources.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

164
Deadlock Prevention
ƒ Type 2
ƒ Processes must release all Ri’s when
they request any instance of Rj if F(Ri) ≥
F(Rj)
ƒ F : R Æ N must be defined according to
the normal order of resource usages in a
system, e.g.,
F(tape drive) = 1
F(disk drive) = 5 ?? feasible ??
F(printer) = 12

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance
ƒ Motivation:
ƒ Deadlock-prevention algorithms can cause
low device utilization and reduced system
throughput!

Î Acquire additional information about how


resources are to be requested and have
better resource allocation!
ƒ Processes declare their maximum
number of resources of each type that it
may need.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

165
Deadlock Avoidance
ƒ A Simple Model
ƒ A resource-allocation state
<# of available resources,
# of allocated resources,
max demands of processes>
ƒ A deadlock-avoidance algorithm dynamically
examines the resource-allocation state and
make sure that it is safe.
ƒ e.g., the system never satisfies the circular-
wait condition.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance
ƒ Safe Sequence
ƒ A sequence of processes <P1,
P2, …, Pn> is a safe sequence if
∀ Pi , need ( Pi ) ≤ Available + ∑ allocated ( Pj )
j <i

ƒ Safe State
safe unsafe
ƒ The existence of a safe sequence
deadlock
ƒ Unsafe
Deadlocks are avoided if the system can
allocate resources to each process up to its
maximum request in some order. If so, the
system is in a safe state!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

166
Deadlock Avoidance
ƒ Example:
max needs Allocated Available
P0 10 5 3
P1 4 2
P2 9 2

• The existence of a safe sequence <P1, P0, P2>.


• If P2 got one more, the system state is unsafe.
Q (( P 0,5), ( P1,2), ( P 2,3), (available,2))
How to ensure that the system will always
remain in a safe state?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance – Resource-


Allocation Graph Algorithm
ƒ One Instance per Resource Type

•Request Edge
R1
Pi Rj
resource
•Assignment Edge allocated
P1 P2
Rj Pi request
R2
made
•Claim Edge resource
release
Pi Rj

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

167
Deadlock Avoidance – Resource-
Allocation Graph Algorithm
R1 A cycle is detected!
Î The system state is unsafe!
P1 P2 • R2 was requested & granted!

R2 Safe state: no cycle Cycle detection


can be done
Unsafe state: otherwise in O(n2)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance – Banker’s


Algorithm
ƒ Available [m]
ƒ If Available [i] = k, there are k instances of
resource type Ri available.
ƒ Max [n,m]
n: # of ƒ If Max [i,j] = k, process Pi may request at most k
processes, instances of resource type Rj.
m: # of
resource
ƒ Allocation [n,m]
types ƒ If Allocation [i,j] = k, process Pi is currently
allocated k instances of resource type Rj.
ƒ Need [n,m]
ƒ If Need [i,j] = k, process Pi may need k more
instances of resource type Rj.
¾ Need [i,j] = Max [i,j] – Allocation [i,j]
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

168
Deadlock Avoidance – Banker’s
Algorithm
ƒ Safety Algorithm – A state is safe??
1. Work := Available & Finish [i] := F, 1≦ i≦ n
2. Find an i such that both
n: # of 1. Finish [i] =F
processes, 2. Need[i] ≦ Work
m: # of If no such i exist, then goto Step4
resource 3. Work := Work + Allocation[i]
types Finish [i] := T; Goto Step2
4. If Finish [i] = T for all i, then the system is in
a safe state.
Where Allocation[i] and Need[i] are the i-th row of
Allocation and Need, respectively, and
X≦ Y if X[i] ≦ Y[i] for all i,
X < Y if X ≦ Y and Y≠ X
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance – Banker’s


Algorithm
ƒ Resource-Request Algorithm
Requesti [j] =k: Pi requests k instance of resource type Rj
1. If Requesti ≦ Needi, then Goto Step2; otherwise, Trap
2. If Requesti ≦ Available, then Goto Step3; otherwise, Pi
must wait.
3. Have the system pretend to have allocated resources to
process Pi by setting
Available := Available – Requesti;
Allocationi := Allocationi + Requesti;
Needi := Needi – Requesti;
Execute “Safety Algorithm”. If the system state is safe,
the request is granted; otherwise, Pi must wait, and the
old resource-allocation state is restored!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

169
Deadlock Avoidance
ƒ An Example

Allocation Max Need Available


A B C A B C A B C A B C
P0 0 1 0 7 5 3 7 4 3 3 3 2
P1 2 0 0 3 2 2 1 2 2
P2 3 0 2 9 0 2 6 0 0
P3 2 1 1 2 2 2 0 1 1
P4 0 0 2 4 3 3 4 3 1

• A safe state
∵ <P1,P3,P4,P2,P0> is a safe sequence.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance
Let P1 make a request Requesti = (1,0,2)
Requesti ≦ Available ((1,0,2) ≦ (3,3,2))
Allocation Need Available
A B C A B C A B C
P0 0 1 0 7 4 3 2 3 0
P1 3 0 2 0 2 0
P2 3 0 2 6 0 0
P3 2 1 1 0 1 1
P4 0 0 2 4 3 1

Æ Safe ∵ <P1,P3,P4,P0,P2> is a safe sequence!


• If Request4 = (3,3,0) is asked later, it must be rejected.
• Request0 = (0,2,0) must be rejected because it results in an
unsafe state.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

170
Deadlock Detection
ƒ Motivation:
ƒ Have high resource utilization and
“maybe” a lower possibility of deadlock
occurrence.
ƒ Overheads:
ƒ Cost of information maintenance
ƒ Cost of executing a detection algorithm
ƒ Potential loss inherent from a deadlock
recovery

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Detection – Single


Instance per Resource Type
P5
P5
R1 R3 R4
P1 P2 P3
P1 P2 P3

R2 P4 R5 P4

A Resource-Allocation Graph A Wait-For Graph

Pi Rq Pj Pi Pj

• Detect an cycle in O(n2).


• The system needs to maintain the wait-for graph
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

171
Deadlock Detection – Multiple
Instance per Resource Type
ƒ Data Structures
ƒ Available[1..m]: # of available resource
instances
n: # of ƒ Allocation[1..n, 1..m]: current resource
processes, allocation to each process
m: # of ƒ Request[1..n, 1..m]: the current request of
resource each process
types ƒ If Request[i,j] = k, Pi requests k more
instances of resource type Rj

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Detection – Multiple


Instance per Resource Type
1. Work := Available. For i = 1, 2, …, n, if
Allocation[i] ≠ 0, then Finish[i] = F;
otherwise, Finish[i] =T.
2. Find an i such that both
Complexity = a. Finish[i] = F
O(m * n2) b. Request[i] ≦ Work
If no such i, Goto Step 4
3. Work := Work + Allocation[i]
Finish[i] := T
Goto Step 2
4. If Finish[i] = F for some i, then the system is
in a deadlock state. If Finish[i] = F, then
process Pi is deadlocked.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

172
Deadlock Detection – Multiple
Instances per Resource Type
ƒ An Example
Allocation Request Available
A B C A B C A B C
P0 0 1 0 0 0 0 0 2 0
P1 2 0 0 2 0 2
P2 3 0 3 0 0 0
P3 2 1 1 1 0 0
P4 0 0 2 0 0 2

Î Find a sequence <P0, P2, P3, P1, P4> such that Finish[i]
= T for all i.
If Request2 = (0,0,1) is issued, then P1, P2, P3, and P4 are
deadlocked.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Detection – Algorithm


Usage
ƒ When should we invoke the detection
algorithm?
ƒ How often is a deadlock likely to occur?
ƒ How many processes will be affected by a
deadlock?
Every overheads
+ -
rejected processes affected ∞
request - +
ƒ Time for Deadlock Detection?
ƒ CPU Threshold? Detection Frequency? …

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

173
Deadlock Recovery
ƒ Whose responsibility to deal with
deadlocks?
ƒ Operator deals with the deadlock
manually.
ƒ The system recover from the
deadlock automatically.
ƒ Possible Solutions
ƒ Abort one or more processes to
break the circular wait.
ƒ Preempt some resources from one or
more deadlocked processes.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Recovery – Process


Termination
ƒ Process Termination
ƒ Abort all deadlocked processes!
ƒ Simple but costly!
ƒ Abort one process at a time until the deadlock
cycle is broken!
ƒ Overheads for running the detection again and
again.
ƒ The difficulty in selecting a victim!

But, can we abort any process?


Should we compensate any
damage caused by aborting?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

174
Deadlock Recovery – Process
Termination
ƒ What should be considered in choosing
a victim?
ƒ Process priority
ƒ The CPU time consumed and to be
consumed by a process.
ƒ The numbers and types of resources
used and needed by a process
ƒ Process’s characteristics such as
“interactive or batch”
ƒ The number of processes needed to be
aborted.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Recovery – Resource


Preemption
ƒ Goal: Preempt some resources from processes
and give them to other processes until the
deadlock cycle is broken!
ƒ Issues
ƒ Selecting a victim:
ƒ It must be cost-effective!
ƒ Roll-Back
ƒ How far should we roll back a process whose resources
were preempted?
ƒ Starvation
ƒ Will we keep picking up the same process as a victim?
ƒ How to control the # of rollbacks per process efficiently?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

175
Deadlock Recovery –
Combined Approaches
ƒ Partition resources into classes that
are hierarchically ordered.
⇒ No deadlock involves more than
one class
ƒ Handle deadlocks in each class
independently

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Recovery –
Combined Approaches
Examples:
ƒ Internal Resources: Resources used by the
system, e.g., PCB
→ Prevention through resource ordering
ƒ Central Memory: User Memory
→ Prevention through resource preemption
ƒ Job Resources: Assignable devices and files
→ Avoidance ∵ This info may be obtained!
ƒ Swappable Space: Space for each user process
on the backing store
→ Pre-allocation ∵ the maximum need is known!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

176
Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Chapter 9
Memory Management

177
Memory Management
ƒ Motivation
ƒ Keep several processes in memory
to improve a system’s performance
ƒ Selection of different memory
management methods
ƒ Application-dependent
ƒ Hardware-dependent
ƒ Memory – A large array of words or
bytes, each with its own address.
ƒ Memory is always too small!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Memory Management
ƒ The Viewpoint of the Memory Unit
ƒ A stream of memory addresses!
ƒ What should be done?
ƒ Which areas are free or used (by
whom)
ƒ Decide which processes to get memory
ƒ Perform allocation and de-allocation
ƒ Remark:
ƒ Interaction between CPU scheduling
and memory allocation!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

178
Background
ƒ Address Binding – binding of instructions and data
to memory addresses
symbolic address
Binding Time source program e.g., x

Known at compile time, compiling


where a program will be in
object module Relocatable
memory - “absolute code”
address
MS-DOS *.COM other object
modules linking
At load time:
- All memory reference by a load module
program will be translated
- Code is relocatable system library
loading
- Fixed while a program runs
At execution time in-memory binary
dynamically
- binding may change loaded system memory image Absolute
as a program run library address
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Background

Main • Binding at the Compiling


Memory Time
•A process must execute at a
specific memory space
• Binding at the Load Time
• Relocatable Code
• Process may move from a
memory segment to another →
binding is delayed till run-time

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

179
Logical Versus Physical Address

Logical Physical
Address + Address Memory
CPU Address

346 Relocation 14346 Register

Register
The user program Memory
deals with logical 14000
addresses
- Virtual Addresses
(binding at the run time) Memory Management
Unit (MMU) –
“Hardware-Support”

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Logical Versus Physical Address


ƒ A logical (physical) address space is the set of
logical (physical) addresses generated by a
process. Physical addresses of a program is
transparent to any process!
ƒ MMU maps from virtual addresses to physical
addresses. Different memory mapping
schemes need different MMU’s that are
hardware devices. (slow down)
ƒ Compile-time & load-time binding schemes
results in the collapsing of logical and physical
address spaces.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

180
Dynamic Loading
ƒ Dynamic Loading
ƒ A routine will not be loaded until it is
called. A relocatable linking loader
must be called to load the desired
routine and change the program’s
address tables.
ƒ Advantage
ƒ Memory space is better utilized.
ƒ Users may use OS-provided
libraries to achieve dynamic loading
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Dynamic Linking
ƒ Dynamic Linking Static Linking

language library
A small piece of code, called
+
stub, is used to locate or load program object module
the appropriate routine
binary program image
Advantage
Save memory space by sharing Simple
the library code among
processes Æ Memory
Protection & Library Update!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

181
Overlays
ƒ Motivation
ƒ Keep in memory only those instructions and data
needed at any given time.
ƒ Example: Two overlays of a two-pass assembler

Symbol table 20KB

common routines 30KB Certain relocation &


linking algorithms are
overlay driver 10KB needed!

70KB Pass 1 Pass 2 80KB


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Overlays
ƒ Memory space is saved at the cost of
run-time I/O.
ƒ Overlays can be achieved w/o OS
support:
⇒ “absolute-address” code
ƒ However, it’s not easy to program a
overlay structure properly!
⇒ Need some sort of automatic
techniques that run a large program in a
limited physical memory!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

182
Swapping

OS
swap out
Process
User p1
Space
swap in Process
p2

Main Memory Backing Store

Should a process be put back into the same


memory space that it occupied previously?
↔ Binding Scheme?!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Swapping
ƒ A Naive Way
Pick up Dispatcher
a process checks whether Yes Dispatch CPU to
from the the process is the process
ready queue in memory

No

Swap in
the process

Potentially High Context-Switch Cost:


2 * (1000KB/5000KBps + 8ms) = 416ms
Transfer Time Latency Delay
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

183
Swapping
ƒ The execution time of each process should
be long relative to the swapping time in
this case (e.g., 416ms in the last example)!
ƒ Only swap in1000kwhat
100k
per sec
is disk
= 100ms actually
+
used. ⇒
Users must keep the system informed of
100k disk
memory usage. Memory 1000kpersec
=100ms
+

ƒ Who should be swapped out? OS


I/O buffering
ƒ “Lower Priority” Processes?
ƒ Any Constraint?
Pi I/O buffering
⇒ System Design
?I/O?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Swapping
ƒ Separate swapping space from the
file system for efficient usage
ƒ Disable swapping whenever possible
such as many versions of UNIX –
Swapping is triggered only if the
memory usage passes a threshold,
and many processes are running!
ƒ In Windows 3.1, a swapped-out
process is not swapped in until the
user selects the process to run.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

184
Contiguous Allocation – Single User
0000
OS relocation register
a
a
User
b
limit register

Unused b
8888

ƒ A single user is allocated as much memory as


needed
ƒ Problem: Size Restriction → Overlays (MS/DOS)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contiguous Allocation – Single User


ƒ Hardware Support for Memory Mapping
and Protection

relocation
limit register
register

CPU < +
Yes physical
memory
logical No address
address
trap

Disadvantage: Wasting of CPU and Resources


∵ No Multiprogramming Possible
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

185
Contiguous Allocation – Multiple Users
ƒ Fixed Partitions
ƒ Memory is divided into
fixed partitions, e.g.,
OS/360 (or MFT)
20k
Partition 1 proc 1 ƒ A process is allocated on
45k an entire partition
Partition 2 proc 7 ƒ An OS Data Structure:
60k Partitions
Partition 3 proc 5 # size location status

90k 1 25KB 20k Used


Partition 4 100k 2 15KB 45k Used
“fragmentation” 3 30KB 60k Used
4 10KB 90k Free
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contiguous Allocation – Multiple Users

ƒ Hardware Supports
ƒ Bound registers
ƒ Each partition may have a
protection key (corresponding to a
key in the current PSW)
ƒ Disadvantage:
ƒ Fragmentation gives poor memory
utilization !

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

186
Contiguous Allocation – Multiple Users
ƒ Dynamic Partitions
ƒ Partitions are dynamically created.
ƒ OS tables record free and used partitions

OS Base = 20k Base = 70k


20k Used size = 20KB size = 20KB
Process 1 user = 1 user = 2
40k
free
70k Base = 40k Base = 90k
Process 2 Free size = 30KB size = 20KB
90k
free
110k Input Queue
P3 with a 40KB
memory request !
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contiguous Allocation – Multiple Users


ƒ Solutions for dynamic storage allocation :
ƒ First Fit – Find a hole which is big enough
ƒ Advantage: Fast and likely to have large chunks
of memory in high memory locations
Better
in Time ƒ Best Fit – Find the smallest hole which is big
and Storage enough. → It might need a lot of search time
Usage and create lots of small fragments !
ƒ Advantage: Large chunks of memory available
ƒ Worst Fit – Find the largest hole and create a
new partition out of it!
ƒ Advantage: Having largest leftover holes with
lots of search time!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

187
Contiguous Allocation Example – First Fit
Process Memory Time
(RR Scheduler with Quantum = 1) A job queue
P1 600KB 10
P2 1000KB 5
OS OS 400k
OS P3 300KB 20
400k 400k
P1 P1 P4 700KB 8
1000k
1000k P5 500KB 15
P2
2000k 2000k P2 terminates &
P3 2300k P3 frees its memory
2300k
2560k 2560k 2560k
Time = 0 Time = “0” Time = 14

400k OS 400k
OS 400k
OS
P1 P5
900k
1000k 1000k 1000k
P4 P4 P4
1700k 1700k 1700k
300KB
2000k 2000k 2000k
2300k P3 P3 + 560KB 2300k P3
2300k
260KB
2560k 2560k P5? 2560k
Time = “14” Time = 28 Time = “28”
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Fragmentation – Dynamic Partitions


ƒ External fragmentation occurs as small
chunks of memory accumulate as a by-
product of partitioning due to imperfect fits.
ƒ Statistical Analysis For the First-Fit Algorithm:
ƒ 1/3 memory is unusable – 50-percent rule
ƒ Solutions:
a. Merge adjacent free areas.
b. Compaction
- Compact all free areas into one contiguous region
- Requires user processes to be relocatable
Any optimal compaction strategy???
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

188
Fragmentation – Dynamic Partitions
0 0 0 0
OS OS OS OS
300K 300K 300K 300K

500K
P1 500K
P1 500K
P1 500K
P1
600K
P2 600K
P2 600K
P2 600K
P2
400KB *P3 *P4
1000K 800K 1000K
1200K
P3 1200K
*P4 1200K
P3 900K
1500K
300KB 1500K
P4 900K 900K P3
1900K 1900K
2100K 200KB 2100K 2100K 2100K *P4
MOVE 600KB MOVE 400KB MOVE 200KB

ƒ Cost: Time Complexity O(n!)?!!


ƒ Combination of swapping and compaction
ƒ Dynamic/static relocation
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Fragmentation – Dynamic Partitions


ƒ Internal fragmentation:
A small chunk of “unused” memory internal to a
partition.

OS
P3 request 20KB
P1 ?? give P3 20KB & leave a
20,002 bytes 2-byte free area??
P2
Reduce free-space maintenance cost
Æ Give 20,002 bytes to P3 and have 2 bytes as an internal
fragmentation!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

189
Fragmentation – Dynamic Partitions
ƒ Dynamic Partitioning:
ƒ Advantage:
⇒ Eliminate fragmentation to some degree
⇒ Can have more partitions and a higher degree
of multiprogramming
ƒ Disadvantage:
ƒ Compaction vs Fragmentation
ƒ The amount of free memory may not be enough for a
process! (contiguous allocation)
ƒ Memory locations may be allocated but never
referenced.
ƒ Relocation Hardware Cost & Slow Down
⇒ Solution: Paged Memory!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging
ƒ Objective
ƒ Users see a logically contiguous address
space although its physical addresses are
throughout physical memory
ƒ Units of Memory and Backing Store
ƒ Physical memory is divided into fixed-sized
blocks called frames.
ƒ The logical memory space of each process
is divided into blocks of the same size
called pages.
ƒ The backing store is also divided into
blocks of the same size if used.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

190
Paging – Basic Method
Page Offset
Logical Address
CPU p d f d f

Physical Address
d
Page Table

Page Number p ……

..
f Base Address of Page p
……

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging – Basic Method


ƒ Address Translation
page # page offset
p d max number of pages: 2m-n
m-n n Logical Address Space: 2m
Physical Address Space: ???
m
ƒ A page size tends to be a power of 2
for efficient address translation.
ƒ The actual page size depends on the
computer architecture. Today, it is
from 512B or 16KB.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

191
Paging – Basic Method
Page
0 Frame
0 A 0 0
0 5
4
4 C
1
B 1
1 6 8 2
8
D
2
C 2 1 12 3
12
3 2 16 4
3
D Page 20 5
16
Table A
Logical 24 6
Memory
B
28 7
Physical Memory
Logical Address
1*4+1=5 Physical Address
01 01 110 01
= 6 * 4 + 1 = 25
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging – Basic Method


ƒ No External Fragmentation
ƒ Paging is a form of dynamic relocation.
ƒ The average internal fragmentation is about
one-half page per process
ƒ The page size generally grows over time as
processes, data sets, and memory have
become larger.
ƒ 4-byte page table entry & 4KB per page Æ
232 * 212B = 244B = 16TB of physical memory
Disk I/O Page Table Internal
Page Size
Efficiency Maintenance Fragmentation

* Example: 8KB or 4MB for Solaris.


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

192
Paging – Basic Method
ƒ Page Replacement:
ƒ An executing process has all of its pages
in physical memory.
ƒ Maintenance of the Frame Table
ƒ One entry for each physical frame
ƒ The status of each frame (free or allocated)
and its owner
ƒ The page table of each process must be
saved when the process is preempted. Æ
Paging increases context-switch time!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging – Hardware Support


ƒ Page Tables
ƒ Where: Registers or Memory
ƒ Efficiency is the main consideration!
ƒ The use of registers for page tables
ƒ The page table must be small!
ƒ The use of memory for page tables
ƒ Page-Table Base Register (PTBR)

a A Page
Table
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

193
Paging – Hardware Support
ƒ Page Tables on Memory
ƒ Advantages:
ƒ The size of a page table is unlimited!
ƒ The context switch cost may be low if the
CPU dispatcher merely changes PTBR,
instead of reloading another page table.
ƒ Disadvantages:
ƒ Memory access is slowed by a factor of 2
ƒ Translation Look-aside buffers (TLB)
ƒ Associate, high-speed memory
ƒ (key/tag, value) – 16 ~ 1024 entries
ƒ Less than 10% memory access time
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging – Hardware Support


ƒ Translation Look-aside Buffers(TLB):
ƒ Disadvantages: Expensive Hardware and
Flushing of Contents for Switching of
Page Tables
ƒ Advantage: Fast – Constant-Search Time
key value

item

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

194
Paging – Hardware Support
Logical Address
CPU p d
Page# Frame#

TLB Hit
p TLB
…. …….. Physical
Memory
f d
Physical
Address
TLB Miss
f
* Address-Space Identifiers
(ASID) in TLB for process
Page • Update TLB if a TLB miss occurs!
matching? Protection? Flush? Table • Replacement of TLB entries might
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. be needed.

Paging – Effective Memory


Access Time
ƒ Hit Ratio = the percentage of times that a
page number is found in the TLB
ƒ The hit ratio of a TLB largely depends
on the size and the replacement
strategy of TLB entries!
ƒ Effective Memory Access Time
ƒ Hit-Ratio * (TLB lookup + a mapped
memory access) + (1 – Hit-Ratio) *
(TLB lookup + a page table lookup + a
mapped memory access)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

195
Paging – Effective Memory
Access Time
ƒ An Example
ƒ 20ns per TLB lookup, 100ns per memory
access
ƒ Effective Access Time = 0.8*120ns
+0.2*220ns = 140 ns, when hit ratio = 80%
ƒ Effective access time = 0.98*120ns
+0.02*220ns = 122 ns, when hit ratio = 98%
ƒ Intel 486 has a 32-register TLB and claims a
98 percent hit ratio.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging – Protection & Sharing


ƒ Protection
y v 2
y v 7 Page Table
y 3

y v
1 0
memory r/w/e dirty Valid-Invalid Bit
Valid Page?
Is the page in memory? Modified?
r/w/e protected: 100r, 010w, 110rw,

ƒ Use a Page-Table Length Register (PTLR) to
indicate the size of the page table.
ƒ Unused Paged table entries might be ignored
during maintenance.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

196
Paging – Protection & Sharing
ƒ Example: a 12287-byte Process (16384=214)
0
P0
2K 0 V 2 0
P1 1 V 3 1
4K 2 V 4
2 P0
P2 3 V 7
6K V 8 3 P1
4
P3 5 V 9 4 P2
8K 6 i 0 5
P4 7 i 0 6
10K
10,468
… P5 Page Table 7 P3
12,287 (PTLR entries?) 8 P4
Logical address 9 P5
p d
3 11
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging – Protection & Sharing


*ed1 *ed1 3
3 P2 Page
P1 Page Table 2 4
*ed2 Table 1 4 *ed2
6
6
*ed3 *ed3 7
1
* Data 1 * Data 2

* * *
* *
* * * ::
data1data2
ed1 ed2 ed3
page 0 1 2 3 4 5 6 7 n
ƒ Procedures which are executed often (e.g., editor) can be divided into
procedure + date. Memory can be saved a lot.
ƒ Reentrant procedures can be saved! The non-modified nature of saved
code must be enforced
ƒ Address referencing inside shared pages could be an issue.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

197
Multilevel Paging

ƒ Motivation
ƒ The logical address space of a process
in many modern computer system is very
large, e.g., 232 to 264 Bytes.
32-bit address Æ 220 page entries Æ 4MB
4KB per page 4B per entries page table

Æ Even the page table must be divided into


pieces to fit in the memory!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multilevel Paging – Two-Level Paging


Logical Address

P1 P2 d Physical
P1
Memory

P2

Outer-Page Table d

PTBR A page of page table

Forward-Mapped Page Table


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

198
Multilevel Paging – N-Level Paging
ƒ Motivation: Two-level paging is not
appropriate for a huge logical address space!
Logical Address PTBR
P1 P2 .. Pn d

P1
N pieces
P2
… Physical
Pn Memory

1 + 1 +… + 1 + 1
= n+1 accesses
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multilevel Paging – N-Level Paging

ƒ Example
ƒ 98% hit ratio, 4-level paging, 20ns TLB
access time, 100ns memory access time.
ƒ Effective access time = 0.98 X 120ns +
0.02 X 520ns = 128ns
ƒ SUN SPARC (32-bit addressing) Æ 3-level
paging
ƒ Motorola 68030 (32-bit addressing) Æ 4-
level paging
ƒ VAX (32-bit addressing) Æ 2-level paging
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

199
Hashed Page Tables
ƒ Objective:
ƒ To handle large address spaces
ƒ Virtual address Æ hash function Æ a
linked list of elements
ƒ (virtual page #, frame #, a pointer)
ƒ Clustered Page Tables
ƒ Each entry contains the mappings for
several physical-page frames, e.g.,
16.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Inverted Page Table


ƒ Motivation
ƒ A page table tends to be big and does not
correspond to the # of pages residing in the
physical memory.
ƒ Each entry corresponds to a physical frame.
ƒ Virtual Address: <Process ID, Page Number, Offset>
Logical
Address Physical
CPU pid P d f d Memory

Physical
Address
pid: p

An Inverted Page Table


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

200
Inverted Page Table
ƒ Each entry contains the virtual address of the frame.
ƒ Entries are sorted by physical addresses.
ƒ One table per system.
ƒ When no match is found, the page table of the
corresponding process must be referenced.
ƒ Example Systems: HP Spectrum, IBM RT, PowerPC,
SUN UltraSPARC
Logical
Address Physical
CPU pid P d f d Memory

Physical
Address
pid: p

An Inverted Page Table


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Inverted Page Table


ƒ Advantage
ƒ Decrease the amount of memory needed
to store each page table
ƒ Disadvantage
ƒ The inverted page table is sorted by
physical addresses, whereas a page
reference is in a logical address.
ƒ The use of Hash Table to eliminate
lengthy table lookup time: 1HASH + 1IPT
ƒ The use of an associate memory to hold
recently located entries.
ƒ Difficult to implement with shared memory
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

201
Segmentation
ƒ Segmentation is a memory management
scheme that support the user view of memory:
ƒ A logical address space is a collection of
segments with variable lengths.

Subroutine Stack

Symbol
Sqrt table

Main program

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Segmentation
ƒ Why Segmentation?
ƒ Paging separates the user’s view of
memory from the actual physical
memory but does not reflect the logical
units of a process!
ƒ Pages & frames are fixed-sized, but
segments have variable sizes.
ƒ For simplicity of representation,
<segment name, offset> Æ <segment-
number, offset>

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

202
Segmentation – Hardware Support
ƒ Address Mapping
s
CPU s d
limit base

limit Segment
Table base

d yes
< d
+ Physical
Memory

no

trap
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Segmentation – Hardware Support


ƒ Implementation in Registers – limited size!
ƒ Implementation in Memory
ƒ Segment-table base register (STBR)
ƒ Segment-table length register (STLR)
ƒ Advantages & Disadvantages – Paging
ƒ Use an associate memory (TLB) to improve the
effective memory access time !
ƒ TLB must be flushed whenever a new segment
table is used !
a
STBR Segment table STLR
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

203
Segmentation – Protection & Sharing
ƒ Advantage:
ƒ Segments are a semantically defined portion of
the program and likely to have all entries being
“homogeneous”.
ƒ Example: Array, code, stack, data, etc.
Æ Logical units for protection !
ƒ Sharing of code & data improves memory usage.
ƒ Sharing occurs at the segment level.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Segmentation – Protection & Sharing


ƒ Potential Problems
ƒ External Fragmentation
ƒ Segments must occupy contiguous memory.
ƒ Address referencing inside shared
segments can be a big issue:

Seg# offset

Indirect Should all shared-code segments


addressing?!!! have the same segment number?

ƒ How to find the right segment number if the


number of users sharing the segments
increase! Æ Avoid reference to segment #
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

204
Segmentation – Fragmentation
ƒ Motivation:
Segments are of variable lengths!
Æ Memory allocation is a dynamic
storage-allocation problem.
ƒ best-fit? first-fit? worst-ft?
ƒ External fragmentation will occur!!
ƒ Factors, e.g., average segment sizes

Size External
Fragmentation

A byte Overheads increases substantially!


(base+limit “registers”)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Segmentation – Fragmentation

ƒ Remark:
ƒ Its external fragmentation problem is
better than that of the dynamic
partition method because segments
are likely to be smaller than the
entire process.
ƒ Internal Fragmentation??

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

205
Segmentation with Paging

ƒ Motivation :
ƒ Segmentation has external fragmentation.
ƒ Paging has internal fragmentation.
ƒ Segments are semantically defined
portions of a program.
Æ “Page” Segments !

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paged Segmentation – Intel 80386


ƒ 8K Private Segments + 8K Public Segments
ƒ Page Size = 4KB, Max Segment Size = 4GB
ƒ Tables:
ƒ Local Descriptor Table (LDT)
ƒ Global Descriptor Table (GDT)
ƒ 6 microprogram segment registers for caching

Selector Segment Offset

Logical Address s g p sd
13 1 2 32

Linear Address p1 p2 d
10 10 12
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

206
Paged Segmentation – Intel 80386
Logical Address
16 32

s+g+p sd

Physical
: : f d Memory
Descriptor Table >-
Segment Segment
no Physical
Length Base address

Trap Page Directory


: : Base Register
Segment
table ;
+ p1
;
10 10 12 p2

p1 p2 d
f
Page Directory
*Page table are limited by the segment
lengths of their segments. Page Table
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Paging and Segmentation


ƒ To overcome disadvantages of paging or
segmentation alone:
ƒ Paged segments – divide segments further into
pages.
ƒ Segment need not be in contiguous memory.
ƒ Segmented paging – segment the page table.
ƒ Variable size page tables.
ƒ Address translation overheads increase!
ƒ An entire process still needs to be in memory
at once!
Æ Virtual Memory!!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

207
Paging and Segmentation
ƒ Considerations in Memory Management
ƒ Hardware Support, e.g., STBR, TLB, etc.
ƒ Performance
ƒ Fragmentation
ƒ Multiprogramming Levels
ƒ Relocation Constraints?
ƒ Swapping: +
ƒ Sharing?!
ƒ Protection?!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents
1. Introduction
2. Computer-System Structures
3. Operating-System Structures
4. Processes
5. Threads
6. CPU Scheduling
7. Process Synchronization
8. Deadlocks
9. Memory Management
10. Virtual Memory
11. File Systems
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

208
Chapter 10
Virtual Memory

Virtual Memory
ƒ Virtual Memory
ƒ A technique that allows the execution
of a process that may not be
completely in memory.
ƒ Motivation:
ƒ An entire program in execution may
not all be needed at the same time!
ƒ e.g. error handling routines, a large
array, certain program features, etc

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

209
Virtual Memory
ƒ Potential Benefits
ƒ Programs can be much larger than the
amount of physical memory. Users can
concentrate on their problem programming.
ƒ The level of multiprogramming increases
because processes occupy less physical
memory.
ƒ Each user program may run faster because
less I/O is needed for loading or swapping
user programs.
ƒ Implementation: demand paging,
demand segmentation (more difficult),etc.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Demand Paging – Lazy


Swapping
ƒ Process image may reside on the backing
store. Rather than swap in the entire
process image into memory, Lazy
Swapper only swap in a page when it is
needed!
ƒ Pure Demand Paging – Pager vs Swapper
ƒ A Mechanism required to recover from the
missing of non-resident referenced pages.
ƒ A page fault occurs when a process
references a non-memory-resident
page.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

210
Demand Paging – Lazy
Swapping 0
1
CPU p d f d
2
3
4-A
A 4 v
i 5
B 6 v 6-C
C i valid-invalid bit 7
D i
9 v 8
E i invalid page? 9 -F
F i
non-memory- .

Logical Memory Page Table resident page?


.
.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.


Physical Memory

A Procedure to Handle a Page


Fault 3. Issue a ‘read”
instruction & find a
free frame
OS

2. Trap
(valid disk-resident page)
1. Reference 4. Bring in
Free the missing
CPU i Frame
5. Reset page
the Page
6. Return to Table
execute the
instruction
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

211
A Procedure to Handle A
Page Fault
ƒ Pure Demand Paging:
ƒ Never bring in a page into the
memory until it is required!

ƒ Pre-Paging
ƒ Bring into the memory all of the
pages that “will” be needed at one
time!
ƒ Locality of reference

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Hardware Support for Demand


Paging
ƒ New Bits in the Page Table
ƒ To indicate that a page is now in
memory or not.

ƒ Secondary Storage
ƒ Swap space in the backing store
ƒ A continuous section of space in the
secondary storage for better
performance.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

212
Crucial issues

ƒ Example 1 – Cost in restarting an


instruction
ƒ Assembly Instruction: Add a, b, c
ƒ Only a short job!
ƒ Re-fetch the instruction, decode,
fetch operands, execute, save, etc
ƒ Strategy:
ƒ Get all pages and restart the
instruction from the beginning!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Crucial Issues
ƒ Example 2 – Block-Moving MVC x, y, 4
Assembly Instruction
x: A
ƒ MVC x, y, 256
y: AB
ƒ IBM System 360/ 370
B C Page fault!
ƒ Characteristics C D Return??
ƒ More expensive D X is
ƒ “self-modifying” “operands” destroyed
ƒ Solutions:
ƒ Pre-load pages
ƒ Pre-save & recover before
page-fault services
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

213
Crucial Issues
ƒ Example 3 – Addressing Mode

(R2) +

MOV (R2)+, -(R3) Page Fault

- (R3)

When the page fault is serviced,


R2, R3 are modified!
- Undo Effects!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Performance of Demand Paging

ƒ Effective Access Time:


ƒ ma: memory access time for paging
ƒ p: probability of a page fault
ƒ pft: page fault time

(1 - p) * ma + p * pft

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

214
Performance of Demand Paging
ƒ Page fault time - major components
ƒ Components 1&3 (about 103 ns ~ 105 ns)
ƒ Service the page-fault interrupt
ƒ Restart the process
ƒ Component 2 (about 25ms)
ƒ Read in the page (multiprogramming!
However, let’s get the taste!)
ƒ pft ≈ 25ms = 25,000,000 ns
ƒ Effect Access Time (when ma = 100ns)
ƒ (1-p) * 100ns + p * 25,000,000 ns
ƒ 100ns + 24,999,900ns * p
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Performance of Demand Paging


ƒ Example (when ma = 100ns)
ƒ p = 1/1000
ƒ Effect Access Time ≈ 25,000 ns
→ Slowed down by 250 times
ƒ How to only 10% slow-down?
110 > 100 * (1-p) + 25,000,000 * p
p < 0.0000004
p < 1 / 2,500,000

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

215
Performance of Demand Paging
ƒ How to keep the page fault rate low?
ƒ Effective Access Time ≈ 100ns +
24,999,900ns * p
ƒ Handling of Swap Space – A Way to
Reduce Page Fault Time (pft)
ƒ Disk I/O to swap space is generally faster
than that to the file system.
ƒ Preload processes into the swap space
before they start up.
ƒ Demand paging from file system but do page
replacement to the swap space. (BSD UNIX)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Creation
ƒ Copy-on-Write
P1
ƒ Rapid Process Creation and Reducing
Page 3
Table 1 4
of New Pages for the New Process
6 ƒ fork(); execve()
1 ƒ Shared pages Æ copy-on-write pages
ƒ Only the pages that are modified are
P2
3 copied!
Page
Table 2 4 * * *
*
6 * * * ?? ::
data1
ed1 ed2 ed3
1
page 0 1 2 3 4 5 6 7 n
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. * Windows 2000, Linux, Solaris 2 support this feature!

216
Process Creation
ƒ Copy-on-Write
ƒ zero-fill-on-demand
ƒ Zero-filled pages, e.g., those for the
stack or bss.
ƒ vfork() vs fork() with copy-on-write
ƒ vfork() lets the sharing of the page
table and pages between the parent
and child processes.
ƒ Where to keep the needs of copy-on-
write information for pages?

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Memory-Mapped Files
ƒ File writes might not cause any disk write!
ƒ Solaris 2 uses memory-mapped files for
open(), read(), write(), etc.
P1 VM P2 VM
3
6 1
2
1
1 3
5
2 4
3 4 5
4 2 6
5
6
1 2 3 4 5 6
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004. Disk File

217
Page Replacement

ƒ Demand paging increases the


multiprogramming level of a system by
“potentially” over-allocating memory.
ƒ Total physical memory = 40 frames
ƒ Run six processes of size equal to 10
frames but with only five frames. => 10
spare frames
ƒ Most of the time, the average memory
usage is close to the physical memory
size if we increase a system’s
multiprogramming level!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement
ƒ Q: Should we run the 7th processes?
ƒ How if the six processes start to ask
their shares?
ƒ What to do if all memory is in use, and
more memory is needed?
ƒ Answers
ƒ Kill a user process!
ƒ But, paging should be transparent to
users?
ƒ Swap out a process!
ƒ Do page replacement!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

218
Page Replacement
ƒ A Page-Fault Service
ƒ Find the desired page on the disk!
ƒ Find a free frame
ƒ Select a victim and write the victim
page out when there is no free
frame!
ƒ Read the desired page into the
selected frame.
ƒ Update the page and frame tables, and
restart the user process.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement
Logical Memory Page Table

P1 0 H 3 v 0 OS
1 Load 4 v
PC M
1 OS
2 J
5 v
2 D
3 i
3 H
B
4 M/B
P2 0 A 6 v
1 B i 5 J M
2 D 2 v 6 A
3 E 7 v
7 E
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

219
Page Replacement
ƒ Two page transfers per page fault if
no frame is available!

Page Table
6 V N
4 V N
Modify Bit is set by the
3 V Y
hardware automatically!
7 V Y
Modify (/Dirty) Bit! To
“eliminate” ‘swap out” =>
Valid-Invalid Bit Reduce I/O time by one-half
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement
ƒ Two Major Pieces for Demand Paging
ƒ Frame Allocation Algorithms
ƒ How many frames are allocated to a
process?
ƒ Page Replacement Algorithms
ƒ When page replacement is required,
select the frame that is to be
replaced!
ƒ Goal: A low page fault rate!
ƒ Note that a bad replacement choice
does not cause any incorrect execution!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

220
Page Replacement Algorithms
ƒ Evaluation of Algorithms
ƒ Calculate the number of page faults on
strings of memory references, called
reference strings, for a set of algorithms
ƒ Sources of Reference Strings
ƒ Reference strings are generated artificially.
ƒ Reference strings are recorded as system
traces:
ƒ How to reduce the number of data?

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement Algorithms


ƒ Two Observations to Reduce the Number
of Data:
ƒ Consider only the page numbers if the
page size is fixed.
ƒ Reduce memory references into page
references
ƒ If a page p is referenced, any immediately
following references to page p will never
cause a page fault.
ƒ Reduce consecutive page references of
page p into one page reference.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

221
Page Replacement Algorithms
ƒ Example
XX XX

page# offset

0100, 0432, 0101, 0612, 0103, 0104, 0101, 0611

01, 04, 01, 06, 01, 01, 01, 06

01, 04, 01, 06, 01, 06


Does the number of page faults decrease when the
number of page frames available increases?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

FIFO Algorithm
ƒ A FIFO Implementation
1. Each page is given a time stamp when it
is brought into memory.
2. Select the oldest page for replacement!
reference
string 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 4 4 4 0 0 0 7 7 7
page
0 0 0 3 3 3 2 2 2 1 1 1 0 0
frames
1 1 1 0 0 0 3 3 3 2 2 2 1

FIFO 7 7 7 0 1 2 3 0 4 2 3 0 1 2 7
queue 0 0 1 2 3 0 4 2 3 0 1 2 7 0
1 2 3 0 4 2 3 0 1 2 7 0 1
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

222
FIFO Algorithm
ƒ The Idea behind FIFO
ƒ The oldest page is unlikely to be used
again.
??Should we save the page which will be
used in the near future??
ƒ Belady’s anomaly
ƒ For some page-replacement algorithms,
the page fault rate may increase as the
number of allocated frames increases.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

FIFO Algorithm
Run the FIFO algorithm on the following reference:
string: 1 2 3 4 1 2 5 1 2 3 4 5

1 1 1 2 3 4 1 1 1 2 5 5
3 frames 2 2 3 4 1 2 2 2 5 3 3
3 4 1 2 5 5 5 3 4 4

1 1 1 1 1 1 2 3 4 5 1 2
4 frames 2 2 2 2 2 3 4 5 1 2 3
3 3 3 3 4 5 1 2 3 4
4 4 4 5 1 2 3 4 5
Push out pages
that will be used later!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

223
Optimal Algorithm (OPT)
ƒ Optimality
ƒ One with the lowest page fault rate.
ƒ Replace the page that will not be used for the
longest period of time. ÅÆ Future Prediction
reference
string 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 2 2 2 7
page
0 0 0 0 4 0 0 0
frames
1 1 3 3 3 1 1

next 7
next 0

next 1
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Least-Recently-Used Algorithm
(LRU)
ƒ The Idea:
ƒ OPT concerns when a page is to be used!
ƒ “Don’t have knowledge about the future”?!

ƒ Use the history of page referencing in the


past to predict the future!

S ? SR ( SR is the reverse of S !)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

224
LRU Algorithm
ƒ Example
reference
string 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
7 7 7 2 2 4 4 4 0 1 1 7
page
0 0 0 0 0 0 3 3 3 0 0
frames
1 1 3 3 2 2 2 2 2 7

LRU 0 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
queue 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0
7 0 1 2 2 3 0 4 2 2 0 3 3 1 2 0 1 7

a wrong prediction!

Remark: LRU is like OPT which “looks backward” in time.


* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Implementation – Counters


Logical
f
Address

CPU p d f d
Physical
A Logical Clock
Memory
p

cnt++
time
frame # v/i tag
……

Update the
“time-of-use”
field
Page Time of Last
Disk
Table Use!
for Pi
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

225
LRU Implementation – Counters
ƒ Overheads
ƒ The logical clock is incremented for
every memory reference.
ƒ Update the “time-of-use” field for each
page reference.
ƒ Search the LRU page for replacement.
ƒ Overflow prevention of the clock & the
maintenance of the “time-of-use” field
of each page table.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Implementation – Stack


Logical f
Address
CPU p d f d
Head Physical
Memory
p

frame # v/i
A LRU
……

Stack
move
Page Table
Disk
Tail Overheads: Stack maintenance per memory
(The LRU page!) reference ~ no search for page replacement!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

226
A Stack Algorithm
memory- memory-
resident ⊆ resident
pages n frames pages (n +1) frames
available available

ƒ Need hardware support for efficient


implementations.
ƒ Note that LRU maintenance needs to
be done for every memory reference.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Approximation Algorithms


ƒ Motivation
ƒ No sufficient hardware support
ƒ Most systems provide only “reference bit”
which only indicates whether a page is
used or not, instead of their order.
ƒ Additional-Reference-Bit Algorithm
ƒ Second-Chance Algorithm
ƒ Enhanced Second Chance Algorithm
ƒ Counting-Based Page Replacement

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

227
Additional-Reference-Bits Algorithm
ƒ Motivation
ƒ Keep a history of reference bits

1 01101101
0 10100011 OS shifts all
history registers right


by one bit at each
regular interval!!
0 11101010
1 00000001
reference one byte per page in memory
bit
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Additional-Reference-Bits Algorithm
ƒ History Registers
LRU 00000000 Not used for 8 times
(smaller value!) 00000001

11111110
MRU 11111111 Used at least once
every time
ƒ But, how many bits per history register
should be used?
ƒ Fast but cost-effective!
ƒ The more bits, the better the approximation is.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

228
Second-Chance (Clock) Algorithm
Reference Page Reference Page
Bit Bit ƒ Motivation
0 0 ƒ Use the reference bit
only
0 0 ƒ Basic Data Structure:
1 0 ƒ Circular FIFO Queue
ƒ Basic Mechanism
1 0
ƒ When a page is selected
0 0 ƒ Take it as a victim if its
reference bit = 0

ƒ Otherwise, clear the bit


1 1 and advance to the
1 1 next page
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Enhanced Second-Chance
Algorithm
ƒ Motivation:
ƒ Consider the cost in swapping out” pages.
ƒ 4 Classes (reference bit, modify bit)
low priority ƒ (0,0) – not recently used and not “dirty”
ƒ (0,1) – not recently used but “dirty”
ƒ (1,0) – recently used but not “dirty”
high priority ƒ (1,1) – recently used and “dirty”

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

229
Enhanced Second-Chance
Algorithm

ƒ Use the second-chance algorithm to


replace the first page encountered in
the lowest nonempty class.
=> May have to scan the circular queue
several times before find the right page.
ƒ Macintosh Virtual Memory
Management

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Counting-Based Algorithms
ƒ Motivation:
ƒ Count the # of references made to each
page, instead of their referencing times.
ƒ Least Frequently Used Algorithm (LFU)
ƒ LFU pages are less actively used pages!
ƒ Potential Hazard: Some heavily used
pages may no longer be used !
ƒ A Solution – Aging
ƒ Shift counters right by one bit at each
regular interval.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

230
Counting-Based Algorithms

ƒ Most Frequently Used Algorithm (MFU)


ƒ Pages with the smallest number of
references are probably just brought in
and has yet to be used!
ƒ LFU & MFU replacement schemes can
be fairly expensive!
ƒ They do not approximate OPT very well!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Buffering
ƒ Basic Idea
a. Systems keep a pool of free frames
b. Desired pages are first “swapped in” some
pages in the pool.
c. When the selected page (victim) is later
written out, its frame is returned to the pool.
ƒ Variation 1
a. Maintain a list of modified pages.
b. Whenever the paging device is idle, a
modified page is written out and reset its
“modify bit”.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

231
Page Buffering
ƒ Variation 2
a. Remember which page was in each frame of
the pool.
b. When a page fault occurs, first check
whether the desired page is there already.
ƒ Pages which were in frames of the pool must
be “clean”.
ƒ “Swapping-in” time is saved!
ƒ VAX/VMS with the FIFO replacement
algorithm adopt it to improve the
performance of the FIFO algorithm.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Single User


ƒ Basic Strategy:
ƒ User process is allocated any free frame.
ƒ User process requests free frames from the
free-frame list.
ƒ When the free-frame list is exhausted, page
replacement takes place.
ƒ All allocated frames are released by the
ending process.
ƒ Variations
ƒ O.S. can share with users some free frames
for special purposes.
ƒ Page Buffering - Frames to save “swapping”
time
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

232
Frame Allocation – Multiple
Users
ƒ Fixed Allocation
a. Equal Allocation
m frames, n processes Æ m/n frames per
process
b. Proportional Allocation
1. Ratios of Frames ∝ Size
S = Σ Si, Ai ∝ (Si / S) x m, where (sum <= m) &
(Ai >= minimum # of frames required)
2. Ratios of Frames ∝ Priority
Si : relative importance
3. Combinations, or others.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Multiple


Users
ƒ Dynamic Allocation
a. Allocated frames ∝ the
multiprogramming level
b. Allocated frames ∝ Others
ƒ The minimum number of frames
required for a process is determined
by the instruction-set architecture.
ƒ ADD A,B,C Æ 4 frames needed
ƒ ADD (A), (B), (C) Æ 1+2+2+2 = 7
frames, where (A) is an indirect
addressing.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

233
Frame Allocation – Multiple
Users
ƒ Minimum Number of Frames
(Continued)
ƒ How many levels of indirect
addressing should be supported?
16 bits ƒ It may touch every page in the logical
address space of a process
address
0 direct => Virtual memory is collapsing!
1 indirect ƒ A long instruction may cross a page
boundary.
MVC X, Y, 256 Æ 2 + 2 + 2 = 6 frames
ƒ The spanning of the instruction and
the operands.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Multiple


Users
ƒ Global Allocation
ƒ Processes can take frames from others. For
example, high-priority processes can
increase its frame allocation at the expense
of the low-priority processes!
ƒ Local Allocation
ƒ Processes can only select frames from their
own allocated frames Æ Fixed Allocation
ƒ The set of pages in memory for a process is
affected by the paging behavior of only that
process.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

234
Frame Allocation – Multiple
Users
ƒ Remarks
a.Global replacement generally results
in a better system throughput
b.Processes can not control their own
page fault rates such that a process
can affect each another easily.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Thrashing
ƒ Thrashing – A High Paging Activity:
ƒ A process is thrashing if it is spending
more time paging than executing.
ƒ Why thrashing?
ƒ Too few frames allocated to a process!
CPU utilization

thrashing Thrashing
under a global page-
replacement algorithm

Dispatch a new process low CPU utilization


degree of multiprogramming
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

235
Thrashing
ƒ Solutions:
ƒ Decrease the multiprogramming level
Æ Swap out processes!
ƒ Use local page-replacement algorithms
ƒ Only limit thrashing effects “locally”
ƒ Page faults of other processes also
slow down.
ƒ Give processes as many frames as
they need!
ƒ But, how do you know the right number
of frames for a process?

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Locality Model
localityj =
localityi = {Pj,1,Pj,2,…,Pj,nj}
{Pi,1,Pi,2,…,Pi,ni}
control flow

ƒ A program is composed of several different


(overlapped) localities.
ƒ Localities are defined by the program
structures and data structures (e.g., an array,
hash tables)
ƒ How do we know that we allocate enough
frames to a process to accommodate its
current locality?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

236
Working-Set Model
Page references
…2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4
Δ Δ
working-set window working-set window
t1 t2
working-set(t1) = {1,2,5,6,7} working-set(t2) = {3,4}

ƒ The working set is an approximation


of a program’s locality.

The minimum Δ ∞
allocation All touched pages
may cover several
localities.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Working-Set Model
D = ∑ working − set − sizei ≤ M
where M is the total number of
available frames.

D>M
D>M
Suspend some Extra frames
processes and are available,
swap out
“Safe” and initiate
their pages. new processes.
D≦M
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

237
Working-Set Model
ƒ The maintenance of working sets is expensive!
ƒ Approximation by a timer and the reference bit

0 timer!
1
shift or

……
……
……
……
copy
1
0
reference bit in-memory history
ƒ Accuracy v.s. Timeout Interval!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page-Fault Frequency
ƒ Motivation
ƒ Control thrashing directly through the
observation on the page-fault rate!

increase # of frames!
page-fault rate

upper bound

lower bound
decrease # of frames!

number of frames
*Processes are suspended and swapped out if the number of
available frames is reduced to that under the minimum needs.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

238
OS Examples – NT
ƒ Virtual Memory – Demand Paging with
Clustering
ƒ Clustering brings in more pages
surrounding the faulting page!
ƒ Working Set
ƒ A Min and Max bounds for a process
ƒ Local page replacement when the max
number of frames are allocated.
ƒ Automatic working-set trimming reduces
allocated frames of a process to its min
when the system threshold on the
available frames is reached.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

OS Examples – Solaris
ƒ Process pageout first clears
8192
the reference bit of all pages
fastscan to 0 and then later returns all
pages with the reference bit =
0 to the system (handspread).
100 ƒ 4HZ Æ 100HZ when desfree
slowscan is reached!
minfree desfree lotsfree ƒ Swapping starts when
desfree fails for 30s.
ƒ pageout runs for every
request to a new page when
minfree is reached.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

239
Other Considerations
ƒ Pre-Paging
ƒ Bring into memory at one time all the
pages that will be needed!
swapped
out
ready suspended
processes resumed processes

Do pre-paging if the working set is known!


ƒ Issue
Pre-Paging Cost Cost of Page Fault Services
Not every page in the working set will be used!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations
ƒ Page Size

Better
Resolution Page Size Smaller Page
for Locality & small Table Size &
p d large
Internal Better I/O
Fragmentation 512B(29)~16,384B(212) Efficiency

ƒ Trends - Large Page Size


∵ The CPU speed and the memory capacity
grow much faster than the disk speed!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

240
Other Considerations
ƒ TLB Reach
ƒ TLB-Entry-Number * Page-Size
ƒ Wish
ƒ The working set is stored in the TLB!
ƒ Solutions
ƒ Increase the page size
ƒ Have multiple page sizes –
UltraSparc II (8KB - 4MB) + Solaris 2
(8KB or 4MB)

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations
ƒ Inverted Page Table
ƒ The objective is to reduce the
amount of physical memory for page
tables, but they are needed when a
page fault occurs!
ƒ More page faults for page tables will
occur!!!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

241
Other Considerations
ƒ Program Structure
ƒ Motivation – Improve the system performance
by an awareness of the underlying demand
paging.
var A: array [1..128,1..128] of integer;
for j:=1 to 128
for i:=1 to 128
A(i,j):=0
A(1,1) A(2,1) A(128,1)
128 A(1,2) A(2,2) A(128,2)
128x128 page
words . . …… .
faults if the
. . .
process has
A(1,128) A(2,128) A(128,128)
less than 128
frames!!
128 pages
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations
ƒ Program Structures:
ƒ Data Structures
ƒ Locality: stack, hash table, etc.
ƒ Search speed, # of memory references, # of
pages touched, etc.
ƒ Programming Language
ƒ Lisp, PASCAL, etc.
ƒ Compiler & Loader
ƒ Separate code and data
ƒ Pack inter-related routines into the same page
ƒ Routine placement (across page boundary?)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

242
I/O Interlock

buffer Drive

• DMA gets the following


information of the buffer:
• Base Address in
Memory
• Chunk Size
Physical Memory • Could the buffer-residing
pages be swapped out?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Interlock
ƒ Solutions
ƒ I/O Device ÅÆ System Memory ÅÆ
User Memory
ƒ Extra Data Copying!!
ƒ Lock pages into memory
ƒ The lock bit of a page-faulting page is set
until the faulting process is dispatched!
ƒ Lock bits might never be turned off!
ƒ Multi-user systems usually take locks as
“hints” only!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

243
Real-Time Processing

Predictable Virtual memory


Behavior introduces unexpected,
long-term delays in the
execution of a program.

ƒ Solution:
ƒ Go beyond locking hints Î Allow
privileged users to require pages being
locked into memory!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Demand Segmentation
ƒ Motivation
ƒ Segmentation captures better the logical
structure of a process!
ƒ Demand paging needs a significant
amount of hardware!
ƒ Mechanism
ƒ Like demand paging!
ƒ However, compaction may be needed!
ƒ Considerable overheads!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

244