OS Lecture-13
Virtual Memory
and
Page Replacement Algorithms
Virtual Memory
• Virtual memory – separation of user logical memory from physical
memory.
– A section of a hard disk that's set up to emulate the computer's RAM.
– Only part of the program needs to be in memory for execution.
– Logical address space can therefore be much larger than physical
address space.
– Need to allow pages to be swapped in and out.
Virtual Memory (cont.)
Virtual Address Space
Advantages of Virtual Memory
• Multitasking.
• Allocating Memory Is Easy and Cheap.
• Sharing of files and memory by multiple processes.
• More efficient Swapping.
• Process may even be larger than all of the physical memory.
• Helpful in implementing multiprogramming environment.
Example: Shared library using virtual memory
Disadvantages of Virtual Memory
• Longer memory access time as HDD is slower.
• Applications run slower if the system is using virtual memory.
• It takes more time to switch between applications.
• Less hard drive space for your use.
Implementation of Virtual Memory
• Virtual memory can be implemented via:
– Demand paging
– Demand segmentation
Demand Paging
Demand Paging
• Similar to a paging system with swapping.
• Bring a page into memory only when it is needed.
– Less I/O needed
– Less memory needed
– Faster response
– More users
• Page is needed reference to it
– invalid reference abort
– not-in-memory bring to memory
Demand Paging (cont.)
• Lazy swapper - never swaps a
page into memory unless page
will be needed.
– Swapper that deals with pages
is a pager.
• A swapper manipulates the
entire process, whereas a pager
is concerned with the
individual pages of a process.
Demand Paging: Hardware Support
• With demand paging, we need some form of hardware support to
distinguish between those pages that
– are in memory
– and those pages that are on the disk.
• The valid-invalid bit scheme can be used for this purpose.
• When this bit is set to "valid“, this value indicates that the associated page
is both legal and in memory.
• If the bit is set to "invalid“,
– the page either is not valid (that is, not in the logical address space of the
process),
– or is valid but is currently on the disk (then the entry points to page address
on the disk).
Demand Paging: Hardware Support (cont.)
0
1
valid-invalid 2
bit 3
frame #
4 A
5
0 A 0 4 v 6 C
1 B 1 i 7
2 C 2 6 v 8 A B
3 D 3 i 9 F
4 i C D E
4 E 10
5 F 5 9 v 11
6 i F G H
6 G 12
7 i
7 H 13
Page table
Logical 14
memory
15
Physical memory
Page Fault
• If there is ever a reference to a page, first reference will trap to
OS page fault
• OS looks at another table to decide:
– Invalid reference abort.
– Just not in memory continue.
• Allocate a free frame in memory.
• Schedule a disk read of the page to the free frame.
• Allocate the CPU to some other process.
• Interrupt from the disk (I/O completed).
• Save registers and state for currently running process.
• Update original process table (the desired page in memory).
• Move original process to ready queue – will wait for CPU allocation.
• Restore registers and state then restart the trapped instruction.
Steps in Handling a Page Fault
3 page on
backing store
operating
system
running 2
process reference
trap
1
i
6
restart
instruction
page table
free frame
5 4
reset bring in
page table missing page
disk
physical
memory
What happens if there is no free frame?
• Page replacement: Are all those pages in memory being
referenced? Choose one to swap back out to disk and make room
to load a new page.
– Algorithm: How you choose a victim.
– Performance: Want an algorithm which will result in minimum
number of page faults.
• Side effect: The same page may be brought in and out of memory
several times.
Performance of Demand Paging
• Page Fault Rate 0 p 1.0
– if p = 0 no page faults
– if p = 1, every reference is a fault
• Effective Access Time (EAT) is given as:
EAT = (1 – p) × memory access time + p × page fault time
where, page fault time = page fault overhead + swap page out + swap page in
+ restart overhead
• Example: Memory access time = 200 nanoseconds, Average page-fault
service time = 8 milliseconds. Then,
EAT = (1 – p) × 200ns + p (8 ms)
= (1 – p) × 200 + p × 8,000,000
= 200 + 7,999,880 × p.
EAT is directly proportional to the page fault rate.
Page Replacement
Page Replacement
• Prevent over-allocation of memory by modifying page-fault service
routine to include page replacement.
• Use modify (dirty) bit to reduce overhead of page transfers – only
modified pages are written to disk.
• Page replacement completes separation between logical memory
and physical memory – large virtual memory can be provided on a
smaller physical memory.
Page Replacement (cont.)
Basic Page Replacement
1. Find the location of the desired page on disk.
2. Find a free frame:
− If there is a free frame, use it.
− If there is no free frame, use a page replacement algorithm to select a victim
frame.
3. Read the desired page into the (newly) free frame. Update the page and
frame tables.
4. Restart the process.
Page-Replacement Algorithms
• Types:
– First-In-First-Out (FIFO)
– Optimal
– Least Recently Used (LRU)
– Counting
• Want lowest page-fault rate.
• Evaluate algorithm by running it on a particular string of memory
references (reference string) and computing the number of page faults on
that string.
• We will use this reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
Graph of Page Faults Versus The Number of Frames
First-In-First-Out (FIFO) Algorithm
• 3 frames (3 pages can be in memory at a time per process)
• Reference string: 1 2 3 4 1 2 5 1 2 3 4 5
1 1 1 1 4 4 4 5 5 5 5 5 5
# of page
2 2 2 2 1 1 1 1 1 3 3 3
faults = 9
3 3 3 3 2 2 2 2 2 4 4
F F F F F F F N N F F N
• 4 frames (4 pages can be in memory at a time per process)
• Reference string: 1 2 3 4 1 2 5 1 2 3 4 5
1 1 1 1 1 1 1 5 5 5 5 4 4
2 2 2 2 2 2 2 1 1 1 1 5
# of page
faults = 10 3 3 3 3 3 3 3 2 2 2 2
4 4 4 4 4 4 4 3 3 3
F F F F N N F F F F F F
First-In-First-Out (FIFO) Algorithm (cont.)
• FIFO algorithm suffers from Belady’s anomaly:
allocated frames page-fault rate
Optimal (OPT) Algorithm
• Replace page that will not be used for longest period of time.
• 4 frames (4 pages can be in memory at a time per process)
• Reference string: 1 2 3 4 1 2 5 1 2 3 4 5
1 1 1 1 1 1 1 1 1 1 1 4 4
2 2 2 2 2 2 2 2 2 2 2 2
# of page
3 3 3 3 3 3 3 3 3 3 3
faults = 6
4 4 4 4 5 5 5 5 5 5
F F F F N N F N N N F N
• OPT algorithm guarantees the lowest possible page-fault rate for a fixed number of
pages.
• But it is difficult to implement, because it requires future knowledge of the
reference string.
• As a result, OPT is used mainly for comparison purposes.
Least Recently Used (LRU) Algorithm
• LRU associates with each page the time of that page’s last use.
• When a page must be replaced, LRU chooses the page that has not been
used for the longest period of time.
• We can think of this strategy as OPT looking backward in time, rather than
forward.
• Reference string: 1 2 3 4 1 2 5 1 2 3 4 5
1 1 1 1 1 1 1 1 1 1 1 1 5
2 2 2 2 2 2 2 2 2 2 2 2
# of page
3 3 3 3 3 5 5 5 5 4 4
faults = 8
4 4 4 4 4 4 4 3 3 3
F F F F N N F N N F F F
Implementation of the LRU Policy
• Each page could be tagged (in the page table entry) with the time at
each memory reference.
• The LRU page is the one with the smallest time value (needs to be
searched at each page fault).
• This would require expensive hardware and a great deal of
overhead.
• Consequently very few computer systems provide sufficient
hardware support for true LRU replacement policy.
• Other algorithms are used instead.
LRU Implementation Using Counters
• Add a “time stamp” field to page table entries.
• Add counter register to CPU. Counter incremented on every
page reference.
• When a page is referenced, counter is copied to time stamp
field for that page.
• Victim page has the smallest time stamp value.
LRU Implementation Using Stack
• Keep a stack of page numbers.
• E.g., Reference string: 4 7 0 7 1 0 1 2 1 2 7
a b
Most recently
used page goes
on the top 2 7
1 2
0 1
Least recently 7 0
used page is at
the top 4 4
Stack at a Stack at b
• No search for replacement: Always take the bottom one.
LRU Approximation Algorithm-1
• Reference Bit:
– With each page associate a bit, initially = 0
– When page is referenced, bit is set to 1.
– Replace the one which is 0 (if one exists) – we do not know the
real order of use, however.
LRU Approximation Algorithm-2
• Second Chance (Clock):
─ The set of frames candidate for replacement is considered as a
circular buffer.
─ When a page is replaced, a pointer is set to point to the next
frame in buffer.
─ A reference bit for each frame is set to 1 whenever:
• a page is first loaded into the frame.
• the corresponding page is referenced.
─ When it is time to replace a page, the first frame encountered
with the reference bit set to 0 is replaced:
• During the search for replacement, each reference bit
set to 1 is changed to 0.
Second-Chance (Clock) Page-Replacement Algorithm
Reference string:
4 5 5 5 6 7 8 6 7 8 6 7 8 4 1 5 2 4 4 1
1 04 4
0 0
4 0
4 4
0
4
0
8
0
8
0
8
0
8
1
8
1
8
1
8
1
8
0
8
0
8
0
2
0
2
0
2
0
2
0
2 5 5 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4
0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1
3 6 6 6 6 6 6 6 6 6 6 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1
4 7 7 7 7 7 7 7 7 7 7 5 5 5 5 5
0 0 0 1 1 1 1 1 0 0 0 0 0 0 0
F F N N F F F N N N N N N F F F F N N N
# of page faults = 9
Counting Algorithms
• Keep a counter of the number of references that have been made
to each page.
• Two possibilities: Least/Most Frequently Used (LFU/MFU).
• Least frequently Used (LFU) Algorithm: replaces page with smallest
count.
• Most frequently Used (MFU) Algorithm: based on the argument
that the page with the smallest count was probably just brought in
and has yet to be used
Least Frequently Used (LFU) Algorithm
• Replaces page that is least intensively referenced.
• Based on the heuristic that a page not referenced often is not likely
to be referenced in the future.
• Could easily select wrong page for replacement.
– A page that was referenced heavily in the past may never be referenced again,
but will stay in memory while newer, active pages are replaced.
Reference string: 7 0 1 2 0 3 0 4 2 3 0 3 2
1 1
7 1
7 1
7 1
2 1
2 1
2 1
2 1
4 1
4 1
3 1
3 2
3 2
3
# of page 2 1
0 1
0 1
0 2
0 2
0 3
0 3
0 3
0 3
0 4
0 4
0 4
0
faults = 8
3 11 11 11 13 13 13 12 12 12 12 22
F F F F N F N F F F N N N
Least Frequently Used (LFU) Algorithm
• Replaces page that is least intensively referenced
• Based on the heuristic that a page not referenced often is not likely to be
referenced in the future
• Could easily select wrong page for replacement
• A page that was referenced heavily in the past may never be referenced again,
but will stay in memory while newer, active pages are replaced
• Reference string: 1 2 3 4 1 2 5 1 2 3 4 5
1 1 1 1 1 1 1 1 1 1 1 1 5
2 2 2 2 2 2 2 2 2 2 2 2
# of page
3 3 3 3 3 5 5 5 5 4 4
faults = 8
4 4 4 4 4 4 4 3 3 3
F F F F N N F N N F F F
Question
• Given page reference string:
1, 2, 3, 4, 2, 1, 5, 6, 2, 1, 2, 3, 7, 6, 3, 2, 1, 2, 3, 6
Compare the number of page faults for FIFO, LRU, LFU, Second Chance
and Optimal page replacement algorithm assuming that 4 frames are
available in memory.
Allocation of Frames
Allocation of Frames
• Each process needs a minimum number of pages.
• There are two major allocation schemes:
– Fixed allocation
– Priority allocation
Fixed Allocation
• Equal allocation – e.g., if 100 frames and 5 processes, give each 20
pages.
• Proportional allocation – Allocate according to the size of process.
si size of process pi m 64
si 10
S si
s2 127
m total number of frames
10
si a1 64 5
ai allocation for pi m 137
S 127
a2 64 59
137
Priority Allocation
• Use a proportional allocation scheme using priorities rather
than size.
• If process Pi generates a page fault,
– select for replacement one of its frames.
– select for replacement a frame from a process with lower priority
number.
Global vs. Local Allocation
• Local replacement – each process selects from only its own
set of allocated frames
– More consistent per-process performance
– But possibly underutilized memory
• Global replacement – process selects a replacement frame
from the set of all frames; one process can take a frame from
another
– But then process execution time can vary greatly
– But greater throughput ----- so more common
• Processes can not control its own page fault rate
– Depends on the paging behavior of other processes
Thrashing
Thrashing
• If a process does not have “enough” pages, the page-fault rate is
very high. This leads to:
– low CPU utilization
– operating system thinks that it needs to increase the degree of
multiprogramming because of low cpu utilization
– another process added to the system
• Thrashing ≡ a process is busy swapping pages in and out.
• The processor spends most of its time swapping pages rather than
executing user instructions.
Thrashing (cont.)
Demand Paging and Thrashing
• Why does demand paging work?
• Locality model
– Process migrates from one locality to another
– Localities may overlap
• E.g.
for (……) {
computations;
}
…..
for (….. ) {
computations;
}
• Why does thrashing occur?
Σ size of locality > total memory size
Solutions of Thrashing
• Local replacement
– One process cannot steal frames from other processes
• Provide a process as many frames as needed
– Able to load all active pages
– How do we know?
– Working-set model
Working-Set Model
• The pages used by a process within a window of time are called its
working set.
• The working-set model is based on the assumption of locality.
…2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4
t1 t2
• Changes continuously - hard to maintain an accurate number.
• How can the system use this number to give optimum memory to the
process?
Working-Set Model (cont.)
• ∆ ≡ working-set window ≡ fixed number of page references
Example: 10,000 instructions
• If a page is in active use, it will be in the working-set.
• If it is not in the use, it will be drooped from the working-set.
…2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4
∆ ∆
t1 t2
WS(t1) = {1, 2, 5, 6, 7} WS(t2) = {3, 4}
• Working-set is the approximation of the program’s locality.
• The accuracy of the working-set depends on the selection of the ∆.
Calculation of Working-Set
• Important property of the working set is the size.
• Compute working set size for each process in the system WSSi.
• D = ΣWSSi, where D is the total demand for frames.
• If D > m (where m is the total no. of available frames), thrashing will
occur.
• The operating system monitors the working-set of each process and
allocates them enough frames.
– Policy if D > m, then suspend one of the processes
Question
• Given references to the following pages by a program:
0, 9, 0, 1, 8 1, 8, 7, 8, 7 1, 2, 8, 2, 7 8, 2, 3, 8, 3,
What is the working set W(t, ∆), with t equal to the time between the 15th
and 16th references, and ∆ equal to 6 references?
Solution:
The 6 references before the 16th reference are 7, 1, 2, 8, 2 and 7. Thus,
W(t, ∆) = {1, 2, 7, 8}.
Advantages and Disadvantages
• Advantages:
– allows a peripheral device to read from/write to memory without
going through the CPU
– allows for faster processing since the processor can be working on
something else while the peripheral can be populating memory.
• Disadvantages:
– requires a DMA controller to carry out the operation, which increases
the cost of the system
– cache coherence problems