CS 152 Computer Architecture and Engineering Lecture

CS 152 Computer Architecture and
Engineering
Lecture 10 - Virtual Memory
Krste Asanovic
Electrical Engineering and Computer Sciences
University of California at Berkeley
http://www.eecs.berkeley.edu/~krste
http://inst.eecs.berkeley.edu/~cs152
Last time in Lecture 9
• Protection and translation required for
multiprogramming
– Base and bounds, early simple scheme
• Page-based translation and protection avoids need
for memory compaction, easy allocation by OS
– But need to indirect in large page table on every access
• Address spaces accessed sparsely
– Can use multi-level page table to hold translation/protection
information
• Address space access with locality
– Can use “translation lookaside buffer” (TLB) to cache address
translations (sometimes known as address translation cache)
– Still have to walk page tables on TLB miss, can be hardware or
software talk
• Virtual memory uses DRAM as a “cache” of disk
memory, allows very cheap main memory
2/28/2008 CS152-Spring’08 2
Modern Virtual Memory Systems
Illusion of a large, private, uniform store
Protection & Privacy OS

several users, each with their private
address space and one or more
shared address spaces useri
page table ≡ name space
Swapping
Demand Paging Store
Provides the ability to run programs Primary
larger than the primary memory Memory
Hides differences in machine

configurations
The price is address translation on

each memory reference VA mapping PA
TLB
2/28/2008 CS152-Spring’08 3
Hierarchical Page Table
Virtual Address
31 22 21 12 11 0
p1 p2 offset
10-bit 10-bit
L1 index L2 index offset
Root of the Current
Page Table p2
p1
(Processor Level 1
Register) Page Table
Level 2
page in primary memory Page Tables
page in secondary memory
PTE of a nonexistent page

Data Pages
2/28/2008 CS152-Spring’08 4
Address Translation & Protection
Virtual Address Virtual Page No. (VPN) offset
Kernel/User Mode
Read/Write
Protection Address
Check Translation
Exception?
Physical Address Physical Page No. (PPN) offset
• Every instruction and data access needs address

translation and protection checks
A good VM design needs to be fast (~ one cycle) and

space efficient
2/28/2008 CS152-Spring’08 5
Translation Lookaside Buffers
Address translation is very expensive!
In a two-level page table, each reference
becomes several memory accesses
Solution: Cache translations in TLB
TLB hit ⇒ Single Cycle Translation
TLB miss ⇒ Page Table Walk to refill
virtual address VPN offset
VRWD tag PPN (VPN = virtual page number)
(PPN = physical page number)
hit? physical address PPN offset

2/28/2008 CS152-Spring’08 6
Handling a TLB Miss
Software (MIPS, Alpha)

TLB miss causes an exception and the operating system
walks the page tables and reloads TLB. A privileged
“untranslated” addressing mode used for walk
Hardware (SPARC v8, x86, PowerPC)

A memory management unit (MMU) walks the page
tables and reloads the TLB
If a missing (data or PT) page is encountered during the

TLB reloading, MMU gives up and signals a Page-Fault
exception for the original instruction
2/28/2008 CS152-Spring’08 7
Translation for Page Tables
• Can references to page tables cause TLB misses?
• Can this go on forever?
User PTE Base

User Page Table
(in virtual space)
System PTE Base

System Page Table
(in physical space)
Data Pages
2/28/2008 CS152-Spring’08 8
Variable-Sized Page Support
Virtual Address
31 22 21 12 11 0
p1 p2 offset
10-bit 10-bit
L1 index L2 index offset
Root of the Current
Page Table p2
p1
(Processor Level 1
Level 2
page in primary memory
Page Tables
large page in primary memory
Data Pages
2/28/2008 CS152-Spring’08 9
Variable-Size Page TLB
Some systems support multiple page sizes.
virtual address VPN offset
V R WD Tag PPN L
hit?
physical address PPN offset
2/28/2008 CS152-Spring’08 10
Address Translation:
putting it all together
Virtual Address
hardware
Restart instruction hardware or software
TLB
software
Lookup
miss hit
Page Table Protection

Walk Check
the page is
∉ memory ∈ memory denied permitted
Page Fault
Update TLB Protection Physical
(OS loads page) Address
Fault
(to cache)
SEGFAULT
2/28/2008 CS152-Spring’08 11
Address Translation in CPU Pipeline
Inst Inst. Data Data

PC D Decode E + M W
TLB Cache TLB Cache
TLB miss? Page Fault? TLB miss? Page Fault?

Protection violation? Protection violation?
• Software handlers need restartable exception on page fault or

protection violation
• Handling a TLB miss needs a hardware or software mechanism to
refill TLB
• Need mechanisms to cope with the additional latency of a TLB:
– slow down the clock
– pipeline the TLB and cache access
– virtual address caches
– parallel TLB/cache access
2/28/2008 CS152-Spring’08 12
Virtual Address Caches
PA
VA Physical Primary
CPU TLB
Cache Memory
Alternative: place the cache before the TLB

VA
Primary
Memory (StrongARM)
Virtual PA
CPU TLB
Cache
• one-step process in case of a hit (+)

• cache needs to be flushed on a context switch unless address
space identifiers (ASIDs) included in tags (-)
• aliasing problems due to the sharing of pages (-)
• maintaining cache coherence (-) (see later in course)
2/28/2008 CS152-Spring’08 13
Aliasing in Virtual-Address Caches
Page Table Tag Data
VA1
Data Pages VA1 1st Copy of Data at PA
PA VA2 2nd Copy of Data at PA

VA2
Virtual cache can have two
copies of same physical data.
Two virtual pages share
Writes to one copy not visible
one physical page
to reads of other!
General Solution: Disallow aliases to coexist in cache

Software (i.e., OS) solution for direct-mapped cache
VAs of shared pages must agree in cache index bits; this
ensures all VAs accessing same PA will conflict in direct-
mapped cache (early SPARCs)
2/28/2008 CS152-Spring’08 14
CS152 Administrivia
• Tuesday Mar 4, Quiz 2
– Memory hierarchy lectures L6-L8, PS 2, Lab 2
– In class, closed book
2/28/2008 CS152-Spring’08 15
Concurrent Access to TLB & Cache
Virtual
VA VPN L b Index
Direct-map Cache
TLB k
2L blocks
2b-byte block
PA PPN Page Offset
Tag
=
Physical Tag Data
hit?
Index L is available without consulting the TLB
⇒ cache and TLB accesses can begin simultaneously
Tag comparison is made after both accesses are completed
Cases: L + b = k L+b<k L+b>k

2/28/2008 CS152-Spring’08 16
Virtual-Index Physical-Tag Caches:
Associative Organization
Virtual
VA VPN a L = k-b b 2a Index
Direct-map Direct-map
TLB k 2L blocks 2L blocks
Phy.
PA PPN Page Offset Tag
= =
Tag hit? 2a
Data
After the PPN is known, 2 physical tags are compared
a
Is this scheme realistic?

2/28/2008 CS152-Spring’08 17
Concurrent Access to TLB & Large L1
The problem with L1 > Page size
Virtual Index
L1 PA cache
VA VPN a Page Offset b Direct-map
VA1 PPNa Data

TLB
VA2 PPNa Data
PA PPN Page Offset b
= hit?
Tag
Can VA1 and VA2 both map to PA ?
2/28/2008 CS152-Spring’08 18
A solution via Second Level Cache
L1
Instruction Memory
Cache Memory
Unified L2
CPU
Cache Memory
L1 Data
RF Memory
Cache
Usually a common L2 cache backs up both

Instruction and Data L1 caches
L2 is “inclusive” of both Instruction and Data caches
2/28/2008 CS152-Spring’08 19
Anti-Aliasing Using L2: MIPS R10000
Virtual Index L1 PA cache

VA VPN a Page Offset b Direct-map
into L2 tag VA1 PPNa Data

TLB
VA2 PPNa Data
PA PPN Page Offset b

PPN
Tag
= hit?
• Suppose VA1 and VA2 both map to PA and VA1 is

already in L1, L2 (VA1 ≠ VA2)
• PA a1 Data
After VA2 is resolved to PA, a collision will be
detected in L2.
• VA1 will be purged from L1 and L2, and VA2 will be Direct-Mapped L2
loaded ⇒ no aliasing !
2/28/2008 CS152-Spring’08 20
Virtually-Addressed L1:
Anti-Aliasing using L2
Virtual
VA VPN Page Offset b Index & Tag
VA1 Data
TLB
VA2 Data
PA PPN Page Offset b L1 VA Cache

“Virtual
Tag Physical Tag”
Index & Tag
PA VA1 Data
Physically-addressed L2 can also be
used to avoid aliases in virtually-
L2 PA Cache
addressed L1 L2 “contains” L1
2/28/2008 CS152-Spring’08 21
Page Fault Handler
• When the referenced page is not in DRAM:
– The missing page is located (or created)
– It is brought in from disk, and page table is updated
Another job may be run on the CPU while the
first job waits for the requested page to be read
from disk
– If no free pages are left, a page is swapped out
Pseudo-LRU replacement policy
• Since it takes a long time to transfer a page (msecs),
page faults are handled completely in software by the
OS
– Untranslated addressing mode is essential to allow kernel to
access page tables
2/28/2008 CS152-Spring’08 22
Hierarchical Page Table
Virtual Address t he
31
s
22 21rse 12 11 0
p1 rav p2
e e.
t “ o
n mo d
offset
t
tha ds a sing
r a m 10-bit
e e e10-bit
s
g n r
p ro ablL1 e index
a dd L2 index offset
ARoot eoft theoCurrent
n ”
g ati
pa Page
n sl Table p2
tra p1
(Processor Level 1
Level 2
page in primary memory Page Tables

Data Pages
2/28/2008 CS152-Spring’08 23
Swapping a Page of a Page Table
A PTE in primary memory contains

primary or secondary memory addresses
A PTE in secondary memory contains

only secondary memory addresses
⇒ a page of a PT can be swapped out only

if none its PTE’s point to pages in the
primary memory
Why?__________________________________
2/28/2008 CS152-Spring’08 24
Atlas Revisited
• One PAR for each physical page

PAR’s
• PAR’s contain the VPN’s of the pages
resident in primary memory
PPN VPN
• Advantage: The size is proportional to
the size of the primary memory
• What is the disadvantage ?
2/28/2008 CS152-Spring’08 25
Hashed Page Table:
Approximating Associative Addressing
VPN d Virtual Address

Page Table
Offset PA of PTE
PID hash +
Base of Table
VPN PID PPN
• Hashed Page Table is typically 2 to 3 times
larger than the number of PPN’s to reduce VPN PID DPN
collision probability
VPN PID
• It can also contain DPN’s for some non-
resident pages (not common)
• If a translation cannot be resolved in this
table then the software consults a data Primary
structure that has an entry for every existing Memory
page
2/28/2008 CS152-Spring’08 26
Global System Address Space
User map
Global
System Physical
Level A map
Address Memory
Space Level B
User map
• Level A maps users’ address spaces into the global space

providing privacy, protection, sharing etc.
• Level B provides demand-paging for the large global system
address space
• Level A and Level B translations may be kept in separate
TLB’s
2/28/2008 CS152-Spring’08 27
Hashed Page Table Walk:
PowerPC Two-level, Segmented Addressing
64-bit user VA Seg ID Page Offset
0 35 51 63
hashS
Hashed Segment Table
PA of Seg Table +
per process PA
80-bit System VA Global Seg ID Page Offset

0 51 67 79
hashP
Hashed Page Table
PA of Page Table + PA
system-wide
0 27 39
[ IBM numbers bits
with MSB=0 ] 40-bit PA PPN Offset
2/28/2008 CS152-Spring’08 28
Power PC: Hashed Page Table
VPN d 80-bit VA
Page Table
Offset PA of Slot
hash + VPN
VPN
PPN
Base of Table
• Each hash table slot has 8 PTE's <VPN,PPN> that are
searched sequentially
• If the first hash slot fails, an alternate hash function is
used to look in another slot
All these steps are done in hardware!
• Hashed Table is typically 2 to 3 times larger than the
number of physical pages
Primary
• The full backup Page Table is a software data structure
Memory
2/28/2008 CS152-Spring’08 29
Virtual Memory Use Today - 1
• Desktops/servers have full demand-paged virtual

memory
– Portability between machines with different memory sizes
– Protection between multiple users or multiple tasks
– Share small physical memory among active tasks
– Simplifies implementation of some OS features
• Vector supercomputers have translation and protection
but not demand-paging
• (Older Crays: base&bound, Japanese & Cray X1/X2: pages)
– Don’t waste expensive CPU time thrashing to disk (make jobs fit in
memory)
– Mostly run in batch mode (run set of jobs that fits in memory)
– Difficult to implement restartable vector instructions
2/28/2008 CS152-Spring’08 30
Virtual Memory Use Today - 2
• Most embedded processors and DSPs provide physical

addressing only
– Can’t afford area/speed/power budget for virtual memory support
– Often there is no secondary storage to swap to!
– Programs custom written for particular memory configuration in
product
– Difficult to implement restartable instructions for exposed architectures
2/28/2008 CS152-Spring’08 31
Acknowledgements
• These slides contain material developed and
copyright by:
– Arvind (MIT)
– Krste Asanovic (MIT/UCB)
– Joel Emer (Intel/MIT)
– James Hoe (CMU)
– John Kubiatowicz (UCB)
– David Patterson (UCB)
• MIT material derived from course 6.823

• UCB material derived from course CS252
2/28/2008 CS152-Spring’08 32

CS 152 Computer Architecture and Engineering Lecture

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS 152 Computer Architecture and Engineering Lecture

Uploaded by

Copyright:

Available Formats

CS 152 Computer Architecture and

Lecture 10 - Virtual Memory

Protection & Privacy OS

Hides differences in machine

The price is address translation on

PTE of a nonexistent page

• Every instruction and data access needs address

A good VM design needs to be fast (~ one cycle) and

virtual address VPN offset

VRWD tag PPN (VPN = virtual page number)

(PPN = physical page number)

hit? physical address PPN offset

Software (MIPS, Alpha)

Hardware (SPARC v8, x86, PowerPC)

If a missing (data or PT) page is encountered during the

User PTE Base

System PTE Base

virtual address VPN offset

physical address PPN offset

Page Table Protection

Inst Inst. Data Data

TLB miss? Page Fault? TLB miss? Page Fault?

• Software handlers need restartable exception on page fault or

Alternative: place the cache before the TLB

• one-step process in case of a hit (+)

PA VA2 2nd Copy of Data at PA

General Solution: Disallow aliases to coexist in cache

Cases: L + b = k L+b<k L+b>k

Is this scheme realistic?

VA1 PPNa Data

PA PPN Page Offset b

Can VA1 and VA2 both map to PA ?

Usually a common L2 cache backs up both

L2 is “inclusive” of both Instruction and Data caches

Virtual Index L1 PA cache

into L2 tag VA1 PPNa Data

PA PPN Page Offset b

• Suppose VA1 and VA2 both map to PA and VA1 is

PA PPN Page Offset b L1 VA Cache

PTE of a nonexistent page

A PTE in primary memory contains

A PTE in secondary memory contains

⇒ a page of a PT can be swapped out only

• One PAR for each physical page

• What is the disadvantage ?

VPN d Virtual Address

• Level A maps users’ address spaces into the global space

80-bit System VA Global Seg ID Page Offset

• Desktops/servers have full demand-paged virtual

• Most embedded processors and DSPs provide physical

• MIT material derived from course 6.823

You might also like