You are on page 1of 56

William Stallings

Computer Organization and Architecture

Chapter 4
Cache
Chapter 5
Internal Memory
4.1 Memory system overview
1. Characteristics

• Location
• Capacity
• Unit of transfer
• Access method
• Performance
• Physical type
• Physical characteristics
• Organisation
Location

• Internal (computer viewpoint)


e.g. Registers
Cache
Main memory
• External
• Peripheral storage devices
• Accessed by I/O controller
e.g. Magnetic disks
Optical disks
Tapes
Capacity

• Word ‼
The natural unit of memory organisation
e.g. 8, 16, 32
• Size of word / Word length ‼
Number of bits used to represent a word
• Addressable unit ‼
Smallest location which can be uniquely addressed
Unit of Transfer ‼
• number of bits read/written at a time
• Internal
Usually governed by data bus width
word
• External
Usually a block which is much larger than a word
Cluster on disks
Access Methods‼ (1)
• Sequential
• Move from previous location to desired location
• Read through in order/ in sequence
• Access time depends on data location and
previous location
e.g. tape
• Direct
• Individual blocks have unique address
• Access is by jumping to vicinity plus sequential
search
• Access time depends on data location and
previous location
e.g. disk
Access Methods‼ (2)
• Random
• Individual addresses identify locations exactly
• Access time is independent of location or
previous access
e.g. RAM
• Associative
• Data retrieved based on a portion of its contents
rather than its address
• Access time is independent of location or
previous access
e.g. cache
Performance‼
• Access time (latency)
• Time between presenting the address and data
being stored(write) or available for use(read)
• Memory Cycle time
• Time may be required for the memory to
“recover” before next access
• Cycle time = access time+ recovery time
• Transfer Rate
• Rate at which data can be transferred into or
out of a memory unit
Physical Types

• Semiconductor
e.g. RAM
• Magnetic surface memory
e.g. Disk & Tape
• Optical
e.g. CD & DVD
Physical Characteristics

• Decay
• Volatility
• Erasable
Organisation

• Physical arrangement of bits into words

Will learn later


4.2 Semiconductor Memory
1. Basic element
Memory cell properties
• Two stable/ semistable states to represent binary 1 and 0
• Can be written into (at least once) to set the state (value)
• Can be read to sense the state (value)
2. RAM
• Random Access Memory
• Misnamed as all semiconductor memory
is random access
• Read/Write
• Volatile
• Temporary storage
• Dynamic and static
Dynamic RAM‼

• Bits stored as charge in capacitors


• Charge leak
• Need refreshing even when powered
Select Line
• Need refresh circuits (Word Line)
• Slower
• Simpler construction C

• Smaller per bit


• Less expensive(cheaper) Bit Line
• Main memory
Sense Amp
Static RAM‼

• Bits stored as on/off switches


• No charge to leak
• No refreshing needed when powered
• Need no refresh circuits
• Faster
• More complex construction
• Larger per bit
• More expensive
• Cache
3. Read Only Memory (ROM)

• Permanent storage
• Microprogramming
• Library subroutines
• Systems programs (BIOS)
• Function tables
Types of ROM
• Written during manufacturing
• Very expensive for small runs
• Programmable (once)
• PROM
• Needs special equipment to program
• Read “mostly”
• Erasable Programmable (EPROM)
• Erased by UV
• Electrically Erasable (EEPROM)
• Takes much longer to write than read
• Flash memory
• Erase whole memory electrically
4. Organisation in detail‼

• A 16Mbit chip can be organised as 1M of 16 bit


words
• A bit-per-chip system has 16 slot of 1Mbit chip
with bit 1 of each word in chip 1 and so on
*decoding process
Module Organisation (1) ( 256K*8 )

256K words
Memory 1 bit each word
Address Chip #1
Register (MAR) Memory
Buffer
Register (MBR)

1
18 2

256K words 7
1 bit each word 8
Address Chip #7
Bus
Data

[Example] 256K words


1 bit each word
Bus
Target:256K×8bit Chip #8
(256K word, word length is 8bits)
Idea: 8 Chips of
256K*1 , one chip one
Module Organisation (1) ( 256K*8 )

18
256K words 256K words 256K words
1 bit each word 1 bit each word 1 bit each word
Chip #1 Chip #2 Chip #8
Address
Bus

1 2 8 Data
256K*8 Bus
Module
A How to get 512K*8 with module A?
18 address 8 data
lines lines Combine two module A?
How to get 512K*8 with module A?

256K*8
Module
A
18 address 8 data
512K lines lines

? address
lines
x8 ? data
lines
= +
256K*8
Module
A
18 address 8 data
lines lines

Combine two module A!


Module Organisation (2) ( 1M*8 )

00 000000000000000000
... ...
00 111111111111111111
01 000000000000000000
... ...
01 111111111111111111
... ...
11 000000000000000000
... ...
256K*8 256K*8 ... 256K*8 11 111111111111111111
*DRAM Organisation

• A 16Mbit chip can be organised as a 2048


x 2048 x 4bit array
• Reduces number of address pins
• Multiplex row address and column address
• 11 pins to address (211=2048)
• Adding one more pin doubles range of values so x4
capacity
*Typical 16 Mb DRAM (4M x 4)
RAS CAS WE OE

Timing and Control

Address Refresh
Bus Counter
MUX

Row Row Data


Memory array
A0 Address De- 2048*2048*4 Bus
A1 Buffer coder

Data Input
Column D0
A10 Buffer D1
Address Sense Amplifier
And I/O Gate Data Output D2
Buffer D3
Buffer
Column Decoder
5. Error Correction

• Hard Failure
• Permanent defect
• Soft Error
• Random, non-destructive
• No permanent damage to memory
• Detected using Hamming error correcting
code
Error Correcting Code Function
Error
Error Signal Not possible to Correct

No Error
Error
Data Out M Possible to Correct
Corrector

Data In M M K
f
K Memory Compare
f K
4.3 Hierarchy List

• Registers
• L1 Cache
• L2 Cache
• Main memory
• Disk cache
• Disk
• Optical
• Tape
Design constraints of memory
• How much? Capacity
• How fast? Time is money
• How expensive? Budget
----------------------- Dilemma ----------------------
• Faster access, higher cost per bit
• Larger capacity, smaller cost per bit
• Larger capacity, slower access

Is it possible to get a fast,cheap&large Yes


enough !
memory?
Locality of Reference‼

• During the course of the execution of a


program, memory references tend to
cluster
4.4 Cache
1. Principles
• Small amount of fast memory
• Sits between normal main memory and CPU
• May be located on CPU chip or module

Word Transfer Block Transfer

CPU Cache Main Memory


Small & Fast
Big & Slow
Cache/Main-Memory structure :
Line Memory
No. Tag Block Address
0 AA
0 AA BB CC DD
1 BB Block 0
1 CC
2 (K words)
2 DD



Cache
C-1 Main …
Memory

Block Length
(K words)
Block j
2n-1

Word Length
Cache Read Operation

RA: Read address


Cache operation - overview

• CPU requests contents of memory location


• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block( fixed number
of words) from main memory to cache, then
deliver target word from cache to CPU
• Cache includes tags to identify which block of
main memory is in every cache slot
Typical Cache Organization

Address

Address

System Bus
Buffer
Processor Control Control
Cache

Data
Data
Buffer
4.5 Cache Design

• Cache Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Line Size
• Number of Caches
1 Cache Size

• Cost
• More cache is expensive
• Speed
• More cache is faster (up to a point)
• Checking cache for data takes time

L1 Cache used to be 1—512KB


2 Mapping Function

Example :
• Cache of 64kBytes
• Cache block of 4 bytes
• i.e. cache is 16k (214) lines of 4 bytes
4*214=16k
• 16MBytes main memory
• Addressable by a 24 bit address
224=16M
(1) Direct Mapping

• Each block of main memory maps to only one


cache line ( n:1 对应)
i.e. if a block is in cache, it must be in one
specific place
• Address is split into two parts
Least Significant w bits identify unique word
Most Significant s bits specify one memory block
• The MSBs are split into a cache line field r and a
tag of s-r
Direct Mapping Example

n:
1

16 Kword cache

16MB Main Memory


Direct Mapping Example
HEX BINARY
000000 0000 0000 0000 0000 0000 0000
000001 0000 0000 0000 0000 0000 0001
000002 0000 0000 0000 0000 0000 0010
000003 0000 0000 0000 0000 0000 0011

0000 0000 0000 0000 0000 00

(11 1111 1111 1111)

HEX BINARY
160000 0001 0110 0000 0000 0000 0000
160001 0001 0110 0000 0000 0000 0001
160002 0001 0110 0000 0000 0000 0010
160003 0001 0110 0000 0000 0000 0011

0001 0110 0000 0000 0000 00


16MB Main Memory
Direct Mapping Address Structure

Tag s-r Line or Slot r Word w


8 14 2
• 24 bit address
• 2 bit word identifier (4 byte per block/line)
• 22 bit block identifier
• 14 bit slot or line
• 8 bit tag (8=22-14)

• No two blocks in the same line have same tags


• Search cache for block by finding line and checking
tag
Direct Mapping Cache Organization

Cache




Direct Mapping pros & cons

• Simple
• Inexpensive
• Comparer
• Comparing times
• Fixed location for given block
If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very
high
(2) Associative Mapping

• A main memory block can load into any line of


cache
• Memory address is interpreted as tag and word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
• Cache searching gets expensive

22bits 2bits

S W
Associative Mapping Example
Address Data HEX BINARY
000000 0000 0000 0000 0000 0000 0000
000000 13579249 000001 0000 0000 0000 0000 0000 0001
000004 000002 0000 0000 0000 0000 0000 0010
000003 0000 0000 0000 0000 0000 0011
0000 0000 0000 0000 0000 00

0A0000
Line
Tag Data Number
3FFFFF 11223344 0000
058001 11235813 0001
160004 11235813 …
3FFE
000000 13579246 3FFF

22 Bits 32Bits
FFFFF4
FFFFF8 11223344 16K word Cache
HEX BINARY
FFFFFC 160000 0001 0110 0000 0000 0000 0000
160001 0001 0110 0000 0000 0000 0001
32Bits 160002 0001 0110 0000 0000 0000 0010
160003 0001 0110 0000 0000 0000 0011
16Mbyte Main Memory 0001 0110 0000 0000 0000 00
Fully Associative Cache Organization

Cache
Memory Address Tag Data Main Memory
Tag Word W0
L0 W0
W1 B0
W2 W1
s w W3 W2
s W3

w W0 ③
s W1 Lj

W2 W(4j)
W3 Bj
W (4j+1)
W (4j+2)
Compare
W (4j+3)
W0
s W1 Lm-1
W2
Hit in Cache W3
Miss in Cache
(3) Set Associative Mapping

• Cache is divided into a number of sets


• Each set contains a number of lines
• A given block maps to any line in a given
set
e.g. Block B can be in any line of set i
e.g. 2 lines per set
• 2 way associative mapping
• A given block can be in one of 2 lines in only
one set
Two Way Set Associative Mapping Example
Set Associative Mapping
Address Structure
Word
Tag 9 bit Set 13 bit
2 bit
• Use set field to determine cache set to look in
• Compare tag field to see if we have a hit
• e.g
• Address Tag Data Set number
1FF 7FFC 1FF 12345678 1FFF
001 7FFC 001 11223344 1FFF
• This two blocks will map to the same set. We use tag to
distinguish them.
K(Two) Way Set Associative Cache Organization



3 Replacement Algorithms
(1) Direct mapping

• No choice
• Each block only maps to one line
• Replace that line
Replacement Algorithms
(2)Associative & Set Associative
• Hardware implemented algorithm (speed)
• Least Recently used (LRU)
• e.g. in 2 way set associative
• Which of the 2 block is lru?
• First in first out (FIFO)
• replace block that has been in cache longest
• Least frequently used
• replace block which has had fewest hits
• Random
4 Write Policy

• Must not overwrite a cache block unless


main memory is up to date
• Multiple CPUs may have individual caches
• I/O may address main memory directly
(1) Write through

• All writes go to main memory as well as


cache
• Multiple CPUs can monitor main memory
traffic to keep local (to CPU) cache up to
date
• Lots of traffic
• Slows down writes
(2) Write back

• Updates initially made in cache only


• Update bit for cache slot is set when
update occurs
• If block is to be replaced, write to main
memory only if update bit is set
• Other caches get out of sync
• I/O must access main memory through
cache

You might also like