You are on page 1of 23

Memory

The Memory/Processor Performance Gap

Memory-Processor Performance Gap


The memory hierarchy

CPU

Increasing distance
Level 1
from the CPU in
access time

Levels in the Level 2


memory hierarchy

Level n

Size of the memory at each level


The memory hierarchy
Fundamentals of Memory Hierarchy
 Locality
 Temporal locality
 The currently required data are likely to need again in the near future
 Spatial locality
 There is high probability that the other data nearby will be need soon
Two Processor/Memory Architectures

Processor Processor
 Princeton
 Fewer memory
wires
 Harvard Program Data memory Memory
memory (program and data)
 Simultaneous
program and data
memory access Harvard Princeton
Memory Access Process
Physical
Address
Virtual Tag
address hit miss
CPU Main
TLB Cache
Memory

miss hit

Address
Translation

data

TLB (Translation lookaside buffer): Translate a virtual


address to a physical address
Memory management units

 Handles DRAM refresh, bus interface and arbitration


 Takes care of memory sharing among multiple processors
 Translates logic memory addresses from processor to physical memory
addresses

logical physical
address memory address main
CPU management
memory
unit
Memory Data Organization
 Endianness
 Big Endian/Little Endian

 Memory data alignment


Endianness
 The order of bytes (sometimes “bit”) in memory to
represent different data types
 Little/Big Endian
 Little Endian:
 put the least-significant byte first (at lower address)
 e.g. Intel Processor
 Big Endian:
 put the most-significant byte first
 e.g.some PowerPCs, Motorola, MIPS, SPARC
Big/Little Endian Example
 32bit data 0xFABC0123 at address 0xFF20

0xFF20 0xFF21 0xFF22 0xFF23


Big Endian 0xFA 0xBC 0x01 0x23
Little Endian 0x23 0x01 0xBC 0xFA
Data Alignment
 Data alignment
 A datum with multiple bytes need to be allocated to an address
that is a multiple of its size
 Examples
 0bxxxxxxxxxx byte (8bit) aligned
 0bxxxxxxxxx0 half word (16bit) aligned
 0bxxxxxxxx00 word (32bit) aligned
 0bxxxxxxx000 double word (64bit) aligned
 Why aligned?
 Misalignment causes implementation complications and reduces
performance
Memory device: basic concepts
m × n memory

 Stores large number of bits …

 m x n: m words of n bits each

m words

 k = Log2(m) address input signals
 or m = 2^k words
 e.g., 4,096 x 8 memory: n bits per word

 32,768 bits
 12 address input signals memory external view

 8 input/output data signals r/w


2k × n read and write
memory
 Memory access enable

 r/w: selects read or write A0


 enable: read or write only when asserted Ak-1



 multiport: multiple accesses to different locations
simultaneously
Qn-1 Q0
Memory Types
ROM: “Read-Only” Memory
 Nonvolatile memory
 Can be read from but not written to, by a processor in an
embedded system External view

 Traditionally written to, “programmed”, before inserting to enable 2k × n ROM


embedded system A0


 Uses Ak-1

 Store software program for general-purpose processor
Qn-1 Q0
 program instructions can be one or more ROM words
 Store constant data needed by system
 Implement combinational circuit
Example: 8 x 4 ROM
 Horizontal lines = words Internal view

 Vertical lines = data


8 × 4 ROM
 Lines connected only at circles word 0
enable 3×8 word 1
 Decoder sets word 2’s line to 1 if address input decoder word 2
is 010 A0 word line
A1
 Data lines Q3 and Q1 are set to 1 because there A2
is a “programmed” connection with word 2’s
line data line

programmable
 Word 2 is not connected with data lines Q2 and connection wired-OR

Q0 Q3 Q2 Q1 Q0

 Output is 1010
Implementing combinational function
 Any combinational circuit of n functions of same k variables can be done with 2^k x n
ROM

Truth table
Inputs (address) Outputs
a b c y z 8×2 ROM
0 0 word 0
0 0 0 0 0
0 0 1 0 1 0 1 word 1
0 1 0 0 1 0 1
0 1 1 1 0 enable 1 0
1 0 0 1 0 1 0
1 0 1 1 1 c 1 1
1 1 0 1 1 b 1 1
1 1 1 1 1 1 1 word 7
a
y z
Types of ROM
 Written during manufacture
 Programmable (once)
 PROM
 Needs special equipment to program
 Read “mostly”
 Erasable Programmable (EPROM)
 Erased by UV
 can program and erase individual words
 Electrically Erasable (EEPROM)
 Takes much longer to write than read
 can program and erase individual words as well
 Flash memory
 Large blocks of memory read/write at once, rather than one word at a time
 Faster erase
RAM
external view

 Random access memory r/w


enable
2k × n read and write
memory

 Typically volatile memory A0



 bits are not held without power supply
Ak-1
 Read and written easily by processor during execution …

 Internal structure more complex than ROM Qn-1 Q0


 a word consists of several memory cells, each storing 1 bit internal view
I3 I2 I1 I0
 each input and output data line connects to each cell in its
column 4×4 RAM

 rd/wr connected to every cell enable 2×4


decoder
 when row is enabled by decoder, each cell has logic that stores
A0
input data bit when rd/wr indicates write or outputs stored bit A1
when rd/wr indicates read Memory
cell
rd/wr To every cell

Q3 Q2 Q1 Q0
Types of RAM
 SRAM: Static RAM
 Memory cell uses flip-flop to store bit
 Holds data as long as power supplied

 DRAM: Dynamic RAM


 Memory cell uses transistor and capacitor to store bit
 More compact than SRAM
 “Refresh” required due to capacitor leak
 Slower to access than SRAM
Device Schematic and Time Diagram

11-13, 15-19 data<7…0> data<7…0>


11-13, 15-19
2,23,21,24, addr<15...0> addr<15...0>
27,26,2,23,21,
25, 3-10 24,25, 3-10
22 /OE 22 /OE

27 /WE 20 /CS

20 /CS1

CS2 HM6264 27C256


26
block diagrams

Device Access Time (ns) Standby Pwr. (mW) Active Pwr. (mW) Vcc Voltage (V)
HM6264 85-100 .01 15 5
27C256 90 .5 100 5

device characteristics

Read operation Write operation

data data

addr addr

OE WE

/CS1 /CS1

CS2 CS2
timing diagrams
Composing memory
Increase number of words
 Memory size needed often differs from size of readily available
2m+1 × n ROM
memories 2m × n ROM
 When available memory is larger, simply ignore unneeded high- A0
… …
order address bits and higher data lines Am-1
1×2 …
 When available memory is smaller, compose several smaller Am decoder

memories into one larger memory 2m × n ROM


enable
 Connect side-by-side to increase width of words

 Connect top to bottom to increase number of words

 added high-order address line selects smaller memory containing
desired word using a decoder
 Combine techniques to increase number and width of words …
Qn-1 Q0

A
2m × 3n ROM
Increase number
enable 2m × n ROM 2m × n ROM 2m × n ROM
and width of
Increase width words
A0 … … …
of words enable
Am
… … … outputs

Q3n-1 Q2n-1 Q0
Summary
 Memory hierarchy
 Memory/processor architecture
 Memory access process
 Endianness
 Data alignment
 Memory data organization
 Memory devices
 Basics
 ROM/RAM

You might also like