You are on page 1of 39

Memory organization

Three categories:
Internal processor memory Main memory Secondary memory Cache memory The choice of a memory device to built a

memory system depends on its properties.

Goal of memory organization


To provide high averg performance with low averg cost per bit:
Hierarchy of different memory devices Automatic storage allocation method Virtual memory concepts Efficient memory interface to provide higher data transfer rate.

Memory-device characteristics
Cost, c = C/S dollars/bit Access time, tA Access modes
Random-access memory (RAM). Serial access Semirandom or direct access Time spent to transfer a data to the output after receiving a readrequest.

Alterability
Read-only, ROM PROM, EPROM Read-write, RAM

Permanence of storage
Destructive read-out Non-destructive read-out Dynamic storage, refreshing Volatility.

Cycle time and data-transfer rate


Bandwidth, bM =w/tM tA may not be equal to tM,cycle time.

Physical characteristics
Physical size Storage density Energy consumption Reliability: mean time to failure (MTTF)

Semiconductor RAMs
Static RAM Data remains as long as power is
supplied Read: add line active, data line connected to sense amp Write: Add line active, data line connected to data (low or high), Vb is connected to V1/2
Data line Va R R

V1/2=V
b

Add line
Bipolar static RAM cell

Dynamic RAM Data goes away within a millisecond


even power is there. Refreshing is required

add

data GND
Dynamic MOS cell

RAM organization
Access circuitry has a very significant effect on total cost of any memory unit. To reduce the cost the organization has two essential features:
The storage cells are physically arranged in rectangular arrays of cells. This is to facilitate layout of the connections between the cells and the access circuitry. The memory address is partitioned into d components so that address Ai of cell Ci becomes a d-dimensional vector (Ai,1,Ai,2, . . . , Ai,d) = Ai. Each of the d parts of an address word goes to a different address decoder and a different set of address drivers. A particular cell is selected by simultaneously activating all d of its address lines.
A0 A1

Va Vb

Decoder

CE WE X 0

X1

Z0

Z1

Structure of 4x2-bit RAM. 1-dimensional memory

RAM Design

2mxn bit RAM IC. m represent no of address lines. n represent word size.

m
Address A

2mxn RAM

n
Data D

CS WE OE A RAM IC

RAM design contd.


Increasing word size. Increasing number of words. Given that Nxw bit RAM ICs, design Nxw bit RAM, where N>N and/or w>w:
Construct a pxq array of given RAM ICs, where p=ceil{N/N}, q=ceil{w/w} Each row stores N words Each column stores a fixed set of w bits from every words

Increasing word size

Increasing number of words


2mxw RAM
CS OE WE

2mxw RAM
CS OE WE

2-bit are added at the msb position of m+2 bit address

1 to 4 decoder

2mxw RAM
CS OE WE

Address A Chip select CS WE OE

m+2

2mxw RAM
CS OE WE

Structure of commercial 8Mx8 bit DRAM chip


23 address lines are multuplexed into 13 external address lines. Page mode Refresh cycle time is 64ms. If one-row read operation takes 90ns, then time needed to refresh the DRAM is 90nsx8192 = 0.737ms

Other semiconductor memories


ROMs
ROM PROM EPROM: program randomly erase in bulk offline EEPROM

Flash memory
Reading is random but writing is in blocks

Fast RAM interface


. If we need to supply a faster external processor with individually accessible n-bit words, then two basic ways to increase data-transfer rate across its external interface by a factor of S
Use a bigger memory words: design the RAM with an internal memory word size of w=Sn bits. Sn bits is used as one unit to be accessed in one memory cycle time TM. Access more than one words at a time: partition the RAM into S separate banks M0, M1, , MS-1, each covering part of the memory address space and each provided with its own addressing ckt. Need fast ckt inside the RAM to assemble and disassemble the words being accessed.

Both require fast p-to-s and s-to-p ckts at the memory processor interface. Normally S words produced or consumed by the processor have consecutive address. Their placement in the physical memory uses Interleaving technique.

Address Interleaving
Let Xh, Xh+1, be words that expected to be accessed in sequence. They normally be placed in consecutive memory locations Ai,Ai+1, in the RAM.
Assign Ai to bank Mj if j=i (modulo S). If S = p2, then least significant p bits of a memory add immediately identify the memory bank where it belongs to.

Magnetic surface recording


Surface of magnetic medium: ferric oxide If each track has a fixed capacity N words, and rotate at r revolutions/s. Let n be the number of words/block, its data can be transferred in n/(rN) s. The aver latency is 1/(2r) s. If ts is avrg seek time, then time needs to access a block of data,
tB = ts + 1/(2r) + n/(rN)

Magnetic disk drive


Platters heads Tracks: Sectors: sector header, inter-sector gap Cylinders

Magnetic tape
Data is stored in longitudinal tracks. Older tapes had 9 parallel tracks. Now about 80 tracks are used. A single head can read/write all tracks simultaneously. Along the tracks data are stored in blokcs. Large gaps are inserted between adjucent blocks so that tape can be started and stopped between blocks.

Optical memories
CD-ROMs Bits are stored in 0.1 m wide pits and lands. Access time is about 100ms, data transfer rate is 3.6 MB/s (for 24x; x = 150KB/s)

Memory hierarchy M M
1

I CPU D Cache M1

Cache
Write-through Write-back
CPU

Main memory
M2 I

M3
Secondary

D L1 L2 Main memory M2 I

Cache M1

M3

Secondary

ci > ci+1 tAi > tAi+1 si > si+1

CPU

Main memory I

Secondary

Normally holds

Virtual memory
Memory hierarchy comprising of different memory devices appears to the user program as a single, large, directly addressable memory.
Automatic memory allocation, and efficient sharing Makes program independent of main memory space Achieve relatively low cost/bit and low access time.

A memory location is addressed by virtual address V, and it is necessary to map this address to the actual physical address R, f:VR

Locality of references
Over short term, the address generated by a program tend to be localized and are therefore predictable. Page mode: info is transferred between Mi and Mi+1 as a block of consecutive words. Spatial locality. Temporal locality: loop instru has high frequency of references.

Address translation
Address assignment and translation is carried out at diffent stage in life of a pgro
By the programmer while writing the prog By the compiler during compilation By the loader at initial prog-load time By run-time memory management HW and/or software

Static translation, dynamic translation

Effective add, Base add, Displacement D A B W 0 B Aeff= B+D or Aeff = B.D W 1 B+1 . Memory address table. W i B+I Limit address, Li m-1 B+m-1 W Bi<=Ai<=Li
eff 0 1

m-1

B1 Blk K1 B2 Blk K2 L2 L1

B1 B2 B3

Blk K1
Blk K2 Blk K3

L1 L2 L3

Dynamic address-trans system


TLB is referred to as an address cache.
AV TLB Translation table containing part of the memory map

BV

AR

BR

To memory system

Segments and page


Page - Basic unit of memory info for swapping purpose in a multilevel memory system. Page-frame Segments higher level info blocks corresponding to logical entities e.g program or data sets. A segments can be translated into one or more pages. Segment add, displacement. when a segment is not currently resident in the M1 memory, it is entirely transferred from the secondary memory M2. Segment table.

Burroughs B6500 segmentation


Each program has a segment called its program reference table PRT, which serves as it segment table. Segment descriptor Intel 80x86 and pentium series have four 16 bit segment registers forming a segment table

Two-stage add translation with segment and page

Add trans. For pentium

Advantages of segmentation
Segment boundaries corresponds to natural program and data boundaries. Because of their logical independence, a program segment can be changed or recompiled at any time without affecting other segments. Implementation of access rights and scopes of program variables have been easy. Implementation of Stack and queues have been easy as the segment can be of variable length.

pages
Fixed length block page table
Page add, displacement

Page fault. External fragmentation Internal fragmentation Segments can be assigned over a noncontiguous area in the memory by the use of paging.

Effect of page size, SP


Storage utilization, effective data-transfer rate. If SS >> SP, the last page assigned to a segment should contain Sp/2 words. No. of page table is approx. SS/SP words. Memory-space overhead with each segment Space utilization,
SS SP S 2 SP

SS 2 SS SP u 2 S S S S P 2 S S (1 S P )

Optimum page size is obtained when S is minimized. dS 1 SS


dSP 2 S
2 P

OPT SP 2SS

Optimum space utilization,


u
OPT

The effect of page size on hit ratio is complex


When Sp is small, H increases with Sp . When Sp exceeds a certain value, H begins to decreases.

1 1 2 / SS

Memory Allocation
Placement of info blocks in memory system is called memory allocation. Demand swapping Anticipatory swapping Thrashing For to level memory system, memory map contains
Occupied memory list for M1 Available space list for M1 Directory for M2.

Non-preemptive allocation
Does not overwrite or move existing blocks to make room for incoming blocks. Algorithm for paged segment. Algorithms for unpaged segment
First fit Best fit

Preemptive allocation
Rellocation is done in 2 ways:
Relocate a block to a different postion within M1 Deallocate a block from M1 memory using a replacement policy
Dirty blocks, clean blocks

Relocation by compaction Replacement policies to achieve max. H

Optimal replacement policy find the block for replacement that has minimum chance to be referenced next time. Address trace Two policies
FIFO LRU

You might also like