This action might not be possible to undo. Are you sure you want to continue?
Ramon Canal CTD – Master CANS
Slides based on:Introduction to CMOS VLSI Design. D. Harris
CTD – Master CANS 1
• Memory Arrays • SRAM Architecture
– SRAM Cell – Decoders – Column Circuitry – Multiple Ports
• Serial Access Memories
CTD – Master CANS
Random Access Memory Serial Access Memory Content Addressable Memory (CAM) Queues
Read/Write Memory (RAM) (Volatile)
Read Only Memory (ROM) (Nonvolatile)
Static RAM (SRAM)
Dynamic RAM (DRAM)
Serial In Parallel Out (SIPO)
Parallel In Serial Out (PISO)
First In First Out (FIFO)
Last In First Out (LIFO)
Programmable ROM (PROM)
Erasable Programmable ROM (EPROM)
Electrically Erasable Programmable ROM (EEPROM)
CTD – Master CANS
• 2n words of 2m bits each • If n >> m, fold by 2k into fewer rows of more columns
wordlines bitline conditioning bitlines row decoder memory cells: 2n-k rows x 2m+k columns
k column decoder
column circuitry 2m bits
• Good regularity – easy to design • Very high density if good cells are used
CTD – Master CANS 4
12T SRAM Cell
• Basic building block: SRAM Cell
– Holds one bit of information, like a latch – Must be read and written
• 12-transistor (12T) SRAM cell
– Use a simple latch connected to bitline – 46 x 75 λ unit cell
bit write write_b read read_b
CTD – Master CANS
bit_b – Raise wordline CTD – Master CANS 6 . bit_b – Raise wordline bit word bit_b • Write: – Drive data onto bit.6T SRAM Cell • Cell size accounts for most of array size – Reduce cell size at expense of complexity • 6T SRAM Cell – Used in most commercial chips – Data stored in cross-coupled inverters • Read: – Precharge bit.
A_b = 1 bit – bit discharges.5 A 0. bit_b stays high – But A bumps up slightly A_b 1.0 0 100 200 300 time (ps) 400 500 600 CTD – Master CANS 7 .SRAM Read • • • • Precharge both bitlines high Then turn on wordline One of the two bitlines will be pulled down by the cell Ex: A = 0.0 word bit 0.5 bit_b word N2 A bit_b P1 P2 A_b N1 N3 N4 • Read stability – A must not flip 1.
bit_b stays high – But A bumps up slightly A_b 1.0 word bit 0.5 bit_b word N2 A bit_b P1 P2 A_b N1 N3 N4 • Read stability – A must not flip – N1 >> N2 1. A_b = 1 bit – bit discharges.5 A 0.0 0 100 200 300 time (ps) 400 500 600 CTD – Master CANS 8 .SRAM Read • • • • Precharge both bitlines high Then turn on wordline One of the two bitlines will be pulled down by the cell Ex: A = 0.
bit_b = 0 – Force A_b low. the other low Then turn on wordline Bitlines overpower cell with new value Ex: A = 0.0 0 100 200 300 400 500 600 700 time (ps) CTD – Master CANS 9 .5 • Writability – Must overpower feedback inverter A bit_b 1.SRAM Write • • • • Drive one bitline high. then A rises high bit word N2 A N1 N3 P1 P2 A_b N4 bit_b A_b 1. A_b = 1.5 word 0. bit = 1.0 0.
5 word 0. A_b = 1. the other low Then turn on wordline Bitlines overpower cell with new value Ex: A = 0.5 A_b A bit_b 1.0 0. bit = 1.0 0 100 200 300 400 500 600 700 time (ps) CTD – Master CANS 10 . then A rises high bit word N2 A N1 N3 P1 P2 A_b N4 bit_b • Writability – Must overpower feedback inverter – N2 >> P1 1. bit_b = 0 – Force A_b low.SRAM Write • • • • Drive one bitline high.
SRAM Sizing • High bitlines must not overpower inverters during reads • But low bitlines must write new value into cell bit word weak med A strong A_b med bit_b CTD – Master CANS 11 .
SRAM Column Example Read Bitline Conditioning φ2 More Cells word_q1 Write Bitline Conditioning φ2 More Cells word_q1 bit_b_v1f out_b_v1r φ1 φ2 word_q1 bit_v1f out_v1r bit_v1f SRAM Cell bit_b_v1f bit_v1f H H out_v1r SRAM Cell write_q1 data_s1 CTD – Master CANS 12 .
bitline contacts GND VDD BIT BIT_B GND WORD Cell boundary CTD – Master CANS 13 .SRAM Layout • Cell size is critical: 26 x 45 λ (even smaller in industry) • Tile cells sharing VDD. GND.
Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry CTD – Master CANS 14 .
Decoders • n:2n decoder consists of 2n n-input AND gates – One needed for each row of memory – Build AND from NAND or NOR gates Static CMOS A1 A0 Pseudo-nMOS 1 1 1 1 8 4 A1 A0 word A1 A0 1/2 A0 A1 1 1 4 2 16 8 word word0 word1 word2 word3 word0 word1 word2 word3 CTD – Master CANS 15 .
Decoder Layout • Decoders must be pitch-matched to SRAM cell – Requires very skinny gates A3 VDD A3 A2 A2 A1 A1 A0 A0 word GND NAND gate buffer inverter CTD – Master CANS 16 .
Large Decoders • For n > 4. NAND gates become slow – Break large gates into multiple smaller gates A3 A2 A1 A0 word0 word1 word2 word3 word15 CTD – Master CANS 17 .
Predecoding • Many of these gates are redundant – Factor out common gates into predecoder – Saves area – Same path effort A3 A2 A1 A0 predecoders 1 of 4 hot predecoded lines word0 word1 word2 word3 word15 CTD – Master CANS 18 .
Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry CTD – Master CANS 19 .
output 20 .Sense Amplifiers C ⋅ ΔV tp = ---------------Iav large make Δ V as small as possible small Idea: Use Sense Amplifer small transition input CTD – Master CANS s.a.
64C of diffusion capacitance (big C) – Discharged slowly through small transistors (small I) • Sense amplifiers are triggered on small voltage swing (reduce ΔV) CTD – Master CANS 21 .Sense Amplifiers • Bitlines have many cells attached – Ex: 32-kbit SRAM has 256 rows x 128 cols – 128 cells on each bitline • tpd ∝ (C/I) ΔV – Even with shared diffusion contacts.
Differential Pair Amp • Differential pair requires no clock • But always dissipates static power sense_b bit P1 N1 P2 N2 N3 sense bit_b CTD – Master CANS 22 .
Clocked Sense Amp • Clocked sense amp saves power • Requires sense_clk after enough bitline swing • Isolation transistors cut off large bitline capacitance bit sense_clk bit_b isolation transistors regenerative feedback sense sense_b CTD – Master CANS 23 .
Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry CTD – Master CANS 24 .
Column Circuitry • Some circuitry is required for each column – Bitline conditioning – Column multiplexing CTD – Master CANS 25 .
Bitline Conditioning • Precharge bitlines high before reads φ bit bit_b • Equalize bitlines to minimize voltage difference when using sense amplifiers φ bit bit_b CTD – Master CANS 26 .
Twisted Bitlines • Sense amplifiers also amplify noise – Coupling noise is severe in modern processes – Try to couple equally onto bit and bit_b – Done by twisting bitlines b0 b0_b b1 b1_b b2 b2_b b3 b3_b CTD – Master CANS 27 .
Column Multiplexing • Recall that array may be folded for good aspect ratio • Ex: 2 kword x 16 folded into 256 rows x 128 columns – Must select 16 output bits from the 128 columns – Requires 16 8:1 column multiplexers CTD – Master CANS 28 .
precharge outputs • One design is to use k series transistors for 2k:1 mux – No external decoder logic needed B0 B1 A0 A0 A1 A1 A2 A2 B2 B3 B4 B5 B6 B7 B0 B1 B2 B3 B4 B5 B6 B7 Y to sense amps and write circuits Y CTD – Master CANS 29 .Tree Decoder Mux • Column mux can use pass transistors – Use nMOS only.
Single Pass-Gate Mux • Or eliminate series transistors with separate decoder A1 A0 B0 B1 B2 B3 Y CTD – Master CANS 30 .
Ex: 2-way Muxed SRAM φ2 More Cells word_q1 More Cells A0 A0 write0_q1 φ2 write1_q1 data_v1 CTD – Master CANS 31 .
Multiple Ports • We have considered single-ported SRAM – One read or one write on each cycle • Multiported SRAM are needed for register files • Examples: – Multicycle MIPS must read two sources or write a result on some cycles – Pipelined MIPS must read two sources and write a third result each cycle – Superscalar MIPS must read and write many sources and results each cycle CTD – Master CANS 32 .
Memory configuratons Multiported memories CAM Memories Serial Access. Queues CTD – Master CANS 33 .
write during ph2 CTD – Master CANS 34 .Dual-Ported SRAM • Simple dual-ported SRAM – Two independent single-ended reads – Or one differential write bit wordA wordB bit_b • Do two reads and one write by time multiplexing – Read during ph1.
Multi-Ported SRAM • Adding more access transistors hurts read stability • Multiported SRAM isolates reads from state node • Single-ended design minimizes number of bitlines bA bB bC wordA wordB wordC wordD wordE wordF wordG bD bE bF bG write circuits read circuits CTD – Master CANS 35 .
Queues CTD – Master CANS 36 .Memory configuratons Multiported memories CAM Memories Serial Access.
Contents-Addressable Memory I/O Buffers Data (64 bits) I / O B u f f e r s I / O B u f f e r s Commands Comparand C o m m a n d s C o m m a n d s Mask Address Decoder Control Logic R/W Address (9 bits) CAM Array 2 words 3 64 bits 9 9 A d d r e s s D e c o d e r 9 V a l i d i t y B i t s 2 P r i o r V a l i d i t y B i t s 2 A d d r e s s D e c o d e r CTD – Master CANS Priority Encoder i t y E n c o d e r P r i o r i t y E n c o d e r 2 Validity Bits 9 37 .
Queues CTD – Master CANS 38 .Memory configuratons Multiported memories CAM Memories Serial Access.
Serial Access Memories • Serial access memories do not use an address – – – – – Shift Registers Tapped Delay Lines Serial In Parallel Out (SIPO) Parallel In Serial Out (PISO) Queues (FIFO. LIFO) CTD – Master CANS 39 .
Shift Register • Shift registers store and delay data • Simple design: cascade of registers – Watch your hold times! clk Din 8 Dout CTD – Master CANS 40 .
11 Dout CTD – Master CANS 41 ....00 11.. keep data in SRAM instead • Move read/write pointers to RAM rather than data – Initialize read address to first entry.Denser Shift Registers • Flip-flops aren’t very area-efficient • For large shift registers. write to last – Increment address on each cycle clk readaddr writeaddr dual-ported SRAM counter counter reset Din 00.
Tapped Delay Line • A tapped delay line is a shift register with a programmable number of stages • Set number of stages with delay controls to mux – Ex: 0 – 63 stages of delay clk SR32 delay5 SR16 SR8 SR4 SR2 SR1 Din Dout delay4 delay3 delay2 delay1 delay0 CTD – Master CANS 42 .
Serial In Parallel Out • 1-bit shift register reads in serial data – After N steps. presents N-bit parallel output clk Sin P0 P1 P2 P3 CTD – Master CANS 43 .
Parallel In Serial Out • Load all N bits in parallel when shift = 0 – Then shift one bit out per cycle P0 shift/load clk P1 P2 P3 Sout CTD – Master CANS 44 .
• Read and write each use their own clock.Queues • Queues allow data to be read and written at different rates. data • Queue indicates whether it is full or empty • Build with SRAM and read/write counters (pointers) WriteClk WriteData FULL Queue ReadClk ReadData EMPTY CTD – Master CANS 45 .
FIFO. LIFO Queues • First In First Out (FIFO) – – – – – Initialize read and write pointers to first element Queue is EMPTY On write. Queue is FULL On read. increment write pointer If write almost catches read. increment read pointer • Last In First Out (LIFO) – Also called a stack – Use a single stack pointer for read and write CTD – Master CANS 46 .
Other considerations Leakage control Redundancy Flash Memories CTD – Master CANS 47 .
int Inserting Extra Resistance Reducing the supply voltage 48 CTD – Master CANS .Suppressing Leakage in SRAM V DD low-threshold transistor sleep V DD.int sleep V DD.int SRAM cell SRAM cell SRAM cell V DD V DDL SRAM cell SRAM cell SRAM cell sleep V SS.
Other considerations Leakage control Redundancy Flash Memories CTD – Master CANS 49 .
Redundancy Redundant rows Redundant columns Row Address : Memory Array R o w D e c o d e r Fuse Bank Column Decoder Column Address CTD – Master CANS 50 .
Error-Correcting Codes Example: Hamming Codes e. B3 Wrong with 1 1 0 =3 CTD – Master CANS 51 .g.
Redundancy and Error Correction CTD – Master CANS 52 .
Other considerations Leakage control Redundancy Flash Memories CTD – Master CANS 53 .
Flash EEPROM Control gate Floating gate erasure n 1 source Thin tunneling oxide programming p-substrate n 1 drain Many other options … CTD – Master CANS 54 .
Cross-sections of NVM cells Flash Courtesy Intel CTD – Master CANS EPROM 55 .
Basic Operations in a NOR Flash Memory― Erase cell G 12 V S D 0V 12 V 0V WL 1 open CTD – Master CANS BL 0 array BL 1 WL 0 open 56 .
Basic Operations in a NOR Flash Memory― Write 12 V G BL 0 6V 12 V S D 0V 0V WL 1 6V 0V WL 0 BL 1 CTD – Master CANS 57 .
Basic Operations in a NOR Flash Memory― BL 1 Read BL 0 5V G 1V 5V S D 0V 0V WL 1 1V CTD – Master CANS WL 0 0V 58 .
Conclusions Memory Structure: 2L 2 K Row Decoder Bit line Storage cell AK A K1 1 Word line AL2 1 M.2K Sense amplifiers / Drivers A0 A K2 1 Amplify swing to rail-to-rail amplitude Column decoder Selects appropriate word Input-Output (M bits) CTD – Master CANS 59 .