Lecture 8 SRAM 2021

Digital Integrated Circuits
(83-313)
Lecture 8:
SRAM
Prof. Adam Teman
2 May 2021
Disclaimer: This course was prepared, in its entirety, by Adam Teman. Many materials were copied from sources freely available on the internet. When possible, these sources have been cited;
however, some references may have been cited incorrectly or overlooked. If you feel that a picture, graph, or code example has been copied from you and either needs to be cited or removed,
please feel free to email adam.teman@biu.ac.il and I will address this as soon as possible.
Lecture Content
2
© Adam Teman,
May 2, 2021
First Look at Memory
3
Why Memory?
Source: Intel
Intel Pentium-M (2001) – 2MB L3 Cache
Source: wccftech.com
Cerebras Wafer Scale Engine 2 (2021) –

Source: Intel 16GB On-Chip Memory
Intel 10th Gen “Comet Lake” (2020) – 20MB L3 Cache © Adam Teman,
May 2, 2021
Memory Hierarchy of a Personal Computer
5 Source: Pavlov, Sachdev, 2008

© Adam Teman,
May 2, 2021
Semiconductor Memory Classification
• Size:
Memory Arrays
• Bits, Bytes, Words
Random Access Memory Serial Access Memory Content Addressable Memory

• Timing Parameters:
(CAM) • Read access, write access, cycle time
Read/Write Memory Read Only Memory
Shift Registers Queues
• Function:
(RAM) (ROM)
(Volatile) (Nonvolatile) • Read Only (ROM) – non-volatile
• Read-Write (RWM) – volatile
Serial In Parallel In First In Last In
Static RAM Dynamic RAM Parallel Out Serial Out First Out First Out • NVRWM – Non-volatile Read Write
• Access Pattern:
(SRAM) (DRAM) (SIPO) (PISO) (FIFO) (LIFO)
• Random Access, FIFO, LIFO,

Electrically
Shift Register, CAM
Mask ROM Programmable Erasable Flash ROM
ROM
(PROM)
Programmable
ROM
Erasable
Programmable
• I/O Architecture:
(EPROM) ROM • Single Port, Multi-port
(EEPROM)
• Application:
• Embedded, External, Secondary
6
© Adam Teman,
May 2, 2021
Random Access Chip Architecture
• Conceptual: linear array
• Each box holds some data
• But this leads to a long and skinny shape
• Let’s say we want to make a 1MB memory:

• 1MB=220 words X 8 bits=223 bits, each word in a separate row
• A decoder would reduce the number of access pins from
220 access pins to 20 address lines.
• We’d fit the pitch of the decoder to the word cells,
so we’d have Word Lines with no area overhead.
• The output lines (=bit lines) would be extremely long,
as would the delay of the huge decoder.
• The array’s height is about 128,000 times larger than its width (220/23).
7
© Adam Teman,
May 2, 2021
Square Ratio
• Instead, let’s make the array square:
• 1MB=223 bits=212 rows X 211 columns.
• There are 4000 rows, so we need a
12-bit row address decoder
(to select a single row)
• There are 2000 columns,
representing 256 8-bit words.
• We need to select only one of the
256 words through a
column address decoder (or multiplexer).
• We call the row lines “Word Lines”
and the column lines “Bit Lines”.
8
© Adam Teman,
May 2, 2021
Special Considerations
• The “core” of the memory array is huge.
It can sometimes take up most of the chip area.
• For this reason, we will try to make the “bitcell” as small as possible.
• A standard Flip Flop uses at least 10 transistors per bit
(usually more than 20). This is very area consuming.
• We will trade-off area for other circuit properties:
• Noise Margins
• Logic Swing
• Speed
• Design Rules
• This requires special peripheral circuitry.
9
© Adam Teman,
May 2, 2021
Memory Architecture
Storage Cell
Bit Line Memory Size: W Words of C bits
=W x C bits
Address bus: A bits
ADDA-1 : ADDM
→W=2A
Row Decoder
Word Line
Number of Words in a Row: 2M

Multiplexing Factor: M
Number of Rows: 2A-M

Number of Columns: C x 2M
C×2M
Sense Amplifiers /Drivers
Row Decoder: A-M → 2A-M
ADDM-1 :
Column Decoder Column Decoder: M → 2M
ADD0
Input/Output
10 (C bits) © Adam Teman,
May 2, 2021
The 6T SRAM Bitcell
11
Basic Static Memory Element
Q Q
12
© Adam Teman,
May 2, 2021
Positive Feedback: Bi-Stability
© Adam Teman,
May 2, 2021
Writing into a Cross-Coupled Pair
• The write operation is ratioed
• The access transistor must overcome the feedback.
En
D Q Q
15
© Adam Teman,
May 2, 2021
How should we write a ‘1’
Option 1: nMOS Access Transistor Option 2: pMOS Access Transistor
Passes a “weak ‘1’”, bad at pulling Passes a “weak ‘0’”, bad at pulling
up against the feedback down against the feedback
Option 3: Transmission Gate Solution: Differential nMOS Write
Writes well, but how do we read?

16
© Adam Teman,
May 2, 2021
Reading from a 6T SRAM Cell
© Adam Teman,
May 2, 2021
6-transistor CMOS SRAM Cell
BL BLB
WL WL
M3 M6
M2 M5
Q QB
M1 M4
18
© Adam Teman,
May 2, 2021
The Computer Hall of Fame
• The machine that introduced the GUI, the mouse,
and Steve Jobs to the mainstream.
• The personal computer that “was designed Source: macworld.co.uk
so easy to use that people could actually use it”.

• Developed based on a 3-day tour of Apple
at Xerox PARC, where they saw the Xerox Alto.
• Introduced in 1984, sold for $2500 with an
8MHz Motorola 6800 processor, 128kB RAM.
• Included MacPaint and MacWrite
• Despite initial success, sales declined and
Steve Jobs was fired from Apple in 1985.
Source: macworld.co.uk
© Adam Teman,
May 2, 2021
6T SRAM Operation
21
SRAM Operation: HOLD
BL BLB
WL WL
M3 M6
M2 M5
Q QB
M1 M4
22
© Adam Teman,
May 2, 2021
SRAM Operation: READ
BL BLB
WL 0 VDD WL
M3 M6
M2 VDD M5
0 BLB
Q QB
QB M1
0 VDD
M4
WL
M5
M3
QB=ΔV
Q=VDD
Q
M4
WL M2
Left Side: Right Side:
BL “nMOS” inverter –
Nothing Changes…
QB voltage rises
23
© Adam Teman,
May 2, 2021
SRAM Operation - Read
BLB
BL
BLB Cell Ratio:
WL M3 M6 WL WL
M5 W4
L4
VDD M2 M5 VDD QB=ΔV CR 
W5
Q=‘1’ M1 Q
M4 L5
CBLB
M4 QB=‘0’
CBL
 VDSat,n 
2
 V 2 
kM5 (VDD − V − VT,n )VDSat,n −  = kM4 (VDD − VT,n ) V − 
 2   2 
VDSat,n + CR (VDD − VT,n ) − V (1 + CR ) + CR (VDD − VT,n )

2 2 2
DSat,n
V =
CR
24
© Adam Teman,
May 2, 2021
Cell Ratio (Read Constraint)
BLB
W4 WL
M5
L4
CR  QB=ΔV
W5
L5 M4
Q
So we need the pull

down transistor to be
much stronger than the
access transistor…
25
© Adam Teman,
May 2, 2021
SRAM Operation: WRITE
BL BLB
WL VDD 0 WL
M3 M6
VDD M2 0 VDD M5 0
Q QB
BL M1 M4
VDD 0
Q
WL
M2 M6
Q=ΔV
QB=VOLmin
M1 WL M5
QB Left Side: Right Side:
BLB
Same as during read – Pseudo nMOS
designed so ΔV<VM inverter!
26
© Adam Teman,
May 2, 2021
SRAM Operation - Write Pull-Up Ratio
W6
BLB
BL L6
PR 
Q W5
M6 L5
WL M3 M6 WL
M2 M5
QB=VOLmin
‘0’
Q=‘0’ M1 WL M5
M4 QB=‘1’
VDD BLB
 2
   2

( )
V
 = kM5 (VDD − VT,n )VQB − 2 
V QB
kM6  VDD − VT,p VDSat,p − DSat,p
 2   
p  2

( )
VDSat,p
(V − VT,n )
2
VQB = VDD − VT,n − − 2 PR  VDD − VT,p VDSat,p − 
DD
n  2 
27
© Adam Teman,
May 2, 2021
Pull Up Ratio – Write Constraint Q
M6
QB=VOLmin
WL M5
BLB
So we need the access

transistor to be much stronger
than the pull up transistor…
W6
L6
PR 
W5
L5
28
© Adam Teman,
May 2, 2021
Summary – SRAM Sizing Constraints
W1 W4
Read Constraint L1 L4 PDN
CR  = =
W2 W5 access
L2 L5
KPDN  Kaccess
KPDN  Kaccess  KPUN

Write Constraint
Kaccess  KPUN
W3 W6
L3 L6 PUN
PR  = =
W2 W5 access
L2 L5
29
© Adam Teman,
May 2, 2021
4T Memory Cell
• Achieve density by removing the PMOS pull-up.
• However, this results in static power dissipation.
30
© Adam Teman,
May 2, 2021
Multi-Port SRAM
Dual Port SRAM Two Port SRAM
31
© Adam Teman,
May 2, 2021
6T SRAM Layout
SRAM Layout - Traditional
• Share Horizontal Routing (WL).
• Share Vertical Routing (BL, BLB).
• Share Power and Ground.
BL BLB
WL WL
M3 M6
M2 M5
Q QB
M1 M4
© Adam Teman,
May 2, 2021
SRAM Layout – Thin Cell BL BLB
• Avoid Bends in Polysilicon and Diffusion WL WL

M3 M6
• Orient all transistors in one direction.
M2 M5
• Minimize Bitline Capacitance. Q QB
M1 M4
34
© Adam Teman,
May 2, 2021
65nm SRAM
• Industrial example from ST/Phillips
35
© Adam Teman,
May 2, 2021
Commercial SRAMs
Intel Design Forum 2009
36
© Adam Teman,
May 2, 2021
And very recent SRAMs Samsung 3nm
TSMC 7nm GAA SRAM Test Chip
SRAM
Source: TSMC
Source: ISSCC 2021
TSMC 5nm SRAM Test Chip

Source: ISSCC 2020 © Adam Teman,
May 2, 2021
SRAM Stability
“Static Noise Margin”
38
Static Noise Margin - Hold
+ +
VN - - VN
39
© Adam Teman,
May 2, 2021
Static Noise Margin - Hold
M3 M6
Q QB Q QB
M1 M4
1. Plot both VTCs on the same graph

2. Find the maximum square that fits
in the VTC.
3. The SNM is defined as the side of
the maximum square.
40
© Adam Teman,
May 2, 2021
Static Noise Margin - Read
• What happens during Read?
• We can’t ignore the access transistors anymore…
BLB
Vout
BL
WL M3 M6 WL
VDD M2 M5 VDD
Q M1 M4 QB
CBL
CBLB
M3 M2 M6 M5
QB Q
Q QB
M1 M4
Vin
41
© Adam Teman,
May 2, 2021
Static Noise Margin - Read
QB
M3 M2
QB Q
M1
SNM
M6 M5
Q
QB
M4
Q
42
© Adam Teman,
May 2, 2021
Static Noise Margin - Write Q
• What happens during Write? M3 M2

• The two sides are now different. QB Q
M1
BLB
BL
QB
WL M3 M6 WL QB
M2 M5 ‘0’
M6
Q=‘0’ M1 M4 QB=‘1’ QB
VDD Q
M4 M5
43
© Adam Teman,
May 2, 2021
Static Noise Margin - Write
Q
M3 M2
QB Q
M1
If there is a stable
point here, the
wrong data is
M6 WSNM written!
QB
Q
M4 M5
QB
44
© Adam Teman,
May 2, 2021
Alternative Write SNM Definition
• Write SNM depends on the cell’s separatrix,
therefore alternative definitions have been proposed.
• For example, add a DC Voltage (VBL) to the 0 bitline Q
and see how high it can be and still flip the cell.
M3 M2 M6 VBL=
V DD
QB
QB Q
Q
VB
L =0
M1 M4 M5
QB
VBL
45
© Adam Teman,
May 2, 2021
Dynamic Stability
46
© Adam Teman,
May 2, 2021
SNM Calculation
47
Simulating SNM
• Problem:
• How can we calculate SNM with SPICE?
• Some options:
• Insert DC sources at Q and QB
• But where exactly do we connect them?
• Draw Butterfly Curves
• But how do we find the largest squares?
• To run Monte Carlo Simulations we should have an easy way of calculation.
49
© Adam Teman,
May 2, 2021
Simulating SNM
• First let’s define the graphical solution:
• The diagonals of all the squares are on lines parallel to Q=QB.
• We need to find the distance
between the points where these
intersect the butterfly plot.
• The largest of these distances
is the diagonal of the maximum
square in each lobe.
• Multiply this by cos45°
and we get the SNM.
• Easy, right?
50
© Adam Teman,
May 2, 2021
Changing Coordinates
• What if we were to turn the graph?
51
© Adam Teman,
May 2, 2021
• If we were to use new axes, we could just subtract the graphs.
• This gives us the distances between the intersections with the Q=QB parallels.
• Now all we have to do is
find the maximum of the
subtraction.
• (Don’t forget to multiply by cos45)
52
© Adam Teman,
May 2, 2021
• The required transformation is: x=
1
u+
1
v
2 2
1 1
y=− u+ v
2 2
• Now let’s define some function as F1
• Substituting y=F1(x) gives:

v = u + 2y =
 1 1 
= u + 2 F1  u+ v
 2 2 
53
© Adam Teman,
May 2, 2021
• What we did is turn some function (F1) 45 degrees
counter clockwise.
• This can easily be implemented with the following circuit:
• What is F1?
• It could be the VTC of
Vin=Q, Vout=QB…
54
© Adam Teman,
May 2, 2021
• But what about the “mirrored” VTC?
• This needs to first be mirrored with respect to the v axis and then transformed
to the (u,v) system.
• If we call the second VTC F2 with x=F2(y) then the operation we need is:
v = −u + 2 x
 v u 
= −u + 2 F2  − 
 2 2
55
© Adam Teman,
May 2, 2021
Final SNM Calculation
VDD GND VDD
• Now we need to:
BL WL BLB
• Make a schematic of our SRAM
cell with two pins: Q and QB. 6T Cell
Q2 Q QB QB2
• Create a coordinate changing circuit VDD GND VDD
for each of the transformations. DC Sweep u v1 v1

BL WL BLB
Transformation 1
Q1 F1(in)
6T CellF1(out) QB1
Q2 Q QB QB2
DC Sweep u v2 v2
Transformation 2
QB2 F2(in) F2(out) Q2
56
© Adam Teman,
May 2, 2021
• Now, connect F1 to Q→QB, and F2 to QB→Q. VDD GND VDD
DC Sweep u v1 v1 BL WL BLB
Transformation 1 6T Cell
Q1 F1(in) F1(out) QB1 Q1 Q QB QB1
VDD GND VDD
DC Sweep u v2 v2
BL WL BLB
Transformation 2
QB2 F2(in) F2(out) Q2 6T Cell
Q2 Q QB QB2
• Run a DC Sweep on u from –VDD/√2 to VDD/√2

• This will present the butterfly curves
turned 45 degrees.
57
© Adam Teman,
May 2, 2021
• Now just:
• Subtract the bottom graph from the top one.
• Find the local maxima for each lobe.
• The smaller of the local maxima
is the diagonal of the largest square.
• Multiply this by cos45° for the SNM
 min max ( v1 − v2 ) , max ( v1 − v2 )

1 
SNM =
2  − 2 u  0 0 u  2
58
© Adam Teman,
May 2, 2021
Read/Write SNM
• How about Read SNM:
• Use the exact same setup.
• Connect BL and BLB to VDD.
• Connect WL to VDD.
• Run the same calculation.
• And Write SNM.

• Now connect one BL to GND.
• This is trickier, so you’ll have to play around with the calculation.
• There are other options for WM calculation.
59
© Adam Teman,
May 2, 2021
Testbench Setup – Read/Write
Read Testbench: Write Testbench:
DC Sweep u v1 v1 DC Sweep u v1 v1
Transformation 1 Transformation 1
Q1 F1(in) F1(out) QB1 Q1 F1(in) F1(out) QB1
DC Sweep u v2 v2
DC Sweep u v2 v2 Transformation 2
Transformation 2 QB2 F2(in) F2(out) Q2
QB2 F2(in) F2(out) Q2
GND VDD VDD
VDD VDD VDD
BL WL BLB
BL WL BLB
6T Cell
6T Cell Q1 Q QB QB1
Q1 Q QB QB1
GND VDD VDD
VDD VDD VDD
BL WL BLB
BL WL BLB
6T Cell 6T Cell
Q2 Q QB QB2 Q2 Q QB QB2
60
© Adam Teman,
May 2, 2021
SRAM Stability under process variations
61
© Adam Teman,
May 2, 2021
Metastability Convergence in Spectre
• Node Sets
• What solution does Virtuoso find with a standard OP?
• To fix this, make sure you use the “Node Set” option.
© Adam Teman,
May 2, 2021
Node Sets vs. Initial Conditions
• SPICE supports two types of conversion aids :
• Node Sets:
• Help SPICE converge by providing it
with an initial guess.
• Used only for DC convergence!
Disregarded for Transient Analysis.
• Initial Conditions:
• Enforce a node voltage at time t=0.
• Used only for Transient analysis!
Disregarded for DC convergence.
63
© Adam Teman,
May 2, 2021
Additional simulation tips
• Work with Design Hierarchy
• Create transformation functions
and DUTs as symbols.
• Create multiple tests in
single ADE-XL view.
• Use variables/parameters to
define initial conditions/node sets.
• Create supply voltages in
separate symbol.
• Use buffers to smooth transitions
and reduce cross cap.
64
© Adam Teman,
May 2, 2021
Further Reading
• Rabaey, et al. “Digital Integrated Circuits” (2nd Edition)
• Elad Alon, Berkeley ee141 (online)
• Weste, Harris, “CMOS VLSI Design (4th Edition)”
• Seevinck, List, Lostroh, “Static Noise Margin Analysis of SRAM Cells”
IEEE Journal of Solid State Circuits, 1987
• Teman and Visotsky. "A fast modular method for true variation-aware separatrix
tracing in nanoscaled SRAMs." IEEE TVLSI, 2014.
65
© Adam Teman,
May 2, 2021

Lecture 8 SRAM 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 8 SRAM 2021

Uploaded by

Copyright:

Available Formats

Digital Integrated Circuits

Cerebras Wafer Scale Engine 2 (2021) –

5 Source: Pavlov, Sachdev, 2008

Random Access Memory Serial Access Memory Content Addressable Memory

• Random Access, FIFO, LIFO,

• Let’s say we want to make a 1MB memory:

Number of Words in a Row: 2M

Number of Rows: 2A-M

Option 3: Transmission Gate Solution: Differential nMOS Write

Writes well, but how do we read?

• The personal computer that “was designed Source: macworld.co.uk

so easy to use that people could actually use it”.

VDSat,n + CR (VDD − VT,n ) − V (1 + CR ) + CR (VDD − VT,n )

So we need the pull

So we need the access

KPDN  Kaccess  KPUN

Dual Port SRAM Two Port SRAM

• Avoid Bends in Polysilicon and Diffusion WL WL

Intel Design Forum 2009

Source: ISSCC 2021

TSMC 5nm SRAM Test Chip

1. Plot both VTCs on the same graph

• What happens during Write? M3 M2

• To run Monte Carlo Simulations we should have an easy way of calculation.

• Substituting y=F1(x) gives:

• Create a coordinate changing circuit VDD GND VDD

for each of the transformations. DC Sweep u v1 v1

VDD GND VDD

• Run a DC Sweep on u from –VDD/√2 to VDD/√2

 min max ( v1 − v2 ) , max ( v1 − v2 )

• And Write SNM.

You might also like