Professional Documents
Culture Documents
14 October 2021
Who are we?
2
Prof. Osnat Keren Dr. Leonid Yavits © Adam
October
Teman,
14, 2021
Outline
© Adam
October
Teman,
14, 2021
Embedded DRT and Other
GC-eDRAM Summary
Memories Refresh Designs
Embedded Memories
The Computer Memory Hierarchy
© Adam
October
Teman,
14, 2021
The Importance of Embedded Memories
• Memories dominate area and power.
2MB L3 Cache 20MB L3 Cache
Source: Intel
Intel Pentium-M (2001) Intel 10th Gen “Comet Lake” (2020) Source: Intel
Source: wccftec
256MB L3 Cache 960MB L4 Cache
Cerebras Wafer Scale Engine 2 (2021)
6 IBM z15 – CP and SC chips (2020) © Adam
October
Teman,
14, 2021
Static is GOOD!
• A static circuit can replenish its state in light of a disruption.
VQ VQB VQ VQB VQ
0V 1V 0V 1V 0V
VQB
‘0’ State
0.8V
0.4V
0V Q QB 1V
VQ
0V 0.4V
‘0’ State ‘1’ State
• High noise margins!
7 © Adam
October
Teman,
14, 2021
SRAM is GOOD!
• SRAM is the exclusive solution for embedded memories in most ICs.
BL BLB
TSMC 7nm SRAM TSMC 5nm SRAM Test Chip
WL WL
M3 M6
M2 M5
Q QB
M1 M4
Source: TSMC
Source: ISSCC 2020
Samsung
3nm GAA SRAM
Source: ST
Source: ISSCC 2021
8 © Adam
October
Teman,
14, 2021
But… Nobody is Perfect BL BLB
• SRAM is BIG WL WL
M3 M6
• 6 Transistors=1 bit
M2 M5
0VQ QB
1V
M1 M4
• SRAM is Leaky
• Several VDD to GND paths
• SRAM is Ratioed
• Fails under voltage scaling
13
Introducing the “Gain-Cell”
WL
• 1T-1C DRAM uses a single port for reading and writing BL
• Write: Drive charge through the port onto a storage capacitor. R/W Port
• Read:
• Precharge bitline and enable charge sharing through the port
• The charge transferred from the storage node changes the bitline voltage
• A large storage capacitor is required to enable sensing this change
• It also destroys the stored data requiring write-back
• We can amplify the stored charge (=“gain”) Write Port Read Port
• We can separately optimize read and write
RBL
• Read becomes non-destructive
• We get two-ported functionality
14 © Adam
October
Teman,
14, 2021
Basic Gain Cell Operation
• All NMOS 2T Gain Cell • All NMOS 3T Gain Cell
• Write Strong ‘0’, Weak ‘1’ • Write is the same
• Boosted voltage for strong ‘1’ • Read:
• Read: • Precharge RBL, Pulse RWL high
• Precharge RBL, Pulse RWL low • SN=‘0’ → RBL unchanged
• SN=‘0’ → RBL unchanged • SN=‘1’ → RBL discharges
• SN=‘1’ → RBL discharges
Vboost VDD VDD
RWL driven RWL
WWL through diffusion VDD
WBL WWL
‘1’
‘0’ RBL discharge
WBL MW
MW SN dependent on
MR RBL other cells on row
MW SN
VDD MR RBL
RBL saturation
depends on other
RWL cells in column
15 GND © Adam
October
Teman,
14, 2021
GC-eDRAM Advantages
• Compared to SRAM: BL BLB
• Non-ratioed M1 M4
• Two-ported
16 © Adam
October
Teman,
14, 2021
But, charge leaks away…
• Subthreshold conduction
• Exponentially depends on MW’s VT, VGS, and temp
• Depends on voltage difference between SN and WBL
• GIDL and junction leakage
• Asymmetrical between ‘1’ and ‘0’, Increases with temperature
• Gate leakage
• Asymmetrical between ‘1’ and ‘0’, Independent of temperature
17 © Adam
October
Teman,
14, 2021
Write access statistics
• Sub-threshold leakage depends on the relation between SN and WBL
• Scenario 1: Worst-case access
• After writing a cell, WBL is permanently opposite to stored data
Write ‘0’
• Scenario 2: Retention mode
• After writing memory array, it remains in idle or read states,
allowing WBL control -> pre-(dis)charge or bias WBL
18 © Adam
October
Teman,
14, 2021
Data Retention Time Measurement
• Data Retention Time (DRT) is the time from
write until you can no longer read out the data.
• Various approaches for measuring:
• Effective data retention time (EDRT)
• Voltage-based data retention time (VDRT)
• Current-Based Data Retention Time Evaluation (IDRT)
Sources: R. Giterman, A. Bonetti, T. Noy, A. Teman, and A. Burg, IEEE Transactions on Circuits and Systems I (TCAS-I), 2020
19 N. Edri, P. Meinerzhagen,A. Teman, A. Burg, and A. Fish, IEEE Transactions on Circuits and Systems I (TCAS-I), 2016 © Adam
October
Teman,
14, 2021
Embedded DRT and Other
GC-eDRAM Summary
Memories Refresh Designs
20
The problems with Data Retention Time
• The main barrier for GC-eDRAM is its limited DRT, which leads to:
• Increased power consumption - Pret 1
DRT
• Lower availability Availability 1 → DRT
Trefresh
• This gets worse with transistor scaling,
as the parasitic capacitance is reduced DRT CG W L
• In addition, DRT is a complex factor, as it is dependent on:
• Written voltage levels (Vboost, CI/CF)
• Read Frequency: I RBL VSN 1
Tret
• Write Statistics
• Data stored in neighboring cells (for 1T read port)
• Accordingly, a wide range of research has focused on extending the DRT
© Adam
October
Teman,
14, 2021
Different Bitcells
• Many combinations of bitcells have been proposed for
improving retention time and other circuit characteristics
Somasekhar 08,09 2T Luk 2004 (SG) / 05 2T1D,
All PMOS 2T 2T1D
Chang 07
3T1D VDD
RBL
WWL BL
PB
MW RWL
WWL
MW MR MR
MR
WBL
RWL WWL
RWL GD GND/
VDD
Vbias MW
Somasekhar 08, 09 Luk 04, Chang 07 WWL
SN
RBL
Chun 12 Asymmetric
Asymmetric 2T 2T ChunBoosted
09,11 Boosted 3T
3T CSN
SN1
WBL
VDD
RBL
WWL
WBL
PB
200mV
MW Luk 06, Harel 21
MR MW
MS
WBL
RWL RWL
Chun 12 Chun 09, 11
© Adam
October
Teman,
14, 2021
Dealing with CMOS Scaling
• The retention time of classic GC-eDRAM
options drops significantly below 65nm
Sources: R. Giterman, A. Fish, N. Geuli, E. Mentovich, A. Burg, and A. Teman,, IEEE Journal of Solid State Circuits (JSSC), 2018
R. Giterman , A. Fish, A. Burg and A. Teman, IEEE Transactions on Circuits and Systems I (TCAS-I), 2017.
23 © Adam
October
R. Giterman,A. Teman, P. Meinerzhagen, A. Burg, and A. Fish, US Patent 10,002,660 Teman,
14, 2021
Different Technologies
• Bulk CMOS technologies suffer from increasing subthreshold leakages
• 180nm provided DRTs of ms, reduced to 10’s of us at 65nm
• Reduced leakage of FD-SOI and FinFET technologies provide new opportunities
28FD-SOI Test Chip
16nm FinFET Test Chip
© Adam
October
Teman,
14, 2021
Sources: R. Giterman , A. Fish, A. Burg and A. Teman, IEEE Transactions on Circuits and Systems I (TCAS-I), 2017.
24 R. Giterman, A. Shalom, A. Burg, A. Fish, and A. Teman, IEEE Solid State Circuit Letters, 2020
Body Biasing
• In mature processes, body biasing can be
applied to lower leakage and extend DRT
• Silicon: 100mV RBB → 2.3X DRT Boost
Sources: P. Meinerzhagen,A. Teman, A. Fish, and A. Burg, IET Journal of Engineering (JoE), 2013
R. Giterman, A. Bonetti, A. Burg, andA. Teman, IEEE Transactions on Circuits and Systems II (TCAS-II), 2019
J. Narinx, A. Bonetti, N. Frigerio, C. Aprile, A. Burg and Y. Lenlebici, IEEE Asian Solid State Circuits (ASSCC), 2019
R. Giterman and A. Teman, US Patent App. 17/257,893, 2021 © Adam
October
Teman,
14, 2021
Refresh Approaches
• Straightforward approach: ordinary periodic refresh (a.k.a., global refresh)
• Sequentially refresh entire array at 1/DRT frequency
Normal Operation Refresh Normal Operation Refresh
26 © Adam
October
Teman,
14, 2021
Hidden Refresh Algorithm
• Can we ensure 100% Availability?
• In order to provide a “drop-in” replacement for SRAM,
a GC-eDRAM macro must ensure 100% array availability.
• Hide the refresh using COIs (copies of instances)
Memory subarrays
COI’s
(invisible to user)
27 Source: R. Golman, N. Nachum, T. Cohen, R. Giterman, A. Teman, IEEE Access, 2021 © Adam
October
Teman,
14, 2021
Refreshing FIFOs
• What if the access is strictly ordered, such as in a FIFO?
Can we do any better?
• Yes.
• There is an upper bound on the number
of interruptions that can occur.
A FIFO of size S is guaranteed to be
refreshed on time if:
NDRT ≥ (S+1) + 2(S-1) = 3S-1
(NDRT is Retention Time in clock cycles)
• So we just need to trigger the refresh in time to ensure we can finish on time!
• Leads to very significant power savings (often no refresh is needed!)
Sources: T. Noy and A. Teman, IEEE Transactions on Circuits and Systems I (TCAS-I), 2020
28 T. Noy and A. Teman, US Patent 10,803,920, 2021 © Adam
October
Teman,
14, 2021
Replica Cells
• Utilize replica cells to track data retention time
due to process variations, write statistics
• Silicon: 5X longer DRT, 5X lower refresh power
5X
29 Source: A. Teman, P. Meinerzhagen, R. Giterman, A. Fish, and A. Burg, IEEE Transactions on Circuits and Systems II (TCAS-II), 2014 © Adam
October
Teman,
14, 2021
Overlapping Refresh
Internal Refresh
Multi-Ported Gain-Cell
Double-pumped Read
31
Low-leakage Hybrid Memory
• A hybrid SRAM/GC-eDRAM cell can
provide ultra low-leakage by
• Power gating the supply during standby
• Rely on dynamic storage of GC-eDRAM
• Use the SRAM latch to refresh the data
Sources: R. Giterman, A. Teman, P. Meinerzhagen, IEEE Transactions on Circuits and Systems II (TCAS-II), 2017
32
R. Giterman, A. Teman, P. Meinerzhagen, IEEE Int. Symp. on Circuits & Systems (ISCAS), 2017
© Adam
October
Teman,
14, 2021
Radiation-Hardened Dynamic Memory
• A conventional 2T gain-cell is only susceptible to a one-direction bit-flip
• Combine complementary 2T cells and one will never fail!
• When reading, if both outputs are complementary → No error
• If both outputs are the same (presumable data ‘1’) → an error has occurred
• Add parity to correct the error!
• Can also be used for retention time extension.
Sources: R. Giterman, L. Atias and A. Teman , IEEE Transactions on VLSI (TVLSI), 2016
R. Giterman, L. Atias and A. Teman, US Patent 10,991,421
33 R. Giterman, R. Golman and A. Teman, IEEE Access, 2019 © Adam
October
Teman,
14, 2021
True Approximate Storage
• Approximate computing does not require
100% error-free operation.
• However, this requires “graceful degradation”
• This is an inherent trait of DRT failures
28nm GC-eDRAM with Integrated dynamic and static RAM (iD-SRAM)
reduced refresh frequency
1us 5us
10us 50us
Sources: A. Teman, G. Karakonstantis, R. Giterman, P. Meinerzhagen, and A. Burg, DATE 2015
S. Ganapathy, A. Teman, R. Giterman, A. Burg, and G. Karakonstantis, IEEE NEWCAS, 2015
34 © Adam
October
R. Giterman, A. Fish, N. Geuli, E. Mentovich, A. Burg, and A. Teman,, IEEE Journal of Solid State Circuits (JSSC), 2018
A. Kazimirsky, A. Teman, N. Edri, and A. Fish, IEEE Trans. VLSI (TVLSI), 2017 Teman,
14, 2021
Ternary Bitcells
• Static bitcells are bi-stable
• And therefore, can only store two values (VDD and GND) 100 ‘0’ (𝐺𝑁𝐷)
• But dynamic circuits can be at intermediate levels 010 ‘1’ (
𝑣𝑑𝑑
2
)
• The provides the capability to implement a multi-level cell 001 ‘2’ (V𝐷𝐷)
35 © Adam
October
Teman,
14, 2021
Cryogenic GC-eDRAM
• Cryogenic operation is used for certain applications:
• Quantum computing, Infra-red imaging, HPC
• Subthreshold leakage is highly
suppressed under these conditions
• Dynamic memories could be a great option!
37 Source: E. Garzon, Y. Grinblatt, O. Harel, . Lanuzza, and A. Teman, IEEE Transactions on VLSI (TVLSI), 2021 © Adam
October
Teman,
14, 2021
Embedded DRT and Other
GC-eDRAM Summary
Memories Refresh Designs
38
A decade of GC-eDRAM research
• I started researching gain cells in 2012
• More than 40 published papers.
• One full-length book. GREENBELT1
(2012) 180nm
GREENBELT2
(2013) 180nm
CAMEL (2014)
65nm
dynOR (2015)
28 FDSOI
DAFNA
(2016) 28nm
40 Source: A. Bonetti, R. Golman, R. Giterman, A. Teman, and A. Burg, IEEE Transactions on VLSI (TVLSI), 2020 © Adam
October
Teman,
14, 2021
And the next step: RAAAM
Delivering the Highest Density Volatile
Embedded Memories in Standard CMOS
Reduced Cost | Longer Battery-Life | Better Performance
Dr. Robert Giterman Prof. Andreas Burg Prof. Alex Fish Prof. Adam Teman Mr. Danny Biran
41 CEO CTO Technology Advisor Technology Advisor © Adam
October
Teman,
Business 14, 2021
Advisor
Thank you
42 © Adam
October
Teman,
14, 2021