Professional Documents
Culture Documents
to Meet
Power, Speed, and Area Scaling Trend
LC LU
TSMC
TSMC Fellow/Senior Director, R&D
© 2017 ISPD
ISPD 2017 1
Chip and System Integration Trends
for Better PPA & System Performance
Platform Solutions
Mobile
3D Transistors High
(FinFET, GAA-NWT) Performance
Computing
Automotive
CPU
Voice
3D Stacking
(BSI, Monolithic MEMS)
Camera IoT
Video
MP3
Storage
SYSTEM
SoC 3D Packaging INTEGRATION
(WLP, WoW)
Multiple System performance/
Better performance
Chips Bandwidth
Better Power
on PCB Application-specific
Better form factor
Process technology +
Design Solutions
2D 3D
© 2017 ISPD
ISPD 2017 2
Application-Optimized Design Platforms
Area, Performance and Power
Power
Speed &
Memory Bandwidth
High-
Performance
Functional Safety Computing
Standard Compliance
Mobile
Ultra-Low Power
Automotive
IoT
Speed
© 2017 ISPD
ISPD 2017 3
Semiconductor Trend
Area: process primary dimension scaling slows down
Very challenging to continue Moore’s law economically
Process and design co-optimization to provide enough area
scaling
Performance: dimension scaling grows metal and via
resistance exponentially
Fully automatic and smart via pillar design flow to reduce high
resistance impact
Power: ultra low voltage for ultra low power
Robust design and variation modeling at low voltages
Heterogeneous integration
Low cost and high system performance 3D packaging
High physical design complexity
Machine learning to be applied to physical design
© 2017 ISPD
ISPD 2017 4
Area Scaling
Process primary dimension (metal, gate and
fin pitch) scales less and per generation
cadence takes longer
Process-design co-optimization innovation to
keep area scaling
Fin depopulation to increase cell density
Power plan and cell layout co-optimization to
increase cell placement utilization
EUV to increase routing density
© 2017 ISPD
ISPD 2017 5
Fin Depopulation can Increase Density
and Reduce Process Dimension Scaling Pressure
© 2017 ISPD
ISPD 2017 6
Fin Depopulation can Increase
Speed and Power Efficiency
Higher fins have highest speed
Fewer fins have highest speed@same power and
lowest power@same speed
© 2017 ISPD
ISPD 2017 7
Fin Depopulation Impact on
Logic Density and Cell Utilization
Logic
Density
3.0
2.0
Cell
Utilization
1.0 80%
75%
70%
16nm 10nm 7nm
3~4 fins 3 fins 2 fins
© 2017 ISPD
ISPD 2017 8
Logic Density Improvement (I)
Power Plan Optimization
To improve power plan IR, PG via count is increasing over
technology generations
Shrink PG Pitch to
increase via count
Top Down
Unused
space
Power stub:
More cells placed under PG Stagger pin: 6 access points
© 2017 ISPD
ISPD 2017 11
Logic Density Improvement (IV)
EUV to Increase Routing Density
NXE3300 EUV
Hole type DSA, Line/Space DSA,
Single 16 nm half pitch 12 nm half pitch
Patterning
© 2017 ISPD
ISPD 2017 12
EUV Metal
Metal : Poly pitch = 2 : 3
More Metal resources for logic density improvement
2 library sets are needed for two metal offset.
© 2017 ISPD
ISPD 2017 13
Less Coupling from Metal : Poly = 2 : 3
ND2D1 ND2D1
M1:PO pitch = 1:1 M1:PO pitch = 2:3
© 2017 ISPD
ISPD 2017 14
More Routing from Metal : Poly = 2 : 3
© 2017 ISPD
ISPD 2017 15
Enhanced Logic Density and
Cell Utilization Trends
Logic
Density
3.0
2.0 Optimized
power plan, Cell
pin access, Utilization
1.0 EUV routing 80%
75%
70%
N16 N10 N7
11ML 12ML 13ML
© 2017 ISPD
ISPD 2017 16
Performance Scaling
Process dimension scaling grows metal and
via resistance exponentially
Fully automatic and smart EDA design flow is
needed to effectively solve high resistance
issue for 7nm and beyond
© 2017 ISPD
ISPD 2017 17
Cross Node Metal RC Scaling
75 2
Mx Resistance (vs. 40nm)
Mx R
1.8
60
1.4
30
increase ~3X
1.2
Mx C
15
1
0 0.8
40nm 28nm 16nm 7nm 5nm
© 2017 ISPD
ISPD 2017 18
Cross Node Wire R and Via Impact
50%
BEOL R delay vs. Total delay
40%
30%
20%
10%
0%
40nm 28nm 16nm 7nm 5nm
© 2017 ISPD
ISPD 2017 19
VIA Pillar
M6
M5 M5
M4 M4
M3 M3
M2 M2
© 2017 ISPD
ISPD 2017 20
Via Pillar Insertion
Implies layer promotion for better wire resistance
Effectively reduces source Via resistance
Connection through DPT layer: DPT wire resistance dominates
Metal R Source Via R
Std. Wire
cell (20μm) 55.43X 1X
40%
30%
20%
0%
40nm 28nm 16nm 7nm 5nm
© 2017 ISPD
ISPD 2017 22
Via Pillar Type and Usage
• EM via pillar (EM VP), used to guarantee cell-level EM,
has to be 100% inserted
– One cell master will only have 1 EM VP
• Performance via pillar (Performance VP), which is larger
than EM VP, can reduce more resistance
– One cell master can have multiple performance VP
VP Association I
Cell Performance Performance Performance EM Default
Master VP1 VP2 VP.. VP VP
VP Association II
Cell EM Performance Performance Performance
Master VP VP1 VP2 VP..
© 2017 ISPD
ISPD 2017 23
Via Pillar Automatic Design Flow
•VIA pillar (VP) definition, association
Design Kits
Floorplan
• VP aware placement
Placement • Automatic NDR
• VP optimization
Routing/Opt.
• DEF out option
© 2017 ISPD
ISPD 2017 24
Voltage and Power Scaling
Low voltage design reduces power effectively
Low voltage design challenges
Functionality robustness
Accurate variation modeling
© 2017 ISPD
ISPD 2017 25
Ultra-Low-Voltage Standard Cell Solution
Skew Cells/ D0 or Fine-Grained Cells
Internal Half-Sizing Cells
0.9V-
0.6V
Small High-Stack(3 or 4) Cells
Transmission Gates
Multi-Bit Latch/Flop
Proper Stage Ratio to Maintain Internal Slew
0.5V
and Proper P/N Ratio to Improve Noise Margin
below
New Circuit Structure to Enhance Robustness at Low–Vdd
© 2017 ISPD
ISPD 2017 26
Flop Low Voltage Design Robustness
• High sigma design checks are required to ensure
robust operation at low operating voltage
• For write operation, forward path (red) must be
stronger than feedback path (blue)
QN
Reset
clk_b clk
SI
D Reset
clk clk_b
clk_b
Reset
clk
clk
Reset
clk_b
© 2017 ISPD
ISPD 2017 27
Delay Variation Increases at
Lower VDD
2
Log of scaled delay standard
1.8
Delay variation increases
1.6 exponentially as VDD decreases
1.4
deviation
1.2
0.8
0.6
◄ Non-Gaussian Gaussian ►
0.4 at low voltage at regular voltage
0.2
0
VDD
© 2017 ISPD
ISPD 2017 28
Abnormal Hold-Time Caused by
Low-Vdd Non-Gaussian Behavior
Low VDD
Regular VDD
Hold time
© 2017 ISPD
ISPD 2017 29
Solutions for Low-Vdd Variations (I)
Methodology improvement
– Timing values are models by both “mean” and
“early/late distribution”
– Implement the new models in timing characterization
and STA tools
Original distribution
(Non-Gaussian with
long right tail)
“Early” distribution
to cover short tail “Late” distribution
to cover long tail
© 2017 ISPD
ISPD 2017 30
Solutions for Low-Vdd Variations (II)
By using new timing model and advanced statistical
OCV method, we achieve more accurate STA results
compared to Monte-Carlo simulations
Optimistic
Slack
© 2017 ISPD
ISPD 2017 32
Future Device Possibilities Through
Heterogeneous Technology Integration
TFET RF CMOS Analog
GAA- NWT
eFlash
Source
Gate Drain
3D Low Integrated
Ge FinFET
on Si
Power Specialty
CMOS MEMS
CMOS Technology
FinFET HV
3D-IC
Technology BCD - Power IC
InFO
Cu-TSV
CoWoSTM Vertical Stacking
Si Interposer
© 2017 ISPD
ISPD 2017 33
3D Packaging for Better System
Performance and Form Factor
Analog RF Passive
Logic
Memory
Wafer Bonding
3000
I/O Pin Count
CoWoS
2000
InFO
InFO-M
1000 InFO PoP
WLCSP*
0
0 200 400 600 800 1000 1200 1400
CoWoS
DRAM
Logic Through-
2000 InFO–Via
InFO (TIV)
InFO-PoP
InFO-M
1000 InFO PoP Logic Memory
Homogeneous
partition
3000
I/O Pin Count
CoWoS
High bandwidth memory
2000 integration for high-performance
InFO computing
InFO-M
1000 InFO PoP
High-performance
heterogeneous partitioning
WLCSP*
0
0 200 400 600 800 1000 1200 1400
Die/PKG size (mm2)
© 2017 ISPD
ISPD 2017 37
Advanced Packaging Co-Design –
SoC + InFO + IPD (Integrated Passive Device)
FC-PoP o
InFO-PoP
Rja (17.2 C/W) Rja (15.4 oC/W)
IPD 1 Thermal
L6 SR F-S pad
PM4 dissipation
RDL3
L5
Insulator5
RDL2
PM3 improved by 12%
Insulator4 PM2
L4 RDL1 250μm
RDL0 PM1
Core Molding
590μm Silicon Compounds
L3
Insulator2
L2
B-S pad
Insulator1
L1
SR
InFO DRC/LVS
RLCK Extraction
< 4GHz RC/RLCK Extraction for STA, IR,
SEM, & PI Analysis
InFO RC
Extraction S-parameter Extraction
4GHz – 10GHz S-parameters Extraction
for SI/PI, & EMI analysis
© 2017 ISPD
ISPD 2017 39
InFO Design Solutions
(Dies + InFO)
Inter-die DRC/LVS
Inter-die
Inter-die DRC Inter-die LVS DRC/LVS
InFO EM/IR Analysis
B1 B2 B3 B4
Dynamic
A1 A2 A3 A4 Static IR
EM/IR IR
Top die Analysis
Analysis Analysis
SI/PI Simulation
SI/PI
Simulation Thermal-aware EM/IR Analysis
SI Analysis PI Analysis
© 2017 ISPD
ISPD 2017 40
Applying Machine Learning to
Complex Physical Design Problems
Place and route Machine Learning experiment
Feature extraction and convolutional neural
network data mapping
Clock gating and routing congestion speed
improvement
© 2017 ISPD
ISPD 2017 41
Machine Learning-Based
Design Solution
Eliminate human & fixed-model subjective bias with statistical
important features
Traditional EDA Approach Machine Learning Approach
Placement
Prediction
Placement
Learning
result Routing result
Routing Model Routing
simulation
overheads training overheads
Routing programming Detailed
pattern model routing result
Only know routing results after Able to predict routing behavior and improve
running routing congestion with new recipes to achieve 40Mhz
© 2017 ISPD speed improvement
ISPD 2017 42
ML Design Enablement Platform
RTL Synthesis Place & Route STA Signoff Tapeout
TSMC’s proven
ML training
models
© 2017 ISPD
ISPD 2017 43
ML Design Enablement Platform
RTL Synthesis Place & Route STA Signoff Tapeout
Customer’s
models
© 2017 ISPD
ISPD 2017 44
Performance Gains Prediction from ML
• Capability to predict and set accurate ARM A72 clock
gating latency to avoid over-design and achieve better
speed from 50 to 150Mhz
• Post-route speed is improved by 40MHz with detour
prediction and early fix
© 2017 ISPD
ISPD 2017 46