Thriuuu

Physical Design Challenges and Innovations
to Meet
Power, Speed, and Area Scaling Trend
LC LU
TSMC
TSMC Fellow/Senior Director, R&D
© 2017 ISPD
ISPD 2017 1
Chip and System Integration Trends
for Better PPA & System Performance
Platform Solutions
Mobile
3D Transistors High
(FinFET, GAA-NWT) Performance
Computing
Automotive
CPU
Voice
3D Stacking
(BSI, Monolithic MEMS)
Camera IoT
Video
MP3
Storage
SYSTEM
SoC 3D Packaging INTEGRATION
(WLP, WoW)
Multiple  System performance/
 Better performance
Chips Bandwidth
 Better Power
on PCB  Application-specific
 Better form factor
 Process technology +
Design Solutions
2D 3D
© 2017 ISPD
ISPD 2017 2
Application-Optimized Design Platforms
Area, Performance and Power
Power
Speed &
Memory Bandwidth
High-
Performance
Functional Safety Computing
Standard Compliance
Mobile
Ultra-Low Power
Automotive
IoT
Speed
© 2017 ISPD
ISPD 2017 3
Semiconductor Trend
 Area: process primary dimension scaling slows down
 Very challenging to continue Moore’s law economically
 Process and design co-optimization to provide enough area
scaling
 Performance: dimension scaling grows metal and via
resistance exponentially
 Fully automatic and smart via pillar design flow to reduce high
resistance impact
 Power: ultra low voltage for ultra low power
 Robust design and variation modeling at low voltages
 Heterogeneous integration
 Low cost and high system performance 3D packaging
 High physical design complexity
 Machine learning to be applied to physical design
© 2017 ISPD
ISPD 2017 4
Area Scaling
 Process primary dimension (metal, gate and
fin pitch) scales less and per generation
cadence takes longer
 Process-design co-optimization innovation to
keep area scaling
 Fin depopulation to increase cell density
 Power plan and cell layout co-optimization to
increase cell placement utilization
 EUV to increase routing density
© 2017 ISPD
ISPD 2017 5
Fin Depopulation can Increase Density
and Reduce Process Dimension Scaling Pressure
4-fin cell 3-fin cell 2-fin cell
© 2017 ISPD
ISPD 2017 6
Fin Depopulation can Increase
Speed and Power Efficiency
 Higher fins have highest speed
 Fewer fins have highest speed@same power and
lowest power@same speed
© 2017 ISPD
ISPD 2017 7
Fin Depopulation Impact on
Logic Density and Cell Utilization
Logic
Density
3.0
2.0
Cell
Utilization
1.0 80%
75%
70%
16nm 10nm 7nm
3~4 fins 3 fins 2 fins
© 2017 ISPD
ISPD 2017 8
Logic Density Improvement (I)
Power Plan Optimization
 To improve power plan IR, PG via count is increasing over
technology generations
Shrink PG Pitch to
increase via count
 However, shrinking PG pitch impacts routing resources.

So different and new PG design approaches evolved
Allows better cell

placement freedom
© 2017 ISPD
ISPD 2017 9
Logic Density Improvement (II)
Power Plan and Cell Layout Co-optimization
Uniform M1 Dual M1
Top Down
PG design has to be friendly to

cell architecture (Uniform  Dual)
PG and Cell Optimized
Co-Design Cell re-Design
Big cell design has to be

Bottom Up redesigned to be compatible with
Dual PG architecture
© 2017 ISPD
ISPD 2017 10
Logic Density Improvement (III)
Power Plan and Cell Layout Push to Limit
Power strap:
No stagger pin: 5 access points
Less cells placed under PG
Unused
space
Power stub:
More cells placed under PG Stagger pin: 6 access points
© 2017 ISPD
ISPD 2017 11
Logic Density Improvement (IV)
EUV to Increase Routing Density
NXE3300 EUV
Hole type DSA, Line/Space DSA,
Single 16 nm half pitch 12 nm half pitch
Patterning
Inverse Multiple EUV Directed Self-

Litho. patterning Assembly
© 2017 ISPD
ISPD 2017 12
EUV Metal
Metal : Poly pitch = 2 : 3
 More Metal resources for logic density improvement
 2 library sets are needed for two metal offset.
Metal:Poly = 1:1 Metal:Poly = 2:3
© 2017 ISPD
ISPD 2017 13
Less Coupling from Metal : Poly = 2 : 3
ND2D1 ND2D1
M1:PO pitch = 1:1 M1:PO pitch = 2:3
© 2017 ISPD
ISPD 2017 14
More Routing from Metal : Poly = 2 : 3
M1:Poly pitch = 1:1 M1:Poly pitch = 2:3
© 2017 ISPD
ISPD 2017 15
Enhanced Logic Density and
Cell Utilization Trends
Logic
Density
3.0
2.0 Optimized
power plan, Cell
pin access, Utilization
1.0 EUV routing 80%
75%
70%
N16 N10 N7
11ML 12ML 13ML
© 2017 ISPD
ISPD 2017 16
Performance Scaling
 Process dimension scaling grows metal and
via resistance exponentially
 Fully automatic and smart EDA design flow is
needed to effectively solve high resistance
issue for 7nm and beyond
© 2017 ISPD
ISPD 2017 17
Cross Node Metal RC Scaling
75 2
Mx Resistance (vs. 40nm)
Mx R
1.8
60
Mx Cap (vs. 40nm)

increase ~3X 1.6
45
1.4
30
increase ~3X
1.2
Mx C
15
1
0 0.8
40nm 28nm 16nm 7nm 5nm
© 2017 ISPD
ISPD 2017 18
Cross Node Wire R and Via Impact
50%
BEOL R delay vs. Total delay
40%
30%
20%
10%
0%
© 2017 ISPD
ISPD 2017 19
VIA Pillar
M6
Single stacking via Via pillar

M6 M6
M5 M5
M4 M4
M3 M3
M2 M2
• Large drivers + upper thick metal + via pillar to reduce

transistor, metal and via resistance
• VIA-Pillar enablement is complex and requires EDA
enablement across all Implementation phases
© 2017 ISPD
ISPD 2017 20
Via Pillar Insertion
 Implies layer promotion for better wire resistance
 Effectively reduces source Via resistance
Connection through DPT layer: DPT wire resistance dominates
Metal R Source Via R
Std. Wire
cell (20μm) 55.43X 1X
Layer promotion to non-DPT layer: Source Via resistance increases

13.23X 3.29X
Via pillar: Best combination of metal and source Via resistance

13.23X 1X
Std. cell Double Patterning Non-Double Patterning

Technology (DTP) Layer Technology (DTP) Layer
© 2017 ISPD
ISPD 2017 21
Cross Node Wire R and Via Impact
50%
BEOL R delay vs. Total delay
40%
30%
20%
with Via Pillar

10%
0%
© 2017 ISPD
ISPD 2017 22
Via Pillar Type and Usage
• EM via pillar (EM VP), used to guarantee cell-level EM,
has to be 100% inserted
– One cell master will only have 1 EM VP
• Performance via pillar (Performance VP), which is larger
than EM VP, can reduce more resistance
– One cell master can have multiple performance VP
VP Association I
Cell Performance Performance Performance EM Default
Master VP1 VP2 VP.. VP VP
VP Association II
Cell EM Performance Performance Performance
Master VP VP1 VP2 VP..
© 2017 ISPD
ISPD 2017 23
Via Pillar Automatic Design Flow
•VIA pillar (VP) definition, association
Design Kits
Floorplan
• VP aware placement
Placement • Automatic NDR
• VP insertion and verification

CTS • CTS with VP
• VP optimization
Routing/Opt.
• DEF out option
© 2017 ISPD
ISPD 2017 24
Voltage and Power Scaling
 Low voltage design reduces power effectively
 Low voltage design challenges
 Functionality robustness
 Accurate variation modeling
© 2017 ISPD
ISPD 2017 25
Ultra-Low-Voltage Standard Cell Solution
Skew Cells/ D0 or Fine-Grained Cells
Internal Half-Sizing Cells
0.9V-
0.6V
Small High-Stack(3 or 4) Cells
Transmission Gates
Multi-Bit Latch/Flop
Proper Stage Ratio to Maintain Internal Slew
0.5V
and Proper P/N Ratio to Improve Noise Margin
below
New Circuit Structure to Enhance Robustness at Low–Vdd
 Improve cell performance by reducing delay

degradation and variation induced by low-Vdd
© 2017 ISPD
ISPD 2017 26
Flop Low Voltage Design Robustness
• High sigma design checks are required to ensure
robust operation at low operating voltage
• For write operation, forward path (red) must be
stronger than feedback path (blue)
QN
Reset
clk_b clk
SI
D Reset
clk clk_b
clk_b
Reset
clk
clk
Reset
clk_b
© 2017 ISPD
ISPD 2017 27
Delay Variation Increases at
Lower VDD
2
Log of scaled delay standard
1.8
Delay variation increases
1.6 exponentially as VDD decreases
1.4
deviation
1.2
0.8
0.6
◄ Non-Gaussian Gaussian ►
0.4 at low voltage at regular voltage
0.2
0
VDD
© 2017 ISPD
ISPD 2017 28
Abnormal Hold-Time Caused by
Low-Vdd Non-Gaussian Behavior
Flop Hold Constraint Behavior

Output
Low VDD
Regular VDD
Hold time
© 2017 ISPD
ISPD 2017 29
Solutions for Low-Vdd Variations (I)
 Methodology improvement
– Timing values are models by both “mean” and
“early/late distribution”
– Implement the new models in timing characterization
and STA tools
Original distribution
(Non-Gaussian with
long right tail)
“Early” distribution
to cover short tail “Late” distribution
to cover long tail
© 2017 ISPD
ISPD 2017 30
Solutions for Low-Vdd Variations (II)
 By using new timing model and advanced statistical
OCV method, we achieve more accurate STA results
compared to Monte-Carlo simulations
Optimistic
Slack
SPICE Monte Carlo

Simple OCV
Advanced Statistical OCV
Path ID
© 2017 ISPD
ISPD 2017 31
Heterogeneous Integration
 Low cost and high system performance
 InFO and CoWoS integration
 Integrated EDA design flow for package and
chips
© 2017 ISPD
ISPD 2017 32
Future Device Possibilities Through
Heterogeneous Technology Integration
TFET RF CMOS Analog
GAA- NWT
eFlash
Source
Gate Drain
III-V FFET Mixed Signal

Si CMOS BSI CIS
3D Low Integrated
Ge FinFET
on Si
Power Specialty
CMOS MEMS
CMOS Technology
FinFET HV
3D-IC
Technology BCD - Power IC
InFO
Cu-TSV
CoWoSTM Vertical Stacking
Si Interposer
© 2017 ISPD
ISPD 2017 33
3D Packaging for Better System
Performance and Form Factor
Analog RF Passive
Logic
Memory
Wafer Bonding
Conventional CoWoS / InFO Vertical

Stacking
Conventional SIP 3D Wafer-based

or MCM System Integration
(Wirebond,
(CoWoS, InFo-PoP, Vertical Stacking, Wafer Bonding)
Flipchip, SMT)
Sources: A-SSCC 2014, IEDM 2014, VLSI 2015

© 2017 ISPD
ISPD 2017 34
Different Systems Require Different
Packaging Solutions
4000
3000
I/O Pin Count
CoWoS
2000
InFO
InFO-M
1000 InFO PoP
WLCSP*
0
0 200 400 600 800 1000 1200 1400
* WLCSP : Wafer Level Chip Scale Package Die/PKG size (mm2)

** InFO : Integrated Fan-Out
*** CoWoS : Chip on Wafer on Substrate
© 2017 ISPD
ISPD 2017 35
Wafer-Level Packaging Technologies:
Integrated Fan-Out (InFO)
4000
Logic
Multi-chips integration
Smallest form-factor InFO
3000 Cost competitive
I/O Pin Count
CoWoS
DRAM 
Logic Through-
2000 InFO–Via
InFO (TIV)
InFO-PoP
InFO-M
1000 InFO PoP Logic Memory
WLCSP* InFO-M (Multi-chip)

0
0 200 400 600 800 1000 1200 1400
Die/PKG size (mm2)
© 2017 ISPD
ISPD 2017 36
Wafer-Level Packaging Technologies:
Chip-on-Wafer-on-Substrate (CoWoS)
4000
Homogeneous
partition
3000
I/O Pin Count
CoWoS
High bandwidth memory
2000 integration for high-performance
InFO computing
InFO-M
1000 InFO PoP
High-performance
heterogeneous partitioning
WLCSP*
0
0 200 400 600 800 1000 1200 1400
Die/PKG size (mm2)
© 2017 ISPD
ISPD 2017 37
Advanced Packaging Co-Design –
SoC + InFO + IPD (Integrated Passive Device)
FC-PoP o
InFO-PoP
Rja (17.2 C/W) Rja (15.4 oC/W)
IPD 1 Thermal
L6 SR F-S pad
PM4 dissipation
RDL3
L5
Insulator5
RDL2
PM3 improved by 12%
Insulator4 PM2
L4 RDL1 250μm
RDL0 PM1
Core Molding
590μm Silicon Compounds
L3
Insulator2
L2
B-S pad
Insulator1
L1
SR
TMV Silicon DRAM 2 Thickness

Molding reduced by 57%
Compounds
DRAM Eye Opening: Eye Opening:

0.785 UI 0.663 UI 3 Eye width
improved by >12%
InFO-PoP W/I IPD
FC-PoP W/O IPD 4 Voltage droop

© 2017 ISPD
ISPD 2017 reduced by 5~10%38
InFO Design Solutions
(InFO Only)
CDL Netlist Auto-Export
InFO Layout
Creation InFO GDS
PKG Layout In-design DRC InFO
PKG Layout
InFO CDL LVS
Netlist
InFO DRC/LVS
RLCK Extraction
< 4GHz RC/RLCK Extraction for STA, IR,
SEM, & PI Analysis
InFO RC
Extraction S-parameter Extraction
4GHz – 10GHz S-parameters Extraction
for SI/PI, & EMI analysis
© 2017 ISPD
ISPD 2017 39
InFO Design Solutions
(Dies + InFO)
Inter-die DRC/LVS
Inter-die
Inter-die DRC Inter-die LVS DRC/LVS
InFO EM/IR Analysis
B1 B2 B3 B4
Dynamic
A1 A2 A3 A4 Static IR
EM/IR IR
Top die Analysis
Analysis Analysis
SI/PI Simulation
SI/PI
Simulation Thermal-aware EM/IR Analysis
SI Analysis PI Analysis
Normal Analysis Thermal-aware

(No Thermal) Analysis
Thermal
Analysis
© 2017 ISPD
ISPD 2017 40
Applying Machine Learning to
Complex Physical Design Problems
 Place and route Machine Learning experiment
 Feature extraction and convolutional neural
network data mapping
 Clock gating and routing congestion speed
improvement
© 2017 ISPD
ISPD 2017 41
Machine Learning-Based
Design Solution
 Eliminate human & fixed-model subjective bias with statistical
important features
Traditional EDA Approach Machine Learning Approach
Placement
Prediction
Placement
Learning
result Routing result
Routing Model Routing
simulation
overheads training overheads
Routing programming Detailed
pattern model routing result
Easy to be biased and Only heuristics Leverage widely used deep

no guaranteed quality are possible learning packages
Traditional (EDA) flow Flow enables by Machine Learning

Pre-route design Congestion map after routing Congestion prediction Congestion map after routing
Only know routing results after Able to predict routing behavior and improve
running routing congestion with new recipes to achieve 40Mhz
© 2017 ISPD speed improvement
ISPD 2017 42
ML Design Enablement Platform
RTL Synthesis Place & Route STA Signoff Tapeout
Adopt TSMC ML Platform to predict the design improvement opportunity

RTL Synthesis Place & Route STA Signoff Optimized design
APR TSMC ML Design Enablement Platform

database Script to
Feature Data pre- perform design
Prediction optimization
extractions processing
TSMC’s proven
ML training
models
© 2017 ISPD
ISPD 2017 43
ML Design Enablement Platform
RTL Synthesis Place & Route STA Signoff Tapeout
Adopt TSMC ML Platform to train design and build new models

RTL Synthesis Place & Route STA Signoff Optimized design
APR database TSMC ML Design Enablement Platform

for new Script to
Feature Data pre-
learning perform design
extractions processing
optimization
Prediction
Training Model Mgmt.

TSMC’s
models
Customer’s
models
© 2017 ISPD
ISPD 2017 44
Performance Gains Prediction from ML
• Capability to predict and set accurate ARM A72 clock
gating latency to avoid over-design and achieve better
speed from 50 to 150Mhz
• Post-route speed is improved by 40MHz with detour
prediction and early fix
Predicted detours Real post-route detours
© 2017 ISPD Achieved 95% matching on A72

ISPD 2017 45
Conclusion
 Five semiconductor industry trends, issues and
solutions are discussed: area, performance and
power scaling, heterogeneous 3D integration
and Machine Learning
 Physical design and EDA play even more critical
roles to extend Moore’s law and to enable highly
integrated complex 3D SOC chips
© 2017 ISPD
ISPD 2017 46

Thriuuu

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thriuuu

Uploaded by

Copyright:

Available Formats

Physical Design Challenges and Innovations

4-fin cell 3-fin cell 2-fin cell

 However, shrinking PG pitch impacts routing resources.

Allows better cell

PG design has to be friendly to

Big cell design has to be

Inverse Multiple EUV Directed Self-

Metal:Poly = 1:1 Metal:Poly = 2:3

M1:Poly pitch = 1:1 M1:Poly pitch = 2:3

Mx Cap (vs. 40nm)

Single stacking via Via pillar

• Large drivers + upper thick metal + via pillar to reduce

Layer promotion to non-DPT layer: Source Via resistance increases

Via pillar: Best combination of metal and source Via resistance

Std. cell Double Patterning Non-Double Patterning

with Via Pillar

• VP insertion and verification

 Improve cell performance by reducing delay

Flop Hold Constraint Behavior

SPICE Monte Carlo

III-V FFET Mixed Signal

Conventional CoWoS / InFO Vertical

Conventional SIP 3D Wafer-based

Sources: A-SSCC 2014, IEDM 2014, VLSI 2015

* WLCSP : Wafer Level Chip Scale Package Die/PKG size (mm2)

WLCSP* InFO-M (Multi-chip)

TMV Silicon DRAM 2 Thickness

DRAM Eye Opening: Eye Opening:

InFO-PoP W/I IPD

FC-PoP W/O IPD 4 Voltage droop

Normal Analysis Thermal-aware

Easy to be biased and Only heuristics Leverage widely used deep

Traditional (EDA) flow Flow enables by Machine Learning

Adopt TSMC ML Platform to predict the design improvement opportunity

APR TSMC ML Design Enablement Platform

Adopt TSMC ML Platform to train design and build new models

APR database TSMC ML Design Enablement Platform

Training Model Mgmt.

Predicted detours Real post-route detours

© 2017 ISPD Achieved 95% matching on A72

You might also like