You are on page 1of 74

There is Plenty of Room

at the Bottom

Shankar Balachandaran
Dept. of Computer Science and Engineering
Indian Institute of Technology Madras
shankar@cse.iitm.ac.in
Who Said This?

“Why cannot we write the entire 24


volumes of the Encyclopedia Brittannica
on the head of a pin?”
Clue : Nobel Laureate
Physicist
Richard Feynman
 “There is Plenty of Room at the Bottom” was
a talk delivered to American Physical Society
at California Institute of Technology
 In 1959
 Excellent vision
 Predicted nanotechnology just by analyzing what
is theoretically possible
 Build what we want with molecular precision
Today’s Talk
 There is room in VLSI research
 Plenty of it
 How did we reach here?
 Open research problems in VLSI
 Employment Opportunities
How Chips Have Shrunk

 1946 in UPenn
 Measured in cubic ft.
ENIAC on a Chip

 1997  7.44 mm x 5.29 mm


 174,569 Transistors  0.5  technology
Integrated Circuit Revolution

1958: First integrated circuit (germanium) 2000: Intel Pentium 4 Processor


Built by Jack Kilby at Texas Instruments Clock speed: 1.5 GHz
Contailed five components : transistors, # Transistors: 42 million
resistors and capacitors Technology: 0.18μm CMOS
Costs Over Time
Evolution in IC Complexity
If Transistors are Counted as
Seconds
4004 < 1 hr
8080 < 2 hrs
8086 8 hrs
80286 1.5 days
80386 DX 3 days
80486 13 days
Pentium > 1 month
Pentium Pro 2 months
P II 3 months
P III ~1 year
P4 ~ 1.5 years
Comparison of Sizes
How Small Are The Transistors?

2.0 Micron Sub-micron Deep-sub Ultra Nano


80286 micron Deep-sub
80386
1.0 486 micron
pentium
0.3 pentium II
0.2 Pentium IV
Itanium
0.1

0.05
0.03

83 86 89 92 95 98 01 04 07

• Compare that to diameter of human hair - 56  m


Moore’s Law
 Transistors double almost every 2.3 years
 Gordon Moore of Intel
 Visionary prediction
 Observed in practice for more than 3 decades
 Implication
 More functionality
 More complexity
 Cost ??
Processor Frequency Trends

Frequency doubles each generation


Processor Power Trends
Power Density Increase
Sun’s
10000 Surface
Power Density (W/cm2)

Rocket
1000
Nuclear Nozzle
Reactor
100

Hot Plate
10
8086
8008 P6
4004 8085 Pentium®
286 386
486
8080
1
1970 1980 1990 2000 2010
Productivity Gap
“How many gates
can I get for $N?”
Complexity
How to Handle Complexity?
VLSI Design Flow

Specifications
X = AB;
Y = CD;
Z=
X+Y;
Design Automation
 Tools are used at every step
 Manual intervention is still required
 Tools do not scale up very well
 Many problems are NP-Complete
 Theory vs Practice
Before Tools
 Faggin laid out 4004 by hand
 Drawn on paper and photographed
 Demagnified 500 times smaller
 Seymour Cray relied on humans to build
supercomputers
 Women used to cut wires by hand
 Delicate handwork was necessary
 Almost no verification or validation
 Chips may not function properly
 Market may return products
Design Goals Over Time

speed/area speed
speed/power/reliability
area Speed +

Power
power reliable
low power ultra-low power

1970’s 1980’s 1990’s 2000’s


Issues in UDSM
 Catastrophic Yield
 Number of dies that are good in a wafer
 Good = Functionally Right + Right Frequency
 Signal Integrity
 Crosstalk
 ElectroMigration
 ….
 Manufacturability
 Design rules put wires apart
 Dimensions getting close to sub-wavelength
 Lithography is a serious issue
Issues in UDSM Design
 Reliability
 Designs that fail in the filed
 Electromigration on power supplies
 Wire self heat affects clocks and signals
 Power
 Leakage becomes significant
 Dynamic Power is still an issue
Are These New?
 Some are, some are not
 Old ones
 Signal Integrity was a problem in board design
 CDC 6600 had good models for taking SI into account
 Dynamic power has been addressed in digital
watches
 Very low power – change battery once in 3 to 4 years
 But very high frequencies now
 Leakage was not a big problem then
Why Now Then?
 Finer geometries
 Greater wire and via resistance
 More metal layers
 9 or more
 Cross coupling is high
 Lower Supply Voltages
 More current for given power
 Lower threshold voltages
 Less noise margins
 Interconnect issues
 Dominates gate delays
 Susceptible to patterning difficulties
 Susceptible to defects
International Roadmap of
Semiconductors
 Worldwide consensus on semiconductor
growth
 Predicts main trends 15 years into the future
 Participants include USA, Europe, Korea, Japan
and Taiwan
 Chipmakers, Universities and Research Institutes,
Manufacturers, Materials Suppliers
 Provides a reference for requirements, potential
solutions and their timing
Purpose
 Predictions
 Very important for companies that develop
products, tools and materials
 Guidelines
 Where to put your
 Next few years of research
 Money
 Side Effect
 What skill set must I require to be relevant?
Grand Challenges
 Enhancing Performance
 Along with power dissipation in high end
applications
 Leakage power in low power applications
 Cost effective manufacturing
 Reduce manufacturing costs
 Make reliable chips => Increase Yield
Major Roadblocks Predicted

Wire Delay Variability Power Soft Errors


Designer’s Dilemma
Speed
State-of-the-art microprocessors,
communications, etc.

?
Yield
Manufacturability

Power Area
Portable application – Cost of the die.
PCS, wireless, etc.
DAC Panel – Where Should the
R&D Money be Spent?
Variability/Litho/Mask/Fab Low Power/Leakage
Power Delivery/Integrity Tool/Flow Enhancements/OA
IP Reuse/Abstraction/SysLevel Design DSM Analysis
P&R and Opt Others (Lotto)

100%

80%

60%

40%

20%

0%
Intel IBM Synopsys TUE- Cadence STMicro
Magma
Required Advance in Design System Architecture
Yesterday 1000nm Today 180nm
Today 130nm Tomorrow 50nm

System System System System System


Design Design Model Design Model

Functional
Software SPEC Performance
Logic Design Hw/Sw Perf. Testability
Design Optimization Model Verification

Functional
Cockpit
Verification
RTL SW Auto-Pilot
RTL SW Optimize Analyze
Opt Hw/Sw Comm. Perf.
Synthesis SW Timing

EQ check
Synthesis Logic Hw/Sw Power
+ Timing Analysis Circuit Data Noise
Equivalence checking

File + Placement Opt Model


Place Test
Wire Mfg.
Timing Analysis Performance other
Repository
other
File Testability
Verification
Functional
File MASKS
Verification
Place/Wire
+ Timing Analysis
Place/Wire + Logic Opt

File
File
Multiple design files are converged into one efficient Data Model
Timing Analysis Disk accesses are eliminated in critical methodology loops
MASKS Verification of Function, Performance, Testability and other design
Performance criteria all move to earlier, higher levels of abstraction followed by
File Verification equivalence checking and
assertion driven design optimizations
Testability Industry Standard interfaces for data access and control
MASKS Verification Incremental modular tools for optimization and analysis
New Table 8
Advances in Software
 Compiler can control sequence of instructions
to mitigate power
 Reorder instructions so that the number of bit
changes between successive instructions is
minimized
 Aggressive optimization to prevent Cache
miss
 Accurate branch prediction to avoid pipeline
flush and refill
 Garbage Collection – Efficient turning off
Loop Level Power Control
 Instruct hardware to turn itself off when not
needed
Header Header

Loop Loop
Loop Loop Body-
Body-
Body-I Body-I I I

Loop Header
Turn off Body-
Loop
Loop II
Body-
Body-II II
Loop
Body-II

Turn off
Turn off
Design Automation Problems
 Interconnect delay dominates system
performance
 Consumes 70% of clock cycle
 Multiple clock cycles required to cross chip
 whether 3 or 15 not as important as fact of
“multiple” > 1
 Correct by construction methodology
 Avoid iterative Logic Synthesis, Placement and
Routing loop
 Prevention is better than cure
DA – Complexity Issues
 Silicon Complexity + Design Complexity
 Design convergence – Abstract what’s beneath
 Prevent instead of analyze + verify
 Many issues become first class citizens
 Unify
 Keep the database and models unified
 Tight integration of synthesis with layout issues
 Cost issue
 Reuse of IP blocks
Design Convergence
 What must converge?
 Timing, Logic…………..
 Provide predictable back-end
 Correct by construction – “assume” then “enforce”
 Constraints and assumptions are sent downstream
 Not much goes upstream
 Construct by Correction
 Concurrently Optimize Logic + Layout
 Elimination of concerns
 Reduce degrees of freedom
 Partition the design into globally asynchronous and locally
synchronous modules
Interconnect Complexities
 Blocks cannot grow in size
 Designing them becomes difficult
 Small blocks mean more wires

Local wires
Occurrence Rate
(Normalized)

Global wires

~0.5
wirelength
die size
Placement Needs
 More hierarchical than flat
 Support placement of partial designs
 Incremental model
 Construct by correct model
 Characterizable for synthesis
Placement + Synthesis ????

buffering
resizing

cloning
Router Needs
 Hierarchical
 Scalable
 Should not break down for large number of
nets/large area of routing
 Incremental, controllable, well-characterized
 Detunable (e.g., coarse/quick routing), ...
Also……..
 Degrees of freedom
 Wire widths/spacings, shielding/interleaving,
driver/repeater sizing
 Router empowered to perform small logic resyntheses
 Change in search mechanisms
 Iterative ripup/reroute replaced by “atomic topology
synthesis utilities”
 Construct entire topologies to satisfy constraints in arbitrary
contexts
Combinatorics
 Millions of cells
 Millions of moves
 Orientation
 Pin position
 Multiple orthogonal objectives
 Divide and Conquer
 Divide the Problem into Smaller Sub-Problems
 Solve Each of these Separately
 Stitch together the Solutions of the Sub-Problems

10 Million Gate Design => 200 (50k Gate Designs)


Low Power Design
Year 2002 2005 2008 2011 2014
Power supply Vdd (V) 1.5 1.2 0.9 0.7 0.6
Threshold VT (V) 0.4 0.4 0.35 0.3 0.25

 Standby Power: Drain leakage will increase as VT decreases


to maintain noise margins and meet frequency demands,
leading to excessive battery draining standby power
consumption.
50%
8KW
…and phones leaky!
40% 1.7KW
Standby Power

30% 400W

20% 88W
12W
10%

0%
2000 2002 2004 2006 2008
Power vs Energy
Power is height of curve
Watts Lower power design could simply be slower
Approach 1

Approach 2

time

Energy is area under curve


Watts Two approaches require the same energy
Approach 1

Approach 2

time
Design Space
Constant Variable
Throughput/Latency Throughput/Latency

Energy Design Time Non-active Modules Run Time

Logic Design DFS, DVS


Reduced Vdd (Dynamic
Active Clock Gating
Sizing Freq, Voltage
Multi-Vdd Scaling)

Sleep Transistors
+ Variable
Leakage + Multi-VT Multi-Vdd
VT
Variable VT
Leakage Control

MTCMOS Dual Threshold State Assignment Variable Vt

Vdd 0 1 1 0 1 0

High Vt Vdd

Low Vt Variable
Vt
Logic
Logic

High Vt

Substrate or SOI
back gate Vt
control
Dynamic Thermal Management

Trigger Mechanism:
When do we enable Initiation Mechanism:
DTM techniques?
How do we enable
technique?

Response Mechanism:
What technique do we
enable?
Crosstalk Induced Errors
 Transition on an adjoining signal causes
unintended logic transition
 Symptom: chip fails (repeatably) on certain
logic operations

Coupling C
Victim net

Wire R

Drive R Grounded C Input Noise Tolerance


Timing Dependence
 Timing dependence on crosstalk
 timing depends on behavior of adjoining signals
 symptom: timing predictions inaccurate
compared to silicon (effect can be large: 3:1 on
individual nets)
Delay here and here depends on the behavior of other nets

Wire R

Grounded C Coupling C

Other logic net(s)


Delay Uncertainty
Relative Delay vs. Relative Risetime
for different coupling percentages
5
10% coupling - fast
4.5 10% coupling - slow
50% coupling - fast
4 50%coupling - slow
Relative to no-crosstalk case

100% coupling - fast


3.5 100% coupling - slow

3
Delay

2.5

1.5

0.5

0
0 1 2 3 4 5 6
Risetime of interferer / Risetime of victim
Electrical Optimization
1 3

2
4
5
Consider All Issues in SP&R
 Physical … proximity of the signal
 Temporal … noise event occurs timing
window
 Critical … is the path important
 Electrical … driver strength vs pin cap
 Congestion Estimation
 More congested regions will have more crosstalk
IR Drop
• Voltage drop in supply lines from currents drawn
by cells
• Symptom: chip malfunctions on certain vectors
• Biggest problem: what's the worst-case vector?
Electromigration
 Power supply lines fail due to excessive
current
 Symptom: chip eventually fails in the field
when wire breaks
Cause for EM

hillock
voids
Layout Uncertainties
0.18µ
Layout 0.25µ

0.13µ 90-nm 65-nm


Data Volume Explosion
350
Number of design rules per process
MEB ES D ata Volum e (GB )
300 node
250

200

150

100

50

0
180nm 130nm 90nm 70nm
MEBE S Data Vo lu m e vs. Tech n o lo g y No d e
Photolithographic Process
optical
mask
oxidation

photoresist photoresist coating


removal (ashing)
stepper exposure

Typical operations in a single


photolithographic cycle (from [Fullman]).
photoresist
development
acid etch
process spin, rinse, dry
step
Process Variation Taxonomy
 Spatial scale:
 Die-to-Die or Inter-Die. E.g.
Focus, etch
 Within-Die or Intra-Die. E.g.
lens aberration, diffraction
effects
 Nature:
 Random. E.g. batch-to-match
material variation
 Systematic. E.g. diffraction-
based proximity effects
 Systematic but difficult to
model variations random
Towards 90nm

300mm wafers In a 300mm fab…

UMC Taiwan
Two New Paradigms
 Design for Manufacturability
 Make manufacturer’s life easier
 Correct by construction
 Close interaction with the fabs
 Design for Yield
 Consider various yield issues during design
 Cells with high yield statistics are better for library
 Strict design practices
How Can You Help?
 Broad research area
 Various background required
 Computer Science
 Algorithms, Databases, Graphics, Visualization
 Electrical Engineering
 Circuit designers – Digital, Analog, Mixed, RF
 Material Science
 Physics
 Mathematics
 Think Big
 By thinking about small things
Skill Sets Required
 Front end design
 C, C++, Perl
 VHDL or Verilog
 Good Understanding of Digital Circuit
fundamentals
 CAD Tool Designer
 Good programming skills
 Optimization Theory
 Algorithms, Data Structures
Skill Sets Required (contd.)
 ASIC Designers
 VLSI tools
 Digital Circuits
 Design Styles – ASIC vs FPGA vs …
 Circuit Designers
 Spice, Layout
 CMOS, Low power design
 Everyone
 Applications – DSP, Wireless Communication, Data
Communication
 Computer Architecture, Digital Circuits
 Publications in reputed conferences and journals
Potential Employers
 Front end designers and ASIC designers
 Endless List
 Architects
 Intel, AMD, ARM, Xilinx, Altera ………
 CAD Tool Designers
 Synopsys, Cadence, Magma, Mentor Graphics, Synplicity
 Circuit Designers
 TI, Broadcom, Motorola …………
 Material Science
 AMAT, STM, TI …………..
 Academia
For Teachers
 Many skills required
 Not enough time
 Develop courses around these skill sets
 Take summer breaks to work for companies
 Update your knowledge base
 Homework is the best cure
Famous Quotes
 IBM founder T. J. Watson in 1945
 “I don’t think there will ever be a market for more
than 5 computers in this world.”
 Ken Olson, president of Digital Equipment
Corp. 1977
 There is no reason anyone would want a
computer in their home.
There is Plenty of
Room at the Bottom
Questions?
References
 Course Notes of EE6325 CMOS VLSI Design
at UTD
 ITRS 2003
 ICCAD 2000 Tutorial on Current Issues in
CAD
 MUSIC 2005
 Jan Rabaey’s Course Notes
 Other web resources for the historical
aspects of VLSI
Thank You