Professional Documents
Culture Documents
MAH
So use a range of technology scalings Better chance of covering the correct answer
MAH VLSI Scaling for Architects 4
Technology Scaling
In scaling there are really two issues Devices Can we build smaller devices What will their performance be Wires Try to avoid the wet noodle effect There is concern about our ability to scale both of these components
MAH
Hot electrons Parasitic resistances Now worries are a little different Oxide tunnel currents Punchthrough Parameter control Parasitic resistances
MAH VLSI Scaling for Architects 6
Transistor Scaling
People are building very short channel devices Shown are I-V curves for 15nm L pMOS And a short channel nMOS The structure is strange FinFET But you can make them work
MAH
16
FO4
Measure the delay of an inverter with Cout/Cin = 4 Divide speed of a circuit by speed of FO4 inverter Get delay of circuit in measured in FO4 inverters Metric pretty stable, over process, temp, and voltage
MAH VLSI Scaling for Architects 8
Delay Tracking
12
12
8.4 2
8.4 2
1.4 TT FF SS FS SF
Process Corner
MAH
Use an adder as an example Fastest 64 bit adders run in around 7 FO4 delays
Dynamic adders Hasnt changed very much recently (HP at ISSCC in 96)
Memory Performance
Can use FO4 delays to look at memory access time too. Delay is log(Size) for an optimized design. Wire delay is important at larger memories, but not a dominant factor. Partition array, use fewer thicker wires. Notice access times 10-20 FO4
MAH
40.0 Total Delay (no wire res.) Decoder Delay (no wire res.) Output Path Delay (no wire res.) Total Delay (with wire res.)
30.0
Delay ( fo4)
20.0
10.0
0.0 64Kb
256Kb
1Mb Size
4Mb
16Mb
11
Circuit Power
Is very much tied to voltage scaling If the power supply scales with technology For a fixed complexity circuit Power scales down as ^3 if you run as same frequency Power scales down as ^2 if you run it 1/ times faster Power scaling is a problem because Freq has been scaling at faster than 1/ Complexity of machine has been growing This will continue to be an issue in future chips Remember scaling the technology makes a chip lower power!
MAH VLSI Scaling for Architects 13
Wire Scaling
More uncertainty than transistor scaling Many options with complex trade-offs For each metal layer Need to set H, TT, TB, 1, 2, conductivity of the metal
TT SL SR ILDTop
2
W TB
Metal, ILDmiddle
1
MAH
ILDBot
14
Wire Capacitance
Capacitance per micron is roughly constant Simple approx. = fringe (0.07fF/m today) + 4 parallel plates Depends only on the ratio of the parameters Coupling becomes a large issue with increased H/S ratio
ILDTop
Metal, ILDmiddle
ILDBot
15
Wire Resistance
Resistance is simpler R/m = /wh Scales up as the technology shrinks Main reason that wire height has not scaled much
H1
H1
2R W1
H1
1.4R
W1
W1
MAH
16
Wire Layers
Not all wiring layers have the same characteristics Today have three types of levels M1
Finest pitch, highest resistance, local interconnection in a cell
M2-4?
Normal routing level, most of the wires
M5+
Thick coarse metal, used for global wires When scaling forces thinner metal, create new top layer
MAH
17
Noise Issues
Two main noise sources for wires Capacitance coupling Inductive coupling Capacitance coupling is mostly a nearest neighbor issue High aspect ratio wires make this worse Real push for low- dielectric between wires Inductive coupling Is much more complex to analyze
Depends on where the return currents flow
MAH
18
Semi-global wire capacitance, 1mm long 0.6 0.4 0.2 0 0.25 0.18 0.13 0.1 0.07 0.05 0.035 Technology Ldrawn (um)
19
Semi-global wire capacitance, scaled length 0.6 0.4 0.2 0 0.25 0.18 0.13 0.1 0.07 0.05 0.035 Technology Ldrawn (um)
20
Module Wires
These wires scale fairly well:
Semi-global wire, scaled length 0.5 Wire delay/Gate delay 0.4 0.3 0.2 0.1 0 0.25 Aggressive scaling Conservative scaling
0.18
0.13
0.1
0.07
0.05
0.035
Scaled wire delays stay pretty constant relative to gates Not a very big change
MAH VLSI Scaling for Architects 21
So what are we missing? Why are people working so hard on developing new tools?
MAH
22
9 modules, 22 exceptions
What can we do? Larger design teams? Longer design times? Better tools that have fewer exceptions per module
MAH VLSI Scaling for Architects 23
THIS is important!
Is this important?
0 0.25 0.18 0.13 0.10 0.07 0.05 0.035
Fixed-length wires, relative to gates, worsen by 2x per generation This is a big problem
MAH VLSI Scaling for Architects 25
Designer Responses
Wire engineering -- use wider wires or thicker wires Making wires wider will improve performance
Resistance = k/W; Capacitance = C0 + b*W RC = kC0/W + kb; b << C0 so increasing W helps quite a bit
Delay (nS) for a 1cm wire 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 4
MAH
M3 M5
14
16
26
Designer Responses
Circuit solution -- use repeaters Break the wire into segments Delay becomes linear with length Signal velocity = k ( FO4 Rw Cw )1/2
Rw/2 Cw 4 Cw 4 Cw 4 Rw/2 Rw/2 Cw Cw 4 4 Cwload Cload C 4
MAH
27
M3
Repeaters Repeaters
M5
10 Distance (mm)
VLSI Scaling for Architects
15
20
28
M1 C on serva tive S IA M3
M5
0.2
0.1
0.05
29
A Different View
Wires are not as bad as they are painted in the media If you take a chip and scale it Keeping the same exact design The ratio of the wire delay to gate delay will change slowly
Currently they both scale by roughly the scaling factor Wire get a little slower each generation
Not a big issue Key is the logical span of the wire remained constant The problem is if you make the chip more complex Wires in general need to connect up more gates Logical span increases Get slower Circuit Tricks (repeaters) help some, but not enough
MAH VLSI Scaling for Architects 30
Logical chip size Distance I can go in 1 cycle New view: a chip looks really big to a wire How is this going to affect chip design?
MAH
31
Computer Performance
100
10
SpecInt95
ISPEC
ISPEC 10
0.1
1 1993
1994
1995
1996
1997
1998
1999
Year
How is scaling going to effect computer design? Use Hennessy Patterson Formula Delay = Number of Instructions * Cycles/Inst * Clock Period
Performance = Clock Frequency/CPI
Pipelined
Separate hardware for each stage
Super-scalar
Multiple port mems, function units
Out-of-order
Mega-ports, complex scheduling
Speculation Each design has more logic to accomplish same task (but faster)
MAH VLSI Scaling for Architects 33
Architecture Scaling
Plot of IPC Compiler + IPC 1.5x / generation Used all known tricks OOO is old idea Uses lots of wires What next? Wider machines Threads Speculation
Guess answers to create parallelism
0.04
SpecInt95 / MHz
0.03
0.02
0.01
Need multiple functional units Need large ported central registerfiles This will not scale: Machines are already starting to use internal clustering
MAH VLSI Scaling for Architects 35
Clock Frequency
1000
MHz
100
80386 80486 Pentium Pentium II 10 Dec-83 Dec-86 Dec-89 Dec-92 Dec-95 Dec-98
Most of performance comes from clock scaling Clock frequency double each generation Two factors contribute: technology (1.4x/gen), circuit design
MAH VLSI Scaling for Architects 36
Number of gates decrease 1.4x each generation Caused by: Faster circuit families (dynamic logic) Better optimization Better micro-architecture Better adder/mem arch All this generally requires more transistors
37
Latency
Flops are already 10-20% of cycle today
Difficult to generate a clock at less than 8 FO4 gates Continued scaling of gates/clock will be hard Guessing slope will change soon
MAH VLSI Scaling for Architects 38
Performance Scaling
Remember processor performance plots used to have two lines
Mainframe
uP
Microprocessors and mainframes Mainframes had maxed out and improved at technology rate What will happen with microprocessors?
MAH
39
It does let you design more modules Continued scaling of uniprocessor performance is getting hard Machines using global resources run into wire limitations Faster clock ticks (in FO4) is getting very hard
Power and logical reach issues in going much under 16 FO4s