You are on page 1of 6

32.

Interconnects in the Third Dimension: Design Challenges


for 3D ICs
Kerry Bernstein, Paul Andry, Jerome Cann*, Phil Emma, David Greenberg, Wilfried Haensch,
Mike Ignatowski, Steve Koester, John Magerlein, Ruchir Puri, Albert Young
IBM T.J. Watson Research Center, Route 134/ P.O. Box 218, Yorktown Heights, NY 10598 USA
*IBM Microelectronics, 1000 River Road, Essex Junction, VT 05452 USA
Email: kbernste@us.ibm.com

ABSTRACT separate dies connected with conventional C-4 solder bumps


Despite generation upon generation of scaling, computer chips attached to large through-wafer vias, mounted either face-to-face
have until now remained essentially 2-dimensional. Improvements or back-to-face. Advanced 3D technologies with vias under 1
in on-chip wire delay and in the maximum number of I/O per chip micron in diameter are now emerging, connecting wafer-bonded
have not been able to keep up with transistor performance growth; active layers. This range and their applications are shown in
it has become steadily harder to hide the discrepancy. 3D chip Figure 1.
technologies come in a number of flavors, but are expected to
enable the extension of CMOS performance. Designing in three
dimensions, however, forces the industry to look at formerly-two-
dimensional integration issues quite differently, and requires the
re-fitting of multiple existing EDA capabilities.

Categories and Subject Descriptors


B.4.3 [Interconnects (Subsystems)]: Parallel I/O; C.1.2
[Multiple Data Stream Architectures]: Interconnection
Architectures

General Terms Figure 1 Technology range of vertical interconnects


Performance, Design, Standardization, Theory
1.1 Chip Stack / Silicon Carrier Technology
The key enabling technology element required for 3D integration
Keywords is the through-silicon via (TSV). Per Figure 1, the final
3D, Interconnect, Through-wafer via, Chip-stack, Silicon Carrier, application and the 3D process technology used for its
Bandwidth, Latency, Hierarchical Memory implementation will dictate the physical parameters of the TSV
including pitch, diameter or width, and via height. Assuming
1. What is 3D Interconnect? interconnect density requirements can be met, the electrical
In general, 3D refers to a variety of technologies which provide requirements of the design must be fulfilled which include
for electrical connectivity between multiple active device planes. maximum allowable values for via resistance, capacitance and
The technologies vary in the diameter and pitch of the vertical inductance. A good deal of research has been performed to define
vias, their impedance, and how the planes are mounted in respect the structure and fabrication process of TSVs [1][2]. A common
to each other. The first 3D structures appearing in the market were approach employs a cylindrical through-via metallized with Cu
stacks of chips with progressively larger dimensions, which were electroplating. Copper can be processed using standard back-end-
then wire-bonded. Early integrated 3D structures comprised of-the-line (BEOL) tools. However, the dimensions of these
through-vias are limited by the capabilities of the plasma vapor
deposition (PVD) tools used to deposit the liner and Cu seed films
Permission to make digital or hard copies of of all or part ofof this
this work
work for
for necessary for electroplating. In practice, via aspect ratios much
personal or classroom use is granted without fee provided that that copies
copies are
are beyond 10:1 become very difficult to electroplate using standard
profit or commercial advantage
not made or distributed for profit advantage andand that
that copies
copies techniques. Another approach known as a “vias last” process
this notice
bear this notice and
andthe
thefull
fullcitation
citationononthe
thefirst
first page.
page. ToTo
copycopy otherwise,
otherwise, or drives the via in from the backside of the wafer once the wafer has
or republish,
republish, to post
to post on servers
on servers or to redistribute
or to redistribute to lists,to lists, requires
requires prior
prior specific
been ground to its final thickness. Typically, this process requires
specific permission
permission and/or a and/or
fee. a fee.
DAC 2007, June 4–8, 2007, San Diego, California, USA. some combination of wafer handling, backside lithography, low
DAC 2007,2007
Copyright JuneACM
4–8, 2007, San Diego, California, ...
978-1-59593-627-1/07/0006 USA
$5.00. temperature insulation, RIE, PVD and electroplating.
Copyright 2007 ACM 978-1-59593-627-1/07/0006…5.00

562
For applications requiring modular designs, where heterogeneous through-silicon vias, which are used to provide I/O escapes from
integration of multiple die must be interconnected in a reworkable the bonded stack.
manner, chip-to-chip and chip-to-wafer 3D assemblies are useful.
Chip stacking using thin die with TSVs and conventional pitch C4
interconnects may dramatically increase density of memory
products with only incremental changes to die, package and final
assembly. Further increases in density can be realized by
decreasing the volume of the joining metallurgy and using
thermo-compression techniques to reduce stacked inter-die
spacing down to a few microns [4]. Since the Cu wiring density
achieved using standard CMOS BEOL techniques is orders of
magnitude higher than typical first levels packages, silicon
carriers increase in-plane interconnection density between
multiple die, assuming the I/O pitch of these die can be reduced Figure 3 Schematic illustration of face-to-face wafer-scale
simultaneously. Bond and assembly of multiple chips on a silicon assembly enabled by copper-to-copper bonding of
carrier have been undertaken using a variety of solder interconnects
metallurgies on die having dense microbump arrays (25 micron Because wafer-scale integration enables further processing of
diameter at 50 micron pitch) with excellent yield and reliability wafers after the join step, process flows can also be developed
[3]. Such techniques may be extended down to pitches of tens of which first join wafers mechanically and then electrically at a later
microns. Planarity, solder volume and chip placement accuracy step. One example of this in practice today enables very high
issues may limit further reductions. Examples of chip stacks and densities of interlayer interconnections. It uses a glass-handle
silicon carrier assemblies are shown in Figure 2. wafer to stack silicon-on-insulator (SOI) top-circuit layer onto an
underlying bottom-circuit wafer in a face-to-back orientation, as
1.2 Wafer Scale Integration illustrated schematically in Figure 4.
Wafer-scale integration typically involves the joining and
interconnection of two wafers that already have active circuits and
devices. The use of such wafer-to-wafer joining preserves the
physical integrity of the wafer and therefore enables additional
conventional semiconductor processing (typically back-end-of-
line operations) after the wafer-joining step. In wafer-scale
integration, the wafers can be joined in either a face-to-face or
face-to-back orientation; however, both types of builds typically
require wafer thinning, aligned wafer-to-wafer bonding, and
formation of wafer-to-wafer interconnect. These processes are
Figure 4 Schematic illustration showing steps used to stack a
generally performed using materials that are compatible with
silicon-on-insulator (SOI) top-circuit layer onto an underlying
future back-end-of-line processing.
bottom-circuit wafer in a face-to-back orientation
Wafer-scale integration can be used in combination with through-
silicon vias (TSVs). As one example, face-to-face wafer-scale 2. 3D Advantages and Opportunities
integration can be enabled by copper-to-copper bonding of Vertical interconnects, in a number of embodiments have been
shown to provide a number of distinct advantages to high
a) b) performance chips. Each application has a different requirement
for via pitch and count, electrical performance, and operating
conditions. These requirements dictate which of the 3D schemes
is most economically suited to the use.
2.1 Wirelength Reduction
3DI is commonly cited as a means of reducing lateral wire length.
With the introduction of multiple active silicon layers, a 3DI 2-3
micron tall vertical via with resistance of a few milliohms,
Figure 2 a). a 3D stack comprising a full thickness chip, a 90 inductance of under 1 pH, and capacitance of a couple fF replaces
micron thick Si carrier, and a ceramic package, and b) SEM a lateral wire of tens to perhaps hundreds of microns. Figure 5 is a
of a 2-chip stack showing TSVs, thin joining metallurgy and histogram of wirelengths from a recent 90nm-generation
the resulting small inter-die gap. microprocessor subunit, assuming wirelength reduction factors
interconnects (Figure 3). Copper metallurgy can be used to ranging from 1X to 0.5X. As the wire lengths decrease, so does
simultaneously form mechanical and electrical connections, and is the demand for wire buffers, also shown in Figure 5. Overall,
compatible with additional back-end processing. A full thickness power dissipated in interconnects and buffers goes monotonically
top wafer is fabricated with both circuits and deep-via (deeper with the wire length-reduction factor, and is shown in Figure 6.
than the final top-wafer thickness target) structures. This wafer is
flipped over and aligned to the bottom wafer, so the copper
patterns used at the join of the two wafers must incorporate
mirroring in the design. After aligned bonding, the top wafer is
thinned from the backside to convert the deep-via structures into

563
2000
45 specialized accelerator functions separate from the cores; security
1500 1.0X
40 and network functions are two common operations addressed by
Buffer Count (per bin)

0.9500
specialized hardware. These trends combine to significantly

Total B uffer s
Thous ands
0.9000 35

increase the need for larger cache capacity to support the many
1000 0.8500
0.8000
0.7500 30
500
0.7000

25
different threads running on a chip, as well as increasing the
0
20
overall bandwidth requirements for the cache and memory
systems.
0 5 10 15 20 25 30
1 0.95 0.9 0.85 0.8 0.75 0.7
Leng th Bi n (microns)
Thousands 3D Avg Net Length Reducti on Factor

Figure 5 Histogram of wirelengths by length-reduction factor,


and Buffer Count by Length-Reduction Factor

10

1
Power ( W)

W irePwr Sv gs (W )
BufPwr Sv gs ( W )
0.1

0.01
1.00 0.90 0.80 0.70
0.95 0.85 0.75
Avg Net Leng th Reduction

Figure 6 Wire, Buffer Power Savings with Length-Reduction


Factor Figure 7 Thread growth in high performance processors

2.2 Integration of Disparate Technologies 2.3.2 Cache Management


Chip processing technologies specific to functions such as RF Bandwidth is used for moving “cache lines” between main storage
circuits or memories are often incompatible with fabrication steps and the various levels within the cache hierarchy. A cache line is a
needed to produce high performance logic devices. Past attempts contiguous block of data; usually 128 or 256 bytes in servers.
to converge these onto a single monolith in 2D resulted in When a processor references a datum that is not in its cache, this
compromises in device mobility or device density. Converging event is said to be a “cache miss.” The system must the retrieve
these functions is a powerful means of reducing delay or power the line containing that datum, and deliver it to the processor. The
consumption, as the alternative is placing separate components on processor stores the newly referenced cache line in its local cache,
a system blade or planar. Power and delay associated with the overwriting a previously held cache line.
resulting traces can make certain capabilities unfeasible. 3D Miss events usually cause the processor to stall (at least, partly)
interconnect enables the independent fabrication of these until it receives the requested datum. The penalty incurred
functions, and their subsequent integration in close electrical comprises what is called the “Leading Edge” (LE) and the effects
proximity. of the “Trailing Edge” (TE) of the miss.
2.3 Bandwidth and Latency The LE comprises the various latencies that are immediately
A more subtle but perhaps very compelling advantage of 3DI is apparent when considering what has to get done: logic cycles to
associated with the availability of massive bandwidth. do prioritization, directory lookups, and error checking; time-of-
flight wire delays; queuing delay incurred when the required
2.3.1 Architecture resources are busy servicing a different miss, and the access time
Processor chips have historically achieved a yearly average associated with the remote cache that sources the data.
compound performance growth rate of 50% or more. While much The TE is simply the number of cycles (processor cycles) required
of this performance growth rate was due to faster cycle times and to move the entire line across the bus. It is purely a measure of
improved core architecture, a growing portion of the performance bandwidth, and has nothing to do with latency. (Latency is
growth now comes from factors that increase bandwidth accounted in the LE.)
requirements at a faster rate. Leading edge microprocessors are
implementing an increasing number of cores per chip, as well as TE = (Line Size / Bus Width) X (FProcessor / FBus) (Eq. 1)
an increasing number of execution threads per core (see Figure 7). The first ratio in Equation 1 is the number of “packets” that are
The Sun Niagara-2 will have 8 cores each supporting 8 execution moved across the bus. The TE causes performance problems
threads for a total of 64 execution threads on a chip. In addition, through five distinct kinds of interactions. For three of these
virtualization functions allowing multiple operating systems to interactions, the penalty induced is directly proportional to the TE.
run together on a single chip are becoming pervasive. Processor For the other two, the induced penalty is nonlinear: minimal if the
chips are also implementing specialized hardware to increase the bandwidth is sufficient and precipitously dreadful as certain
performance of various common operations. This can take the thresholds are crossed [5].
form of specialized extensions to the core design, such as SSE
(Streaming SIMD Extensions) from Intel, 3DNow! from AMD, A long-known heuristically observable property of caches is that
and AltiVec for PowerPC processors. It can also take the form of the miss rate (the rate at which a processor generates misses) is

564
proportional to the reciprocal of a root (frequently, the square drain-induced barrier lowering, decreasing leakage. Passive
root) of the capacity of the cache, and is workload dependent. The energy per cycle thus decreases for some range of decreasing
bus utilization is a product of the TE (which is the “service time” supply voltage [6][7]. Figure 8 illustrates this relationship
for a miss) and the miss rate. The nonlinearities in the TE effects between both active and passive energy per cycle and supply
arise when the utilization gets pushed too hard. voltage.
Cache capacity and bandwidth can be thought of as “mutually Low-voltage operation comes at the cost of increased gate delay,
fungible” entities. If the cache can be made larger, less bandwidth however, due to the reduced on-current available to charge each
is required; if more bandwidth is made available, we can live with gate output node. This increased gate delay degrades performance
a smaller cache. It is always better to have more of both however: and allows leakage power to integrate over a longer period,
big caches with lots of bandwidth. 3D affords us this opportunity. eventually causing passive energy to rise again at sufficiently low
Obviously, 3D enables more cache capacity directly. Cache supply values, as shown in Figure 8.
“planes” can be readily stacked upon a system footprint. Less Circuit architecture can capture the improved efficiency of low-
obvious is that within the 3D stack, much higher bandwidth voltage operation while maintaining performance by
(between planes) is achievable as well, much higher than would compensating for delay. Parallelism can offset delay by
have been possible had the planes been laid out on a 2D package. increasing the number of circuits performing a given task,
To wire between chips on a 2D package requires a “Manhattan” x dividing the workload [8][9]. Since not all algorithms can be cast
and y wiring having dimensions on the scale of the chip size. into a perfect parallel implementation, the percentage increase in
These wires can be quite long - several centimeters, or even circuit count required to maintain performance is typically larger
inches. If areas to be connected between chips are spatially co- than the percentage increase in delay, commonly represented by a
located, then when put into 3D the connections can be primarily power-law relationship with coefficient α. In a system
vertical (in the z dimension). In this case, particular, busses characterized by an α value of 1.4, for example, circuit count must
between the adjacent layers in a cache hierarchy have the potential increase by a factor of 2.6 for each doubling of delay. The price of
of requiring very little x or y displacement. Short busses like this power efficiency constrained by constant performance is therefore
(now perhaps only 10s or 100s of microns) run much faster, and at an increase in chip area. Power-area tradeoffs can be difficult to
lower relative power. In addition, if specific 2D layouts can be accept in planar implementations. 3D integration offers options.
anticipated (to minimize the x and y motion required in moving Specific system blocks such as SRAM caches could be moved to
between planes), wiring blockages may be eliminated. This will their own layer, freeing area for increased logic count. In another,
enable a denser thru-via grid, which allows the busses to be very multiple cores with dedicated local memory could be stacked in
wide as well. This enables removal of much of the Trailing Edge, multiple layers.
and its pernicious effects. A key obstacle toward realizing robust low-voltage design is
2.4 Low Voltage and Power Savings process variability, particularly as it impacts threshold voltage.
The performance of many high performance processors, is limited Such variability can be at least partly countered by adjusting
not by raw capability but by the ability to supply sufficient power threshold voltage dynamically through body bias (in bulk and
or remove the consequential heat. One effective means of partially-depleted SOI) or a backgate (in fully-depleted SOI).
improving efficiency is by the use of reduced-voltage supplies. Implementing threshold-adjusting biases with fine granularity
Since the signal swing in CMOS logic is determined by the supply across a chip is a difficult design challenge, however, requiring
voltage, reducing this voltage decreases the energy supplied by a generation of many individual voltages. 3D integration provides
logic gate to and discharged from the wiring and an elegant solution to this problem, enabling individual, adjustable
voltage converters to be placed in their own layer directly above
the zones of the logic chip where they are required.

3. 3D Challenges and Solutions


3.1 Effective Cooling of 3D Assemblies
Thermal management in 3D stacks is critical for maintaining
required reliability, performance, and power dissipation targets.
Conventional 2D solutions, however, are insufficient to remove
the heat associated with stacked or bonded layers.
Cooling of 3D assemblies is difficult, both because the power per
unit area is increased and because heat must be conducted through
multiple chips, often with poor thermal interfaces. While it is
possible to introduce coolants within a thick 3D structure to
handle very high power levels [10], such an approach is complex
and requires thick cooling structures within the stack to bring in a
sufficient fluid volume. Thus every effort should be made to
remove heat from the back of a chip stack as is currently done for
Figure 8 The relationship between CMOS active and passive single high-power chips.
power While calculating chip temperatures accurately requires detailed
next-stage input capacitance. This reduces active energy per cycle. thermal modeling, rough estimates based on the thermal
At the same time, threshold voltage increases as a consequence of resistances from Table I show that removing heat from the back of
a stack is feasible. While a microprocessor with a hot spot power

565
density of 2 W/mm2 would give a prohibitive temperature drop of
over 50 C per chip layer for a “typical” case, improved thermal
design can manage this. Nevertheless, care must be taken that hot
spots on different chips in a stack do not overlap. For DRAM
chips with power densities in the range of 0.01 W/mm2, stacks of
many chips may be possible through careful thermal design.
Table 1 Approximate area-normalized thermal resistance for
layers in stacked chip structures.
Structure Thermal resistance Figure 9 Repeater explosion due to metal resistance increase
2 with CMOS scaling
(C-mm /W)
In order to exploit the full potential of 3D technology, new
PbSn solder balls 100 µm tall, 15% 16 challenges in the area of physical design [17][18], thermal
coverage analysis[15][16], system level design and analysis need to be
Cu balls 20 µm tall, 20% coverage 0.3 addressed [13]. 3D interconnects have the potential of reducing
critical paths delays significantly, which are typically between
200 µm thick Si wafer 1.6 memory and the interfacing logic.
10 µm thick SiO2 layer 8 New tools that consider thermally aware physical design
Total “typical” (solder balls) >25 implementations, most importantly at the architecture and SoC
level are crucial to the success of 3D as thermal issues are
Total “improved” (Cu balls, thin ~5 exacerbated in 3D implementations [12]. To justify the cost and
dielectric) complexity overhead of 3D technology, it is essential to study the
benefit of 3D early in the design cycle. This requires strong
3.2 Test Approaches linkage between architecture level analysis tools and 3D physical
The Automated Test Equipment (ATE) required to test advanced planning tools. Most of the advantages of 3D will be utilized with
3DI wafers in the future will be no different than that needed for new system architectures and physical implementations.
testing standard, non-bonded 2D wafers today. However, for Therefore, the tools to aid 3D implementation must also operate at
some 3DI bonding processes, the 3DI wafers will be aligned and the higher level in addition to the 3D place and route algorithms
bonded before metallization of the topmost layer, with the process that have been proposed in the literature before. In fact, in our
repeated for each additional layer [11]. Without metallization of view, the benefits from 3D place and route will be limited since
each IC layer prior to bonding, testing individual layers prior to current 2D designs do a fairly good job of optimizing the critical
bonding will not be possible. Because testing of individual IC path distance. There is a very strong need for 3D architectural and
layers prior to bonding will not be possible, contact pads, ESD, physical planning tools that operate in the domain of thermal,
I/O, and test structures on individual layers will not be necessary, physical, and performance analysis in order to yield an optimized
and can be placed on a single dedicated layer, the Peripheral and system implementation in 3D technology [19][20][21][22]. Most
Test Layer (PTL). The debate remains open as to where the PTL of the studies reporting huge benefits from 3D for wire length [12]
should be placed in the stack, where the contact pads should be do not adequately consider
located (topmost layer or backside of substrate), and how the
signals should be routed through the 3DI assembly. If the PTL is
placed first on the stack and the contact pads are located on the
backside of the substrate, there will be significant opportunities
for developing new test Front-End-Hardware (FEH),
methodologies and processes for improving 3DI quality,
throughput and yield. Locating the contact pads on the backside
of the substrate will enable the continuous monitoring of the
bonding process as well as testing of the assembly prior to
completion. If the contact pads are located on the topmost layer,
as is done with standard, non-bonded 2D wafers today, test will
continue to be the last step in the process, and the FEH,
methodologies, processes and value added by test will remain
largely unchanged.
3.3 EDA Enablements for 3D
A fundamental shift in the technology has occurred beyond 90nm
CMOS where the interconnect resistance has been increasing
significantly to cause a repeater explosion problem. This problem
translates into not only significant area overhead but also power, Figure 10 Sweet spot of 3D partitioning when considering 3D
as repeaters are among the leakiest circuit topologies. 3D through via impact
technology has the potential of easing the challenge of repeater
explosion (Figure 9) the physical impact of vertical vias. It is crucial to consider the
impact of vertical vias on the physical design of ICs, from area,
latency, and thermal impact point of view. Figure 10 shows that
the sweet spot of partitioning for 3D implementation lies at the

566
unit level (where a unit is a large logical entity such as floating [9] H. P. Hofstee, "Power efficient processor architecture
point logic or Instruction decode logic etc) and beyond, when and the cell processor" Proc. 11th Int. Symp. on High-
considering the via impact. Performance Computer Architecture (San Francisco,
In addition to 3D design and implementation tools, there are CA), Feb. 2005, pp. 258-262, 2005.
important challenging issues in 3D test and yield that must be [10] B. Dang, et al., “Integrated Thermal-Fluidic I/O
addressed as well. It is well known that yield has a quadratic Interconnects for an On-Chip Microchannel Heat Sink”,
dependency on die size, and a linear dependency on chip count at Electron Device Letters, Vol. 27, pp. 117-119, Feb.
a given die size. 3D designs may incur some yield loss due to 2006.
vertical vias, and may gain some yield due to density. One of the [11] K. Bernstein, et al., "Introduction to 3D Integration",
benefits of 3D is that this technology is compatible with the ISSCC '06 Tutorial 3, February 2006.
known-good-die practices, a known contributor to cost reduction [12] J. Cong, et al., “An Automated Design Flow for 3D
and test simplification. Microarchitecture Evaluation”, Proc. of Asia Pacific
DAC 2006, pp. 384-389, 2006.
4. References [13] A. Rahman, et al., “Wire length distribution of 3-D
[1] Takahashi, K. et al., “Process Integration of 3D Chip ICs”, Proc. of IEEE Intl. conference on interconnect
Stack with Vertical Interconnection,” Proceedings of technology 1999, pp. 671-678, 1999.
the 54th Electronic Components and Technology [14] S. Das, et al., “Design Tools for 3-D ICs”, Proc. of
Conference, Las Vegas, NV, pp. 601-609, 2004. Asia-Pacific DAC 2003, pp. 53-56, 2003.
[2] Bower, C.A. et al., “High Density Vertical [15] J. Cong, et al., “A thermal driven floorplanning
Interconnects for 3D Integration of Silicon Integrated algorithm for 3-D ICs”, Proc. of ICCAD 2004. pp. 306-
Circuits,” Proceedings of the 56th Electronic 313, 2004.
Components and Technology Conference, San Diego, [16] J. Cong, et al., “Thermal driven multi-level routing for
CA, pp. 399-403, 2006. 3-D ICs”, Proc. of Asia-Pacific DAC 2005, pp. 121-126,
[3] Wright, S.L. et al., “Characterization of Micro-bump C4 2005.
Interconnects for Si-Carrier SOP Applications,” [17] C. Ababei, et al., “Placement and Routing in 3D ICs”,
Proceedings 56th Electronic Components and IEEE Design & Test, Nov-Dec. 2005, pp. 520-531,
Technology Conference., San Diego, CA,, pp. 633-640, 2005.
2006. [18] S.K.Lim, “Physical Design for 3D system on package”,
[4] Sakuma, K, et al., “3D Chip Stacking Technology with IEEE Design & Test, Nov-Dec. 2005, pp. 532-539,
Low-Volume Lead-Free Interconnections,” to be 2005.
published in the Proceedings 57th Electronic [19] G.L.Loi, et al., “A Thermally aware performance
Components and Technology Conference, Reno, NV. analysis of vertically integrated (3D) processor memory
May 29 – June 1, 2007. Hierarchy”, Proceedings, 43rd Design Automation
[5] Emma, P.G., “How Bandwidth Works in Computers,” Conf. (DAC), pp. 991-996, 2006.
Chapter 11 in High Performance Energy Efficient [20] O.Ozturk, et al., “Optimal Topology Exploration for
Microprocessor Design, edited by V.G. Oklobdzija and Application-Specific 3D Architectures”, Proc. of Asia
R. Krishnamurthy, published by Springer, Feb., 2006. Pacific DAC 2006, pp. 390-395, 2006.
[6] S. Hanson, et al., "Ultralow-voltage minimum-energy [21] J. Kim, et al., “A Novel Dimensionally-Decomposed
CMOS", IBM J. Res. and Dev., vol. 50 no. 4/5, pp. 469- Router for On-Chip Communication in 3D
488, 2006. Architectures”, to appear in Proc of International
[7] A. Bryant, et al., "Low power CMOS at Vdd=4KT/q", Symposium on Computer Architecture (ISCA) 2007.
Proceedings of the Device Research Conference (Notre [22] W.-L. Hung, et al., “Interconnect and Thermal-aware
Dame, IN), June 2001, pp. 22-23, 2001. Floorplanning for 3D Microprocessors”, International
[8] H.P. Hofstee, "Power-constrained microprocessor Symposium on Quality Electronic Design (ISQED),
design," Proc. 2002 IEEE Int. Conf. on Computer 2006, pp. 98-104, 2006.
Design: VLSI in Computers and Processors (San Jose,
CA), Sept. 2002, pp. 14-16, 2002.

567

You might also like