Professional Documents
Culture Documents
Abstract
3nm Gate-All-Around (GAA) technology is introduced to suggest
the future of logic transistor with performance, power, and area
(PPA) benefit. However, as with the recent advanced technologies,
GAA technology also faces the potential challenges to overcome for
the optimum PPA. Therefore, Design-Technology Co-Optimization
(DTCO) has become more important than ever to maximize
technology-to-design benefits of GAA. In this paper, the motivation
of DTCO is presented by showing the successful design examples
in advanced technologies. Then, the design techniques of standard
cell and SRAM compiler are proposed based on DTCO to maximize
the benefit of 3nm GAA technology.
2022 IEEE Custom Integrated Circuits Conference (CICC) | 978-1-6654-0756-4/22/$31.00 ©2022 IEEE | DOI: 10.1109/CICC53496.2022.9772784
Introduction
Technology has been developed to meet the requirement of PPA by
providing solutions to System-on-Chip (SoC) design [1]. Figure 1 (a)
shows the technology pitch scaling and standard cell density over
upcoming years [1]. Contact poly pitch (CPP) and metal (Mx) pitch
scale down by forming standard cell scale down consisting of x-
and y-dimension. The critical design rule is supported for small x-
and y-dimension of standard cell, which is integrated into standard Fig. 1. (a) Technology pitch scaling and standard cell density over
cell architecture. Then, standard cell is developed to implement upcoming years [1]. (b) CPU throughput at fmax and the constant
block design with competitve PPA. Figure 1 (b) shows CPU power density according to IRDS2021 [1].
throughput at fmax and at the contant power density. CPU
throughput expects sequential ascent over years by meeting
market requirement. Generally, SoC design utilizes fast transistor or
high-performance design technique to achieve high throughput.
However, as shown in Figure 1 (b), throughput gets saturated at the
constant power density over years, since high-performance
technology and design knobs require power consumption
accordingly. Therefore, the competitive technology and design
solutions are necessary to provide low-power solution by meeting
high-performance at the same time. In this paper, the various
design techniques are introduced to recover PPA challenge in
advanced technologies. Then, DTCO are proposed to succeed PPA Fig. 2. SoC area breakdown of standard cell, SRAM, IO, and IP.
benefit above 3nm GAA technology. Standard cell occupies the largest portion, and SRAM the second
largest of SoC area.
Standard cell and SRAM design in SoC
Standard cell and SRAM are crucial elements of SoC design Table 1. Technology-to-design scaling level
constituting a dominant portion in terms of transistor count and die
area [2]. Figure 2 shows the area breakdown of SoC design. Scaling levels Scaling features
Standard cell occupies the major portion of SoC by more than 50%,
and SRAM takes the second largest area by about 30%. The area Level-I Gate length, Gate-to-contact space, Mx pitch
portion is different according to the specifications of SoC design, Level-II Gate cross-coupled structure, diffusion break
but there is no doubt that standard cell and SRAM occupy most
area in SoC design by impacting the overal PPA. Therefore, DTCO Level-III Facilitation of place and routing elements
techniques are majorly introduced in standard cell and SRAM Level-I*II*III Block scaling
design in this paper. Standard cell is the smallest macro that
implements various functionality of SoC design. Then, its
characteristics of PPA are clearly determined by the features of Especially, since SRAM power and performance are dependent on
technology parameters along with circuit and layout architecture. SRAM assist techniques, the various SRAM assist design
Therefore, standard cell is defined and evaluated from the very techniques are exploited in this paper.
early stage of technology definition to assess PPA of technology.
Meanwhile, standard cell plays an important role in providing a
design environment in conjunction with automated EDA tool. For DTCO for standard cell and block scaling
this reason, the competitiveness of standard cell is decided by not Technology-to-design scaling is featured by many factors thorugh
only its own PPA, but also its competence with EDA tool. The key several design levels as shown in Table 1. Level-I factor shows the
features of technology-to-design scaling is discussed in the next intrinic standard cell scaling with x- and y-dimension. Level-II factor
Section with standard cell architecture and routing capability. Then, includes cell-level architecture regarding dummy gate inside and
SRAM, the 2nd largest design in SoC, is discussed by providing outside cell. Lastly, level-III factor is related to cell-to-block scaling
DTCO knobs to optimize area by layout and circuit improvement. in terms of pin solution and routing resources.
978-1-6654-0756-4/22/$31.00 ©2022
Authorized licensed use IEEE
limited to: Korea University. Downloaded on October 16,2023 at 15:48:53 UTC from IEEE Xplore. Restrictions apply.
IEEE CICC 2022 2
Fig. 10. SRAM assist scheme to help write and read operation by
changing bitcell power (VDDC), WL voltage, and BL voltage [8].
Fig. 12. Dual-write driver (DWD) SRAM assist to reduce the
Since the sequential functionality is implemented in SRAM and F/F effective BL resistance [9].
of standard cell, VMIN of SoC depends on characteristics of SRAM
and F/F. Therefore, DTCO is applied to improve VMIN of SRAM and Figure 11 (b) compares ring-oscillator delay according to additional
F/F. Figure 9 shows SRAM VMIN trend regarding VOP, VTH over VIA distance in power-rail of standard cell. Power-rail of M1 and M2
technologies [8]. As technology advances, VOP has been decreased reduces resistance by drawing two layers with more VIAs, which
for low-power. However, VTH reduces the design headroom improves performance of ring-oscillator. There is similar DTCO of
according to the equation of VOP - VTH. In order to provide additional SRAM design to reduce metal resistance effectively. Figure 12
design headroom, various SRAM assist schemes are developed by shows dual-write-driver (DWD) SRAM assist, which is developed to
providing VMIN. Figure 10 shows various SRAM assist schemes to reduce BL resistance by DWD [9]. By utilizing the parallel
improve read and write margin. Since SRAM assist schemes have resistance effect, DWD BL resistance is reduced by 75% ideally.
pros and cons with different PPA characteristics, various circuit and Write time and margin are also improved with the help of the
layout architectures are developed to provide the optimum PPA in reduced BL resistance. Metal resistance plays more important role
addition to VMIN [7-12]. Since low-power SRAM and F/F are very in GAA technology, since transistor is faster, and suffers from metal
crucial to provide high-performance of SoC, and SRAM DTCO resistance effect more.
should be applied to achieve the competitive PPA in the advanced
technologies.
DTCO for 3nm GAA technology
DTCO for high-performance GAA technology is proposed to provide PPA benefits by featuring (1)
small capacitance, (2) large effective channel width (Weff), and (3)
Over a decade, FinFET technology has provided high-performance design flexibility. Therefore, DTCO knobs are applied to maximize
solution comparing to Planar technology until GAA technology is those features in 3nm GAA technology. Figure 13 (a) shows GAA
proposed recently. There have been literatures to optimize parasitic transistor structure comparing to FinFET. FinFET increases
resistance and capacitance through DTCO to make the best use of transistor width by adding quantized Fin, which increases
transistor [7-9], [11]. Figure 11 (a) shows metal patterning difference performance by increasing capacitance digitally. However, GAA
for uni- and bi-direction. EUV allows bi-directional metal with increases transistor width linearly by increasing nanosheet.
patterning flexibility, while ArF supports uni-directional metal. In bi- Moreover, overlap capacitance of GAA transistor is relatively small
directional metal, sufficient VIA can be inserted between two layers by the gentle slope comparing to FinFET thanks to transistor
metals by placing them in the same direction. structure.
Authorized licensed use limited to: Korea University. Downloaded on October 16,2023 at 15:48:53 UTC from IEEE Xplore. Restrictions apply.
5 IEEE CICC 2022
Fig. 16. (a) GAA SRAM bitcell with various PU, PG, and PD
nanosheet size. (b) Iread and disturb margin comparison for
FinFET and GAA with different nanosheet width [10].
Fig. 17. SRAM bitcell VMIN of GAA and 7NM for HD and HC types.
GAA shows lower VMIN than 7NM, and HC shows much lower VMIN
than HD by sacrificing area [10].
Authorized licensed use limited to: Korea University. Downloaded on October 16,2023 at 15:48:53 UTC from IEEE Xplore. Restrictions apply.