Professional Documents
Culture Documents
Selt Mann 2009
Selt Mann 2009
Rolf Seltmann, Gert Burbach, Anne Parge, Jens Busch, Tino Hertzsch, Andre Poock, Francois
Weisbuch, Andre Holfeld
AMD Fab36 LLC & Co. KG, Wilschdorfer Landstrasse 101, D-01069 Dresden, Germany
Abstract
Within our paper we are going to discuss the variation within the patterning process in the context of the overall
electrical parameter variation in an advanced logic Fab. The evaluation is based on both the variation of ring oscillators
that are distributed across the chip as well as on local variation of matched transistor pairs. Starting with a view back to
the 130nm technology, we will show how things and requirements changed over time. In particular we focus on the gate
layer where we do a detailed ACLV-comparison from the 130nm technology node down to today’s 45nm node. Within
the patterning variation we keep special attention on the mask performance. Within that section, we do a detailed wafer-
mask correlation analysis. Additionally to the low-MEEF gate layer we show the importance of the mask CD-
performance for a typical high MEEF-layer. Finally, we discuss the mask contribution to the overall overlay error for
the most critical contact to gate overlay. In all of the cases, we will show that the mask performance is not the limiter
within today’s most advanced technology, as long as we get access to a world class mask shop.
1. Introduction
Controlling process variation is one of the major tasks within a wafer Fab. The Fab management as well as the customer
is interested on a product that meets all the electrical specifications that are committed in the design manual. Relaxed
electrical specifications result in uncompetitive designs whereas too tight ones result in inacceptable yields. In case of a
microprocessor this requires the monitoring of individual transistor parameters, monitoring of the matching of equal
transistor in close proximity, RO speed variations across die and finally the package speed and power consumption data.
These electrical characteristics are influenced by multiple process variables within the Fab, e.g. random dopant
fluctuation, gate oxide thickness variation, variation of the stained layers, annealing temperature variation and others /1/.
Going 10 years back, the control strategy within a wafer Fab nearly exclusively was based on across wafer metrology.
This might have been o.k. for processes like diffusion or layer deposition, but this is not enough for lithography. Not
only that imaging depends on the size of a pattern, it’s distance to neighboring pattern and the properties of the medium
to long range neighborhood. Uniquely to lithography, the performance can be affected by the properties and the
signature of the mask /2/. The gate layer was the first layer where engineers started to show interest for the mask
properties /3/. At gate, the critical dimension (CD) of the active pattern directly transfers into both the drive current and
leakage current of the transistors and thus the speed and power consumption of a microprocessors. Fig.1 shows a 10-
year old snapshot of the CD-variation across field of a test-mask that was exposed on a first generation KrF-tool.
Although this tool was very immature with a lot of focus and dynamic problems, Fab engineers recognized: mask
quality can matter! In our study that starts back at the 130nm technology node, we will focus primarily on the gate layer.
We reflect the patterning and in particular mask variation on the overall electrical variation as measured by the ring
oscillators. Additionally we will cover local variation of matched transistor pairs. In particular, we discuss the impact of
Line Width Roughness (LWR). It is known that LWR can have a significant impact on device performance /1/,/4/. We
will show how things improved over time and how the impact of different contributors changed up to today’s 45nm
node. Besides the gate layer, we also touch high MEEF applications like the contact layer.
25th European Mask and Lithography Conference, edited by Uwe F. W. Behringer, Proc. of SPIE
Vol. 7470, 747006 · © 2009 SPIE · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.835166
Fig.1. CD-signature across exposure field for a gate-test-reticle; ASML PAS/500 prototype, 1998; left: wafer result,
right mask CD/4
Besides the electrical parameters of a product the Fab needs to deliver high yields. High defectivity is known to be a
major yield detractor. However, with shrinking process budgets, the overlay control in conjunction with CD-variations
can play an important role for yield as well. Until 2-3 years ago, logic FAB’s did not care much about the impact of the
mask, as the mask registration impact usually was within 10-20% of the overall budget. However, with a much faster
shrinking of the overlay needs compared to that of the minimum feature (fig.2), the mask registration performance more
and more comes into the attention of the wafer Fab engineers.
Fig. 2. Overlay shrink factor in comparison to the minimum feature shrink versus logic technology
node
Fig.3 shows the range of ring oscillator speed variation across chip in relation to the mean speed of the chip (dark blue),
together with a pattern density map of a 130nm µP-design at AMD. In that early design, only three RO’s were present in the
die (TL, M, BR). The exposure was done on a 0.7NA KrF scanner. As the variation (dark blue curve) exceeded the
expected range by more than a factor of 3, a detailed analysis of the root causes was done. Re-measuring the mask at the
sites with the highest variation revealed a clear mask signature that was responsible for about 30% of the variation and that
correlated to the pattern density of the mask pretty well (dark brown). The mask shop found an inaccuracy in the fogging
correction algorithm (correction of long-range e-beam scattering effects) to be responsible for that big variation. After
190
148 JA\.
0.95
146
144
142
0.9 CD
A
V
CD-ew imsk
140 0.85
--flare red Ion
138
136
0.8 4--RO/rcl.0 nit
RO_new
134
0.75
.132
130 0.7
TL M BR
Fig.3. Speed distribution (blue) versus CD-distribution across die (left) and pattern density o the gate layer (right) at the
130nm node. TL, M and BR represent three Ring oscillators that are located in corners and in the middle of the die
T non-paternin
exposure
mask
Fig.4 Relative contribution to the across die speed variation of mask, wafer and non-patterning related effects at the
130nm node . Left: initial performance; right: after mask and exposure improvement
Fig.5. relative improvement of the systematic across chip speed variation at the different technology nodes, separated
into patterning and non-patterning related contribution
Thus, the overall across die speed variation went down considerably. Fig.5 shows the improvement of the across die
speed variation from the 130nm node up to the 90nm node, normalized with respect to the 130nm status discussed in
detail above. The overall improvement is separated into the patterning contribution (mask, exposure, etch) and non-
patterning related variation. Interestingly, the biggest step in patterning improvement could be achieved between the
90nm and the 65nm node. This will be discussed later on in detail. A big part of non-patterning related improvement
was due to managing the pattern density of the gate and other layers. Fig. 6 shows a map of the 45nm pattern density in
comparison to the map at 130nm. Although pattern density management was greatly improved, non-lithographic
variation is still dominating. In particular at dies close to the wafer edge this can become tremendous!
ill
Fig 6 : Pattern density at the gate layer: left: 130nm node, AMD OpteronTM, right: 45nm node, Quad-Core AMD
OpteronTM, same scale
Fig.7 (left) shows the across wafer speed variation at the 90nm node, together with an enlarged picture of a field at the
wafer edge. The large gradient close to the edge is a similar effect as we saw indie due to the changing pattern densities:
Fig. 7: Across wafer speed variation for a 90nm-dual core microprocessor and across reticle speed variation for an edge
die before and after optimization. Every dot represents one RO; red means higher speed
:III,S-DItb.
---
IpIIkO5IPJ-OIIMI. 1.11
IC
I -- A
-i -
---- A
ID A-
---I----
0
a,
Frnax
Fig.8: Package data of chips at the wafer edge of AMD’s first dual-core microprocessor. Red/gray: before DoseMapper
application; blue/pink: after DoseMapper correction of the indie-speed-signature
Matched pairs are pairs of symmetrical transistors that are used in the SRAM and in Input-Output-circuits of µP’s. The
functionality of those circuit parts rely on that the critical devices behave identical. The drive current and the threshold
voltages are not allowed to exceed a given “design window”. Patterning wise, it is important that no systematic CD-
deltas happen between the two transistors. Historically, design liked to use different pitches for different SRAM-lines
/5/. Fig.9 shows the probability plot for a transistor matching, separated into symmetrical transistors and strongly
asymmetric transistors for two scanners with different aberration level. In case of line asymmetry (red curves), the
maximum matching error heavily depends on the aberration status of the scanner lens, in particular asymmetric
Zernike’s as Coma and Threefoil were responsible for that strong shift. In the case of symmetric line pairs, no
systematic mismatch is seen in either case. We furthermore looked for systematic line differences due to local variation
025
0.90
(t3
0 0.75
0.50
a)
0.25
- 0.10
E 005
o 001 Aberration level lOmh
0.001
-25 -20 -15 -10 -5 U S 10 15 20 25
Matching I Qt [%]
Fig.9. Matched pair performance for equally spaced pairs versus asymmetric spaced pairs, shown for two scanners with
different aberration level
IIIIIIIIIIIIIIIIIIIIIIIII
II1IIIIIIIIIIIIIIIIIIIIIII
IIuIIIIIIIIIIIIIIIIIIII
I-
C
0 II'IIIIIIIIIIIIIIIIII
ii inii::uuiiiiiiiiiiiii
etch
litho
I-
II IIIIIIIIIIIi;::!!!!!!r Pot.(etch)
IIIIIIIIIIIIflhIIIII
5Orim
Pot.(Iitho)
U
IuuIIIIIIIIIIIIIIIII
0 200 400 600 800 1000
active width/rim
Fig.10: LWR-induced random CD-variation for the gate layer in dependency of the width of the gate
But although the active width went down from node to node by 70%, the patterning contribution to the matched pair
performance could be maintained nearly constant. This was achieved by improvements in the gate patterning module.
The switch to single pitch improved the performance at the 65nm node, the selection of better resist and a smoother etch
process lead to improvements at 45nm, and we are confident that we can get similar improvements at the move to the
32nm node. However, with shrinking geometries, the increase of variation related to other contributors, in particular to
implant shot noise at small active areas are un-avoidable. Fig.11a shows the relative matching of transistor pairs for
different active cell areas at the 90nm node. 11b shows the impact of patterning versus non-patterning related variation
onto the matching of the pairs from the 90nm to the 45nm node. One can conclude that the importance of patterning
improvements, in particular of LWR, might be limited. However, that is not true in general. Besides the impact of
LWR/LER on matched pair performance, a rough, grainy surface can lead to diffusion along the grain boundaries or
2.5
5
2
4
to E
C
-C electricsl E
3 1.5 pItterning
0
potterning
> 2 = other
[inch (electricol}
0 finI eIectriciI
Linear (patterriing
0.5
0
70 ¶O 110 130 150 170
Fi.11a. Transistor mismatch for different active areas at the 90nm (left) and 11b: The increase of relative mismatch from
90nm to 45nm technology node, separated into patterning contribution and other effects
3. ACLV characterization
In the following chapter we will discuss the improvements of the across chip CD-performance (ACLV) of the gate layer
from node to node. Multiple attempts /6/ were made to separate the ACLV-performance into several independent
contributors. We differentiate ACLV into three major components: mask, exposure tool and process impact, although
we recognize that there are several inter-dependencies.
Mask impact; contains global variation (resist, etch, pattern density impact) as well as local variation (writer)
Exposure impact is further separated into:
o Global (focus, dose, aberration control)
o Pattern density / flare
o H-V
o OPC impact
Process impact is separated into
o Topography-impact
o Random errors due to LWR
random errors
30
topography impact
25
OPC-inaccuray
20
15 H-V CD-offset
01JIii-
etch)
exposure systematic
Fig.12. Gate-ACLV improvement over time. The random errors are related to the smallest active width at each
technology
Engineers always try to blame the mask if they see a CD-signature that does not meet their expectations. However, once
they take the mask data and try to correlate it to the data as measured at the wafer, they are often disappointed about the
weak correlation. Fig.13 shows a typical mask-wafer correlation plot (brown) as we got it at a test-exposure for a 32nm
gate layer. It looks like the correlation is virtually zero. But we need to regard a simple “truth”: correlation gets worse as
better the mask and as lower the MEEF and as higher other impacts become, in particular random effects. And with all
the improvements at state of the art mask shops like AMTC, the mask signature can be destroyed if the correlation study
is not done accurately. After doing some special “excessive” metrology at the wafer, like:
Averaging via 5 parallel lines at both the mask and wafer metrology, at 50 sites across wafer,
Measuring at exactly the same positions at both the mask and the wafer,
we suddenly see some correlation, although it is not great (blue dots in Fig.13).
56
= 0.7967* 20.303
55.54 = 0.181
0.7666e * 21.695
= 0.6613
.
55
.
54.5
metro_optimize
54
jnjtiI
mk 53.5
53
niask-CD/4
Fig.13. Wafer to mask CD-correlation; left: CD-map, middle: correlation plot for “normal” metrology (brown) and
excessive metrology ( blue); right: center “hot spot” removed from the map
If we would remove the one point with the low CD in the field center, the wafer-mask-correlation would become really
questionable (see the right contour-plots). Again, we see the correlation strongly depending from the mask quality. In a
next step, we calculated the residuals of the individual points with respect to the regression curve. We achieved 3σ-
numbers of 0.45nm for the blue graph and 1.02nm for the initial case. The residual-number is a measure that is
independent on the mask quality and thus gives us a much better picture about the quality of the wafer data than the
pure correlation factor. We recognized that the CD-residuals for the blue correlation plot are already pretty small. In a
next step, we averaged the CD-signature along the scan direction and plotted the averaged residual across slit for both
the mask and the wafer (fig.14, left). The deltas between the mask and the wafer residuals are in the range of 0.1nm. But
anyway, if we compare the remaining CD-residual-delta across slit with the corresponding calculated CD-offset due to
0.4
03
0.2
0.1
.--niask mean
0
-.- wafer
0.1
0.2
-0.3
Arethese deltas explainable?
-0.4
Slit-coordinate
Fig.14: left: CD-residual across slit after averaging along scan and right: residual-delta versus scanner illumination
uniformity induced CD-error across slit
As written above, the gate layer has very strict design rules that enable optimized illumination schemes and MEEF as
low as 1 to 1.5. At other layers, that is not possible. As an example we will discuss the CD-budget of the 45nm hole
layer that has an MEEF of up to 3.5. At lithography, multilayer antireflective coating is used that enables a perfect
reflectivity control. Both the across wafer, wafer to wafer and lot to lot variation can be controlled very accurately. The
leftovers are mask errors, exposure errors, proximity errors and of cause random errors that are related to LER. Fig.15,
left, shows a typical CD-budget at lithography for that layer. Although the MEEF is big, the random part is in the same
order as the overall combined mask and exposure process part. Things change radically if etch comes into the game.
Unlike litho, etch processes can’t be controlled as accurate toward the edge of the wafer. Thus, the across wafer
variation becomes dominant. Furthermore, etch chamber matching can’t be done as accurate as hotplate matching at
litho. The overall CD-statistics becomes very etch dominated, and the mask contribution gets negligible. However, as
always during process optimization, the weakest items get optimized first. Fig.17b shows the same CD-budget after
optimization of:
Optimized across field focus distribution, CD-result seen in fig.16a (before) and 16b (after) optimization
Illumination fine tuning to improve the OPC-signature,
DoseMapper application to compensate the etch signature across wafer,
APC at etch.
random
1.8 -',
1.62
1.4
OPC-imperfection
Across wafer
Fig.15: Contact layer CD-budget for three different cases: at litho (left), after etch (middle) and after process
optimization at both litho and etch (right)
Optical proximity correction meanwhile is a well established process that is a major enabler for continuous shrinking.
The OPC-flow consists of the following steps:
Built a first model that is based on simulation and/ or extrapolation from previous generations (model 1)
build a mask with model 1, expose the mask onto the wafer and measure the wafer on certain test and circuit
pattern.
Judge with respect to the deviation of the CD’s to their individual target of certain test- and circuit pattern,
Create a new model based on a lumped parameter model that is based on step 3 (model 2)
build a mask with model 2, expose the mask onto the wafer and measure the wafer on certain test and circuit
pattern.
It becomes obvious, the model quality is not just determined by the accuracy of the model, but also of the predictability
and reproducibility of the exposure process (was the same tool used?) and in particular the reproducibility of the mask
signature. To remain manufacturing flexible, an OPC-model needs to be valid within the allowed specification for
multiple tools (at least of the same type) and multiple masks manufactured with the same mask process. By establishing
an accurate methodology at our partner mask shop to control the OPC-stability in AMD’s primary the mask shop, we
could manage that the mask variation remains as one of the least varying parameters within the whole OPC-process-
chain.
To get a better visibility about the impact of the mask registration on the wafer overlay result, we started to apply AIM-
targets into the die /7/ a while ago. Those targets can be measured at both the mask shop and on the wafer in a regular
production environment. Fig.16 shows a contact to gate overlay as measured on the wafer for the same reticles for three
different cases: A: both layers are exposed on one and the same machine (SMO), B: gate was exposed on machine A
and PC was exposed on machine B (MMO). As we can see, although a mask signature is seen in the MMO-case as well,
the impact of a different tool fingerprint clearly is visible. After a careful tool adjustment (C), the MMO-performance is
close to the SMO performance. How does that intra-field performance compare to the across wafer overlay
contribution? Fig.17, left shows the typical across wafer overlay performance of tool A and tool B (mixed machine) as
we got the tools from the supplier. A clear mismatch is seen that points to different chuck signatures. Due to that not
satisfying initial performance, we built up a dedicated correction and lot/wafer tracking system that ensures that every
wafer gets its optimum correction based on its history at the reference layer. This regime works properly for both mixed
machine and single machine operation if tools with dual chucks are used.
if 'V
/ - , If
-
U
'II
a. J - -
I S
P
Fig.16. Intra-field overlay residuals for three different cases: A: Single machine, B: matched machine, C: matched
machine after adjustment of the intra-field signature
.11n
usa IL
I Ul1UupU_UUj
401111 III
a -aisas
suna a aI Pu Pu
SI SI0SSS S
1 S S SINUS
I S P.1NI
II I
Fig.17. Mixed chuck/tool overlay performance. Initial (left) and after applying our dedicated tracking and control
regime (right)
0,9
0.8
0,y_
0.7
0.5
0.4
0.3 - -
0.2
-- -- across wafer
Fig.18. Relative overlay performance for different improvement scnenarios, normalized with respect to initial
performance.We assumed a random (RSS) superposition of wafer and field overlay
Serie 1
t -Unity Slü,
-Trendline ¶Jrr
Ret OVL V
Fig.19. Wafer to mask intra-field residual correlation plot, for an experimental reticle, y-axis
8. Conclusions
Within our study we compared the patterning related variation with the overall electrical variation. For the global across
chip ring oscillator variation, we could achieve 60% improvement from the 130nm node to the 45nm node. The
patterning contribution is not negligible, though non-patterning related effects are slightly dominating. Within the
patterning variation (ACLV), the mask contribution could be improved significantly over time and is currently not a
major concern for the gate layer at 45nm. Even at high MEEF applications, the mask contribution is rather on the lower
end of the individual contributors with etch CD-control both lot to lot / wafer to wafer and across wafer being a major
challenge. However, any improvement in the mask performance helps to improve mask inspectibility. At overlay, state
of the art mask shops deliver excellent registration capability that does fully support the current and next technology
node. For double patterning application further improvements are needed.
9. References
1. Reducing Variation in Advanced Logic Technologies: Approaches to Process and Design for
Manufacturability of Nanoscale CMOS, Kelin J. Kuhn, Proceedings IEDM 2007, p. 471, paper 18.2
2. Measurement and Analysis of reticle and wafer level Contributions to Total CD variation
Moshe Preil, KLA Yield Management Solutions, Autumn 2000
3. ACLV-analysis in production and its impact on product performance, Seltmann, Rolf, Stephan, Rolf, Mazur,
martin, Spence Christopher, La Fontaine, Bruno, Stankowski, Dirk, Poock Andre, Grundke, Wolfram, Optical
Microlithography XVI. Proceedings of the SPIE, Volume 5040, pp. 530-540 (2003)
4. Effect of line-edge roughness (LER) and line-width roughness (LWR) on sub-100-nm device performance
Lee, Ji-Young; Shin, Jangho; Kim, Hyun-Woo; Woo, Sang-Gyun; Cho, Han-Ku; Han, Woo-Sung; Moon, Joo-
Tae, Proceedings of the SPIE, Volume 5376, pp. 426-433 (2004)
5. Meeting critical gate linewidth control needs at the 65 nm node, Arpan Mahorowala, Scott Halle, Allen Gabor,
William Chu, Alexandra Barberet, Donald Samuels, Amr Abdo, Len Tsou, Wendy Yan, Seiji Iseda, Kaushal
Patel, Bachir Dirahoui, Asuka Nomura, Ishtiaq Ahsan, Faisal Azam, Gary Berg, Andrew Brendler, Jeffrey
Zimmerman, and Tom Faure, Proc. SPIE Vol. 6156, 61560M (Mar. 14, 2006)
6. CD analysis of advanced photolithography and its impact on critical design structures
Karla A. Romero, Rolf Seltmann, Gert Burbach, Rolf Stephan, Joerg Paufler, and David Greenlaw
Proc. SPIE Vol. 6156, 61560D (Mar. 13, 2006)
7. In-chip Overlay Metrology in 90-nm Production, Bernd Schulz, Rolf Seltmann, Joerg Paufler, Philippe Leray,
Aviv Frommer, Pavel Izikson, Elyakim Kassel, Mike Adel, Proc. IEEE. May 2005
The authors would like to thank Thomas Schmidt, AMTC, Paul Ackmann, Anna Tchikoulaeva and Andre Poock for
their input on mask related topics, Marc Staples for supporting the work and for his general input, Rolf Stephan, Karla
Romero, Sarah McGowan, Cyrus Tabery, Chris Spence, Bruno LaFontaine and Norma Rodriguez for their contribution
along the path of ACLV-improvement and Bernd Schulz for his pioneering work in the area of indie-overlay metrology
and his courtesy to use the mask-wafer-correlation overlay plot.