You are on page 1of 14

Invited Paper

Mask parameter variation in the context of the overall variation


budget of an advanced logic wafer Fab

Rolf Seltmann, Gert Burbach, Anne Parge, Jens Busch, Tino Hertzsch, Andre Poock, Francois
Weisbuch, Andre Holfeld
AMD Fab36 LLC & Co. KG, Wilschdorfer Landstrasse 101, D-01069 Dresden, Germany

Abstract

Within our paper we are going to discuss the variation within the patterning process in the context of the overall
electrical parameter variation in an advanced logic Fab. The evaluation is based on both the variation of ring oscillators
that are distributed across the chip as well as on local variation of matched transistor pairs. Starting with a view back to
the 130nm technology, we will show how things and requirements changed over time. In particular we focus on the gate
layer where we do a detailed ACLV-comparison from the 130nm technology node down to today’s 45nm node. Within
the patterning variation we keep special attention on the mask performance. Within that section, we do a detailed wafer-
mask correlation analysis. Additionally to the low-MEEF gate layer we show the importance of the mask CD-
performance for a typical high MEEF-layer. Finally, we discuss the mask contribution to the overall overlay error for
the most critical contact to gate overlay. In all of the cases, we will show that the mask performance is not the limiter
within today’s most advanced technology, as long as we get access to a world class mask shop.

Keywords: ring oscillator, mask, ACLV, CD-budget, overlay

1. Introduction

Controlling process variation is one of the major tasks within a wafer Fab. The Fab management as well as the customer
is interested on a product that meets all the electrical specifications that are committed in the design manual. Relaxed
electrical specifications result in uncompetitive designs whereas too tight ones result in inacceptable yields. In case of a
microprocessor this requires the monitoring of individual transistor parameters, monitoring of the matching of equal
transistor in close proximity, RO speed variations across die and finally the package speed and power consumption data.
These electrical characteristics are influenced by multiple process variables within the Fab, e.g. random dopant
fluctuation, gate oxide thickness variation, variation of the stained layers, annealing temperature variation and others /1/.
Going 10 years back, the control strategy within a wafer Fab nearly exclusively was based on across wafer metrology.
This might have been o.k. for processes like diffusion or layer deposition, but this is not enough for lithography. Not
only that imaging depends on the size of a pattern, it’s distance to neighboring pattern and the properties of the medium
to long range neighborhood. Uniquely to lithography, the performance can be affected by the properties and the
signature of the mask /2/. The gate layer was the first layer where engineers started to show interest for the mask
properties /3/. At gate, the critical dimension (CD) of the active pattern directly transfers into both the drive current and
leakage current of the transistors and thus the speed and power consumption of a microprocessors. Fig.1 shows a 10-
year old snapshot of the CD-variation across field of a test-mask that was exposed on a first generation KrF-tool.
Although this tool was very immature with a lot of focus and dynamic problems, Fab engineers recognized: mask
quality can matter! In our study that starts back at the 130nm technology node, we will focus primarily on the gate layer.
We reflect the patterning and in particular mask variation on the overall electrical variation as measured by the ring
oscillators. Additionally we will cover local variation of matched transistor pairs. In particular, we discuss the impact of
Line Width Roughness (LWR). It is known that LWR can have a significant impact on device performance /1/,/4/. We
will show how things improved over time and how the impact of different contributors changed up to today’s 45nm
node. Besides the gate layer, we also touch high MEEF applications like the contact layer.

25th European Mask and Lithography Conference, edited by Uwe F. W. Behringer, Proc. of SPIE
Vol. 7470, 747006 · © 2009 SPIE · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.835166

Proc. of SPIE Vol. 7470 747006-1

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


154 152
152 150
150 148
148 146
14G 144
144 142
142 140
140 4 138
138 3 136
Ri 2 R1
scan axis R2
1
R3 slit axis

Fig.1. CD-signature across exposure field for a gate-test-reticle; ASML PAS/500 prototype, 1998; left: wafer result,
right mask CD/4

Besides the electrical parameters of a product the Fab needs to deliver high yields. High defectivity is known to be a
major yield detractor. However, with shrinking process budgets, the overlay control in conjunction with CD-variations
can play an important role for yield as well. Until 2-3 years ago, logic FAB’s did not care much about the impact of the
mask, as the mask registration impact usually was within 10-20% of the overall budget. However, with a much faster
shrinking of the overlay needs compared to that of the minimum feature (fig.2), the mask registration performance more
and more comes into the attention of the wafer Fab engineers.

Fig. 2. Overlay shrink factor in comparison to the minimum feature shrink versus logic technology
node

2. Electrical versus patterning variation

2.1 Global speed variation versus global CD-variation across chip

Fig.3 shows the range of ring oscillator speed variation across chip in relation to the mean speed of the chip (dark blue),
together with a pattern density map of a 130nm µP-design at AMD. In that early design, only three RO’s were present in the
die (TL, M, BR). The exposure was done on a 0.7NA KrF scanner. As the variation (dark blue curve) exceeded the
expected range by more than a factor of 3, a detailed analysis of the root causes was done. Re-measuring the mask at the
sites with the highest variation revealed a clear mask signature that was responsible for about 30% of the variation and that
correlated to the pattern density of the mask pretty well (dark brown). The mask shop found an inaccuracy in the fogging
correction algorithm (correction of long-range e-beam scattering effects) to be responsible for that big variation. After

Proc. of SPIE Vol. 7470 747006-2

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


improving the algorithm, the CD-signature improved tremendously (orange curve). But even with the new mask, a 6nm
delta between the corner and the center of the die still is present. Going further into detail, we recognized a strong exposure
effect that was related to the pattern density as well. After doing some stray-light-analysis and modeling of flare effects, we
recognized an issue with lens contamination at DUV lenses that was responsible for the large CD-delta within the die, in
conjunction with the pattern density variation of the gate layer. By having such a sensitive vehicle as a µP, we were able to
detect the effect of sulfuric and ammonium based lens contamination before it hurt us in production and before it became an
industry-wide phenomenon. The red curve represents the CD-signature after the new mask and flare reduction program was
executed. Although the CD-curve is quite flat, the across die speed variation (light blue) still is about 10% that cannot be
explained by the CD-variation. After some modeling of heat absorption at the wafer and experiments with an RAT-tool
from a different veendor we recognized that thermal effects at rapid thermal annealing (RTA) were responsible for the
strong speed signature. Similarly as at patterning, a correlation to pattern density is seen for the electrical signature. It
became obvious that locally different heat absorption lead to different dopant diffusion and thus the effective channel length
of the transistor that transferred into a pattern density dependant speed distribution. Whereas the mask, followed by flare
effects, was the strongest contributor before the litho fix, non-patterning related thermal variation was dominating at the
final stage (fig.4)

190
148 JA\.
0.95
146

144
142
0.9 CD
A
V
CD-ew imsk
140 0.85
--flare red Ion
138
136
0.8 4--RO/rcl.0 nit
RO_new
134
0.75
.132

130 0.7
TL M BR

Fig.3. Speed distribution (blue) versus CD-distribution across die (left) and pattern density o the gate layer (right) at the
130nm node. TL, M and BR represent three Ring oscillators that are located in corners and in the middle of the die

T non-paternin
exposure

mask

initiul porformance foggingcorrection,


flare control

Fig.4 Relative contribution to the across die speed variation of mask, wafer and non-patterning related effects at the
130nm node . Left: initial performance; right: after mask and exposure improvement

Proc. of SPIE Vol. 7470 747006-3

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


Based on that learning the following steps were taken:
• Radically improving the pattern density variation across die
• Improving the mask process, in particular the compensation of pattern density related effects (fogging
correction)
• Controlling lens flare at a very low level by the implementation of a very sensitive flare monitor
• Implementation of the DoseMapper functionality that was developed by ASML in a JDP with AMD.

Fig.5. relative improvement of the systematic across chip speed variation at the different technology nodes, separated
into patterning and non-patterning related contribution

Thus, the overall across die speed variation went down considerably. Fig.5 shows the improvement of the across die
speed variation from the 130nm node up to the 90nm node, normalized with respect to the 130nm status discussed in
detail above. The overall improvement is separated into the patterning contribution (mask, exposure, etch) and non-
patterning related variation. Interestingly, the biggest step in patterning improvement could be achieved between the
90nm and the 65nm node. This will be discussed later on in detail. A big part of non-patterning related improvement
was due to managing the pattern density of the gate and other layers. Fig. 6 shows a map of the 45nm pattern density in
comparison to the map at 130nm. Although pattern density management was greatly improved, non-lithographic
variation is still dominating. In particular at dies close to the wafer edge this can become tremendous!

ill

Fig 6 : Pattern density at the gate layer: left: 130nm node, AMD OpteronTM, right: 45nm node, Quad-Core AMD
OpteronTM, same scale

Fig.7 (left) shows the across wafer speed variation at the 90nm node, together with an enlarged picture of a field at the
wafer edge. The large gradient close to the edge is a similar effect as we saw indie due to the changing pattern densities:

Proc. of SPIE Vol. 7470 747006-4

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


At the wafer edge, where a patterned surface meets a non-patterned wafer edge, the heat absorption changes in
dependency of the radius. However, due its excellent control, lithography was used for correcting this non-patterning
related systematic variation both across wafer and across reticle field by changing the CD’s locally (DoseMapper). This
in particular was useful for the first AMD dual-core µP, as any misbalance in the electrical parameters between the
cores leads to an overall speed loss. Fig.8 shows an impressive improvement of final package data due to the indie-CD-
correction at the edge dies. Correction schemes that address electrical variation rather than CD-variation still are used at
today’s 45nm technology. The extent of correction is much lower due to improvements in all areas, but advanced dose
control still helps managing the variation.

Fig. 7: Across wafer speed variation for a 90nm-dual core microprocessor and across reticle speed variation for an edge
die before and after optimization. Every dot represents one RO; red means higher speed

:III,S-DItb.
---
IpIIkO5IPJ-OIIMI. 1.11
IC
I -- A

-i -
---- A
ID A-
---I----

0
a,

Frnax

Fig.8: Package data of chips at the wafer edge of AMD’s first dual-core microprocessor. Red/gray: before DoseMapper
application; blue/pink: after DoseMapper correction of the indie-speed-signature

2.2. Local variation of matched pairs

Matched pairs are pairs of symmetrical transistors that are used in the SRAM and in Input-Output-circuits of µP’s. The
functionality of those circuit parts rely on that the critical devices behave identical. The drive current and the threshold
voltages are not allowed to exceed a given “design window”. Patterning wise, it is important that no systematic CD-
deltas happen between the two transistors. Historically, design liked to use different pitches for different SRAM-lines
/5/. Fig.9 shows the probability plot for a transistor matching, separated into symmetrical transistors and strongly
asymmetric transistors for two scanners with different aberration level. In case of line asymmetry (red curves), the
maximum matching error heavily depends on the aberration status of the scanner lens, in particular asymmetric
Zernike’s as Coma and Threefoil were responsible for that strong shift. In the case of symmetric line pairs, no
systematic mismatch is seen in either case. We furthermore looked for systematic line differences due to local variation

Proc. of SPIE Vol. 7470 747006-5

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


of the mask making process. We found excellent local CD-control of 0.7nm (3σ) which shows a very good control of
local CD-variation at the mask that contributes less than 10% to the overall variation. However, fig.9 also reveals that
random errors are dominating, even in the case of the “bad” lens. In a next step, we looked for the contribution of
patterning related random CD-variation due to Line Width Roughness (LWR). LWR depends on the contrast of the
aerial image, but is primarily influenced by the molecular structure of chemically amplified resists and the etch process.
Thus LWR directly transfers into random CD-variation. Its spatial signature for a gate-layer is shown in Fig.10: As
smaller the active width of interest is as bigger the gate CD-variation becomes. This has to be regarded in the
technology and design if we go down the scaling path! For geometries as small as 50nm, the random variation becomes
50% bigger as for an active width of 150nm.

025
0.90
(t3

0 0.75
0.50
a)
0.25
- 0.10
E 005
o 001 Aberration level lOmh
0.001
-25 -20 -15 -10 -5 U S 10 15 20 25
Matching I Qt [%]

Fig.9. Matched pair performance for equally spaced pairs versus asymmetric spaced pairs, shown for two scanners with
different aberration level

IIIIIIIIIIIIIIIIIIIIIIIII
II1IIIIIIIIIIIIIIIIIIIIIII
IIuIIIIIIIIIIIIIIIIIIII
I-
C
0 II'IIIIIIIIIIIIIIIIII
ii inii::uuiiiiiiiiiiiii
etch
litho

I-
II IIIIIIIIIIIi;::!!!!!!r Pot.(etch)
IIIIIIIIIIIIflhIIIII
5Orim
Pot.(Iitho)
U
IuuIIIIIIIIIIIIIIIII
0 200 400 600 800 1000
active width/rim

Fig.10: LWR-induced random CD-variation for the gate layer in dependency of the width of the gate

But although the active width went down from node to node by 70%, the patterning contribution to the matched pair
performance could be maintained nearly constant. This was achieved by improvements in the gate patterning module.
The switch to single pitch improved the performance at the 65nm node, the selection of better resist and a smoother etch
process lead to improvements at 45nm, and we are confident that we can get similar improvements at the move to the
32nm node. However, with shrinking geometries, the increase of variation related to other contributors, in particular to
implant shot noise at small active areas are un-avoidable. Fig.11a shows the relative matching of transistor pairs for
different active cell areas at the 90nm node. 11b shows the impact of patterning versus non-patterning related variation
onto the matching of the pairs from the 90nm to the 45nm node. One can conclude that the importance of patterning
improvements, in particular of LWR, might be limited. However, that is not true in general. Besides the impact of
LWR/LER on matched pair performance, a rough, grainy surface can lead to diffusion along the grain boundaries or

Proc. of SPIE Vol. 7470 747006-6

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


channeling effects during the tilted Halo implants with dramatic impact on the local electrical channel length and thus
the individual transistor characteristic. Any improvement of the LER remains a major challenge in improving the device
performance.

2.5
5

2
4
to E
C
-C electricsl E
3 1.5 pItterning
0
potterning
> 2 = other
[inch (electricol}
0 finI eIectriciI
Linear (patterriing
0.5
0
70 ¶O 110 130 150 170

SQRT(W°L) 9Onm 65nm 45nm

Fi.11a. Transistor mismatch for different active areas at the 90nm (left) and 11b: The increase of relative mismatch from
90nm to 45nm technology node, separated into patterning contribution and other effects

3. ACLV characterization

In the following chapter we will discuss the improvements of the across chip CD-performance (ACLV) of the gate layer
from node to node. Multiple attempts /6/ were made to separate the ACLV-performance into several independent
contributors. We differentiate ACLV into three major components: mask, exposure tool and process impact, although
we recognize that there are several inter-dependencies.
Mask impact; contains global variation (resist, etch, pattern density impact) as well as local variation (writer)
Exposure impact is further separated into:
o Global (focus, dose, aberration control)
o Pattern density / flare
o H-V
o OPC impact
Process impact is separated into
o Topography-impact
o Random errors due to LWR

random errors
30
topography impact
25
OPC-inaccuray
20

15 H-V CD-offset

10 pattern density Iftho,

01JIii-
etch)
exposure systematic

mask (global, local,


l3Onm* 9Onm 65nm 4Snm
pattern density)

Fig.12. Gate-ACLV improvement over time. The random errors are related to the smallest active width at each
technology

Proc. of SPIE Vol. 7470 747006-7

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


Fig. 12 shows the improvement of ACLV, separated into the individual contributions, from the 130nm technology node
toward the 45nm node. At 130nm, not all terms were quantified thoroughly enough to give a reliable number, thus we
just introduced “best guess” numbers in some categories. The LWR-numbers are always quantified for the smallest
active length, e.g. 150nm for the 130nm node and 60nm for the 45nm node. It clearly is seen that we could manage to
improve ACLV from node to node. After fixing the flare and mask issues at 130nm, as discussed above, OPC-
inaccuracy became the largest contributor at 90nm. The biggest improvement step was achieved between 90nm and
65nm by introduction of design for manufacturing (DFM) methods. By eliminating multiple pitches in the gate layer, by
placing dummy gates on every active gate and by introducing unidirectional gate-orientation, a much more robust
patterning process could be defined that lead to a clear improvement of CD-control, in particular of the OPC
contribution. As pointed out above, we managed to keep the random, LWR-part about constant from node to node. But
due to all the other improvements, the random, LWR-part clearly becomes the dominating contributor in today’s ACLV
budget. As we can see, the mask contribution is within the ballpark of the other variables, suggesting a well balanced
process.

4. Mask to wafer CD correlation

Engineers always try to blame the mask if they see a CD-signature that does not meet their expectations. However, once
they take the mask data and try to correlate it to the data as measured at the wafer, they are often disappointed about the
weak correlation. Fig.13 shows a typical mask-wafer correlation plot (brown) as we got it at a test-exposure for a 32nm
gate layer. It looks like the correlation is virtually zero. But we need to regard a simple “truth”: correlation gets worse as
better the mask and as lower the MEEF and as higher other impacts become, in particular random effects. And with all
the improvements at state of the art mask shops like AMTC, the mask signature can be destroyed if the correlation study
is not done accurately. After doing some special “excessive” metrology at the wafer, like:
Averaging via 5 parallel lines at both the mask and wafer metrology, at 50 sites across wafer,
Measuring at exactly the same positions at both the mask and the wafer,
we suddenly see some correlation, although it is not great (blue dots in Fig.13).

56
= 0.7967* 20.303
55.54 = 0.181
0.7666e * 21.695
= 0.6613
.
55
.
54.5
metro_optimize
54
jnjtiI
mk 53.5

53

41.5 42 42.5 43 43.5


S

niask-CD/4

Fig.13. Wafer to mask CD-correlation; left: CD-map, middle: correlation plot for “normal” metrology (brown) and
excessive metrology ( blue); right: center “hot spot” removed from the map

If we would remove the one point with the low CD in the field center, the wafer-mask-correlation would become really
questionable (see the right contour-plots). Again, we see the correlation strongly depending from the mask quality. In a
next step, we calculated the residuals of the individual points with respect to the regression curve. We achieved 3σ-
numbers of 0.45nm for the blue graph and 1.02nm for the initial case. The residual-number is a measure that is
independent on the mask quality and thus gives us a much better picture about the quality of the wafer data than the
pure correlation factor. We recognized that the CD-residuals for the blue correlation plot are already pretty small. In a
next step, we averaged the CD-signature along the scan direction and plotted the averaged residual across slit for both
the mask and the wafer (fig.14, left). The deltas between the mask and the wafer residuals are in the range of 0.1nm. But
anyway, if we compare the remaining CD-residual-delta across slit with the corresponding calculated CD-offset due to

Proc. of SPIE Vol. 7470 747006-8

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


the illumination signature of the scanner across slit (fig.14, right), we see a clear similarity. Thus we can conclude that
the correlation between mask and wafer is depending on two things:
1. extensive and accurate metrology and
2. how well do I know and can I control my exposure tool.

0.4

03

0.2

0.1
.--niask mean
0
-.- wafer
0.1

0.2

-0.3
Arethese deltas explainable?
-0.4
Slit-coordinate

Fig.14: left: CD-residual across slit after averaging along scan and right: residual-delta versus scanner illumination
uniformity induced CD-error across slit

5. CD-budget for a high MEEF- layer

As written above, the gate layer has very strict design rules that enable optimized illumination schemes and MEEF as
low as 1 to 1.5. At other layers, that is not possible. As an example we will discuss the CD-budget of the 45nm hole
layer that has an MEEF of up to 3.5. At lithography, multilayer antireflective coating is used that enables a perfect
reflectivity control. Both the across wafer, wafer to wafer and lot to lot variation can be controlled very accurately. The
leftovers are mask errors, exposure errors, proximity errors and of cause random errors that are related to LER. Fig.15,
left, shows a typical CD-budget at lithography for that layer. Although the MEEF is big, the random part is in the same
order as the overall combined mask and exposure process part. Things change radically if etch comes into the game.
Unlike litho, etch processes can’t be controlled as accurate toward the edge of the wafer. Thus, the across wafer
variation becomes dominant. Furthermore, etch chamber matching can’t be done as accurate as hotplate matching at
litho. The overall CD-statistics becomes very etch dominated, and the mask contribution gets negligible. However, as
always during process optimization, the weakest items get optimized first. Fig.17b shows the same CD-budget after
optimization of:
Optimized across field focus distribution, CD-result seen in fig.16a (before) and 16b (after) optimization
Illumination fine tuning to improve the OPC-signature,
DoseMapper application to compensate the etch signature across wafer,
APC at etch.

Proc. of SPIE Vol. 7470 747006-9

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


As a result, the CD-variation went down by about 30% which gives enough margins for the patterning. Now, the overall
budget shows a very well balancing over all individual contributors. The mask still remains one of the lower end
contributors, but that does not mean that a better control of the mask parameters is not appreciated. One particular weak
point of the mask quality control is the delta between the typical CD-performance and the CD-need on one side and the
inspection accuracy with respect to the CD-deviation of a potential defect on the other side. Thus, even if inspection
does not reveal any issue, there is no 100% confidence that that there is no pattern that does exceed the needed CD-
specification of the mask. As CD-deviations are detected at the die- to die inspection mode, this in particular becomes
dangerous if the “normal” CD-deviation of the mask is wide. Two scenarios are possible:
the inspection delivers a false alarm if the pattern that are compared are in different areas of the distribution or
a pattern outside the allowed range is missed in the case the reference pattern is on the same side of the
distribution as the defective pattern
Thus, making the CD-distribution small reduces the danger of a CD-related defect that is outside of the allowed range.

random
1.8 -',
1.62
1.4
OPC-imperfection

across field, scanner

across field; mask

Across wafer

Litho after etch after etch


final lotto lot and wafer
to wafer

Fig.15: Contact layer CD-budget for three different cases: at litho (left), after etch (middle) and after process
optimization at both litho and etch (right)

6. OPC and mask

Optical proximity correction meanwhile is a well established process that is a major enabler for continuous shrinking.
The OPC-flow consists of the following steps:
Built a first model that is based on simulation and/ or extrapolation from previous generations (model 1)
build a mask with model 1, expose the mask onto the wafer and measure the wafer on certain test and circuit
pattern.
Judge with respect to the deviation of the CD’s to their individual target of certain test- and circuit pattern,
Create a new model based on a lumped parameter model that is based on step 3 (model 2)
build a mask with model 2, expose the mask onto the wafer and measure the wafer on certain test and circuit
pattern.

It becomes obvious, the model quality is not just determined by the accuracy of the model, but also of the predictability
and reproducibility of the exposure process (was the same tool used?) and in particular the reproducibility of the mask
signature. To remain manufacturing flexible, an OPC-model needs to be valid within the allowed specification for
multiple tools (at least of the same type) and multiple masks manufactured with the same mask process. By establishing
an accurate methodology at our partner mask shop to control the OPC-stability in AMD’s primary the mask shop, we
could manage that the mask variation remains as one of the least varying parameters within the whole OPC-process-
chain.

Proc. of SPIE Vol. 7470 747006-10

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


7. Overlay and impact of mask registration

To get a better visibility about the impact of the mask registration on the wafer overlay result, we started to apply AIM-
targets into the die /7/ a while ago. Those targets can be measured at both the mask shop and on the wafer in a regular
production environment. Fig.16 shows a contact to gate overlay as measured on the wafer for the same reticles for three
different cases: A: both layers are exposed on one and the same machine (SMO), B: gate was exposed on machine A
and PC was exposed on machine B (MMO). As we can see, although a mask signature is seen in the MMO-case as well,
the impact of a different tool fingerprint clearly is visible. After a careful tool adjustment (C), the MMO-performance is
close to the SMO performance. How does that intra-field performance compare to the across wafer overlay
contribution? Fig.17, left shows the typical across wafer overlay performance of tool A and tool B (mixed machine) as
we got the tools from the supplier. A clear mismatch is seen that points to different chuck signatures. Due to that not
satisfying initial performance, we built up a dedicated correction and lot/wafer tracking system that ensures that every
wafer gets its optimum correction based on its history at the reference layer. This regime works properly for both mixed
machine and single machine operation if tools with dual chucks are used.

if 'V
/ - , If

-
U

'II
a. J - -
I S
P

Fig.16. Intra-field overlay residuals for three different cases: A: Single machine, B: matched machine, C: matched
machine after adjustment of the intra-field signature

.11n
usa IL
I Ul1UupU_UUj
401111 III
a -aisas
suna a aI Pu Pu

SI SI0SSS S
1 S S SINUS
I S P.1NI
II I
Fig.17. Mixed chuck/tool overlay performance. Initial (left) and after applying our dedicated tracking and control
regime (right)

Proc. of SPIE Vol. 7470 747006-11

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


On fig.17, right, the typical performance after installation of our correction system for mixed machine operation is seen.
If we now compare the overall statistics for the individual cases of the wafer overlay and the intra-field overlay, we
recognize that the intra-field error only becomes remarkable if the tools are badly matched and the chuck signatures still
show a strong mismatch. In all other cases, the wafer result becomes the dominating factor. As for the CD-case, we did
some wafer to mask correlation analysis; fig.19 shows an example that was done at a test-mask by measuring hundreds
of targets. Correlation looks limited, but again, the residuals with respect to the regression curve are such small that we
can conclude that:
We have a very small imaging contribution to the intra-field overlay
The mask performance is excellent.
Based on all that data we can conclude that the mask contribution to the overall overlay number still is limited, as long
as we get masks out of state of the art mask shops like AMTC. We also did a closer look into further improvement
possibilities by applying non-linear intra-field correction schemes. Based on the variety of production masks we
measured so far, we can conclude that a further improvement of about 30% seems to be possible, depending on the
magnitude of imaging related distortions (e.g. mask clamping, pellicle impact, lens distortion stability, lens
aberrations). If we assume some continuous improvement at the exposure process, we can be optimistic that we can
meet the challenges of the last non-double-patterning (DP) node. For DP, with overall overlay needs between 2-3nm, a
clear reduction of all contributors is essential. For a Litho-Litho-Etch process (LLE), where the wafer has not to leave
the track, we can expect a reduction of the across wafer content per se. Thus, the mask contribution, together with the
mask exposure process will come back into the consideration of the FAB. Based on the great performance today and the
improvements ahead of us (writer, correction schemes at the mask shop) the DP-needs seem to be not unrealistic for the
most advanced mask shops.

0,9
0.8

0,y_
0.7

0.5
0.4
0.3 - -
0.2
-- -- across wafer

0.1 in tra- lid ci


0

MMO, SMO MMD, MMO, SMO,


initial initial fixing wafer+ wafer
wafer fielc fixing
fixing

Fig.18. Relative overlay performance for different improvement scnenarios, normalized with respect to initial
performance.We assumed a random (RSS) superposition of wafer and field overlay

Proc. of SPIE Vol. 7470 747006-12

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


Overlay Y
y = -D58x + 0,00
R2 = 0.59

Serie 1

t -Unity Slü,
-Trendline ¶Jrr

Ret OVL V

Fig.19. Wafer to mask intra-field residual correlation plot, for an experimental reticle, y-axis

8. Conclusions

Within our study we compared the patterning related variation with the overall electrical variation. For the global across
chip ring oscillator variation, we could achieve 60% improvement from the 130nm node to the 45nm node. The
patterning contribution is not negligible, though non-patterning related effects are slightly dominating. Within the
patterning variation (ACLV), the mask contribution could be improved significantly over time and is currently not a
major concern for the gate layer at 45nm. Even at high MEEF applications, the mask contribution is rather on the lower
end of the individual contributors with etch CD-control both lot to lot / wafer to wafer and across wafer being a major
challenge. However, any improvement in the mask performance helps to improve mask inspectibility. At overlay, state
of the art mask shops deliver excellent registration capability that does fully support the current and next technology
node. For double patterning application further improvements are needed.

9. References

1. Reducing Variation in Advanced Logic Technologies: Approaches to Process and Design for
Manufacturability of Nanoscale CMOS, Kelin J. Kuhn, Proceedings IEDM 2007, p. 471, paper 18.2
2. Measurement and Analysis of reticle and wafer level Contributions to Total CD variation
Moshe Preil, KLA Yield Management Solutions, Autumn 2000
3. ACLV-analysis in production and its impact on product performance, Seltmann, Rolf, Stephan, Rolf, Mazur,
martin, Spence Christopher, La Fontaine, Bruno, Stankowski, Dirk, Poock Andre, Grundke, Wolfram, Optical
Microlithography XVI. Proceedings of the SPIE, Volume 5040, pp. 530-540 (2003)
4. Effect of line-edge roughness (LER) and line-width roughness (LWR) on sub-100-nm device performance
Lee, Ji-Young; Shin, Jangho; Kim, Hyun-Woo; Woo, Sang-Gyun; Cho, Han-Ku; Han, Woo-Sung; Moon, Joo-
Tae, Proceedings of the SPIE, Volume 5376, pp. 426-433 (2004)
5. Meeting critical gate linewidth control needs at the 65 nm node, Arpan Mahorowala, Scott Halle, Allen Gabor,
William Chu, Alexandra Barberet, Donald Samuels, Amr Abdo, Len Tsou, Wendy Yan, Seiji Iseda, Kaushal
Patel, Bachir Dirahoui, Asuka Nomura, Ishtiaq Ahsan, Faisal Azam, Gary Berg, Andrew Brendler, Jeffrey
Zimmerman, and Tom Faure, Proc. SPIE Vol. 6156, 61560M (Mar. 14, 2006)
6. CD analysis of advanced photolithography and its impact on critical design structures
Karla A. Romero, Rolf Seltmann, Gert Burbach, Rolf Stephan, Joerg Paufler, and David Greenlaw
Proc. SPIE Vol. 6156, 61560D (Mar. 13, 2006)
7. In-chip Overlay Metrology in 90-nm Production, Bernd Schulz, Rolf Seltmann, Joerg Paufler, Philippe Leray,
Aviv Frommer, Pavel Izikson, Elyakim Kassel, Mike Adel, Proc. IEEE. May 2005

Proc. of SPIE Vol. 7470 747006-13

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx


10. Acknowledgements

The authors would like to thank Thomas Schmidt, AMTC, Paul Ackmann, Anna Tchikoulaeva and Andre Poock for
their input on mask related topics, Marc Staples for supporting the work and for his general input, Rolf Stephan, Karla
Romero, Sarah McGowan, Cyrus Tabery, Chris Spence, Bruno LaFontaine and Norma Rodriguez for their contribution
along the path of ACLV-improvement and Bernd Schulz for his pioneering work in the area of indie-overlay metrology
and his courtesy to use the mask-wafer-correlation overlay plot.

Proc. of SPIE Vol. 7470 747006-14

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 06/27/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

You might also like