You are on page 1of 31

ClassWork 05 2k design

Exercise 1 [M]

A router is used to cut locating notches on a printed circuit board. The vibration level at the surface
of the board as it is cut is considered to be a major source of dimensional variation in the notches.
Two factors are thought to influence vibration: bit size (A) and cutting speed (B). Two bit sizes
and two speeds are selected, and four boards are cut at each set of conditions shown below. The
response variable is vibrating measured by some accelerometers.
A B I II III IV
- - 18,2 18,9 12,9 14,4
+ - 27,2 24 22,4 22,5
- + 15,9 14,5 15,1 14,2
+ + 41 43,9 36,3 39,9

a) Define an experiment order;


b) Analyze the data from this experiment (α=0.05);
c) What levels of bit size and speed would you recommend for routine operation?

Solution

a) Define an experiment order;


In order to define a run order, we can easily generate a random permutation from the 16! possible
ones and use it to run the experiments.

b) Analyze the data from this experiment (α=0.05);


Because the experiment is replicated, we start the analysis by plotting the data with the individual
value plot.

1
From the analysis of the individual value plot, no evident outliers appear. Factor A seems more
relevant than factor B. In the interaction plot, the two factors show nonparallel lines, indicating a
probable interaction.

Let us do the analysis by hand


CONTRASTA2 ( −(1) + a − b + ab ) ( −64.4 + 96.1 − 59.7 + 161.1)
2 2

=SS A = = = 1107.226
n 2k 4 2 2 16
( −(1) −=
a + b + ab ) ( −64.4 − 96.1 + 59.7 + 161.1)
2 2
CONTRASTB2
=SS B = = 227.256
n 2k 4 2 2 16
− a − b + ab )
( +(1) = ( +64.4 − 96.1 − 59.7 + 161.1)
2 2 2
CONTRASTAB
=SS AB = 2
= 303.631
n2 k
4 2 16
2 2
y (381.3)
SSTOT = ∑ yijk
2
− k = 10796.7 − = 1709.84
ijk n2 16
SS E = SSTOT − SS A − SS B − SS AB = 71.72

The ANOVA table is:

Source SS df MS F0 F0.017 (1,12)


A 1107.226 1 1107.226 185.23 7.67
B 227.256 1 227.256 38.03 7.67
AB 303.631 1 303.631 50.80 7.67
Error 71.723 12 5.977
Total 1709.84 15

We use Minitab to complete the analysis

General Linear Model: Vibration versus A; B


Method
Factor coding (-1; 0; +1)

Factor Information
Factor Type Levels Values
A Fixed 2 -1; 1
B Fixed 2 -1; 1

Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
A 1 1107,23 1107,23 185,25 0,000
B 1 227,26 227,26 38,02 0,000
A*B 1 303,63 303,63 50,80 0,000
Error 12 71,72 5,98
Total 15 1709,83

Model Summary
S R-sq R-sq(adj) R-sq(pred)
2,44476 95,81% 94,76% 92,54%

Let us check the residual assumptions before drawing the conclusions.

2
Test for Equal Variances: SRES vs A; B
Multiple comparison intervals for the standard deviation, α = 0,05
A B
Multiple Comparisons
P-Value 0,092
-1 -1 Levene’s Test
P-Value 0,234

1 -1

0 1 2 3 4 5 6
If intervals do not overlap, the corresponding stdevs are significantly different.

Looking at the scatterplot, no outliers appear. The normality hypothesis (p-value=0.867) and the
hypothesis of homogeneous variance (p-value=0.234) cannot be rejected. In conclusion, both
factors are significant as well as their interaction.

c) What levels of bit size and speed would you recommend for routine operation?

A multiple comparison is required because we have to choose the combination of factor levels that
minimizes the response (vibrations). Using the GLM command and the Tukey method (as a matter
of fact, we should compute the Sheffé and t-Bonferroni constants in order to choose the Tukey
method) (StatAnovaGLMComparison), the results show that the combination of
parameters that we have to choose are (A-, B+) or (A-, B-)

Grouping Information Using the Tukey Method and 95% Confidence


A*B N Mean Grouping
1 1 4 40,275 A
1 -1 4 24,025 B
-1 -1 4 16,100 C
-1 1 4 14,925 C
Means that do not share a letter are significantly different.

3
Exercise 2 [M]

An engineer is interested in the effects of cutting speed (A), tool geometry (B), and cutting angle
(C) on the life (in minutes) of a Tool. Two levels of each factor are chosen, and three replicates are
run. The results follow:

A B C I II III
- - - 22 31 25
+ - - 32 43 29
- + - 35 34 50
+ + - 55 47 46
- - + 44 45 38
+ - + 40 37 36
- + + 60 50 54
+ + + 39 41 47

a) Define an experiment order;


b) Estimate the factor effects and analyze the data with α=5%;
c) Write down a regression model for predicting tool life (in minutes) ;
d) Based on the analysis, what factor levels would you recommend using?

Solution

a) Define an experiment order;


In order to define a run order, we have to generate a random permutation from the possible 24! .
To establish the data set, we could use the command StatDOE Factorial Create (this
command will generate the Worksheet including a set of columns. Among these columns let’s focus
on the Standard order column and the Run order column with the run order we were looking for.
To have the sign table in the Standard order, you can select the option: No randomize or sort the
data using the Data>Sort command.

b) Estimate the factor effects and analyze the data with α=5%;

Let us analyze graphically the data:

The individual value plot indicates that no evident outliers appear and the variability among the
factor levels appears uniform.

4
The factors Tool geometry (B) and Cutting angle (C) are more relevant than Cutting speed (A).
Possible relevant interactions are speed*geometry (AB) and speed* angle (AC).
Note that Tool Geometry is a discrete Factor (just two geometries were tested). Minitab calls it Text
factor.

MINITAB estimates the effects through the command: StatDOE FactorialAnalyze


(The Analyze command performs the analysis too).

Factorial Regression: T versus A; B; C


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 7 1612,67 230,381 7,64 0,000
Linear 3 1051,50 350,500 11,62 0,000
A 1 0,67 0,667 0,02 0,884
B 1 770,67 770,667 25,55 0,000
C 1 280,17 280,167 9,29 0,008
2-Way Interactions 3 533,00 177,667 5,89 0,007
A*B 1 16,67 16,667 0,55 0,468
A*C 1 468,17 468,167 15,52 0,001
B*C 1 48,17 48,167 1,60 0,224
3-Way Interactions 1 28,17 28,167 0,93 0,348
A*B*C 1 28,17 28,167 0,93 0,348
Error 16 482,67 30,167
Total 23 2095,33

Model Summary
S R-sq R-sq(adj) R-sq(pred)
5,49242 76,96% 66,89% 48,17%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 40,83 1,12 36,42 0,000
A 0,33 0,17 1,12 0,15 0,884 1,00
B 11,33 5,67 1,12 5,05 0,000 1,00
C 6,83 3,42 1,12 3,05 0,008 1,00
A*B -1,67 -0,83 1,12 -0,74 0,468 1,00
A*C -8,83 -4,42 1,12 -3,94 0,001 1,00
B*C -2,83 -1,42 1,12 -1,26 0,224 1,00
A*B*C -2,17 -1,08 1,12 -0,97 0,348 1,00

Let us check the residual assumptions.

5
Tests
Test
Method Statistic P-Value
Multiple comparisons — 0,614
Levene 0,24 0,969

The scatterplots indicate that no evident outliers appear, all the standardized residuals in fact belong
to the interval (-3;+3). The hypothesis of homogeneous variance cannot be refused. The normality
hypothesis cannot be rejected (even with a very small p-value). If we reduce the model using only
Geometry, Angle and Speed*Angle, the normality assumption would not be rejected with a higher
p-value, as you can see in the next plot:

.
In conclusion the significant factors are Geometry, Angle e Speed*Angle.
Even if it is not mandatory, we add the Speed factor to the model in order to have a hierarchical
model.

c) Write down a regression model for predicting tool life (in minutes);
The model to predict tool life is:
Tˆ =40.833 + 0.167 Speed + 5.667Geometry + 3.417Gamma − 4.417 Speed * Gamma
Note that Geometry can assume only the values -1 and +1 because it is a discrete factor.

d) Based on the analysis, what factor levels would you recommend using?
From the equation, it is convenient to choose a low level of Speed, a high level of Geometry and a
high level of Angle. With a tool life point prediction of about 54.17 min.

Instead, if we use the multiple comparison approach using the GLM Comparison Command and
Tukey option (note that the confidence coefficient for each set of comparison is 97,5 to have a
Family error rate of 5%), the result is

6
Comparisons for T
Tukey Pairwise Comparisons: Response = T, Term = Geometry

Grouping Information Using the Tukey Method and 97,5% Confidence


Geometry N Mean Grouping
1 12 46,5000 A
-1 12 35,1667 B
Means that do not share a letter are significantly different.

Tukey Simultaneous 97,5% CIs


Tukey Pairwise Comparisons: Response = T, Term = Speed*Gamma
Grouping Information Using the Tukey Method and 97,5% Confidence
Speed*Gamma N Mean Grouping
-1 1 6 48,5000 A
1 -1 6 42,0000 A B
1 1 6 40,0000 A B
-1 -1 6 32,8333 B
Means that do not share a letter are significantly different.

Or:
• The High level of geometry must definitely be chosen.
• For the Gamma Factor we need further evidence to operate a choice, even if the choice High
Gamma and Low Speed seems quite promising.

7
Exercise 3 [M]

Reconsider the experiment described in the Exercise 2 [M]. Suppose the experimenter only
performed the eight trials from the first replicate.
a) Analyze the data with α=0.05. Perform manually the Lenth method;
b) Write down a regression model for predicting the tool life (in minutes). What factor levels
would you recommend using?

Solution

a) Analyze the data (α=0.05). Perform manually the Lenth method.

Factors B (geometry) and C (Angle) are more significant than factor A (speed). In the interaction
plot, the factors speed and Angle show nonparallel lines.

Since we do not have replicates, to estimate the variance we need to calculate PSE with the Lenth
method. Let us do it by hand:

1. The effects are estimated using the MINITAB command: StatDOE FactorialAnalyze
Factorial Regression: T versus A; B; C
Coded Coefficients
Term Effect Coef
Constant 40,88
A 1,2500 0,6250
B 12,750 6,375
C 9,750 4,875
A*B -1,7500 -0,8750
A*C -13,750 -6,875
B*C -5,250 -2,625
A*B*C -6,750 -3,375

2. Let us calculate la absolute value of the effects and sort them.

A AB BC ABC C B AC
1,25 1,75 5,25 6,75 9,75 12,75 13,75

3. The median of the data is 6,75  s0=1,5*Median=10,125 


= PSE 1.5Median ( Eff = | Eff < 2.5s0 ) 10.125
4. Let us calculate ME and SME with α=0.10

8
MEα =tα /2 ( m / 3) PSE =t0.10/2 ( 7 / 3) ×10.125 =2.65 ×10.125 =26.83
SMEα = tγ ( m / 3) PSE γ = 1 − (1 − α FAM )  / 2 = 1 − (1 − 0.10 )  / 2 = 0.007469
1/ m 1/7
con
   
SME =t0.007469 ( 7 / 3) ×10.125 =6.566 ×10.125 =66.48

Summing up no effect is significant at 10%. We can confirm this results using Minitab:

Normal Plot of the Effects


(response is T; α = 0,1 0)
99
Effect Type
Not Significant
95 Significant
90 Factor Name
80 A A
B B
70
C C
Percent

60
50
40
30
20

10
5

1
-30 -20 -1 0 0 10 20 30
Effect
Lenth’s PSE = 10,125

The MINITAB command is: StatDOE FactorialAnalyzeGraphs: Normal


You can change the α-level by changing the confidence level in the Options:
Option: Confidence level for all intervals 90 to have α=10%

9
Exercise 4 [M]

Reconsider the experiment described in the Exercise 2[M]. Suppose the experimenter only
performed the eight trials from the first replicate and, in addition, he ran center points and obtained
the following response values:

A B C
0 -1 0 36 40 43 45
0 +1 0 50 54 59 61

a) Analyze the data (α=0.05);


b) Perform manually the test for curvature.

Solution

In order to add the center points to the design you can create a 2k factorial design with three factors
and one replicate. In the worksheet add by hand the last 8 rows with the center points by hand.

a) Analyze the data (α=0.05).

The center points allow the experimenter to estimate the variance of the model. The factor B is a
categorical factor, thus it cannot have a central points.

10
Main Effects Plot for T Interaction Plot for T
Data Means Data Means
A B C -1 1 -1 0 1
52,5
A
50 -1
50,0 0
A 40 1

47,5 30

B
45,0
Mean

50 -1
1
42,5 B 40

30
40,0

37,5
C
35,0
-1 0 1 -1 1 -1 0 1

The factors B and C seem to affect the tool life. The interaction AC seems relevant.

Factorial Regression: T versus A; B; C; CenterPt


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 8 1726,38 215,797 12,08 0,002
Linear 3 963,31 321,104 17,97 0,001
A 1 3,13 3,125 0,17 0,688
B 1 770,06 770,063 43,10 0,000
C 1 190,12 190,125 10,64 0,014
2-Way Interactions 3 439,37 146,458 8,20 0,011
A*B 1 6,12 6,125 0,34 0,577
A*C 1 378,12 378,125 21,16 0,002
B*C 1 55,13 55,125 3,09 0,122
3-Way Interactions 1 91,12 91,125 5,10 0,058
A*B*C 1 91,12 91,125 5,10 0,058
Curvature 1 232,56 232,563 13,02 0,009
Error 7 125,06 17,866
Lack-of-Fit 1 5,06 5,062 0,25 0,633
Pure Error 6 120,00 20,000
Total 15 1851,44

Model Summary
S R-sq R-sq(adj) R-sq(pred)
4,22683 93,25% 85,53% 54,98%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 40,88 1,49 27,35 0,000
A 1,25 0,63 1,49 0,42 0,688 1,00
B 13,88 6,94 1,06 6,57 0,000 1,00
C 9,75 4,87 1,49 3,26 0,014 1,00
A*B -1,75 -0,88 1,49 -0,59 0,577 1,00
A*C -13,75 -6,88 1,49 -4,60 0,002 1,00
B*C -5,25 -2,63 1,49 -1,76 0,122 1,00
A*B*C -6,75 -3,38 1,49 -2,26 0,058 1,00
Ct Pt 7,63 2,11 3,61 0,009 1,00

Before drawing the conclusions, we have do check the residual assumptions.

11
Scatterplot of SRES1 vs FITS1 ; A; C Probability Plot of SRES1
FITS1 A
Normal
99
1 Mean 7,21 6450E-1 6
StDev 0,8449
95 N 16
0 AD 0,575
90
P-Value 0,1 1 5
80
-1
70

Percent
60
SRES1

20 30 40 50 60 -1 ,0 -0,5 0,0 0,5 1 ,0 50


C 40
30
1 20

10
0
5

-1
1
-2 -1 0 1 2
-1 ,0 -0,5 0,0 0,5 1 ,0 SRES1

We can try to reduce the model in the significant factors (B, C, and AC) to see if the residuals
improve.

Factorial Regression: T versus A; B; C; CenterPt


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 5 1574,00 314,800 11,35 0,001
Linear 3 963,31 321,104 11,57 0,001
A 1 3,12 3,125 0,11 0,744
B 1 770,06 770,063 27,76 0,000
C 1 190,12 190,125 6,85 0,026
2-Way Interactions 1 378,12 378,125 13,63 0,004
A*C 1 378,12 378,125 13,63 0,004
Curvature 1 232,56 232,562 8,38 0,016
Error 10 277,44 27,744
Lack-of-Fit 4 157,44 39,359 1,97 0,219
Pure Error 6 120,00 20,000
Total 15 1851,44

Model Summary
S R-sq R-sq(adj) R-sq(pred)
5,26723 85,02% 77,52% 46,26%

Let us analyze the residuals.


Scatterplot of SRES2 vs FITS2; A; C Probability Plot of SRES2
FITS2 A
Normal
2 99
Mean 1 ,387779E-1 7
1 StDev 1 ,1 01
95 N 16
0 AD 0,1 78
90
P-Value 0,902
-1 80
70
-2
Percent

60
SRES2

20 30 40 50 60 -1 ,0 -0,5 0,0 0,5 1 ,0 50


C 40
2 30
20
1
10
0
5

-1
1
-2 -3 -2 -1 0 1 2 3
-1 ,0 -0,5 0,0 0,5 1 ,0 SRES2

No outliers appear in the analysis. The hypothesis of normality cannot be rejected. Notice that the
low level of A seems to be affected by a lower variability.

We observe that:
• It is possible to calculate the ANOVA table because we have an estimate of the variance
thanks to the center points;

12
• An estimate of the curvature is given and it is significant. We suggest to add the axial points
to fit a second order model in the next experiment;
• In conclusion significant factors are: B, C and AC.

b) Perform manually the test for curvature.

The test for curvature is


( yF − yC ) 2
nF nC > Fα (1, df E )
MSE (nF + nC )
where:
df E = 2k (n − 1) + (nC − 1) + nRemoved terms = 0 + 7 + 3 = 10

Then,
( 40.875 − 48.5)
2

8⋅8⋅ =8.3836
27.74 ⋅ (8 + 8)
Prob { F (1,10) > 8,38} =
0.016
Summing up the curvature is significant.

13
Exercise 5 [M]

A 24 factorial design was used to study a nitride etch process on a single-wafer plasma etching tool.
The process used C2F6 as the reactant gas. The design factors are the gap between the electrodes
(A), the pressure of the reaction chamber (B), the gas flow (C), and the RF power applied to the
cathode. Each factor is run at two levels, and the design is not replicated.
The response variable is the etch rate for silicon nitride (A/m).The etch rate data are shown by the
standard order:
550 669 604 650 633 642 601 635 1037 749 1052 868 1075 860 1063 729

The execution order was:


13 8 12 9 4 15 16 3 1 14 5 10 11 2 7 6

The factors (and the levels) were:


A = Gap (0.80 cm, 1.20 cm);
B = Pressure (450 mTorr, 550 mTorr)
C = C2F6 flow (125 SCCM, 200 SCCM);
D = Power (275 W, 325 W).

a) Analyze the data (α=5%);


b) Using a Bonferroni approach, project the design into a 2k design in the important factor;
c) Write down the regression model and generate a response surface contour plot of the etch
rate;
d) What operating condition would you recommend if an etch rate equal to 800 angstrom/min
was necessary?

Solution

a) Analyze the data;


First of all we need to order the data according to the sign table (in order to obtain the standard
order and easily insert them into Minitab).

Std Order A B C D Run order Rate


1 − − − − 13 550
2 + − − − 8 669
3 − + − − 12 604
4 + + − − 9 650
5 − − + − 4 633
6 + − + − 15 642
7 − + + − 16 601
8 + + + − 3 635
9 − − − + 1 1037
10 + − − + 14 749
11 − + − + 5 1052
12 + + − + 10 868
13 − − + + 11 1075
14 + − + + 2 860
15 − + + + 7 1063
16 + + + + 6 729

14
Main Effects Plot for rate Interaction Plot for rate
Data Means Data Means
A B C D -1 1 -1 1 -1 1
950
1000 A
-1
900 A 800 1

600
850 1000 B
-1
B 800 1
800
Mean

600
750 1000 C
-1
C 800 1
700

600
650

D
600
-1 1 -1 1 -1 1 -1 1

Factor D seems to influence on the response more the other factors. The most relevant interaction is
AD.

Let us use the Lenth method (at 5%) to evaluate preliminary the data.
The MINITAB command is: StatDOE FactorialAnalyzeGraphs: Normal
Remember that that each effect, if not relevant, follows a Gaussian with mean zero and variance (if
the variance is not known, you can use its estimate).

Let us do manually the Lenth Method (It is not required from the text of the exercise).
1. The sorted effects (obtained by MINITAB) and their absolute values are:
EFF Effect AbsEff
-0.625 BD 0.625
-1.625 B 1.625
-2.125 CD 2.125
4.125 ABD 4.125
5.625 ACD 5.625
7.375 C 7.375
-7.875 AB 7.875
-15.625 ABC 15.625
-24.875 AC 24.875
-25.375 BCD 25.375
-40.125 ABCD 40.125
-43.875 BC 43.875
-101.625 A 101.625
-153.625 AD 153.625
306.125 D 306.125

15
2. The median is 15.625 
1.5 15.626 =
s0 =× 23.4375
2.5s0 = 58.59
7.375 + 7.875
1.5 Median ( EFF | EFF < 2.5s0 ) =×
PSE =× 1.5 11.4375 (as indicated in the
=
2
Minitab Graph)

3. Let us evaluate ME or SME


m
ME = tα / 2   PSE =t0.025 ( 5 ) PSE =
2.57 ⋅11.4375 =
29.39
3
The relevant effects are: ABCD, BC, A, AD, D

m
SME = tγ   PSE → γ = 1 − (1 − α FAM )  / 2 = 0.0017
1/ m

3  
SME = t0.0017 ( 5 ) PSE =5.224 ⋅11.4375 =59.75
The relevant effects are: A, AD and D.

The situation is complicated because of the quadruple effect which prevents us from having a
complete hierarchical model (it would be the complete model). Instead, we use the model
y=A+B+C+D+BC+AD+ABCD (in red the effects added for a hierarchical approach).

Factorial Regression: rate versus A; B; C; D


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 7 524931 74990 92,44 0,000
Linear 4 416389 104097 128,33 0,000
A 1 41311 41311 50,93 0,000
B 1 11 11 0,01 0,912
C 1 218 218 0,27 0,619
D 1 374850 374850 462,10 0,000
2-Way Interactions 2 102103 51051 62,93 0,000
A*D 1 94403 94403 116,38 0,000
B*C 1 7700 7700 9,49 0,015
4-Way Interactions 1 6440 6440 7,94 0,023
A*B*C*D 1 6440 6440 7,94 0,023
Error 8 6489 811
Total 15 531421

Model Summary
S R-sq R-sq(adj) R-sq(pred)
28,4814 98,78% 97,71% 95,12%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 776,06 7,12 108,99 0,000
A -101,62 -50,81 7,12 -7,14 0,000 1,00
B -1,62 -0,81 7,12 -0,11 0,912 1,00
C 7,37 3,69 7,12 0,52 0,619 1,00
D 306,12 153,06 7,12 21,50 0,000 1,00
A*D -153,63 -76,81 7,12 -10,79 0,000 1,00

16
B*C -43,88 -21,94 7,12 -3,08 0,015 1,00
A*B*C*D -40,13 -20,06 7,12 -2,82 0,023 1,00

Before drawing the conclusions, let us check the residual assumptions.

From the third scatterplot, we observe that factor B seems to influence the dispersion of the data.
All the standardized residuals belong to the interval [-3,+3], thus no outliers appear.
Because the run order is know, the hypothesis of time autocorrelation can be tested. The MINITAB
command is: Stat Times Series  Autocorrelation. Remember that: before using this command
you have to sort the data according to the experimental run order.

Lag ACF T LBQ


1 -0,057217 -0,23 0,06
2 -0,085311 -0,34 0,21
3 0,139247 0,55 0,64
4 -0,628450 -2,44 10,12

The coefficient of Lag 4 is critical, we should check if a specific explanation exists. Let us suppose
that there is no explanation and consequently let us ignore the signal.

In conclusion, factor B and C are not relevant as main effects of the model. Without a Bonferroni
Approach, the model would include A, D, AD, BC and ABCD.

b) Using a Bonferroni approach, project the design into a 2k design in the important factor.

The model, using the Bonferroni approach = ( α 0.05


= 7 0.007 ), is composed by A, D and AD.
Thus, let us project the experiment in the factor A and D and do the analysis. We are dealing with a
22 experiment with 4 replicates (Notice that: these are not real replicates, these are artificial
replicates due to the projection into a model with a lower number of factors).

17
Factorial Regression: Rate versus A; D
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 3 510563 170188 97,91 0,000
Linear 2 416161 208080 119,71 0,000
A 1 41311 41311 23,77 0,000
D 1 374850 374850 215,66 0,000
2-Way Interactions 1 94403 94403 54,31 0,000
A*D 1 94403 94403 54,31 0,000
Error 12 20858 1738
Total 15 531421

Model Summary
S R-sq R-sq(adj) R-sq(pred)
41,6911 96,08% 95,09% 93,02%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 776,1 10,4 74,46 0,000
A -101,6 -50,8 10,4 -4,88 0,000 1,00
D 306,1 153,1 10,4 14,69 0,000 1,00
A*D -153,6 -76,8 10,4 -7,37 0,000 1,00

Before drawing the conclusions, let us check the residual assumptions.


Scatterplot of SRES2 vs FITS2; A; B; C; D
FITS2 A B
2

-1

-2
SRES2

600 800 1 000 -1 0 1 -1 0 1

C D
2

-1

-2
-1 0 1 -1 0 1

Test for Equal Variances: SRES1 vs A; D


Multiple comparison intervals for the standard deviation, α = 0,05
A D
Multiple Comparisons
P-Value 0,007
-1 -1 Levene’s Test
P-Value 0,001

1 -1

0 1 2 3 4 5 6
If intervals do not overlap, the corresponding stdevs are significantly different.

18
All the standardized residuals belong to the interval [-3;+3]. The hypothesis of normality cannot be
rejected. The homogeneity of variances is not verified but remember that we do not have real
replicates, thus this result probably depends on the removal of some factors.
2
With this model (fully significant), the value of RAdj gets worse only from 97.71% to 95.09%

c) Write down the regression model and generate a response surface contour plot of the etch
rate.

The complete model with factors A, D and AD is chosen.


The equation of the etch rate is:

 =776.06 − 50.81x + 153.06 x − 76.81x x


rate con σˆ =41.69
A D A D

The MINITAB command to build the contour plot is StatDOEFactorialContours plot


(previously you have to set up the defined model).

Contour Plot of rate vs D; A


1 ,0
rate
< 600
600 – 700
700 – 800
800 – 900
0,5
900 – 1 000
> 1 000

0,0
D

-0,5

-1 ,0
-1 ,0 -0,5 0,0 0,5 1 ,0
A

d) What operating condition would you recommend if an etch rate equal to 800 angstrom/min
was necessary?

Let us look at the contour plot, all the conditions belonging to the 800 boundary are feasible.

If we impose rate=800 into the regression equation, we can find the relationship between xA and xD.
Then it will be necessary code into the real values.

19
Exercise 6 [M]

An experiment was run in a semiconductor fabrication plant in an effort to increase yield. Five
factors, each at two levels, were studied. The factors (and the levels) were:
• A = aperture setting (small, large);
• B = exposure time (20% below nominal, 20% above nominal);
• C = development time (30 s, 45 s);
• D = mask dimension (small, large);
• E = etch time (14.5 min, 15.5 min)
The unreplicated data are: (1)=7 d=8 e=8 de=6 a=9 ad=10 ae=12 ade=10 b=34 bd=32 be=35
bde=30 ab=55 abd=50 abe=52 abde= 53c=16 cd=18 ce=15 cde=15 ac=20 acd=21 ace=22
acde=20 bc=40 bcd=44 bce=45 bcde=41 abc=60 abcd=61 abce=65 abcde=63

a) Analyze the data (α=5%);


b) Write down the regression model relating yield to the significant process variables;
c) What are your recommendations regarding the process operating conditions?

Solution

a) Analyze the data;

First of all, we have to build the table linking the data to the experimental conditions.

From the main effect plot, factor B seems to influence the response more than the other factors. In
the interaction plot, AB and DE seem to be significant interactions. Because the experiment is not
replicated, the Lenth method at10 % is required to find out the relevant factor.

The graph suggests to build a model with A, B, C, AB, DE as factors and for hierarchical reasons to
add D and E into the model.

20
Factorial Regression: yield versus A; B; C; D; E
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 7 11603,2 1657,60 654,86 0,000
Linear 5 11087,9 2217,58 876,08 0,000
A 1 1116,3 1116,28 441,00 0,000
B 1 9214,0 9214,03 3640,11 0,000
C 1 750,8 750,78 296,60 0,000
D 1 5,3 5,28 2,09 0,162
E 1 1,5 1,53 0,60 0,444
2-Way Interactions 2 515,3 257,66 101,79 0,000
A*B 1 504,0 504,03 199,12 0,000
D*E 1 11,3 11,28 4,46 0,045
Error 24 60,8 2,53
Total 31 11664,0

Model Summary
S R-sq R-sq(adj) R-sq(pred)
1,59099 99,48% 99,33% 99,07%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 30,531 0,281 108,56 0,000
A 11,812 5,906 0,281 21,00 0,000 1,00
B 33,938 16,969 0,281 60,33 0,000 1,00
C 9,687 4,844 0,281 17,22 0,000 1,00
D -0,813 -0,406 0,281 -1,44 0,162 1,00
E 0,437 0,219 0,281 0,78 0,444 1,00
A*B 7,938 3,969 0,281 14,11 0,000 1,00
D*E -1,188 -0,594 0,281 -2,11 0,045 1,00

Before drawing the conclusions, we have to check the residual assumptions.

There are no outliers: all the standardized residuals belong to the interval [-3; +3]. The hypothesis
of normality cannot be refused. In conclusion, the model contains the effects: A, B, C and AB.

b) Write down the regression model relating yield to the significant process variables;
The regression model is: yˆ =30.531 + 5.906 x A + 16.969 xB + 4.844 xC + 3.969 x A xB

c) What are your recommendations regarding the process operating conditions?


Since all the coefficients are positive, it is better to choose all the factors at their highest level.

21
Exercise 7 [M]

An experiment was conducted on a chemical process that produces a polymer. The four factors
studied were temperature (A) [100°, 200°], catalyst concentration (B) [4%, 8%], time (C) [20 min,
30 min], and pressure (D) [60 psi, 75psi]. Two responses, molecular weight and viscosity, were
observed.

The response data in standard order are:


• Molecular weight: 2400, 2410, 2315, 2510, 2615, 2625, 2400, 2750, 2400, 2390, 2300,
2520, 2625, 2630, 2500, 2710
• Viscosity: 1400, 1500, 1520, 1630, 1380, 1525, 1500, 1620, 1400, 1525, 1500, 1600, 1420,
1490, 1500, 1600
The execution order is: 18, 9, 13, 8, 3, 11, 14, 17, 6, 7, 2, 10, 4, 19, 15, 20

In addition, there are four center points:


• Molecular weight: 2515, 2500, 2400, 2475
• Viscosity: 1500, 1460, 1525, 1500
• Execution order: 1, 5, 16, 12

a) Analyze both response data. Write down a regression model to predict each response as a
function of important variables.
b) What operating condition would you recommend if it is necessary to produce a product with
molecular weight between 2400 and 2500, and the lowest possible viscosity?

Solution

a) Analyze both response data. Write down a regression model to predict each response as a
function of important variables;

In this case we deal with two responses, we will evaluate them independently leaving out the
chance that they are correlated (molecular weight and viscosity).
If we look at the graph:

The corresponding correlation coefficient is not significant:

Correlations: MW; Viscosity


Pearson correlation of MW and Viscosity = 0,255
p-Value = 0,278

22
Concerning the viscosity as response, factors A and B seem to influence the response. No
interactions seem particularly relevant.

Concerning the molecular weight as response, factors A and C seem to influence the response. The
interaction between A and B seems relevant.

Because the experiment has the center points, we have the ANOVA table (but we do not have the
standardized residuals).

Factorial Regression: Viscosity versus A; B; C; D; CenterPt


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 16 93455,0 5840,9 8,08 0,055
Linear 4 90562,5 22640,6 31,32 0,009
A 1 47306,2 47306,2 65,44 0,004
B 1 43056,3 43056,3 59,56 0,005
C 1 100,0 100,0 0,14 0,735
D 1 100,0 100,0 0,14 0,735
2-Way Interactions 6 1062,5 177,1 0,24 0,933
A*B 1 6,3 6,3 0,01 0,932
A*C 1 0,0 0,0 0,00 1,000
A*D 1 400,0 400,0 0,55 0,511
B*C 1 25,0 25,0 0,03 0,864
B*D 1 625,0 625,0 0,86 0,421
C*D 1 6,2 6,2 0,01 0,932
3-Way Interactions 4 962,5 240,6 0,33 0,842
A*B*C 1 25,0 25,0 0,03 0,864
A*B*D 1 25,0 25,0 0,03 0,864
A*C*D 1 756,3 756,3 1,05 0,382
B*C*D 1 156,2 156,2 0,22 0,674
4-Way Interactions 1 506,2 506,2 0,70 0,464

23
A*B*C*D 1 506,2 506,2 0,70 0,464
Curvature 1 361,2 361,2 0,50 0,531
Error 3 2168,8 722,9
Total 19 95623,8

The only significant factors are A and B, then an additive model is selected.

Factorial Regression: Viscosity versus A; B; CenterPt


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 3 90723,8 30241,3 98,75 0,000
Linear 2 90362,5 45181,3 147,53 0,000
A 1 47306,3 47306,3 154,47 0,000
B 1 43056,3 43056,3 140,59 0,000
Curvature 1 361,2 361,2 1,18 0,294
Error 16 4900,0 306,3
Lack-of-Fit 13 2731,3 210,1 0,29 0,951
Pure Error 3 2168,8 722,9
Total 19 95623,8

Model Summary
S R-sq R-sq(adj) R-sq(pred)
17,5 94,88% 93,91% 91,64%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 1506,88 4,38 344,43 0,000
A 108,75 54,38 4,38 12,43 0,000 1,00
B 103,75 51,88 4,38 11,86 0,000 1,00
Ct Pt -10,62 9,78 -1,09 0,294 1,00

Before drawing the conclusion, let us check the residuals of the model:

24
(Autocorrelation can be evaluated with Stat  Time Series  Autocorrelation).
Analyzing the scatterplots: all the standardize residuals belong to the interval [-3;3]. The residuals
are not time autocorrelated. The hypotheses of normality and homogeneous variance cannot be
refused.
In conclusion, the model is additive in A and B. The curvature test and the LOF test are not
significant.

The regression model is: Viscosity = 1506.88 + 54.37 x A + 51.87 xB con σˆ = 17.5

Let us analyze the molecular weight:


Factorial Regression: M.W. versus A; B; C; D; CenterPt
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 16 295920 18495 7,09 0,066
Linear 4 223925 55981 21,46 0,015
A 1 61256 61256 23,48 0,017
B 1 506 506 0,19 0,689
C 1 162006 162006 62,11 0,004
D 1 156 156 0,06 0,822
2-Way Interactions 6 63300 10550 4,04 0,139
A*B 1 57600 57600 22,08 0,018
A*C 1 1600 1600 0,61 0,491
A*D 1 1225 1225 0,47 0,542
B*C 1 2025 2025 0,78 0,443
B*D 1 225 225 0,09 0,788
C*D 1 625 625 0,24 0,658
3-Way Interactions 4 3025 756 0,29 0,869
A*B*C 1 1056 1056 0,40 0,570
A*B*D 1 506 506 0,19 0,689
A*C*D 1 1406 1406 0,54 0,516
B*C*D 1 56 56 0,02 0,893
4-Way Interactions 1 2025 2025 0,78 0,443
A*B*C*D 1 2025 2025 0,78 0,443
Curvature 1 3645 3645 1,40 0,322
Error 3 7825 2608
Total 19 303745

The relevant factors are A, C and AB. We have to add B for hierarchical reasons.

Factorial Regression: M.W. versus A; B; C; CenterPt


Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Model 5 285014 57003 42,60 0,000
Linear 3 223769 74590 55,75 0,000
A 1 61256 61256 45,78 0,000
B 1 506 506 0,38 0,548
C 1 162006 162006 121,09 0,000
2-Way Interactions 1 57600 57600 43,05 0,000
A*B 1 57600 57600 43,05 0,000
Curvature 1 3645 3645 2,72 0,121
Error 14 18731 1338
Lack-of-Fit 11 10906 991 0,38 0,898
Pure Error 3 7825 2608
Total 19 303745

25
Model Summary
S R-sq R-sq(adj) R-sq(pred)
36,5780 93,83% 91,63% 87,82%

Coded Coefficients
Term Effect Coef SE Coef T-Value P-Value VIF
Constant 2506,25 9,14 274,07 0,000
A 123,75 61,88 9,14 6,77 0,000 1,00
B -11,25 -5,63 9,14 -0,62 0,548 1,00
C 201,25 100,63 9,14 11,00 0,000 1,00
A*B 120,00 60,00 9,14 6,56 0,000 1,00
Ct Pt -33,8 20,4 -1,65 0,121 1,00

Before drawing the conclusions, we check the residuals assumptions:

Autocorrelation Function for SRES2


(with 5% significance limits for the autocorrelations)

1 ,0

0,8

0,6

0,4
Autocorrelation

0,2

0,0

-0,2

-0,4

-0,6

-0,8

-1 ,0

1 2 3 4 5
Lag

There are not any outliers. The standardized residual are not time autocorrelated. The hypothesis of
normality cannot be refused (even with a low P-value), the same is true for the test of equal
variance. The residual assumptions are checked.
In conclusion the factors A, C, AB are significant.

The regression equation is:


M .W . =2506.25 + 61.87 x A − 5.62 xB + 100.62 xC + 60 x A xB with σˆ =36.578

b) What operating condition would you recommend if it is necessary to produce a product


with molecular weight between 2400 and 2500, and the lowest possible viscosity?

In order to choose the process parameters with minimum viscosity and 2400≤PM≤2500 we need to
solve an optimization problem with boundaries.

26
Min [ Viscosity = 1506.88 + 54.37 x A + 51.87 xB ]
x A , xB , xC

with 2400 ≤ M.W. ≤ 2500 e M.W.= 2506.25 + 61.87 x A − 5.62 xB + 100.62 xC + 60 x A xB

Using the software Mathematica (you can use the software that you prefer), we get:

Minimize [1506.88+54.37A+51.87B,{-1≤A≤1,-1≤B≤1,-1≤C≤1,2400≤2506.25+61.87A-
5.68B+100.62C+60.0A*B≤2500},{A,B,C}]

{1400.64, {A→-1,B→-1,C→-1}} which corresponds to a viscosity equal to 1400.64 and a


molecular weight of 2409.38

27
Exercise 8

Consider a two level factorial design with 3 factors and a number of replicates between 2 and 5.
Compute the power of the experiment to discover a difference not less than d/σ=1 (where d is the
minimum difference of interest). Use a confidence level equals to 0.05.
Compare the Minitab results with the results obtained by using the direct method.

Solution

The MINITAB command is: StatPower and sample sizeTwo level factorial design
Power and Sample Size
2-Level Factorial Design
α = 0,05 Assumed standard deviation = 1 Factors: 3 Base Design: 3; 8 Blocks: none
Center Total
Points Effect Reps Runs Power
0 1 2 16 0,421052
0 1 3 24 0,633406
0 1 4 32 0,774508
0 1 5 40 0,865660

In order to use the direct method (that is to compute manually the power), we need to find the
values of the Fisher Noncentrality Parameters and the degrees of freedom according to the
following formulas:
2
d 
δ AB..I =  AB..I  n 2k −1−i
 σ 
df=
A df=
B df=
C df=
AB df=
AC df=
BC = 1
df ABC
df E= 2k (n − 1) + nRe moved + (nc − 1)
Terms

Main Effect
n=2
• Let us evaluate Fα (df A , df E )
df E = 2k (n − 1) + nRe moved + (nc − 1) = 8 ×1 + 0 + 0 = 8
Terms

Inverse Cumulative Distribution Function


F distribution with 1 DF in numerator and 8 DF in denominator
P(X<=x) x
0,95 5,31766

28
2
 d  k −1−1
• Let us evaluate the non-centrality parameter
= δ ME =  2 n 4
σ 
• Let us evaluate β
Cumulative Distribution Function
F distribution with 1 DF in numerator and 8 DF in denominator and noncentrality
parameter 4
x P( X <= x )
5,31766 0,578948
• the Power is: Power = 1- β = 1-0,57894 = 0,421052

n=3
• Let us evaluate Fα (df A , df E )
df E = 2k (n − 1) + nRe moved + (nc − 1) = 8 × 2 + 0 + 0 = 16
Terms

Inverse Cumulative Distribution Function


F distribution with 1 DF in numerator and 16 DF in denominator
P(X<=x) x
0,95 4,49400
2
 d  k −1−1
• Let us evaluate the non-centrality parameter
= δ ME =  2 n 6
σ 
• Let us evaluate β
Cumulative Distribution Function
F distribution with 1 DF in numerator and 16 DF in denominator and noncentrality
parameter 6
x P( X <= x )
4,494 0,366594
• The Power is: Power = 1- β = 1-0,366594 = 0,633406

n=4
• Let us evaluate Fα (df A , df E )
df E = 2k (n − 1) + nRe moved + (nc − 1) = 8 × 3 + 0 + 0 = 24
Terms

Inverse Cumulative Distribution Function


F distribution with 1 DF in numerator and 24 DF in denominator
P(X<=x) x
0,95 4,25968
2
 d  k −1−1
• Let us evaluate the non-centrality parameter
= δ ME =  2 n 8
σ 
• Let us valuate β
Cumulative Distribution Function
F distribution with 1 DF in numerator and 24 DF in denominator and noncentrality
parameter 8
x P(X<=x)
4,25968 0,225492
• The power is: Power = 1- β = 1-0,225492 = 0,774508
n=5
• Let us evaluate Fα (df A , df E )
df E = 2k (n − 1) + nRe moved + (nc − 1) = 8 × 4 + 0 + 0 = 32
Terms

Inverse Cumulative Distribution Function


F distribution with 1 DF in numerator and 32 DF in denominator
P(X<=x) x
0,95 4,14910

29
2
 d  k −1−1
• Let us evaluate the non-centrality parameter
= δ ME =  2 n 10
σ 
• Let us evaluate β
Cumulative Distribution Function
F distribution with 1 DF in numerator and 32 DF in denominator and noncentrality
parameter 10
x P( X <= x )
4,1491 0,134340
• The power is: Power = 1- β = 1-0,134340 = 0,86566

The results are comparable with the ones given by Minitab.


δ Power (by hand) Power Minitab
2 4 0,421052 0,421052
3 6 0,633406 0,633406
4 8 0,774508 0,774508
5 10 0,86566 0,865660

Let us compute the power for interaction model terms (Not given by MINITAB)
2 Factor Interactions
n=2
• Let us evaluate Fα (df AB , df E )
df E = 2k (n − 1) + nRe moved + (nc − 1) = 8 ×1 + 0 + 0 = 8
Terms

Inverse Cumulative Distribution Function


F distribution with 1 DF in numerator and 8 DF in denominator
P(X<=x) x
0,95 5,31766
2
 d  k −1− 2
• Let us evaluate the non-centrality parameter
= δ 2 fi =  2 n 2
σ 
• Let us evaluate β
Cumulative Distribution Function
F distribution with 1 DF in numerator and 8 DF in denominator and noncentrality
parameter 2
x P( X <= x )
5,31766 0,761000
• The power is: Power = 1- β = 1-0,761 = 0,239

n δ Power (By hand)


2 2 0,239000
3 3 0,370255
4 4 0,484018
5 5 0,582672

3 factor interaction
n δ Power (By hand)
2 1 0,143256
3 1,5 0,210513
4 2 0,274019
5 5/2 0,335222

30
Exercise 9 [June 30th 2009]

Consider a 2-level factorial design with 4 factors (A, B, C, D) and n replications. Develop the
confidence interval for the effect of a 3 factor interaction (anyone).

Solution

Let us consider the effect of the third order interaction ABC.


The equation to the estimate the effect is:
 ABC CONT  ABC CONT  ABC
= ABC CONT
EFF = =
n 2k −1 n * 23 8n
The estimate is normally distributed because it is the sum of observations that, for hypothesis,
follow a normal distribution.
The observations are independent.
)
The expected value is: E EFF (
 ABC = EFF and the variance of the estimate is:
ABC

V ( EFF ]) V ( CONT )
 1  24 σ2
= [ ABC 2
= 6 ABC = nσ 2

n 2 n 2 26 n22

We can state that:

 ABC − EFF
EFF ABC
 N ( 0,1)
(
 ABC
V EFF )
Consequently, if we know the variance, the confidence interval is:
 ABC − z σ2  ABC + z σ2
EFF α /2 ≤ EFF ABC ≤ EFF α /2
n 22 n 22
Instead, if we do not know the variance and we estimate it by MSE, the confidence interval is:
 ABC − t ( df ) MS E ≤ EFF ≤ EFF
EFF  ABC + t ( df ) MS E
α /2 E 2 ABC α /2 E
n2 n 22

31

You might also like