You are on page 1of 27

IIT, Bombay

Module
5
Design for Reliability and
Quality


IIT, Bombay
















Lecture
4
Approach to Robust Design
IIT, Bombay

Instructional Objectives
The primary objectives of this lecture are to outline the concept of robust design and various
tools to achieve the same for typical manufacturing processes.

Defining Robust Design
Robust design is an engineering methodology for improving productivity during research and
development so that high-quality products can be produced quickly and at low cost. According to
Dr. Genichi Taguchi, a robust design is one that is created with a system of design tools to
reduce variability in product or process, while simultaneously guiding the performance towards
an optimal setting. A product that is robustly designed will provide customer satisfaction even
when subjected to extreme conditions on the manufacturing floor or in the service environment.

Tools for Robust Design
Taguchi method, design of experiments and multiple regression analysis are some of the
important tools used for robust design to produce high quality products quickly and at low cost.

Taguchi Method
Taguchi method is based on performing evaluation or experiments to test the sensitivity of a set
of response variables to a set of control parameters (or independent variables) by considering
experiments in orthogonal array with an aim to attain the optimum setting of the control
parameters. Orthogonal arrays provide a best set of well balanced (minimum) experiments [1].
Table 5.4.1 shows eighteen standard orthogonal arrays along with the number of columns at
different levels for these arrays [1]. An array name indicates the number of rows and columns it
has, and also the number of levels in each of the columns. For example array L
4
(2
3
) has four
rows and three 2 level columns. Similarly the array L
18
(2
1
3
7
) has 18 rows; one 2 level
column; and seven 3 level columns. Thus, there are eight columns in the array L
18
. The
number of rows of an orthogonal array represents the requisite number of experiments. The
number of rows must be at least equal to the degrees of the freedom associated with the factors
i.e. the control variables. In general, the number of degrees of freedom associated with a factor
IIT, Bombay

(control variable) is equal to the number of levels for that factor minus one. For example, a case
study has one factor (A) with 2 levels (A), and five factors (B, C, D, E, F) each with 3 level.
Table 5.4.2 depicts the degrees of freedom calculated for this case. The number of columns of an
array represents the maximum number of factors that can be studied using that array.

Table 5.4.1 Standard orthogonal arrays [1]
Orthogonal
array
Number
of rows
Maximum
number of
factors
Maximum number of columns at these
levels
2 3 4 5
L
L
4
L
8
L
9
4
12
8
9
12
3
7
4
11
3
7
-
11
-
-
4
-
-
-
-
-
-
-
-
-
L
L
16
16
L

L
18
16
25

16
18
25
15
5
8
6
15
-
1
-
-
-
7
-
-
5
-
-
-
-
-
6
L
L
27
L
32
32
L

L
36
36
27

32
32
36
36
13
31
10
23
16
-
31
1
11
3
13
-
-
12
13
-
-
9
-
-
-
-
-
-
-
L
L
50
L
54
L
64
64
L


50
81
54
64
64
81
12
26
63
21
40
1
1
63
-
-
-
25
-
-
40
-
-
-
21
-
11
-
-
-
-

The signal to noise ratios (S/N), which are log functions of desired output, serve as the
objective functions for optimization, help in data analysis and the prediction of the optimum
results. The Taguchi method treats the optimization problems in two categories: static problems
IIT, Bombay

and dynamic problems. For simplicity, the detailed explanation of only the static problems is
given in the following text. Next, the complete procedure followed to optimize a typical process
using Taguchi method is explained with an example.

Table 5.4.2 The degrees of freedom for one factor (A) in 2 levels and five factors
(B, C, D, E, F) in 3 levels
Factors Degrees of freedom
Overall mean
A
B, C, D, E, F
1
2-1 =1
5 (3-1) =10
Total 12

Static problems
Generally, a process to be optimized has several control factors (process parameters) which
directly decide the target or desired value of the output. The optimization then involves
determining the best levels of the control factor so that the output is at the target value. Such a
problem is called as a "STATIC PROBLEM". This can be best explained using a P-Diagram
(Figure 5.4.1) which is shown below ("P" stands for Process or Product). The noise is shown to
be present in the process but should have no effect on the output. This is the primary aim of the
Taguchi experiments - to minimize the variations in output even though noise is present in the
process. The process is then said to have become ROBUST.








Figure 5.4.1 P- Diagram for static problems [1].

IIT, Bombay

Signal to Noise (S/N) Ratio
There are three forms of signal to noise (S/N) ratio that are of common interest for optimization
of static problems.
[1] Smaller-the-better
This is expressed as

] data measured of squares of sum of mean [ Log 10 n
10
= (1)

This is usually the chosen S/N ratio for all the undesirable characteristics like defects for which
the ideal value is zero. When an ideal value is finite and its maximum or minimum value is
defined (like the maximum purity is 100% or the maximum temperature is 92 K or the minimum
time for making a telephone connection is 1 sec) then the difference between the measured data
and the ideal value is expected to be as small as possible. Thus, the generic form of S/N ratio
becomes,

}] ideal measured { of squares of sum of mean [ Log 10 n
10
= (2)

[2] Larger-the-better
This is expressed as

] data measured of reciprocal of squares of sum of mean [ Log 10 n
10
= (3)

This is often converted to smaller-the-better by taking the reciprocal of the measured data and
next, taking the S/N ratio as in the smaller-the-better case.

[3] Nominal-the-best
This is expressed as
(

=
iance var
mean of square
Log 10 n
10
(4)

IIT, Bombay

This case arises when a specified value is the most desired, meaning that neither a smaller nor a
larger value is desired.

Example for application of Taguchi Method
Determine the effect of four process parameters: temperature (A), pressure (B), setting time (C),
and cleaning method (D) on the formation of surface defects in a chemical vapor deposition
(CVD) process to produce silicon wafers. Also estimate the optimum setting of the above
process parameters for minimum defects. Table 5.4.3 depicts the factors and their levels.

Table 5.4.3 Factors and their levels
Factor
Level
1 2 3
A. Temperature (
0
B. Pressure (mtorr)
C)
C. Settling time (min)
D. Cleaning method
T
0
P
25
0
t
200
None
0
T
P
0
t
0
0
CM
+8
T
2
0
P
+25
0
t
+200
0
CM
+16
3


Step 1: Select the design matrix and perform the experiments
The present example is associated with four factors with each at three levels. Table 5.4.1
indicates that the best suitable orthogonal array is L
9
. Table 5.4.4 shows the design matrix for
L
9
. Next conduct all the nine experiments and observe the surface defect counts per unit area at
three locations each on three silicon wafers (thin disks of silicon used for making VLSI circuits)
so that there are nine observations in total for each experiment. The summary statistic,
i

, for an
experiment, i, is given by
i 10 i
C log 10 = (5)

where C
i
refers to mean squared effect count for experiment i and the mean square refers to the
average of the squares of the nine observations in the experiment i. Table 5.4.4 also depicts the
observed value of
i
for all the nine experiments. This summary statistic
i
is called the signal to
noise (S/N) ratio.
IIT, Bombay


Table 5.4.4 L
9

array matrix experiment table [1].
Expt
No.
Column number and factor assigned
Observation, (dB)
1
Temperature
(A)
2
Pressure
(B)
3
Settling time
(C)
4
Cleaning
method (D)
1
2
3
4
5
6
7
8
9
1
1
1
2
2
2
3
3
3
1
2
3
1
2
3
1
2
3
1
2
3
2
3
1
3
1
2
1
2
3
3
1
2
2
3
1

=-20
2

=-10
3

=-30
4

=-25
5

=-45
6

=-65
7

=-45
8

=-65
9
=-70

Step 2: Calculation of factor effects
The effect of a factor level is defined as the deviation it causes from the overall mean. Hence as a
first step, calculate the overall mean value of for the experimental region defined by the factor
levels in Table 5.4.4 as

( ) dB 67 . 41 .....
9
1
9
1
m
9
1 i
9 2 1 i
= + + + = =

=
(6)

The effect of the temperature at level A
1
(at experiments 1, 2 and 3) is calculated as the
difference of the average S/N ratio for these experiments (m
A1
The effect of temperature at level A
) and the overall mean. The same
is given as
1
=m
A1
( ) m
3
1
3 2 1
+ + m = (7)
Similarly,
IIT, Bombay

The effect of temperature at level A
2
=m
A2
( ) m
3
1
6 5 4
+ + m = (8)
The effect of temperature at level A
3
=m
A3
( ) m
3
1
9 8 7
+ + m = (9)
Using the S/N ratio data available in Table 5.4.4 the average of each level of the four factors is
calculated and listed in Table 5.4.5. These average values are shown in Figure 5.4.2. They are
separate effect of each factor and are commonly called main effects.

Table 5.4.5 Average for different factor levels [1].
Factor
Level
1 2 3
A. Temperature
B. Pressure
C. Settling time
D. Cleaning method
-20
-30
-50
-45
-45
-40
-35
-40
-60
-55
-40
-40















Figure 5.4.2 Plots of factor effects
IIT, Bombay


Step 3: Selecting optimum factor levels
Our goal in this experiment is to minimize the surface defect counts to improve the quality of the
silicon wafers produced through the chemical vapor deposition process. Since log depicts a
monotonic decreasing function [equation (5)], we should maximize . Hence the optimum level
for a factor is the level that gives the highest value of in the experimental region. From Figure
5.4.2 and the Table 5.4.5, it is observed that the optimum settings of temperature, pressure,
settling time and cleaning method are A
1
, B
1
, C
2
and D
2
or D
3
. Hence we can conclude that the
settings A
1
B
1
C
2
D
2
and A
1
B
1
C
2
D
3

can give the highest or the lowest surface defect count.
Step 4: Developing the additive model for factor effects
The relation between and the process parameters A, B, C and D can be approximated
adequately by the following additive model:

e d c b a m ) D , C , B , A (
l k j i l k j i
+ + + + + = (10)

where the term m refers to the overall mean (that is the mean of for the experimental region).
The terms a
i
, b
j
, c
k
and d
l
refer to the deviations from caused by the setting A
i
, B
j
, C
k
, and D
l


of factors A, B, C and D, respectively. The term e stands for the error. In additive model the
cross- product terms involving two or more factors are not allowed. Equation (10) is utilized in
predicting the S/N ratio at optimum factor levels.
Step 5: Analysis of Variance (ANOVA)
Different factors affect the surface defects formation to a different degree. The relative
magnitude of the factor effects are listed in Table 5.4.5. A better feel for the relative effect of the
different factors is obtained by the decomposition of variance, which is commonly called as
analysis of variance (ANOVA). This is obtained first by computing the sum of squares.

Total sum of squares =
2 2 2 2
9
1 i
2
i
) dB ( 19425 ) 70 ( ..... ) 10 ( ) 20 ( = + + + =

=
(11)

IIT, Bombay

Sum of squares due to mean =
2 2 2
) dB ( 15625 67 . 41 9 m s) experiment of number ( = = (12)

Total sum of squares =
2
9
1 i
2
i
) dB ( 3800 ) m ( =

=
(13)

Sum of squares due to factor A
=[(number of experiments at level A
1
) (m
A1
-m)
2
[(number of experiments at level A
] +
2
) (m
A2
-m)
2
[(number of experiments at level A
] + (14)
3
) (m
A3
-m)
2
=[3 (-20+41.67)
]
2
] +[3 (-45+41.67)
2
] +[3 (-60+41.67)
2
] =2450 (dB)
2

.
Similarly the sum of squares due to factor B, C and D can be computed as 950, 350 and 50 (dB)
2
Table 5.4.6 ANOVA table for [1].
,
respectively. Now all these sum of squares are tabulated in Table 5.4.6. This is called as the
ANOVA table.
Factor
Degree of
freedom
Sum of
squares
Mean square =
sum of squares/degree of freedom
F
A. Temperature
B. Pressure
C. Settling time
D. Cleaning method
2
2
2
2
2450
950
350
50
*
1225
*
475
175
25
12.25
4.75
Error 0 0 -
Total 8 3800
(Error) (4) (400) (100)
*Indicates sum of squares added together to estimate the pooled error sum of squares shown within
parenthesis. F ratio is calculated as the ratio of factor mean square to the error mean square.

Degrees of freedom:
The degrees of freedom associated with the grand total sum of squares are equal to the
number of rows in the design matrix.
The degree of freedom associated with the sum of squares due to mean is one.
IIT, Bombay

The degrees of freedom associated with the total sum of squares will be equal to the
number of rows in the design matrix minus one.
The degrees of freedom associated with the factor will be equal to the number of levels
minus one.
The degrees of freedom for the error will be equal to the degrees of freedom for the total
sum of squares minus the sum of the degrees of freedom for the various factors.

In the present case-study, the degrees of freedom for the error will be zero. Hence an
approximate estimate of the error sum of squares is obtained by pooling the sum of squares
corresponding to the factors having the lowest mean square. As a rule of thumb, the sum of
squares corresponding to the bottom half of the factors (as defined by lower mean square) are
used to estimate the error sum of squares. In the present example, the factors C and D are used to
estimate the error sum of squares. Together they account for four degrees of freedom and their
sum of squares is 400.

Step 6: Interpretation of ANOVA table.
The major inferences from the ANOVA table are given in this section. Referring to the sum of
squares in Table 5.4.6, the factor A makes the largest contribution to the total sum of squares
[(2450/3800) x 100 =64.5%]. The factor B makes the next largest contribution (25%) to the total
sum of squares, whereas the factors C and D together make only 10.5% contribution. The larger
the contribution of a particular factor to the total sum of squares, the larger the ability is of that
factor to influence . Moreover, the larger the F-value, the larger will be the factor effect in
comparison to the error mean square or the error variance.

Step 7: Prediction of under optimum conditions
In the present example, the identified optimum condition or the optimum level of factors is
A
1
B
1
C
2
D
2

(step 3). The value of under the optimum condition is predicted using the additive
model [equation (10)] as
dB 33 . 8 ) 67 . 41 30 ( ) 67 . 41 20 ( 67 . 41 ) m m ( ) m m ( m
1 B 1 A opt
= + + + + = + + = (15)

IIT, Bombay


Since the sum of squares due to the factors C and D are small as well as used to estimate the
error variance, these terms are not included in equation (15). Further using equations (5) and
(15), the mean square count at the optimum condition is calculated as
10
opt
10 y

= =10
0.833
=6.8
(defects/unit area)
2
6 . 2 8 . 6 = . The corresponding root-mean square defect count is defects/unit
area.

Design of Experiments
A designed experiment is a test or series of tests in which purposeful changes are made to the
input variables of a process or system so that we may observe and identify the reasons for
changes in the output response. For example, Figure 5.4.3 depicts a process or system under
study. The process parameters x
1
, x
2
, x
3
, , x
p
are controllable, whereas other variables z
1
, z
2
,
z
3
, ,z
q
Determining which variables are most influential on the response, y.
are uncontrollable. The term y refers to the output variable. The objectives of the
experiment are stated as:
Determining where to set the influential xs so that y is almost always near the desired
nominal value.
Determining where to set the influential xs so that variability in y is small.
Determining where to set the influential xs so that the effects of the uncontrollable z
1
, z
2

z
q

are minimized.







Figure 5.4.3 General model of a process or system [2].

IIT, Bombay


Experimental design is used as an important tool in numerous applications. For instance it is used
as a vital tool in improving the performance of a manufacturing process and in the engineering
design activities. The use of the experimental design in these areas results in products those are
easier to manufacture, products that have enhanced field performance and reliability, lower
product cost, and short product design and development time.

Guidelines for designing experiments
Recognition and statement of the problem.
Choice of factors and levels.
Selection of the response variable.
Choice of experimental design.
Performing the experiment.
Data analysis.
Conclusions and recommendations.

Factorial designs
Factorial designs are widely used in experiments involving several factors where it is necessary
to study the joint effect of the factors on a response. For simplicity and easy understanding, in
the present section the design matrix of the 2
2
The 2
factorial design is presented with subsequent
explanation on the calculation of the main effects, interaction effects and the sum of squares. The
two level design matrices are very famous and used in the daily life engineering applications
very frequently.
2
The 2
design
2
design is the first design in the 2
k
factorial design. This involves two factors (A and B),
each run at two levels. Table 5.4.7 depicts the 2
2
design matrix, where refers to the low level
and + refers to the high level. These are also called as non-dimensional or coded values of the
process parameters. The relation between the actual and the coded process parameters is given as
IIT, Bombay

2
x x
2
x x
x
x
low high
low high
i

|
|
.
|

\
| +

= (16)

where x
i

is the coded value of the process parameter (x). The term y refers to the response
parameter.
Table 5.4.7 2
2
Expt. No.
factorial design matrix.

Factors Response
A B AB y
1
2
3
4
-1
+1
-1
+1
-1
-1
+1
+1
1
-1
-1
1
y
y
1
y
2
y
3
4

Similarly, the main effect of factor B is calculated as

+

B B
y y =
r 2
y y
r 2
y y
2 1 4 3
+

+
(18)

The interaction effect of AB is calculated as

+

AB AB
y y =
r 2
y y
r 2
y y
3 2 4 1
+

+
(19)

The next step is to compute the sum of squares of the main and interaction factors. Before doing
that, the contrast of the factors need to be calculated as follows.

(Contrast)
A
=(y
2
+y
4
)-(y
1
+y
3
(Contrast)
) (20)
B
=(y
3
+y
4
)-(y
1
+y
2
(Contrast)
) (21)
AB
=(y
1
+y
4
)-(y
2
+y
3
) (22)
IIT, Bombay


Further these contrasts are utilized in the calculation of sum of squares as follow.

(Sum of squares)
A
=SS
A
rows of number r
] ) contrast [(
2
A

= (23)
(Sum of squares)
B
=SS
B
rows of number r
] ) contrast [(
2
B

= (24)
(Sum of squares)
AB
=SS
AB
rows of number r
] ) contrast [(
2
AB

= (25)
Total sum of squares =SS
T

= = =

2
1 i
2
1 j
r
1 k
2
avg 2
ijk
rows of number r
y
y = (26)


In general, SS
T

has [(r number of rows)-1] degrees of freedom (dof). The error sum of squares,
with r [number of rows-1] is calculates as
Error sum of squares =SS
E
=SS
T
SS
A
SS
B
- SS
AB

(27)
Moreover each process parameter is associated with a single degree of freedom. Further, the
complete analysis of variance is summarized in Table 5.4.8. This is called as analysis of variance
(ANOVA) table. The term F
0
The main drawback with the two level designs is the failure to capture the non linear influence of
the process parameters on the response. Three level designs are used for this purpose. The
explanation about the three level designs is given elsewhere [2].
refers to the F ratio and the same is calculated as the ratio of factor
mean square to the error mean square. The interpretation of the ANOVA table can be done
similar to the one as explained in the Taguchi method, step 6.





IIT, Bombay

Table 5.4.8 Analysis of Variance (ANOVA) table.
Source of
variation
Sum of squares
Degree of
freedom
Mean square F
0
A
B
AB
Error
Total
SS
SS
A
SS
B

SS
AB
SS
E
(dof)
T
(dof)
A

(dof)
B

(dof)
AB
(dof)
E

SS
T
A
/(dof)
SS
A

B
/(dof)
SS
B

AB
/(dof)
SS
AB

E
/(dof)

E

(F
0
)
(F
A
0
)
(F
B
0
)
AB



Central composite rotatable design
Even though three level designs help in understanding the non linear influence of the process
parameters on the response, the number of experiments increases tremendously with the increase
in number of process parameters. For example, the number of experiments involved in three
level designs with three, four and five factors is twenty seven (3
3
=27), eighty one (3
4
=81) and
two hundred and forty three (3
5
The principle of central composite rotatable design includes 2f numbers of factorial experiments
to estimate the linear and the interaction effects of the independent variables on the responses,
where f is the number of factors or independent process variables. In addition, a number (n
=243), respectively. The principle of central composite rotatable
design (CCD) reduces the total number of experiments without a loss of generality [2]. This is
widely used as it can provide a second order multiple regression model as a function of the
independent process parameters with the minimum number of experimental runs [2].
C
) of
repetitions [n
C
The choice of the distance of the axial points () from the centre of the design is important to
make a central composite design (CCD) rotatable. The value of for rotatability of the design
scheme is estimated as =(2
>f] are made at the center point of the design matrix to calculate the model
independent estimate of the noise variance and 2f number of axial runs are used to facilitate the
incorporation of the quadratic terms into the model. The term rotatable indicates that the variance
of the model prediction would be the same at all points located equidistant from the center of the
design matrix.
f
)
1/4

[2]. The number of experiments is estimated as
IIT, Bombay


C
f
n ) f 2 ( 2 + + (28)

The intermediate coded values are calculated as [2]

2
x x
2
x x
x
x
min max
min max
i

|
.
|

\
| +

= (29)

where x
i
is the coded value of a process variable (x) between x
max
and x
min
12 4 ) 2 2 ( 2
2
= + +
. For example the
number of experiments in a CCD matrix corresponding to two process variables is calculated as
and the distance of the axial points from the center is calculated as =
(2*2)
1/4

=1.414. Hence Table 5.4.9 depicts the CCD for a two process parameter application.
Table 5.4.9 Central composite design (CCD) for a two process parameter application.
Expt. No.
Process parameters (coded) Response variable
x X
1
y
2
1
2
3
4
5
6
7
8
9
10
11
12
-1
+1
-1
+1
-1.414
+1.414
0
0
0
0
0
0
-1
-1
+1
+1
0
0
-1.414
+1.414
0
0
0
0
y
y
1
y
2
y
3
y
4

y
5

y
6

y
7

y
8

y
9

y
10

y
11

12



IIT, Bombay

Regression modeling
Regression models are the mathematical estimation equations with response variable as a
function of process parameters. These models are developed statistically by utilizing the
information of the measured response variable and the corresponding design matrix. Considering
the f number of independent process parameters, a generalized regression model can be
represented as

+ +
|
|
.
|

\
|
+ + =


= = = =
f
1 i
f
1 j
j i ij j jj
f
1 j
f
1 j
* * * * *
x x x x y
2
j j 0 m
(30)
where

m
y is a response variable in non-dimensional form, x
i
* and x
j

* refer to the independent


variables in non-dimensional form, s refer to the regression coefficients and is the error
term.
Calculation of the regression coefficients and ANOVA terms
The coefficients, s, in the regression model [equation (5.4.30)] are calculated based on the
minimization of the error between the experimentally measured and the corresponding estimated
values of the response variables. The least square function, S, to be minimized can be expressed
as [3]

2
f
1 i
f
1 j
ij jj
f
1 j
j j
f
1 j
u
1 s
0 j i
2
j m ij 1 0
x x ) x ( x y ) , , , ( S
*


=

|
|
.
|

\
|
=
= = = =

(31)

The estimated second order response surface model is represented as

= = = =

+ + + =

f
1 i
f
1 j
ij jj
f
1 j
j j
f
1 j
j i
2
j p
x x ) (x x y

*
0
(32)

Further the adequacy of the developed estimation model is tested using Analysis of Variance
(ANOVA) as shown in Table 5.4.10.

IIT, Bombay

Table 5.4.10 Analysis of variance (ANOVA) method for testing the significance of regression
model [3].
Source of variation
Sum of
squares
Degree of freedom
(dof)
Mean
square
F-statistic
(F)
P-value
Regression SS m-1
R
MS F
R
P
R R
Linear terms SS m-1-m
R_L
MS F
R_L
P
R_L R_L
Non-linear terms SS m
R_NL
MS F
R_NL
P
R_NL R_NL
Residual SS u-m
Res
MS

Res
Lack of fit SS u-m-n
LOF C
MS +1
LOF
Pure error SS n
PE C
MS -1
PE
Total SS u-1
T

2
Adj
R

The terms in ANOVA table are calculated in the following manner.

2
u
1 s
s p s m
s
s
u
1 s
m
s
) y ( ) y ( SS ;
u
) y (
) y ( SS
s Re
2
u
1
p R

= =
=
|
|
.
|

\
|
=
|
|
|
|
|
.
|

\
|
=

(33, 34)
2
L L _ R
2
T
u
1 s
u
1 s
s m
s _ p
u
1 s
u
1 s
s m
s m
u
) y (
) y ( SS ;
u
) y (
) y ( SS


=
=
=
=
|
|
|
|
|
.
|

\
|
=
|
|
|
|
|
.
|

\
|
=

(35, 36)


=
=
|
|
|
|
|
.
|

\
|

= =

u
43 s
u
43 s
s m
s m
2
PE L _ R R NL _ R
44 u
) y (
) y ( SS ; SS SS SS (37, 38)
' m 1 m
SS
MS ;
m u
SS
MS
;
1 m
SS
MS ; SS SS SS
L _ R
L _ R
s Re
s Re
R
R PE s Re LOF

=

= =
=
(39, 40)
IIT, Bombay


(
(
(
(

|
.
|

\
|

|
.
|

\
|

= =
= =
=

=
+
= =
1 u
SS
m u
SS
1 R ;
MS
MS
F
;
MS
MS
F ;
MS
MS
F
;
MS
MS
F ;
1 n
SS
MS
;
1 n m u
SS
MS ;
' m
SS
MS
T
s Re
2
Adj
PE
LOF
LOF
s Re
NL _ R
NL _ R
s Re
L _ R
s Re
R
R
C
PE
PE
C
LOF
LOF
NL _ R
NL _ R
L _ R
(41 48)
where
(a) SS
R
, SS
Res
, and SS
T
(b) SS
refer to the regression sum of squares, residual sum of squares and
total sum of squares with degrees of freedom m1 (m is the number of terms in the
regression model), um and u1 respectively.
R_L
, SS
R_NL
, SS
PE
and SS
LOF
refer to the regression sum of squares of the model
having only linear terms, regression sum of squares of the model having only non-
linear terms, pure error sum of squares and the lack of fit sum of squares with degrees
of freedom m1m, m (number of non-linear terms in response surface model), n
C
1
and umn
C
(c)
1 respectively.

L _ p
y refers to the regression model with only linear terms.
(d) MS
R
and MS
Res
(e) MS
refer to the regression mean squares and the residual mean squares
respectively.
R_L
, MS
R_NL
, MS
PE
and MS
LOF
(f) F
refer to the regression mean squares of the model
having only linear terms, regression mean squares of the model having only non-linear
terms, pure error mean squares and lack of fit mean squares respectively.
R
, F
R_L
, F
R_NL
and F
LOF
(g) P
refer to the F-statistic required for the hypothesis testing of
the regression model, model with only linear terms, model with only quadratic terms
and the lack of fit of regression model respectively.
R
, P
R_L
, P
R_NL
and P
LOF
refer to the P-value of the regression model, model with only
linear terms, model with only non-linear terms, and the lack of fit of second order
IIT, Bombay

response surface model respectively. The term P-value refers to the smallest
significance level at which the data lead to rejection of the null hypothesis. In other
words, if the P-value is less than level of significance () then the null hypothesis is
rejected. These values are calculated using the corresponding F-statistic value and the
F-distribution table.
(h)
2
Adj
R refers to the adjusted coefficient of determination.
Model adequacy checking
The various steps followed to check the adequacy of the regression model are
[1] Step 1
Initially, the lack of fit test is performed to check the lack of fit for the regression model. The
appropriate hypothesis considered for testing is

H
0
H
: The regression model is adequate (Null hypothesis) (49)
1

: The regression model is not adequate (Alternate hypothesis) (50)
For a given significance level (), the null hypothesis is rejected if
1
c
n , 1
c
n m u ,
F F
LOF +
> and >
LOF
P (51)
The terms
1
c
n , 1
c
n m u ,
F
+
and P
LOF

are calculated from the F-distribution table. The value of
is considered as 0.1 in the present study [3]. If the equation (51) is not satisfied then the null
hypothesis is accepted, implying that there is no evidence of lack of fit for the regression model
and the same model can be used for further analysis.
[2] Step 2
The significance of this quadratic model is checked by conducting hypothesis testing. The
appropriate hypothesis considered for testing is

hypothesis Alternate ; el mod regression the in term one atleast for 0 : H
hypothesis Null ; 0 : H
1
j i
ij 13 12 jj 22 11 j 2 1 0

= = = = = = = = =
<

(52)

For a given significance level (), the null hypothesis is rejected if
IIT, Bombay


m u , 1 m ,
F F
R
> and P
R
-value < (53)

The terms
m u , 1 m ,
F

and P
R

are calculated from the F-distribution table. If the equation (53) is
satisfied then the null hypothesis is rejected, implying that at least one of the regressors in the
model are non zero or significant.
[3] Step 3
The contribution of the linear and non-linear terms to the model is tested. For a given
significance level (), the linear terms contribute significantly when

m u , ' m 1 m ,
F F
L _ R
> and corresponding P
R_L
-value < ; (54)

and the quadratic terms contribute significantly when

m u , ' m ,
F F
NL _ R
> and corresponding P
R_NL
-value < (55)

[4] Step 4
The coefficient of determination
2
Adj
R is calculated. This represents the proportion of the
variation in the response explained by the regression model. If the value of
2
Adj
R is close to 1.0
then most of the variability in response is explained by the model.

[5] Step 5
The t-statistic and P-value of all the coefficients in regression model are calculated. If the P-
value of any term in the model is greater than then the same are insignificant.

[6] Step 6
The significant terms in the regression model are identified using the step wise regression
analysis. Step wise regression analysis involves multiple steps of regression, where in each step a
IIT, Bombay

single variable having low P-value (<) is added to the model such that it improves the adjusted
coefficient of determination. The detailed explanation of the step wise regression analysis is
given elsewhere [3]. Further using the final regression model having only the significant terms,
the ANOVA table is recalculated.

Example
Two wire tandem submerged arc welding process is performed over a HSLA steel plate of 12
mm thickness. The influence of five important process parameters on the weld bead dimensions
is studied. The process parameters include leading wire current, trailing wire positive pulse
current, trailing wire negative pulse current and its time duration, and welding speed. Table
5.4.11 depicts the working range of the process parameters. The design matrix corresponding to
this experiment and the measured weld bead dimensions at different welding conditions are
reported elsewhere [4]. The non-dimensional form of the independent process variables and the
response variables are considered in the present work in the following manner.

G
m
G
m
TR
TR
TR
TR
TR
TR
LE
LE
w
h
h ;
tp
d
d ;
w
w
w ;
0 . 7 45 . 17
)] 0 . 7 45 . 17 ( v 2 [
3784 . 2 v

00835 . 0 01253 . 0
)] 00835 . 0 01253 . 0 ( t 2 [
3784 . 2 t ;
401 958
)] 401 958 ( I 2 [
3784 . 2 I

319 401
)] 319 401 ( I 2 [
3784 . 2 I ;
300 590
)] 300 590 ( I 2 [
3784 . 2 I
m
*
*
= = =

+
=

+
=

+
=

+
=

+
=

+
+
(56)

where
[1] w, d and h refer to the measured values of weld width, penetration and reinforcement
height , respectively, corresponding to any welding condition,
[2] w
G
[3]
and tp refer to width of the V-groove at the surface and thickness of base plate,
respectively,

m
w ,

m
d , and

m
h refer to the measured values of weld width, penetration and
reinforcement height, respectively, in non-dimensional form.

IIT, Bombay

Utilizing the measured values of the weld bead width at different welding conditions [4], develop
the regression model of the weld bead width as a function of welding condition?

Table 5.4.11 Process parameters and their limits [4].
Process parameters Notation
Factor levels
2.3784 1 0 1 2.3784
Leading wire current I
LE
300 (A) 384 445 506 590
Trailing wire +ve pulse
current
+
TR
I (A) 319 343 360 377 401
Trailing wire ve pulse
current

TR
I (A) 401 562 680 797 958
Trailing wire negative
pulse time

TR
t (s) 0.00835 0.00956 0.01044 0.01132 0.01253
Welding speed v (mm/s) 7 10 12.23 14.45 17.45

Solution
Sequentially following the steps explained under the section regression modeling, the weld bead
width regression model as a function of process parameters is developed as

* * * *
* * *
*
* * * *
TR TR TR TR
2 2
TR
2
TR
TR TR TR LE p
t I 043 . 0 I I 0370 . 0
) v ( 0620 . 0 ) I ( 0430 . 0 ) I ( 0320 . 0 v 3900 . 0
t 0710 . 0 I 1570 . 0 I 0450 . 0 I 1200 . 0 655 . 2 w
*
+
+
+

+
+ + + =
(57)
where the term
*
p
w refers to the predicted non-dimensional weld bead width. Table 5.4.12
depicts the corresponding ANOVA table. This ANOVA tables explain the contribution of the
linear and non-linear terms, and the proportion of variation in the predicted weld bead width
form the measured.



IIT, Bombay

Table 5.4.12 ANOVA table for the weld bead width regression model.
Source of variation
Sum of
squares
Degree of
freedom
Mean square
F-statistic
(F)
P-value
Regression 9.2087 20 0.4604 36.77 0.00
Linear terms 8.5678 5 1.7136 137.09 0.00
Non linear terms 0.6409 15 0.0427 3.4181 0.002
Residual 0.3631 29 0.0125
1.4737 0.292 Lack of fit 0.2985 22 0.0136
Pure error 0.0646 7 0.0092
Total 9.5718 49
2
Adj
R = 0.94

The adjusted coefficient of determination (
2
Adj
R ) corresponding to the equations (5.4.41is
calcualted as 0.94 (table 5.4.12). The adjusted coefficient of determination represents the
proportion of the variation in the response explained by the regression model [3]. It is thus
envisaged that equation (5.4.41) can capture 94% of the variation in the measured values of weld
width as function of the five independent welding conditions within the ranges considered in the
present study.

IIT, Bombay

Exercise
1. Develop the design matrix for three factors operating at three levels.
2. Develop the regression model for the penetration as a function of process parameters
using the data published in the reference 4.

Reference
[1] M. S. Phadke, Quality engineering using robust design, 2
nd
[2] D. C. Montgomery, Design and analysis of experiments, 3
edition, Pearson, 2009.
rd
[3] D. C. Montgomery, E. A. Peck and G. G. Vining, Introduction to linear regression
analysis, 4
edition, J ohn wiley and sons,
1991.
th
[4] D. V. Kiran, B. Basu and A. De, Influence of process variables on weld bead quality in two
wire tandem submerged arc welding of HSLA steel, J ournal of Materials Processing
Technology, 2010, doi:10.1016/jmatprotec.2012.05.008.
edition, , J ohn wiley and sons, 2006.

You might also like