0 Up votes0 Down votes

17 views30 pagesEdit_5

Apr 07, 2015

© © All Rights Reserved

PDF, TXT or read online from Scribd

Edit_5

© All Rights Reserved

17 views

Edit_5

© All Rights Reserved

- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- Micro: A Novel
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions
- The 6th Extinction
- The Black Swan
- The Art of Thinking Clearly
- The Last Battle
- Prince Caspian
- A Mind for Numbers: How to Excel at Math and Science Even If You Flunked Algebra
- The Theory of Death: A Decker/Lazarus Novel

You are on page 1of 30

Maintainability Symposium, San Jose, CA, USA, January 2528, 2010.

This material is posted here with permission of the IEEE. Such

permission of the IEEE does not in any way imply IEEE

endorsement of any of ReliaSoft Corporation's products or

services. Internal or personal use of this material is permitted.

However, permission to reprint/republish this material for

advertising or promotional purposes or for creating new

collective works for resale or redistribution must be obtained

from the IEEE by writing to pubs-permissions@ieee.org.

By choosing to view this document, you agree to all provisions

of the copyright laws protecting it.

ReliaSoft Corporation

1450 S. Eastside Loop

Tucson, AZ 85710 USA

e-mail: Harry.Guo@ReliaSoft.com

ReliaSoft Corporation

1450 S. Eastside Loop

Tucson, AZ 85710 USA

e-mail: Adam.Mettas@ReliaSoft.com

Design of Experiments (DOE) is one of the most useful statistical tools in product design and testing. While many

organizations benefit from designed experiments, others are getting data with little useful information and wasting resources

because of experiments that have not been carefully designed. Design of Experiments can be applied in many areas including

but not limited to: design comparisons, variable identification, design optimization, process control and product performance

prediction. Different design types in DOE have been developed for different purposes. Many engineers are confused or even

intimidated by so many options.

This tutorial will focus on how to plan experiments effectively and how to analyze data correctly. Practical and correct

methods for analyzing data from life testing will also be provided.

Huairui Guo is the Director of Theoretical Development at ReliaSoft Corporation. He received his Ph.D. in Systems and

Industrial Engineering from the University of Arizona. He has published numerous papers in the areas of quality engineering

including SPC, ANOVA and DOE and reliability engineering. His current research interests include repairable system

modeling, accelerated life/degradation Testing, warranty data analysis and robust optimization. Dr. Guo is a member of SRE,

IIE and IEEE. He is a Certified Reliability Professional (CRP).

Mr. Mettas is the Vice President of product development at ReliaSoft Corporation. He fills a critical role in the advancement of

ReliaSoft's theoretical research efforts and formulations in the subjects of Life Data Analysis, Accelerated Life Testing, and

System Reliability and Maintainability. He has played a key role in the development of ReliaSoft's software including,

Weibull++, ALTA and BlockSim, and has published numerous papers on various reliability methods. Mr. Mettas holds a B.S

degree in Mechanical Engineering and an M.S. degree in Reliability Engineering from the University of Arizona. He is a

Certified Reliability Professional (CRP).

Table of Contents

1.

2.

3.

4.

5.

6.

7.

8.

Introduction ..........................................................................................................................................................................1

Statistical Background ..........................................................................................................................................................2

Two Level Factorial Design .................................................................................................................................................3

Response Surface Methods (RSM) ......................................................................................................................................6

DOE for Life Testing ...........................................................................................................................................................9

Conclusions ........................................................................................................................................................................10

References ..........................................................................................................................................................................11

Tutorial Visuals.. .................................12

1.

INTRODUCTION

reliability is to integrate them in the design and manufacturing

process. Design of Experiments (DOE) is a useful tool that can

be integrated into the early stages of the development cycle. It

has been successfully adopted by many industries, including

automotive, semiconductor, medical devices, chemical

products, etc. The application of DOE is not limited to

engineering. Many successful stories can be found in other

areas. For example, it has been used to reduce administration

costs, improve the efficiency of surgery processes, and

establish better advertisement strategies.

1.1 Why DOE

DOE will make your life easier. For many engineers,

applying DOE knowledge in their daily work will reduce lots

of trouble. Here are two examples of bad experiments that will

cause trouble.

Example 1: Assume the reliability of a product is affected by

voltage. The usage level voltage is 10. In order to predict the

reliability at the usage level, fifty units are available for

accelerated life testing. An engineer tested all fifty units at a

voltage of 25. Is this a good test?

Example 2: Assume the reliability of a product is affected by

temperature and humidity. The usage level is 40 degrees

Celsius and 50% relative humidity. In order to predict the

reliability at the usage level, fifty units are available for

accelerated life testing. The design is conducted in the

following way:

Number of

Units

Temperature

(Celsius)

Humidity

(%)

25

120

95

25

85

85

Will the engineer be able to predict the reliability at the usage

level with the failure data from this test?

1.2 What DOE Can Do

DOE can help you design better tests than the above two

examples. Based on the objectives of the experiments, DOE

can be used for the following purposes [1, 2]:

1. Comparisons. When you have multiple design options,

several materials or suppliers are available, you can design an

experiment to choose the best one. For example, in the

comparison of six different suppliers that provide connectors,

will the components have the same expected life? If they are

different, how are they different and which is the best?

2. Variable Screening. If there are a large number of

variables that can affect the performance of a product or a

system, but only a relatively small number of them are

important, a screening experiment can be conducted to

identify the important variables. For example, the warranty

return is abnormally high after a new product is launched.

duty cycle, humidity and several other factors. DOE can be

used to quickly identify the troublemakers and a follow-up

experiment can provide the guidelines for design modification

to improve the reliability.

3. Transfer Function Exploration. Once a small number of

variables have been identified as important, their effects on the

system performance or response can be further explored. The

relationship between the input variables and output response is

called the transfer function. DOE can be applied to design

efficient experiments to study the linear and quadratic effects

of the variables and some of the interactions between the

variables.

4. System Optimization. The goal of system design is to

improve the system performance, such as to improve the

efficiency, quality, and reliability. If the transfer function

between variables and responses has been identified, the

transfer function can be used for design optimization. DOE

provides an intelligent sequential strategy to quickly move the

experiment to a region containing the optimum settings of the

variables.

5. System Robustness. In addition to optimizing the

response, it is important to make the system robust against

noise, such as environmental factors and uncontrolled

factors. Robust design, one of the DOE techniques, can be

used to achieve this goal.

1.3 Common Design Types

Different designs have been used for different experiment

purposes. The following list gives the commonly used design

types.

1. For comparison

One factor design

2. For variable screening

2 level factorial design

Taguchi orthogonal array

Plackett-Burman design

3. For transfer function identification and optimization

Central composite design

Box-Behnken design

4. For system robustness

Taguchi robust design

The designs used for transfer function identification and

optimization are called Response Surface Method designs. In

this tutorial, we will focus on 2 level factorial design and

response surface method designs. They are the two most

popular and basic designs.

1.4 General Guidelines for Conducting DOE

DOE is not only a collection of statistical techniques that

enable an engineer to conduct better experiments and analyze

data efficiently; it is also a philosophy. In this section, general

guidelines for planning efficient experiments will be given.

The following seven-step procedure should be followed [1, 2].

1. Clarify and State Objective. The objective of the

experiment should be clearly stated. It is helpful to prepare a

experiment.

2. Choose Responses. Responses are the experimental

outcomes. An experiment may have multiple responses based

on the stated objectives. The responses that have been chosen

should be measurable.

3. Choose Factors and Levels. A factor is a variable that is

going to be studied through the experiment in order to

understand its effect on the responses. Once a factor has been

selected, the value range of the factor that will be used in the

experiment should be determined. Two or more values within

the range need to be used. These values are referred to as

levels or settings. Practical constraints of treatments must be

considered, especially when safety is involved. A cause-andeffect diagram or a fishbone diagram can be utilized to help

identify factors and determine factor levels.

4. Choose Experimental design. According to the objective

of the experiments, the analysts will need to select the number

of factors, the number of level of factors, and an appropriate

design type. For example, if the objective is to identify

important factors from many potential factors, a screening

design should be used. If the objective is to optimize the

response, designs used to establish the factor-response

function should be planned. In selecting design types, the

available number of test samples should also be considered.

5. Perform the Experiment. A design matrix should be used

as a guide for the experiment. This matrix describes the

experiment in terms of the actual values of factors and the test

sequence of factor combinations. For a hard-to-set factor, its

value should be set first. Within each of this factors settings,

the combinations of other factors should be tested.

6. Analyze the Data. Statistical methods such as regression

analysis and ANOVA (Analysis of Variance) are the tools for

data analysis. Engineering knowledge should be integrated

into the analysis process. Statistical methods cannot prove that

a factor has a particular effect. They only provide guidelines

for making decisions. Statistical techniques together with good

engineering knowledge and common sense will usually lead to

sound conclusions. Without common sense, pure statistical

models may be misleading. For example, models created by

smart Wall Street scientists did not avoid, and probably

contributed to, the economic crisis in 2008.

7. Draw Conclusions and Make Recommendations. Once the

data have been analyzed, practical conclusions and

recommendations should be made. Graphical methods are

often useful, particularly in presenting the results to others.

Confirmation testing must be performed to validate the

conclusion and recommendations.

The above seven steps are the general guidelines for

performing an experiment. A successful experiment requires

knowledge of the factors, the ranges of these factors and the

appropriate number of levels to use. Generally, this

information is not perfectly known before the experiment.

Therefore, it is suggested to perform experiments iteratively

and sequentially. It is usually a major mistake to design a

single, large, comprehensive experiment at the start of a study.

2 Guo & Mettas

resources should be invested in the first experiment.

2.

STATISTICAL BACKGROUND

used in DOE data analysis. Knowing them will help you have

a better understand of DOE.

2.1 Linear Regression[2]

A general linear model or a multiple regression model is:

(1)

Y = 0 + 1 X 1 + ... + p X p +

Where: Y is the response also called output or dependent

variable. X i is the predictor also called input or independent

variable. is the random error or noise, which is assumed to

be normally distributed with mean 0 and variance 2 , usually

noted as ~ N (0, 2 ) . Because is normally distributed,

then for a given value of X, Y is also normally distributed and

Var (Y ) = 2 .

From the model, it can be seen that the variation or the

difference of Y consists of two parts. One is the random part of

. The other is the difference caused by the difference of the

X values. For example, consider the data in Table 2:

Observation

X1

X2

120

90

300

120

90

350

85

95

150

85

95

190

120

95

400

120

95

430

Y

Mean

325

170

415

Table 2 has three different combinations of X1 and X2,

showing at different colors. For each combination, there are

two observations. Because of the randomness caused by ,

these two observations are different although they have the

same X values. This difference usually is called within-run

variation. The mean values of Y at the three combinations are

different too. This difference is caused by the difference of X1

and X2 and usually is called between-run variation.

If the between-run variation is significantly larger than the

within-run variation, it means most of the variation of Y is

caused by the difference of X settings. In other words, Xs

significantly affect Y. The difference of Ys caused by the Xs

are much more significant than the difference caused by the

noise.

From Table 2, we have the feeling that the between-run

variation is larger than the within-run variation. To confirm

this, statistical methods should be applied. The amount of the

total variation of Y is defined by the sum of squares:

n

SST = Yi Y

i =1

)2

(2)

number of observations, to eliminate this effect, another

metric called mean squares is used to measure the normalized

variability of Y.

SS

1 n

2

(3)

MST = T =

Yi Y

n 1 n 1 i =1

As mentioned before, the total sum of squares can be

partitioned into two parts: within-run variation caused by

random noise (called sum of squares of error SS E ) and the

between-run variation caused by different values of Xs (called

sum of squares of regression SS R ).

n

i =1

i =1

SS T = SS R + SS E = (Yi Y ) 2 + (Yi Yi ) 2

(4)

Where: Yi is the predicted value for the ith test. For tests with

the same X values, the predicted values are the same.

Similar to equation (3), the mean squares of regression

and the mean squares of error are calculated by:

2

SS

1 n

(5)

MS R = R = Yi Y

p

p i =1

MS E =

n

SS E

1

=

Yi Yi

n 1 p n 1 p i =1

)2

(6)

When there is more than one input variable, SS R can be

further divided into the variation caused by each variable, such

as:

(7)

SS R = SS X1 + SS X 2 + ... + SS X p

The mean squares of each input variable MS X i is compared

with MS E to test if the effect of X i is significantly greater

than the effect of noise.

The mean squares of regression MS R is used to measure

the between-run variance that is caused by predictor Xs. The

mean squares of error MS E represents the within-run variance

caused by noise. By comparing these two values, we can find

out if the variance contributed by Xs is significantly greater

than the variance caused by noise. ANOVA is the method

used for the comparison in a statistical way.

2.2 ANOVA (Analysis of Variance)

SST = SS R + SS E = SS X1 + SS X 2 + SS E

The fifth column shows the F ratio of each source. All the

values are much bigger than 1. The last column is the P value.

The smaller the P value is, the larger the difference between

the variance caused by the corresponding source and the

variance caused by noise. Usually, a significance level ,

such as 0.05 or 0.1 is used to compare with the P values. If a

P value is less than , the corresponding source is said to be

significant at the significance level of . From Table 3, we

can see that both variables X1 and X2 are significant to the

response Y at the significance level of 0.1.

MS R

(8)

MS E

is used to test the following two hypotheses:

H0: There is no difference between the variance caused

by Xs and the variance caused by noise.

H1: The variance caused by Xs is larger than the variance

caused by noise.

Under the null hypothesis, the ratio follows the F distribution

with degree of freedoms of p and n 1 p . By applying

ANOVA to the data for this example, we get the following

ANOVA table.

The third column shows the values for the sum of squares. We

Degrees

of

Freedom

Source of

Variation

Sum of

Squares

Mean

Squares

F

Ratio

P

Value

Model

6.14E+04

3.07E+04

36.86

0.0077

X1

6.00E+04

6.00E+04

72.03

0.0034

X2

9.72

0.0526

8100

8100

Residual

2500

833.3333

Total

6.39E+04

Another way to test whether or not a variable is

significant is to test whether or not its coefficient is 0 in the

regression model. For this example, the linear regression

model is:

(10)

Y = 0 + 1 X 1 + 1 X 2 +

If we want to test whether or not 1 = 0 , we can use the

following hypothesis:

H0: 1 = 0

Under this null hypothesis, the statistic is a t distribution:

1

(11)

se( 1 )

Se( 1 ) is the standard error of 1 that is estimated from the

data. The t test results are given in Table 4.

T0 =

Term

Coefficient

Standard Error

T Value

P Value

Intercept

-2135

588.7624

-3.6263

0.0361

X1

0.8248

8.487

0.0034

X2

18

5.7735

3.1177

0.0526

F0 =

(9)

Table 3 and Table 4 give the same P values.

With linear regression and ANOVA in mind, we can start

discussing DOE now.

3.

In order to study the effect of a factor, at least two different

settings for this factor are required. This also can be explained

from the viewpoint of linear regression. To fit a line, two

points are the minimal requirement. Therefore, the engineer

predict the life at the usage level of 10 volts. With only one

voltage value, the effect of voltage cannot be evaluated.

3.1 Two Level Full Factorial Design

When the number of factors is small and you have the

resources, a full factorial design should be conducted. Here

we will use a simple example to introduce some basic

concepts in DOE.

For an experiment with two factors, the factors usually are

called A and B. Uppercase letters are used for factors. The

first level or the low level is represented by -1, while the

second level or the high level is represented by 1. There are

four combinations of a 2 level 2 factorial design. Each

combination is called a treatment. Treatments are represented

by lowercase letters. The number of test units for each

treatment is called the number of replicates. For example, if

you test two samples at each treatment, the number of

replicates is 2. Since the number of replicates for each factor

combination is the same, this design is also balanced. A two

level factorial design with k factors usually is written as 2 k

design and read as 2 to the power of 3 design or 2 to the 3

design. For a 2 2 design, the design matrix is:

Factors

Response

Treatment

-1

-1

-1

-1

30

-1

25

ab

35

20

This design is orthogonal. This is because the sum of the

product

of

A

and

B

is

zero,

which

is

An

orthogonal

(1 1) + (1 1) + (1 1) + (1 1) = 0 .

design will reduce the estimation uncertainty of the model

coefficients.

The following linear regression model is used for the

analysis:

(12)

Y = 0 + 1 X 1 + 2 X 2 + 12 X 1 X 2 +

Where: X 1 is for factor A; X 2 is for factor B; and their

interaction is represented by X 1 X 2 . The effects of A and B

are called main effects. The effects of their interaction are

called two-way interaction effects. These three effects are the

three sources for the variation of Y. Since equation (12) is a

linear regression model, the ANOVA method and the t-test

given in Section 2 can be used to test whether or not one effect

is significant.

For a balanced design, a simple way to calculate the effect

of a factor is to calculate the difference of the mean values of

the response at its high and low setting. For example, the

effect of A can be calculated by:

=

(13)

30 + 35 20 + 25

= 10

2

2

When you increase the number of factors, the number of

test units will increase quickly. For example, to study 7

factors, 128 units are needed. In reality, responses are affected

by a small number of main effects and lower order

interactions. Higher order interactions are relatively

unimportant. This statement is called the sparsity of effects

principle. According to this principle, fractional factorial

designs are developed. These designs use fewer samples to

estimate main effects and lower order interactions, while the

higher order interactions are considered to have negligible

effects.

Consider a 2 3 design. 8 test units are required for a full

factorial design. Assume only 4 test units are available

because of the cost of the test. Which 4 of the 8 treatments

should you choose? A full design matrix with all the effects

for a 2 3 design is:

Order

1

A

-1

B

-1

AB

1

C

-1

AC

1

BC

1

ABC

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

If the effect of ABC can be ignored, the following 4

treatments can be used in the experiment.

Order

AB

AC

BC

-1

-1

-1

-1

ABC

1

-1

-1

-1

-1

-1

-1

-1

-1

In Table 7, the effect of ABC cannot be estimated from the

experiment because it is always at the same level of 1. Since

Table 7 uses only half of the treatments from the full factorial

the power of 3 minus 1 design or 2 to the 3 minus 1 design.

From Table 7, you will also notice that some columns

have the same values. For example, column AB and C are the

same. Using equation (13) to calculate the effect of AB and C,

we will end up with the same procedure and result. Therefore,

from this experiment, the effect of AB and C cannot be

DOE, effects that cannot be separated from an experiment are

called confounded effects or aliased effects. A list of aliased

effects is called the alias structure. For the design of Table 7,

the alias structure is:

[I]=I+ABC; [A]=A+AC; [B]=B+BC; [C]=C+AB

Where: I is the effect of the intercept in the regression model,

which represents the mean value of the response. The alias for

I is called the defining relation. For this example, the defining

relation is written as I = ABC. In a design, I may be aliased

with several effects. The order of the shortest effects that

aliased with I is the resolution of this design.

From the alias structure, we can see that main effects are

confounded with 2-way interactions. For example, the

estimated effect for C in fact is the combination effect of C

and AB.

Checking Table 7, we can see it is a full factorial design if

we have only factor A and B. Therefore, A and B usually are

called basic factors and the full factorial design for them is

called the basic design. A fractional factorial design is

generated from its basic design and basic factor. For this

example, the values for factor C are generated from the values

of the basic factors A and B using the relation C=AB. Usually

AB is called the factor generator for C.

By now, it should be clear that the design given in Table 1

at the beginning of this tutorial is not a good design. If you

check the design in terms of coded value, the answer is

obvious. Table 8 shows the design again.

Number of

Units

Temperature

(Celsius)

25

120 (1)

95 (1)

25

85 (-1)

85 (-1)

Aperture Setting

small

large

Exposure Time

minutes

20

40

Develop Time

seconds

30

45

Mask Dimension

small

large

Etch Time

14.5

15.5

With five factors the total number of runs required for a

full factorial is 2 5 = 32 . Running all of the 32 combinations is

too expensive for the manufacturer. At the initial

investigation, only main effects and two factor interactions are

of interest, while higher order interactions are considered to be

unimportant. It is decided to carry out the investigation using

the 2 51 design, which requires 16 runs. The defining relation

is I=ABCDE, or in other words, the generator for factor E is

E=ABCD. Table 10 gives the experiment data.

Run Order

Yield

Large

20

30

Large

15.5

10

Large

20

45

Large

14.5

21

Small

40

45

Small

15.5

45

Small

20

45

Small

14.5

16

Large

40

30

Small

15.5

52

Large

40

45

Small

14.5

60

Small

40

30

Large

15.5

30

Small

20

45

Large

15.5

15

Large

20

30

Small

14.5

10

Small

40

30

Small

14.5

34

11

Small

20

30

Small

15.5

12

Large

40

30

Large

14.5

50

13

Small

40

45

Large

14.5

44

14

Small

20

30

Large

14.5

15

Large

20

45

Small

15.5

22

16

Large

40

45

Large

15.5

63

Humidity

(%)

In this design, temperature and humidity are confounded. In

fact, to study the effect of two factors, at least three different

settings are required. From the linear regression point of view,

at least three unique settings are needed to solve three

parameters: the intercept, the effect of factor A and the effect

of factor B. If their interaction is also to be estimated, four

different settings should be used. Many DOE software

packages can generate a design matrix for you according to

the number of factors and the level of factors. It will help you

avoid bad designs such as the one given in Table 8.

Since the design has only 16 unique factor combinations, it

can be used to estimate only 16 parameters in the linear

regression model. If we include all main and 2-way

interactions in the model, we get the following ANOVA table.

Assume an engineer wants to identify the factors that

affect the yield of a manufacturing process for integrated

circuits. By following the DOE guidelines, five factors are

brought up and a two level fractional factorial design is

decided to be used [1]. The five factors and their levels are

given in Table 9.

Factor

Name

Unit

Low

(-1)

minutes

High

(1)

Source of

Variation

DF

Sum of

Squares

Mean

Squares

P

Value

Model

15

5775.4375

385.0292

495.0625

495.0625

4590.0625

4590.0625

473.0625

473.0625

3.0625

3.0625

1.5625

1.5625

AB

189.0625

189.0625

AC

0.5625

0.5625

AD

5.0625

5.0625

AE

5.0625

5.0625

BC

1.5625

1.5625

BD

0.0625

0.0625

BE

0.0625

0.0625

CD

3.0625

3.0625

CE

0.5625

0.5625

DE

7.5625

7.5625

Residual

Total

15

495.0625

495.0625

193.1951

2.53E-08

4590.0625

4590.0625

1791.244

1.56E-13

473.0625

473.0625

184.6098

3.21E-08

73.7805

3.30E-06

1.3964

0.3128

189.0625

189.0625

Residual

11

28.1875

2.5625

Lack of Fit

9.6875

3.2292

Pure Error

18.5

2.3125

15

5775.4375

AB

Total

5775.4375

because there are no replicates in this experiment when all the

effects are considered. Therefore, there is no way to estimate

the error term in the regression model. This is why the SSE

(Sum of Squares of Error), labeled as Residual in Table 11, is

0. Without SSE, the estimation of the random error, how can

we test whether or not an effect is significant compared to

random error? Dont panic. Statisticians have already

developed methods to deal with this situation. When there is

no error in a screening experiment, Lenths method can be

used to identify significant effects. Lenths method assumes

that all the effects should be normally distributed with a mean

of 0, given the hypothesis that they are not significant. If any

effects are significantly different from 0, they should be

considered significant. So we can check the normal probability

plot for effects.

indeed significant because their P values are close to 0.

Once the important factors have been identified, followup experiments can be conducted to optimize the process.

Response Surface Methods are developed for this purpose.

99.000

Effect Probability

B:Exposure Time

A:Aperture Setting

Yield

Y' = Y

Non-Significant Effects

Significant Effects

Distribution Line

C:Develop Time

Probability

AB

50.000

10.000

5.000

1.000

-40.000

-24.000

-8.000

8.000

24.000

QA

Reliasoft

7/10/2009

3:03:18 PM

40.000

Effect

Alpha = 0.1; Lenth's PSE = 0.9375

From Figure 1, the main effect A, B, C and the 2-way

interaction AB are identified as significant at a significance

level of 0.1. Since the rest of the effects are not significant,

they can be treated as noise and used to estimate the sum of

squares of error. In DOE, it is a common practice to pool nonsignificant effects into error. With only A, B, C and AB in the

model, we get the following ANOVA table.

Source of

Variation

Model

DF

Sum of

Squares

Mean

Squares

F

Ratio

P Value

5747.25

1436.8125

560.7073

1.25E-12

4.

transfer functions at the optimal region. The estimated

function is then used to optimize the responses. The quadratic

model is the model used in RSM. Similar to the factorial

design, linear regression and ANOVA are the tools for data

analysis in RSM. Lets use a simple example to illustrate this

type of design.

4.1 Initial Investigation

Assume the yield from a chemical process has been found

to be affected by two factors [1]:

Reaction Temperature

Reaction Time

The current operating conditions of 230 Fahrenheit and 65

minutes give a yield of about 35%. The engineers decide to

explore the current conditions in the range [L=225, H=235]

Fahrenheit and [L=55, H=75] minutes to see how the

temperature and time affect the yield. The design matrix is:

Std.

Order

1

Point

Type

1

A:

Temperature

-1

B:

Reaction Time

-1

-1

-1

35

37.25

35.45

35.75

36.05

35.3

35.9

Yield (%)

33.95

36.35

Table 13 is the design matrix in coded values where -1 is the

lower level and 1 is the higher level. For a given actual value

for a numerical factor, its corresponding coded value can be

calculated by:

Half of the Range Between High and Low

(13)

4.8

254

85

39.35

of (0, 0). These runs are called center points. Center points

have the following two uses.

To estimate random error.

To check whether or not the curvature is significant.

At least five center points are suggested in an experiment.

When we analyze the data in Table 13, we get the

ANOVA table below.

From Table 14, we know curvature is not significant.

Therefore, the linear model is sufficient in the current

experiment space. The linear model that includes only

significant effects is:

(14)

Y = 35.6375 + 1.1625 X 1 + 0.4875 X 2

Equation (14) is in terms of coded value. We can see both

factors have a positive effect to the yield since their

coefficients are positive. To improve the yield, we should

explore the region with factor values larger than the current

operation condition. There are many directions we can move

to increase the yield, but which one is the fastest lane to

approach the optimal region? By checking the coefficient of

each factor in equation (14), we know the direction should be

(1.1625, 0.4875). This also can be seen from the contour plot

in Figure 2.

7.2

266

95

45.65

9.6

278

105

49.55

Coded value =

Source of

Variation

DF

Sum of

Squares

Mean

Squares

F Ratio

P Value

6.368

1.592

16.4548

0.0095

5.4056

5.4056

55.8721

0.0017

0.9506

0.9506

9.8256

0.035

AB

0.0056

0.0056

0.0581

0.8213

Curvature

0.0061

0.0061

0.0633

0.8137

0.0967

Model

Residual

0.387

Total

6.755

12

290

115

55.7

14.4

302

125

64.25

16.8

314

135

72.5

19.2

326

145

80.6

21.6

338

155

91.4

10

24

10

350

165

95.45

11

26.4

11

362

175

89.3

12

28.8

12

374

185

87.65

From Figure 2, we know the fastest lane to increase yield

is to move along the direction that is perpendicular to the

contour lines. This direction is (1.1625, 0.4875), or about (2.4,

1) in terms of normalized scale. Therefore, if 1 unit of X2 is

increased, 2.39 units of X1 should be used in order to keep

moving on the steepest ascent direction. To convert the code

values to the actual values, we should use the step size of (12

degree, 10 mins) for factor A and B. The table above gives the

results for the experiments conducted along the path of

steepest ascent.

From Table 15, it can be seen that at step 10, the factor

setting is close to the optimal region. This is because the yield

decreases on either side of this step. The region around setting

of (350, 165) requires further investigation. Therefore, the

analysts will conduct a 2 2 factorial design with the center

point of (350, 165) and the range of [L=345, H=355] for factor

A (temperature) and [L=155, H=175] for factor B (reaction

time). The design matrix is given in Table 16.

Std.

Order

Point

Type

A:Temperature

(F)

B:Reaction Time

(min)

Yield

(%)

345

155

89.75

355

155

90.2

345

175

92

355

175

94.25

350

165

94.85

350

165

95.45

350

165

95

350

165

94.55

350

165

94.7

The ANOVA table for this data is:

Figure 2--Contour Plot of the Initial Experiment

Factor Levels

Step

Coded

Actual

Yield

(%)

Current Operation

230

65

35

2.4

242

75

36.5

Source of

Variation

DF

Sum of

Squares

Mean

Squares

F Ratio

P

Value

37.643

9.4107

78.916

5E-04

1.8225

1.8225

15.283

0.017

9.9225

9.9225

83.208

8E-04

AB

0.81

0.81

6.7925

0.06

Model

25.088

25.088

Residual

0.477

0.1193

Total

38.12

Curvature

210.38

1E-04

Table 17 shows curvature is significant at this experiment

region. Therefore, the linear model is not enough for the

relationship between factors and response. A quadratic model

should be used instead. An experiment design that is good for

the quadratic model should be used for further investigation.

Central Composite Design (CCD) is one of these designs.

4.2 Optimization Using RSM

Table 17 is the 2-level factorial design at the optimal

region. CCD is build based on this factorial design and used to

estimate the parameters for a quadratic model such as:

Y = 0 + 1 X 1 + 2 X 2 + 11 X 12 + 22 X 22 + 12 X 1 X 2 + (15)

In fact, a CCD can be directly augmented from a regular

factorial design. The augmentation process is illustrated in

Figure 3.

(0, )

(-1,1)

(1,1)

(-, 0)

(1,-1)

(-1,-1)

(0, 0)

(, 0)

(0, -)

(-, 0)

(1,1)

(0, 0)

1/ 4

2 k f (n f )

(16)

=

ns

original factorial design. ns is the number of replicates of the

runs at the axial points. 2k-f represents the original factorial or

fractional factorial design.

92

355

175

94.25

350

165

94.85

350

165

95.45

350

165

95

350

165

94.55

B:Reaction

Time (min)

155

Yield (%)

89.75

90.2

350

165

94.7

10

-1

342.93

165

90.5

11

-1

357.07

165

92.75

12

-1

350

150.86

88.4

13

-1

350

179.14

92.6

DF

Sum of

Squares

Mean

Squares

65.0867

4.3247

AB

AA

F Ratio

P Value

13.0173

91.3426

3.22E-06

4.3247

30.3465

0.0009

18.7263

18.7263

131.4022

8.64E-06

0.81

0.81

5.6838

0.0486

16.0856

16.0856

112.8724

1.43E-05

BB

30.1872

30.1872

211.8234

1.73E-06

Residual

1.4551

0.3527

Model

(, 0)

or start points. By adding several center points and axial

points, a regular factorial design is augmented to a CCD. In

Figure 3, we can see there are five different values for each

factor. So CCD can be used to estimate the quadratic model in

equation (15).

Several methods have been developed to calculate to

make the CCD have special properties such that the designed

experiment can better estimate model parameters or can better

explore the optimal region. The commonly used method to

calculate is:

A:Temperature

(F)

345

175

start points to have a CCD. The complete design matrix for

the CCD is:shown in Table 18. The last 5 runs in the above

table are added to the previous factorial design. The fitted

quadratic model is:

Y = 94.91 + 0.74 X 1 + 1.53 X 2 1.52 X 12 2.08 X 22 + 0.45 X 1 X 2 (16)

The ANOVA table for the quadratic model is:

Point

Type

1

155

345

= 1.414. Since we already have five center points in the

(0, -)

Std. Order

355

(1,-1)

(-1,-1)

Source of

Variation

(0, )

(-1,1)

0.9976

0.1425

Lack of Fit

0.5206

0.1735

Pure Error

4

12

0.477

66.0842

0.1193

Total

As mentioned before, the model for CCD is used to

optimize the process. Therefore, the accuracy of the model is

very important. From Table 19, the Lack of Fit test, we see the

P value is relatively large. It means the model can fit the data

well. The Lack of Fit residual is the estimation of the

variations of the terms that are not included in the model. If its

amount is close to the pure error, which is the within-run

variation, it can be treated as part of the noise. Another way to

check the model accuracy is to check the residual plots.

Through the above diagnostic, we found the model is

adequate and we can use it to identify the optimal factor

settings. The optimal settings can be found easily by taking the

derivative of each factor from equation (16) and setting them

to 0. Many software packages can do optimization. The

following results are from DOE++ from ReliaSoft [3].

Optimal Solution 1

Central Composite Design

95.326

Factor vs Response

Continuous Function

Factor Value

Response Value

Yield

(Maximize)

Y = 95.3264

93.733

92.141

90.548

88.955

QA

Reliasoft

7/11/2009

1:16:17 PM

A:Temperature

X =351.5045

179.142

173.485

167.828

162.172

156.515

150.858

357.071

354.243

351.414

348.586

345.757

342.929

87.362

B:Reaction Time

X =168.9973

Figure 4 shows how the response changes with each

factor. The red dashed line points to the optimal value of each

factor. we can see that the optimal settings are 351.5 degrees

Fahrenheit for temperature and 169 minutes for the reaction

time. At this setting, the predicted yield is 95.3%, which is

much better than the yield at the current condition (about

35%).

5.

data is to use the center point of the interval data as the failure

time, and treat the suspension units as failed.

Even with the modification of the original data, another

issue may still exist. In the using of linear regression and

ANOVA, the response is assumed to be normally distributed.

The F and T tests are established based on this normal

distribution assumption. However, life time usually is not

normally distributed.

Given the above reasons, correct analysis methods for

data from life testing are needed.

5.2 Maximum Likelihood Estimation and Likelihood Ratio

Test [2]

Maximum Likelihood Estimation (MLE) can estimate

model parameters to maximize the probability of the

occurrence of an observed data set. It has been used

successfully to handle different data types, such as complete

data, suspensions and interval data. Therefore, we will use

MLE to estimate the model parameters for life data from

DOE.

Many distributions are used to describe lifetimes. The

three most commonly used are [4]:

Weibull distribution with probability density function

(pdf):

When DOE is used for life testing, the response is the life

or failure time. However, because of the cost, time or other

constraints, you may not have observed values of life for some

test units. They are still functioning at the time when the test

ends. The end time of the test is called the suspension time for

the units that are not failed. Obviously, this time is not their

life. Should the suspension time be treated as the life time in

order to analyze the data? In this section, we will discuss the

correct statistical method for analyzing the data for life testing.

First, lets explain some basic concepts in life data analysis.

5.1 Data Type for Life Test

When the response is life, there are two types of data

Complete Data

Censored Data

o Right Censored (Suspended)

o Interval Censored

If a test unit is failed during the test and the exact failure

time is known, the failure time is called complete data.

If a test unit is failed and you dont know the exact failure

time -- instead, you know the failure occurs within a time

range -- this time range is called interval data.

If a unit does not fail in the test, the end time of the test of

the unit is called right censored data or suspension data.

Obviously, ignoring the censored data or treating them as

failure times will underestimate the system reliability.

However, in the use of the linear regression and ANOVA, an

exact value for each observation is required. Therefore,

engineers have to tweak the censored data in order to use

linear regression and ANOVA. A simple way to tweak the

t

Lognormal distribution with pdf:

f (t ) =

f (t ) =

t 2

1 ln(t )

e 2

(17)

(18)

t

1

(19)

f (t ) = e m

m

Assume there is only one factor (in the language of DOE), or

stress (in the language of accelerated life testing) that affects

the lifetime of the product. The life distribution and factor

relationship can be described using the following graph.

Figure 5 shows that life decreases when a factor is changed

from the low level to the high level. The pdf curves have the

same shape while only the scale of the curve changes. The

scale of the pdf is compressed at the high level. It means the

failure mode remains the same, only the time of occurrence

decreases at the high level. Instead of considering the entire

scale of the pdf, a life characteristic can be chosen to represent

the curve and used to investigate the effect of potential factors

on life. The life characteristics for the three commonly used

distributions are:

Weibull distribution:

Lognormal distribution:

Exponential distribution: m

The life-factor relationship is studied to see how factors affect

life characteristic. For example, a linear model can be used as

the initial investigation for the relationship:

(20)

' = 0 + 1 X 1 + 1 X 2 + ... + 12 X 1 X 2 + ...

Where: ' = ln( ) for Weibull; ' = for lognormal and

' = ln(m) for exponential.

Please note that in equation (20) a logarithmic

transformation is applied to the life characteristics of the

Weibull and exponential distributions. One of the reasons is

because and m can take only positive values.

To test whether or not one effect in equation (20) is

significant, the likelihood ratio test is used:

L(effect k removed)

LR (effect k ) = 2 ln

(21)

L(full model)

Where L() is the likelihood value. LR follows a chi-squared

distribution if effect k is not significant.

If effect k is not significant, whether or not it is removed

from the full model of equation (20) will not affect the

likelihood value much. It means the value of LR will be close

to 0. Otherwise, if the LR value is very large, it means effect k

is significant.

5.3 Life Test Example

Consider an experiment to improve the reliability of

fluorescent lights [2]. Five factors A-E are investigated in the

experiment. A 2 52 design with factor generators D=AC and

E=BC is conducted. The objective is to identify the significant

effects that affect the reliability. Two replicates are used for

each treatment. The test ends at the 20th day. Inspections are

conducted every two days. The experiment results are given in

Table 20.

Failure Time

20+ means that the test unit was still working at the end of the

test. So this experiment has suspension data. (14, 16) means

that failure occurred at a time between the 14th and the 16th

day. So this experiment also has interval data. The Weibull

model is used as the distribution for the life of the fluorescent

lights. The likelihood ratio test table is given below.

LR

P

Value

-20.7181

3.1411

0.0763

-24.6436

10.9922

0.0009

-19.2794

0.2638

0.6076

-25.7594

13.2237

0.0003

-21.0727

3.8504

0.0497

-19.1475

Model

Effect

DF

Ln(LKV)

Reduced

Full

Table 21 has a layout that is similar to the ANOVA table. This

makes it easy to read for engineers who are familiar with

ANOVA. From the P value column, we can see factor A, B, D

and E are important to the product life. The estimated factorlife relationship is:

ln( ) = 2.9959 + 0.1052 A 0.2256 B 0.0294C

(22)

0.2477 D + 0.1166 E

The estimated shape parameter for the Weibull distribution

is 7.27.

For comparison, the data was also analyzed using

traditional linear regression and the ANOVA method. To

apply linear regression and ANOVA, the data set was

modified by using the center points as the failure time for

interval data and treating the suspensions as failures. Results

are given below.

Source of

Variation

Model

DF

Sum of

Squares

Mean

Squares

F Ratio

P Value

143.3125

28.6625

4.2384

0.025

1.5625

1.5625

0.2311

0.6411

33.0625

33.0625

4.8891

0.0515

3.0625

3.0625

0.4529

0.5162

95.0625

95.0625

14.0573

0.0038

1.5619

0.2398

10.5625

10.5625

Residual

10

67.625

6.7625

Total

15

210.9375

Run

-1

-1

-1

-1

-1

(14,16)

-1

-1

(18,20)

20+

-1

-1

-1

(8,10)

(10, 12)

-1

-1

(18,20)

20+

significant. The estimated linear regression model is:

Y = 16.9375 + 0.3125 A 1.4375 B + 0.4375C

(23)

2.4375 D + 0.8127 E

Comparing the results in Tables 22 and 23, we can see

that they are quite different. The linear regression and

ANOVA method failed to identify A and E as important

20+

-1

-1

-1

20+

20+

-1

-1

(12,14)

20+

-1

(16,18)

20+

-1

-1

(12,14)

(14, 16)

6.

CONCLUSION

basic concepts in DOE. Guidelines for conducting DOE were

given. Three major topics were discussed in detail: 2-level

factorial design, RSM and DOE for life tests.

Linear regression and ANOVA are the important tools in

DOE data analysis. So, they are emphasized. For DOE

involving censored data, the better method of MLE and

likelihood ratio test should be used.

DOE involves many different statistical methods. Many

useful techniques, such as blocking and randomization,

random and mixed effect model, model diagnostic, power and

sample size, measurement system study, RSM with multiple

responses, D-optimal designs, Taguchi orthogonal array,

Taguchi robust designs, mixture designs and so on are not

covered in this tutorial [1, 2, 5, 6]. However, with the basic

of them easily.

REFERENCES

1.

2.

3.

4.

5.

6.

5th edition, John Wiley and Sons, Inc. New York, 2001.

C. F. Wu and M. Hamad, Experiments: Planning,

Analysis, and Parameter Design Optimization, John

Wiley and Sons, Inc. New York, 2000.

ReliaSoft, http://www.ReliaSoft.com/doe/index.htm

ReliaSoft, Life Data Analysis Reference, ReliaSoft

Corporation, Tucson, 2007.

R. H. Myers and D. C. Montgomery, Response Surface

Methodology: Process and Product Optimization Using

Designed Experiments, 2nd edition, John Wiley and Sons,

Inc. New York, 2002.

ReliaSoft, Experiment Design and Analysis Reference,

ReliaSoft Corporation, Tucson, 2008.

January 25-28, 2010

Outline

Design of Experiments

(DOE) and Data Analysis

Adamantios Mettas

Why DOE

What DOE Can Do

Common Design Types

General Guidelines for Conducting DOE

Statistic Background

Introduction

Response Surface Method

Reliability DOE

1

2010 RAMS Tutorial DOE Guo and Mettas

Introduction

Statistical

Background

Why DOE

Two Level

Factorial Design

Example 1

Introduction

Response

Surface Method

Test all the 50 units at V=25v to predict the reliability at V=10v

Example 2

Reliability DOE

Accelerated life testing to predict reliability at usage level

Summary

Num of Units

25

25

Temperature

(Celsius)

120

85

For comparison

Plackett-Burman design

Taguchi orthogonal array design

Box-Behnken design

Humidity (%)

95

85

Comparisons

Variable screening

Transfer function exploration

System optimization

System robustness

Y = 0 + 1 X 1 + ... + p X p +

Assumptions

It is assumed to be normally distributed with mean of 0

and variance of 2

Choose responses

Choose factors and levels

Choose experimental design

Perform the experiment

Analyze the data

Draw conclusions and make recommendations

Model

distributed and

Var (Y ) = 2

2010 RAMS Tutorial DOE Guo and Mettas

11

Introduction

Statistical Background

Linear regression

Statistical

Background

Two Level

Factorial Design

linear regression model.

ANOVA can tell you if there is a strong relationship

between the independent variables, Xs, and the

response variable, Y.

ANOVA can test whether or not an individual X can

affect Y significantly.

2010 RAMS Tutorial DOE Guo and Mettas

Statistical Background

Response

Surface Method

Reliability DOE

Summary

Linear Regression

Regression analysis is a

statistical technique that

attempts to explore and

model the relationship

between two or more

variables.

A linear regression

model attempts to explain

the relationship between

two or more variables

using a straight line.

Temperature Anomoly ( F)

1.5

1.0

0.5

0.0

-0.5

-1.0

1880

1900

1920

1940

Year

1960

1980

2000

10

Within-run Variations

Y = 0 + 1 X 1 + 2 X 2 +

and X2, the observed Y

values are different.

This difference of Y is

caused by noise.

Source of Degrees of

Variation Freedom

Model

2

X1

1

X2

1

Residual

3

Total

5

Between-run Variations

and X2, the observed mean

values of Y are different.

This difference of Y is

caused by the different

values of the Xs.

Sum of

Squares

6.14E+04

6.00E+04

8100

2500

6.39E+04

Mean

Squares F Ratio

3.07E+04 36.86

6.00E+04 72.03

8100

9.72

833.3333

P

Value

0.0077

0.0034

0.0526

P Value: The smaller the P value is, the more significant the corresponding source.

The P value is compared with a given significance level, alpha. If a P value is less than

the alpha, the corresponding source is significant at the significance level of alpha. The

commonly used alpha values are 0.01, 0.05, 0.1.

12

The Model, X1 and X2 are significant at Level of 0.1.

2010 RAMS Tutorial DOE Guo and Mettas

Sum of Squares:

SST = Yi Y

i =1

)2

Mean Squares:

Between-run Variation

Within-run Variation

SST = SS R + SS E

n

i =1

i =1

= (Yi Y ) 2 + (Yi Yi ) 2

MST =

17

SS T

1 n

=

Yi Y

n 1 n 1 i =1

)2

It is also the unbiased estimation of the variance of Y.

2010 RAMS Tutorial DOE Guo and Mettas

MS R =

SS R

p

MS E =

n: Number of observations

13

SS E

n 1 p

variable:

SS R = SS X1 + SS X 2 + ... + SS X p

If it is not significantly larger than MSE, Xis effect is close to the

effect of noise. Xi doesnt significantly affect Y.

2010 RAMS Tutorial DOE Guo and Mettas

15

F0 =

MS X i

MS E

the variation caused by noise.

H1: The variation caused by Xi is larger than the variation

caused by noise.

2010 RAMS Tutorial DOE Guo and Mettas

MS R

MS E

and the variation caused by noise.

H1: The variation caused by Xs is larger than the variation

caused by noise.

MS X 1 , MS X 2 ,..., MS X P

14

16

Introduction

Statistical

Background

Two Level

Factorial Design

T-test is used to test whether or not a coefficient is 0.

For example,

H0: 1 = 0;

The test statistic is:

T0 =

Response

Surface Method

H1: 1 0

Reliability DOE

Summary

se( 1 )

2010 RAMS Tutorial DOE Guo and Mettas

21

freedom (df) that is the same as the df of the error term.

18

T-test Results

Term

Intercept

X1

X2

Coefficient

-2135

7

18

Standard

Error

588.7624

0.8248

5.7735

Variation Freedom

Squares

Model

2

6.14E+04

X1

1

6.00E+04

X2

1

8100

Residual

3

2500

Total

5

6.39E+04

T Value

-3.6263

8.487

3.1177

Mean

Squares

3.07E+04

6.00E+04

8100

833.3333

Replicate: The number of test units under each

treatment.

P Value

0.0361

0.0034

0.0526

Treatment = 4

P

F Ratio Value

36.86 0.0077

72.03 0.0034

9.72

0.0526

Replicate = 1

Design notation:

2k

23

Note: The P values from the T-test are the same as in the F-test.

2010 RAMS Tutorial DOE Guo and Mettas

19

ReliaSoft DOE++ - www.ReliaSoft.com

Pareto Chart

Pareto Chart

Critical Valu

Significant

Term

X1

X2

0.000

2.353

3.600

5.400

7.200

QA

Reliasoft

7/28/2009

4:57:54 PM

9.000

Effect

Alpha = 0.1; Threshold = 2.3534

20

different settings are needed. High = 1, Low = -1.

From a linear regression viewpoint, you need at least

two points to fit a straight line.

Answer

Example 1

Usage voltage: V=10v

Test all the 50 units at V=25v to predict the reliability at V=10v.

22

27 = 128 runs.

Reduce cost and time by only using part of the full

design.

Sparsity of effects principle: Most of the time,

responses are affected by only a small number of main

effects and lower order interactions.

Fractional factorial design: Focus on main effects and

lower order interactions.

2010 RAMS Tutorial DOE Guo and Mettas

Order

2

3

5

8

for all the treatments.

factor column is 0.

(1 1) + (1 1) + (1 1) + (1 1) = 0

A balanced and orthogonal design can evaluate the effect of each factor more

accurately.

If you add one more sample for any one of the treatments, the design will be

unbalanced and non-orthogonal.

2010 RAMS Tutorial DOE Guo and Mettas

27

24

A

1

-1

-1

1

B

-1

1

-1

1

AB

-1

-1

1

1

C

-1

-1

1

1

AC

-1

1

-1

1

BC ABC

1

1

-1

1

-1

1

1

1

and A cannot be distinguished. Each pair of columns (AB and C,

A and BC, B and AC) have the same pattern.

Effects that cannot be distinguished within a design are called

Confounded Effects or Aliased Effects.

The summary of the aliased effects for the design is called Alias

Structure.

[I]=I+ABC; [A]=A+AC; [B]=B+BC; [C]=C+AB

I=ABC is called the Defining Relation.

2010 RAMS Tutorial DOE Guo and Mettas

Y = 0 + 1 X 1 + 2 X 2 + 12 X 1 X 2 +

Special case: only valid for a balanced design

Two-way interaction effect: AB(X1X2)

29

order of the interaction.

significant.

2010 RAMS Tutorial DOE Guo and Mettas

25

=

30 + 35 20 + 25

= 10

2

2

Effect of A = 2 1

26

Example 2

Accelerated life test to predict reliability at usage level.

Num of Units

25

25

Temperature

(Celsius)

120 (+1)

85 (-1)

Humidity (%)

95 (+1)

85 (-1)

Confounded.

From the viewpoint of Linear Regression, at least 3 unique

combinations are needed why?

Y = 0 + 1 X 1 + 2 X 2 +

2010 RAMS Tutorial DOE Guo and Mettas

31

If effect ABC can be ignored in the study, then we can select the

treatments with ABC=1 for a Fractional Factorial Design.

2010 RAMS Tutorial DOE Guo and Mettas

28

Order

2

3

5

8

A

1

-1

-1

1

B

-1

1

-1

1

C

-1

-1

1

1

AB

-1

-1

1

1

AC

-1

1

-1

1

BC ABC

1

1

-1

1

-1

1

1

1

full factorial is 25 = 32. These runs are too expensive for

the manufacturer.

Only the main effects and the two factor interactions are

assumed to be important.

It is decided to carry out the investigation using the 25-1

design (requiring 16 runs) with the factor generator

E= ABCD.

A and B are called Basic Factors.

The Full Factorial Design for A and B is called the Basic

Design.

Factor C can be generated from C = AB.

C = AB is called the Factor Generator for C.

2010 RAMS Tutorial DOE Guo and Mettas

33

30

Source of

Variation

DF

Model

A

B

C

D

E

AB

AC

AD

AE

BC

BD

BE

CD

CE

DE

Residual

Total

15

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

0

15

Sum of

Squares

Mean

Squares

P Value

5775.4375

495.0625

4590.0625

473.0625

3.0625

1.5625

189.0625

0.5625

5.0625

5.0625

1.5625

0.0625

0.0625

3.0625

0.5625

7.5625

385.0292

495.0625

4590.0625

473.0625

3.0625

1.5625

189.0625

0.5625

5.0625

5.0625

1.5625

0.0625

0.0625

3.0625

0.5625

7.5625

5775.4375

2010 RAMS Tutorial DOE Guo and Mettas

35

integrated circuits is thought to be

affected by the following five

factors*:

Factor

Name

A

B

C

D

E

Aperture Setting

Exposure Time

Develop Time

Mask Dimension

Etch Time

Unit

minutes

seconds

minutes

Low

High

small

20

30

small

14.5

large

40

45

large

15.5

used to estimate the error.

experimentation is to increase the

yield by finding the significant

effects.

Montgomery, D. C, 2001

2010 RAMS Tutorial DOE Guo and Mettas

32

2010 RAMS Tutorial DOE Guo and Mettas

Run Order

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

A

1

1

-1

-1

1

1

-1

-1

1

-1

-1

1

-1

-1

1

1

B

-1

-1

1

-1

1

1

1

-1

-1

1

-1

1

1

-1

-1

1

C

-1

1

1

1

-1

1

-1

1

-1

-1

-1

-1

1

-1

1

1

D

1

1

-1

-1

-1

-1

1

1

-1

-1

-1

1

1

1

-1

1

E

1

-1

1

-1

1

-1

1

1

-1

-1

1

-1

-1

-1

1

1

37

Yield

10

21

45

16

52

60

30

15

9

34

8

50

44

6

22

63

the optimal region.

The Quadratic model is used in RSM.

The estimated function is used for optimization.

Linear regression and ANOVA are the tools for

data analysis.

conducted in a Random Sequence of the factor combinations.

2010 RAMS Tutorial DOE Guo and Mettas

34

2010 RAMS Tutorial DOE Guo and Mettas

Experiment

99.000

Effect Probability

B:Exposure Time

A:Aperture Setting

Yield

Y' = Y

Non-Significant Effects

Significant Effects

Distribution Line

The initial design is a modified factorial design. The Design Matrix is:

C:Develop Time

AB

Probability

Std. Order

1

2

3

4

5

6

7

8

9

50.000

10.000

5.000

1.000

-40.000

-24.000

8.000

-8.000

24.000

QA

Reliasoft

7/10/2009

3:03:18 PM

40.000

Effect

Alpha = 0.1; Lenth's PSE = 0.9375

Lenths method assumes all the effects should be normally distributed with a mean of

0, given the hypothesis that they are not significant.

2010 RAMS Tutorial DOE Guo and Mettas

Point Type

1

1

1

1

0

0

0

0

0

A:

Temperature

-1

1

-1

1

0

0

0

0

0

B:

Reaction Time

-1

-1

1

1

0

0

0

0

0

Yield (%)

33.95

36.35

35

37.25

35.45

35.75

36.05

35.3

35.9

36

2010 RAMS Tutorial DOE Guo and Mettas

39

41

Introduction

ANOVA Table

Statistical

Background

Two Level

Factorial Design

Response

Surface Method

Reliability DOE

Summary

The final linear regression model (in coded values) with only

significant terms:

38

RSM: Example

found to be affected by two factors:

Reaction Temperature

Reaction Time

Step

230 Fahrenheit and 65 minutes give a yield

of about 35%.

The engineer decides to explore the current

conditions in the range (225, 235)

Fahrenheit and (55, 75) minutes so that

steps can be taken to achieve the maximum

yield.

43

Current

Operation

1

2

3

4

5

6

7

8

9

10

11

12

Coded

Factor Levels

Actual

Yield (%)

230

65

35

2.4

4.8

7.2

9.6

12

14.4

16.8

19.2

21.6

24

26.4

28.8

1

2

3

4

5

6

7

8

9

10

11

12

242

254

266

278

290

302

314

326

338

350

362

374

75

85

95

105

115

125

135

145

155

165

175

185

36.5

39.35

45.65

49.55

55.7

64.25

72.5

80.6

91.4

95.45

89.3

87.65

40

2010 RAMS Tutorial DOE Guo and Mettas

45

Coded value =

Half of the Range Between High and Low

Temperature = 230, the corresponding coded value is 0.

Replicates at Center Points will keep the Orthogonality of the design.

Can be used to check whether or not the Linear Model is sufficient or

whether the curvature is significant.

42

Std. Order

1

2

3

4

5

6

7

8

9

Point Type

1

1

1

1

0

0

0

0

0

A:Temperature (F)

345

355

345

355

350

350

350

350

350

155

89.75

155

90.2

175

92

175

94.25

165

94.85

165

95.45

165

95

165

94.55

165

94.7

Quadratic model

Y = 0 + 1 X 1 + 2 X 2 + 11 X 12 + 22 X 22 + 12 X 1 X 2 +

Central Composite Design (CCD) is one of them.

factorial design.

At the optimal region, the curvature is significant. The simple linear model is not enough any more.

2010 RAMS Tutorial DOE Guo and Mettas

47

2010 RAMS Tutorial DOE Guo and Mettas

48

The direction of steepest ascent is perpendicular to the contour lines.

(0, )

(-1,1)

(1,1)

(-, 0)

(1,-1)

(-1,-1)

(0, 0)

(, 0)

(0, -)

(0, )

(1,1)

(-1,1)

(-, 0)

(, 0)

(0, 0)

(1,-1)

(-1,-1)

(0, -)

44

star points.

2010 RAMS Tutorial DOE Guo and Mettas

step 10 (Temp = 350, Time = 165) is close to the

optimal region.

The region around the setting at step 10

(Temp = 350, Time = 165) should be investigated.

A Factorial Design with a center point of (350, 165)

was conducted.

49

Std. Order

1

2

3

4

5

6

7

8

9

10

11

12

13

Point Type

1

1

1

1

0

0

0

0

0

-1

-1

-1

-1

A:Temperature (F)

345

355

345

355

350

350

350

350

350

342.93

357.07

350

350

155

155

175

175

165

165

165

165

165

165

165

150.86

179.14

Yield (%)

89.75

90.2

92

94.25

94.85

95.45

95

94.55

94.7

90.5

92.75

88.4

92.6

2010 RAMS Tutorial DOE Guo and Mettas

46

2010 RAMS Tutorial DOE Guo and Mettas

51

Model Diagnostic

99.000

0.700

Residual Probability

Residual vs Order

0.621

Yield

Y' = Y

Data Points

Residual Line

Yield

Y' = Y

Data Points

Upper Critical Value

Lower Critical Value

0.420

Residual

Probability

0.140

50.000

0.000

-0.140

10.000

5.000

1.000

-2.000

-1.200

-0.400

0.400

1.200

QA

Reliasoft

7/11/2009

1:07:07 PM

2.000

-0.420

-0.700

0.000

QA

Reliasoft

7/11/2009

1:08:33 PM

15.000

-0.621

Residual

Anderson-Darling =0.3202; p-value =0.4906

53

3.000

6.000

9.000

Run Order

12.000

Alpha =0.1; Upper Critical Value =0.6209; Lower Critical Value =-0.6209

How to Decide

Several methods have been developed for deciding

.

properties, such as a design that can better estimate model

parameters, or that can better explore the optimal region.

1/ 4

Factor Value

Response Value

92.141

90.548

88.955

QA

Reliasoft

7/11/2009

1:16:17 PM

A:Temperature

X =351.5045

50

179.142

173.485

167.828

162.172

156.515

357.071

150.858

354.243

342.929

351.414

87.362

ns is the number of replicates of the runs at the axial points.

2k-f represents the original factorial or fractional factorial design.

2010 RAMS Tutorial DOE Guo and Mettas

Factor vs Response

Continuous Function

93.733

2 k f (n f )

ns

Optimal Solution 1

95.326

348.586

345.757

Optimization

Yield

(Maximize)

Y = 95.3264

54

B:Reaction Time

X =168.9973

Predicted Yield at the Optimal Setting is 95.3%

2010 RAMS Tutorial DOE Guo and Mettas

55

ANOVA Table

2010 RAMS Tutorial DOE Guo and Mettas

52

Data Types

The failure time usually is not normally distributed, so linear

regression and ANOVA are not good for life data.

If a unit is still running at the time when the test ends, you only

have suspension time, not failure time for this unit.

DOE.

Complete data

Censored data

lognormal and exponential.

A method that can handle suspension times correctly.

A method that can evaluate the significance of each effect,

similar to ANOVA.

2010 RAMS Tutorial DOE Guo and Mettas

Right Censored

Interval Censored

57

2010 RAMS Tutorial DOE Guo and Mettas

scenario, our data set is composed of the times-to-failure

of the three units that failed and the running time of the

other two units that did not fail.

58

them every 100 hours. If a unit failed between inspections, we

do not know exactly when it failed, but rather that it failed

between inspections. This is also called inspection data.

59

2010 RAMS Tutorial DOE Guo and Mettas

Introduction

Statistical

Background

1

t

f (t ) = e

Two Level

Factorial Design

Response

Surface Method

f (t ) =

Reliability DOE

Summary

60

1 ln( t )

1

e 2

t 2

t

f (t ) =

2010 RAMS Tutorial DOE Guo and Mettas

56

1 m

e

m

61

Life-Factor Relationship

factors on life can be expressed as:

where:

characteristic can be chosen to investigate the effect of potential

factors on life.

The life characteristics for the three commonly used distributions

are:

Weibull:

Lognormal:

xj :

Exponential: m

' = ln( )

or

'=

or

' = ln(m)

characteristics of the Weibull and exponential distributions.

63

2010 RAMS Tutorial DOE Guo and Mettas

Life-Factor Relationship

L f = f (Ti ; i, )

Suspension Data

Ratio Test

'

i

Life-factor relationship:

i =1

M

LS = R ( S j ; i, )

j =1

=

LI

Interval Data

[ F (I

l =1

Ul

0 , 1 , 2 ,...

; i, ) F ( I Ll ; i, ) ]

MLE

L = L f Ls LI

and

64

LR (effect k ) = 2 ln

L(effect k removed )

L( full Model )

for lognormal

If

2010 RAMS Tutorial DOE Guo and Mettas

65

2010 RAMS Tutorial DOE Guo and Mettas

66

Relationship

changed from the low level to the high level.

It is seen that the pdf changes in scale only. The scale of the pdf is

compressed at the high level.

The failure mode remains the same. Only the time of occurrence

decreases at the high level.

2010 RAMS Tutorial DOE Guo and Mettas

62

R-DOE: Example

Consider an experiment to

improve the reliability of

fluorescent lights. Five

factors A-E are investigated

in the experiment. A 25-2

design with factor generators

D=AC and E=BC was

conducted.*

Objective: To identify the

significant factors and adjust

them to improve life.

A

-1

-1

-1

-1

1

1

1

1

B

-1

-1

1

1

-1

-1

1

1

C

-1

1

-1

1

-1

1

-1

1

D

1

-1

1

-1

-1

1

-1

1

E

1

-1

-1

1

1

-1

-1

1

Failure Time

14~16

20+

18~20

20+

8~10

10~12

18~20

20+

20+

20+

12~14

20+

16~18

20+

12~14

14~16

Inspections were conducted every two hours.

Results have interval data and suspensions.

67

2010 RAMS Tutorial DOE Guo and Mettas

distributed.

Treats suspensions as failures.

Uses the middle point of the interval data as the

failure time.

Problem: The above assumptions and adjustments

are incorrect.

68

Relationship

For the ith observation

A

-1

-1

-1

-1

1

1

1

1

B

-1

-1

1

1

-1

-1

1

1

C

-1

1

-1

1

-1

1

-1

1

D

1

-1

1

-1

-1

1

-1

1

E

1

-1

-1

1

1

-1

-1

1

Failure Time

14~16

20+

18~20

20+

8~10

10~12

18~20

20+

20+

20+

12~14

20+

16~18

20+

12~14

14~16

1' = 0 + 1 (1) + 2 (1) + 3 (1) + 4 (+1) + 5 (+1)

2010 RAMS Tutorial DOE Guo and Mettas

69

ln( ) = 2.9959 + 0.1052 A 0.2256 B 0.0294C

0.2477 D + 0.1166 E

70

For example,

H0: 1 = 0; H1: 1 <> 0

Z0 =

se( 1 )

se( 1 ) is the standard error.

2010 RAMS Tutorial DOE Guo and Mettas

71

2010 RAMS Tutorial DOE Guo and Mettas

72

Z-test Results

Z-values are the normalized coefficients.

ReliaSoft DOE++ - www.ReliaSoft.com

Pareto Chart

Pareto Chart

Critical Value

Non-Significant

Significant

D:D

Term

B:B

Note: The P values from the Z-test are slightly different from the Likelihood Ratio test.

A:A

When the sample size is small, the likelihood ratio test is more

accurate than the Z-test.

2010 RAMS Tutorial DOE Guo and Mettas

73

E:E

C:C

0.000

1.000

1.645 2.000

3.000

Standardized Effect (Z Value)

4.000

QA

Reliasoft

7/11/2009

9:06:30 PM

5.000

2010 RAMS Tutorial DOE Guo and Mettas

Model Diagnostic

When using the Weibull distribution for life, the residuals from the lifefactor relationship should follow the extreme value distribution with a

mean of zero.

74

Residuals against run order plot

75

2010 RAMS Tutorial DOE Guo and Mettas

76

From the results, factors A,B, D and E are significant at the risk level of

0.10. Therefore, attention should be paid to these factors.

Approach

Mid-points are used as failure times for interval data.

Life is assumed to follow the normal distribution.

In order to improve the life, factors A and E should be set to the high

level; while factors B and D should be set to the low level.

MLE Information

Term

Coefficient

A:A

0.1052

B:B

-0.2256

C:C

-0.0294

D:D

-0.2477

E:E

0.1166

2010 RAMS Tutorial DOE Guo and Mettas

77

2010 RAMS Tutorial DOE Guo and Mettas

78

More Information

http://www.weibull.com/doewebcontents.htm

http://www.itl.nist.gov/div898/handbook/

83

A, B, D and E were found to be significant using R-DOE.

Traditional DOE fails to identify A and E as important factors at a

significance level of 0.1.

2010 RAMS Tutorial DOE Guo and Mettas

79

Introduction

Statistical

Background

types

General guidelines for conducting DOE

Linear regression and ANOVA

2-level factorial and fractional factorial design

Response surface method

Reliability DOE

Two Level

Factorial Design

Summary

Response

Surface Method

Reliability DOE

Summary

80

81

Blocking

Power and sample size

RSM with multiple responses

RSM: Box-Behnken design

D-optimal design

Taguchi robust design

Taguchi orthogonal array (OA)

Mixture design

Random and mixed effect model

more

2010 RAMS Tutorial DOE Guo and Mettas

82

The End

84

- Lean Six Sigma Executive Overview (Case Study) TemplatesUploaded bySteven Bonacorsi
- Lecture 8Uploaded byYong Liu
- Paper Kopsidas OdysseasUploaded byΟδυσσεας Κοψιδας
- Stata CommandsUploaded byNdivhuho Neosta
- Guide to ANOVAUploaded byPappu Kapali
- research posterUploaded byapi-234981776
- Chapter 4 Experimental DesignUploaded byRay Liu
- TwcUploaded byEric Davis
- Regression Models ProjectUploaded byAkshay Rao
- journalUploaded byapi-404676565
- Chap009 Anova.pptUploaded byaaayinn
- 2.pdfUploaded byHiren Chauhan
- 10.1016@j.carbpol.2013.05.012Uploaded byAmelia Putri
- A guide about Stata CommandsUploaded byraum123
- Study of Silica Fume Blended Concrete using Statistical Mixture Experiment and Optimizing of Compressive Strength of Cement ConcreteUploaded byEditor IJTSRD
- Lab 1- Analyzing DataUploaded bytamhieu
- Ay 4301275279Uploaded byAnonymous 7VPPkWS8O
- Quantitative Analysis Sectional TestUploaded bygeeths207
- SPSS Simple Regression AnalysisUploaded byRatnakar Swain
- Analysis of the Evolution and Correlation Between Gross Net Salary and Consumer Price IndexUploaded byAmi Popescu
- lateral loaded pileUploaded byAhmed Ramadan
- APJMR-2015-3-180-Predictive-Models-of-Work-Related-Musculoskeletal-Disorders.pdfUploaded byRomi Riah Al-wafi
- 8. MaheshUploaded byAnonymous CwJeBCAXp
- Corrosion&MaterialsUploaded bysaeid59
- Statistika zavrsniUploaded byValentina Holjevac
- 3. sabarUploaded byMustapa Simanullang
- measures of associationUploaded bysadhin
- L2-2014-FS-02-QMUploaded byHeidi HC
- Phys Ther 1987 Belitsky 1080 4Uploaded byNatacha Haydeé
- Ip Mgt516 Research MethodologyUploaded bySiddharth Singh

- VSM HealthcareUploaded byAmitesh Prasad
- A First Course in Service Systems EngineeringUploaded byVirojana Tantibadaro
- Jet EngineUploaded byVirojana Tantibadaro
- web.txtUploaded byVirojana Tantibadaro
- 2_lean_fundament.pdfUploaded byMirela Iancu
- 108-116Uploaded byVirojana Tantibadaro
- HenryHarvin 1444112499 Industrial EngineeringUploaded byVirojana Tantibadaro
- 11483TuesdayAnderson.pdfUploaded byVirojana Tantibadaro
- Weibull Analysis in ExcelUploaded byStanford Brown
- BosUploaded byProfRomeo Moreno
- Basic Expressions in ChineseUploaded byglassyglass
- 4-layout-planning-models1.pdfUploaded byVirojana Tantibadaro
- Seven PrinciplesUploaded byDecisiones Logísticas
- 0912f5062f052a368b000000Uploaded byVirojana Tantibadaro
- 7.pdfUploaded byRocio Gill
- Xekalaki QREI2011(Online)FtUploaded byVirojana Tantibadaro
- Effect of pretreatment process by using dilute acid to characteristic of oil palms frond (ทางปาล์มน้ำมัน)Uploaded byVirojana Tantibadaro
- 10.1.1.125.5980Uploaded byVirojana Tantibadaro
- 000 Contrbuted Part-4Uploaded bySathya Prakash
- 11 lucr 16 Sabrina rec 15 .082013 ac 04.12Uploaded byVirojana Tantibadaro
- LEAN.pdfUploaded byManikyala Rao
- An Implementation of the Shingo SystemUploaded byVirojana Tantibadaro
- Unlocking Weibull AnalysisUploaded byVirojana Tantibadaro
- Basic Math and AlgebraUploaded bybrmrodrigues
- Public Road Transport OKUploaded byVirojana Tantibadaro
- 460-operations-4-6-110525112121-phpapp02Uploaded byVirojana Tantibadaro
- 1-s2.0-S0278612505000051-mainUploaded byVirojana Tantibadaro
- 06_ReviewProblemSolution[1]Uploaded bynaveen4950

- Human Error Chap 8 (James Reason)Uploaded byNasir Danial
- Week012-CourseModule-ControlAuditAndSecurityOfISUploaded byRuge Pux
- Pressure ProductsUploaded bya_manrique7218
- Quinta Rueda HollandUploaded byreed3ck
- 2011-Diode-Laser-Modules-and-Pattern-Generators-Catalog.pdfUploaded byMasquesol Bermejo
- kablitz Grate SystemsUploaded byricky
- Leaflet-GAMMA-TRINCIA-2014-11-W00227290R-US11.pdfUploaded byPANOS87276
- Inroduction to Real time systemsUploaded byRajini Gutti
- Evaluation ResearchUploaded byancutza_1015434
- STR-W6754.pdfUploaded byBoniface Asuva
- SKF Reliability MaintenanceInstitute Courses.pdfUploaded byelbusharieltaher1261
- Conext CL BrochureUploaded byJorge Orzco
- EPRI - Main Generator Rotor MaintenanceUploaded byAri Firmansyah
- GD air compressorsUploaded bymike_bdn
- Facilitating ISO 14224 With EAM SoftwareUploaded byAdel Chelba
- PTA_FULL_AARON.pdfUploaded byAaron Sekai
- Catalogo EqualizerUploaded byYeyePulido
- ACAD98-004R1Uploaded bymichael1971
- UH-X.pdfUploaded byRepresentaciones y Distribuciones FAL
- ch19-validation-verification.pptUploaded byumama amjad
- Mescom - Chapter - 6Uploaded bydilipeline
- vibration sensorsUploaded byChew Chong
- MIL-HDBK-728_1Uploaded byEagle1968
- BR_Valves_test_benches_en_co_61985.pdfUploaded bycafe negro
- HSE Plan_FPSO Topside.pdfUploaded bydndudc
- Scheduling CI Maintenance QueuesUploaded byTai Wong
- 55CBC0904Uploaded byEnson Portela
- MERMOS.pdfUploaded byAcad Task
- AC33-4-3 + hirfUploaded byrguillenjr
- 10924_DESIGN FOR QUALITY & PRODUCT EXCELLENCE.pptxUploaded byAbhishek Srivastava

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.