LISA Short Course Series Basics of Design of Experiments: Ana Maria Ortega-Villa Fall 2014

LISA Short Course Series
Basics of Design of Experiments
Ana Maria Ortega-Villa

Fall 2014
LISA: DOE
Fall 2014
About me
Home country Colombia.

5th year PhD student in Statistics
Ms. Statistics, Virginia Tech
Ms. Operations research, Universidad
de los Andes, Colombia.
Instructor: STAT 4705 Probability and
Statistics for Engineers.
Contact: anaorte@vt.edu
LISA:
LISA:DOE
R Basics
Fall
2014
Fall 2013
Laboratory for
Interdisciplinary Statistical
Analysis
LISA helps VT researchers benefit from the use of Statistics
Collaboration:
Visit our website to request personalized statistical advice and assistance with:
Designing Experiments Analyzing Data Interpreting Results

Grant Proposals Software (R, SAS, JMP, Minitab...)
LISA statistical collaborators aim to explain concepts in ways useful for your research.
Great advice right now: Meet with LISA before collecting your data.
LISA also offers:
Educational Short Courses: Designed to help graduate students apply statistics in their research
Walk-In Consulting: Available Monday-Friday from 1-3 PM in the Old Security Building (OSB) for
questions <30 mins. See our website for additional times and locations .
All services are FREE for VT researchers. We assist with researchnot class projects or homework.
www.lisa.stat.vt.ed
u
What are we doing?

1. Introduction to Design of Experiments
2. DOE main principles
. Randomization
. Replication
. Local control of error
3. Complete Randomized Design
4. Randomized Complete Block Design
5. Introduction to factorial Designs
LISA: DOE
Fall 2014
Introduction to Design of
Experiments
LISA: DOE
Fall 2014
What is an Experiment?
An experiment can be thought of as a test or series
of tests in which we make controlled changes to the
input variables of a process or a system, in order to
determine how they change the output of interest.
LISA: DOE
https://weakinteractions.files.wordpress.com/2009/08/s1e1.jpg?w=450
Fall 2014
Why do we design experiments?

MAXIMIZE:
Probability of having a successful experiment.

Information gain: the results and conclusions
derived depend on the way information was
collected.
MINIMIZE
Unwanted effects from other sources of variation.

Cost of experiment if results are limited.
LISA: DOE
Fall 2014
What would be an alternative?

Observational study:
The researcher has little to no control over sources of
variation and simply observes what is happening.
The researcher can only determine information about
how our inputs are related to the outputs we cannot
determine causation.
Examples:
Surveys
Weather Patterns
Stock market price
etc.
LISA: DOE
http://fluxicon.com/blog/wp-content/uploads/2012/02/observeandreport.jpg
Fall 2014
Designed experiment
The researcher identifies and controls sources of
variation that significantly impact the measured
response.
The researcher
causation.
can
gather
Correlation Causation
LISA: DOE
Fall 2014
evidence
for
But what are sources of variation?

Sources of variation are anything that could
cause an observation to be different from
another observation.
Two main types:
Those that can be controlled and are of

interest are called treatments or treatment
factors.
Those that can influence the experimental
response but in which we are not directly
interested are called nuisance factors.
LISA: DOE
Fall 2014
Rule of Thumb
List all major and minor sources of variation
before collecting the data, classifying them as
either a treatment or a nuisance factor.
We want our design to minimize the impact of
minor sources of variation, and to be able to
separate effects of nuisance factors from
treatment factors
We want the majority of the variability of the data
to be explained by the treatment factors.
LISA: DOE
Fall 2014
Example: Impact of Exercise Intensity on

Resting Heart Rate
Suppose a researcher surveys a sample of
individuals to obtain information about their
intensity of exercise each week and their resting
heart rate.
Subje Reported Intensity
of Exercise each
ct
Resting Heart
Rate
week
1
2
3
What type of study is

this?
LISA: DOE
http://karmajello.com/postcont/2014/02/WhatExercise-Can-Heart-Patients-Undertakee1352999185475.jpg
Observational Study
Fall 2014
How could we make it a designed

expt?
The researcher finds a sample of individuals,
enrolls groups in exercise programs of different

intensity levels, and then measures their resting
heart rate.
LISA: DOE
Subje Intensity level of

exercise each
ct
Resting Heart
Rate
week
1
2
3
Fall 2014
What are our sources of variation?

Major
Treatment
Nuisance Factor
LISA: DOE
Minor
Exercise intensity
Medication Use
Air Temperature &
Humidity
Fall 2014
Location of
measurement
Body Size
Body Position
Designing the experiment

Minimum considerations:
Response: Resting heart rate (beats per

minute)
Treatment: Exercise Program
o Low intensity
o Moderate intensity
o High intensity
LISA: DOE
Fall 2014
Designing the experiment

Basic Design:
36 participants, 18 male and 18 female under

the conditions listed previously.
Every person is assigned to one of the three
8-week exercise programs.
Resting heart rate is measured
beginning and end of 8 weeks.
LISA: DOE
Fall 2014
at
the
Fundamentals of Design of
Experiments
An experimental unit (EU) is the material to which treatment
factors are assigned.
For the resting heart rate example, the participants are the
EU.
We want EUs to be as similar as possible, but that isnt

always realistic.
A block is a group of EUs similar to each other, and different

from other groups.
o
In the resting heart rate example, women are
physiologically similar to each other and different from
men.
A blocking factor is the characteristic used to create the blocks.
o
In the resting heart rate example, gender is a blocking
LISA: DOE
factor.
Fall 2014
Three Basic Principles of

Design of Experiments
Randomization
LISA: DOE
Fall 2014
Randomization
Randomization consists of randomly assigning:
the experimental treatments to experimental units.

the order in which the independent runs will be
performed (when applicable).
Purpose:
Often we assume an independent, random
distribution of observations and errors
randomization validates this assumption.
Averages out the effects of extraneous/lurking
variables.
Reduces bias and accusations of bias.
LISA: DOE
Fall 2014
Randomization
The way you randomize depends on your
experiment, what is important here is to remember
there are two levels of randomization.
1. Assignment of treatments to experimental units

2. Order of the runs (when applicable).
LISA: DOE
Fall 2014
Randomization RHR Example

1. Assignment of treatments to experimental units.
Particip
ant
Exercise
Program
High
High
Low
Intermediate
Low
High
2. Order of the runs. Not applicable in this case since all

participants are doing the experiment at the same
time.
LISA: DOE
Fall 2014

Replication
LISA: DOE
Fall 2014
Replication
Replication consists of independently repeating runs of each
treatment.
Purpose:
Improves precision of effect estimation.
Decreases Variance.
Allows for estimation of experimental error. This error
will later become a unit of measurement to determine
whether observed differences are really statistically
significant.
Note: Try to have the same amount of replicates for each
treatment assignment.
LISA: DOE
# Replicates=# EUs/#Treatments
Fall 2014
Replication in RHR Example

Particip
ant
Exercise
Program
High
High
Low
Intermediate
Low
High
Participants 1, 2 and 6 can be considered as

replicates of High intensity exercise treatment.
LISA: DOE
Fall 2014
Pseudoreplication
What is pseudoreplication?
Occurs when there is more than one

observation per EU and they are treated as
replicates.
In our RHR example it would be like taking
measurements in different locations (wrist, side
of the neck and foot) of the same person and
treating them as separate observations.
LISA: DOE
Fall 2014
Pseudoreplication
A way to deal with multiple measurements per
EU is to average them over and work with the
new value.
Consequences:
Underestimation of error
Potentially exaggerate the true treatment
differences
LISA: DOE
Fall 2014

Local Control of Error
LISA: DOE
Fall 2014
Local control of error

Local control of error is taking any means of
improving the accuracy of measuring treatment
effects in the design.
Purpose:
Removes or minimizes sources of nuisance.
Improves the precision with which comparisons
among factors are made.
Note: There are several ways of doing this. One could
control as much as possible all the previously listed
sources of variation. Often this is done by the use of
blocking or more advanced designs such as ANCOVA.
LISA: DOE
Fall 2014
RHR Local control of error

We will be monitoring the participants exercise program
throughout the study (not relying on self-reporting).
We will only consider participants that are not taking any
medication that might alter their heart rate.
We will take all measurements on the same location of
the body: the wrist.
We will take all measurements with the participant on
the same position: standing.
We will only accept participants with a body mass index
within the normal range.
We will measure all participants on the same day at the
beginning and the end of the study.
LISA: DOE
Fall 2014
Common Designs:
Completely Randomized Design (CRD)
LISA: DOE
Fall 2014
Complete Randomized Design (CRD)

The CRD is the simplest design. It assumes all EUs
are similar and the only major sources of variation
are the treatments.
In this design all treatment-EU assignments are
randomized for the specified number of treatment
replications.
If you are equally interested in comparisons of all
treatments get as close as possible to equally
replicating the treatments. (Balanced design).
LISA: DOE
Fall 2014
CRD Example: Plasma Etching Experiment

Etching is a process in which unwanted material
is removed from circuit wafers in order to obtain
circuit patterns, electrical interconnects and
areas in which diffusions or metal depositions
are to be made.
* Example from Montgomery (2009)
LISA: DOE
Fall 2014
CRD Example: Etching Process simplified
Energy is
supplied by
a
generator.
LISA: DOE
Chemical
mixture
gas is is
shot at a
sample.
Fall 2014
Plasma is
generated
in gap
between
electrodes
CRD Example: Study

An engineer is interested in investigating the relationship
between the generator power setting and the etch rate for the
tool.
Response: Etch rate
Treatment: Generator power setting (4 levels to consider)
Experimental Unit: Circuit Wafer
Possible sources of variation:
Generator power setting

Chemical mixture gas (the gases affect the plasma
behavior)
Size of the gap between the electrodes.
LISA: DOE
Fall 2014
CRD Example: Principles of DOE

Replication
We will consider 5 EUs for each treatment level
(generator power setting)
Randomization
Since all EUs are considered to be identical, we will
randomize the running order.

In order to minimize variability we will use the same
chemical mixture (C2F6) and size of gap (0.8 cm) for
all runs of the experiment.
LISA: DOE
Fall 2014
CRD Example: Randomization Scheme

Run
Treatment
Run
Treatment
11
12
13
14
15
16
17
18
19
10
20
This run order was obtained using a random number generator.
LISA: DOE
Fall 2014
CRD Example: What is the question?

We are interested in testing the equality of the
treatment means:
If we reject the null hypothesis, then this would mean

there is a difference between at least two of the
means, which translates to a significant different
between the treatments.
LISA: DOE
Fall 2014
CRD Example: Analysis

We want to enter the data
such
that
each
each
response has its own row,
with
the
corresponding
treatment type.
We then choose Analyze ->
Fit Y by X.
LISA: DOE
Fall 2014
CRD Example: Analysis

We will choose Rate as the Y response and Treatment
as the X factor.
LISA: DOE
Fall 2014
CRD Example: Visual Analysis

From the red triangle: Display Options ->Boxplot
Remarks:
These box plots show that the etch rate increases as the power
setting increases.
From this graphical analysis we suspect:
1. Generator power settings affects the etch rate.

2. Higher power settings result in increased etch rate.
LISA: DOE
Fall 2014
CRD Example: ANOVA Table

From red triangle select means and ANOVA.
ANOVA partitions total variability into three separate

independent pieces:
MSTrt: Variability due to treatment differences.
MSE: Variability due to experimental error.
If MSTrt>MSE then treatments likely have different effects.
LISA: DOE
Fall 2014
CRD Example: Contrasts

Red Triangle: Compare Means -> Tukey HSD
At least two treatments are different, which ones?
LISA: DOE
Fall 2014
CRD: Summary
CRD has one overall randomization.
Try to equally replicate all the treatments.
Plot your data in a meaningful way to help
visualize analysis.
Use ANOVA to test for an overall difference.
Look at specific contrasts of interest to better
understand the relationship between treatments.
LISA: DOE
Fall 2014
Common Designs:
Randomized Complete Block Design

(RCBD)
LISA: DOE
Fall 2014
Randomized Complete Block Design (RCBD)

The RCBD is a design in which there are one or
more nuisance factors that are known and
controllable. This design systematically eliminates
the effect of these nuisance factors on the
statistical comparisons among treatments.
The block size equals the number of treatments.
Basic Idea: Compare treatments within blocks to
account for the source of variation.
LISA: DOE
Fall 2014
RCBD Example: Vascular Graft Experiment

Vascular grafts (artificial veins) are produced by
extruding billets of polytetrafluoroethylene (PFTE) resin
combined with a lubricant into tubes. Sometimes these
tubes contain defects known as flicks. These defects
are cause for rejection of the unit.
The product developer suspects that the extrusion
pressure affects the occurrence of flicks. An engineer
suspects that there may be significant batch-to-batch
variation from the resin.
LISA: DOE
Fall 2014
RCBD Example: Study

Response: Percentage of tubes that did not
contain any flick.
Treatment: Extrusion Pressure (4 levels)
Block: Batch of resin (6 batches).
LISA: DOE
Fall 2014
RCBD Example: Principles of DOE

Replication
Each treatment (extrusion pressure) is replicated
once in each block.
Randomization
The treatments (extrusion pressure) are randomized
inside each block.

In order to minimize variability we will use Blocking
and keeping all other possible controllable nuisance
factors controlled.
LISA: DOE
Fall 2014
RCBD Example: What is the question?

treatment means:

means.
LISA: DOE
Fall 2014
RCBD Example: What is the question?

treatment means:

means.
LISA: DOE
Fall 2014
RCBD Example: Analysis JMP

Analysis: Follow the same procedure.
Analyze->Fit Y by X.
LISA: DOE
Fall 2014
RCBD Example: Visual Analysis

Boxplot:
From this graphical analysis we suspect:
1.
Extrusion pressure affects the response.
2.
Higher pressure settings seem to result in decreased no flicks percentages.
3.
These results can be potentially affected by the resin batch.
LISA: DOE
Fall 2014
RCBD Example: ANOVA Table
According to this analysis, we reject the null hypothesis. This

means that there is a significant effect by the treatments.
Software is going to give you a p-value for Block, but
only use this to gauge how much we reduced experimental
error. Do not test the blocks using this p-value.
LISA: DOE
Fall 2014
RCBD Example: Contrasts
Significant differences between treatments 1 and 4,

and 2 and 4.
LISA: DOE
Fall 2014
Common Designs:
Introduction to Factorial Designs
LISA: DOE
Fall 2014
Factorial Designs
In this type of design we want to study the
effect of two or more factors. Here, we have
that in each complete trial or replication of the
experiment, all possible combinations of the
levels of the factors are investigated.
Basic idea: Treatments are a combination of
multiple factors with different levels (i.e.
settings)
LISA: DOE
Fall 2014
Factorial Designs: Main Concepts

The effect of factor is defined to be as the
change in the response produced by a change
in the level of the factor (main effect).
Interaction between factors is present when
the difference in response between the levels
of one factor is not the same at all levels of
the other factors (i.e. the effect of factor A
depends on the level chose for factor B).
LISA: DOE
Fall 2014
Factorial Designs Example: Battery Design

An engineer is designing a battery that will be used in
a device that will be subject to extreme variations in
temperature.
She is interested in examining three different materials
for this battery at three different temperatures (15, 70
and 125 F) in order to determine how battery life is
affected by these conditions.
LISA: DOE
Fall 2014
Factorial Design Example: Study

Response: Battery life
Treatment: All combinations the factors:
LISA: DOE
Material: 3 levels (1, 2 and 3)

Temperature: 3 levels (15, 70 and 125
F)
Fall 2014
Factorial Design Example: Principles of DOE

Replication
Each treatment (combination of levels of factors) is
replicated 4 times.

In order to minimize variability we will keep
everything else in the testing lab constant
throughout the experiment.
LISA: DOE
Fall 2014
Factorial Design Example: Randomization

Mat
Temp
Run
Mat
Temp
Run
Mat
Temp
Run
15
15
17
15
13
15
11
15
30
15
15
26
15
23
15
32
15
31
15
14
15
24
70
22
70
25
70
27
70
34
70
35
70
70
70
70
12
70
33
70
20
70
125
15
125
10
125
125
21
125
29
125
125
28
125
36
125
18
125
19
125
125
16
LISA: DOE
Fall 2014
Factorial Design Example: Randomization

You can create your own design in JMP:
DOE->Custom Design
LISA: DOE
Fall 2014
Factorial Design Example: Analysis

Analyze->Fit Model
LISA: DOE
Fall 2014
Factorial Design Example: Interaction

Red Triangle: Factor Profiling -> Interaction Plots
LISA: DOE
Fall 2014
Factorial Design Example: ANOVA Theory

Here the ANOVA table is partitioned:
SST= SSModel+SSError
And SSModel is partitioned:
SSModel=SSTemp+SSMat+SSInt
SSTemp: Compares Temperature level means to overall
mean.
SSMat: Compares Material level means to overall mean.
SSInt: Looks at differences between
changes depending on material.
LISA: DOE
Fall 2014
temperature
Factorial Design Example: ANOVA
LISA: DOE
Fall 2014
Model adequacy checking
LISA: DOE
Fall 2014
Model Adequacy checking

It is recommended to check the adequacy of the
model by examining the residuals (difference between
the true values and the ones predicted by the model.
These residuals should be structureless, which means
they should not contain an obvious pattern.
To save the residuals from Fit Model (Not fit Y by X):
Red triangle: Save columns -> Residuals
LISA: DOE
Fall 2014
Model Adequacy checking: Assumptions

Residuals should be normally distributed
Can inspect with a normal probability plot:
Analyze-> Distribution.
Red triangle: Normal Quantile plot
Plot Residuals vs fitted values and check for

patterns
In the effect analysis window, red triangle: Row
diagnostics
Plot Residuals by treatment, can do it with saved

residuals using the graph builder.
LISA: DOE
Fall 2014
Model Adequacy checking: Battery
LISA: DOE
Fall 2014
Model Adequacy checking: Plasma Etching
LISA: DOE
Fall 2014
Model Adequacy checking: Vascular Graft
LISA: DOE
Fall 2014
Exercise
LISA: DOE
Fall 2014
Exercise:
A soft drink bottler is interested in obtaining more uniform fill heights in the
bottles produced by his manufacturing process. The process engineer can
control three variables during the filling process:
Percent carbonation
Operating pressure in the filler
Line speed.
The engineer can control carbonation at three different levels (10, 12 and 14%),
two levels for pressure (25 and 30 psi) and two levels for line speed (200 and
250 bpm).
She designs to run two replicates of a factorial design in these factors, with all
runs taken in random order. The response variable is the average deviation
from the target fill height observed in a production run of bottles at each set of
conditions.
How many factors do we have? How many runs would we need to perform?
LISA: DOE
Fall 2014
Exercise: Question 1
Suppose you obtain this interaction plot, what
would you interpret?
LISA: DOE
Fall 2014
Exercise: Analysis
Conduct the factorial analysis in JMP, what can
you conclude?
LISA: DOE
Fall 2014
Exercise: Analysis
What can you say about the residuals?
LISA: DOE
Fall 2014
Summary
Remember to randomize!
Randomize run order, and treatments
Remember to replicate!
Use multiple EUs for each treatment it will help you be
more accurate in estimating your effects
Remember to block!
In the case where you suspect some inherent quality of
your experimental units may be causing variation in your
response, arrange your experimental units into groups
based on similarity in that quality
Remember to contact LISA!

For short questions, attend our Walk-in Consulting hours
For research, come before you collect your data for design
help
LISA: DOE
Fall 2014
Reference
Montgomery, Douglas C. Design and analysis of

experiments. John Wiley & Sons, 2008.
LISA: DOE
Fall 2014
LISA: DOE
Please dont forget to fill the sign in sheet

and to complete the survey that will be
sent to you by email.
Thank you!
Fall 2014

LISA Short Course Series Basics of Design of Experiments: Ana Maria Ortega-Villa Fall 2014

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LISA Short Course Series Basics of Design of Experiments: Ana Maria Ortega-Villa Fall 2014

Uploaded by

Copyright:

Available Formats

LISA Short Course Series

Basics of Design of Experiments

Ana Maria Ortega-Villa

Home country Colombia.

Designing Experiments Analyzing Data Interpreting Results

What are we doing?

Why do we design experiments?

Probability of having a successful experiment.

Unwanted effects from other sources of variation.

What would be an alternative?

But what are sources of variation?

Those that can be controlled and are of

Example: Impact of Exercise Intensity on

What type of study is

How could we make it a designed

enrolls groups in exercise programs of different

Subje Intensity level of

What are our sources of variation?

Designing the experiment

Response: Resting heart rate (beats per

Designing the experiment

36 participants, 18 male and 18 female under

We want EUs to be as similar as possible, but that isnt

A block is a group of EUs similar to each other, and different

Three Basic Principles of

the experimental treatments to experimental units.

1. Assignment of treatments to experimental units

Randomization RHR Example

2. Order of the runs. Not applicable in this case since all

Three Basic Principles of

Replication in RHR Example

Participants 1, 2 and 6 can be considered as

Occurs when there is more than one

Three Basic Principles of

Local control of error

RHR Local control of error

Completely Randomized Design (CRD)

Complete Randomized Design (CRD)

CRD Example: Plasma Etching Experiment

* Example from Montgomery (2009)

CRD Example: Etching Process simplified

CRD Example: Study

Generator power setting

CRD Example: Principles of DOE

Local control of error

CRD Example: Randomization Scheme

This run order was obtained using a random number generator.

CRD Example: What is the question?

If we reject the null hypothesis, then this would mean

CRD Example: Analysis

CRD Example: Analysis

CRD Example: Visual Analysis

From this graphical analysis we suspect:

1. Generator power settings affects the etch rate.

CRD Example: ANOVA Table

ANOVA partitions total variability into three separate

CRD Example: Contrasts

Randomized Complete Block Design

Randomized Complete Block Design (RCBD)

RCBD Example: Vascular Graft Experiment

RCBD Example: Study

RCBD Example: Principles of DOE

Local control of error