You are on page 1of 48

Chapter 1

Research Design Principles

Li-Shya Chen and Tsung-Chi Cheng

Department of Statistics
National Chengchi University
Taipei 11605, Taiwan


• 1.1 The legacy of Sir R. A. Fisher

• 1.2 Planning for research
• 1.3 Experiments, Treatments, and Experimental Units
• 1.4 Research Hypotheses Generate Treatment Designs
• 1.5 Local Control of Experimental Errors
• 1.6 Replication for Valid Experiments
• 1.7 How Many Replications?
• 1.8 Randomization for Valid Inferences
• 1.9 Relative Efficiency of Experiment Designs
• 1.10 From Principles to Practice: A Case Study

• All of activities associated with planning and performing

research studies have statistical implications.
• The principles in this Chapter form a basis for the
structure of a research study.
• The structure, in turn, defines the study’s function.
• If the structure is faulty, then the study will not function
properly. It will produce either incomplete or misleading
• The statistical principles are those associated with the
collection of observations to gain maximum information
for a research study in an efficient manner.
• They include treatment design, local control of variability,
replication, randomization, and the efficiency of
Sir Ronald A. Fisher (1890-1962)
• R. A. Fisher developed the analysis of variance and unified
his idea on basic principles of experimental design.
• He joined the staff at Rothamsted Experimental Station
during 1919-1933.
• Fisher (1926) published “The arrangement of field
experiments”and outlined three fundamental components
• Local control of field conditions for reduction of
experimental error
• Replication as a means to estimate experimental error
• Randomization for valid estimation of experimental error
• Two books
• Statistical Principles for Research Workers (1925)
• The Design of Experiments (1935)
Planning for research

• A research program is an organized effort on the part of a

scientist to acquire knowledge about a natural or
manufactured process.
• Good planning helps the scientist to organize the required
tasks for a research study.
A checklist at the beginning

• The specific objectives of the experiment

• Identification of influential factors and which of those
factors to vary and which to hold constant
• The characteristics to be measured
• The specific procedures for conducting tests or measuring
the characteristics
• The number of repetitions of the basic experiment to
• Available resources and materials
Questions to focus activities

• Questions throughout the design process

• What is my objective?
• What do I want to know?
• Why do I want to know it?
• Productive follow-up question for each activity in the
process direct our attention to define the role of each
activity in the research study.
• How am I going to perform this task?
• Why am I doing this task?
Design of experiments (DOE)

• An experimental design shall be confined to investigations

that establish a particular set of circumstance under a
specific protocol to observe and evaluate implications of
the resulting observation.
• The investigator establishes and controls the protocols in
an experiment to evaluate and test something that for the
most part is unknown up to that time.
• Comparative experiment
• the establishment of more than one set of circumstances
in the experiment
• the responses resulting from different circumstances will
be compared with one another
(Comparative) experiment

• The design of experiments is specified by:

• Treatments
• the set of circumstances created for the experiment in
response to research hypothesis
• the focus of the investigation
• Ex: diets, cultivars of a crop species, temperatures, soil
types, amounts of a nutrient
• Experiment units (EU)
• the physical entity or subject exposed to the treatment
independently of other units.
• responses or outcomes
• a method to assign treatments to experiment units

• Experimental design is the part of statistics happening

prior to an experiment.
• Always design experiments with particular statistical tests
in mind
• Proper analysis depends on the design and the kinds of
statistical model assumptions we believe are correct and
are willing to assume.
Experimental error

• Experimental error: the variation among identically and

independently treated EUs
• Origins of experimental error
• the natural variation among EUs
• variability in measurement of the response
• inability to reproduce the treatment conditions exactly
• interaction of treatments and EUs
• other extraneous factors influence the response
• A major objective is the attainment of an estimate of the
variance of experimental error
• The variance of experimental error is the variance of
observations on EUs
• The differences among the observations can be attributed
only to experimental error.
• Statistical inferences rely on an estimate of such variance.
Comparative observational study

• An observational study draws inferences about the

possible effect of treatments on subjects, where the
assignment of subjects into treatment groups is outside
the control of the investigator.
• The subjects are either self-selected into identifiable
groups or they are simply exist in their particular
• The groups or circumstances are used as treatments in the
observational study.
• Practical reasons or ethical considerations
• Observational study are limited to association
relationships between the responses and treatment
DOE vs. observational study
• helps researchers to answer questions or evaluate the
effectiveness of a research program.
• Distinctions between observational study and designed
• Can treatments be assigned to EUs?
• In designed experiments, treatments are assigned to EUs
according to a planned scheme.
• In observational studies, subjects are either self-selected
into identifiable groups or they simply exist in their
particular circumstances.
• The groups or circumstances are used as treatments in
observation study.
• What kind of relationships can be established?
• Designed experiments help to establish causal relationships
between the responses and treatments
• Observational studies help to establish association
relationships between responses and treatments
• Section 1.4 gives description about how the research
hypotheses generate treatment design.
• Why do experiments but not observational study?
• For instance, an observational study shows a positive
association between ice cream sales and rates of violent
• What does this mean?

• Correlation or Association , Causation

• Observational studies are prone to be influenced by
confounding variables, which mask or distort the
association between measured variables in a study
• Ex: Ice cream sales and rates of violent crime are positively
correlated to hot weather.
• In an experiment, treatments can be randomly assigned to
units or individuals (randomization) to avoid confounding
Purposes of experiments

• To determine the cause(s) of variation in the response.

• To find conditions under which the optimal (maximum or
minimum) response is achieved
• To compare responses at different levels of controllable
• To determine the levels of controllable variables to
minimize the impacts from the uncontrollable variables
• To develop a model for predicting responses
• Etc.
History of design of experiments (I)

• The agricultural origins, 1908 – 1940s

• W. S. Gossett and the t-test (1908)
• R. A. Fisher & his co-workers
• Profound impact on agricultural and biological sciences
• The first industrial era, 1951 – late 1970s
• Box & Wilson, response surfaces
• Process modeling and optimization
• Applications in the chemical & process industries
History of design of experiments (II)

• The second industrial era, late 1970s – 1990

• Quality improvement and variation deduction initiatives in
many companies
• Taguchi and robust parameter design, process robustness
• The modern era, beginning 1990s
• Popular outside statistics, and an indispensable tool in
many scientific/engineering endeavors
• New challenges:
• Large and complex experiments, e.g. screening design in
pharmaceutical industry, experimental design in
• Computer experiments: efficient ways to model complex
systems based on computer simulation
• Many others
A systematic approach to experimentation
• State objectives
• Choose responses
• What to measure?
• How to measure?
• How good is the measurement system?

• Choose factors and levels

• Flow chart and cause-and-effect diagram
• Factor experimental range is crucial for success

• Choose experimental plan

• Conduct the experiment
• Analyze the data
• Conclusion and recommendation:
• iterative procedure
• confirmation experiments/follow-up experiments
Research hypotheses generate treatment designs

• Example of survival of seedlings under attack by a soil

• Research hypothesis: Not all fungicides are equally
effective in controlling the soil pathogen.
• Treatment design: Selecting different types of fungicides
for experiment.
• Treatment design decides treatments to be used, where
treatments can be conditions imposed by researchers or
existing conditions.
• Allowing additional treatments help to fully evaluate the
consequences of hypotheses (next page)
Control treatments are benchmarks

• No treatment or Placebo or standard treatment

• No treatment helps to reveal the conditions under which
the experiment was conducted.
• Placebo establishes a basis for treatment effectiveness
when manipulating treatment alone may produce a
• Note: One may include two distinct types of controls.
Multiple-factor treatment designs expand inferences

• A factor is a particular set of treatments.

• The categories of a factor are levels of the factor.
• Temperature: 20◦ C , 30◦ C , 40◦ C
• Practical or ethical reasons may prevent experimentation:
• Ex: Can’t randomly assign gender to subjects
• Ex: Unethical to put any individual in “no seat belt”
condition in car colliding experiment
• Ex: Impractical to randomly assign sites to have only pure
mature oak trees or to have mixed mature oak and pine
• The investigator may simply lack the requisite influence or
power to conduct randomization.
Factorial arrangement

• The factorial arrangement consists all possible

combinations of the levels of the treatment factors.
• Ex: Nitrogen production by a special type of bacteria is
affected by the temperature and soil type
• Temperature: 20◦ C , 30◦ C , 40◦ C
• Soil type: normal, saline, sodic
• 3 × 3 factorial treatment combinations (Two-factor
factorial design)
• See Chapters 6 and 7.
Local control of experimental errors

• Actions an investigator can employ to reduce or control

experimental error:
• (1) Techniques related to preparing circumstances or
taking measurement:
• Accurate measurement, preparation of media, pipetting of
solutions, calibration of instruments
• (2) Selection of uniform EUs to reduce experimental error.
• (3) Blocking helps to reduce experimental error and select
experimental units for uniformity.
Local control of experimental errors (II)
Major criteria for blocking:
• Proximity: such as neighboring plots
• Physical characteristics: such as litter, batch, age, or
weight, etc.
• Time: temporal effect on the task, such as each day only
one replication can be run.
• Management: Each technician or observer can be
assigned to one replication of all treatments.
• Matching strategy for grouping subjects or units:
• Pair matching: two approaches based on controlling
• exact value
• caliper value (similar value)
• Nonpair matching: two approaches based on controlling
• frequency: Subjects are stratified from a frequency
• mean-based: Each treatment group has the same average
Local control of experimental errors (III)
• (4) Choice of experiment design to accommodate
treatment designs
• Experiment design decides the selection/arrangement of
EUs and assignment of treatments.
• Ex: Compare 3 gasoline additives.
• Treatment: additive ; 3 treatments
• EU: automobile engine; 6 EUs
• Assignment: 3 additives are randomly allocated to 6
engines, 2 units per additive.
• Design:
• Ex: Compare 3 diets.
• Treatment: diet; 3 treatments.
• EU: mouse; 6 EUs.
• Block : litter ; 2 blocks
• Assignment: 3 treatments were randomly assigned to 3
mice in each litter.
• Design:
Local control of experimental errors (IV)

• (5) Including covariates to control variation

• Variables that are related to the response variable are
called covariates.
• Analysis of covariance (ANCOVA): A procedure considering
covariates in addition to treatments provides a statistical
control on experiment error variance.
• Ex: Body weights would be a covariate of a weight gain
• Pretest scores would be a covariate of a learning
• Requirement of ANCOVA: Covariates are unaffected by the
Replication for valid experiments

• Replication implies an independent repetition of the basic

• Reasons for replicating an experiment:
• demonstrating the results to be reproducible
• providing some insurance against aberrant results due to
• providing an estimate for experimental error variance
• increasing the precision for estimating treatment means

• Observational units vs. experimental units:

• Why the observational unit , the experimental unit?
• The observational unit may be a sample from the EU
Example 1.1
• Weight gain data are collected to test the efficacy of 2
Pen 1 – Ration A Pen 2 – Ration B
s s s s s s
s s s s s s
• Each of 2 rations was assigned to one of 2 pens.
• Each pen had 6 cows.
• EU: pen; number of EUs= 2.
• Observational unit: cow; number of OUs within each
treatment = 6.
• This experiment has only 1 true replication, i.e. pen.
• Differences in ȲA and ȲB may not be attributed to rations
alone, why?
• Besides treatment effects, there may exist pen effects and
treatment-pen interactions.
• Note: When observational unit (OU) , the experimental unit,
OUs are also called pseudoreplicates.
Example 1.2
• Each of two rations was randomly assigned to 2 pens of
Ration A Ration B
Pen 1 s s s Pen 3 s s s
Pen 2 s s s Pen 4 s s s
• EU: pen; number of EUs= 4.
• OU: cow; 3 cows per pen.
• Response from EU: the average pen response.

• Fertilizers were applied to fields, then individual plant

samples from a field plot are observational units
• Medications were received by patients, then serum
samples from a patient are observational units.
How many replications?

• More replications are required to detect small treatment

• Factors involved in determining number of replications:
• The variance σ 2
• Size of difference that has physical significance
• The significance level α
• The power of the test 1- β.
• To test H0 : µ1 = µ2 vs. Ha : µ1 , µ2 , an experiment with 2
independent samples is conducted.
• Let α be the significance level.
• When the true difference in means, µ1 − µ2 , is δ, and the
probability of type II error is to be no greater than β
• Then the replication for each treatment groups is at least
2 Z α2 + Zβ σ 2
r= .
• If µ1 = 10, µ2 = 11, σ = 0.5, α = 0.05, and β = 0.2, then
2 Z α2 + Zβ σ 2
r= = 2(1.96 + 0.84)2 0.52 = 3.92 −→ 4
Randomization for valid inferences

• Random assignment of treatments to the EUs.

• According to Fisher, the random allocation of treatments
to EUs simulates the effect of independence and permits
us to proceed as if the observations are independent and
normally distributed.
• He showed normal theory tests are good approximations to
the randomized tests provided randomization being
implemented to a reasonably large sample.

• Eliminating biases arising through systematic assignment

of treatments to EUs.

• Trts A and B were randomly assigned to 7 EUs, 4 receiving

A and 3 receiving B.
• Two-sample t test

(y 1 − y 2 ) − D0 (16.515) − 0
t= q = q = 0.958
1 1 1 1
sp n + n 4.2( 4 + 3 )
1 2


(n1 − 1)S12 + (n2 − 1)S22 3 × 2.51662 + 2 × 1

sp2 = = = 4.2
n1 + n2 − 2 4+3−2

• p-value = P (t5 > 0.958) = 0.38.

Example (continued)
• Randomized test
• If no difference between the effects of treatment A and B,
then they are merely labels put on EUs, and can be
allocated to the sample results.
(n +n )!
• Hence, there are n1 !n 2! = 47!3! ! = 35 possible
1 2
• Table 1.2 listed all these 35 arrangements and computed
the differences between the two treatment, (y 1 − y 2 )j ,
j =1,. . . ,35.
|(y 1 − y 2 )|(1) , |(y 1 − y 2 )|(2) , · · · · · · , |(y 1 − y 2 )|(35)
y 1 − y 2 = 1.5
• p-value of randomized test =
i =1 I (|(y 1 − y 2 )|(i ) ≥ 1.5) 13
= = 0.37
35 35
Three importance principles-I

• Randomization
• Randomization is the random assignment of treatments to
units in an experimental study.
• Breaks the association between potential confounding
variables and the explanatory variables
Three importance principles-II

• Replication
• Replication implies an independent repetition of the basic
• Reasons for replicating an experiment:
• demonstrating the results to be reproducible
• providing some insurance against aberrant results due to
• providing an estimate for experimental error variance
• increasing the precision for estimating treatment means
Three importance principles-III

• Blocking
• Blocking is the grouping of experimental units that have
similar properties
• Within each block, treatments are randomly assigned to
experimental units.
• Blocking allows us to remove extraneous variation from
the data.
• When do we need to consider blocking and/or
• There are nuisance factors which are not our interest but
they do affect the response.
• For the nuisance factors
• Block what you can control
• Randomize what you cannot control
Nuisance factors

• Has effect on response but its effect is not of interest

• If unknown
• Protecting experiment through randomization
• If known (measurable) but uncontrollable
• Analysis of Covariance
• If known and controllable
• Blocking
Textbook (I)

• Completely randomized designs (comparing two

population means with independent samples, Ch. 2, 3)
• Factorial experimental designs (Ch. 6, 7)
• Factorial designs with fixed effects (Ch. 6)
• Factorial designs with random effects (Sec. 7.1)
• Factorial designs with mixed effects (Sec. 7.2)
• Nested factor designs (Sec. 7.3)
• Nested and crossed factors designs (Sec. 7.4)
• Complete block designs
• Randomized block designs (comparing two population
means with matched samples, Sec. 8.2)
• Latin square designs (Sec. 8.3)
• Factorial experiments in complete block design (Sec. 8.4)
Textbook (II)

• Incomplete block designs (Ch. 9, 10, 11)

• Fractional factorial designs (Ch. 12)
• Response surface designs (Ch. 13)
• Split-plot designs (Ch. 14)
• Repeated measures designs (Ch. 15)
• Crossover designs (Ch. 16)
Other techniques or terminologies may be implemented in DOE
Control group

• A control group is a group of subjects left untreated for

the treatment of interest but otherwise experiencing the
same conditions as the treated subjects
• Example: One group of patients is given a placebo
• Why do we need to consider placebo?
• Because patients treated with placebos, including sugar
pills may report improvement
• Example: More than 30% of patients with chronic back pain
report improvement even when treated only with a placebo.

• Blinding is the concealment of information from the

participants and/or researchers about which subjects are
receiving which treatments
• Single blind: subjects are unaware of treatments
• Double blind: subjects and researchers are unaware of
• Example: testing heart medication
• Two treatments: drug and placebo
• Single blind: Patients don’t know which group they are in,
but doctors do.
• Double blind: Neither patients nor doctors administering the
drug know which group the patients are in. Only a third
party knows the identification before the study is done.

• In a balanced experimental design, all treatments have

equal sample size.
• Balance maximizes power.
• Balance makes tests more robust if assumptions are
Definition from Montgomery

• A test or series of tests in which purposeful changes are

made to the input variables of a process or system so that
we may observe and identify the reasons for changes that
may be observed in the output response.

• Randomization is insurance against two sorts of bias.

• An nonrandomized experiment may give biased parameter
• A biased estimate of the error variance