You are on page 1of 13

Lecture 1: What are Comparative Experiments?

EXAMPLE 1. The effectiveness of a new brand of toothpaste in reducung tooth


decat is tested using volunteer university students. Each student is given free tooth-
paste and is checked regularly over a period of a year by the dentist.

EXAMPLE 2. A food company assesses the nutritional adequacy of a new instant


breakfast product by feeding it to newly weaned male white rats and measuring their
weight gain over a 28–day period. A control group of rats will be fed a standard
diet for comparison. There will be 15 rats in each group. The researcher chooses the
rats for the treatment group by going to the pen and catching 15 rats to allocate to
this group.

Written by Debbie Street, 35356: Lecture 1 1


modified by Steve Bush
Statistical studies can be broadly classified into two different types, observational
studies and comparative experiments.

A comparative experiment is “observation under conditions deliberately


arranged by the observer” (Bowers [1938]). More specifically, in a compar-
ative experiment, the experimenter chooses two or more conditions (treat-
ments) to which experimental units are assigned by the experimenter, and
observations are taken. For example, we may assign students to two different
classes to compare the effectiveness of the teaching methods in those two
classes.

In an observational study, the experimenter has no control over which


conditions apply to a particular experimental unit. The unit is either self
selected into a particular group or the groups simply describe the circum-
stances for a particular unit. For instance, we are only able to observe a
person’s gender, as we do not have the ability to assign gender to a par-
ticular person. This means that the experimental units may differ prior to
receiving the treatment, so we must think how to allow for this in analysing
data from an observational study.

Some Definitions:
Treatments set of circumstances created for the experiment in response to research
hypotheses. The focus of the investigation. (Kuehl [2000])

Experimental Units the smallest unit to which a treatment can be applied (Bailey
[2008])

Observational Units the smallest unit on which a response will be measured (Bai-
ley [2008])

Experimental Error variation observed amongst identically treated units. Sources


include the natural variation among units, measurement variability, treatments
not being the same from unit to unit, interactions between units and treat-
ments.

Experiments are useful in a number of disciplines. In medicine researchers use


experiments to determine the efficacy of particular drugs or treatment regimes.
In industry experiments are used to compare the effects of different manufactur-
ing methods. In marketing, experiments can be used to determine the strategy
that will best attract customers. In agriculture, experiments are used to deter-
mine the best conditions to maximise the yield of particular crops.

Written by Debbie Street, 35356: Lecture 1 2


modified by Steve Bush
Replication, Randomisation and Blocking
There are three very important concepts that help ensure that the experiment
is sound, and the results are robust. These are replication, randomisation, and
blocking (often called the three ‘R’s - perhaps blocking has a silent ‘R’, or there
may be pirates).

Replication
Replication is the application of each treatment to one or more experimental
units, independently. Thus it is repetition of the experiment. Why replicate?

• To demonstrate reproducibility of the results under experimental conditions.

• To provide some insurance against aberrant results.

• To provide an estimate of experimental error.

• Increase the precision of the estimates of the treatment means.

It is important to distinguish between observational units and experimental units.


If we want to compare two methods of teaching students to read and we have
three classes taught using each of the methods then the classes are the exper-
imental units (since these are the units to which the treatments were applied)
and the students are the observational units (since these are the units on which
we can observe the responses). So the variance calculated between the students
in the class is a measure of the observation error and not a measure of the
experimental error.
We will look at this question in more detail next week.

Written by Debbie Street, 35356: Lecture 1 3


modified by Steve Bush
Randomisation
Randomisation ‘provides justification for the statistical inference methods of estima-
tion and test of hypotheses’. It is randomisation not replication which guarantees
the validity of the estimates of error variance and treatment effects. Random as-
signment of treatments to plots simulates independence of the experimental units.
Randomisation tests make no assumptions about the form of the probability distri-
bution. Since they are time-consuming to compute in even moderate sized experi-
ments most statisticians use normal theory tests. Such tests have been shown to be
a good approximation to randomisation tests if treatments are randomly allocated
to units.

EXAMPLE 3. Consider the “instant breakfast” product experiment. In this


experiment, we needed to assign the treatment or the control to each of the thirty
rats. The simplest way to

1. Number the rats from 1 to 30.

2. Obtain a random permutation of these numbers.

3. The first 15 rats on this list are allocated to the treatment group, and the
remaining 15 are allocated to the control group.

To do this in SAS you need to use

proc plan;
factors treatment=30;
run;

to generate a randomisation of the numbers 1-30. Then allocate the first treatment
to the rats in the first 15 positions of the randomisation and the second treatment
to the rats in the second 15 positions of the randomisation.
T u e s d a y , F e b ru a ry 1 1 , 2 0 1 4 1 1 :0 8 :4 0 P M 1

T h e P L A N P ro c e d u re

F a c to r S e le c t L e v e ls O r d e r
tr e a tm e n t 2 4 2 4 R a n d o m

tr e a tm e n t
1 2 1 5 1 6 1 7 6 1 9 1 1 2 3 2 1 9 1 4 2 2 0 1 0 3 1 3 1 8 2 2 2 4 1 7 4 8 5

Written by Debbie Street, 35356: Lecture 1 4


modified by Steve Bush
Blocking
Blocking is used to control experimental error by providing local control of the
environment. Units in the same block are supposed to be homogeneous. To choose
blocks use characteristics of the units, not of the treatments. Common blocking
criteria are:

• proximity (plots of land in a field, animals from the same litter, patients in
the same hospital),

• physical characteristics (such as age, sex, weight, height),

• time (day of the week, month of the year),

• management of tasks in the experiment (all plots in the same block must be
harvested on the same day, all components in the same block must go in the
oven at the same time).

Blocking can also be considered to be a restriction in the randomisation porcess.

EXAMPLE 4. Consider the “instant breakfast” product experiment. Now sup-


pose that the rats come from two different litters, and the rats from the first litter
are more likely to be large.

Written by Debbie Street, 35356: Lecture 1 5


modified by Steve Bush
What is an Experimental Design?
An experimental design is the arrangement of experimental units into treatment
groups and blocks, where necessary.
We can compare designs using the relative efficiency of the two designs The
higher the relative efficiency the fewer the number of replicates required. We use
σr2 = σ 2 /r as a measure of the efficiency of the estimated treatment mean in an
experiment. This can be improved by reducing σ 2 and/or by increasing r. We use
strategies like blocking to reduce σ 2 .

More Definitions:
Factor a particular variable, such as temperature, pressure or screen colour.

Levels of a factor the specific values a factor can take.

Types of factors qualitative (colour) or quantitative (degrees).

Factorial experiment one which investigates the effects of several factors simul-
taneously on the characteristic(s) of interest.

OFAT - One-Factor-At-a-Time best avoided since it involves optimising on one


factor and then fixing that factor at its best level and starting to experiment
on the next factor. Interactions are completely missed in this sort of approach.

Written by Debbie Street, 35356: Lecture 1 6


modified by Steve Bush
Design Checklist
Think about why you are carrying out the experiment and what questions you hope
to be able to answer as a result of the experiment.

• What are the specific objectives of the experiment?

• Which factors are likely to be influential?

• Which of these factors will you include in the experiment and allow to take on
different values? What values can they take?

• Which of these factors will you hold constant during the experiment? At what
level for each one?

• What characteristics will you measure?

• Will you measure the experimental units or smaller observational units?

• How will you do the measurements?

• What limits exist in terms of units? time? materials?

Written by Debbie Street, 35356: Lecture 1 7


modified by Steve Bush
Commonly Used Designs
We will be looking at each of these designs in more detail as the semester progresses.

• Completely randomised designs (CRD): Used when all the experimental


units are believed to be exactly the same. The simplest and best design if you
can get enough homogeneous units.

• Block designs: Use attributes of the experimental units to group them into
sets of homogeneous units. Treatments are applied within sets of homogeneous
units.

– Complete blocks (RCBD): One blocking factor (such as litter, day of


the week, distance from the airport), and have as many units as treat-
ments.
– Incomplete blocks: One blocking factor, but you have fewer units than
treatments. Which treatments should be in which blocks is important.
Consult before using.
– Row–column design: A design with two blocking factors.
– Latin squares: Two blocking factors and each block has as many units
as treatments - special case of a Row–Column design.
– Graeco-Latin squares: A design with three blocking factors with as
many units as treatments.

• Factorial designs: This term describes the treatment structure. Factorial


designs can be used in conjunction with any block design.

– Complete Factorial Design: all possible combinations of all factor


levels are used.
– Incomplete Factorial Design (Fractional Factorial Design): some com-
binations of factor levels are missing.
– Response Surface Design: a design where all the factors are quantita-
tive. Suitably chosen response surface designs can be used to fit models
with quadratic or cubic terms in the factors.

Written by Debbie Street, 35356: Lecture 1 8


modified by Steve Bush
EXAMPLE 5. Two chemicals , A and B, can effect the fungal growth on plant
leaves. Suppose that each chemical is applied to a leaf at one of two levels. The age
of the leaf effects its vulnerability to fungal infection. Leaves from a pair are more
similar than leaves from the same plant that are not the two leaves of a pair. Leaves
on different plants may differ substantially.

How would you allocate levels of A and B to leaves in each of the following scenarios?

• You have 8 plants and 4 leaves from each plant. Four different leaf ages are
represented on each plant (same age across plants).

• You have 4 plants and you use 8 leaves (4 old and 4 young) from each plant.

• You have 4 plants and from each plant you choose 4 pairs of leaves (2 old and
2 young).

Written by Debbie Street, 35356: Lecture 1 9


modified by Steve Bush
EXAMPLE 6. (from Kuehl) Three treatments to make material “wrinkle-free”
are to be compared. These are PCA(1-2-3 propane tricarbolic acid), BTCA (butane
tetracarboxillic acid) and CA (citric acid). Four shirts will be used for each of the
treatments. The treatments are applied to the shirts which are then subjected to
simulated wear and washing in a simulation machine. The treatments do not con-
taminate each other if they are placed in the same simulation machine during the
test. The machine can hold between one and four shirts in a simulation run. At the
end of each run the tear and breaking strength of the fabric is determined and the
“wrinkle-freeness” is assessed.

The comparisons between the treatments can be affected by:

• natural variation from one shirt to another

• measurement errors

• variation in application of the treatment

• variation in the runs of the simulation machine.

Three possible experiments are described below. What are the advantages and
disadvantages of each of the experiments?

Expt 1 The shirts are divided randomly into three groups of four shirts. Each group
receives a treatment as one batch and then each batch is processed in one run
of the simulation machine. There are three runs of the simulation machine.

Written by Debbie Street, 35356: Lecture 1 10


modified by Steve Bush
Expt 2 The shirts are divided randomly into three groups of four shirts. The treat-
ments are applied independently to single shirts. The four shirts receiving
each treatment are processed in one run of the simulation machine. There are
three runs of the simulation machine.

Expt 3 The shirts are divided randomly into three groups of four shirts. The treat-
ments are applied independently to single shirts. The shirts are grouped into
four sets of three shirts, one from each treatment, and each of these groups
are placed in one run of the simulation machine. There are four runs of the
simulation machine.

Written by Debbie Street, 35356: Lecture 1 11


modified by Steve Bush
Other Issues to be Aware of
Ethical Considerations
What happens to the validity of studies if participants may withdraw consent after
they have been debriefed at the end of a study? The adopted code for psychology
experiments conducted in Canada may be viewed at

http://www.cpa.ca/ethics2000.html

It states “III.29 Give a research participant the option of removing his or her
data...and if the removal of the data will not compromise the validity of the re-
search design and hence diminish the ethical value of the participation of the other
research participants.”

Inadequate literature reviews


Aprotinin - drug that reduces the need blood transfusion in heart surgery. By 1992
it had been tested in over 2000 patients in 12 randomised clinical trials. Yet at least
50 further studies involving 5000 patients were carried out. Why?

The papers that reported the later trials typically referred to just 4 of the earlier
studies and the same 4 at that. This suggests that an inadequate literature review
had been carried out. More details in Fergusson (Clinical Trials, 2005).

Some Quotes about Experiments


“. . . even the most sophisticated statistical analysis cannot salvage a badly designed
experiment.”
Box, Hunter and Hunter (1978)
“Most real-life statistical problems have one or more non-standard features. There
are no routine statistical questions; only questionable statistical routines.”
Sir David Cox
“No aphorism is more frequently repeated in connection with field trials, than that
we must ask Nature few questions, or, ideally, one question, at a time. The writer is
convinced that this view is wholly mistaken. Nature, he suggests, will best respond
to a logical and carefully thought out questionnaire”
Sir Ronald Fisher, 1926

Written by Debbie Street, 35356: Lecture 1 12


modified by Steve Bush
Some Good Books
G. Box, W. Hunter and J. S. Hunter, Statistics for Experimenters, Wiley, 1978

D. Cox, Planning of Experiments, Wiley, 1958

R. Kuehl, Design of Experiments: Statistical Principles of Research Design and


Analysis, Duxbury, 2000

R. Mead, The Design of Experiments, Cambridge, 1988

D. C. Montgomery, Design and Analysis of Experiments, Wiley, 2009

A. Dean and D. Voss, Design and Analysis of Experiments, Springer, 1999

R. Bailey, Design of comparative experiments, Cambridge, 2008.

References
R. Bailey. Design of comparative experiments. Cambridge Univ Pr, 2008. ISBN
0521683572.

R.V. Bowers. Conceptual Integration and Social Research. American Sociological


Review, 3(3):307–319, 1938. ISSN 0003-1224.

R.O. Kuehl. Design of experiments: statistical principles of research design and


analysis. Duxbury/Thomson Learning, 2000. ISBN 0534368344.

Written by Debbie Street, 35356: Lecture 1 13


modified by Steve Bush

You might also like