You are on page 1of 8


For many true experimental designs, pretest-posttest designs are the preferred
method to compare participant groups and measure the degree of change occurring
as a result of treatments or interventions.

Pretest-posttest designs grew from the simpler posttest only designs, and address
some of the issues arising with assignment bias and the allocation of participants to

One example is education, where researchers want to monitor the effect of a new
teaching method upon groups of children. Other areas include evaluating the effects
of counseling, testing medical treatments, and measuring psychological constructs.
The only stipulation is that the subjects must be randomly assigned to groups, in a
true experimental design, to properly isolate and nullify any nuisance or confounding


Pretest-posttest designs are an expansion of the posttest only design with
nonequivalent groups, one of the simplest methods of testing the effectiveness of an

In this design, which uses two groups, one group is given the treatment and the
results are gathered at the end. The control group receives no treatment, over the
same period of time, but undergoes exactly the same tests.

Statistical analysis can then determine if the intervention had a significant effect.
One common example of this is in medicine; one group is given a medicine, whereas
the control group is given none, and this allows the researchers to determine if the
drug really works. This type of design, whilst commonly using two groups, can be
slightly more complex. For example, if different dosages of a medicine are tested,
the design can be based around multiple groups.
Whilst this posttest only design does find many uses, it is limited in scope and
contains many threats to validity. It is very poor at guarding against assignment
bias, because the researcher knows nothing about the individual differences within
the control group and how they may have affected the outcome. Even with
randomization of the initial groups, this failure to address assignment bias means
that the statistical power is weak.

The results of such a study will always be limited in scope and, resources permitting;
most researchers use a more robust design, of which pretest-posttest designs are
one. The posttest only design with non-equivalent groups is usually reserved for
experiments performed after the fact, such as a medical researcher wishing to
observe the effect of a medicine that has already been administered.


This is, by far, the simplest and most common of the pretest-posttest designs, and is
a useful way of ensuring that an experiment has a strong level of internal validity.
The principle behind this design is relatively simple, and involves randomly assigning
subjects between two groups, a test group and a control. Both groups are pre-
tested, and both are post-tested, the ultimate difference being that one group was
administered the treatment.

This test allows a number of distinct analyses, giving researchers the tools to filter
out experimental noise and confounding variables. The internal validity of this design
is strong, because the pretest ensures that the groups are equivalent. The various
analyses that can be performed upon a two-group control group pretest-posttest
designs are (Fig 1):
1. This design allows researchers to compare the final posttest results between
the two groups, giving them an idea of the overall effectiveness of the
intervention or treatment. (C)
2. The researcher can see how both groups changed from pretest to posttest,
whether one, both or neither improved over time. If the control group also
showed a significant improvement, then the researcher must attempt to
uncover the reasons behind this. (A and A1)
3. The researchers can compare the scores in the two pretest groups, to ensure
that the randomization process was effective. (B)

These checks evaluate the efficiency of the randomization process and also
determine whether the group given the treatment showed a significant difference.


The main problem with this design is that it improves internal validity but sacrifices
external validity to do so. There is no way of judging whether the process of pre-
testing actually influenced the results because there is no baseline measurement
against groups that remained completely untreated. For example, children given an
educational pretest may be inspired to try a little harder in their lessons, and both
groups would outperform children not given a pretest, so it becomes difficult to
generalize the results to encompass all children.

The other major problem, which afflicts many sociological and educational research
programs, is that it is impossible and unethical to isolate all of the participants
completely. If two groups of children attend the same school, it is reasonable to
assume that they mix outside of lessons and share ideas, potentially contaminating
the results. On the other hand, if the children are drawn from different schools to
prevent this, the chance of selection bias arises, because randomization is not

The two-group control group design is an exceptionally useful research method, as

long as its limitations are fully understood. For extensive and particularly important
research, many researchers use the Solomon four group method, a design that is
more costly, but avoids many weaknesses of the simple pretest-posttest designs.
The Solomon four group design is a way of avoiding some of the difficulties
associated with the pretest-posttest design.

This design contains two extra control groups, which serve to reduce the influence of
confounding variables and allow the researcher to test whether the pretest itself has
an effect on the subjects.

Whilst much more complex to set up and analyze, this design type combats many of
the internal validity issues that can plague research. It allows the researcher to exert
complete control over the variables and allows the researcher to check that the
pretest did not influence the results.

The Solomon four group test is a standard pretest-posttest two-group design and the
posttest only control design. The various combinations of tested and untested groups
with treatment and control groups allows the researcher to ensure that confounding
variables and extraneous factors have not influenced the results.


In the figure, A, A1, B and C are exactly the same as in the standard two group
The first two groups of the Solomon four group design are designed and interpreted
in exactly the same way as in the pretest-post-test design, and provide the same
checks upon randomization.

• The comparison between the posttest results of groups C and D, marked by

line ‘D’, allows the researcher to determine if the actual act of pretesting
influenced the results. If the difference between the posttest results of Groups
C and D is different from the Groups A and B difference, then the researcher
can assume that the pretest has had some effect upon the results
• The comparison between the Group B pretest and the Group D posttest allows
the researcher to establish if any external factors have caused a temporal
distortion. For example, it shows if anything else could have caused the
results shown and is a check upon causality.
• The Comparison between Group A posttest and the Group C posttest allows
the researcher to determine the effect that the pretest has had upon the
treatment. If the posttest results for these two groups differ, then the pretest
has had some effect upon the treatment and the experiment is flawed.
• The comparison between the Group B posttest and the Group D posttest
shows whether the pretest itself has affected behavior, independently of the
treatment. If the results are significantly different, then the act of pretesting
has influenced the overall results and is in need of refinement.
The Solomon four group design is one of the benchmarks for sociological and
educational research, and combats most of the internal and external validity issues
apparent in lesser designs. Despite the statistical power and results that are easy to
generalize, this design does suffer from one major drawback that prevents it from
becoming a common method of research: the complexity.

A researcher using a Solomon four group design must have the resources and time
to use four research groups, not always possible in tightly funded research
departments. Most schools and organizations are not going to allow researchers to
assign four groups randomly because it will disrupt their normal practice. Thus, a
non-random assignment of groups is essential and this undermines the strength of
the design.

Secondly, the statistics involved is extremely complex, even in the age of computers
and statistical programs. Unless the research is critical or funded by a large budget
and extensive team of researchers, most experiments are of the simpler pretest-
posttest research designs. As long as the researcher is fully aware of the issues with
external validity and generalization, they are sufficiently robust and a Solomon four
group design is not needed.


A factorial design is often used by scientists wishing to understand the effect of two
or more independent variables upon a single dependent variable.

Traditional research methods generally study the effect of one variable at a time,
because it is statistically easier to manipulate. However, in many cases, two factors
may be interdependent, and it is impractical or false to attempt to analyze them in
the traditional way.

Social researchers often use factorial designs to assess the effects of educational
methods, whilst taking into account the influence of socio-economic factors and

Agricultural science, with a need for field-testing, often uses factorial designs to test
the effect of variables on crops. In such large-scale studies, it is difficult and
impractical to isolate and test each variable individually.

Factorial experiments allow subtle manipulations of a larger number of

interdependent variables. Whilst the method has limitations, it is a useful method for
streamlining research and letting powerful statistical methods highlight any
Imagine an aquaculture research group attempting to test the effects of food
additives upon the growth rate of trout.

A traditional experiment would involve randomly selecting different tanks of fish and
feeding them varying levels of the additive contained within the feed, for example
none or 10%.

However, as any fish farmer knows, the density of stocking is also crucial to fish
growth; if there are not enough fish in a tank, then the wasted capacity costs
money. If the density is too high, then the fish grow at a slower rate.

Rather than the traditional experiment, the researchers could use a factorial design
and co-ordinate the additive trial with different stocking densities, perhaps choosing
four groups. The factorial experiment then needs 4 x 2, or eight treatments.

The traditional rules of the scientific method are still in force, so statistics require
that every experiment be conducted in triplicate.

This means 24 separate treatment tanks. Of course, the researchers could also test,
for example, 4 levels of concentration for the additive, and this would give 4 x 4 or
16 tanks, meaning 48 tanks in total.

Each factor is an independent variable, whilst the level is the subdivision of a factor.
Assuming that we are designing an experiment with two factors, a 2 x 2 would mean
two levels for each, whereas a 2 x 4 would mean two subdivisions for one factor and
four for the other. It is possible to test more than two factors, but this becomes
unwieldy very quickly.

In the fish farm example, imagine adding another factor, temperature, with four
levels into the mix. It would then be 4 x 4 x 4, or 64 runs. In triplicate, this would be
192 tanks, a huge undertaking.

There are a few other methods, such as fractional factorial designs, to reduce this,
but they are not always statistically valid. This lies firmly in the realm of advanced
statistics and is a long, complicated and arduous undertaking.


Factorial designs are extremely useful to psychologists and field scientists as a
preliminary study, allowing them to judge whether there is a link between variables,
whilst reducing the possibility of experimental error and confounding variables.

The factorial design, as well as simplifying the process and making research cheaper,
allows many levels of analysis. As well as highlighting the relationships between
variables, it also allows the effects of manipulating a single variable to be isolated
and analyzed singly.

The main disadvantage is the difficulty of experimenting with more than two factors,
or many levels. A factorial design has to be planned meticulously, as an error in one
of the levels, or in the general operationalization, will jeopardize a great amount of

Other than these slight detractions, a factorial design is a mainstay of many scientific
disciplines, delivering great results in the field.