You are on page 1of 6

Y520 Spring 2000 Page 1

Experimental Method
The best method indeed the only fully compelling method of establishing causation is to conduct a carefully
designed experiment in which the effects of possible lurking variables are controlled. To experiment means to
actively change x and to observe the response in y (p. 202).

Moore, D., & McCabe, D. (1993). Introduction to the practice of statistics. New York: Freeman.

The experimental method is the only method of research that can truly test hypotheses concerning cause-and-effect
relationships. It represents the most valid approach to the solution of educational problems, both practical and
theoretical, and to the advancement of education as a science (p. 298).

Gay, L. R. (1992). Educational research (4th Ed.). New York: Merrill.

Importance of Good Design: (http://www.tufts.edu/*gdallal/study.htm)

100% of all disasters are failures of design, not analysis. Ron Marks, Toronto, August 16, 1994

To propose that poor design can be corrected by subtle [statistical] analysis techniques is contrary to good scientific
thinking. Stuart Pocock (Controlled Clinical Trials, p 58) regarding the use of retrospective adjustment for trials
with historical controls.

Issues of design always trump issues of analysis. G.E. Dallal, 1999, explaining why it would be wasted effort to
focus on the analysis of data from a study under challenge whose design was fatally flawed.

Unique Features of Experiments:


1. The investigator manipulates a variable directly (the independent variable).
2. Empirical observations based on experiments provide the strongest argument for cause-effect relationships.

Additional features:
1. Problem statement theory constructs operational definitions variables hypotheses.
2. The research question (hypothesis) is often stated as the alternative hypothesis to the null hypothesis, that is
used to interpret differences in the empirical data.
3. Random sampling of subjects from population (insures sample is representative of population).
4. Random assignment of subjects to treatment and control (comparison) groups (insures equivalency of groups;
ie., unknown variables that may influence outcome are equally distributed across groups).
5. Extraneous variables are controlled by 3 & 4 and other procedures if needed.
6. After treatment, performance of subjects (dependent variable) in both groups is compared.

Ways to control extraneous variables:


1. Random assignment of subjects to groups. This is the best way to control extraneous variables in
experimental research. Provides control for subject characteristics, maturation, and statistical regression.
2. Variables that may still exist:
a. Subject mortality (i.e., dropouts due to treatment)
b. Hawthorne effect
c. Fidelity of treatment (manipulation check)
d. Data collector bias (double blind studies)
e. Location, history
3. Additional procedures for controlling extraneous variables (use as needed)
a. Exclude certain variables.
b. Blocking.
c. Matching subjects on certain characteristics.
d. Use subject as own control.
e. Analysis of covariance.

Michael Y520
Y520 Spring 2000 Page 2

True Experimental Designs


A. Randomized Post-test only Control Group Design

Treatment R X1 O R = random assignment


Comparison R X2 O X = Treatment occurs for X1 only
O = Observation (dependent variable)
This is the best of all designs for experimental research.
Random assignment controls for subject characteristics, maturation, statistical regression.
Potential threats not controlled: subject mortality, Hawthorne effect, fidelity of treatment, data collection bias, unique
features of location, history of subjects.

B. Randomized Pretest Post-test Control Group Design

Treatment R O1 X1 O2 R = random assignment


Comparison R O1 X2 O2 X = Treatment occurs for X1 only
O1 = Observation (Pre-test)
O2 = Observation (Post-test, dependent
Potential threat: Effect of pre-testing. variable)

C. Randomized Solomon Four Group Design

Treatment R O1 X1 O2 R = random assignment


Comparison R O1 X2 O2 X = Treatment occurs for X1 only
O1 = Observation (Pre-test)
Treatment R X1 O2 O2 = Observation (Post-test, dependent
Comparison R X2 O2 variable)

Random sampling, random assignment.


Best control of threats to internal validity, particularly the threat introduced by pretesting.
Requires a relatively large number of subjects.

D. Randomized Assignment with Matching


1. Randomized (Sampling & Assignment), Matched Ss, Post-test only, Control Group

Treatment M,R X1 O M = Matched Subjects


R = Random assignment of matched pairs
Comparison M,R X2 O X =Treatment (for X1 only)
O = Observation (dependent variable)

Example: An experimenter wants to test the impact of a novel instructional program in formal logic. The investigator
infers from reports in the literature that high ability students and those with programming, mathematical, or
music backgrounds are likely to excel in formal logic regardless of type of instruction. The experimenter
randomly samples subjects, looks at subjects SAT scores, matches subjects on basis of SAT scores and randomly
assigns matched pairs (one of each pair to each group). The other concominant variables (previous
programming, mathematical, and music experience) could also be matched.

Michael Y520
Y520 Spring 2000 Page 3
2. Randomized Pretest-Post-test Control Group, Matched Ss

Treatment O1 M,R X1 O2 O1 = Pretest


M = Matched Subjects
Comparison O1 M,R X2 O2 R = Random assignment of matched pairs
X =Treatment (for X1 only)
O2 = Observation (dependent variable)

Subjects are matched on the basis of their pretest score and pairs of subjects are randomly assigned to groups.

3. Matching Methods
a. Mechanical matching
1). Rank order subjects on variable, take top two, randomly assign members of pairs to groups. Repeat for
all pairs.
2). Problems:
Impossible to match on more than one or two variables simultaneously.
May need to eliminate some Ss due to no appropriate match for one of the groups. a. Statistical
matching
b. Statistical Matching
1). The purpose is to control for factors that cannot be randomized but nonetheless can be measured on (at
least) an interval scale (but in practice we often treat ordinal scales as if they were interval). Statistical
control is achieved by measuring one or more concomitant variables (referred to as the covariate) in
addition to the variable (variate) of primary interest (i.e., the dependent or response variable).
Statistical control can be used in experimental designs and because no direct manipulation of subjects
or conditions is required, it can also be used in quasi-expermential and non-experimental designs.
2). Analysis of covariance is used to test the main and interaction effects of categorical variables on a
continuous dependent variable, controlling for the effects of selected other continuous variables which
covary with the dependent.The control variable is called the covariate.
(http:http://www2.chass.ncsu.edu/garson/pa765/ancova.htm).
3). To control a covariate statistically means the same as to adjust for the covariate or to correct for
covariate, or to hold constant or to partial out the covariate. (http://www.psych.uiuc.edu/-
mho/psy307a.html)
4). But see:
Loftin, L., & Madison, S. (1991). The extreme dangers of covariance corrections. In B. Thompson
(Ed.), (1991). Advances in educational research: Substantive findings, methodological developments
(Vol. 1, pp. 133-148). Greenwich, CT: JAI Press. (IBSN: 1-55938-316-X)
Thompson, B. (1992). Misuse of ANCOVA and related "statistical control" procedures. Reading
Psychology, 13, iii-xviii.

Michael Y520
Y520 Spring 2000 Page 4
Pre-Experimental Designs
A. One-Shot Case Study

X O X = treatment
O = Observation (dependent variable)

Problems: No control group; cannot tell if treatment had any effect.


Comments from Campbell and Stanley (1963):
As has been pointed out (e.g., Boring, 1954; Stouffer, 1949) such studies have such a total absence of
control as to be of almost no scientific value (p. 6).
Basic to scientific evidence (and to all knowledge-diagnostic processes including the retina of the eye) is the
process of comparison, of recording differences, or of contrast. Any appearance of absolute knowledge, or
intrinsic knowledge about singular isolated objects, is found to be illusory upon analysis. Securing scientific
evidence involves making at least one comparison" (p. 6).
It seems well-nigh unethical... to allow, as theses or dissertations in education, case studies of this nature
(i.e., involving a single group observed at one time only)" (p. 7).

B. One Group Pretest-Post test Design

O1 X O2 O1 = Pretest
X = treatment
O2 = Observation (dependent variable)
Problems: No control group. Changes between pre- and post-test may be due not to the treatment but to:
history, maturation, instrument decay, data collection characteristics, data collection bias, testing, statistical
regression, attitude of subjects, problems with implementation, etc.

C. Static-group comparison design

X O1 X = treatment
O1 = Observation (dependent variable)
O1
Intact, existing groups are used. No random selection of subjects; no random assignment to groups. No way to insure
equivalence of groups.

Comments from Campbell and Stanley (1963):

Instances of this kind of research include, for example, the comparison of school systems which require the
bachelors degree of teachers (the X) versus those which do not; the comparison of students in classes given
speed-reading training versus those not given it; the comparison of those who heard a certain TV program with
those who did not, etc. (p. 12).

There is ... no formal means of certifying that the groups would have been equivalent had it not been for the
X.... If O2 and O2 differ, this difference could well have come through the differential recruitment of persons
making up the groups: the groups might have differed anyway, without the occurrence of X" (p. 12).

Michael Y520
Y520 Spring 2000 Page 5
Quasi-Experimental Designs
No random sampling of subjects. Intact groups often used.
No random assignment of Ss to groups. Confidence in equivalency of groups is lower.
A. Matching-only Group Design

Treatment M X1 O X = treatment

Control M X2 O

B. Matching-only Pretest-Post test Group Design

Treatment O1 M X1 O2 O1 = Pretest
X1 = treatment
Control O1 M X2 O2 O2 = Post test

Existing, intact groups.


Subjects matched on one or more variables; can't be certain if groups are equivalent on remaining unmatched
variables.
Matching is never a substitute for random sampling and random assignment to groups.

C. Single Group Time Series Design


The essence of the time-series design is the presence of a periodic measurement process on some group or
individual and the introduction of an experimental change into this time series of measurements, the results of
which are indicated by a discontinuity in the measurements recorded in the time series" (Campbell & Stanley,
1963, p. 37).

O 1O 2 O3O4 O5X1 O6O7 O8O9 O10 X1 = treatment

Factorial Designs.
Requires at a minimum, two levels variable A crossed with two levels of variable B. That is, all levels of A
occur with all levels of B.
Factorial designs enable the investigator to observe an interaction, if one exists. An interaction simply means
that different levels of the dependent variable occur at different levels of the independent variable.
Let us suppose that three types of teachers are all, in general, effective (e.g., the spontaneous extemporizers,
the conscientious preparers, and the close supervisors of student work). Similarly, three teaching methods in
general turn out to be equally effective (e.g., group discussion, formal lecture, and tutorial). In such a case...,
teaching methods could plausibly interact strongly with types, the spontaneous extemporizer doing best with
group discussion and poorest with tutorial, and the close supervisor doing best with tutorial and poorest with
group discussion (Campbell & Stanley, 1963, p. 29).

Threats to Internal Validity


Is the investigators conclusion correct? Are the changes in independent variable indeed responsible for the observed
variation in the dependent variable? Or, might the variation in the dependent variable be attributable to other causes?
This is the question of internal validity.The following list is from Campbell and Stanley (1963) as interpreted by
Kirk (1995):
1. History. Events other than the administration of a treatment level that occur between the time the treatment
level is assigned to subjects and the time the dependent variable is measured may affect the dependent
variable.
2. Maturation. Processes not related to the administration of a treatment level that occur within subjects is
simply a function of the passage of time (growing older, stronger, larger, more experienced, and so on) may
affect the dependent variable.
3. Testing. Repeated testing of subjects may result in familiarity with the testing situation or acquisition of
information that can affect the dependent variable.

Michael Y520
Y520 Spring 2000 Page 6
4. Instrumentation. Changes in the calibration of a measuring instrument, shifts in the criteria used by
observers and scorers, or unequal intervals in different ranges of a measuring instrument can affect the
measurement of the dependent variable.
5. Statistical regression. When the measurement of the dependent variable is not perfectly reliable, there is a
tendency for extreme scores to regress or move toward the mean. Statistical regression operates to (a) increase
the scores of subjects originally found to score low on a test, (b) decrease the scores of subjects originally
found to score high on a test, and (c) not affect the scores of subjects at the mean of the test. The amount of
statistical regression is inversely related to the reliability of the test.
6. Selection. Differences among the dependent-variable means may reflect prior differences among the
subjects assigned to the various levels of the independent variable.
7. Mortality. The loss of subjects in the various treatment conditions may alter the distribution of subject
characteristics across the treatment groups.
8. Interactions with selection. Some of the foregoing threats to internal validity may interact with selection to
produce effects that are confounded with or indistinguishable from treatment effects. Among these are
selection-history effects and selection-maturation effects. For example, selection-maturation effects occur
when subjects with different maturation schedules are assigned to different treatment levels.
9. Ambiguity about the direction of causal influence. In some types of research for example,
correlational studies it may be difficult to determine whether X is responsible for the change in Y or vice
versa. This ambiguity is not present when X is known to occur before Y.
10. Diffusion or imitation of treatments. Sometimes the independent variable involves information that is
selectively presented to subjects in the various treatment levels. If the subjects in different levels can
communicate with one another, differences among the treatment levels may be compromised.
11. Compensatory rivalry by respondents receiving less desirable treatments. When subjects in some
treatment levels receive goods or services generally believed to be desirable and this becomes known to
subjects in treatment levels that do not receive those goods and services, social competition may motivate the
subjects in the latter group, the control subjects, to attempt to reverse or reduce the anticipated effects of the
desirable treatment levels. Saretsky (1972) named this the John Henry effect in honor of the steel driver who,
upon learning that his output was being compared with that of a steam drill, worked so hard that he
outperformed the drill and died of overexertion.
12. Resentful demoralization of respondents receiving less desirable treatments. If subjects learn that the
treatment level to which they have been assigned received less desirable goods or services, they may
experience feelings of resentment and demoralization. Their response may be to perform at an abnormally low
level, thereby increasing the magnitude of the difference between their performance and that of units assigned
to the desirable treatment level.

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research. Chicago, IL:
Rand McNally.

Kirk, R. E. (1995). Experimental design: Procedures for the behavioral sciences. Pacific Grove, CA: Brooks/Cole.

Michael Y520

You might also like