Investigación Experimental

P A R T 4
Quantitative Research
Methodologies
In Part 4, we begin a more detailed discussion of some of the methodologies that
educational researchers use. We concentrate here on quantitative research, with a
separate chapter devoted to group-comparison experimental research, single-subject
experimental research, correlational research, causal-comparative research, and sur-
vey research. In each chapter, we not only discuss the method in some detail, but we
also provide examples of published studies in which the researchers used one of these
methods. We conclude each chapter with an analysis of a particular study’s strengths
and weaknesses.
fra97851_ch13_263-300.indd 263 12/21/10 6:58 PM

13 Experimental Research
The Uniqueness of
Experimental Research “What is a “This seems
double-blind like more than
Essential Characteristics
study anyway?” just a double!”
of Experimental
Research
Comparison of Groups
Manipulation of the
Independent Variable
Randomization
Control of Extraneous
Variables
Group Designs in
Experimental Research
Poor Experimental Designs
True Experimental Designs
Quasi-Experimental Designs
Factorial Designs
Control of Threats
to Internal Validity:
A Summary
Evaluating the
Likelihood of a Threat
to Internal Validity in
Experimental Studies
Control of Experimental OBJECTIVES Studying this chapter should enable you to:
Treatments
• Describe briefly the purpose of • Explain three ways in which various threats
An Example of experimental research. to internal validity in experimental research
Experimental Research • Describe the basic steps involved in can be controlled.
Analysis of the Study conducting an experiment. • Explain how matching can be used
• Describe two ways in which experimental to equate groups in experimental
Purpose/Justification research differs from other forms of studies.
Prior Research educational research. • Describe briefly the purpose of factorial
Definitions • Explain the difference between random and counterbalanced designs and draw
Hypotheses assignment and random selection and the diagrams of such designs.
Sample
importance of each. • Describe briefly the purpose of a time-
• Explain what is meant by the phrase series design and draw a diagram of this
Instrumentation “manipulation of variables” and describe design.
Procedures/Internal Validity three ways in which such manipulation • Describe briefly how to assess probable
Data Analysis/Results can occur. threats to internal validity in an
Discussion/Interpretation • Distinguish between examples of weak experimental study.
and strong experimental designs and draw • Recognize an experimental study when
diagrams of such designs. you see one in the literature.
• Identify various threats to internal validity
associated with different experimental
designs.
fra97851_ch13_263-300.indd 264 12/21/10 6:58 PM

INTERACTIVE AND APPLIED LEARNING After, or while, reading this chapter:
Go to the Online Learning Center Go to your online Student Mastery
at www.mhhe.com/fraenkel8e to: Activities book to do the following
activities:
• Learn More About What Constitutes an Experiment • Activity 13.1: Group Experimental Research Questions
• Activity 13.2: Designing an Experiment
• Activity 13.3: Characteristics of Experimental Research
• Activity 13.4: Random Selections vs. Random Assignment
D oes team-teaching improve the achievement of students in high school social studies classes? Abigail Johnson,
the principal of a large high school in Minneapolis, Minnesota, having heard encouraging remarks about the
idea at a recent educational conference, wants to find out. Accordingly, she asks some of her eleventh-grade world his-
tory teachers to participate in an experiment. Three teachers are to combine their classes into one large group. These
teachers are to work as a team, sharing the planning, teaching, and evaluation of these students. Three other teachers
are assigned to teach a class in the same subject individually, with the usual arrangement of one teacher per class. The
students selected to participate are similar in ability, and the teachers will teach at the same time, using the same cur-
riculum. All are to use the same standardized tests and other assessment instruments, including written tests prepared
jointly by the six teachers. Periodically during the semester, Mrs. Johnson will compare the scores of the two groups of
students on these tests.
This is an example of an experiment—the comparison of a treatment group with a nontreatment group. In this chapter, you
will learn about various procedures that researchers use to carry out such experiments, as well as how they try to ensure that it is
the experimental treatment rather than some uncontrolled variable that causes the changes in achievement.
Experimental research is one of the most powerful re- effect(s) of at least one independent variable on one or
search methodologies that researchers can use. Of the more dependent variables. The independent variable
many types of research that might be used, the experi- in experimental research is also frequently referred
ment is the best way to establish cause-and-effect re- to as the experimental, or treatment, variable. The
lationships among variables. Yet experiments are not dependent variable, also known as the criterion, or
always easy to conduct. In this chapter, we will show outcome, variable, refers to the results or outcomes
you both the power of, and the problems involved in, of the study.
conducting experiments. The major characteristic of experimental research
that distinguishes it from all other types of research is
that researchers manipulate the independent variable.
The Uniqueness They decide the nature of the treatment (that is, what is
of Experimental Research going to happen to the subjects of the study), to whom
it is to be applied, and to what extent. Independent vari-
Of all the research methodologies described in this ables frequently manipulated in educational research
book, experimental research is unique in two very include methods of instruction, types of assignment,
important respects: It is the only type of research that learning materials, rewards given to students, and types
directly attempts to influence a particular variable, of questions asked by teachers. Dependent variables
and when properly applied, it is the best type for test- that are frequently studied include achievement, interest
ing hypotheses about cause-and-effect relationships. in a subject, attention span, motivation, and attitudes
In an experimental study, researchers look at the toward school.
265
fra97851_ch13_263-300.indd 265 12/21/10 6:58 PM

266 PART 4 Quantitative Research Methodologies www.mhhe.com/fraenkel8e
After the treatment has been administered for an wooden spindles in dry leaves. Much of the success of
appropriate length of time, researchers observe or modern science is due to carefully designed and meticu-
measure the groups receiving different treatments (by lously implemented experiments.
means of a posttest of some sort) to see if they dif- The basic idea underlying all experimental research
fer. Another way of saying this is that researchers want is really quite simple: Try something and systemati-
to see whether the treatment made a difference. If the cally observe what happens. Formal experiments con-
average scores of the groups on the posttest do differ sist of two basic conditions. First, at least two (but often
and researchers cannot find any sensible alternative more) conditions or methods are compared to assess
explanations for this difference, they can conclude that the effect(s) of particular conditions or “treatments”
the treatment did have an effect and is likely the cause (the independent variable). Second, the independent
of the difference. variable is directly manipulated by the researcher.
Experimental research, therefore, enables researchers Change is planned for and deliberately manipulated in
to go beyond description and prediction, beyond the order to study its effect(s) on one or more outcomes
identification of relationships, to at least a partial deter- (the dependent variable). Let us discuss some impor-
mination of what causes them. Correlational studies tant characteristics of experimental research in a bit
may demonstrate a strong relationship between socio- more detail.
economic level and academic achievement, for instance,
but they cannot demonstrate that improving socioeco-
nomic level will necessarily improve achievement. COMPARISON OF GROUPS
Only experimental research has this capability. Some An experiment usually involves two groups of subjects,
actual examples of the kinds of experimental studies an experimental group and a control or a comparison
that have been conducted by educational researchers are group, although it is possible to conduct an experiment
as follows: with only one group (by providing all treatments to
the same subjects) or with three or more groups. The
• The effect of small classes on instruction.1
experimental group receives a treatment of some sort
• The effect of early reading instruction on growth
(such as a new textbook or a different method of teach-
rates of at-risk kindergarteners.2
ing), while the control group receives no treatment (or
• The use of intensive mentoring to help beginning
the comparison group receives a different treatment).
teachers develop balanced instruction.3
The control or the comparison group is crucially im-
• The effect of lotteries on Web survey response rates.4
portant in all experimental research, for it enables the
• Introduction of a course on bullying into preservice
researcher to determine whether the treatment has had
teacher-training curriculum.5
an effect or whether one treatment is more effective
• Using social stories to enhance the interpersonal conflict
than another.
resolution skills of children with learning disabilities.6
Historically, a pure control group is one that re-
• Improving the self-concept of students through the
ceives no treatment at all. While this is often the case
use of hypnosis.7
in medical or psychological research, it is rarely true in
educational research. The control group almost always
receives a different treatment of some sort. Some educa-
tional researchers, therefore, refer to comparison groups
Essential Characteristics rather than to control groups.
of Experimental Research Consider an example. Suppose a researcher wished
to study the effectiveness of a new method of teaching
The word experiment has a long and illustrious history science. He or she would have the students in the experi-
in the annals of research. It has often been hailed as the mental group taught by the new method, but the students
most powerful method that exists for studying cause in the comparison group would continue to be taught by
and effect. Its origins go back to the very beginnings their teacher’s usual method. The researcher would not
of history when, for example, primeval humans first ex- administer the new method to the experimental group
perimented with ways to produce fire. One can imagine and have a control group do nothing. Any method of
countless trial-and-error attempts on their part before instruction would likely be more effective than no
achieving success by sparking rocks or by spinning method at all!
fra97851_ch13_263-300.indd 266 12/21/10 6:58 PM

CHAPTER 13 Experimental Research 267
MANIPULATION OF THE INDEPENDENT certain kinds of experiments in which random assign-

VARIABLE ment is not possible, researchers try to use randomiza-
The second essential characteristic of all experiments is tion whenever feasible. It is a crucial ingredient in the
that the researcher actively manipulates the independent best kinds of experiments. Random assignment is simi-
variables. What does this mean? Simply put, it means lar, but not identical, to the concept of random selection
that the researcher deliberately and directly determines we discussed in Chapter 6. Random assignment means
what forms the independent variable will take and then that every individual who is participating in an experi-
which group will get which form. For example, if the in- ment has an equal chance of being assigned to any of
dependent variable in a study is the amount of enthusi- the experimental or control conditions being compared.
asm an instructor displays, a researcher might train two Random selection, on the other hand, means that every
teachers to display different amounts of enthusiasm as member of a population has an equal chance of being
they teach their classes. selected to be a member of the sample. Under random
Although many independent variables in education assignment, each member of the sample is given a num-
can be manipulated, many others cannot. Examples of ber (arbitrarily), and a table of random numbers (see
independent variables that can be manipulated include Chapter 6) is then used to select the members of the
teaching method, type of counseling, learning activities, experimental and control groups.
assignments given, and materials used; examples of in- Three things should be noted about the random as-
dependent variables that cannot be manipulated include signment of subjects to groups. First, it takes place be-
gender, ethnicity, age, and religious preference. Re- fore the experiment begins. Second, it is a process of
searchers can manipulate the kinds of learning activities assigning or distributing individuals to groups, not a
to which students are exposed in a classroom, but they result of such distribution. This means that you cannot
cannot manipulate, say, religious preference—that is, look at two groups that have already been formed and
students cannot be “made into” Protestants, Catholics, be able to tell, just by looking, whether or not they were
Jews, or Muslims, for example, to serve the purposes formed randomly. Third, the use of random assignment
of a study. To manipulate a variable, researchers must allows the researcher to form groups that, right at the
decide who is to get something and when, where, and beginning of the study, are equivalent—that is, they dif-
how they will get it. fer only by chance in any variables of interest. In other
The independent variable in an experimental study words, random assignment is intended to eliminate the
may be established in several ways—either (1) one threat of extraneous, or additional, variables—not
form of the variable versus another; (2) presence ver- only those of which researchers are aware but also
sus absence of a particular form; or (3) varying degrees those of which they are not aware—that might affect
of the same form. An example of (1) would be a study the outcome of the study. This is the beauty and the
comparing the inquiry method with the lecture method power of random assignment. It is one of the reasons
of instruction in teaching chemistry. An example of why experiments are, in general, more effective than
(2) would be a study comparing the use of PowerPoint other types of research for assessing cause-and-effect
slides versus no PowerPoint slides in teaching statis- relationships.
tics. An example of (3) would be a study comparing the This last statement is tempered, of course, by the
effects of different specified amounts of teacher enthu- realization that groups formed through random assign-
siasm on student attitudes toward mathematics. In both ment may still differ somewhat. Random assignment
(1) and (2), the variable (method) is clearly categorical. ensures only that groups are equivalent (or at least as
In (3), a variable that in actuality is quantitative (degree equivalent as human beings can make them) at the be-
of enthusiasm) is treated as categorical (the effects of ginning of an experiment.
only specified amounts of enthusiasm will be studied) Furthermore, random assignment is no guarantee of
in order for the researcher to manipulate (that is, to con- equivalent groups unless both groups are sufficiently
trol for) the amount of enthusiasm. large. No one would expect random assignment to re-
sult in equivalence if only five subjects were assigned
to each group, for example. There are no rules for deter-
RANDOMIZATION mining how large groups must be, but most researchers
An important aspect of many experiments is the random are uncomfortable relying on random assignment with
assignment of subjects to groups. Although there are fewer than 40 subjects in each group.
fra97851_ch13_263-300.indd 267 12/21/10 6:58 PM

by excluding all males. The variable of gender, in

Control of Extraneous other words, is held constant. However, there is a
cost involved (as there almost always is) for this
Variables control, as the generalizability of the results of the
Researchers in an experimental study have an oppor- study are correspondingly reduced.
tunity to exercise far more control than in most other Building the variable into the design: This solution
forms of research. They determine the treatment (or involves building the variable(s) into the study to
treatments), select the sample, assign individuals to assess their effects. It is the exact opposite of the
groups, decide which group will get the treatment, try previous idea. Using the preceding example, the
to control other factors besides the treatment that might researcher would include both females and males
influence the outcome of the study, and then (finally) (as distinct groups) in the design of the study
observe or measure the effect of the treatment on the and then analyze the effects of both gender and
groups when the treatment is completed. method on outcomes.
In Chapter 9, we introduced the idea of internal va- Matching: Often pairs of subjects can be matched
lidity and discussed several kinds of threats to internal on certain variables of interest. If a researcher felt
validity. It is very important for researchers conducting that age, for example, might affect the outcome of
an experimental study to do their best to control for— a study, he might endeavor to match students ac-
that is, to eliminate or to minimize the possible effect cording to their ages and then assign one member
of—these threats. If researchers are unsure whether an- of each pair (randomly if possible) to each of the
other variable might be the cause of a result observed in comparison groups.
a study, they cannot be sure what the cause really is. For Using subjects as their own controls: When sub-
example, if a researcher attempted to compare the ef- jects are used as their own controls, their perfor-
fects of two different methods of instruction on student mance under both (or all) treatments is compared.
attitudes toward history but did not make sure that the Thus, the same students might be taught algebra
groups involved were equivalent in ability, then ability units first by an inquiry method and later by a lec-
might be a possible alternative explanation (rather than ture method. Another example is the assessment
the difference in methods) for any differences in atti- of an individual’s behavior during a period of time
tudes of the groups found on a posttest. before and after a treatment is implemented to see
In particular, researchers who conduct experimental whether changes in behavior occur.
studies try their best to control any and all subject char- Using analysis of covariance: As mentioned in Chap-
acteristics that might affect the outcome of the study. ter 11, analysis of covariance can be used to equate
They do this by ensuring that the two groups are as groups statistically on the basis of a pretest or other
equivalent as possible on all variables other than the one variables. The posttest scores of the subjects in each
or ones being studied (that is, the independent variables). group are then adjusted accordingly.
How do researchers minimize or eliminate threats We will shortly show you a number of research de-
due to subject characteristics? Many ways exist. Here signs that illustrate how several of the above controls
are some of the most common. can be implemented in an experimental study.
Randomization: As we mentioned before, if sub-
jects can be randomly assigned to the various
groups involved in an experimental study, re- Group Designs in Experimental
searchers can assume that the groups are equiva-
lent. This is the best way to ensure that the effects
Research
of one or more possible extraneous variables have The design of an experiment can take a variety of forms.
been controlled. Some of the designs we present in this section are better
Holding certain variables constant: The idea here than others, however. Why “better”? Because of the var-
is to eliminate the possible effects of a variable by ious threats to internal validity identified in Chapter 9:
removing it from the study. For example, if a re- Good designs control many of these threats, while poor
searcher suspects that gender might influence the designs control only a few. The quality of an experi-
outcomes of a study, she could control for it by ment depends on how well the various threats to internal
restricting the subjects of the study to females and validity are controlled.
fra97851_ch13_263-300.indd 268 12/21/10 6:58 PM

POOR EXPERIMENTAL DESIGNS Thus, he does not know whether the treatment had any
Designs that are “weak” do not have built-in controls for effect at all. It is quite possible that the students who
threats to internal validity. In addition to the indepen- use the new textbook will indicate very favorable at-
dent variable, there are a number of other plausible ex- titudes toward history. But the question remains,
planations for any outcomes that occur. As a result, any were these attitudes produced by the new textbook?
researcher who uses one of these designs has difficulty Unfortunately, the one-shot case study does not help
assessing the effectiveness of the independent variable. us answer this question. To remedy this design, a com-
parison could be made with another group of students
The One-Shot Case Study. In the one-shot case who had the same course content presented in the regular
study design, a single group is exposed to a treatment textbook. (We shall show you just such a design shortly.)
or event and a dependent variable is subsequently ob- Fortunately, the flaws in the one-shot design are so well
served (measured) in order to assess the effect of the known that it is seldom used in educational research.
treatment. A diagram of this design is as follows:
The One-Group Pretest-Posttest Design.
The One-Shot Case Study Design In the one-group pretest-posttest design, a single
X O group is measured or observed not only after being
Treatment Observation exposed to a treatment of some sort, but also before. A
(Dependent variable) diagram of this design is as follows:
The One-Group Pretest-Posttest Design
The symbol X represents exposure of the group to the
treatment of interest, while O refers to observation (mea- O X O
Pretest Treatment Posttest
surement) of the dependent variable. The placement of
the symbols from left to right indicates the order in time
of X and O. As you can see, the treatment, X, comes Consider an example of this design. A principal wants
before observation of the dependent variable, O. to assess the effects of weekly counseling sessions on the
Suppose a researcher wishes to see if a new textbook attitudes of certain “hard-to-reach” students in her school.
increases student interest in history. He uses the text- She asks the counselors in the program to meet once a
book (X) for a semester and then measures student inter- week with these students for a period of 10 weeks, during
est (O) with an attitude scale. A diagram of this example which sessions the students are encouraged to express
is shown in Figure 13.1. their feelings and concerns. She uses a 20-item scale to
The most obvious weakness of this design is its measure student attitudes toward school both immedi-
absence of any control. The researcher has no way of ately before and after the 10-week period. Figure 13.2
knowing if the results obtained at O (as measured by the presents a diagram of the design of the study.
attitude scale) are due to treatment X (the textbook). The This design is better than the one-shot case study (the
design does not provide for any comparison, so the re- researcher at least knows whether any change occurred),
searcher cannot compare the treatment results (as mea- but it is still weak. Nine uncontrolled-for threats to
sured by the attitude scale) with the same group before
using the new textbook, or with those of another group
using a different textbook. Because the group has not O X O
been pretested in any way, the researcher knows noth- Pretest: Treatment: Posttest:
ing about what the group was like before using the text. 20-item 10 weeks of 20-item
attitude scale counseling attitude scale
completed by completed by
X O students students
New textbook Attitude scale
to measure interest (Dependent (Dependent
variable) variable)
(Dependent variable)
Figure 13.2 Example of a One-Group Pretest-Posttest
Figure 13.1 Example of a One-Shot Case Study Design Design
fra97851_ch13_263-300.indd 269 12/21/10 6:58 PM

internal validity exist that might also explain the results

X O
on the posttest. They are history, maturation, instrument
New textbook Attitude scale
decay, data collector characteristics, data collector bias,
to measure interest
testing, statistical regression, attitude of subjects, and
implementation. Any or all of these may influence the O
outcome of the study. The researcher would not know if Regular textbook Attitude scale
any differences between the pretest and the posttest are to measure interest
due to the treatment or to one or more of these threats.
To remedy this, a comparison group, which does not re- Figure 13.3 Example of a Static-Group Comparison
ceive the treatment, could be added. Then if a change in Design
attitude occurs between the pretest and the posttest, the
researcher has reason to believe that it was caused by
more vulnerable not only to mortality and location,† but
the treatment (symbolized by X).
also, more importantly, to the possibility of differential
subject characteristics.
The Static-Group Comparison Design. In
the static-group comparison design, two already exist-
The Static-Group Pretest-Posttest Design.
ing, or intact, groups are used. These are sometimes re-
The static-group pretest-posttest design differs from
ferred to as static groups, hence the name for the design.
the static-group comparison design only in that a pretest
This design is sometimes called a nonequivalent control
is given to both groups. A diagram for this design is as
group design. A diagram of this design is as follows:
follows:
The Static-Group Comparison Design
The Static-Group Pretest-Posttest Design
X O
O X O
O
O O
The dashed line indicates that the two groups being

In analyzing the data, each individual’s pretest score is
compared are already formed—that is, the subjects are
subtracted from his or her posttest score, thus permitting
not randomly assigned to the two groups. X symbolizes
analysis of “gain” or “change.” While this provides better
the experimental treatment. The blank space in the de-
control of the subject characteristics threat (since it is the
sign indicates that the “control” group does not receive
change in each student that is analyzed), the amount of
the experimental treatment; it may receive a different
gain often depends on initial performance; that is, the
treatment or no treatment at all. The two Os are placed
group scoring higher on the pretest is likely to improve
exactly vertical to each other, indicating that the obser-
more (or in some cases less), and thus subject characteris-
vation or measurement of the two groups occurs at the
tics still remains somewhat of a threat. Further, administer-
same time.
ing a pretest raises the possibility of a testing threat. In the
Consider again the example used to illustrate the one-
event that the pretest is used to match groups, this design
shot case study design. We could apply the static-group
becomes the matching-only pretest-posttest control group
comparison design to this example. The researcher
design (see page 275), a much more effective design.
would (1) find two intact groups (two classes), (2) assign
the new textbook (X) to one of the classes but have the
other class use the regular textbook, and then (3) mea- TRUE EXPERIMENTAL DESIGNS
sure the degree of interest of all students in both classes The essential ingredient of a true experimental design
at the same time (for example, at the end of the semes- is that subjects are randomly assigned to treatment
ter). Figure 13.3 presents a diagram of this example. groups. As discussed earlier, random assignment is a
Although this design provides better control over powerful technique for controlling the subject charac-
history, maturation, testing, and regression threats,* it is teristics threat to internal validity, a major consideration
in educational research.
*History and maturation remain possible threats because the re-
searcher cannot be sure that the two groups have been exposed to the †This is because the groups may differ in the number of subjects lost
same extraneous events or have the same maturational processes. and/or in the kinds of resources provided.
fra97851_ch13_263-300.indd 270 12/21/10 6:58 PM

The Randomized Posttest-Only Control this reason, researchers should always report how many
Group Design. The randomized posttest-only con- subjects drop out of each group during an experiment.
trol group design involves two groups, both of which An attitudinal threat is possible. In addition, implemen-
are formed by random assignment. One group receives tation, data collector bias, location, and history threats
the experimental treatment while the other does not, may exist. These threats can sometimes be controlled by
and then both groups are posttested on the dependent appropriate modifications to this design.
variable. A diagram of this design is as follows: As an example of this design, consider a hypotheti-
cal study in which a researcher investigates the effects
The Randomized Posttest-Only of a series of sensitivity training workshops on faculty
Control Group Design morale in a large high school district. The researcher
Treatment group R X O randomly selects a sample of 100 teachers from all the
Control group R C O teachers in the district. The researcher then (1) ran-
domly assigns the teachers in the district to two groups;
As before, the symbol X represents exposure to the treat- (2) exposes one group, but not the other, to the training;
ment and O refers to the measurement of the dependent and then (3) measures the morale of each group using
variable. R represents the random assignment of indi- a questionnaire. Figure 13.4 presents a diagram of this
viduals to groups. C now represents the control group. hypothetical experiment.
In this design, the control of certain threats is excel- Again we stress that it is important to keep clear the
lent. Through the use of random assignment, the threats distinction between random selection and random as-
of subject characteristics, maturation, and statistical re- signment. Both involve the process of randomization,
gression are well controlled for. Because none of the but for a different purpose. Random selection, you will
subjects in the study are measured twice, testing is not a recall, is intended to provide a representative sample.
possible threat. This is perhaps the best of all designs to But it may or may not be accompanied by the random
use in an experimental study, provided there are at least assignment of subjects to groups. Random assignment
40 subjects in each group. is intended to equate groups, and often is not accompa-
There are, unfortunately, some threats to internal va- nied by random selection.
lidity that are not controlled for by this design. The first is
mortality. Because the two groups are similar, we might The Randomized Pretest-Posttest Control
expect an equal dropout rate from each group. However, Group Design. The randomized pretest-posttest
exposure to the treatment may cause more individuals control group design differs from the randomized
in the experimental group to drop out (or stay in) than posttest-only control group design solely in the use of
in the control group. This may result in the two groups a pretest. Two groups of subjects are used, with both
becoming dissimilar in terms of their characteristics, groups being measured or observed twice. The first mea-
which in turn may affect the results on the posttest. For surement serves as the pretest, the second as the posttest.
R X O
Random assignment Treatment: Posttest:
of Sensitivity training Faculty morale
50 teachers to workshops questionnaire
experimental group
100 teachers (Dependent variable)
randomly selected
R C O
Random assignment No treatment: Posttest:
of Do not receive Faculty morale
50 teachers to sensitivity training questionnaire
control group
(Dependent variable)
Figure 13.4 Example of a Randomized Posttest-Only Control Group Design
fra97851_ch13_263-300.indd 271 12/21/10 6:58 PM

Random assignment is used to form the groups. The to four groups, with two of the groups being pretested
measurements or observations are collected at the same and two not. One of the pretested groups and one of the
time for both groups. A diagram of this design follows. unpretested groups is exposed to the experimental treat-
ment. All four groups are then posttested. A diagram of
The Randomized Pretest-Posttest this design is as follows:
Control Group Design
The Randomized Solomon Four-Group Design
Treatment group R O X O
Control group R O C O Treatment group R O X O
Control group R O C O
The use of the pretest raises the possibility of a pre- Treatment group R X O
test treatment interaction threat, since it may “alert” Control group R C O
the members of the experimental group, thereby causing
them to do better (or more poorly) on the posttest than the The randomized Solomon four-group design com-
members of the control group. A trade-off is that it pro- bines the pretest-posttest control group and posttest-only
vides the researcher with a means of checking whether control group designs. The first two groups represent
the two groups are really similar—that is, whether ran- the pretest-posttest control group design, while the last
dom assignment actually succeeded in making the groups two groups represent the posttest-only control group
equivalent. This is particularly desirable if the number in design. Figure 13.6 presents an example of the random-
each group is small (less than 30). If the pretest shows ized Solomon four-group design.
that the groups are not equivalent, the researcher can seek The randomized Solomon four-group design pro-
to make them so by using one of the matching designs vides the best control of the threats to internal validity
we will discuss shortly. A pretest is also necessary if the that we have discussed. A weakness, however, is that
amount of change over time is to be assessed. it requires a large sample because subjects must be as-
Let us illustrate this design by using our previous signed to four groups. Furthermore, conducting a study
example involving the use of sensitivity workshops. involving four groups at the same time requires a con-
Figure 13.5 presents a diagram of how this design siderable amount of energy and effort on the part of the
would be used. researcher.
The Randomized Solomon Four-Group Random Assignment with Matching. In

Design. The randomized Solomon four-group an attempt to increase the likelihood that the groups
design is an attempt to eliminate the possible effect of subjects in an experiment will be equivalent, pairs
of a pretest. It involves random assignment of subjects of individuals may be matched on certain variables.
R O X O
Random Pretest: Treatment: Posttest:
assignment of Faculty morale Sensitivity Faculty morale
50 teachers to questionnaire training questionnaire
experimental group workshops
(Dependent (Dependent
100 teachers variable) variable)
randomly
selected R O C O
assignment of Faculty morale Workshops that Faculty morale
50 teachers to questionnaire do not include questionnaire
control group sensitivity training
(Dependent (Dependent
variable) variable)
Figure 13.5 Example of a Randomized Pretest-Posttest Control Group Design
fra97851_ch13_263-300.indd 272 12/21/10 6:58 PM

R O X O
assignment of Faculty morale Sensitivity Faculty morale
25 teachers to questionnaire training questionnaire
experimental workshops
group (Group I) (Dependent (Dependent
variable) variable)
R O C O
Random Pretest: Workshops Posttest:
assignment of Faculty morale that do not Faculty morale
25 teachers to questionnaire include questionnaire
comparison sensitivity
100 group (Group II) (Dependent training (Dependent
teachers variable) variable)
randomly
selected R X O
Random Treatment: Posttest:
assignment of Sensitivity Faculty morale
25 teachers to training questionnaire
experimental workshops
group (Group III) (Dependent
variable)
R C O
Random Workshops Posttest:
assignment of that do not Faculty morale
25 teachers to include questionnaire
comparison sensitivity
group (Group IV) training (Dependent
variable)
Figure 13.6 Example of a Randomized Solomon Four-Group Design
The choice of variables on which to match is based The Randomized Pretest-Posttest Control
on previous research, theory, and/or the experience of Group Design, Using Matched Subjects
the researcher. The members of each matched pair are Treatment group Mr O X O
then assigned to the experimental and control groups
Control group Mr O C O
at random. This adaptation can be made to both the
posttest-only control group design and the pretest- The symbol Mr refers to the fact that the members of
posttest control group design, although the latter is each matched pair are randomly assigned to the experi-
more common. Diagrams of these designs are pro- mental and control groups.
vided below. Although a pretest of the dependent variable is com-
monly used to provide scores on which to match, a
The Randomized Posttest-Only Control measurement of any variable that shows a substantial rela-
Group Design, Using Matched Subjects tionship to the dependent variable is appropriate. Matching
may be done in either or both of two ways: mechanically
Treatment group Mr X O
or statistically. Both require a score for each subject on
Control group Mr C O each variable on which subjects are to be matched.
fra97851_ch13_263-300.indd 273 12/21/10 6:58 PM

Experimental
group receives
Sample of women randomly coaching in
selected and then matched with mathematics.
regard to GPA.
Effects of
coaching
on GPA
are
compared.
One member of each pair is

randomly assigned to the
experimental group and one
to the comparison group.
Control group receives

no coaching.
Figure 13.7 A Randomized Posttest-Only Control Group Design, Using Matched Subjects
Mechanical matching is a process of pairing two averages (GPA) of low-achieving students in science
persons whose scores on a particular variable are simi- classes. The researcher randomly selects a sample of
lar. Two girls, for example, whose mathematics apti- 60 students from a population of 125 such students in
tude scores and test anxiety scores are similar might be a local elementary school and matches them by pairs
matched on those variables. After the matching is com- on GPA, finding that she can match 40 of the 60. She
pleted for the entire sample, a check should be made then randomly assigns each subject in the resulting
(through the use of frequency polygons) to ensure that 20 pairs to either the experimental or the control group.
the two groups are indeed equivalent on each matching Figure 13.7 presents a similar example.
variable. Unfortunately, two problems limit the useful- Statistical matching,* on the other hand, does
ness of mechanical matching. First, it is very difficult to not necessitate a loss of subjects, nor does it limit the
match on more than two or three variables—people just number of matching variables. Each subject is given a
don’t pair up on more than a few characteristics, making “predicted” score on the dependent variable, based on
it necessary to have a very large initial sample to draw the correlation between the dependent variable and the
from. Second, in order to match, it is almost inevitable variable (or variables) on which the subjects are being
that some subjects must be eliminated from the study matched. The difference between the predicted and ac-
because no “matches” for them can be found. Samples tual scores for each individual is then used to compare
then are no longer random even though they may have experimental and control groups.
been before matching occurred.
As an example of a mechanical matching design with *Statistical equating is a more common term than its synonym,
random assignment, suppose a researcher is interested statistical matching. We believe the meaning for the beginning
in the effects of academic coaching on the grade point student is better conveyed by the term matching.
fra97851_ch13_263-300.indd 274 12/21/10 6:58 PM

When a pretest is used as the matching variable, the It should be emphasized that matching (whether me-
difference between the predicted and actual score is chanical or statistical) is never a substitute for random
called a regressed gain score. This score is preferable assignment. Furthermore, the correlation between the
to the more straightforward gain scores (posttest minus matching variable(s) and the dependent variable should
pretest score for each individual) primarily because it be fairly substantial. (We suggest at least .40.) Realize
is more reliable. We discuss a similar procedure under also that unless it is used in conjunction with random
partial correlation in Chapter 15. assignment, matching controls only for the variable(s)
If mechanical matching is used, one member being matched. Diagrams of each of the matching-only
of each matched pair is randomly assigned to the control group designs follow.
experimental group, the other to the control group. The Matching-Only Posttest-Only
If statistical matching is used, the sample is divided Control Group Design
randomly at the outset, and the statistical adjustments
are made after all data have been collected. Although Treatment group M X O
some researchers advocate the use of statistical over Control group M C O
mechanical matching, statistical matching is not
infallible. Its major weakness is that it assumes that The Matching-Only Pretest-Posttest
the relationship between the dependent variable and Control Group Design
each predictor variable can be properly described by a Treatment group M O X O
straight line rather than a curved line. Whichever pro- Control group M O C O
cedure is used, the researcher must (in this design) rely
on random assignment to equate groups on all other The M in this design means that the subjects in each
variables related to the dependent variable. group have been matched (on certain variables) but not
randomly assigned to the groups.
QUASI-EXPERIMENTAL DESIGNS
Counterbalanced Designs. Counterbalanced
Quasi-experimental designs do not include the use designs represent another technique for equating experi-
of random assignment. Researchers who employ these mental and comparison groups. In this design, each group
designs rely instead on other techniques to control (or is exposed to all treatments, however many there are, but
at least reduce) threats to internal validity. We shall de- in a different order. Any number of treatments may be in-
scribe some of these techniques as we discuss several volved. An example of a diagram for a counterbalanced
quasi-experimental designs. design involving three treatments is as follows:
The Matching-Only Design. The matching-only A Three-Treatment Counterbalanced Design

design differs from random assignment with matching Group I X1 O X2 O X3 O
only in the fact that random assignment is not used. The
Group II X2 O X3 O X1 O
researcher still matches the subjects in the experimental
and control groups on certain variables, but he or she has Group III X3 O X1 O X2 O
no assurance that they are equivalent on others. Why?
Because even though matched, subjects already are in This arrangement involves three groups. Group I re-
intact groups. This is a serious limitation but often is un- ceives treatment 1 and is posttested, then receives treat-
avoidable when random assignment is impossible—that ment 2 and is posttested, and last receives treatment 3
is, when intact groups must be used. When several (say, and is posttested. Group II receives treatment 2 first, then
10 or more) groups are available for a method study treatment 3, and then treatment 1, being posttested after
and the groups can be randomly assigned to different each treatment. Group III receives treatment 3 first, then
treatments, this design offers an alternative to random treatment 1, followed by treatment 2, also being post-
assignment of subjects. After the groups have been ran- tested after each treatment. The order in which the groups
domly assigned to the different treatments, the individu- receive the treatments should be determined randomly.
als receiving one treatment are matched with individuals How do researchers determine the effectiveness
receiving the other treatments. The design shown in of the various treatments? Simply by comparing the
Figure 13.7 is still preferred, however. average scores for all groups on the posttest for each
fra97851_ch13_263-300.indd 275 12/21/10 6:58 PM

Study 1 Study 2
Weeks Weeks Weeks Weeks
1--4 5--8 1--4 5--8
Group I Method X = 12 Method Y = 8 Method X = 10 Method Y = 6
Group II Method Y = 8 Method X = 12 Method Y = 10 Method X = 14
Overall Means: Method X = 12; Method Y = 8 Method X = 12; Method Y = 8
Figure 13.8 Results (Means) from a Study Using a Counterbalanced Design
treatment. In other words, the averaged posttest score essentially the same on the pretests and then consider-
for all groups for treatment 1 can be compared with the ably improves on the posttests, the researcher has more
averaged posttest score for all groups for treatment 2, confidence that the treatment is causing the improve-
and so on, for however many treatments there are. ment than if just one pretest and one posttest were
This design controls well for the subject charac- given. An example might be a teacher who gives a
teristics threat to internal validity but is particularly weekly test to her class for several weeks before giving
vulnerable to multiple-treatment interference—that them a new textbook to use, and then monitors how
is, performance during a particular treatment may be they score on a number of weekly tests after they have
affected by one or more of the previous treatments. used the text. A diagram of the basic time-series design
Consequently, the results of any study in which the re- is as follows:
searcher has used a counterbalanced design must be ex-
amined carefully. Consider the two sets of hypothetical A Basic Time-Series Design
data shown in Figure 13.8. O1 O2 O3 O4 O5 X O6 O7 O8 O9 O10
The interpretation in study 1 is clear: Method X is su-
perior for both groups regardless of sequence and to the The threats to internal validity that endanger use
same degree. The interpretation in study 2, however, is of this design include history (something could hap-
much more complex. Overall, method X appears supe- pen between the last pretest and the first posttest), in-
rior, and by the same amount as in study 1. In both studies, strumentation (if, for some reason, the test being used
the overall mean for X is 12, while for Y it is 8. In study is changed at any time during the study), and testing
2, however, it appears that the difference between X and (due to a practice effect). The possibility of a pretest-
Y depends on previous exposure to the other method. treatment interaction is also increased with the use of
Group I performed much worse on method Y when it several pretests.
was exposed to it following X, and group II performed The effectiveness of the treatment in a time-series de-
much better on X when it was exposed to it after method sign is basically determined by analyzing the pattern of
Y. When either X or Y was given first in the sequence, test scores that results from the several tests. Figure 13.9
there was no difference in performance. It is not clear illustrates several possible outcome patterns that might
that method X is superior in all conditions in study 2, result from the introduction of an experimental variable
whereas this is quite clear in study 1. (X). The vertical line indicates the point at which the
experimental treatment is introduced. In this figure, the
Time-Series Designs. The typical pre- and post- change between time periods 5 and 6 gives the same
test designs examined up to now involve observations kind of data that would be obtained using a one-group
or measurements taken immediately before and after pretest-posttest design. The collection of additional data
treatment. A time-series design, however, involves before and after the introduction of the treatment, how-
repeated measurements or observations over a period ever, shows how misleading a one-group pretest-post-
of time both before and after treatment. It is really an test design can be. In (A), the improvement is shown to
elaboration of the one-group pretest-posttest design be no more than that which occurs from one data col-
presented in Figure 13.2. An extensive amount of lection period to another—regardless of method. You
data is collected on a single group. If the group scores will notice that performance does improve from time to
fra97851_ch13_263-300.indd 276 12/21/10 6:58 PM

Study (with or without random assignment), which permit

(D) the investigation of additional independent variables.
Study
Another value of a factorial design is that it allows a
(C) researcher to study the interaction of an independent
Study
variable with one or more other variables, sometimes
(B) called moderator variables. Moderator variables may
be either treatment variables or subject characteristic
variables. A diagram of a factorial design is as follows:
Factorial Design
Study
Treatment R O X Y1 O
(A) Control R O C Y1 O
Treatment R O X Y2 O
O1 O2 O3 O4 O5 O6 O7 O8 O9 O 10 Control R O C Y2 O
X
This design is a modification of the pretest-posttest
Figure 13.9 Possible Outcome Patterns in a Time-Series control group design. It involves one treatment and one
Design control group, and a moderator variable having two
levels (Y1 and Y2). In this example, two groups would
time, but no trend or overall increase is apparent. In (B), receive the treatment (X) and two would not (C). The
the gain from periods 5 to 6 appears to be part of a trend groups receiving the treatment would differ on Y, how-
already apparent before the treatment was begun (quite ever, as would the two groups not receiving the treat-
possibly an example of maturation). In (D) the higher ment. Because each variable, or factor, has two levels,
score in period 6 is only temporary, as performance the above design is called a 2 by 2 factorial design. This
soon approaches what it was before the treatment was design can also be illustrated as follows:
introduced (suggesting an extraneous event of transient
impact). Only in (C) do we have evidence of a consis- Alternative Illustration of the Above Example
tent effect of the treatment. X C
The time-series design is a strong design, although it
is vulnerable to history (an extraneous event could occur Y1
after period 5) and instrumentation (owing to the sev- Y2
eral test administrations at different points in time). The
extensive amount of data collection required, in fact,
is a likely reason why this design is infrequently used A variation of this design uses two or more different
in educational research. In many studies, especially in treatment groups and no control groups. Consider the
schools, it simply is not feasible to give the same in- example we have used before of a researcher compar-
strument eight to ten times. Even when it is possible, ing the effectiveness of inquiry and lecture methods of
serious questions are raised concerning the validity of instruction on achievement in history. The independent
instrument interpretation with so many administrations. variable in this case (method of instruction) has two
An exception is the use of unobtrusive devices that can levels—inquiry (X1) and lecture (X2). Now imagine the
be applied over many occasions, since interpretations researcher wants to see whether achievement is also in-
based on them should remain valid. fluenced by class size. In that case, Y1 might represent
small classes and Y2 might represent large classes.
As we suggest above, it is possible using a factorial
FACTORIAL DESIGNS design to assess not only the separate effect of each in-
Factorial designs extend the number of relationships dependent variable but also their joint effect. In other
that may be examined in an experimental study. They words, the researcher is able to see how one of the vari-
are essentially modifications of either the posttest-only ables might moderate the other (hence the reason for
control group or pretest-posttest control group designs calling these variables moderator variables).
fra97851_ch13_263-300.indd 277 12/21/10 6:58 PM

Figure 13.11, for example, illustrates two possi-

Method ble outcomes for the 2 by 2 factorial design shown in
Class size Inquiry (X1) Lecture (X2) Figure 13.10. The scores for each group on the posttest
Small (Y1) (a 50-item quiz on American history) are shown in the
boxes (usually called cells) corresponding to each com-
Large (Y2)
bination of method and class size.
In study (a) in Figure 13.11, the inquiry method was
Figure 13.10 Using a Factorial Design to Study Effects shown to be superior in both small and large classes, and
of Method and Class Size on Achievement small classes were superior to large classes for both meth-
ods. Hence no interaction effect is present. In study (b),
Let us continue with the example of the researcher students did better in small than in large classes with
who wished to investigate the effects of method of in- both methods; however, students in small classes did
struction and class size on achievement in history. Fig- better when they were taught by the inquiry method,
ure 13.10 illustrates how various combinations of these but students in large classes did better when they were
variables could be studied in a factorial design. taught by the lecture method. Thus, even though stu-
Factorial designs, therefore, are an efficient way to dents did better in small than in large classes in general,
study several relationships with one set of data. Let us how well they did depended on the teaching method. As
emphasize again, however, that their greatest virtue lies a result, the researcher cannot say that either method
in the fact that they enable a researcher to study interac- was always better; it depended on the size of the class
tions between variables. in which students were taught. There was an interaction,
50
(a) No interaction between class size and method
45 Inquiry
Method
Scores
40
Class size Inquiry (X1) Lecture (X2) Mean
Lecture
Small (Y1) 46 38 42 35
Large (Y2) 40 32 36
30
Mean = 43 35
Small Large
Class size
50
(b) Interaction between class size and method
45
Method Inquiry
Scores
40
Class size Inquiry (X1) Lecture (X2) Mean
Small (Y1) 48 42 45 35 Lecture

Large (Y2) 32 38 35
30
Mean = 40 40
Small Large
Class size
Figure 13.11 Illustration of Interaction and No Interaction in a 2 by 2 Factorial Design
fra97851_ch13_263-300.indd 278 12/21/10 6:58 PM

Figure 13.12 Example

Treatments (X) of a 4 by 2 Factorial
R X1 Y1 O X1 Computer-assisted instruction Design
R X2 Y1 O X2 Programmed text
R X3 Y1 O X3 Televised lecture
R X4 Y1 O X4 Lecture-discussion
Treatments
Moderator (Y) X1 X2 X3 X4
R X1 Y2 O
R X2 Y2 O Y1 High motivation Y1
R X3 Y2 O Y2 Low motivation Y2
R X4 Y2 O
in other words, between class size and method, and this indicate some control (the threat might occur); a minus
in turn affected achievement. (−) to indicate a weak control (the threat is likely to
Suppose a factorial design was not used in study (b). occur); and a question mark (?) to those threats whose
If the researcher simply compared the effect of the two likelihood, owing to the nature of the study, we cannot
methods, without taking class size into account, he determine.
would have concluded that there was no difference in You will notice that these designs are most effec-
their effect on achievement (notice that the means of tive in controlling the threats of subject characteristics,
both groups 5 40). The use of a factorial design en- mortality, history, maturation, and regression. Note that
ables us to see that the effectiveness of the method, in mortality is controlled in several designs because any
this case, depended on the size of the class in which it subject lost is lost to both the experimental and con-
was used. It appears that an interaction existed between trol methods, thus introducing no advantage to either.
method and class size. A location threat is a minor problem in the time-series
A factorial design involving four levels of the in- design because the location where the treatment is ad-
dependent variable and using a modification of the ministered is usually constant throughout the study; the
posttest-only control group design was employed by same is true for data collector characteristics, although
Tuckman.8 In this study, the independent variable was such characteristics may be a problem in other designs
type of instruction, and the moderator was amount of if different collectors are used for different methods.
motivation. It is a 4 by 2 factorial design (Figure 13.12). This is usually easy to control, however. Unfortunately,
Many additional variations are also possible, such as 3 time-series designs do suffer from a strong likelihood of
by 3, 4 by 3, and 3 by 2 by 3 designs. Factorial designs instrument decay and data collector bias, since data (by
can be used to investigate more than two variables, al- means of observations) must be collected over many tri-
though rarely are more than three variables studied in als, and the data collector can hardly be kept in the dark
one design. as to the intent of the study.
Unconscious bias on the part of data collectors is not
controlled by any of these designs, nor is an implemen-
tation effect. Either implementers or data collectors can,
Control of Threats to Internal unintentionally, distort the results of a study. The data
Validity: A Summary collector should be kept ignorant as to who received
which treatment, if this is feasible. It should be verified
Table 13.1 presents our evaluation of the effectiveness of that the treatment is administered and the data collected
each of the preceding designs in controlling the threats as the researcher intended.
to internal validity that we discussed in Chapter 9. As you can see in Table 13.1, a testing threat may be
You should remember that these assessments reflect our present in many of the designs, although its magnitude
judgment; not all researchers would necessarily agree. depends on the nature and frequency of the instrumenta-
We have assigned two pluses (+ +) to indicate a strong tion involved. It can occur only when subjects respond
control (the threat is unlikely to occur); one plus (+) to to an instrument on more than one occasion.
fra97851_ch13_263-300.indd 279 12/21/10 6:58 PM

TABLE 13.1 Effectiveness of Experimental Designs in Controlling Threats to Internal Validity

Threat
Data Atti-
Collec- Data tude
Subject Instru- tor Col- of Imple-
Charac- Mor- Loca- ment Charac- lector Matura- Sub- Regres- men-
Design teristics tality tion Decay teristics Bias Testing History tion jects sion tation
One-shot case
study − − 2 (NA) 2 2 (NA) 2 2 2 2 2
One group
pretest-posttest − ? 2 2 2 2 2 2 2 2 2 2
Static-group
comparison − − 2 1 2 2 1 ? 1 2 2 2
Randomized
posttest-only
control group 11 1 2 1 2 2 11 1 11 2 11 2
Randomized
pretest-posttest
control group 11 1 2 1 2 2 1 1 11 2 11 2
Randomized
Solomon
four-group 11 11 2 1 2 2 11 1 11 2 11 2
Randomized
posttest-only
control group
with matched
subjects 11 1 2 1 2 2 11 1 11 2 11 2
Matching-only
pretest-posttest
control group 1 1 2 1 2 2 1 1 1 2 1 2
Counterbalanced 11 11 2 1 2 2 1 11 11 11 11 2
Time-series 11 2 1 2 1 1 2 2 1 2 11 2
Factorial with
randomization 11 11 2 11 2 2 1 1 11 2 11 2
Factorial without
randomization ? ? 2 11 2 2 1 1 1 2 ? 2
Key: (++) 5 strong control, threat unlikely to occur; (+) 5 some control, threat may possibly occur; (–) 5 weak control, threat likely to occur; (?) 5 can’t
determine; (NA) 5 threat does not apply.
The attitudinal (or demoralization) effect is best con- is that neither the subjects nor the researcher know the
trolled by the counterbalanced design since each subject identity of those receiving each treatment. This is most
receives both (or all) special treatments. In the remain- easily accomplished in medical studies by means of a
ing designs, it can be controlled by providing another placebo (sometimes a sugar pill) that is indistinguish-
“special” experience during the alternative treatment. able from the actual medicine.
Special mention should be made of the double-blind Regression is not likely to be a problem except in
type of experiment. Such studies are common in medi- the one-group pretest-posttest design, since it should
cine but hard to arrange in education. The key element occur equally in experimental and control conditions if it
fra97851_ch13_263-300.indd 280 12/21/10 6:58 PM

CONTROVERSIES IN
RESEARCH
Their report, published in the New England Journal of Medi-
cine in May 2001, showed that “placebos offer no significant
advantage over ‘no treatment’ for dozens of conditions rang-
Do Placebos Work? ing from colds and seasickness to hypertension and Alzheim-
er’s disease. (The exception was pain relief, which sugar pills
seem to bring to about 15 percent of patients.)”* The research-
T he placebo effect—the expectation that some patients will
show improvement if they are given any kind of treatment
at all, even a sugar pill—has long been acknowledged by phy-
ers speculated that one explanation of the placebo effect may
simply have been an unconscious desire by patients to please
sicians and others involved in clinical trials. But does this ef- their doctors.
fect really exist? What do you think? Do some patients try (unconsciously)
Two researchers in Denmark recently suggested it often to please their doctors?
does not. They reviewed 114 clinical trials in which patients
were given real medicine, a placebo, or no treatment at all. *Reported in Time, June 4, 2001, p. 65.
occurs at all. It could, however, possibly occur in a static- The importance of step 2 is illustrated in Fig-
group pretest-posttest control group design, if there are ure 13.13. In each diagram, the thermometers depict
large initial differences between the two groups. the performance of subjects receiving method A com-
pared to those receiving method B. In diagram (a),
subjects receiving method A performed higher on the
posttest but also performed higher on the pretest; thus,
Evaluating the Likelihood the difference in pretest achievement accounts for the
of a Threat to Internal Validity difference on the posttest. In diagram (b), subjects
receiving method A performed higher on the posttest
in Experimental Studies but did not perform higher on the pretest; thus, the
posttest results cannot be explained by, or attributed
An important consideration in planning an experimental
to, different achievement levels prior to receiving the
study or in evaluating the results of a reported study is
methods.
the likelihood of threats to internal validity. As we have
Let us consider an example to illustrate how
shown, a number of possible threats to internal validity
these different steps might be employed. Suppose a
may exist. The question that a researcher must ask is:
researcher wishes to investigate the effects of two
How likely is it that any particular threat exists in this
different teaching methods (for example, lecture
study?
versus inquiry instruction) on critical thinking abil-
To aid in assessing this likelihood, we suggest the
ity of students (as measured by scores on a critical
following procedures.
thinking test). The researcher plans to compare two
Step 1: Ask: What specific factors either are known groups of eleventh-graders, one group being taught by
to affect the dependent variable or may logically an instructor who uses the lecture method, the other
be expected to affect this variable? (Note that re- group being taught by an instructor who uses the in-
searchers need not be concerned with factors un- quiry method. Assume that intact classes will be used
related to what they are studying.) rather than random assignment to groups. Several of
Step 2: Ask: What is the likelihood of the compari- the threats to internal validity discussed in Chapter 9
son groups differing on each of these factors? (A are considered and evaluated using the steps just pre-
difference between groups cannot be explained sented. We would argue that this is the kind of think-
away by a factor that is the same for all groups.) ing researchers should engage in when planning a
Step 3: Evaluate the threats on the basis of how likely research project.
they are to have an effect, and plan to control
for them. If a given threat cannot be controlled, Subject Characteristics. Although many possi-
acknowledge this. ble subject characteristics might affect critical thinking
281
fra97851_ch13_263-300.indd 281 12/21/10 6:58 PM

Investigación Experimental

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Investigación Experimental

Uploaded by

Copyright:

Available Formats

P A R T 4

fra97851_ch13_263-300.indd 263 12/21/10 6:58 PM

fra97851_ch13_263-300.indd 264 12/21/10 6:58 PM

fra97851_ch13_263-300.indd 265 12/21/10 6:58 PM

fra97851_ch13_263-300.indd 266 12/21/10 6:58 PM

MANIPULATION OF THE INDEPENDENT certain kinds of experiments in which random assign-

fra97851_ch13_263-300.indd 267 12/21/10 6:58 PM

by excluding all males. The variable of gender, in

fra97851_ch13_263-300.indd 268 12/21/10 6:58 PM

fra97851_ch13_263-300.indd 269 12/21/10 6:58 PM

internal validity exist that might also explain the results

The dashed line indicates that the two groups being

fra97851_ch13_263-300.indd 270 12/21/10 6:58 PM

Figure 13.4 Example of a Randomized Posttest-Only Control Group Design

fra97851_ch13_263-300.indd 271 12/21/10 6:58 PM

The Randomized Solomon Four-Group Random Assignment with Matching. In

Figure 13.5 Example of a Randomized Pretest-Posttest Control Group Design

fra97851_ch13_263-300.indd 272 12/21/10 6:58 PM

Figure 13.6 Example of a Randomized Solomon Four-Group Design

fra97851_ch13_263-300.indd 273 12/21/10 6:58 PM

One member of each pair is

Control group receives

fra97851_ch13_263-300.indd 274 12/21/10 6:58 PM

The Matching-Only Design. The matching-only A Three-Treatment Counterbalanced Design

fra97851_ch13_263-300.indd 275 12/21/10 6:58 PM

Figure 13.8 Results (Means) from a Study Using a Counterbalanced Design

fra97851_ch13_263-300.indd 276 12/21/10 6:58 PM

Study (with or without random assignment), which permit

fra97851_ch13_263-300.indd 277 12/21/10 6:58 PM

Figure 13.11, for example, illustrates two possi-

Small (Y1) 48 42 45 35 Lecture

fra97851_ch13_263-300.indd 278 12/21/10 6:58 PM

Figure 13.12 Example

fra97851_ch13_263-300.indd 279 12/21/10 6:58 PM

TABLE 13.1 Effectiveness of Experimental Designs in Controlling Threats to Internal Validity

fra97851_ch13_263-300.indd 280 12/21/10 6:58 PM

fra97851_ch13_263-300.indd 281 12/21/10 6:58 PM

You might also like