Professional Documents
Culture Documents
INTRODUCTION TO RESEARCH
METHODS
Introduction to Design
Causal
– Pertaining to a cause-effect question, hypothesis, or
relationship
– Something is causal if it leads to an outcome or makes
an outcome happen
– Don’t confuse this word with casual!
8.2a Establishing Cause and Effect
in Research Design
Causal relationship
– A cause-effect relationship: for example, when you
evaluate whether your treatment or program causes an
outcome to occur, you are examining a causal
relationship
Three criteria:
– Temporal precedence
– Covariation of the cause and effect
– No plausible alternative explanations
8.2a The Third-Variable Problem
History
– Conception of threats to internal validity
specifically encompasses those things (i.e., specific
events) that a study participant experiences during
the course of an experiment that are not part of
the experiment itself; therefore, they are
extraneous variables.
8.2b Internal Validity – Single Group Threats
Maturation
– Whereas history involves the experience of
external events, maturation involves bodily changes.
Selection-history threat
Selection-maturation threat
Selection-testing Threat
Selection-instrumentation threat
Selection-mortality threat
Selection-regression threat
8.2b Internal Validity – Types of Selection Bias
Selection-history threat
– A selection-history threat is any other event that occurs between pretest and posttest
that the groups experience differently.
– Because this is a selection threat, it means the groups differ in some way.
– Because it’s a ‘history’ threat, it means that the way the groups differ is with respect to
their reactions to history events.
– For example, what if the children in one group differ from those in the other in their
television habits.
– Perhaps the program group children watch Sesame Street more frequently than those in
the control group do.
– Since Sesame Street is a children’s show that presents simple mathematical concepts in
interesting ways, it may be that a higher average posttest math score for the program
group doesn’t indicate the effect of our math tutoring – it’s really an effect of the two
groups differentially experiencing a relevant event – in this case Sesame Street –
between the pretest and posttest.
8.2b Internal Validity – Types of Selection Bias
Selection-maturation threat
– A selection-maturation threat results from differential
rates of normal growth between pretest and posttest
for the groups. In this case, the two groups are
different in their different rates of maturation with
respect to math concepts. It’s important to distinguish
between history and maturation threats. In general,
history refers to a discrete event or series of events
whereas maturation implies the normal, ongoing
developmental process that would take place. In any
case, if the groups are maturing at different rates with
respect to the outcome, we cannot assume that
posttest differences are due to our program – they
may be selection-maturation effects.
8.2b Internal Validity – Types of Selection Bias
Selection-testing Threat
– A selection-testing threat occurs when there is a
differential effect between groups on the posttest of
taking the pretest. Perhaps the test “primed” the
children in each group differently or they may have
learned differentially from the pretest. in these cases,
an observed posttest difference can’t be attributed to
the program, they could be the result of selection-
testing.
8.2b Internal Validity – Types of Selection Bias
Selection-instrumentation threat
– Selection-instrumentation refers to any differential
change in the test used for each group from pretest
and posttest. In other words, the test changes
differently for the two groups. Perhaps the test consists
of observers who rate the class performance of the
children. What if the program group observers, for
example, get better at doing the observations while,
over time, the comparison group observers get
fatigued and bored. Differences on the posttest could
easily be due to this differential instrumentation –
selection-instrumentation – and not to the program.
8.2b Internal Validity – Types of Selection Bias
Selection-mortality threat
– Selection-mortality arises when there is differential
nonrandom dropout between pretest and posttest.
– In our example, different types of children might drop out of
each group, or more may drop out of one than the other.
Posttest differences might then be due to the different types
of dropouts – the selection-mortality – and not to the
program.
8.2b Internal Validity – Types of Selection Bias
Selection-regression threat
– Finally, selection-regression occurs when there are different rates of
regression to the mean in the two groups. This might happen if one group
is more extreme on the pretest than the other. In the context of our
example, it may be that the program group is getting a disproportionate
number of low math ability children because teachers think they need the
math tutoring more (and the teachers don’t understand the need for
‘comparable’ program and comparison groups!). Since the tutoring group
has the more extreme lower scorers, their mean will regress a greater
distance toward the overall population mean and they will appear to gain
more than their comparison group counterparts. This is not a real program
gain – it’s just a selection-regression artifact.
8.2b Internal Validity – Multi-Group Threats
Compensatory rivalry
– Here, the comparison group knows what the program group is getting and develops a
competitive attitude with them.
– The students in the comparison group might see the special math tutoring program the
program group is getting and feel jealous. This could lead them to deciding to compete
with the program group “just to show them” how well they can do. Sometimes, in
contexts like these, the participants are even encouraged by well-meaning teachers or
administrators to compete with each other (while this might make educational sense as
a motivation for the students in both groups to work harder, it works against our ability
to see the effects of the program). If the rivalry between groups affects posttest
performance, it could maker it more difficult to detect the effects of the program. As
with diffusion and imitation, this threat generally works to in the direction of equalizing
the posttest performance across groups, increasing the chance that you won’t see a
program effect, even if the program is effective.
8.2b Internal Validity – Social-Interaction
Threats
Resentful demoralization
– This is almost the opposite of compensatory rivalry.
Here, students in the comparison group know what the
program group is getting. But here, instead of developing
a rivalry, they get discouraged or angry and they give up
(sometimes referred to as the “screw you” effect!).
Unlike the previous two threats, this one is likely to
exaggerate posttest differences between groups, making
your program look even more effective than it actually is.
Minimizing Threats to Validity
By Argument
– The most straightforward way to rule out a potential threat to validity is to
simply argue that the threat in question is not a reasonable one. Such an
argument may be made either a priori or a posteriori, although the former
will usually be more convincing than the latter.
– For example, depending on the situation, one might argue that an
instrumentation threat is not likely because the same test is used for pre and
post test measurements and did not involve observers who might improve,
or other such factors.
– In most cases, ruling out a potential threat to validity by argument alone will
be weaker than the other approaches listed below.
– As a result, the most plausible threats in a study should not, except in
unusual cases, be ruled out by argument only.
Minimizing Threats to Validity
By Measurement or Observation
– In some cases it will be possible to rule out a threat by
measuring it and demonstrating that either it does not occur at
all or occurs so minimally as to not be a strong alternative
explanation for the cause-effect relationship.
– Consider, for example, a study of the effects of an advertising
campaign on subsequent sales of a particular product.
– In such a study, history (i.e., the occurrence of other events
which might lead to an increased desire to purchase the
product) would be a plausible alternative explanation.
Minimizing Threats to Validity
– For example, a change in the local economy, the removal of a competing
product from the market, or similar events could cause an increase in
product sales.
– One might attempt to minimize such threats by measuring local
economic indicators and the availability and sales of competing products.
– If there is no change in these measures coincident with the onset of the
advertising campaign, these threats would be considerably minimized.
– Similarly, if one is studying the effects of special mathematics training on
math achievement scores of children, it might be useful to observe
everyday classroom behavior in order to verify that students were not
receiving any additional math training to that provided in the study.
Minimizing Threats to Validity
By Analysis
– There are a number of ways to rule out alternative explanations using
statistical analysis.
– One interesting example is provided by Jurs and Glass (1971).
– They suggest that one could study the plausibility of an attrition or mortality
threat by conducting a two-way analysis of variance. One factor in this study
would be the original treatment group designations (i.e., program vs.
comparison group), while the other factor would be attrition (i.e., dropout vs.
non-dropout group). The dependent measure could be the pretest or other
available pre-program measures. A main effect on the attrition factor would be
indicative of a threat to external validity or generalizability, while an interaction
between group and attrition factors would point to a possible threat to internal
validity. Where both effects occur, it is reasonable to infer that there is a threat
to both internal and external validity.
Minimizing Threats to Validity
By Preventive Action
– When potential threats are anticipated they can often be ruled out
by some type of preventive action.
– For example, if the program is a desirable one, it is likely that the
comparison group would feel jealous or demoralized. Several
actions can be taken to minimize the effects of these attitudes
including offering the program to the comparison group upon
completion of the study or using program and comparison groups
which have little opportunity for contact and communication. In
addition, auditing methods and quality control can be used to
track potential experimental dropouts or to insure the
standardization of measurement.
Design Construction
Basic Design Elements.
Program(s) or Treatment(s).
– The presumed cause may be a program or treatment under
the explicit control of the researcher or the occurrence of
some natural event or program not explicitly controlled.
– In design notation we usually depict a presumed cause with
the symbol “X”.
– When multiple programs or treatments are being studied
using the same design, we can keep the programs distinct by
using subscripts such as “X1” or “X2”.
– For a comparison group (i.e., one which does not receive the
program under study) no “X” is used.
Design Construction
Observation(s) or Measure(s).
– Measurements are typically depicted in design notation with
the symbol “O”.
– If the same measurement or observation is taken at every
point in time in a design, then this “O” will be sufficient.
– Similarly, if the same set of measures is given at every point
in time in this study, the “O” can be used to depict the entire
set of measures.
– However, if different measures are given at different times it
is useful to subscript the “O” to indicate which measurement
is being given at which point in time.
Design Construction
Groups or Individuals.
– The final design element consists of the intact groups or the
individuals who participate in various conditions.
– Typically, there will be one or more program and comparison groups.
– In design notation, each group is indicated on a separate line.
Furthermore, the manner in which groups are assigned to the
conditions can be indicated by an appropriate symbol at the
beginning of each line.
– Here, “R” will represent a group which was randomly assigned, “N”
will depict a group which was nonrandomly assigned (i.e., a
nonequivalent group or cohort) and a “C” will indicate that the group
was assigned using a cutoff score on a measurement.
The Nature of Good Design
Theory-Grounded.
– Good research strategies reflect the theories which are
being investigated.
– Where specific theoretical expectations can be
hypothesized these are incorporated into the design.
– For example, where theory predicts a specific
treatment effect on one measure but not on another,
the inclusion of both in the design improves
discriminant validity and demonstrates the predictive
power of the theory.
The Nature of Good Design
Situational.
– Good research designs reflect the settings of the
investigation.
– This was illustrated above where a particular need of
teachers and administrators was explicitly addressed in
the design strategy.
– Similarly, intergroup rivalry, demoralization, and
competition might be assessed through the use of
additional comparison groups who are not in direct
contact with the original group.
The Nature of Good Design
Feasible.
– Good designs can be implemented.
– The sequence and timing of events are carefully
thought out.
– Potential problems in measurement, adherence to
assignment, database construction and the like, are
anticipated.
– Where needed, additional groups or measurements are
included in the design to explicitly correct for such
problems.
The Nature of Good Design
Redundant.
– Good research designs have some flexibility built into
them.
– Often, this flexibility results from duplication of
essential design features.
– For example, multiple replications of a treatment help
to insure that failure to implement the treatment in one
setting will not invalidate the entire study.
The Nature of Good Design
Efficient.
– Good designs strike a balance between redundancy
and the tendency to overdesign. Where it is
reasonable, other, less costly, strategies for ruling out
potential threats to validity are utilized.