0% found this document useful (0 votes)
142 views22 pages

Module 3 - Validity in Research Design

1. The document discusses criteria for establishing causal relationships in research: covariance, temporal precedence, and eliminating confounding variables. 2. The covariance criterion requires demonstrating that changes in the independent variable are correlated with changes in the dependent variable. 3. The temporal precedence criterion addresses the directionality problem by requiring the independent variable to change before the dependent variable, not vice versa. 4. Meeting all three criteria—covariance, temporal precedence, and eliminating confounds—allows researchers to reliably infer causal relationships between variables.

Uploaded by

Sanam Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views22 pages

Module 3 - Validity in Research Design

1. The document discusses criteria for establishing causal relationships in research: covariance, temporal precedence, and eliminating confounding variables. 2. The covariance criterion requires demonstrating that changes in the independent variable are correlated with changes in the dependent variable. 3. The temporal precedence criterion addresses the directionality problem by requiring the independent variable to change before the dependent variable, not vice versa. 4. Meeting all three criteria—covariance, temporal precedence, and eliminating confounds—allows researchers to reliably infer causal relationships between variables.

Uploaded by

Sanam Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module 3: Validity in Research Design

Learning Outcomes
- Understand the criteria for drawing causal inferences in research
- Apply the concepts of internal validity and external validity to assess the quality of
research designs
- Recognize common threats to internal validity in research designs
- Recognize possible confounds that might be relevant to inferences that you draw from
observations in everyday life

Principles of Causal Inference

- Causal inference: a judgement that changes in one variable cause changes in a second
variable
- Confound: an unmeasured third variable that could potentially be the true causal
influence that accounts for an observed association between two variables that the
researcher is studying
- Covariance: a pattern of association between 2 variables, showing that variance in one
variable predicts variance in the other variable
- Longitudinal study: a study in which the researcher measures the IV and DV repeatedly
across a period of time in the same sample to assess whether changes in the IV precede
and predict subsequent changes in the DV
- Temporal precedence: a pattern of association where changes in one variable reliably
precede and predict subsequent changes in a second variable
Common Hypotheses in Psychology
- One of the principal goals of psychological research is testing hypotheses about how
independent variables causally influence dependent variables
- Developmental psychologists test hypotheses about how different types of
socialization experiences shape a child’s personality or behaviour patterns
- Social psychologists test hypotheses about how social factors influence
people’s attitudes
- Clinical psychologists test hypotheses about how different types of stimulus
presentations affect people’s perceptions or recall patterns
- Industrial psychologists test hypotheses about how different styles of
leadership influence employees’ workplace motivations
Hume and Causal Inferences
- Given the hypotheses about causal influences are the central focus of much
psychological research, it is important for the research community to agree on a set of
criteria that they can use to assess whether a causal influence exists between an IV and
a DV
- Assessing hypotheses about causality turns out to be a tricky problem because, as the
philosopher David Hume noted, causation is not something concrete that we can
observe directly using our senses
- **causation is something abstract that we need to infer based on the patterns that
we observe in the relations between variables
- Hume argued that this is true even in seemingly straightforward examples of physical
causation
- Observing one billiard ball colliding into a second billiard ball and seeming to
cause the second ball to move following the impact
- Hume argues that we do not directly see that the impact of the first ball causes
the second ball to move - rather, we see a sequence of motions of these objects
- We see the first ball move and strike the second ball and then see the
second ball move
- When we see this sequence of events we readily infer that the impact of
the first ball caused the second ball to move, but this judgment of
causation is still an inference, not something concrete that we can directly
observe in the same way that we can directly observe the balls
themselves and their motions through physical space

Criteria of Causal Influence


- Because causality needs to be inferred from the patterns that are observed
between an IV and DV, it is important for researchers to examine those patterns in a
systematic fashion
- This careful examination allows researchers to draw a reliable inference about whether
the hypothesized IV has an actual causal impact on the hypothesized DV
- Conditions of observation are what researchers set up so that they can
systematically test their hypotheses about causal relations between IVs and DVs
- Thomas Cook and Donald Campbell (1979) drew on this tradition of thought
(centuries-long tradition of philosophical reflection on the principles that justify causal
inferences), especially from John Stuart Mill, David Hume, and other empiricist
philosophers, in order to propose 3 criteria that researchers must meet in order to
establish a causal relation between an IV and a DV
- Cook and Campbell (1979) proposed the following criteria:
1. The covariance criterion: the researcher must demonstrate that the IV and the
DV are reliably associated (ie. correlated) with each other)
2. The temporal precedence criterion: the researcher must demonstrate that
changed in the level of the IV precede changes in the level of the DV
3. The elimination of confounds criterion: the researcher must demonstrate that
the relation between the IV and the DV cannot be attributed to their shared
association with some third variable

1. Covariance Criterion
- As their first criterion Cook and Campbell (1979) explain that researchers must
demonstrate that the IV and the DV are reliability associated with one another
- This means researchers must provide evidence that variance in the IV is
associated (ie. correlated) with variance in the DV - when they’re associated this
means that knowing an individual’s value on the IV allows you to predict (better
than chance) the likely value that this same individual will show on the DV
(measures of association including correlation coefficient for continuous
measures and contingency tables for discrete variables)
- Recall: association between two variables could be a positive association in
which high scores on the IV tend to predict high scores on the DV and low scores
on the IV predict low scores on the DV
- Ex. a positive association between number of hours students reported
studying for an exam and their scores on that exam would mean that
students who reported more hours of studying tended to score higher on
the exam and those who reported fewer hours studying tended to score
lower on the exam
- Recall: if the association between two variables is negative then high scores on
the IV predict low scores on the DV and low scores on the IV predict high scores
on the DV
- Ex. a negative association between the amount of alcohol students
reported consuming the night before an exam and their exam score would
mean that students who reported consuming more alcohol tended to
score lower on the exam and those who reported drinking less alcohol
tended to score higher on the exam
- So, demonstrating that the IV and DV are associated, the covariance criterion, is a first
step in establishing that the IV may have a causal influence on the DV
- However, establishing an association between an IV and a DV is insufficient evidence to
demonstrate that there is a causal influence of the IV on the DV because of 2 critical
problems:
- The directionality problem ( x causing y can be switched to y causing x which is a
problem) and the third-variable problem
2. Temporal Precedence Criterion
- Cook and Campbell’s second criterion for demonstrating a causal influence of the IV on
the DV addresses the directionality problem
- Directionality problem: arises because either one of the two variables that are correlated
with each other could often be plausibly interpreted to be the independent variable that is
causally responsible for their linkage
- when there is an association or correlation between two variables – Variables A
and B - this could be because Variable A causes some effect on Variable B or it
could be because Variable B causes some effect on Variable A
- To determine whether it is more plausible that Variable A causes a change in Variable B
or that Variable B causes a change in Variable A a researcher will investigate the
timing of the changes in each of these variables in relation to each other
- Demonstrating that changes in the level of the hypothesized IV precede (occur
before) changed in the level of the hypothesized DV, rather than vice versa
- Usually involves measuring spontaneously occurring changes in the
hypothesized IV or using some procedure to induce a change in the IV
and then measuring whether these changes in the IV predict subsequent,
downstream changes in the DV
- Establishing that changes in the IV come first and changes in the DV come later helps to
demonstrate that the causal direction runs from the IV to the DV
- This temporal precedence criterion capitalizes on the fact that time is linear and
directional, and thus a cause cannot plausibly occur after its effect. So, if changes in one
variable precede changes in another variable then the variable that changed first is more
plausibly interpreted as the cause and the variable that changed second is more
plausibly interpreted as the effect or outcome
- Researchers can establish temporal precedence in a variety of ways:

- Approach 1: longitudinal study in


which researchers measure the IV
and DV repeatedly across a period of
time in the same sample
- It can then be examined to
assess whether changes in the
hypothesized IV tend to
precede and predict
subsequent changes in the
hypothesized DV whereas
changes in the hypothesized
DV do not tend to predict
subsequent change in the IV

- Approach 2: examination or studies


that actively manipulate the IV
- Experiment researchers
induce a change in the level of
the IV in a sample of
participants and then measure
whether a subsequent change
in the level of the DV results
- In this approach the IV is
known to have temporal
precedence because the
researcher manipulated it first
and then measured the
change in the DV afterwords
- Addressing the directionality problem
is also insufficient to establish a
causal relation between variables -
because there's still the third variable
problem

- Approach 3: a third variable that is not


ruled out in a research design is
referred to as a confound in that
research design - researchers try to
set up procedures that will allow them
to rule out potential confounds by
isolating the hypothesized IV from any
potential third variables
- To the extent that confounds
are systematically ruled out in
the research design then any
association between the
hypothesized IV and DV can
be attributed to the causal
impact of the IV on the DV and
not the causal impact of some
third variable
- Confound is a factor associated with
the IV and that could potentially exert
its own independent causal influence
on the DV - some factors that can act
as confounds in study designs
include:
- Preexisting individual
differences that are correlated
with the IV,
- Uncontrolled or unmeasured
changes in other potential
causal variables that occur
during the course of the study,
and
- Self-conscious reactions that
participants have to features of
the research design when they
are aware that their behaviour
is being studied.
- Recognizing the confounds that may
need to be addressed requires
extensive knowledge of the common
types of confounds that arise in
research designs
Internal and External Validity

- Baseline measures: measures of a dependent variable that occur before some naturally
occurring or experimentally induced change in the IV takes place. Baseline measures
thus are used to establish a reference point for assessing the magnitude of any changes
in the DV that occur in response to the change in the IV
- External Validity: Assesses the degree to which a study's research findings will be able
to be generalized beyond the specific sample and setting in which it was tested.
- Internal Validity: The quality of the evidence that a research design can provide to test
whether a hypothesized IV has an actual causal influence on a hypothesized DV.
- Post-treatment phase of study: Phase of a study that occurs after some naturally
occurring or experimentally induced change in the IV takes place. During this phase
measures of the relevant variables are assessed again to compare them with the
baseline measures to detect any changes that occurred in reaction to the treatment
phase events.
- Pre-treatment phase of study: Phase of a study in which baseline measures of relevant
variables are assessed to establish a baseline for evaluating subsequent changes in
these variables.
- Single group pre-test post-test design: A research design in which researchers measure
levels of the DV in a single sample of participants both before and again after they
expose that sample to some planned intervention that is designed to induce some
change in the level of the IV within that sample.
- Treatment phase of study: Phase of a study when some naturally occurring or
experimentally induced change in the IV takes place.
Internal Validity
- Donald Campbell provided useful concepts for assessing the relative quality of research
designs for testing hypotheses about the causal influence of IVs on DVs
- In a monograph that was co-authored with Julian Stanley, Campbell introduced the
concepts of internal validity and external validity
- When a research design has weak internal validity the nature of the relation between
the IV and the DV is ambiguous and open to alternative interpretations
- When a research design has strong internal validity the researcher will be able to
confidently draw conclusions about whether or not the IV has the hypothesized causal
influence on the DV
- Example of a strong internal validity if Ivan Pavlov’s famous classical conditioning
experiment with dogs
- Pavlov showed that the repeated pairing of the sound with a food stimulus (IV)
led to the development of a conditional salivation response (DV) to the sound on
its own
- **it has strong internal validity because Pavlov showed that dogs only salivate to
the sound of a metronome after they have been exposed to the experience of this
sound being associated with delivery of food
- This experience of this association is the most plausible explanation of
why Pavlov observed the dogs salivating to the sound following their
exposure to the conditioning trials - there are not any plausible alternative
explanations for why the dogs would salivate to the metronome’s sound
- Research designs that address the directionality problem by establishing temporal
precedence of the IV's effect on the DV and that effectively address the third-variable
problem by ruling out relevant potential confounds are considered to have strong internal
validity
- Studies that only establish that an association exists between two variables, or that only
establish temporal precedence but which fail to rule out relevant confounds are
considered to be weak in internal validity
External Validity
- External validity is critical because researchers seek to discover causal influences that
are important and relevant to a potentially wide variety of contexts
- It would thus not be satisfying to establish that an IV has a causal influence on a
DV if that causal influence only emerges under a very limited and constrained set
of circumstances
- For this reason, researched not only establish that a causal influence exists, but
also that it is generalizable beyond the specific circumstances in which is
observed
- Weak in external validity: only be generalized to a fairly narrow population or a very
limited set of circumstances
- High external validity: can be generalized to a broad population and a variety of
contexts and settings - researchers can increase a study’s external validity by using
representative sampling methods and representative research designs

- Campbell and Stanley (1963) emphasized that internal validity is


a more fundamental standard for assessing the quality of a research
design because a causal effect first needs to be convincingly
demonstrated before it can be generalized to a broader population and
other contexts
Think and Respond

- The hypothetical anti-bullying intervention that was just described represents what is
referred to as a single group pre-test post-test design
- There is a number of fatal flaws that undermine the internal validity of studies that utilize
the single group pre-test post-test design

Remainder of the module, seven common confounds that undermine the validity of
studies that utilize the single group pre-test post-test design and we will illustrate how
these confounds might have been present in the hypothetical example of the
anti-bullying intervention that we just described. In discussing each of these seven
confounds we will refer back to features of the anti-bullying intervention example to
illustrate how the single group pre-test post-test design exhibits each of the threats to
internal validity.

Confound 1: Maturation Effects

- Maturation confound: A psychological change in the research participants that occurs


spontaneously with the passage of time and that has the potential to provide an
alternative explanation of any observed changes in the DV
- Single group pre-test post-test design: a research design in which researchers measure
levels of the DV in a single sample of participants both before and again after they
expose that sample to some planned intervention that is designed to induce some
change in the level of the IV within that sample
Changes Over Time Impact Research
- One of the trickiest problems with studying human beings and other complex organisms
is that they are constantly developing and adapting
- Studying their behaviour patterns and psychological characteristics can be like
trying to track a moving target
- If there's a long span of time between the beginning and the end of the study / if
the study takes place during a period of the lifespan in which there is a high rate
of psychological change - research participants are likely to change in ways that
researchers may have little or no control over
- Maturation confounds have the potential to provide an alternative explanation of any
observed changes in the DV
- Maturation confounds are potentially a serious
problem in studies that utilize a single group pre-test
post-test design
- Ex. the anti-bullying intervention: is any
maturation changes occur within the participants during
the course of the study, it could provide an alternative
explanation of any changes that the researcher observes
in the levels of the dependent measure from the pre-test
phase to the post-test phase

Think and Respond

- In this example, there are potentially numerous psychological changes that the students
may have experienced between the 4th and 6th grades that might explain why rates of
bullying were lower in 6th grade than in 4th grade. For one thing, it's well-known that
children's ability to exercise self-control increases with age. To the extent that bullying
involves failure to control aggressive impulses, 6th graders might be expected to engage
in less bullying than 4th graders simply because they are older and have more inherent
capacity to exercise self-control. If so, then we might have seen the rates of bullying
decrease from the 4th to the 6th grades even if no anti-bullying intervention had been
implemented in these schools. This is just one example of potential maturational
changes that might provide an alternative explanation of the results observed in this
study.
- The maturation confound is one of the most serious threats to the internal validity of
studies that utilize a single group pre-test post-test design
- In this type of design, any changes from pre-test to post-test might be due to
maturational changes that this research design fails to account for
- The plausability of maturatinal confounds is heightened when there is more
elapsed time between the pre-test and post-test measures or when the study
occurs during periods when maturational change is heightened
- However, even in short-term studies there is potential for psychological
changes that may not be accounted for and that might be a third-variable
that causes an observed changes in the DV from the pre-test to the
post-test phase
Addressing Maturation Confound Research
- To rule out maturation confound researchers should replace the single-group pre-test
post-test design with a two group pre-test post-test design in which they assign the
manipulated exposure to the independent variable to just one of these groups but not to
the other
- Treatment group: group that receives manipulated exposure to the IV
- Comparison group: group that did not receive manipulation
- In this design, any spontaneously occurring maturational changes should be held
constant (i.e. experienced to the same extent) across the two groups
- Thus, if a greater change in the level of the DV from pre-test to post-test is observed in
the treatment group than is observed in the comparison group this change could not be
attributed to maturation because maturation effects should be experienced similarly by
both of these groups
- The presence of the comparison group thus allows the researcher to assess whether
any observed changes are due to maturation alone or due to some extra effectiveness of
their manipulated exposure to the IV
- This alternative design follows the logic of controlled comparison

Confound 2: History Effects

- History Confound: any incident that research participants are exposed to that the
researchers do not have control over and that has the potential to provide an alternative
explanation of any observed changes in the DV
The Influence of Outside Experience on Research
- Another problem that arises in psychological research is the possibility that participants
might be exposed to events that are beyond the control of the researchers and which
could affect their responses to the measure of the dependent variable
- Researchers usually try to carefully control what participants are exposed to within the
confines of the research environment itself - however, if research participants are
exposed to some outside influence in between the pre-test measure and the post-test
measure, then any changes in the post-test measure might be due to the influence of
these external events rather than due to any effect of their exposure to the independent
variable
- In research methodology we refer to external influences that the researcher does
not have control over as history confounds
- While the term “history” may seem to imply that these influences need to be historically
significant incidents, in actuality it is intended to refer to any kind of external incident -
whether great or small - that participants might be exposed to that could potentially affect
their responses to the DV
- A history confound could entail a major national or international incident that
receives widespread attention and that changes how people interpret or respond
to the dependent measures
- Ex. natural disaster that directly or indirectly impacts individuals in the
sample
- Also may refer to a very local, low-profile incident that occurs in the participants’
own immediate community and that has an impact on the sample of participants
being observed in the study even though it might not be newsworthy in a broader
sense
- Ex. construction leading to significant traffic on a main route to the
research lab, causing many participants to arrive at the research lab
feeling irritated
- History confounds are another serious threat to the validity of studies that utilize a single
group pre-test post-test design such as the hypothetical example of the anti-bullying
intervention that we are considering
- Any external incidents that research participants might be exposed to during the
course of the study could provide an alternative explanation of any changes that
the researcher observes in the levels of the dependent measure from the pre-test
phase to the post-test phase
Think and Respond

- There are many possible historical confounds that may arise here. One possible
historical factor that could impact the study results would be an incident of extreme
cyberbullying, which creates a significant scandal in the students' local community. An
incident of this nature would raise the profile of bullying and generate widespread
outrage about bullying among the students and other members of the community. The
increased concern about bullying triggered by this incident may lead to a heightened
policing of bullying in the schools that depresses the levels of bullying observed. This
incident and the reactions that it likely triggers would provide an obvious alternative
explanation for a dramatic reduction in the levels of bullying that the researchers
observed in the 6th grade compared to the 4th grade for this cohort of students. The
observed reduction in bullying thus could plausibly be attributed to the impact of this
historical incident rather than to the effectiveness of the anti-bullying intervention that the
researchers were testing. This is just one example of a possible external incident that
might contaminate the study and provide an alternative explanation of the results
observed in this study.
Addressing History Confound in Research
- Just like the maturation confound, researchers can try to rule out the history confound by
replacing the single group pre-test post-test design with a two group pre-test post-test
design.
- They assign the manipulated exposure to the independent variable just to the
treatment group but not the comparison group
- In this design, exposure to uncontrolled external events should be a constant that is
experienced to the same extent by both the treatment group and the comparison group
- Thus, if a greater change in the level of the DV from pre-test to post-test is observed in
the treatment group than is observed in the comparison group this change could not be
attributed to history effects because history effects should be experienced similarly by
both of these groups
- The presence of the comparison group thus allows the researcher to assess whether
any observed changes are due to history effects alone or due to some extra
effectiveness of their manipulated exposure to the IV

Confound 3: Repeated Testing Effects

- Repeated testing effects: Psychological changes that participants experience due to their
repeated exposure to measures of the DV that may provide an alternative explanation of
any changes in the levels of the DV from pre-test to post-test.
The Influence of Repeated Experience on Research
- When the same measures are assessed repeatedly in a study, such as in the pre-test
and post-test phases of the design that we’ve been considering, then this can introduce
another potential confound referred to as repeated testing effects
- This confound involves psychological changes that participants experience due to their
repeated exposure to measures of the DV that may provide an alternative explanation of
any changes in the levels of the DV from pre-test to post-test
- When participants are exposed to a measure repeatedly their responses to the later
administrations of that measure may differ from their responses to earlier
administrations of that measure
- Here are some common examples of how this confound may have influence over DVs:
- Psychological changes such as fatigue or boredom may influence the DV
causing changes that the researcher may interpret as an effect of the IV that the
research is testing
- If the repeated measure involves some kind of skilled performance such as an
ability test (ex. Test of working memory or verbal fluency) or a reaction time
measure, then the opportunity to repeatedly practice that skill by taking that
measure more than once may cause skill-related learning. This learning could
provide an alternative explanation of any changes in the participants’ levels of
performance from the earlier administrations of the measure to the later
administrations of that same measure
- Repeated exposure to a measure may provide an
opportunity for participants to experience insights
that might change their view of themselves or their
views of the phenomenon that the measure is
tapping into, and these insights due to repeated
testing may provide an alternative explanation of any
changes that are observed in participants’ responses
to this dependent measure from the earlier to the
later administrations of that measure
- Anti-bullying intervention: suppose that researchers asked
participants what strategies they would use to confront bullies in both pre-treatment and
post-treatment phases of the study
- Change might not have been because of the effectiveness of intervention
- Another explanation would be that participants aren’t as naive of the question the
second time so they have an answer to it because they thought about it after the
pre-testing
Addressing Repeated Testing Confound in Research
- Risk of repeated testing confounds can be often less severe (mitigated) by increasing
the length of time between the pre-test and post-test measures
- However, increasing the amount of time between these measures can
exacerbate other confounds (maturation or history effects)
- Researchers can rule out confounds due to repeated testing effects by replacing the
single-group pre-test post-test design with a two group pre-test post-test design in which
they assign the manipulated exposure to the independent variable to just the treatment
group but not to the comparison group
- In this design, repeated exposure to the same test of the DV would be a constant that is
experienced to the same extent by both the treatment and the comparison group
- Thus, if a greater change in the level of the DV from pre-test to post-test is observed in
the treatment group than is observed in the comparison group this change could not be
attributed to repeated testing effects because repeated testing was experienced similarly
by both of these groups
- The presence of the comparison group thus allows the researcher to assess whether
any observed changes are due to repeated testing effects alone or due to the actual
effectiveness of their manipulated exposure to the IV

Confound 4: Changes in Measurement

- Intentional changes in measurement: intentional differences in how the researcher


measures the DV in the pre-test phase versus the post-test phase of a study
- Unintentional changes in measurement: unintentional differences in how the researcher
measures the DV in the pre-test phase versus the post-test phase of a study
Intentional Changes in Measurement
- Researchers sometimes seek to avoid confounds that might arise from repeated testing
by changing their operationalization of the dependent measure from the pre-test phase
to the post-test phase
- Alternatively, unintentional changes in measurement may emerge due to changes in how
the measures are administered or scored by the research personnel - so if the measure
involves a skilled performance measure then alternative forms of this measure might be
administered in pre-test and post-test or if the measure assesses attitudes or personality
traits then different response items might be used to assess these characteristics during
the pre-test and post-test phases
Unintentional Changes in Measurement
- Changes in measurement might emerge unintentionally due to changes in how the
measure are administered or scored by the research personnel
- Examples
- if the measures involve research assistants who observe and assess
participants' behaviours then these observers' sensitivity for detecting
relevant behaviours may change as they gain more experience observing
those behaviours

- Changes in measurement thus introduce confounds when over the course of the study
the measures that are used to record levels of the DV change and/or there are changes
in observers' sensitivity for detecting varying levels of the DV
Addressing changes in Measurement Confound in Research
- researchers can try to rule out confounds due to changes in measurement by replacing
the single-group pre-test post-test design with a two group pre-test post-test design in
which they assign the manipulated exposure to the independent variable to just the
treatment group but not to the comparison group
- In this modified design, any changes in measurement should be a constant that is
experienced to the same extent by both the treatment and the comparison group
- However, the changed level of the IV would be present for the treatment group but
absent for the comparison group. Thus, if a greater change in the level of the DV from
pre-test to post-test is observed in the treatment group than is observed in the
comparison group this change could not be attributed to measurement changes because
measurement changes should be experienced similarly by both of these groups. The
presence of the comparison group thus allows the researcher to assess whether any
observed changes are due to measurement changes alone or due to the effectiveness of
their manipulated exposure to the IV.
- Ex. you look at a place differently with a changed set of perceptual lenses
- If you last visited the place when you were a child and now you're a fully grown
adult then structures that seemed huge from a child's vantage point may seem
normal-sized from an adult's vantage point

Confound 5: Self-selection

- Self-selection confound: Confound that arises whenever research participants choose


their own levels of exposure to the IV and thus any pre-existing distinctive characteristics
of the participants who chose to expose themselves to the IV might explain any
observed changes in the DV
The Problem of Allowing Participants to Choose their Group
- Anti-bullying intervention: Recall that in this example only a subset of the B.C. schools
volunteered to participate in the study. This raises a concern about whether the schools
that volunteered to participate had characteristic qualities that might predispose them to
show the changes that the researchers attributed to the effects of their intervention
- Perhaps this indicates that these schools had an actively engaged administration that
was staffed by individuals who are effective problem solvers. If so then these schools
might have managed to lower their bullying problems through the attention and efforts of
their engaged administrators and not through changes that were caused by the
intervention itself. In other words, perhaps schools with engaged, attentive
administrators would have shown a reduction in bullying problems over time even if the
intervention was never conducted. This problem is referred to as a self-selection
confound
- Self-selection creates problems with interpreting the effects of the IV because the IV is
confounded with whatever individual differences led participants to select into their
exposure to the IV
Addressing Self-selection Confound in Research
- Random assignment and its utility for ruling out confounds such as self-selection in detail
when we cover the logic of experimental design in a future module

Confound 6: Regression to the Mean

- Random measurement error (or noise): Random, unpredictable discrepancies between


the measured value and the true value.
- Regression towards the mean: A reliable statistical phenomenon wherein if an individual
obtains an exceptional score (e.g., above or below the mean) on one measure they are
likely to have a score that is closer to the mean on a second, correlated measure.
Regression to the mean is a reliable statistical pattern that is driven just by the influence
of random, chance factors on performance and measurement. However, because many
people are not aware that this pattern is explained by random chance factors they tend
to invent causal explanations when they observe sequences of outcomes where
exceptional observations on one measure regress to the mean on a second measure.
Extreme Scores Tend to Shift Towards the Mean
- suppose that instead of recruiting the sample through voluntary self-selection the
researchers recruited schools for the intervention in a different way. Suppose that after
they measured bullying in grade 4 during the pre-treatment phase in all B.C. schools
they then selected the schools that showed the highest levels of reported bullying
incidents in those pre-treatment scores and they implemented the intervention during the
5th grade just in those schools that had the highest levels of bullying
- This might seem sensible because due to resource limitations we often aim to
focus resources where they are most needed. Why bother expending staff and
other resources to implement an anti-bullying curriculum in schools that are
already showing relatively low levels of bullying? Doesn't it seem sensible to test
the intervention in the schools that need it the most — where bullying is higher
than average?
- There may seem to be sensible justifications for testing the intervention in the schools
that show above-average levels of bullying on the pre-test measures and while this may
avoid the self-selection confound that we previously reviewed, this approach would
actually add a new, equally problematic kind of confound, referred to as regression
towards the mean
- **Regression to the mean is a reliable statistical phenomenon that, in this case,
would indicate that schools that were above-average in their levels of bullying on
the pre-test should tend to move closer to the average level of bullying on the
post-test measure, just due to chance factors and not necessarily due to a valid
effect of the intervention
- Because extreme scores will tend to move closer to the average score on the
post-test measure simply due to chance
- Regression to the mean occurs because two factors contribute to an individual's score
on any given measure:
- the individual's true score, which the measure is trying to capture, and
- random measurement error, which captures randomly occurring fluctuations in
measurement due to imperfections in the measurement process, randomly
occurring situational influences that affect the observed value (i.e. score) on the
measure, and other chance-related factors

Think and Respond

- This pattern is a classic case of potential regression to the mean. We tend to punish
people when their behaviour is exceptionally bad and we tend to reward people when
their behaviour is exceptionally good. By definition exceptional behaviour is discrepant
from the normal, average expectations for people's behaviour. In other words,
exceptional behaviour is improbable. So, the principle of regression to the mean
indicates that we should expect that just by chance exceptionally negative behaviour will
regress in a positive direction towards the mean of normal behaviour and exceptionally
positive behaviour will regress in a negative direction towards the mean of normal
behaviour. This is because a person's behaviour is a function of their own dispositions as
well as random factors (e.g., a fleeting mood) that might sway it in a positive or negative
direction.
- Thus, regression to the mean may make it appear that punishment works to make
people behave less negatively but that rewards backfire by making people behave less
positively. In fact, careful experimental research that rules out the regression confound
shows that rewards tend to be more effective as reinforcers than punishments tend to
be. So, the appearance that punishment is effective while rewards backfire is very likely
a consequence of failing to recognize the principle of regression to the mean.

- Achieving success with a creative work such as a movie, album, or novel is a rare
achievement. Because of the sheer volume of artistic products that are produced it is an
exceptional achievement to break through to achieve popular success. Some of this
success is likely due to genuine artistic qualities in the works that achieve success.
However, some of the success is likely due to luck, e.g., getting noticed by the right
people at the right time. To the extent that some of the initial success was due to luck, it
is unlikely that the same artist will experience the same lucky circumstances in their
subsequent work. So, just due to chance an artist who experienced a breakthrough
achievement is likely to be somewhat less successful in their subsequent work.
Addressing Regression to the Mean Confound in Research
- The risk of regression to the mean confound can be avoided by not selecting only
participants with extreme high or low values on the pre-test measures for inclusion in the
study
- Since extreme scores tend to regress to the mean based on chance then the inclusion of
participants with the full range of pre-test scores will allow researchers to distinguish
changes that may have been caused by the IV from changes that merely reflect
regression to the mean
- If for pragmatic reasons the researchers need to focus their study on participants who
exhibited extreme scores on the pre-test measures, then researchers can try to rule out
confounds due to regression to the mean by replacing the single-group pre-test post-test
design with a two group pre-test post-test design in which they assign the manipulated
exposure to the independent variable to just the treatment group but not to the
comparison group
- In this design, regression to the mean would be a constant that is experienced to
the same extent by both the treatment and the comparison group. However, the
changed level of the IV would be present for the treatment group but absent for
the comparison group. Thus, if a greater change in the level of the DV from
pre-test to post-test is observed in the treatment group than is observed in the
comparison group this change could not be attributed solely to regression to the
man because regression to the mean should be exhibited similarly by both of
these groups. The presence of the comparison group thus allows the researcher
to assess whether any observed changes are due to regression to the mean
alone or due to the actual effectiveness of their manipulated exposure to the IV

Confound 7: Selective Attrition

- Selective attrition: When individuals choose to withdraw from the study prior to the
administration of the final measures of the relevant study variables
Individual Differences in Who Chooses to Complete a Study
- While self-selection into treatments of the independent variable introduces individual
difference confounds into research designs at the start of a study, another problem
referred to as selective attrition introduces individual difference confounds into the
design in the middle or end of the study
- To the extent that the individuals who choose to complete the study differ in systematic
ways from those who chose to exit from the study early, then those individual differences
could provide an alternative explanation of any observed effects that emerge on the DV
- A number of individual differences might influence someone's decision whether or not to
continue in a study through to the end
- Ex. personality traits may affect this decision
- Highly conscientious individuals may be more strongly motivated to follow through on
what they perceive as a commitment to the researchers whereas less conscientious
individuals may be less reluctant to quit the study early if other priorities arise. In addition
to personality differences, differences in participants' life circumstances might affect their
decisions about whether to continue through to the end of a study. Individuals who have
less control over their free time because of work or family obligations may be forced to
discontinue their participation if these other obligations raise urgent demands whereas
individuals who have fewer obligations of this sort may have more freedom to remain in
the study
- As was the case with self-selection, selective attrition can be a problem not only in
studies that use the single group pre-test post-test design but potentially also in studies
that compare a treatment group that was exposed to an IV to a comparison group that
was not exposed to the IV if the attrition from one of these conditions was greater than
attrition from the other condition
- Ex. if a treatment condition involves a lot of demanding work whereas the
comparison condition is mostly a relaxing experience, then people might be more
tempted to drop out of the treatment condition than in the comparison condition
Addressing Repeated Testing Confound in Research
- If attrition differences emerge then researchers may need to modify their research
procedures to try and eliminate any features of the conditions that may have led to
different levels of attrition and try to replicate the study again with the hope that attrition
differences will not recur. We will discuss the problem of selective attrition in more detail
in a future module when we cover confounds that could potentially threaten internal
validity in experimental designs

Summary

- Psychological research aims to test hypotheses about the causal influence of IVs on
DVs. Drawing from centuries of philosophical thought about the epistemic foundations
for causal inferences, researchers adopt three criteria for testing the validity of their
causal hypotheses:
- demonstrating a reliable pattern of association between the hypothesized IV and
DV,
- demonstrating that changes in the IV temporally precede and predict changes in
the DV, and
- demonstrating that the relation between the IV and DV remains after controlling
for factors that might be confounded with the IV. Research designs attempt to
assess the patterns of associations between a hypothesized IV and DV to
determine whether those patterns fulfill these criteria.
- A study has high internal validity to the extent that the study's design can generate
convincing evidence to assess these criteria for establishing a causal relation between
an IV and a DV. In particular, studies are considered to have strong internal validity if
they can establish that there is a directional effect from the IV to the DV and can rule out
relevant confounds that might offer alternative explanations of any observed effects on
the DV. Internal validity is thus an important standard for assessing the strength of a
study's ability to test causal hypotheses.
- Another important standard for assessing the quality of a research design is the standard
of external validity, which refers to the generalizability of the study's results to a broader
population and diverse contexts. Studies have strong external validity to the extent that
they:
- examine samples that are representative of the population that the researcher
intends to generalize the results to, and
- use measures and procedures that are representative of the real world contexts
that the researcher intends to generalize the results to.
- A study that is strong in both internal and external validity provides strong evidence for
assessing the causal impact of the hypothesized IV on the DV and can extend any
conclusions that are drawn from that evidence to broader populations and situations.
- Much of the focus of research design involves trying to enhance a study's internal
validity by ruling out potential confounds that could potentially provide alternative
explanations of any observed relations between the hypothesized IV and DV. Research
methods training thus involves developing the skill to recognize potential confounds and
utilize methods to address those confounds. To illustrate this process we reviewed in
detail seven common confounds that can threaten the internal validity of research
designs and that researchers thus need to be mindful of. These seven confounds arise
in a variety of research designs but are most usefully illustrated in the single group
pre-test post-test design, which is vulnerable to each of these key confounds. Learning
how these confounds undermine the internal validity of this particular research design
helps to set the stage to understand the value of procedures that are designed to rule
out these confounds such as the randomized control group experimental design, which
we will review in detail in later modules.

You might also like