You are on page 1of 43

Validity and Reliability

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Validity and Reliability


Chapter Eight

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Validity

Validity has been defined as referring to the


appropriateness, correctness, meaningfulness,
and usefulness of the specific inferences
researchers make based on the data they collect.
It is the most important idea to consider when
preparing or selecting an instrument.
Validation is the process of collecting and
analyzing evidence to support such inferences.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Evidence of Validity

There are 3 types of evidence a researcher


might collect

Content-related evidence of validity

Criterion-related evidence of validity

Relationship between scores obtained using the


instrument and scores obtained

Construct-related evidence of validity

McGraw-Hill

Content and format of the instrument

Psychological construct being measured by the


instrument

2006 The McGraw-Hill Companies, Inc. All rights

Illustration of Types of Evidence of Validity (Figure 8.1)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Content-related Evidence

A key element is the adequacy of the


sampling of the domain it is supposed to
represent.
The other aspect of content validation is the
format of the instrument.
Attempts to obtain evidence that the items
measure what they are supposed to measure
typify the process of content-related
evidence.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Criterion-related Evidence

A criterion is a second test presumed to


measure the same variable.
There are two forms of criterion-related
validity:
1)

2)

Predictive validity: time interval elapses between


administering the instrument and obtaining criterion
scores
Concurrent validity: instrument data and criterion
data are gathered and compared at the same time

A Correlation Coefficient (r) indicates the


degree of relationship that exists between the
scores of individuals obtained by two
instruments.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Construct-related Evidence

Considered the broadest of the three


categories.
There is no single piece of evidence that
satisfies construct-related validity.
Researchers attempt to collect a variety of
types of evidence, including both contentrelated and criterion-related evidence.
The more evidence researchers have from
different sources, the more confident they
become about the interpretation of the
instrument.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Reliability

Refers to the consistency of scores or


answers provided by an instrument.
Scores obtained can be considered reliable
but not valid.
An instrument should be reliable and valid
(Figure 8.2), depending on the context in
which an instrument is used.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Reliability and Validity (Figure 8.2)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Reliability of Measurement (Figure 8.3)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Errors of Measurement

Because errors of measurement are


always present to some degree, variation
in test scores are common.
This is due to:

McGraw-Hill

Differences in motivation
Energy
Anxiety
Different testing situation

2006 The McGraw-Hill Companies, Inc. All rights

Reliability Coefficient

Expresses a relationship between scores


of the same instrument at two different
times or parts of the instrument.
The 3 best known methods are:

McGraw-Hill

Test-retest
Equivalent forms method
Internal consistency method

2006 The McGraw-Hill Companies, Inc. All rights

Test-Retest Method

Involves administering the same test twice to


the same group after a certain time interval has
elapsed.
A reliability coefficient is calculated to indicate
the relationship between the two sets of scores.
Reliability coefficients are affected by the lapse
of time between the administrations of the test.
An appropriate time interval should be selected.
In Educational Research, scores collected over a
two-month period is considered sufficient
evidence of test-retest reliability.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Equivalent-Forms Method

Two different but equivalent (alternate or


parallel) forms of an instrument are
administered to the same group during the
same time period.
A reliability coefficient is then calculated
between the two sets of scores.
It is possible to combine the test-retest and
equivalent-forms methods by giving two
different forms of testing with a time interval
between the two administrations.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Internal-Consistency Methods

There are several internal-consistency methods that


require only one administration of an instrument.
Split-half Procedure: involves scoring two halves of a
test separately for each subject and calculating the
correlation coefficient between the two scores.
Kuder-Richardson Approaches: (KR20 and KR21)
requires 3 pieces of information:

Number of items on the test


The mean
The standard deviation

Considered the most frequent method for determining


internal consistency

Alpha Coefficient: a general form of the KR20 used to


calculate the reliability of items that are not scored
right vs. wrong.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Standard Error of Measurement

An index that shows the extent to which


a measurement would vary under
changed circumstances.
There are many possible standard errors
for scores given.
Also known as measurement error, a
range of scores that show the amount of
error which can be expected. (Appendix
D)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Scoring Agreement

Scoring agreement requires a demonstration that


independent scorers can achieve satisfactory
agreement in their scoring.
Instruments that use direct observations are highly
vulnerable to observer differences.
What is desired is a correlation of at least .90
among scorers as an acceptable level of
agreement.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Internal Validity

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Internal Validity
Chapter Nine

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

What is Internal Validity?

Internal Validity is when observed differences on


the dependent variable are directly related to the
independent variable, and not due to some other
unintended variable.
In other words, any relationship observed
between two or more variables should be
unambiguous as to what it means rather than
being due to something else.
The something else could be:

McGraw-Hill

Age
Ability
Types of materials used

2006 The McGraw-Hill Companies, Inc. All rights

Threats to Internal Validity

Subject
Characteristics
Mortality
Location
Instrumentation
Testing

McGraw-Hill

History
Maturation
Attitude of subjects
Regression
Implementation

2006 The McGraw-Hill Companies, Inc. All rights

Mortality Threat to Internal Validity (Figure 9.1)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Location Might Make a Difference


(Figure 9.2)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Subject Characteristics

The selection of people may result in differences,


either between individuals or groups, that are
related to the variables being studied.
This refers to a selection bias or subject
characteristics threat.
If not controlled, these variables may explain away
whatever differences are found in the study.
There are techniques used to either equalize the
differences or control these variables.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Mortality

It is common to lose subjects as a study


progresses
This is known as mortality threat.
Loss of subjects limits generalizability and
can introduce bias.
Mortality is the most difficult threat to control
for internal validity.
An attempt to eliminate the problem would
be to provide evidence that the subjects lost
were similar to those who remained in the
study.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Location

The particular locations where data is


collected may create different results or
explanations known as location threat.
The best way to control for this is to keep
the location consistent for all subjects.
If this is not possible, the researcher should
ensure that different locations do not favor
or jeopardize the hypothesis.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Instrumentation

The way instruments are used may


constitute a threat to the internal validity
of a study.
Some examples are as follows:

McGraw-Hill

Instrument decay
Data Collector Characteristics
Data Collector Bias

2006 The McGraw-Hill Companies, Inc. All rights

Instrument Decay (Figure 9.3)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

A Data Collector Characteristics Threat


(Figure 9.4)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Testing

A testing threat is when efforts are


achieved by the subjects due to practice
(i.e., pretest, post-test designed study)
An interaction also could cause this by
taking the test and being more aware of a
possible interaction, allowing the subjects
to be more responsive towards the
treatment.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

A Testing Threat to Internal Validity


(Figure 9.5)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

History

A history threat is when an unforeseen


event occurs during the course of the
study.
Researchers need to be alert to any
possibilities of influences that may occur
during the course of the study.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

A History Threat to Internal Validity


(Figure 9.6)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Maturation

Change during an intervention may be due to


factors associated with the passing of time
rather than the intervention.
Students could change over the course of a
study. This is known as a Maturation Threat.
Maturation is only a threat in studies using
pre/post data for the intervention group or in
studies that span a number of years.
The best way to control for this is to include a
well-selected comparison group in the study.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Could Maturation be at Work Here?(Figure 9.7)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Attitude of Subjects

The way subjects view a study and their


participation can be considered a threat to
internal validity, a.k.a. the Hawthorne effect.
Subjects may perform better based upon a
feeling of receiving special attention.
The opposite may occur, with subjects receiving
no treatment at all, resulting in poor
performances.
A remedy to this would be to provide both
groups with comparable treatments or to make
the treatment a regular part of the study.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

The Attitude of Subjects Can Make a Difference


(Figure 9.8)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Regression

A regression threat is possible when


change is studied in a group that has
extreme low or high performance in the
pre-intervention stage.
As with the maturation threat, this can be
controlled by the use of an equivalent
control or comparison group.

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Regression Rears Its Head


(Figure 9.9)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

Implementation

The experimental group may be treated in


ways that are unintended, giving them an
advantage.
This is known as an implementation threat.
This can occur in two ways:
1)

2)

McGraw-Hill

When different individuals are assigned to


implement different methods, and these
individuals differ in ways related to the outcome
When some individuals have a personal bias in
favor of one method over the other

2006 The McGraw-Hill Companies, Inc. All rights

How to Minimize Threats to


Internal Validity

There are four alternatives a researcher


can use to reduce threats to internal
validity:
1)
2)
3)
4)

McGraw-Hill

Standardize the conditions under which the


study occurs
Obtain more information on the subjects of
the study
Obtain more information on the details of the
study
Choose an appropriate design

2006 The McGraw-Hill Companies, Inc. All rights

Illustration of Threats to Internal Validity


(Figure 9.10)

McGraw-Hill

2006 The McGraw-Hill Companies, Inc. All rights

You might also like