You are on page 1of 15

# Measurement Error

## regard to some psychological construct, we do so

with some amount of error
Any observed score for an individual is their true score

## are concerned with a measures inability to

capture the true response for an individual
Observed Score = True score + Error of measurement

Reliability
Reliability refers to a measures ability to capture an

## individuals true score, i.e. to distinguish accurately one

person from another
While a reliable measure will be consistent, consistency can
actually be seen as a by-product of reliability, and in a case
where we had perfect consistency (everyone scores the
same and gets the same score repeatedly), reliability
coefficients could not be calculated
No variance/covariance to give a correlation

## also the lack of the measure being perfectly reliable

Reliability
Criteria of reliability
Test-retest
Test components (internal consistency)

Test-retest reliability
Consistency of measurement for individuals over time
The score similarly e.g. today and 6 months from now

Issues
Memory
If too close in time the correlation between scores is due to memory of item
responses rather than true score captured
Chance covariation
Any two variables will always have a non-zero correlation
Reliability is not constant across subsets of a population
General IQ scores good reliability
IQ scores for college students, less reliable

## Restriction of range, fewer individual differences

Internal Consistency
We can get a sort of average correlation

## among items to assess the reliability of some

measure1
As one would most likely intuitively assume,
having more measures of something is better
than few
It is the case that having more items which
correlate with one another will increase the
tests reliability

## Whats good reliability?

While we have conventions, it really kind of depends
As mentioned reliability of a measure may be different

## for different groups of people

What we may need to do is compare reliability to those
measures which are in place and deemed good as well
as get interval estimates to provide an assessment of
the uncertainty in our reliability estimate
Note also that reliability estimates are biased upwardly
and so are a bit optimistic
Also, many of our techniques do not take into account
the reliability of our measures, and poor reliability can
result in lower statistical power i.e. an increase in type II
error
Though technically increasing reliability can potentially also lower

power1

## Replication and Reliability

While reliability implies replicability, assessing reliability does not

replicability1

## themselves, we might even think that in many cases we would not

expect consistent research findings
In psychology, many people spend a lot of time debating back and
forth about the merits of some theory, citing cases where it did or
did not replicate
However the lack of replication could be due to low power, low
reliability, problem data, incorrectly carrying out the experiment
etc.
In other words, we didnt repeat because of methodology, not because

## Factors affecting the utility of

replications
You cant step in the same river twice!
Heraclitus1

When
Later replications are not providing as much information,

## however they can contribute greatly to the overall

assessment of an effect

Meta-analysis

How
There is no perfect replication (different people involved,

## time it takes to conduct etc.)

Doing exact replication gives us more confidence in the
original finding (should it hold), but may not offer much in
the way of generalization

## Example: doing a gender difference study at UNT over and over.

Does it work for non-college folk? People outside of Texas?

## Factors affecting the utility of

replications
By whom
It is well known that those with a vested interest in some

## idea tend to find confirming evidence more than those

that dont
Replications by others are still being done by those with
an interest in that research topic and so may have a
precorrelation inherent in their attempt

## Direct: correlation of attributes of persons involved

Indirect: correlation of data to be obtained

## attempts, but must strive to minimize bias

The more independent replication attempts are,

Validity
Validity refers to the question of whether our

## measurements are actually hitting on the

construct we think they are
While we can obtain specific statistics for
reliability (even different types), validity is more of
a global assessment based on the evidence
available
We can have reliable measurements that are
invalid
Classic example: The scale which is consistent and able

by 5 pounds

## Validity Criteria in Psychological Testing

Content validity
Criterion validity
Concurrent
Predictive

Construct-related validity
Convergent
Discriminant

Content validity
Items represent the kinds of material (or content areas) they are

supposed to represent

Are the questions worth a flip in the sense they cover all domains of a

given construct?

etc.

## Validity Criteria in Psychological Testing

Criterion validity
the degree to which the measure correlates with various

outcomes

## Does some new personality measure correlate with the Big 5

Concurrent
Criterion is in the present

## Measure of ADHD and current scholastic behavioral problems

Predictive
Criterion in the future

## Validity Criteria in Psychological Testing

Construct-related validity
How much is it an actual measure of the construct of

interest

Convergent
Correlates well with other measures of the construct

## Depression scale correlates well with other dep scales

Discriminant
Is distinguished from related but distinct constructs

## Validity Criteria in Experimentation

Statistical conclusion validity
Is there a causal relationship between X and Y?
Correlation is our starting point (i.e. correlation isnt causation, but does

Related to this is the question of whether the study was sufficiently
sensitive to pick up on the correlation

Internal validity
Has the study been conducted so as to rule out other effects which were

controllable?

## Poor instruments, experimenter bias

External validity
Will the relationship be seen in other settings?

Construct validity
Same concerns as before
Ex. Is reaction time an appropriate measure of learning?

Summary
Reliability and Validity are key concerns in

psychological research
Part of the problem in psychology is the lack of
reliable measures of the things we are interested
in1
Assuming that they are valid to begin with, we
must always press for more reliable measures if
we are to progress scientifically
This means letting go of supposed standards when they

current ones