Professional Documents
Culture Documents
Reliability
Reliability
Reliability
It is the user who must take responsibility for determining whether or not scores are
sufficiently trustworthy to justify anticipated uses and interpretations (AERA et al. 1999).
Reliability
• If a test taker was administered the test at a different time, would she
receive the same score?
• If the test contained a different selection of items, would the test taker get
the same score?
Types of Reliability
Test-Retest Reliability
• Most applicable with tests administered more than once and/or with constructs
that are viewed as stable.
• It is important to consider the length of the interval between the two test
administrations.
• Subject to “carry-over effects.” Appropriate for tests that are not appreciably
impacted by carry-over effects.
➢ Some limitations:
• Estimates of reliability that are based on the relationship between items within a
test and are derived from a single administration of a test.
Split-Half Reliability
• Divide the test into two equivalent halves, usually an odd/even split.
• Since this really only provides an estimate of the reliability of a half-test, we use
the Spearman-Brown Formula to estimate the reliability of the complete test
• Also sensitive to the heterogeneity of the test content (or item homogeneity).
Inter-Rater Reliability
• Reflects differences due to the individuals scoring the test.
• Important when scoring requires subjective judgement by the scorer.
Assessment of Learning 1: Reliability
Alternate Forms
Improving Reliability