You are on page 1of 18

 Reliability – refers to the consistency of

scores obtained by the same person when re-


examined with the same test on different
occasions, or with different sets of equivalent
items, or under other variable examining
condition.
 Classical Test Score Theory – this assumes
that each person has a true score that would
be obtained if there were no errors in
measurement.
 Measurement error – the difference between
the observed score and the true score results.

E = X - T
(error) (observed score) - (true score)

 Standard error of measurement – the


standard deviation of the distribution of
errors for each repeated application of the
same test on an individual.
 Factors that contribute to consistency:
 These consist entirely of those stable attributes of
the individual, which the examiner is trying to
measure.

 Factors that contribute to inconsistency:


 These include characteristics of the individual,
test, or situation, which have nothing to do with
the attribute being measured, but which
nonetheless affect test scores.
 Domain Sampling Model
 There is a problem in the use of limited number of
items to represent a larger and more complicated
construct.
A. Item selection
 One source of measurement error is the
instrument itself. A test developer must settle
upon a finite number of items from a potentially
infinite pool of test question.
B. Test Administration
 General environmental conditions may exert an
untoward influence on the accuracy of
measurement, such as uncomfortable room
temperature, dim lighting, and excessive noise.
C. Test Scoring
 Whenever psychological test uses a format other
than machine-scored multiple choice items, some
degree of judgment is required to assign points to
answers.
 With the help of a computer, the item
difficulty is calibrated to the mental ability of
the test taker.

 If you got several easy items correct, the


computer will then move to more difficult
items.

 If you get several difficult items wrong, the


computer moves back to average items.
 It is also known as time sampling reliability

 This is used when we measure only traits or


characteristics that do not change over time.
 Carryover effect

 Practice effect

 Error variance
 It is established when at least two different versions
of the test yield almost the same scores.

 It is also known as item sampling reliability or


alternate forms reliability

 The error of variance in this case represents


fluctuations in performance from one set of items to
another, but not fluctuations over time.
 One of the most rigorous and burdensome
assessments of reliability since test
developers have to create two forms of the
same test.

 Practical constraints make it difficult to retest


the same group of individuals.
 It is the degree of agreement between two
observers who simultaneously record
measurements of the behaviors.
 It is obtained by splitting the items on a
questionnaire or test in half, computing a separate
score for each half, and then calculating the
degree of consistency between the two scores for
a group of participants.

The task we must set
for ourselves is not to
feel secure, but to be
able to tolerate
insecurity.

- Erich Fromm

You might also like