You are on page 1of 7

Criterion and Norm-Referenced Testing:

Approaches to Educational Measurement


Desalegn Chalchisa*

Educational Measurement 1980). The following


approaches are either operational definition of NRT
criterion-referenced or norm- given by Popham (1981) is
referenced. This paper commonly used by test
presents a review of the specialists.
fundamental concepts of these
approaches. It contains NRT is used to ascertain
definitions, elaborations and an individuals status
disti~ctions of both type of with respect to the
testing. performance of other
individual's on the
test.
1. Norm-Referenced Testing
(NRT) In NRT, an individual's raw
score is interpreted by
Raw scores obtained by comparing it to the scores of
counting the number of right a defined group, often called
answers are hard to the normative group
interpret. An examinee's (representativ€ sample of
performance would be better testees who have been used
interpreted if it could be as the basis for interpreting
referenced to something test scores) . Norms are
outside of the test itself. derived scores of the
As a result, tests require normative groups (Wiersma and
the use of some type of Jurs, 1990). The intent is to
scores derived from raw compare the performance of an
scores in order to facilitate examinee on the test with
the interpretation of that of the normative group,
examinees' performance on rather than to determine how
them. Traditionally, derived proficient a stuaent is in a
scores such as ranks, particular subject or skill
percentiles, standard scores, (Capper, 1994).
age and grade equivalent
scores have been used to The following example
report examinees' performance illustrates NRT by the use of
on tests. These types of letter grading. Suppose that
derived scores indicate the an English teacher in a
status of the individual with
respect to the performance of
others. Tests developed to
make such derived scores t
especially useful are called Hand, Measurement and
norm-referenced tests (Nitko, Evaluation Unit, IER

10
Criterion and Norm-Referenced Testing:
Approaches to Educational Measurement

Desalegn Chalchisa*

Educational Measurement 1980). The following


approaches are either operational definition of NRT
criterion-referenced or norm- given by Popham (1981) is
referenced. This paper commonly used by test
presents a review of the specialists.
fundamental concepts of these
approaches. It contains NRT is used to ascertain
definitions, elaborations and an individuals status
distinctions of both type of with respect to the
testing. performance of other
individual's on the
test.
1. Norm-Referenced Testing
(NRT) In NRT, an individual's raw
score is interpreted by
Raw scores obtained by comparing it to the scores of
counting the number of right a defined group, often called
answers are hard to the normative group
interpret. An examinee's (representativ€ sample of
performance would be better testees who have been used
interpreted if it could be as the basis for interpreting
referenced to something test scores). Norms are
outside of the test itself. derived scores of the
As a result, tests require normative groups (Wiersma and
the use of some type of Jurs, 1990) . The intent is t o
scores derived from raw compare the performance of a n
scores in order to facilitate examinee on the test wi t h
the interpretation of that of the normative group,
examinees' performance on rather than to determine how
them. Traditionally, derived proficient a student is in a
scores such as ranks, particular subject or skill
percentiles, standard scores, (Capper, 1994}.
age and grade equivalent
scores have been used to The following example
report examinees' performance illustrates NRT by the use of
on tests. Thes e types of letter grading. Suppose that
derived scores indicate t he an English teacher in a
status of the ind~vidual with
respect to the performance of
others. Tests developed to
make such derived scores t
especially useful are called Hand, Measurement and
n orm-referenced tests (Nitko, Evaluation Unit, IER

10
college administers a final 2. Criterion-Referenced
examination to a total of 180 Testing (CRT)
students. The teacher
decides that the distribution CRT is relatively a new
of letter grades assigned to phenomenon in the history of
the final examination result testing. The concept of CRT
will be 5 percent A's, 15 was first introduced by
percent B's, 60 percent CIS, Glases in 1963 (Poham, 1981;
15 percent D's and 5 percent Wiersma and Jurs, 1990). He
F's. Suppose that a student came up with the idea of
obtains a score on the final derived scores that directly
exam such that 30 students reflect the kind of
have higher and 150 students performance that an
have lower scores. His score individual can show or do
on the exam will be B, rather than by his or her
because the top 9 students (5 relative standing in a
percent of 180) will receive defined group examinees. He
A's, the second top 27 refered to this sort of
students (15 percent of 180) notion as CRT (Nitho, 1980).
will receive B's, the middle The derived scores to be
108 students (60 percent of obtained from criterion
180) will receive CIS, the referenced test scores
second bottom 27 students provide information about the
will receive D's and the last degree of competence/mastery
9 students will receive F's. attained by a particular
Counting from top to down, examinee along the continuum
the studert's score is of achievement irrespective
located 31 s, and he will, of the performance of others
therefore, receive a grade A student who does not attain
of B. the criterion has not
mastered the skill
sufficiently to move to the
In this example, the raw next instructional level.
score of the student was not
specified. The raw score
would have been necessary to 3. Fundamental Distinctions
determine that the student's Between CRT and NRT
score is located 31 st in
relation to the 180 scores. 3.1. Breadth of the test
However, in terms of
interpretation of the score, A norm-referenced test
it was based strictly on the measures a more general
student's relative position category of behaviours like
in the total group of arithmetic skills
students. Thus, this kind of whereas criterion
interpretation specifies the referenced test focuses on a
performance of examinees in more specific domain of
relative and not absolute behaviours such as solving
terms. addition problems with two

11
three-digit numbers or is interpreted via
determining multiplication percentages.
products of one and three
digit numbers. CRT tends to Both NRT and CRT approaches
focus on sub-skills more than continue to exist and play
on broad skills (Ebel, 1979; important roles in
Popham 1981; Glaser and accomplishing specific
Nitko, 1971). purposes in educational
measurement and evaluation
( Harney, 1984).
3.2. Way of interpretation of
test scores
RBFBRBNCBS
It is the kind of
interpretations that Capper, J. 1994. Testing
basically distinguishes CRT to Learning ... Learning
and NRT. Norm-referenced to Test: A Policy
interpretations are based on maker's Guide to Better
an individual's standing on Educational Testing.
the test relative to others.
Any interpretations via Ebel, R.L. 1979.
ranks, percentiles,' E sse n t i a 1 s 0 f
deciles, quartiles or Ed~cational Measurement
standard scores (Z-scores, ( 3 ed.) .
T-scores, Deviation 1.0
scores, Stanines) is norm- Glaser, R; and Nitko, A.J.
referenced. CRT does not 1971. 'Measurement in
depend on co~parisons of the lea r n i n g and
performance of other instruction' . In: R. L.
examinees; rather, the score Thorndike (Ed. ) ,
of an examinee on criterion - EdMcational Measurement
referenced test must yield (2 ed.). Washington,
direct information about the DC: American Council on
individual's performance on Education.
some criterion of interest
independent of the test Gronlund, N.E. 1985.
scores earned by any other Measurement and
examinee. There is a general EVflIluation in Testing
agreement that criterion (5 ed.). New York:
referenced test scores must MacMillan Publishing
be referable to a well Company.
defined domain of behaviours
(Hartel, 1985; Nitko, 1980; Harney, W. 1984. 'Testing
Popham, 1981) and to well reasoning and reasoning
defined domain of behaviours about tests,' Review of
or defined performance level Educational Research,
(Wiersa and Jurs, 1990). How 55, 597-654.
well an examinee mastered a
well defined behavioral Hartel, E. 1985. 'Construct
domain or a defined validity and criterion -
performance level on a referenced testing,'
c riterion - referenced test Review of Educational
Research, 55,~23-46).

12
Ministry of Education,
1980. Geography: Grade Payne, D. 1992. Measuring
~. Addis Ababa: and Evaluating
Educational Materials Educational Outcomes.
production and New York: MacMillan.
distribution Agency.
Popham, W.J, (1981). Modern
Educational Measurement.
Nitko, A.J. 1980 Englewood Cliffs:
'Distinguishing the many prentice Hall.
varieties of criterion
referenced test'. 3eview Wiersma, W; & Jurs, S. G.
of Educational Research 1990. Educational
50, (461-486). Me~surement and Testing
(2 ed. ) . Boston:
Allyn and Bacon.

13
. . . . . . . . . . _ ............................ ~ ........................................... ............................................................. u ..... . . . . . . . . . . . . . . . . . ._ ........................................................_ ••

IER Conducted a Workshop on Measurement and Evaluation in


Classroom Learning in Addis Ababa Schools

The Institute of Educational the need for improving


Research held a short-term teachers' skills in the area
training workshop from July of testing. About 25 senior
25 - 29/94 on Measurement and secondary school principals,
Evaluation in classroom unit leaders, department
learning in Addis Ababa heads and teachers
schools. The training was participated in the workshop.
conducted in response to

Opening Ression

14
The workshop specifically Addis Ababa University had an
deals with the following. increasing role in the
development of the system of
student performance education through the direct
measurement; involvement of its graduates
and the sustained
achievement tests; contributions of its
researchers. He added that
validity and reliability the training session was in
of test scores; line with the objectives of
the university and in
preparing and particular with that of the
administering classroom Insti tute of Educational
tests in high schools Research.
and
The workshop was closed by
the use and application Ato Tamirat Demessie, the
of tests for guidance of former Head of the Education
students. Bureau of Region 14. After
giving certificates to the
The workshop was opened by participants, Ato Tamirat
Dr. Makonnen Yimer, made a closing speech in
Vice President for which he emphaslzed the need
Administration and for similar workshops for the
Development of Addis Ababa vast majority of teachers in
University. In his openlng outer schools. The same was
remarks, the vice e x pre sse d b y the
president stated that participants of the workshop

15

You might also like