This action might not be possible to undo. Are you sure you want to continue?
Norm-referenced and criterion-referenced tests serve a variety of purposes due to the array of educational situations that exist in today’s schools. Testing can rank students with each
other or some other sociocultural norm, or testing can be based on some performance criteria that focus on assessing certain understandings or skill set. Ideally, a combination of both testing types exists in a way that is valid, reliable, and fair. Thus, given that many classrooms contain students with different socioeconomic and cultural backgrounds, testing becomes quite a challenge. Therefore, in order to assure that all students receive the most appropriate feedback, a variety of testing techniques is needed so that proper decisions and actions can be made that best suit the learner. Virtually all students have taken some kind of standardized test by the time they enter high school or college. Moreover, many standardized tests (i.e., high stakes tests) are used as a condition of graduation, acceptance, or financial aid. Because these tests are used as a way to rank or compare students, they are often referred to as norm-referenced tests (NRT) (Kubiszyn and Borich, 2007). NRTs are commonly used when stakeholders are interested in the central tendency of the results of a group of students, as when descriptive statistics are used to find the average, mean, median, and mode of a particular data set. When using tests to diagnose or to figure the aptitude of a student, inferences are made based on how students compare with each other or some other sample based on a social norm. Since results are “objective” – test items are usually in terms of right and wrong answers – and since many tests can be applied at once, NRTs are typically more appropriate for making decisions that are non-instructional based. In addition to NRTs being used externally to rank students (e.g., SAT, ACT, etc.), teachers oftentimes use NRTs to test students in the classroom. Multiple-choice, true-false,
and ranked in order for teachers to make their best inference as to what level a student has understood. Instead of ranking students to some certain norm. Arter and McTighe (2001) distinguish between a holistic and analytical trait rubric when they state “A holistic rubric gives a single score or rating for an entire product or performance based on an overall impression of a student’s work” and “an analytical trait rubric . explain. then as an internal instrument used by teachers in their classrooms. Subsequently.Norm and criterion-referenced tests 2 matching. another testing method aids in basing students performance in terms of meeting certain criteria. obtained the necessary skill set. apply. Kubiszyn and Borich (2007) define criterion-referenced test (CRT) as tests that “tells us about a student’s level of proficiency in or mastery of some skill or set of skills” (p. Test results are gathered. Having framed NRT first as an external instrument. one can see a noticeable difference in why they are being used in each circumstance. Wiggins and McTighe (2005) also put forth the notion of promoting the six facets of understanding (e. interpret. averaged. In other words. This distinction is important when talking about a second type of test that is based on criteria. The former is to make decisions regarding achievement while the latter is to make decisions regarding instruction. or developed the intended disposition based on the goals and objectives of the classroom. instructional decisions are often made based on these results either by reviewing past information that students continue to struggle with or continuing on with new information that makes up part of the curriculum. perspective.g.. Rubrics are often used in order to qualitatively assess performances and products. such as an ACT. empathy. 66). CRTs can provide teachers with greater insight on instructional decision-making adjustments when student performances are assessed in terms of performance criteria. and essay questions are common testing types that fall under this same category. and self-knowledge) when testing students regarding what they know and their disposition they possess.
73) drive the level of predictability an instrument has in making proper inferences on a student’s achievement. Communicating these “essential traits” with students provides the basis for what constitutes a “good” and “bad” performance or product.g. The validity of a test pertains to the three Cs: “content. and dispositions in terms of subsequent inferences towards instructional decision-making adjustments and adjustments to student learning tactics. and “absence-of-bias” (Popham. that is. Criterion validity in CRTs deals with rubric traits and how valid they are in terms of a student’s future performance. Content validity addresses how test items represent concepts that are covered in the curriculum. 18). Similarly. criterion. Indeed. knowledge. the same ACT should yield similar results (i. and construct” (Popham. The final C.Norm and criterion-referenced tests divides a product or performance into essential traits or dimensions so that they can be judged separately-one analyzes a product or performance for essential traits” (p. reliability. does not lean towards a certain group of people based on socioeconomic status. Regardless of the test being administered. p. construct validity. a high correlation coefficient) if students retake the exam without being exposed to a learning intervention in the interim. gender. has to do with how a student’s performance over time is gauged in terms of meeting criteria that is aligned to the curriculum. 2008. absence-of-bias centers on how test items present information that is fair.. 2008. And finally. for example. ethnic background. or sexual orientation. Reliability in NRTs is of high concern since many versions of the ACT. skills. race. 53). ACT and SAT scores and subsequent academic success or failure). . validity. CRTs are specifically suited for assessing understandings.. p. Criterion validity in NRTs deals with how accurate the testing items are in predicting future behavior (e.e. and is essential in setting the expectations between teacher and student. are expected to contain test items that measure the same 3 content.
are reliable within the same and different versions of an exam. knowledge. testing understandings. or sexual orientation. socioeconomic status. Conversely. in order to continue the development and improve the feedback that tests provide all of its stakeholders. and do not discriminate minority groups based on age. reliable. the ideal and the reality of what schools are for all its stakeholders. gender. and disposition through performance and product criteria serves a vital role in making inferences that influence instructional decisions and student tactic adjustments. In order for tests to be valid. .Norm and criterion-referenced tests 4 NRTs and CRTs should not be considered dichotomous. and absent of bias. a collaborative effort is needed in bringing together a community of practice that addresses these important aspects of testing and assessment. Thus. Ranking and comparing students has a purpose when the goal is to measure achievement and to predict future academic success. test designers should conduct a variety of reviews to assure that tests measure curricular aims. Tests are the link between the written and taught curriculum. but are two different approaches to assessing students in a complementary way. race. skills.
Scoring rubrics in the classroom: Using performance criteria for assessing and improving student performance. Hoboken. 5 . Understanding by design. Wiggins. NJ: Wiley and Jossey-Bass Education. (2007). J. Alexandria. and McTighe. New York: Pearson.Norm and criterion-referenced tests References Arter. G. and Borich. Educational testing and measurement: Classroom application and practice. Thousand Oaks: CA: Corwin Press. (2001). G. J. VA: ASCD. and McTighe. (2008). W. T. Classroom assessment: What teachers need to know. Popham. Kubiszyn. (2005). J.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.