NAEP assessments include multiple-choice items, which are machine-scored by optical mark reflexscanning, and constructed-response items, which are scored by trained scoring staff. These trainedscorers ("raters") use an image-based scoring system that routes student responses directly to eachrater. Focused, explicit scoring guides are developed to match the criteria emphasized in the assessmentframeworks. Consistency of scoring between raters is monitored during the process through ongoingreliability checks and frequent backreading.Throughout the scoring process, three types of personnel make up individual scoring teams:
are professional scorers who are hired to rate the individual student responses, or
lead teams of raters throughout the scoring process on a daily basis.
provide training for the scoring raters, continually monitoring the progress of eachscoring team.Team members are required to have, at a minimum, a baccalaureate degree from a four-year college or university. An advanced degree, scoring experience, and/or teaching experience is preferred. Scoringteams use the training process to determine whether each individual rater is sufficiently prepared toscore. Following training , each rater is given a pre-scored "qualification set" and expected to attain 80percent correct in order to proceed.All scoring is carried out via image processing. To assign a score, raters click the mouse over a buttondisplayed in a scoring window. Since buttons are included only for valid scores, there is no editing for out-of-range scores. Two significant advantages of the image-scoring system are the ease of regulating theflow of work to raters and the ease of monitoring scoring. The image system provides scoring supervisorswith tools to determine rater qualification, to backread raters, to determine rater calibration, to reset trendrescore items, to monitor trend rescore items through
-statistics reports, to monitor interrater reliability,and to gauge the rate at which scoring was being completed.The scoring supervisors monitor work flow for each item using a status tool that displays the number of responses scored, the number of responses first-scored that still need to be second-scored, the number of responses remaining to be first-scored, and the total number of responses remaining to be scored. Thisallows the scoring directors and project leads to accurately monitor the rate of scoring and to estimate thetime needed for completion of the various phases of scoring.