Professional Documents
Culture Documents
Moss (1994 Validity Reliability)
Moss (1994 Validity Reliability)
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Educational Research Association is collaborating with JSTOR to digitize, preserve and extend
access to Educational Researcher.
http://www.jstor.org
Reliabilityhas traditionallybeentakenfor grantedas a necessary script if it dealt effectively with the concerns of Reviewer
but insufficientconditionfor validityin assessmentuse. My pur- A. He commented, diplomatically,that he feared our posi-
pose in this articleis to illuminateand challengethis presump- tion "might be misread as a rejection of a fundamental
tionbyexploringa dialecticbetweenpsychometric andhermeneutic measurement principle."He noted that "any measurement
approachesto drawingand warrantinginterpretations of human should have adequate reliabilityfor its purposes, otherwise
productsor performances. Reliability,as it is typicallydefinedand it is not good measurement, regardless of its positive
operationalized in the measurementliterature(e.g., American features."
Educational Research Association[AERA],AmericanPsychological There is an instructive irony embedded in this anecdote.
Association,& NationalCouncilon Measurementin Education, The process by which a working decision was reached re-
1985; Feldt& Brennan,1989), privilegesstandardized formsof garding our manuscriptwas based in an epistemology that
assessment.By consideringhermeneuticalternativesfor serving more closely resembled the one we had proposed than the
the importantepistemological andethicalpurposesthat reliability one against which our manuscript was evaluated. The
serves,we expandthe rangeof viablehigh-stakesassessmentprac- editor's decision was not grounded in the consistency
ticesto includethosethathonorthe purposesthat studentsbring among independent readings, which diverged substantially;
to their workand the contextualized judgmentsof teachers. rather,he made a thoughtfuljudgmentbased upon a careful
Research,Vol.23, No. 2, pp. 5-12.
Educational reading of both sets of comments and his own evaluation
of the manuscript. I am confident that he was concerned
with the validity and fairness of his decision. Of course, I
ome time ago, I submittedto a journala manuscript didn't agree with his initial decision, but our dialogue con-
in which my coauthors and I argued for the value of tinued through the mail, and the paper improved (and was
teachers' contextualizedjudgments in making conse- published) as we strengthened our argument in response
quentialdecisions about individualstudents and educational to his concerns. I am also confident that both the readers
programs. Drawing on epistemological strategies typically of the journaland I were well served by the written dialogue
used by qualitative or interpretive researchers, we offered that accompanied what, for me, was a "high-stakes"
an example of how teachers' narrative evaluations of their decision.
students' collected work, which varied in substance from My purpose in this article is to illuminate and challenge
student to student and classroom to classroom, might be the presumption that reliability,as it's typically defined and
warranted and used for accountabilitypurposes. We based operationalized in the professional measurement literature
our argument for the value of this sort of contextualized (e.g., AERA et al., 1985; Feldt & Brennan, 1989), is essen-
assessment on the unique quality of information it might tial to sound assessment practice;in doing this, I give par-
provide when used in conjunction with more standardized ticular attention to the context of accountability in public
forms of assessment and on the educationalbenefits it might education. I explore a dialectic between two diverse ap-
have for teachers and students. We warrantedthe narrative proaches to drawing and warranting interpretations of
evaluation, in part, in criticaldialogue among readers about human products and performances-one based in
the multidimensional evidence contained in students' psychometricsand one in hermeneutics.This task, I believe,
folders and, in part, in documentation of evidence allow- honors Messick's (1989)proposed "Singerian" mode of in-
ing subsequent readers of the report to "audit" or confirm quiry in validity research, where one inquiring system is
the conclusions for themselves. observed in terms of another inquiringsystem "to elucidate
Reviewer B thought our manuscript was a "superb and the distinctive technical and value assumptions underlying
important article" and gave it her "highest endorsement." each system application and to integrate the scientific and
ReviewerA thought the manuscriptshould not be published ethical implications of the inquiry" (p. 32). My point is not
in its current form because we had "confused the purpose to overturn a traditionalcriterionbut rather to suggest that
of assessment with that of instruction" and had "failed to it be treatedas only one of several possible strategiesof serv-
establish reliability" (in this case, adequate consistency ing important epistemological and ethical purposes. The
among independent readings). She commented that our choice among reliability and its alternatives has conse-
argument showed a lack of understanding of the essential
function of reliability,not only in service of validity, but also
"for fairness to the student to prevent the subjectivity and A. MOSS is assistantprofessor,Universityof Michigan,
PAMELA
potential bias of an individual teacher." The editor, faced 4220 Schoolof Education,610 East University,Ann Arbor,MI
with the dilemma of divergent opinions, wrote that he 48109-1259. She specializesin educationalmeasurementand
would be willing to publish an articlebased on the manu- evaluation.
MARCH 1994 5
6 EDUCATIONALRESEARCHER
MARCH 1994 7
8 EDUCATIONALRESEARCHER
MARCH 1994 9
10 EDUCATIONALRESEARCHER
MARCH 1994 11
12 EDUCATIONALRESEARCHER