A Procedure For Writing Content-Fair Essay Examination Topics For Large-Scale Writing Assessments

A Procedure for Writing Content-Fair Essay Examination Topics for Large-Scale Writing
Assessments
Author(s): James Hoetker and Gordon Brossell
Source: College Composition and Communication, Vol. 37, No. 3 (Oct., 1986), pp. 328-335
Published by: National Council of Teachers of English
Stable URL: http://www.jstor.org/stable/358049
Accessed: 21/09/2008 01:52
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=ncte.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org.
National Council of Teachers of English is collaborating with JSTOR to digitize, preserve and extend access to
College Composition and Communication.
http://www.jstor.org
A Procedure for Writing Content-Fair
Essay Examination Topics for Large-
Scale Writing Assessments
James Hoetker and Gordon Brossell
In the past few years, as performance tests of writing ability have become in-
creasingly prominent parts of large-scale educational assessment programs, re-
searchershave finally begun to study the effects of variations in essay test char-
acteristics on the nature and quality of students' writing samples. 1 Several
recent studies of essay topics have begun to produce evidence that variations in
the wording and content of a topic may affect what and how well a student
writes.
Our ability to propose and interpret studies of the effects of the phrasing of
essay topics is limited by our post-structuralist awarenessthat an essay topic is
a text which, like any other text, possesses very limited control over what any
particular reader will make of it. A reader topics her own topic, just as she
poems her own poem, and the reader-constructedtopic will resemble what the
writer of the test topic intended only to the extent that the writer is able to
restrict the possibility of readings that differ too much from what was
intended. Probably, until we better understand what is involved, our safest
bet, when preparing sets of essay topics for large-scale examination programs,
is to minimize opportunities for miscommunication by keeping the topics as
brief as possible, as simply phrased as possible, and as similar one to the other
as possible-and by field-testing them thoroughly before using them.
There are no similar commonsensical rules of thumb for dealing with the
other major problem, the content or subject that the topic calls on students to
write about. Lacking a topic that designates a subject about which all of the
examinees equally have something to say, the essay examination may become
less a test of writing ability than an expensive way of identifying the students
who do and do not conform to the topic writer's preconception that students
educated to a certain point should be able to discourse about current events,
ethical conundrums, American politics, literary quotations, or whatever.
In this essay we report a study in the development of essay topics that we
JamesHoetkeris Chairof the Departmentof Curriculumand Instructionin the College of

Educationat FloridaState University. GordonBrossell is AssociateDean for Undergraduate
Studiesin the same College. Both havepublishedextensivelyon testing and on researchin the
teachingof English.
328 CollegeCompositionand Communication,Vol. 37, No. 3, October1986

Content-FairEssay Examination Topics 329
believe represents a new approach to producing sets of examination topics that

we may call "content-fair," sets of topics that provide the great majority of ex-
aminees with the chance to write on a subject about which they know a good
deal. Topics written according to the procedures we describe seem successful
both in reducing linguistic opportunities for misinterpretation and in enabling
students to identify a subject about which they have something to say.
In 1983 we contracted to supply a set of topics for Florida's College-Level
Academic Skills Test (CLAST), an examination that all Florida college soph-
omores must pass in order to be admitted to junior standing. Unlike other
writing tests in use in Florida, the CLAST test requires students to produce a
specimen of "academic" writing, rather than, for instance, personal narrative
or pure description.
Without at first having any clear idea of what our topics were going to look
like, we set ourselves the goal of writing a set of thirty topics that were mini-
mally misreadable, functionally equivalent, and fair to all examinees regardless
of their backgrounds or academic interests.
Our own prior experience and researchand our reading of the literature on
topics for tests of writing ability had convinced us that the topics in our set
should share several characteristics which should, we thought, reduce the op-
portunity for interpretations too different from our intentions. First, the top-
ics (as separate from the instructions accompanying them) should be as brief as
possible, since we could think of no surer way to reduce misreadings than to
reduce the number of words that had to be read and interpreted. Second, the
topics, in the interests of comparability, should be cast in a common syntactic
pattern and, to the greatest extent possible, should share semantic elements as
well. Third, the instructions, which would be uniform across topics, should
also be brief and simple and should honestly and succinctly describe the real
rhetorical occasion-i.e., the student is being asked to write a piece of exam-
ination prose that several readers (English teachers) are going to evaluate in
such and such a way-rather than prescribing fictional voices and audiences.
We began by writing topics that more or less fit these criteria. And then
we tore them up, since the topics we wrote were, despite our intentions, as
variable in some crucial ways as other people's topics that we had criticized.
The major problem, of course, was content, subject matter. On the one hand,
different subjects demanded different vocabularies, which introduced more va-
riety than we wanted in length and syntax. On the other hand, no matter
what content we introduced, it was obvious that some groups of students
would be better equipped to write about it than others.
Our breakthrough came-neither of us can remember just what suggested
this approach-when we realized that the population of examinees had in
common that they were just completing two years of taking college courses,
and that the solution to the problem posed by students' varying familiarity
with any subject we proposed was to allow each student to write on whatever
subject he felt he had learned the most about.
330 CollegeCompositionand Communication
From there we quickly conceived of a "master topic" on the pattern of one

of those frame sentences that the structural linguists were fond of, with indi-
vidual topics in a set being produced by substituting different words or
phrasesin specified blanks in the master pattern. Getting from that concept to
a set of workable topics proved a great deal more difficult and frustrating than
we had anticipated. Finally, in a handbook one of us was consulting for an-
other purpose entirely, we came across a section on the old-fashioned formal
definition, which proved to be just what we needed.
A formal definition has these parts:
1. A category name (noun or noun phrase)-e.g., "A Flute";

2. A linking verb-"is";
3. A class specification (adjective phrase)-"a keyed woodwind instru-
ment";
4. Differentiating criteria (relative clause)- "which has a side hole over
which wind is blown, etc."
Our adaptation of this pattern presented the student with a topic consisting
of the class specification (3) and two differentiating criteria (4), leaving it up
to the student to choose a specific exemplar of the category (1) that completed
the definition. For example, a topic might read: "A novel/ which many stu-
dents read/ that may affect them significantly." The student then is free to
identify a novel about which he knows enough to write cogently, and free, as
well, to supply his own definition of "significant effects."
At this point we had, in theory at least, what we thought of as a "topic-
generating machine" that could turn out an unlimited number of syntactically
and semantically similar topics. Each topic allowed examinees to supply a
noun phrase naming a subject they knew well enough to write about. The
sample topic, for instance, could be varied in these ways:
"A book/ that manystudentsread/that may affectthem harmfully."
"A book/ that many studentsread/that may affectthem beneficially."
"A course/that many studentstake/ that may affectthem in important
ways"
"A habit or belief/ that many studentsacquirein college/ that affects
their life in importantways."
And so forth. Substitute another noun phrase or other differentiating crite-
ria and you can start cranking out another set of related topics. For instance:
"A commonpractice/in Americanschools/that inhibitslearning."
"A commonpractice/in Americancolleges/that shouldbe changed."
"A common practice/ of teachers/ that does not achieve what it is
intendedto achieve."
"A commonpractice/in Americanelections/that does not servethe in-
terestsof the public."
Of course, the existence of this mechanism does not relieve the test maker
of the responsibility for exercising judgment about the relative appropriateness
of topics that are generated. A topic like our first sample, for instance, with
the class specification "novel," might be appropriate for an examination of
English majors, but with a general population it would likely reveal that large
numbers of students, far from being able to discuss a novel, are not quite sure
what a novel is. The field tests of our topics, to be discussed below, suggest
how subtle may be the factors that influence responses to even linguistically
similar topics, so that surprises may lie in wait beyond even the most elabo-
rate arrangementof expert reviews and editorial screens.
In the belief (admittedly intuitive and undocumented) that some students
(usually bright and highly verbal ones) will write better about abstract sub-
jects, while others will write more comfortably about people, places, and
things, we submitted 15 topics of each sort and recommended that each form
of the test should contain two topics-one with an abstract and one with a
concrete noun phrase-between which a student could choose. This recom-
mendation was accepted by the test administrators. Figure 1 reproduces the
complete writing examination-instructions and two sample topics-as it is
presented to students.
Our report and the initial set of 30 topics were submitted for review to a
panel of college composition teachers. They were attracted by the reasoning
DIRECTIONSFOR ESSAY
You will have 50 minutes to plan, write, and proofreadan essayon one of the topics
below.
TOPICS: 1. A book that manystudentsreadthat may affectthem beneficially.
2. A commonpracticein Americancolleges that should be changed.
In youressay,you shouldintroduceyoursubjectand then either
-explain the subjectyou havechosen,or
-take a positionaboutyoursubjectand supportit.
At least two evaluatorswill readyouressayand assignit a score. They will be paying
specialattentionto whetheryou
-state yourthesisclearly,
-develop yourthesis logicallyand in sufficientdetail,
-use well-formedsentencesand paragraphs,
-use languageappropriately and effectively,and
-follow standardpracticesin spelling, punctuation,and grammar.
Take a few minutes to think about what you want to say beforeyou start writing.
Leaveyourselfa few minutes at the end of the period to proofreadand make correc-
tions.
You may crossout and add informationas necessary.Althoughyourhandwritingwill
not be scored,you shouldwrite as legibly as possibleso that evaluatorscan readyour
essayeasily.
You may use the following page to plan your essaybeforeyou begin to write in the
answerfolder.
Figure 1. Specimen CLAST Essay Examination

332 CollegeCompositionand Communication
underlying our topic-generating procedure, and their discussions of specific

topics often resulted in revisions of the wording that moved the topics closer
to our stated ideals of conciseness, comparability, and fairness.
We conducted a preliminary field test of the revised topics with a sample of
900 Florida community college and university students. Test conditions ap-
proximated as nearly as possible conditions of the actual testing situation.
Each essay was scored holistically by two trained raters on a scale from 1 to 4,
with the sum of the ratings being the essay's final score. Mean and modal
scores, as had been the common experience of the State's other tests requiring
production of writing, clustered between 4.5 and 5.0 on a scale from 2 to 8;
but on this test for the first time the scores were on the whole normally dis-
tributed, rather than, as always before, being skewed toward the low end of
the scale. There were, that is to say, fewer extremely low scores-especially
2's-than might have been expected. The optimistic interpretation of this re-
sult would be that the format of the topics worked to assist weaker writers,
who might otherwise have floundered, to find a subject about which they had
something to say. But there was no evidence that the topics contributed to an
increasein scores in the higher ranges.
All students were asked to complete a brief questionnaire about the topics
when they had finished writing their essays. Student responses to the new-
model topics were decidedly positive, with most students agreeing that the
topics were fair and helpful.
We read the field test essays with special attention to how they opened-
especially to whether writers got to work quickly developing a thesis or spent
a great deal of time floundering. We did not have access to another corpus of
essays written by similar students to topics of another sort, so our impressions
were subjective; but we agreed that we were finding an encouragingly large
number of essays which started off more purposefully than most essays written
for earlier testing programs with which we had been involved. The typical
low-scoring test essays in earlier programs opened with a long paragraph of
what can only be called free writing, with the writer finally discovering a the-
sis (if ever) only midway through the text he or she was producing. But, as we
had hoped, an encouragingly large number of the writers in the field test got
down to business immediately, naming and narrowing the subject in the very
first sentence. We may tentatively offer the hypothesis that these writers took
advantage of the support offered them by the format of the topic, often recast-
ing the language of the topic into an effective thesis sentence. It is possible
that the good impression created on raters by businesslike opening sentences
helps to explain the relative infrequency of essays receiving final scores of 2 or
3.
Because the topics were so different from those used in previous CLAST
administrations, the test administrators scheduled another larger field test of
the topics in late 1984, which was coordinated by personnel from Miami-
Dade Community College and directed by Richard Swartz of GED Testing Ser-
vice.2 Essays were written on 15 of the topics by more than 2,700 students
from a representative sample of sophomores in Florida colleges. Twenty raters

who had had prior experience grading CLAST essays were given refreshertrain-
ing and divided into four "tables" of five raters each.
Raters at a table were given all the essays written on one topic to read and
score. (From 70 to 212 students had written on each topic, with the average
number of writers per topic being around 200.) After all the essays had been
holistically scored, the raters held a summary discussion, in which they tried
to reach a consensus about the suitability of the topic under examination. The
raters were specifically urged to support their recommendations by relating
details of the topic to specific features of the essays written in response to it.
The procedure was then repeated for each topic, with the raters at another
table independently scoring the essays and discussing the topic and making
recommendationsabout its suitability.
The administrators of the field test were concerned with essay scores only to
the extent of inspecting them to determine that rater agreement was satisfac-
tory (it was) and that the score distribution was not unacceptably skewed (as
in the earlier field test, distributions of scores on suitable topics approached
normality). Their emphasis, rather, was, as we think it should be, on using
the special skills of writing teachers to find connections between features of
two sets of texts-the topics and the essays written in response to them.
Now, at this point we would normally give the readersamples of the topics
that were finally accepted for CLAST; samples of topics that were rejected,
along with the reasons for their rejection; and excerpts from essays written for
both our new-model topics and conventional topics, to illustrate our discus-
sion of the differences between essays that we think are attributable to the for-
mat of the topics.
Unfortunately, the legal requirements by which the management of the
testing project must abide forbid us from publicly sharing either accepted or
rejected topics and from publishing samples from essays from which the reader
might infer the nature of the topic. We fully intend, when we find the time
and financial support, to collect student essays written on a new set of topics
developed according to the procedures we have described and subject them to
detailed analyses, including comparisons to essays written on topics used in
earlier administrations of CLAST. But we have not wanted to delay publication
of a description of our work until we find the opportunity to conduct that re-
search. We hope, in fact, that this provisional report of our work in topic de-
velopment will encourage other researchersto design studies evaluating the
sorts of topics we describe.
We considered, as a way of compensating for our inability to use real exam-
ples, writing some fictional examples to illustrate how the topics affected the
students' composition of their essays; but we rejected that idea on the grounds
that we would not be convinced by that sort of "evidence" if another re-
searcherpresented it for our consideration.
We hope, then, to depend on quoting summary reports from CLAST
administratorsand consultants on the field tests of the topics in order to make
334 and Communication
CollegeComposition
our case that the topics do indeed accomplish their intention of giving stu-
dents the chance to choose a subject on which they are qualified to write,
while at the same time assisting the students to engage the writing task as ef-
fectively as possible.
For a first example, a 15 July 1985 memorandum from the Florida Depart-
ment of Education to college and university English department chairpersons
announced the new topic format and drew on the field test results to give this
explanation of the virtues of what had by then been christened the "frame top-
ic":
The frame topics have been developed to deal with the most prob-
lematicalaspectof large scale essaytesting-ensuring that the highly di-
verse examinees are all familiar with the assigned topics. . . . With this
frameformattopics can be writtenwithout extensivequalificationswhich
might excludesome studentsfrom accessto the subjects. Also, this for-
mat allows students to restrictthe topic and to organizeas they choose
insteadof followingthe organizationset up in a prompt.
It is the judgmentof the CLASP [College-LevelAcademicSkills Project]
CommunicationsTask Forceand of the readerswho evaluatedthe field
test essaysthat the frametopic is advisablefor severalreasons.One, the
demands . . . of this format closely parallel the CLASTwriting skills{i.e.,
the taskspecifications] as well as the methodology commonly used in
teachingstudents to write an essay. Two, it allows students to demon-
stratetheir ability in all the steps of the composingprocessexcept initial
choice of topic. Three, it will allow for evaluation of all . . . parts of a
student'scomposingprocess.
This endorsement has great practical importance, since curriculum is, as
they say, test-driven, so that the adoption of the "frame topic" for future ad-
ministrations of the CLAST writing examination will affect the curriculum in
freshmanwriting courses across the State.
As a second piece of evidence, here is the text of a 14 January 1986 letter to
us from Dan Kelly of the University of Florida, the Chief Reader for the
CLASTproject.
As you know, we used the new topic formatfor the first time for the
Fall, 1985, administration/reading of the CLAST essays. We were more
than just curiousabout the effects the topics would have on the essays,
and I am sureyou too arequite interested.
We readat three sites, Tallahassee,Fort Lauderdale,and Gainesville.
Readersat all sites agreedthat:
1. Essayswere better developedbecausewriters could addresstopics
that they not only felt they knew more about, but also were more inter-
esting for them.
2. Thesis statements and organizationwere as well managedas they
had beenwhen writersrespondedto topics used underthe old format.
3. Readerswere very gratefulfor a largervarietyof differentresponses
which reducedthe numberof readingsof repetitivesupportingideas and
details over the two-dayperiod. We believe this factorled to improve-
ments in readerattentionand reliability.
Statisticalanalysesindicatethat scoresfor essayswrittenunderthe new
formatwere not significantlydifferentfrom those recordedduring pre-

vious CLASTscoring sessions.
If we continue to have similar results on subsequent readings-and
thereis no reasonto believetherewon't be-we may assumethat the new
topics, developedby you, are a genuine breakthroughin the art of topic
writingand holisticallyscoredessays.
The new-model topics, that is to say, were judged to have some noticeable
good effects and no observable negative effects.
Thus, the feedback we have received so far satisfies us that our procedures
enable a topic writer to produce an indefinitely large pool of topics which (1)
assist students to identify a subject they are equipped to write about and (2)
assist them in finding a focus for their essays. We are also satisfied that these
procedures represent a significant advance toward managing the perennial
problem of comparability among topics.
Notes
1. See, for instance, Gordon Brossell, "Rhetorical Specification in Essay Examination Top-
ics," CollegeEnglish, 45 (February, 1983), 165-173; Gordon Brossell and BarbaraHoetker Ash,
"An Experiment with the Wording of Essay Topics," CollegeComposition and Communication,35
(December, 1984), 423-425; Karen Greenberg, The Effectsof Variationsin Essay Questionson the
Writing of CUNY Freshmen(New York: City University of New York, Instructional Resource
Center, 1981); and Leo Ruth and Sandra Murphy, "Designing Topics for Writing Assessment:
Problems of Meaning," CollegeCompositionand Communication,35 (December, 1984), 410-422.
2. Richard Swartz, "Report on CLASTEssay Topic Field Test Reading, November, 1984,"
mimeographed Appendix A to "Final Report for Field Testing Essay Topics . . . ," (Tallahas-
see, FL: State Department of Education, College-Level Academic Skills Project, January 1985).
All subsequent quotations not otherwise attributed are from this report.
Call for Manuscripts
and Compositionis a journal devoted to exploring the uses of computers

Computers
in writing classes,writing programs,and writing research.The aim of our pub-
lication is to provide a forum for discussing issues connected with computer
use.
is currentlyinviting articles,reviews,and lettersto
and Composition
Computers
the editors that may be of interest to its readers. Recommended topics for fea-
ture articles include descriptions of computer-aided writing and/or reading in-
struction; discussions of topics related to computer use or software develop-
ment; surveys of computer use in writing programs at various educational
levels; explorations of controversial ethical, legal, and moral issues related to
using computers in writing classes; reports of completed or on-going research
studies involving computers in writing programs; discussions of how computers
affect the form and content for written discourse, the processes by which this
discourse is produced, or the impact this discourse has on an audience.
Send all submissions to the following address: Cynthia Selfe and Kate Kiefer
Editors, COMPUTERS AND COMPOSITION, Humanities Department, Michigan
Technological University, Houghton, MI 49931. Phone: (906) 487-2447.

A Procedure For Writing Content-Fair Essay Examination Topics For Large-Scale Writing Assessments

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Procedure For Writing Content-Fair Essay Examination Topics For Large-Scale Writing Assessments

Uploaded by

Copyright:

Available Formats

A Procedure for Writing Content-Fair Essay Examination Topics for Large-Scale Writing

James Hoetker and Gordon Brossell

JamesHoetkeris Chairof the Departmentof Curriculumand Instructionin the College of

328 CollegeCompositionand Communication,Vol. 37, No. 3, October1986

believe represents a new approach to producing sets of examination topics that

From there we quickly conceived of a "master topic" on the pattern of one

1. A category name (noun or noun phrase)-e.g., "A Flute";

Figure 1. Specimen CLAST Essay Examination

underlying our topic-generating procedure, and their discussions of specific

from a representative sample of sophomores in Florida colleges. Twenty raters

formatwere not significantlydifferentfrom those recordedduring pre-

Call for Manuscripts

and Compositionis a journal devoted to exploring the uses of computers

You might also like