Assessing Writing

Assessing Writing
By:
Tania Syafutri
Ririn Pitaloka
LIST OF CONTENT
The nature of writing ability
The relationship between writing and speaking
1Permanence 2 Production time 3 Distance

4 Orthography 5 Complexity 6 Formality
7 Vocabulary
Second-language writing
Basic considerations in assessing writing

a. Test purpose
b. Language use and language test
performance
Writing as performance assessment
The term performance assessment is used to describe any
assessment procedure that involves either the observation of
behavior in the real world or a simulation of a real-life activity i.e. a
performance of the ability being assessed, and the evaluation of
the performance by raters.
Test usefulness
Bachman and Palmer -1996: 17) maintain that `the most
important consideration in designing and developing a language
test is the use for which it is intended, so that the most important
quality of a test is its usefulness.‘
The six qualities of usefulness, particularly as they relate to
writing assessment:
a) Reliability
b) Construct validity
c) Authenticity
d) interactiveness
e) Impact
f) Practicality
Research in large-scale writing
assessment
a. Task variables
Some of the questions we might ask
about writing tasks are the following:
1) Most generally, on what dimensions
do writing tasks vary, both `in the
real world' and in testing situations?
2) the many ways in which writing
tasks can vary, which are
associated with different levels of
performance, and which are not?
3) do raters use different criteria in
assigning scores to different task
types?
the task is an overarching term that

includes all relevant dimensions within
the assessment,
whether or not they are explicitly
stated, while the prompt refers
specifically to the written instructions to
the test taker.
f. Text variables
An important question in writing assessment research has been the degree to which specific aspects of texts are related to test
scores. There is a fair amount of research in both L1 and L2 that relates specific features of texts to test scores.
g. Rater variables
The study of rater variables in writing assessment has taken two main focus: a consideration of what attributes of compositions
raters focus on while evaluating writing, and the investigation of background rater characteristics and their effects of the process
of reading compositions and ultimately on the scores that raters use.
h. Rating scales
Since the rating scale represents the most concrete statement of the construct being measured, it is clearly important to
understand how rating scales influence decisions made about test takers.
i. Context variables
Variability within the rating context includes such factors as ordering of compositions, time of day of the rating session, whether
the rating is done alone or in a group setting, and the type of training received.
j. Test-taker variables
While test takers are, in a very fundamental sense, the most important element of a writing test, surprisingly little research has
been done on the responses of test takers to test tasks. As Ruth and Murphy -1984) note, a writing task as intended by test
writers may not be the same task that is perceived and attempted by writers
This presents principles of test design for large-scale
writing assessment -- that is, testing beyond the level of
Designing writing assessment tasks

the individual classroom.
a) The process of test development

(1) The design stage
• As the outcome of the design stage, Bachman and
Palmer -1996: 88) recommend the development of a
design statement, which is a document containing the
following information:
.a description of the test purposes,
.a description of the domain and task types,
.a description of the target population,
.a definition of the construct,
.a plan for evaluating the qualities of usefulness, and
.an inventory of required and available resources and
a plan for their allocation and management.
2) operationalization stage
specifications should contain, at a minimum, the following elements:
• a description of the test content, including the organization of the test, a description of the number and type of test
tasks, time allotment for each task, and specifications for each test task/item type,
• the criteria for correctness,
• sample tasks/items (Douglas, 2000: 110-113)
(3) administration stage

The third stage in the test development process is the administration of test tasks to examinees, both on a trial basis
and operationally, and the concurrent collection and analysis of test data and other relevant information about the test
procedures.
b) Considerations in task design

White -1994) notes that most test developers consider at least the following four minimum requirements for writing
tasks: clarity, validity, reliability, and interest.
c) Subject matter
At issue here is the question of what topic -content area) test takers should write about, and what topics should be
avoided.
e) Stimulus material
A writing prompt can include source materials, such as a reading passage, a brief quotation, or a drawing, that provide content for test
takers to write about, or it can simply nominate a topic without any additional stimulus material.
f) Genre
Genre can be defined both in terms of the intended form and the intended function of the writing.
g) Time allotment
Another important issue in writing assessment is deciding how much time test takers will be given to complete each task.
i) Instructions
(Bachman and Palmer, 1996) provide three guidelines for instructions:
1) they should be simple enough for test takers to understand;
2) they should be short enough not to take up too much of the test administration time; and
3) they should be sufficiently detailed for test takers to know exactly what is expected of them
j) Choice of tasks
k) Transcription mode (handwriting versus word processing)

With the increasing use of computers in education and testing, one important question is whether to ask examinees to write by hand
or to enter their essays on a computer
l) Use of dictionaries and other reference materials

Scoring procedures for writing assessment
Rating scales
the scale that is used in assessing performance tasks such as writing tests represents, implicitly or explicitly,
the theoretical basis upon which the test is founded; that is, it embodies the test -or scale) developer's notion of
what skills or abilities are being measured by the test.
Types of rating scales

a. primary trait scales
b. holistic scales, and
c. analytic scales.
Designing the scoring rubric

Factors to consider in designing a scoring rubric:
a) Who is going to use the scoring rubric?
b) What aspects of writing are most important, and how will they be divided up?
c) How many points, or scoring levels, will be used?
d) How will scores be reported?
Writing scale descriptors

This can be done a priori , by defining in advance the ability being measured and then describing a number of
levels of attainment, from none to complete mastery
Calculating total scores
Before the scoring rubric can be finalized, decisions need to be made about calculating reported score.
The scoring process

Once the scoring rubric has been finalized, the next step is to select raters and design a process for the
operational scoring of scripts.
Procedures for assuring reliability

Each script must be scored independently by at least two raters,
with a third rater adjudicating in cases of discrepancy.
.Scoring should be done in a controlled reading
.Checks on the reading in progress by reading leaders –sometimes called Table Leaders)
.Evaluation and record keeping are essential for an ongoing assessment program
Rater training
Special problems in scoring

a. Off-task scripts
b. Memorized scripts
c. Incomplete responses
Evaluating scoring procedures

d. Assessing reliability of scores : intra rater reliability and inter-rater reliability
e. Assessing validity of scoring procedure. It can be through questions:
1) do the scoring procedures ± in particular, the scoring guide accurately reflect the construct being
measured?
2) whether the scoring procedures are being implemented in an appropriate way?
3) whether the scores obtained from the test allow us to make appropriate inferences about writing ability
and thus appropriate decisions about test takers?
c. Evaluating the practicality of scoring procedures
FCE
Illustrative
TOEFL test of writing IELTS
(First Certificate in
( The International
( test of English as a English)
English Language
foreign language)
There are five illustrative of writing test:
Testing System)
BEST CoWA
( Basic English Test (Contextualized
Skill Test) Writing Assesement)
TOEFL
( Test of English as a Foreign Language)
 Purpose
To evaluate the English proficiency of people whose native language is not English.
TOEFL scores are used primarly in decision about admission to colleges and
universities and scholarship program. TWE ( Test of Written English) was not contain
writing section in July 1998 (Computer-based TOEFL)
 Test content
The TOEFL Writing test consist of one single essay. According to the ETS (2000) the
purpose of writing test is to demonstrate test takers’ ability to write in English.
examiness are not gven a choice of prompt and must write assigned prompt which
selected randomly by computer. examiness may write their essay by hand on the
computer. handwritten essays are scanned into a computer before being scored
Writing Prompt
Continued..
 scoring
the TOEFL writing is
scored on a six holistic
scale, can be seen as :
continue..
 Discussion
cumpolsory writing test is relatively new development in the TOEFL, which has its roots
in american psychometric tradition in discrete-point test. these factor have been
important in determining the structure and content othe TOEFL writing test
 Construct
The construct being measured is limited to a narrow focusthe ability to write
argumentative discourse on an impromptu topic
Authenticity
the TOEFL writing limited by the fact that there is no opportunity to read about or
discuss the asigned topic before writing about it
FCE
(First Certificate in English)
 purpose
this examination are used to certify English language proficiency
for a variety of purposes. the examinees divided between
employment, study and personal interest
 the test content
The EFC writing paper consist of two writing task : a compulsory
task an optional task. the total time for task is 1 hour 30 minutes
 Scoring
the FCE writing task
scored on six band-
scale :
Example of
FCE test
IELTS
( The International English Language Testing System
 Test purpose
The purpose of IELTS test (IELTS 2002)is to assess the language ability of a candidate
who need to study the post-secondary or university level of work in a professional
capacity where English used as the language of communication
 Test content
both general training and academic modules consist of two tasks. a shortertask at least
150 words and a longer task at least 20 words. examinees are advised to spent 20
minutes on the first task and 40 minutes on the second task the main differences oon
both test is the topic areas ( general vs academic) and the compexity of the task.
Writing task 1 and 2
BEST
( Basic English Test Skill Test
 Test Purposes
The BEST test intended for use with limited English Speaking Adults for whom
information on attained of basic functional language skill is needed. this test designed
to be used for placement in ESL courses,progress testing for developent on survival
skills, diagnostic testing, screening for vocational training and program evaluation.
 Test content
The test ids consist of oral interview section, administered individually a a literacy skill
section contains a basic number of basic funcional literacy tasks
- Reading : reading canlendars, food, clothing label, telephone directories, train
schedules, short information passage etc.
- Writing : filling out personal information,adressing an envelope, and writing notes.
in this section, consist of two items each of which ask test takers to write three and or
four sentence to a given topics.
continue..
 Scoring
The writing samples are scored strictly of the basis of
amount of communicative information appropriate to the
task conveyed in the writing. the manual test make it clear
that witing must be on task to be scored, if the task takers
write a wrong topic the score will be 0.
CoWA
(Contextualized Writing Assesement)
 Test Purpose
The CoWA is purposed to certififying the second-language proficiency of secondary
and post secondary students. The CoWA is intended to use in situation wre a writer
performance in a second language meets minimal criterion for placement such as
fullfilling a graduation requirement etc.
 Test Content
The CoWA consist of five tasks organized around one single theme. involved
contextual task and brainstorming task
 Scoring
the criteria of scoring consist task fulfilment, vocabulary,discourse and correct formstion
of present time/immediate future

Assessing Writing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assessing Writing

Uploaded by

Copyright:

Available Formats

Assessing Writing

1Permanence 2 Production time 3 Distance

Basic considerations in assessing writing

the task is an overarching term that

Designing writing assessment tasks

a) The process of test development

(3) administration stage

b) Considerations in task design

k) Transcription mode (handwriting versus word processing)

l) Use of dictionaries and other reference materials

Types of rating scales

Designing the scoring rubric

Writing scale descriptors

The scoring process

Procedures for assuring reliability

Special problems in scoring

Evaluating scoring procedures

You might also like