Available online at www.sciencedirect.


Journal of Second Language Writing 16 (2007) 194–209

Teaching writing teachers about assessment
Sara Cushing Weigle *
Department of Applied Linguistics & ESL, Georgia State University, P.O. Box 4099, Atlanta, GA 30302-4099, USA

Abstract The assessment of student writing is an essential task for writing teachers, and yet many graduate programs do not require students to take a course in assessment or evaluation, and courses on teaching writing often devote only a limited amount of time to the discussion of assessment. Furthermore, teachers frequently need to prepare their students for externally mandated large-scale writing assessments, and thus they need to have an understanding of the uses and misuses of such tests. This article outlines some of the essential considerations in classroom and large-scale assessments and provides suggestions for how to incorporate considerations about assessment into a course on teaching writing or as a stand-alone course. # 2007 Elsevier Inc. All rights reserved.
Keywords: Second language writing; Writing assessment; Teacher education

Assessment of student writing is an essential task for writing teachers. Unfortunately, however, many graduate programs in TESOL and rhetoric/composition do not require students to take a course in assessment or evaluation, and courses on teaching writing often devote only a limited amount of time to the discussion of assessment. Moreover, teachers often feel that assessment is a necessary evil rather than a central aspect of teaching that has the potential to be beneficial to both teacher and students. They may believe, rightly or wrongly, that assessment courses focus too much on statistics and large-scale assessment and have little to offer classroom teachers. As a result, teachers sometimes avoid learning about assessment or, worse, delay thinking about how they will assess their students until they are forced to do so, a situation which unfortunately decreases the chances that assessments will be fair and valid. At the same time, writing teachers often find themselves in a position of having to prepare their students for externally imposed assessments such as departmental or university-wide exit examinations or large-scale high stakes tests such as the test of English as a foreign language

* Tel.: +1 404 413 5192; fax: +1 404 413 5201. E-mail address: sweigle@gsu.edu. 1060-3743/$ – see front matter # 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.jslw.2007.07.004

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


(TOEFL). Teachers sometimes feel that such assessments have little to do with the skills they are trying to teach their students; consequently, they may approach these tests with some resistance and, unfortunately, little understanding of how such tests are constructed or scored and whether or not they have been validated for the purpose for which they are being used. It is my belief that writing teachers must be adequately prepared to construct, administer, score, and communicate the results of valid and reliable classroom tests, and that, similarly, they should have an understanding of the uses and misuses of large-scale assessments so that they can be critical users of such tests and effective advocates of their students in the face of mandatory assessments not of their own making. In this paper, I start by outlining some of the fundamental principles of assessment in general, and then discuss the process of test development, some of the considerations that teachers must think about in designing classroom writing assessments, and some suggestions for how teacher trainers might approach these issues in a course on second language writing issues or on assessment. Finally, I discuss large-scale assessment and some of the ways in which teachers can be empowered by a deeper understanding of these assessments that affect their students. Classroom assessment For any teacher, the ability to design fair and valid ways of assessing their own students’ progress and achievement is an essential skill. In order to do so, teachers need to understand the range of possibilities for assessing students, what the essential qualities of a good assessment instrument are, and how to develop assessments that maximize these essential qualities within the constraints of time and resources that teachers face. It may be useful at first to clarify some terminology and to outline various types of assessments. Assessment is a broad term that encompasses all sorts of activities that teachers engage in to evaluate their students’ progress, learning needs, and achievements. As Brown (2004) notes, teachers are constantly evaluating their students in informal ways, and these informal evaluations are an important part of assessment, just as more formal tests are. Informal assessments include such things as clarification checks to make sure students understand particular teaching points, eliciting responses to questions on style and usage from students, or circulating among students doing peer response work to ensure that they are on task. Formal assessments can be defined as ‘‘exercises or procedures specifically designed to tap into a storehouse of skills and knowledge’’ (Brown, 2004, p. 6). For a writing class, formal assessments may include traditional writing tests, for example, an exercise in which students are required to generate one or more pieces of connected discourse in a limited time period, which are then scored on some sort of numerical scale (Hamp-Lyons, 1991a,b), and other activities, in particular, response to and evaluation of artifacts such as portfolios, homework assignments, or out – class writing assignments. It is important for teachers to recognize that all of these activities – informal assessments, and various types of formal assessments, including tests – have a place in a teacher’s assessment toolbox, all are appropriate under certain circumstances, and all need to be evaluated according to the most important qualities of effective assessments: in particular, reliability, validity, and practicality. Thorough treatments of these qualities can be found in a variety of sources, including those listed in the Appendix. I include only a brief discussion of them here. A good test is reliable; that is, it is consistent. A student should get the same score on a test one day as on the next day (assuming, of course, that no additional learning has taken place in the interim) or from one grader/rater as from another. If there is a choice of topics or tasks, they should be equivalent in difficulty so that a student’s chances of performing optimally do not depend on which topic they choose. Finally, conditions of administration should be as similar as


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

possible so that factors not related to the skills being assessed do not affect student performance. For example, students should all be given the same amount of time to complete the assessment. A good test is valid for the purposes for which it is being used. Validity is a complex issue that is discussed at length in many references on assessment (e.g., Bachman, 1990; Bachman & Palmer, 1996; Hamp-Lyons, 1991a,b; Hudson & Brown, 2002; McNamara, 1996). In essence, validity has to do with the appropriateness of decisions that will be made on the basis of the test so that, for example, students who are capable of demonstrating excellent work in class are able to do so on the test, and those who are not as capable are not able to pass the test by other means (for instance, by lucky guessing or by memorizing a response). For most classroom purposes, the most important validity consideration is that the content of the test is representative of the skill(s) and knowledge that are being taught in the course, both in terms of covering the range of skills adequately, and also in terms of not assessing skills that are not being taught in the course. A good test is practical; that is, it can be developed, administered, and scored within the constraints of available resources, particularly time. For teachers, practicality is an overriding concern; writing teachers in particular know how time-consuming it is to grade papers. Teachers need to have realistic expectations about how much time they can devote to developing assessments, as well as how long it will take to administer and score any assessment of writing. Reliability, validity, and practicality are not the only considerations for assessment. For example, Bachman and Palmer (1996) include interactiveness, authenticity, and impact, or the effect of an assessment on learners, teachers, and other stakeholders, in their model of test usefulness, but for classroom teachers, these three are perhaps the most critical to be familiar with. The test development process Whether one is writing a test for an individual classroom or for large-scale administration, the essential steps are the same. Many books on language testing provide guidance for test development at the classroom level and for large-scale tests which go into greater detail than is possible here (see, for example, Alderson, Clapham, & Wall, 1995; Bachman & Palmer, 1996; Weigle, 2002). For any classroom test, there are four major considerations that go into an assessment procedure. These are:     setting measurable objectives, deciding on how to assess objectives (formally and informally), setting tasks, scoring.

Specifying measurable objectives One of the most fundamental lessons about assessment is that decisions about assessment should not be left until the end of instruction, but rather should be taken into account from the very beginning, preferably in the earliest planning stages for a course. Teachers need to learn how to articulate precisely what it is they hope students will learn in their courses so that they can develop ways of assessing whether their students have, in fact, mastered the course objectives. For this reason, it is helpful to state course objectives in terms of observable behaviors or products so that they can be evaluated appropriately. Many writing course syllabi contain general objective statements such as ‘‘students will learn the basics of academic writing’’ or ‘‘after completing this class, you will know how to revise and

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


edit your writing.’’ The problem with statements that are framed in this way is that they do not provide any guidance for developing assessments that will help teachers judge whether students have met these objectives. As a teacher, how does one know when a student has ‘‘learned the basics of academic writing’’? Does the writing of a student who has accomplished this objective differ from that of a student who has not? Without a clearer statement of measurable outcomes, it will be impossible for teachers to know whether they have been successful. This problem of vaguely worded objectives is compounded in a multi-level program where students need to progress through two or more levels, and different sections of the same course are taught by different teachers. Statements such as those above do not provide useful ways of articulating between levels. It is, therefore, much more helpful to start out by stating objectives in such a way that it is clear when the objectives have been met. There are many sources in the educational literature for writing clear objectives (see, for example, Gronlund, 2004), but my own inspiration in this area comes from business rather than education. David Allen, in his excellent book Getting Things Done, provides a three-step model for stating outcomes, which can be applied just as easily to teaching as to the business world. The steps are: 1. View the project from beyond the completion date. 2. Envision ‘‘WILD SUCCESS.’’ 3. Capture features, aspects, qualities you imagine in place (p. 69). In terms of teaching writing, these steps can be conceptualized as follows: 1. Imagine the class and the students at the end of the term. 2. Think about the very best piece of writing that could come from this class. 3. Describe its attributes. What does it look like? What makes it stand out? Is it the correct use of verb tenses? Is it the vivid details or the insightful thinking that went into the writing? Is it the use of transitions and other cohesive devices? Has the student revised appropriately to instructor and/or peer feedback? Teachers who can articulate what they imagine their best writers can accomplish at the end of a term are in a good position to begin developing assessments. Furthermore, by defining one’s objectives in this way at the beginning of instruction, teachers can begin to plan out how they will assist students in reaching these goals, thus allowing concerns about assessment to inform instruction from the very beginning. One activity that can help teachers with articulating their objectives is to have them write an imaginary endnote to a final draft of a writing assignment from their course, as in Figure 1. Note that the questions cover three main areas: what the student has done well (i.e., the student’s

Fig. 1. Imaginary endnote to a final draft of a writing assignment.


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

strengths, whether specifically learned in the course or not), ways in which the student has improved (i.e., what the student has learned from the course), and what the student could focus on for the future (i.e., what the student may not yet have mastered but is ready to learn). The questions are flexible enough to cover linguistic, content, rhetorical, or process dimensions of writing. Once teachers have determined what a successful paper would look like, they are ready to write outcome statements that contain measurable objectives. One rule of thumb for specifying objectives is to include three characteristics: a description of the performance itself, or what the student is expected to do, the conditions under which the performance will be elicited, and the level of performance that will be deemed acceptable (Mager, 1975, cited in Ferris & Hedgcock, 2005). Some teachers may object that setting goals in this way is inappropriate for teaching writing, especially those who view personal expression as the main goal of writing instruction (see Raimes, 1991, for an overview of different perspectives on the goals of writing courses). Indeed, one of the dangers of writing objectives in this way is that what is measurable is not always what is essential, so that the focus often turns to easily quantifiable traits of essays such as error counts. Teachers need to find a compromise that they can live with between too much specificity, which can lead to an unhealthy focus on lower level skills to the detriment of the big picture, and too much generality, which can make it nearly impossible to ascertain how successfully the course objectives have been met. One example of such a compromise can be found in Figure 2. Note that the outcome statements cover objectives related to the range of written products, the use of language, and the writing process, and are written using verbs that describe observable behavior (uses, writes, etc.). The benefits of specifying outcomes in this way are numerous. Teachers can use these outcome statements to make teaching decisions and to design rubrics for evaluating writing, and students benefit because what is expected of them becomes much more clear. One assignment that that can help students learn how to practice writing clear course objectives is to provide a sample syllabus (see, for example, Ferris & Hedgcock, 2005, pp. 110– 118). The students’ task is to evaluate the course objectives in terms of whether they are specific and measurable. For ones that are not, students need to rewrite the objectives so that they are specific and measurable. For those objectives that are already specific, students can discuss how they would write assignments that measure those objectives. Deciding on how to assess objectives Once teachers have a list of objectives, the next step in the process is to decide which objectives will be assessed informally, which will be assessed formally through tests, and which will be assessed formally through means other than tests. For example, objectives related to critical thinking skills might best be assessed through informal means such as observing participation in class discussions or responding to reading journals, while more specific language-related objectives such as the correct use of verb tenses might be assessed as part of a test, either as a controlled exercise or as part of the evaluation of a timed writing assignment. Teachers need to be aware of the multiplicity of ways in which various objectives might be tested. The bibliography at the end of this article contains numerous resources for designing assessment tasks, testing books in particular. Cohen (1994) and Hughes (2002) contain chapters devoted to various ways of assessing writing skills either holistically or as discrete subskills. In this next section, I will focus on setting tasks for independent writing (either as timed single-draft

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


essays or as untimed multiple-draft essays) rather than for testing subskills such as grammatical knowledge or the ability to paraphrase. Following this, I describe portfolio assessment as a potentially more valid way of assessing many aspects of writing than can be assessed in a single test. First, however, I will explore the issue of whether one should test writing at all in a writing course; that is, under what circumstances is a writing test appropriate? In-class versus out-of-class writing Writing teachers frequently face the dilemma of whether to assess in-class as well as out-ofclass writing. Particularly in classes where the writing process is emphasized, many teachers feel that it is counterproductive to assess students on a single draft of a paper, especially on an impromptu topic that students may not have had time to think about before the day of the

Fig. 2. Examples of outcome statements.


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

Fig. 2. (Continued ).

assessment. If a final examination for a writing course consists of impromptu writing only, students are given a mixed message about what kind of writing is actually important for them to be able to master. Furthermore, most writing outside of testing situations in the real world is not completed under time pressure. This is particularly true for academic writing. The process of writing involves reflection, discussion, reading, feedback, and revision, and one’s best work is usually not produced in a single draft within 30 or 60 minutes. A final reason for emphasizing out-of-class writing is that some L2 students may have difficulties on timed writing tests even if they are successful in other academic writing tasks

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


(Byrd & Nelson, 1995; Johns, 1991). Furthermore, English teachers without ESL training may be susceptible to basing their evaluations of NNS writing more on sentence-level concerns than on content or rhetorical concerns (Sweedler-Brown, 1993). NNS may not be able to perform as well under time pressure as their native-speaking peers, and this may be especially noticeable in timed writing. However, there are at least three important reasons why teachers would want to include some sort of in-class writing assessment as part of their assessment of students’ abilities. The first reason is simply a pragmatic one: timed writing tests are a fact of life for many students. Writing tests have become standard on large-scale high-stakes tests such as the TOEFL and the GRE, and such tests can have a profound effect on students’ futures. Furthermore, in content courses such as history or psychology – at the undergraduate level, at least – students are frequently expected to write short essays on their examinations (Carson, Chase, Gibson, & Hargrove, 1992). The ability to compose under time pressure is thus critical for many students, and the writing class can be a valuable place to learn strategies for timed writing and to practice this skill. In addition, while collaboration in writing is often seen as an important component of the writing process, there are times when teachers want to know what students can do on their own without assistance. In out-of-class writing assignments there is always the danger that students have received inappropriate amounts and kinds of help from tutors, friends, or roommates. In particular, second language writers may ask their native speaker friends to proofread their papers and fix sentence-level errors. Teachers are certainly justified in asking students to produce at least some writing in class where they are unable to rely on such outside support. A third reason for testing writing in class under timed condition comes from second language acquisition theory. From a psycholinguistic viewpoint, in-class writing can serve as a test of automatized knowledge of English. In general, adults writing in their first language have automatic access to lexical and syntactic resources, while for many second language writers, particularly at lower levels of proficiency, these processes are not yet automatic, so writers need to focus conscious attention on retrieving words and explicit grammar rules from long-term memory. This need to pay attention to word and sentence level concerns makes it difficult to focus on macro-level issues such as overall structure and organization and writing strategies that they may use in their first language (see Weigle, 2005, for a summary of research in this area). Furthermore, as Ellis (2005) demonstrates, different tasks evoke implicit and explicit knowledge. Ellis found that an untimed grammaticality judgment test evoked explicit or rulegoverned knowledge, particularly for those sentences that were ungrammatical, while a timed test evoked implicit knowledge. One might hypothesize on the basis of these results that timed and untimed writing assignments would evoke different knowledge types, and, therefore, if one is interested in knowing how much linguistic knowledge is implicit and automatized, a timed writing assessment may be an appropriate vehicle for this purpose. For these reasons, although many writing teachers feel that in-class writing does not allow students to demonstrate their best ability, one can justify assessing both in-class and out-of-class writing as complementary sources of information about student abilities, particularly when it comes to making high-stakes decisions such as passing or not passing a course. In such cases, as assessment specialists (e.g., Brown & Hudson, 1998) frequently point out, it is particularly critical to use multiple sources of information, as no single test of an ability is without error. In assessing in-class or timed writing, however, classroom teachers have advantages that developers of large-scale tests do not, in that they can modify the timed impromptu essay to take advantage of the extended time they spend with students before a test. Elsewhere (Weigle, 2002), I have presented ways of modifying the timed impromptu essay to fit the classroom environment.


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

Possibilities include strategies such as discussing a topic in class and doing preliminary brainstorming, allowing students to write an essay outline before writing their drafts in class, and/ or writing an in-class draft for a grade, followed by revising it out of class based on teacher or peer feedback for a separate grade. Because of the difficulties that second language writers often have managing both the content and linguistic demands of a writing assignment, giving students the opportunity to prepare the content in advance of the writing may allow them to demonstrate their best writing. Setting tasks Whether or not one is evaluating in-class or out-of-class writing, a useful approach to task development is to begin by drafting test specifications as a way of articulating clearly what one is attempting to assess. Specifications are particularly important in developing large-scale tests, but even for an individual teacher, specifications can be helpful as a tool for planning out an assessment. Specifications can benefit teachers in at least three ways: (1) the process of developing specifications helps to ensure that teachers have considered the specific aspects of writing that they are attempting to assess and how those aspects are operationalized in tasks and scoring procedures; (2) within a given program, teachers can share specifications so that courses at the same level can maintain the same evaluation standards and procedures; and (3) sharing specifications with students allows them to know exactly how they will be assessed, in terms of what sorts of tasks they can be expected to perform and how they will be evaluated (Weigle, 2002). Specifications can take many forms, but one useful format is that described in detail in Davidson and Lynch (2002). The main parts of the specification are (a) a general description of the skill(s) or ability(ies) being tested, including a rationale for why these particular skills are important for the given testing context; (b) a description of the prompt, or the instructions to the student about what to write, including a description of any additional stimulus material such as reading passages, pictures, or graphs; and (c) a description of the scoring guide, or rating scale. In my experience, teachers in training are often skeptical of the value of specifications and sometimes resistant to the notion of spending time on specifications until they actually go through the process of developing a specification and a test. However, they usually find that writing a specification is helpful in clarifying their thinking and anticipating potential difficulties, and that, in the long run, writing a specification saves time. As one student wrote in an online posting for an assessment course: When we began discussing test specifications I felt sooo lost and had no clue where to begin. After reading and discussing in class, I thought that writing the specs would be difficult but not impossible. Now, I can just look at my specs and create test items with much more understanding of how the process works. I would just like to advertise for spec writing and say that they really are the blueprints and make things so much clearer when it comes to creating a test that is relevant. I finally see the light even though I still have much to learn and perfect when it comes to writing tests. As noted above, specifications should include a description of the prompt (instructions to the student) and of the expected response. Useful guidelines for designing prompts can be found in Kroll and Reid (1994). Depending on the goals of the assessment, specifications can include any of the dimensions for writing tasks (from Weigle, 2002) outlined in Table 1. For example, one

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209 Table 1 Dimensions of tasks for writing assessment Dimension Subject matter Stimulus Genre Rhetorical task Pattern of exposition Cognitive demands Specification of Audience Role Tone, style Length Time allowed Prompt wording Choice of prompts Transcription mode Scoring criteria Examples Self, family, school, technology, etc. Text, multiple texts, graph, table Essay, letter, informal note, advertisement Narration, description, exposition, argument Process, comparison/contrast, cause/effect, classification, definition Reproduce facts/ideas, organize/reorganize information, apply/analyze/synthesize/evaluate Self, teacher, classmates, general public Self/detached observer, other/assumed persona Formal, informal


Less than 1/2 page, 1/2 to 1 page, 2–5 pages Less than 30 min, 30–59 min, 1–2 h Question vs. statement, implicit vs. explicit, amount of context provided Choice vs. no choice Hand-written vs. word-processed Primarily content and organization, primarily linguistic accuracy, unspecified

¨ ¨ ¨ Weigle (2002). Adapted from Purves, Soter, Takala, and Vahapassi (1984, pp. 397–398) and Hale et al. (1996).

might specify that students will write a one page (length) narrative letter (rhetorical task/genre) to a close friend (audience) using a series of picture prompts (stimulus) as input, and so on. Scoring One of the most troublesome aspects of assessing writing for many teachers is assigning letter grades or numerical scores to their students’ work. One reason for this difficulty is that many teachers feel much more comfortable in the role of supportive coach than of evaluator. Another reason is that teachers sometimes begin their assessment with some idea of how many points a particular assignment is worth, but without a clear notion of how those points should be awarded or the criteria they should use to grade their student work. For these reasons, among others, teachers need to have a systematic process for assigning scores to essays or other written work and some sort of written rubric that outlines the criteria for grading. Sources for writing rubrics abound in print and online, so there is little need for a teacher to start from scratch in developing a rubric for grading. In creating a rubric, teachers need to be familiar with the main types of rubrics. Rubrics vary along two dimensions: whether they are general (to be used across a variety of assignments/ writing tasks) or specific to an assignment, and whether a single score is given (usually referred to as a holistic scale) or analytic (i.e., points are given for different aspects of writing, such as content, organization, and use of language). Much has been written about the advantages and disadvantages of different types of scoring rubrics; see in particular Hamp-Lyons (1991a,b) and Weigle (2002, chap. 6). While arguments can be made for either type of scoring rubric, research suggests that, while holistic scales are faster and more efficient, analytic scales tend to be somewhat more reliable than holistic scales, and certainly provide more useful feedback to students, as scores on different aspects of writing can tell students where their respective strengths and weaknesses are.


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

In training teachers, it is useful to have them try out existing scoring rubrics on a set of essays on a given topic—perhaps one holistic rubric such as the TOEFL writing rubric and one analytic rubric such as that proposed by Jacobs, Zinkgraf, Wormuth, Hartfiel, and Hughey (1981) – and compare their answers in small groups. Teachers in training usually learn from this experience that (a) without exemplars at different levels the various descriptors are difficult to interpret consistently; (b) they can usually agree on the best and the worst essays, but the ones in the middle are more difficult to agree on; and (c) different raters read different things into papers and bring their own values and experiences into the rating process, which highlights the importance of rater training to clarify how the scale should be used in a given context so that raters can learn to apply similar standards. To summarize, there are many things that novice teachers need to learn about developing their own classroom writing assessments—in particular, how to articulate their course objectives clearly so that their assessments match their instruction as closely as possible, how to construct prompts that can elicit reliable samples of writing that are valid indicators of their students’ ability, and how to score writing reliably and efficiently. This article has only scratched the surface of these issues; the interested reader is referred to the sources listed in the Appendix for additional information in these areas. Portfolio assessment Experienced writing teachers and scholars agree that writing tests such as those described in the previous section is quite limited in terms of its usefulness in assessing the complete range of a student’s writing ability. Writing ability is perhaps best conceptualized as the ability to compose texts in a variety of genres that are appropriate for their audience and purpose, and it is difficult, if not impossible, to generalize from a single text on a single topic composed under time constraints to this broader universe of writing. For this reason, many individual teachers and writing programs have adopted portfolio assessment as a (potentially) more valid approach writing assessment. A complete discussion of portfolio assessment is beyond the scope of this paper; interested readers are referred to Hamp-Lyons and Condon (2000), Mabry (1999), Weigle (2002, chap. 9), and Wolcott and Leggett (1998) for more thorough treatments of portfolio assessment. Here, I will briefly define portfolio assessment and provide a brief overview of some of the advantages and constraints of portfolio assessment. A portfolio is ‘‘a purposeful collection of student works that exhibits to the student (and/or others) the student’s efforts, progress, or achievement in a given area’’ (Northwest Evaluation Association, 1991, p. 4, cited in Wolcott and Leggett, 1998). Portfolios vary greatly depending on the age of students, the purpose of the course, and the learning context, among other variables, but three essential components of a portfolio are collection, reflection, and selection (Hamp-Lyons and Condon, 2000). A portfolio is a collection of written products rather than a single writing sample, but it is the process of selecting and arranging the specific contents through deliberate reflection that distinguishes a portfolio from a pile of papers or a large folder (p. 119). Another important component of most portfolio assessment programs is delayed evaluation, which gives students both motivation and time to revise their papers based on feedback and self-reflection before turning them in for a final grade. Portfolio assessment has several advantages over traditional writing tests as a means for evaluating student growth and achievement in writing. First and foremost, portfolios allow assessment and instruction to be integrated seamlessly, as everything that happens in the writing class contributes directly to the process of assembling the portfolio. Furthermore, portfolio

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


assessment allows students to demonstrate their mastery of different genres and registers, as well as their mastery of different aspects of the writing process such as the ability to revise one’s writing based on feedback and to edit one’s writing for sentence-level errors. For second language writers, in particular, portfolio assessment has the advantage of affording extra time for revision and editing to students who may not perform as well under timed conditions. Despite these advantages, however, implementing a portfolio assessment program is not without its difficulties. One potentially problematic aspect of portfolio assessment is reliability of scoring: individual portfolios may contain writing samples that vary greatly in quality, which makes it difficult to assign a single score or grade to the portfolio, and the content of portfolios assembled by different students may vary considerably, making it difficult to score consistently across portfolios. Another area of potential difficulty has to do with practicality: setting up and maintaining a successful portfolio program requires a great deal of advanced planning and investment of time and effort on the part of teachers, administrators, and students. These difficulties are not insurmountable, however, and many teachers who have successfully implemented portfolio assessment will state unequivocally that the benefits of portfolios far outweigh the difficulties. What teachers should know about externally mandated assessments In addition to knowing about classroom assessments, writing teachers need to be aware of many issues related to large-scale assessment. In many programs and institutions, teachers are obligated to prepare their students for large-scale examinations, ranging from locally produced exit examinations to professionally written tests such as the TOEFL. Teachers frequently have one of two attitudes towards these tests. Some teachers feel mistrustful of standardized tests and the companies that make and administer them. They believe – not completely without justification – that many externally imposed tests are thrust upon them and their students for political reasons, and that such tests are created by people who are out of touch with the world of education. Others, on the other hand, are all too willing to trust the judgments of the ‘‘experts’’ rather than their own expertise. They tend to assume that a test is valid simply because it was written by a professional test writer and do not take the time to examine the test closely or look at the match between the test and their own goals and objectives in teaching. As a teacher trainer, I find it important to explore both of these points of view and point out some of the dangers and misconceptions involved in each. Several scholars have pointed out the inherently political nature of large-scale assessment; as tests can be used as gatekeeping mechanisms that allow or restrict access to educational resources and opportunities (see for example, Shohamy, 1998; White, Lutz, & Kamusikiri, 1996). Questions about whose agenda is being served by large-scale tests, who has the right to determine what ‘‘good writing’’ means, and what the intended and unintended effects of policy decisions about testing are on students, teachers, and programs, need to be continually asked, particularly by teachers, who are among those most affected by large-scale tests and most immediately aware of how these tests affect their students. On the other hand, teachers may too easily adopt the view described by Scharton (1996) of ‘‘right-minded teachers struggl[ing] against ruthless big-company test designers who merely want to sell a test score to administrators interested in a quick fix’’ (p. 56) and may not appreciate the professionalism behind the development of large-scale tests. My own experience as a member of the TOEFL Committee of Examiners for three years helped disabuse me of that particular perspective; I found that the people involved in developing the TOEFL were deeply committed to creating high-quality, fair, and valid assessments and were just as concerned with mitigating negative consequences to students as any teacher.


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

Of course, most teachers will not have the opportunity to get an up-close look at the inner workings of testing companies, and teachers’ busy schedules make it difficult to devote time to advocacy issues. However, one valuable assignment that I have used with teachers in training to raise their awareness of some of these issues is to critique an existing test from the point of view of reliability, validity, authenticity, practicality, and washback. Students have critiqued largescale tests and tests that are used in their institutions and are often surprised at what they find out. For example, students have discovered that placement and exit tests used in local language schools and community colleges frequently have no handbook or technical manual, no record of how they were developed or validated, and often use writing prompts that are not pretested or equated, so that there is no way to determine the effect of particular prompts or prompt types on the scores given. At the same time, students come to appreciate that large testing organizations such as Educational Testing Service, whom many are used to thinking of in negative terms because of their dominance in high-stakes tests, in fact take tremendous care in defining constructs, designing valid and reliable assessments, and maintaining a program of research to ensure that their tests are of high quality. As test users and as advocates for their students, teachers have a responsibility to understand the powerful role that tests play in their students’ lives, and where relevant, to challenge misuses of tests, for example, the use of a single essay test to make high – stakes decisions such as exit or admissions, or the practice of administering writing prompts that have not been validated or even pre-tested. Huot (1996) proposes a set of principles for assessing writing that take into account the needs and concerns of all stakeholders in assessment; these principles can be a useful starting point in evaluating any mandated assessments that are in place at an institution. Huot believes that writing assessment should be site-based. That is, it should be developed in response to a need at a specific site; locally controlled by the institution involved; context-sensitive to take into account the instructional goals as well as the cultural and social environment of the institution; rhetorically-based, adhering to ‘‘recognizable principles integral to the thoughtful expression and reflective interpretation of text’’; and accessible, so that procedures for creating and scoring writing assessments are available to all stakeholders, including the test takers themselves. Teachers can also be advocates for fair testing by insisting that those who are administering and scoring tests adhere to a code of practice and ethics, such as that promulgated by the International Language Testing Association (http://iltaonline.com/). The ILTA code of ethics consists of nine principles that should guide the professional behavior of language testers, each elaborated upon with a set of annotations. For example, Principle 1 states: ‘‘Language testers shall have respect for the humanity and dignity of each of their test takers. They shall provide them with the best possible professional consideration and shall respect all persons’ needs, values and cultures in the provision of their language testing service.’’ The draft code of practice outlines responsibilities and obligations for test writers, institutions, and users of test results with regard to good testing practices. Teachers should not hesitate to ask questions about the reliability and validity of the tests that their students are required to take and how test results will be used, and teachers should be proactive in bringing issues of questionable testing practices to the attention of administrators. Conclusion In this paper, I have briefly touched upon several issues related to assessment that writing teachers should be aware of, both in terms of tests that teachers develop for their own courses and in terms of large-scale tests. Because assessment is such an integral component of teaching, it is

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


regrettable that many graduate programs in composition and TESOL do not require an assessment course, and thus many teachers enter the classroom without a thorough grounding in assessment issues. Fortunately, teachers are not without resources; there are regional associations of language testing specialists that hold annual conferences, and, increasingly, there are assessment-related sessions at major international conferences such as TESOL and many excellent volumes that discuss assessment issues in clear, understandable terms. A solid understanding of assessment issues should be part of every teacher’s knowledge base, and teachers should be encouraged to equip themselves with this knowledge as part of their ongoing professional development. Acknowledgments I am indebted to Diane Belcher, Alan Hirvela, and two anonymous reviewers for their helpful suggestions on earlier drafts of this manuscript. References
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. Allen, D. (2002). Getting things done: The art of stress-free productivity. London: Piatkus Books. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press. Brown, H. (2004). Language assessment: Principles and classroom practices. White Plains, NJ: Pearson Education Inc. Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32, 653–675. Byrd, P., & Nelson, G. (1995). NNS performance on writing proficiency exams: Focus on students who failed. Journal of Second Language Writing, 4, 273–285. Carson, J. G., Chase, N. D., Gibson, S. U., & Hargrove, M. (1992). Literacy demands of the undergraduate curriculum. Reading Research and Instruction, 31(4), 25–50. Cohen, A. D. (1994). Assessing language ability in the classroom. Boston, MA: Heinle and Heinle. Davidson, F., & Lynch, B. K. (2002). Testcraft: A teacher’s guide to writing and using language test specifications. New Haven, CT: Yale University Press. Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A psychometric study. Studies in Second Language Acquisition, 27, 141–172. Ferris, D. R., & Hedgcock, J. R. (2005). Teaching ESL composition: Purpose, process, and practice. Mahwah, NJ: Lawrence Erlbaum Associates. Gronlund, N. (2004). Writing instructional objectives for teaching and assessment (7th ed.). Upper Saddle River, NJ: Pearson Education. Hale, G., Taylor, C., Bridgeman, B., Carson, J., Kroll, B., and Kantor, R. (1996). A study of writing tasks assigned in academic degree programs. (TOEFL Research Report No. 54). Princeton, NJ: Educational Testing Service. Hamp-Lyons, L. (1991a). Assessing second language writing in academic contexts. Norwood, NJ: Ablex. Hamp-Lyons, L. (1991b). Scoring procedures for ESL contexts. In L. Hamp-Lyons (Ed.), Assessing second language writing in academic contexts (pp. 241–276). Norwood, NJ: Ablex. Hamp-Lyons, L., & Condon, W. (2000). Assessing the portfolio: Principles for practice, theory, and research. Cresskill, NJ: Hampton Press. Hudson, T., & Brown, J. D. (2002). Criterion-referenced language testing. Cambridge: Cambridge University Press. Hughes, A. (2002). Testing for language teachers (2nd ed.). Cambridge: Cambridge University Press. Huot, B. (1996). Toward a new theory of writing assessment. College Composition and Communication, 47, 549–566. International Language Testing Association. (2000). Code of ethics for ILTA. Retrieved October 3, 2007 from http:// www.iltaonline.com/code.pdf. Jacobs, H. L., Zinkgraf, S. A., Wormuth, D. R., Hartfiel, V. F., & Hughey, J. B. (1981). Testing ESL composition: A practical approach. Rowley, MA: Newbury House.


S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209

Johns, A. M. (1991). Interpreting an English competency examination: The frustrations of an ESL science student. Written Communication, 8, 379–401. Kroll, B., & Reid, J. (1994). Guidelines for designing writing prompts: Clarifications, caveats, and cautions. Journal of Second Language Writing, 3, 231–255. Mabry, L. (1999). Portfolios plus: A critical guide to alternative assessment. Thousand Oaks, CA: Corwin. McNamara, T. F. (1996). Measuring second language performance. London: Longman. ¨ ¨ ¨ Purves, A. C., Soter, A., Takala, S., & Vahapassi, A. (1984). Towards a domain-referenced system for classifying assignments. Research in the Teaching of English, 18(4), 385–416. Raimes, A. (1991). Out of the woods: Emerging traditions in the teaching of writing. TESOL Quarterly, 25, 407– 430. Scharton, M. (1996). The politics of validity. In E. M. White, W. D. Lutz, & S. Kamusikiri (Eds.), Assessment of writing: Politics, policies, practices. New York: The Modern Language Association of America. Shohamy, E. (1998). Critical language testing and beyond. Studies In Educational Evaluation, 24, 331–345. Sweedler-Brown, C. O. (1993). ESL essay evaluation: The influence of sentence-level and rhetorical features. Journal of Second Language Writing, 2, 3–17. Weigle, S. (2002). Assessing writing. Cambridge: Cambridge University Press. Weigle, S. (2005). Second language writing expertise. In K. Johnson (Ed.), Expertise in language learning and teaching (pp. 128–149). Hampshire, England: Palgrave Macmillan. White, E., Lutz, D., & Kamusikiri, S. (Eds.). (1996). Assessment of writing: Politics, policies, practices. New York: The Modern Language Association. Wolcott, W., & Leggett, S. M. (1998). An overview of writing assessment: Theory, research and practice. Urbana, IL: National Council of Teachers of English.

Appendix. Selected references on assessment Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press. Brown, H. (2004). Language assessment: Principles and classroom practices. White Plains, NJ: Pearson Education. Brown, J. D., & Hudson, T. (1998). The alternatives in language assessment. TESOL Quarterly, 32, 653–675. Cohen, A. D. (1994). Assessing language ability in the classroom. Boston: Heinle and Heinle. Davidson, F., Lynch, B. K. (2002). Testcraft: A teacher’s guide to writing and using language test specifications. New Haven, CT: Yale University Press. Hamp-Lyons, L. (1990). Second language writing: Assessment issues. In B. Kroll (Ed.), Second language writing: Research insights for the classroom (pp. 69–87). Cambridge: Cambridge University Press. Hamp-Lyons, L. (1991). Assessing second language writing in academic contexts. Norwood, NJ: Ablex. Hamp-Lyons, L., & Kroll, B. (1997). TOEFL 2000-writing; composition, community, and assessment (TOEFL Monograph Series Report No. 5). Princeton, NJ: Educational Testing Service. Hudson, T., & Brown, J. D. (2002). Criterion-referenced language testing. Cambridge: Cambridge University Press. Huot, B. (1990). The literature of direct writing assessment: major concerns and prevailing trends. Review of Educational Research, 60, 237–263.

S.C. Weigle / Journal of Second Language Writing 16 (2007) 194–209


Shohamy, E. (2001). The power of tests: A critical perspective on the uses of language tests. London: Longman/Pearson Education. Weigle, S. (2002). Assessing writing. Cambridge: Cambridge University Press. White, E. (1994). Teaching and assessing writing: Recent advances in understanding, evaluating and improving student performance, 2nd ed. San Francisco: Jossey-Bass. Wolcott, W., & Leggett, S. M. (1998). An overview of writing assessment: Theory, research and practice. Urbana, IL: National Council of Teachers of English.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.