You are on page 1of 18

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/321895415

A Chinese EFL Teacher’s Classroom Assessment Practices

Article  in  Language Assessment Quarterly An International Journal · October 2017


DOI: 10.1080/15434303.2017.1393819

CITATIONS READS

7 599

1 author:

Xiaoying Wang
Beijing Foreign Studies University
6 PUBLICATIONS   105 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Xiaoying Wang on 17 May 2018.

The user has requested enhancement of the downloaded file.


Language Assessment Quarterly

ISSN: 1543-4303 (Print) 1543-4311 (Online) Journal homepage: http://www.tandfonline.com/loi/hlaq20

A Chinese EFL Teacher’s Classroom Assessment


Practices

Xiaoying Wang

To cite this article: Xiaoying Wang (2017) A Chinese EFL Teacher’s Classroom Assessment
Practices, Language Assessment Quarterly, 14:4, 312-327, DOI: 10.1080/15434303.2017.1393819

To link to this article: https://doi.org/10.1080/15434303.2017.1393819

Published online: 18 Dec 2017.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=hlaq20

Download by: [223.72.76.26] Date: 19 December 2017, At: 06:51


LANGUAGE ASSESSMENT QUARTERLY
2017, VOL. 14, NO. 4, 312–327
https://doi.org/10.1080/15434303.2017.1393819

A Chinese EFL Teacher’s Classroom Assessment Practices


Xiaoying Wang
Beijing Foreign Studies University, Beijing, China

ABSTRACT
This article reports on a case study of how an experienced EFL teacher assessed
her students in her oral English course at a university in China. Data were
collected over one semester through document analysis, classroom observa-
tion and recording, interviews, and student journals. Analysis revealed that the
teacher assessed her students through both summative assessment (SA) and
formative assessment (FA), with some assessment practices serving both for-
Downloaded by [223.72.76.26] at 06:51 19 December 2017

mative and summative purposes simultaneously (dual-purpose CA). Moreover,


the teacher integrated her FA and SA in a productive way that pushed her
students to make progress in an upward spiral. The article ends with a
recommendation that experienced teachers’ expertise on classroom assess-
ment should be revealed and described, which may help teachers bridge the
gap between assessment theories and classroom practices.

Introduction
Classroom assessment (CA) refers to the assessment of student learning conducted in classroom settings, as
opposed to external standardized testing (Leung, 2005). While some researchers use CA interchangeably
with formative assessment (FA), other researchers use CA to include both formative and summative
functions of CA (c.f. Brookhart, 2004). To avoid confusion, in this article the concept of CA will encompass
both formative and summative functions, and the concept of FA and that of summative assessment (SA)
will follow the classic definitions, that is, FA is typically conducted to improve teaching and learning,
whereas SA is usually administered to categorize students’ performances or for certification (Cizek, 2010).
Stemming from formative evaluation (Scriven, 1967) and criterion-referenced measurement (Glaser,
1963), FA research began to attract increasing attention in both the general education field and the language
assessment field in particular from the 1990s, especially after the publication of Black and Wiliam’s review
article (1998) that pointed out the importance of FA in promoting effective learning. The Chinese EFL
context has witnessed a similar trend since the new millennium. For one thing, the Chinese government has
since then published educational policies advocating FA or assessment for learning (AfL) to enhance EFL
teaching and support learning (The Higher Education Department of China Ministry of Education, 2007).
For another, an increasing number of articles have been published since then (Wu, 2008). It should be
noted that the majority of the published articles have focused on introducing the principles of FA or AfL to
Chinese readers and discussing how such principles can be applied to the Chinese EFL contexts in general
or to specific language skill courses (84.8% of the total 79 articles). Only a small number of empirical studies
have tried out some FA strategies in their own language classrooms to examine the effects of FA on such
aspects as learner autonomy, learning strategies, or reading comprehension.
However, research so far shows that Chinese EFL teachers in general have difficulty in translating
FA theories into classroom practices (Berry, 2011; Chen, Kettle, Klenowski, & May, 2013; Chen,
May, Klenowski, & Kettle, 2014) and “there were no concrete ways available to help them use

CONTACT Xiaoying Wang xywang@bfsu.edu.cn School of English and International Studies, Beijing Foreign Studies
University, No.2 North Xisanhuan Road, Haidian District, Beijing, P.R. China 100089.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/hlaq.
© 2017 Taylor & Francis
LANGUAGE ASSESSMENT QUARTERLY 313

assessment for teaching and learning purposes” (Berry, 2011, p. 54). Moreover, while Chinese EFL
classrooms have traditionally been dominated by SA in the form of tests and exams (Carless, 2011;
Cheng, 2008; Han & Yang, 2001), few studies so far have paid attention to both FA and SA
simultaneously as well as how everyday FA practices are related to final summative grading. To
address such research gaps, this study made an attempt to explore how an experienced EFL teacher
assessed her students in her oral English course to generate recommendations for EFL professional
development initiatives on CA practices in China.

Literature review
Technically speaking, CA, despite the different terminology, is viewed as a process consisting of three
basic steps: collecting information about student learning, making a judgment about the information
collected, and then making use of the judgment made (c.f. Black & Wiliam, 2009; Cizek, 1997; Cowie
& Bell, 1999; Harlen, 2007; Leung, 2005; McMillan, 2007). It is more than a specific assessment
instrument because it takes place in actual time and space. For example, a test paper is just an
assessment instrument, and only when it is used to collect information about student learning and
Downloaded by [223.72.76.26] at 06:51 19 December 2017

then decisions are made about the students based on their test performances can this whole process
be regarded as a CA practice.
While the process of eliciting-interpreting-using is at the very core of CA, a teacher may conduct each
of the three steps in a variety of ways. For example, for the evidence-collecting step, a teacher may use
various methods from traditional multiple-choice tests to alternative methods like portfolios (e.g., Fox,
2008). For the judgment-making step, a teacher may interpret a student’s performance in relation to the
whole group who take the assessment (norm-referenced measurement) or against some prespecified
learning objectives (criterion-referenced measurement) (Glaser, 1963). Besides, a teacher may make the
judgment alone or involve students in doing self-/peer-assessment (Harlen, 2007). For the judgment-
using step, a teacher may use the judgment for summative or formative purposes (Cizek, 2010).
It is notable that CA is a complex phenomenon with various components and variables.
Furthermore, researchers have also found that CA practices vary along the time dimension. Some
CA practices may last just a few seconds while others a few months or even a whole academic year
(Wiliam & Thompson, 2008). Some CA practices tend to occur during the course of a learning
program while others at the end of the program (Cizek, 2010). Therefore, Harlen (2007) suggested
that an investigation of CA practices should not only include descriptions of components and
variables within specific CA practices but also the relationship of different CA practices over time.
So far, in the Chinese EFL context, the great majority of published articles on CA have been
discussion papers (Wu, 2008). For the limited number of empirical research papers, some have
adopted the action research approach to investigate a specific EFL course to examine the effects of an
assessment system the researchers designed for their own courses on student motivation, self-
regulated learning, and academic achievement (Cao, Zhang, & Zhou, 2004; Zhou & Qin, 2005).
Only a few articles have touched on specific aspects of EFL teachers’ CA practices in China.
Cheng and her colleagues have published a series of articles (Cheng, Rogers, & Hu, 2004; Cheng,
Rogers, & Wang, 2008; Cheng & Wang, 2007) to reveal the characteristics of CA practices among
three groups of tertiary-level ESL/EFL teachers: teachers from mainland China, Hong Kong, and
Canada. Through analyzing data from questionnaire surveys and interviews, the researchers com-
pared the three-group teachers’ CA practices in assessment methods, assessment purposes, grading
and reporting, and follow-up strategies. Major findings concerning the EFL teachers from mainland
China include the following points:

● They used more of selection items, such as multiple-choice items and true/false items and
student performance assessment methods, such as dialogues, oral discussions, or retelling a
story, but much less of peer assessment, student journals, or student portfolios (Cheng et al.,
2004, Cheng et al., 2008).
314 WANG

● Their major purposes of conducting CA were to help with their instruction and make students
work harder rather than understand their students’ progress better (Cheng et al., 2004).
● They preferred holistic scoring to analytical scoring or rubric scoring due to teacher beliefs and
big classes (Cheng & Wang, 2007).
● They tended to provide feedback to the whole class instead of to individual students, again due
to big classes (Cheng & Wang, 2007).

While Cheng and her colleagues’ studies were valuable in presenting a comprehensive picture about
the characteristics of Chinese EFL teachers’ CA practices, their results mainly depended on teachers’
self-report data. Therefore, it has been pointed out that future studies of assessment practices should
include “both teachers’ perceptions and their actions in the classroom” (Cheng & Wang, 2007, p. 103).
Through an interview study, Chen and her co-researchers investigated how two universities in
China—an urban-based Key University and a regional-based Non-Key University—interpreted and
implemented a national policy on FA in College English teaching (Chen et al., 2013, 2014). They found
that both universities had institutional policies that interpreted FA as process assessment that consisted
of students’ class participation, class attendance, and students’ assignments (sometimes quizzes as
Downloaded by [223.72.76.26] at 06:51 19 December 2017

well), with each aspect contributing a certain percentage to the final grading, although the specific
percentages were different between the two universities (Chen et al., 2013). Such understanding is not a
true reflection of the FA principles (ARG, 2002) because only those marked assignments were counted
as FA while instruction-embedded FA practices (Wiliam & Thompson, 2008) were not recognized by
the teachers, one from each university (Chen et al., 2014). Besides, in this understanding, FA was also
used as a tool to discipline and motivate students, which was not mentioned in ARG (2002) definition.
For the implementation of such policies, classroom observations and interview data showed that
both teachers, one from each university, showed some level of autonomy in conducting their own
assessment practices; however, they differed in the focus of feedback, the extent of engaging students
in self-/peer-assessment, and reactions to the testing culture. Moreover, students in both universities
were not actively involved in self-assessment and peer-assessment and were skeptical about the
usefulness of peer-assessment (Chen et al., 2014).
While studies by Chen et al. presented a valuable picture about how EFL teachers in non-English-major
programs1 assessed their students in class, so far, little is known about this issue concerning English-major
programs,2 which is also an important EFL learning context in China. In addition, it should be noted that
few studies have examined the relationship between a teacher’s FA and SA practices in actual classrooms.
Therefore, this study made an attempt to explore how an English-major teacher assessed her students in her
language classroom, including how her FA and SA practices were related over one semester. Moreover, for
pedagogical purposes, this study also explored why the teacher had assessed her students the way she did
and how her students perceived the impacts of her CA practices on them.

Methodology
Stiggins and Conklin (1992) found that a typical teacher spends as much as one-third to one-half of
class time on assessment-related activities. While SA practices tend to be formal and stand-alone, FA
practices can be highly embedded within classroom activities (Wiliam & Thompson, 2008). To
capture the complexities and dynamics of a teacher’s CA practices in naturalistic classroom settings
over time, the present study adopted the case study approach for its advantages in allowing a
contextualized, in-depth, and holistic understanding of a complex issue (Dornyei, 2007; Yin, 2009).
1
Non-English-major programs offer English teaching to university students who have their own majors, such as Mathematics,
Physics, Law, or Medicine. Students are expected to possess minimum English ability so that their future development in their
major field will not be hindered.
2
English-major programs offer English teaching to university students who choose English as their major and try to perfect their
English skills and enrich their knowledge about English language and culture during their university life. They are expected to
become advanced or even native-like English users upon graduation.
LANGUAGE ASSESSMENT QUARTERLY 315

An experienced teacher, Linda,3 from a top foreign language university in China was chosen to
participate in this study. Her department is generally regarded as one of the best English-major
programs in China because it can recruit top-level high school graduates who are interested in
learning English and its graduates usually possess very good English proficiency and are very
competitive in an English-related job market or further study. Linda had a strong educational
background and was very experienced in EFL teaching. At the time of the study, she was doing a
PhD degree in Applied Linguistics, with a special focus on Second Language Acquisition. She once
took a course on language testing and had been frequently involved in working as a test designer or
oral examiner for some standardized tests. Though she could not provide an accurate definition for
FA, she knew well about validity and reliability of a test and the washback effects of a test on
students. With more than 20 years of teaching experience, she had always been regarded as one of
the most-liked teachers at her department. Therefore, she was chosen with the expectation that she
might serve as a role model for other teachers at other English-major programs, and her way of
assessing her students might demonstrate what an experienced EFL teacher can do, which might
offer valuable implications for future professional development on CA in China.
This study was conducted in 2011 in an EFL speaking course where Linda was teaching Public Speaking
Downloaded by [223.72.76.26] at 06:51 19 December 2017

to 25 first-year English-major undergraduates. During this course Linda met with her students twice a
week: one hour on Monday morning and two hours on Thursday morning. Consent to participate in this
study was obtained from Linda prior to the commencement of the semester, and consent from the students
was obtained at the end of the first lesson before the start of the data collection process.
Data were collected over one semester using the following procedures. At the beginning of the
semester, a teacher interview and a student questionnaire survey were conducted to obtain baseline
information about the participants. During the semester, four weeks’ lessons (Weeks 1, 5, 9, and 13)
were observed and audio-recorded. The “non-participant” stance was adopted during the classroom
observation (Dornyei, 2007, p. 177). Seven volunteer students were invited to write journals after
each observed lesson to find out how they had been engaged in the CA practices conducted during
the observed lessons and their perceived impacts of such practices on themselves. In addition, after
the Week 5 lesson, a stimulated retrospective interview4 was conducted with Linda to capture her
thought processes involved in carrying out the CA practices during that week’s lesson. Each
volunteer student also had a stimulated retrospective interview after one observed lesson to gain
richer data about students’ engagement and perceptions of the identified CA practices. At the end of
the semester, the final oral test was observed; a teacher interview was carried out to find out how
Linda summarized and reflected on her overall teaching experiences, with special attention given to
her CA practices; and volunteer students were also interviewed to find out their overall impressions
about the course, their perceptions of improvement and the contributing factors, and their under-
standing and their feelings of the CA practices in the course.
Throughout the semester, relevant documents, such as the National Curriculum for English
Majors, the university/department guidelines regarding assessment of student learning, the textbooks
used, the teachers’ lesson plans and marking schemes were also collected, not only to situate the
present study in context but also to supplement other methods of data collection in this study.
The data analysis process was carried out in three stages. First, to identify Linda’s CA practices,
relevant documents and her interview data were analyzed to locate those CA practices specified in
the documents or regarded as assessment by Linda. A discourse analysis of the transcripts of the
audio-recorded lessons was then conducted to identify the segments containing evidence of the three
assessment steps, even if they were not regarded as CA practices by Linda. These segments usually

3
All participants’ names in this article are pseudonyms.
4
Originally, the researcher conducted stimulated recall (SR) (Gass & Mackey, 2000), but found it too time-consuming to replay the
recording for each of the recorded sessions. Therefore, after each observed session, the researcher summarized all those activities
or episodes that contained CA practices, and during the interviews mainly used the summaries as cues, although important
episodes were still replayed for the interviewees. Because this practice was not SR in a strict sense, it was renamed as stimulated
retrospective interviews to avoid confusion.
316 WANG

have clear boundaries determined by their content as well as discourse markers, such as “Now” and
“OK,” which typically marked the beginning or end of a new topic (c.f. Excerpt 1). To enhance the
validity of this step, transcripts of one-hour classroom recording were peer-coded by a Chinese
colleague who has a PhD degree in Applied Linguistics with a special interest in classroom assess-
ment and language testing. All disagreements were discussed in detail until an agreement was
reached.
The identified CA practices were then analyzed from two perspectives: cross-sectionally and long-
itudinally. The cross-sectional analysis started with a preliminary list of coding categories derived from
existing relevant literature that focused on the why, what, and how of an assessment practice. The data
were examined many times to develop an exhaustive and mutually exclusive list of specific codes for each
category. Where uncertainty arose, detailed discussions were held with an experienced researcher to
ensure that the codes were faithful representations of the data.
In addition to the cross-sectional description of the identified CA practices, attention was also
paid to the time the identified CA practices occurred, how deeply they were embedded in instruc-
tion, and how different CA practices were related to each other. Then, an analysis to identify what
ran through and what changed among the different types of CA practices during the semester was
Downloaded by [223.72.76.26] at 06:51 19 December 2017

undertaken to reveal the relationship between FA and SA in this course.


Finally, the interview and journal data were analyzed through content analysis (Dornyei, 2007) to reveal
Linda’s beliefs about CA, her explanation of her CA practices, and the students’ perceptions of the identified
CA practices.

Findings
Context
At the time of this study, Linda was teaching the Public Speaking course for the third time to one of
eight parallel classes at her department. This course used to be called Oral English (2) focusing on
conversational English, but three years before the study, her department began a curriculum reform
and crystallized the focus of each course and made the course title reflect the focus, as a response to
the changed expectations from society as well as the changes in students’ characteristics and needs.
“The traditional conversational English was not challenging enough because students were only
asked to do role plays and simple conversations” (Linda_BI5). As the coordinator of this course,
Linda wrote the course description, including course objectives, syllabus, requirements, class orga-
nization, and ways to assess students. This description was brought to the teaching group for
discussion, and agreement was reached before the start of the semester.
The Public Speaking course was a 16-week course offered to freshmen in their second semester.
According to the course description, this course was “to cultivate the students’ ability to speak effectively
in public, with a clear sense of purpose, resourceful thinking, and confidence to express ideas.” In the
baseline interview Linda also hoped that in addition to developing students’ public speaking skills, this
course “might also help improve their language quality” (Linda_BI), but she was not confident about that.
Two textbooks were used for this course. The course syllabus was basically arranged around the
public speaking skills explained in the book The Art of Public Speaking (Lucas, 2010), and the old oral
English textbook Contemporary College English: Oral English (2) (Yang, 2005) was used as a “resource
book” from which students could “get some ideas and some language input” (Linda_BI). Not all the
chapters or units from the two books were covered because of the limited time of the semester. In this
respect Linda made sure to include those parts that were most relevant to students’ needs.
Her department required all the teachers to follow the following guidelines concerning assessing
students:
5
In this article, the following short forms are used to indicate the source of a piece of data. BI: baseline interview; COF28042011:
classroom observation field notes collected on April 28, 2011; SRI: stimulated retrospective interview; EoSI: end-of-semester
interview CRT31032011: classroom recording transcript collected on March 31 2011; and J3: journal three.
LANGUAGE ASSESSMENT QUARTERLY 317

● The final test score should account for no more than 40% of the final composite score.
● Teachers should keep a record of students’ class performance and attendance, which, together
with other assignments and/or quizzes, should also contribute to the final composite score.
● Less than one-fourth of a class should get a score above 90.

Her class comprised 25 students (8 males and 17 females), all of whom were new students to her. Ten
students graduated from foreign language (FL) high schools where they had received systematic training in
English, especially in oral communication. They had been exempted from the College Entrance
Examination because they had performed well in an English proficiency test organized by the university
itself in advance. For the other students, they all had very high scores in the College Entrance Examination,
especially in the subject of English, but their oral communication abilities varied, depending on their past
learning experiences. Information about the seven volunteer students can be found in Table 1.

Table 1. Volunteer students’ background information.


Student Name Gender En-learning Starting Age High School Type Going-abroad Experience Oral English Ability
Lewis Male 9–12 Non-FL No Bottom
Downloaded by [223.72.76.26] at 06:51 19 December 2017

Henry Male 9–12 FL No Average


William Male Before 8 FL 2 weeks tour abroad Top
David Male Before 8 FL 1 month tour abroad Top
Lily Female 9–12 FL No Top
Karen Female Before 8 Non-FL 1 month tour abroad Average
Helen Female Before 8 FL No Top

Assessment structure
According to the course description, students’ final composite scores for this course consisted of four
parts: three prepared-speech assignments (20% each), a final exam (20%), students’ self-assessment
(10%) and peer-assessment (5%), and class attendance and interaction (5%) (Figure 1). The self-
assessment part and the peer-assessment part were attached to the prepared-speech assignments,
because students were required to give peer evaluation and feedback after every student’s each given
speech and each student had to write a self-critique after each of his or her three speeches. This
assessment structure was essentially the same as those found in the study by Chen et al. (2013),
which shows the hodgepodge nature of a student’s final composite score. The final score reflects not
only the student’s learning but also his or her learning attitudes and behavior.

Figure 1. Linda’s assessment structure.


318 WANG

However, Linda said she mainly used the prepared-speech assignments, conducted during the second
half of the semester, and the final exam to assess her students in this course. This arrangement was based
on her reflection of her previous round of teaching as well as her own understanding about the functions
of assessment. During the previous year, each student had to give five prepared speeches, once every
three weeks, spread out through the whole semester. All of the five assignments were marked. For the
final test, students were given three speech topics to prepare beforehand and were tested on one of them
chosen by the examiners. Because “the course was quite rushed” during the previous year, Linda decided
to “relax a little bit” this time (Linda_BI). Moreover, Linda believed that the functions of assessment are
“to monitor students’ progress,” “make teaching more systematic,” and let students “know where they
have made bigger progress and where less progress” (Linda_BI). So she had a strong opinion that it is
“totally unfair” to assess such productive skills like speaking and writing only at the end of the semester;
such skills should be assessed systematically throughout a whole semester (Linda_BI).
For the other components of the assessment structure, Linda said she used them mainly for
washback purposes rather than for assessment purposes, because they were designed to motivate
positive learning attitudes and behavior or to enhance student learning rather than assessing their
learning, as can be seen from the following quotes.
Downloaded by [223.72.76.26] at 06:51 19 December 2017

Linda thought that to assign 5% for class attendance was “to get across the message that it is a
required course and class attendance is mandatory” (Linda_EoSI). Although this 5% also included
“interaction,” she did not really look at it but intended to tell students that
This is a learning community, in which they need to communicate and learn from each other. They can’t just bring
their ears to the class. We want them to participate in class discussion and really listen and give feedback to their
classmates. And we want them to interact with their peers not only with the class teacher (Linda_EoSI).

For peer evaluation, Linda “didn’t really evaluate the quality of their comments as long as they
participated” (Linda_EoSI). Instead, she wanted it to achieve the following purposes:
One is to ensure that the listeners will provide the speaker with immediate feedback, and second is to help students
consolidate their own understanding of public speaking skills and what is a good speech, and also to provide a chance
for them to learn from each other. And probably it can also be used to foster their critical thinking (Linda_EoSI).

For self-critique, similarly, Linda “didn’t really give them a score for their critique journals” but “just want
to give them a chance to reflect on their own performances. As long as they did that, it would be OK”
(Linda_EoSI).
The reason Linda did not actually grade the quality of student performances on the above-
mentioned aspects was found to be twofold. On the one hand, she was fully aware of the power of
grading for Chinese students. “Particularly in China, the students are kind of conditioned to think
that scores are very important and they would work for the scores. So if you say you are going to get
a score for this, probably they are going to take it more seriously than something when you say no
it’s not going to be scored” (Linda_EoSI). However, she was also concerned about the negative
impacts of grading on students. Linda actually “hated” to score students. She reported:
I think what I care most is if they learn; I don’t care what kind of scores they get, because each person might
start from different levels. So for one student who was not that proficient, maybe this person has made a lot of
progress, but if you use the same standard to judge all the students, this student would still just get an average
score. But actually to this person, he has made a lot of achievement already. And I don’t want them to feel
discouraged when they see their scores (Linda_EoSI).

Obviously, she had found her own way to both motivate the students to work harder and to
minimize the potential negative effects of grading on students.
In addition to the assessment components specified in Figure 1, classroom observation
revealed some unrecognized CA practices that occurred within some of Linda’s classroom
activities, though Linda did not regard them as assessment. In what follows, the profiles of the
prepared-speech tasks and the final exam will be presented before the description of those
unrecognized CA practices.
LANGUAGE ASSESSMENT QUARTERLY 319

The prepared-speech assignments


Linda prepared her students for the three assignments both at the beginning of the semester through
her description of the course and one week before each assignment when she specified in class the
marking criteria, how to use PowerPoint presentation, and the sequence in which the students would
deliver their speeches. When the scheduled time came to conduct each assignment, the students took
turns to deliver their speeches, each of which was followed by immediate teacher- and peer-feedback.
After the completion of each assignment, every student was required to submit a written self-critique
before receiving feedback from Linda, which included a score and written comments.
Analysis showed that the three assignments served dual purposes. On the one hand, they were
summative in nature due to the following reasons. First, they were administered during the latter half
of the semester, each at the end of a learning unit, to assess what the students had learned to that
point. Second, the written instruction for each assignment specified that the tasks were an oppor-
tunity for the students to “apply the principles of speech organization, delivery, and persuasion …
covered in your readings and/or class lectures to date,” thus implying that the tasks were to assess
students’ achievement. Moreover, each assignment accounted for 20% of the final grading. All of
these factors showed that they served summative purposes.
Downloaded by [223.72.76.26] at 06:51 19 December 2017

On the other hand, the assignments also served formative purposes, because integrated within them
were many learning-enhancing opportunities. First, the students were informed of the three assignments
at the beginning of the semester through Linda’s introduction to the course (Linda_COF28022011). In
turn, the tasks served as learning objectives for the students to achieve. Moreover, attached to each
assignment was the requirement for the students to conduct an in-class peer evaluation as well as write a
self-critique after class. The multiple feedback sources (i.e., peer evaluation, student self-critique, and
teacher feedback) provided students with many opportunities to internalize the features of good speeches
and to understand their own strengths and weaknesses. Finally, the three speeches were sequenced in a
way that allowed for feedback from the first assignment to guide preparation for the second assignment
and feedback from the second assignment to guide preparation for the third assignment. Thus, the
assignment cycle took the form of a spiral: preparation 1 assessment 1 → feedback 1 → preparation 2 →
assessment 2 → feedback 2 → . . . → feedback 3. In this way the three assignments helped to push the
students toward a progressively higher level of speech making.
The emergence of this dual-purpose CA category from the data corroborated previous research
findings that one assessment procedure can serve both formative and summative purposes (Carless,
2011; Rea-Dickins, 2001, 2006; Rea-Dickins & Gardner, 2000). What’s special about Linda’s assign-
ments was that these assignments were linked, with the latter ones built on the former ones.
Classroom observations showed that all students prepared for their speeches carefully and were active
in providing feedback to their classmates. Data from the volunteer students showed that they found this
set of assessment tasks highly beneficial for learning and conducive to boosting their confidence.
For instance, Lewis, whose oral English skills were comparatively poor, felt his first speech was “a
total failure” because he “was just standing there reading the speech.” He remarked: “For my next
speech, I should pay more attention to this, I mean, not to read the speech, but to look at my notes
and then look at my audience” (Lewis_SRI).
Subsequently, during his second speech, he made some eye contact with the audience
(Linda_COF28042011). During the feedback session, Linda commented in a more positive way:
“Well, I felt he had made greater effort this time to maintain eye contact. You remember last time he
was (Linda acted the way he delivered his first speech), always looking down. Yes, he tried to
maintain eye contact this time” (Linda_CRT28042011).
During his third speech, Lewis talked in a more confident and natural manner
(Linda_COF26052011). He wrote in his journal that “I have made some progress this time compared
with the last time. At least I was more confident and paid attention to making eye contact with the
audience” (Lewis_J4). It is evident that Lewis felt he had made progress through this series of
assignments.
320 WANG

Helen, a top student in the class, found that she learned a lot not only from teacher- and peer-
feedback on her own speeches but also from observing others’ speeches and evaluating others’
performances. She commented:
I learned a lot from all of them, ranging from their way of delivering, to the content of their speeches, to the power
and confidence hidden inside their speeches. I tried my best to give evaluations and comments to them on a little
piece of paper, because I think that when commenting on their behavior I was also doing some introspective self-
evaluation. I attempted to learn their advantages and try to avoid the mistakes they made (Helen_J3).

William, a top student from a foreign language middle school, completed every self-critique
carefully (William_SRI, William_EoSI). Although he knew his self-critique was not marked and
that there was no word limit for this task, he still completed each of them carefully because he
liked to listen to the recording of his own speeches so that he could examine his performances in
a more “objective” way.
Overall, the volunteer students all thought the three assignments very helpful owing to the whole
process: “preparing for” them, including “doing research, writing up the speech, and rehearsing
again and again” (Henry_EoSI); the “inspiring and encouraging feedback” after delivering the speech
(David_EoSI); and reflecting on the quality and delivery through self-critique and Linda’s feedback
Downloaded by [223.72.76.26] at 06:51 19 December 2017

(William_EoSI). As such, the positive comments from students demonstrated the combined power
of the inherent washback effect of SA on students and the usefulness of FA for student learning.

The final test


For the final test, students had to give an impromptu speech on a given topic rather than a prepared
speech. The topic was given to students only five minutes before their test. Linda explained that the
previous practice of asking students to prepare for three speeches turned out to be “a heavy burden”
on students. In addition, she noticed that some students would “recite” a speech during the test,
which would affect the validity of the final test (Linda_EoSI). Moreover, although impromptu
speeches tended to be more demanding, students had been practicing this kind of speech during
the semester, so it was fair to require students to give an impromptu speech at the final test.
This test served summative purposes solely, because a student’s performance on this test con-
tributed 20% to his or her final grading. Besides, Linda would not teach this class the following
semester, nor would she inform the students of their final test scores or provide them with any
feedback concerning their test performances (Linda_EoSI).
Linda wanted the final test to be another exercise for the students instead of a heavy burden for
them (Linda_EoSI). Data showed that the students did regard the test as another marked exercise
and did not feel particularly pressured (Wiliam_EoSI, David_EoSI, Lily_EoSI, Henry_EoSI). They all
undertook a little practice prior to the final test to “put them in the mood of doing this kind of
speech” (Lily_EoSI), but they were not particularly worried about their final test results. As Lily
commented: “Everybody should get a fairly good score” because “we have already got a large
percentage of the final composite score based on our daily classroom performances and the final
test score only accounted for a small percentage” (Lily_EoSI).

Linda’s unrecognized CA practices


Of the 11-hour observed lessons, four hours were devoted to the prepared speech tasks. Of the
remaining seven hour-long lessons, this study found that more than 60% of Linda’s class time was
devoted to assessment-related activities (those marked with a * in Table 2), which she referred to as
class exercises and feedback.
Linda’s instruction-embedded assessment practices were evident in three types of classroom
activities. First, she often conducted a Q&A session before starting a new chapter, especially during
the first half of the semester. Linda said the sessions were to “check if they [the students] had read
LANGUAGE ASSESSMENT QUARTERLY 321

Table 2. Overview of Linda’s classroom activities.


Classroom Activity Frequency of Occurrence Percentage of Total Observed Class Time
*Student making a speech 5 37%
Teacher lecturing 3 14%
*Q&A 4 11%
*Teacher-guided evaluation of a sample speech 3 10%
Student news report 4 8%
Student conducting an interview 1 6%
Game 1 6%

the chapter and what they had got from the chapter, and also to find out if there was anything
important they had overlooked” (Linda_SRI). Therefore, the sessions were formative in nature.
These sessions (e.g., Excerpt 1) usually took the traditional initiation-response-feedback (IRF)
sequence (Sinclair & Coulthard, 1975). In most cases Linda’s questions were prepared beforehand as
her questions appeared on her PowerPoint slides, which characterizes the planned nature of such FA
segments. There were also 10 instances where Linda rephrased a previous question based on her
students’ responses (e.g., turn 11 in Excerpt 1), thus demonstrating the incidental nature of such
Downloaded by [223.72.76.26] at 06:51 19 December 2017

assessment episodes.
Excerpt 1.
Assessment
Turn Speaker Transcript Turn type type
1 T […] Now let’s spend a little time checking on your understanding of the main points Initiation Planned FA
from Chapter 13: Speaking to Inform. Now, very quickly, what are the four types of
informative speech discussed in the chapter? What are they? Four types?
Remember?
2 SS Objects. Response
3 T The first is objects. Objects can include—Ok, first of all, objects. And then? Feedback
[…] ((Students get all four types correct.))
9 T Okay then, why must the speaker not overestimate what the audience already know Initiation
about the topic and what you can do to make sure that your ideas don’t pass over
the heads of your listeners?
10 SS ((No response. Some students look through their textbooks.))
11 T Then what are the suggestions given in the book to prevent this from happening, Initiation Incidental
that is, passing over the heads of the listeners? What can you do? FA
[…] ((Four students provide their answers which Linda acknowledges.))
20 S5 When you put forward a technical term, you can give your listeners an explanation Response
and, if necessary, you can use body language to show the meaning.
21 T Exactly, yes. Remember to explain, particularly when you are preparing. You can Feedback
anticipate that probably at this point I maybe need to provide some explanation.
This is important. Don’t assume that it’s so easy, everyone should know, because
they may not know.

The second type of instruction-embedded assessment activities—the teacher-guided speech-eva-


luation activities—were usually conducted immediately after a Q&A session. During this procedure
Linda wanted the students to “use the knowledge and skills they had just learned to evaluate a
speech” and to “see the real power of the speech” (Linda_SRI). In addition, each speech-making
activity was also in the form of an IRF sequence, similar to those in the Q&A sessions. While 10 IRF
sequences contained questions specified on Linda’s PowerPoint slides, indicating they were planned
in advance, four incidental segments also emerged when Linda tried to reinforce an idea based on
the students’ actual responses (e.g., Excerpt 2).
In the above episode Linda noticed that one student made an important observation (turn 14), so
she prompted the class to reflect on the issue (turn 15), which was in effect scaffolding the students
to further reflect on this aspect. After evaluating three students’ opinions (turns 16–18), Linda
confirmed the responses from S7 and S8 and provided further explanation, pointing out the
322 WANG

Excerpt 2.
Assessment
Turn Speaker Transcript Turn type type
… … […] ((After playing a sample speech twice, Linda organizes a class discussion on the
theme of the speech and ways to make the speech effective. What follows is the last
part of the discussion when they talked about a strategy used by the speaker.))
14 S5 And she is humorous. She told jokes about Chinese football [women’s football team]. Response
15 T Yes, I want to say a little bit more on that. Remember, I heard some responses or Initiation Incidental
noises from you when she said: ‘now you can understand why our women’s football FA
team does so well’. Yeah, I heard some responses from you, which reflects what you
were thinking when hearing that. Can you explain to me what the noise you made
means? Were you impressed by it or what?
16 S6 Maybe the women’s football team was not playing that well. Response
17 S7 Well, I think now the women’s team is not so good, but in 2001 they were a much
stronger team.
18 S8 […] ((Another student gives a similar opinion.))
19 T Yeah. Right. Viewing the speech now, because the women’s football team is no Feedback
longer doing so well, we get different feelings when we hear people mention it
today. The thing I want to say is that: it can be a technique in a speech to relate to
the audience by using a specific example, but you’ve got to do it right. If it is
awkwardly done, it would have opposite effects. So you’ve got to be careful.
Downloaded by [223.72.76.26] at 06:51 19 December 2017

importance of relating a chosen example to the audience (turn 19). Clearly, Linda’s scaffolding
question (turn 15) and confirmative and explanatory feedback (turn 19) were based on her assess-
ment of the students’ actual performances and therefore was incidental in nature.
Linda also conducted many speech-making activities, often near the end of a lesson, which were found
to be CA practices as well. This is because this group of activities followed the same pattern: Linda
assigned the students a speech-making task, students practiced individually or in pairs/groups, and then
Linda invited a few students to give their speeches before the whole class, after each of which there was
often peer- and teacher-feedback. In Linda’s eyes, such activities were not assessment practices but
classroom exercises, whose major purpose was to promote student learning (Linda_EoSI).
Although these speech-making activities were preplanned (given that the instructions appeared
on Linda’s PowerPoint slides), eight incidental assessment episodes were also identified within these
activities. In these instances Linda provided scaffolding questions to guide the students to see their
own strengths and weaknesses and to identify ways to improve (e.g., Excerpt 3).
Excerpt 3.
Assessment
Turn Speaker Transcript Turn type type
((Linda asks students to work in groups of three and interview their group
members on their attitudes towards fake and shoddy goods. Then she asks
students to give an impromptu speech based on the interview data they have
collected.))
1 S1 ((One student delivers his speech to the whole class.))
2 T Um, that’s a report of the answers you got from your interviewees, right? If we are Feedback & Incidental
going to summarize, can we summarize from their answers? Never mind the initiation FA
number of interviewees because we are doing it in class and you only had two, so
never mind that. Can we summarize in one sentence the attitude of your
interviewees?
3 S1 Yes. Response
4 T Like what? Initiation
5 S1 Maybe not all fake goods should be rejected and some of them are practical. Response
6 T Okay, okay. Then that can be one of your points if you are making a speech, right? Feedback
So that’s why we do research. We get a lot of information, answers from people,
but then we need to summarize it. Okay. ((She indicates another student to give
her speech to the class.))

The above episode was taken from the Thursday lesson of Week 5. Following the delivery of
the first speech (turn 1), Linda recognized a mismatch between her expectation and the
student’s performance (turn 2), which was confirmed in her interview (Linda_SRI). Instead of
LANGUAGE ASSESSMENT QUARTERLY 323

telling the student directly that he was wrong, Linda asked the student to synthesize the main
idea of his interview data into one sentence (turns 2 and 4). When she recognized that the
student was able to accomplish this task (turn 5), she further pointed out that this summary
sentence could serve as one key point in his speech (turn 6), thus informing the student how to
improve.
The importance of synthesizing the interviewees’ ideas in the speech was re-emphasized following
the delivery of the second speech, primarily because the student made a similar mistake. The third
student delivered a better speech, most likely due to Linda’s repeated emphasis, and Linda therefore
provided more positive comments afterward:
Um, we can see some summary work done in this report, yeah? It’s not like this is the question, this is the
answer. She did a summary job. And also, it was good that before she ended the speech she kind of restated
it. . .. A conclusive remark is that my interviewee would refuse to buy for two reasons. That was also quite clear.
Good! (Linda_CRT31032011).

Clearly, incidental assessment was involved in this episode, as reflected in the adjustment Linda
made on the basis of the students’ performances.
Of the three types of assessment-involving classroom activities/episodes, the volunteer students
Downloaded by [223.72.76.26] at 06:51 19 December 2017

found the speech-making activities, together with the follow-up peer-feedback and teacher-feedback,
most useful. The data showed there were only a total of six comments on the Q&A sessions and the
speech-evaluation activities, but there were 17 comments on the speech-making activities. The
students felt that the speech-making activities not only helped them grasp some speech-making
strategies but also enhanced their critical thinking. For example, after the Thursday lesson during
Week 1, Henry reflected:
One of our classmates pointed out that William folded his arms at the outset of his speech which was beyond
my observation. This reminded me of once learning that ‘folding arms’ would send ‘unwelcome’ messages to
others so I bear in mind that such gesture must be excluded in my future speeches (Henry_J1).

Near the end of the semester, Helen realized that impromptu-speech-making activities in class were
very useful in improving her critical thinking. She wrote:
The practice of making impromptu speeches mainly helps us to better organize our thoughts and words in a
very short period of time. It also helps us to try to think things from different perspectives as much and
thoroughly as possible. I find that I could be more quick-witted than before, but I think I still need to practice
more by myself after class in order to be more capable to do a better job (Helen_J4).

Despite the positive comments from the volunteer students on Linda’s FA practices, the end-of-semester
interviews showed that the students felt such practices were not as helpful for their learning as the three
prepared-speech tasks. This was because they were “just classroom activities,” “not that formal”, “not
graded” (Helen_EoSI, William_EoSI), and the students “did not have to prepare for them”
(Henry_EoSI). This finding reflects the powerful effect of grading on students in this EFL context.
Overall, the fact that over half of Linda’s class time was spent on assessment-involving activities
affirmed Stiggins and Conklin’s (1992) finding that a teacher can spend as much as one-third to one-
half of the class time involved in assessment-related activities. However, the fact that Linda did not
regard such activities as assessment also supported Black and Wiliam’s (1998) finding that teachers
are constantly engaged in CA during their classroom teaching, though sometimes they may not be
fully aware of that.

Relationship between FA and SA


It has been pointed out that for a curriculum to be effective there must be alignment between the
curriculum objectives, classroom instruction, and student assessment (English, 1992). This kind of
alignment is ideally achieved through turning the course learning objectives into assessment con-
structs in CA practices (Harlen, 2007). An analysis of the textbook, course objectives, the marking
324 WANG

Table 3. Alignment between course objectives and assessment constructs.


When Assessed
FA Dual-Purpose CA SA
Speech Speech- Speech- Peer Self- Final
Course Objective Assessment Construct Q&A Evaluation making making Evaluation Critique Test
Knowledge about public √
speaking
Ability to evaluate a Ability to evaluate a √ √ √ √
speech speech
Ability to deliver a public Ideas √ √ √ √ √
speech √ Relevance
√ Depth
√ Logic
√ Clarity
√ Credibility
Discourse competence √ √ √ √ √
Linguistic competence √ √ √ √
Paralinguistic competence √ √ √ √ √
Rhetorical effectiveness √ √ √ √ √
Use of visual aids √ √ √
Downloaded by [223.72.76.26] at 06:51 19 December 2017

Confidence in giving a Confidence in giving a √ √ √ √


speech speech

criteria for the three prepared-speech assignments and the final test, and the feedback sessions
during Linda’s FA practices (especially those attached to the speech-making activities) revealed a
high level of alignment in what was emphasized in the course throughout the semester (Table 3).
It is evident from Table 3 that all of the course objectives were transformed into assessment
constructs and that almost all of them were assessed not only during Linda’s dual-purpose CA and
SA practices but also during some of her FA practices. Although “knowledge about public speaking”
was not listed as one of the course objectives, Linda emphasized this through her Q&A sessions
during the first half of the semester. While the Q&A sessions assessed students’ knowledge about the
key points and principles concerning public speaking, the other CA practices assessed students’
ability to apply this knowledge in practice. Thanks to the high level of consistency between the
course objectives and assessment constructs, Linda’s FA and SA practices were closely linked
through repeated emphasis on the same assessment constructs.
More importantly, the way Linda’s CA practices were arranged throughout the semester enabled
her FA and SA practices to work together like an upward spiral that pushed students to achieve the
course learning goals. Specifically, her FA practices during her daily classroom teaching were like
small cycles from checking students’ knowledge about the key points and principles of effective
public speaking, to asking students to use this knowledge to evaluate sample speeches, and finally to
giving students tasks to apply this knowledge in the construction and delivery of their own speeches.
Such cycles were connected through the feedback sessions after speech-making activities that focused
on the key skills and strategies of effective speech making. As the students increased their knowledge
and improved their specific skills, they were presented with opportunities to practice delivering a
speech in a more formal way (the three prepared-speech tasks). As discussed before, the three
assignments required students to demonstrate what they had learned along the way, and the feed-
back provided for each student after each speech could be used to improve later performances.
Finally, the final test was like the summit of the course by giving students the most challenging task:
making an impromptu speech.
Therefore, Linda’s FA and SA practices together displayed a chained spiral pattern and, as such,
both the teaching and learning processes were progress oriented. Harlen (2007) proposed a model
for integrating FA and SA, and Linda’s CA practices offer a possible way to achieve effective synergy
between FA and SA.
LANGUAGE ASSESSMENT QUARTERLY 325

Furthermore, the way she graded her students also facilitated the synergy of her FA and SA
practices. In this course 60% of the final grade was based on the three prepared-speech tasks, and
another 15% was derived from the student’s self-assessment and peer-assessment. In so doing Linda
took advantage of the SA function of such assessment practices as a motivation tool, especially in the
Chinese educational context, and the FA function as a teaching and learning tool. Consequently,
students were motivated and worked hard for the grades, but at the same time, their anxiety level
was minimized because each task only accounted for a small percentage of the final grading and they
had many opportunities to improve their performances to gain a better grade.

Conclusion
It can be noticed that Linda’s teaching environment was leaning toward the “low structure” end
(Wette & Barkhuizen, 2009). On the one hand, she was not “obliged to follow a comprehensive, pre-
specified syllabus as well as a textbook and/or examination prescription” (ibid, p. 198), and therefore,
she was far from the “high structure” end. On the other hand, her teaching environment was not that
“the curriculum pre-specifications were minimal and flexible, allowing teachers and learners to
Downloaded by [223.72.76.26] at 06:51 19 December 2017

negotiate the curriculum,” or the “low structure” end (ibid, p. 198). However, her teaching context
was closer to the “low structure” end, because she had the freedom to modify the existing course
syllabus and assessment structure based on her teaching experience of the previous two rounds.
Moreover, the assessment guidelines from the department were not very restrictive, and Linda was
able to make full use of the power of grading on students to help them learn and improve within the
framework of the assessment guidelines.
Data analysis revealed that Linda had been engaged in a variety of CA practices throughout the
whole semester. In addition to such recognized assessment practices as the final test (the SA practice)
and the prepared-speech tasks (the dual-purpose CA practices), she also conducted a large number
of unrecognized FA practices, which were embedded within her classroom activities. While most of
her CA practices were planned in advance, there were also incidental assessment episodes that were
contingent on students’ actual performances in class. Students found the dual-purpose CA practices
most engaging and beneficial.
More importantly, this study found a high level of synergy between Linda’s FA and SA practices,
which worked together like an upward spiral that pushed students to achieve the course learning
goals. This synergy was achieved through the following ways: by transforming the course objectives
into assessment constructs; by taking advantage of the grading process while also minimizing the
negative effects of the grading process on students; by arranging the various CA tasks in a reasonable
way so that they offered students opportunities for learning and using alternately; and by emphasiz-
ing student involvement and student improvement in her assessments.
Linda’s CA practices provide a counter example of the general view that Chinese EFL teachers
seldom conducted peer assessment (Cheng et al., 2004, 2008) and provided little individualized
feedback (Cheng & Wang, 2007). This is consistent with the case study finding in the study by Chen
et al. (2014). Moreover, Linda’s CA practices partially supported Cheng et al.’s (2004) finding that
the major purposes of Chinese EFL teachers’ conducting CA practices were to help with their
instruction and make students work harder rather than understand their students’ progress better.
Linda did use her CA practices to motivate students to work harder, but at the same time, she also
used them to identify and guide her students’ progress.
Although Linda had not been systematically trained in FA, her CA practices demonstrated that
she had implicit knowledge about FA and was flexible and capable enough to integrate FA and SA in
a productive way in her course. This study testifies that experienced teachers’ expertise on CA, even
if it is implicit, can be a valuable source for teacher training, but it takes researchers to identify and
describe such implicit knowledge and strategies. Therefore, more empirical work is needed in diverse
institutional and curriculum contexts to reveal the wisdom and expertise of experienced teachers’ CA
practices, both implicit and explicit, which will help tackle the problem of EFL teachers’ translating
326 WANG

FA theories into classroom practices as identified in the previous studies (Berry, 2011; Chen et al.,
2013, 2014).

Acknowledgment
I am sincerely grateful to my PhD supervisors, Prof. John Read and Dr. Rosemary Erlam from Auckland University,
for their valuable advice during my research, to all the participants for their support and cooperation, and to all my
Chinese colleagues, especially to Prof. Yan Lin, for their selfless help of one kind or another.

References
ARG. (2002). Assessment for learning: 10 principles. Cambridge, UK: University of Cambridge: Assessment Reform
Group.
Berry, R. (2011). Educational assessment in Mainland China, Hong Kong and Taiwan. In R. Berry, & B. Adamson
(Eds.), Assessment reform in education: Policy and practice (pp. 49–61). Hong Kong, China: Springer.
Downloaded by [223.72.76.26] at 06:51 19 December 2017

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy &
Practice, 5(1), 7–74.
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation
and Accountability, 1(1), 5–31.
Brookhart, S. M. (2004). Classroom assessment: Tensions and intersections in theory and practice. Teachers College
Record, 106(3), 429–458.
Cao, R., Zhang, W., & Zhou, Y. (2004). Implementation of formative assessment in an EFL writing course for Chinese
non-English-major undergraduates. Foreign Language Education, 25(5), 82–87.
Carless, D. (2011). From testing to productive student learning: Implementing formative assessment in Confucian-
heritage settings. New York, NY: Routledge.
Chen, Q., Kettle, M., Klenowski, V., & May, L. (2013). Interpretations of formative assessment in the teaching of
English at two Chinese universities: A sociocultural perspect. Assessment & Evaluation in Higher Education, 38(7),
831–846. doi:10.1080/02602938.2012.726963
Chen, Q., May, L., Klenowski, V., & Kettle, M. (2014). The enactment of formative assessment in English language
classrooms in two Chinese universities: Teacher and student responses. Assessment in Education: Principles, Policy
& Practice, 2014, 21(3), 271–285. doi:10.1080/0969594X.2013.790308
Cheng, L. (2008). The key to success: English language testing in China. Language Testing, 25(1), 15–38. doi:10.1177/
0265532207083743
Cheng, L., Rogers, T., & Hu, H. (2004). ESL/EFL instructors’ classroom assessment practices: Purposes, methods and
procedures. Language Testing, 21(3), 360–389. doi:10.1191/0265532204lt288oa
Cheng, L., Rogers, T., & Wang, X. (2008). Assessment purposes and procedures in ESL/EFL classrooms. Assessment &
Evaluation in Higher Education, 33(1), 9–32. doi:10.1080/02602930601122555
Cheng, L., & Wang, X. (2007). Grading, feedback, and reporting in ESL/EFL classrooms. Language Assessment
Quarterly, 4(1), 85–107. doi:10.1080/15434300701348409
Cizek, G. J. (1997). Learning, achievement, and assessment: Constructs at a crossroads. In G. D. Phye (Ed.), Handbook
of classroom assessment (pp. 1–32). New York, NY: Academic Press.
Cizek, G. J. (2010). An introduction to formative assessment: History, characteristics, and challenges. In H. L.
Andrade, & G. J. Cizek (Eds.), Handbook of formative assessment (pp. 3–17). New York, NY: Routledge.
Cowie, B., & Bell, B. (1999). A model of formative assessment in science education. Assessment in Education, 6(1), 101–116.
Dornyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodoligies. Oxford,
UK: Oxford University Press.
English, F. W. (1992). Deciding what to teach and test: Developing, aligning, and auditing the curriculum. Newbury
Park, CA: Corwin Press, Inc..
Fox, J. (2008). Alternative assessment. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and
education (Vol. 7: Language testing and assessment, pp. 97–108). New York: Springer Science+Business Media LLC.
Gass, S. M., & Mackey, A. (2000). Stimulated recall methodology in second language research. Mahwah, NJ: Lawrence
Erlbaum Associates, Inc.
Glaser, R. (1963). Instructional technology and the measurement of learning outcomes: Some questions. American
Psychologist, 18, 519–521.
Han, M., & Yang, X. (2001). Educational assessment in China: Lessons from history and future prospects. Assessment
in Education, 8(1), 5–10. doi:10.1080/09695940120033216
Harlen, W. (2007). Assessment of learning. Los Angeles, CA: SAGE.
LANGUAGE ASSESSMENT QUARTERLY 327

Leung, C. (2005). Classroom teacher assessment of second language development: Construct as practice. In E. Hinkel
(Ed.), Handbook of research in second language teaching and learning (pp. 869–888). Mahwah, NJ: Lawrence
Erlbaum Associates, Inc.
Lucas, S. E. (2010). The Art of Public Speaking (10th ed). Beijing, China: Foreign Language Teaching and Research Press.
McMillan, J. H. (2007). Classroom assessment: Principles and practices for effective standards-based instruction (4th ed.).
Boston, MA: Pearson Education, Inc.
Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing,
18(4), 429–462. doi:10.1177/026553220101800407
Rea-Dickins, P. (2006). Currents and eddis in the discourse of assessment: A learning-focused interpretation.
International Journal of Applied Linguistics, 16(2), 163–188.
Rea-Dickins, P., & Gardner, S. (2000). Snares and silver bullets: Disentangling the construct of formative assessment.
Language Testing, 17(2), 215–243. doi:10.1177/026553220001700206
Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagné, & M. Scriven (Eds.), Perspectives on
curriculum evaluation (pp. 39–83). Chicago, IL: Rand McNally.
Sinclair, J. M., & Coulthard, M. (1975). Towards an analysis of discourse. London, UK: Oxford University Press.
Stiggins, R. J., & Conklin, N. F. (1992). In teacher’s hands: Investigating the practices of classroom assessment. Albany,
NY: State University of New York.
The Higher Education Department of China Ministry of Education. (2007). College English Curriculum Requirements
[daxue yingyu kecheng yaoqiu] [in Chinese]. Shanghai, China: Shanghai Foreign Language Education Press.
Downloaded by [223.72.76.26] at 06:51 19 December 2017

Wette, R., & Barkhuizen, G. (2009). Teaching the book and educating the person: Challenges for university English
language teachers in China. Asia Pacific Journal of Education, 29(2), 195–212. doi:10.1080/02188790902857180
Wiliam, D., & Thompson, M. (2008). Integrating assessment with learning: What will it take to make it work? In C. A.
Dwyer (Ed.), The future of assessment: Shaping teaching and learning (pp. 53–84). New York, NY: Erlbaum.
Wu, X. (2008). A literature review on the application of formative evaluation to tertiary-level EFL teaching in China.
Foreign Language World (126), 91–96.
Yang, L. (Ed.). (2005). Contemporary College English: Oral English (2). Beijing, China: Foreign Language Teaching and
Research Press.
Yin, R. K. (2009). Case study research: Design and methods (4th ed.). Los Angeles, CA: SAGE.
Zhou, P., & Qin, X. (2005). Applying formative assessment in the Internet teaching of College English. Foreign
Language Audio-Visual Instruction (105), 9–13.

View publication stats

You might also like