You are on page 1of 17

System 84 (2019) 93e109

Contents lists available at ScienceDirect

System
journal homepage: www.elsevier.com/locate/system

Twenty-five years of research on oral and written corrective


feedback in System
Shaofeng Li*, Alyssa Vuono
School of Teacher Education, Florida State University, Tallahassee, FL, 32306, USA

a r t i c l e i n f o a b s t r a c t

Article history: This article provides a comprehensive and critical review of the research on various as-
Received 15 May 2019 pects of oral and written corrective feedback (CF) based on selected articles published in
Accepted 26 May 2019 System over the past 25 years. The review starts with a comparison between oral and
Available online 4 June 2019
written CF, demonstrating that despite the discrepancies in the characteristics and peda-
gogical practices of the two types of CF, they have been examined from similar perspec-
Keywords:
tives in the research. The striking similarity in the research themes makes it possible to
Corrective feedback
follow the same template in organizing the research synthesis for each CF type. The
Written feedback
Second language acquisition
synthesis for each CF type comprises three sections. Section 1 provides a taxonomy of the
Form-focused instruction CF type in question and summarizes the findings of descriptive or observational research
Task-based language learning and teaching regarding how teachers provide CF and how students react to CF. Section 2 synthesizes the
Second language writing findings of experimental CF research regarding whether CF is effective in facilitating
learning gains and what factors constrain its effectiveness. Section 3 discusses the research
on teachers’ and students’ beliefs about the utility of CF and how it should be implemented
in the classroom.
© 2019 Published by Elsevier Ltd.

Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2. Oral corrective feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.1. Taxonomy and occurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.2. The effectiveness of oral CF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.3. Teacher and student beliefs and attitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3. Written corrective feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.1. Taxonomy and occurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.2. The effectiveness of written CF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.3. Teacher and student beliefs and attitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
The Synopses of Selected Articles Published in System (*oral feedback; **written feedback) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

* Corresponding author.
E-mail addresses: sli9@fsu.edu (S. Li), avuono@fsu.edu (A. Vuono).

https://doi.org/10.1016/j.system.2019.05.006
0346-251X/© 2019 Published by Elsevier Ltd.
94 S. Li, A. Vuono / System 84 (2019) 93e109

1. Introduction

Corrective feedback (CF) refers to comments on the appropriateness or correctness of learners' production or compre-
hension of a second language. As one of the most vibrant streams of research in second language acquisition (SLA), CF has
spawned a voluminous body of research in the past two and a half decades. One piece of evidence for the popularity of CF
research can be found in Plonsky and Brown's (2015) review article that reported 18 meta-analyses that synthesized the
empirical studies on various aspects of this instructional device. The momentum of CF research commenced with the pub-
lication of two milestone articles: Lyster and Ranta (1997) and Truscott (1996), which concern oral and written CF respec-
tively. Lyster and Ranta's study provided a clear taxonomy of various corrective strategies teachers utilize in the classroom.
The researchers also introduced the concept of uptakedlearners’ responses after feedbackdas a measure of learners'
engagement with feedback. Lyster and Ranta's taxonomy of oral CF and uptake has provided a convenient coding scheme and
reliable guidance for the empirical examination of CF occurrence and effectiveness. With regard to written CF, Truscott
expressed his strong objection to, and absolute dismissal of, the necessity and utility of written CF. Truscott's denial of written
CF motivated a surge of empirical research on the effects of written CF. In a sense, Truscott's article advanced written CF
research in a “provocative” way. However, as we will show in later sections, while Truscott's claim that written CF is inef-
fective has been proven wrong, some of his concerns have been confirmed in research.
System has stayed abreast with the development of CF research. The articles that have been published on CF in this journal
in the past 25 years accurately represent the developmental trajectory of CF research. The aim of this article is to synthesize
the various strands of CF research represented by selected articles published in System (see the appendix for an annotated
bibliography of selected articles). To conduct the synthesis, we searched every issue of the journal published since the two
milestone articles discussed above (Lyster & Ranta, 1997; Truscott, 1996) and retrieved 20 studies on oral CF and 32 on written
CF. Unlike previous reviews which focus on either oral CF (e.g., Li, 2018) or written CF (e.g., Ferris, 2004), we include both oral
and written CF in our review, with a view to providing a more holistic picture of the research on CF as a unified construct.
However, we also recognize that the two CF types have unique features and have been examined separately in the primary
research. Therefore, we will synthesize the research in separate sections. Despite their different attributes, the two types of CF
have been examined from similar perspectives in the research. The striking similarity in the research themes for the two CF
types allows us to follow the same template in organizing the two separate sections. In what follows, we make a comparison
between them, identifying the different attributes of the two CF types and the commonalities in the themes and questions
examined by researchers.
As Table 1 shows, oral and written CF differ in a number of respects. Oral CF involves the encoding and decoding of aurally
presented information, whereas written CF is typically provided visually. Oral CF is usually provided online during speech
production, while written CF is typically delayed and provided after a written task is completed. Thus, oral CF constitutes
integrated focus on form where linguistic forms are attended to in context and the learned knowledge is applied or proce-
duralized in immediate, subsequent production. Written CF, in contrast, is decontextualized and immediate production of the
targeted structure is not required. Oral CF has a pure focus on language-related errors, which may or may not cause
communication breakdowns, while written CF may target both language and contentdthe discourse and organizational
aspects of writing. Oral CF can be implicit or explicit depending on whether learners are made aware of the problematic
nature of their speech performance. In contrast, written CF is always explicit because learners have no trouble recognizing the
corrective intention, regardless of how it is provided. Thus, the implicit-explicit distinction does not apply to written CF. Oral
CF can be categorized as input-providing or output-prompting based on whether the correct form is provided or withheld.
The same distinction applies to written CF, but the terms “direct” and “indirect” have been used to refer to feedback that

Table 1
A comparison of oral and written CF: differences and similarities.

Oral CF Written CF
Differences Modality Aural Visual
Spontaneity Synchronous/immediate Asynchronous/delayed
Context Integrated Isolated
Focus Language only Both language and content
Salience Explicit or implicit Explicit
Taxonomy Prompt vs. provide; implicit vs. explicit Direct vs. indirect; focused vs. unfocused
Source Teacher Teacher or peer
Similarities Common themes Commonly asked questions
Theoretical debate Does CF facilitate or impede L2 development?
Pedagogical practice Do teachers provide feedback? What types of feedback do they provide? Is feedback
incorporated in students' subsequent production? Is the corrective force of feedback
recognized by feedback receivers?
Effectiveness Does CF facilitate L2 learning? Which types of CF are more effective? Are the effects
sustainable? What factors moderate the effects of CF?
Teacher and student attitudes and beliefs What are teachers' and students' attitudes about CF? Which types of feedback do
teachers and students favor? Are teachers' stated CF beliefs consistent with their CF
practice in the classroom?
S. Li, A. Vuono / System 84 (2019) 93e109 95

contains or withholds the correct form, respectively. In the literature on written CF, a distinction has also been made between
focused and unfocused CF, which refers to whether CF targets one or multiple linguistic structures. Although this distinction
may also apply to oral CF, it seems more important for written CF, probably because comprehensive error correction is a
prevalent pedagogical practice in L2 writing classes (Lee, 2018). Finally, while oral CF is usually provided by the teacher, both
teacher CF and peer CF are common in writing classes (Yu & Lee, 2014).
Despite the differences in the characteristics and pedagogical practices of oral and written CF, they have been examined
from similar perspectives in the research. From a theoretical perspective, whether to provide CF, be it oral or written CF,
concerns a core debate over whether second language learning relies exclusively on positive evidencedcorrect linguistic
modelsdor requires both positive evidence and negative evidencedinformation about what is unacceptable. In both oral and
written CF research, opponents of CF (e.g. Truscott, 1996) draw on Krashen's (1982) theory, arguing that exposure to authentic
linguistic materials and using language to achieve communicative outcomes is key to learning success. Accordingly, CF is
deemed ineffective or even harmful for L2 development on the grounds that it only caters to explicit knowledge that is not
drawn on in real world oral and written tasks. In this view, the focus of oral and written tasks should be on meaning-making
rather than linguistic accuracy. Proponents of CF (e.g. Long, 2015; Lyster, 2015) argue that while positive evidence is critical for
L2 success, a small dose of form-focused instruction is important, especially in the case of nonsalient and semantically
redundant linguistic features, such as the French gender or the English third person es, which can be easily ignored by
learners because those features are not meaning distinctive. In written CF research, this debate is translated into the
distinction between learning to write (i.e. how to effectively communicate meaning) and writing to learn the language (i.e.
how to improve linguistic skills via writing) (Mancho  n, 2011).
For both oral and written CF, one major stream of research centers on teachers' practices in the classroom. Questions
investigated in this stream of research include what types of CF teachers provide, what errors receive CF, whether CF is
noticed, and whether CF is incorporated in learners' later production. This type of research is observational and descriptive in
that they aim to show what happens in the classroom rather than whether it affects learning outcomesda question that can
only be answered by experimental research (discussed below). The findings of observational research are significant because
they inform us to what extent teachers' CF practices are aligned with the findings of experimental research. For example,
while teachers prefer to provide recasts for learners' speech errors, this type of CF has been found to be less effective for L2
learning. Similarly, while teachers prefer to provide unfocused written feedback (i.e. correcting a wide range of errors),
experimental research shows that unfocused feedback is less effective than focused feedback (feedback provided on one or
two grammatical forms). The findings about learners’ noticing and uptake of CF are equally valuable because they are hy-
pothesized to facilitate L2 development (Schmidt, 1990).
Experimental CF research investigates (1) whether CF is effective for learning, (2) which types of CF are more effective, (3)
whether the effects are short-lived or can be retained for a long period, and (4) what factors mediate the effects of CF.
Experimental research is characterized by consistent variable manipulation, investigation of causal relationships, use of
pretests and posttests, and inclusion of control or comparison conditions. These studies help us reach conclusions on the role
of CF in L2 learning and resolve debates over the necessity of CF. The findings of experimental CF studies have been aggregated
or synthesized in a number of meta-analyses (e.g., Kang & Han, 2015; Li, 2010; Lyster & Saito, 2010) and narrative reviews (e.g.
Ellis, 2010; Lee, 2018). There have also been attempts to synthesize the methodological aspects of experimental CF research
(Li, 2018; Liu & Brown, 2015).
For both oral and written CF, there has been an active stream of research on teachers' and learners' beliefs about CF. Li
(2017) defined CF beliefs as “the attitudes, views, opinions, or stances learners and teachers hold about the utility of CF in
second language (L2) learning and teaching and how it should be implemented in the classroom” (p. 143). Li pointed out that
it is important to examine CF beliefs because (1) learners' beliefs about CF affect CF effectiveness (Sheen, 2007), (2) mis-
matches between student and teacher beliefs may affect students' satisfaction with the class and their motivation to learn the
language, (3) findings about teachers' and students' beliefs help us determine whether teachers' and students' preferences are
consistent with research findings, and (4) CF beliefs have been found to be separate from beliefs about other aspects of
language learning (Loewen et al., 2009). Typically, in belief studies, teachers and students are asked to respond to Likert-scale
questions regarding their attitudes toward CF or their personal preferences on various issues surrounding CF (Elwood & Bode,
2014). Research in this paradigm has also investigated the congruence and incongruence between teachers' stated beliefs and
their classroom practice and the effects of training on students’ beliefs.
With the framework of the review in place, we will discuss the research on oral and written CF following the same
structure. We will start by providing a taxonomy and reporting the findings of descriptive or observational CF studies. We will
then report the findings and methods of experimental research regarding CF's effectiveness. Finally, we report the findings on
teachers' and students' beliefs and attitudes toward CF.

2. Oral corrective feedback

2.1. Taxonomy and occurrence

Lyster and Ranta's (1997) seminal study identified six major types of oral CF: recasts, explicit correction, metalinguistic
clues, elicitation, repetition, and clarification requests. To exemplify, for the wrong passive use in the sentence, “Three people
killed in a car accident”, the teacher may correct it by:
96 S. Li, A. Vuono / System 84 (2019) 93e109

1) using a recast, that is, reformulating all or part of the sentence with the wrong form replaced and the rest of the message
intact: “Three people were killed.”
2) using explicit correction, which consists of a comment informing the learner of the existence of an error followed by the
provision of the correct form: “Not ‘killed’, ‘were killed’“.
3) providing a metalinguistic clue, namely a comment on the nature of the error without providing the correct form: “You
need passive voice.”
4) eliciting the correct form from the learner: “Three people … ?”
5) repeating the error: “Killed?”
6) making a clarification request: “I'm sorry?”

The six feedback types can be categorized in two ways: implicit vs. explicit and input-providing vs. output-prompting,
with the former distinction based on whether the learner's attention is overtly drawn to the error and the latter on whether
self-repair is encouraged. Based on the implicit-explicit distinction, metalinguistic clues and explicit correction are more
explicit than other CF types. In terms of the provide-prompt distinction, recasts and explicit correction are input-providing,
while the other four CF types are output-prompting. Notwithstanding the clear taxonomy, teachers' CF practice may not
strictly follow the definitions. For example, teachers often conflate different CF types, such as by providing a metalinguistic
clue followed by explicit correction. In experimental research, however, it is typical to provide a certain type of feedback in a
consistent manner.
In terms of the occurrence of CF in the classroom, one overall pattern is that recasts seem to be teachers' favorite among all
CF types (e.g., Choi & Li, 2012; Zhao & Bitchener, 2007). This is probably due to the advantage of recasts in maintaining the
flow of communication and not undermining students' confidence. These speculations are confirmed by Roothooft (2014),
who reported that teachers favored recasts in their teaching because they were worried about using CF strategies that are
intrusive and that cause anxiety to students. The predominance of recasts is also detected by Brown (2016), who synthesized
all descriptive CF studies. However, Brown found that while recasts occur more frequently in adult and elementary classes, in
high school language classes prompts are more frequent than recasts. It would seem that teachers prefer to use recasts for
children because prompts may lead to management issues given that children's responses may go off the track if they are
encouraged to self-correct. It is also possible that teachers assume that children have limited L2 knowledge and that
prompting them to self-correct may not work. For high school students, or teenagers whose attention span is short, prompts
seem to work better because of their advantage in drawing attention to errors. Teachers seem to switch back to recasts for
adults, probably because adult learners are sensitive to CF and can therefore notice the corrective force even though recasts
are implicit (Vuono & Li, in press). It is also possible that many of the adult classes were from intensive language programs
where teachers felt it more efficient to use recasts than prompts.
Brown's (2016) meta-analysis reported two interesting findings related to teachers. One is that teachers with more L2
training tended to provide more prompts and fewer recasts. It would seem that teachers who are better informed about L2
and CF research are more aware of the benefits of prompts. The other finding is that teachers provided more prompts when
they were informed that the purpose of the research was to investigate CF, but when they were told that the purpose was to
examine interaction, the way they provided CF was unaffected. The finding seems to suggest that teachers believe that
prompts are perceived more positively than recasts, and they tend to go out of their way to meet observers' expectations if
they are made aware that the focus is on CF.
One perspective of CF is whether learners are engaged with CF. Engagement can be operationalized as whether the
corrective intention of feedback is noticed by learners or whether CF is followed by uptake. Noticing has been measured
through online or offline verbal reports, that is, learners are required to report in written or oral form what they noticed
during an interactional treatment. Online methods require learners to report their cognitive behavior during ongoing
interaction. Online verbal report can take the form of (1) think-aloud (e.g. Lai, Fei, & Roots, 2008), where learners report what
comes to mind at the moment, (2) online journal, which requires learners to record on a sheet of paper what they have
learned (Al-Surmi, 2012), or (3) immediate recall, in which learners recall what they heard in the immediate past (Ahn, 2012).
Offline methods require learners to make reflections on their cognitive behavior during task performance after the task is
completed. Offline methods include stimulated recall where learners make comments while watching their videotaped task
performance (Bao, Egi, & Han., 2011; Rassaei, 2013) and exit questionnaire (or uptake chart) asking learners to recall what
they learned at the end of the study (Mackey, 2006).
The studies on noticing almost invariably examined recasts (Alsurmi, 2012; Bao, Egi, & Han, 2011; Egi, 2010; Kim, Payant, &
Pearson, 2015; Mackey, Gass, & McDonough, 2000; Philp, 2003), probably because recasts are implicit, and explicit CF such as
metalinguistic feedback and explicit correction can be easily noticed. Also, these studies typically involve the dyadic inter-
action between a native speaking interlocutor and a learner, except for Bao et al. (2011), which investigated teacher-fronted
class interaction. The studies (e.g. Bao et al., 2011; Egi, 2010) on noticing have demonstrated that recasts are more likely to be
noticed when they (1) are short, (2) involve fewer changes, and (3) are delivered using a rising intonation. The studies also
show that morphosyntatic recasts are less likely to be noticed than lexical and phonological recasts (e.g. Al-Surmi, 2012;
Mackey et al., 2000). The studies using immediate recall (e.g. Kim et al., 2015; Philp, 2003) as a measure of noticing all re-
ported higher noticing rates than other studies, suggesting the existence of reactivity, which refers to the effects of noticing
activities on task performance. Finally, the likelihood of noticing may also be influenced by other factors such as whether
S. Li, A. Vuono / System 84 (2019) 93e109 97

learners have been taught through traditional form-oriented approaches which may make them very sensitive to CF and
whether the target feature is salient. To date, there has been little research on these constraining factors.
Uptake refers to learners' responses after CF. Uptake can be successful or unsuccessful depending on whether the error is
fixed in the following utterance. The benefits of uptake are justifiable on the following grounds. First, uptake shows that
feedback is noticed or is registered in the learner's short-term memory. However, absence of uptake is not an indicator of
failure to notice because learners often do not have the opportunity to respond to CF, especially in classroom settings (Lyster,
2001). In fact, there has been empirical evidence corroborating this argument. For example, Bao et al. (2011) found that
learners reported more noticing of recasts during a stimulated recall (37%) than the percentage of recasts followed by uptake
(14%). Second, uptake involves real-time production of language, which necessarily facilitates fluency and the procedurali-
zation of L2 knowledge. Third, uptake after prompts is especially useful because it pushes the learner to conduct deep
cognitive processing, thus facilitating L2 development. Uptake after prompts, then, constitutes pushed output, which, ac-
cording to Swain's (2005) Output Hypothesis, has at least three benefits: (1) helping the learner notice the hole, that is, raising
the learner's awareness of his/her lack of linguistic knowledge and prompting the learner to attend to relevant linguistic input
in future learning; (2) providing a forum to conduct hypothesis testing or try out previously learned knowledge; (3)
prompting the learner to make metalinguistic reflections.
Observational CF studies showed that the levels of uptake vary considerably between different instructional settings.
Sheen (2004) compared the incidence of CF and uptake between four instructional settings: New Zealand ESL, Korean EFL,
Canadian immersion, and Canadian ESL. She found that while recasts were the most frequent CF type in all four settings,
learner repair and uptake following recasts were substantially greater in the Korean and New Zealand classes than in the
other two contexts (72.9% and 82.5% vs. 30.7% and 39.8%). Lyster and Mori (2006) compared Canadian immersion and
Japanese immersion classes. They reported that (1) Japanese immersion students responded to feedback more frequently
than French immersion students, and (2) the greatest proportion of uptake and repair in Japanese immersion settings
followed recasts (61% for uptake and 68% for repair), whereas the greatest proportion of uptake and repair in French
immersion settings followed prompts (62% and 53%). They attributed the disparate findings to the fact that the teachers of
Japanese immersion tended to ask students to repeat recasts. The teachers also regularly integrated choral activities into
their content-based instruction, asking students to repeat the linguistic models provided by their teachers. Based on these
findings, the researchers proposed the so-called counterbalance hypothesis, which states that the provision of CF should
vary according to the overall orientation of the instructional setting and that the best way to provide CF is one that differs
from the overall focus of the instruction. Specifically, in settings that are more meaning-oriented such as French immersion,
where recasts are less likely to lead to uptake and repair, prompts are ideal because they are better at shifting learners'
attention to linguistic forms. Conversely, in more form-oriented classes such as Japanese immersion, where learners are
sensitive to CF, recasts are ideal because they are easily recognizable and they are conducive to orienting learners’ attention
to meaning.
To conclude this section, we would like to point out that there is a need to distinguish different types of uptake. Specif-
ically, it would seem that uptake after prompts and after recasts seem qualitatively different. As discussed above, because
prompts withhold the correct form and encourage learner repair, uptake after prompts constitutes pushed output that has a
number of benefits. However, uptake after recasts, which contain the correct form, may represent mere repetition of the
feedback and may therefore be of limited value in facilitating learning. The speculation that prompt-generated uptake and
recast-generated uptake may have differential effects on L2 learning has been borne out empirically. For example, Loewen
and Philp (2006) reported that successful uptake was predictive of learners’ scores on a tailor-made test only when feed-
back contained prompts (elicitation or metalinguistic information). McDonough (2005) found that modified output after
clarification requests (prompts) was the only predictor of ESL question development. Nassaji (2011) showed that while repairs
(successful uptake) after prompts and those after recasts had similar immediate effects reflected on a tailor-made test, the
effects of recast-generated repairs were not as sustainable as prompt-generated repairs. Interestingly, Nassaji further divided
the repairs after recasts into repetition and incorporation, with the former referring to mere repetition of a recast and the
latter to using the information of the recast in new contexts and revised utterances. Nassaji found that most repairs after
recasts were repetitions, but incorporation led to higher test scores than repetition, suggesting that the two types of repairs
represent different kinds of uptake that lead to different levels of learning.

2.2. The effectiveness of oral CF

While observational studies examine teachers' CF practice and students' responses to CF, experimental studies investigate
the effects of CF on L2 development. Whether CF facilitates learning is a primary concern of theorists, researchers, and
teachers, because the ultimate goal of all discussions and research about CF is to see whether CF can enhance L2 learning. In
their seminal study, Lyster and Ranta (1997) commented that 20 years after Hendrickson's (1978) five famous questions
regarding various issues on CF, we were still unable to answer the most fundamental questiondwhether CF should be
provided. However, we would like to point out that, twenty years after Lyster and Ranta's study, based on the large amount of
research that has been accumulated, we are able to conclude that the answer to that question is positive. The most convincing
evidence comes from the meta-analyses of CF research (e.g. Li, 2010; Lyster & Saito, 2010; Mackey & Goo, 2007), which all
showed that CF has significant effects on L2 learning, with the magnitude of the effects ranging from medium to large. A
meta-analysis aggregates all available research that has examined the construct in question and is thus based on the totality of
98 S. Li, A. Vuono / System 84 (2019) 93e109

the research rather than a set of selected studies. Also, meta-analysis follows rigorous statistical procedures, and therefore the
results are robust. In the following sections, we summarize the findings of the meta-analyses by Li (2010) and Lyster and Saito
(2010) to provide an overview of the effects of CF, followed by a review of some more recent studies. We conclude by drawing
attention to some methodological issues arising from experimental CF research.
Li (2010) and Lyster and Saito (2010) both meta-analyzed CF research, but their foci and analyses are slightly different.
Li's (2010) meta-analysis aggregated the results of both laboratory and classroom studies and both published studies and
unpublished Ph.D. dissertations. The results showed that CF had an overall medium effect on L2 learning, d ¼ 0.61. When
CF was divided into implicit (recasts, clarification, elicitation, and repetition) and explicit (metalinguistic feedback and
explicit correction) according to whether learners' attention is overtly drawn to errors, explicit CF showed larger im-
mediate effects but smaller long-term effects than implicit feedback. Long (2015) interpreted this finding as suggesting
that implicit instruction has a more robust effect on L2 learning than explicit instruction and that recasts, an implicit type
of CF, is an ideal form-focusing strategy. Other major findings include a larger effect in laboratory than classroom studies
and a larger effect for shorter than longer CF treatments. Lyster and Saito (2010) only included classroom studies and
found larger effects for prompts than recasts, suggesting that in classroom settings prompts are likely more effective than
recasts because they are more salient. One striking finding is that younger learners benefited more from CF than older
learners. The authors interpreted this finding as suggesting that oral CF facilitates implicit learning, which matches
children's learning mechanism. However, it is unclear whether there is an interaction between age and CF type, that is,
whether the comparative effects of recasts and prompts are different for adult and child language learners. It is also
worth noting that to date there has been no research examining age as an independent variable. Therefore, age is a niche
to be filled in CF research. Finally, both meta-analyses showed a larger effect for CF on oral tests than on written tests,
which is attributable to the consistency in format between the tests and the treatments, which are provided in the oral
mode.
Having examined the overall effects of CF by summarizing the findings of the two meta-analyses, we now turn to some
other important or interesting findings. One active stream of research has examined the extent to which CF effects are
moderated by individual difference variables, which can be called learner-internal factors. Among individual difference
variables, language aptitude and working memory have received the most attention. Li (2017) conducted a research synthesis
of the studies investigating the associations between learners’ cognitive variation in language aptitude and working memory
on the one hand and the effects of CF on the other. The results showed that language aptitude was more strongly correlated
with the effects of CF than working memory, suggesting that the former is a domain-specific cognitive ability that is
exclusively important for language learning, while the latter is a domain-general cognitive device that is important for all
areas of academic learning. The results also showed that both cognitive variables were more predictive of the effects of
explicit feedback than those of implicit feedback. This finding may suggest that both variables are more important for
conscious learning than unconscious learning. Recently, the concept of implicit language aptitudedthe ability to learn a
second language unconsciouslydhas been introduced into CF research. A study by Yilmaz & Granema, (in press) showed that
implicit aptitude measured via a serial reaction time task was significantly correlated with the effects of implicit feedback, but
not with the effects of explicit feedback.
Another individual difference variable that has been examined is anxiety. Rassaei (2015) found that high-anxiety learners
benefited more from recasts than metalinguistic feedback, while low-anxiety learners benefited equally from the two types of
feedback. The author's explanation of the results is that recasts do not arouse anxiety and are therefore particularly useful for
learners with high anxiety. Sheen (2008) showed that less anxious learners benefited more from recasts and that they also
produced more modified output after recasts. These two studies seem to suggest that to accommodate learners with high
anxiety, it is more advisable to provide recasts than metalinguistic feedback. However, because anxiety has a negative impact
even when recasts are provided, teachers should avoid using activities or strategies that may potentially increase learners'
anxiety.
The linguistic target of a CF treatment, which constitutes a learner-external variable, has been found to moderate CF
effectiveness. Varnosfadrani and Basturkmen (2009) examined the effects of three types of CFdimmediate metalinguistic
correction (provided during task performance), delayed metalinguistic correction (provided after tasks were completed), and
immediate recasts (provided during task performance)don the learning of developmental early and late structures. The
developmental early structures were the English definite article the, irregular past, and plural es; the late structures were
indefinite articles a and an, regular past, relative clause, passive voice, and third person -s. The selection of developmental
early and late structures was based on the research findings on the sequence of acquisition of some grammatical morphemes
by first and second language learners of English. The results showed that metalinguistic correction worked better for early
structures, and recasts were more effective for late structures. The results, according to the researchers, confirmed Krashen's
(1982) view of second language acquisition, that is, explicit instruction is only effective for simple structures, and complex
structures have to be learned implicitly. However, it is unclear whether it is justified to equate acquisition order with
structural complexity. For example, the third person es is a very simple structure, and yet it is late-acquired. Also, conflating
CF timing (immediate vs. delayed) and CF explicitness (explicit vs. implicit) seems to be a methodological limitation.
Notwithstanding, this is an original study in that it initiated a promising topic that has much potential for further empirical
investigation.
One obvious pattern, or rather limitation, of CF research is that most research concerns morphosyntax, and there is little on
other foci aspects of learning. Here we would like to discuss two studies published in System on two less explored
S. Li, A. Vuono / System 84 (2019) 93e109 99

areasdpronunciation and pragmatics. Gooch, Saito, and Lyster (2016) compared the effects of prompts
(clarification þ elicitation) and recasts, both preceded by form-focused instruction, on the learning of the English /ɹ/ by L1
Korean learners. Treatment effects were measured via a controlled task where learners read single words and a free pro-
duction task which required learners to describe two pictures. The researchers reported that recasts only facilitated the
learners’ comprehensibility of the sound in controlled tasks whereas prompts were effective for both controlled and free
production. With respect to pragmatics, Takimoto (2006) examined the effects of explicit feedback (explicit
correction þ metalinguistic explanation) on the learning of lexical/phrasal and syntactic downgraders. Example downgraders
are: “I am wondering if you could lend me your textbook” (syntactic), and “Could you possibly lend me your textbook?”
(lexical). The participants were divided into three groups: feedback, no feedback, and control. Both experimental groups
performed consciousness raising tasks asking learners to analyze given materials and answer some questions. The teacher
provided feedback on their answers. The study failed to find significant differences between the two treatment groups,
although they both outperformed the control group. The lack of CF effects is probably due to the consciousness raising tasks,
which may have levelled out the effects of CF. Furthermore, in this study, CF was provided in discrete item practice, which
deviates from other CF studies where feedback is embedded in communicative tasks.
Finally, a discussion of some methodological issues is in order. Li (2018) synthesized the methods of 34 CF studies pub-
lished in five top journals including System. The purpose of the synthesis was to inform the reader of how CF studies have
been conducted and what issues may undermine the internal, external, and construct validity of empirical research. Li coded
the 34 studies in terms of CF treatment, CF elicitation task, and the measurement of CF effects. With regard to CF treatment, he
showed the inconsistency in the way CF has been operationalized in the primary research. He discussed the strengths and
limitations of laboratory and classroom studies proposing ways to address some of the methodological issues such as re-
ceivers vs. non-receivers of CF, peer input, and learner motivation. In terms of CF elicitation tasks, one concern is the use of
mechanical drills in some studies as treatment tasks, which violates the construct validity of CF research. According to the
theories CF studies are couched in, such as the Interaction Hypothesis (Long, 2015), CF must occur in meaning-primary tasks,
not in discrete item practice, which is reminiscent of the traditional Audiolingual approach. Another issue is task validation,
which means that evidence must be collected to validate task complexity (i.e. ensure a complex task is indeed more complex
than a simple task) and contexts of obligatory use of the target structure (i.e. errors that receive feedback are indeed errors).
For the measurement of CF effects, Li described tests of implicit and explicit knowledge and ways treatment effects have been
operationalized including mastery of the target structure, use of a more advanced variant of a structure (staged development),
automatization of existing knowledge, and learners’ overall task performance indexed by the complexity, accuracy, and
fluency of their speech production.

2.3. Teacher and student beliefs and attitudes

Most studies examine CF beliefs together with beliefs about other aspects of language learning such as grammar learning
(e.g. Loewen et al., 2009; Schulz, 1996), and studies focusing exclusively on CF are few and far between (e.g. Martínez Agudo,
2014). Using meta-analysis as well as narrative review, Li (2017) synthesized all the empirical studies investigating CF beliefs,
and the following results were obtained. First, while students were overwhelmingly positive about CF, with an 89% agreement
rate (collapsing “agree” and “strongly agree”) for Likert scale questions, teachers were hesitant, with only 39% agreement rate.
Also, teachers' and students' beliefs were affected by their experiences. For example, students who had received more
grammar instruction and error correction in their previous learning experience were less positive about CF than those who
had less such experience. Novice teachers were more concerned about the harmful effects of CF than experienced teachers.
Second, in terms of preferences for CF types, there is only one study on students' beliefs, and the participants were inter-
national teaching assistants, not language learners. These learners preferred to receive input-providing feedback, namely
recasts and explicit correction, more than output-prompting feedback such as elicitation. On the teacher's side, novice
teachers favored implicit feedback while experienced teachers advocated a more balanced approach using a variety of
feedback (Rahimi & Zhang, 2015). Third, with regard to the timing of CF, students were in favor of immediate CF while
teachers tended to prefer delayed CF (Roothooft, 2014).
Several studies have examined whether training can change students' and teachers' CF beliefs. Sato (2013) reported that
after a semester-long training where students engaged in communicative tasks and practiced giving and receiving peer
feedback, they were significantly more positive about CF. V asquez and Harvey (2010) and Busch (2010) found that engaging
teachers and pre-service teachers in hands-on, experiential activities, such as conducting a small-scale CF project, tutoring a
learner, and reflecting on their own beliefs and teaching practices, successfully changed teachers' CF beliefs. On the other
hand, Kamiya and Loewen (2014) found that asking teachers to read journal articles reporting CF studies had no effect on
teachers’ beliefs.
Some small-scale qualitative studies examined the congruence and incongruence between teachers’ stated beliefs about
CF and the way they provided CF in the classroom (Basturkmen, Loewen, & Ellis, 2004; Junqueira & Kim, 2013; Kartchava,
2006). In general, teachers were consistent about recasts: they reported favoring recasts and opposing explicit feedback
and in their teaching they indeed provided more recasts than explicit feedback. However, teachers showed more inconsis-
tency than consistency. For example, some teachers were negative about CF when reporting their CF beliefs but in their
teaching practice they provided lots of CF. Some teachers claimed that they preferred prompts more than recasts but in the
classroom they used more recasts than prompts.
100 S. Li, A. Vuono / System 84 (2019) 93e109

3. Written corrective feedback

3.1. Taxonomy and occurrence

Written CF (WCF) refers to responses and comments on learners' written production in a second language. WCF can be
provided in written or oral form, with the former referring to written comments provided in the learner's written script and
the latter to verbal feedback on the learner's written product during individual conferencing (Erlam, Ellis, & Batstone, 2013) or
during class sessions (Bitchener & Knoch, 2009). Thus, the definition of written CF is based on the modality of L2 production
rather than the modality of feedback. WCF may target both content and language (Ashwell, 2000), although the research has
predominantly focused on language-related errors. Also, although WCF has been primarily operationalized as responses to
issues and errors, one variant is providing a sample or model as a means of directing students to identify their own errors or
areas of improvement in their writing (e.g. Ca novas Guirao, Roca de Larios, & Coyle, 2015). Given that most WCF research has
focused on feedback in the form of teachers' written comments on leaners' errors in their language use rather than content,
our review will focus on studies on this type of feedback.
Ellis (2009) provided a typology for the various ways written corrective feedback is provided. He divided feedback into
three broad categories: direct, metalinguistic, or indirect. For instance, in response to the inaccurate use of the present perfect
tense in the sentence, “Nowadays, technology had made it easier for people to communicate”, the teacher may provide
feedback in the following ways.

1. Direct feedback: providing the correct form for the student by replacing “had” with “has”.
2. Metalinguistic feedback: giving the student a clue by identifying the nature of the error in the form of a brief description
such as "use the present perfenct", or using an error code such as T (for tense).
3. Indirect feedback: demonstrating the existence of the error by circling, underlining, or otherwise highlighting “had”
without providing further information about the nature of the error.

Metalinguistic feedback has previously been considered a form of indirect feedback as it identifies the location of the error,
withholds the correct form, and encourages the learner to self-correct (similar to what Lyster and Ranta (1997) referred to as
metalinguistic clue in oral feedback) (Truscott, 1996). Although some researchers have used the two terms interchangeably,
indirect feedback and metalinguistic feedback are fundamentally different. While indirect feedback only indicates that an
error is present, metalinguistic feedback provides a clue to illustrate the cause and nature of the error. While metalinguistic
feedback is typically operationalized as brief comments or error codes on individual errors, and is therefore scattered in a
written text, one variant that has appeared in the literature is providing a handout that contains the rule explanation of the
target structure followed by examples. Students study the information and then correct their errors by applying the rule (Li &
Roshan, 2019; Shintani, Ellis, & Suzuki, 2013). One may question whether rule explanation constitutes feedback because it is
not related to students' written performance. However, if feedback is defined as responses to errors for the purpose of
rectifying errors, it is justified to consider this kind of rule explanation as feedback on the grounds that (1) it is reactive, that is,
it is provided after, not before, errors are committed, and (2) the purpose is to raise the learners’ awareness of, and help them
correct, their errors.
WCF can be classified as focused or unfocused based on the number of error categories or target structures. Focused
feedback refers to feedback that targets a limited number of linguistic structures while unfocused feedback targets errors
relating to multiple structures. Degrees of focus fall on a continuum with the most focused feedback targeting only one error
type or linguistic structure and the least focused being exhaustive, correcting errors relating to all linguistic features. Liu and
Brown (2015) identify feedback falling between the extremes of the focused and unfocused continuum as being mid-focused;
in their classification scheme, feedback targeting 2e6 structures is considered mid-focused. Obviously the focusedness of
WCF stands in a continuum, and in a meta-analysis, it would be possible to investigate focusedness as a continuous variable
and ascertain whether it is predictive of treatment effects. One dimension that is missing from the literature is whether
feedback targets all or selected errors: the former may be termed comprehensive feedback whereas the latter selected
feedback. Thus, the focused vs. unfocused distinction pertains to the number of linguistic structures while the comprehensive
vs. selective distinction helps us distinguish whether all or selected errors received feedback. Therefore, researchers may use
focused feedback to correct errors relating to one linguistic structure, but they still need to face the choice of correcting all or
some errors learners make in using that particular structure.
With regard to teachers' WCF practice, the most frequent feedback employed by teachers appears to be unfocused and
comprehensive with a similar ratio of direct and indirect WCF. Lee's (2004) study demonstrated that 67% of EFL teachers
(N ¼ 58) in Hong Kong secondary schools provided feedback on all errors made by their students, and 72% felt they did so.
Fifty-five percent of the teachers who gave feedback on all errors provided direct feedback for the errors they identified. The
teachers only used one type of indirect feedback: locating the error plus providing error codes (metalinguistic feedback). She
elaborated that the majority of the teachers explained that comprehensive feedback is mandated in their schools, indicating
that administrators in Hong Kong schools hold unfocused, comprehensive WCF in high regard.
Ferris's (2006) observations of three ESL teachers in California revealed similar results in terms of the distribution of direct
and indirect feedback in teachers' WCF practice. On the student writing samples collected for the study, the teachers provided
S. Li, A. Vuono / System 84 (2019) 93e109 101

relatively equal quantities of indirect (51.1%) and direct (45.3%) feedback. Further exploration revealed that the teachers were
more likely to provide direct feedback to non-rule-based, or untreatable, errors (65.3%), and indirect feedback to rule-based,
or treatable, errors (58.7%). Ferris further speculated that teachers likely provide indirect feedback for treatable errors,
presuming learners will be able to self-correct based on language related norms and rules. Direct feedback is provided for
untreatable errors with the assumption that learners are incapable of correcting these errors without assistance. Ferris'
findings suggest that teachers' WCF practice is likely constrained by error type.
There is evidence of differences in feedback provided by native and nonnative teachers. Han (2017) observed the feedback
practices of two EFL teachers in China, revealing that the native English-speaking teacher was more likely to provide indirect
feedback on student writing whereas the native Chinese-speaking teacher's WCF practice was more direct. However,
quantities or proportions of feedback by type were not reported so the degree of difference between the feedback practices of
the two teachers is unclear. Hyland and Anan (2006) also explored how native English-speakers from the United Kingdom
and non-native EFL teachers in Japan provided feedback. They found that the Japanese-native teachers were much stricter in
the feedback they provided, rating learner errors more harshly than did the native English-speaking teachers. They also noted
that while the native-English speaking teachers identified learner strengths alongside areas for improvement or revision,
none of the 16 Japanese native teachers provided a positive comment on the text. They also found that Japanese native
teachers reported attempts to provide comprehensive feedback while native English speakers were more likely to be selective
in their feedback, focusing on errors which impeded understanding. Recall that Lee (2004) found similar results, with EFL
teachers in Hong Kong attempting to provide comprehensive feedback. However, whether Lee's sample included native and/
or nonnative English teachers is unclear although, because of the nature of the FL context, it is probable that the majority of
teachers in her sample were non-native English speakers.
Similar to uptake in oral feedback, learners' revisions in response to written feedback can be considered a proxy of
learners' engagement with feedback. In Ferris's (2006) investigation, when students were asked to correct their own writing
without receiving feedback from the teacher, they were able to locate and self-correct 19% of errors present in their writing.
However, when errors were indicated by the teacher through use of WCF, the students were able to make accurate changes in
response to 80.4% of the suggestions with 9.9% of errors incorrectly changed and 9.3% of corrections ignored. Clearly WCF
enhanced noticing of errors and led to a greater quantity of successful revisions. Ferris did not identify the proportion of
corrections made in response to different feedback types. However, presumably incorrect variations or ignored corrections
were linked to indirect WCF because direct corrections would have provided the correct form of the target structure for the
learners to incorporate (Suzuki, Nassaji, & Sato, 2019). This speculation is confirmed by Van Beuningen, De Jong, and Kuiken's
(2012) study showing that whereas direct feedback led to a correction rate of 78% in students' revisions, the correction rate for
indirect feedback (error codes) was 64%. Lee (1997) also reported that accuracy of revisions was considerably lower when
learners responded to indirect feedback or metalinguistic clues. She found that EFL students in Hong Kong were able to make
one-word corrections for an average of 50.5% of errors when responding to indirect WCF or metalinguistic comments. Lee did
not differentiate the percentage of revisions made in response to metalinguistic codes and indirect feedback separately.
However, she elaborated that metalinguistic codes may be ambiguous and hence hinder student corrections.
Lee (2004) reported that the vast majority of the secondary school teachers she investigated reported using metalinguistic
codes. They employed 15 to 26 distinct codes, posing a challenge for students who were unsure what to make of the feedback.
To exacerbate the situation, some of the codes overlapped. For example, when a student is making a revision and sees the
code “P”, they must distinguish whether the code refers to an error of preposition, pronoun, or punctuation use. Liu and
Brown (2015) reported that 44% of empirical studies employing metalinguistic error codes did not disclose whether the
students had access to a key describing what each code entailed. Although keys may be present in the writing classroom, they
are not universal and so learners have to utilize different keys to make revisions in different writing classes.
One contributing factor to the successfulness of students' revision is the accuracy of teachers' feedback. Lee (2004) re-
ported that when teachers attempted to provide exhaustive feedback, “57% of their corrections were accurate, 40% unnec-
essary, and 3% inaccurate” (p. 298). Lee pointed out that the unnecessary feedback that teachers provided to their students
resulted in an increase in errors in the revised texts. Lee's finding raised a practical issue that is prevalent in foreign language
contexts, namely the fact that some teachers are unable to provide accurate feedback. This is also one of the concerns Truscott
(1996) brought up when justifying his objection to WCF, although Truscott's concern is more to do with teachers' lack of
metalinguistic knowledge to accurately describe the nature of learners' errors. However, it must be clarified that teachers'
inability to provide accurate feedback is indicative of a need for teacher training rather than a basis for disregarding the value
of feedback.
Another factor that may influence the uptake of WCF is learners' language proficiency. It is possible that the low English
proficiency of the students in Lee's (1997) study impacted their ability to self-correct in response to indirect feedback. This
speculation has been confirmed by Park, Song, and Shin (2015), who investigated the impact of indirect WCF (error codes and
underlining) on beginner and intermediate L2 Korean students' corrections of their own written errors. The intermediate
learners were able to successfully revise 38% of the errors indicated while beginners self-corrected nearly 32% of their errors.
The results showed that overall intermediate learners corrected a higher percentage of their own corrections as a result of the
teacher's feedback, and the difference was statistically significant. Writers with higher L2 proficiency may have greater
metalinguistic awareness and therefore may be more prepared to respond to indirect feedback and respond with more ac-
curate revisions. Lower proficiency learners, on the other hand, have fewer linguistic resources to access when making re-
visions, and may benefit more from direct focused feedback.
102 S. Li, A. Vuono / System 84 (2019) 93e109

More focused feedback may also enhance the number of corrections made in response to WCF. Suzuki et al. (2019)
investigated the impact that different feedback types, with a focus on two target structures, had on revisions. They
randomly assigned 88 Japanese EFL students to four treatment groups that received direct or indirect WCF with or without
metalinguistic explanation. The results showed that the participants in the direct feedback group provided nearly perfect use
of the target forms in their revisions (95%e100%), which is primarily due to the fact that they had access to WCF while making
corrections. The groups that received indirect WCF correctly used the target structures in 69%e82% of the revised drafts.
Although direct feedback led to higher correction rates than indirect feedback, the latter showed more accurate revisions than
the correction rate (around 50%) in Lee's study (1997), which is probably due to the focused nature of the feedback in Suzuki
et al.'s study.
As a final note to conclude this section, a distinction should be made between uptake, which refers to revisions to the
original draft, and learning, which is measured through learners’ performance in a new writing task. Truscott (1996) warned
that revisions made in response to WCF potentially demonstrate false gains in the form of “pseudolearning” (p. 345) which
could not be replicated in new writing. While revisions may indicate replication of a model (Truscott, 1996), learning entails
transfer of knowledge to a new context (Larsen-Freeman, 2013). In the following section, we review the findings of exper-
imental WCF research that examined the effects of CF on learning gains measured via new writing tasks.

3.2. The effectiveness of written CF

Any discussion of the effectiveness of WCF starts with Truscott (1996) because of his strong opposition to this instructional
technique. He claimed that “grammar correction has no place in writing courses and should be abandoned” (p. 328). He
argued that from a theoretical standpoint error correction does not work because second language acquisition follows a fixed
sequence, and CF may not work for structures that learners are not ready for. Truscott contended that although we have
limited knowledge about the order of acquisition, there has been evidence that there exists a sequence that is impervious to
external intervention such as feedback. He went on to point out that another reason why feedback does not work is that the
knowledge learned through feedback constitutes pseudoknowledge. Truscott did not provide an exact definition of pseu-
doknowledge, but he seemed to refer to (1) knowledge that is superficial and unsystematic, such as the kind represented by
the correction of errors in revised writing, and (2) explicit linguistic knowledge that, according to him, learners do not draw
on in their writing (they resort to implicit knowledge instead). Truscott also brought up some practical issues that pose a
challenge for the effects of WCF: (1) teachers may not recognize errors, (2) teachers may not have the metalinguistic
knowledge to explain the nature of the error, (3) students may not understand teachers' explanations, and (4) students may
not be motivated to attend to teachers' feedback. Truscott further argued that WCF is not only ineffective but also harmful
because (1) it has a negative effect on students' confidence, (2) it affects students’ writing quality because they “shorten and
simplify their writing to avoid corrections” (p. 355), and (3) it is time-consuming, hence the corollary that “it diverts class
resources from more appropriate tasks” (p. 356).
Truscott's complete denial of the necessity and utility of WCF has received much criticism from scholars in writing
research (e.g. Ferris, 2004). Theoretically, in the field of SLA a general view is that while language instruction should be
primarily meaning-focused, a certain dose of form-focused instruction, in the form of explicit instruction and corrective
feedback, is necessary for the learning of nonsalient, communicatively redundant linguistic features (such as French gender
and the English third person es) that may not heed learners' attention (Gass, 2017; Long, 2015). Empirically, as will be
discussed in later sections, the research has demonstrated unequivocally that WCF is facilitative of L2 development, not only
in revised writing but also in new contexts when students are given a different writing prompt after receiving feedback (Kang
& Han, 2015). Pedagogically, providing feedback on students' errors is typical of writing classes, and the research has shown
that teachers, students, and other stake-holders such as school administrators are all positive about WCF (Chen, Nassaji, & Liu,
2016; Lee, 2009). In this section, we will synthesize the findings of experimental WCF, beginning with summaries of two
meta-analyses, which showed negligible effects (Truscott, 2007) and moderate to large effects (Kang & Han, 2015) for WCF,
respectively. Variables and patterns that were not examined in the meta-analyses will be explored in further detail. This
section concludes with a description of some trends and issues in WCF research methodology.
Truscott (2007) and Kang and Han (2015) conducted meta-analyses of WCF studies, showing quite different results.
Truscott (2007) aggregated the results of 12 empirical studies and concluded that feedback had a small negative impact
(d ¼ 0.20) on learners' written accuracy. Truscott also calculated the within-group effects of WCF based on pretest-posttest
gains and found a “negligible” (p. 267) effect (d ¼ 0.17). However, while meta-analysis requires exhaustive literature research
(Li & Wang, 2018), Truscott selected his sample from narrative reviews published by himself and Ferris, resulting in an
analysis of 12 empirical studies which Kang and Han (2015) refer to as “notably narrow in scope” (p. 3). Including 22 empirical
studies employing new writing as a measurement of learning in their sample, Kang and Han's meta-analysis found a
moderate to large effect (g ¼ 0.68) for WCF on grammatical accuracy on immediate posttests. Further inspection of the 11
studies employing delayed posttests revealed moderate to large effects (g ¼ 0.68), which are nearly identical to immediate
posttest effects. Kang and Han (2015) also explored the effects of different types of WCF. They found that focused feedback had
a much greater positive effect (g ¼ 0.69) than unfocused feedback (g ¼ 0.33). They also found a greater effect resulting from
direct feedback (g ¼ 0.60) as opposed to indirect feedback (g ¼ 0.36). Although the differences between the compared
feedback types were nonsignificant, they were substantial. Kang and Han attributed the nonsignificant p-values to the small
sample sizes for the related comparisons.
S. Li, A. Vuono / System 84 (2019) 93e109 103

Kang and Han's (2015) meta-analysis further revealed that the efficacy of WCF can be impacted by methodological features
of the primary studies. For instance, feedback was found to be more effective in foreign language settings than second
language settings. They speculated that feedback may be more salient in foreign language settings, thereby leading to more
noticing of feedback, greater engagement with feedback, and therefore more learning as a result of feedback. The meta-
analysis also showed that advanced learners benefited more from feedback than intermediate learners (g ¼ 0.73 vs. 0.56),
with the caveats that there were only three effect sizes for intermediate learners, and that it is unclear whether the difference
is significant (a pairwise Q test was not conducted). Also unknown is how proficiency was operationalized and what types of
feedback were provided in the related studies. It is possible that learners at different levels benefit differently from different
types of feedback. For example, direct feedback might be more effective for low-proficiency learners and indirect for advanced
learners (e.g. Hendrickson, 1980; Lee, 1997). However, there has been no research on the interface between feedback type and
proficiency. Recall that to date the only empirical study exploring the relationship between learner proficiency and the effects
of WCF considered revisions or self-corrections rather than new writing as an indication of learning (Park et al., 2015).
Another possible contributor to the efficacy of WCF is the linguistic target. Shintani, Ellis, and Suzuki (2014) found that the
WCF they investigated, direct correction and metalinguistic feedback (in the form of a handout), had a significant and positive
effect on learners' accurate use of the hypothetical conditional but not on their use of the indefinite article. The researchers'
explanation is that when multiple demands compete for learners' limited attentional resources, they tend to focus on more
salient and more meaning-distinctive structures such as the hypothetical conditional and ignore non-salient and redundant
structures such as the indefinite article. They backed up their interpretation by referring to a similar study by Shintani and
Ellis (2013) which investigated the effects of direct feedback and metalinguistic feedback on the indefinite articledthe
only target structure of the feedback treatment. The study revealed a significant effect for metalinguistic feedback, but not for
direct correction, which was attributed to the learners' inability to extrapolate the rules governing the usage of the indefinite
article based on the direct corrections of their errors. However, a study by Ellis, Sheen, Murakami, and Takashima (2008)
showed that direct correction led to significant gains in the learning of the English definite and indefinite articles. A major
difference between Ellis et al. (2008) on one hand and Shintani et al. (2014) and Shintani and Ellis (2013) on the other lies in
the treatment tasks learners performed. In Ellis et al.‘s study, the learners performed three treatment tasks whereas in the
two studies led by Shintani the learners only performed one task. Therefore, the three studies seem to show that salient
structures are more amenable to WCF and that non-salient structures require metalinguistic feedback or a large dose of direct
feedback. It is worth pointing out that there needs to be more research on the influence of the nature of the linguistic target on
WCF effects, and such research requires a justification for examining the target structures and a clear, systematic description
of the differences between the structures.
One distinction related to the nature of the linguistic target of a WCF treatment is between treatable and untreatable
errors. According to Ferris (1999), treatable errors are rule-governed while untreatable errors are item-based such as lexical
errors. Ferris believes that metalinguistic (indirect) feedback works better for treatable errors because there are rules to
follow, whereas direct correction is more effective for untreatable errors. However, Van Beuningen et al.’s (2012) study
showed the opposite: metalinguistic feedback in the form of error codes was more effective for untreatable (called “non-
grammatical” in their study) errors, and direct correction demonstrated superior effects for treatable (grammatical) errors. It
is possible that for untreatable errors, metalinguistic feedback, which withheld the correct forms, prompted learners to
conduct deeper processing of the errors; for treatable errors, metalinguistic feedback did not work because this group of
learners, who learned the second language (Dutch) in content-based classes and who acquired the second language primarily
in a naturalistic setting, did not have much grammar knowledge and were not used to learning from rules. Therefore, the
interface between error type and feedback is more complicated than assumed and needs more empirical investigation. Also,
as Ellis et al. (2008) pointed out, the distinction between treatable and untreatable is not simply one between grammatical
and nongrammatical errors. For example, within the domain of grammar, some errors are more treatable than others.
One promising area that has been underexplored is the role of learners’ individual differences (IDs) in affecting WCF
effectiveness. ID variables refer to characteristics, traits, and dispositions that make learners unique individuals, cause
variation among learners, and are hypothesized to have a direct or indirect impact on learning outcomes. In the field of SLA,
commonly investigated ID variables include language aptitude, anxiety, motivation, learning strategies, and learning styles
(Do€ rnyei, 2005). While there has been much research on the impact of ID variables on the effects of oral feedback (see the
section on oral feedback), studies on the associations between ID variables and written feedback are few and far between.
Among the various ID variables, only language aptitude has been investigated in the research (Benson & DeKeyser, 2018;
Shintani & Ellis, 2015; Stefanou & Revesz, 2015), and all related studies examined the role of language analytic abilitydthe
ability to learn the morphosyntactic aspects of a second languagedin WCF. These studies all point to the importance of
analytic ability in WCF, but its influence depends on the nature of the feedback treatment and the associated methodological
features. In particular, it has stronger associations with the effects of direct correction than metalinguistic feedback (Benson &
DeKeyser, 2018; Stefanou & Revesz, 2015); it is more predictive of the gains of learners who were required to rewrite after
receiving feedback as opposed to those who were not; and it is more strongly correlated with CF effects measured by new
writing than revisions (Shintani & Ellis, 2015). Shintani and Ellis pointed out that all the factors affecting the role of analytic
ability have to do with the depth of processing, that is, the role of analytic ability is evident when the instruction requires the
learner to conduct deep cognitive processing of the provided feedback.
To conclude this section, a discussion of the methodology of experimental WCF research is in order. Liu and Brown (2015)
synthesized the methods of WCF research and identified a number of issues that need to be addressed in future research. They
104 S. Li, A. Vuono / System 84 (2019) 93e109

found that the majority of WCF research focuses on short-term benefits of feedback measured in a single posttest with long-
term effects largely ignored. The authors found that only 30% of existing WCF studies utilized delayed posttests with few
studies spanning more than a single semester, with the exception of an experiment conducted by Bitchener and Knoch (2009)
that lasted ten months. Liu and Brown also highlighted several primary studies that failed to include control groups, claiming
it would be unethical to withhold feedback from students. In the studies that did utilize control groups, the control condition
was operationalized as no feedback, content-related feedback, and “traditional feedback” with no further elaboration. The
inclusion of a control group was further hindered by the fact that 95% of WCF studies took place in the classroom, which
increases ecological validity at the cost of experimental control.
Liu and Brown (2015) also identified the importance of describing text length in WCF research. They found that half of the
studies investigated failed to report the average length of writing, or mean word count of writing samples collected. They
recommend that future studies include this detail as word count could provide an index for the proficiency, fluency, and
writing abilities of the students in the sample. Bitchener and Ferris (2012) argued that the longer a writing sample is, the
greater opportunity the writer has to commit errors and therefore receive feedback in response to those errors. The quantity
of feedback received on longer compositions can then either promote or impede learning. That is, longer compositions may
receive more feedback, resulting in greater noticing through use of focused feedback, or cognitive overload through unfo-
cused feedback, when compared to shorter pieces of writing.

3.3. Teacher and student beliefs and attitudes

The research on teachers' and students’ beliefs and attitudes toward WCF show the following patterns:

1) Importance of WCF. Teachers and students have demonstrated an overall positive attitude toward WCF. For example,
Jamoom (2016) reported that university EFL teachers who participated in the study all endorsed the value of WCF. In a
study on students' attitudes, Chen et al. (2016) found that on a five-point scale, the average rating of students' responses to
the question on their attitudes toward WCF was 4.37.
2) Preferred feedback type. Learners seem to prefer direct feedback. For example, Lee (2005) reported that 75.7% of the
secondary school students involved in her study wanted to receive direct feedback or overt correction. Chen et al.‘s (2016)
study showed that learners favored feedback that locates the error, explains the nature of the error, and provides the
correct form (with the average ratings being four out of five on a Likert scale), and that they were not in favor of feedback
that only indicated the presence of an error with no elaboration (with a rating of 2.9).
3) Error categories. With regard to which errors should receive feedback, learners' preferences have been found to be affected
by learning settings and learners' proficiency. Hedgecock and Lefkowitz (1994) found a disparity between foreign language
(Spanish, German, and French) and ESL learners: whereas the former (72%) preferred feedback on language-related errors,
the latter thought it necessary to receive feedback on both language and content (50% for each category). Lee (2008)
reported that in response to the question, “Which area do you want your teacher to emphasize more in the future?",
high-proficiency learners wanted more feedback on content (51.4%) than language (34.3%) and organization (11.4%), while
low-proficiency learners were spread out in their preference, with the learners' responses varying between 20% and 30%
for content, organization, and language. Interestingly, 28% of low-proficiency learners did not want any additional feed-
back while all high-proficiency learners expressed interest in receiving more feedback in at least one of the listed domains.
4) CF dose. Lee (2005) showed that 83% of the Hong Kong secondary school students preferreded comprehensive feed-
backdhaving all errors corrected. Similarly, Amrhein and Nassaji (2010) reported that 94% of the ESL learners they sur-
veyed wanted their teachers to mark all errors. However, the study revealed that only 45% of the teacher participants
thought it necessary to provide feedback on all errors.
5) Source of CF. Lee (2004) reported that 60% of 206 university and secondary teachers said it is the teachers' job to provide
corrections, and over 90% said students should learn to locate and correct their own errors. Students seem to believe that
both teachers (45%) and students (55%) should perform error correction (Lee, 2004). When asked to choose between
teacher and non-teacher (peer feedback and self-correction) feedback, 93.8% of students chose teacher feedback (Zhang,
1995). However, asking students to make a choice may not be the most effective way to elicit their attitudes, and it is
possible that while students hold teacher feedback in high regard, they also value peer feedback.
6) Students' responses to CF. It was found that 90% of learners claimed that they would read teachers' feedback and correct the
errors (Chen et al., 2016; Leki, 1991). However, Han (2017) found that although students recognize the value of keeping an
error log, 33% took mental note of errors rather than make revisions on the page.

In addition to teachers' and students' attitudes, WCF researchers have examined the congruence and incongruence be-
tween teachers' stated beliefs about CF and their CF practice. Similar to the research on oral CF, WCF research has shown a
dissonance between teachers’ beliefs and practices. Hyland (2003) interviewed and observed two ESL teachers who stated
that their students were “obsessed with grammar” (p. 222) and who therefore urged students to comment more on genre and
content when providing feedback to their peers; however, the feedback the teachers provided was mostly focused on formal
and grammatical language features (56e72.7%). Thus, while these teachers recognize that writing involves much more than
grammatical knowledge, their feedback practices remain focused on correcting grammatical errors in student writing. On a
S. Li, A. Vuono / System 84 (2019) 93e109 105

larger scale, Lee (2009) analyzed the feedback that 26 secondary teachers provided on 174 student texts and compared the
results with a survey distributed to 206 teachers from the same population. She identified ten mismatches between what the
teachers reported and their practice. For instance, teachers acknowledged that effective writing entails more than gram-
matical elements, yet 94.1% of the comments teachers provided were form oriented, indicating that under 6% of feedback was
related to meaning, content, organization, and genre. Teachers preferred selective feedback, yet they provided comprehensive
WCF, averaging one correction for every seven words of student writing. Finally, the teachers Lee surveyed stated that stu-
dents should write multiple drafts if they are to learn from WCF, yet they continually assigned single-draft assignments in
their writing classes. The teachers did acknowledge, however, that their practice reflects the conditions of public exams in
Hong Kong.

The Synopses of Selected Articles Published in System (*oral feedback; **written feedback)

*Bao, M., Egi, T., & Han, Y. (2011). Classroom study on noticing and recast features: Capturing learner noticing and uptake with stimulated recall. System,
39, 215e228.

Bao et al. (2011) explored the impact that various features of recasts have on student noticing and uptake in L2 classrooms. Two instructors were trained
to provide consistent recasts. Then, their ESL classes were observed and video-recorded. Recordings were coded based on six recast features and
learner uptake was coded as repaired or needs-repair.
The day after the lessons were recorded, the students watched clips of their teacher's feedback from the previous day's lesson and described what they
were thinking at the time. These stimulated recall sessions, and the interviews that followed, were audio-recorded and coded for noticing.
The results indicate that learners' reported noticing (37.3%) was much more likely to occur than was uptake (14.3%). Bao et al. also noted that recasts with
a rising intonation, similar to question intonation, were the only type of recast that was a significant predictor of noticing (p ¼ 0.021). They elaborated
that all rising intonation recasts were direct (r ¼ .282). However, directness of recast was not a significant predictor of noticing. Finally, of the 35
recasts observed, only five resulted in learner uptake during the class session. The authors concluded by commenting that uptake may not accurately
depict noticing in classroom discourse. Therefore, stimulated recall may be a more effective way to measure noticing feedback than uptake.

**Erlam, R., Ellis, R., & Batstone, R. (2013). Oral corrective feedback on L2 writing: Two approaches compared. System, 41, 257e268.

Erlam et al. explored the impact that WCF conferences, or oral feedback on written compositions, has on students' writing. Adult ESL students in New
Zealand participated in a text reconstruction task before conferencing with one of the researchers, receiving oral feedback on past tense verb forms
and article use in their writing. In the conferences, students received either graduated or explicit feedback. Graduated feedback was “the most implicit
feedback that enabled the learner to self-correct an error” (p. 260). Alternatively, explicit feedback drew the learner's attention to the location of the
error and the correct form was provided. Ninety percent of explicit feedback was accompanied by metalinguistic explanation. After the conferences,
participants completed another text reconstruction task similar to the first and received feedback a second time.
Erlam et al.‘s results indicated that with graduated feedback, learners were better able to correct themselves over time (96.4%). Alternatively, students
who received explicit corrections did not demonstrate uptake consistently (52.4%). Finally, the authors note, concluding which form of feedback is
more effective is not possible from the data collected, nor was it the purpose of their study.

*Gooch, R., Saito, K., & Lyster, R. (2016). Effects of recasts and prompts on L2 pronunciation development: Teaching English/ɹ/to Korean adult EFL
learners. System, 60, 117e127.

Gooch et al. (2016) tested the impact that feedback has on pronunciation of the English /ɹ/ among Korean EFL students. Twenty-two participants were
assigned to one of three treatment groups. All three groups received form-focused instruction (FFI). However, two groups received FFI paired with
either recasts (n ¼ 7), or prompts (n ¼ 6). The classes were video-recorded to regulate feedback and assess learner uptake. Participants completed the
same pretest and posttest, consisting of both a controlled and sustained production test.
The authors noted that although 93% of prompts resulted in uptake, the repair was frequently a hybrid form of /ɹ/ and /r/ pronunciation. Recasts led to
less uptake (79%) but did not include hybrid forms which Gooch et al. attributed to the learners' ability to mimic the instructor's recasted target-like
pronunciation. Interestingly, relying on their own resources rather than the teacher's model, the prompt group demonstrated greater target-like
performance on posttests in comparison to other treatment groups.
The authors elaborated that while the recast group demonstrated significant gains in controlled production (p ¼ 0.046), the prompt group demonstrated
strengths in controlled (p ¼ 0.017) and spontaneous (p ¼ 0.036) production. Gooch et al. conclude that recasts might be useful for learners with lower
proficiency because recasts provide learners with the exemplars needed to respond to prompts effectively over time.

**Han, Y. (2017). Mediating and being mediated: Learner beliefs and learner engagement with written corrective feedback. System, 69, 133e142.

Han (2017) interviewed six Chinese university students over 16 weeks to explore the learners' beliefs related to WCF and how those beliefs mediate their
engagement with feedback. Participants were selected from two university classes e one taught by a native English-speaker and the other by a native
Chinese-speaker. Han identified that learner beliefs can be categorized as task-related, strategy-related, or related to interpersonal relationships. Her
findings indicate a strong correlation and reciprocal relationship between students' beliefs and their engagement with feedback. For instance, average
and low-achieving students were concerned that conferencing with the teacher would make them seem competitive to their classmates or
demonstrate to the teacher that they are incapable of fixing their own errors in response to teacher feedback. Alternatively, a higher achieving student
attended conferences with the teacher as a result of “unreserved trust in her teacher” (p. 138). Students who were more engaged with the feedback
their teacher provided were more inclined to revise their writing, regardless of whether the syllabus required revisions. Over time, learners expressed
increased awareness of the benefits that WCF can have on language development. Han expounded that teachers should engage with their students'
beliefs, fostering attitudes that enhance and enrich learner engagement with feedback. She also accentuated the importance of providing feedback
consistently and providing students with the opportunity to revise their writing in response to feedback.
106 S. Li, A. Vuono / System 84 (2019) 93e109

**Hyland, K., & Anan, E. (2006). Teachers' perceptions of error: The effects of first language and experience. System, 34, 509e519.

Hyland and Anan (2006) compared the WCF provided by native and non-native English language teachers well as native English non-teachers in London.
Participants received a 150-word sample text written by a pre-intermediate level Japanese EFL student. The 11 errors present in the passage were not
emphasized in any way. Participants rated the text on a 10-point scale, corrected any errors identified, ranked the three most serious errors, and
completed a beliefs questionnaire. Results indicated that the Japanese native-speaking teachers identified fewer errors but rated the writing more
harshly than either native English-speaking group. Japanese-native teachers were also more likely than English-natives to agree on which errors were
the most serious, focusing on errors of agreement and form. The English-native teachers and non-teachers were more inclined to focus on
intelligibility when making corrections. The native Japanese EFL teachers seemed to hunt for errors, pinpointing two and a half times more errors than
native-English speaking teachers and identifying error correction as a major component of their role as a teacher. None of the Japanese teachers
provided positive comments relating to strengths in the writing. Conversely, the English-native non-teachers were more positive and included
comments on the challenges the learner was overcoming by writing in a foreign language.

**Lee, I. (1997). ESL learners' performance in error correction in writing: Some implications for teaching. System, 25 (4), 465e477.

Lee (1997) examined the impact that WCF has on student revisions and corrections. Her sample included 149 electrical engineering students with low
English proficiency in Hong Kong. Each student corrected an authentic newspaper article into which 20 errors were added. Errors were either
unmarked or marked with direct or indirect prompting. Direct prompting entailed underlining of errors while indirect prompts revealed the line of
text in which an error was present.
The group that received direct prompts made significantly more corrections (p < 0.01) than the other groups and no significant difference was found
between the corrections made by the unmarked and indirect prompt groups. Lee explained that the students were able to correct the errors they
detected, and failure to detect an error led to the inability to correct it. She also found that the learners were significantly more prepared to correct
form-related errors than meaning-related errors across treatment groups (p ¼ 0.000).
Lee concluded her article with pedagogical implications, promoting the use of indirect and metalinguistic feedback. She emphasized that students
should be encouraged to correct their own errors, while the teacher must train learners to interpret error codes if used. Lee finally encouraged
teachers to prioritize corrections, focusing on meaning or forms as determined by the needs of the learners.

*Nassaji, H. (2011). Immediate learner repair and its relationship with learning targeted forms in dyadic interaction. System, 39, 17e29.

Nassaji (2011) explored the relationship between interactional feedback, sources of repair, and longitudinal learning gains. Before interacting with a
teacher, 42 adult ESL students received a series of drawings, organized them in sequence, and described the sequence in writing. The written
descriptions were then collected and the participants described the sequence orally in dyadic interactions with a teacher who provided feedback. The
interactions were audio-recorded and transcribed for comparison with students' written production. Immediately after the oral interaction,
participants received their written descriptions and made any revisions they deemed necessary in response to the feedback they had received during
the interaction. Two weeks later, the students received the type-written and otherwise unamended written description they had created two weeks
prior and corrected any errors they encountered in the text.
Nassaji's findings revealed that students were able to successfully orally correct 42% of errors in response to feedback during the interaction. In writing,
they successfully revised over half (60%) of the errors that they had self-repaired during the interaction and nearly a third (31%) of the errors that
received feedback but were not repaired during the interaction. Furthermore, the effects of self-repair were consistent over time whereas corrections
made in response to recasts reduced over time. However, teacher-generated repair that resulted in incorporation of recasts in new language
production lead to considerably more post-interaction corrections (58%) than repetitions (34%) of the teacher's recast.

*Roothooft, H. (2014). The relationship between adult EFL teachers' oral feedback practices and their beliefs. System, 46, 65e79.

Roothooft (2014) investigated the relationship between teachers' beliefs and practices related to oral feedback. She observed and recorded the classes of
ten EFL teachers in Spain, five from a private language academy and five from a university language institute. The teachers then responded to an open-
ended questionnaire describing (1) the observed classes, (2) their pedagogical practices in those classes, and (3) how they would provide feedback to
students in those classes by responding to various scenarios. Corrections provided in the classroom and survey responses were coded as explicit
corrections, recasts, clarification requests, metalinguistic feedback, elicitations, repetitions, or translations.
Roothooft identified a dissonance between the teachers' self-report data from the surveys and their actual classroom practices. For instance, classroom
observations demonstrated an overall preference for recasts (63.5%), with few prompts (9.4%) provided in the classes while survey results revealed
that teachers believe prompting feedback promotes learning more than recasts. However, teachers also indicated concern about students' feelings
and confidence being negatively impacted by feedback which may have led to greater use of recasts in the classroom. Contradictions also existed
within the questionnaire responses with teachers identifying they believe error correction is important but worry about hurting students' feelings or
negatively impacting students' confidence by providing corrective feedback.

**Sheen, Y., Wright, D., & Moldawa, A. (2009). Differential effects of focused and unfocused written correction on the accurate use of grammatical forms
by adult ESL learners. System, 37, 556e569.

Sheen, Wright, and Moldawa (2009) investigated the differential effects of focused and unfocused WCF in an ESL course for intermediate adults who are
interested in using English in their academic or professional careers. Eighty students wrote a narrative essay which served as the pretest and received
feedback on the pretest essay as well as a second narrative essay. Participants were assigned to one of four treatment groups. The students in the
focused group received feedback on definite and indefinite article use while the unfocused group received feedback relating to five grammatical
forms. Feedback was provided on two narratives written over two weeks. A “practice” group completed the same tasks without receiving feedback
whereas the control group only wrote the narratives used for testing and did not receive any feedback. Participants then wrote one narrative for each
of the two posttests. All four groups made significant immediate and longitudinal gains (p < 0.001). Regarding accurate article use, the focused
feedback group significantly outperformed the unfocused WCF group on the immediate posttest (p < 0.05) and had significantly higher scores than
the control group on the immediate (p < 0.05) and delayed posttests (p < 0.01) while the unfocused group did not. The practice group had significantly
higher scores than the control group on the delayed posttest (p < 0.05), indicating that task repetition can enhance written accuracy.
S. Li, A. Vuono / System 84 (2019) 93e109 107

(continued )

**Suzuki, W., Nassaji, H., & Sato, K., (2019). The effects of feedback explicitness and type of target structure on accuracy in revision and new pieces of
writing. System, 81, 135e145.

**Suzuki, W., Nassaji, H., & Sato, K., (2019). The effects of feedback explicitness and type of target structure on accuracy in revision and new pieces of
writing. System, 81, 135e145.

Suzuki et al. explored the impact that explicitness of WCF has on accuracy of revisions as well as new writing. Japanese university students of various
majors completed a text reconstruction task. The writing produced by the students received one of four feedback types, namely direct or indirect WCF
with or without metalinguistic explanations. WCF for all groups was focused on indefinite article and past perfect tense use which the authors labeled,
respectively, as simple and complex. One week after completing the text reconstruction, participants reviewed the feedback given based on their
treatment condition and revised their writing. Two weeks later, participants completed a similar text reconstruction task.
All treatment groups demonstrated increased accuracy in use of both target structures in their revisions (p < 0.01). Two weeks later, however, each group
showed a significant increase in accurate use of the past perfect tense (p < 0.01) but significantly reduced accuracy of indefinite article use in new
writing when compared to accuracy of revisions (p < 0.01). No significant effects for group was found in relation to article accuracy (p ¼ 0.06).
However, effects for group were found for accurate past perfect use (p < 0.01). Both direct feedback groups demonstrated significantly greater
accuracy of past perfect tense than the indirect feedback with metalinguistic explanation group whether direct feedback was provided with (p < 0.05)
or without (p < 0.01) a metalinguistic explanation of the grammatical form.

*Yilmaz, Y. (2013). The relative effectiveness of mixed, explicit and implicit feedback in the acquisition of English articles. System, 41, 691e705.

Yilmaz studied the differential effects of various oral feedback types on the acquisition of English articles. University freshmen in Turkey were randomly
assigned to one of five experimental conditions receiving (1) exclusively explicit feedback, (2) exclusively implicit feedback, (3) combined explicit and
implicit feedback, (4) reduced-explicit feedback, or (5) no feedback (control). The experiment employed a pre-immediate-delayed posttest design
with participants receiving treatment twice between the pretest and immediate posttest. Treatment and tests used variations of the same three oral
production tasks.
All feedback groups significantly outperformed the control group on both posttests (d ranging from 1.22 to 14.41), indicating that feedback aids in
acquisition of English articles. Furthermore, students in the explicit feedback group had significantly greater gains than the implicit group for both
definite (p ¼ 0.00) and indefinite (p ¼ 0.01) article use on the immediate posttest, but not on the delayed posttest two months later (p ¼ 0.52 and
p ¼ 0.73 respectively). A significant advantage was also found for mixed feedback in comparison to exclusively implicit feedback on the immediate
posttest (p < 0.05), yet no significant difference was uncovered between the mixed feedback and explicit feedback groups. Yilmaz attributed learning
gains demonstrated by the mixed feedback group to combined effects of explicit and implicit feedback, as opposed to contributions of explicit
feedback alone. In sum, explicit and mixed feedback led to greater immediate and delayed posttest performance than implicit or reduced explicit
feedback.

References

Articles published in System and included in the annotated bibliography of this manuscript marked with a single asterisk (*) if related to oral feedback and
double asterisks (**) if related to written feedback
Ahn, S. (2012). The relationships between grammatical sensitivity, noticing of recasts and learning of Korean object relative clauses. Applied Language
Learning, 22(1 & 2), 47e68.
Al-Surmi, M. (2012). Learners' noticing of recasts of morphosyntactic errors: Recast types and delayed recognition. System, 40(2), 226e236.
Amrhein, R., & Nassaji, H. (2010). Written corrective feedback: What do students and teachers prefer and why? Canadian Journal of Applied Linguistics, 13(2),
95e127.
Ashwell, T. (2000). Patterns of teacher response to student writing in a multiple-draft composition classroom: Is content feedback followed by form
feedback the best method? Journal of Second Language Writing, 9(3), 227e257.
* Bao, M., Egi, T., & Han, Y. (2011). Classroom study on noticing and recast features: Capturing learner noticing with uptake and stimulated recall. System, 39,
215e228.
Basturkmen, H., Loewen, S., & Ellis, R. (2004). Teachers' stated beliefs about incidental focus on form and their classroom practices. Applied Linguistics, 25(2),
243e272.
Benson, S., & DeKeyser, R. (2018). Effects of written corrective feedback and language aptitude on verb tense accuracy. Language Teaching Research, 1e28
(First view).
Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in second language acquisition and writing. New York, NY: Routledge.
Bitchener, J., & Knoch, U. (2009). The contribution of written corrective feedback to language development: A ten month investigation. Applied Linguistics,
31(2), 193e214.
Brown, D. (2016). The type and linguistic foci of oral corrective feedback in the L2 classroom: A meta-analysis. Language Teaching Research, 20(4), 436e458.
Busch, D. (2010). Pre-service teacher beliefs about language learning: The second language acquisition course as an agent for change. Language Teaching
Research, 14(3), 318e337.
novas Guirao, J., Roca de Larios, J., & Coyle, Y. (2015). The use of models as a written feedback technique with young EFL learners. System, 52, 63e77.
Ca
Chen, S., Nassaji, H., & Liu, Q. (2016). EFL learners' perceptions and preferences of written corrective feedback: A case study of university students from
mainland China. Asian-Pacific Journal of Second and Foreign Language Education, 1(5), 1e17.
Choi, S., & Li, S. (2012). Corrective feedback and learner uptake in a child ESOL classroom. RELC Journal, 43(3), 331e351.
Do€ rnyei, Z. (2005). The Psychology of the Language Learner: Individual Differences in Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum Associates.
Egi, T. (2010). Uptake, modified output, and learner perceptions of recasts: Learner responses as language awareness. The Modern Language Journal, 94(1),
1e21.
Ellis, R. (2009). A typology of written corrective feedback types. ELT Journal, 63, 97e107.
Ellis, R. (2010). Epilogue: A framework for investigating oral and written corrective feedback. Studies in Second Language Acquisition, 32(2), 335e349.
Ellis, R., Sheen, Y., Murakami, M., & Takashima, H. (2008). The effects of focused and unfocused written corrective feedback in an Englsih as a foreign
language context. System, 36(3), 353e371.
Elwood, J. A., & Bode, J. (2014). Student preferences vis-a -vis teacher feedback in university EFL writing classes in Japan. System, 42, 333e343.
** Erlam, R., Ellis, R., & Batstone, R. (2013). Oral corrective feedback on L2 writing: Two approaches compared. System, 41, 257e268.
Ferris, D. (1999). The case for grammar correction in L2 writing classes: A response to Truscott (1996). Journal of Second Language Writing, 8, 1e11.
108 S. Li, A. Vuono / System 84 (2019) 93e109

Ferris, D. R. (2004). The “grammar correction” debate in L2 writing: Where we are, and where do we go from here? (And what do we do in the mean-
time…?). Journal Of Second Language Writing, 13(1), 49e62.
Ferris, D. (2006). Does error feedback help student writers? New evidence on the short- and long-term effects of written error correction. In K. Hyland, & F.
Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 81e104). Cambridge, UK: Cambridge University Press.
* Gooch, R., Saito, K., & Lyster, R. (2016). Effects of recasts and prompts on L2 pronunciation development: Teaching English /ɹ/ to Korean adult EFL learners.
System, 60, 117e127.
** Han, Y. (2017). Mediating and being mediated: Learner beliefs and learner engagement with written corrective feedback. System, 69, 133e142.
Hendrickson, J. M. (1978). Error correction in foreign language teaching: Recent theory, research, and practice. The Modern Language Journal, 62(8),
387e398.
Hendrickson, J. M. (1980). Treatment of error in written work. The Modern Language Journal, 64(2), 216e221.
Hyland, F. (2003). Focusing on form: Student engagement with teacher feedback. System, 31, 217e230.
** Hyland, K., & Anan, E. (2006). Teachers' perceptions of error: The effects of first language and experience. System, 34, 509e519.
Jamoom, O. (2016). Teachers' beliefs and practices of feedback and prefernces os students for feedback in university level EFL writing classrooms. Unpublished Ph.
D. dissertation. University of Southampton.
Junqueira, L., & Kim, Y. J. (2013). Exploring the relationship between training, beliefs, and teachers' corrective feedback practices: A case study of a novice
and an experienced ESL teacher. The Canadian Modern Language Review, 69(2), 181e206.
Kamiya, N., & Loewen, S. (2014). The effectiveness of intensive and extensive recasts on L2 acquisition for implicit and explicit knowledge. Innovation in
Language Learning and Teaching, 8(3), 205e213.
Kang, E., & Han, Z. (2015). The efficacy of written corrective feedback in improving L2 written accuracy: A meta-analysis. The Modern Language Journal,
99(1), 1e18.
Kartchava, E. (2006). Corrective feedback: Novice ESL teachers' beliefs and practices. Doctoral dissertation. Concordia University.
Kim, Y., Payant, C., & Pearson, P. (2015). The intersection of task-based interaction, task complexity, and working memory. Studies in Second Language
Acquisition, 37(3), 549e581.
Krashen, S. (1982). Principles and practice in second language acquisition. Oxford, England: Oxford University Press.
Lai, C., Fei, F., & Roots, R. (2008). The contingency of recasts and noticing. CALICO Journal, 26(1), 70e90.
Larsen-Freeman, D. (2013). Transfer of learning transformed. Language Learning, 63(1), 107e129.
** Lee, S. (1997). ESL learners' performance in error correction in writing: Some implications for teaching. System, 25(4), 465e477.
Lee, I. (2004). Error correction in L2 secondary writing classrooms: The case of Hong Kong. Journal of Second Language Writing, 13, 285e312.
Lee, I. (2008). Student reactions to teacher feedback in two Hong Kong secondary classrooms. Journal of Second Language Writing, 17, 144e164.
Lee, I. (2009). Ten mismatches between teachers' beliefs and written feedback practice. ELT Journal, 63(1), 13e22.
Lee. (2018). Working hard or working smart: Comprehensive versus focused written corrective feedback in L2 academic contexts. In J. Bithener, N. Sorch, &
R. Wette (Eds.), Teaching writing for academic purposes to multilingual students. NY: Routledge.
Lee, I. (2005). Error correction in the L2 writing classroom: What do students think? TESL Canada Journal, 22(2), 1e16.
Leki, I. (1991). The preference of ESL students for error correction in college-level writing classes. Foreign Language Annals, 24, 203e218.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60(2), 309e365.
Li, S. (2017). Student and teacher beliefs and attitudes about oral corrective feedback. In E. Kartchava, & H. Nassaji (Eds.), Corrective feedback in second
language teaching and learning: Research, theory, applications, implications (pp. 143e157). New York, NY: Routledge.
Li, S. (2018). Data collection in the research on the effectiveness of corrective feedback: A synthetic and critical review. In A. Gudmestad, & A. Edmonds (Eds.
), Critical reflections on data in second language acquisition (pp. 33e62). Philadelphia, PA: John Benjamins Publishing Company.
Li, S., & Roshan, S. (2019). The associations between working memory and the effects of four different types of written corrective feedback. Journal of Second
Language Writing, 45, 1e15.
Li, S., & Wang, H. (2018). Traditional literature review and research synthesis. In A. Phakiti, P. De Costa, L. Plonsky, & S. Starfield (Eds.), The Palgrave handbook
of applied linguistics research methodology (pp. 123e144). London: Palgrave Macmillan.
Liu, Q., & Brown, D. (2015). Methodological synthesis of research on the effectiveness of corrective feedback in L2 writing. Journal of Second Language
Writing, 30, 66e81.
Loewen, S., Li, S., Fei, F., Thompson, A., Nakatsukasa, K., Ahn, S., et al. (2009). Second language learners' beliefs about grammar instruction and error
correction. The Modern Language Journal, 93(1), 91e104.
Loewen, S., & Philp, J. (2006). Recasts in the adult English L2 classroom: Characteristics, explicitness, and effectiveness. The Modern Language Journal, 90(4),
536e556.
Long, M. H. (2015). Second language acqusition and task-based language teaching. Malden: Wiley Blackwell.
Lyster, R. (2015). Using form-focused tasks to integrate language across the immersion curriculum. System, 54, 4e13.
Lyster, R. (2001). Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms. Language
Learning, 51(1), 265e301.
Lyster, R., & Mori, H. (2006). Interactional feedback and instructional counterbalance. Studies in Second Language Acquisition, 28(2), 269e300.
Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake: Negotiation of form in communicative classrooms. Studies in Second Language
Acquisition, 19(1), 37e66.
Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA: A meta-analysis. Studies in Second Language Acquisition, 32(2), 265e302.
Mackey, A. (2006). Feedback, noticing and instructed second language learning. Applied Linguistics, 27(3), 405e430.
Mackey, A., Gass, S., & McDonough, K. (2000). How do learners perceive interactional feedback? Studies in Second Language Acquisition, 22(4), 471e497.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In A. Mackey (Ed.), Conversational interaction in second
language acquisition (pp. 407e453). Oxford, England: Oxford University Press.
Mancho n, R. M. (2011). Learning to write and writing to learn in an additional language. Amsterdam: John Benjamins Publishing Company.
Martínez Agudo, J. (2015). How do Spanish EFL learners perceive grammar instruction and corrective feedback? Southern African Linguistics and Applied
Language Studies, 33(4), 411e425.
McDonough, K. (2005). Identifying the impact of negative feedback and learners' responses on ESL question development. Studies in Second Language
Acquisition, 27(1), 79e103.
* Nassaji, H. (2011). Immediate learner repair and its relationship with learning targeted forms in dyadic interaction. System, 39(1), 17e29.
Park, E. S., Song, S., & Shin, Y. K. (2015). To what extent do learners benefit from indirect written corrective feedback? A study targeting learners of different
proficiency and heritage language status. Language Teaching Research, 20(6), 1e22.
Philp, J. (2003). Constraints on “noticing the gap”: Nonnative speakers' noticing of recasts in NS-NNS interaction. Studies in Second Language Acquisition,
25(1), 99e126.
Plonsky, L., & Brown, D. (2015). Domain definition and search techniques in meta-analyses of L2 research (Or why 18 meta-analyses of feedback have
different results). Second Language Research, 31(2), 267e278.
Rahimi, M., & Zhang, L. J. (2015). Exploring non-native English-speaking teachers' cognitions about corrective feedback in teaching English oral commu-
nication. System, 55, 111e122.
Rassaei, E. (2013). Corrective feedback, learners' perceptions, and second language development. System, 41(1), 472e483.
Rassaei, E. (2015). Oral corrective feedback, foreign language anxiety and L2 development. System, 49, 98e109.
* Roothooft, H. (2014). The relationship between adult EFL teachers' oral feedback practices and their beliefs. System, 46, 65e79.
S. Li, A. Vuono / System 84 (2019) 93e109 109

Sato, M. (2013). Beliefs about peer interaction and peer corrective feedback: Efficacy of classroom intervention. The Modern Language Journal, 97(3),
611e633.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129e158.
Schulz, R. A. (1996). Focus on form in the foreign language classroom: Students' and teachers' views on error correction and the role of grammar. Foreign
Language Annals, 29, 343e364.
Sheen, Y. (2004). Corrective feedback and learner uptake in communicative classrooms across instructional settings. Language Teaching Research, 8(3),
263e300.
Sheen, Y. (2007). The effect of focused written corrective feedback and language aptitude on ESL learners' acquisition of articles. Tesol Quarterly, 41(2),
255e283.
Sheen, Y. (2008). Recasts, language anxiety, modified output, and L2 learning. Language Learning, 58(4), 835e874.
** Sheen, Y., Wright, D., & Moldawa, A. (2009). Differential effects of focused and unfocused written correction on the accurate use of grammatical forms by
adult ESL learners. System, 37, 556e569.
Shintani, N., & Ellis, R. (2013). The comparative effect of metalinguistic explanation and direct written corrective feedback on learners' explicit and implicit
knowledgeof the English indefinite article. Journal of Second Language Writing, 23, 286e306.
Shintani, N., & Ellis, R. (2015). Does language analytical ability mediate the effect of written feedback on grammatical accuracy in second language writing?
System, 49, 110e119.
Shintani, N., Ellis, R., & Suzuki, W. (2014). Effects of written feedback and revision on learners' accuracy in using two English grammatical structures.
Language Learning, 64(1), 103e131.
Stefanou, C., & Revesz, A. (2015). Learner differences, and the acquisition of second language article use for generic and specific plural reference. The Modern
Language Journal, 99, 263e282.
** Suzuki, W., Nassaji, H., & Sato, K. (2019). The effects of feedback explicitness and type of target structure on accuracy in revision and new pieces of
writing. System, 81, 135e145.
Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp.
495e508). New York, NY: Routledge.
Takimoto, M. (2006). The effects of explicit feedback on the development of pragmatic proficiency. Language Teaching Research, 10(4), 393e417.
Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2), 327e369.
Truscott, J. (2007). The effect of error correction on learners' ability to write accurately. Journal of Second Language Writing, 16, 255e272.
Van Beuningen, C. G., De Jong, N. H., & Kuiken. (2012). Evidence on the effectiveness of comprehensive error correction in second language writing.
Language Learning, 62(1), 1e41.
Varnosfadrani, A. D., & Basturkmen, H. (2009). The effectiveness of implicit and explicit error correction on learners' performance. System, 37, 82e98.
V
asquez, C., & Harvey, J. (2010). Raising teachers' awareness about corrective feedback through research replication. Language Teaching Research, 14(4),
421e443.
Vuono & Li (in press). Age and corrective feedback. In H. Nassaji & Kartchava (Eds.), The Cambridge handbook of corrective feedback in language learning
and teaching. Cambridge University Press.
Yilmaz, Y., & Granema, G. (in press). Corrective feedback and the role of implicit sequence learning ability in L2 online processing. Language Learning.
Yu, S., & Lee, I. (2014). An analysis of Chinese EFL students' use of first and second language in peer feedback of L2 writing. System, 47, 28e38.
Zhang, S. (1995). Re-examining the affective advantages of peer feedback in the ESL writing classroom. Journal of Second Language Writing, 4, 209e222.
Zhao, S. Y., & Bitchener, J. (2007). Incidental focus on form in teacher-learner and learner-learner interactions. System, 35, 431e447.

You might also like