(The Routledge Handbooks in Second Language Acquisition) Rosa M. Manchón, Charlene Polio - The Routledge Handbook of Second Language Acquisition and Writing-Routledge (2021)

THE ROUTLEDGE HANDBOOK OF SECOND
LANGUAGE ACQUISITION AND WRITING
This unique state-of-the-art volume offers a comprehensive, systematic discussion of second lan-
guage (L2) writing and L2 learning. Led by experts Rosa Manchón and Charlene Polio, top inter-
national scholars synthesize and contextualize the salient theoretical approaches, methodological
issues, empirical findings, and emerging themes in the connection between L2 writing and L2
learning, and set the future research agenda to move the field forward. This will be an indispensable
resource for scholars and students of second language acquisition (SLA), applied linguistics, edu-
cation, and composition studies.
Rosa M. Manchón is Professor of Applied Linguistics in the Department of English at the

University of Murcia.
Charlene Polio is Professor in the Department of Linguistics, Languages, and Cultures at Michigan
State University, where she teaches in the TESOL and Second Language Studies programs.
ROUTLEDGE HANDBOOKS IN SECOND
LANGUAGE ACQUISITION
Susan M. Gass and Alison Mackey, Series Editors
Kimberly L. Geeslin, Associate Editor
The Routledge Handbooks in Second Language Acquisition are a comprehensive, must-have

survey of this core sub-discipline of applied linguistics. With a truly global reach and featuring
diverse contributing voices, each handbook provides an overview of both the fundamentals and
new directions for each topic.
The Routledge Handbook of Second Language Acquisition and Pragmatics

Edited by Naoko Taguchi
The Routledge Handbook of Second Language Acquisition and Corpora

Edited by Nicole Tracy-Ventura and Magali Paquot
The Routledge Handbook of Second Language Acquisition and Language Testing

Edited by Paula Winke and Tineke Brunfaut
The Routledge Handbook of Second Language Acquisition and Technology

Edited by Nicole Ziegler and Marta González-Lloret
The Routledge Handbook of Second Language Acquisition and Writing

Edited by Rosa M. Manchón and Charlene Polio
For more information about this series, please visit:

https://www.routledge.com/Second-Language-Acquisition-Research-Series/book-series/RHSLA
THE ROUTLEDGE HANDBOOK
OF SECOND LANGUAGE
ACQUISITION AND WRITING
Edited by Rosa M. Manchón and Charlene Polio

Cover image: Getty
First published 2022
by Routledge
605 Third Avenue, New York, NY 10158
and by Routledge
4 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2022 Taylor & Francis
The right of Rosa M. Manchón & Charlene Polio to be identified as the authors
of the editorial material, and of the authors for their individual chapters, has been
asserted in accordance with sections 77 and 78 of the Copyright,
Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised
in any form or by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying and recording, or in any information
storage or retrieval system, without permission in writing from the publishers.
Trademark notice: Product or corporate names may be trademarks or registered trademarks,
and are used only for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Names: Manchón, Rosa, editor. | Polio, Charlene, editor.
Title: The Routledge handbook of second language acquisition
and writing / edited by Rosa M. Manchón & Charlene Polio.
Description: New York, NY : Routledge, 2022. |
Includes bibliographical references and index. |
Identifiers: LCCN 2021030136 (print) | LCCN 2021030137 (ebook) |
Subjects: LCSH: Second language acquisition. | Composition (Language arts) |
Language and languages–Study and teaching.
Classification: LCC P118.2 .R6857 2022 (print) |
LCC P118.2 (ebook) | DDC 418.0071–dc23
LC record available at https://lccn.loc.gov/2021030136
LC ebook record available at https://lccn.loc.gov/2021030137
ISBN: 978-0 -3 67-1 8985-3 (hbk)
ISBN: 978-1-032-15478-7 (pbk)
ISBN: 978-0-429-19969-1 (ebk)
DOI: 10.4324/9780429199691
Typeset in Times New Roman
by Newgen Publishing UK
CONTENTS
Contributors ix
1 L2 Writing and Language Learning 1

Rosa M. Manchón and Charlene Polio
PART I
Theoretical Perspectives 7
2 Theoretical Perspectives on L2 Writing, Written Corrective

Feedback, and Language Learning in Individual Writing Conditions 9
Ronald P. Leow and Bo-Ram Suh
3 Theoretical Perspectives on L2 Writing and Language Learning in

Collaborative Writing and the Collaborative Processing of Written
Corrective Feedback 22
Neomy Storch
PART II
Core Issues 35
SECTION 1
Tasks and Writing 37
4 Task Effects Across Modalities 39

Olena Vasylets and Roger Gilabert
5 Task Complexity Studies 52

Mark D. Johnson
v
Contents
SECTION 2
Language Processing 65
6 L2 Writing Processes of Language Learners in individual and

collaborative writing conditions 67
Marije Michel, Laura Stiefenhöfer, Marjolijn Verspoor, and Rosa M. Manchón
7 Learners’ Engagement with Written Corrective Feedback in Individual

and Collaborative L2 Writing Conditions 81
Julio Roca de Larios and Yvette Coyle
SECTION 3
Language Transfer and Writing 95
8 Transfer, Writing, and SLA: L2 Writing as a Multilingual Event 97

Rob Schoonen and Sanne van Vuuren
9 Multicompetence and L2 Writing 109

Guillaume Gentil
SECTION 4
The Role of Individual Differences 123
10 Age-Related Differences in L2 Written Performance and Written

Corrective Feedback Processing and Use 125
Yvette Coyle and Julio Roca de Larios
11 The Role of Cognitive Individual Differences in L2 Writing

Performance and Written Corrective Feedback Processing and Use 139
Mohammad Javad Ahmadian and Olena Vasylets
12 The Role of Motivational and Affective Factors in L2 Writing

Performance and Written Corrective Feedback Processing and Use 152
Mostafa Papi
SECTION 5
Writing Research, Corrective Feedback, and Language Development 167
13 L2 Writing and Grammar Development 169

Charlene Polio
14 L2 Writing and Vocabulary Development 183

Kristopher Kyle
vi
Contents
15 L2 Writing and Formulaic Language: Formulaic Chunks and Lexical Bundles 199
Hyung-Jo Yoon
16 Written Corrective Feedback: Short-Term and Long-Term Effects on

Language Learning 213
Eun Young Kang and ZhaoHong Han
17 The Role of Language in Assessing L2 Writing 226

Lia Plakans and Renka Ohta
SECTION 6
Writing Research in Different Contexts 239
18 Learning and Teaching L2 Writing in Content and Language

Integrated Learning (CLIL) Contexts 241
Carmen Pérez-Vidal and David Lasagabaster
19 L2 Writing in Study-Abroad Contexts 254

Cristi Vallejos and Cristina Sanz
20 L2 Writing and Language Learning in Academic Settings 268

Nigel A. Caplan
21 L2 Writing and Language Learning in Electronic Environments 282

Scott Aubrey and Natsuko Shintani
PART III
Expanding Research Agendas 297
22 Directions for Future Research Agendas on L2 Writing and

Feedback as Language Learning from an ISLA Perspective 299
Ronald P. Leow and Rosa M. Manchón
23 Directions for Future Research on Attention and L2 Writing 312

Osamu Hanaoka and Shinichi Izumi
24 Directions for Future Research on SLA, L2 Writing, and Multimodality 325

Jungmin Lim and Matt Kessler
25 Directions for Future Methodologies to Capture the Processing

Dimension of L2 Writing and Written Corrective Feedback 339
Andrea Révész, Xiaojun Lu, and Ana Pellicer-Sánchez
vii
Contents
26 Directions for Future Use of Using Existing Corpora in the Study

of L2 Writing 356
Shelley Staples, Adriana Picoral, Aleksey Novikov, and
Bruna Sommer-Farias
27 Directions for Future Automated Analyses of L2 Written Texts 370

Xiaofei Lu
Coda 383
28 Implications of SLA-Oriented Research for the Teaching of L2 Writing 385

Dana Ferris
Index 394
viii
CONTRIBUTORS
Mohammad Javad Ahmadian is currently Head of Postgraduate Taught at the School of Education,
University of Leeds.
Scott Aubrey is an assistant professor in the Faculty of Education, Department of Curriculum and
Instruction at the Chinese University of Hong Kong, where he is the deputy coordinator of the BA
(English Studies) and BEd (English Language Education) program.
Nigel A. Caplan is an associate professor at the University of Delaware English Language Institute
in the United States, where he teaches ESL and MA TESL courses.
Yvette Coyle is an associate professor in the Faculty of Education at the University of Murcia in
Spain where she has worked since 1992 training future primary school teachers of English.
Julio Roca de Larios is an associate professor at the Department of Language and Literature in the
Faculty of Education of the University of Murcia (Spain).
Dana Ferris (PhD Applied Linguistics, University of Southern California) is Professor in the
University Writing Program at the University of California, Davis (USA).
Guillaume Gentil is a professor in the School of Linguistics and Language Studies at Carleton
University, Ottawa, Canada and former co-editor of the Journal of Second Language Writing.
Roger Gilabert is an associate professor and researcher within the Language Acquisition Research
Group (GRAL) at the University of Barcelona.
ZhaoHong Han is Professor of Applied Linguistics at Teachers College, Columbia University.
Osamu Hanaoka is a professor in the School of International Relations at Tokyo International

University, where he teaches second language acquisition, English teaching methodology, and
English skill courses.
ix
Contributors
Shinichi Izumi is a professor of English Studies at Sophia University, Tokyo, Japan, where he
teaches in the BA program in English Language Studies and the MA and the PhD programs in
Applied Linguistics and TESOL.
Mark D. Johnson is Associate Professor of TESOL and Applied Linguistics at East Carolina
University, where he teaches classes on linguistics and teaching English to speakers of other
languages.
Eun Young Kang is an assistant professor in the Divison of Liberal Arts at Kongju National
University.
Matt Kessler is Assistant Professor of Applied Linguistics in the University of South Florida’s
Department of World Languages, where he serves as a core faculty member for both the MA TESL
program and PhD program in Linguistics and Applied Language Studies.
Kristopher Kyle is an assistant professor in the Linguistics Department and the director of the
Learner Corpus Research and Applied Data Science Lab at the University of Oregon.
David Lasagabaster is Professor of Applied Linguistics at the University of the Basque Country
UPV/EHU, Spain.
Ronald P. Leow is Professor of Applied Linguistics, Director of Spanish Language Instruction

and Spanish Language Institute (Summer), and Director of Graduate Studies –Linguistics in the
Department of Spanish and Portuguese at Georgetown University.
Jungmin Lim is an assistant professor in the College of Liberal Arts at Dankook University,
South Korea.
Xiaofei Lu is Professor of Applied Linguistics and Asian Studies at Pennsylvania State University.
Xiaojun Lu is a lecturer in Applied Linguistics at Southeast University (China).
Rosa M. Manchón is Professor of Applied Linguistics in the Department of English at the

University of Murcia.
Marije Michel studied Dutch and German Language and Culture at Utrecht University and holds
a PhD in Applied Linguistics from the University of Amsterdam.
Aleksey Novikov is a recent PhD graduate in Second Language Acquisition and Teaching (SLAT)
at the University of Arizona.
Renka Ohta earned her PhD in Foreign Language and ESL Education at the University of Iowa
in 2018.
Mostafa Papi is an assistant professor of Foreign and Second Language Education at Florida State
University, where he has been teaching graduate and undergraduate courses in second language
acquisition, research methods, and teaching methodology over the last five years.
Ana Pellicer-Sánchez is Associate Professor of Applied Linguistics and TESOL at the UCL
Institute of Education (UK) and a member of the UCL Centre for Applied Linguistics.
x
Contributors
Carmen Pérez-Vidal holds a Chair in English Linguistics and Language Acquisition in the
Department of Translation and Linguistic Sciences at Universitat Pompeu Fabra (UPF) in Barcelona
(Spain).
Adriana Picoral is an assistant professor of Data Science in the University of Arizona’s School of
Information, and an affiliated faculty member in the interdisciplinary graduate program of Second
Language Acquisition and Teaching also at the University of Arizona.
Lia Plakans is a professor in Multilingual Education at the University of Iowa, and departmental
executive officer for Teaching and Learning in the College of Education.
Charlene Polio is a professor in the Department of Linguistics, Languages, and Cultures at Michigan
State University, where she teaches in the TESOL and Second Language Studies programs.
Andrea Révész is a professor of Second Language Acquisition (SLA) at the UCL Institute of
Education, University College London.
Cristina Sanz is Full Professor of Spanish and Linguistics, Language Program Director, and
Director of the Georgetown@Barcelona Summer Program at Georgetown University in the United
States.
Rob Schoonen holds a PhD from the University of Amsterdam and is currently Chair and Professor
of Applied Linguistics at the Department of Language and Communication of Radboud University
Nijmegen.
Natsuko Shintani is a professor in the Faculty of Foreign Language Studies at Kansai University,
Japan.
Bruna Sommer-Farias is an assistant professor on the MA in Foreign Language Teaching program

and an affiliated faculty member in the Center for Latin American and Caribbean Studies at
Michigan State University.
Shelley Staples is Associate Professor of English/Second Language Acquisition and Teaching at

University of Arizona.
Laura Stiefenhöfer studied English and Spanish at the University of Mannheim, Germany.
Neomy Storch is an associate professor in ESL and Applied Linguistics in the School of Languages
and Linguistics at the University of Melbourne.
Bo-Ram Suh is Associate Teaching Professor in the Faculty of Liberal Education at Seoul National
University in South Korea.
Cristi Vallejos is Assistant Professor of Spanish at Cedarville University where she also coordinates
the Spanish Education program and teaches courses to prepare future language educators.
Sanne van Vuuren defended her PhD thesis, a corpus study on the pragmatic development of
advanced Dutch learners’ of English written production, in March 2017 and has since worked as
an assistant professor at the Department of Modern Languages and Cultures at Radboud University
Nijmegen.
xi
newgenprepdf
Contributors
Olena Vasylets is an associate professor at the University of Barcelona, Spain.
Marjolijn Verspoor is Professor Emeritus of English Language at the University of Groningen,

Netherlands, and Professor of Applied Linguistics at the University of Pannonia, Hungary.
Hyung-Jo Yoon is an assistant professor in the Department of Linguistics/TESL at California State

University, Northridge.
xii
1
L2 WRITING AND LANGUAGE
LEARNING
University of Murcia and Michigan State University
Introduction
The Routledge Handbook in Second Language Acquisition: Second Language Acquisition and
Writing is intended as a comprehensive compendium of theoretical perspectives and historical and
recent empirical developments on how and why writing in an additional language (L2) can be a site
for language learning. Despite the fact that this area of scholarly interest is a newcomer to second
language acquisition (SLA) studies, several reasons can be adduced to justify the publication of an
entire handbook on the landscape of this research domain.
First, the scholarly interest in the connection between L2 writing and L2 learning has developed
into a vibrant research area with rich, although arguably disparate, scholarly output in the form
of both theorizing and empirical research. Thus, SLA-based theoretical accounts of the learning
affordances related to L2 text production and of the use of written corrective feedback (WCF)
have been copious (e.g. Bitchener, 2012, 2016, 2019; Manchón, 2020a; Manchón & Williams,
2016; Polio, 2012a; Williams, 2012). Similarly, a fast-growing body of SLA-oriented empirical
research has directly or indirectly tested and, in some cases, provided evidence for, the theoretical
predictions of the learning affordances of L2 writing (see, for instance, contributions to Byrnes &
Manchón, 2014, and to Manchón, 2011, 2020c) and of WCF use (as reviewed in Bitchener, 2019;
Bitchener & Storch, 2016). This notable expansion of theoretical and empirical work hence invites
the field to scrutinize past developments and advance future directions, as attempted in the present
Handbook.
A second, and in our view more powerful, reason for the publication of a comprehensive com-
pendium on SLA-L2 writing interfaces, and for doing so in the Routledge series Handbooks of SLA,
derives from the contribution that SLA-oriented L2 writing theory and research can make to current
SLA knowledge, two research domains that until recently have not spoken to each other. Thus, it is
perhaps not particularly controversial to argue that early theoretical accounts of SLA are premised
on the centrality of (oral) input processing (rather than output production, a notable exception being
the Output Hypothesis, cf. Swain, 1985, 2005) and oral (rather than written) output (Byrnes &
Manchón, 2014b for a fuller discussion). This explains, in part, why up until recently writing has
not been a high priority in SLA research agendas, either in general, or in specific research domains.
For instance, work on the role of feedback by mainstream SLA researchers had traditionally focused
DOI: 10.4324/9780429199691-1 1
almost exclusively on oral feedback (see discussion in Bitchener, 2019 and Ferris, Chapter 28) and
the same applies to theorizing and empirical work on tasks (as discussed in Byrnes & Manchón,
2014b). Although early research on WCF exists (e.g., Semke, 1984; Kepner, 1991), such studies
were framed only as a pedagogical issue. Truscott’s (1996) paper was among the first to relate the-
ories of SLA to written feedback, albeit with a focus only on selected theories.
In her Epilogue to a special issue on SLA-L2 writing interfaces published in the Journal
of Second Language Writing, Ortega (2012) explained the neglect of writing in SLA agendas
in terms of different disciplinary goals as well as divergent ontological and epistemological
principles in the two disciplinary domains. She noted that literacy is seen by the SLA commu-
nity as “a culture-dependent, secondary manifestation of human language, a derivation of the
primary oral capacity for language that all healthy individuals of our species share, allegedly
regardless of culture, education, or walk of life” (p. 405). Given that, as Cumming (2013) has
also argued, writing is “highly variable and contingent on education, opportunities for learning,
and needs for use” (p. 1), writing and its contingency and variability make it more intractable for
SLA researchers, who value the generalizability of findings. Writing can also be seen as lacking
spontaneity because, with minor exceptions such as diverse forms on online written commu-
nication, monitoring pervasively characterizes the process of text composing, and this makes
written texts “compromised evidence,” which explains why “written evidence takes a back seat
compared to oral evidence in SLA research programs” (Ortega, 2012, p. 405).
Yet, sometimes language production in the written mode is spontaneous, and the application
of explicit knowledge never materializes, which makes understanding language production while
writing even more challenging. What is more, inspecting L2 writing from an SLA angle is fully
justified on account of the likely learning outcomes that may derive from the monitoring activity
that characterizes written production (before, during, and after the processing of WCF) and, as
a result, from the associated implementation of explicit learning mechanisms while writing and
while WCF processing (see further elaboration in Bitchener, 2019; Williams, 2012. See also
Chapters 2, 3, 6, 7, and 23, this volume). As repeatedly stated in the relevant literature, the possi-
bility of implementing explicit learning processes is in part facilitated by the idiosyncratic, more
expanded time nature of written production, which makes the study of attentional processes while
writing and likely associated learning outcomes worth inspecting, as done, for instance, in studies
of writing processes, of written feedback processing, and in task-modality studies (see Bitchener,
2019; Leow, 2020; Manchón, 2020b; Manchón & Leow, 2020; Manchón & Vasylets, 2019; Polio,
2012a, b, 2020. See also Chapters 2, 3, 4, 5, 6, and 7, this volume).
From a more applied perspective, additional reasons to make the study of L2 writing more cen-
tral in SLA research agendas and hence to reflect on current developments in the domain stem from
the recognition of the ever-present role that the printed word possesses in instructed second lan-
guage (ISLA) settings. This is precisely why Harklau (2002) advocated a modality-sensitive SLA
research agenda almost twenty years ago. On the basis of her own classroom-oriented research,
Harklau drew attention to the distinctive role of literacy practices in instructed SLA: “my research
in American public schools has convinced me that reading and writing are of relevance to virtu-
ally all classroom-based research” (Harklau, 2002, p. 330). Accordingly, she argued for a more
central position of the study of the language learning potential of writing in both L2 writing and
instructed SLA research agendas (see Manchón, 2020a for further elaboration of Harklau’s crucial
contribution to the research on writing and language learning). The other side of this coin is the
lack of literacy skills in language learning (e.g., Bigelow & Vinogradov, 2011; Tarone, Hansen, &
Bigelow, 2013) and how such a lack may affect oral language processing and learning. This topic
is not addressed in this volume because of the fairly limited amount of research on the topic (but
see Polio, 2020), although it is nonetheless an important issue for fully understanding the role of
reading and writing in language learning.
2
L2 Writing and Language Learning
Very much in line with Harklau’s arguments, Manchón and Cerezo (2018) have recently
reiterated the relevance of the study of the writing-L2 learning interface for both L2 writing and
SLA scholarship. They claimed:
Being fully cognizant of the socially situated nature of L2 writing teaching and learning,
L2 writing research cannot afford to ignore the important, and at times even predominant,
instrumental role that writing can have in the learning experience of many L2 learners
across the globe. Similarly, it would be myopic for SLA theory and research to disre-
gard the crucial role that literacy practices play in the learning experience of millions of
instructed L2 learners; hence the relevance of the consideration of writing as a site for
studying and promoting L2 learning in current and future SLA research agendas.
(p. 1, our emphasis)
The study of the language learning affordances of L2 writing has certainly made its way into both
fields. In the case of L2 writing research, it is no coincidence that the new version of Hyland and
Hyland’s (2006) acclaimed book on feedback in L2 writing (Hyland & Hyland, 2019) includes a new
chapter on “The intersection between SLA and feedback research” (Bitchener, 2019). Similarly, the
most comprehensive handbook on L2 writing to date (Manchón & Matsuda, 2016) includes a chapter
on “L2 writing and SLA studies” (Manchón & Williams, 2016), and Hyland’s (2019) revised edition
of Second Language Writing alludes to the “learning-to-write/writing-to-learn” dichotomy (Manchón,
2011) as a novel, relevant distinction in understanding the varied contexts in which writing is learned
and taught. Following a similar trend, SLA scholarship is replete with insights about the role that L2
writing can play in the learning of additional languages, and so handbooks and compendia of SLA
published in the last ten years include chapters on L2 writing, whether conceptual pieces (e.g. Manchón
& Vasylets, 2019; Polio, 2012b; Polio & Lee, 2017) or empirical studies (see, for instance, L2 writing
contributions to Leow, 2019).
In short, although still an expanding research domain and hence open to further work to be done the-
oretically and empirically, SLA-oriented L2 writing scholarship has developed sufficiently for the field
to engage in a retrospective critical analysis of achievements thus far and their wider implications for
central SLA disciplinary discussions, and in an equally broadening of the agenda through a critical pro-
spective analysis on what lies ahead in terms of theory, research, and, when appropriate, applications.
The Handbook of SLA and Writing: An Overview

The Handbook of SLA and Writing is divided into three parts: Theoretical Perspectives, Core Issues,
and Expanding Research Agendas, and includes this Introduction and a Coda (Chapter 28). Part I
begins with Leow and Suh’s chapter (Chapter 2) that summarizes the theoretical approaches and
models invoked in research examining how writing and written corrective feedback may be related
to language learning in individual writing conditions. The next chapter by Storch (Chapter 3)
also considers the role of writing and corrective feedback in language learning but in collabora-
tive conditions. As such, Leow and Suh summarize various cognitive models while Storch also
highlights the role of sociocultural theory, a framework used in much research on collaborative
writing.
Part II constitutes the bulk of this volume and includes six sections: Tasks and Writing;
Language Processing; Language Transfer and Writing; The Role of Individual Differences;
Writing Research, Corrective Feedback, and Language Development; and Writing Research in
Different Contexts. Each section contains two to four chapters on specific topics all following the
same format. After an introduction, where any necessary terms are defined, the authors offer a
historical perspective on the topic, although in some cases, such as writing and study abroad, the
3
history may be short. Next, the authors summarize the key issues in the relevant scholarship and
then provide examples of empirical research related to the issues. Each chapter ends with a brief
discussion of the main research methods used in the research in focus, recommendations for prac-
tice, and future research.
In Part II, Tasks and Writing extends task-based research, which traditionally has focused on oral
tasks, by approaching tasks from the perspective of modality showing differences in task perform-
ance across speaking and writing (Gilabert & Vasylets, Chapter 4). Johnson (Chapter 5), following
up on his 2017 meta-analysis, outlines theories and critical issues related to the role of task com-
plexity and its effect on learner language.
Language Processing extends the ideas in the two chapters in Part I by focusing on current
research related to writing processes and language learning. Research on writing processes is
not new, but Michel et al. (Chapter 6) highlight recent developments and the new technologies
and research methods that more comprehensively look into writers’ processes. Roca de Larios and
Coyle (Chapter 7) focus on corrective feedback, again, a topic that is not new but is approached
here by delving into research that addresses learners’ engagement with feedback suggesting we
have moved beyond the simple “does written corrective feedback work?” question.
Language Transfer and Writing tackles two complex areas of investigation, namely transfer and
multicompetence. Schoonen and van Vuuren (Chapter 8) explore transfer not simply as L1 influ-
ence on L2 linguistics features but as a more complex interplay of how L1 and L2 linguistic know-
ledge interact with general metacognitive knowledge. They note that transfer can be bidirectional,
a concept elaborated further by Gentil (Chapter 9) in his discussion of multicompetence in writing,
which is the notion that multilingual speakers and writers do not have separate linguistic systems
for each language but that knowledge of languages interact in a variety of ways.
The Role of Individual Differences focuses on age, cognitive factors, and affective factors. These
topics have been addressed for decades but only relatively recently in L2 writing research. The
role of age, in particular, has a very short history in L2 writing research. Coyle and Roca de Larios
(Chapter 10) review the recent research with regard to amount of exposure, explicit instruction, and
engagement with feedback. Although, as they note, this research is only beginning to accumulate,
they argue that it is important to explore populations other than university L2 writers and not take
as a given that younger is better for all language instruction. Ahmadian and Vasylets (Chapter 11)
review research related to working memory and aptitude, while Papi (Chapter 12) teases apart the
very complex issue of motivation in relation to writing.
The section on Writing Research, Corrective Feedback, and Language Development examines
vocabulary (Kyle, Chapter 13), grammar (Polio, Chapter 14), and formulaic chunks (Yoon,
Chapter 15), areas of language development that are not always clearly separable. Reviewed studies
include those that examine how these dimensions develop, how writing might facilitate acquisi-
tion in these areas, and relevant interventions to speed development. In addition, Kang and Han
(Chapter 16) take on the challenging task of succinctly reviewing current research on and central
issues related to writing corrective feedback. The section ends with a look at the role of language
in writing assessment (Plakans & Ohta, Chapter 17), explaining how research on written SLA both
diverges and overlaps with research on testing writing.
The last section of Part II focuses on different contexts for studying writing from an SLA per-
spective. Pérez-Vidal and Lasagabaster (Chapter 18) review research in the rich context of content
and language integrated learning (CLIL) covering four trends in the short history of CLIL and
writing development. Study abroad, a context widely studied but with little emphasis on the devel-
opment of writing (but see Sasaki, 2016), is addressed by Vallejos and Sanz (Chapter 19). Caplan
(Chapter 20) focuses on academic settings both with regard to development in university-level
classes but also with regard to the development of academic language. This section ends a with a
review by Aubrey and Shintani (Chapter 21) that examines the relatively new affordances of elec-
tronic environments such as text chat, collaborative environments, and tools for feedback.
4
L2 Writing and Language Learning
Complementing the looking-back view offered in chapters in Part II, Part III provides a forward-
looking analysis of what lies ahead. The six chapters in Part III provide critical accounts of future
research agendas from the dual angle of empirical questions worth asking and of research methods
needed to answer them. These chapters focus on: writing, language learning, and ISLA (Leow &
Manchón, Chapter 22); attention and writing (Hanaoka & Izumi, Chapter 23); SLA, writing, and
multimodality (Lim & Kessler, Chapter 24); new methodologies for studying the writing process
(Révész, Lu, & Pellicer-Sánchez, Chapter 25); using existing learner corpora (Staples, Picoral,
Novikov, & Sommer-Farias, Chapter 26); and automated analyses of written texts (Lu, Chapter 27).
The Handbook ends with a chapter by Dana Ferris (Chapter 28) who pulls together the core themes
of the Handbook and highlights the pedagogical implications of the research analyzed in the various
chapters.
In short, with this collective project our aim is to create a volume that contextualizes the
developments and research practices of the inquiry into the connection between L2 writing and
SLA. The theory and research synthesized in the Handbook of Second Language Acquisition and
Writing will likely not answer the myriad of questions related to how writing facilitates SLA or
how written language is acquired, but it does represent a collection of theories and of syntheses of
empirical research thus far. Importantly, it is hoped that it will lay out detailed and specific research
agendas that will spur further research where needed.
References
Bigelow, M., & Vinogradov, P. (2011). Teaching adult second language learners who are emergent readers.
Annual Review of Applied Linguistics, 17, 120–136.
Bitchener, J. (2012). A reflection on “the language learning potential” of written CF. Journal of Second
Language Writing, 21, 348–363.
Bitchener, J. (2016). To what extent has the published written CF research aided our understanding of its poten-
tial for L2 development? ITL –International Journal of Applied Linguistics, 167 (2), 111–131.
Bitchener, J. (2019). The intersection between SLA and feedback research. In K. Hyland & F. Hyland
(Eds.), Feedback in second language writing. Contexts and issues (pp. 85–105). Cambridge: Cambridge
University Press.
Bitchener, J., & Storch N. (2016). Written corrective feedback for L2 development. Bristol: Multilingual Matters.
Byrnes, H., & Manchón, R.M. (Eds.) (2014a). Task-based language learning –Insights from and for L2
writing. Amsterdam: John Benjamins.
Byrnes, H., & Manchón, R.M. (2014b). Task-based language learning –insights from and for L2 writing: An
introduction. In H. Byrnes & R.M. Manchón (Eds.), Task-based L2 language learning –Insights from and
for L2 writing (pp. 1–23). Amsterdam: John Benjamins.
Cumming, A. (2013). Writing development in second language acquisition. In C. Chapelle (Ed.), Encyclopedia
of applied linguistics. Malden, MA: Wiley-Blackwell. doi:10.1002/9781405198431.wbeal1299.
Harklau, L. (2002). The role of writing in classroom second language acquisition. Journal of Second Language
Writing, 11, 329–350.
Hyland, K. (2019). Second language writing (2nd ed.). Cambridge: Cambridge University Press.
Hyland, K., & Hyland, F. (Eds.) (2006). Feedback in second language writing: Contexts and issues.
New York: Cambridge University Press.
Hyland, K., & Hyland, F. (Eds.) (2019). Feedback in second language writing. Contexts and issues.
Cambridge: Cambridge University Press.
Kepner, C.G. (1991). An experiment in the relationship of types of written feedback to the development of
second language writing skills. Modern Language Journal, 75, 305–313.
Leow, R. (Ed.) (2019). The Routledge handbook of second language research in classroom learning.
New York: Routledge.
Leow, R. (2020). L2 writing-to-learn: Theory, research, and a curricular approach. In R.M. Manchón (Ed.),
Writing and language learning. Advancing research agendas (pp. 95–117). Amsterdam: John Benjamins.
Manchón, R.M. (2011). Learning-to-write and writing-to-learn in an additional language. Amsterdam: John
Benjamins.
Manchón, R.M. (2020a). Writing and language learning. Looking back and moving forward. In R.M.
Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 3–26). Amsterdam: John
Benjamins.
5
Manchón, R.M. (2020b). The language learning potential of L2 writing. Moving forward in theory and
research. In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 405–
426). Amsterdam: John Benjamins.
Manchón, R.M. (2020c). L2 writing and L2 learning: Cumming’s Contribution and subsequent developments.
In M. Riazi, L. Shi, & K. Barkaoui (Eds.), Studies and essays on learning, teaching and assessing L2
writing in honour of Alister Cumming (pp. 8–27). Newcastle upon Tyne: Cambridge Scholars.
Manchón, R.M., & Cerezo, L. (2018). Writing and language learning. In J. Liontas (Ed.), The TESOL
Encyclopedia of English Language Teaching (pp. 1–6). New York: Wiley. doi:10.1002/9781118784235.
eelt0530.
Manchón, R.M., & Leow, R. (2020). An ISLA perspective on L2 learning through writing. Implications for
future research agendas. In R.M. Manchón (Ed.), Writing and language learning. Advancing research
agendas (pp. 335–355). Amsterdam: John Benjamins.
Manchón, R.M., & Matsuda, P.K. (Eds.) (2016). The Handbook of second and foreign language writing.
Berlin: de Gruyter Mouton.
Manchón, R.M., & Vasylets, O. (2019). Language learning through writing: Theoretical perspectives and
empirical evidence. In J.B. Schwieter & A. Benati (Eds.), The Cambridge handbook of language learning
(pp. 341–362). Cambridge: Cambridge University Press.
Manchón, R.M., & Williams, J. (2016). L2 writing and SLA studies. In R.M. Manchón & P. K. Matsuda (Eds.),
The handbook of second and foreign language writing (pp. 567–586). Boston: de Gruyter.
Ortega, L. (2012). Epilogue: Exploring L2 writing–SLA interfaces. Journal of Second Language Writing, 21,
404–415.
Polio, C. (2012a). The relevance of second language acquisition theory to the written error correction debate.
Journal of Second Language Writing, 21, 375–389.
Polio, C. (2012b). Second language writing. In S. Gass & A. Mackey (Eds.) Handbook of second language
acquisition (pp. 319–334). New York: Routledge.
Polio, C. (2020). Can writing facilitate the development of grammatical competence? Advancing research
agendas. In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 381–
Polio, C., & Lee, J. (2017). Written language learning. In S. Loewen & M. Sato (Eds.), Routledge handbook of
instructed second language acquisition (pp. 299–317). New York: Routledge.
Sasaki, M. (2016). L2 writers in study-abroad contexts. In R.M. Manchón & P.K. Matsuda (Eds.), Handbook
of second and foreign language writing (pp. 161–180). Boston: de Gruyter Mouton
Semke, H. (1984). The effects of the red pen. Foreign Language Annals, 17, 195–202.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and comprehensible
output in its development. In S. Gass & C. Madden (Eds.), Input in second language acquisition (pp. 235–
253). Rowley, MA: Newbury House.
Swain, M. (2005). The Output Hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in
second language teaching and learning (pp. 471–483). Mahwah, NJ: Lawrence Erlbaum.
Tarone, E., Hansen, K., & Bigelow, M. (2013). Alphabetic literacy and second language acquisition by older
learners. In J. Herschensohn & M. Young–Scholten, (Eds.), The Cambridge Handbook of Second Language
Acquisition (pp. 180–203). Cambridge: Cambridge University Press.
Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46,
327–369.
Williams, J. (2012). The potential role(s) of writing in second language development. Journal of Second
6
PART I
Theoretical Perspectives
2
THEORETICAL PERSPECTIVES
ON L2 WRITING, WRITTEN
CORRECTIVE FEEDBACK,
AND LANGUAGE LEARNING
IN INDIVIDUAL WRITING
CONDITIONS
Georgetown University and Seoul National University
Introduction
Almost all language curricula invariably include a component for students to produce in writing
some aspect(s) of the second or foreign language (L2) under study, whether as part of an in-class
or take-home writing assignment that may or may not be evaluated for a performance grade. This
curricular perspective, logically situated within instructed second language acquisition (ISLA), as
opposed to the more naturalistic SLA (Leow, 2015, 2019a, 2019b; Leow & Cerezo, 2016), may
be viewed from a writing-to-learn (or writing as a site for language learning) versus learning-to-
write perspective (as in upper-level writing courses) raised previously by several researchers (e.g.,
Harklau, 2002, Manchón, 2009, 2011a, 2011b, 2011c; Manchón & Williams, 2016; Ortega, 2011).
Researchers have underscored (a) the potential of writing as an integral part of overall language
instruction (Manchón, 2011c), (b) the need to view the role of writing less on a focus on grammar
and vocabulary practice and more as an opportunity to learn the L2 (Ortega, 2011), (c) the need
for investigation into the benefits for language development derived from the processes involved
in composing (e.g., Manchón & Williams, 2016) and, more recently, in revision based on WCF
(e.g., Bitchener & Storch, 2016). In addition, while the typical writing task has been viewed from a
cognitive-based individual perspective (the focus of this chapter), there has been an uptick in sev-
eral studies addressing writing tasks from a collaborative perspective (see Chapter 3, this volume;
Storch, 2019 for a recent review), typically situated within Vygotsky’s (1978, 1986) sociocultural
theory (SCT).
Several empirical studies (e.g., Breuer, 2019; Cumming, 1990; López-Serrano, Roca de Larios,
& Manchón, 2019; Manchón, Roca de Larios, & Murphy, 2009; Révész, Kourtali, & Mazgutova,
2017; Roca de Larios, Murphy, Manchón, & Marín, 2008; see also contributions to Révész &
Michel, 2019) have investigated the linguistic processing involved in composing both in L1 and
L2 via multiple data elicitation procedures (see Chapters 6 and 26, this volume). These studies
DOI: 10.4324/9780429199691-3 9
have generally reported beneficial effects for potential L2 development derived from some level of
processing or cognitive engagement during the act of composing. Similarly, ISLA writing research
addressing the role of WCF in potential L2 learning has recently begun to probe into the cognitive
processes employed by L2 writers as they process WCF via the use of written language, think-
aloud protocols, collaborative dialogues, and noticing charts (see Chapter 7, this volume). These
studies have investigated WCF processing (in either individual or collaborative writing conditions)
to first establish depth of processing (DoP, see Leow, 2015) and then measure the impact of WCF
processing on immediate revisions (e.g., Adrada-Rafael & Filgueras-Gómez, 2019; Caras, 2019;
Cerezo, Manchón, & Nicolás-Conesa, 2019; Park & Kim, 2019; Suzuki, 2016).
The value of investigating L2 written production, especially when compared to oral production,
is more pronounced when the potential for WCF is involved during the subsequent stage of revi-
sion. WCF is a response that is provided on a written composition by a teacher, a researcher, or a
peer in reaction to an error committed by the L2 writer accompanied by an effort to even minimally
draw their attention to some grammatical, lexical, structural, and/or content error committed by the
L2 writer (Leow, 2020). What is noteworthy is that WCF appears to be firmly situated within the
instructed setting given that the bulk of studies conducted within this WCF strand of research draw
their sample populations from the formal L2 context (Leow, 2020). In addition, based on the type
of explicit processing typically employed in an instructed setting (Leow & Cerezo, 2016; Leow,
2018, 2019a, 2019b), WCF is also premised on explicit learning grounded in the activation of prior
explicit knowledge (Leow, 2015; Polio, 2012) and in teachers’ hope that L2 writers go beyond
merely paying attention to or noticing WCF but also adequately further processing (with some DoP
or level of awareness) the information provided in the feedback with the goal of restructuring their
incorrect L2 knowledge (Leow, 2020).
Investigating writing and WCF in relation to potential language learning is premised on several
positive affordances due to the differential time and processing dimensions of written versus oral
communication, as well as to the processing of L2 data via WCF when compared to oral feedback.
In both cases (writing and WCF processing), the potential for deeper processing while writing
and while appropriating WCF is promoted by the saliency, visibility, and permanency of both the
written text and the written feedback. Crucially, L2 writers individually control their attentional and
processing resources while writing and their own responses to WCF, so how much linguistic pro-
cessing takes place while writing and how deeply WCF is processed may account for subsequent
L2 learning via the process of restructuring their interlanguage.
There has also been an increasing call not only to study the role of writing processes and the
effects of WCF premised on offline data in direct relation to L2 development (e.g., Manchón &
Vasylets, 2019; Manchón & Williams, 2016) but also for a methodological effort to document
concurrently the role of cognitive processes during writing and during exposure to WCF in sub-
sequent L2 learning (e.g., Leow, 2020; Manchón & Leow, 2020), as manifested in several recent
studies (see Chapters 6 and 7, this volume). Given (a) the current focus on these cognitive processes
employed during individual writing conditions in ISLA, (b) the role of WCF, and (c) the potential
for language learning from both writing and WCF processing, this chapter provides a succinct
report of theoretical underpinnings postulated to account for the roles of writing and WCF during
the revision stage of the writing process. The theoretical underpinnings are followed by a critical
commentary on their ability to account for the roles the act of writing and WCF during the revision
writing process play in L2 learning.
Theoretical Underpinnings for Writing and WCF

Several theoretical underpinnings1 have been invoked to account for the roles writing and WCF pro-
cessing play in the L2 learning process. From a sociocultural perspective, the major theoretical under-
pinning has been Vygotsky’s (1978, 1986) SCT premised on his concept of the Zone of Proximal
10
L2 Learning & Individual L2 Writing
Development (ZPD). ZPD is best viewed as a collective activity between an expert and a novice, with
the former shaping the cognitive development (or levels of knowledge) of the latter. This temporal,
fine-tuned expert assistance is referred to in the literature as scaffolding (Wood, Bruner, & Ross, 1976).
Cognitive-based theoretical underpinnings include Focus-on-Form (Long & Robinson, 1998),2
the Model of Second Language Acquisition (Gass, 1988), the Noticing Hypothesis (Schmidt,
1990, 1993, 1994, 1995, 2001), the Output Hypothesis (Swain, 2005), Skill Acquisition Theory
(DeKeyser, 2015), and the Model of the L2 Learning Process in ISLA (Leow, 2015).
Given the chapter focus on individual writing and individual WCF appropriation, in what
follows we account just for cognitive-based theoretical underpinnings, and these are the ones that
provide a more solid foundation for the purported link between the act of writing, WCF processing,
and potential language learning outcomes.
The Noticing Hypothesis

Schmidt’s (1990 and elsewhere) Noticing Hypothesis has been cited in many WCF studies.
According to the Noticing Hypothesis, attention controls access to awareness and is responsible
for the subjective experience of noticing, which Schmidt earlier stated was “the necessary and
sufficient condition for the conversion of input to intake” (Schmidt, 1993, p. 209). Schmidt posits
that focal attention is isomorphic with awareness and, consequently, rejects any learning that may
take place without awareness, especially in relation to abstraction given that abstraction is always
associated with conscious cognitive functions (Schmidt, 1995).
At the same time, Schmidt acknowledges the theoretical and methodological debates associated with
the relatively categorical postulation of the role of noticing in his Noticing Hypothesis. Theoretically,
is the role of noticing indeed necessary and sufficient for subsequent intake in input processing to
take place if one considers the general belief that some aspects of the L2 can be picked up by learners
without being aware of doing so, for example, Tomlin and Villa’s (1994) notion of detection (cognitive
registration) that may occur without awareness? Methodologically, can one establish zero awareness
at the point of noticing or processing the incoming input? To address these questions, Schmidt (1994)
waters down his original postulation of noticing as the necessary and sufficient condition for the con-
version of input into intake by postulating that more noticing leads to more learning, consequently
underscoring the facilitative nature of noticing in the early stages of the L2 learning process.
Schmidt (1990) also postulates a higher level of awareness (at the level of understanding) that is
associated with learners’ ability to analyze, compare, and test hypotheses about the linguistic data
ultimately leading to rule formulation. He distinguishes the two levels of awareness by stating that
noticing is necessary for intake and potential learning to occur while understanding, which may
function as a facilitator for learning, is not necessary. According to Schmidt (1993), the crucial
difference between noticing and understanding is that noticing results in intake and item learning
while understanding leads to restructuring and system learning. Understanding appears to involve
a much higher DoP when compared to noticing although they both allow for storage of linguistic
information in the learner’s internal system.
It appears, then, that WCF, rather than writing itself, provides an opportunity for L2 writers to
notice the gap or mismatch between their output and the linguistic information contained in the
feedback, and this learning process is explicit.
Comments on the Noticing Hypothesis

Many SLA- oriented writing and WCF studies have cited Schmidt’s Noticing Hypothesis as
the theoretical underpinning for their findings in addition to the role of attention in this strand
of research. However, writing is a productive activity while Schmidt’s Noticing Hypothesis is
premised on incoming input. Within WCF studies, noticing is typically framed as the L2 writer’s
11
need to minimally “notice” feedback in order for any potential restructuring or learning to take
place. Given the isomorphic nature of noticing (attention plus a low level of awareness), employing
noticing to explicate findings in this strand of WCF research is logically premised on WCF being
processed explicitly, that is, with a minimal low level of awareness. As pointed out by Leow (2020),
it should be noted that Schmidt’s (e.g., 1990, 1993, 1995, 2001) Noticing Hypothesis was origin-
ally situated at the early stage of the L2 learning process (the input-to-intake stage) and addresses
the processing of new linguistic information in the L2 input with the potential of noticed informa-
tion resulting in intake and potential learning. The Noticing Hypothesis, then, has more relevance
for research on WCF processing than to explicate the language learning potential of the act of
writing itself. Situating the Noticing Hypothesis at the output or productive stage also assumes
that, like new linguistic information, the feedback provided at this stage also needs to be minim-
ally noticed. However, a closer comparison between new information being processed for the first
time by an L2 learner and feedback provided after an L2 writer’s production reveals one major
difference: noticing at this output stage has to be logically associated with the L2 writer’s prior
knowledge. In other words, the L2 writer notices the mismatch between the different linguistic
information contained in the WCF and what they have produced (from prior knowledge already
lodged in their internal system). This role of (activation of) prior knowledge in relation to the WCF
provided, which goes beyond mere noticing (low DoP) and necessitates more cognitive engage-
ment (higher DoP) to potentially restructure their prior incorrect knowledge (Leow, 2015), does
not appear to constitute a part of the Noticing Hypothesis. In addition, noticing (attention plus a
low level of awareness) does not guarantee automatic further processing that may require deeper
processing or a higher level of awareness. Previous studies employing concurrent data elicitation
procedures such as eye-tracking (e.g., Godfroid, Boers, & Housen, 2013) and think-aloud (e.g.,
Leow, 2001) procedures have reported no role for noticing in subsequent performance (recognition
and production), suggesting that intake may need to be further processed for potential learning to
take place (Leow, 2015). In sum, the Noticing Hypothesis may be too coarse-grained to account
for how attentional and processing resources while writing may lead to language learning or
how the WCF is processed after noticing and what role prior knowledge plays in any interaction
with the WCF noticed. The Noticing Hypothesis also fails to account for the potential of processing
the WCF without awareness.
The Output Hypothesis

Swain’s (1985, 2005) Output Hypothesis was put forward as a direct rebuttal to Krashen’s (1982)
Input Hypothesis that postulates that acquisition takes place during exposure to comprehensible
input during the early stages of the L2 process (the input and intake processing stages), after she
observed that, in spite of tremendous exposure to comprehensible input in content-based L2 French
classrooms in immersion schools in Canada, students’ L2 productive ability remained quite low
when compared to their comprehension ability. To address this apparent issue, Swain assumes that
students’ productive ability will improve if more opportunities to produce the L2 are provided.
Further, more production will decrease the time spent on semantic processing or processing for
meaning and increase the opportunity to become more consciously and cognitively engaged in
grammatical (e.g., syntactic and morphological) processing.
The Output Hypothesis comprises three major claims, the first two psycholinguistic in nature
and the third sociocultural, not reported in this chapter. The first claim addresses the actual process
of producing the L2 (in our case, the act of writing itself) and potential cognitive processes involved
during this production (a noticing/triggering function) while the second is associated with cognitive
processes associated with the potential role of feedback (a hypothesis-testing function).
In her first claim, Swain views the opportunity to produce the L2 as activating the noticing/
triggering function that may promote a higher level of awareness of or recognition of potential gaps
12
between learners’ current interlanguage or current knowledge and the L2. Promoting this cognitive
engagement may result in deeper processing of relevant L2 data due to learners’ effort to solve these
linguistic problems. Swain also claims that this awareness triggers cognitive processes associated
with the process of learning: “ones in which learners generate linguistic knowledge that is new
to them, or that consolidate their current existing knowledge” (Swain, 2005, p. 474). At the same
time, it is also acknowledged that learner awareness may also depend upon several other variables
that may prevent such awareness-raising. Also included in the claim is the notion of DoP (Craik &
Lockhart, 1972) that is associated with the amount of analysis and elaboration L2 learners perform
while processing L2 data with the belief that deeper processing leads to better retention of L2 know-
ledge. Consequently, the act of producing written output allows L2 writers to control their own atten-
tional and processing resources while writing and, based on DoP, to achieve potential higher levels
of awareness leading to new linguistic knowledge during the writing process. Swain’s basic claim,
then, is that more opportunities to produce the L2 may force L2 learners “to move from semantic
processing to syntactic processing” (Swain, 1993, p. 159), which, in turn, will provide more time for
them to explicitly analyze the linguistic mismatches they notice during output (i.e., while writing).
The second claim, the hypothesis-testing function, may appeal more to risk takers willing to con-
sciously produce the L2 for feedback given that it is related to the opportunity for learners to experi-
ment with new linguistic forms and structures with the purpose of verifying their L2 accuracy.
According to Swain, learners need to test hypotheses in relation to the feedback received if they are
interested in consolidating their prior knowledge. This increase in consciously processing the feed-
back can potentially lead to modifying or “reprocessing” their output. Feedback, then, consciously
sought and processed at a high DoP by L2 writers, offers the opportunity for these writers to restruc-
ture their prior knowledge via explicit learning.
Comments on the Output Hypothesis

Swain’s Output Hypothesis enjoys a role in many writing and WCF studies given that its loca-
tion is exactly where the production of written text and the provision of WCF occur. It is also
classroom-based in that it acknowledges the type of explicit processing that predominates in this
formal setting. Interestingly, in spite of this location, the hypothesis claims that “the act of produ-
cing language (speaking or writing) constitutes, under certain circumstances, part of the process of
second language learning” (Swain, 2005, p. 471). In other words, while it is generally assumed that
output or production is a product and for many outside the L2 learning process, Swain includes this
process of producing as part of the learning process, which also appears to involve quite a high DoP
in both claims.
One limitation of the hypothesis-testing function that provides the benefits of WCF is that it
may only be restricted to L2 learners/writers willing to invest more effort into explicitly producing
the L2 data to seek verification of their production. Viewed from the immediacy of the feedback
provided (as in oral production), the delayed provision of WCF may be a drawback regarding
any impact on L2 writers’ hypotheses made during the composing phase. In addition, how the
feedback is further processed (is noticing sufficient to restructure the interlanguage?) and whether
L2 learning rests solely on production are questions that remain unanswered in this hypothesis.
Incidentally, as pointed out in Leow (2020), the notion of “reprocessing” the feedback at the early
input-to-intake stage arguably weakens the primary tenet of the Output Hypothesis that is based on
production or at the later stage of the L2 learning process.
Skill Acquisition Theory

DeKeyser’s (2015) application of Skill Acquisition Theory to L2 learning postulates how explicit
L2 knowledge becomes proceduralized over time until it is converted into implicit knowledge.
13
In this theory, knowledge by type and use are characterized to propose three principal stages of
development, commonly referred to within SLA as the acquisition of declarative knowledge,
proceduralization, and automatization stages (see Anderson, 1982, 1993, 2007).
The learning process begins with the L2 learner acquiring declarative knowledge about a skill
(language), most often through observation and analysis of expert behavior as well as a verbal
and explicit explanation (DeKeyser, 2015), without attempting to engage in any kind of language
perception or production. Once the learner has acquired enough knowledge about a skill, typically
associated with much cognitive effort, they may begin the process of forming a behavior where
declarative knowledge is transformed into procedural knowledge during proceduralization. To
achieve proceduralized knowledge together with spontaneous and fluid speech, learners need to be
exposed to a great deal of meaningful, contextualized practice. To describe the process of acqui-
sition, DeKeyser posits a qualitative change in knowledge retrieval that resembles a power curve,
representing a shift from declarative to procedural knowledge through automatization and shows
the change over time as explicit knowledge becomes acquired. Once the learner has proceduralized
their declarative knowledge, specific information can be retrieved with less dependence on declara-
tive knowledge and, consequently, less cognitive effort, resulting in faster responses.
Skill Acquisition Theory does not necessarily claim an evolution from explicit to implicit know-
ledge but rather focuses on how explicitly learned declarative knowledge carries learners through
the proceduralization stage by way of carefully formatted tasks, to more implicit or procedural
knowledge and into the initial stages of automatization. At the same time, under this theory, a
learner cannot reach a practical proficiency level without moving through each stage (DeKeyser,
2015). From a different perspective, feedback may allow L2 learners the opportunity to practice
correctly the L2 during the stages of the acquisition of declarative knowledge, proceduralization,
and automatization.
Comments on the Skill Acquisition Theory

The Skill Acquisition Theory, like the Output Hypothesis, lies at the output stage of the L2 learning
process where L2 learners practice proceduralizing their declarative knowledge in the internal
system. Therefore, it is a relevant theory to account for the purported language learning potential
that may derive from the practice involved in the act of writing although it may be more pertinent
to upper-level courses than to language classes that do not systematically practice proceduralizing
declarative knowledge via compositions alone.
While this theory does not offer any role for feedback, it is clear that some researchers view
some role of feedback during the proceduralization of declarative knowledge to procedural or
automatized knowledge. For example, Leeman (2007) suggests that feedback may play a role
at the three stages postulated within the theory (i.e., the acquisition of declarative knowledge,
proceduralization, and automatization). At the initial stage, feedback can promote the development
of declarative knowledge while during the stages of proceduralization and automatization, feed-
back can “indicate the need for greater attention and reliance on declarative knowledge as well as
the need to change the scope of a given rule or procedure. Furthermore, feedback may be useful in
avoiding the automatization of non-target L2 knowledge” (Leeman, 2007, p. 117). While it may
be proposed that feedback associated with subsequent corrected and consistent practice promotes
some type of L2 knowledge during the three stages, it is noteworthy that Skill Acquisition Theory is
premised on constant practice of declarative knowledge (assumed to be accurate, DeKeyser, 1997)
before engaging in skill-specific practice activities for proceduralization and automatization to
occur. In addition, practice in Skill Acquisition Theory may not align well time-wise with that of the
typical writing component in a language curriculum in which WCF is provided several days after
compositions are submitted to the teacher. Whether Skill Acquisition Theory provides a theoretical
14
underpinning for WCF, which does not appear to form part of its theoretical postulations, appears a
bit tenuous and may need to be revisited.
Model of Second Language Acquisition

Gass’s (1988, updated later in Gass, Behney, & Plonsky, 2013) Model of Second Language
Acquisition is especially relevant for explaining any learning that may derive from WCF appropri-
ation. The model is framed within an interactionist perspective and posits five stages in the learning
process from input to output, namely, (1) apperceived input, (2) comprehended input, (3) intake,
(4) integration, and (5) output. Apperception is “an internal cognitive act in which a linguistic form
is related to some bit of existing knowledge (or gap in knowledge)” (Gass, 1997, p. 4). Once the
particular piece of input has been apperceived, the potential for intake to take place depends upon
what Gass calls comprehended input. Comprehended input may be analyzed at different levels of
analysis, for example, global comprehension versus a more linguistic focus, and these analyses
have an impact on what becomes intake, which is controlled by the L2 learner.
The stage of intake is not merely a subset of input but a process that allows psycholinguistic
processes to take place in relation to internalized grammatical prior knowledge, and it is from this
component that fossilization stems. Major processes at this stage include hypothesis formation and
testing, hypothesis rejection, hypothesis modification, and hypothesis confirmation.
There are at least two outcomes that are derived from the intake stage, both of which are a form
of integration (Stage 4). According to Gass, one is the development per se of a learner’s second lan-
guage grammar, and the other is storage that may or may not be further integrated into the system.
Integration is continuous, and the integration component does not function as an independent unit
given that the model “is dynamic and interactive, with knowledge itself being accumulative and
interactive” (Gass & Selinker, 2001, p. 304). Important variables involved in integration include
different levels of analysis and reanalysis from storage into the grammar, and within the grammar
itself.
The final stage is output although Gass points out that it is not truly a stage in the acquisition
process but more of an overt manifestation of the process. One important role of this stage is that
it may serve as a means of confirming or disconfirming prior hypotheses of the L2 via feedback
provided by someone else. Not elaborated but clearly depicted in her model is the role output plays
in the learner’s intake component and, given the interactionist framework within which the model is
subsumed, in subsequent negotiation with the L2 speaker and potential L2 speaker’s modifications
based on the learner’s output.
Bitchener (2019, 2021) adopts the various stages of the L2 learning process postulated by Gass’s
(1997) model for his postulated roles of cognitive processes during the different stages of pro-
cessing WCF and elaborates on each stage in relation to additional moderating variables that could
potentially play a role in feedback processing. He also adopts Tomlin and Villa’s (1994) Model of
Input Processing in SLA to motivate his claim that, at a pre-processing stage of WCF, L2 writers
need to be motivated (alerted?) and oriented to form or accuracy before addressing WCF in order
to attend to such feedback. At the stage of attending to WCF, Bitchener postulates that preference
for type of feedback, language proficiency, and prior experiences with previous WCF may play a
role. Following Schmidt’s Noticing Hypothesis, for the noticing the gap between their prior know-
ledge (i.e., their own output) and WCF stage, Bitchener claims that explicitness (i.e., salience and
linguistic marking) of different types of WCF may need to be considered. At the understanding
and comprehension of the WCF stage, levels of information, type of linguistic item (e.g., com-
plexity of targeted linguistic item), and long-term memory are claimed to moderate feedback pro-
cessing. At the analyzing stage, variables such as working memory processing capacity, long-term
memory store (prior knowledge), language learning aptitude, and type of WCF may moderate how
15
the feedback is processed. Finally, at the hypothesis formation/testing stage, affective factors and
prior experiences are assumed to play a role. These then lead to whether the hypothesis is accepted
or not, which in turn leads to consolidation or a repetition of the episode.
Adopting a modified version of Gass’s (1997) Model of Second Language Acquisition to expli-
cate feedback processing is quite appropriate given that her model covers all the stages of the L2
learning process and does indeed appear to place, unlike Schmidt’s (1990) Noticing Hypothesis,
an initial role for a potential link between what is noticed and learners’ prior knowledge (apper-
ception). It is also laudable that Bitchener attempts to go beyond Gass’s model to include several
moderating variables assumed to play some role in feedback processing at different stages of the
learning process. However, one major limitation may lie with regard to its testability. For example,
attempting to assign specific variables to specific individual stages (e.g., WCF type preference at
the attention stage and explicitness of WCF types at the noticing stage) may be methodologically
impossible to test empirically given that, as pointed out by Godfroid, Boers, and Housen (2013),
based on eye-tracking data, “input processing rapidly blends into intake processing given that it
only takes 50 ms for visual information to be represented in working memory” (p. 510). The same
observation holds true for all the other stages (understanding, analyzing, hypothesis formation/
testing), which, in the original model (Gass, 1997) all occur within the intake processing stage but
do not follow a sequence among themselves, that is, not all of these processes may occur within one
episode (see also Leow, 2015 below).
Model of the L2 Learning Process in ISLA

Leow’s (2015) Model of the L2 Learning Process in ISLA posits three major processing stages
(i.e., input processing stage, intake processing stage, knowledge processing stage) during which
L2 linguistic information contained in the input is processed along the L2 learning process. During
all these stages, specific cognitive processes are postulated to play important roles. As the title
indicates, this model is importantly situated within the instructed setting. While attention is central
to the model, DoP, defined as “the relative amount of cognitive effort, level of analysis, elabor-
ation of intake together with the usage of prior knowledge, hypothesis testing and rule forma-
tion employed in decoding and encoding some grammatical or lexical item in the input” (p. 204)
plays an important role at all these processing stages. The three processing stages are succinctly
described below.
Input Processing Stage

The first stage (input processing) occurs between the input and the intake of specific linguistic
information, and what is taken in (intake) is initially stored in working memory. This stage is largely
dependent upon the level of attention (peripheral, selective, or focal) paid to such information by
the learner and may be accompanied by DoP, cognitive registration, and level of awareness. Based
on the role of these variables, intake lodged into working memory may be of three types: Attended
intake (+very low DoP, − cognitive registration, − awareness), detected intake (+low DoP, +cogni-
tive registration, − awareness), and noticed intake (+low DoP, +cognitive registration, + awareness).
Crucially, any type of intake will be discarded from working memory if not further processed.
Intake Processing Stage

This second stage of further processing occurs between preliminary intake (attended, detected, and
noticed) and the internal or L2 developing system. New preliminary intake (attended, detected,
or noticed) may be processed one of two ways depending upon depth or level of processing: (1)
16
minimal data-driven processing associated with very low DoP that allows the data to be entered into
learners’ L2 developing system encoded as a non-systemized chunk of language (item learning) or
(2) conceptually-driven processing or consciously encoding and decoding the linguistic informa-
tion, associated with higher levels of processing and potential higher levels of awareness (system
learning). Activation of old or recent (new exemplars) prior knowledge may also play a role in the
intake processing stage. Dependent upon the DoP and/or levels of awareness, new linguistic infor-
mation may lead to either implicit or explicit systemized learning of the L2 information, which is
stored in the internal system.
L2 Knowledge Processing Stage

The third and final processing stage takes place between the L2 developing system and learner
output. Knowledge processing would include, for example, assigning phonological features to
the L2 in oral production or monitoring production in relation to learned grammar. Depth of pro-
cessing and possible level of awareness may also be involved in this stage, along with the ability
to activate (appropriate) prior knowledge. Like Swain (2005), Leow also views this knowledge
processing stage as part of the L2 learning process given that learners’ output may also serve as
additional input. Learners may monitor their own output or use potential feedback provided based
on what they have just produced to confirm or disconfirm their L2 output or prior knowledge.
Consequently, feedback may serve as additional L2 input to the L2 learner. Dependent upon DoP
or level of awareness, they may reinforce their present knowledge or restructure their current
interlanguage.
Generally based on his model, Leow (2020) provides a feedback processing framework that
offers a cognitive explanation for the role of feedback, whether oral or written, in subsequent
L2 development in direct relation to how L2 speakers or writers process such feedback. In this
framework, as reported in Leow (2020), feedback is the L2 information that learners need to pro-
cess and feedback processing encompasses how the learner cognitively processes the feedback (if
at all) in relation to the current learner knowledge or interlanguage (in line with Bitchener, 2019,
2021; Gass, 1997). If further processed at this stage, the information in the corrective feedback
allows for restructuring of previously learned knowledge stored in the learner’s internal or L2
developing system. This new restructured information (accurate or still inaccurate) then enters into
and exists alongside or replaces the original knowledge in the internal or L2 system. “It is possible,
then, that the learner still retains the previous inaccurate L2 data and now holds both (accurate
and inaccurate) options in the system” (Leow, 2020). Old output represents a potential absence or
low depth of prior processing of the corrective feedback provided or not much confidence in the
newly restructured knowledge. New or Modified output represents the learner’s production of the
restructured L2, which is assumed to represent the current L2 knowledge present at that point in
time in their Internal system. Current knowledge may be derived from item learning, that is, the
corrective feedback was internalized as a chunk of language as in simply copying or repeating
direct corrective feedback without much understanding of the error source or system learning
that involves higher DoP and awareness at the level of understanding. To address whether a com-
plete accurate restructuring took place (as in system learning) or whether such restructuring was
temporary, immediate, or reflective of item learning, researchers may need to/should include a
delayed test in their design (to observe learners’ performance). In other words, accurate perform-
ance was evidenced immediately after the corrective feedback was provided, but learners later
reverted back to their previous inaccurate interlanguage. Whether feedback is indeed processed
by L2 learners may depend on several cognitive processes and variables such as DoP, levels of
awareness, activation of appropriate prior knowledge, hypothesis testing, rule formulation, and/
or metacognition.
17
Comments on the Model of the L2 Learning Process in ISLA

Similar to Swain’s (1985, 2005) Output Hypothesis, the knowledge processing stage of Leow’s
model allows L2 writers (and learners) to monitor their production (written or oral). At this stage
of the L2 learning process, Leow provides a cognitive explanation for writing in terms of his
postulated roles of the activation of prior knowledge, DoP, and levels of awareness. Accessing
recent appropriate prior knowledge at the lower levels of language proficiency will involve much
cognitive effort by L2 writers that, in turn and dependent upon DoP, may contribute to a higher
level of awareness (at the level of understanding) and more potential learning. At the same time, as
proficiency rises and with much practice activating old appropriate prior knowledge, learning will
be strengthened and DoP will be reduced during the writing process.
Also similar to Swain (2005), Leow’s model views the learning process to include the know-
ledge processing stage that allows potential feedback to loop back to the early input processing
stage. It is situated also within the instructed setting and clearly premised on explicit learning taking
place in this setting. According to the feedback processing framework, based on the model, whether
the feedback has been attended to, detected, or noticed once again depends upon the attentional
resources allocated to the feedback by the learner in addition to the DoP and level of awareness
involved to make the connection between the learner’s prior inaccurate knowledge (or output) and
the information in the feedback received. In other words, whether the feedback processed allows for
potential restructuring of the inaccurate knowledge may depend upon whether the feedback is cog-
nitively registered, how deeply the feedback is processed, and/or the level of awareness in relation to
the mismatch between the learner’s prior knowledge and the feedback. Feedback can be processed
implicitly, that is, without much cognitive effort expended in doing so, but many instances of the
same feedback may be needed for implicit learning or restructuring to take place. However, in the
instructed setting it will most likely be processed explicitly, especially in the written mode.
Summary and Suggestions for Future Research

The writing process, in relation to both the acts of writing and the processing of WCF, is clearly
a cognitive activity that is undertaken with some effort to produce the L2. All the theoretical
underpinnings discussed above underscore in some way the role of cognitive or mental processes
that are assumed to play an important role in both the stages of writing and feedback processing.
Learning in the act of writing and from exposure to WCF appears to be promoted with some rela-
tive level of DoP and awareness with deeper processing leading to more robust learning (Bitchener,
2019, 2021; DeKeyser, 2015; Leow, 2015, 2020; Schmidt, 1993; Swain, 2005). Such cognitive
processes need to be further investigated to arrive at a clearer understanding of what takes place
during individual writing conditions. There is also the need to continue addressing moderating
variables (both external and internal) that could potentially impact the act of writing and how feed-
back is processed. External variables may include number and length of compositions, type of
grammatical items, L2 level of proficiency, writing-to-learn as in language curricula vs. learning-
to-write as in upper-level writing courses, context as in, for example, learners’ academic status
(e.g., university-vs. non-university levels) while learner-internal factors may include, for example,
age (e.g., adults vs. children), working memory, motivation, prior knowledge, language learning
aptitude, and anxiety. Pertaining to WCF, a possible, promising next step may be to empirically
test (different stages of) Leow’s Feedback Processing Framework in an instructed setting, that
is, within the written component of a language curriculum, in which (different types of) WCF is
provided. Findings from such empirical studies will help broaden our understanding of how and
why writing and WCF facilitate L2 development while providing some pedagogical ramifications
for this instructed setting.
18
Conclusion
This chapter began by situating writing in individual writing conditions, WCF, and language
learning from a cognitive perspective within ISLA as opposed to SLA and, more specifically, within
the language curriculum. It then discussed the processing affordances provided by the written mode
when compared to its oral counterpart and underscored the importance of empirically investigating
the cognitive processes involved during the act of writing and the processing of WCF and language
learning. Based on the current focus on these cognitive processes employed during L2 written pro-
duction in ISLA, the role of WCF, and the potential for learning in an instructed setting, a succinct
report of theoretical cognitive underpinnings postulated to account for the learning process during
the phase of writing and that of addressing WCF was provided. Each theoretical underpinning was
then followed by a critical commentary on each underpinning’s ability to account for the roles
of writing itself and of WCF during the revision writing process in L2 development. Finally, a
suggestion to combine a proposed list of both internal and external variables in future writing and
WCF research designs was provided.
Notes
1 We are purposely using the phrase “theoretical underpinnings” to cover hypotheses, models, and theories
under one umbrella given the different premises and theoretical scopes offered by each.
2 Although Focus-on-Form has been cited as a theoretical underpinning to account for the role of WCF in L2
development, its principles rest on some periodic focus on grammatical errors during oral communication.
Given that Focus-on-Form does not adhere easily to the written mode, we have not included it among the
theoretical underpinnings.
References
Adrada-Rafael, S., & Filgueras-Gómez, M. (2019). Reactivity, language of think aloud protocol, and depth
of processing in the processing of reformulated feedback. In R.P. Leow (Ed.), The Routledge handbook of
second language research in classroom learning (pp. 201–213). New York: Routledge.
Anderson, J.R. (1982). Acquisition of cognitive skill. Psychological Review, 89(4), 369–406.
Anderson, J.R. (1993). Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum.
Anderson, J.R. (2007). How can a human mind occur in the physical universe? New York: Oxford University
Press.
Breuer. E.O. (2019). Fluency in L1 and FL writing: An analysis of planning, essay writing and final revision.
In E. Lindgren & K. Sullivan (Eds.), Observing writing: Insights from keystroke logging and handwriting
(pp. 190–211). Leiden: Brill.
Bitchener, J. (2019). The interaction between SLA and feedback research. In K. Hyland & F. Hyland (Eds.),
Feedback in second language writing: Contexts and issues (pp. 85– 105). Cambridge: Cambridge
University Press.
Bitchener, J. (2021). Written corrective feedback. In H. Nassaji & E. Kartchava (Eds.), The Cambridge hand-
book of corrective feedback in language learning and teaching (pp. 207–225). Cambridge: Cambridge
University Press.
Bitchener, J., & Storch, N. (2016). Written corrective feedback for L2 development. Bristol: Multilingual
Matters.
Caras, A. (2019). Written corrective feedback in compositions and the role of depth of processing. In R.P.
Leow (Ed.), The Routledge handbook of second language research in classroom learning (pp. 188–200).
Cerezo, L., Manchón, R.M., & Nicolás-Conesa, F. (2019). What do learners notice while processing written cor-
rective feedback? A look at depth of processing via written languaging. In R.P. Leow (Ed.), The Routledge
handbook of second language research in classroom learning (pp. 173–187). New York: Routledge.
Craik, F.I.M., & Lockhart, R.S. (1972). Levels of processing: A framework for memory research. Journal of
Verbal Learning and Verbal Behavior, 11(6), 671–684.
Cumming, A. (1990). Metalinguistic and ideational thinking in second language composing. Written
Communication, 7(4), 483–511.
19
DeKeyser, R.M. (1997). Beyond explicit rule learning. Studies in Second Language Acquisition, 19(2),
195–221.
DeKeyser, R.M. (2015). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second lan-
guage acquisition (pp. 94–112). London: Routledge.
Gass, S.M. (1988). Integrating research areas: A framework for second language studies. Applied Linguistics,
9(2), 198–217.
Gass, S.M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum.
Gass, S.M., Behney, J., & Plonsky, L. (2013). Second language acquisition: An introductory course (4th ed.).
Gass, S.M., & Selinker, L. (2001). Second language acquisition: An introductory course (2nd ed.). Mahwah,
NJ: Lawrence Erlbaum Associates.
Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in L2 vocabulary
acquisition by means of eye tracking. Studies in Second Language Acquisition, 35(3), 483–517.
Writing, 11(4), 329–350.
Krashen, S.D. (1982). Principles and practice in second language acquisition. Oxford: Pergamon Press.
Leeman, J. (2007). Feedback in L2 learning: Responding to errors during practice. In R. DeKeyser (Ed.),
Practicing in a second language: Perspectives from applied linguistics and cognitive psychology (pp. 111–
137). New York: Cambridge University Press.
Leow, R.P. (2001). Do learners notice enhanced forms while interacting with the L2?: An online and offline
study of the role of written input enhancement in L2 reading. Hispania, 84(3), 496–509.
Leow, R.P. (2015). Explicit learning in the L2 classroom: A student-centered approach. New York: Routledge.
Leow, R.P. (2018). Explicit learning and depth of processing in the instructed setting: Theory, research, and
practice. Studies in English Education, 23(4), 769–801.
Leow, R.P. (2019a). From SLA > ISLA > ILL: A curricular perspective. In R.P. Leow (Ed.), The Routledge
Leow, R.P. (2019b). ISLA: How explicit or how implicit should it be? Theoretical, empirical, and pedagogical/
curricular issues. Language Teaching Research, 23(4), 476–493.
Leow, R.P. (2020). L2 writing-to-learn: Theory, research, and a curricular approach. In R.M. Manchón (Ed.),
Leow, R.P., & Cerezo, L. (2016). Deconstructing the “I” and “SLA” in ISLA: One curricular approach. Studies
in Second Language Learning and Teaching, 6(1), 43–63.
Long, M., & Robinson, P. (1998). Focus on form: Theory, research and practice. In C. Doughty & J. Williams
(Eds.), Focus on form in classroom second language acquisition (pp. 15–41). Cambridge: Cambridge
University Press.
López-Serrano, S., Roca de Larios, J., & Manchón, R.M. (2019). Language reflection fostered by individual
L2 writing tasks: Developing a theoretically-motivated and empirically-based coding system. Studies in
Second Language Acquisition, 41(3), 503–527.
Manchón, R.M. (2009). Writing in foreign language contexts: Learning, teaching, and research. Clevedon:
Multilingual Matters.
Manchón, R.M. (2011a). The language learning potential of writing in foreign language contexts: Lessons
from research. In T. Cimasko & M. Reichelt (Eds.), Foreign language writing instruction: Principles and
practices (pp. 44–64). Anderson, SC: Parlor Press.
Manchón, R.M. (Ed.). (2011b). Learning- to-
write and writing- to-
learn in an additional language.
Amsterdam: John Benjamins.
Manchón, R.M. (2011c). Situating the learning-to-write and writing-to-learn dimensions of L2 writing.
In R.M. Manchón (Ed.), Learning-to-write and writing-to-learn in an additional language (pp. 3–14),
Manchón, R.M., & Leow, R.P. (2020). Investigating the language learning potential of L2 writing: Methodological
considerations for future research agendas. In R.M. Manchón (Ed.), Writing and language learning.
Advancing research agendas (pp. 335–355). Amsterdam: John Benjamins.
Manchón, R.M., Roca de Larios, J., & Murphy, L. (2009). The temporal dimension and problem-solving nature
of foreign language composing: Implications for theory. In R. M. Manchón (Ed.), Writing in foreign lan-
guage contexts: Learning, teaching, and research (pp. 102–129). Bristol: Multilingual Matters.
empirical evidence. In J.W. Schwieter & A. Benati (Eds.), The Cambridge handbook of language learning
The handbook of second and foreign language writing (pp. 567–586). Berlin: de Gruyter Mouton.
20
Ortega, L. (2011). Reflections on the learning-to-write and writing-to-learn dimensions of second language
writing. In R. Manchón (Ed.), Learning-to-write and writing-to-learn in an additional language (pp. 237–
Park, E.S., & Kim, O.Y. (2019). Learners’ use of indirect written corrective feedback: Depth of processing and
self-correction. In R.P. Leow (Ed.), The Routledge handbook of second language research in classroom
learning (pp. 214–228). New York: Routledge.
Polio, C. (2012). The relevance of second language acquisition theory to the written error correction debate.
Journal of Second Language Writing, 21(4), 375–389.
Révész, A., Kourtali, N., & Mazgutova, D. (2017). Effects of task complexity on L2 writing behaviors and
linguistic complexity. Language Learning, 67(1), 208–241.
Révész, A., & Michel, M. (Eds.). (2019). Methodological advances in investigating L2 writing processes
[Special Issue]. Studies in Second Language Acquisition, 41(3).
Roca de Larios, J., Murphy, L., Manchón, R.M., & Marín, J. (2008). The foreign language writer’s strategic
behaviour in the allocation of time to writing processes. Journal of Second Language Writing, 17(1), 30–47.
Schmidt, R.W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2),
129–158.
Schmidt, R.W. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13,
206–226.
Schmidt, R.W. (1994). Implicit learning and the cognitive unconscious: Of artificial grammars and SLA. In N.
Ellis (Ed.), Implicit and explicit learning of languages (pp. 165–209). London: Academic Press.
Schmidt, R. (1995). Consciousness and foreign language learning: A tutorial on the role of attention and
awareness in learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning and
teaching. Second Language Teaching and Curriculum Center Technical Report No. 9 (pp. 1– 64).
Honolulu: University of Hawai’i Press.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and Second Language Instruction (pp. 3–32).
New York: Cambridge University Press.
Storch, N. (2019). Time line on collaborative writing. Language Teaching, 52(1), 40–59.
Suzuki, W. (2016). The effect of quality of written languaging on second language learning. Writing &
Pedagogy, 8(3), 461–482.
output in its development. In S. Gass & C. Madden (Eds.), Input in second language acquisition (pp. 235–
153). Rowley, MA: Newbury House.
Swain, M. (1993). The Output Hypothesis: Just speaking and writing aren’t enough. The Canadian Modern
Language Review, 50(1), 158–164.
Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in
Tomlin, R.S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in
Vygotsky, L.S. (1978). Mind in society. The development of higher psychological processes. Cambridge,
MA: Harvard University Press.
Vygotsky, L.S. (1986). Thought and language. Cambridge, MA: MIT Press.
Wood, D., Bruner, J.S., & Ross, G. (1976). The role of tutoring in problem-solving. Journal of Child Psychology
and Psychiatry, 17(2), 89–100.
21
3
THEORETICAL PERSPECTIVES
ON L2 WRITING AND
LANGUAGE LEARNING
IN COLLABORATIVE WRITING
AND THE COLLABORATIVE
PROCESSING OF WRITTEN
CORRECTIVE FEEDBACK
Neomy Storch
University of Melbourne, Australia
Introduction
For many years, scholarship on second language acquisition (SLA) focused on spoken language,
viewing oral interactions as the site for second language (L2) learning (Harklau, 2002). It is rela-
tively more recently that writing has come to be recognized as an equally suitable and in some
ways a superior site for L2 learning. Writing is now perceived as offering learners opportunities not
only to develop writing abilities and content knowledge, but also opportunities to learn the target
language (Manchón, 2011; Manchón & Vasylets, 2019; Williams, 2012). Manchón (2011) refers to
these two types of opportunities as “learning-to-write” and “writing-to-learn” respectively.
An activity closely associated with writing in instructional contexts is the provision of feedback
on writing, and in particular corrective feedback on errors in language use. Corrective feedback on
writing can be delivered orally (e.g., in one-on-one conferences) and/or in written form. However,
it is written corrective feedback (WCF) that has received much research attention in the field of
L2 writing in the past 25 years or so. This growing body of research (see review in Bitchener,
2019; Bitchener & Storch, 2016) seems to support the stance adopted by recognized experts on
WCF (e.g., Bitchener, 2012, 2017; Ferris & Kurzer, 2019) that WCF feedback has the potential
to contribute to language learning and in particular to the development of grammatical accuracy.
However, there is also an acknowledgment among researchers that whether and how learners attend
to the feedback is, as Han and Hyland (2015, p. 31) put it, “a critical link that connects the provision
of WCF with learning outcomes.”
Writing and the processing of corrective feedback activities are generally completed by the
learner individually. However, such activities can also be completed collaboratively, in pairs or small
22 DOI: 10.4324/9780429199691-4
L2 Learning & Collaborative L2 Writing
groups. The main aim of this chapter is to critically review the theoretical basis for implementing
writing and feedback processing in the collaborative condition.
The chapter begins with a discussion of what collaborative writing means and how this activity
has been extended to include online collaboration and collaborative processing of expert and peer
feedback. A discussion of the theoretical rationale for implementing these collaborative activities
follows. Although there are a number of theories that attempt to explain L2 learning processes (see
Gass & Mackey, 2012; Ortega, 2009), the two that have informed much research in SLA and that
have been identified as being of most relevance to discussions of writing and WCF are cognitive
and sociocultural perspectives (see Bitchener, 2019; Polio, 2012; Manchón, 2011; Manchón &
Vasylets, 2019; Williams, 2012). However, these discussions focus predominantly on cognitive
perspectives and on individual writing and feedback processing (as discussed in Chapter 2, this
volume). In this chapter, it is sociocultural perspectives and collaborative writing and feedback
processing that receive prominence. Thus, although I begin with discussing key cognitive theories,
or rather hypotheses, this discussion is brief. I note the relevance of these hypotheses to individual
writing and WCF and then discuss their relevance to collaborative writing and WCF processing.
The discussion then moves to sociocultural perspectives, and specifically Vygotsky’s (1978, 1986)
sociocultural theory (SCT), and the key tenets in the theory of relevance to collaborative writing
and feedback processing activities. This discussion also identifies areas requiring further empirical
investigations and constructs in SCT that need further explication.
Collaborative Writing and Feedback Processing

The term collaborative writing describes a writing activity where two or more authors create one
joint text. In mainstream composition classes, the main rationale for implementing joint writing
tasks is to prepare students for the kind of multi-authored writing projects in the workplace (Ede &
Lunsford, 1990) as well as to expose them to different ways of expressing ideas and different
writing strategies (Louth, McAllister, & McAllister, 1993). In L2 writing classes, particularly uni-
versity level L2 writing for academic or professional purposes, a similar case can be mounted. In
other words, joint L2 writing tasks can be viewed as “learning-to-write” activities. However, an
additional rationale for implementing collaborative writing with L2 learners in instructed second
language acquisition (SLA) contexts is that of “writing-to-learn.” The argument I have put forward
(see Storch, 2013, 2016, 2017a, b) is that collaborative writing tasks, if appropriately designed,
implemented and monitored to ensure that learners share a sense of collective ownership and
responsibility for the creation of their joint text, may provide L2 learners with more language
learning opportunities than individual writing.
The use of collaborative writing has become more widespread in L2 writing classes, as evidenced
by the growing number of studies reporting on the implementation of such tasks in a range of
L2 learning contexts, with different student populations and languages. Thus, in addition to many
studies conducted with adult learners in university ESL or EFL classes (see review in Storch, 2013,
2016, 2017a, 2019a) there are now also studies reporting on implementing collaborative writing
activities with primary school EFL learners of relatively low L2 proficiency (e.g., Calzada &
García-Mayo, 2020; García-Mayo & Imaz Aguirre, 2019) and with learners of languages other than
English (e.g., Masuda & Iwasaki, 2018), including heritage language learners (e.g., Fernández-
Dobao, 2020). Moreover, collaborative writing has extended beyond face-to-face collaboration,
with a growing number of studies reporting on computer-mediated collaborative writing tasks in
different parts of the world (e.g., Alghasab, Hardman, & Handley, 2019; Aufa & Storch, 2021; Liu,
Liu, & Liu, 2018). This body of research is likely to continue to grow in line with rapid advances
in technology and the ease of using freely available online collaborative writing platforms such as
wikis or Google Docs. These platforms are said to enhance collaboration because they enable the
co-authors to make direct changes to the evolving joint text (Goodwin-Jones, 2003).
23
Neomy Storch
Collaborative writing can be augmented with collaborative processing of feedback that the
co-authors receive on their joint text. This feedback can be in the form of direct or indirect WCF
(e.g., Storch & Wigglesworth, 2010) or as a reformulated version of the students’ text (e.g., Brooks
& Swain, 2009; Tocalli-Beller & Swain, 2005). Although reformulations tend to be provided by an
expert (teacher, researcher), WCF can be provided by experts as well as by fellow L2 peers (e.g.,
Alshuraidah & Storch, 2019).
Theoretical Support for Collaborative Writing and Processing of Feedback

As mentioned earlier, cognitive and sociocultural perspectives are most commonly drawn on to
suggest that the act of writing (e.g., Manchón, 2011; Williams, 2012) and of processing feedback
(e.g., Bitchener & Storch, 2016; Polio, 2012) can result in L2 learning. Cognitive perspectives
include reference to key cognitive processes (e.g., noticing, hypothesis formulation and testing),
mental facilities (e.g., working memory), and models showing the sequence of processes that
account for language learning (e.g., Gass, 1988; Leow, 2015). What tends to be highlighted in
these perspectives is the role and type of attention learners pay to language input, drawing mostly
on Schmidt’s (1990, 1993, 1995) Noticing Hypothesis. Another influential hypothesis is Swain’s
(1985, 1995, 1998) Output Hypothesis and its claims about language production (output) being
a more effective trigger of noticing than exposure to language (input). In what follows, I discuss
these two hypotheses briefly (for a fuller discussion of cognitive perspectives see Chapter 2 by
Leow & Suh) and then suggest that the collaborative condition of writing and feedback processing
may create more conducive conditions for noticing than the individual condition. The discussion
of sociocultural perspectives follows. I consider the main tenets of sociocultural theory and the
two key constructs of most relevance to collaborative writing and feedback processing. These two
constructs are the kind of assistance provided to novices (learners) during interaction, captured
by the constructs of Zone of Proximal Development (ZPD, see below for further elaboration) and
scaffolding, and the role of language as a mediating tool in language learning, captured by the term
languaging (Swain, 2000, 2006, 2010). I explain how each of these constructs underpins collabora-
tive writing and feedback processing.
Cognitive Theoretical Perspectives

A number of models have been put forward to explain the sequence of cognitive processes leading
to L2 acquisition (e.g., see Gass, 1988; Leow, 2015; Leow & Suh, this volume), and these have
been then used to explain how writing and in particular WCF (see Bitchener, 2019) can lead to L2
learning. The initial stages in these models involve input processing, which entails noticing.
The Noticing Hypothesis

The importance of noticing for language learning appeared first in the work of Schmidt and Frota
(1986). In their “noticing the gap” principle, the authors state that “a second language learner will
begin to acquire the target-like form if and only if it is present in comprehended input and “noticed”
in the normal sense of the word, that is consciously” (p. 311). In other words, the learner needs
to notice not only the target language form but also the gap or difference between the form in the
target language and in their own interlanguage. In his subsequent work, Schmidt (1990, 1993,
1995) articulated the Noticing Hypothesis, positing that in order to learn any aspect of the L2 (e.g.,
vocabulary, grammar, phonology), the learner needs to notice it in the input. Schmidt (1993, 1995)
also distinguished between noticing and a higher level of awareness (the level of understanding).
The claim made was that noticing is sufficient to convert input into intake for further processing
and may result in item learning (e.g., a new word). On the other hand, it is awareness at the level of
24
understanding or “recognition of a general principle, rule, or pattern” (Schmidt, 1995, p. 29) that is
required for a restructuring of the learner’s existing knowledge, that is, system learning.
However, it is important to note that for Schmidt (1993, 1995) noticing occurs only when
learners are exposed to input. When composing a text individually learners are not exposed to
input. Instead they draw on their existing linguistic knowledge as documented in studies using
think-aloud protocols to investigate composing processes (e.g., Cumming 1990; Manchón, Roca de
Larios, & Murphy, 2009). Thus Schmidt’s Noticing Hypothesis cannot explain how writing per se
can promote L2 learning.
However, the hypothesis can be used to explain the potential contribution of WCF to L2 learning,
because WCF is a form of input. For example, in Bitchener’s (2019, p. 90) model of the cogni-
tive processing stages of WCF, the initial stages include attention to the feedback and the learner
noticing the gap between their output and the WCF input. The model, however, seems to assume
that WCF is in the form of direct feedback (i.e., the correct form is provided), as indirect WCF
signals an error but does not provide target-like forms. The learner’s ability to resolve an error
identified via indirect WCF is based on the presumption that the learner has the knowledge and
can draw on it to correct the error (unless the learner consults additional resources). The important
role played by the learner’s existing L2 knowledge in L2 learning is discussed in Swain’s Output
Hypothesis.
The Output Hypothesis

Swain’s (1993, 1995, 1998) Output Hypothesis posits that output (as speaking or writing) can pro-
mote noticing because it pushes learners to process language more deeply than exposure to input.
Indeed, originally Swain (1985) termed her hypothesis the Comprehensible Output Hypothesis.
Swain identified three important functions that output can serve. One function is that of “noticing
the gap.” However, the sense in which the term “gap” is used in the Output Hypothesis is different
from the way it was used by Schmidt and Frota (1986). In the Output Hypothesis it refers to the
learner noticing a linguistic inadequacy. Using think-aloud protocols of grade 8 French immersion
students composing a text, Swain and Lapkin (1995) provided evidence of such inadequacies; that
is, the learners becoming aware of a lack of vocabulary or knowledge of grammatical forms that
they feel are needed to express their intended meaning better or more accurately. Swain and Lapkin
refer to this kind of noticing as “noticing a hole.” The claim made is that this kind of noticing may
encourage the learner to pay more careful attention to subsequent relevant input (Swain, 1998),
but in the absence of such input, it may prompt the learner to draw on their existing L2 know-
ledge and engage in hypothesis formulation and testing. This is the second function of output and
it involves the learner in sounding out and evaluating various alternatives thus testing hypotheses
against existing knowledge. The third function of output is metalinguistic: Using language to reflect
on language use. Swain (1995, 1998) suggests that such reflections may serve to deepen learners’
understanding of form-function relationships.
Thus the claims made by the Output Hypothesis are that output is not just a product but also a
process that creates opportunities for L2 learning. The cognitive processes that follow “noticing a
hole” whilst composing may eventually result in the learner generating linguistic knowledge that
is new or that serves to consolidate the learner’s existing knowledge. However, what appears as
new knowledge is reprocessed existing knowledge, and this type of knowledge does not align with
what is considered new knowledge from a cognitive perspective (see VanPatten, 2007). Moreover,
the few studies that have empirically investigated the effect of output on noticing elements in sub-
sequent input (e.g., Adams, 2003; Izumi & Bigelow, 2000; Izumi, Bigelow, Fujiwara, & Fearnow,
1999; Hanaoka, 2007) have yielded some mixed findings.
Although Swain argued that both oral and written output can serve these important language
learning functions, her research (together with her colleagues) used writing rather than speaking
25
Neomy Storch
tasks. Writing facilitates a greater focus on language than speaking (e.g., Adams & Ross-Feldman,
2008; García-Mayo & Aguirre, 2019; Niu, 2009). The slower pace and permanence of writing
means that learners are more likely to notice holes and then have time to draw on their existing lin-
guistic resources, to reflect on their language use, and to pay close attention to subsequent feedback.
Furthermore, although initial studies (e.g., Swain & Lapkin, 1995) used individual writing tasks
and think-aloud protocols, subsequent studies began to use collaborative writing tasks (e.g., Kowal
& Swain, 1997; Swain, 1998; Swain & Lapkin, 1998; Watanabe & Swain, 2007).
The Output Hypothesis and Collaborative Writing

The rationale Swain (1998) gave for using collaborative writing tasks in her studies was that they
were communicative, providing learners with opportunities to reflect about their writing. However,
it became apparent from the data the researchers collected (see Swain, 1998; Swain & Lapin,
1998) that the advantages of collaborative writing tasks extended beyond opportunities to reflect
on language. The data showed that when writing collaboratively, noticing a hole can elicit imme-
diate feedback from a peer rather than being delayed, as is usually the case with teacher feedback.
Furthermore, when testing hypotheses, learners need not rely solely on their own existing know-
ledge but can draw on a larger pool of knowledge –that of their co-authors. The units of ana-
lysis used by Swain and colleagues in these studies were “Language Related Episodes” (LREs),
defined initially as instances where learners spoke about a language problem they encountered and
attempted to resolve (Swain & Lapkin, 1995). A later definition of LREs took into consideration
the presence of peer feedback in joint writing activities, and thus LREs included instances where
learners self-or other-corrected (Swain, 1998).
Thus, from a cognitive perspective, collaborative writing tasks provide learners with oppor-
tunities to receive exposure to new input during the writing process, when learners identify
holes or differences (gaps) between their respective interlanguage systems (i.e., when another-
correction is offered), and this may lead to acquiring new knowledge. Any ensuing negotiations
may include explanations which can promote higher levels of noticing –noticing as awareness with
understanding. Such negotiations have been documented in a number of studies conducted with
adult learners in different instructional contexts (see review in Storch, 2013, 2016, 2017a) as
well as younger L2 learners (e.g., Calzada & García-Mayo, 2020; Swain & Lapkin, 2002).
Admittedly, lower proficiency learners may not be able to resolve all the concerns that may arise
during the writing phase (e.g., Leeser, 2004; Storch & Aldosari, 2013), nor have the metalanguage
to reflect on language use (e.g., Calzada & García-Mayo, 2020). This is why it is important to con-
sider task type and the basis for grouping students when implementing collaborative writing tasks
(see Storch, 2013, 2017b for a discussion).
Augmenting a collaborative writing activity with collaborative processing of feedback
received on the text produced provides learners with additional language learning opportunities.
For example, Brooks and Swain’s (2009) study showed the cognitive processes that occur when
learners receive a reformulated version of their joint text, such as comparing their output and the
reformulated input and deliberating together about any gaps that they notice. Other studies provided
evidence of learners discussing the WCF on their joint text, provided by a native speaker expert
(e.g., Storch & Wigglesworth, 2010; Wigglesworth & Storch, 2012) or fellow L2 peers (e.g., Storch
& Alshuraidah, 2020). The studies show that learners deliberate about the source of their errors,
draw on their existing L2 knowledge to consider alternative ways of correcting their errors and
articulate their understanding of grammatical rules and conventions. However, to date there have
been no studies that have compared depth of processing of WCF in the individual and collaborative
condition (although see Manchón, Nicolás-Conesa, Cerezo, & Criado, 2020).
In 2000, Swain abandoned her cognitively-oriented Output Hypothesis and her work since
then became influenced by sociocultural theory. Sociocultural theory provides a very different
26
perspective on language learning than cognitive theories. It views the learner as a social being and
focuses on interaction and dialogues where cognitive development is said to take place.
Sociocultural Theory (SCT)

Sociocultural theory (SCT) is based on the work of Vygotsky (1978, 1986) and in essence it is
a psychological theory of human cognitive development. The theory postulates how biologically
endowed cognitive capacities (e.g., involuntary attention) develop to become uniquely human,
higher order cognitive capacities (e.g., voluntary attention). The development of language (both first
and subsequent languages) is considered a higher order capacity and hence SCT has been employed
by an increasing number of scholars to inform research on L2 learning and testing. Although some
investigations of individual L2 writing and feedback processing (e.g., Haneda, 2007; Swain,
Kinnear, & Steinman, 2015) have been informed by SCT and Activity Theory (a theory that has its
genesis in SCT), most of the research on collaboration in L2 writing research is informed by these
sociocultural perspectives (for a review see Storch, 2013, 2016, 2017a, 2019a, b).
The underlying premise of Vygotsky’s (1978) SCT is that the development of human com-
plex cognitive functions originate and are shaped by purposeful social interaction and mediated
by material and/or symbolic tools or means. This social interaction occurs between an expert (a
more knowledgeable member of the community) and a novice. In the same way that noticing is
highlighted in cognitive theories of SLA as key to language learning, in SCT the key ingredient that
drives cognitive development is appropriate assistance. Not all forms of assistance are necessarily
effective (see Storch, 2019b for a more detailed discussion).
Effective Assistance: ZPD and Scaffolding

According to Vygotsky (1978), effective assistance needs to be contingently responsive not only
to the novice’s current state of knowledge, but more importantly, to the novice’s potential state of
knowledge, as gauged by the novice’s ability to take advantage of the assistance offered. The dis-
tance between these two states of knowledge is referred to as the Zone of Proximal Development
(ZPD). Effective assistance is also by definition dynamic rather than a “fixed treatment,” aligned
with the novice’s changing developmental needs (Lantolf & Poehner, 2014). What this means is
that the assistance provided by the expert needs to be guided by the actions and utterances of the
novice, with both expert and novice working collectively towards achieving a shared object –the
cognitive development of the novice. In this sense, ZPD is best viewed as a collective activity.
This temporal, finely tuned form of assistance is referred to in the literature as scaffolding (Wood,
Bruner, & Ross, 1976).
Vygotsky’s SCT and research focused primarily on children’s cognitive development (see dis-
cussion in Lantolf & Thorne, 2006). However, because the theory establishes learning as a funda-
mentally social experience, it can be used as the rationale for the use of interaction in all learning,
including adult L2 learning. These interactions can take place between an expert (e.g., teacher,
native speaker) and the novice learner as well as between novices (i.e., peers). It is the latter that is
of particular relevance here.
As mentioned in my earlier discussion of the Output Hypothesis, one of the advantages of collab-
orative writing is the availability of immediate assistance from peers to address holes in knowledge
learners discover in the process of writing. Studies have shown that this assistance is immediate,
contingently responsive, and developmentally appropriate (see review in Storch, 2019b). In other
words, it has the attributes of scaffolded feedback. Furthermore, in the collaborative condition,
learners can also pool their linguistic knowledge. In L2 classes, although learners may all be
classified on the basis of proficiency tests as being at a certain proficiency level, they may in fact
have different pockets of L2 knowledge and expertise. The process of pooling and building on each
27
Neomy Storch
other’s knowledge to co-construct new knowledge, is a form of mutual assistance, and labeled by
Donato (1994) “collective scaffolding.” The outcome may lead to the creation of co-constructed
knowledge that is new; that is, knowledge that was not held by the individual learners previously.
Donato also found some evidence that the knowledge learners co-constructed during interactions
was internalized and used later by the learners in independently completed tasks. Donato’s study
was conducted with adult learners completing oral group tasks. Similar findings were reported in
my own study (Storch, 2002) with adult learners completing collaborative writing tasks. The study
found that the pairs formed distinct patterns of interaction (see Storch, 2001, 2002, 2013) and
that collective scaffolding was found mainly in pairs that formed a collaborative relationship, with
participants contributing and engaging with each other’s contribution to the task. In other words,
simply assigning learners to work jointly on a writing task does not guarantee that they will collab-
orate, offer appropriate assistance to each other, or engage in collective scaffolding.
A growing number of studies have set out to investigate the factors that my impact on the
relationships learners form when co-authoring texts, including learner-related factors, such as
relative proficiency (e.g., Watanabe & Swain, 2007), and context-related factors, such as mode of
communication (e.g., Bikowski & Vithanage, 2016; Cho, 2017; Rouhshad & Storch, 2016). For
example, Rouhshad and Storch (2016), who compared face-to-face and computer-mediated col-
laborative writing, found that learners are more likely to cooperate rather than collaborate in the
computer-mediated mode. When cooperating, the process of construction tends to be one where
each author adds to the evolving text, and thus the joint text is an aggregate of individual texts
rather than a co-constructed text. Furthermore, the computer-mediated mode impacts on learners’
attention to language and sense of text ownership, and consequently on language learning oppor-
tunities. Learners seem to pay less attention to language use when composing joint texts online
than face-to-face as evident in the lower quantity of LREs generated (Rouhshad & Storch, 2016).
Furthermore, there is less evidence of deliberations and collective scaffolding in the online mode.
Learners seem more likely to amend their own contributions to the joint text than their co-authors’
contributions (e.g., Cho, 2017; Elola & Oskoz, 2010; Li & Zhu, 2017), suggesting that mode of
communication has an impact on learners’ sense of text ownership (see discussion in Storch, 2017c,
2019b).
A construct which seems to capture these individual and context-related variables and their
impact on learner behavior is learner agency. Learners’ behavior observed in joint writing tasks and
the relationships they form are examples of learners’ enactment of their agency. Agency is a rela-
tively recent term in SLA and L2 research (see Deters, Gao, Miller, & Vitanova, 2015), yet clearly
one whose importance has been recognized (Douglas Fir Group, 2016). Agency is generally under-
stood as a learner’s ability to act in pursuit of fulfilling individual goals (Duff, 2012). However,
from a sociocultural perspective (see Lantolf & Thorne, 2006; Pavlenko & Lantolf, 2000), this
ability is viewed not as a property of the individual, but as relational (interpersonal) and very much
situated, with power hierarchies, expected norms of behavior, and available tools making only
certain choices possible in a particular context and time. These key elements explaining human
behavior in an activity are captured in Activity Theory. The theory has its roots in Vygotsky’s
(1978) work but has been elaborated by others, and particularly by Engeström (1987, 2001) (for a
more detailed discussion see Bitchener & Storch, 2016). Although the impact of mediating indi-
vidual and some contextual variables are acknowledged in cognitive perspectives on writing and
feedback processing (e.g., Bitchener, 2019; Kormos, 2012), what distinguishes Activity Theory is
that it provides a framework whereby individual and context-related variables are treated as a whole
system, rather than a loose set of variables.
A relatively small number of studies have used Activity Theory (e.g., Aufa & Storch, 2021;
Storch, 2004) and the construct of agency (e.g., Li & Zhu, 2017; Pu, 2020) to explore and explain
the nature of learners’ behavior in collaborative writing activities. Furthermore, these studies
have focused mainly on learners’ goals and orientation to the activity. Future investigations of
28
collaborative writing and feedback processing could explore, for example, how other elements in
the activity, such as the power hierarchies within the classroom and expected norms of behavior in
face-to-face and online collaboration impinge on learners’ ability to exercise their agency in collab-
orative writing activities.
Another important dimension of agency is emotions. Vygotsky (1986) makes innumerable references
to the inseparability of cognition and emotion as do a number of sociocultural theorists (e.g., Lantolf &
Poehner, 2014; Lantolf & Thorne, 2006; Swain, 2013). The original definition of scaffolding offered by
Wood et al. (1976) also noted that one of the functions of scaffolding is to control feelings such as frus-
tration. Yet, although a small number of studies investigating collaborative writing note that learners
seem to experience different emotions during these activities such as pleasure, antagonism, or fear of
losing face (see Li & Zhu, 2017; Storch, 2004), emotions are conspicuously absent in most discussions
and research on scaffolding and ZPD. Learners’ emotions may also impact on whether new knowledge
provided by peers during an interaction is appropriated (see Imai, 2010). I return to the construct of
agency and emotions when discussing learners’ processing of WCF from a sociocultural perspective.
Language as a Tool: Languaging

What enables experts to scaffold the performance of novices or for peers to engage in collective
scaffolding is language. In SCT, language is viewed as a psychological tool that mediates devel-
opment. This important role of language is captured in the construct of “languaging” used first by
Swain in 2006 to describe the kind of reflections and dialogues that occur when learners produce
output. Swain (2006, p. 98) defines languaging as “the process of making meaning and shaping
knowledge and experience through language.” In other words, languaging is the use of language, a
psychological tool, that enables learners to think through a problem. It occurs, for example, when
learners encounter a problem such as a hole in their knowledge, a difference between two ways of
expressing an idea (gap) or when they try to understand a complex idea or rule (for an extended
discussion of languaging, see Suzuki & Storch, 2020). In the L2 learning domain, language has as
a dual role. It mediates learning and the knowledge acquired is new language. Drawing on SCT,
Swain (2006, 2010) distinguished between two forms of languaging: self-directed (private speech)
and other-directed (collaborative dialogue).
When completing a language task individually, such as a composition, learners have been found
to engage in the private speech form of languaging, talking themselves through a problem they
encounter. The deliberations captured by think-aloud protocols, mentioned earlier, when learners
noticed holes in their knowledge during the process of writing, are examples of the self-directed
form of languaging. Verbalizing thoughts creates an audible artefact that can be analyzed further.
Although self-directed talk does not guarantee that a problem will be solved (Lantolf, 2005), it may
enable the learners to talk themselves into understanding something that was previously not under-
stood; that is, it may strengthen existing knowledge.
Other-directed talk or collaborative dialogue, as the name suggests, is languaging that occurs
in interaction with others. During collaborative writing and processing of WCF, both forms of
languaging can emerge. Self-directed languaging in the presence of others makes the speaker’s
thinking and deliberations accessible to others and evolve into collaborative dialogue as the
co-authors offer suggestions, explanations, or confirmations. Furthermore, a sense of joint owner-
ship of the text may encourage collaborative dialogue. For examples, if disagreements arise about
language use, learners may feel compelled to offer explanations to justify their suggestions and con-
sider counter-suggestions. In this process of resolving disagreements or cognitive conflicts (Tocalli-
Beller & Swain, 2005), learners construct a clearer representation of their own knowledge (van
Lier, 1996).
Similarly, when processing WCF that co-authors receive on their completed texts from their
teacher or peers, learners have been shown to engage in collaborative dialogue to try to understand
29
Neomy Storch
the intended meaning of the feedback, or to compare and evaluate a reformulated form with their
own existing language knowledge (see Storch & Wigglesworth, 2010; Swain & Lapkin, 2002;
Wigglesworth & Storch, 2012). However, these studies have also shown that deeper engagement
with the WCF does not necessarily mean that the feedback will be accepted, and even if accepted
and incorporated in revised drafts, it may not lead to learning. For examples, the study by Storch and
Wiggleworth (2010) found that in instances where the learners rejected teacher feedback, they often
incorporated that feedback in their revised version, perhaps due to context-related expectations and
power hierarchies, but they then reverted to the use of the same erroneous structures in their subse-
quent individual writing. Whether learners accept or reject WCF represents an enactment of their
agency.
In line with more recent research on individual learners’ response to teacher WCF that presents a
view of learners not primarily as responders to feedback but as active agents (e.g., Zhang & Hyland,
2018; Zheng & Yu, 2018), including emerging research on the emotions that WCF evokes (e.g.,
Mahfoodh, 2017), research on collaborative processing of WCR also needs to explore more fully
agency and emotions. Collaborative processing of feedback cannot be fully understood unless the
co-authors’ agency and emotions are taken into consideration.
To conclude this theoretical overview, from a sociocultural perspective, collaborative writing and
the processing of feedback may provide optimal conditions for language learning. Unlike solitary
writing and processing of feedback, performing these tasks collaboratively provides learners with
opportunities to engage in collaborative dialogue, during which they receive timely peer assistance
to resolve any difficulties they encounter or gaps that they notice. The collaborative condition also
enables the learners to pool their partial linguistic resources and co-construct new knowledge or
consolidate existing knowledge. This co-constructed knowledge can then be internalized by the
learners for their own use.
Internalization is the final stage of development. To date, however, the term internalization
remains contested (see discussion in Lantolf & Thorne, 2006) and its exact meaning rather elu-
sive. This is despite attempts by Lantolf, the leading interpreter of Vygotsky’s work in the West, to
explicate the term. According to Lantolf and Thorne (2006) internalization refers to a process which
enables humans to control and regulate their biologically endowed capacities in order to perform
increasingly more complex functions independently. What enables this development is “the appro-
priation of the regulatory means employed by others” (Lantolf, 2000, p. 14) and the mechanisms
that enable this development are imitation as well as self-and other-directed forms of languaging.
However, the exact meaning of appropriation is not clear. It is often used interchangeably with
internalization, even by Lantolf (2000), yet Lantolf and Thorne (2006) argue that the two terms
are fundamentally different. The explanations they offer, however, do not make this distinction
very clear.
Concluding Remarks
Collaborative writing is a challenging activity, more so than individual writing because learners
need to negotiate a joint text. However, the activity provides opportunities for learners to address
any noticed gaps (holes and gaps) as well as the means to address these gaps via peer feedback
available throughout the writing process. The peer feedback is immediate and contingently respon-
sive to their needs and thus represents effective assistance. There are also opportunities to pool
linguistic resources and co-construct new knowledge. The collaborative processing of feedback
learners receive on their joint texts provides additional opportunities for deliberations about lan-
guage, and thus L2 learning.
Cognitive theories of SLA, focusing as they do on hypothesized mental processes, tend to ignore
the potential impact of non-cognitive factors on these processes. However, learners, as Atkinson
and Tardy (2018, p. 91) correctly point out, are not merely “input processors,” but human beings
30
who are “living-thinking-feeling-valuing” and whose choice of actions is mediated by the context
in which an activity takes place. Learner agency, including the role of emotions, is clearly an area
requiring further investigation, particularly if we are to gain a better understanding of learners’
engagement in joint activities such as collaborative writing and processing of feedback, where
humans interact with each other.
References
Adams, R. (2003). L2 output, reformulation, and noticing. Implications for IL development. Language
Teaching Research, 7(3), 347–376.
Adams, R. & Ross-Feldman, L. (2008). Does writing influence learner attention to form? In D. Belcher &
A. Hirvela (Eds.), The oral-literate connection (pp. 243–266). Ann Arbor: University of Michigan Press.
Alghasab, M., Hardman, J., & Handley, Z. (2019). Teacher-student interaction in wikis: Fostering collaborative
learning and writing. Learning, Culture and Social Interaction, 21, 10–20.
Alshuraidah, A. & Storch, N. (2019). Investigating a collaborative approach to peer feedback. ELT Journal,
73(2), 166–174. doi.org/10.1093/elt/ccy057
Atkinson, D. & Tardy, C. (2018). Disciplinary dialogues. SLW at the crossroads: Finding a way in the field.
Journal of Second Language Writing, 42, 86–93. https://doi.org/10.1016/j.jslw.2018.10.011
Aufa, F. & Storch, N. (2021). L2 learner interaction in blended collaborative writing activities. In M.P. García-
Mayo (Ed.), Working collaboratively in second/foreign language learning (pp. 13–34). Berlin: De Gruyter
Mouton.
Bikowski, D. & Vithanage, R. (2016). Effects of web-based collaborative writing on individual writing devel-
opment. Language Learning & Technology, 20(1), 79–99.
Bitchener, J. (2012). A reflection on the language learning potential of written CF. Journal of Second Language
Writing, 21, 348–363.
Bitchener, J. (2017). Why some L2 learners fail to benefit from written corrective feedback. In H. Nassaji
& E. Kartchava (Eds), Corrective feedback in second language teaching and learning. Research, theory,
applications, implications (pp. 129–140). New York: Routledge.
Bitchener, J. (2019). The intersection between SLA and feedback research. In K. Hyland & F. Hyland (Eds.),
Feedback in second language writing. Contexts and issues (2nd ed.) (pp. 85–105). Cambridge: Cambridge
University Press.
Bitchener, J. & Storch, N. (2016). Written corrective feedback for L2 development. Bristol: Multilingual
Matters.
Brooks, L. & Swain, M. (2009). Languaging in collaborative writing: Creation and response to expertise. In A.
Mackey & C. Polio (Eds.), Multiple perspectives on interaction in SLA (pp. 58–89). Mahwah, NJ: Lawrence
Erlbaum.
Calzada, A. & García-Mayo, M.P. (2020). Child EFL grammar learning through a collaborative writing task.
In W. Suzuki & N. Storch (Eds.), Languaging in language learning and teaching: A collection of empirical
studies (pp. 20–39). Amsterdam: John Benjamins.
Cho, H. (2017). Synchronous web-based collaborative writing: Factors mediating interaction among second-
language writers. Journal of Second Language Writing, 36, 37–51.
Deters, P., Gao, X., Miller, E.R., & Vitanova, G. (Eds.). (2015). Theorizing and analysing agency in second
language learning. Bristol: Multilingual Matters.
Donato, R. (1994). Collective scaffolding in second language learning. In J.P. Lantolf & G. Appel (Eds.),
Vygotskian approaches to second language research (pp. 33–56). Norwood, NJ: Ablex.
Douglas Fir Group. (2016). A transdisciplinary framework for SLA in a multilingual world. Modern Language
Journal, 100 (Supplement 2016), 19–47.
Duff, P. (2012). Issues of identity. In A. Mackey & S. Gass (Eds.), The Routledge Handbook of second lan-
guage acquisition (pp. 410–426). London: Routledge.
Ede, L. & Lunsford, A. (1990). Singular texts/plural authors. Carbondale: Southern Illinois University Press.
Elola, I. & Oskoz, A. (2010). Collaborative writing: Fostering foreign language writing conventions develop-
ment. Language Learning & Technology, 14(3), 51–71.
Engeström, R. (1987). Learning by expanding: An activity theoretical approach to developmental research.
Helsinki: Orienta-Konsultit.
Engeström, R. (2001). Expansive learning at work: Toward an activity theoretical conceptualization. Journal
of Education and Work, 14, 133–156.
31
Neomy Storch
Fernández-Dobao, A. (2020). Exploring interaction between heritage and second language learners in the
Spanish language classroom: Opportunities for collaborative dialogue and learning. In W. Suzuki & N.
Storch (Eds.), Languaging in language learning and teaching: A collection of empirical studies (pp. 92–
Ferris, D. & Kurzer, K. (2019). Does error feedback help L2 writers? Latest evidence on the efficacy of written
corrective feedback. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing. Contexts and
issues (2nd ed.) (pp. 106–124). Cambridge: Cambridge University Press.
García Mayo, M.P. & Imaz Agirre, A. (2019). Task modality and pair formation method: Their impact on
patterns of interaction and LREs among EFL primary school children. System, 80, 165–175. doi.org/
10.1016/j.system.2018.11.011
Gass, S (1988). Integrating research areas: A framework for second language studies. Applied Linguistics, 9(2),
198–217
Gass, S. & Mackey, A. (Eds.). (2012). The Routledge handbook of second language acquisition. New York:
Routledge.
Godwin-Jones, R. (2003). Emerging technologies. Blogs and wikis: Environments for online collaboration.
Language Learning and Technology, 7(2), 12–16.
Han, Y. & Hyland, F. (2015). Exploring learner engagement with written corrective feedback in a Chinese ter-
tiary EFL classroom. Journal of Second Language Writing, 30, 31–44.
Hanaoka, O. (2007). Output, noticing, and learning: An investigation into the role of spontaneous attention to
form in a four-stage writing task. Language Teaching Research, 11(4), 459–479.
Haneda, M. (2007). Modes of engagement in foreign language writing: An activity theoretical perspective. The
Canadian Modern Language Review, 64(2), 301–332.
Writing, 11(4), 329–350.
Imai, Y. (2010). Emotions in SLA: New insights from collaborative learning for an EFL classroom. The
Modern Language Journal, 94(2), 278–292.
Izumi, S. & Bigelow, M (2000). Does output promote noticing and second language acquisition? TESOL
Quarterly, 34(2), 239–278.
Izumi, S., Bigelow, M., Fujiwara, M., & Fearnow, S. (1999). Testing the output hypothesis: Effects of output on
noticing and second language acquisition. Studies in Second Language Acquisition, 21, 421–452.
Kormos, J. (2012). The role of individual differences in L2 writing. Journal of Second Language Writing,
21, 390–403.
Kowal, U.M. & Swain, M. (1997). From semantic to syntactic processing: How can we promote it in the
immersion classroom? In R.K. Johnson & M. Swain (Eds.), Immersion education: International perspectives
Lantolf, J.P. (2000). Introducing sociocultural theory. In J. Lantolf (Ed.), Sociocultural theory and second lan-
guage learning (pp. 1–26). Oxford: Oxford University Press.
Lantolf, J.P. (2005). Sociocultural theory and L2 learning: An exegesis. In E. Hinkel (Ed.), Handbook of
research in second language teaching and learning (pp. 335–354). Mahwah, NJ: Erlbaum.
Lantolf, J.P. & Poehner, M.E. (2014). Sociocultural theory and the pedagogical imperative in L2 education.
Vygotskian praxis and research/practice divide. New York: Routledge.
Lantolf, J.P. & Thorne, S. (2006). The sociogenesis of second language development. Oxford: Oxford
University Press.
Leeser, M.J. (2004). Learner proficiency and focus on form during collaborative dialogue. Language Teaching
Research, 8(1), 55–81.
Leow, R.P. (2015). Explicit learning in L2 classroom: A student–centred approach. New York: Routledge.
Li, M. & Zhu, W. (2017). Explaining dynamic interactions in wiki-based collaborative writing. Language
Learning & Technology, 21(2), 96–120.
Liu, M., Liu, L., & Liu, L. (2018). Group awareness increases student engagement in online collaborative
writing. The Internet in Higher Education, 38, 1–8.
Louth, R., McAllister, C., & McAllister, H.A. (1993). The effects of collaborative writing techniques on
freshmen writing and attitudes. The Journal of Experimental Education, 61(3), 215–224.
Mahfoodh, O.H.A. (2017). “I feel disappointed”: EFL university students’ emotional response towards teacher
written feedback. Assessing Writing, 31, 53–72.
Manchón, R. (2011). Writing to learn the language: Issues in theory and research. In R. Manchón (Ed.),
Learning-to-write and writing-to-learn in an additional language (pp. 61–84). Amsterdam: John Benjamins.
Manchón, R., Nicolás-Conesa, F., Cerezo, L., & Criado, R. (2020). L2 writers’ processing of written corrective
feedback: Depth of processing via written languaging. In W. Suzuki & N. Storch (Eds.), Languaging in lan-
guage learning and teaching: A collection of empirical studies (pp. 242–265). Amsterdam: John Benjamins.
32
Manchón, R., Roca de Larios, J., & Murphy, L. (2009). The temporal dimension and problem solving nature
of foreign language composing processes. Implications for theory. In R.M. Manchón (Ed.), Writing in
foreign language contexts. Learning, teaching, and research (pp. 102–129). Clevedon: Multilingual
Matters.
Manchón, R.M. & Vasylets, O. (2019). Language learning through writing: Theoretical perspectives and
Masuda, K., & Iwasaki, N. (2018). Pair-work dynamics: Stronger learners’ languaging engagement and learning
outcomes for the Japanese polysemous particles ni/de. Language and Sociocultural Theory, 5, 46–71.
Niu, R. (2009). Effect of task-inherent production modes on EFL learners’ focus on form. Language Awareness,
18(3–4), 384–402.
Ortega, L. (2009). Understanding second language acquisition. London: Hodder Education.
Pavlenko, A. & Lantolf, J.P. (2000). Second language learning as participation and the (re)construction of selvs.
In J.P. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 155–178). Oxford: Oxford
University Press.
Pu, A.Y-E. (2020). Student agency in a collaborative writing programme: A sociocognitive perspective.
Unpublished PhD thesis. University of Waikato, New Zealand.
Rouhshad, A. & Storch, N. (2016). A focus on mode: Patterns of interaction in face-to-face and computer-
mediated modes. In S. Ballinger & M. Sato (Eds.), Peer interaction and second language learning (pp.
267–290). Amsterdam: John Benjamins.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2),
192–196.
Schmidt, R. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13,
206–226.
awareness in learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning (pp. 1 –
63). Honolulu, HI: National Foreign Language Resource Centre.
Schmidt, R. & Frota, S. (1986). Developing basic conversational ability in a second language: A case study of
an adult learner of Portuguese. In R. Day (Ed.), Talking to learn: Conversation in second language acqui-
sition (pp. 237–326). Rowley, MA: Newbury House.
Storch, N. (2001). How collaborative is pair work? ESL tertiary students composing in pairs. Language
Storch, N. (2002). Patterns of interaction in ESL pair work. Language Learning, 52(1), 119–158.
Storch, N. (2013). Collaborative writing in L2 classrooms. Bristol: Multilingual Matters.
Storch, N. (2016). Collaborative writing. In R.M. Manchón & P.K. Matsuda (Eds.), Handbook of second and
foreign language writing (pp. 387–406). Boston: de Gruyter Mouton.
Storch, N. (2017a). Sociocultural theory in the L2 classroom. In S. Loewen & M. Sato (Eds.), The Routledge
handbook of instructed second language acquisition (pp. 69–84). New York: Routledge.
Storch, N. (2017b). Implementing and assessing collaborative writing activities in EAP classes. In J. Bitchener,
N. Storch, & R. Wette (Eds.), Teaching writing for academic purposes to multilingual students: Instructional
approaches (pp. 130–144). New York: Routledge.
Storch, N. (2017c). Peer corrective feedback in computer-mediated collaborative writing. In H. Nassaji &
E. Kartchava (Eds.), Corrective feedback in second language teaching and learning. Research, theory,
applications, implications (pp. 66–79). New York: Routledge.
Storch, N (2019a). Time line on collaborative writing. Language Teaching, 52(1), 40–59.
Storch, N. (2019b). Collaborative writing as peer feedback in K. Hyland & F. Hyland (Eds.), Feedback
in Second Language Writing: Contexts and Issues (2nd ed.) (pp. 143–162). Cambridge: Cambridge
University Press.
Storch, N. & Aldosari, A. (2013). Pairing learners in pair-work activity. Language Teaching Research,
17(1), 31–48.
Storch, N. & Alshuraidah, A. (2020). Languaging when providing and processing peer feedback. In W. Suzuki
& N. Storch (Eds.), Languaging in language learning and teaching: A collection of empirical studies
(pp. 112–128). Amsterdam: John Benjamins.
Storch, N. & Wigglesworth, G. (2010). Learners’ processing, uptake, and retention of corrective feedback on
writing. Case studies. Studies in Second Language Acquisition 32(2), 303–334.
Suzuki, W. & Storch, N. (2020). An introduction. In W. Suzuki & N. Storch (Eds.), Languaging in language
learning and teaching: A collection of empirical studies (pp. 2–15). Amsterdam: John Benjamins.
33
Neomy Storch
output in its development. In S.M. Gass & C.G. Madden (Eds.), Input in second language acquisition (pp.
235–253). Rowley, MA: Newbury House.
Swain, M. (1993). The output hypothesis: Just speaking and writing aren’t enough. The Canadian Modern
Language Review, 50(1), 158–164.
Swain, M. (1995). Three functions of output in second language learning. In G. Crook and B. Seidlhofer
(Eds.), Principles and practices in applied linguistics: Studies in honour of H. Widdowson (pp. 125–144).
Oxford: Oxford University Press.
Swain, M. (1998). Focus on form through conscious reflection. In C. Doughty & J. Williams (Eds.), Focus on
form in classroom second language learning (pp. 64–81). Cambridge: Cambridge University Press.
Swain, M. (2000). The output hypothesis and beyond: Mediating acquisition through collaborative dialogues.
In J. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 97–114). Oxford: Oxford
University Press.
Swain, M. (2006). Languaging, agency and collaboration in advanced second language learning. In H.
Byrnes (Ed.) Advanced language learning: The contributions of Halliday and Vygotsky (pp. 95–108).
London: Continuum.
Swain, M. (2010). Talking-it-through: Languaging as a source of learning. In R. Batestone (Ed.) Sociocognitive
perspectives on language use and language learning (pp. 112–130). Oxford: Oxford University Press.
Swain, N. (2013). The inseparability of cognition and emotion in second language learning. Language
Teaching, 46(2), 195–207.
Swain, M., Kinnear, P., & Steinman, L. (2015). Sociocultural theory in second language education. An intro-
duction through narratives (2nd ed.). Bristol: Multilingual Matters.
Swain, M. & Lapkin, S. (1995). Problems in output and the cognitive processes they generate: A step towards
second language learning. Applied Linguistics, 16(3), 371–391.
Swain, M. & Lapkin, S. (1998). Interaction and second language learning: Two adolescent French immersion
students working together. Modern Language Journal, 82, 320–337.
Swain, M. & Lapkin, S. (2002). Talking it through: Two French immersion students’ response to reformulation.
International Journal of Educational Research, 37(3–4), 285–304.
Tocalli-Beller, A. & Swain, M. (2005). Reformulation: The cognitive conflict and L2 learning it generates.
International Journal of Applied Linguistics, 5(1), 5–28.
van Lier, L. (1996). Interaction in the language curriculum: Awareness, autonomy and authenticity. London:
Longman.
VanPatten, B. (2007). Input processing in adult second language acquisition. In B. VanPatten & J. Williams
(Eds.), Theories in second language acquisition (pp. 115–135). Mahwah, NJ: Erlbaum.
Vygotsky, L.S. (1978). Mind in society. The development of higher psychological processes. Cambridge,
Vygotsky, L.S. (1986). Thought and language. Cambridge, MA: MIT Press.
Watanabe, Y. & Swain, M. (2007). Effects of proficiency differences and patterns of pair interaction on second
language learning: Collaborative dialogue between adult ESL learners. Language Teaching Research,
11(2), 121–142.
Wigglesworth, G. & Storch, N. (2012). Feedback and writing development through collaboration: A socio-
cultural approach. In R.M. Manchón (Ed.), L2 writing development: multiple perspectives (pp. 69–100).
Wood, D., Bruner, J.S. & Ross, G. (1976). The role of tutoring in problem-solving. Journal of Child Psychology
and Psychiatry, 17(2), 89–100.
Zhang, Z. & Hyland, K. (2018). Student engagement with teacher and automated feedback on L2 writing.
Assessing Writing, 36, 90–102.
Zheng, Y. & Yu, S. (2018). Student engagement with teacher written corrective feedback in EFL writing: A case
study of Chinese lower-proficiency students. Assessing Writing, 37, 13–24.
34
PART II
Core Issues
SECTION 1
Tasks and Writing

4
TASK EFFECTS ACROSS
MODALITIES
University of Barcelona
Introduction
In second language learning, L2 learners are typically exposed to a combination of oral and written
tasks, although we can expect the exposure to writing tasks to vary with age, among other factors.
In the last three decades a considerable amount of research has looked into various factors influ-
encing L2 task performance. Driven by frameworks such as Robinson’s Cognition Hypothesis
(Robinson, 2011a) or Skehan’s Limited Capacity Model (Skehan, 2014), researchers explored the
factors related to the inherent cognitive complexity of tasks (e.g., the amount of reasoning required
to perform them or the number of elements involved in the task) and to the nature of the task (e.g.,
monologic vs dialogic or open vs closed), as well external factors such as planning time or task repe-
tition, which have been extensively researched since the 1990s. The exploration of different task
dimensions and features, however, has been dominated by oral studies (but see Johnson, Chapter 5
this volume, for a review of studies about writing).
Only in the last few years there have been voices (see, for instance, Byrnes & Manchón, 2014a,
b; Gilabert, Manchón, & Vasylets, 2016; Kuiken & Vedder, 2012) advocating that the mode (oral
versus written) in which the task is performed should be taken into consideration for theoretical,
research task design, and practical decision-making reasons. What role, if any, does mode have
in language learning? Why, if at all, should we be concerned with the role of mode in task-based
performance? It is a general assumption that both oral and written tasks are necessary for real-
life performance, yet it is a concern of many task-based language teaching (TBLT) theorists that
research into tasks has clearly been conceived almost exclusively from the perspective of the oral
mode. But what if the hypotheses and claims made about how L2 learners may perform tasks are
mediated by the mode in which tasks are carried out? This would have important implications for
theory building and would modulate some of the assumptions and claims that can be made about
the effects of design on tasks. In line with previous voices, we would like to claim here that it is
necessary that the specific features of both modes should be taken into consideration at the theoret-
ical level (i.e., in theoretical frameworks), research into task performance (i.e., that should consider
mode as a feature in task design) and in practical decision-making in task-based or task-supported
programs, where knowledge about the effects of mode can inform task and syllabus design.
In what follows we provide a short historical perspective that brings up where the focus has
been placed in the last four decades of research into tasks. This leads us to the identification and
DOI: 10.4324/9780429199691-7 39
analysis of some critical issues generated by the specificity of speaking and writing. We then point
out some key facts regarding research into mode, as well as some methodological aspects that need
to be addressed. We finish the chapter with some recommendations for syllabus and task design,
and some predictions as to the future of mode in TBLT are made.
Historical Perspectives
Task-based language learning and teaching has received considerable attention from second lan-
guage (L2) acquisition researchers as well as practitioners (see, for instance, Bygate, 2015, 2016,
for key developments). In the last four decades, there has been a growing interest in the use of tasks
for the development of second and foreign languages worldwide. A widely accepted definition of
task is that of an activity “where meaning is primary; there is some communicative problem to
solve; some sort of relationship with real-world activities; and the assessment of task is in terms
of a task outcome” (Skehan, 1998, p. 95). Much of TBLT-oriented research endeavors have taken
a psycholinguistic perspective, which sees tasks as pedagogic devices that can “predispose, even
induce learners to engage in certain types of language use and mental processing that are benefi-
cial for acquisition” (Ellis, 2000, p. 197). Both teachers’ and researchers’ attention has been drawn
to tasks, either as a means to investigate the effects of task design on language performance and
acquisition or for the promotion of language learning in their classes (13,600 hits in Google Scholar
under the search term “task-based language”).
In the case of researchers, the last few decades have seen an exponential increase in the
number of research studies investigating various aspects of L2 learners’ task performance. Tasks
have been proven to be flexible yet robust units for experimentation and hundreds of studies
have looked into a diversity of features within tasks. As a consequence, considerable advances
have been achieved in our understanding of how tasks may lead to language comprehension,
production, interaction, and learning. As regards teachers, their interest in tasks has been
inspired by the fact that communicative tasks are instantiations of learning-by-doing, and that
they are meaning-focused, learner-centered, and holistic unlike traditional language-learning
activities which tend to be more decontextualized and grammar-focused in nature. Whether in
traditional task-supported programs (where tasks give support to a traditional grammar-focused
syllabus, see Müller-Hartmann & Schocker-von Ditfurth, 2011), communicative task-based
programs (where tasks are the units of syllabus organization; see Long, 2016), content-and-
language integrated programs (where tasks may be integrated for the teaching of content; see
García-Mayo, 2015) or immersion programs, teachers and syllabus designers are faced with the
need to design tasks that will serve their specific needs.
The use of tasks in teaching, testing, and research has also grown exponentially (see Bygate,
2016; Bygate, Skehan, & Swain, 2013; Ellis, 2003; Ellis, Skehan, Li, Shintani, & Lambert, 2019;
García Mayo, 2007; Long, 2016; Révész, 2009; Robinson, 2011a, 2011b; Samuda & Bygate, 2008;
Shehadeh & Coombe, 2012; Skehan, 1998; Thomas & Reinders, 2010; Van den Branden, 2006;
also research syntheses like the work of Jackson & Suethanapornkul [2013], as well as the hundreds
of studies presented at the biennial Task-based Language Teaching Conference). While there are
some general models of task-based language teaching (Long, 2016; Ellis, 2003, Skehan, 2003),
most of what we know about task design research has been inspired by two cognitive-interactionist
models for task-based teaching and learning: Robinson’s Cognition Hypothesis (Robinson 2001)
and associated Triadic Componential Framework (Robinson, 2001; Robinson & Gilabert, 2007),
and Skehan’s Trade-off Hypothesis (Skehan, 1998), which have generated prolific research agendas
in the last three decades. The main interest of both models has been to provide an explanation of
how task features may affect the processes and outcomes of L2 use and acquisition. In turn, this
information has been used for the design of tasks and task-based syllabi in such a way that they
40
Task Effects Across Modalities
maximize the opportunities for L2 use and development. Such models have looked at task variables
which are cognitive in nature (e.g., the amount of reasoning involved in a task), interactive (e.g.,
how information may be distributed among participants), or linguistic (e.g., how the difficulty of
certain features may affect task performance –see “code complexity” in Skekan, 2003). A wealth of
literature has been generated from the two main hypotheses in relation to task design.
Notwithstanding notable theoretical and empirical achievements, the field of TBLT has an
acknowledged limitation derived from privileging oral over written production (Byrnes & Manchón,
2014a, b; Gilabert et al., 2016). This neglect of the written mode is problematic given that in lit-
erate societies both oral and written modes are indispensable and, accordingly, the command of
both modes constitutes the hallmark of language proficiency (Berman & Ravid, 2009). In addition,
both oral and written language use represent potential language-learning opportunities, hence the
relevance of looking into language learning associated with both speaking and writing tasks. In
line with these ideas, recent positions in TBLT have stressed the need for more mode-sensitive and
integrative TBLT research and for a deeper understanding of the unique learning benefits that each
mode can offer separately or in mutual interaction (Gilabert et al., 2016). Although still limited,
some theoretical thinking as well as empirical research has been directed towards the role of mode
in TBLT.
Critical Issues and Topics

Both speech and writing share an important similarity as both modes externalize inner language
by giving it the linguistic form. At the same time, there are a number of important features that
differentiate oral and written language production and, by extension, define the specific features of
the language-learning potential of the two modes. These features primarily include: (1) the nature
of the motor execution in speech versus writing, (2) the nature of output, and (3) the relationship
with the audience. Speech involves producing language with the help of auditory signals, which is
less effortful and faster than the production of written signs (Horowitz & Samuels, 1987; Williams,
2012). Speakers are thus more fluent than writers in terms of the speed fluency as they can produce
more linguistic material per unit of time. At the same time, the online nature and the evanescent
character of oral output create pressures which can constrain the implementation of language pro-
duction processes, limiting the time available for planning, linguistic formulation, or monitoring.
These pressures could be particularly relevant in L2 production which could be less automatic and
more resource-demanding than L1 production (Kormos, 2006).
In contrast, writing is offline (with the exception of synchronous computer-mediated written
communication) and self-paced, and its output is visible and (relatively) permanent. In writing,
language users do not feel the tension to prioritize one production process over another, as they
can implement planning, formulation, and revision in a recursive and self-paced manner (Kellogg,
1996). Written conditions also allow to pay attention to both content and linguistic form (Chafe,
1985). Because of the slowness of writing, knowledge representations are activated for a longer
period of time, which can benefit information retrieval from long-term memory and planning
processes; additionally, visible written text serves as an extension of working memory which
implies a potential relief of the cognitive load in writing, facilitating knowledge recall (Graboswki,
2007). Written conditions can, thus, facilitate retrieval/generation of the ideational content and
allow for exhaustive linguistic encoding and enhanced monitoring (Chafe & Danielewisz, 1987).
Another relevant aspect is the relationship with the audience (Nystrand, 1987; Ong, 1982). Oral
production typically entails a face-to-face contact between the communicator and the recipient of
the message. Thus, during the performance of oral tasks, speaker(s) and listener(s) typically share
the same environment which allows for a possibility to employ modulatory affordances of the voice
(pitch, intonation) and also gestures to convey part of the meaning. Face-to-face contact also means
41
that speakers can receive instant feedback (linguistic or non-linguistic) about the accuracy or com-
prehensibility of their production, which allows speakers to make quick adjustments to ensure the
correct functioning of the communicative channel (Bereiter & Scardamalia, 1987). The immediate
presence of the listener creates the conditions under which speakers have to avoid unduly long
pauses and maintain a reasonably fluent pace of production. Such pressures can have linguistic
consequences, evident in the simplification of lexis and grammar, use of short idea units, or unin-
tended errors (Chafe & Danielewisz, 1987). In contrast, the writer’s audience is typically displaced
in time and space (Ong, 1982). Under such conditions, written messages do not function commu-
nicatively at the time of their creation (Horowitz & Samuels, 1987). Thus, in order to ensure com-
prehensibility and effectiveness of the message, writers have to aim at precision in their lexical and
grammatical choices, and they have to pay special attention to the explicitness and coherence of
their discourse (Chafe, 1985).
The specific conditions of written production, namely, the greater availability of time
stemming from the slowness and the self-paced nature of writing, the visibility and perman-
ence of output, and also the problem-solving nature and resulting depth of processing inherent
to some types of writing tasks are at the core of the language-learning potential of L2 writing
theory, which views composing as a tool to promote general L2 proficiency (Manchón, 2011;
Manchón & Williams, 2016; Manchón & Vasylets, 2019). The prediction is that less constrained
writing conditions offer a favorable environment to employ attentional resources and metacog-
nitive strategies more effectively, to pay more attention to linguistic forms and to deploy explicit
and metalinguistic knowledge in a more efficient and flexible manner (Manchón & Williams,
2016; Williams, 2012). As pointed out by Cumming (1990), “composing might function broadly
as a psycholinguistic output condition wherein learners analyze and consolidate second lan-
guage knowledge that they have previously (but not yet fully) acquired” (p. 483). In a similar
line, Manchón and Vasylets (2019) stress that visible written text can induce L2 users to set up
more complex goals and produce “pushed” output (Swain, 1995, 2005), characterized by higher
accuracy, cohesion, and complexity. As Williams (2012) puts it, “the cognitive window is open
somewhat wider” and for a longer period of time in writing (p. 328), which allows learners to
engage learning processes to a full extent. Problem-solving characteristics of some complex
form of writing is another relevant feature, which is associated with deeper linguistic processing
and concomitant engagement of relevant learning processes, such as noticing, cognitive com-
parison, and analysis, as well metalinguistic reflection and explicit knowledge analysis (see
also, Byrnes & Manchón, 2014a, b; Manchón & Roca de Larios, 2007).
Specificity of language production in the two modes can also account for the fact that written
and oral conditions can interact with other learning-relevant task variables, such as task complexity
and task repetition. In Robinson`s (2001) definition, task complexity refers to the intrinsic cognitive
demands posed by a task on learner`s memory, reasoning, and attentional resources. The two most
influential task complexity models advanced by Skehan (2009, 2014) and Robinson (2001, 2011a)
provide differing predictions concerning the effects of task complexity on L2 use and development.
Thus, Skehan’s (2009) Limited Capacity Model posits that, because of the limited nature of cogni-
tive resources, learners have to trade-off accuracy and complexity when performing complex tasks.
In other words, in tasks with increased task complexity, fluent L2 production can be either accurate
or complex, but not both. On the other hand, Robinson`s (2001, 2011a) in his Cognition Hypothesis
predicts that increases in cognitive task demands along the so-called resource-directing factors
(such as reasoning demands or the number of elements) can induce learners to produce L2 discourse
which is simultaneously accurate and complex (Robinson, 2011a). There is one aspect, however,
which is common to these two opposing theoretical models. This commonality consists in the fact
that both models have been designed with the oral mode in mind, without specifying explicitly if
and how their theoretical tenets would apply to writing. As pointed out by Tavakoli (2014) “little
is known about whether cognitive complexity affects writing and speaking tasks in similar ways,
42
or whether it has similar influences on L2 oral and written performance”(p. 217). In this regard, a
number of voices have cautioned against the uncritical application of the speech-originated theories
to writing, calling for the necessity to reconceptualize the constructs of task and task complexity
in their application to writing, provided idiosyncratic characteristics of written production, such as
visibility of output or more ample planning and revision opportunities (Byrnes & Manchón, 2014b;
Manchón, 2014a; see also Yoon & Polio, 2017).
Interestingly, Johnson (2017) has proposed the possibility that “the effects of cognitive task
complexity manipulation may be more pronounced for L2 written production” (p. 15). The empir-
ical evidence to this claim has been provided by Vasylets, Gilabert, and Manchón (2017) who found
that in written performance task complexity effects were greater and showed a better fit to the theor-
etical prediction as compared to the oral task performance. On the basis of these findings, Vasylets
et al. (2017) questioned the validity of a “fit-all-modes” task complexity models, suggesting that
cognitive task-based models “should be problematized in an attempt to account in full for the spe-
cificity of production conditions in different modalities” (p. 28).
Task repetition is another task variable that has generated debate concerning its functioning in
writing tasks. With the primary focus on oral task performance, task repetition theorizing posits
that repeating a task can potentially promote L2 acquisition processes and lead to improved L2
performance (Bygate, 2001; Bygate & Samuda, 2008). The main underlying idea is that during the
first encounter with the task, learners have to divide their attentional resources between content
and form. However, when repeating the task, learners are already familiar with the content, which
consequently reduces task demands, frees up cognitive resources, and induces learners to engage in
focus on form processes to a fuller extent (Bygate, 2001). The first encounter with the task is, thus,
regarded as a planning opportunity on which learners can draw in the posterior iteration/s of the
task, leading to a more sophisticated and appropriate L2 performance and creating additional L2
development opportunities. Similar to task complexity, theoretical rationale for task repetition was
created with the oral mode in mind. As pointed out by Manchón (2014a, b), the distinctive features
of writing (as discussed previously) might account for the unique qualities of task repetition in
the written mode. Thus, in writing learners are able to plan their production during the very first
encounter with the task. Additionally, the recursive nature of writing enables learners to continu-
ously shift back and forth among the different writing processes, resulting in what Manchón (2014b)
terms as internal task repetition, which is believed to give an additional boost to L2 performance
and development. Visibility of written output also facilitates external task repetition, which consists
in processing of the provided feedback and the subsequent revision/repetition of the task.
In sum, because of the specific characteristics of the two modes, the same L2 task can engage L2
processes differently and evoke different types of L2 production depending if it is performed orally
or in writing. According to the tenets of language-learning potential of L2 writing theory, written
mode represents a specifically favorable environment for the deep engagement of L2 learning
processes and knowledge (explicit knowledge, in particular) and for the production of coherent,
complex, and accurate L2 output. The particular characteristics of oral versus written language pro-
duction can also account for the differential interaction of the two modes with other task-learning
variables, with writing potentially offering specifically beneficial conditions for the channeling the
effects of task complexity or task repetition. These predictions, however, still lack empirical con-
firmation, although, as we show in the next section, there is a number of investigations which have
already discovered certain patterns in this regard.
Current Contributions and Research

In this section we will review three general areas of research that are relevant to the question of the
role of mode in task-based learning: (1) comparison of L2 oral and written production, (2) medi-
ating effects of mode on task complexity, and (3) mediating effects of mode on task repetition.
43
There is a sizeable amount of investigations which have compared the quality and quantity of L2
oral and written discourse, frequently operationalized in terms of the CAF measures. Because of the
nature of oral and written performance, speed fluency is always obviously expected to be higher in
speech. Thus, the main focus is on the comparison of accuracy and complexity in the two modes.
The studies comparing L2 oral and written production have produced mixed findings, although cer-
tain patterns in the results can be traced. Thus, in an early study, Ellis (1987) compared accuracy
of English verbal forms in oral and written narrative performance of 17 L2 English learners from
different backgrounds. The analysis showed that all three past-tense forms were more accurate in
the written task than in oral performance. In a later study, Ellis and Yuan (2005) broadened the
scope of measures to fluency, accuracy, and complexity and examined narrative performance of 42
L1 Chinese undergraduate learners of L2 English (aged 18–20). The overall comparison of oral and
written task performance showed that learners were less fluent but more complex (in grammar and
lexis) and more accurate in writing than in speech. These results corroborated Ellis`s (1987) earlier
findings of higher accuracy of past-tense forms in written production.
In Granfeldt`s (2008) study, Swedish university learners of L2 French performed orally and in
writing two tasks of different genres. In line with Ellis and Yuan (2005), Granfeldt found that lexical
complexity was higher in writing. However, contrary to the initial expectation, writers were less
accurate than speakers and exhibited lower levels of grammatical complexity. Higher accuracy in
speech was also obtained by Ferrari and Nuzzo (2009) with young learners of L2 Italian. Studies
by Yu (2009) and Bulté and Housen (2009) focused exclusively on lexical complexity. Bulté and
Housen (2009) explored the development of lexical complexity in the two modes in the production
of Dutch learners of L2 French. The findings were that the scores on the written tasks were higher
for the measures of lexical diversity and sophistication, but not for the lexical profile measures.
A different pattern of findings was obtained by Yu (2009), who found similar levels of lexical
diversity in written compositions and oral interviews of 25 learners of English from various L1
backgrounds.
More evidence about linguistic differences between L2 oral and written production comes from
Kormos (2014) who compared oral and written narrative production of 44 Hungarian learners of
L2 English. The analysis showed that, in writing, learners used more varied vocabulary, exhibited
higher noun-phrase complexity, and were more accurate. In a more recent study, Zalbidea (2017)
examined argumentative task performance of 32 intermediate learners of L2 Spanish, finding
that oral production was characterized by more syntactically complex language, while in writing
learners generated more lexically diverse and more accurate output. Vasylets et al. (2017) also
explored argumentative task performance, but in Spanish/Catalan intermediate learners of L2
English (n =78). The results showed that speakers produced more idea units, while writers achieved
higher scores for general syntactic complexity, subordination and lexical diversity. Writers also
produced a higher ratio of complex ideas and, predictably, spent more time on task as compared
to speakers.
In a more recent large-scale study, Vasylets, Manchón, and Gilabert (2019) focused on linguistic
(lexical and syntactic) and propositional complexity in L2 oral and written narrations performed
by 290 Spanish/Catalan intermediate learners of L2 English. The results showed that written per-
formance displayed higher scores on all the sub-dimensions of syntactic complexity (including
coordination, subordination, and nominal complexity) and lexical complexity (density, variety,
richness, and sophistication). Differences were observed in the way writers and speakers conveyed
the propositional content of the task, with oral production exhibiting longer idea units and written
production containing a higher ratio of complex ideas. On the basis of these findings, the authors
concluded that due to its characteristic features, such as the slowness of production or visibility of
output, written mode offered favorable conditions to produce complex discourse, with the concomi-
tant positive implications for the language-learning potential of writing.
44
Summarizing the available research findings, we can conclude that the comparison of oral and
written performance have produced mixed findings for accuracy and syntactic complexity, but there
is an evident pattern of higher lexical complexity in writing. On the basis of these findings, we
could suggest that mode selection during task performance can mediate the way L2 learners use
their linguistic knowledge and engage L2 learning processes, with the concomitant consequences
for L2 performance and development.
A number of studies have examined interactions between mode and task complexity in their
influence on L2 performance. Arguments derived from this strand of research have stressed the
possibility that task complexity effects might play out differently in speech and in writing. The
added value of these studies is that they have put to the empirical test the applicability to writing of
the theoretical predictions which were originally put forward with the oral mode in mind. Kuiken
and Vedder`s (2011) study was one of the first to explore if the effects of task complexity differed
in oral and written production. In this study, Dutch learners of L2 Italian (n =44 in the oral mode
and n = 91 in writing) performed orally and in writing a simple and a complex version of an argu-
mentative task, which was analyzed in terms of accuracy and linguistic (lexical and syntactic) com-
plexity. The results showed that task complexity effects were largely the same in the two modes,
with learners producing fewer errors in the complex task. On the basis of these results, Kuiken and
Vedder (2011) concluded that “the influence of task complexity on linguistic performance is not
substantially constrained by mode” (p. 103). The same conclusion was drawn in the recent study by
Cho (2018) in which 39 Korean university EFL students performed in both modes four argumen-
tative tasks with varying levels of complexity. Task complexity effects were similar in writing and
in speech: in both modes, the production in the complex tasks was less accurate, more syntactically
complex and faster as compared to the production in the simple tasks. Similarly, Zalbidea (2017)
reported absence of interactions between mode and task complexity, as she found that increases in
task demands did not result in significant changes in L2 performance in either speech or writing.
On the other hand, a number of studies have found task complexity effects played out differently
in speech and in writing. Kormos and Trebits (2012), for example, found that task demands affected
differently oral and written task performance of 44 Hungarian secondary school EFL learners.
Thus, in the oral mode, learners produced more varied language in the task which taxed conceptu-
alization and more accurate language (ratio of error-free verbs) in the task taxing linguistic formu-
lation. These findings were not replicated in the written mode, as instead learners produced longer
clauses and more relative clauses in the writing task which taxed conceptualization. Kormos and
Trebits (2012) concluded that “tasks with different cognitive and linguistic demands seem to elicit
different patterns of performance in writing than in speech” (p. 25). Findings in Tavakoli (2014)
also suggested that mode has to be taken into account when interpreting the effects of task com-
plexity on L2 production. In this study task complexity was operationalized as storyline complexity,
with foreground complexity (includes one storyline) representing a simple task and background
complexity (has more than one storylines) representing a complex task. Focusing specifically on
the syntactic complexity, Tavakoli found that the effects of task complexity were greater in speech,
with a higher index of subordination and longer grammatical units in the complex task.
Interactive effects of task complexity and mode in the influence on L2 performance was also
reported by Vasylets et al. (2017) who found that, in the written mode, changes in the performance
from simple to complex conditions were more noticeable and showed a better fit to the predictions
of the Cognition Hypothesis. Thus, in both modes learners obtained higher scores for general syn-
tactic complexity (length of AS-units), propositional complexity (number of idea units) and lexical
sophistication in the complex task condition. However, only in writing learners produced fewer
errors, more complex ideas, and spent more time on the complex task as compared to its simple
counterpart. Thus, as compared to speech, written mode appeared to be an even more favorable
environment for learners to take full advantage of the learning beneficial effects stemming from
45
the increases in task demands – a finding which adds another dimension to the language-learning
potential of L2 writing. Plausibly, the less constrained production conditions in writing may allow
learners to fully engage in the L2 learning processing induced by the increase in task demands.
Without doubt, more investigation is warranted to provide definitive conclusions concerning
the comparison of the way oral and written modes may channel task complexity effects. To our
knowledge, the only investigation available to date which has compared task repetition effects is
the study by Sánchez, Manchón, and Gilabert (2020). This study, which involved low-proficient
secondary school learners, explored the effects of task repetition on L2 production and on learning
from different types of corrective feedback. Concerning L2 performance, the researchers found
differences in the way task repetition effects played out in the two modes. Thus, while there
were increases in fluency during the second reiteration of the oral task, there were no evident
improvements in the second repetition of the task in writing. Without doubt, more research is
required to explore the potential differences in the task repetition effects in the two modes.
Main Research Methods

The available studies exhibit certain differences in their design and the employed research methods.
The unifying feature of the studies is the predominant use of a picture description task or some other
kind of a narrative task to elicit performance. The investigations comparing speech and writing
also differ in terms of the research focus. Some investigations had a narrow focus on some specific
dimension such as accuracy (Ellis, 1987) or complexity (Bulté & Housen, 2009; Vasylets et al.,
2019), while others tapped into various CAF dimensions (Ellis & Yuan, 2005; Kormos, 2014). At
the same time, these studies involved participants from various L1 backgrounds and with different
L2, including L2 English, French, Italian, or Spanish.
On the other hand, the studies exploring task complexity and task repetition effects across modes
were rather homogenous in terms of the participants, involving mainly adult L2 learners of English.
The exceptions are the study by Kuiken and Vedder (2011) who used learners of L2 Italian, or
Zalbidea (2017) who involved L2 Spanish learners. A narrative task based on the description of
pictures or an argumentative essay were the most frequently elicited genres. The majority of studies
explored various CAF dimensions, with the exception of Tavakoli (2014) who had a narrow focus
on syntactic complexity.
In terms of the choice of performance measures, all previous studies show a considerable diver-
sity in this regard (see also Johnson, 2017). Among the frequently reported measures of accuracy
are the proportion of error-free clauses or T-units, or the number of errors per 100 words. Only a few
studies additionally employed some specific measures of accuracy, such as the ratio of error-free
verbs (Kormos & Trebits, 2012), or classified the errors according to their gravity or type (Kuiken
& Vedder, 2011). Concerning complexity, the majority of studies looked into lexical complexity
(lexical diversity, in particular) and syntactic complexity (predominantly, subordination). Only a
few studies examined noun-phrase complexity (Kormos, 2014) or tapped into the area of propos-
itional complexity (Vasylets et al., 2017, 2019). Very few studies included measures of fluency. An
example is the study by Cho (2018) who measured speed fluency (the number of words per minute)
and breakdown fluency (the number of pauses per 100 words).
Summarizing it all, we can conclude that the available investigations represent quantitative
studies, focusing on the impact of mode on the L2 outcomes, typically operationalized in terms of
the CAF measures of L2 performance.
In the SLA literature, CAF measures have been widely used to assess L2 performance and
to gauge L2 development, with higher indices typically indicating better performance and higher
progress (Wolfe-Quintero, Inagaki, & Kim, 1998). Although the results of previous mode studies
provide primarily insights on the way mode impacts L2 outcomes, inferences can also be drawn
concerning the effects of speech versus writing on the L2 processes. As highlighted by Housen,
46
Kuiken, and Vedder (2012), CAF can be seen as “the primary epiphenomena of the psycholinguistic
processes and mechanisms underlying the acquisition, representation and processing of L2 systems”
(p. 2). Theoretically, each of the CAF dimensions can be linked to the L2 developmental processes,
which include (1) internalization of new L2 elements (reflected in greater complexity of L2 produc-
tion, as the interlanguage gains in variety and sophistication); (2) modification of L2 knowledge
(learners restructure their interlanguage which becomes more complex and target-like, or accurate);
(3) consolidation and proceduralization of L2 knowledge (reflected in higher fluency) (Housen &
Pierrard, 2005). An important consideration that we have to mention is that although it is possible
to establish connections between the CAF measures and L2 learning processes, these connections
still remain inferences. We must also take into account that complexity, accuracy and fluency are
multidimensional, which diminishes the possibility of a single and straightforward correspondence
between the CAF dimensions and a particular learning mechanism or process (Housen et al., 2012).
In this regard, we consider that future research on the role of mode in SLA would greatly benefit
from employing online methodologies such as eye-tracking or electroencephalogram, which are
methods of recent prominence to inform our understanding of the cognitive processes underlying
L2 acquisition (Benati & Rastelli, 2018; Conklin & Pellicer-Sánchez, 2016). The major advantage
of the online methodologies is that they provide researchers with a direct window on the brain
during language processing, allowing investigation of online cognition in a relatively non-invasive
manner. Additionally, the online techniques (e.g., eye-tracking) can be fruitfully combined with
the verbal protocols, which can enhance the informativeness and facilitate the interpretation of the
obtained results.
Recommendations for Practice

For decades writing studies have focused on understanding either the product or the processes
of writing. Researchers have studied writing so that they can better understand what aspects
are involved in L2 writing (linguistic, cognitive, social) and how such understanding may help
practitioners assist their students to improve their writing. So, studies in areas such as planning
(Ellis, 2005), feedback (Bitchener, 2012; Bitchener & Storch, 2016), process writing (White &
Caminero, 1995) have proliferated in the past decades and the research agenda on writing has
progressed substantially. As for practitioners, some of the ideas coming out of research studies have
started to permeate teaching practices (e.g., the use of written corrective feedback by teachers or task
repetition). However, a lot of those studies have seen writing as separate from the oral mode, which
does not necessarily correspond to the way learners use hybrid oral and written tasks in actual prac-
tice (Gilabert et al. 2016). Our take in this section is that of how modes may be combined during
task design, with a particular focus on the interaction of mode with task complexity and task repe-
tition. While based on the research findings we have so far, the recommendations included below
are tentative and would need to be tested in systematic classroom studies. Two recommendations
are advanced: 1) turning written material into oral material; 2) creating oral-to-written cycles for
consolidation and overall improvement.
In classes that use tasks as their units of organization it is often the case that the separation of oral
and written modes is not so clear cut. One potential strategy is to encourage L2 learners to express
in writing of what they plan to say verbally. For example, in a task-based lesson learners often
give an oral presentation where they report the outcomes of their task work (e.g., the public pres-
entation of an ad campaign after they have prepared it in small groups). Such public presentations
may happen in groups, but still each participant in the presentation needs to speak monologically
in a one-way task. This is typically a demanding and face-threatening phase of the task that can be
assisted by a writing-to-speaking cycle. Between the task and its public report, a planning stage may
be included to give learners a chance to prepare their oral report. This may involve oral discussions
and practice, but it may also include writing. Such writing may take the shape of a full text or even
47
of partial notes that will help to recall ideas during the oral report. This writing while preparing
for speaking may serve a number of pedagogical purposes: 1) it will facilitate re-activation and
coding of ideas and schemas during verbal planning and formulation –two demanding processes
of oral production. As a consequence, task complexity along resource-dispersing variables (see
Robinson, 2001; Robinson, & Gilabert, 2007) is reduced, and attentional and memory resources
are liberated; 2) it may help transfer some of the advantages of writing to the oral mode (e.g., by
using more complex syntactic constructions and more varied vocabulary –this transfer has not been
directly tested by research, but it has been thoroughly tested through the idea of task repetition);
3) it can facilitate self-noticing of knowledge gaps and mismatches between what learners wish to
say and what they can actually say; 4) it can also prompt students to request corrective feedback.
Such corrective feedback may help them contrast their gaps or their erroneous productions to the
feedback provided by more competent peers or the teacher; 5) it may help in reducing anxiety
associated with spontaneous oral task performance and therefore have a positive effect on overall
fluency. Of course we are not claiming such a strategy may work across the board, since it may
not be useful with all task types (e.g., in more spontaneous, two-way types of tasks where there is
no room for planning how one will react to an interlocutor’s comments). The strategy is however
in line with previous studies conducted with other learner populations which suggest that students
should write to learn (Amores & Pladevall, 2014; Kim, 2008; Williams, 2008).
The second type of recommendation has to do with task repetition through speaking-to-write
cycles. This recommendation is prompted by findings in the task repetition literature that suggest
that task repetition may help improve performance by making outcomes more complex and accurate
(Bygate, 2016; Bygate & Samuda, 2008). Through these techniques L2 learners may be encouraged
to repeat in the written mode a task that they performed orally. This can be designed as post task
material where learners are given the opportunity to repair and fine-tune the outcome of their oral
performances (typically happening under the pressure of time) by having L2 learners write them
down. This may serve a number of pedagogical purposes: 1) unfilled gaps during oral performance
may be tackled under the more favorable writing conditions; 2) written corrective feedback may be
provided by instructors that may help overcome erroneous productions; 3) writing may aid consoli-
dation of knowledge.
Future Directions
Our better understanding of mode may have positive consequences on theory, empirical research,
and teaching practice. At the theoretical level, we estimate that mode will have a much more central
role in theory building. Task-based theoretical frameworks will include mode as a crucial moder-
ating variable in task performance. Because of the slow but steady accumulation of studies that show
important and systematic difference between the oral and the written mode, it is also our belief that
specific hypotheses about task performance will take mode into consideration to a higher degree.
For example, any claims about a potential trade-off between fluency, accuracy, and complexity
will need to take mode into consideration. The lack of pressure and the better access to cognitive
resources afforded by writing tasks may prompt a simultaneous focus on complexity and accuracy
of performance that may differ considerably from the conditions of oral performance. At the empir-
ical level, we predict here that mode may feature more prominently in studies either because it is
considered an important variable in task design for research, or because it is addressed directly
through experimental or quasi-experimental studies. Our final prediction is that mode will also
influence practical decision-making in task-based or task-supported programs, where knowledge
about the effects of mode can inform task and syllabus design. More systematic designs combining
writing-to-speak and speaking-to-write cycles (or even more complex cycles where learners write
in preparation for speaking and they write again prompted by that speaking) may be expected from
48
future teaching practice. Such practices and their effects on overall performance and learning will
need to be tested through both classroom-based and controlled experimental studies in the future.
References
Amores, M., & Pladevall, M. (2014). The effects of written input on young EFL learners’ oral output. Journal
of English Studies, 12, 7–33.
Benati, A., & Rastelli, S. (2018). The teaching and acquisition interface in neurocognition research. Second
Language Research, 34(1), 3–7.
Bereiter, C. & Scardamalia, M. (1987). The psychology of written composition. Hillsdale, NJ: Lawrence
Erlbaum Associates, Inc.
Berman, R., & Ravid, D. (2009). Becoming a literate language user: Oral and written text construction
across adolescence. In D. Olson & N. Torrance (Eds.), The Cambridge handbook of literacy (pp. 92–112).
Bitchener, J. (2012). A reflection on “the language learning potential” of written CF. Journal of Second
Matters.
Bulté, B., & Housen, A. (2009). The development of lexical proficiency in L2 speaking and writing tasks by
Dutch speaking learners of French in Brussels. Paper presented at the colloquium “Tasks across modal-
ities,” Task Based Language Teaching Conference, Lancaster 2009.
Bygate, M. (2001). Effects of task repetition on the structure and control of oral language. In M. Bygate, P.
Skehan, & M. Swain (Eds.). Researching pedagogic tasks, second language learning, teaching and testing
(pp. 23–48). London: Longman.
Bygate, M. (Ed.) (2015). Domains and directions in the development of TBLT. Amsterdam: John Benjamins.
Bygate, M. (2016). Sources, developments and directions of task-based language teaching. The Language
Learning Journal, 44(4), 381–400.
Bygate, M., Skehan, P., & Swain, M. (2013). Researching pedagogic tasks: Second language learning,
teaching, and testing. Abingdon: Routledge.
Byrnes, H., & Manchón, R.M. (Eds.). (2014a). Task-based language learning –Insights from and for L2
writing (Vol. 7). Amsterdam: John Benjamins.
Byrnes, H., & Manchón, R.M. (2014b). Task-based language learning: Insights from and for L2 writing. An
Introduction. In H. Byrnes & R.M. Manchón (Eds.), Task-based language learning –Insights from and for
L2 writing (Vol. 7; pp. 1–27). Amsterdam: John Benjamins.
Chafe, W. (1985). Linguistic differences produced by differences between speaking and writing. In D. Olson,
N. Torrance, & A. Hildyard (Eds.), Literacy, language, and learning: The nature and consequences of
reading and writing (pp. 105–123). Cambridge: Cambridge University Press.
Chafe, W., & Danielewicz, J. (1987). Properties of spoken and written language. In R. Horowitz & S. Samuels
(Eds.), Comprehending oral and written language (pp 55–87). New York: Academic Press.
Cho, M. (2018). Task complexity, modality, and working memory in L2 task performance. System, 72,
85–98.
Conklin, K., & Pellicer-Sánchez, A. (2016). Using eye-tracking in applied linguistics and second language
research. Second Language Research, 32(3), 453–467.
Ellis, R. (1987). Interlanguage variability in narrative discourse: Style shifting in the use of the past tense.
Studies in Second Language Acquisition, 9, 12–20.
Ellis, R. (2000). Task-based research and language pedagogy. Language Teaching Research, 4(3), 193–220.
Ellis, R. (2003). Task-based language learning and teaching. Oxford: Oxford University Press.
Ellis, R. (2005). Planning and task performance: Theory and research. In R. Ellis (Ed.), Planning and task per-
formance in a second language (pp. 3–37). Amsterdam: John Benjamins.
Ellis, R., & Yuan, F. (2005). The effects of careful within-task planning on oral and written task performance.
In R. Ellis (Ed.), Planning and task performance in a second language (pp. 167–193). Amsterdam: John
Benjamins.
Ellis, R., Skehan, P., Li, S., Shintani, N., & Lambert, C. (2019). Task-based language teaching: Theory and
practice. Cambridge: Cambridge University Press.
García Mayo, M.P., (Ed.). (2007). Investigating tasks in formal language learning. Clevedon: Multilingual
Matters.
49
Garcia-Mayo, P. (2015). The interface between task-based language teaching and content-based instruction.
System, 54, 1–3.
Gilabert, R., Manchón, R., & Vasylets, O. (2016). Mode in theoretical and empirical TBLT research: Advancing
research agendas. Annual Review of Applied Linguistics, 36, 117–135.
Grabowski, J. (2007). The writing superiority effect in the verbal recall of knowledge: Sources and determinants.
In G. Rijlaarsdam (Series Ed.), M. Torrance, L. van Waes, and D. Galbraith (Volume Eds.), Writing and
cognition: Research and applications (Studies in Writing) (pp. 165–179). Amsterdam: Elsevier.
Granfeldt, J. (2008). Speaking and writing in L2 French: Exploring effects on fluency, accuracy and com-
plexity. In A. Housen, S. van Daele, & M. Pierrard (Eds.), Proceedings of the conference on complexity,
accuracy and fluency in second language (pp. 87–98). Brussels: Koninklijke Vlaamse Academi.
Ferrari, S., & Nuzzo, E. (2009) Meeting the challenge of diversity with TBLT: Connecting speaking and
writing in mainstream classrooms. Paper presented in the colloquium “Tasks across modalities,” Task
Based Language Teaching Conference, Lancaster 2009.
Horowitz, S., & Samuels, J. (1987). Comprehending oral and written language. San Diego, CA: Academic Press.
Housen, A., Kuiken, F., & Vedder, I. (Eds.). (2012). Complexity, accuracy and fluency. Definitions, meas-
urement and research. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 performance and
proficiency: Complexity, accuracy and fluency in SLA (Vol. 32) (pp. 1–21). Amsterdam: John Benjamins.
Housen, A., & Pierrard, M. (Eds.). (2005). Investigations in instructed second language acquisition (Vol. 25).
Amsterdam: Walter de Gruyter.
Jackson, D.O., & Suethanapornkul, S. (2013). The cognition hypothesis: A synthesis and meta-analysis of
research on second language task complexity. Language Learning, 63(2), 330–367.
Johnson, M.D. (2017). Cognitive task complexity and L2 written syntactic complexity, accuracy, lexical
complexity, and fluency: A research synthesis and meta-analysis. Journal of Second Language Writing,
37, 13–38.
Kellogg, R.T. (1996). A model of working memory in writing. In C.M. Levy & S. Ransdell (Eds.), The science
of writing: Theories, methods, individual differences and applications (pp. 57–71). Mahwah, NJ: Erlbaum.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Erlbaum.
Kormos, J. (2014). Differences across modalities of performance: An investigation of linguistic and discourse
complexity in narrative tasks. In H. Byrnes & R. Manchón (Eds.), Task-based language learning –Insights
from and for L2 writing (pp. 193–217). Amsterdam: John Benjamins.
Kormos, J., & Trebits, A. (2012). The role of task complexity, modality, and aptitude in narrative task perform-
ance. Language Learning, 62, 439–472. doi:10.1111/j.1467-9922.2012.00695
Kim, Y. (2008). The effects of integrated language-based instruction in elementary ESL learning. Modern
Language Journal, 92, 432–451.
Kuiken, F., & Vedder, I. (2011). Task performance in L2 writing and speaking: The effect of mode. In P.
Robinson (Ed.), Second language task complexity: Researching the Cognition Hypothesis of language
learning and performance (pp. 91–104). Amsterdam: John Benjamins.
Kuiken, F., & Vedder, I. (2012). Speaking and writing tasks and their effects on second language performance.
In S. Gass & A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 364–379).
Long, M.H. (2016). In defense of tasks and TBLT: Nonissues and real issues. Annual Review of Applied
Linguistics, 36, 5–33.
Manchón, R.M. (2011). Writing to learn the language: Issues in theory and research. In R.M. Manchón
(ed.), Learning-to-write and writing-to-learn in an additional language (pp. 51–82). Amsterdam: John
Benjamins.
Manchón, R.M. (2014a). The internal dimension of tasks: The interaction between task factors and learner
factors in bringing about learning through writing. In H. Byrnes & R.M. Manchón (Eds.), Task-based lan-
guage learning: Insights to and from writing (pp. 27–52). Amsterdam: John Benjamins.
Manchón, R.M. (2014b) The distinctive nature of task repetition in writing. Implications for theory, research,
and pedagogy. ELIA, 14, 13–42. doi:http://dx.doi.org/10.12795/elia.2014.i14.02
Manchón, R.M. & Roca de Larios, J. (2007). Writing-to-learn in instructed language learning contexts. In
E.A. Soler & M.P.S. Jordá (Eds.), Intercultural language use and language learning (pp. 101–121).
Berlin: Springer.
empirical evidence. In J.W. Shwieter, & A. Benati (Eds.), The Cambridge Handbook of Language Learning
Manchón, R.M., & Williams, J. (2016). L2 writing and SLA studies. In R.M. Manchón & P.K. Matsuda (Eds.),
The handbook of second and foreign language writing (pp. 567–586). Boston, MA: de Gruyter Mouton.
50
Müller-Hartmann, A., & Schocker-von Ditfurth, M. (2011). Introduction to English language teaching:[optimize
your exam preparation]. Stuttgart: Klett Lerntraining.
Nystrand, M. (1987). The role of context in written communication. In S. Horowitz & J. Samules (Eds.),
Comprehending oral and written language (pp. 197–212). San Diego, CA: Academic Press.
Ong, W. (1982). Literacy and orality: The technologizing of the word. New York: Methuen.
Révész, A. (2009). Task complexity, focus on form, and second language development. Studies in Second lan-
guage acquisition, 31(3), 437–470.
Robinson, P. (2001). Task complexity, task difficulty, and task production: Exploring interactions in a compo-
nential framework. Applied linguistics, 22(1), 27–57.
Robinson, P. (Ed.). (2011a). Second language task complexity: Researching the cognition hypothesis of lan-
guage learning and performance (Vol. 2). Amsterdam: John Benjamins Publishing.
Robinson, P. (2011b). Task-based language learning: A review of issues. Language Learning, 61, 1–36.
Robinson, P., & Gilabert, R. (2007). Task complexity, the Cognition Hypothesis and second language learning
and performance. IRAL-International Review of Applied Linguistics in Language Teaching, 45(3), 161–176.
Samuda, V., & Bygate, M. (2008). Tasks in second language learning. Basingstoke: Palgrave Macmillan.
Sánchez, A., Manchón, R.M., & Gilabert, R. (2020). The effects of task repetition across modalities and pro-
ficiency levels. In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp.
Shehadeh, A., & Coombe, C.A. (Eds.). (2012). Task-based language teaching in foreign language contexts:
Research and implementation (Vol. 4). Amsterdam: John Benjamins.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.Skehan, P.
(2003). Task-based instruction. Language Teaching, 36(1), 1–14.
Skehan, P. (2009). Modelling second language performance: Integrating complexity, accuracy, fluency, and
lexis. Applied Linguistics, 30, 510–532.
Skehan, P. (Ed.) (2014). Processing perspectives on task performance. Amsterdam: John Benjamins.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidhofer (Eds.),
For H. G. Widdowson: Principles and practice in the study of language. A Festschrift on the occasion of
his 60th birthday (pp. 125–144). Oxford: Oxford University Press.
Swain, M. (2005). The Output Hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in
second language teaching and learning (pp. 471–483). Mahwah, NJ: Erlbaum.
Tavakoli, P. (2014). Storyline complexity and syntactic complexity in writing and speaking tasks. In H. Byrnes
& R.M. Manchón (Eds.), Task-based language learning –Insights from and for L2 writing (pp. 217–236).
Thomas, M., & Reinders, H. (Eds.). (2010). Task-based language learning and teaching with technology.
London: A&C Black.
Van den Branden, K. (Ed.). (2006). Task-based language education: From theory to practice. Cambridge:
Cambridge University Press.
Vasylets, O., Gilabert, R., & Manchón, R.M. (2017). The effects of mode and task complexity on second lan-
guage production. Language Learning, 67(2), 394–430.
Vasylets, O., Gilabert, R., & Manchón, R.M. (2019). Differential contribution of oral and written modes to
lexical, syntactic and propositional complexity in L2 performance in instructed contexts. Instructed Second
Language Learning, 3 2), 206–227. doi:10.1558/isla.38289
White, A.S., & Caminero, R. (1995). Using process writing as a learning tool in the foreign language class.
Williams, J. (2008). The speaking-writing connection in second language and academic literacy development.
In D. Belcher and A. Hirvela (Eds.), The oral/literate connection: Perspectives on L2 speaking, writing,
and other media interactions (pp. 210–225). Ann Arbor, MI: University of Michigan Press.
Language Writing, 21(4), 321–331.
Wolfe-Quintero, K., Inagaki, S., & Kim, H.Y. (1998). Second language development in writing: Measures of
fluency, accuracy and complexity. Honolulu: University of Hawai’i.
Yoon, H.J., & Polio, C. (2017). The linguistic development of students of English as a second language in two
written genres. TESOL Quarterly, 51(2), 275–301.
Yu, G. (2009). Lexical diversity in writing and speaking task performances. Applied Linguistics, 31, 236–259.
doi:10.1093/applin/amp024
Zalbidea, J. (2017). ‘One task fits all’? The roles of task complexity, modality, and working memory capacity
in L2 performance. The Modern Language Journal, 101(2), 335–352. doi:org/10.1111/modl.12389
51
5
TASK COMPLEXITY STUDIES
Mark D. Johnson
East Carolina University
Introduction
Second language (L2) writing research on cognitive task complexity (henceforth CTC) has explored
two over-arching, closely-related questions: (a) how can we ease the demands of the writing pro-
cess?; and (b) how can we promote students’ writing performance in terms of improved complexity,
accuracy, and fluency of language produced? The natural desire for many teachers of L2 writing
is to ease the demands of the writing process in order to promote students’ writing performance.
This is certainly important in a testing situation. However, in pedagogical contexts, it may be in
the learners’ best interests to struggle with the language, as doing so may aid in the development of
general proficiency in the L2 (Robinson, 2011). Task-based language teaching/learning (henceforth
TBLT) offers a means to structure and/or sequence tasks to promote production in the L2 and hence
the development of the L2.
In many academic contexts, common writing assignments certainly meet Skehan’s (1998) cri-
teria for language tasks: (a) the purpose for writing is to convey meaning; (b) writing is a com-
plex, problem-solving activity; (c) commonly-used writing tasks are a form of socialization into
the “real world” of an academic community; (d) task completion is paramount; and (e) the final
product –the completion of the task –is assessed. In particular, writing allows for intense focus on
the first two task features: communicative purpose and problem solving. Further, compared to oral
language production, writing takes place at a slower pace, which allows learners to direct atten-
tional resources so that they can use their explicit knowledge of the L2 to experiment with partially
acquired structures and/or to gain greater control over acquired structures (Gilabert, Manchón, &
Vasylets, 2016; Manchón, 2011; Manchón & Vasylets, 2019; Manchón & Williams, 2016; Vasylets,
Gilabert, & Manchón, 2019; Williams, 2008, 2012). Thus, learners must grapple with the linguistic
resources available to them in an effort to communicate cognitively demanding messages.
Research on CTC in L2 writing is informed by three theoretical perspectives: (a) the Limited
Attentional Capacity Model (Foster & Skehan, 1996; Skehan, 1996, 1998; 2009; Skehan & Foster,
2001), (b) the Cognition Hypothesis (Robinson, 2001, 2003, 2005, 2011), and/or (c) Kellogg’s
(1996; Kellogg, Whiteford, Turner, Cahill, & Mertens, 2013) model of working memory in L1
writing.
The Limited Attentional Capacity Model –often called the Tradeoff Hypothesis –proposes that
attentional resources are in limited supply, and that using the L2 is a highly demanding cognitive
52 DOI: 10.4324/9780429199691-8
Task Complexity Studies
process to be managed by working memory/attentional resources (Kormos, 2006). In Skehan’s

(1998) framework, the complexity of a task can be described along three dimensions: (a) code com-
plexity, or the complexity of the language needed to complete the task; (b) cognitive complexity,
or the degree of the learner’s familiarity with the task and/or the cognitive processes required by
the task; and (c) communicative stress, typically described in terms of time pressure and/or an
increased number of interlocutors. When attentional resources are stretched thin by a cognitively
complex task, the learner is predicted to focus on meaning –and therefore the use of automatized
chunks of language – , resulting in fluent production at the expense of complex and/or accurate
production (Skehan, 2014). Conversely, less complex tasks are thought to free attentional resources
so that the learner may focus on the rule-based interlanguage system, resulting in the production of
complex and/or accurate language. This increase in complexity or accuracy is predicted to come at
the expense of fluency.
In contrast, the Cognition Hypothesis conceives of multiple attentional resources and proposes
CTC as one of three dimensions for consideration when designing and sequencing tasks for lan-
guage learners (see Table 5.1): (a) task complexity, which describes the cognitive demands of the
task; (b) task condition, which describes the interactional demands of the task; and (c) task diffi-
culty, which describes the ability requirements of the task. As can be seen in the first column of
Table 5.1, the framework proposes two dimensions of CTC: (a) a resource-dispersing dimension of
CTC and (b) a resource-directing dimension of CTC. Tasks that are complex along the resource-
dispersing dimension of CTC are thought to affect language production in much the same manner
predicted by the Limited Attentional Capacity Model. In contrast, tasks that are complex along the
resource-directing dimension of CTC are thought to focus attentional resources on the production
of complex, accurate – though less fluent – language to meet the demands of the task. So long as
the resource-dispersing features of a task are not simultaneously increased (Robinson, 2001, 2003),
tasks that are complex along the resource-directing dimension of CTC are thought to necessitate the
use of complex, accurate language to complete them.
Table 5.1 Resource-directing and resource-dispersing features of cognitive task complexity
Task Complexity Task Condition Task Difficulty

Resource-directing features of task Participation features Ability features
complexity +/− open solution h/l working memory
+/− here-and-now +/− one-way flow h/l reasoning
+/− few elements +/− convergent solution h/l task-switching
+/− spatial reasoning +/− few participants h/l aptitude
+/− causal reasoning +/− few contributions needed h/l field independence
+/− intentional reasoning +/− negotiation not needed h/l mind-reading
+/− perspective taking
Resource-dispersing features of task Participant features Affective features
complexity +/− same proficiency h/l openness
+/− planning time +/− same gender h/l control of emotion
+/− prior knowledge +/− familiar h/l task motivation
+/− single task +/− shared content knowledge h/l anxiety
+/− task structure +/− equal status and role h/l willingness to communicate
+/− few steps +/− shared cultural knowledge h/l self-efficacy
+/− interdependency of steps
Source: Robinson (2011).
53
Mark D. Johnson
Table 5.2 Writing systems and processes associated with Kellogg’s model of working memory in L1 writing
The Writing Process
System Process Sub-process

Formulation Translation: the translation of
ideas into language
Planning: attention to one of the Idea generation: generation of
listed sub-processes ideas for potential inclusion
in the developing text
Organization: the organization
of ideas
Goal setting: attention to
considerations of audience
and purpose
Execution: the coordination of muscle
movements to encode language on
the page (or computer monitor)
Monitoring: the comparison of the Reading
developing text to the author’s
goals for the text
Editing
Source: Adapted from Kellogg, Whiteford, Turner, Cahill, and Mertens (2013).
The third theoretical perspective is specific to writing. Kellogg’s model of working memory
in L1 writing links components of working memory (Baddeley, 1986, 2007) to writing processes
identified in the early research of Flower and Hayes (1980), each of which is described in Table 5.2.
Kellogg (1996) proposes that, during the writing process, formulation and monitoring are in direct
competition for working memory resources. Thus, channeling the writer’s attention to focus on one
or the other is thought to promote the fluency of writing, the perceived quality of writing, and –
perhaps –the linguistic complexity of writing. For example, the results of Kellogg’s early research
(1987a, 1987b, 1988, 1990) suggest that providing L1 writers with time to plan before composing
allows them to focus their attentional resources on translating ideas into language, thus resulting in
increased fluency and holistic quality of writing. Kellogg’s results also suggest –though indirectly –
that this focus on the translation process increases the complexity of language produced.
Comparisons of Robinson’s and Kellogg’s models illustrate the exceptional complexity of
writing tasks, particularly when such tasks are completed in a language over which the L2 writer
may have limited command. Not only must L2 writers manage the complexity of the writing task
(i.e., the specific cognitive demands of the writing task presented to them), but they must also
manage the complex performance demands of the written mode (i.e., the processes inherent to com-
posing, see Table 5.2). However, these challenges also present opportunities for educators. Better
understanding the cognitive demands of L2 writing tasks may allow practitioners to design and/or
sequence tasks that (a) focus learners’ attention on monitoring the developing text –resulting
in greater accuracy –or (b) focus learners’ attention on translating ideas into language –resulting
in greater complexity and/or fluency.

Early examinations of CTC in L2 writing appear to have been concerned primarily with testing
the predictions of the Limited Attentional Capacity Model and/or the Cognition Hypothesis in the
54
written modality (Ishikawa, 2007; Kuiken & Vedder, 2007, 2008; Kuiken, Mos, & Vedder, 2005),
focusing on the impact of CTC on the syntactic complexity, accuracy, lexical complexity, and/or
fluency (henceforth CALF) of written production. Though the results of these studies offer little to
support either hypothesis, several trends suggest some effect of CTC on written L2 production. For
example, each of the above studies examined the effects of CTC and L2 proficiency on the accuracy
of L2 written output and found significant main effects of CTC on the accuracy of participants’
writing. Participants in the more complex task condition produced fewer errors, but there is some
question as to whether the effect of CTC on L2 written accuracy is moderated by L2 proficiency.
Kuiken and Vedder (2007, 2008) and Kuiken, Mos, and Vedder (2005) found consistent positive
effects of L2 proficiency and increased CTC on the accuracy of L2 written production but found no
significant interaction between CTC and L2 proficiency. In contrast, Ishikawa (2007) found signifi-
cant interaction effects of L2 proficiency and CTC. More proficient L2 writers in Ishikawa’s study
produced less accurate writing in the complex task condition.
Regarding the effect of CTC on the syntactic and lexical complexity of L2 written production,
the results are again contradictory. Ellis and Yuan (2004) found that reducing the complexity of a
writing task by providing planning time increased the variety of syntactic forms used among L2
writers. In contrast, Ishikawa (2007) observed a significant effect of increased CTC on the lex-
ical complexity of L2 production, though the effect was moderated by the L2 proficiency of the
participants. Further complicating matters, Kuiken et al. (2005) found no effect of CTC nor L2
proficiency on the syntactic and lexical complexity of L2 written production, whereas Kuiken and
Vedder (2008) found the effect of CTC on lexical complexity to differ depending on the proficiency
and target language of the participants.
Research results on the effects of CTC on the fluency of L2 written production offer no clear
support for either hypothesis. Ellis and Yuan (2004) found reduced CTC –in the form of planning –
to be associated with increases in L2 writing fluency, whereas a large-scale study by Johnson,
Mercado, and Acevedo (2012) found negligible effects of planning on the fluency of L2 written
production. In contrast, Ishikawa (2007) found increasing the resource-directing features of CTC to
be associated with increased writing fluency.
Though the results of previous research on the effects of CTC on L2 written production offer no
clear support for the Cognition Hypothesis or the Limited Attentional Capacity Model as applied to
writing, they do offer some evidence that the complexity of a writing task facilitates attention to lan-
guage, whether in terms of monitoring the existing interlanguage system or in terms of restructuring
the interlanguage system to meet the communicative demands of the task. Further, due to the unique
performance conditions of the written mode, L2 writers may be able to direct the focus of their attention
to any dimension of the task (e.g., content, discourse-related concerns, language-related concerns,
etc.), or to any of the macro writing processes (i.e., planning, monitoring, formulation, or revision, See
Table 5.2) (Ishikawa, 2007; Johnson, 2017). Thus, L2 writing tasks may be designed with complex
features in mind to promote attention to language. However, how that additional attention manifests
itself is subject to how L2 writers –consciously or unconsciously –direct their attentional resources.

As advanced above, recent L2 writing research on CTC suggests that the nature of the written mode
facilitates attention to language: (a) the additional time and slower pace of the writing process
facilitate simultaneous attention to form and meaning (Manchón & Vasylets, 2019; Vasylets et al.,
2019; Williams, 2008, 2012); and (b) the recursive nature of the writing process allows learners to
focus attentional resources on the formulation or monitoring processes throughout the entire task
completion process (Johnson, 2017; Manchón, 2014; Manchón & Vasylets, 2019).
These tenets have been tested in a group of task-modality studies (cf. Cho, 2015, 2018; Tavakoli,
2014; Vasylets, Gilabert, & Manchón, 2017; Vasylets et al., 2019; Zalbidea, 2017). Overall, these
55
Mark D. Johnson
recent contributions to TBLT research provide empirical evidence of differences in CALF between
the oral and written mode of L2 production. Importantly, these differences have been found to be
moderated by the complexity of the writing task (Cho, 2018; Vasylets, et al., 2017; Zalbidea, 2017).
For example, some studies suggest that language produced in the written mode is more complex,
more accurate, but less fluent than language produced in the oral mode (Vasylets et al., 2019).
Additionally, the effects of CTC features appear to interact with mode to increase the lexical com-
plexity and accuracy (Zalbidea, 2017) as well as the propositional complexity –i.e., the number and
quality of ideas –of L2 written production (Vasylets et al., 2017).
Part of this recent line of research has explored the role of working memory capacity in CTC
effects (Cho, 2018; Michel, Kormos, Brunfaut, & Ratajczak, 2019; Zalbidea, 2017). For example,
Zalbidea’s (2017) results suggest working memory capacity moderates the effect of CTC on L2
written production while Michel et al. (2019) suggest that working memory may interact with CTC
features to promote attention to monitoring. In contrast, Cho’s (2018) results suggest no effect
of working memory capacity. In sum, the results of this research area suggest no clear effect of
working memory capacity on L2 written production.
Further research has expanded beyond examinations of CTC to include interaction –a feature of
task condition in Robinson’s framework (see Table 5.1) –as well as learner affect –a feature of task
difficulty in Robinson’s framework –and their impact on the CALF of L2 written production. The
results of this research suggest positive effects of collaborative planning on the CALF of written
L2 production (Abrams & Byrd, 2017) and suggest that learner affect, in particular writing anxiety
(Rahimi & Zhang, 2019), may moderate the effect of CTC on the CALF of L2 written production.
Given that task sequence is at the heart of both the Cognition Hypothesis and the Limited
Attentional Capacity Model (Choong, 2011), it is perhaps ironic that task sequence is an under-
studied area of TBLT-informed L2 writing research. Only two studies have examined task sequence
in L2 writing (Allaw & McDonough, 2019; Lambert & Robinson, 2014). Allaw & McDonough
(2019) show positive effects of sequencing tasks from less complex to more complex on the lexical
complexity and fluency of L2 writing. In contrast, Lambert and Robinson (2014), show minimal
effects of task sequence on L2 written production.
As summarized above, recent developments in TBLT-informed L2 writing research indicate that
oral and written language differ in CALF but that these differences are moderated by the complexity
of the writing task. This would suggest, as is discussed in the following section on performance
variables, that linguistic complexity may be manifested differently in the two modes. However, the
complexity of the writing task does exert some influence over how the L2 writer responds to the
demands of the task. Some research evidence –though limited –suggests that working memory
capacity, at the heart of TBLT-informed research and cognitive research on L2 writing, may mod-
erate the effect of CTC on L2 writing performance. Additionally, interaction and affect may further
influence the effect of CTC on L2 written output.

As Manchón and Vasylets (2019) have noted, L2 writing research on CTC is somewhat limited,
and methodological variation makes direct comparison of –not to mention generalization from –
studies difficult at best. This section attempts to summarize L2 writing research methods on CTC to
date, making comparisons among methodologically similar studies where possible.
In terms of research designs, studies reviewed for this chapter (N = 26) were divided nearly
evenly between one- shot designs, repeated- measures designs, and factorial designs. Quasi-
experimental one-shot designs were limited to studies of planning in L2 writing, likely due to the
nature of writing tasks and the challenges inherent in designing a control, as no writing task is likely
to be free of CTC features. In repeated-measures designs, the participants were typically asked
56
to complete two or more tasks varying in complexity in a single session. In other cases, longer
intervals between tasks ranged from 1–2 days (Ruiz-Funes, 2015) to semester-long, longitudinal
studies (Lambert & Robinson, 2014). Factorial research designs examined main effects of CTC,
certainly, as well as main and/or interaction effects of task sequence (Allaw & McDonough, 2019),
L2 proficiency (Kuiken & Vedder, 2007, 2008; Kuiken, Mos, & Vedder, 2005), and/or mode (Cho,
2015, 2018; Vasylets et al., 2017). However, for the purposes of this chapter, the independent vari-
able of interest in these studies is CTC.
With regard to populations studied, the majority of the studies reviewed for this chapter examined
adult EFL learners, typically in university courses or in private language schools. A handful of
studies examined ESL learners in university courses in the United States, Britain, or New Zealand.
Even fewer studies examined L2 writing among young learners and L2 writing in languages other
than English.
The most commonly-used writing tasks were narrations of picture sequences or argumenta-
tive essays, followed by letters to friends, expository essays, and writing tasks described simply
as essays. Picture sequences were often used to extend TBLT studies of oral language production
(Ellis & Yuan, 2004; Mohammadzadeh Mohammadabadi, Dabaghi, & Tavakoli, 2013; Shajeri &
Izadpanah, 2016; Tavakoli, 2014) or to establish criterion validity through comparisons to tasks
used in proficiency exams (Meraji, 2011). The use of argumentative essays appears to be an effort
to provide face validity for learners of English for academic purposes, whereas the use of letters
appears to be more communicatively authentic. The variety of genres employed in these studies
further highlights the challenges in making direct comparisons among CTC studies of L2 writing.
These challenges are compounded when outcome variables –particularly metrics of syntactic com-
plexity –are also considered. As will become clearer in a later discussion of outcome variables,
linguistic complexity is manifested differently, depending on the genre elicited by a writing task.
The most studied feature of CTC was planning, typically some form of pre-task planning
operationalized as ten minutes of time prior to composing. A handful of studies manipulated pre-
task planning through (a) the use of common planning techniques (i.e., mind-mapping, chrono-
logical sequencing), (b) a focus on one or more sub-process of planning (i.e., idea generation,
organization, and/or goal setting), or (c) a focus on content and/or language. Only two studies
examined online planning (i.e., planning done during composing), operationally defined as unlim-
ited composing time.
The second most examined CTC feature was the number of elements, followed closely by
reasoning demands. Number of elements was usually operationalized as the number of required
conditions to meet the demands of the task. For example, a series of studies by Kuiken, Mos, and
Vedder (2005) and Kuiken and Vedder (2007, 2008) asked participants to write letters recommending
a vacation destination to a friend. The complex task condition asked the participants to consider
six required conditions (e.g., location, amenities, food options), whereas the less complex task
condition asked participants to consider only three required conditions. Though the writing tasks
differed somewhat, two studies by Cho (2015, 2018) and a study by Frear and Bitchener (2015)
operationalized the number of elements in similar ways.
Operational definitions of reasoning demands were much less consistent, ranging from the
demands of sequencing a jumbled picture narrative (Kormos, 2011; Shajeri & Izadpanah, 2016)
to very careful considerations of causal relationships when retelling events in a video sequence
(Choong, 2014). Other operational definitions suggested that reasoning demands are inherent to
academic writing tasks (Ruiz-Funes, 2015).
Importantly, very few early studies offered tests of the validity of CTC manipulation, a common
oversight in TBLT-informed L2 writing research (Révész, 2014). Robinson (2001) suggests a
number of methods to validate the complexity of a language task: (a) successful task comple-
tion, (b) time needed to complete the task, (c) perceptions of a task’s difficulty/complexity,
57
Mark D. Johnson
(d) physiological responses to a task, and/or (e) interference from competing tasks. However, many
L2 writing studies on CTC assume that features of CTC identified in Robinson’s framework (2011)
are, in fact, complex. Notable exceptions include efforts to establish the complexity of writing
tasks through (a) questionnaires of the participants’ perceptions of CTC (Choong, 2014; Révész,
Kourtali, & Mazgutova, 2017; Yang, 2014; Zalbidea, 2017), (b) questionnaires of native speakers’
perceptions of CTC (Zalbidea, 2017), (c) pilot studies which questioned participants about their
perceptions of CTC (Rahimi & Zhang, 2018), (d) reference to previous studies establishing the
complexity of the task (Vasylets et al., 2017), or (e) the use of multi-facets Rasch analysis to quan-
titatively measure CTC on a logit scale (Choong, 2014). These exceptions highlight the importance
of including similar metrics in future TBLT-informed L2 writing research.
As for outcome measures, true to their orientation in SLA, the preponderance of studies reviewed
for this chapter examined the effect of CTC on some metric of L2 written CALF. Metrics of flu-
ency and accuracy appear to be relatively straightforward, using very similar metrics across studies.
Metrics of syntactic and lexical complexity, in contrast, suggest (a) a movement from complexity
measures more suited to oral language analysis toward complexity measures more suited to written
language analysis and (b) a need for comparable metrics of syntactic and lexical complexity across
studies.
In terms of fluency metrics, writing fluency was most often measured as words per minute or as
total words produced when writing time was held constant. A study by Révész et al. (2017) stands
out for its use of the number of pauses per 100 words and P-burst –the number of words produced
between pauses – as process-oriented fluency metrics. Accuracy metrics appear to be evenly split
between metrics of accurate language produced and metrics of errors produced, each normed to the
unit of analysis –typically the T-unit or the clause.
Metrics of syntactic complexity used in the studies reviewed for this chapter illustrate the early
focus of SLA research on oral language production. Among the studies examined, most relied on
measures of verbal subordination, most commonly the ratio of clauses to T-unit, as metrics of syn-
tactic complexity. However, recent corpus-analytic research (Biber & Gray, 2010; Biber, Gray, &
Poonpon, 2011, 2013) confirms Halliday’s (1989) early characterization of complexity in speech
as verbal and complexity in writing as nominal, suggesting that gauging the use of extended noun
phrases and post-nominal modification may be more appropriate to the analysis of writing. This
is not to say that metrics of subordination have no place in the study of L2 writing development.
Byrnes (2012), Norris and Ortega (2009), and Ryshina-Pankova (2015) note that the use of subor-
dination is indicative of an intermediary stage of L2 writing development as learners gain greater
command over the lexico-grammatical features that are typical of written language. These authors
recommend metrics of subordination be used in concert with metrics of phrasal complexity to cap-
ture syntactic complexity and its relationship to L2 writing development.
Lexical complexity metrics used in CTC studies typically include some measure of lexical
diversity, most frequently mean-segmental type-token ratio or a variation of Guiraud’s index.
Although each of these metrics minimizes the impact of text length on type-token ratio, each has its
disadvantages (McCarthy & Jarvis, 2007, 2010). Relatively few studies have employed more recent
metrics of lexical diversity such as D, voc D, or MTLD (Malvern, Richards, Chipere, & Durán,
2004; McCarthy, 2005), which are less affected by increases in text length. More recent studies
have included metrics of lexical sophistication and lexical density, using the relative frequency of
lexis (Johnson et al., 2012; Kormos, 2011; Kuiken & Vedder, 2007; Révész et al., 2017; Vasylets
et al., 2017) or ratio metrics of content words to total words (Yang, 2014).
Missing from research on CTC and L2 writing is an examination of genre, L2 development
and the interplay of syntactic and lexical complexity. For example, early corpus-analytic research
(Biber, 1988) suggests tradeoffs between lexical complexity and syntactic complexity as a response
to the informational density of academic genres. Further, research informed by Dynamic Systems
58
Theory (Spoelman & Verspoor, 2010; Verspoor, Lowie, & Van Dijk, 2008; Verspoor, Schmid, &
Xu, 2012) suggests longitudinal tradeoffs between lexical complexity and syntactic complexity
associated with increased proficiency in the L2.
Finally, as previously noted, common metrics are needed across studies to facilitate comparison.
As Bulté and Housen (2012) have remarked, (a) operational definitions of linguistic complexity
are often circular; and (b) construct definitions of linguistic complexity are often missing (oper-
ational definitions of linguistic complexity are used in their stead). Thus, principled consideration
of L2 development, writing development, and the genre/register of the writing task is needed when
selecting metrics of syntactic and lexical complexity. As corpus-analytic research has consistently
demonstrated (Biber & Gray, 2010; Biber et al., 2011, 2013), metrics of verbal subordination may
best capture the linguistic complexity of interpersonal genres such as narratives and letters, whereas
metrics of phrasal elaboration may best capture the linguistic complexity of informational genres
such as argumentative essays. Furthermore, movement from greater verbal subordination toward
greater phrasal elaboration is also associated with writing development both in the L1 (Staples,
Egbert, Biber, & Gray, 2016) and in the L2 (Byrnes, 2012; Norris & Ortega, 2009; Ryshina-
Pankova, 2015).

Though the results of research on CTC and L2 writing research offer no clear support for either the
Cognition Hypothesis or the Limited Attentional Capacity Model, the most commonly studied CTC
features do suggest that CTC can be manipulated to focus learners’ attentional resources on lan-
guage more generally. Instructors may construct tasks to facilitate attention to language. Whether
that focus results in greater syntactic and lexical complexity, accuracy, or fluency appears to be a
function of the “directability of attention” (Ishikawa, 2007, p. 151) inherent to the writing process
(see Table 5.2).
The single most-researched feature of CTC is the provision of pre-task planning. While the
results can be confusing, they suggest that providing L2 writers with time to plan prior to com-
posing facilitates attention to language. Interestingly, it would appear as though it does not matter
what the structure of pre-task planning is. Simply providing learners with 10 minutes of time prior
to composing appears to improve the syntactic complexity and fluency of written L2 production. It
would seem, then, that giving L2 writers time to plan their writing process prior to composing frees
attentional resources such that they may direct those resources to the translation writing process
(see Table 5.2) –thus, resulting in greater syntactic complexity as well as greater fluency.
Though the results of L2 writing studies examining the number of elements required in a writing
task are inconclusive, they do suggest that the synthesis of input material facilitates attention to
language. For example, asking L2 writers to consider multiple options and multiple perspectives
in making a recommendation appears to focus attentional resources on monitoring language pro-
duction (Cho, 2018; Kuiken & Vedder, 2007, 2008; Kuiken, Mos, & Vedder, 2005) and/or syn-
tactic and lexical retrieval (Cho, 2018; Rahimi & Zhang, 2019). It may also be possible that the
written input inherent in such tasks provides L2 writers with additional models of written language
production. Regardless, asking learners to synthesize input and make critical recommendations/
judgments based on that input facilitates attention to language, resulting in improved accuracy and/
or complexity.
Increasing the reasoning demands of a writing task appears to result in greater syntactic and
lexical complexity of language produced. However, there do seem to be tradeoff effects in terms
of reduced accuracy of language produced. Some research evidence suggests that increased com-
plexity of L2 production may be the result of a need to communicate increasingly complex ideas
(Vasylets et al., 2017). Although in practical terms, there is some inconsistency in what constitutes
59
Mark D. Johnson
a reasoning demand. Some have argued that certain academic genres –most commonly argumen-
tative essays –pose reasoning demands on the learner, whereas others have suggested that consid-
erations of audience and rhetorical purpose pose reasoning demands on the learner. Regardless,
practice composing in various genres for various audiences allows L2 writers to experiment with
the meaning-making resources associated with each genre.
Due to the inconclusive – sometimes contradictory – results of previous research, it is diffi-
cult to make firm recommendations for pedagogical practice. Yet a number of trends suggest that
planning prior to composing, increasing the number of elements required for task completion, and
increasing the reasoning demands of a writing task facilitate L2 writers’ attention to language.
Because writing allows for greater attention to form-meaning relationships, learners may devote
attentional resources to manipulating the interlanguage system to respond to the increased commu-
nicative demands of the task, resulting in greater complexity of language. In contrast, L2 writers
may devote attentional resources to monitoring the developing text, resulting in greater accuracy
of language.
Future Directions
Recent trends in TBLT-informed research noted above provide a number of avenues for future
inquiry, among them are (a) the validation of CTC features, (b) the specific role of working memory
in L2 writing processes; (c) the relationship(s) among syntactic complexity, lexical complexity,
propositional complexity, and genre; and (d) the interplay of general L2 development, L2 writing
development, and genre familiarity. Central to each is the role of mode in language production.
First, further research is needed to determine whether theorized features of CTC are, in fact,
complex. As noted previously, validation of CTC features appears to be an oversight in past TBLT-
informed L2 writing research. Though more recent research has taken measures to establish the
complexity of tasks, more research is needed to corroborate these findings.
In order to better understand the allocation of attentional resources during writing tasks, future
TBLT-informed L2 writing research should examine the role of working memory capacity in (a) the
completion of complex tasks as well as (b) the writing process (whether L1 or L2). While recent
research has examined working memory capacity as a moderating variable (Cho, 2018; Michel
et al., 2019; Zalbidea, 2017), the results of such studies are contradictory at times. For example,
Cho (2018) found working memory capacity to have no effect on the CALF of oral and written pro-
duction. In contrast, Zalbidea (2017) found that participants with greater working memory capacity
produced more complex language in the oral mode (oral vs. written) and more accurate language
in the written mode. More recently, Michel et al. (2019) corroborated Zalbidea’s (2017) results by
finding that increases in working memory capacity were associated with improved performance
on an academic editing task. It would appear that working memory capacity aids in the production
of accurate language in the written mode, but the difference in findings between Zalbidea (2017)
and Cho (2018) points to a need for further research on the relationship(s) among CTC, propos-
itional complexity, and the CALF of L2 written production (Vasylets et al., 2019) and how each
may be influenced by the mode of L2 production as well as the genre of written production (Yoon
& Polio, 2017).
Recent inquiries (Vasylets et al., 2017; Zalbidea, 2017) have found increases in resource-
directing features of CTC and decreases in resource-dispersing features of CTC to be associated
with increased propositional complexity among L2 writers. However, the relationship between
propositional complexity and linguistic complexity (whether syntactic or lexical), as well as the
relationship between propositional complexity and L2 written fluency, is not entirely clear. Vasylets
et al. (2017) found increases in lexical and syntactic complexity corresponding to increases in prop-
ositional complexity, a difference associated, perhaps, with the written mode of production (Vasylets
et al., 2019). Thus, more research is needed to better understand (a) whether the complexity of a
60
task impacts the propositional complexity of L2 writing and (b) whether propositional complexity
in L2 writing is related to linguistic complexity. Further, the relationship(s) among propositional
complexity, syntactic complexity, and lexical complexity –and the appropriacy of complex forms
to the genre elicited by the writing task –requires further research.
Understanding the relationships among genre familiarity, general L2 development, general
writing development, and specific L2 writing development and their impact on the complex forms
used by L2 writers is necessary to fully understand the effects of CTC on L2 written production.
Examination of TBLT-informed L2 writing research investigating differences between the oral and
written modes of L2 production suggests that distinctions between oral and written modes of pro-
duction may be best described on a continuum (Gilabert et al., 2016). L2 use in each mode is
influenced not only by proficiency in the L2, but also by genre knowledge and considerations of
audience.
Studies examining the differential effect of CTC on productive mode have arrived at sometimes
contradictory results. For example, Cho (2018) and Vasylets et al. (2017) found CTC to increase
the syntactic complexity of L2 production in the written mode. In contrast, Zalbidea (2017) found
no significant effect of CTC in the written or the oral mode. Rather, Zalbidea (2017) notes signifi-
cant effects of mode on the syntactic complexity of production; in the spoken mode, participants
produced more complex language than did participants in the written mode. Cho (2018) also notes
significant effects of mode on the complexity, accuracy, and fluency of language produced; in the
spoken mode, participants produced more fluent, less accurate language. However, Cho’s (2018)
results highlight the difference in how complex language is manifested in the written mode. When
writing, participants in the more complex task condition produced a greater number of complex
nominals per T-unit –a metric typically associated with (a) academic prose (Biber, 1988; Biber
& Gray, 2010; Biber et al., 2011, 2013), (b) L1 writing development (Staples et al., 2016), and/
or (c) general L2 development (Norris & Ortega, 2009). In contrast to Cho (2018), Vasylets et al.
(2017), found that participants in the written mode produced more subordinated units –a metric
typically associated with oral language production –than participants in the oral mode. The con-
trast in findings further illustrates the challenges in identifying the relationship(s) among productive
mode, general L2 proficiency writing proficiency, and genre.
Because writing allows for a “directability of attention” (Ishikawa, 2007, p. 151), future research
examining the specific effects of CTC on the writing processes of L2 learners is needed to better
understand how writing tasks can be structured and/or sequenced to promote development in
the L2. In particular, longitudinal, studies are needed to understand how tasks can be sequenced
to interact with various writing processes to facilitate experimentation with partially acquired
structures as well as to promote the use of explicit L2 knowledge to gain greater controls over pre-
viously acquired forms (Gilabert et al., 2016; Manchón, 2011; Manchón & Vsylets, 2019; Manchón
& Williams, 2016; Vasylets et al., 2019; Williams, 2008, 2012).
Finally, TBLT-informed L2 writing research appears to be in its early stages, and for this reason,
the primary recommendation for future research offered here is replication of existing research.
Despite recent meta-analytic evidence to suggest consistent effects of complex task features on the
CALF of written L2 production (Johnson, 2017), the inconclusive results of previous research jus-
tify the pursuit of replication studies.
References
Abrams, Z.I., & Byrd, D.R. (2017). The effects of meaning-focused pre-tasks on beginning-level L2 writing in
German: An exploratory study. Language Teaching Research, 21, 434–453. doi:10.1177/1362168815627383
Allaw, E., & McDonough, K. (2019). The effect of task sequencing on second language written lexical com-
plexity, accuracy, and fluency. System, 85, 1–12. doi:10.1016/j.system.2019.06.008
Baddeley, A. (1986). Working memory. Oxford: Oxford University Press.
Baddeley, A. (2007). Working memory, thought, and action. Oxford: Oxford University Press.
61
Mark D. Johnson
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Biber, D., & Gray, B. (2010). Challenging stereotypes about academic writing: Complexity, elaboration, expli-
citness. Journal of English for Academic Purposes, 9, 2–20. doi:10.1016/j.jeap.2010.01.001
Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure gram-
matical complexity in L2 writing development? TESOL Quarterly, 45, 5–35. doi:10.5054/tq.2011.244483
Biber, D., Gray, B., & Poonpon, K. (2013). Pay attention to the phrasal structures: Going beyond T-units –
A response to WeiWei Yang. TESOL Quarterly, 47, 192–201. doi:10.1002/tesq.84
Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In A. Housen, F. Kuiken, & I.
Vedder (Eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA
Byrnes, H. (2012). Conceptualizing FL writing in collegiate settings: A genre-based systemic functional lin-
guistics approach. In R.M. Manchón (Ed.), L2 writing development: Multiple perspectives (pp. 191–219).
Boston, MA: De Gruyter Mouton.
Cho, M. (2015). Task design features and learner variables in task performance and task experience
(Unpublished doctoral dissertation). University of Hawai‘i at Mānoa, Honolulu, HI.
Cho, M. (2018). Task complexity, modality, and working memory in L2 task performance. System, 72, 85–98.
doi:10.1016/j.system.2017.10.010
Choong, K.P. (2011). Task complexity and linguistic complexity: An exploratory study (Working Papers in
TESOL & Applied Linguistics, Vol. 11, No. 1, pp. 1–28). New York: Teachers College, Columbia University.
Choong, K.P. (2014). Effects of task complexity on written production in L2 English (Unpublished doctoral
dissertation). Teachers College, Columbia University, New York.
Ellis, R., & Yuan, F. (2004). The effects of planning on fluency, complexity, and accuracy in second language
narrative writing. Studies in Second Language Acquisition, 26, 59–84. doi:10.1017/S0272263104261034
Flower, L.S., & Hayes, J.R. (1980). The dynamics of composing: Making plans and juggling constraints. In
L.W. Gregg & E.R. Steinberg (Eds.), Cognitive processes in writing (pp. 31–50). Hillsdale, NJ: Lawrence
Erlbaum.
Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language performance.
Studies in Second Language Acquisition, 18, 299–326. doi:10.1017/S0272263100015047
Frear, M.W., & Bitchener, J. (2015). The effects of cognitive task complexity on writing complexity. Journal
of Second Language Writing, 30, 45–57. doi:10.1016/j.jslw.2015.08.009
Gilabert, R., Manchón, R.M., & Vasylets, O. (2016). Mode in theoretical and empirical TBLT research: Advancing
research agendas. Annual Review of Applied Linguistics, 36, 117–135. doi:10.1017/S0267190515000112
Halliday, M.A.K. (1989). Spoken and written language. Oxford: Oxford University Press.
Ishikawa, T. (2007). The effect of manipulating task complexity along the (± here-and-now) dimension on L2
written narrative discourse. In M.P. García Mayo (Ed.), Investigating tasks in formal language learning (pp.
136–156). Clevedon: Multilingual Matters.
Johnson, M.D. (2017). Cognitive task complexity and L2 written syntactic complexity, lexical complexity,
accuracy, and fluency: A research synthesis and meta-analysis. Journal of Second Language Writing, 37,
13–38. doi:10.1016/j.jslw.2017.06.001
Johnson, M.D., Mercado, L., & Acevedo, A. (2012). The effect of planning sub-processes on L2 writing flu-
ency, grammatical complexity, and lexical complexity. Journal of Second Language Writing, 21, 264–282.
doi:10.1016/j.jslw.2012.05.011
Kellogg, R.T. (1987a). Effects of topic knowledge on the allocation of processing time and cognitive effort to
writing processes. Memory and Cognition, 15, 256–266. doi:10.3758/BF03197724
Kellogg, R.T. (1987b). Writing performance: Effects of cognitive strategies. Written Communication, 4, 269–
298. doi:10.1177/0741088387004003003
Kellogg, R.T. (1988). Attentional overload and writing performance: Effects of rough draft and outline strat-
egies. Journal of Experimental Psychology, 14, 355–365. doi:10.1037/0278-7393.14.2.355
Kellogg, R.T. (1990). Effectiveness of prewriting strategies as a function of task demands. American Journal
of Psychology, 103, 327–342. doi:10.2307/1423213
of writing: Theories, methods, individual differences, and applications (pp. 57–71). Mahwah, NJ: Lawrence
Erlbaum Associates.
Kellogg, R.T., Whiteford, A.P., Turner, C.E., Cahill, M., & Mertens, A. (2013). Working memory in written
composition: An evaluation of the 1996 model. Journal of Writing Research, 5, 159–190. doi:10.17239/
jowr-2013.05.02.1
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence Erlbaum
Associates.
Kormos, J. (2011). Task complexity and linguistic and discourse features of narrative writing performance.
Journal of Second Language Writing, 20, 148–161. doi:10.1016/j.jslw.2011.02.001
62
Kuiken, F., Mos, M., & Vedder, I. (2005). Cognitive task complexity and second language writing perform-
ance. EUROSLA Yearbook, 5, 195–222. doi:10.1075/eurosla.5.10kui
Kuiken, F., & Vedder, I. (2007). Task complexity and measures of linguistic performance in L2 writing.
IRAL-International Review of Applied Linguistics in Language Teaching, 45, 261–284. doi:10.1515/
IRAL.2007.012
Kuiken, F., & Vedder, I. (2008). Cognitive task complexity and written output in Italian and French as a foreign
language. Journal of Second Language Writing, 17, 48–60. doi:10.1016/j.jslw.2007.08.003
Lambert, C., & Robinson, P. (2014). Learning to perform narrative tasks: A semester-long classroom study of
L2 task sequencing. In M. Baralt, R. Gilabert, & P. Robinson (Eds.), Task sequencing and instructed second
language learning (pp. 207–230). New York: Bloomsbury.
Malvern, D., Richards, B., Chipere, N., & Durán, P. (2004). Lexical diversity and language development.
New York: Palgrave Macmillan.
Manchón, R.M. (2011). Writing to learn: Issues in theory and research. In R. M. Manchón (Ed.), Learning-to-
write and writing-to-learn in an additional language (pp. 61–82). Amsterdam: John Benjamins.
Manchón, R.M. (2014). The internal dimension of tasks: The interaction between task factors and learner
factors in bringing about learning through writing. In H. Byrnes & R.M. Manchón (Eds.), Task-based lan-
guage learning: Insights from and for L2 writing (pp. 27–52). Amsterdam: John Benjamins.
empirical evidence. In J. W. Schwieter & A. Benati (Eds.). The Cambridge handbook of language learning
Manchón, R.M., & Williams, J. (2016). L2 writing and SLA studies. In R.M. Manchón & P. Matsuda (Eds.),
Handbook of second and foreign language writing (pp. 567–586). Boston, MA: DeGruyter.
McCarthy, P.M. (2005). An assessment of the range and usefulness of lexical diversity measures and the poten-
tial of the measure of textual, lexical diversity (MTLD) (Doctoral dissertation). Retrieved from Dissertation
Abstracts International. (3199485).
McCarthy, P.M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24, 459–
488. doi:10.1177/0265532207080767
McCarthy, P.M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches
to lexical diversity assessment. Behavior Research Methods, 42, 381–392. doi:10.3758/BRM.42.2.381
Michel, M., Kormos, J., Brunfaut, T., & Ratajczak, M. (2019). The role of working memory in young second
language learners’ written performances. Journal of Second Language Writing, 45, 31–45. doi:10.1016/
j.jslw.2019.03.002
Meraji, S.R. (2011). Planning time, strategy use, and written task production in a pedagogic vs. a testing con-
text. Journal of Language Teaching and Research, 2, 338–352. doi:10.4304/jltr.2.2.338-352
Mohammadzadeh Mohammadabadi, A.R., Dabaghi, A., & Tavakoli, M. (2013). The effects of simultaneous
use of pre-planning along ± here-and-now dimension on fluency, complexity, and accuracy of Iranian EFL
learners’ written performance. International Journal of Research Studies in Language Learning, 2(3), 49–
65. doi:10.5861/ijrsll.2012.168
Norris, J.M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The
case of complexity. Applied Linguistics, 30, 555–578. doi:10.1093/applin/amp044
Rahimi, R., & Zhang, L.J. (2019). Writing task complexity, students’ motivational beliefs, anxiety and their
writing production in English as a second language, Reading and Writing, 32, 761–786. doi:10.1007/
s11145-018-9887-9
Révész, A. (2014). Towards a fuller assessment of cognitive models of task-based learning: Investigating
task-generated cognitive demands and processes. Applied Linguistics, 35, 87–92. doi:10.1093/applin/
amt039
Révész, A., Kourtali, N.E., & Mazgutova, D. (2017). Effects of task complexity on L2 writing behaviors and
linguistic complexity. Language Learning, 67, 208–241. doi:10.1111/lang.12205
Robinson, P. (2001). Task complexity, task difficulty, and task production: Exploring interactions in a compo-
nential framework. Applied Linguistics, 22, 27–57. doi:10.1093/applin/22.1.27
Robinson, P. (2003). The cognition hypothesis, task design, and adult task-based language learning. Second
Language Studies, 21, 45–105.
Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential framework for
second language task design. International Review of Applied Linguistics in Language Teaching, 43, 1–33.
doi:10.1515/iral.2005.43.1.1
Robinson, P. (Ed.). (2011). Second language task complexity: Researching the cognition hypothesis of lan-
guage learning and performance. Amsterdam: John Benjamins.
Ruiz-Funes, M. (2015). Exploring the potential of second/foreign language writing for language learning: The
effects of task factors and learner variables. Journal of Second Language Writing, 28, 1–19. doi:10.1016/
j.jslw.2015.02.001
63
Mark D. Johnson
Ryshina-Pankova, M. (2015). A meaning-based approach to the study of complexity in L2 writing: The case
of grammatical metaphor. Journal of Second Language Writing, 29, 51–63. doi:10.1016/j.jslw.2015.06.005
Shajeri, E., & Izadpanah, S. (2016). The impact of task complexity along single task dimension on Iranian
EFL learners’ writing production. Theory and Practice in Language Studies, 6, 935–945. doi:10.17507/
tpls.0605.04
Skehan, P. (1996). A framework for the implementation of task-based instruction. Applied Linguistics, 17,
38–62. doi:10.1093/applin/17.1.38
Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.
Skehan, P. (2009). Modelling second language performance: Integrating complexity, accuracy, fluency, and
lexis. Applied Linguistics, 30, 510–532. doi:10.1093/applin/amp047
Skehan, P. (2014). The context for researching a processing perspective on task performance. In P. Skehan.
(Ed.), Processing perspectives on task performance (pp. 1–26). Amsterdam: John Benjamins.
Skehan, P., & Foster, P. (2001). Cognition and tasks. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 183–205). Cambridge: Cambridge University Press.
Spoelman, M., & Verspoor, M. (2010). Dynamic patterns in development of accuracy and complexity: A lon-
gitudinal case study in the acquisition of Finnish. Applied linguistics, 31, 532–553. doi:10.1093/applin/
amq001
Staples, S., Egbert, J., Biber, D., & Gray, B. (2016). Academic writing development at the university
level: Phrasal and clausal complexity across level of study, discipline, and genre. Written Communication,
33, 149–183. doi:10.1177/0741088316631527
Tavakoli, P. (2014). Storyline complexity and syntactic complexity in writing and speaking tasks. In H. Byrnes
& R.M. Manchón (Eds.), Task-based language learning: Insights from and for L2 writing (pp. 217–236).
guage production. Language Learning, 67, 394–430. doi:10.1111/lang.12228
Vasylets, O., Gilabert, R., & Manchón, R.M. (2019). Differential contribution of oral and written modes to
lexical, syntactic and propositional complexity in L2 performance in instructed contexts. Instructed Second
Language Acquisition, 3, 206–227. doi:10.1558/isla.38289
Verspoor, M., Lowie, W., & Van Dijk, M. (2008). Variability in second language development from a dynamic
systems perspective. The Modern Language Journal, 92, 214–231. doi:10.1111/j.1540-4781.2008.00715.x
Verspoor, M., Schmid, M.S., & Xu, X. (2012). A dynamic usage based perspective on L2 writing. Journal of
Second Language Writing, 21, 239–263. doi:10.1016/j.jslw.2012.03.007
In D. Belcher & A. Hirvela (Eds.), The oral-literate connection (pp. 10–25). Ann Arbor, MI: University of
Michigan Press.
Language Writing, 21, 321–331. doi:10.1016/j.jslw.2012.09.007
Yang, W. (2014). Mapping the relationships among the cognitive complexity of independent writing tasks, L2
writing quality, and complexity, accuracy and fluency of L2 writing (Unpublished doctoral dissertation).
Georgia State University, Atlanta, GA.
written genres. TESOL Quarterly, 51, 275–301. doi: 10.1002/tesq.296
Zalbidea, J. (2017). “One task fits all”? The roles of task complexity, modality, and working memory capacity
in L2 performance. The Modern Language Journal, 101, 335–352. doi:10.1111/modl.12389
64
SECTION 2
Language Processing
6
L2 WRITING PROCESSES
OF LANGUAGE LEARNERS
IN INDIVIDUAL AND
COLLABORATIVE WRITING
CONDITIONS
Marije Michel, Laura Stiefenhöfer, Marjolijn Verspoor,
and Rosa M. Manchón
Groningen University, Lancaster University,
Groningen University, and University of Murcia
Introduction
Writing processes − those invisible actions behind the production of written language − have been
an important research area in both first (L1) and second language (L2) writing. Framed primarily
in cognitive theories/models of writing and cognitive accounts of second language acquisition
(SLA) (see details in Chapters 2 and 3, this volume), research on L2 writing processes investigates
both observable L2 writing behaviors, such as typing speed and pausing patterns, and the nature
and temporal distribution of underlying cognitive operations, like planning, linguistic encoding,
and revising. Originally, research into L2 writing processes was not really related to second lan-
guage acquisition (SLA) but focused more on the composition process drawing on L1 writing pro-
cess models (as reviewed in Roca de Larios, Nicolás-Conesa, & Coyle, 2016).Yet, a recurrent and
conspicuous finding in this body of work has been the attention that L2 writers appear to pay to
language-related concerns while producing their L2 texts. Thus, an SLA-oriented strand on writing
L2 processes has emerged and explores how the processing dimension of L2 writing can be helpful
for L2 learning. This is the strand we review in this chapter, dealing with both individual and col-
laborative writing.
We start with a synthetic, historical overview of the writing and SLA perspectives on L2 writing
processes. This will lead to a more focused discussion of critical issues and topics in the SLA-
oriented strand, and then to a synthesis of the main contributions of extant research on the pro-
cessing dimension of writing (for the processing dimension of feedback see Chapter 7). On the basis
of these analyses, we shall outline some implications for practice and will formulate suggestions for
future work in the domain.
DOI: 10.4324/9780429199691-10 67
Marije Michel et al.
Research into L2 writing processes goes back to the 1980s (e.g., Cumming, 1989, 1990; Raimes,
1987). Different strands of research have emerged over the years, adopting a more cognitive, or a
more sociocultural perspective (see review in Roca de Larios et al., 2016): writing processes can
be conceived of as those cognitive actions that are behind the production of written language (the
focus of this chapter), or refer to actions responsible for the socially-situated production of texts in
diverse time-and space-distributed, real-life conditions (see Manchón, 2021, for a recent review).
Cognitively-oriented research on L2 writing processes has been based on models that describe
L1 writing processes (e.g., Bereiter & Scardamalia, 1987; Flower & Hayes, 1981 Kellogg, 1996;
Galbraith, 2009). We present Kellogg’s (1996) model here, which describes three recursive and
dynamic key phases of writing: (1) formulation, which includes higher-order processes of planning
(e.g., content and organization of the text) as well as lower-order processes of lexical retrieval and
linguistic encoding; (2) execution, that is, when the planned text is put on (digital) paper by hand
using a pencil or a keyboard; and (3) a monitoring phase, where written text is being reviewed to
see whether it is in line with the writer’s intentions, and, if necessary, revised. Given such a complex
interplay of different processes, these models stress the demands that writing places on the limited
capacity of working memory.
L2 writing places an additional burden on writers in all three phases, affecting all levels of lin-
guistic processing. Particularly, lexical retrieval and morphosyntactic encoding is considered to
take more time and cognitive control making the lower-order formulation phases more effortful.
Similarly, crosslinguistic differences between L1 and L2 will impact processing and will increase
demands on attention during the execution phase (see Chapters 8 and 23, this volume). Finally,
also the processes of comparing different linguistic options and deciding on its accuracy and
adequacy, in addition to revising text in the L2, are likely to need more cognitive effort. In sum,
L2 writing places an additional cognitive load on the writers’ working memory (Galbraith, 2009).
Furthermore, the challenges in lower-order formulation processes could lead to a disruption of
higher-order processes (cf. Chenoweth & Hayes, 2003), so less attentional capacity may be avail-
able for planning, monitoring, and goal setting.
Based on the postulated different writing phases and the additional cognitive demands that char-
acterize L2 writing, macro-writing processes of formulation and monitoring have been investigated
especially in terms of their purported problem-solving nature and the resulting language learning
potential (e.g., for a review see Roca de Larios et al., 2016) and their temporal distribution while
performing time-compressed writing tasks (e.g., Gánem-Gutiérrez & Gilmore, 2018; Roca de
Larios et al., 2008). Others have studied online writing behaviors, especially pausing, revisions,
and fluency (cf. Révész & Michel, 2019a). A further important strand of research has looked into
collaborative writing (e.g., Kessler, Bikowski, & Boggs, 2012; Kim & McDonough, 2011; Kowal
& Swain, 1994; Leeser, 2004; Storch, 2008), where the joint responsibility for a text and resulting
discussions between writers has provided ample insights on writing processes, in this case taking
a more socio-cognitive perspective. Collectively, this research has shed light on the recursive and
problem-solving nature of writing and has provided robust empirical evidence of the intense lin-
guistic processing that characterizes most forms of writing (cf. Cumming, 1990; Manchón, Roca
de Larios, & Murphy, 2009), which Galbraith (2009) argues serves the constitution of new know-
ledge. The common conclusion in studies on the temporal distribution of writing processes − both
in pen-and-paper writing without access to external sources (Roca et al, 2008) and in digital writing
with access to sources (Gánem-Gutiérrez & Gilmore, 2018) − is that transforming ideas into lan-
guage is the predominant process while composing. Such rich linguistic processing is assumed
to be beneficial in terms of language development because, as Révész and Michel (2019b) note,
“the act of written production may foster cognitive processes, which are assumed to facilitate L2
development” (pp. 491–492). Accordingly, they suggest, “L2 writing process research may inform
68
L2 Writing Processes & L2 Learning
L2 instruction and assessment by advancing our understanding of both the learning-to-write and
writing-to-learn dimensions of L2 writing” (pp. 491–492). This link between writing processes and
language learning is further elaborated in the following sections.

The question of why and how processes of L2 writing and those of SLA might interact builds on the
idea that engaging in text composition positively affects language learning. The highly recursive and
dynamic meaning-making processes during writing (i.e., planning, linguistic encoding, execution,
and monitoring) have been shown to induce heightened attention to language. These processes tie in
with seminal hypotheses in SLA research such as Swain’s Output Hypothesis (1985) and Schmidt’s
Noticing Hypothesis (2001) and readers are referred to Chapter 2 (this volume) for a fuller elabor-
ation. Similarly, as more fully discussed in Chapter 3 (this volume), in collaborative writing both
cognitive accounts of SLA (especially interactionist approaches, such as Gass & Mackey, 2007)
and sociocultural theories (see Swain, 2006) provide the rationale for the language learning that
may derive from the joint construction of texts and from the negotiation processes while writing
together, especially in terms of heightened attention to language, often operationalized in terms of
language-related episodes (LREs; i.e., episodes in which learners deliberate about language, and
the very nature of the interaction among peers).
Regarding individual writing, L2 researchers have stressed the positive effect writing may have
on reducing cognitive load during L2 processing. Compared to L2 speech production, writing has
a relatively slow pace and the fact that the written output remains visible on paper or screen is
assumed to reduce cognitive load. This frees memory so that attentional capacity can be dedicated
to the linguistic form of the message (see Chapter 23 this volume). Cumming (1990) adds another
argument: The mere act of writing, where the product (i.e., the text) and the mental processes, that
underlie its production, are separate, allows learners to engage with language form more exten-
sively. Indeed, reviews from different perspectives (see Chapter 1, this volume) highlight how this
disjuncture of cognitive processes and product, leaving additional time and space for language, is
particularly helpful for L2 learning (Harklau, 2002; Manchón, 2020a; Williams, 2012). The inherent
problem-solving nature of writing is also purported to induce focused language production during
text composition, which may create opportunities for monitoring, restructuring, elaboration, and
refinement of the language used (Byrnes & Manchón, 2014; Manchón & Roca de Larios, 2007).
Similarly, collaborative writing shares the opportunities of individual writing and, in addition,
learning opportunities are created through the interaction between writers because interacting with
a peer pushes L2 learners to focus their attention to language form (see Chapter 3, this volume, for
further elaboration). As proposed by Long (1996) and others (Gass & Mackey, 2007), interaction
fosters L2 users to engage during LREs in negotiations of meaning and form, which presumably
leads them to modify their output, formulate and incorporate peer feedback, and adopt a more form-
focused approach to language (see Loewen & Sato, 2018, for a recent review). In addition, feelings
of joint authorship and responsibility push L2 writers to engage in the articulation of thinking in the
form of collaborative talk, also termed languaging (Swain, 2006) during collaborative writing (see
Chapter 3, this volume).
Yet, as Cumming (2020) has recently pointed out, “despite the many proposed perspectives on
how L2 writing may relate to and prompt L2 learning, causality remains unproven” (p. 40).
Current Contributions
In this section we synthesize the main lines of research that have attempted to advance empirical
knowledge in both individual and collaborative writing processes from the perspective of language
learning.
69
Studies of Individual Writing

Writing Processes of Language Learners in Laboratory Conditions
A growing body of research investigates L2 learners under laboratory conditions in an attempt
to increase our understanding of how L2 writing processes are mediated by task design and task
implementation variables. Typically, these studies do not directly research L2 learning, although
implications for language learning are usually formulated. For instance, in a series of studies, Ong
(2013, 2014) investigated processes learners engaged in when given different amounts of time for
pre-task and online planning. Using post-task questionnaires, she found that while learners used
pre-task planning time to generate ideas and organize content, they also spent a substantial amount
of time dealing with linguistic aspects of their texts. As a result, they continued to use the writing
time for higher-order processes, such as planning and organizing content. However, an extended
writing condition without pre-task planning time resulted in texts of higher quality, possibly due to
the reduced cognitive load of transcription as compared to planning, hence facilitating the retrieval
of idea units (Ong, 2013). In the context of academic writing, Révész, Kourtali, and Mazgutova
(2017) examined the effects of task complexity on cognitive processes underlying L2 writing
behavior. Their stimulated recall data indicated that a more complex task engaged participants more
in planning activities, leaving fewer resources for translation processes. More recently, Michel
et al. (2020) compared writing processes on integrated and independent task types triangulating
eye-gaze, keystroke, and stimulated recall data. While the integrated task elicited more source con-
sultation processes (i.e., reading the source text), most of the writing processes were fairly similar
across tasks.
Comparing writing tasks across languages, Leijten et al. (2019) drew on keystroke data to
compare 280 students who did source-based writing tasks in their L1 Dutch and L2 English. The
authors examined pausing behavior (frequency and duration) at different stages of the writing pro-
cess. Data showed hardly any differences between writing in Dutch or English, with the exception
that participants spent more time on language-related processes (e.g., lexical synonym search) and
used more sources in their L2. Importantly, high quality texts resulted from substantial source
consultations at the beginning, before writing, which was then characterized by frequent but short
switches to sources during writing. The final stages elicited mainly revision processes.
In laboratory conditions, research into L2 writing processes has made use of think-aloud or
stimulated recall data to examine LREs as a window into cognitive processes. Two recent SLA-
oriented L2 writing studies by López-Serrano, Roca de Larios, and Manchón (2019, 2020) used
EFL writers’ think-aloud protocols to analyze different processes (i.e., language-related reflections)
of individuals when writing in their L2. Their data revealed categories of processes focusing on
linguistic aspects (e.g., spelling); resolution (e.g., of an LRE); strategies applied (e.g., translation,
grammar); orientation (i.e., whether the writer searches for compensatory forms); and depth of pro-
cessing. Their theoretically-informed and empirically driven coding scheme may serve as a tool for
future research aiming to characterize problem-solving strategies of L2 writers.
In another laboratory setting, microprocesses of pausing and revision behavior in L2 writing
have been investigated with research methods that include eye tracking and keystroke logging.
Thus, Barkaoui (2019) measured L2 writers’ pausing behavior in the data provided by 68 L2
writers of English doing the integrated and independent task of the TOEFL iBT writing test. It was
found that low-proficiency students generally paused more often than high-proficiency writers.
Particularly at the beginning, pauses were long, which the author interpreted as the time that
writers needed to globally plan their text. Similarly, the independent task elicited more but shorter
pauses than the integrated task –which could be related to writers reading the input text. A tailor-
made tool that allowed combining keystroke logging with eye tracking methodology informed
Chukharev-Hudilainen, Saricaoğlu, Torrance, and Feng’s (2019) study of Turkish native speakers
70
(N=24), who completed one task in their L1 Turkish and one in L2 English. The data confirmed
findings of earlier work, as L2 writers wrote significantly slower in their L2 than their L1. Finally,
Révész, Michel, and Lee (2017, 2019) triangulated keystroke logging with eye-gaze and stimulated
recall data when exploring L2 pausing and revision behaviors. Based on data by Chinese writers
of English (N=30) who performed the IELTS Academic Writing task 2, the authors confirmed
that pauses at lower textual units (e.g., within a word) were associated with lower-order writing
processes (e.g., lexical retrieval), while longer pauses were at larger textual units (e.g., sentence
boundary) and were more often associated with higher-order writing processes (e.g., content
planning) and further look-backs.
As stated before, these studies focused on L2 processes during writing and did not directly aim at
shedding light on the connection between writing and language learning, an exception being López-
Serrano et al.’s (2019, 2020) studies. In any case, these studies do shed light on the task-related
and learner-related factors that may mediate cognitive activity while writing and, hence, indirectly,
their insights can be taken as empirical evidence of the purported learning processes activated while
writing (see Manchón, 2020b for a fuller discussion).
Writing Processes in Instructed SLA Contexts

In the past two decades, more and more scholars have followed Harklau’s (2002) call for classroom-
based research into L2 writing. Only a handful have taken a processing perspective.
An innovative approach to studying writing processes in an instructed setting was used by
Séror (2013; see also Hamel, Séror, & Dion, 2015), who examined writing processes of univer-
sity learners (N=36) longitudinally. Screen-captures complemented by post-task (cued) interviews
showed the many facets of academic writing in an L2, including effective use of L1 and dictionary
look-ups. The ethnographic study by Smith et al. (2017) based on screen recordings, field-notes,
and interviews provided detailed insights into students’ multimodal processes and code-meshing
when they were writing as they drew on text, visuals, and other sources. This study also illustrates
a recent interest in studying writing processes of multimodal composing.
Classroom-based research on writing processes also includes interventionist studies on strategy
instruction. Illustrative of this trend is the study by DeSilva and Graham (2015), in which effective
writing strategies were taught to a group of L2 writers in the context of a 24-week academic writing
course. A control group followed the same writing course, for the same duration of time, with the
exception of the strategy instruction. Stimulated recall comments were complemented by video
recordings also of the students’ facial expressions. Results showed that the intervention had pushed
students to use more planning and self-monitoring strategies and, in general, had enabled them to
employ different strategies in an orchestrated way more effectively.
In sum, while the studies reviewed in this section provide ample insights into individual writing
processes and illustrate the rich repertoire of processes learners engage in while composing a text,
the relationship to L2 learning remains implicit. Work on collaborative writing more explicitly
investigates this link as we discuss in the following section.
Processes of Collaborative L2 Writing

In collaborative writing tasks, students are engaged in joint problem solving and knowledge
building (cf. Chapter 3, this volume). As knowledge building is mediated by language use, referred
to as languaging (Swain, 2000, 2006), collaborative writing is thought to support language learning
because it is likely to enhance focus on form, noticing, and uptake (Schmidt, 2001). Several lines
of research, both in face-to-face and in computer-supported modes, have shed light on the language
learning affordances of these collaborative writing processes.
71
Collaborative Writing in Face-to-Face Mode

Research on collaborative writing processes has often focused on the amount of languaging,
operationalized in terms of LREs learners engage in when writing a joint text, which is relevant from
a language learning perspective given that the learning affordances of writing are in part premised
on the linguistic processing that takes place while producing a text. This line of work confirms that
language-focused tasks elicit a larger number of LREs than meaning-focused tasks (Alegría de la
Colina & García Mayo, 2007) and hence suggests that the former have greater language learning
potential. More recently, McDonough, Crawford, and Vleeschauwer, (2016) compared different
meaning-focused tasks and revealed that, in comparison with a summary task, a problem-solution
task elicited more LREs. As these problem-solving tasks also led to more deliberation of content,
language, and text organization, which also correlated positively with text quality, they seem to
generate processes that potentially support language learning.
Proficiency level also plays a role in the number of LREs occurring in a collaborative writing
task. While some studies have found more LREs for high-proficiency writers (Storch & Aldosari,
2013), others (e.g., Watanabe & Swain, 2007) suggest that there is no direct relationship, but argue
that interaction patterns, as described in Storch’s (2002) seminal work (e.g., collaborative; expert-
novice), are more important. In this respect, Kim and McDonough (2008) found that learners
showed more collaborative patterns when working with advanced learners, and collaborative pairs
engaged in more languaging, resulting in more LREs than non-collaborative pairs.
Computer-Supported Collaborative Writing (CSCW)

In collaborative writing tasks, students often use technology and web-based writing tools (e.g.,
shared Google Docs), referred to as computer-supported collaborative writing (CSCW). Because
CSCW may affect the nature of interaction during activities, it has received growing attention
(Zheng & Warschauer, 2017) in research into collaborative writing processes.
Rouhshad and Storch (2016) compared face-to-face with written chat interactions during CSCW
and found that computer-mediated interaction resulted in fewer collaborative patterns. Some CSCW
research identified additional patterns of interaction to the ones proposed by Storch (2002): passive/
passive (low) and sequential- additive interaction (Abrams, 2016), and facilitator/ participant,
exhibiting features of both expert/novice as well as collaborative interaction (Cho, 2017).
For research, web-tools are useful because they allow for easily retraceable editing of text. Taking
editing as a window into monitoring processes, research has asked two main questions: 1) Whose
text is being edited? and 2) Which kinds of edits are being made? Kessler (2009) found that students
were less inclined to edit their own writing than that of their group partners, while participants in
Arnold, Ducate, and Kost (2012) focused more on their own writing. Generally, students seem
to focus more on meaning than on form (e.g., Kessler, 2009). Form-focused revisions, however,
appear to be more common in peer-editing than in self-editing (Arnold et al., 2012; Kessler, 2009).
In sum, this research shows that the language learning potential of collaborative writing activ-
ities (understood as the promotion of relevant learning processes), be it face-to-face or online,
depends to a large extent on pairing and group composition. The findings also highlight that pro-
viding students with relevant training prior to writing activities is likely to promote languaging and
collaborative interaction.
Processes Underlying Digitally-Mediated Interactive Writing

Interactive writing, that is, quick turn-taking written exchange via text chat using digitally medi-
ating tools, has gained ground. Such a context − characterized as interaction in slow motion using
the written modality − has led to exciting avenues for research into the language learning potential
72
of writing through digitally-mediated exchange (see Elola & Oskoz, 2017, and Ziegler, 2016, for
comprehensive reviews). However, few published studies have focused on the processes involved.
Early explorations by O’Rourke (2008, 2012) employed eye tracking to look into written
English-German telecollaboration exchanges. He identified three behavioral patterns of learners
when reading their own text production (seen as a sign of monitoring): (a) simultaneous monitoring,
that is, reading while drafting; (b) pre-send monitoring, that is, reading after drafting but before
sending the text message; and (c) post-send monitoring, that is, reading after sending the message.
In addition, scrolling and scanning patterns through the on-screen transcript of the conversation
revealed that some writers were “browsing” through the earlier turns while waiting for the contri-
bution of their partner, and others spent time on a specific expression of the already transmitted text.
Recent work by Michel and O’Rourke (2019) furthermore suggests that learners learn language
from their partner when interacting via text chat. Future work into these explorations of reading-
while-writing in interactive written dialogue is needed to increase our understanding how this mode
of L2 writing may contribute to language learning.
Individual Writing
Most studies investigating cognitive processes underlying L2 writing draw on introspective meth-
odologies, including think-aloud protocols, retrospective cued interviews, and stimulated recalls.
The result is detailed categorizations for verbalized thought processes (e.g., López Serrano et al.,
2019), which have been linked to theoretical models of writing (e.g., Révész, Michel & Lee, 2019).
However, think-alouds and stimulated recalls have been criticized for reactivity (think-alouds)
and memory decay (stimulated recalls) (see further elaboration in Chapter 25. See also Polio &
Freedman, 2017). In contrast, technological tools, such as screen capturing, keystroke logging,
and eye tracking, allow a moment-by-moment registration of writing behaviors on a screen, which
facilitate objective measurement of pausing, revision, and reading- while-writing behavior at
milliseconds level. Together with introspective methods, they provide unique insights into writing
processes (see recent reviews by Galbraith & Vedder, 2019; Révész & Michel, 2019a, b; see also
Chapter 25, this volume).
Recent developments demonstrate how fruitful data triangulation approaches are. In particular
combining stimulated recall or interview data with screen recordings, eye-gaze and/or keystroke
information, allows for more comprehensive perspectives on writing processes and more valid
interpretations (see contributions to Révész & Michel, 2019a).
Each of the methods comes, however, with challenges for researchers. Analyzing screen
recordings is based on qualitative, manual coding and includes the interpretative step of what
to code for in what ways (e.g., Hamel & Séror, 2015). Similarly, eye tracking data on viewing
behavior during writing often requires substantial hand-coding before one can make meaningful
inferences about writing processes (cf., Gánem-Gutiérrez & Gilmore, 2018; Révész, Michel, &
Lee, 2019). Large individual differences in eye-gaze patterns requires person-centered data as a
baseline for comparisons. The greatest challenge with eye-tracking writing, however, is that not all
participants are touch typists. Many watch the keyboard during writing with major consequences
for data quality. More general measures of viewing behavior during writing (e.g., Michel et al.
2020) are informative, but do not allow for detailed analyses of cognitive processes at linguistic
levels and, therefore, may limit the conclusions that can be drawn about the connection between the
processing dimension of writing and language learning.
In sum, even though these tools can provide valuable and detailed insights about writing
behaviors, for investigating the relationship between processes and language learning, it seems that
these kinds of data would need to be triangulated with more traditional methods, such as concurrent
73
or retrospective think-aloud protocols, as these are better capable of tapping the cognitive processes
learners engage in (see Manchón & Leow, 2020).
Collaborative Writing
Analysis of Pair Talk

Collaborative writing research builds to a large extent on the analysis of pair talk between learners.
Audio recordings are coded for type and frequency of LREs (e.g., meaning vs. language focus) and
the general focus of student talk (e.g., content, organization, language, task management, off-task
talk; McDonough et al., 2016; Storch & Wigglesworth, 2007).
In digitally-mediated contexts (e.g., collaborative writing in Google Docs), students’ written
chat protocols and screen recordings provide information on learners’ behavior (Strobl, 2014). Also
stimulated recall interviews (Cho, 2017), reflection papers (Li & Zhu, 2017a), and questionnaires
(Li & Kim, 2016) have been used to provide information about students’ perception of collaborative
processes and peer interaction.
Version History and Text Mining in Digital Writing

Tools like Google Docs and wikis keep track of the version history of the writing product and after-
wards the locus and content of edits can be analyzed to reveal processes underlying the writers’
contributions (Alghasab & Handley, 2017). Text mining tools can convert the version history of
digital documents into data that provide insights into what writer contributed, how much, at what
point in time, including information on revisions to their own and a collaborator’s text.
As in the case of studies looking at individual processes, more recent work on collaborative writing
has started to combine several methods The triangulation of quantifiable information provided by
text mining tools and fine-grained qualitative analyses likely promotes our understanding of the
relationship between collaborative writing processes and writing outcomes, helping us to identify
learner behaviors that might prove particularly beneficial for language learning (see Stiefenhöfer
& Michel (2020) for a study in which DocuViz data, eye-gaze methodology, stimulated recall
interviews, and chat log analyses were triangulated).
Individual Writing
Pedagogical interventions should aim at fostering writing processes that are conducive to lan-
guage learning. The preceding discussion shows that this can be done in part in terms of task
characteristics and task implementation procedures (see discussion in Manchón, 2020b). Tasks that
engage learners in problem solving and represent a real challenge in ideational and/or linguistic
terms have more chances of contributing to learning, in terms of either consolidation or expansion
of L2 knowledge (Kormos, 2011; López-Serrano et al., 2019; Manchón & Roca de Larios, 2007;
Révész, Kourtali, & Mazgutova, 2017). Providing pre-task and online planning time might also be
key considerations in pedagogical decision-making (Manchón & Roca de Larios, 2007; Manchón
& Vasylets, 2019) and using different task types will also affect what learners presumably focus
their attention on during L2 writing (Michel et al. 2020).
Research on collaborative writing has produced ample evidence for its potential for language
learning (see Storch, Chapter 3 this volume). When planning collaborative writing tasks, teachers
74
need to pay attention to group formation. Some research suggests that students knowing each other
positively influences collaborative behavior (Hassaskhah & Mozaffari, 2015), but teachers might
also choose to have students work with different peers over time using criteria such as shared L1,
L2 proficiency or different personalities (Storch, 2017).
For low-proficiency students, language-focused tasks lead to more peer deliberation about lan-
guage, adding to its potential for noticing and focus on form. For more advanced learners, adequate
collaborative writing activities include meaning-focused and integrated tasks (Alegría de la Colina
& García Mayo, 2007).
Finally, taking into account the influence of attitude towards collaborative writing on its poten-
tial for language learning (Chen & Yu, 2019), it is recommendable to make goals and metacognitive
strategies explicit to students (Chen & Hapgood, 2019).
Interactive Writing
Similar to collaborative writing, research has demonstrated that interactive writing during text-
chat conversations has the potential to support language learning, for example, via conversational
alignment (Michel & O’Rourke, 2019). The classroom-based studies by Michel (2018) in a high-
school context and by Michel and Stiefenhöfer (2019) in university classrooms suggest that writing
activities building on alignment and priming boost the language learning potential of text chat even
more, as learners are inclined to use their partners’ input for their own contributions, thereby enlar-
ging their own language repertoire. Teachers may therefore use carefully designed chat activities in
order to foster both L2 use and L2 writing which supports L2 learning.
Future Directions
Research Interests
Individual Writing
To further our understanding of how writing supports language learning, Manchón, (2020b) and
Manchón and Leow, (2020) (see also Chapter 22 this volume) suggest future work on writing
processes in individual writing conditions should attempt to (a) make digital writing more prom-
inent; (b) go beyond time-constrained writing conditions and look into the time-distributed nature of
writing and writing processes in real-life writing tasks; (c) explore more the effect of task conditions
on writing processes rather than on writing products (the general tendency in extant research. See
Chapters 4 and 5, this volume); and (d) study individual differences in the processing dimension
of writing (as more fully discussed in Chapters 11, 12, and 22, this volume). Such explorations
will further our insights on what specific processes may support language learning under what
conditions.
To increase ecological validity, collaborative writing research should include the study of processes
in their authentic contexts. Accordingly, as most collaborative writing takes place online, more work
is needed on different collaboration types and interaction patterns that emerge in digitally-mediated
contexts (cf. Cho, 2017). We also need more work that explores collaborative group writing, instead
of pair work. From a pedagogic perspective, also more research looking into the effects of different
task types is needed (e.g., Révész, Kourtali & Mazgutova, 2017) and, for example, tasks with multi-
modal input (Lim & Polio, 2020).
Following the dynamic turn, we need studies that account for the fluctuating nature of
collaborators’ interactions and more advanced statistical approaches to identify varying interaction
75
patterns at different stages of the process (Zhang, 2019). Studies focusing on the dynamic changes
of interaction during writing will increase our understanding of how, when, and why collaborative
writing potentially supports language learning. Similarly, the little research that exists on interactive
writing during text chat that has taken a processing perspective will need to be complemented in
order to allow interpretations about how the affordances of this medium support language learning.
Research Methods
Several chapters in this Handbook (especially Chapters 22, 23, and 25) discuss needed methodo-
logical innovations in the study of writing processes. We will only add that, to date, most work
focusing on writing processes in an L2 has relied on laborious transcription and hand-coding of
the data. Consequently, not that many large-scale studies exist (but see Michel et al. 2020 for a rare
exception). Without expanding our populations, insights remain local and might be of little value
for practice (see also Chapter 22, this volume). Therefore, future research into individual and col-
laborative writing will benefit from using advanced technological tools, which allow more quantita-
tive approaches to directly measure writing processes and behaviors (Révész & Michel, 2019a). In
particular, software packages that combine several methodologies (e.g., eye tracking and keystroke
logging by Chukharev-Hudilainen et al., 2019) are likely to substantially expand our knowledge of
writing processes, as well as reading processes in the context of writing.
For digitally-mediated contexts, we need more studies using text mining techniques (cf. Yim &
Warschauer, 2017) as they will provide deeper insights into how producing and revising one’s own
and someone else’s text interrelate. To enhance our understanding of collaborative processes that
go beyond typing text, more research using eye tracking will be relevant. In particular, instrument
and techniques triangulation − for instance, text mining, eye tracking, and stimulated recall − will
allow a more comprehensive perspective on collaborative writing processes.
Similarly, collaborative writing processes underlying text-chat interaction (with peers, a tutor, or
a chatbot) deserve academic attention given that it is still an empirical question how interaction in
these environments supports language learning.
Research into L2 writing processes, be it in individual or collaborative contexts, has still many
avenues to explore. Future work will provide further insights into how the cognitive and social pro-
cess of engaging in text production supports language learning.
References
Abrams, Z. (2016). Exploring collaboratively written L2 texts among first-year learners of German in
Google Docs. Computer Assisted Language Learning, 29(8), 1259–1270. https://doi.org/10.1080/09588
221.2016.1270968
Alegría de la Colina, A., & García-Mayo, M.P. (2007). Attention to form across collaborative tasks by
low-proficiency learners in an EFL setting. In M.P. García-Mayo (Ed.), Second language acquisi-
tion: Investigating tasks in formal language learning (pp. 91–116). Clevedon: Multilingual Matters.
Alghasab, M., & Handley, Z. (2017). Capturing (non-)collaboration in wiki-mediated collaborative writing
activities: The need to examine discussion posts and editing acts in tandem. Computer Assisted Language
Learning, 30(7), 664–691. https://doi.org/10.1080/09588221.2017.1341928
Arnold, N., Ducate, L., & Kost, C. (2012). Collaboration or cooperation? Analyzing group dynamics and revi-
sion processes in wikis. CALICO Journal, 29(3), 431–448. https://doi.org/10.11139/cj.29.3.431-448
Barkaoui, K. (2019). What can L2 writers’pausing behavior tell us about their L2 writing processes? Studies in
Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition. Mahwah, NJ: Lawrence
Erlbaum.
Byrnes, H., & Manchón, R.M. (Eds.). (2014). Task-based language learning-Insights from and for L2 writing.
Chen, W., & Hapgood, S. (2019). Understanding knowledge, participation and learning in L2 collaborative
writing: A metacognitive theory perspective. Language Teaching Research, 36, 136216881983756. https://
doi.org/10.1177/1362168819837560
76
Chen, W., & Yu, S. (2019). A longitudinal case study of changes in students’ attitudes, participation, and
learning in collaborative writing. System, 82, 83–96. https://doi.org/10.1016/j.system.2019.03.005
Chenoweth, A., & Hayes, J. (2003). The inner voice in writing. Written Communication, 20, 99–118.
Cho, H. (2017). Synchronous web- based collaborative writing: Factors mediating interaction among
second-language writers. Journal of Second Language Writing, 36, 37–51. https://doi.org/10.1016/j.jslw.
2017.05.013.
Chukharev-Hudilainen, E., Saricaoglu, A., Torrance, M., & Feng, H.H. (2019). Combined deployable keystroke
logging and eyetracking for investigating L2 writing fluency. Studies in Second Language Acquisition,
41(3), 583–604.
Cumming, A. (1989). Writing expertise and second-language proficiency. Language Learning: A Journal of
Applied Linguistics, 39, 81–141.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing, 7, 31–51.
Cumming, A. (2020). L2 writing and L2 learning: Transfer, self-regulation, and identities. In R. M. Manchón
(Ed.), Writing and language learning. Advancing research agendas (pp. 30- 48). Amsterdam: John
Benjamins.
De Silva, R., & Graham, S. (2015). The effects of strategy instruction on writing strategy use for students of
different proficiency levels. System, 53, 47–59.
Elola, I., & Oskoz, A. (2017). Writing with 21st century social tools in the L2 classroom: New literacies,
genres, and writing practices. Journal of Second Language Writing, 36, 52–60.
Flower, L., & Hayes, J.R. (1981). A cognitive process theory of writing. College Composition and
Communication, 32(4), 365−387.
Galbraith, D. (2009). Cognitive models of writing. German as a Foreign Language, 2(3), 7–22.
Galbraith, D. & Vedder, I. (2019). Methodological advances in investigating L2 writing processes. Challenges
and perspectives. Studies in Second Language Acquisition, 41 (3), 633–645.
Gánem-Gutiérrez, A. & Gilmore, A. (2018). Tracking the real-time evolution of a writing event: Second lan-
guage writers at different proficiency levels. Language Learning, 68(2), 469–506.
Gass, S.M., & Mackey, A. (2007). Input, interaction, and output in second language acquisition. In B. VanPatten
& J. Williams (Eds.), Theories in second language acquisition (pp. 180–206). Mahwah, NJ: Lawrence
Erlbaum.
Hamel, M.-J., Séror, J., & Dion, C. (2015). Writers in action: Modelling and scaffolding second-language
learners’ writing process. Toronto: Higher Education Quality Council of Ontario.
Writing, 11, 329–350.
Hayes, J.R. (1996). A new framework for understanding cognition and affect in writing. In C.M. Levy & S.
Ransdell (Eds.), The science of writing (pp. 1–27). Mahwah, NJ: Lawrence Erlbaum.
Hayes, J.R. (2012). Evidence from language bursts, revision, and transcription for translation and its relation to
other writing processes. In M. Fayol, D. Alamargot, & V. Berninger (Eds.), Translation of thought to written
text while composing (pp. 15–25). New York: Psychology Press.
Hassaskhah, J., & Mozaffari, H. (2015). The impact of group formation method (student-selected vs. teacher-
assigned) on group dynamics and group outcome in EFL creative writing. Journal of Language Teaching
and Research, 6(1), 147. https://doi.org/10.17507/jltr.0601.18.
Kellogg, R. (1996). A model of working memory in writing. In M. Levy & S. Ransdell (Eds.), The science of
writing: Theories, methods, individual differences, and applications (pp. 57–72). Mahwah, NJ: Lawrence
Erlbaum.
Kessler, G. (2009). Student-initiated attention to form in wiki-based collaborative writing. Language Learning
& Technology, 13(1), 79–95.
Kessler, G., Bikowski, D., & Boggs, J. (2012). Collaborative writing among second language learners in aca-
demic web-based projects. Language Learning & Technology, 16(1), 91–109.
Kim, Y., & McDonough, K. (2008). The effect of interlocutor proficiency on the collaborative dialogue
between Korean as a second language learners. Language Teaching Research, 12(2), 211–234. https://
doi.org/10.1177/1362168807086288
Kim, Y., & McDonough, K. (2011). Using pretask modelling to encourage collaborative learning opportunities.
Language Teaching Research, 15(2), 183–199. https://doi.org/10.1177/1362168810388711
Kormos, J. 2011. Task complexity and linguistic and discourse features of narrative writing performance.
Journal of Second Language Writing 20(2), 148–161.
Kowal, M., & Swain, M. (1994). Using collaborative language production tasks to promote students’ language
awareness. Language Awareness, 3(1), 73–93.
Lantolf, J.P. (2000). Introducing sociocultural theory. In J.P. Lantolf (Ed.), Sociocultural theory and second
language learning (pp. 1–26). Oxford: Oxford University Press.
77
Leeser, M.J. (2004). Learner proficiency and focus on form during collaborative dialogue. Language Teaching
Research, 8(1), 55–81. https://doi.org/10.1191/1362168804lr134oa
Li, M., & Kim, D. (2016). One wiki, two groups: Dynamic interactions across ESL collaborative writing tasks.
Li, M., & Zhu, W. (2017a). Explaining dynamic interactions in wiki-based collaborative writing. Language
Li, M., & Zhu, W. (2017b). Good or bad collaborative wiki writing: Exploring links between group
interactions and writing products. Journal of Second Language Writing, 35, 38–53. https://doi.org/10.1016/
j.jslw.2017.01.003
Lim, J., & Polio, C. (2020). Multimodal assignments in higher education: Implications for multimodal writing
tasks for L2 writers. Journal of Second Language Writing, 47.
Leijten, M., Van Waes, L., Schrijver, I., Bernolet, S., & Vangehuchten, L. (2019). Mapping master’s students’
use of external sources in source-based writing in L1 and L2. Studies in Second Language Acquisition,
41(3), 555–582.
Loewen, S., & Sato, M. (2018). Interaction and instructed second language acquisition. Language Teaching,
51(3), 285–329.
Long, M.H. (1996). The role of the linguistic environment in second language acquisition. In W.C. Ritchie &
T.K. Bathia (Eds.), Second language acquisition: Vol. 2. Handbook of language acquisition (pp. 413–468).
New York: Academic Press.
Second Language Acquisition, 41, 503–527.
López Serrano, S., Roca de Larios, J., & Manchón, R.M. (2020). Reprocessing output during L2 individual
writing tasks: An exploration of depth of processing and the effects of proficiency. In R.M. Manchón (Ed.),
Mak, B., & Coniam, D. (2008). Using wikis to enhance and develop writing skills among secondary school
students in Hong Kong. System, 36(3), 437–455. https://doi.org/10.1016/j.system.2008.02.004
Manchón, R.M. (2011). Situating the learning- to-write and writing-to-
learn dimensions of L2 writing.
In R.M. Manchón (Ed.), Learning-to-write and writing-to-learn in an additional language (pp. 3–16).
Manchón, R.M. (Ed.). (2020a). Writing and language learning: Advancing research agendas. Amsterdam: John
Benjamins.
Manchón, R.M. (2020b). The language learning potential of L2 writing: Moving forward in theory and
research. In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 405–
Manchón, R.M. (2021). The contribution of ethnographically-oriented approaches to the study of L2 writing
and text production processes. In A. Bocanegra & I. Guillén (Eds), Ethnographies of academic writing
research: Theory, methods, and interpretation (pp. 83–103). Amsterdam: John Benjamins.
Manchón, R.M. & Leow, R. (2020). An ISLA perspective on L2 learning through writing: Implications for
future research agendas. In R.M. Manchón (Ed.), Writing and language learning. Advancing research
Manchón, R.M. & Roca de Larios, J. (2007). Writing-to-learn in instructed language learning contexts. In
E.A. Soler & M.P.S. Jordá (Eds.), Intercultural language use and language learning (pp. 101–121).
Berlin: Springer.
of foreign language composing processes: Implications for theory. In R.M. Manchón (Ed.), Writing in for-
eign language contexts: Learning, teaching, and research (pp. 102–129). Bristol: Multilingual Matters.
Manchón, R.M. & Vasylets, O. (2019). Language learning through writing: Theoretical perspectives and
empirical evidence. In J.B. Schwieter, & A. Benati, (Eds.), The Cambridge handbook of language learning
McDonough, K., Crawford, W., & Vleeschauwer, J. de (2016). Thai EFL learners’ interaction during collabora-
tive writing tasks and its relationship to text quality. In M. Sato & S.G. Ballinger (Eds.), Language learning
& language teaching. Peer interaction and second language learning: Pedagogical potential and research
agenda (pp. 185–207). Amsterdam: John Benjamins.
Michel, M. (2018). Practising online with your peers: The role of text chat for second language develop-
ment. In C. Jones (Ed.), Practice in second language learning (pp. 164–174). Cambridge: Cambridge
University Press.
Michel, M., & O’Rourke, B. (2019). What drives alignment during text chat with a peer vs. a tutor? Insights
from cued interviews and eye-tracking. System, 83, 50–63. https://doi.org/10.1016/j.system.2019.02.009
78
Michel, M., Révész, A., Lu, X., Kourtali, N.E., Lee, M., & Borges, L. (2020). Investigating L2 writing
processes across independent and integrated tasks: A mixed-methods study. Second Language Research,
36(3), 277–304. https://doi.org/10.1177/0267658320915501
Michel, M., & Stiefenhöfer, L. (2019). Priming Spanish subjunctives during synchronous computer-mediated
communication: German peers’ classroom-based and homework interactions. In M. Sato & S. Loewen
(Eds.), Evidence-based second language pedagogy (pp. 191–218). New York: Routledge.
Ong, J. (2013). Discovery of ideas in second language writing task environment. System 41(3), 529–542.
Ong, J. (2014). How do planning time and task conditions affect metacognitive processes of L2 writers?
Journal of Second Language Writing, 23. 17–30.
O’Rourke, B. (2008). The other C in CMC: What alternative data sources can tell us about text-based syn-
chronous computer mediated communication and language learning. Computer Assisted Language
Learning, 21(3), 227–251.
O’Rourke, B. (2012). Using eye-tracking to investigate gaze behaviour in synchronous computer-mediated
communication for language learning. In M. Dooly & R. O’Dowd (Eds.), Researching online foreign lan-
guage interaction and exchange: Theories, methods and challenges (pp. 305–341) Bern: Peter Lang
Polio, C., & Friedman, D.A. (2017). Understanding, evaluating, and conducting second language writing
research. New York: Routledge.
Raimes, A. (1987). Language proficiency, writing ability, and composing strategies: A study of ESL college
student writers. Language Learning, 37, 439–468.
linguistic complexity. Language Learning, 67, 208–241.
Révész, A. & Michel, M. (Eds.) (2019a). Methodological advances in investigating L2 writing processes.
Special Issue, Studies in Second Language Acquisition, 41(3).
Révesz, A. & Michel, M. (2019b). Introduction. In A. Révész & M. Michel (Eds.), Methodological advances
in investigating L2 writing processes. Special Issue in Studies in Second Language Acquisition, 41(3),
491–501.
Révész, A. Michel, M., & Lee, M. (2017). Investigating IELTS academic Writing Task 2: Relationships
between cognitive writing processes, text quality, and working memory. IELTS Research Reports, 3, 5–44.
Révész, A., Michel, M., & Lee, M. (2019). Exploring second language writers’ pausing and revision behaviors: A
mixed-methods study. Studies in Second Language Acquisition, 41(3), 605–631.
Roca de Larios, J., Manchón, R.M., Murphy, L., & Marín, J. (2008). The foreign language writer’s strategic
behavior in the allocation of time to writing processes. Journal of Second Language Writing, 17, 30–47.
Roca de Larios, J., Nicolás-Conesa, F., & Coyle, Y. (2016). Focus on writers: Processes and strategies. In
R.M. Manchón & P.K. Matsuda (Eds.), Handbook of second and foreign language writing (pp. 267–286).
Berlin: De Gruyter Mouton.
Rouhshad, A., & Storch, N. (2016). A focus on mode: Patterns of interaction in face-to-face and computer-
mediated contexts. In M. Sato & S.G. Ballinger (Eds.), Language Learning & Language Teaching. Peer
Interaction and Second Language Learning: Pedagogical Potential and Research Agenda (pp. 267–289).
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 3–32).
Séror, J. (2013). Screen capture technology: A digital window into students’ writing processes. Canadian
Journal of Learning and Technology, 39(3), 1–16.
Smith, B. E., Pacheco, M. B., & De Almeida, C. R. (2017). Multimodal codemeshing: Bilingual adolescents’
processes composing across modes and languages. Journal of Second Language Writing, 36, 6−22.
Stiefenhöfer, L., & Michel, M. (2020) Investigating the relationship between peer interaction and writing
processes in computer-supported collaborative L2 writing: A mixed-methods study. In R.M. Manchón (Ed.),
Storch, N. (2002). Patterns of interaction in ESL pair work. Language Learning, 52(1), 119–158. https://
doi.org/10.1111/1467-9922.00179
Storch, N. (2008). Metatalk in a pair work activity: Level of engagement and implications for language devel-
opment. Language Awareness, 17(2), 95–114. https://doi.org/10.1080/09658410802146644
Storch, N. (2017). Implementing and assessing collaborative writing activities in EAP classes. In J. Bitchener,
N. Storch, & R. Wette (Eds.), Teaching writing for academic purposes to multilingual students: instruc-
tional approaches (pp. 130–142). New York: Routledge.
Storch, N., & Aldosari, A. (2013). Pairing learners in pair work activity. Language Teaching Research, 17(1),
31–48. https://doi.org/10.1177/1362168812457530
Storch, N., & Wigglesworth, G. (2007). Writing tasks: the effect of collaboration. In M. d. P. García Mayo
(Ed.), Second language acquisition: Investigating tasks in formal language learning (pp. 157–177).
Clevedon: Multilingual Matters.
79
Strobl, C. (2014). Affordances of Web 2.0 technologies for collaborative advanced writing in a foreign lan-
guage. CALICO Journal, 31(1), 1–18. https://doi.org/10.11139/cj.31.1.1-18
output in its development. In S. Gass & C. Madden (Eds.), Input in Second Language Acquisition (pp.
Swain, M. (1993). The Output Hypothesis: Just speaking and writing aren’t enough. Canadian Modern
Language Review, 158–164.
Swain, M. (2000). The Output Hypothesis and beyond: Mediating acquisition through collaborative dia-
logue. In J. P. Lantolf (Ed.), Sociocultural theory and second language learning (97–114). Oxford: Oxford
University Press.
Swain, M. (2006). Languaging, agency and collaboration in advanced second language learning. In H.
Byrnes (Ed.), Advanced language learning: The contributions of Halliday and Vygotsky (pp. 95–108).
London: Continuum.
Watanabe, Y., & Swain, M. (2007). Effects of proficiency differences and patterns of pair interaction on second
language learning: Collaborative dialogue between adult ESL learners. Language Teaching Research,
11(2), 121–142. https://doi.org/10.1177/136216880607074599.
Yim, S., & Warschauer, M. (2017). Web-based collaborative writing in L2 contexts: Methodological insights
from text mining. Language Learning & Technology, 21(1), 146–165.
Zhang, M. (2019). Towards a quantitative model of understanding the dynamics of collaboration in collabora-
tive writing. Journal of Second Language Writing, 45, 16–30. https://doi.org/10.1016/j.jslw.2019.04.001
Zheng, B., & Warschauer, M. (2017). Epilogue: Second language writing in the age of computer-mediated com-
munication. Journal of Second Language Writing, 36, 61–67. https://doi.org/10.1016/j.jslw.2017.05.014
Ziegler, N. (2016). Taking technology to task: Technology-mediated TBLT, performance, and production.
Annual Review of Applied Linguistics, 36, 136–163.
80
7
LEARNERS’ ENGAGEMENT
WITH WRITTEN CORRECTIVE
FEEDBACK IN INDIVIDUAL AND
COLLABORATIVE L2 WRITING
CONDITIONS
University of Murcia
Introduction
The instrumental role that individual or collaborative writing and written feedback may play in
second language (L2) learning has been highlighted by some scholars on the argument that the
written modality may give learners special opportunities to increase their language awareness
(Bitchener, 2019; Bitchener & Storch, 2016; Manchón & Roca de Larios, 2007; Manchón &
Williams, 2016). The reduced time constraints involved in writing, and the set of mental processes
that may be triggered when writers attempt to match intended messages and their expression, have
been posited as factors that may help them heighten their attention to formal aspects of language
and lead, as a result, to more complex and/or accurate performance than is possible on tasks in
which these processes are not activated (Roca de Larios, 2013; Manchón & Williams, 2016).
In particular, the multistage structure of written feedback tasks is thought to prompt these
processes by combining the benefits for learning associated with output production and input pro-
cessing. Output production in the initial composition stage is posited to promote syntactic processing
combined with self-monitoring (in individual writing) and with self-and other-monitoring (in col-
laborative writing) that may give learners the opportunity of noticing holes in their interlanguages.
If input is available afterwards in the form of feedback, this previous noticing may enable learners to
compare their problems to the target language with more focal attention (noticing the gap, Schmidt,
1990), which will hopefully lead, in the revision stage, to deeper processing with awareness at
the level of understanding (Schmidt, 1990), typically associated with subsequent robust learning
(Swain, 1995; Uggen, 2012).
Although teachers may respond to many aspects of learners’ written texts, it is written cor-
rective feedback (WCF), globally understood as “any explicit attempt to draw a learner’s attention
to a morpho-syntactic or lexical error” (Polio, 2012, p. 375), that seems to have received much
attention in recent years. Most research has been conducted as a reaction to Truscott’s arguments
against its validity and usefulness (e.g., Truscott, 2007) and to critical calls for improvements in
research designs (e.g., Guenette, 2007). As a result, research on WCF has verified that the effects of
DOI: 10.4324/9780429199691-11 81
different types of WCF in terms of explicitness (direct, indirect, metalinguistic explanations, etc.)
are more consistent in some areas (e.g., the accurate production of a limited number of grammatical
forms both immediately and after receiving the feedback) than in others (e.g., the scope of errors
to be corrected) (see Chapter 16, this volume). However, being mostly interested in measuring the
accuracy of the final product (taken as a reflection of language learning), research on WCF has until
recently overlooked the processes learners engage in when appropriating the feedback given to
them (Bitchener & Storch, 2016).Yet, despite this general trend, attention to learners’ engagement
with feedback, both in individual and collaborative writing conditions, is increasingly becoming a
relevant area of research.
According to Han and Hyland (2015), learner engagement with feedback is a broad concept that
may encompass different sub-constructs, such as quality of processing (Qi & Lapkin, 2001; Storch
& Wigglesworth, 2010), analysis of language and cognitive comparison (Sheen, 2010), willingness
and commitment to reasoning about errors (Evans et al., 2010) or learning strategies for memor-
izing target forms (Hyland, 2003). In consonance with this variety of characterizations, engagement
with feedback is considered in this chapter as a multifaceted and more encompassing construct than
feedback processing, which is understood as that part of engagement mainly related to the depth and
quality of cognitive effort involved in detecting, analyzing, and understanding the feedback (see
Svalberg, 2009; Uscinski, 2017).
This chapter identifies and critically discusses key issues and trends in engagement with
WCF as documented in individual and collaborative writing studies. In what follows, we start by
briefly discussing the relevance of this field of inquiry for SLA studies and how it has evolved
over the years. We then review relevant research strands, in terms of theoretical assumptions
guiding the research, main issues addressed and key findings reported. Finally, we review the
main methods used in the empirical studies reviewed and make some recommendations for
practice and for future research.
Historically, several reasons have been adduced for the need to look at learners’ engagement with
written corrective feedback (WCF) and the factors that may influence this process. One reason is
that feedback may only be effective in helping learners develop their L2 if they are willing and
motivated to engage with the WCF so that they can process, act upon, and, ultimately, incorporate
it into their future language use (Wigglesworth & Storch, 2012a, b). In fact, the lack of consistent
findings in WCF studies has been attributed, among other things, to the fact that not all learners
process feedback in the same way or to the same extent (Bitchener, 2012; Kim & Bowles, 2019).
In addition, understanding how learners respond to WCF may also be important for teachers’ peda-
gogical practices (Zheng & Yu, 2018).
This need to analyze learners’ engagement with feedback was initially addressed by a small
number of studies which emerged in the late 1980s–1990s as an alternative to the dominant trend
in WCF research of using quasi-experimental and controlled designs to look exclusively at the
product-based response of learners to feedback (e.g., Cohen, 1987; Hyland, 1998). This orientation
has been followed up over the years by other studies that, for want of a better name, we have iden-
tified as non-interventionist or naturalistic, since they are usually conducted in intact classes and
make use of case study methodologies to analyze the thought processes engaged in by individual
learners when trying to identify and understand the WCF provided by the teacher. This feedback,
which usually includes all types of comments, is not predetermined by the researcher (e.g., Ferris
et al., 2013; Hyland, 2003).
Also at the beginning of the present century, a set of studies on collaborative writing conducted
by Lapkin, Swain and Smith (2002) and Swain and Lapkin (2002) with French immersion
82
Learners’ Engagement with WCF
adolescents in Canada, and by Adams (2003) with L2 Spanish university students in the USA,
started to look at the way learners engaged with reformulations as a specific type of WCF. This
approach contrasted with the previous one and opened up the interventionist strand, which is
composed of studies where predetermined forms of feedback are provided to learners when writing
individually (e.g., Kim & Bowles, 2019) or in collaboration (e.g., Adams, 2003). In these studies,
evaluations of engagement with feedback are usually conducted with comparisons between groups
(e.g., Sachs & Polio, 2007, but see Qi & Lapkin, 2001), sometimes supplemented with case studies
(Storch & Wigglesworth, 2010).The present chapter focuses, with a few exceptions, on studies
within the interventionist strand as these are the most relevant studies from a second language
acquisition (SLA) perspective.

Within the interventionist research strand, attempts to account for the potential links between
engagement with feedback in individual writing and language development have been based on the
application to the field of WCF of a number of theoretical assumptions found in the cognitively-
oriented SLA literature. The simplest of these applications in terms of the mechanisms involved,
which we call the Noticing Approach, is largely premised on Schmidt’s (1990) Noticing Hypothesis
without reference to differences in levels of awareness. From this perspective, feedback processing
is equated with paying focal attention to linguistic forms in the input. This is the approach adopted
in studies in which learners are provided with model texts tailored to their age and proficiency level
as well as to the content and the genre of the writing task at hand (e.g., Hanaoka, 2007). It is also
the approach adopted in studies where models have been compared to reformulations, i.e., teachers’
rewritings of learners’ texts, preserving all the original ideas and making them sound as native-like
as possible (Hanaoka & Izumi, 2012).
A second application, which we call the Dual Approach, also draws on Schmidt’s Noticing
Hypothesis (1990, and elsewhere), but now makes use of his distinction between awareness at the
level of noticing and awareness at the level of understanding, the latter being related to the ability
to analyze, compare, and test hypotheses about previously noticed input. This approach has been
adopted in studies accounting for learners’ engagement with reformulations (Qi & Lapkin, 2001),
direct corrections (Suzuki, 2012, 2017; Uscinski, 2017), reformulations versus direct corrections
(Sachs & Polio, 2007) or direct versus indirect corrections (Simard, Guenétte, & Bergeron, 2015).
Other SLA-oriented theoretical perspectives informing research on individual writing, which
have emerged as alternatives to the dichotomous approach suggested by Schmidt, make up what we
have collectively categorized as the Beyond the Dual Approach. These perspectives include three
main lines of inquiry. One line has addressed learners’ multiple levels of feedback processing using
some of the feedback techniques found in the previous approaches, such as indirect corrections (Park
& Kim, 2019), direct, indirect, and metalinguistic feedback (e.g., Caras, 2019), reformulations and
direct corrections (Kim & Bowles, 2019), direct and indirect corrections (e.g., Cerezo, Manchón,
& Nicolás-Conesa, 2019; Suh, 2010), and direct corrections in individual and collaborative writing
conditions (Manchón et al., 2020). A second line, including so far only one study (Li, 2017), has
used SLA-based input and output processing models (Bitchener, 2016, 2019; Bitchener & Storch;
2016), together with principles of Dynamic Systems Theory (Vespoor, Loowie, & van Dijk (2008),
to account for the potential links between metalinguistic feedback processing and partial develop-
ment of language. Finally, studies in a third line have looked at engagement with feedback in indi-
vidual writing from two different perspectives. First, Shintani and Aubrey (2016) have relied on a
combination of L1 writing process models (Flower & Hayes, 1980; Kellogg, 1996), skill-learning
theoretical assumptions (DeKeyser, 2007) and sociocultural perspectives on scaffolding (Aljaafreh
& Lantolf, 1994) to account for the different cognitive and social processes involved in learners’
responses to asynchronous and synchronous direct corrections. Second, a set of related studies
83
have undertaken the adaptation to WCF of a general model of corrective feedback suggested by
Ellis (2010) with the aim of looking at individual engagement as a multidimensional activity com-
prising cognitive, affective, and behavioral dimensions (Han, 2017; Han & Hyland, 2015; Zheng
& Yu, 2018).
Regarding collaborative writing (CW), most research has implicitly or explicitly
conceptualized engagement with feedback as a joint problem-solving activity mediated by
languaging, i.e., learners’ reflection on language to make sense of the linguistic system and to
learn about language (Bitchener & Storch, 2016). On the basis of this mediational tool, some
sociocultural studies have used collaborative dialogues to document noticing processes in
learners’ engagement with various types of feedback, such as reformulations (e.g., Adams, 2003;
Swain & Lapkin, 2002) or models and reformulations (Yang & Zhang, 2010). These studies
have also tried to capture learners’ reactions to text-editing and reformulations in the form of
multiple processing levels (Wigglesworth & Storch, 2012a) or through the filter of their own
beliefs about language conventions and their intended meaning (Storch & Wigglesworth, 2010).
Studies with children have investigated their noticing processes with models (Cánovas-Guirao,
Roca de Larios, & Coyle, 2015) and direct corrections and models (Coyle & Roca de Larios,
2014) by means of note-taking. Other studies with children have studied dyads’ collaborative
dialogues to ascertain the influence of reformulation on their use of writing strategies (García-
Hernández, 2017) or to account for the sequential routes or trajectories children follow through
the different stages of the feedback task (Coyle, Cánovas-Guirao, & Roca de Larios, 2018).
The above discussion shows that research is progressively evolving into a more nuanced and
multifaceted understanding of learners’ engagement with WCF. Initial analyses of feedback pro-
cessing, operationalized as mere surface-level noticing, have given way to other analyses involving
multiple levels of engagement, SLA-oriented models of input and output processing, or even multi-
dimensional models of engagement with feedback. These issues, only outlined in this introduction,
will be elaborated further in the next section.

We first discuss how “engagement with feedback” has been operationalized in the three strands
mentioned above and then present the main contributions reported within each strand.
The “Noticing Approach” Strand

The different ways in which noticing in individual and collaborative writing has been operationalized
in this strand are all related to the idea of learners paying attention to the differences between their
own production and the target form (e.g., Adams, 2003; Cánovas-Guirao et al., 2015; Hanaoka,
2007; Swain & Lapkin, 2002), attention which is sometimes reflected as overtly or covertly
verbalized problems (Hanaoka & Izumi, 2012) (see also Chapter 2, this volume).
What emerges from these studies is that different feedback techniques (reformulation, models
and direct correction) may lead to noticing different types of linguistic form and that, with some
exceptions, a relationship may be posited between noticing and immediate retention of feedback
(e.g., Coyle & Roca de Larios, 2014; Lapkin et al., 2002; Swain & Lapkin, 2002; Yang & Zhang,
2010). These findings, however, come from research primarily focused on the description of the
number and type of linguistic units underlying learners’ noticing rather than on the processes they
engage in to analyze and interpret the information within the feedback and thus upgrade their levels
of awareness. These processes begin to emerge in the two approaches reviewed next.
84
The “Dual Approach” Strand

The operationalization of the “noticing and understanding” dichotomy in this strand has adopted
different forms and levels of explicitness as a function of the data collection procedure used (i.e.,
think-aloud, written languaging or eye-tracking). In the studies using think-aloud, the dichotomy
has been explicitly operationalized in two ways. One consists of classifying learners’ verbalized
LREs either as perfunctory (when they simply noticed differences between their own text and the
reformulation) or substantive (when they gave reasons for their acceptance of the feedback) (Qi &
Lapkin, 2001). The other operationalization entails identifying the level of awareness involved in
learners’ verbal comments: either noticing, when they read the correction aloud or just mentioned
the error, or understanding, when they used metalanguage and reasoned about the corrections
(Sachs & Polio, 2007). When data have been collected through written languaging, defined either
as learners’ “interpretation of each type of WCF received” (Simard et al., 2015, p. 242) or “explan-
ation about the direct correction for a given error” (Suzuki, 2012, p. 1119), the operationalization
of “noticing” and “understanding” as two distinct processes does not appear as explicit as in the
previous studies. However, the central role played by “explanation” and “interpretation” processes
in those definitions seems to suggest that the emphasis is placed on the “understanding” side of the
equation, with “noticing” being taken for granted. The lowest level of explicitness of the noticing/
understanding dichotomy is found in those studies that have attempted to capture these dichot-
omous processes through the use of eye-tracking techniques (e.g., Shintani & Ellis, 2013), since
the duration of eye-fixations may imply understanding but also signal confusion or even deep pro-
cessing without achieving understanding.
The data collected in these studies provide insights on the relative usefulness of the various
WCF techniques used and on the relationship between noticing/understanding and accuracy. Direct
correction appeared to prompt rich “languaging” about grammar more often than lexis (Suzuki,
2012) and led to a greater understanding of the nature of errors than implicit WCF (underlining),
which caused misunderstandings and partial understandings when learners had difficulty in
interpreting the source of the error (Simard et al., 2015). Direct correction, however, did not seem
to be as powerful as metalinguistic explanation in helping learners understand the rules of the
English article system, although no differences between both feedback techniques were found in the
frequency and duration of eye movements for each article error (Shintani & Ellis, 2013). However,
while some studies reported a direct relationship between levels of noticing and accuracy, other
studies limited this relationship to that of a mere association. In the first case, Qi and Lapkin (2001)
found that noticing with a reason (substantive noticing) contributed directly to improvements in
subsequent output more than noticing without providing a reason (perfunctory noticing). Similarly
Suzuki (2012) reported that written languaging mediated improvements in lexical and grammatical
accuracy when learners reflected on the nature of the linguistic problem indicated by the WCF,
and Shintani and Ellis (2013) found metalinguistic explanation to be more effective than direct
correction in helping learners improve accuracy of the English indefinite article on both an imme-
diate post-test and a delayed post-test, although the effect wore off over time. In contrast, Sachs
and Polio (2007) claimed that the use of metalanguage and the provision of reasons by learners
were only associated with accurate changes in subsequent output, as improvements in accuracy
on revisions and verbalized awareness did not always correspond. These authors also suggested
that the evidence for substantive versus perfunctory noticing is not always straightforward since
it is at times difficult to determine how deeply a correction has been processed by relying only on
verbal data. These data are incomplete sometimes because the learner may know the reason for
the correction without stating it or because the complexity of the error makes it very difficult to be
verbalized.
85
The “Beyond the Dual Approach” Strand

A first group of studies in this strand conceptualize engagement with WCF as a cognitive
activity involving multiple processing levels which have been operationalized by relying on
three theoretical constructs, namely, “awareness,” “depth of processing” (DoP) and “engage-
ment in the interaction.” “Awareness” is defined as “a particular state of mind connected to a
subjective experience of some cognitive content or external stimulus” (Tomlin & Villa, 1994,
p. 193), while DoP, as propounded by Leow in his 2015 model of Instructed Second Language
Acquisition (ISLA), has been characterized as “the relative amount of cognitive effort, level of
analysis, and elaboration of intake, together with the use of prior knowledge, hypothesis testing,
and rule formation employed in decoding and encoding some grammatical or lexical item in
the input” (Leow, 2015, p. 204). In contrast with the internal character of these two constructs,
“engagement in the interaction” is taken to be a more external category since it has been used to
account for how learners interact with each other while writing in collaboration, either reading
the feedback and just acknowledging it (limited engagement) or offering suggestions and
counter-suggestions, explanations or any comments that show meta-awareness of the feedback
received (extensive engagement) (Storch & Wigglesworth, 2010). The construct of awareness
allowed Suh (2010) and Suzuki (2017) to identify several levels, which ranged from lower
levels of awareness (uncertainty about the purposes of the correction or noticing of surface-level
phenomena) to higher ones (verbalization of underlying rules, hypotheses, and comparisons).
Drawing on the construct of DoP, Kim and Bowles (2019) and Caras (2019) isolated three DoP
levels, which broadly ranged from those where learners did not show any processing of target
(low DoP) to other cases in which they made comments indicating some processing but without
understanding the correction (medium DoP) and still other cases where they engaged in hypoth-
esis testing and rule formation or spent time processing the target (high DoP). Park and Kim
(2019), Cerezo et al., (2019) and Manchón et al., (2020) identified five levels of DoP by com-
bining different types of behavior shown by learners. Finally, Wigglesworth and Storch (2012a),
taking “extensive engagement” (see above) as their point of departure, identified three levels in
this category (low, medium, or high) with the use of quantitative criteria.
These studies showed that higher levels of awareness were prompted by both comprehen-
sive (Suzuki, 2017) and focused (Suh, 2010) forms of direct correction, although it was only in
the focused condition that these levels were exclusively found to have a significant relationship
with accuracy improvement. Higher DoP levels were found by Caras (2019) to be mainly fostered
by metalinguistic feedback, followed by direct corrections and underlining, with Kim and Bowles
(2019) similarly reporting that higher levels of DoP were more often related to reformulations than to
direct corrections. In contrast, Cerezo et al., (2019) reported a clear advantage of direct over indirect
WCF for prompting deeper levels of processing, and Manchón et al. (2020) did not find any signifi-
cant differences in DoP levels between individual and collaborative writing. As for feedback appro-
priation, hardly any relationship was found between reported DoP levels and accuracy of the target
forms addressed (Caras, 2019) or overall accuracy of the final text (Cerezo et al., 2019; Manchón
et al.,2020), or between cognitive effort and learners’ understanding of the errors underlined (Park
& Kim, 2019). In collaborative writing conditions, Wigglesworth and Storch (2012a) reported that
error coding promoted higher levels of engagement in pair interaction than reformulations, although
these higher levels did not directly manifest themselves in greater accuracy of rewritten texts. Rather,
students’ beliefs about the CF they received favored or hindered its incorporation.
In short, the results of this first line of inquiry seem to suggest that the promotion of higher levels
of awareness, DoP, and engagement in the interaction were related to the use of specific feedback
techniques. However, although higher levels of awareness appeared to be related to the improve-
ment of accuracy in certain cases, neither DoP nor extensive engagement in the interaction seemed
to have a clear effect on the level of accuracy achieved in revisions.
86
Li (2017) represents a different research orientation within the “Beyond the Dual Approach”
On the basis of Bitchener’s model (2016, see also 2019), the study departs from the premise that
engagement with feedback is a developmental process that involves (i) the processing of input that
occurs in initial treatment episodes, where explicit knowledge is internalized by learners through
noticing and understanding processes; and (ii) the modification and consolidation of that knowledge
through its retrieval and use as output in subsequent pieces of writing over time. Following the
basic tenets of Dynamic Systems Theory (Vespoor et al., 2008), Li contends that these modification
and consolidation processes occur within a complex, self-organizing system (the L2) which, being
sensitive to feedback, undergoes continuous, non-linear changes that may be subtle and hardly
perceived (the target structure in hand is partially developed) or abrupt and visible (L2 knowledge is
reorganized due to the incorporation of the newly assimilated information into the internal system).
Accordingly, the expectation of a linear relationship between WCF and the effects it may prompt
is unrealistic, since failure to improve accuracy after a WCF treatment does not necessarily mean
that learning has not taken place. It may simply indicate that more WCF is needed to iterate the L2
learning process until changes in the system become visible.
On the basis of these theoretical assumptions, in Li (2017) a group of Chinese EFL learners were
given direct corrections and metalinguistic explanations on the use of the English passive voice, and
their performance was compared to that of a practice group. No differences between groups were
found for the accurate production of the target form, but metalinguistic explanation was superior to
the practice group in the recognition of the need to use the passive voice on obligatory occasions.
Following Bitchener’s (2016) model, this meant that, after this specific WCF treatment, the learners
had been able to partially develop the target form by prioritizing semantic understanding (recogni-
tion of when to use the passive voice) over structural understanding (its accurate production) as the
necessary precondition to its full acquisition. From the perspective of Dynamic Systems Theory,
recognition of the need to use the passive voice should be understood, according to Li, as a subtle
change in learners’ L2 knowledge that needed more nuanced analyses on their part to become exter-
nally visible.
Another group of studies within the “Beyond the Dual Approach” have focused on a broader
variety of issues. In individual writing, these include the conceptualization and analysis of feed-
back processing as a multidimensional activity, and the exploration of the main cognitive and social
processes involved in synchronous and asynchronous WCF. Collaborative studies have looked at
the role played by adult learners’ goals and beliefs in their feedback processing or scrutinized the
effects of WCF on children’s writing strategies and the processing trajectories they follow through
the different stages of the task.
Han and Hyland’s (2015) naturalistic research analyzed how individual writers addressed WCF.
Drawing on Ellis (2010), engagement with feedback was operationalized as a multidimensional
activity involving cognitive (metacognitive and cognitive operations engaged in by learners),
affective (learners’ emotional responses and attitudes toward the feedback) and behavioral (the
revision operations performed on subsequent written texts) components. On the basis of this meth-
odological framework, the authors reported that, although the participants in their study addressed
WCF and their errors by means of cognitive and metacognitive operations, the processing level
achieved by each individual and the correspondence between this processing and accurate pro-
duction depended on their attitudes toward accuracy, motivation and beliefs on the effectiveness
of WCF, and their emotional responses to the feedback provided. Other studies have reported that
additional factors mediating learners’ multidimensional engagement with feedback include person-
related, task-related and strategy-related beliefs (Han, 2017), as well as low proficiency (Zheng &
Yu, 2018). From a different perspective, Waller and Papi (2017) have claimed that learners’ motiv-
ation and orientation toward WCF is influenced by their implicit theories of intelligence.
In individual writing, Shintani and Audrey (2016) looked at the cognitive and social processes
involved in synchronous (SCF) and asynchronous (ACF) corrective feedback. Drawing on the
87
theoretical frameworks reported above, the authors speculated that SCF, provided in real operating
conditions, might facilitate the online revision processes that naturally occur in writing. In contrast,
AFC, which is separate by definition from the composition process, was not thought to create a
context where the writer’s revisions could be embedded while producing new text. Socially, the
interactive nature of SCF seemed optimal for teachers to scaffold learners’ gradual development
(Aljaafreh & Lantolf, 1994), which was not the case with AFC since in this condition interaction
between teachers and students is limited. The results of the study confirmed these speculations: SFC
was found to bring about more durable and powerful effects than ACF. The former gave learners
the opportunity to gradually improve accuracy by consulting previous feedback in earlier sentences
when composing new ones, while their ACF counterparts only had the chance of copying corrected
sentences after the writing task had been completed.
Three additional studies in this line of inquiry have looked at engagement with feedback in
collaborative writing from the perspective of adult learners’ personal beliefs, on the one hand,
and children’s approaches to feedback processing, on the other. Storch and Wigglesworth (2010)
analyzed the influence of personal factors on university students’ analysis and interpretation
of reformulations and found that extensive levels of engagement in the interaction (see above)
were associated to high levels of uptake and retention only when the feedback was consistent
with learners’ beliefs from previous courses or with their correction goals. Conversely, limited
engagement accompanied with disapproval of the feedback led to little uptake and low reten-
tion. García-Hernández (2017), as a response to the absence of research on learners’ strategic
behavior when processing and responding to WCF (Manchón, 2018), analyzed the effects of
reformulation on the strategies used by Spanish EFL 6th graders while writing in collaboration.
The results indicated that the children whose interpretation of and attitudes toward the feedback
were more congruent with the task goals engaged in a more frequent upgrading orientation of the
strategies used. Coyle et al. (2018), in turn, moved beyond the distinction between “noticing”
and “understanding” on the belief that, while this dichotomy may fit in well with older learners,
it needs to be revised if children’s engagement with feedback is to be adequately accounted for.
In consonance, they reported that (i) young EFL learners followed multiple routes or trajectories
during writing, processing of model texts and revision; and that (ii) these trajectories, which
involved a broader array of options than those usually covered in available WCF research with
adult learners, could be classified in relation to their impact on children’s ongoing development
of L2 knowledge.
Collectively considered, the studies in the three approaches so far discussed (Noticing Approach,
Dual Approach, and Beyond the Dual Approach) suggest that the study of engagement with and/or
processing of WCF is gradually becoming more nuanced and multifaceted in theoretical and empir-
ical terms. Theoretically, this development is evident in the variety and complexity of the models
used to account for the phenomenon and in the conceptions of learning underlying the research.
Earlier accounts of feedback processing merely based on the notion of noticing are giving way to
other approaches based on multistage models of SLA or multidimensional models of WCF where
it is assumed that learners’ engagement with feedback can be contemplated as a cognitive activity
only in conjunction with other attitudinal and affective factors. The construct of learning underlying
the studies has evolved from conceptualizations in which it is equated with the accurate produc-
tion of subsequent texts, to process-based accounts which show how L2 knowledge is partially
gained from feedback, and modified and consolidated over time. Empirically, some of the findings
reported also confirm this complex panorama, as they show, among other things, that the relation-
ship between deeper or more extensive levels of engagement with feedback and accuracy is not a
direct one. Rather, it appears to be mediated by the interaction of a number of factors that include, at
a minimum, the feedback type, the number and type of errors treated, and the congruence between
the feedback provided and the learners’ goals, beliefs, and expectations.
88

In the present section we will briefly discuss the research designs and data collection procedures
used in the research outlined above.
The research design most commonly used has been the typical one-shot, three-staged WCF task
(writing of original text, reception and processing of feedback, revision or rewriting of original
or new text), sometimes complemented with mini-grammar lessons or error-correction pre-tests
before the initial writing stage, or with delayed or error-correction post-tests after the rewriting
stage. In some cases, more longitudinal designs have been used, with learners writing different texts
and receiving feedback along the way (Simard et al., 2015) or even completing take-home essays
(e.g., Han, 2017). Time schedules have ranged from one, two, or three weeks to up to four months
or more in a few cases (e.g., Coyle et al., 2018; Simard et al., 2015).
Information on learners’ written performance and engagement with/processing of WCF has been
gathered mainly in pen-and-paper contexts. Data collection procedures have included think-aloud
(TA), considered a suitable tool to explore learners’ cognitive processes despite potential reactivity
and veridicality problems (Sachs & Polio, 2007), and stimulated recall, regarded as less intrusive than
TA although it may involve memory loss difficulties and prompt additional learning (Adams, 2003).
Written notes, thought to indicate learners’ locus of attention and to be less affected by memory loss
than off-line measures (Hanaoka, 2007), and audio-recorded collaborative dialogues, based on the
assumption that processing feedback in pairs can lead learners to engage more deeply with the feed-
back (Wigglesworth &Storch, 2012b), have also been used. In individual studies, written languaging,
viewed not only as an external storage mechanism but also as a tool to relieve more demands on
working memory than oral languaging (Suzuki, 2012); eye-tracking, grounded in the purported
connection between cognitive effort, eye-fixation location and eye-movement time (Leow, 2015); and
retrospective verbal reports and interviews, used in combination in naturalistic case studies (e.g., Han
& Hyland, 2015), have also proved useful.
As for the range of WCF techniques used, it should be reiterated that these have mostly been
unfocused in the “Noticing” and “Dual” strands, with focused feedback used by some studies in the
“Beyond the Dual Approach” strand. Also, while studies in the “Noticing” strand have mostly used
discursive feedback techniques, such as reformulations and models, separately or in comparison
(but see Coyle & Roca de Larios, 2014), in the other two strands there has been a shift towards the
analysis of direct error correction techniques, either alone or in comparison with other direct (e.g.,
reformulation, metalinguistic explanation) and indirect techniques.

The main recommendation to be drawn from the available research findings is that getting learners
to engage with WCF cannot properly be achieved if teachers just jump into correcting students’
texts. When possible, they should pay attention to the larger context, the nature and goals of the
course, their own emotions as feedback providers (see Caswell, 2018) and the learners’ individual
characteristics (Ferris et al., 2013). On the basis of this information, and bearing in mind that
different feedback techniques and error types may promote, among other things, different levels of
processing (Kim & Bowles, 2019), teachers might reflect on the most appropriate types, forms and
timing of the feedback to be provided as well as on the convenience of informing students about
their WCF approaches and strategies (Han & Hyland, 2015).
These general recommendations may be specifically linked to the usefulness of certain techniques
for promoting engagement with feedback. These include, for example, the use of focused and
unfocused feedback in the same text so as to favor appropriate processing, the use of TA or collab-
orative dialogues as techniques to interpret and understand WCF with the aim of raising students’
89
awareness of the gaps they may have, the convenience of providing WCF during the completion
of the task rather than between tasks with a view to increasing the possibilities of being adequately
processed by the learner, or the provision of clear guidelines on how learners are expected to use
direct feedback as a way of helping them engage with meta-awareness.
Future Directions
Future research should take stock of the attested fact that, beyond specially-designed L2 composition
courses, L2 learning through writing mostly takes place over multiple opportunities to engage with
language in curriculum-based instructional contexts (see Manchón & Leow, 2020). If adopted, this
curricular approach should logically lead to the expansion of forms of writing, first and second lan-
guage combinations, contexts and populations so far investigated. As a result, a number of different
areas should be addressed in future research. For a start, the task already initiated in the field of
producing more nuanced and theoretically-based understandings of L2 learners’ engagement with/
processing of WCF must be pursued further. In this respect, attempts to account for the multidimen-
sionality of the phenomenon may require not only the use of the different processing models discussed
above but also the application of new, more curriculum-oriented, ones (see Leow, 2020). Additionally,
the refining of the learning construct, in line with a process-oriented view of feedback appropriation,
should continue so as to make it more congruent with the multiplicity of instructional settings where
writing is performed by children, adolescents and adults as part of their L2 curriculum.
Long-term studies in different writing modalities (collaborative, individual, digital, pen
and paper) are also advocated as a way to move away from one-shot treatments. Throughout
different units of work in the curriculum, learners should be provided with different WCF cycles
(Wigglesworth & Storch, 2012b) whose impact could be explored by looking at how feedback is
first processed and then gradually incorporated into their linguistic repertoires. For the first purpose,
the combined use of techniques that can capture higher and lower levels of awareness, i.e., think-
aloud and eye-tracking (Kim & Bowles, 2019), might allow researchers to conduct fine-grained
analyses of how learners initially process feedback on linguistic items differing in saliency and/or
complexity. The analyses of this processing might then be linked to learners’ repeated retrievals and
use of that feedback in subsequent writing tasks within each cycle (Bitchener, 2016). In turn, the
gradual retention of feedback across different WCF cycles could be analyzed by looking at the par-
tial development of learners’ knowledge through DST-based micro-developmental approaches that
involve the collection of dense data from a large sample and may help make subtle changes more
visible (Li, 2017). The influence of learner-external and learner-internal factors (together with more
and less explicit and informative types of WCF) in moderating these processes within the classroom
should also be examined.
Writing in instructional settings is often so integrated with the other three skills that the target
items in WCF are also frequently covered by other classroom activities or the textbook. Future
studies might control for this overlapping or, alternatively, explore the reinforcement of learning
involved in this integration of skills as a function of the target item(s) addressed, the learners’ lan-
guage proficiency, the length of exposure or any other relevant factors. Finally, the role played by
the teacher in providing WCF should also be explored. Future research, in addition to looking at the
factors, contexts, and attitudes constraining such provision, might also triangulate retrospective text-
based teacher interviews with the reactions of students after receiving and addressing the feedback as
a way to ensure the “right conditions” for WCF to be beneficial (Ferris & Kurzer, 2019).
References
Adams, R. (2003). L2 output, reformulation and noticing: Implications for IL development. Language Teaching
Research, 7, 347–376.
90
Aljaafreh, A. & J.P. Lantolf (1994). Negative feedback as regulation and second language learning in the zone
of proximal development. Modern Language Journal, 78, 465–483.
Bitchener, J. (2012). A reflection on the language learning potential of written CF. Journal of Second Language
Writing, 21, 348–363.
Bitchener, J. (2016). Why written corrective feedback can contribute to L2 development: A theoretical model.
Paper presented at ALAA Conference, Melbourne, December.
Bitchener, J. (2019). The intersection between SLA and feedback research. In K. Hyland & F. Hyland
(Eds.). Feedback in second language writing. Contexts and issues (pp. 85–105). Cambridge: Cambridge
University Press.
Bitchener, J. & Storch, N. (2016). Written corrective feedback for L2 development. Bristol: Multilingual
Matters.
Cánovas-Guirao, J., Roca de Larios, J., & Coyle, Y. (2015). The use of models as a written feedback technique
with young EFL learners. System, 52, 63–77.
Caras, A.M. (2019). Written corrective feedback in compositions and the role of depth of processing. In R.P.
Caswell, N.I. (2018). Affective tensions in response. Journal of Response to Writing, 4, 69–98.
Cohen, A. (1987). Student processing of feedback on their compositions. In A.L. Wenden & J. Rubin (Eds.),
Learner strategies in language learning (pp. 57–69). Englewood Cliffs, NJ: Prentice-Hall.
Coyle, Y., Cánovas-Guirao, J., & Roca de Larios, J. (2018). Identifying the trajectories of young EFL learners
across multi-stage writing and feedback processing tasks with model texts. Journal of Second Language
Writing, 42, 25–43.
Coyle, Y. & Roca de Larios, J. (2014). Exploring the role played by error correction and models on children’s
reported noticing and output production in a L2 writing task. Studies in Second Language Acquisition, 36,
451–485.
DeKeyser, R.M. (2007). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second lan-
guage acquisition (pp. 97–113). Mahwah NJ: Lawrence Erlbaum Associates.
Ellis, R. (2010). Epilogue: A framework for investigating oral and written corrective feedback. Studies in
Evans, N.W., Hartshorn, K.J., McCollum, R.M., & Wolfersberger, M. (2010). Contextualizing corrective feed-
back in second language writing pedagogy. Language Teaching Research, 14, 445–463.
Ferris, D. & Kurzer, K. (2019). Does error feedback help L2 writers? Latest evidence on the efficacy of written
issues (pp. 106–124). Cambridge: Cambridge University Press.
Ferris, D., Liu, H., Sinha, A., & Senna, M. (2013). Written corrective feedback for individual L2 writers.
Journal of second Language Writing, 22, 307–329.
Flower, L.S. & Hayes, J.R. (1980). The dynamics of composing. Making plans and juggling constraints. In
L.W. Gregg & E.R. Stenberg (Eds.), Cognitive processes in writing (pp. 31–50). Hillsdale, NJ: Lawrence
Erlbaum.
García Hernández, J. (2017). Analysis of the effects of reformulation as a written corrective feedback technique
in English with grade six pupils. Doctoral Dissertation submitted to the Faculty of Education. University
de Murcia.
Guénette, D. (2007). Is feedback pedagogically correct? Research design issues in studies of feedback in
writing. Journal of Second Language Writing, 16, 40–53.
Han, Y. (2017). Mediating and being mediated: Learner beliefs and learner engagement with written corrective
feedback. System, 69, 133–142.
Han, Y. & Hyland, F. (2015). Exploring learner engagement with written corrective feedback in a Chinese ter-
tiary EFL classroom. Journal of Second Language Writing, 30, 31–44.
form in a four-stage writing task. Language Teaching Research, 11, 459–479.
Hanaoka, O. & Izumi, S. (2012). Noticing and uptake: Addressing pre-articulated covert problems in L2
writing. Journal of Second Language Writing, 21, 332–347.
Hyland, F. (1998). The impact of teacher written feedback on individual writers. Journal of Second Language
Writing, 7, 255–286.
Hyland, F. (2003). Focusing on form: Student engagement with teacher feedback. System, 31, 217–230.
of writing: Theories, methods, individual differences, and applications (pp. 57–71). Hillsdale: Lawrence
Erlbaum Associates.
91
Kim, H.R. & Bowles, M. (2019). How deeply do second language learners process written corrective feed-
back? Insights gained from think-alouds. TESOL Quarterly, 4, 913–938.
Lapkin, S., Swain, M., & Smith, M. (2002). Reformulation and the learning of French pronominal verbs in a
Canadian French immersion context. Modern Language Journal, 86, 485–507.
Leow, R. (2015). Explicit learning in the classroom. A student-centered perspective. New York: Routledge.
Leow, R. (2020). L2 writing-to-learn: Theory, research, and a curricular Approach. In R.M. Manchón (Ed.),
Writing and language learning: Advancing research agendas (pp. 95–118). Amsterdam: John Benjamins.
Li, S. (2017). The efficacy of written corrective feedback on second language development: the impact of feed-
back type, revision type, learning motivation and strategies. Doctoral Dissertation submitted to the School
of Language and Culture. Auckland University of Technology.
Manchón, R.M. (2018). Past and future research agendas on writing strategies: Conceptualizations, inquiry
methods, and research findings. Studies in Second Language Learning and Teaching, 8, 247–267.
Manchón, R. M. & Leow, R. (2020). An ISLA perspective on L2 learning through writing: Implications for
future research agendas. In R.M. Manchón (Ed.), Writing and language learning: Advancing research
Manchón, R.M., Nicolás-Conesa, F., Cerezo, L., & Criado, R. (2020). L2 writers’ processing of written cor-
rective feedback: Depth of processing via written languaging. In W. Suzuki & N. Storch (Eds.), Languaging
in language learning and teaching (pp. 241–265). Amsterdam: John Benjamins.
Manchón, R.M. & Roca de Larios, J. (2007). Writing-to-learn in instructed language learning contexts. In E.
Alcón-Soler & M.P. Safont-Jordá (Eds.) Intercultural language use and language learning (pp. 101–121).
Dordrecht: Springer.
Manchón, R.M. & Williams, J. (2016). L2 writing and SLA studies. In R.M. Manchón & P.K. Matsuda (Eds.),
Handbook of second and foreign language writing (pp. 567–586). Berlin: de Gruyter Mouton.
Park, E.S. & Kim, O.Y. (2019). Learners’ use of indirect written corrective feedback: Depth of processing and
self-correction. In R.P. Leow (Ed.), The Routledge handbook of second language research in classroom
Qi, D. & Lapkin, S. (2001). Exploring the role of noticing in a three-stage second language writing task.
Roca de Larios, J. (2013). Second language writing as a psycholinguistic locus for L2 production and learning.
Sachs, R. & Polio, C. (2007). Learners’ uses of two types of written feedback on an L2 writing revision task.
Schmidt, R. (1990). The role of consciousness in second language learning. Oxford Journals Humanities,
Sheen, Y. (2010). Differential effects of oral and written corrective feedback in the ESL classroom. Studies in
Shintani, N. & Aubrey, S. (2016). The effectiveness of synchronous and asynchronous written corrective feed-
back on grammatical accuracy in a computer-mediated environment. Modern Language Journal, 100,
296–319.
Shintani, Y. & Ellis, R. (2013). The comparative effects of direct corrective feedback and metalinguistic
explanation on learners’ explicit and implicit knowledge of the English indefinite article. Journal of Second
Simard, D., Guénette, D., & Bergeron, A. (2015). L2 learners’ interpretation and understanding of written cor-
rective feedback: Insights from their metalinguistic reflections. Language Awareness, 24, 233–254.
Storch, N. & Wigglesworth, G. (2010). Learners’ processing, uptake, and retention of corrective feedback on
writing: Case studies. Studies in Second Language Acquisition, 32, 303–334.
Suh, B.R. (2010). Written feedback in second language acquisition: exploring the roles of type of feedback, lin-
guistic targets, awareness and concurrent verbalization (Unpublished doctoral dissertation). Georgetown
University.
Suzuki, W. (2012). Written languaging, direct correction and second language writing revision. Language
Learning, 62, 1110–1133.
Pedagogy, 8, 461–482.
Svalberg, A.M.L. (2009). Engagement with language: Interrogating a construct. Language Awareness, 18,
242–258.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidlhofer (Eds.),
Applied linguistics. Studies in honour of H. G. Widdowson (pp. 125–144). Oxford: Oxford University Press.
92
Swain, M. & Lapkin, S. (2002). Talking it through: Two French immersion learners’ response to reformulation.
International Journal of Educational Research, 37, 85–304.
Tomlin, R.S. & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in
Truscott, J. (2007). The effect of error correction on learners’ ability to write accurately. Journal of Second
Uggen, M. (2012). Reinvestigating the noticing function of output. Language Learning, 62, 506–540.
Uscinski, I. (2017). L2 learners’ engagement with direct written corrective feedback in first year composition
courses. Journal of Response to Writing, 3, 36–62.
Vespoor, M., Loowie, W., & van Dijk, M. (2008). Variability in L2 development from a dynamic systems per-
spective. The Modern Language Journal, 92, 214–231.
Waller, L. & Papi, M. (2017). Motivation and feedback: How implicit theories of intelligence predict L2
writers’ motivation and feedback orientation. Journal of Second Language Writing, 35, 54–65.
Wigglesworth, G. & Storch, N. (2012a). Feedback and writing development through collaboration: A socio-
cultural approach. In R. Manchón (Ed.), L2 Writing Development: Multiple Perspectives (pp. 69–97).
Berlin: Mouton de Gruyter.
Wigglesworth, G. & Storch, N. (2012b). What role for collaboration in writing and writing feedback. Journal
of Second Language Writing, 21, 364–374.
Yang, L. & Zhang, L. (2010). Exploring the role of reformulations and a model text in EFL students’ writing
performance. Language Teaching Research, 14, 464–484.
Zheng, Y. & Yu, S. (2018). Students’ engagement with teacher written corrective feedback in EFL writing: A
case study of Chinese lower-proficiency students. Assessing Writing, 37, 13–24.
93
SECTION 3
Language Transfer and Writing

8
TRANSFER, WRITING, AND SLA
L2 Writing as a Multilingual Event

Centre for Language Studies, Radboud University, Nijmegen
Introduction
Multilingualism is far more common than we generally think it is, not only because of multilingual
first language (L1) acquisition (Unsworth, 2013) but also as a result of language education. As lan-
guage learners are slowly gaining control over the second or additional language (henceforth: L2),1
they still have the L1 at their disposal. Whenever the language learner misses the right expression
or the appropriate language to convey an intended message in the L2, he or she may deliberately
or unconsciously resort to the L1 or any other languages known. This influence of the L1 on L2
use is part of what is called crosslinguistic influence (CLI) or transfer. Although the two terms
have different histories and may have different connotations, we will use them interchangeably (cf.
Odlin, 2003). In his seminal work, Odlin (1989) describes transfer as “… the influence resulting
from the similarities and differences between the target language and any other language that has
been previously […] acquired” (p. 27). This definition does not specify the number of languages
involved and the status or the acquisition order of the languages. So, transfer can occur from L1
to L2, or L3 or Ln, but also from L2 to L3, or in reverse direction from L2 or L3 to L1 (Forbes &
Fisher, 2020).
Especially in educational settings where students learn a foreign language with relatively little
exposure outside the classroom, transfer is likely to be mostly one-directional and forward, directed
from the stronger L1 to the L2. However, there are other multilingual settings where bidirectional or
multidirectional transfer or CLI is to be expected (cf. Cenoz & Gorter, 2011). Indeed, many multi-
lingual language users may not have learned to read and write in their first language, or may use
their second (or other languages) more frequently for written language use so that crosslinguistic
influence may be exerted by the L2 on later L1 literacy.
In general terms, it is efficient to learn a skill in one context and to be able to use it in another too.
However, the literature on second language acquisition (SLA) recognizes that deeply internalized
L1 skills sometimes hinder the acquisition and use of L2 skills. L1 influence that results in errors
or inadequate L2 use, known as negative transfer or interference, has been extensively studied
by means of error analyses in the SLA literature (Odlin, 2003; Ortega, 2009). More recently,
researchers have become more interested in the opportunities (positive) transfer of skills and know-
ledge may afford.
DOI: 10.4324/9780429199691-13 97
In this chapter our main focus will be on an L1-L2 setting, where language learners learn to write
in an additional target language and may benefit from or may be hindered by previously acquired
linguistic knowledge and experience. We will review the relationship between L1 and L2 writing
from the perspective of the writer and what is known about the cognitive processes and abilities
involved in both L1 and L2 writing, as well as from the perspective of the written product and its
linguistic and structural features. Specifically, we will discuss results of studies that deal with the
role of L1 and L2 language knowledge and cognitive processing in writing and how the L1 and
L2 may interact with each other. The results of this interaction show in texts written by multilin-
gual writers, or at least L2 learners, and investigation of these texts constitutes another strand of
research. Studies into large corpora of texts may reveal subtle differences in language patterns of
different groups of L1 and L2 writers, and thus clarify what types of features are subject to CLI
in groups of L2 writers with different L1 backgrounds, at varying stages of L2 acquisition. Both
language processing and corpus studies, each with their advantages and limitations, can provide
insights that may allow for learning to write, but also for writing to learn another language.
Historically, the study of SLA and second language writing have different roots. SLA has focused
primarily on oral language use. Theories on SLA, ranging from a formal universal grammar per-
spective from the 1960s onwards to a more communicative or cognitive perspective in later years,
tried to describe how the L2 learner acquires the essential features of the L2, in addition to the
previously acquired L1, often from a language theoretical perspective, addressing questions about
the role of input, about the accessibility of linguistic knowledge and about the competition between
L1 and L2, which suggests CLI (see Hulstijn, 2015; Ortega, 2009). Second language writing
research, on the other hand, largely stems from language teaching research, which in the 1980s was
influenced by (cognitive) studies of writing processes (see Chapter 6, this volume). These latter
studies were initially L1 studies, but they paved the way for (comparative) L2 studies of the cog-
nitive processes involved in L2 writing. The different histories of these subfields prevented much
exchange and interaction. Written language use has rarely, if ever, been considered in testing SLA
theories (Polio, 2012). Ortega (2009) nicely outlines the development of SLA theories, but nowhere
does L2 writing play a significant role in theory development, hypothesis testing or SLA facilita-
tion, other than as a tool for data collection, which was often distrusted for being less spontaneous
as the writer has ample time to monitor and revise the written output (Ortega, 2012). It was only
recently that the two subfields started to collaborate on common topics and interests that obviously
do exist (see Chapter 1, this volume), CLI being one of them.
Despite the different histories, there are also parallels between both fields. Both presume
the comparison between the L1 and the target language to be predictive of success in learning.
Especially in early SLA studies, contrastive analyses of the features of the L2 and those of the L1
were expected to predict the difficulties a second language learner was to experience. Since the
1990s, the creation of large computer-based corpora has provided an empirical basis to test the
effect of (lack of) congruence between languages on the interlanguage of learners with a wide var-
iety of L1 backgrounds. However, the problems L2 learners experience cannot solely be explained
by the linguistic differences between their L1 and L2; general linguistic or cognitive factors play
a major role as well. For example, the (beginning) L2 learner (and writer) will opt for simpler
solutions in L2 language production, using mostly simpler, high-frequency words and preferring
simple verbs over phrasal verbs, for example (Ortega, 2009). One of the critical questions is there-
fore to what extent the features typical of L2 language use are dependent on the languages involved
and their linguistic distance and to what extent general linguistic and cognitive factors play a deter-
mining role.
98
Transfer, L2 Writing & SLA
It was not until recently that L2 writing was considered a vehicle for L2 learning. Studies are
no longer solely concerned with “learning-to-write,” but also with “writing-to-learn,” with the L2
as one of the possible goals of learning (Hirvela, Hyland, & Manchón, 2016). The analysis of CLI
in the individual writer’s behavior and performance as well as in distributional patterns in large
corpora of aggregated learner writing enhances our understanding of the role of L1 in L2 writing
processes and SLA in general.

In the acquisition of writing skills, as in most other learning contexts, transfer is an important phe-
nomenon. Whereas L1 writing studies are primarily concerned with transfer of skills learned in one
writing task to performance on the next, in L2 writing research the major concern is whether skills
and knowledge transfer across languages. The extent to which what is learned in one context can
be applied in a new context and, equally important, to what extent the learner is aware of this trans-
ferability (Kellerman, 1977) and able to apply previously learned skills and knowledge, are critical
issues in L2 writing research. It is the awareness that the L1 (or any other previously learned lan-
guage) might be useful for the target language that will encourage transfer (Ringbom, 2007), but it
may not always be self-evident to the learner that the same knowledge and skills can be employed
in both languages. Language learners’ intuitions about opportunities for transfer very much depend
on their L2 language proficiency. However, L2 proficiency is certainly not the only factor to play
a role (see Jarvis & Pavlenko, 2008 for an extensive discussion of CLI). The typological distance
between the L1 and the L2 may determine the likelihood of transfer to a large extent, but the L2
writer may not recognize objective similarities (or the lack thereof) and will be guided by perceived
or subjective similarities, which do not necessarily coincide with objective similarities. Second lan-
guage writers may overlook existing similarities and may “see” similarities that do not really exist
(Jarvis & Pavlenko, 2008; Ringbom & Jarvis, 2009). For related languages, linguistic similarities
may exist at the orthographic, lexical, morphosyntactic, and discourse level, while for unrelated
languages orthographic and lexical similarities may be very unlikely, but the languages may still
employ similar morphosyntactic and discourse principles, such as a certain basic word order (e.g.,
SVO or SOV) and use of affixes or adpositions. The effect of language proficiency and language
distance on CLI is further mediated by the educational experience or language learning history of
the L2 writer, and by contextual factors. Jarvis and Pavlenko (2008) group the different factors
involved in CLI into five major categories: 1) linguistic and psycholinguistic factors, 2) cognitive,
attentional, and developmental ones, and factors related to 3) cumulative language experience and
knowledge, 4) the learning environment and 5) language use, which may or may not be formal and
thus have (or not have) strong and explicit conventions that can be violated. Like the early SLA
theories, most of these effects on CLI have been studied in oral communication (Jarvis & Pavlenko,
2008) and are still unexplored territory when it comes to CLI in L2 writing.
One important strand of research focuses on the application of L1 writing skills, writing strat-
egies, and processes to L2 writing. L2 writers, especially at the early stages of development, will
rely on the skills, strategies, and cognitive processes that they tend to employ in their L1 writing
and the L1 may even be used to prime and facilitate the L2 writing process, as demonstrated by
think-aloud studies (Murphy & Roca de Larios, 2010; Manchón, Roca de Larios, & Murphy,
2007). Another strand of research, related to this cognitive approach, is concerned with the study
of individual differences in writing, investigating the relative contribution of L2 linguistic know-
ledge and skills and general (L1) writing skills to the individual differences observed in L2 writing
(Cumming, 1989; Schoonen et al., 2003). The core questions are to what extent L2 writing is a lan-
guage problem or a writing problem, and which conditions affect transfer from L1 to L2. If we want
to see L2 writing as a writing problem, then the writer can employ his or her L1 writing experience
99
and strategies. These writing strategies may not show in L2 written production, but they will show
in the way the L2 writer composes the text. Both of these strands thus take the L2 writer as a point
of departure and try to map the various cognitive components and processes involved in L2 writing.
In the final written product, transfer may affect the degree to which the linguistic and textual
features of the L2 text are aligned with readers’ expectations in the target context. It may result in
violations of L2 linguistic conventions, such as when an L1 Dutch writer uses Dutch spelling in
English L2 writing (succes instead of success) or selects a “false friend” when it comes to word
choice (take the trap instead of stairs, Dutch trap meaning “stairs”). These examples could be
extended with morphosyntactic and textual ones concerning use of tense, word order, and such. In
many of these cases, transfer will show up as easily identifiable instances of interference. However,
transfer may not always lead to violations of explicit conventions of the target language, but rather
to avoidance or underrepresentation of certain structures or lexical expressions, paralleled by the
overrepresentation of others. Therefore, these less obvious forms of CLI may not be immediately
apparent in the target language, but only emerge from comparative analyses that look into the fre-
quency distribution of words and structures in L1 and L2 texts (cf. Hinkel, 2003). Although these
kinds of CLI phenomena may be difficult to detect, they may be a critical source for advanced L2
writing instruction and advanced SLA. While CLI may also have a facilitative effect on (the fluency
of) L2 written production, this positive transfer from the L1 and successful language acquisition
tend to be difficult to tell apart, as Ringbom and Jarvis (2009) point out. Partly for this reason, and
partly because of the pedagogically-oriented nature of many studies of L2 writing, there has been
considerably more research attention for negative than for positive transfer of L1 linguistic features
in L2 written production. Where there is negative transfer, there may also be room for improvement
and opportunity for instruction.
Current Contributions
Research on Transfer of Writing Processes, Knowledge, and Skills

In their L2 writing process, language learners who have already developed literacy skills in their
L1 will most likely depend on their experience in L1 writing. They may use the L1 for content gen-
eration (and subsequently translate the L1 text into the L2), to review the text written so far, or for
metacognitive deliberations. Van Weijen, Van den Bergh, Rijlaarsdam, and Sanders (2009) demon-
strate the complex interaction of L1 use and L2 writing, task, and proficiency. In a think-aloud study
of 20 Dutch first-year English majors, writing four tasks in each language, they found that L1 use
was associated with the task the writers had to perform and with their level of writing and L2 profi-
ciency. L1 use also varied with different writing activities or cognitive processes. Van Weijen et al.
(2009) distinguished between 11 categories, such as self-instructions, goal-setting, structuring, gen-
erating ideas, formulating, meta-comments, and revising. Results showed that although there was
substantial individual variation, L1 use was most prominent in self-instruction and meta-comments.
The researchers interpreted this as an indication that L2 writers find it particularly difficult to orches-
trate the L2 writing process and fall back on their L1. However, use of the L1 did not have a clear
relation with L2 text quality; only the use of the L1 in meta-comments was negatively related to L2
text quality. Furthermore, more expert writers (as indicated by their L1 writing ability) used their
L1 less frequently, confirming earlier results by Wang and Wen (2002) and Woodall (2002), which
suggests that L2 text quality, general writing ability, and L1 use in L2 writing are interrelated. Van
Weijen et al. (2009) showed that L2 language proficiency, as measured by a vocabulary test, was
associated with L2 text quality, but did not seem to affect the use of the L1 for certain categories of
writing activity. The use of the L2 in major cognitive activities correlated positively with L2 text
quality; its use during other (metacognitive) activities, however, was negatively related to L2 text
quality. Murphy and Roca de Larios (2010), who studied seven Spanish learners of English, found
100
that the learners used their L1 to help to solve lexical searches. The L1 was used specifically in
generating lexical units, evaluations, and self-questioning. In sum, resorting to the L1 during L2
writing may prevent cognitive overload and facilitate non-linguistic processes.
So far, we have discussed the extent of L1 use in the L2 writing process. Yet, L2 writers will also
use their general experience and strategic knowledge acquired in L1 writing while writing in their
L2. The kind of knowledge that seems most transferable is what may be called metacognitive know-
ledge. Metacognitive knowledge is an umbrella term that comprises strategic knowledge related
to, for example, time management during writing, general knowledge of communicative contexts,
knowledge of preferred writing processes and their ordering, and of desirable text characteristics
and effective self-questioning strategies (cf. Lee & Mak, 2018). These subcomponents vary in the
degree to which they are language-dependent and thus more or less transferable. Schoonen et al.
(2011) show that metacognitive knowledge, operationalized as knowledge of reading and writing
strategies and text characteristics, was not only a strong predictor of Dutch secondary-school
students’ L1 writing proficiency (r=.67, using Structural Equation Modeling), but was also strongly
associated with these students’ EFL writing proficiency (r=.71). These findings suggest that even in
relatively young EFL writers, metacognitive knowledge may transfer from L1 to L2, or at least be
a common source for writing in the L1 and additional languages. Moreover, analyses have shown
that the correlation between L1 and L2 writing proficiency can be understood as the metacognitive
knowledge that plays a major role in both L1 and L2 writing. When L1 writing proficiency was
added to the regression model of EFL writing, it turned out to be a major predictor (β=.62), mainly
at the expense of metacognitive knowledge. In other words, the commonality of L1 and L2 writing
can be described in terms of metacognitive knowledge (Schoonen et al., 2011).
For L2 writers to be able to write about the content that they consider to be appropriate to
the task without feeling restricted by limited L2 language resources, it is not just metacognitive
knowledge but also solid linguistic L2 knowledge that they need. This aligns with the findings of
Schoonen et al. (2011) that –next to general metacognitive knowledge –lexical, grammatical, and
orthographical knowledge show substantial correlations with L2 writing proficiency (r=.53, .75,
and .78 respectively). This linguistic knowledge will also affect the writing process itself. Fluency
in writing and the allocation of time and attention varies with the proficiency of the L2 writer; espe-
cially the formulation of content and revision of the language absorbs time and cognitive resources
(Roca de Larios, Manchón & Murphy, 2006; Stevenson, Schoonen, & De Glopper, 2006).
For practical reasons, we have focused mainly on the CLI of an L1 on an L2, but often writers
have a wider linguistic repertoire and can thus resort to more languages than just a single L1 when
writing in another language (Ln). For instance, Sagasta Errasti (2003) has demonstrated that a
higher degree of bilingualism in Basque and Spanish, i.e., active use of the languages, is associated
with better performance on L3 English writing tasks, in terms of overall production, fluency, gram-
matical and lexical complexity as well as accuracy. The scores for writing in Basque, Spanish, and
English were shown to be highly correlated, suggesting that some level of transfer of writing skills
had taken place. Similarly, in a study of L1 Italian, L2 German, and L3 English writing conducted in
South Tyrol, De Angelis and Jessner (2012) observe that writing performance in each of these three
languages is highly interdependent, with higher proficiency levels in one language being associated
with longer texts and T-units in the other (see also Cenoz & Gorter, 2011; Forbes & Fisher, 2020,
and Chapter 9 this volume, for additional multilingual examples).
Research on Crosslinguistic Influence as Evidenced in L2 Written Texts

The ultimate goal of writing is the production of an effective text. In order to make sure their texts
are appropriate for the L2 rhetorical setting, L2 writers have to comply with target language norms
at various linguistic levels: lexico-grammatical, morpho-syntactical, and discourse-structural. In
this section we will focus on manifestations of CLI in learner writing at each of these levels.
101
Easy automatic retrieval has facilitated a wealth of studies that have investigated the extent to
which learners transfer word form and/or meaning as well as collocational, morphological, and
syntactic constraints from their L1 or other previously acquired languages into the target language.
Many studies of lexical transfer in L2 writing have investigated learners’ use of formulaicity. While
learners generally appear to use less formulaic language than native speakers, they have been shown
to rely more on a limited set of highly frequent formulaic expressions, especially those that are con-
gruent with L1 forms (Durrant & Schmitt, 2009; Granger, 1998). Nesselhauf (2003) investigated the
use of collocations in the German subcorpus of the International Corpus of Learner English (ICLE,
Granger, 1993). Of 1,072 verb-noun combinations extracted from the corpus, almost a quarter was
judged to be wrong or questionable. On the basis of the existence of equivalent collocations in the
learners’ L1, German, 45% of these errors were considered likely to have been caused by L1 influ-
ence. The total percentage of errors in congruent collocations, i.e., those with a German equivalent,
was found to be only 11%, compared with 42% for non-congruent collocations.
In another ICLE-based study of lexical transfer effects, Paquot (2013) argues that transferability
depends not only on congruence between L1 and L2 word combinations, but that it is L1 fre-
quency that prompts learners to transfer word combinations into their L2 writing. This becomes
apparent when the lexical bundle use of EFL learners with different L1s is compared. Paquot’s
analysis suggests that 58.8% of the recurrent lexical bundles – not necessarily errors – identified
in the French component of ICLE can be attributed to CLI, because significant differences were
observed between use of these bundles in the French learner texts and texts produced by at least
five of the nine other L1s included in the analysis. While these lexical bundles are congruent with
L1 French forms, some also have translational equivalents in related L1s, such as Spanish, begging
the question why other learner populations do not transfer the same forms to the same extent.
Closer inspection of the distribution of these forms in L1 Spanish and French reference corpora,
however, reveals considerable distributional differences between both languages, confirming an L1
frequency-based explanation of the transfer effect observed for the French learners.
In comparison with studies of lexical transfer, studies that investigate to what extent authentic
written learner data are influenced by L1 syntax are relatively rare, likely due to problems
associated with automatic retrieval of such structures. Those that have investigated various
aspects of syntax, have had to either restrict themselves to fairly limited word counts (hence
affecting the generalizability of such studies) or have investigated closed classes, which can be
retrieved by means of a lexical search (Granger, 2015). The increasing availability, however,
of syntactically annotated corpora and sophisticated corpus analysis software have facilitated
studies of other areas of L2 syntax. Lu and Ai (2015), for instance, investigated how learners’
L1 might play a role in 14 syntactic complexity measures which are widely used as develop-
mental measures of proficiency. In an initial comparison of 200 essays derived from the Louvain
Corpus of Native English Speaking Students (LOCNESS) and 1,400 essays derived from the
International Corpus of Learner English (ICLE), there were few significant differences. Once
L1 was factored in, however, substantial differences emerged between LOCNESS and each of
the seven L1 groups represented in their sample from ICLE, which could not be accounted for
by proficiency alone. The limited use of sentential coordination by the Chinese learners, for
example, and the longer production units and high use of sentential and phrasal coordination
by the German learners might point to an L1 transfer effect, although L1-interlanguage congru-
ence was not investigated. Lu and Ai conclude that syntactic complexity measures previously
reported to be indicative of L2 proficiency might depend on L1 background as well, and suggest
that in order to systematically compare the developmental trajectories of syntactic complexity in
L2 writing, longitudinal studies comprising learners with different L1 backgrounds are needed.
An area in which transfer has been shown to persist even at higher levels of acquisition is the
interface between syntax and information structure (see Lozano & Callies, 2018, for an overview).
102
Callies (2009) investigated advanced German EFL learners’ use of discourse-pragmatically motivated
variations of basic word order in English, such as inversion, preposing, and cleft constructions, in
written corpus data derived from ICLE, combined with experimental data including elicited pro-
duction and retrospective interviews. At these advanced stages of L2 acquisition, L1 discourse
structure is shown to affect L2 written production, not in the form of word order violations but in
the overrepresentation of subject-prominent structures paralleled by the avoidance of structures
without a canonical sentence-initial subject, such as inversion. This avoidance is attributed to the
unexpected resemblance of these structures to verb-second word order in German, which serves to
block positive transfer. In a longitudinal corpus study, Van Vuuren and Laskin (2017) also found that
pragmatic differences between Dutch and English that are co-dependent on syntactic differences
between both languages result in transfer in advanced Dutch EFL learners’ written production,
most notably in the frequent use of clause-initial adverbials, such as “in this book” or “with this
machine,” which in Dutch are typically used in the pre-verbal slot to “anchor” the sentence they
occur in in the directly preceding discourse. The use of these local anchors in Dutch learners’ L2
English is shown to decline with increasing proficiency.
A large number of studies have detailed in which ways L2 writing is affected by L1 cultural
preferences for rhetorical features, such as inductive patterns of text organization (cf. Kubota,
1998). Where early contrastive rhetoric studies have been criticized for being prescriptive and
assimilationist, and for their “tendency to define the expectations of ‘native speaker or reader’ as
the rhetorical ‘norm’ ” (Kubota & Lehner, 2004, p. 10), more recent work has emphasized writers’
agency in their choice to adopt rhetorical features (Hirose, 2006; Kubota & Lehner, 2004; Matsuda,
1997). It has been argued that L1 and L2 writing instruction and practice allow writers to gradually
develop a repertoire of rhetorical knowledge which is non-language specific (Rinnert, Kobayashi, &
Katayama, 2015). Indeed, rhetorical transfer has been shown to be bidirectional in nature (Rinnert &
Kobayashi, 2009), e.g., with L2 (EFL) writing instruction resulting in a higher frequency of features
such as counterarguments and refutation in L1 Japanese texts (Rinnert & Kobayashi, 2009). This
implies that both L1 and L2 writing instruction serve to extend the “organizational repertoire” that
learners can draw on (Hirose, 2006, p. 146), regardless of the language in which they write.

As discussed above, transfer or CLI may show in at least two different ways, that is in the cognitive
processes of writing, and in the actual written texts. The latter is clearly more readily observable
than the former. To study the role of CLI in the cognitive processes involved in L2 writing, various
research methods are available, each with their advantages and disadvantages (Olive, 2010). The
main sources of data, especially in the first decades of research, consisted of protocols of verbalized
cognitive processes during writing, so-called think-aloud protocols (see Chapters 6 and 23, this
volume), which allow the researcher to identify L1 influences in the L2 writing process. Verbal
protocols have yielded rich data, but fast or automatic and simultaneous processes may go unnoticed
by the writer. Less obtrusive is the use of keystroke logging, which traces the actual writing by
logging the keystrokes on the keyboard and use of the computer mouse with a time code (Leijten &
Van Waes, 2013). Keystroke logging has recently been combined with eye-tracking facilities, thus
adding information about the reading behavior of the writer (see Chapters 6 and 23, this volume).
These research methods may also provide (comparative) data on L1 and L2 writing fluency, but the
actual CLI from L1 to L2 or vice versa is much harder to establish. The considerations of the writer
and their reasons for behaving the way he or she does remain invisible. This is the reason for some
researchers to conduct retrospective stimulated recalls with their participants (Leijten & Van Waes,
2013). Interview data may reveal CLI in, for example, pausing behavior or revisions made during
writing. Olive (2010) mentions yet another approach to the online study of cognitive processes in
103
writing, which is the dual task paradigm. Within this paradigm, the cognitive load or processing
demands are investigated by, for example, having writers perform a memory task and at the same
time write sentences. The effect of the primary on the secondary task (or the other way around)
points to the processing demands involved. Again, a comparison of the processing demands in L1
and L2 writing may provide interesting insights in its own right, but to establish CLI in processing
demands may require very specific (dual) tasks. The cognitive approaches to the study of the role of
the L1 in L2 writing could be extended with neuro-imaging techniques to study brain structure and
functions in writing, but studies using these techniques often concentrate on atypical writers and
have not yet focused on CLI (for an overview, see James, Jao, & Berninger, 2016).
The investigation of manifestations of L1 influence in L2 written products, including under-
and overrepresentation of words, phrases, and structures, requires large representative samples
of texts. In recent decades, the development of computer-based learner corpora has facilitated
large-scale quantitative comparisons of authentic data. The International Corpus of Learner
English (Granger, 1993) is a prime example. Comprising 16 L1 backgrounds, it remains one
of the most frequently used corpora of non-native English writing, as evidenced by some of
the studies mentioned above. More recent developments in corpus compilation include the
collection of developmental learner data, e.g., in the Longitudinal Database of Learner English
(LONGDALE, Meunier, 2015), allowing for the investigation of transfer effects at different
stages of L2 acquisition within the same group of learners.
Many corpus-based studies of crosslinguistic influence have followed Granger’s (1996)
Integrated Contrastive Model (ICM), in which Contrastive Interlanguage Analysis (CIA) is used
to identify differences between texts produced by native and non-native speakers or learners
with different L1 backgrounds, while Contrastive Analysis between the learners’ L1 and the
target language is used either to predict or explain these patterns of language use. This com-
parative approach, in combination with the creation of parallel learner corpora, has allowed for
the identification of a wide variety of features that are characteristic of learner language, even
if they do not violate target language norms, which are difficult to uncover if the learner variety
in question is considered in isolation. Those features that appear to be L1-specific have in many
cases been attributed to L1 transfer, although other factors, e.g., differences in teaching methods
and proficiency levels, might clearly also play a role.
Some corpus studies (e.g., Paquot, 2013, above) also investigate the extent to which learners’
use of a feature in their L2 is paralleled by their use of a similar feature in their L1 (cf. Jarvis 2000,
who refers to this type of evidence as L1-interlanguage congruity), while others advocate the use
of translation corpora (e.g., Gilquin, 2008) in addition to comparisons between texts produced by
different L1 groups. In spite of differences in emphasis, corpus-based studies of transfer effects
in L2 writing all rely on the observation of differences and similarities between languages and
language varieties, which combine to clarify the role of transfer in learner language. A different,
and more recent, corpus-approach to transfer research has been characterized by Jarvis (2010) as
detection-based, as opposed to comparison-based. Machine learning techniques previously used for
authorship attribution studies have been applied to automatic L1-identification of large databases
of written texts produced by learners with a variety of L1-backgrounds (e.g., Bestgen, Granger,
& Thewissen, 2012; Koppel, Schler, & Zigdon, 2005; Tsur & Rappoport, 2007; Wong & Dras,
2009). Detection-based approaches do not zoom in on a small selection of pre-determined features.
Rather, they allow patterns to emerge from the data, often combining large numbers of features to
increase classification accuracy. Koppel et al. (2005) use analysis of function words, letter n-grams,
rare POS-bigrams, and various error types to achieve a surprisingly high accuracy rate of 80.2%
in the automatic classification of L1 Russian, Czech, Bulgarian, French, and Spanish EFL texts
included in ICLE. Bestgen et al. (2012) achieve a classification accuracy rate of between 65% and
75% based on error analysis alone. The discriminant analysis procedure they employed appeared
to be most successful for L1-identification of the texts produced by learners with the lowest level
104
of proficiency and highest error frequency. The authors note, however, that the error types that con-
tribute to automatic L1-identification do not necessarily have to result from L1 influence, although
subsequent qualitative analysis reveals that at least some errors, such as omission of pronouns
and zero article errors by L1-Spanish learners and overuse and misuse of “indeed” by L1-French
learners, are likely to be transfer-related.

Writing abilities are usually acquired in an educational context and (L1) writing education may
focus on transfer, but this will be transfer to the next writing assignment, or from controlled
writing assignments to free writing. This L1 writing education does not explicitly prepare for L2,
L3, or Ln writing. Nevertheless, there are commonalities between L1 and L2 writing that could
be exploited.
Generalizable, metacognitive knowledge –of communicative contexts, of effective writing strat-
egies and processes, of desirable text characteristics –are particularly readily transferred from the L1
writing context in which they are originally acquired to subsequent writing in an L2. L1 writing has
been shown to be a major predictor of L2 writing due to the commonality of metacognitive know-
ledge (Schoonen et al., 2011). Although metacognitive knowledge that has been developed in the
context of L2 writing can of course also exert an influence on L1 writing, Gentil (2011) suggests that
“it may […] be preferable for multilingual writers to develop [common underlying proficiencies] in
their L1 or stronger language first” (p. 19). For speakers of minority languages it may be especially
important to gain writing expertise in their L1, which will then form a strong basis for L2 writing.
Writing teachers arguably have a responsibility to help their students learn to write texts that
are recognized as genre-appropriate by the target discourse community. This does not imply that
teachers should resort to the prescriptivism of native-speaker (mostly English) rhetorical norms
that earlier contrastive rhetoric studies have been criticized for. Rather, writing instructors’ role
might ideally involve identifying previously acquired rhetorical knowledge, enabling learners to
extend their rhetorical repertoire and facilitating adaptive transfer (Belcher, 2014; De Palma &
Ringer, 2011).
Beyond metacognitive and rhetorical knowledge, sufficient linguistic knowledge is paramount
in successful writing. If we accept that a focus on correspondences between the linguistic systems
of the L1 and L2 may support the L2 learning process, this has implications for classroom practices
and for how we value L2 writing teachers who might be slightly disparagingly referred to as
non-native. Odlin (2006) suggests that teachers with knowledge of learners’ L1 and L2, who can
point out the similarities and differences between both, are “especially qualified” to help learners.
Language learners may also benefit from new tools for automatic text evaluation and feedback that
are attuned to L1-specific areas of difficulty (Burstein et al., 2004). These kinds of tools may be a
useful addition to spelling checkers, online dictionaries, or web sources. The use of more advanced
tools may certainly help to improve the quality of an L2 text, but to what extent it promotes SLA is
still an open question.
Many language learners have to act in a more complex scenario than we have described. They
may have more than one source language. In the case of L3 or additional language acquisition, the
language learner has a choice, i.e., resorting to the L1 or the L2, when writing in the target lan-
guage becomes problematic. It turns out that the language learner has a sense of transferability and
uses the language that is most useful (Kellerman, 1995). Language and writing instruction could
capitalize on this transferability if multiple languages are involved in the learning process. Cenoz
and Gorter (2011) suggest that one way to do so would be to integrate the curricula of the various
languages involved in a multilingual situation so as to facilitate transfer of what is learned in one
language to other languages (p. 17). It appears that learners would benefit from being encouraged
to draw on all of their strategic, rhetorical, and linguistic resources.
105
Future Directions
New technological developments do not only provide opportunities for practice but also for
research. Learner corpus research can benefit from automated parsing and refined corpus analysis
software, to facilitate the investigation of complex syntactic structures by learners from different
L1 backgrounds and at different levels of proficiency (cf. Lu & Ai, 2015). Although cross-sectional
studies have established proficiency as an important factor affecting transfer, longitudinal studies
investigating transfer effects at different stages of L2 acquisition within the same group of learners
remain rare. Studies investigating learners’ development over time would constitute a valuable con-
tribution to existing research (cf. Manchón, Roca de Larios, & Murphy, 2007).
Writing being a less pressured process than speaking allows L2 learners to carefully consult
their linguistic resources, drawing on linguistic knowledge of previously acquired languages
wherever this is perceived to be of benefit. At the same time, the product of writing, i.e., the text,
provides many opportunities for (detailed) feedback. One particularly relevant research question
that might be addressed by future research is how the crosslinguistic influences that writing is
subject to could be exploited to facilitate language learning. Intervention studies could evaluate
the effectiveness of tools that provide L1-specific feedback, for example, and determine to what
extent they promote SLA or merely serve to improve the quality of an L2 text. In order to estab-
lish true SLA, effects should not only show in language learners’ writing, but also in their oral
language skills.
From an experimental, psycholinguistic perspective, the development of specific sets of tasks
that might or might not facilitate transfer could be an interesting avenue, to investigate to what
extent the response to an L1 writing task primes certain choices in a successive L2 writing task and
whether or not writers will be able to use and, if necessary, adapt previously used or learned know-
ledge (DePalma & Ringer, 2011). Controlled experimental studies and educational intervention
investigations, based on learner corpus data, should go in hand in hand to unravel the mechanisms
and possibilities of CLI in L2-writing development and SLA.
Note
1 For ease of reference, we will often refer to the simple context of L1 speakers who learn a second language
(L2), acknowledging that for most language users their linguistic context is more complex. See Chapter 10
(this volume) for a full discussion on multicompetence.
References
Belcher, D. (2014). What we need and don’t need intercultural rhetoric for: A retrospective and prospective
look at an evolving research area. Journal of Second Language Writing, 25, 59–67.
Bestgen, Y., Granger, S., & Thewissen, J. (2012). Error patterns and automatic L1 identification. In S.
Jarvis & S.A. Crossley (Eds.), Approaching language transfer through text classification (pp. 127–153).
Bristol: Multilingual Matters.
Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing
service. Ai Magazine, 25(3), 27–35.
Callies, M. (2009). Information highlighting in advanced learner English. The syntax-pragmatics interface in
second language acquisition. Amsterdam: John Benjamins.
Cenoz, J., & Gorter, D. (2011). Focus on multilingualism: A study of trilingual writing. The Modern Language
Journal, 95(3), 356–369.
Cumming, A. (1989). Writing expertise and second-language proficiency. Language Learning, 39(1), 81–135.
De Angelis, G., & Jessner, U. (2012). Writing across languages in a bilingual context: A dynamic systems
theory approach. In R.M. Manchón (Ed), L2 writing development: Multiple perspectives (pp. 47–68).
Berlin: De Gruyter Mouton.
DePalma, M.- J., & Ringer, J.M. (2011). Toward a theory of adaptive transfer: Expanding disciplinary
discussions of “transfer” in second-language writing and composition studies. Journal of Second Language
Writing, 20(2), 134–147.
106
Durrant, P., & Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations?
International Review of Applied Linguistics in Language Teaching, 47(2), 157–177.
Forbes, K., & Fisher, L. (2020). Strategy development and cross-linguistic transfer in foreign and first language
writing. Applied Linguistics Review, 11(2), 311–339.
Gentil, G. (2011). A biliteracy agenda for genre research. Journal of Second Language Writing, 20(1), 6–23.
Gilquin, G. (2008). Combining contrastive and interlanguage analysis to apprehend transfer: detection, explan-
ation, evaluation. In G. Gilquin, S. Papp, & M.B. Díez-Bedmar (Eds.), Linking up contrastive and learner
corpus research (pp. 1–33). Leiden: Brill-Rodopi.
Granger, S. (1993). The international corpus of learner English. The European English Messenger, 2(1), 34.
Granger, S. (1996). From CA to CIA and back: An integrated approach to computerized bilingual and learner cor-
pora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in contrast. Papers from a symposium
on text-based cross-linguistic studies, Lund, 4–5 March 1994 (pp. 37–51). Lund: Lund University Press.
Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and formulae. In A.P. Cowie
(Ed.), Phraseology: Theory, analysis, and applications (pp. 145–160). Oxford: Oxford University Press.
Granger, S. (2015). Contrastive interlanguage analysis: A reappraisal. International Journal of Learner Corpus
Research, 1(1), 7–24.
Hinkel, E. (2003). Simplicity without elegance: Features of sentences in L1 and L2 academic texts. TESOL
Quarterly, 37(2), 275–301.
Hirose, K. (2006). Pursuing the complexity of the relationship between L1 and L2 writing. Journal of Second
Hirvela, A., Hyland, K., & Manchón, R.M. (2016). Dimensions in L2 writing theory and research: Learning
to write and writing to learn. In R.M. Manchón & P.K. Matsuda (Eds.), Handbook of second and foreign
language writing (pp. 45–63). Berlin: De Gruyter Mouton.
Hulstijn, J.H. (2015). Language proficiency in native and non-native speakers: Theory and research.
James, K.H., Jao, R.J., & Berninger, V. (2016). The development of multi-leveled writing brain systems; Brain
lessons for writing instruction. In C.A. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing
research (2nd ed., pp. 116–129). New York: Guilford.
Jarvis, S. (2000). Methodological rigor in the study of transfer: Identifying L1 influence in them interlanguage
lexicon. Language learning, 50(2), 245–309.
Jarvis, S. (2010). Comparison-based and detection-based approaches to transfer research. Eurosla Yearbook,
10(1), 169–192.
Jarvis, S., & Pavlenko, A. (2008). Crosslinguistic influence in language and cognition: London: Routledge.
Kellerman, E. (1977). Towards a characterisation of the strategy of transfer in second language learning.
Interlanguage Studies Bulletin, 2(1), 58–145.
Kellerman, E. (1995). Crosslinguistic influence: Transfer to nowhere? Annual Review of Applied Linguistics,
15, 125–150.
Koppel, M., Schler, J., & Zigdon, K. (2005). Automatically determining an anonymous author’s native lan-
guage. In P. Kantor et al. (Eds.), Intelligence and security informatics (pp. 209–217). Berlin: Springer.
Kubota, R. (1998). An investigation of Japanese and English L1 essay organization: Differences and similar-
ities. Canadian Modern Language Review, 54(4), 475–508.
Kubota, R., & Lehner, A. (2004). Toward critical contrastive rhetoric. Journal of Second Language Writing,
13(1), 7–27.
Lee, I., & Mak, P. (2018). Metacognition and metacognitive instruction in second language writing classrooms.
TESOL Quarterly, 52(4), 1085–1097.
Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using Inputlog to analyze and
visualize writing processes. Written Communication, 30(3), 358–392.
Lozano, C., & Callies, M. (2018). Word order and information structure in advanced SLA. In P.A. Malovrh &
A.G. Benati (Eds.), The handbook of advanced proficiency in second language acquisition (pp. 419–441).
Oxford: Wiley-Blackwell.
Lu, X., & Ai, H. (2015). Syntactic complexity in college-level English writing: Differences among writers with
diverse L1 backgrounds. Journal of Second Language Writing, 29, 16–27.
Manchón, R.M., Roca de Larios, J., & Murphy, L. (2007). A review of writing strategies: Focus on
conceptualizations and impact of first language. In A.D. Cohen & E. Macaro (Eds.), Language learner
strategies: Thirty years of research and practice (pp. 229–250). Oxford: Oxford University Press.
Matsuda, P.K. (1997). Contrastive rhetoric in context: A dynamic model of L2 writing. Journal of Second
Meunier, F. (2015). Introduction to the LONGDALE project. In E. Castello, K. Ackerley, & F. Coccetta
(Eds.), Studies in learner corpus linguistics: Research and applications for foreign language teaching and
assessment (pp.123–126). Bern: Peter Lang.
107
Murphy, L., & Roca de Larios, J. (2010). Searching for words: One strategic use of the mother tongue by
advanced Spanish EFL writers. Journal of Second Language Writing, 19(2), 61–81.
Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for
teaching. Applied linguistics, 24(2), 223–242.
Odlin, T. (1989). Language transfer: Cross-linguistic influence in language learning. Cambridge: Cambridge
University Press.
Odlin, T. (2003). Cross-linguistic influence. In C.J. Doughty & M.H. Long (Eds.), The handbook of second
language acquisition (pp. 436–486). Malden, MA: Blackwell.
Odlin, T. (2006). Could a contrastive analysis ever be complete? In J. Arabski (Ed.), Cross-linguistic Influence
in the second language lexicon (pp. 22–35). Clevedon: Multilingual Matters.
Olive, T. (2010). Methods, techniques, and tools for the on-line study of the writing process. In N.L. Mertens
(Ed.), Writing: Processes, tools and techniques (pp.1–18). New York: Nova Science Publishers.
Ortega, L. (2009). Understanding second language acquisition. London: Hodder Education.
Ortega, L. (2012). Epilogue: Exploring L2 writing–SLA interfaces. Journal of Second Language Writing,
21(4), 404–415.
Paquot, M. (2013). Lexical bundles and L1 transfer effects. International Journal of Corpus Linguistics, 18(3),
391–417.
Polio, C. (2012). The acquisition of second language writing. In S.M. Gass & A. Mackey (Eds.), The Routledge
handbook of second language acquisition (pp. 337–352). London: Routledge.
Ringbom, H. (2007). Cross-linguistic similarity in foreign language learning. Clevedon: Multilingual Matters.
Ringbom, H., & Jarvis, S. (2009). The importance of cross-linguistic similarity in foreign language learning.
In M.H. Long & C.J. Doughty (Eds.), The handbook of language teaching (pp. 106–118). Malden,
MA: Wiley-Blackwell.
Rinnert, C., & Kobayashi, H. (2009). Situated writing practices in foreign language settings: The role of pre-
vious experience and instruction. In R.M. Manchón (Ed.), Writing in foreign language contexts: Learning,
teaching, and research (pp. 23–48). Bristol: Multilingual Matters.
Rinnert, C., Kobayashi, H., & Katayama, A. (2015). Argumentation text construction by Japanese as a foreign
language writers: A dynamic view of transfer. The Modern Language Journal, 99(2), 213–245.
Roca de Larios, J., Manchón, R.M., & Murphy, L. (2006). Generating text in native and foreign language
writing: A temporal analysis of problem solving formulation processes. The Modern Language Journal,
90(1), 100–114.
Sagasta Errasti, M.P. (2003). Acquiring writing skills in a third language: The positive effects of bilingualism.
International Journal of Bilingualism, 7(1), 27–42.
Schoonen, R., Van Gelderen, A., De Glopper, K., Hulstijn, J., Simis, A., Snellings, P., & Stevenson, M. (2003).
First language and second language writing: The role of linguistic knowledge, speed of processing, and
metacognitive knowledge. Language learning, 53(1), 165–202.
Schoonen, R., Van Gelderen, A., Stoel, R.D., Hulstijn, J., & de Glopper, K. (2011). Modeling the development
of L1 and EFL writing proficiency of secondary school students. Language Learning, 61(1), 31–79.
Stevenson, M., Schoonen, R., & De Glopper, K. (2006). Revising in two languages: A multi-dimensional com-
parison of online writing revisions in L1 and FL. Journal of Second Language Writing, 15(3), 201–233.
Tsur, O., & Rappoport, A. (2007). Using classifier features for studying the effect of native language on
the choice of written second language words. Proceedings of the Workshop on Cognitive Aspects of
Computational Language Acquisition (pp. 9–16).
Unsworth, S. (2013). Current issues in multilingual first language acquisition. Annual Review of Applied
Van Vuuren, S., & Laskin, L. (2017). Dutch learner English in close-up: A Bayesian corpus analysis of pre-
subject adverbials in advanced Dutch EFL writing. International Journal of Learner Corpus Research,
3(1), 1–35.
Van Weijen, D., Van den Bergh, H., Rijlaarsdam, G., & Sanders, T. (2009). L1 use during L2 writing: An empir-
ical study of a complex phenomenon. Journal of Second Language Writing, 18(4), 235–250.
Wang, W., & Wen, Q. (2002). L1 use in the L2 composing process: An exploratory study of 16 Chinese EFL
writers. Journal of Second Language Writing, 11(3), 225–246.
Wong, S.-M.J., & Dras, M. (2009). Contrastive analysis and native language identification. Proceedings of the
Australasian Language Technology Association Workshop 2009 (pp. 53–61).
Woodall, B. (2002). Language-switching: Using the first language while writing in a second language. Journal
of Second Language Writing, 11(1), 7–28.
108
9
MULTICOMPETENCE AND
L2 WRITING
Guillaume Gentil
Carleton University
Introduction
The concept of multicompetence in second language (L2) acquisition was introduced by Cook
(1991, 1992). It was first defined as the “compound state of a mind with two grammars” (Cook,
1991, p. 112), using grammar in the Chomskyan sense of linguistic competence, to offer a
reconceptualization of Chomskyan language acquisition theory by taking multilingualism, rather
than monolingualism, as a default condition of humanity. To broaden the scope of the concept
beyond syntax or generative linguistics, the definition of multicompetence was later broadened to
“a person’s knowledge of more than one language system” (Cook, 1992, p. 581), or “the knowledge
of more than one language in the same mind” (Cook, 2003a, p. 10). In a recent review of develop-
ment of the concept, Cook (2016a) has proposed a third definition: “the overall system of a mind or
a community that uses more than one language” (p. 3). As the concept developed, multicompetence
has become “more a perspective from which to view the acquisition and use of multiple languages
than a theory or a model” (Cook, 2016a, p. 3).
Throughout the development of the concept, a key underlying premise is that a bilingual is not
the sum of two monolinguals in one. Rather, people who know two or more languages differ from a
monolingual in several respects, not only because knowledge of an L2 differs from a monolingual’s
knowledge of that language, but also because learning an L2 changes one’s knowledge of the first
language (L1). Crosslinguistic influences not only go from the L1 to the L2, but also from the
L2 to the L1 (a phenomenon sometimes referred to as “reverse transfer,” Cook, 2008, p. 20), and
by extension, in trilingual and multilingual acquisition, they are multidirectional among all the
languages a person knows, learns, or uses. Multicompetence is the complex language system or
unique combination that results from these interactions.
A corollary of this idea is that monolingual native speakers should not be used as the norm or
baseline for L2 acquisition or instruction. Rather, because people who know two or more languages
are different from monolinguals, they should be looked at in their own right and not as deficient
monolinguals. Cook (2002a) thus proposes the term L2 user, rather than non-native speaker, as
“any person who uses another language other than his or her first language, that is to say, the one
learnt as a child” (p. 1), or “who knows and uses a second language at any level” (p. 4). The term
L2 user is contrasted to L2 learner in two main ways: L2 learner evokes a classroom context in
which a second language is learned for future use; L2 learner implies a never-ending process of
DOI: 10.4324/9780429199691-14 109

Guillaume Gentil
L2 acquisition, with a monolingual L1 norm as an elusive target, rather than bringing attention to
knowledge and use of the L2 (Cook, 2002a, 2008).
The third definition of multicompetence, proposed in Cook (2016a) and not yet as widely used,
is meant to underscore the complex interplay and restructuring of linguistic, cognitive, and social
systems when more than one language is used and learned by individuals and communities. It
“implies that language is not separate from the rest of the mind” and yet “leaves the concept of
‘system’ and ‘community’ open to interpretation” (Cook, 2016a, p. 3). The idea that learning two
or more languages affect the whole mind, that is all linguistic and cognitive systems, was already
in evidence in Cook’s (1992) earlier argument that L2 users differ from monolinguals not only in
their linguistic knowledge but also in their cognitive processes and metalinguistic awareness. This
argument is based on much research evidence on the cognitive effects of bilingualism, including
improved cognitive flexibility, enhanced language awareness, better communication skills in the
L1, faster L1 literacy development, but also slight cognitive deficits and memory deficiencies (as
reviewed in Cook, 1992, 2002a). Such research evidence has accumulated since the 1970s. However,
more recent research developments on bilingual cognition, allied with the revival of research on
linguistic relativity, have brought linguistic and cognitive interrelations in multicompetence to the
fore (Cook, 2015; Cook & Bassetti, 2011; Cook & Li Wei, 2016). Similarly, the idea that linguistic
competence is open to assessment both at the individual and community levels, also foregrounded
in the latest definition of multicompetence, can be traced back to Saussure’s argument about the
social and psychological faces of language (Cook, 2008).
Multicompetence, as a concept, a framework, and a perspective, calls for a critical re-examination
of much second language acquisition (SLA) research (Cook, 2008; Ortega, 2016). It is part of a “bi/
multilingual turn” that invites a reconsideration of the learning of additional languages later in life
as a form of “late bilingualism” (Ortega, 2013, 2016). In SLA-oriented second language writing
studies, a multicompetence perspective provides a unique vantage point from which to reconsider
the nature of and relationship between language knowledge and writing knowledge in multilingual
writing development. In particular, it casts new light on the crosslinguistic transfer of writing com-
petence (cf. Chapter 8, this volume). It also has major implications for researching, teaching, and
assessing language and writing in multilingual contexts.
Cook (2008) and Cook (2016a) offer two helpful retrospectives of the development of theory
and research on multicompetence since the early 1990s. The idea of multicompetence has
antecedents in bilingualism research, notably Grosjean’s (1982) holistic view on bilingualism,
the view that a bilingual is not two monolinguals in one person but combines knowledge of two
languages in unique and novel ways. Multicompetence has been part of a broader bi/multilin-
gual turn which has been called for in, but as yet to make “inroads” into, SLA research (Ortega,
2016). While steeped in bilingualism research, Cook (2008) also credits several other influences
on the conceptual development of multicompetence, from linguistics (Chomsky), SLA (e.g.,
Selinker’s interlanguage theory), and sociolinguistics (e.g., Labov’s idea that language diffe-
rence does not entail language deficit).
Multicompetence was first introduced both as a concept and perspective to L2 writing studies
by Ortega and Carson (2010), a landmark article that called upon research at the interface
between writing and SLA to embrace the bilingual and social turns at play in applied linguistics.
Ortega and Carson outlined an agenda for developing theories and research of L2 writing from a
multicompetence lens, highlighting the importance of studying multicompetence via within-writer
designs in a variety of social contexts. They further underscored the challenge of developing ways
to compare writing development across languages and assess multicompetent writing through a
bilingual, ecologically valid lens. They reviewed research that did not adopt a multicompetence lens
110
Multicompetence and L2 Writing
explicitly and yet that could be seen as foregrounding or reinterpretable within a multicompetence
approach, such as work by Roca de Larios, Manchón, and Murphy (2006), which will be reviewed
in a later section in the chapter.
Ortega and Carson (2010) emphasized the social nature of multicompetence, describing L2 com-
posing as a socially situated, multicompetent act and characterizing multicompetence as multilin-
gual writers’ ability to negotiate cultural and educational influences on their writing development.
They thus saw multicompetence as integral not only to a bilingual turn, but also to a social turn in
language learning studies. This earlier emphasis on a social view of multicompetence helps explain
why, in the field of L2 writing, a multicompetence lens has been associated primarily with socially
oriented research (Cumming, 2016).

One key question arising within a multicompetence perspective on language and writing develop-
ment centers on the start and end points of multicompetence. In keeping with a Chomskyan view of
competence, Cook (1992) sees the point of emergence whenever regularity and systematicity in L2
use suggest knowledge of the L2 that is distinctive from knowledge of the L1. What could replace
monolingual native competence as a desirable end goal? Although a theory of development is yet
to be elaborated within a multicompetence perspective, Cook does suggest successful L2 users as
helpful role models (Cook, 2003a). Developing multicompetence as an L2 user/writer could then
be conceived as a path toward successful L2 use, with success being contingent on the demands of,
and social validation in, specific contexts of communication. Along these lines, and in relation to L2
writing, Byrnes (2014) describes development in the multicompetent learner as a trajectory toward
“flexibility and stability” in using linguistic resources to make meaning “in light of the communi-
cative situation” (p. 89). In other words, it is a trajectory toward greater agency in responding to
multilingual communicative demands, as exemplified by a multilingual scholar’s growing ability
to negotiate her academic writing in multiple languages with colleagues, reviewers, and editors
during her career.
Another central, and still largely unresolved, question concerns the underlying structure of
multicompetence. Cook (2002a, 2003b) considers three logical possibilities: Total separation
of the language systems, interconnection among them to a greater or lesser degree, and total inte-
gration into a single merged system. J. K. Hall, Cheng, and Carlson (2006) and J. K. Hall (2016)
have argued for the latter position. Proposing a usage-based view of multicompetence, they char-
acterize language knowledge as “dynamic constellations of linguistic resources” (J. K. Hall et al.,
2006, p. 226) that emerge and reorganize themselves through the continual interaction between
the pragmatic demands of language use in specific contexts and the cognitive constraints of the
mind. Differences across users are thus based not on the number of languages they speak, but on
the amount and diversity of their experiences and uses, whether it may be characterized in terms of
languages, discourses, registers, or styles. The very concept of multicompetence then comes under
attack as presuming a qualitative difference between multicompetence and monocompetence, that
is between multilingualism and monolingualism.
The view that languages are merged in the mind despite being socially constructed as separate
labeled languages is shared by proponents of the “strong” version of translanguaging in bilingual
education, who also question the psychological reality of bounded languages (Li Wei, & García,
2017; Otheguy, García, & Reid, 2015). In writing studies, similar ideas surface in the translingual
approach, notably in relation to the concept of codemeshing –the mixing of linguistic and other
symbolic systems that are drawn upon as a single integrated system with communicative intention-
ality (Canagarajah, 2011a, 2013; Gevers, 2018; J. Hall & Navarro, 2011). The idea that multilin-
gualism and monolingualism are similar in kind also underlies the parallel that is sometimes drawn,
in writing studies and English for Academic Purposes, between learning a new register and learning
111
Guillaume Gentil
an additional language, well captured by the observation that “academic English is no one’s first
language” (Hyland, 2016, p. 61).
However, such views do not have unanimous support. Kecskes (2010) reviews empirical
evidence of qualitative differences between monocompetence and multicompetence pointing
to the complex remapping of words, concepts, and grammatical structures in late bilingualism.
Jarvis and Pavlenko (2007) express concerns that J.K. Hall et al.’s (2006) blurring of meaningful
distinctions between monolingual and multilingual users undermines the raison d’être of multi-
lingualism research, encourages the use of monolingual speakers as a model for future research
in SLA and bilingualism, contradicts accumulated evidence about crosslinguistic influences in
multilingual language use and cognition, and ignores the distinct needs of L2 learners and users
(pp. 214–215).
Cook’s own position in this debate is prudent and nuanced. Cook (2002a, 2003b) proposes an
“integration continuum” between the two endpoints of total language separation and total language
integration, with various possibilities for interconnection in the middle. Reviewing empirical evi-
dence, Cook rules out both total separation and total integration as untenable extreme positions
and suggests instead that the particulars of language interrelationships in the mind may differ
according to areas of language (e.g., syntax, lexis, phonology), stage of development (early/late
stages, age), context of learning and use (e.g., classroom/naturalistic contexts), and the closeness
of languages (Cook, 2002a). Cook (2016a) further cautions against equating languages as entities
in the mind and in society. While the effect of language differentiation in society on language
differentiation in the mind remain a fascinating yet largely unresolved question, multilingualism
research points to a multilayered mental architecture with both some degree of differentiation
and yet evolving interconnections at the linguistic and conceptual levels (e.g., de Groot, 2016;
Kecskes, 2010).

Research on multicompetence and L2 writing is difficult to capture because of the various ways
authors have drawn upon multicompetence theory. Some studies just evoke the concept as a way to
signal a multilingual, non-deficit orientation to the study of writing development across languages,
or to interpret the findings of previous research on writers’ use of multilingual resources (e.g.,
Kibler, 2010). Other studies can be seen as aligned with a multicompetence perspective and yet do
not refer to multicompetence specifically. For example, in her introduction to an edited collection,
Manchón (2011) sees Leki’s and Canagarajah’s chapters to the book (Canagarajah, 2011b; Leki,
2011) as “perfect examples of the type of research that sees ‘L2 composing as a multicompetent’
(i.e., biliterate and bilingual) act that is situated and understood in its social context” (Ortega &
Carson, 2010, p. 52) and yet these two authors did not frame their research from a multicompetence
lens explicitly. Similarly, Rinnert and Kobayashi (2016) acknowledge work on multilingual writing
development that is allied with a multicompetence perspective without being labeled as such,
including Gentil’s (2011) biliteracy genre approach, Cenoz and Gorter’s (2011) focus on multilin-
gualism, or translingual and translanguaging approaches (see previous section).
The review that follows will focus on selected empirical studies that explicitly adopt a
multicompetence framework as a central focus of study or at least as one of the main sources of
theoretical insight (for a recent and complementary review of research on multicompetence and
multilingual writing, see Rinnert & Kobayashi, 2016).
Some of the earliest references to multicompetence in L2 writing research can be found in the
cognitively-oriented studies by Manchón, Murphy, and Roca de Larios’ team on the nature of bi/
multilingual writing processes and the relationship between the development of writing expertise
and language proficiency in bilinguals (e.g., Manchón & Roca de Larios, 2007; Manchón, Roca
de Larios, & Murphy, 2009; Murphy & Roca de Larios, 2010). Even though the authors recognize
112
that multicompetence was not part of their original lens, they point to the explanatory potential of
Cook’s work in interpreting their findings on the bidirectional transfer of writing processes (e.g.,
planning) across languages and the complex interplay between composing and linguistic compe-
tence and educational experience. Cook’s lens also helped them bring to light “the specificity of
multicompetent writers’ strategic behavior,” such as the use of the L1 and broader linguistic, idea-
tional, and textual knowledge resources to facilitate composing (Manchón et al., 2009, p. 119).
Similarly, Kecskes and Papp (2000) draw on a multicompetence lens to interpret the effect of inten-
sive foreign language instruction on enhanced syntactic sophistication in both L1 and L2 writing
among secondary school students in Hungary: students in the immersion or intensive programs
outperformed students with only two or three hours of foreign language study per week, suggesting
that the linguistic and analytical skills and learning strategies developed through intensive foreign
language study help the students use written language in more elaborate, conscious, and abstract
ways in both their L1 and their L2.
Arguably the most well-known example of L2 writing research that features multicompetence
as central to theoretical framing and empirical focus is the work of Kobayashi and Rinnert (2012,
2013), who expanded Cook’s theory of multicompetence in multilinguals from linguistic to
writing competence. Such broadened scope aligns well with Cook’s latest conceptualization of
multicompetence as an overall mental system that includes both linguistic and cognitive subsystems
and underscores the interplay between them (cf. Cook’s 2016a third definition). Combining cross-
sectional and longitudinal data and triangulating insight from several sources (mostly text analyses,
interviews, questionnaires), Kobayashi and Rinnert show how the rhetorical and discourse patterns
found in L1, L2, and L3 texts vary with multilingual writers’ writing instruction and experience in
more than one linguistic and educational context. Some text features seem to be specific to a lin-
guistic context, whereas others appear in more than one linguistic context. However, multilingual
writers’ changing L1, L2, and L3 text features over time and with experience suggests that their
selection of text features is not a function of linguistic constraints imposed by the language of com-
posing. Rather, multicompetent writers appear to make strategic, more or less conscious, decisions
as to which text features may be appropriate for more than one linguistic context, and which may
preferably be restricted to specific linguistic contexts, based on their appreciation of cultural and
situational expectations (see also Gentil, 2011).
Kobayashi and Rinnert (2012, 2013) interpret these findings by proposing a model of text con-
struction whereby the writing knowledge that multilingual writers display in selecting text features
(e.g., topic sentences, metadiscourse markers) is construed as a repertoire with several interacting
components, including linguistic knowledge, meta-knowledge (e.g., of reader expectations), and
knowledge of writing conventions and rhetorical features. In keeping with Cook’s integration con-
tinuum, Kobayashi and Rinnert see this repertoire of writing knowledge as dynamic and evolving
with linguistic experience. In sequential bi/multilingual development, as in the Japanese context of
instruction that Kobayashi and Rinnert have investigated, text features first tend to be language-
specific, suggesting that the underlying L1 and L2 writing knowledge systems may initially develop
separately when acquired through distinct linguistic contexts. Over time, however, multilingual
writers demonstrate much greater flexibility in their discourse choices, suggesting that the L1, L2,
and Ln writing knowledge derived from different linguistic sources (e.g., Japanese, English, and
Chinese) and associated educational contexts (e.g., L1 and L2 writing instruction in Japan and
Australia) becomes more integrated within the developing multilingual repertoire.
Such a model leads to a radical reconceptualization of the crosslinguistic transfer of writing
skills and hence adds to research on transfer in SLA. Cook (2016b) himself questions whether the
image of carrying something over from A to B that is conveyed by the transfer metaphor is most
apt in capturing multilingual development. Within a multicompetence perspective, bits of know-
ledge are not carried over from one linguistic system to another; rather, the whole system and its
subsystems are reorganizing themselves with new linguistic experience as new structures emerge.
113
Guillaume Gentil
This understanding is akin to dynamic systems theory, whose affinities with multicompetence
theory have been underlined (e.g., Cook, 2016a; Hofer, 2017).
As noted by Kobayashi and Rinnert (2012), however, the text features found in the writing
of their research participants were not always deemed to be most appropriate, suggesting that
developing multicompetent writers have yet to expand and learn to apply their knowledge. A recent
study by Forbes and Fisher (2018) shows how writing instruction with an explicit focus on meta-
cognitive knowledge can help student writers make more reflective discourse choices that lead
to improvements in the assessed quality of their writing. Furthermore, Forbes and Fisher (2018)
also show how strategies explicitly taught in one foreign language class (German) become avail-
able for use when writing in both the L1 (English) and another foreign language (French). While
multilingual writers have been shown to develop metalinguistic and metacognitive awareness
spontaneously in some contexts (Hofer, 2017, Tullock & Fernández-Villanueva, 2013), Forbes and
Fisher (2018) suggest that instruction whereby L1 and L2 teachers collaborate to draw explicit
connections in effective writing strategy use across language contexts can facilitate the develop-
ment of multicompetence in writing (see also Chapter 8, this volume).
In short, Cook’s multicompetence lens is part of an expansive body of research on multilingual
approaches to writing instruction and development that points to the strategic ways in which multi-
lingual writers draw on and develop evolving linguistic and writing repertoires as they negotiate
the demands of writing across contexts, and how instruction can help them use and expand these
repertoires.

A multicompetence perspective has major implications for research methods. Cook’s (2016a)
three premises of multicompetence provide helpful guidelines for study designs in L2 writing
research. The first premise is that multicompetence “concerns the total system for all languages
(L1, L2, Ln) in a single mind or community and their interrelationships” (p. 7). This premise
invites within-writer designs research methodologies that allow detailed and contextualized
examinations of individual writers’ language use and knowledge in all the languages that form
their linguistic repertoires. Within-writer designs have been called for and used in a growing
body of L2 writing research since the 1990s. Recent within-writer text-based studies of trilin-
gual writing offer good examples of methodologies that allow insight into the interrelationships
among the language systems drawn upon by multilingual writers. Cenoz and Gorter (2011),
for example, compared the written productions in Basque, English, and Spanish by the same
students in Basque schools, using the same picture description task (on similar topics), rating
scale (Jacobs, Zingraf, Wormuth, Fay Hartfiel, & Hughey, 1981), and linguistic and textual
indicators. This allowed them to show correlations among subscales (e.g., grammar, vocabulary,
mechanics) and similar strategies for text organization across all languages, as well as multidir-
ectional transfer at the level of lexicogrammar. In another study of trilingual writing, Lindgren,
Westum, Outakoski, Sullivan, and Sullivan (2017) asked 15-year old writers to compose one
argumentative and one descriptive text in the three languages they are exposed to in and out of
school. They gathered information about the participants’ language repertoires and practices by
a questionnaire. Using a common framework, Systemic Functional Linguistics, allowed them to
compare the writers’ strategies for meaning making across languages in each genre.
The second premise of multicompetence is that multicompetence “does not depend on the
monolingual native speaker” (Cook, 2016a, p. 11). This premise calls for a critical re-examination
of research designs based on native vs. non-native comparisons. Comparing monolingual and
multilingual writers can still be of value in highlighting the distinctive properties of mono-and
multicompetence, provided that monolingual L1 writing is not regarded as the norm for L2 writing. It
114
is also important to characterize multicompetent writing in its own right, describing what successful
L2 writers do, and understanding the social conditions of their success in specific contexts, rather
than prescribing what they should do. Even in within-subject comparisons of L1 and L2 writing,
researchers can draw on native speaker judgment to assess texts produced in each language (e.g.,
Kubota, 1998). While this may be justifiable for some purposes (for example in contexts where
writers are expected to write for monolingual audiences, as in Kubota, 1998), in keeping with
a multicompetence perspective, multicompetence in writing may be preferably assessed in all
languages by the same multilingual raters that are themselves multicompetent and familiar with
culturally appropriate expectations in the target writing contexts. Schoonen, van Gelderen, Stoel,
Hulstijn, and de Glopper (2011), for example, used bilingual Dutch-English raters familiar with
the participants’ school context to assess student writing in both Dutch L1 and English L2. It can
also be advantageous to have language and writing development assessed by a multilingual team
of raters bringing complementary linguistic repertoires and cultural perspectives on linguistic and
rhetorical expectations in target contexts, as done by Kobayashi and Rinnert (2013).
The third premise of multicompetence is that multicompetence “affects the whole mind, i.e., all
language and cognitive systems, rather than language alone” (Cook, 2016a, p. 15). This premise
calls for research designs that allow for the linguistic and cognitive components of multicompetence
thus broadly conceived to be teased apart. In L2 writing research, it extends the theory of
multicompetence, as Kobayashi and Rinnert (2012) do, from linguistic competence to writing com-
petence (as a cognitive system) and their interaction. It thus aligns well with SLA-oriented L2
writing research that seeks to understand the interplay between writing knowledge and language
knowledge in multilingual writing development (for a review, see Manchón, 2013). Kobayashi
and Rinnert’s (2012, 2013) illustrate fruitful methodological strategies: supplementing a longitu-
dinal case study of trilingual writing development with cross-sectional comparisons of multilingual
writers with different levels of experience in the L1 and the L2; and triangulating qualitative and
quantitative data from multiple data sources (interviews, text analyses, observations). Schoonen
et al. (2011) exemplify a more quantitative research design, comparing L1 and L2 writing devel-
opment longitudinally through several linguistic and cognitive measures and structural equation
modeling. Two challenges for this line of research are the operationalization of the linguistic and
cognitive components of multicompetence in writing, and the development of analytic measures
of writing development across languages (Ortega & Carson, 2010; Rinnert & Kobayashi, 2016).
In terms of theoretical frameworks, multicompetence theory can be productively used in com-
bination with other compatible lenses. In addition to the affinities with dynamic system theories
mentioned earlier, it has been combined in L2 writing research with genre theory and identity
theory (Kobayashi and Rinnert, 2013), cognitive models of writing (Manchón et al., 2009; Tullock
& Fernández-Villanueva, 2013), Vygotsky and Bernstein (Kecskes & Papp, 2000), and systemic
functional linguistics (Lindgren et al., 2017). Genre theory (Cheng, 2018; Gentil, 2011; Tardy,
Sommer-Farias, & Gevers, 2020) may offer a particularly fruitful complementary lens in multi-
lingual writing research to help characterize multicompetence as a repertoire of genre knowledge
that can be developed, exploited, and negotiated across linguistic and social contexts both at the
individual and community levels.

A multicompetence perspective has major implications for language teaching; these have been
described in some details in Cook (1999, 2002b) and Scott (2016), although the actual impact on
classroom practices may yet still be limited (Cook, 2016c). I will focus here on implications for
language and writing instruction, that is, instruction with the dual goal of learning to write in an L2
and writing to learn an L2 (cf. Manchón, 2011).
115
Guillaume Gentil
The first major implication is the adoption of successful L2 use, rather than monolingual native
use, as a desirable goal for instruction, with the understanding that success in writing is a function
of both individual skill and social validation in specific contexts of communication (Gentil, 2011).
A multicompetence perspective thus advocates for an “ethical treatment” of L2 writers (Tardy &
Whittig, 2017), combating language deficit views that inevitably arise when monolingual L1 norms
are set as an unattainable target (Cook, 1999). The goals for instruction should then at least in
part be determined through a careful examination of what may count as successful L2 use in spe-
cific target contexts, and syllabi should include informed descriptions of successful L2 users as
role models. In keeping with Languages-for-Specific approaches, instructional goals will vary with
targeted contexts. For university students, end goals may include being able to write papers in an L2
that meet the expectations of their professors in their disciplines. For multilingual scholars, a goal
may be to write a manuscript in an additional language that is deemed acceptable for publication.
And for business L2 writers, a goal may be to negotiate deals and contracts by email, texting, and in
more formal writing. A multicompetence perspective thus aligns well critical approaches in English
for Academic Purposes (Benesch, 2001) and teaching English as a lingua franca (Jenkins, Cogo, &
Dewey, 2011) in attempting to displace monolingual norms with multicompetent benchmarks open
for negotiation and contestation. Multicompetent L2 writers need to know how to use English as
a lingua franca (Cook, 2008), and monolingual English communities need to open themselves to
multicompetent norms. This entails teaching and learning how to negotiate acceptability and intel-
ligibility in language uses that may be deemed to be deviating from monolingual written English
expectations and standards.
Importantly, however, for Cook (2002b), teaching for multicompetence should not be reduced to
“external goals,” defined as the targeted L2 uses outside the classroom; rather, it should also include
“internal goals,” defined as the “educational aims of the classroom itself” (p. 330), such as pro-
moting cognitive flexibility, improving attitudes toward speakers of other languages, and changing
students’ minds. Such internal goals are well illustrated in Kecskes and Papp’s (2000) finding about
the impact of intensive foreign language study on the development of analytical skills displayed in
L1 writing. In the context of the L2 writing classroom, teaching for multicompetence can include
exploiting the potential of multilingualism and writing as resources for thinking, learning, rhet-
orical flexibility, and cross-cultural awareness. For example, instruction could focus on how to
draw on multilingual resources to gain a deeper understanding of a topic or a concept (e.g., Gentil,
2019), better tackle a complex writing task (e.g., Murphy & Roca de Larios, 2010), or engage with
multiple readerships. Additionally, when examining what successful L2 writers do as role models to
aspire to in the L2 writing classroom, attention should be paid to practices and skills that are unique
to multicompetent writers and set them apart from monolingual writers. These include crosslingual
practices such as summarizing in one language information gathered in another language, trans-
lating, and mediating or “shuttling” back and forth between linguistic communities (Canagarajah,
2006; Cook, 2002b).
Another major pedagogical implication of a multicompetence perspective relates to use of the
L1 (or other non-target languages) in the L2 classroom. Because multicompetence theory views
languages are interrelated in the mind, it supports the development of links between and among
language and cognitive systems; it thus jars with teaching approaches that insist on strict language
separation in bilingual education and ban L1 use in L2 classroom. Even when the L1 is made
invisible because of a ban, it is rarely if ever switched off in L2 users’ minds. Multicompetence
therefore calls for a rational re-evaluation of the ways the L1 could be used in the classroom (Cook,
2001) and aligns well with calls for “rethinking monolingual instructional strategies in multilin-
gual classrooms” (Cummins, 2007, p. 221). In L2 writing research, the frequent uses and enab-
ling functions of the L1 in L2 composing have been well documented (for a review, Göpferich,
2017; see also review in Chapter 8, this volume), and the growing body of research into trilingual
and multilingual writing further suggests how multilingual writers draw on their entire linguistic
116
repertoire when composing (e.g., Cenoz & Gorter, 2011; Tullock & Fernández-Villanueva, 2013).
There is, however, a lingering debate about the extent to which and ways in which the L1 and
other non-target languages should be used in the second or foreign language classroom. Although
this debate is not new (e.g., Auerbach, 1993; Polio, 1994), there is still surprisingly little empirical
research comparing the impact of exclusive L2 use vs. principled L1 use on language learning.
A noteworthy study by Tian and Macaro (2012), however, does point to a potential benefit of
teacher codeswitching for L2 vocabulary acquisition.
Central to the debate is the apparent trade-off between two competing goals: leveraging the L1
as a resource for learning vs. maximizing L2 use for L2 development. In designing instruction at
the interface of writing and language development, a consideration of paramount importance is
thus to identify the primary objective(s): a) learning to write in an additional language; b) writing
to learn a language. In ESL contexts, as in the United States or in Britain, the goal tends to be a),
and encouraging multilingual writers to draw on their entire linguistic repertoire is likely to be
helpful. In some foreign language contexts, however, b) tends to be privileged, and maximizing
opportunities for L2 use (without banning L1 use) could be of benefit, especially when access to
the L2 is limited in the environment. In bilingual education contexts, such as content-and-language
integrated learning or content-based language instruction, objectives include a combination of a),
b), and c): writing to learn a subject matter in/and a new language (see Chapter 18, this volume). In
keeping with a multicompetence perspective, crosslinguistic pedagogies that encourage students to
make links between languages are likely to facilitate language development, deepen understanding
of the subject matter, and sharpen writing skills. However, in deciding on appropriate language
allocation strategies in these contexts, another key consideration is language status. As Ballinger,
Lyster, Sterzuk, and Genesee (2017) and Lyster (2019) argue in the Canadian and U.S. contexts,
allowing language minority students to draw on their entire language repertoires is likely to facili-
tate learning and literacy in English; but in order to promote dual language development in French
immersion contexts, where most students are English dominant, crosslinguistic pedagogies must
leverage English L1 use in ways that safeguard opportunities for extended L2 use and deeper levels
of processing in French. Otherwise, given the dominance of English in the environment and in
students’ lives, English quickly displaces French as the default language, impeding development in
French. Crosslinguistic pedagogy promoting L1 use for learning and L2 writing must therefore be
“context-appropriate” (Ballinger et al., 2017, p. 30).
In short, in his latest definition, Cook (2016a) defines multicompetence as “the overall system
of a mind or a community that uses more than one language” (p. 3, emphasis added). However,
multicompetence is a complex, multicomponent system both of a mind and a community. To enable
it to develop fully, curriculum design should balance out the linguistic, cognitive, and social systems
that constitute it and capitalize on the interactions among them. This entails better understanding
successful L2 writing in given communities and helping L2 writers mediate between communities.
Future Directions
Given the intertwined individual and social nature of multicompetence, J.K. Hall (2016) sees inte-
grating empirical findings from cognitive-oriented and ethnographic research as the next challenge
for multicompetence research. Multiple embedded case study design (Yin, 2018) that nest case
studies of individual writers within case studies of multilingual literacy practices in communi-
ties would be conducive to exploring the interplay between multicompetence in the mind and in
the community. A related avenue suggested by Jarvis and Pavlenko (2007) is to interrelate lan-
guage use patterns at the individual and group levels by cross pollinating language contact and SLA
research. In exploring these relations, however, it is important to bear in mind the different time
scales at which language systems may change: months and years in a person’s life span; decades
and centuries in a community’s history (Gentil, 2018). As Kecskes (2010) also notes, whereas in L1
117
Guillaume Gentil
acquisition language structures may emerge in the mind through usage and socialization, in late L2
acquisition, especially in instructed foreign language contexts, new language structures are more
likely to develop out of the contact between new and existing language structures. Unravelling the
complex intersection of individual and collective trajectories of language development in today’s
era of globalization may shed light on the intriguing question of the effect of the social construc-
tion of language as separate named languages on bilinguals’ mental representations and cognitive
processing.
Another direction for future research is to investigate further the unique aspects of multicompetent
practice, notably multilingual writers’ translation and crosslingual practices, as when they write in
one language from sources in another, and then switch languages in reading and writing. There
is reason to believe that such practices can be quite common, for example among francophone
university writers who write analyses, popularization pieces, abstracts, and theses in French from
English sources, and then draw on their previous French writings to compose reports and articles in
English (Gentil, 2019). Little is known about the challenges, resources, and skills associated with
such crosslingual work. Ongoing research triangulating insight from interviews, writing samples,
and screencapture recordings of composing processes suggests that multilingual writers develop
crosslingual literacy skills by trial and error (Séror & Gentil, 2019). Part of the reason for this is
that research and instruction is compartmentalized by language (L1 vs. L2) and skill (writing vs.
translation) (Gentil, 2018). A multicompetence lens can help overcome “disciplinary divisions of
labor” (Matsuda, 1999) by encouraging researchers and educators to adopt a holistic perspective
on language and writing development in multilingual contexts, one which includes translation and
crosslingual mediation as a core skill.
In L2 writing research, there has been some interest in comparing the quality of L2 writing
produced by composing directly in the L2 (direct composing) vs. composing in the L1 and trans-
lating into the L2 (composing through translation) (e.g., Cohen & Brooks-Carson, 2001; Kobayashi
& Rinnert, 1992; for a review on translation and writing competence, see Göpferich, 2017). This line
of research suggests that for some writers, especially at lower levels of L2 proficiency, translated
composing can help enhance the quality of text produced in a second language, allowing writers to
better develop and organize their ideas and pay greater attention to style. This research, however,
was conducted before recent and fast-paced developments in neural machine translation systems,
which have led to impressive and nonstop progress in the performance of free online machine
tools such as GoogleTranslate and DeepL since 2016 (Bowker & Ciro, 2019). While the output
they produced still requires human post-editing, these tools are game changers for multicompetent
writers. Such technological advances call for research at the intersection of multicompetence and
machine translation literacy (Bowker & Ciro, 2019), exploring the ways in which multicompetence
is mediated by machine translation and other technologies that are integral to the L2 writer/user’s
environment. Distinguishing between learning to write in an L2 vs. writing to learn an L2 is again
essential: Developing L2 writers may well be able to produce better texts in the L2 with machine
translation, but copying and pasting L1 texts into an online translator may not lead to the kind of
deep processing in the L2 that is required for L2 development (Lyster, 2019). Better understanding
the impact of machine translation on language and writing development would help to implement
a balanced approach to multicompetence development in technologically mediated contexts by
designing pedagogical strategies that scaffold the use of translation and translation tools in ways
that promote the development of language, writing, and translation skills.
The details of how multicompetence could be structured in the mind remain open to specu-
lation. One model, based on Cummins’ common underlying proficiency, is that cognitive skills,
including writing expertise, may be common across the various languages through which they
could be expressed or realized. In L2 writing research, this has led to understandings of written
genre knowledge as comprising language- dependent and language- independent components
(Gentil, 2011; Tardy et al., 2020). Language-dependent aspects include the lexicogrammatical
118
resources that are required for realizing a genre in a language; a language-independent dimension
might include awareness of audience, purpose, and genre as important considerations when writing.
Research in bilingual cognition, however, suggests that bilingualism can enhance metalinguistic
awareness by bringing interrelationships between language and thought into focus. Genre-based
writing instruction aims to enhance students’ genre awareness by helping them articulate aspects
of their genre knowledge (Cheng, 2018). Genre awareness is thus enhanced by verbalization. This
opens questions about the ways in which genre awareness may in fact be dependent on language
and could be promoted by asking students to think about concepts such as genre, audience, writing,
language, knowledge, awareness, communicative purpose, and rhetorical situation through more
than one linguistic lens.
These are but a few possible directions for future research. Several other avenues suggested by
Rinnert and Kobayashi (2016) and Ortega and Carson (2010) are still relevant today. Ten years after
Ortega and Carson’s (2010) call, a multicompetence perspective has come of age in and continues
to have much to offer for research at the interface between second language acquisition and writing
provided that it is embraced in all its complexity.
References
Auerbach, E.R. (1993). Reexamining English only in the ESL classroom. TESOL Quarterly, 27(1), 9–32.
Ballinger, S., Lyster, R., Sterzuk, A., & Genesee, F. (2017). Context- appropriate crosslinguistic peda-
gogy: Considering the role of language status in immersion education. Journal of Immersion and Content-
Based Language Education, 5(1), 30–57.
Benesch, S. (2001). Critical English for academic purposes: Theory, politics, and practice. New York:
Routledge.
Bowker, L., & Ciro, J.B. (2019). Machine translation and global research: Towards improved machine trans-
lation literacy in the scholarly Community. Bingley: Emerald.
Byrnes, H. (2014). Theorizing language development at the intersection of “task” and L2 writing: Reconsidering
complexity. In H. Byrnes & R. Manchón (Eds.), Task-Based language learning: Insights from and for L2
Writing (pp. 79–103). Amsterdam: John Benjamins.
Canagarajah, S. (2006). Toward a writing pedagogy of shuttling between languages: Learning from multilin-
gual writers. College English, 68(6), 589–604.
Canagarajah, S. (2011a). Codemeshing in academic writing: Identifying teachable strategies of translanguaging.
The Modern Language Journal, 95(3), 401–417. doi:10.1111/j.1540-4781.2011.01207.x
Canagarajah, S. (2011b). Writing to learn and learning to write by shuttling between languages. In R.M.
Manchón (Ed.), Learning- to-
write and writing-to-
learn in an additional language (pp. 111– 132).
Canagarajah, S. (2013). Translingual practice: Global Englishes and cosmopolitan relations. New York:
Routledge.
Cenoz, J., & Gorter, D. (2011). Focus on multilingualism: A study of trilingual writing. The Modern Language
Journal, 95(3), 356–369.
Cheng, A. (2018). Genre and graduate-level research writing. Ann Arbor, MI: University of Michigan Press.
Cohen, A.D., & Brooks-Carson, A. (2001). Research on direct versus translated writing: Students’ strategies
and their results. The Modern Language Journal, 85(2), 169–188.
Cook, V. (1991). The poverty-of-the-stimulus argument and multicompetence. Second Language Research,
7(2), 103–117. doi:10.1177/026765839100700203
Cook, V. (1992). Evidence for multicompetence. Language Learning, 42(4), 557–591. doi:10.1111/j.1467-
1770.1992.tb01044.x
Cook, V. (1999). Going beyond the native speaker in language teaching. TESOL Quarterly, 33(2), 185–209.
doi:10.2307/3587717
Cook, V. (2001). Using the first language in the classroom. Canadian Modern Language Review, 57(3),
399–423.
Cook, V. (2002a). Background to the L2 user. In V. Cook (Ed.), Portraits of the L2 user (pp. 1–28). Clevedon:
Multilingual Matters.
Cook, V. (2002b). Language teaching methodology and the L2 user perspective. In V. Cook (Ed.), Portraits of
the L2 user (pp. 327–343). Clevendon: Multilingual Matters.
Cook, V. (2003a). Effects of the second language on the first. Clevedon: Multilingual Matters.
119
Guillaume Gentil
Cook, V. (2003b). Introduction: The changing L1 in the L2 user’s mind. In V. Cook (Ed.), Effects of the second
language on the first (pp. 1–18). Clevedon: Multilingual Matters.
Cook, V. (2008). Multi-competence: Black hole or wormhole for second language acquisition research? In Z.
Han (Ed.), Understanding second language process (pp. 16–26). Clevedon: Multilingual Matters.
Cook, V. (2015). Discussing the language and thought of motion in second language speakers. The Modern
Language Journal, 99(S1), 154–164. doi:10.1111/j.1540-4781.2015.12184.x
Cook, V. (2016a). Premises of multi-competence. In V. Cook & Li Wei (Eds.), Cambridge handbook of lin-
guistic multicompetence (pp. 1–25). Cambridge: Cambridge University Press.
Cook, V. (2016b). Transfer and the relationship between the languages of multi-competence. In R. Alonso
Alonson (Ed.), Crosslinguistic influence in second language acquisition (pp. 24–37). Clevendon: Multilingual
Matters.
Cook, V. (2016c). Where is the native speaker now? TESOL Quarterly, 50(1), 186–189. doi:10.1002/tesq.286
Cook, V., & Bassetti, B. (2011). Language and bilingual cognition. New York: Psychology Press.
Cook, V., & Li Wei (Eds.). (2016). The Cambridge handbook of linguistic multi-competence. Cambridge:
Cumming, A. (2016). Theoretical orientations to L2 writing. In R. Manchón & P.K. Matsuda (Eds.), Handbook
of Second and Foreign Language Writing (pp. 65–88). Berlin: de Gruyter Mouton.
Cummins, J. (2007). Rethinking monolingual instructional strategies in multilingual classrooms. Canadian
Journal of Applied Linguistics /Revue canadienne de linguistique appliquée, 10(2), 221–240. Retrieved
from https://journals.lib.unb.ca/index.php/CJAL
de Groot, A. (2016). Language and cognition in bilinguals. In V. Cook & Li Wei (Eds.), The Cambridge hand-
book of linguistic multi-competence (pp. 248–275). Cambridge: Cambridge University Press.
Forbes, K., & Fisher, L. (2018). Strategy development and cross-linguistic transfer in foreign and first language
writing. Applied Linguistics Review, 11(2), 311–339. doi:https://doi.org/10.1515/applirev-2018-0008
García, O., & Lin, A. M. (2017). Translanguaging in bilingual education. In J. Cenoz, D. Gorter, & S. May
(Eds.), Encyclopedia of language and education (Vol. Bilingual and multilingual education, pp. 117−130).
New York: Springer.
Gentil, G. (2011). A biliteracy agenda for genre research. Journal of Second Language Writing, 20(1), 6–23.
doi:10.1016/j.jslw.2010.12.006
Gentil, G. (2018). Modern languages, bilingual education, and translation studies: The next frontiers in WAC/
WID research and instruction? Across the Disciplines, 15(3), 114–129.
Gentil, G. (2019). Translanguaging and multilingual academic literacies: How do we translate that into French?
Should we? Pour en faire quoi? (et pourquoi s’en faire?). Cahiers de l’ILOB/OLBI Working Papers, 3–41.
doi:https://doi.org/10.18192/olbiwp.v10i0.3831
Gevers, J. (2018). Translingualism revisited: Language difference and hybridity in L2 writing. Journal of
Second Language Writing, 40, 73–83.
Göpferich, S. (2017). Cognitive functions of translation in L2 writing. In J.W. Schwieter & A. Ferreira (Eds.),
The handbook of translation and cognition (pp. 402–422). Hoboken, NJ: Wiley.
Grosjean, F. (1982). Life with two languages: An introduction to bilingualism. Cambridge, MA: Harvard
University Press.
Hall, J., & Navarro, N. (2011). Lessons for WAC/WID from language learning research: Multicompetence,
register acquisition, and the college writing student. Across the Disciplines, 8(4). Retrieved from https://
wac.colostate.edu/docs/atd/ell/hall-navarro.pdf
Hall, J.K. (2016). A usage-based account of multi-competence. In V. Cook & Li Wei (Eds.), The Cambridge
handbook of linguistic multi-competence (pp. 183–205). Cambridge: Cambridge University Press.
Hall, J.K., Cheng, A., & Carlson, M.T. (2006). Reconceptualizing multicompetence as a theory of language
knowledge. Applied Linguistics, 27(2), 220–240. doi:10.1093/applin/aml013
Hofer, B. (2017). Emergent multicompetence at the primary level: A dynamic conception of multicompetence.
Language Awareness, 26(2), 96–112. doi:10.1080/09658416.2017.1351981
Hyland, K. (2016). Academic publishing and the myth of linguistic injustice. Journal of Second Language
Writing, 31, 58–69. doi:10.1016/j.jslw.2016.01.005
Jacobs, H.L., Zingraf, S., Wormuth, D.R., Fay Hartfiel, V., & Hughey, J. (1981). Teaching ESL composition: A
practical approach. Rowley, MA: Newbury House.
Jarvis, S., & Pavlenko, A. (2007). Crosslinguistic influence in language and cognition. New York: Routledge.
Jenkins, J., Cogo, A., & Dewey, M. (2011). Review of developments in research into English as a lingua franca.
Language Teaching, 44(3), 281–315.
Kecskes, I. (2010). Dual and multilanguage systems. International Journal of Multilingualism, 7(2), 91–109.
doi:10.1080/14790710903288313
Kecskes, I., & Papp, T. (2000). Foreign language and mother tongue. Mahwah, NJ: Lawrence Erlbaum.
120
Kibler, A. (2010). Writing through two languages: First language expertise in a language minority classroom.
Journal of Second Language Writing, 19(3), 121–142. doi:10.1016/j.jslw.2010.04.001
Kobayashi, H., & Rinnert, C. (1992). Effects of first language on second language writing: Translation versus
direct composition. Language Learning, 42(2), 183–209. doi:10.1111/j.1467-1770.1992.tb00707.x
Kobayashi, H., & Rinnert, C. (2012). Understanding L2 writing development from a multicompetence per-
spective: Dynamic repertoires of knowledge and text construction. In R. Manchón (Ed.), L2 writing devel-
opment: Multiple perspectives (pp. 101–134). Berlin: de Gruyter Mouton.
Kobayashi, H., & Rinnert, C. (2013). L1/L2/L3 writing development: Longitudinal case study of a Japanese
multicompetent writer. Journal of Second Language Writing, 22(1), 4–33. doi:10.1016/j.jslw.2012.11.001
Kubota, R. (1998). An investigation of L1–L2 transfer in writing among Japanese university students: Implications
for contrastive rhetoric. Journal of Second Language Writing, 7(1), 69–100.
Leki, I. (2011). Learning to write in a second language: Multilingual graduates and undergraduates expanding
genre repertoires. In R.M. Manchón (Ed.), Learning-to-write and writing-to-learn in an additional lan-
guage (pp. 85–109). Amsterdam: John Benjamins.
Lindgren, E., Westum, A., Outakoski, H., Sullivan, K.P.H., & Sullivan, K.P.H. (2017). Meaning-making across
languages: A case study of three multilingual writers in Sápmi. International Journal of Multilingualism,
14(2), 124–143. doi:10.1080/14790718.2016.1155591
Lyster, R. (2019). Translanguaging in immersion: Cognitive support or social prestige? Canadian
Modern Language Review/La Revue Canadienne des Langues Vivantes, 75(4), 340–352. doi:10.3138/
cmlr.2019-0038
Manchón, R.M. (2011). Learning-to-write and writing-to-learn in an additional language. Amsterdam: John
Benjamins.
Manchón, R.M. (2013). Writing. In F. Grosjean & P. Li (Eds.), The psycholinguistics of bilingualism (pp. 100–
115). Hoboken, NJ: Wiley-Blackwell.
Manchón, R.M., & Roca de Larios, J. (2007). On the temporal nature of planning in L1 and L2 composing.
Language Learning, 57(4), 549–593. doi:10.1111/j.1467-9922.2007.00428.x
Manchón, R.M., Roca de Larios, J., & Murphy, L. (2009). The temporal dimension and problem-solving
nature of foreign language composing processes: Implications for theory. In R.M. Manchón (Ed.), Writing
in foreign language contexts: Learning, teaching, and research (pp. 102–129). Clevedon: Multilingual
Matters.
Matsuda, P.K. (1999). Composition studies and ESL writing: A disciplinary division of labor. College
Composition and Communication, 50(4), 699–721. doi:10.2307/358488
Murphy, L., & Roca de Larios, J. (2010). Searching for words: One strategic use of the mother tongue
by advanced Spanish EFL writers. Journal of Second Language Writing, 19(2), 61–81. doi:10.1016/
j.jslw.2010.02.001
Ortega, L. (2013). SLA for the 21st century: Disciplinary progress, transdisciplinary relevance, and the bi/
multilingual turn. Language Learning, 63(S1), 1–24.
Ortega, L. (2016). Multi-competence in second language acquisition: Inroads into the mainstream. In V. Cook &
Li Wei (Eds.), The Cambridge handbook of linguistic multi-competence (pp. 50–76). Cambridge: Cambridge
University Press.
Ortega, L., & Carson, J. (2010). Multicompetence, social context, and L2 writing research praxis. In P.K.
Matsuda & T.J. Silva (Eds.), Practicing theory in second language writing (pp. 48–71). West Lafayette,
IN: Parlor Press.
Polio, C. (1994). Reexamining English-only in the ESL classroom: A comment on Auerbach. TESOL Quarterly,
28(1), 153–157.
Rinnert, C., & Kobayashi, H. (2016). Multicompetence and multilingual writing. In P.K. Matsuda & R.M.
Manchón & P.K. Matsuda (Eds.), Handbook of second and foreign language writing (pp. 365–386).
writing: A temporal analysis of problem-solving formulation processes. The Modern Language Journal,
90(1), 100–114. doi:10.1111/j.1540-4781.2006.00387.x
Schoonen, R., van Gelderen, A., Stoel, R.D., Hulstijn, J., & de Glopper, C. (2011). Modeling the develop-
ment of L1 and EFL writing proficiency of secondary school students. Language Learning, 61(1), 31–79.
doi:10.1111/j.1467-9922.2010.00590.x
Scott, J. (2016). Multi-competence and language teaching In V. Cook (Ed.), The Cambridge handbook of lin-
guistic multi-competence (pp. 445–460). Cambridge: Cambridge University Press.
Séror, J., & Gentil, G. (2019). Translanguaging and biliteracy in a bilingual university: Student stances and
ideologies. Paper presented at the Annual Conference of the American Association for Applied Linguistics,
Atlanta, GA.
121
Guillaume Gentil
Tardy, C.M., Sommer-Farias, B., & Gevers, J. (2020). Teaching and researching genre knowledge: Toward an
enhanced theoretical framework. Written Communication, 37(3), 287–321. doi:10.1177/0741088320916554
Tardy, C.M., & Whittig, E. (2017). On the ethical treatment of EAL writers: An update. TESOL Quarterly,
51(4), 920–930.
Tian, L., & Macaro, E. (2012). Comparing the effect of teacher codeswitching with English-only explanations
on the vocabulary acquisition of Chinese university students: A lexical focus-on-form study. Language
Tullock, B.D., & Fernández-Villanueva, M. (2013). The role of previously learned languages in the thought
processes of multilingual writers at the Deutsche Schule Barcelona. Research in the Teaching of English,
47, 420–441.
Yin, R.K. (2018). Case study research: Designs and methods (6th rev. ed.). London: Sage.
122
SECTION 4
The Role of Individual Differences

10
AGE-RELATED DIFFERENCES
IN L2 WRITTEN PERFORMANCE
AND WRITTEN CORRECTIVE
FEEDBACK PROCESSING AND USE
University of Murcia
Introduction
The influence of age on the acquisition of a second (L2) or foreign (FL) language is an issue that
has attracted considerable theoretical and empirical attention. Research conducted in naturalistic
and instructed FL settings has produced different findings regarding the effects of maturational
constraints. In naturalistic settings, prolonged and unlimited exposure to the target language (TL)
has generally placed younger learners at an advantage over adults, while the opposite seems to be
true in formal contexts where older learners have consistently been found to be superior in a number
of areas, including writing (Torras, Navés, Celaya, & Pérez Vidal, 2006). However, educational
policy-makers around the globe have not been dissuaded from embracing prevailing theoretical
assumptions regarding the alleged benefits of early language learning. As a result, instructed FL
learning has been made available to growing numbers of younger learners world-wide.
At the same time, young school-aged English language learners (ELLs) from different racial,
ethnic, and linguistic backgrounds are increasingly present in classrooms in the United States
and Australia. This changing scenario presents new challenges for L2 writing scholars who have
traditionally focused their research endeavors on college students in higher education contexts.
Although limited, research on L2 writing and written corrective feedback (WCF) with younger
participants is beginning to accumulate and offers an interesting complement to what is currently
known about the language learning potential of writing and WCF with adult learners. The present
chapter reviews studies carried out with learners of different ages in an attempt to shed light on age-
related differences between L2 writers.
Traditionally, it has been assumed that children are better L2 learners than adults on the grounds
that they acquire language more rapidly and reach higher levels of native-like proficiency than older
learners. This alleged advantage is attributed to the widely held notion that there may be a critical
period in late childhood beyond which learning an additional language becomes more difficult and
ultimately less successful, due to either a loss in brain plasticity (Penfeld & Roberts, 1959) or to
DOI: 10.4324/9780429199691-16 125

the lateralization of language functions in the left hemisphere (Lenneberg, 1967). However, early
studies that investigated the younger is better hypothesis in naturalistic L2 settings produced evi-
dence that challenged this notion, and suggested that adults and teenagers learn at a faster pace than
children, largely due to their increased cognitive maturity, metalinguistic knowledge, and better
test-taking skills (Krashen, Scarcella, & Long, 1979; Snow & Hoefnagel-Höhle, 1978). Subsequent
research went on to claim that the rate advantage identified for adults declines after a few years, so
that children eventually catch up with and overtake older learners in terms of proficiency, especially
in phonological and morphosyntactic competence (Flege, 1987; Johnson & Newport, 1989).
Interestingly, research conducted in instructed foreign language contexts has produced entirely
different findings when comparing the achievements of early (younger) and late (older) starters.
Studies of instructed language learners have shown that (a) older learners learn faster than younger
learners, even after shorter periods of exposure to the TL, due to their superior cognitive skills
(Muñoz, 2006); (b) at best, long-term effects are found for early starters only on some measures
of linguistic competence such as writing fluency (Torras et al., 2006), phonetic discrimination
(Fullana, 2006), and listening comprehension (Cenoz, 2002); and (c) younger learners have not
shown the overall high levels of linguistic attainment reached by immigrant children despite an
early start to FL learning.
Historically, however, SLA research into L2 writing has received less attention than studies
focusing on learners’ oral development or reading skills. At the same time, L2 writing research
has been more interested in writers’ social, personal, and academic literacy development rather
than with language learning per se. More recently, the emergence of the so-called “writing to
learn” research strand (Manchón, 2011) has successfully raised the profile of L2 writing by making
explicit its potential for contributing to the development of L2 knowledge and skills in different
educational contexts and with learners of different ages. It is from this perspective that the studies
included below are contemplated.

Against this background, four issues stand out as relevant to our discussion of age-related differences
in relation to L2 writing and WCF processing and use. The first concerns the ways in which the
amount, intensity, and quality of learners’ exposure to the L2 at different ages and in different
instructional settings influence their written output. Exploring this line of inquiry is necessary, since
age-related differences and written performance have only rarely been addressed directly (Torres
et al. 2006). As a result, insights into the performance of learners of diverse ages can be found in
studies which are primarily concerned with examining the effects of different types and lengths
of instruction on learners’ L2 written output and/or general L2 competence. This research is not
examined from the perspective of the younger/older debate but rather on the grounds that it can
indirectly contribute to our understanding of age-related differences in L2 writing.
A second and more recent issue in which age has been considered is the impact of explicit
instruction on writing. It has long been assumed that children acquire the L2 naturally and
easily in naturalistic settings, and while it is true that they rapidly develop oral communication
skills that enable them to function well in informal social contexts, the same cannot be said
for the specialized language skills that are required to read and write successfully in academic
school disciplines (Schleppegrell, 2013). Young immigrant children have been found to struggle
with more formal or abstract aspects of the L2 and often display an “achievement gap in com-
parison with their native speaker peers” (p. 154). These realizations have prompted the research
addressed below.
The third issue concerns the role of individual cognitive and affective factors on writing.
As a cognitive phenomenon, L2 writing is subject to the dynamic cross-linguistic influences of
learners’ existing L1 metacognitive writing skills and their L2 knowledge, which differ across age
126
Age and L2 Writing
groups and proficiency levels. Such influences are apparent, for example, on the dependency of
the successful transfer of textual organization skills between languages on the degree of L2 gram-
matical competence shown by learners (Berman & Slobin, 1994), the greater reliance of younger
and less proficient children on lexical borrowings from the L1 in comparison to older learners
(Navés, Miralpeix, & Celaya, 2005), or in the importance attached to the role of learners’ linguistic
knowledge in L2 writing performance (Schoonen, van Gelderen, Hulstjin, Simis, Snellings, &
Stevenson 2003).
The last area concerns learners’ engagement with WCF. Most research on WCF has been carried
out with university students or adults studying in L2 settings. Younger secondary and primary school
L2 and FL learners are still underrepresented in the field. Recent metanalyses of WCF research
support this claim. In a review of 22 quantitative WCF studies, Kang and Han (2015) include only
one study involving elementary school children and one other study with secondary school learners.
Similarly, in a synthesis of all the research on oral and written WCF published in the last 25 years in
a specific academic journal, only one WCF study investigated primary school learners (Li & Vuono,
2019). In reporting the sampling features of 44 WCF studies, Liu and Brown (2015) also noted that
the most heavily researched populations were post-secondary and adult learners, while younger
learners were under-investigated. This is perhaps due to the fact that in FL settings, instruction with
younger learners is often focused on the provision of oral input and communicative language prac-
tice, and writing, although included in the curriculum, is generally afforded less attention. Recent
research taking age into consideration is addressed below.
The Impact of Age, Amount, and Intensity of Exposure to

L2 Input on Learners’ Written Output
Most research within this category has been carried out with adolescents, while examinations of
children’s written output still remain relatively scarce. In this respect, findings from the Barcelona
Age Factor (BAF) project (Muñoz, 2006) have been crucial in furthering our understanding of the
complex interrelationship between the initial age at which FL learning begins, the amount of overall
exposure to the FL learners receive, and the effects of both on learning rates and on the development
of different dimensions of writing competence by older and younger learners. In a series of longitu-
dinal studies spanning up to six years (1996–2002), detailed analyses of various measures of lexical
and grammatical complexity, accuracy, and fluency (CAF) were performed on the texts of young
Spanish-Catalan EFL learners considered as either early (aged 8) or late (aged 11) starters over
three time periods, when learners were 10/12, 12/14 and 16/18 years old. The results showed that
with the same amount of instructional time, the older, later starters (LS) initially learned faster and
achieved higher levels of attainment in all CAF measures (Celaya, Torras, & Pérez Vidal, 2001),
although over time, the early starters (ES) seemed to catch up in fluency (Navés Torras, & Celaya,
2003). Also, with fewer hours of instruction, the LS outperformed age-matched ES on grammatical
accuracy and complexity but not on lexical complexity or fluency (Celaya et al., 2001). Finally,
diverse dimensions of these children’s FL writing developed at different rates and seemed to be
affected by age, with age 12 and beyond constituting a turning point in the production of more
syntactically and lexically complex texts (Torras, et al., 2006). The authors attribute this to the
increased cognitive ability associated with the onset of puberty, together with a greater emphasis
on form-focused instruction in FL classrooms, which could have enhanced learners’ metalinguistic
awareness and thus helped improve the formal features of their writing. In sum, research on L2
writing in an instructed FL context suggests that while age-related differences diminish as learners
develop cognitively, a lasting advantage for older learners is ultimately maintained in L2 writing
competence.
127
In recent years, new research in FL instructional settings has produced evidence that begins
to challenge established findings in relation to the role of age in L2 writing. The introduc-
tion of content-based language teaching (CLIL) in European schools has fueled research into
the impact of this instructional program on the development of learners’ L2 language skills,
including writing. Most research to date has examined the written performance of high school
students enrolled in either traditional EFL only or CLIL plus EFL instruction, while information
on younger children is limited. Merisuo-Storm and Soinen (2014), however, found that young
nine-year-old Finnish children enrolled in CLIL groups, in which several subjects were taught
in English, developed significantly better literacy skills over a six-year period than learners in
monolingual classes with EFL instruction. More recently, Steinlen (2018) assessed the pro-
gress made by nine and ten-year-old L1 and L2 German children in both languages after four
years in a partial immersion program. Standardized writing tests in both languages revealed the
learners’ age-appropriate development in German and a surprisingly high A2 level in English
(CFER, Council of Europe, 2001), an achievement usually attributed to older secondary school
leavers. Even though both of these studies focus on the development of age-matched children,
the findings are still interesting since they suggest that increasing the amount, intensity and con-
tent of children’s target language (TL) exposure seems to accelerate the development of their L2
writing skills, even in the absence of systematic writing instruction.
As for adolescents, several studies have compared the written performance of high school
learners enrolled in EFL and CLIL programs using holistic and/or CAF measures. Ruiz de Zarobe
(2010) found that two groups of Basque- Spanish bilinguals participating in CLIL programs
of differing intensity outperformed peers in less intensive EFL classes on a number of written
measures including textual organization, language use, mechanics, and to a significant extent, on
vocabulary and content. Importantly, the younger CLIL group (aged 14) scored higher than older
learners in an EFL group (aged 17) when instructional time was held constant. Lasagabaster (2008)
also found that a group of CLIL learners (aged 14–15) outperformed older non-CLIL peers on an L2
writing test. Both of these studies suggest a positive relationship between the amount and intensity
of exposure to the FL and written performance, which appear to enhance the rate of development in
younger learners allowing them to surpass their older peers in L2 writing.
Subsequent research with high school learners in Spain has produced mixed results when ana-
lyzing the effects of exposure on specific measures of writing performance. In a study of the long-
term writing development of two groups of 13-year-old students in CLIL and non-CLIL instruction
over a three-year period, Gené-Gil, Juan-Garau, and Salazar-Noguera (2015) found that, over time,
the CLIL learners produced progressively more complex, accurate, and longer texts, and achieved
significantly higher scores on most of the CAF measures employed, while the EFL group made
significant progress only in lexical complexity and accuracy. Intergroup comparisons showed that
when instructional time was matched, the younger CLIL group generally outperformed the older
EFL group on almost all writing measures, although not significantly, while the latter made greater
progress in lexical complexity.
Roquet and Pérez Vidal (2015) compared the written performance of 13–14 and 14–15-year-olds
with similar accumulated L2 exposure time, using a combination of CAF and holistic measures,
on commencing and completing one academic year of either traditional EFL or CLIL instruction.
Although the texts of the slightly younger CLIL learners were judged to be lexically richer, better
structured, and more focused than those of their EFL peers, significant differences in favor of the
CLIL group (in contrast to Gené-Gil et al., 2015) were only found in terms of accuracy. However,
in a subsequent study of older and younger learners in time-matched EFL and CLIL programs, it
was the older EFL learners who performed better than the younger CLIL learners in accuracy and
complexity. In a second group of age-matched learners, CLIL students with more instructional
time outperformed EFL learners in all qualitative dimensions of writing competence as well as
128
Age and L2 Writing
in lexical richness (Artieda, Roquet, & Nicolás Conesa, 2017). In two further studies, Lahuerta
(2017a, 2017b), reported that analyses of specific accuracy and complexity measures in the L2
writing of Year 3 (14–15 years) and Year 4 (15–16 years) high school CLIL and EFL classes showed
that only the CLIL groups made significant gains in accuracy across grades (and age). Additionally,
the older Year 4 learners in both groups also scored higher on overall text quality and syntactic
complexity than their younger counterparts. Across group comparisons revealed that third year
CLIL learners outperformed their peers and older non-CLIL leaners in text quality and syntactic
complexity. Taken together, the accumulated findings of this research suggest that despite diver-
gence in specific measures of L2 writing, there seems to be a tendency for younger learners in CLIL
programs to outperform their older peers in L2 writing performance, at least holistically.
Additional evidence in support of the benefits of intensive L2 exposure on writing performance
comes from research on academic discourse development, dynamic systems theory, and study abroad
programs. McCabe and Whittaker (2016) and Whittaker, Llinares, and McCabe (2011) carried out
longitudinal analyses of the history essays of 12–16 year olds enrolled in CLIL classrooms in
Spanish high schools over a period of four years. Results revealed gradual improvements in the
learners’ control of nominal group complexity, appraisal of historical events and figures, and an
increase in lexical variety and accuracy. Verspoor and Smikova (2012) compared the dynamic
development of linguistic chunks in the written texts of two Dutch high school learners selected
from a low input EFL and high input CLIL group over a two-year period. Although variability was
common to both learners, the high input learner showed a clearer developmental trajectory in the
number and types of chunks which emerged in her writing, with peaks of chunk use, which tended
to stabilize over time, while the low input learner showed random development and limited use of
linguistic chunks. Limited exposure to L2 input and fewer opportunities for L2 use in the traditional
instructional context is likely to have impacted on this learner’s inferior written performance.
While both studies highlight the importance of the intensity of L2 exposure for writing devel-
opment in instructional settings, neither focuses expressly on age-related differences. Llanes and
Muñoz (2013), however, examined the impact of immersion in a naturalistic L2 environment on
the writing skills of older and younger FL learners in comparison to peers who remained at home.
Results showed that the children outperformed the adults in oral performance, and the stay-at-
home children in both oral and writing skills, although a lack of writing practice in the natural-
istic environment placed the adults who remained at home at an advantage in writing fluency and
complexity. The findings of this study support those of the BAF research, which found that older
learners outperformed younger learners in L2 writing in school settings. They differ, however, from
the instructed CLIL research described above whereby younger high school learners tended to be
overall better at writing than their older non-CLIL peers. It should be noted, however, that the age
gap between the children and adults in this study abroad (at least ten years) and the BAF research
(up to eight years) is much wider than that of the participants in the CLIL studies, who were all
young adolescents, and whose cognitive maturity was likely to be more similar. This might mitigate
the possible effects of age on writing performance and place greater onus on the intensity and nature
of instruction in enhancing writing development in formal settings.
Collectively considered, the findings of this research strand suggest a positive interaction
between the amount, intensity (and presumably quality) of TL exposure and L2 written perform-
ance, all of which seem decisive in contributing to written language development in formal instruc-
tional contexts. The combination of increased TL exposure together with instruction that integrates
content and language seems to place younger learners at a slight advantage over older learners when
age differences are small. It is interesting that the clearest gains reported in research to date were
those made by the primary school children in Steinlen’s study (2018) whose partial immersion in
the TL (50% of teaching time) was more intense that of the high school CLIL learners reported
above (three or four additional hours per week). This might be attributed to implicit language
129
learning mechanisms which are triggered when children are exposed to sufficient amounts of
L2 input, and when they are afforded ample opportunities for writing practice. Although further
research is required, the evidence gathered so far seems to point to the potential of early content-
based immersion programs, which combine increased exposure time and subject teaching in the L2,
in enhancing young learners’ writing development in instructed FL settings.
The Role of Explicit Writing Instruction in Enhancing Performance

A growing body of classroom-based research focusing on the written performance of young
English language learners (ELL) in the United States has provided evidence to show that
becoming a competent writer does not occur effortlessly from exposure. Cross-sectional studies
have identified the linguistic and structural features of children’s writing across grades in a var-
iety of genres including science explanations (Avalos, Gómez Zisselberger, Gort, & Secada,
2017), written reports (Brisk, Hodgson-Drysdale, & O’Connor, 2011) and literary character
analysis (Moore & Schleppegrell, 2014). Case studies of individual learners have also shown
how explicit instruction in the linguistic and structural features of textual genres can help chil-
dren improve their written performance. De Oliveira and Lan (2014) reported on changes in the
ability of a young 4th grade learner to produce a scientific procedural recount following genre-
based instruction. Chambliss, Christensen, and Parker (2003) showed how combined science and
genre pedagogy fostered 4th graders’ written scientific explanations. Harman (2013) found that
raising children’s awareness of the intertextual and linguistic characteristics of narrative texts
helped support their academic literacy development. Importantly, this research suggests that the
younger the better interpretations arising from theoretical positions on age-related concerns do
not have immediate implications in instructed ESL settings when it comes to writing, which is
a difficult process for children, even in their first language.
Genre-based writing instruction has also been found to enhance the writing performance of ado-
lescent and adult learners of German (Byrnes, 2009) and English (Bunch & Willett, 2013; Liardet,
2016; Lee, 2016; Yasuda, 2011) in both ESL and FL contexts. Bunch and Willett (2013) reported
on a genre-informed intervention with 7th graders in a US high school, in which learners success-
fully integrated information from multiple sources into a written history assignment. Lee (2016)
found that explicit grammar instruction in the processes and structure of narrative texts signifi-
cantly improved the story writing performance of ESL high school learners in Hong Kong. Byrnes
(2009) explored the increasingly sophisticated use of grammatical metaphor in the writing samples
of adult learners following a college German FL program. Yasuda (2011) examined the relation-
ship between writing competence, linguistic knowledge, and genre awareness with Japanese EFL
college students on an email writing course and reported improvements in their knowledge and
productive use of genre-related linguistic and rhetorical choices. Liardét (2016) examined the
expository texts of Chinese ESL learners enrolled in an instructed genre-based academic writing
course and identified different developmental trajectories that included instances of incomplete or
erroneous use of grammatical metaphor, indicating the students’ ongoing appropriation of this lin-
guistic resource. Contemporary L2 genre-based research, then, has shown that where L2 writing is
concerned, learners of all ages, regardless of the learning context, need to be taught to pay attention
to the ways in which language works in texts in order to develop the writing skills necessary to
communicate meanings successfully in academic contexts.
Individual Cognitive and Affective Influences on L2 Writing Development

Evidence from cognitively-oriented research has provided insights into the role of cognitive and lin-
guistic maturity in L2 writing performance. Schoonen et al. (2003) examined the interrelationship
130
Age and L2 Writing
between the cognitive and linguistic dimensions of the writing process with various cohorts of Dutch
high school learners (aged 13–14). The authors suggest that general metacognitive knowledge and
writing expertise contribute differentially to writing performance in the L1 and L2, since L2 writing is
critically conditioned by the learner’s linguistic knowledge (lexical and grammatical knowledge and
fluency). This led them to propose the inhibition hypothesis, which holds that the linguistic demands
of L2 writing may limit learners’ attention to the conceptual and textual features of written production.
Although unequivocal support for this prediction has yet to be found, the idea coincides with
research by Roca de Larios, Manchón, Murphy, and Marin (2008) who studied the cognitive writing
processes of high school (aged 16–17), undergraduate (aged 19–20), and graduate (aged 23–24)
EFL writers while planning, formulating, and revising L1 and L2 texts. In their analysis of the tem-
poral distribution of writing processes, formulation was found to be the dominant process in L1 and
L2 writing across proficiency levels, especially for the less competent L2 adolescent writers who
frequently struggled to compensate for gaps in their L2 knowledge. In comparison, more proficient
(older) writers focused on higher level ideational and textual concerns in an effort to upgrade the
quality of their essays. As writers gained in FL expertise, they were better able to diversify and
balance their attention across the writing task by focusing recursively on planning and revising
processes as well as on composing. More recently, Michel, Kormos, Brunfaut, and Ratajczak
(2019) examined the role of working memory (WM) and written performance with young 11–14-
year-old Hungarian learners. These authors also found that older learners with high WM functions
performed better on a complex listen and write task involving the coordination of simultaneous
cognitive operations. There is growing evidence to suggest, then, that with increased cognitive
maturity and linguistic competence, older learners progressively gain control over the cognitive
processes which enhance their written performance.
L2 writing has also been contemplated from the perspective of a theory of multicompetence by
exploring how the knowledge and practice learners gain from their experiences in the L1 and L2
merge to create a dynamic repertoire which they draw on for text production. Kobayashi and Rinnert
(2013) investigated the comprehensive development of a multilingual writer in three languages
(Japanese, English, and Chinese) and described the ways in which her personal history, attitudes,
and cultural identity progressively influenced her composing processes and text production across
languages. In a study comparing the writing skills of young bilingual Korean/American children
(aged 5 to 8) with those of high achieving, age-matched English-dominant children, Bae (2007)
found that after three years in a two-way immersion program, the bilingual children had developed
comparable writing skills to those of their monolingual peers. This led the author to suggest that the
children’s L1 and L2 writing skills and knowledge seem to be mutually dependent, presumably as a
consequence of a common underlying proficiency (Cummins, 1979). With older children, Lindgren
and Stevenson (2013) also revealed how some of the interactional resources used by 11-year-old
Swedish learners in a letter-writing task were not directly connected to their L1. Instead, evidence
of the children’s non-language specific writing knowledge showed up in their written production.
In this sense, the study expands on the notion of a hybrid L1-L2 knowledge system by suggesting
that during writing performance, young novice writers may rely not only on the merged competence
acquired from their knowledge of different languages, but on unique, strategic knowledge that is not
directly linked to either language.
Motivational variables have also been linked to L2 writing performance in children and adults. In
a study with beginner writers, Chang, Chang, and Hsu (2008) described the growing achievements
of young six-year-old Taiwanese children whose emergent writing was accompanied by enhanced
positive attitudes, showing that even at initial stages of L2 learning, motivation can foster progress
in writing performance. Similarly, Lo and Hyland (2007) found that the introduction of a new
writing program to 10–11-year-old L2 learners in an elementary classroom in Hong Kong enhanced
the children’s engagement with writing and encouraged them to produce longer and conceptually
131
better texts. The role of motivational factors with older learners was corroborated by Sasaki (2009,
2011) in a series of studies investigating the impact of varying lengths of study abroad on the writing
performance of Japanese EFL students. Sasaki found that learners’ L2 writing scores improved in
accordance with the duration of their overseas stays, which, in turn, fostered their intrinsic motiv-
ation to actively preserve their improved writing performance. Serrano, Tragant, and Llanes (2012)
also highlighted the importance of affective factors in influencing the written performance of
Spanish EFL undergraduates on a study abroad program since the students only began to improve
their writing performance after at least one semester abroad, and often in connection with positive
attitudes towards English speakers. In addition, then, to the learner-internal cognitive factors influ-
encing L2 writing, affective variables also play an important role in shaping how older and younger
learners respond to the opportunities they are given both inside and outside the classroom.
Learners’ Engagement with Written Corrective Feedback

A great deal of research on WCF has focused on measuring the accuracy of writers’ texts after
receiving different types of feedback. Findings from studies that selectively targeted discrete gram-
matical structures (e.g., Frear & Chiu, 2015; Shintani, Ellis, & Suzuki, 2014; Suzuki, Nassaji, &
Sato, 2019), as well as those providing comprehensive feedback on a wider range of linguistic
errors (Hartshorn & Evans, 2015; Van Beuningen, de Jong, & Kuiken, 2012) have shown that
improvements in accuracy are often sustained in new pieces of writing. However, a recent study
comparing the effects of direct WCF and metalinguistic explanations on the acquisition of the pre-
sent perfect tense by 9–11-year-old child L2 learners failed to identify any improvements in their
accurate use of the structure (Gorman & Ellis, 2019). The authors suggest that this may be due to
children’s tendency to prioritize meaning over form during communicative tasks. This holds well
with Skehan and Foster’s (2001) limited capacity model, which predicts that younger learners,
unlike adults, may experience attentional constraints during cognitively complex tasks (e.g., the
dictogloss used in this particular study) which occupy their memory resources and draw attention
away from linguistic forms as they focus more on message content.
Process-oriented WCF studies have focused specifically on the nature of learners’ cognitive
engagement with feedback in an attempt to ascertain how internal processes such as noticing and
metalinguistic awareness might determine the effects of WCF on written output. The upshot of
this research has underscored the critical role played by the quality of learners’ noticing, that is,
whether or not the WCF is fully noticed and understood, in determining uptake. Consequently,
factors related to faulty noticing, a lack of L2 metalinguistic knowledge or even the type of feed-
back provided to learners, can lead to errors going uncorrected or feedback being ignored when
learners fail to recognize or understand the cause of the error or lack the knowledge to solve it
(Cerezo, Manchón, & Nicolás-Conesa, 2019; Unscinski, 2017; Zheng & Yu, 2018). These difficul-
ties have proved to be even more important when it comes to younger learners. Simard, Guenette,
and Bergeron (2015) collected questionnaire data on the responses of French immersion students
to direct and indirect feedback and found that the children often misunderstood the intent of the
teacher’s corrections. Coyle and Roca de Larios (2014) suggested that the young EFL primary
school learners in their study misinterpreted linguistic elements that they had noticed but had only
partially understood from a model text, leading them to reproduce their faulty noticing in subse-
quent, ungrammatical language production. Recent research on WCF processing with children has
drawn attention to these important age-related differences in the processing behavior of younger
EFL learners. In describing the feedback trajectories followed by 10-and 11-year-olds using model
texts, Coyle, Cánovas-Guirao, and Roca de Larios (2018) highlighted the need to report in more
precise and age-related ways exactly how children approach the analysis and use of feedback, by
moving beyond the clear-cut dichotomies used in studies with older learners. As a result, the authors
expanded on the twofold distinctions generally made in adult research between learners’ noticing
132
Age and L2 Writing
(or not) and understanding (or not) to provide a more comprehensive description of children’s
noticing, partial noticing, and unreported noticing from a model text, as well as the range of pos-
sible outcomes that WCF processing might have on their revised texts (deletion, repetition, partial,
or full incorporation). In doing so, they demonstrated that research agendas and tools designed
exclusively for adults cannot simply be transposed onto research with younger learners whose spe-
cific age-related characteristics require their own unique approaches.
Alongside cognitive factors, Han and Hyland (2015) suggested that an understanding of the
complexity of learners’ responses to feedback can only be uncovered by a consideration of their
affective engagement with the feedback received. In a case study of four adult Chinese EFL learners,
the authors identified diverse attitudinal profiles and showed how the learners’ individual learning
goals (e.g., conversational use of the L2, instrumental need to pass a test, etc.), self-efficacy beliefs
and personal views and preferences, impacted on their attitudes and willingness to engage with the
feedback. Similar results were reported in two further studies investigating the emotional responses
of advanced Chinese EFL students to computer-generated automated writing evaluation (Zhang,
2017; Zhang & Hyland, 2018). Differences in the learners’ attitudes towards the corrections made
to their writing, their strategic reactions and subsequent textual revisions materialized into different
levels of engagement, which, together with L2 proficiency, impacted their writing performance.
Mahfoodh (2017) also analyzed the emotional reactions to WCF of adult EFL students on a univer-
sity writing course. Recurrent emotions, including acceptance or rejection of feedback, disappoint-
ment, frustration, happiness, satisfaction, etc. were associated both with the type and tone of the
teachers’ feedback and with the success or not of the learners’ revisions.
In an early study of younger learners, Fazio (2001) examined the effect of different feedback
conditions on the journal writing accuracy of fifth grade ESL learners in the United States. The
failure to identify accuracy gains in any of the conditions was partially explained by the children’s
unresponsive attitudes towards the provision of feedback by a teacher other than their regular class-
room teachers, as well as their inattention to the corrections. With slightly older children, Tang
and Liu (2018) found that while indirect coded feedback, provided either with or without affective
comments from the teacher, was equally useful at improving the short-term writing performance
of a group of Chinese EFL 7th graders, the inclusion of affective comments seemed to motivate
and encourage them to invest greater mental effort when processing the error codes. On the con-
trary, García Mayo and Loidi Labandibar (2017) reported that adolescent Basque-Spanish bilingual
learners aged between 13 and 16 had mostly negative attitudes towards the use of model texts as a
WCF technique. The lack of enthusiasm about models was found to be related not so much to the
technique itself, but to additional factors such as the learners’ lack of interest in L2 writing, their
self-efficacy beliefs, and their prior expectations regarding WCF provision shaped by previous
learning experience. Affective engagement with WCF would, therefore, seem to be a powerful
influence on successful writing performance.

The research described above has used a range of designs and analytical tools to explore the
language learning potential of writing and WCF in both FL and L2 settings. Large-scale cross-
sectional studies in naturalistic and formal settings have unveiled differences in learners’ written
performance as a function of their age, proficiency levels, cognitive processing behaviors or par-
ticipation in instructional programs. Several longitudinal studies have compared learners’ written
output after varying periods of time, from six months to up to four years in study abroad, high
school and university contexts. These studies have uncovered patterns of non-linear develop-
ment in L2 writers which are otherwise inaccessible in short-term experimental studies. Other
research, motivated by dynamic systems and multicompetence theories or genre-based pedagogy,
has tracked the writing progress of learners of different ages using qualitative case study or mixed
133
method designs. These small case studies are important in providing rich profiles of particular
writers in different contexts.
Information on learners’ written performance and WCF processing has been gathered in pen and
paper and computer-mediated environments and in individual and collaborative writing conditions.
Procedures have included text analysis, both in-depth linguistic analyses, and examination of per-
formance outcomes using CAF and/or holistic measures, frequently supplemented with additional
sources of information such as observations, interviews, or questionnaires. Systemic functional
linguistics (SFL) has also been adopted to examine writing performance, enabling researchers to
explore the mutual influence of learners’ use of lexicogrammatical features and the development of
textual quality.
Writing and WCF processing data have been obtained via think-aloud and collaborative dia-
logue protocols, computerized tracking, digital screen capture, eye-tracking, key-stroke logging,
or stimulated recall. While these methodological tools are not without their limitations, they have
allowed researchers to access learners’ cognitive activity to differing degrees, thus deepening our
knowledge of the challenges learners face during the writing process. As for tasks, with the exception
of genre-based research, studies with younger learners have opted for simple picture story narratives
or personal recounts, which necessarily limit the nature of the content elicited. Time-compressed
writing tasks, mostly personal opinion and argumentative essays, have been widely used in research
with high school learners and adults, allowing for greater freedom of expression but requiring, in
addition, the coherent organization of ideas and opposing views at the level of discourse. Timing
has also varied widely across studies. Research involving children has either allowed unlimited time
(e.g., Coyle et al., 2018) or between 10 and 15 minutes for task completion (e.g., Torras et al. 2006).
In time-compressed writing, participants have been allocated from 20 up to 50 minutes to complete
their assignments. Evidently, the longer learners have to write the more likely they are to engage in
planning, formulation, and text revision, which could affect the quality of their writing. This makes
comparisons across studies difficult. Further exploration of these intervening factors, together with
more ecologically valid writing tasks integrated within curricular frameworks, would undoubtedly
shed additional light on the complexities of “writing to learn” in different instructional contexts.
The multiple perspectives on writing and WCF processes investigated in research to date have
been instrumental in shaping current knowledge on how language develops through writing. To
continue to do so, both longitudinal classroom-based and controlled interventionist studies are
needed to explore and test the affordances of different methodological tools, with tasks of varying
complexity and with a wider range of participants and languages.

Findings from research on age-related differences can serve as a useful starting point to inform
classroom practice in educational settings. The importance of the amount and intensity of the input
learners are exposed to and the optimistic findings beginning to emerge from content-based instruc-
tion, highlight the potential of the latter in providing the type of significant exposure necessary
for enhancing L2 writing performance. Furthermore, insights obtained from research into writing
instruction in school and college settings in the United States, suggest that genre-oriented writing
pedagogy might also be profitably extended to other FL learning contexts at primary, secondary, and
tertiary levels, possibly in combination with CLIL instruction. Helping learners to become successful
writers of different genres is a complex process and will require teachers to develop expertise in
identifying the linguistic and structural characteristics of different genres (Schleppegrell, 2013).
Hence, there is also a need to assist teachers in supporting students’ writing development. Such
training might form part of initial and in-service teacher education courses. Finally, the importance
of affective responses to L2 writing and WCF cannot be neglected. Educators will need to find
new ways to actively foster learners’ engagement with writing and feedback tasks. In this sense,
134
Age and L2 Writing
they might take advantage of technological resources, which have progressively changed the way
we communicate within and beyond the language classroom. Learners of all ages might use web-
based tools such as blogs, wikis, email, and digital platforms, as well as popular sites like Twitter
or Facebook to interact with others and to share or showcase their work. The use of collaborative
writing tasks in online text chat, or the provision of electronically provided and multimodal feed-
back might also help maintain high levels of motivation with older and younger learners.
Future Research Directions

A number of directions for future research emerge from this review of the age-related differences
in L2 written performance and WCF processing and use. The first and most immediate concern
stems from the imbalance in the populations studied in the research strands outlined above. The
lack of research on the written performance of primary school learners is in direct contradiction
to the expansion of mandatory early language learning programs around the world. Apart from a
few isolated studies, learners under the age of 13 have largely been ignored. An important area in
need of attention is the exploration of children’s long-term writing performance and processing and
appropriation of WCF. Specifically, comparisons of older and younger learners with similar and
differing amounts and intensity of L2 exposure time might provide new age-related evidence on
L2 writing performance. More detailed information on the nature of CLIL interventions might also
provide insights into the potential effects of qualitative differences in the input learners are exposed
to. At the same time, research on the written performance of younger learners would certainly
benefit from the use of measures specifically tailored to match children’s limited ability, in order to
identify even minor developments in their writing, both linguistically and strategically.
As regards writing instruction, research on genre-based writing with young learners in FL settings
is underdeveloped as an area of inquiry. Insights obtained from current research in US schools
highlights collaboration between classroom teachers and researchers as a worthwhile avenue to
explore. Educating teachers to successfully guide learners of different ages and proficiency levels in
the writing of curriculum and academic genres would also make for interesting analysis. Likewise,
genre-based intervention studies in FL classrooms, particularly over longer periods of time, might
reveal suitable pedagogical strategies for teaching younger and older writers. Dynamic systems
theory research could also be expanded to examine the variability in different writing skills of
learners of different ages. The careful tracking of emerging patterns in the writing and use of WCF
by younger and older learners in combination with analyses of cognitive, proficiency-related,
affective and instructional variables, might enable researchers to identify developmental profiles to
account for dissimilarities within and between individuals or groups of learners.
Sites for examining L2 writing and WCF processing could also be expanded to incorporate
new forms of literacy such as multimodal texts and social media or gaming applications, both
in synchronous and asynchronous face-to-face and computer-mediated communication. Similarly,
age-related differences in learners’ manipulation of textual and online source material for written
assignments and WCF processing might also be examined from a number of perspectives (cog-
nitive, affective, collaborative), together with studies exploring instructional techniques for the
effective incorporation of sources into academic assignments. A future research agenda should aim
to gather in-depth, grounded evidence on L2 writing performance and WCF processing and use
across a wider range of age groups in a variety of languages and settings.
References
Artieda, G., Roquet, H., & Nicolás-Conesa, F. (2017). The impact of age and exposure on EFL achievement in
two learning contexts: Formal instruction and formal instruction+content and language integrated learning
(CLIL). International Journal of Bilingual Education and Bilingualism, 20, 1–24.
135
Avalos, M.A., Secada, W.G., Zisselsberger, M.G., & Gort, M. (2017). “Hey! Today I will tell you about the
water cycle!”: Variations of language and organizational features in third-grade science explanation writing.
The Elementary School Journal, 118(1), 149–176.
Bae, J. (2007). Development of English skills need not suffer as a result of immersion: Grades 1 and 2 writing
assessment in a Korean/English two-way immersion program. Language Learning, 57(2), 299–332.
Berman, L., & Slobin, D.I. (1994). Relating events in narrative: A crosslinguistic developmental study.
Hillsdale, NJ: Lawrence Erlbaum.
Brisk, M.E., Hodgson-Drysdale, T., & O’Connor, C. (2011). A study of a collaborative instructional pro-
ject informed by systemic functional linguistic theory: Report writing in elementary grades. Journal of
Education, 191(1), 1–12.
Bunch, G.C., & Willett, K. (2013). Writing to mean in middle school: Understanding how second language
writers negotiate textually rich content-area instruction. Journal of Second Language Writing, 22(2),
141–160.
Byrnes, H. (2009). Emergent L2 German writing ability in a curricular context: A longitudinal study of gram-
matical metaphor. Linguistics and Education, 20, 50–66.
Celaya, M.L., Torras, M.R., & Vidal, C.P. (2001). Short and mid-term effects of an earlier start: An analysis of
EFL written production. In S. Foster-Cohen & A. Nizegorodcrew (Eds.), EUROSLA Yearbook (Vol. 1) (pp.
195–209). Amsterdam. John Benjamins
Cenoz, J. (2002). Age differences in foreign language learning. ITL- International Journal of Applied
Linguistics, 135(1), 125–142.
Cerezo, L., Manchón, R.M., & Nicolás-Conesa, F. (2019). What do learners notice while processing written
corrective feedback? A look at depth of processing via written languaging. In R. Leow (Ed.), SLR handbook
of classroom learning: Processing and processes (pp.173–179). London: Routledge.
Chambliss, M.J., Christenson, L.A., & Parker, C. (2003). Fourth graders composing scientific explanations
about the effects of pollutants: Writing to understand. Written Communication, 20, 426–454.
Chang, F., Chang, S., & Hsu, H. (2008). Writing activities as stimuli for integrating the four language skills in
EFL grade-one classes in Taiwan. English Teaching and Learning, 32(3), 115–154.
Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching,
assessment. Cambridge: Cambridge University Press.
Coyle, Y., & Roca de Larios, J. (2014). Exploring the role played by error correction and models on children’s
reported noticing and output production in a L2 writing task. Studies in Second Language Acquisition,
36(3), 451–485.
Coyle, Y., Cánovas Guirao, J., & Roca de Larios, J. (2018). Identifying the trajectories of young EFL learners
Writing, 42, 25–43.
Cummins, J. (1979). Cognitive/academic language proficiency, linguistic interdependence, the optimum age
question and some other matters. Working Papers on Bilingualism 19, 121–129.
De Oliveira, L.C., & Lan, S. (2014). Writing science in an upper elementary classroom: A genre-based approach
to teaching English language learners. Journal of Second Language Writing, 25, 23–39.
Fazio, L. (2001). The effect of corrections and commentaries on the journal writing accuracy of minority-and
majority-language students. Journal of Second Language Writing, 10(4), 235–249.
Flege, J.E. (1987) A critical period for learning to pronounce foreign languages? Applied Linguistics, 8,
162–177.
Frear, D., & Chiu, Y.H. (2015). The effect of focused and unfocused indirect written corrective feedback on
EFL learners’ accuracy in new pieces of writing. System, 53, 24–34.
Fullana, N. (2006). The development of English (FL) perception and production skills: Starting age
and exposure effects. In C. Muñoz (Ed.), Age and the rate of foreign language learning (pp 41–64).
Clevedon: Multilingual Matters
García Mayo, M.P., & LoidiLabandibar, U. (2017). The use of models as written corrective feedback in EFL
writing. Annual Review of Applied Linguistics, 37, 110–127.
Gené-Gil, M., Juan-Garau, M., & Salazar-Noguera, J. (2015). Development of EFL writing over three years
in secondary education: CLIL and non-CLIL settings. The Language Learning Journal, 43(3), 286–303.
Gorman, M., & Ellis, R. (2019). The relative effects of metalinguistic explanation and direct written corrective
feedback on children’s grammatical accuracy in new writing. Language Teaching for Young Learners,
1(1), 57–81.
Han, Y., & Hyland, F. (2015). Exploring learner engagement with written corrective feedback in a Chinese
tertiary EFL classroom. Journal of Second Language Writing, 30, 31–44.
Harman, R. (2013). Literary intertextuality in genre-based pedagogies: Building lexical cohesion in fifth-grade
L2 writing. Journal of Second Language Writing, 22(2), 125–140.
136
Age and L2 Writing
Hartshorn, K.J., & Evans, N.W. (2015). The effects of dynamic written corrective feedback: A 30-week study.
Journal of Response to Writing, 1(2), 6–34.
Johnson, J., & Newport, E. (1989) Critical period effects in second language learning: The influence of matur-
ational state on the acquisition of English as a second language. Cognitive Psychology, 21, 60–99.
Kang, E., & Han, Z. (2015). The efficacy of written corrective feedback in improving L2 written accuracy: A
meta-analysis. The Modern Language Journal, 99, 1–18.
Kobayashi, H., & Rinnert, C. (2013). L1/L2/L3 writing development: Longitudinal case study of a Japanese
multicompetent writer. Journal of Second Language Writing, 22(1), 4–33.
Krashen, S., Long, M.H., & Scarcella, R. (1979) Accounting for child–adult differences in second language
rate and attainment. TESOL Quarterly, 13, 573–582.
Lahuerta A. (2017a). Analysis of accuracy in the writing of EFL students enrolled on CLIL and non-CLIL
programmes: The impact of grade and gender. The Language Learning Journal, 1–12. doi:10.1080/
09571736.2017.1303745.
Lahuerta Martínez, A.C. (2017b). Syntactic complexity in secondary-level English writing: Differences among
writers enrolled on bilingual and non-bilingual programmes. Porta Linguarum, 28, 67–80.
Lasagabaster, D. (2008). Foreign language competence in content and language integrated courses. The Open
Applied Linguistics Journal, 1, 30–41.
Lee, J.F. (2016). Enriching short stories through processes–A functional approach. System, 58, 112–126.
Lenneberg, E.H. (1967) Biological foundations of language. New York: Wiley.
Li, S., &Vuono, A. (2019). Twenty-five years of research on oral and written corrective feedback in System.
System, 84, 93–109.
Liardét, C.L. (2016). Nominalization and grammatical metaphor: Elaborating the theory. English for Specific
Purposes, 44, 16–29.
Lindgren, E., & Stevenson, M. (2013). Interactional resources in the letters of young writers in Swedish and
English. Journal of Second Language Writing, 22(4), 390–405.
Liu, Q., & Brown, D. (2015). Methodological synthesis of research on the effectiveness of corrective feedback
in L2 writing. Journal of Second Language Writing, 30, 66–81.
Llanes, À., & Muñoz, C. (2013). Age effects in a study abroad context: Children and adults studying abroad
and at home. Language Learning, 63(1), 63–90.
Lo, J., & Hyland, F. (2007). Enhancing students’ engagement and motivation in writing: The case of primary
students in Hong Kong. Journal of Second Language Writing, 16, 219–237.
Mahfoodh, O.H.A. (2017). “I feel disappointed”: EFL university students’ emotional responses towards teacher
written feedback. Assessing Writing, 31, 53–72.
Manchón, R. (2011). Writing to learn the language. In R. Manchón (Ed.), Learning-to-write and writing-to-
learn in an additional language (pp. 61–82). Amsterdam: John Benjamin.
McCabe, A., & Whittaker, R. (2016). Genre and appraisal in CLIL history texts: Developing the voice of the
historian. In A. Llinares, & T. Morton (Eds.), Applied linguistics perspectives on CLIL (pp. 105–124).
Merisuo-Storm, T., & Soininen, M. (2014). Students’ first language skills after six years in bilingual education.
Mediterranean Journal of Social Sciences, 5(22), 72.
language learners’ written performances. Journal of Second Language Writing, 45, 31–45.
Moore, J., & Schleppegrell, M. (2014). Using a functional linguistics metalanguage to support academic lan-
guage development in the English Language Arts. Linguistics and Education, 26, 92–105.
Muñoz, C. (Ed.). (2006). Age and the rate of foreign language learning. Clevedon: Multilingual Matters.
Navés, T., Miralpeix, I., & Celaya, M.L. (2005). Who transfers more …. and what? Crosslinguistic influence
in relation to school grade and language dominance in EFL. International Journal of Multilingualism,
2(2),113–134.
Navés, T., Torras, M.R., & Celaya, M.L. (2003). Long-term effects of an earlier start: An analysis of EFL written
production. In S. Foster-Cohen & S. Pekarek (Eds.), EUROSLA Yearbook, 3(1), 103–129. Amsterdam: John
Benjamins.
Penfield, W., & Roberts, L. (1959) Speech and brain mechanisms. Princeton, NJ: Princeton University Press.
behaviour in the allocation of time to writing processes. Journal of Second Language Writing, 17, 30–47.
Roquet, H., & Pérez-Vidal, C. (2015). Do productive skills improve in content and language integrated learning
contexts? The case of writing. Applied Linguistics, 38(4), 489–511.
Ruiz de Zarobe, Y. (2010). Written production and CLIL: An empirical study. In C. Dalton Puffer, T. Nikula,
& U. Smit (Eds.), Language use and language learning in CLIL classrooms (pp. 191–210). AALS Series.
137
Sasaki, M. (2009). Changes in EFL students’ writing over 3.5 years: A socio-cognitive account. In R.M.
Manchón (Ed.), Writing in foreign language contexts: Learning, teaching, and researching (pp. 49–76.).
Sasaki, M. (2011). Effects of varying lengths of study-abroad experiences on Japanese EFL students’ L2
writing ability and motivation: A longitudinal study. TESOL Quarterly, 45(1), 81–105.
Schleppegrell, M.J. (2013). The role of metalanguage in supporting academic language development. Language
Learning, 63, 153–170.
Schoonen, R., Gelderen, A.V., Glopper, K.D., Hulstijn, J., Simis, A., Snellings, P., & Stevenson, M. (2003).
First language and second language writing: The role of linguistic knowledge, speed of processing, and
metacognitive knowledge. Language Learning, 53(1), 165–202.
Serrano, R., Tragant, E., & Llanes, À. (2012). A longitudinal analysis of the effects of one year abroad.
Shintani, N., Ellis, R., & Suzuki, W. (2014). Effects of written feedback and revision on learners’ accuracy in
using two English grammatical structures. Language Learning, 64(1), 103–131.
Simard, D., Guénette, D., & Bergeron, A. (2015). L2 learners’ interpretation and understanding of written
corrective feedback: Insights from their metalinguistic reflections. Language Awareness, 24(3), 233–254.
Skehan, P., & Foster, P. (2001). Cognition and tasks. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 183–205). Cambridge: Cambridge University Press.
Snow, C., & Hoefnagel-Höhle, M. (1978) The critical period for second language acquisition: evidence from
second language learning. Child Development, 49, 1112–1128.
Steinlen, A.K. (2018). The development of German and English writing skills in a bilingual primary school in
Germany. Journal of Second Language Writing, 39, 42–52.
Suzuki, W., Nassaji, H., & Sato, K. (2019). The effects of feedback explicitness and type of target structure on
accuracy in revision and new pieces of writing. System, 81, 135–145.
Tang, C., & Liu, Y.T. (2018). Effects of indirect coded corrective feedback with and without short affective
teacher comments on L2 writing performance, learner uptake and motivation. Assessing Writing, 35, 26–40.
Torras, M.R., Navés, T., Celaya, M.L., & Pérez Vidal, C. (2006). Age and IL development in writing. In Muñoz,
C. (Ed.), Age and the rate of foreign language learning (pp. 156–182). Clevedon: Multilingual Matters.
Uscinski, I. (2017). L2 Learners’ engagement with direct written corrective feedback in first-year composition
courses, Journal of Response to Writing, 3(2), 36–62.
Van Beuningen, C.G., De Jong, N.H., & Kuiken, F. (2012). Evidence on the effectiveness of comprehensive
error correction in second language writing. Language Learning, 62(1), 1–41.
Verspoor, M., & Smiskova, H. (2012) Foreign language writing development from a dynamic usage-based per-
spective. In R.M. Manchón (Ed.), L2 Writing development: Multiple perspectives (pp 17–46.). Berlin: De
Gruyter Mouton.
Whittaker, R., Llinares, A., & McCabe, A. (2011). Written discourse development in CLIL at secondary school.
Language Teaching Research, 15(3), 343–362.
Yasuda, S. (2011). Genre-based tasks in foreign language writing: Developing writers’ genre awareness, lin-
guistic knowledge, and writing competence. Journal of Second Language Writing, 20(2), 111–133.
Zhang, Z. (2017). Student engagement with computer-generated feedback: A case study. ELT Journal, 71(3),
317–328.
Zhang, Z., & Hyland, K. (2018). Student engagement with teacher and automated feedback on L2 writing.
Zheng, Y., & Yu, S. (2018). Student engagement with teacher written corrective feedback in EFL writing: A
138
11
THE ROLE OF COGNITIVE
INDIVIDUAL DIFFERENCES
IN L2 WRITING PERFORMANCE
University of Leeds and Universitat de Barcelona
Introduction
Writing is one of the most complex skills that human beings can learn. Not unlike other aspects of
second language acquisition and processing, the pace and route of mastering this skill varies from
one person to another. Even in the L1, the acquisition of writing skills proceeds at a varied pace and
exhibits a wide range of inter-individual and intra-individual variations (Bereiter & Scardamalia,
1987). Although cognitive individual differences are pervasive in second language acquisition
and processing, surprisingly their role in L2 writing performance and written corrective feedback
(WCF) processing and use has attracted less attention (Kormos, 2012). This chapter discusses what
research has uncovered and what still needs to elucidate regarding the role of working memory
(WM) and language aptitude in L2 writing performance and written WCF appropriation.
In the late 1950s, Lee Cronbach called for the unification of universal and differential psychology
(Cronbach, 1957). This was chiefly motivated by the fact that inter-individual variations would
make it virtually impossible to extrapolate group data to individual functioning (Estes, 1956). For
example, in the instructed second language acquisition (ISLA) literature, part of the inconsisten-
cies and inconclusiveness in research findings may very well be attributed to ignoring the role of
cognitive ID variables. Yet, this state of affairs has dramatically changed since the late 1990s and
investigating the role of cognitive IDs has become a prominent strand of research in ISLA, espe-
cially regarding the role of WM and language aptitude in L2 acquisition and processing. However,
as will become apparent in the following sections, there is much scope for further theoretical and
empirical research on the links between WM/language aptitude and L2 writing performance/WCF
processing and use.
DOI: 10.4324/9780429199691-17 139

Working Memory
WM is a limited capacity mechanism responsible for temporary storage and processing of infor-
mation. The most widely used model of WM in SLA research is the one proposed by Baddeley and
Hitch (1974) and then further developed by Baddeley (2000). The model comprises three buffer
components (i.e., visuo-spatial sketchpad, phonological loop, and episodic buffer) and an active
and transformative component (i.e., the central executive). Visuo-spatial sketchpad deals with
visual information while phonological loop is a rehearsal system that enables temporary storage of
auditory and verbal information. The episodic buffer, which was added to the original model (see
Baddeley, 2000), coordinates and integrates information from a variety of sources and constitutes
a crucial interface between the slave systems (i.e., visuo-spatial sketchpad and phonological loop)
and long-term memory. The central executive is thought to be responsible for shifting and regu-
lating competing demands on limited attentional resources.
WM is now considered as a cornerstone of cognitive psychology owing to its substantial rela-
tionship with fluid intelligence, i.e., the ability to think about and solve novel reasoning problems
(Shipstead, Harrison, & Engle, 2015). Individual differences in WM have also been shown to be
implicated in various aspects of L2 acquisition and processing, including the acquisition of formu-
laic language, L2 pragmatics acquisition, or interaction (see Wen, Beidroń, & Skehan, 2017 for
reviews). The observed correlations and effects are usually ascribed to the observation that individ-
uals with greater WM capacities can maintain relevant pieces of information in their WM storage,
and disengage from irrelevant ones more efficiently compared to their low-WM counterparts
(Shipstead, Harrison, & Engle, 2015). The importance of WM in L2 acquisition and processing is
further underscored by the fact that some researchers consider it as one of the subcomponents of the
broader construct of foreign language aptitude (Linck, Osthus, Koeth, & Bunting, 2013; Robinson,
2005; Skehan, 2012). In following sections, we will discuss the role of working memory in L2
writing performance and written feedback appropriation.
Language Aptitude
Language aptitude refers to a flair or “specific talent for learning a foreign or second language”
(Wen, Beidroń, & Skehan, 2017, p. 1) and is assumed to be the most reliable predictor of success
in L2 acquisition, second only to Age of Onset (Long, 2013). Language aptitude is also thought
to be a stable trait impervious to training (cf. Snow, 1992) and entirely distinct from other ID
variables such as motivation and anxiety (Li, 2017). Adopting a componential perspective on lan-
guage aptitude, Skehan (2002) was the first scholar to attempt linking putative SLA processing
stages (e.g., input processing, noticing, pattern identification, extending, complexifying language,
becoming accurate, etc.) to aptitudinal components (e.g., WM, grammatical sensitivity, inductive
learning ability, restructuring capacity, retrieval processes). Although, as Skehan (2002) acknow-
ledges, this framework is by and large conjectural and triggers the inception of a comprehensive
theory of language. Robinson’s (2005) Aptitude Complexes Hypothesis, which is essentially based
on Richard Snow’s (1994) Aptitude Complexes Hypothesis, extends Skehan’s proposal and offers
a more nuanced analysis of the acquisition processes involved in early stages of L2 acquisition.
Robinson proposes ten primary cognitive abilities (i.e., perceptual speed, pattern recognition,
phonological WM (PWM) capacity, PWM speed, analogies, inference ability, memory and speed
of memory for text, grammatical sensitivity, and rote memory capacity) and he elaborates on the
ways in which these abilities contribute to broader and higher order aptitude factors (e.g., noticing
the gap, memory for contingent text, etc.). These two proposals show that language aptitude is not a
monolithic construct and that it is best construed as a constellation of different abilities that mediate
and moderate virtually all aspects of L2 acquisition and processing, including L2 writing perform-
ance and written corrective feedback processing and use.
140
Cognitive IDs & L2 Writing
Language aptitude has been studied in relation to, inter alia, ultimate L2 attainment (Grañena &
Long, 2013), instructional effectiveness (Erlam, 2005), and written corrective feedback (Benson &
DeKeyser, 2019). Observed correlations and/or aptitude-treatment interactions are usually attributed
to learners’ differential analytic ability, sensitivity to grammatical forms/regularities, and variable
capacities to perform cognitive comparisons between input and their current knowledge. We will
review these studies in the fourth section.

In this section we will deal with two main issues: (1) the connection between cognitive IDs and
different writing processes; and (2) the role of cognitive IDs in WCF processing and use.
The Connection between Cognitive IDs and Different Writing Processes

Working Memory
Kellogg’s (1996) model of WM in L1 writing integrates Flower and Hayes’s (1981) cognitive pro-
cess model of writing and Baddeley’s multi-component model of WM, thereby accounting for the
ways in which various writing processes are supported by WM. Although this model was originally
proposed for L1 writing, given the similarities between L1 and L2 writing processes, it has been
extensively used in L2 writing research as well. In Kellogg’s model, the basic writing processes
are planning (i.e., goal setting), translation (i.e., linguistic encoding of ideas into actual words
and sentences), programming the output of translation for execution (i.e., typing or handwriting),
reading, and editing. In the model, the link between each of these processes and WM components
is clearly elucidated. Whereas visuo-spatial sketchpad is only drawn upon for planning purposes,
phonological loop is engaged only in translation and reading processes. The central executive is
recruited for virtually all processes except for motor execution. Kellogg’s model was originally put
forward for L1 writing. Therefore, as posited in Kormos (2012) with respect to L2 writers, one could
predict that the cognitive demands on (especially low-proficiency) L2 writers would build up as a
result of their likely more limited and less sophisticated lexicon, less efficient lexical access, and
less automatic processes involved in syntactic packaging for translation of thoughts and ideas into
sentences. Based on Kellogg’s model, we can view L2 writing from two perspectives. First, we can
investigate the ways in which L2 writing processes relate to individual differences in slave systems’
functioning. Second, we can look at the role of WM in L2 writing through investigating the ways in
which the central executive manages attentional functions in relation to writing (Olive, 2004). These
perspectives have barely been investigated in the SLA literature despite Kormos’ (2012) pioneering
call to make the study of cognitive IDs more central in SLA-oriented writing research.
An important issue in L2 writing is to determine the extent to which WM components relate to
L2 writing processes, as noted by Kormos (2012), especially regarding whether WM is more related
to planning processes or to translation processes. However, for this line of research to come to fru-
ition, it would seem necessary to capitalize on L1 writers’ performance as baseline data to be able
to tease apart any effects accruing from general proficiency and WM capacity differences. Another
important topic is to investigate the (differential) contributions of WM and short-term memory
to L2 writing (see Swanson & Berninger, 1996, for a similar question in relation to L1 writing).
Engle (2002) argues that whereas measures of WM “reflect both memory processes and executive
attention […] traditional measures of short-term memory reflect primarily memory processes such
as grouping, chunking, and rehearsal” (pp. 20–21). Although this hypothesis has been confirmed in
a structural equation modeling study by Engle, Tuholski, Laughlin, and Conway (1999), the jury is
still out on the nature of relationships between WM and short-term memory which makes investi-
gating this issue in relation to L2 writing even more exciting and necessary.
141
Language Aptitude
It should be recalled that WM is posited to be one of the principal components of language apti-
tude in contemporary conceptualizations and test batteries of aptitude (Kormos, 2013; Skehan, 2002;
Robinson, 2005; Wen, 2019). Therefore, as proposed by Kormos (2012), there is a strong theoretical
rationale to postulate that ID variations in WM would contribute to ID variations in language aptitude.
However, the impact of language aptitude on L2 writing development and performance is not limited
to the role of WM. In Skehan’s (2016) Macro-SLA aptitude model (cited in Wen et al. 2017), most of
the L2 acquisition processes (i.e., input processing, noticing, pattern recognition, complexification,
handling feedback, and error avoidance) are supported by WM and to that extent Skehan’s model
would align with Wen, et al.’s (2019) recent proposal for WM as language aptitude. However, there
are other processes in Skehan’s model (i.e., automatization, creating a repertoire, and lexicalization)
that do not necessarily require WM’s support and are instead sustained by other aptitude constructs,
namely, memory retrieval and chunking. In addition, there are two more aptitude constructs in
Skehan’s framework (phonetic coding ability and language analysis ability) that complement WM’s
functioning. An important issue is thus to theorize and investigate whether and how each of these
aptitude constructs relate to L2 writing processes. As shown in Table 11.1, apart from WM and atten-
tional control mechanisms, which are clearly associated with all writing processes, memory retrieval
could play an important role in all processes, except for programming. Language analytic ability
could be thought to support translation, reading, and editing processes (Kormos, 2012). Assuming
ID variations in language analytic ability, we could then envisage individual differences in the ways
in which L2 writers may approach translation processes (i.e., converting non-linguistic thoughts into
actual sentences). As Skehan (1998) suggests, while some learners are predisposed to rely on an
exemplar-based system, other learners tend draw upon their rule-based system and generate new syn-
tactic constructions on the fly. This latter predisposition, which could be assumed to be correlated with
higher language analytic ability, would work in favor of accuracy but could potentially impair fluency
in writing performance. Construing language aptitude as a situated construct (Snow, 1992; Robinson,
2005, 2012) would render it crucial to see how L2 writers with various aptitude profiles benefit from
different instructional conditions and task factors (e.g., task complexity). As Robinson (2012) cogently
argues, in order for pedagogical options and instructional interventions to be optimally effective, they
need to correspond to learners’ differential abilities and cognitive resources. However, such aptitude-
treatment-interaction research in relation to L2 writing has gone largely untreated.
The Role of Cognitive IDs in Written Corrective Feedback Processing and Use
Investigating the effectiveness of written corrective feedback has been the subject of intensive
empirical and theoretical debate since the late 1990s. Yet, research findings are inconclusive (see
Table 11.1 Aptitude constructs and basic writing processes
Basic writing processes Aptitude components (Skehan, 2016)

(Flower & Hayes, 1986)
Working Attentional Phonemic Language Memory Chunking
memory control coding ability analysis ability retrieval
Planning X X X X
Translation X X X X X X
Programming X X
Execution X X X
Reading X X X X X
Editing X X X X X
142
Chapters 7 and 16, this volume) and there is little consensus as to what types of feedback are
most effective, under what circumstances and for what linguistic features or groups of learners.
This is partly due to the fact that previous empirical studies seem to have been based on the mis-
guided premise that what may work for one group of learners would be as effective for others.
Yet, there are solid theoretical grounds to postulate that IDs in WM and language aptitude can
mediate the ways in which L2 learners benefit from different types of feedback (see Ellis, 2010).
However, the crucial question is what we actually mean by using/benefiting from feedback when
applied to writing. There are two ways in which WCF appropriation can be conceptualized and
operationalized: on the one hand, L2 writers can be expected to use WCF to merely correct their
errors and improve the quality of their current essays without any effects on L2 learning. On the
other hand, learners can be expected to benefit from WCF in terms of language development in
the short or long term. This binary conceptualization is clearly related to the already established
distinction between “feedback for accuracy” and “feedback for acquisition” (Manchón, 2011).
Arguably, individual variations in language aptitude and WM capacity could prove particularly
consequential for “feedback for acquisition” because, based on Skehan’s (2002) Processing
Stage model (see Table 11.1), the main stages of L2 acquisition (i.e., noticing, pattern identifica-
tion, extending, complexifying, integrating, becoming accurate, avoiding error, creating a reper-
toire, achieving salience, automatization of rule-based system, achieving access fluidity, and
lexicalization) are overwhelmingly supported by WM and language aptitude components. In par-
ticular, as learners need to attend to the feedback provided on their writing in order for it to lead
to L2 development (Williams, 2012), Kormos (2012) advanced that learners with greater WM
capacity and higher language aptitude are more likely to notice and benefit from WCF in terms
of language learning. Besides, owing to ID variations in WM capacity and language aptitude, not
all feedback types (e.g., implicit, explicit, direct, indirect, metalinguistic, etc.; see Ellis, 2009 for
a taxonomy of written corrective feedback) would be equally beneficial for all learners (Kormos,
2017). Therefore, a major research priority would be to identify who benefits from what type of
feedback.
Empirical Findings for the Effects of Working Memory

Previous research in L1 writing has convincingly demonstrated a persuasive implication of WM
(the central executive, in the first place) in higher order writing processes (Kellogg, 2013). Writers
with more efficient WM resources have been found to display higher quality texts as compared to
writers with less WMC (Bergsleithner, 2010; Zalbidea, 2017). It is interesting to observe that the
results from L1 writing resonate with the findings in Linck et al.’s (2014) meta-analysis, which
reported a robust, positive correlation between WM and L2 outcomes, with the estimated popula-
tion effect size (p) of .255. However, the global picture is less clear. Although the extant research
has provided promising initial support for the view that WMC is related to L2 writing and WCF
processing, the findings are conflicting and defy a straightforward interpretation.
Thus, mixed results were obtained by Adams and Guillot (2008) with French/English bilinguals
(n = 22, ages: 12–15) who performed the digit recall test (for PSTM), a visuo-spatial span, a
listening recall (complex WM), word dictation to assess spelling, picture description writing tasks
which were assessed on a 0–10 scale. The only significant result was a correlation between writing
performance in English and PSTM (r = .48, p ≤ 0.05). Findings in Adams and Guillot (2008) res-
onate with the results in Kormos and Sáfár (2008) who also found links with PSTM, but not with
complex WM. In this study, 121 Hungarian secondary school learners of L2 English (age: 15–16;
pre-intermediate and beginners) performed a non-word span test for PSTM and the writing tasks
143
from three different genres, which were assessed for content and accuracy by two raters. Fifty
participants also performed a backward digit span for complex WM. The results showed a moderate
correlation with PSTM (r = .48, p ≤ 0.05) but only for the beginner learners. No significant results
were found for complex WM. However, it is important to highlight that not all the participants in
this study took a complex WM test.
A number of studies, however, found significant relationships between complex WM and L2
written performance. Thus, in the study by Bergsleithner (2010) with 32 Brazilian learners of
English (age: 20–40) there was a positive correlation between complex WM (Operation Span test
in L1) and accuracy (r = .62, p < .001) and complexity (subordination) (r = .69, p < .001) of
L2 picture description writing task. The findings in Zalbidea (2017) confirm, albeit partially, the
positive link with accuracy, but not with complexity. In this study, 32 intermediate learners of L2
Spanish (mean age 19.6) took an Operation Span test and performed simple and complex versions
of an argumentative task orally and in writing. The analysis revealed that complex WM negatively
correlated with the number of errors in the nominal domain (gender and number agreement) in the
complex written task, but there were no connections between WM and lexical and syntactic com-
plexity. A more recent study by Zabihi (2018), which involved 232 Iranian upper-intermediate EFL
students (age: 18–40), did not add much clarity on the matter. The subjects performed an OSpan
test and timed narrative writing task, whose quality was assessed in terms of the CAF measures.
Similar to Bergsleithner (2010), Zabihi found a positive link between WM and syntactic complexity
(subordination) and also fluency (number of words per T-unit). However, surprisingly, WM spans
negatively affected accuracy scores (ratio of error-free T-units) –a finding which Zabihi attributed
to the timed nature of the writing task which induced learners to prioritize complexity and fluency
to the detriment of accuracy.
Some recent studies found absence of links between WM and L2 writing. For example, Cho
(2018) reported that there was no significant relationship between complex WM (assessed by the
Rspan and Ospan tests) and any CAF measures of L2 writing (four writing tasks of varying levels
of complexity) in the performance of the upper-intermediate university learners of EFL in Korea
(n = 39). Similarly, the study by Michel et al. (2019) with 94 young EFL learners in Hungary
(age: 11–14) found that WM functions (assessed by a number of tests) were not significantly related
to the scores of different task types from the TOEFL Junior Comprehensive test battery, except for
the positive link in the editing task in which the learners had to correct errors. The study by Lu
(2015) involved 136 university students in China (mean age: 20) who completed Operation Span
tasks in L1 (Chinese) and L2 (English) and an argumentative writing task which was assessed by
two raters for Content and Language on a 1–15 scale. On the basis of their L2 vocabulary scores,
the participants were divided into low and high proficiency groups. No relationship between WM
and L2 writing was detected for either proficiency group. On the other hand, the recent study by
Vasylets and Marín (submitted), which also took into account variations in L2 proficiency, obtained
different results. In this study, 56 EFL learners in Spain (mean age: 19) completed a standardized
L2 proficiency test, a complex verbal WM test (RSpan in L1) and a video-retelling narrative writing
task, which was assessed holistically and by means of the CAF measures. Moderation analysis
(Hayes, 2012) revealed that for low-proficiency learners, there was a positive connection between
WM and grammatical accuracy. However, for high proficiency writers, there was a positive rela-
tionship between WM and lexical sophistication. The results from this study demonstrate that WM
effects in L2 writing can be contingent on the level of L2 proficiency and on the dimension of L2
written performance under examination.
Although the overall results are mixed, a sizable amount of investigations have provided support
to a positive effect of WM. There is also a clear indication that other factors, such as the level
of L2 proficiency or task complexity, may moderate the potential effects of WM in L2 writing.
Concerning WM and WCF, to our knowledge there is only one investigation available to date which
144
has explored this issue. Li and Roshan (2019) conducted a study in which 79 adult intermediate
EFL learners with L1 Persian performed RSpan in L1 (for complex WM), a non-word span test (for
PSTM) and were assigned to four different WCF treatments designed to improve the knowledge of
English passive voice. The results showed that complex WM was a positive predictor of the effects
of feedback with metalinguistic explanation and the effects of direct corrective feedback plus revi-
sion. On the other hand, PSTM was a negative predictor of the effects of direct corrective feedback
plus revision. On the basis of these findings, the authors concluded that the role of WM may vary
as a function of feedback type, and that complex WM and PSTM may have opposite associations
with the effectiveness of WCF. Clearly, more research on the effects of WM on WCF appropriation
is warranted.
Empirical Findings for the Effects of Language Aptitude

Despite the empirically-proven relevance of language aptitude (LA) in SLA, to date only a few
L2 writing and WCF studies have focused on LA. The available data are too scarce to make firm
conclusions, but certain patterns concerning the effects of LA in L2 writing domain can already be
discerned. Thus, Kormos and Trebits (2012) investigated the relationship between the components of
LA (assessed by the L1 version of MLAT) and the CAF of L2 oral and written performance in tasks
with different levels of complexity (given storyline vs. designing own plot). The participants were 44
upper-intermediate secondary school learners in Hungary. The results showed that LA components
were differently related to the CAF measures of oral performance as compared to writing. In the written
mode, learners with higher grammatical sensitivity produced longer clauses, but only in the task which
was considered simple (the task with the pre-defined plot), suggesting that LA may play different roles
in influencing performance depending on the complexity of the task or the mode in which the task is
performed. In a more recent study, Yang, Sun, Chang, and Li (2019) investigated how LA (assessed
by LLAMA tests; Meara, 2005) and productive and receptive vocabulary size may affect L2 writing
performance. Fifty-nine Chinese university learners of EFL (age: 20–23) performed a picture descrip-
tion narrative task; written performances were assessed by means of an automated scoring system
(www.pigai.org) which assigns a holistic score on the basis of the evaluation of the dimension content,
organization, sentence structure, vocabulary choice, and coherence. Regression analysis showed that
L2 writing was significantly predicted by vocabulary size and LLAMA E score, which taps into asso-
ciative learning and analytical learning ability (Grañena, 2013). In sum, the available findings point to
the positive link between LA and L2 writing performance; importantly, there is also evidence that other
factors (e.g., task complexity) may mediate this relationship.
While the available evidence points to the positive links between LA and L2 written perform-
ance, the picture is less clear with WCF. Thus, Sheen (2007) explored to what extent the learners’
language analytic ability may mediate the effectiveness of two types of WCF (direct correction
vs. direct metalinguistic correction) on the acquisition of English articles. The participants were
91 adult intermediate EFL learners (age: 21–56) from different L1 backgrounds (mainly Korean,
Hispanic, and Polish). The study employed a pretest-posttest-delayed posttest design. To measure
the development of learners’ ability to use articles, a speeded dictation, a narrative writing test and
an error correction test were used. The results revealed that learners with high language analytic
ability benefited more from both types of WCF. However, this advantage was even more salient
in the condition, in which metalinguistic comments were provided in addition to indicating and
correcting an error. A rather different pattern of findings was obtained by Stefanou and Révész
(2015) who had a similar goal of exploring the extent to which language analytic ability and know-
ledge of metalanguage may mediate the effectiveness of direct correction versus direct correction
plus metalinguistic explanation on the use of English articles by L1 Greek intermediate learners of
EFL (n = 89; age: 16). Similar to Sheen (2007), the study had a pretest-posttest-delayed posttest
145
design; the progress in learning was measured by means of a text summary and truth value judgment
tests. The findings were that greater meta-knowledge and higher grammatical sensitivity positively
contributed to the effectiveness of direct WCF. Contrary to the findings in Sheen (2007), gram-
matical sensitivity did not play a role in the metalinguistic condition, which was explained by the
fact that metalinguistic comments could neutralize any advantage potentially afforded by higher
grammatical sensitivity. Shintani and Ellis (2015) examined the mediating role of language analytic
ability (measured with language aptitude battery for Japanese) on direct versus metalinguistic WCF
for the errors with English articles and the past conditional. The results showed that the participants
(118 Japanese university students of English) with higher ability benefited more from both types
of WCF than learners with lower ability. A more recent study by Benson and DeKeyser (2019)
explored the way language analytic ability, as measured by the LLAMA F, mediated the effects of
direct versus metalinguistic WCF on errors in the English simple past and present perfect tense. The
participants were 165 learners of English (low-intermediate to advanced levels) from various L1
background. In line with Stefanou and Révész (2015), this study found that higher aptitude was an
advantage in learning from direct WCF, but not from metalinguistic WCF.
In sum, the available research indicates that LA (language analytic ability, specifically) may
positively enhance the effectiveness of WCF. However, it is not yet clear for which type of learners
(in terms of age, level of L2 proficiency, etc.) and for which types of WCF language aptitude may
play a more relevant role. More empirical evidence is also needed concerning the mediating effects
of LA on the learning from WCF. Although some studies have explored the way LA may influence
the effects of WCF on the accurate use of the selected grammatical structures, it is not yet known if
and how LA may mediate the effectiveness of WCF on other aspects of L2 writing, such as coher-
ence or complexity.
The global conclusion to be drawn from the review of the available investigations is that the
majority of the empirical studies have obtained significant results, pointing to WM and language
aptitude as potential sources of variability in L2 writing performance and in WCF processing.
Overall, the studies give support to “more is better” hypothesis, with learners with higher abilities
showing an advantage in L2 writing scores and in learning from particular types of WCF. More
research is, however, needed to deepen our knowledge on the way L2 writing performance and
learning from WCF may differ in learners with higher versus lower levels of cognitive abilities.

There are two methodological approaches which have been traditionally employed to investigate
links between WM and writing: dual-task methodology and the regression approach (Kellogg,
Whiteford, Turner, Cahill, & Mertens, 2013). In dual-task experiments, the participants are asked
to focus attention on composing a text, while performing a secondary task, which is designed to
tax a specific component of WM. In L1 writing research, dual-task methodology has been fruitfully
employed to study the role of WM in writing processes (Randsell, Levy, & Kellogg, 2002) and it
has also been widely used to explore the specific contribution of the executive functions, as well
as verbal, visual, and spatial WM to writing (Kellogg, Olive, & Piolat, 2007). On the one hand,
the regression approach T consists in examining how differences in the measurement of various
capacities of WM correlate with the quality of L2 writing performance, as rated holistically and/or
as assessed by means of the quantitative measures targeting various dimensions, such as accuracy,
fluency, or propositional and linguistic complexity. Notably, the regression approach has dominated
in L2 writing research. As a result, the predominant focus of the available investigations has been
on the role of WM in L2 outcomes. Another common characteristic of WM studies in L2 writing is
the focus on adult learners of L2 English, typically at upper-intermediate level of L2 proficiency.
The studies, however, greatly differ between themselves in terms of other important features, such
146
as the L1 of the participants or the way WM and L2 proficiency were assessed. Thus, while the
majority of studies employed complex WM tests, only a few investigations measured both PSTM
and complex WM (Kormos & Sáfár, 2008; Li & Roshan, 2019) or visuo-spatial WM (Adams &
Guillot, 2008; Michel et al., 2019). Also, there is a great discrepancy in the way L2 proficiency
has been assessed, with some studies employing standardized tests (Cho, 2018; Vasylets & Marín,
2021) and others using vocabulary tests as a proxy of L2 proficiency (Lu, 2015). The available
studies also differ in the way the L2 outcomes were analyzed. While some studies combined hol-
istic assessment and a range of the CAF measures to analyze L2 written texts (Vasylets & Marín,
2021), others resorted solely to holistic ratings (Kormos & Sáfár, 2008; Lu, 2015) or focused only
on some particular CAF dimensions (Bergsleithner, 2010). The analysis of previous studies also
makes patent certain limitations in the measurement practices, specifically in the assessment of
L2 complexity of L2 written texts. As pointed out by Bulté and Housen (2012), L2 complexity of
production is multifaceted and includes various dimensions, such as linguistic (syntactic and lex-
ical) complexity and propositional complexity, assessed in terms of idea units. Thus, the majority
of the studies focused primarily on subordination and lexical diversity, leaving out such important
dimensions as coordination, nominal complexity, lexical sophistication, or idea units (but see
Vasylets & Marín, submitted). Concerning the exploration of the effects of WM in WCF processing,
the only available (to our knowledge) study by Li and Roshan (2019) has resorted to the traditional
approach in WCF research, which includes a control group and experimental groups, and follows
a pre-test, treatment, posttest, and delayed posttest design. Other notable methodological decisions
in Li and Roshan (2019) were to distribute the experimental sessions over the period of four weeks,
require the minimum of 200 words for written texts, and limit the amount of writing time to 30
minutes. The authors also limited to 10 minutes the time during which the participants were allowed
to review the corrections. Also, in order to foster deeper WCF processing, the subjects were not
allowed to access the corrections when they had to rewrite the texts.
Similar to the research on WM, the investigations on the role of aptitude in L2 written per-
formance have employed a regression approach, in which the scores on the LA component(s) are
correlated with the measures of L2 writing performance. Notably, the available investigations differ
in their methodological decisions concerning the assessment of aptitude and the measurement of L2
writing quality. Thus, while Kormos and Trebits (2012) employed the L1 version of the test battery
for the Carrollian concept of language aptitude (1981), the study by Yang et al. (2019) employed
LLAMA test (Meara, 2005), which is an L1-neutral aptitude testing instrument. The quality of L2
writing was also assessed differently, with Kormos and Trebits (2012) employing a range of the
CAF measures, and Yang et al. (2019) using an automated holistic scoring.
Studies on aptitude and WCF processing are rather homogeneous in their design, following a
pretest-treatment-posttest-delayed posttest design, and using ESL learners to form experimental
and comparison groups. The analysis involves correlations between a subdimension of LA under
exploration and criterion test score in the posttests and delayed posttests. Another commonality
in previous research on aptitude and WCF processing is the choice of the dependent variable,
which typically represents the learning of some aspect of L2 grammar, such as English articles
or verb tenses. The main methodological differences between the studies consists in the way LA
was assessed. Thus, grammatical sensitivity has been assessed in different ways, with Benson and
DeKeyser (2019) using, for example, LLAMA F (Meara, 2005) and Shintani and Ellis (2015) using
Language Aptitude Battery for the Japanese (Sasaki, 1996) for the same purpose.

As noted above, the most easily discernable impact of research on the links between cognitive
IDs and L2 writing performance/development and written corrective feedback appropriation is to
147
equalize learning opportunities for all learners irrespective of how low or high their WM capacity
or level of foreign language aptitude may be. In this section, we will draw on Instructed SLA
literature as well as the Cognitive Load Theory (cited in Price, Catrambone, & Engle, 2007) to
provide some general recommendations for practice. The aim here is not to discuss how students’
cognitive abilities could be boosted but to strive towards matching learners’ abilities to optimal
learning contexts and instructional options (i.e., L2 writing tasks or written corrective feedback)
(Robinson, 2012). The premise therefore is that language classrooms are essentially heterogeneous
in terms of learners’ cognitive IDs and that the onus is on course designers, materials developers,
and teachers to ensure that learners can equally benefit from classroom affordances regardless of
their differential cognitive ability profiles. The Cognitive Load Theory is concerned with the ways
in which limited “cognitive resources are focused and used during learning and problem solving”
(Chandler & Sweller, 1991, p. 294). To the best of our knowledge, this is one of the most compre-
hensive and theoretically supported accounts for dealing with cognitive ID variations in instruc-
tional settings (see also Vatz et al., 2013 for an alternative account). One of the basic assumptions
of the Cognitive Load Theory is that instructional activities impose three kinds of demand on task
performers: (a) intrinsic cognitive load, which refers to the inherent complexity of a task (e.g.,
number of elements and reasoning demand) and is usually affected by one’s familiarity with the
topic/task; (b) extraneous cognitive load, which derives from poorly designed/presented materials
and does not contribute to learning; and (c) germane cognitive load, which could be manipulated
by task designers and is salutary in that it promotes further learning (Price, Catrambone, & Engle,
2007). In order for instructional activities (and in this case, L2 writing tasks) to be maximally bene-
ficial for learners with differential cognitive abilities, extraneous load must be reduced as much as
possible. This could be achieved through designing/choosing tasks which have clear instructions,
are highly structured, and involve several small steps (Kormos, 2017). Using “worked examples,”
i.e., showing learners examples of completed L2 writing tasks where they can see both the task
and the ways in which it has been completed and then moving on to new tasks, is another critical
step towards reducing unfavorable cognitive load on WM capacity (Price, Catrambone, & Engle,
2007). A number of studies in SLA literature have demonstrated that WM capacity matters particu-
larly in teaching situations where metalinguistic information is withheld, and no explicit corrective
feedback is provided (Sanz et al., 2016). For example, Kormos (2017) discusses research findings
from her projects with learners with specific learning difficulties, who usually have reduced WM
capacities, struggle to infer regularities and patterns under implicit instruction conditions and prefer
explicit instruction. This suggests that explicit written corrective feedback has the potential to level
the playing field for L2 learners with various levels of cognitive abilities.
Future Directions
Given the complexity of the constructs of WM and language aptitude, there is an urgent need for
more empirical research on the nature of relationships between L2 writing processes and WM/lan-
guage aptitude components. As discussed earlier, any theoretical model of the role of L2 writing in
second language acquisition needs to take into account the role of individual differences in working
memory and language aptitude. This is clearly a daunting task. Future studies need to examine
the effects of written corrective feedback on L2 acquisition taking into account the important role
of cognitive individual differences. Therefore, using new conceptualizations of language aptitude
(Kormos, 2013) and drawing on improvements in measuring working memory capacity (Waters &
Caplan, 2003) are necessary for developing this line of research.
Construing language aptitude as a situated construct (Robinson, 2005, 2012) would defin-
itely warrant more aptitude-treatment-interaction studies aiming to match task factors, L2 writing
teaching approaches, and written corrective feedback types to differential language abilities. As far
as language pedagogy is concerned, the ultimate goal here would be to discern what types of written
148
corrective feedback or writing tasks would be most beneficial for development of L2 writing ability
in L2 learners with differential cognitive ability profiles. Future research on cognitive abilities
would benefit from more process-oriented studies, tapping into the potential impact of cognitive
abilities on the writing and learning processes involved in L2 writing and on processing of WCF. An
example is the recent study by Révész, Michel, and Li (2017) who explored the links between WM
and L2 writing products and processes by employing eye-tracking and stimulated recall. To date,
the available investigations have focused on the role of cognitive abilities on the effects of WCF
when acquiring some specific L2 grammar aspects. Future studies should also broaden the scope of
the dependent variables, focusing, inter alia, on the development of L2 writing quality in terms of
CAF, coherence, organization, and communicative adequacy.
In terms of the methodological improvements, future research should strive for a more detailed
assessment of the independent variables (WM, LA), a more rigorous control of the potential medi-
ating variables (e.g., gender, vocabulary size, L2 proficiency). Also, more uniform measurement
of the dependent variables would be desirable to enhance comparability between the studies.
Additionally, more variety is needed in terms of the participants involved. Thus, there is a need for
more studies with participants with L2 other than English, participants with different ages (young
and third-age learners are particularly under-researched) or participants with low or advanced levels
of L2 proficiency.
Future research would also benefit from a fuller control of the effects of potentially mediating
variables (e.g., L2 proficiency) which can influence the nature of the relationship between cogni-
tive abilities and L2 writing processes/outcomes or processing and learning from WCF. Without
doubt, taking into account the potential interactions between the relevant variables would increase
ecological validity and allow for a more nuanced understanding of the role of cognitive abilities.
There is also need for more studies with aptitude-treatment interaction design which tests the links
between learners’ cognitive ability scores and their L2 learning outcomes under different L2 writing
or WCF conditions (Vatz, Tare, Jackson, & Doughty, 2013). Future investigations would benefit
from incorporating dual-task methodology in the explorations of working memory in L2 writing.
This methodology would be specifically fruitful to determine the specific contribution of different
working memory components to L2 writing and it will also help elucidate links between working
memory and the writing processes of planning, translating, execution, and revision.
References
Adams, A.M., & Guillot, K. (2008). Working memory and writing in bilingual students. International Journal
of Applied Linguistics, 156, 13–28.
Baddeley, A. (2000). The episodic buffer: a new component of working memory? Trends in Cognitive Sciences,
4(11), 417–423.
Baddeley, A.D., & Hitch, G. (1974). Working memory. Psychology of Learning and Motivation, 8, 47–89.
Bergsleithner, J.M. (2010). Working memory capacity and L2 writing performance. Ciências & Cognição,
15(2), 2–20.
Benson, S., & DeKeyser, R. (2019). Effects of written corrective feedback and language aptitude on verb tense
accuracy. Language Teaching Research, 23(6), 702–726. https://doi.org/10.1177/1362168818770921
Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition. Hillsdale, NJ: Lawrence
Erlbaum Associates.
Bulté, B., & Housen, A. (2012). Deﬁning and operationalising L2 complexity. In A. Housen, F. Kuiken &
I. Vedder (Eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA
Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and
Instruction, 8(4), 293–332.
Cho, M. (2018). Task complexity, modality, and working memory in L2 task performance. System, 72, 85−98.
Cronbach, L.J. (1957). The two disciplines of scientific psychology. American Psychologist, 12(11), 671.
Doughty, C.J. (2019). Cognitive language aptitude. Language Learning, 69, 101–126.
Ellis, R. (2009). A typology of written corrective feedback types. ELT Journal, 63(2), 97–107.
149
Ellis, R. (2010). Epilogue: A framework for investigating oral and written corrective feedback. Studies in
Engle, R.W. (2002). Working memory capacity as executive attention. Current Directions in Psychological
Science, 11(1), 19–23.
Engle, R.W., Tuholski, S.W., Laughlin, J.E., & Conway, A.R. (1999). Working memory, short-term memory,
and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General,
128(3), 309.
Erlam, R. (2005). Language aptitude and its relationship to instructional effectiveness in second language
acquisition. Language Teaching Research, 9(2), 147−171.
Estes, W.K. (1956). The problem of inference from curves based on group data. Psychological Bulletin,
53(2), 134.
Flower, L., & Hayes, J.R. (1981). A cognitive process theory of writing. College Composition and
of writing: Theories, methods, individual differences, and applications (pp. 57–71). Hillsdale, NJ: Lawrence
Erlbaum.
Kellogg, R.T., Olive, T., & Piolat, A. (2007). Verbal, visual, and spatial working memory in written language
production. Acta Psychologica, 124(3), 382–397.
Kellogg, R.T., Whiteford, A.P., Turner, C.E., Cahill, M., & Mertens, A. (2013). Working memory in written
composition: An evaluation of the 1996 model. Journal of Writing Research, 5(2), 159–190.
21(4), 390–403.
Kormos, J. (2013). New conceptualizations of language aptitude in second language attainment. In G. Granena
& M. Long (Eds.), Sensitive periods, language aptitude, and ultimate L2 attainment (pp. 131–152).
Kormos, J. (2017). The second language learning processes of students with specific learning difficulties.
Kormos, J., & Sáfár, A. (2008). Phonological short-term memory, working memory and foreign language per-
formance in intensive language learning, Bilingualism: Language and Cognition, 11, 261–271.
ance. Language Learning, 62(2), 439–472.
Li, S. (2017). Cognitive differences and ISLA. In S. Loewen & M. Sato (Eds.), The Routledge handbook of
instructed second language acquisition (pp. 396–417). New York: Routledge.
Li, S., & Roshan, S. (2019). The associations between working memory and the effects of four different types
of written corrective feedback. Journal of Second Language Writing, 45, 1–15.
Linck, J.A., Osthus, P., Koeth, J.T., & Bunting, M.F. (2014). Working memory and second language compre-
hension and production. Psychonomic Bulletin Review, doi: 10.3758/s13423-013-0565-2
Long, M. (2013). Maturational constraints on child and adult SLA. In G. Granena & M. Long (Eds.), Sensitive
periods, language aptitude, and ultimate L2 attainment (pp. 3–41). Amsterdam: John Benjamins.
Lu, Y. (2015). Working memory, cognitive resources and L2 writing performance. In In Z.E. Wen, M. Borges
Mota, & A. McNeill (Eds.), Working memory in second language acquisition and processing (pp.175–189).
Bristol: Multilingual Matters.
Manchón, R.M. (2011). The language learning potential of writing in foreign language contexts: Lessons
from research. In T. Cimasko & M. Reichelt (Eds.), Foreign language writing instruction: Principles and
practices (pp. 44–64). Anderson, SC: Parlor Press.
Meara, P. (2005). LLAMA language aptitude tests: The manual. Swansea: Lognostics.
Olive, T. (2004). Working memory in writing: Empirical evidence from the dual-task technique. European
Psychologist, 9(1), 32–42. https://doi.org/10.1027/1016-9040.9.1.32
Price, J.L., Catrambone, R., & Engle, R. (2007). When capacity matters: The role of working memory in
problem solving. In D.H. Jonassen (Ed.), Learning to solve complex scientific problems (pp. 49–76).
New York: Lawrence Erlbaum.
Ransdell, S., Levy, C.M., & Kellogg, R.T. (2002). The structure of writing processes as revealed by secondary
task demands. L1-Educational Studies in Language and Literature, 2(2), 141–163.
Révész, A., Michel, M., & Lee, M. (2017). Investigating IELTS Academic Writing Task 2: Relationships
between cognitive writing processes, text quality, and working memory. IELTS Research Reports Online
Series, 17 (3), 1–44.
150
Robinson, P. (2005). Aptitude and second language acquisition. Annual Review of Applied Linguistics,
25, 46–73.
Robinson, P. (2012). Individual differences, aptitude complexes, SLA processes, and aptitude test develop-
ment. In P. Mirosław (Ed.), New perspectives on individual differences in language learning and teaching
(pp. 57–75). Berlin: Springer.
Sanz, C., Lin, H.J., Lado, B., Stafford, C.A., & Bowden, H.W. (2014). One size fits all? Learning conditions
and working memory capacity in ab initio language development. Applied Linguistics, 37(5), 669–692.
Sasaki, M. (1996). Second language proficiency, foreign language aptitude, and intelligence: Quantitative and
qualitative analyses. New York: Peter Lang.
Shintani, N., & Ellis, R. (2015). Does language analytical ability mediate the effect of written feedback on
grammatical accuracy in second language writing? System, 49, 110–119.
Shipstead, Z., Harrison, T.L., & Engle, R.W. (2015). Working memory capacity and the scope and control of
attention. Attention, Perception, & Psychophysics, 77(6), 1863–1880.
Sheen, Y. (2007). The effect of focused written corrective feedback and language aptitude on ESL learners’
acquisition of articles. TESOL Quarterly, 41(2), 255–283.
Skehan, P. (2002). Theorising and updating aptitude. In P. Robinson (Ed.), Individual differences and instructed
language learning (pp. 69–93). Amsterdam: John Benjamins.
Skehan, P. (2012). Language aptitude. In S. Gass & A. Mackey (Eds.), The Routledge handbook of second lan-
guage acquisition (pp. 399–413). New York: Routledge.
Skehan, P. (2016). Foreign language aptitude, acquisitional sequences, and psycholinguistic processes. In
G. Grañena, D. Jackson, & Y. Yilmaz (Eds.), Cognitive individual differences in second language pro-
cessing and acquisition (pp. 17−41). Amsterdam: John Benjamins.
Snow, R.E. (1992). Aptitude theory: Yesterday, today, and tomorrow. Educational Psychologist, 27(1), 5–32.
Stefanou, C., & Révész, A. (2015). Direct written corrective feedback, learner differences, and the acquisition
of second language article use for generic and specific plural reference. The Modern Language Journal,
99(2), 263–282.
Swanson, H. L., & Berninger, V. W. (1996). Individual differences in children's working memory and writing
skill. Journal of Experimental Child Psychology, 63(2), 358−385.
Vasylets, O., & Marín, J. (2021). The effects of working memory and L2 proficiency on L2 writing. Manuscript
submitted for publication.
Vatz, K., Tare, M., Jackson, S.R., & Doughty, C.J. (2013). Aptitude-treatment interaction studies in second
language acquisition. In G. Granena & M. Long (Eds.), Sensitive periods, language aptitude, and ultimate
L2 attainment, (pp. 273–292). Amsterdam: John Benjamins.
Waters, G.S., & Caplan, D. (2003). The reliability and stability of verbal working memory measures. Behavior
Research Methods, Instruments, & Computers, 35(4), 550–564.
Wen, Z.E., Beidroń, A., & Skehan, P. (2017). Foreign language aptitude theory: Yesterday, today and tomorrow.
Language Teaching, 50(1), 1–31.
Wen, Z.E., Skehan, P., Biedroń, A., Li, S., & Sparks, R.L. (Eds.). (2019). Language aptitude: Advancing
theory, testing, research and practice. Abingdon: Routledge.
Williams, J. (2012). The potential role (s) of writing in second language development. Journal of Second
Yang, Y., Sun, Y., Chang, P., & Li, Y. 2019). Exploring the relationship between language aptitude, vocabulary
size, and EFL graduate students’ L2 writing performance. TESOL Quarterly, 53(3), 845–856.
Zabihi, R. (2018). The role of cognitive and affective factors in measures of L2 writing. Written Communication,
35(1), 32–57.
in L2 performance. The Modern Language Journal, 101(2), 335–352.
151
12
THE ROLE OF MOTIVATIONAL
AND AFFECTIVE FACTORS
IN L2 WRITING PERFORMANCE
Mostafa Papi
Florida State University
Introduction
This chapter aims to highlight the importance of the learner factors of motivation and affect and
their potential contributions to understanding second language (L2) writing and development. For
this purpose, studies conducted on the role of motivational and affective factors in L2 writing and
written corrective feedback (WCF) are reviewed, future research directions are outlined, and peda-
gogical implications are discussed. The main topics covered in this chapter include learner beliefs,
motivational factors, and emotions.
Research on the role of motivational and affective factors in L2 writing has been scarce and
mostly random. This has probably been due to the fact that L2 writing researchers typically
have a background either in second language acquisition (SLA) with a cognitive-linguistic
perspective, or first language composition with a predominantly social approach (Polio &
Friedman, 2016). The SLA approach has focused on exploring the role of cognitive-linguistic
factors such as task complexity in L2 writing (e.g., Johnson, 2017; Ong & Zhang, 2010),
whereas the social approach has focused on issues such as contrastive rhetoric, genre theories,
and sociocultural theory (Cumming, 2016). Due to their areas of expertise and interest, the two
groups of researchers have shown minimal interest in the role of motivational and affective
factors in L2 writing. Researchers interested in the motivational and affective factors in SLA,
likewise, have not paid sufficient attention to the role of these factors in the processes involved
in L2 writing and development. The main reason for such a disconnect has been attributed to
the common perspective which focuses on quantitative effects that such learner factors can
have on L2 outcomes rather than qualitative differences in L2 writing performance and devel-
opment (Papi, 2016, 2018). In this chapter, I discuss potential links between motivational and
152 DOI: 10.4324/9780429199691-18

Motivation, Affective Factors & L2 Writing
affective factors on the one hand, and various dimensions of the L2 writing process and devel-
opment on the other.

The empirical studies reviewed in the following section highlight the connection between major
motivational and affective learner factors on one hand, and the L2 writing process on the other
hand. First, studies on learner beliefs are discussed. While research on factors such as the role of
teachers, and instructional methods, materials, and tasks furthers our understanding of the learning
environment, belief studies give researchers a glimpse of what is going on in a learner’s mind. Over
the course of learning how to write in a second language, students form opinions about different
aspects of their abilities and different aspects of the L2 writing process; these opinions may shape
how learners emotionally relate to and engage in the learning process. Through experience, how-
ever, these opinions seem to change or be reaffirmed (Manchón, 2009), leading to what is commonly
known as beliefs. According to Wenden (1986) beliefs are “opinions which are based on experience
and the opinions of respected others, which influence the way they act” (p. 5). The relationship
between beliefs and educational experiences, therefore, is reciprocal. Beliefs influence students’
cognitive, emotional, and behavioral engagement in L2 writing (e.g., Storch & Wiggleworth, 2010),
and L2 writing experiences influence beliefs (e.g., Manchón, 2009). Research on learner beliefs
marked the beginning of scholarly attention to the role of learners in research on L2 writing and
WCF. These studies included both qualitative and quantitative surveys, mostly conducted from a
pedagogical perspective, as a tool to identify what learners believe to be effective in their writing
development and whether learners would like to receive WCF on their L2 performance. Surveys
conducted on learner beliefs (e.g., Cardelle & Corno, 1981; Cohen, 1987; Ferris, 1995), however,
typically include a limited number of statements that reflect what the researcher considers myths or
false beliefs impeding learner’s L2 writing development. Other studies have looked at the nature
of beliefs and how they are influenced by the learner’s social context and instructional experiences
(e.g., Han, 2017; Manchón, 2009; Wan, 2014). The main goal of such studies is to find ways to
improve instruction.
Second, research on motivation highlights the basic principle that learners are in charge of the
learning process and without their will, nothing fundamental will happen regardless of the teaching
resources and techniques they are exposed to. Motivation studies draw on a wide range of theories,
models, and constructs. These include but are not limited to the integrative motive (e.g., Gardner,
1985), the L2 motivational self system (Dörnyei, 2009), the self-determination theory (e.g., Noels,
2001), directed motivational currents (Dörnyei, Henry, & Muir, 2015), and more recently, language
mindsets (e.g., Waller & Papi, 2017), regulatory focus (Papi, 2018; Papi, Bondarenko, Mansouri,
Feng, & Jiang, 2019), buoyancy and resilience (Yun, Hiver, & Al-Hoorie, 2018), and feedback-seeking
behavior (Papi, Bondarenko, Wawire, Jiang, & Zhou, 2020). The attempts have greatly contributed
to our understanding of different facets of the highly complex notion of motivation. The multipli-
city of the conceptualizations, which shows the complexity and multidimensionality of motivation,
nonetheless, can be confusing. Following Papi and Hiver (2020), I have employed Higgins’s (2012)
classification of different motivational constructs under three broad dimensions: control, value, and
truth. According to Higgins (2012), people want to be effective in their life and they do so by being
effective in terms of three dimensions: managing what happens (control), having desired outcomes
(value), and establishing; what’s real (truth). In other words, motivation can come from the desire to
engage in doing what one enjoys or is good at doing (control), the desire to achieve desirable end-
states and avoid undesirable ones (value), and the curiosity for discovering and learning the truth
of different matters (truth). In the next section, the motivation models concerned with end-states in
L2 writing are discussed under the value dimension; those concerned with learner’s interest in the
153
Mostafa Papi
knowledge of writing are discussed under truth; and those about the experience and process of L2
writing development are discussed under the control dimension.
Finally, research on emotions is discussed. Most studies on the role of emotions have focused on
L2 writing anxiety while other emotions such as enjoyment have received marginal attention, if any.
Even though a wide range of emotions have recently been introduced to and examined in the field of
SLA (e.g., MacIntyre, Gregerson, & Mercer, 2016; Teimouri, 2018), in the field of L2 writing only
anxiety has received attention and other emotions have either been examined only as secondary to
L2 writing anxiety (e.g., enjoyment and self-confidence) or have been completely ignored (e.g.,
pride, shame, guilt, enthusiasm).
Beliefs About L2 Writing and WCF

Learner belief studies in the field of L2 writing can be categorized in terms of the content of the
beliefs: beliefs about learning and teaching how to write in a second language, and beliefs about
one’s abilities and potential to learn how to write in a second language (e.g., self-efficacy beliefs
and mindsets). The former is discussed here and the latter under the topic of motivation.
Only a few studies in the first group have focused on learner beliefs about L2 writing in gen-
eral. Manchón (2009), Wan (2014), and Han (2017) explored such beliefs and found that they
can be malleable and change either over the course of a writing class or through more explicit
interventions. Manchón (2009) found that students’ initial beliefs about the self, the nature of L2
writing, and the L2 writing teacher changed positively by the end of an EAP course which included
instructional activities such as making students aware of text construction process, training them on
writing strategies, analytic reading of texts, and providing feedback on their writing. Wan (2014)
found that a student-generated metaphor-sharing intervention was useful in broadening students’
beliefs and understanding of various aspects of the writing process and improved their writing skills
and practices. Han (2017) found that learner beliefs influenced students’ cognitive, behavioral and
affective engagement with L2 writing and WCF, which, in turn, influenced learners’ original beliefs
about L2 writing and WCF.
Studies on learner beliefs about WCF have been more common. Due to the specificity of beliefs
about WCF, such studies have typically been conducted using researcher-produced questions. Studies
on WCF beliefs generally show that the majority of students prefer to receive WCF (e.g., Cardelle &
Corno, 1981; Cohen, 1987) especially if it is positive and encouraging (Ferris, 1995), whereas there
are other students who prefer not to receive any WCF. It also seems that high performing students
tend to believe in the value of WCF and pay attention when they receive it whereas low-performing
ones do not value it as much and are less interested in receiving or processing it (e.g., Cardelle &
Corno, 1981; Cohen, 1987). In another study, Radecki and Swales (1988) classified 59 ESL learners
into feedback receptors (46%), feedback semi-resistors (41%), and feedback resistors (13%) based
on their feedback preferences. The receptors and semi-resistors preferred to receive feedback on
the content and form of their writing and welcomed doing revisions whereas the resistors did not
want such feedback and saw revision assignments as a form of punishment. In sum, these studies
show that the majority of language learners consider WCF to be useful, and would like to receive
such feedback. However, in almost all of the studies reviewed above, there were students who
reported to have ignored teachers’ comments or did not believe in the learning value of WCF. These
studies also show that beliefs are malleable and with proper instructional interventions, students
can improve their belief systems, which, in turn, can contribute to more adaptive learning patterns.
Therefore, until we improve learner beliefs and preferences for WCF, we cannot expect WCF to be
as effective as we want it to be.
154
Motivation from Control

Bandura (1994) defines perceived self-efficacy as “people’s beliefs about their capabilities to
produce designated levels of performance that exercise influence over events that affect their
lives” (p. 71). Self-efficacy beliefs are directly related to the control aspect of motivation. Without
the confidence that one can accomplish a goal, one may not put sufficient effort in that pursuit.
Studies conducted on the construct of self-efficacy beliefs have furthered our understanding of
the motivational processes underlying second language writing. Positive L2 writing self-efficacy
beliefs have been found to enhance learner’s self-regulatory control (Csizér & Tankó, 2015),
decrease their anxiety (Kirmizi & Kirmizi, 2015), improve their engagement with WCF (Ferris,
Liu, Sinha, & Senna, 2013), and contribute to higher levels of L2 writing quality (McCarthy,
Meier, & Rinderer, 1985).
According to Bandura (1997) learners form self-efficacy beliefs based on information that
they collect from their enactive mastery experiences (in a given skill area), vicarious experiences
(observing a role model success in the goal pursuit), verbal persuasion (positive and realistic feed-
back from others), and their physiological states regarding their chances of success or failure in
the pursuit. In L2 writing research, these factors have been found to significantly enhance students’
L2 writing self-efficacy (Manchón, 2009; Sasaki, Mizumoto, & Murakami, 2018). Examples of
enactive mastery experiences include studying abroad, revising practice at the university’s writing
center, writing journals free of concerns about the form of language, and freely choosing the topic
of these journals; vicarious experiences include watching classmates’ performance, and using
student’s and peers’ work as writing models; and, verbal persuasion can be achieved through pro-
viding encouragement and positive feedback.
Teacher and peer feedback, which influences learners’ self-efficacy beliefs, show the interaction
between the control and truth dimension of motivation. Whereas self-efficacy concerns control over
the process of goal pursuit, feedback acts as a way for students to establish the truth about their
efficacy and progress in their goal pursuit; in other words, feedback helps learners establish that
their progress towards the goal is real and they are truly able to accomplish the task of L2 writing.
In Bandura’s account of the sources of self-efficacy beliefs, feedback (corrective or otherwise),
which falls under verbal persuasion, has been shown to influence L2 writing self-efficacy beliefs
but not always positively. Whereas teacher feedback has been found to give motivated students a
sense of progress (Busse, 2013) and enhance their self-efficacy, peer feedback has not shown any
positive effects on self-efficacy probably due to the fact that learners may not need perceive their
peer’s feedback to be as useful as their teacher’s feedback. In fact, Ruegg (2018) found that the
self-efficacy beliefs of the students who received peer feedback were slightly lower than when they
started their writing course. Feedback can be demotivating when it is vague or when the feedback
is not specific enough or too comprehensive. This is probably because vagueness and lack of spe-
cificity deduct from the truth dimension of such feedback which no longer shows learners’ a clear
picture of their weaknesses and strengths.
In addition, when feedback does not contain positive comments or when too much feedback is
provided, it may signal failure to some students and harm their self-efficacy and motivation (Busse,
2013; Duijnhouwer, Prins, & Stokking, 2012). Negative feedback may be perceived by learners
as a sign of lack of progress or even ability. Interview data from the study by Duijnhouwer et al.
(2012) confirmed that students with low self-efficacy beliefs interpreted the provision of feedback
as an indication of their teachers’ belief in their lack of competence, leading to poor self-efficacy
beliefs. However, assuming that learners start every new task without clear self-efficacy beliefs, a
sense of confidence in one’s abilities to complete a task should not be the only factor leading to
motivation for doing so. Something more than self-efficacy is needed for someone to embark on
such a long-term process of trial and error as writing in a second language. Ferris et al. (2013),
for example, found one of the participants to show low levels of L2 writing self-efficacy but “a
155
Mostafa Papi
teachable attitude,” a mindset that helped her believe that she can learn and grow her L2 writing
ability. Such mindsets are the topic of the next component of the control dimension of motivation.
Dweck’s (1999) notion of mindsets refers to beliefs about the malleability of one’s intelligence.
Those who believe that one’s intelligence and natural talent can always grow through experience
are considered to have a growth mindset. Those who endorse a fixed mindset believe that their intel-
ligence can never change. Mindsets have been examined in the field of second language writing as
well. Waller and Papi (2017) examined 142 US-based ESL learners’ mindsets about their L2 writing
talents in relation to their preference for receiving WCF and motivation. The study found that ESL
learners who had a growth L2 writing mindset, that is they believed that their ability to learn how to
write in a second language was malleable, showed a preference for receiving WCF and high levels
of L2 writing motivation whereas those who endorsed a fixed mindset believing that their natural
talent for L2 writing was fixed reported a feedback-avoiding orientation and low motivation.
In another recent study in the foreign language context of the USA, Papi, Bondarenko,
Wawire, Jiang, and Zhou (2020) examined the notion of L2 mindsets (beliefs about the mal-
leability of one’s language-learning abilities) in relation to student-writers’ feedback-seeking
behavior. The researchers collected questionnaire data from students enrolled in foreign lan-
guage writing classes in the United States and found that the learners’ growth L2 mindset signifi-
cantly predicted the perceived value of feedback seeking, which, in turn, predicted the students’
feedback-seeking behaviors. A fixed mindset, on the other hand, significantly predicted the
perceived self-presentation cost of feedback seeking (e.g., feeling embarrassed), which itself
negatively predicted feedback-seeking behaviors. The findings suggest that students with a
growth mindset value feedback and seek it through different means whereas those with a fixed
mindset see feedback seeking largely as a costly behavior to be avoided. Related to the notion
of mindsets are achievement goals.
Achievement goals have a long history of research in the field of motivation (Ames, 1992; Elliott
& Dweck, 1988). According to the original versions of this theory, individuals are motivated to
achieve two types of goals in their pursuits, a performance goal and a mastery (or learning) goal.
Individuals who pursue mastery goals “seek to increase their ability or master new tasks” (Elliott
& Dweck, 1988, p. 5). Those who pursue performance goals, on the other hand, “seek to maintain
positive judgments of their ability and avoid negative judgments by seeking to prove, validate,
or document their ability and not discredit it” (Elliott & Dweck, 1988, p. 5). The achievement
goals have been found to lead to differences in cognitive, emotional, and behavioral patterns in
learning (e.g., Elliott & Dweck, 1988). Learners with a growth mindset tend to follow mastery goals
whereas those with a fixed mindset tend to follow performance goals. The achievement goals can
serve learners in their long-term pursuit for desirable end-states. The mastery orientation has been
found to lead to the increased use of writing strategies (He, 2005), and contribute to the complexity
(Rahimi & Zhang, 2019) and quality of L2 writing (He, 2005). Performance goals, on the other
hand, has negatively predicted writing complexity (Rahimi & Zhang, 2019).
Self-determination theory (Deci & Ryan, 1985) is based on the motivational principle that the
more self-determined a learner’s goal is for completing a task the higher the learner’s motivation,
engagement and enjoyment would be during task completion. Whereas extrinsic motives are about
the desire for relatedness to the environment, intrinsic motivation concerns the desire for autonomy
and competence. This theory classifies motivations into five categories. The most self-determined
type of motivation is intrinsic motivation which represents personal interest in and enjoyment of
an activity; among the more external types of motivations, integrated motivations are integrated
within value and belief systems; identified motivations are personally valued but not integrated;
introjected motivations are partially assimilated within the system of values; and external motiv-
ations are completely external to the person. Lack of any source of regulation and motivation is
labeled amotivation.
156
Even though self-determination theory is prominent in social and educational psychology, I am

aware of only two applications in L2 writing. Yeşilyurt’s (2008) analysis of self-report data collected
from students enrolled in EFL writing classes showed that the more self-determined types of motiv-
ation orientations (i.e., intrinsic motivation and identified regulation) significantly and positively
correlated with writing achievement whereas amotivation had a negative correlation with writing
achievement. In another study, Tsao, Tseng, and Wang (2017) found that learners’ intrinsic motiv-
ation positively predicted their evaluation of both teachers’ and peers’ WCF whereas the extrinsic
motivation did not.
Motivation from Value

Future selves, outlined within the framework of the L2 Motivational Self System (L2MSS), have
been at the center stage of L2 motivation research over the last decade. Dörnyei (2009) drew on the
self-discrepancy theory (Higgins, 1987) and the possible-selves theory (Markus & Nurius, 1986) to
establish the theoretical foundation of the model. He posited that the difference between a learner’s
current L2 self and his or her desired L2 selves would create a discomfort that learners want to
minimize by approaching those future selves. There are two desired selves outlined in the L2MSS,
ideal L2 self and ought-to L2 self. An ideal L2 self is the image of an L2 user that the learner ideally
hopes to achieve in the future. An ought-to L2 self, on the other hand, represents the obligations and
expectations that the learner thinks she or he has to realize in order to avoid negative consequences.
Even though a large number of studies have been conducted using the notion of selves to examine
general L2 motivation, the role of selves in the area of L2 writing motivation has been explored
only in a few studies.
In the context of South Korea, Jang and Lee (2019) found that ideal L2 self positively predicted
students’ planning, overall writing strategy use, and writing quality whereas the ought-to L2 self
only predicted the revising strategy. Similarly, Csizér and Tankó (2015) found that English-major
students with stronger ideal L2 selves reported using more self-regulatory control strategies in their
advanced academic writing classes. These results of the two studies confirmed that the ideal L2 self,
which has a promotion focus concerned with the presence or absence of positive consequences,
results in an eager tendency to take advantage of more writing strategies, which can in turn result in
the higher quality of writing. The ought-to L2 self, on the other hand, leads to a vigilant tendency
to avoid mistakes; hence, it increases the use of revising strategies (Papi, Bondarenko, Mansouri,
Feng, & Jiang, 2019; Papi & Khajavi, 2021). Using future-self scales specifically developed for
L2 writing contexts, Tahmouresi and Papi (2021) found that ideal L2 writing self and ought-to L2
writing self positively predicted L2 writing motivation. However, whereas ideal L2 writing self
positively predicted L2 writing achievement and fluency, ought-to L2 writing self was a negative
predictor of L2 writing achievement.
According to regulatory focus theory (Higgins, 1997), individuals develop two motivational
systems sensitive to different kinds of end-states. A promotion system is concerned with growth,
accomplishments, and achievements. Individuals with a strong promotion system approach posi-
tive end-states, are sensitive to the presence (gain) or absence (non-gain) of positive outcomes,
and follow an eager strategy in their goal pursuit to take advantage of every opportunity and maxi-
mize their chances of gain. A prevention system, on the other hand, is concerned with security,
responsibility, and calmness. Individuals with a strong prevention system tend to avoid nega-
tive end-states, are sensitive to the presence (loss) or absence (non-loss) of negative outcomes,
and follow a vigilant strategy in their goal pursuit to avoid making mistakes and minimize their
chances of loss. Papi (2016, 2018) used framing as a technique to examine how a match between
learners’ dominant regulatory focus (promotion vs. prevention) and the way a task is framed would
affect learners’ engagement and incidental vocabulary learning through an integrated reading and
157
Mostafa Papi
writing task. He asked 188 ESL students in the United States to complete the task, which was
framed in two conditions, promotion and prevention. Students in the promotion condition received
instructions in gain terms; they were told they start the essay with zero points and had to obtain
75 points to qualify for a chance to win one of three $100 gift cards through a raffle. Those in
the prevention condition were told they would start the study with 100 points and had to avoid
losing more than 25 points to qualify for the raffle. The results of the study showed that learners
in the gain condition and those with a stronger promotion focus were more engaged in the task
and learned more vocabulary items. In addition, those with a stronger prevention focus learned
more vocabulary items in the loss condition than in the gain condition. Learners with a promotion
focus, however, performed similarly across the two conditions. Papi attributed these asymmetric
results to the promotion nature of writing an argumentative essay, a question that deserves schol-
arly attention and can lead to a new venue of research on the link between motivation and L2
writing performance and development.
Some sources of motivation relate to one’s personality, belief, value, and motive systems
while other sources relate to one’s social and instructional environment, which itself shapes our
belief, value, and motive systems. Regardless of its source, motivation for making a decision
to act or not to act seems to boil down to a final analysis of the costs and benefits involved in
making decisions. In other words, whereas our personal dispositions, beliefs, and values lead to
differential and biased calculations of the cost and value involved in a decision, the cost-value
analysis, no matter how biased, is the most immediate predictor of our final decisions. Such
analysis has been usually highlighted in the expectancy-value theory of motivation (e.g., Eccles,
Wiegfield, & Schiefele, 1998), which proposes that individuals’ beliefs about the likelihood of
their success in task performance and the extent to which they value the activity explain their
choice and motivation in task performance.
In the field of L2 writing, a few studies have employed this framework. Lin, Cheng, Lin, and
Hsieh (2015) found that learners’ expectancy-value motivation, constituted by utility value, interest
value, cost, connectedness value, and ability self-concept of L2 writing significantly predicted
learners’ self-regulation strategies and abstract-writing ability. Papi, Bondarenko, Wawire, Jiang,
and Zhou (2020) also used a cost-value framework and found that the value of feedback posi-
tively and the self-presentation cost of feedback-seeking negatively predicted feedback-seeking
behavior. Even though few studies have used a cost-value framework to understand the motiv-
ational dynamics of L2 writing, the cost-value explanation can be extended to interpret the results
of some other studies. For instance, Han and Hyland (2015) found that when given indirect,
selective, and focused WCF on their most frequent errors, learners ignored, misinterpreted, or
misidentified WCF as content feedback. In addition, learners’ negative emotional reactions to the
lack of comprehensible feedback led to their cognitive and behavioral disengagement. From a
cost-value perspective, it seems that even though the researchers have reduced the cost of WCF
by making it indirect, they have reduced its value at the same time by making it vague and incom-
prehensible. Giving clear, direct, and comprehensible feedback can increase the learning value of
the feedback and result in more positive outcomes especially for learners with adaptive motiv-
ational profiles (Papi, Bondarenko, Wawire, Jiang, & Zhou, 2020; Papi, Wolff, Nakatsukasa, &
Bellwoar, 2021).
Motivation from Truth

The truth aspect of L2 motivation is represented in constructs such as intrinsic interest in learning
foreign languages or positive attitudes towards or curiosity about the target language community,
and culture. These have been classic constructs in Gardner’s theory of L2 motivation (e.g., Gardner,
1985) but rarely examined in L2 writing. Students’ attitudes towards ESL writing have generally
been found to be positive (e.g., Al-Sobhi, Md Rashid, & Abdullah, 2018; Yoon & Hirvela, 2004)
158
and positive attitudes towards L2 writing have been related to success in L2 writing (Hashemian &
Heidari, 2013). Learners’ intrinsic knowledge orientation (represented by items such as “I experi-
ence a great pleasure while discovering new techniques of expression of ideas and feelings through
writing”) has also been found to positively predict learners’ positive evaluation of both teacher and
peer WCF (Tsao et al., 2017) and their L2 writing achievement (Yeşilyurt, 2008).
Emotions
L2 writing anxiety negatively affects the quality of individuals’ cognitive and behavioral engage-
ment in the process of L2 writing and has been a topic of research in L2 writing for over two
decades. Early studies on L2 writing anxiety were conducted using Daly and Miller’s (1975)
Writing Apprehension Test (WAT), which was developed to examine anxiety in the context of first
language writing. In response to the needs of the field of L2 writing and the limitations of WAT,
Cheng (2004) developed a questionnaire called the Second Language Writing Anxiety Inventory
(SLWAI) for measuring L2 writing anxiety, which has become the most-commonly used measure
in this area. The questionnaire has not only shown appropriate psychometric properties but it has
also provided a conceptual framework for understanding L2 writing anxiety as a unitary but multi-
dimensional construct with three subcomponents: cognitive, somatic, and behavioral avoidance.
The cognitive subcomponent of L2 writing anxiety represents the thoughts and worries underlying
the feeling of anxiety. These include but are not limited to negative expectations and concerns
about others’ evaluations (e.g., While writing English compositions, I feel worried and uneasy if
I know they will be evaluated.). The somatic subcomponent concerns the physiological symptoms
of feeling anxious such as tenseness and nervousness (e.g., I freeze up when unexpectedly asked to
write English compositions.). Finally, the behavioral avoidance subcomponent reflects the behav-
ioral consequences of anxiety such as missing classes and avoiding challenging situations (e.g.,
I usually do my best to avoid writing English compositions.). L2 writing anxiety has often been
found to negatively affect different aspects of the L2 writing process and performance (for a meta-
analysis see Teimouri, 2018). For instance, Cheng (2004) collected data from 421 EFL learners in
Taiwan and found that both global L2 writing anxiety and its components negatively correlated
with L2 writing self-efficacy, motivation, and performance as well as willingness to take L2 writing
courses. Other studies have found that L2 writing anxiety was negatively associated with learners’
L2 writing self-efficacy (Kirmizi & Kirmizi, 2015), use of self-regulatory control strategies in L2
writing (Tsao et al., 2017), perceived value of WCF (Tsao et al., 2017), and writing achievement
(McCarthy et al., 1985; Tahmouresi & Papi, 2021).
The effects of anxiety on L2 writing performance and achievement are not fixed though and
seem to vary depending on task complexity. Zabihi, Mousavi, and Salehian (2018) found that for a
simpler narrative task, somatic anxiety and cognitive anxiety negatively correlated with accuracy
whereas for a more complex argumentative writing task, cognitive anxiety negatively correlated
with one measure of accuracy and all the measures of fluency. Similarly, Rahimi and Zhang (2019)
found that the behavioral component of L2 writing anxiety was negatively associated with writing
complexity in a complex but not a simple task. These results suggest that L2 writing anxiety can
play a role in task performance especially when the task has a high level of complexity, supporting
Skehan’s (1998) trade-off hypothesis that increasing task difficulty results in a trade-off between
the complexity, accuracy, and fluency measures of L2 production.
Given the negative impact of anxiety, researchers have taken interest in exploring the sources
of L2 writing anxiety and methods to minimize the effects of this unpleasant emotion. Studies on
the sources of L2 writing anxiety, however, have been rare. In one such example, Tahmouresi and
Papi (2021) found that the ideal L2 writing self (representing L2 writing skills one would ideally
like to possess) negatively correlated with L2 writing anxiety whereas the ought-to L2 writing
self (represented by the L2 writing attributes one has to possess to avoid negative consequences)
159
Mostafa Papi
positively correlated with L2 writing anxiety, which, in turn, negatively correlated with L2 writing
motivation and achievement. In addition, situational and contextual factors such as time pressure,
fear of negative teacher evaluation, lack of sufficient practice or self-confidence, linguistic
problems, fear of writing tests, and pressure for delivering perfect work can also be major sources
of L2 writing anxiety among students (Kirmizi & Kirmizi, 2015; Rezaei & Jafari, 2014).
L2 writing enjoyment represents the feeling of joy and pleasure that learners experience while
writing in a second language. Traditionally, enjoyment has been investigated as the emotional
representation of the intrinsic motivation. In addition to the studies on intrinsic motivation reviewed
above, other studies have looked at L2 writing enjoyment as an independent emotion. Tahmouresi
and Papi (2021) found that the ideal L2 writing self was positively associated with L2 writing
enjoyment, which, in turn, positively predicted L2 writing motivation and achievement. This makes
theoretical sense from a motivational perspective as it is not hard to imagine that moving towards
one’s desired end-states could lead to enjoyable experiences.
In instructional experiences where the stakes of the tasks are low, learners also seem to experi-
ence more enjoyment. Students who receive peer feedback report to have enjoyed and benefited
from their L2 writing instruction (Kurt & Atay, 2007). The use of technology seems to improve the
L2 writing experience and increase L2 students’ engagement, enjoyment, motivation, and overall
writing performance (Allen et al., 2014; Lo & Hyland 2007). Students also experience more enjoy-
ment and show higher motivation in L2 writing when teachers respond to their work with empathy
and attention, give suggestions on how to improve, share their own personal experiences, and
encourage further reflection (Liao & Wong, 2010). Lower anxiety can also help students enjoy their
learning process (Tahmouresi & Papi, 2021).

Barcelos (2003) classified mainstream SLA studies on student beliefs into three approaches: nor-
mative, metacognitive, and contextual. The normative approach includes quantitative questionnaire
surveys that aim to examine learner beliefs and opinions about language learning and see how those
affect their future learning behavior. The metacognitive approach uses qualitative methods such as
interviews and content analysis to examine beliefs as more stable metacognitive skills and knowledge
that can help learners become more autonomous in learning. Finally, the contextual approach uses
ethnographic methods such as observations, life stories, and metaphors to examine beliefs as embedded
within the context of learning. Whereas the metacognitive and contextual approach can give an in-
depth account of learner beliefs, using questionnaire surveys that typically include a limited number of
belief statements that the researcher choose for students to respond to, can restrict the range of beliefs
that can otherwise be freely expressed by students in relation to the specific context in which they are
situated (Sakui & Gaies, 1999; Yang, 1992). Unless a survey of such beliefs and preferences is focused
on specific aspects of L2 writing such as WCF, the use of close-ended questionnaires can only give us
a narrow understanding of learner beliefs about L2 writing.
Motivational and affective factors are usually explored using descriptive, correlational, and
experimental designs. Descriptive and correlational studies can be conducted using both quantita-
tive and qualitative tools and designs. Whereas interview questions and protocols can be tailored to
the specific purposes of each study especially for exploratory and hypothesis-generating purposes
(e.g., Duijnhouwer et al., 2012; Papi & Hiver, 2020), there are questionnaires that have been
developed to measure L2-writing-specific constructs such as self-efficacy beliefs (e.g., Csizér &
Tankó, 2015), mindsets and motivation (e.g., Waller & Papi, 2017), intrinsic and extrinsic motives
(Tsao et al., 2017), achievement goals, cost-value perceptions, and feedback-seeking behavior
(Papi, Bondarenko, Wawire, Jiang, & Zhou (2020)), and anxiety (Cheng, 2004), which can be
adopted, adapted, and used to test hypotheses in L2 writing contexts. When it comes to learners’
future and long-term goals, however, it is best to combine both a qualitative and a quantitative
160
method for eliciting items that represent the future L2 writing selves of the specific sample under
study. For instance Tahmouresi & Papi (2021) used an idiographic method for developing items
that represented their sample’s specific L2 writing self-guides. In addition, experimental and quasi-
experimental studies can be developed to test hypotheses formulated within different theoretical
frameworks (Allen et al., 2014; Lo & Hyland 2007; Papi, 2018).

Learners come to L2 writing classes bringing with them their own complex set of personality,
motives and belief systems that are beyond the control of L2 writing teachers. Teachers, on the
other hand, have the opportunity to provide positive learning experiences that can influence the
way learners make sense of the writing process as well as their own abilities and future goals,
which can, in turn, lead to their L2 writing development. Based on the findings of the studies
reviewed above, these experiences can include using technology, games, choice in writing
topics, free journal writing, new instructional programs, and study abroad, which have been
found to enhance learners’ interest, motivation, and engagement in the writing process, hence,
their L2 writing development.
Changing a fixed mindset to a growth one can lead to improving learners’ cognitive, emotional,
and behavioral patterns in their L2 writing pursuit (see Lou & Noels, 2016). Using interventions to
enhance L2 writers’ growth mindsets can have remarkable effects on their motivation to write, their
engagement in the writing process, and their feedback-seeking behavior, among other things. The
adaptability of goal orientations is also good news for teachers who want to enhance such adaptive
learning patterns among their students. A mastery goal orientation can be enhanced, for example,
simply by having students reflect on the feedback they receive (Duijnhouwer et al., 2012).
Instructional techniques can be employed to assist highly anxious students. Studies have
found that compared with teachers’ WCF, peer feedback significantly lowers L2 writing anxiety
and increases their self-confidence (Kurt & Atay, 2007) especially when it is computer-mediated
(Zhang, Song, Shen, & Huang, 2014) and asynchronous (Iksan & Halim, 2018) rather than face-
to-face and immediate. Dialogue journal writing can also reduce students’ L2 writing anxiety and
contribute to their English writing competency, fluency, and reflections (Liao & Wong, 2010). Low-
anxiety students prefer certain instructional strategies that help them minimize this undesirable
emotion. These strategies include enhancing background knowledge of writing topics, creating a
safe environment for making mistakes, peer correction, relaxation exercises, good preparation, and
more practice. Teachers have also reported enhancing students’ self-confidence, adopting positive
attitudes towards making mistakes, using familiar writing topics, and adopting a process teaching
approach as effective instructional strategies (Qashoa, 2014).
Adopting a promotion- focused instructional approach focused on gains, achievement and
accomplishments could lead to enhancing learners’ risk-taking tendencies, which can, in turn, lead to
creative, fluent, and even more accurate writing performance. Such a promotion-focused approach
could include enhancing the ideal selves (Papi & Tahmouresi, 2021) or the use of promotion-
focused writing tasks (encouraging creativity and risk-taking), framing instructions, incentives,
management, assessment and feedback style (Papi, 2018; Papi, Rios, Pelt, & Ozdemir 2019). Such
an approach would downplay the costs of making mistakes and unleash learners’ potentials for
writing creatively, effectively, and free of concerns about formal aspects of language.
Future Directions
It seems that it is time to move beyond descriptive studies that exclusively focus on beliefs
and WCF preferences. A more theoretically constructive way of exploring beliefs would be in
connection with learners’ motivational and emotional dispositions such as their mindsets (e.g.,
161
Mostafa Papi
Waller & Papi, 2017), which may underlie learners’ variable preferences in L2 writing and WCF
(see Papi, Wolff, Nakatsuksa, & Bellwoar., 2021). The notion of mindsets is very new in the
field of second language writing and has great potentials for research and practice. Likewise,
achievement goals could play an important role in L2 writing motivation and engagement.
Studies on achievement goals in the field of L2 writing, however, have mostly been conducted
using the more traditional version of the model. Newer versions of achievement goals with add-
itional components (e.g., Korn & Elliot, 2016; Elliott, Murayama, & Pekrun, 2011) can further
our understanding of the role of achievement goals in L2 writing.
Learning how to write in a second language is a long-term process that requires more self-
determined forms of motivation for autonomy and competence. Understanding the L2 writing pro-
cess through the lens of the self-determination theory, therefore, deserves more scholarly attention
as it can contribute to effective writing instruction through, for instance, creating the conditions that
enhance learners’ interest and engagement in the process of L2 writing.
With the introduction of more nuanced conceptualizations and measurement tools for
L2 selves (e.g., Papi, Bondarenko, Mansouri, Feng, & Jiang, 2019; Papi & Khajavy, 2021;
Tahmouresi & Papi, 2021), exploring the role of future selves in L2 writing can be eye opening.
Examining future selves can help us understand not only L2 writers’ motivation and persistence
(Feng & Papi, 2020), but also the process of L2 writing, the quality and quantity of L2 writing
strategies, and learners’ emotional experience during their L2 writing processes.
Exploring learners’ chronic motivational differences such as their regulatory focus and mode
(Teimouri, Papi, & Tahmouresi, 2021), which underlie personality types (Higgins, 2012), can
also be another generative and practical direction for future research in L2 writing. Research
studies could be designed to situationally induce regulatory foci and modes as interventions to
increase learners’ orientation towards different dimensions (complexity, accuracy, fluency) of
L2 writing tasks and promote L2 writing development (e.g., Papi, 2016, 2018).
Research on the role of emotions in L2 writing has been largely limited to L2 writing anxiety.
Given the call for conducting more L2 research from a positive psychology perspective (MacIntyre
et al., 2016), it is imperative that the role of such important and constructive emotions be exten-
sively investigated in the field of L2 writing. Such research not only can further our understanding
of learner engagement in L2 writing, but it can also contribute to L2 writing instruction and further
connect L2 writing research and pedagogy.
Highlighting the lack of attention to the role of the learner in research on WCF, Papi, Bondarenko,
Wawire, Jiang, & Zhou (2020; see also Papi, Rios, Pelt, & Ozdemir, 2019) have recently proposed
the notion of feedback-seeking behavior (FSB) in L2 writing and learning. Drawing on similar
work from the field of organization psychology, the authors made a case for paying attention to
the proactive and agentic role of the learner in the feedback process and for viewing WCF as a
learner resource. Such a perspective puts the learner in charge of the learning process. It recasts
WCF as a learning resource and views the success of WCF to be a function of learners’ motivated
and strategic attention to the information that they perceive as feedback. It complements the current
research on the kinds of WCF teachers employ by drawing attention to whether the learner is
seeking feedback to begin with. A new agenda for research on FSB can therefore be set. Such an
agenda should focus on the individual, interpersonal, contextual, and instructional factors that can
be manipulated to enhance learners’ FSB, thereby the success of WCF. This line of research is still
in its infancy but can make significant contributions to L2 writing research and instruction (see
Papi, Rios, Pelt, & Ozdemir, 2019).
References
Allen, L.K., Crossley, S.A., Snow, E.L., & McNamara, D.S. (2014). L2 writing practice: Game enjoyment as
a key to engagement. Language Learning & Technology, 18(2), 124–150.
162
Al-Sobhi, B., Md Rashid, S., & Abdullah, A.N. (2018). Arab ESL secondary school students’ attitude toward
English spelling and writing. SAGE Open, 8(1). https://doi.org/10.1177/2158244018763477
Ames, C. (1992). Achievement goals and the classroom motivational climate. Student Perceptions in the
Classroom, 1, 327–348.
Bandura, A. (1994). Self-efficacy. In V.S. Ramachaudran (Ed.), Encyclopedia of Human Behavior (Vol. 4) (pp.
71–81). New York: Academic Press.
Bandura, A. (1997). Self-efficacy: The exercise of control. New York: W.H. Freeman and Company.
Barcelos, A.M.F. (2003). Researching beliefs about SLA: A critical review. In P. Kalaja & A. M.F. Barcelos
(Eds.), Beliefs about SLA: New Research Approaches (pp. 7–33). Dordrecht: Kluwer.
Boroujeni, A.A.J., Roohani, A., & Hasanimanesh, A. (2015). The impact of extroversion and introversion per-
sonality types on EFL learners’ writing ability. Theory and Practice in Language Studies, 5(1), 212–218.
Busse, V. (2013). How do students of German perceive feedback practices at university? A motivational
exploration. Journal of Second Language Writing, 22, 406–424.
Cardelle, M., & Corno, L. (1981). Effects on second language learning of variations in written feedback on
homework assignments. TESOL Quarterly, 15(3), 251–261.
Cheng, Y.S. (2004) A measure of second language writing anxiety: Scale development and preliminary valid-
ation. Journal of Second Language Writing, 13(4), 313–335.
Cohen, A.D. (1987). Student processing of feedback on their compositions. In A.L. Wenden & J. Rubin (Eds.),
Learner strategies in language learning (pp. 57–69). Englewood Cliffs, NJ: Prentice-Hall International.
Csizér, K., & Tankó, G. (2015). English majors’ self-regulatory control strategy use in academic writing and its
relation to L2 motivation. Applied Linguistics, 38(3), 386–404.
Cumming, A. (2016). Theoretical orientations to L2 writing. In R.M. Manchón & P.M. Matsuda (Eds.),
Handbook of second and foreign language writing (pp. 65–90). Berlin: De Gruyter Mouton.
Daly, J.A., & Miller, M.D. (1975). Further studies on writing apprehension: SAT scores, success expectations,
willingness to take advanced courses and sex differences. Research in the Teaching of English, 9(3),
250–256.
Deci, E.L., & Ryan, R.M. (1985). The general causality orientations scale: Self-determination in personality.
Journal of Research in Personality, 19(2), 109–134.
Dörnyei, Z. (2009). The L2 Motivational Self System. In Z. Dörnyei & E. Ushioda (Eds.), Motivation, lan-
guage identity and the L2 self (pp. 9–42). Bristol: Multilingual Matters.
Dörnyei, Z., Henry, A., & Muir, C. (2015). Motivational currents in language learning: Frameworks for
focused interventions. Abingdon: Routledge.
Duijnhouwer, H., Prins, F.J., & Stokking, K.M. (2012). Feedback providing improvement strategies and reflec-
tion on feedback use: Effects on students’ writing motivation, process, and performance. Learning and
Instruction, 22(3), 171–184.
Dweck, C.S. (1999). Self-theories: Their role in motivation, personality, and development. Hove; Psychology
Press.
Eccles, J.S., Wigfield, A., & Schiefele, U. (1998). Motivation to succeed. In W. Damon & N. Eisenberg
(Ed.), Handbook of child psychology: Social, emotional, and personality development (pp. 1017–1095).
Hoboken, NJ: Wiley.
Elliot, A.J., Murayama, K., & Pekrun, R. (2011). A 3 × 2 achievement goal model. Journal of Educational
Psychology, 103(3), 632.
Elliott, E.S., & Dweck, C.S. (1988). Goals: An approach to motivation and achievement. Journal of Personality
and Social Psychology, 54(1), 5.
Feng, L., & Papi, M. (2020). Persistence in language learning: The role of grit and future self-guides. Learning
and Individual Differences, 81, 101904.
Ferris, D.R. (1995). Student reactions to teacher response in multiple-draft composition classrooms. TESOL
Quarterly, 29(1), 33–53.
Ferris, D.R., Liu, H., Sinha, A., & Senna, M. (2013). Written corrective feedback for individual L2 writers.
Gardner, R.C. (1985). Social psychology and second language learning: The role of attitudes and motivation.
London; Arnold.
Han, Y. (2017) Mediating and being mediated: Learner beliefs and learner engagement with written corrective
feedback. System, 65, 133–143.
Hashemian, M., & Heidari, A. (2013). The relationship between L2 learners’ motivation/attitude and success
in L2 writing. Procedia-Social and Behavioral Sciences, 70, 476–489.
He, T. (2005). Effects of mastery and performance goals on the composition strategy use of adult EFL writers.
The Canadian Modern Language Review, 61(3), 407–431.
163
Mostafa Papi
Higgins, E.T. (1987). Self-discrepancy theory: What patterns of self-beliefs cause people to suffer? Advances
in Experimental Social Psychology, 22, pp. 93–136. http://dx.doi.org/10.1016/S0065-2601(08)60306-8.
Higgins, E.T. (1997). Beyond pleasure and pain. American Psychologist, 52(12), 1280.
Higgins, E.T. (2012). Beyond pleasure and pain: How motivation works. Oxford: Oxford University Press.
Iksan, H., & Halim, H.A. (2018). The effect of e-feedback via wikis on ESL students’ L2 writing anxiety level.
MOJES: Malaysian Online Journal of Educational Sciences, 6(3), 30–48.
Jang, Y., & Lee, J. (2019). The effects of ideal and ought-to L2 selves on Korean EFL learners’ writing strategy
use and writing quality. Reading and Writing, 32(5), 1129–1148.
37, 13–38.
Kirmizi, Ö., & Kirmizi, G.D. (2015). An investigation of L2 learners’ writing self-efficacy, writing anxiety and
its causes at higher education in Turkey. International Journal of Higher Education, 4(2), 57–66.
Korn, R.M., & Elliot, A.J. (2016). The 2 × 2 standpoints model of achievement goals. Frontiers in Psychology,
7, 742.
Kurt, G., & Atay, D. (2007). The effects of peer feedback on the writing anxiety of prospective Turkish teachers
of EFL. Journal of Theory and Practice in Education, 3(1), 12–23.
Liao, M.T., & Wong, C.T. (2010). Effects of dialogue journals on L2 students’ writing fluency, reflections, anx-
iety, and motivation. Reflections on English Language Teaching, 9(2), 139–170.
Lin, M.C., Cheng, Y.S., Lin, S.H., & Hsieh, P.J., (2015). The role of research-article writing motivation
and self-regulatory strategies in explaining research-article abstract writing ability. Perceptual & Motor
Skills: Learning & Memory, 120(2), 397–415.
Lo, J., & Hyland, F. (2007). Enhancing students’ engagement and motivation in writing: The case of primary
students in Hong Kong. Journal of Second Language Writing, 16(4), 219–237.
Lou, N.M., & Noels, K.A. (2016). Changing language mindsets: Implications for goal orientations and
responses to failure in and outside the second language classroom. Contemporary Educational Psychology,
46, 22–33.
MacIntyre, P.D., Gregersen, T., & Mercer, S. (Eds.). (2016). Positive psychology in SLA. Bristol: Multilingual
Matters.
Manchón, R.M. (2009). Individual differences in foreign language learning: The dynamics of beliefs about L2
writing. RESLA, 22, 245–268.
Markus, H., & Nurius, P. (1986). Possible selves. American Psychologist, 41(9), 954.
McCarthy, P., Meier, S., & Rinderer, R. (1985). Self-efficacy and writing: A different view of self-evaluation.
College Composition and Communication, 36(4), 465–471.
Noels, K.A. (2001). New orientations in language learning motivation: Towards a model of intrinsic, extrinsic,
and integrative orientations and motivation. Motivation and Second Language Acquisition, 23, 43–68.
Ong, J., & Zhang, L. J. (2010). Effects of task complexity on the fluency and lexical complexity in EFL
students’ argumentative writing. Journal of Second Language Writing, 19(4), 218–233.
Papi, M. (2016). Motivation and learning interface: How regulatory fit affects incidental vocabulary learning
and task experience. East Lansing: Michigan State University.
Papi, M. (2018). Motivation as quality: Regulatory fit effects on incidental vocabulary learning. Studies in
Papi, M., Bondarenko, A.V., Mansouri, S., Feng, L., & Jiang, C. (2019). Rethinking L2 motivation research: The
2 × 2 model of L2 self-guides. Studies in Second Language Acquisition, 41(2), 337–361.
Papi, M., Bondarenko, A.V., Wawire, B., Jiang, C., & Zhou, S. (2020). Feedback-seeking behavior in second
language writing: Motivational mechanisms. Reading and Writing, 33(2), 485–505.
Papi, M., & Hiver, P. (2020). Language learning motivation as a complex dynamic system: A global perspec-
tive of truth, control, and value. The Modern Language Journal, 104(1), 209–232.
Papi, M. & Khajavy., G.H. (2021). Motivational mechanisms underlying second language achievement: A
regulatory focus rerspective. Language Learning 71(2), 537–572.
Papi, M., Rios, A., Pelt, H., & Ozdemir, E. (2019). Feedback-seeking behavior in language learning: Basic
components and motivational antecedents. The Modern Language Journal, 103(1), 205–226.
Papi, M., Wolff, D., Nakatsukasa, K., & Bellwoar, E. (2021). Motivational factors underlying learner
preferences for corrective feedback: Language mindsets and achievement goals. Language Teaching
Research, 25(6), 858–877.
Polio, C., & Friedman, D.A. (2016). Understanding, evaluating, and conducting second language writing
research. Abingdon: Taylor & Francis.
Qashoa, S.H.H. (2014). English writing anxiety: Alleviating strategies. Procedia-Social and Behavioral
Sciences, 136, 59–65.
164
Radecki, P.M., & Swales, J.M. (1988). ESL student reaction to written comments on their written work. System,
16(3), 355–365.
Rahimi, M., & Zhang, L. J. (2019). Writing task complexity, students’ motivational beliefs, anxiety and their
writing production in English as a second language. Reading and Writing, 32(3), 761–786.
Rezaei, M., & Jafari, M. (2014). Investigating the levels, types, and causes of writing anxiety among Iranian
EFL students: A mixed method design. Procedia-Social and Behavioral Sciences, 98, 1545–1554.
Ruegg, R. (2018). The effect of peer and teacher feedback on changes in EFL students’ writing self-efficacy.
The Language Learning Journal, 46(2), 87–102.
Sakui, K., & Gaies, S.J. (1999). Investigating Japanese learners’ beliefs about language learning. System, 27(4),
473–492.
Sasaki, M., Mizumoto, A., & Murakami, A. (2018). Developmental trajectories in L2 writing strategy use: A
self-regulation perspective. The Modern Language Journal, 102(2), 292–309.
Storch, N., & Wigglesworth, G. (2010). Learners’ processing, uptake, and retention of corrective feedback on
writing: Case studies. Studies in Second Language Acquisition, 32(2), 303–334.
Tahmouresi, S., & Papi, M (under review). Future selves, enjoyment and anxiety as predictors of L2 writing
quality and achievement.
Teimouri, Y. (2018). Differential roles of shame and guilt in L2 learning: How bad is bad? The Modern
Language Journal, 102(4), 632–652.
Teimouri, Y., Papi, M., & Tahmouresi, S. (2021). Individual differences in how language learners pursue goals:
Regulatory mode perspective. Studies in Second Language Acquisition, 1–26.
Tsao, J.J., Tseng, W.T., & Wang, C. (2017). The effects of writing anxiety and motivation on EFL college
students’ self-evaluative judgments of corrective feedback. Psychological Reports, 120(2), 219–241.
Waller, L., & Papi, M. (2017). Motivation and feedback: How implicit theories of intelligence predict L2
Wan, W. (2014). Constructing and developing ESL students’ beliefs about writing through metaphor: An
exploratory study. Journal of Second Language Writing, 23, 53–73.
Wenden, A. (1986). What do second-language learners know about their language learning? A second look at
retrospective accounts. Applied Linguistics, 7(2), 186−205.
Yang, N.D. (1992). Second language learners’ beliefs about language learning and their use of learning strat-
egies: A study of college students of English in Taiwan (Doctoral dissertation). Retrieved from https://
repositories.lib.utexas.edu/handle/2152/74921
Yeşilyurt, S. (2008). Motivational patterns and achievement in EFL writing courses: An investigation from
self-determination theory perspective. KKEFD, 18, 135–154.
Yoon, H., & Hirvela, A. (2004). ESL student attitudes toward corpus use in L2 writing. Journal of Second
Yun, S., Hiver, P., & Al-Hoorie, A.H. (2018). Academic buoyancy: Exploring learners’ everyday resilience in
the language classroom. Studies in Second Language Acquisition, 40(4), 805–830.
Zabihi, R., Mousavi, S.H., & Salehian, A. (2018). The differential role of domain-specific anxiety in learners’
narrative and argumentative L2 written task performances. Current Psychology, 39(4), 1438–1444.
Zhang, H., Song, W., Shen, S., & Huang, R. (2014). The effects of blog-mediated peer feedback on learners’
motivation, collaboration, and course satisfaction in a second language writing course. Australasian
Journal of Educational Technology, 30(6), 670–685.
165
SECTION 5
Writing Research, Corrective Feedback,

and Language Development
13
L2 WRITING AND GRAMMAR
DEVELOPMENT
Charlene Polio
Michigan State University
Introduction
The goal of this chapter is to provide a current state-of-the-art discussion of how writing does, might,
or might not promote grammatical development in second language learning. In Polio (2020), after
synthesizing research suggesting that writing might facilitate grammatical development, I outlined
research tasks that could provide additional and stronger evidence. I noted that many of the studies
completed at the time pointed to the benefits of writing, although few studies directly addressed the
issue. The focus of that chapter was on future research, so the present chapter is, in some sense, an
extensive and current prequel.
Because the focus of this volume is on language learning through writing, as opposed to the
learning of other (related) aspects of writing such as genre or writing strategies, I define writing as
the production of any type of written text beyond the sentence level regardless of whether or not
that text has a real-life purpose. Thus, I include research that uses controlled writing activities such
as dictoglosses, where writers reconstruct texts as opposed to create their own meaning (cf. Byrnes,
2020). I also include some studies on text chat because they shed light on the written modality in
areas where research is limited.
In discussing grammar development, I consider any short or long-term changes in language use
as evidence of learning. Of course, some changes may not necessarily be more target-like; as lan-
guage becomes more complex, new errors may be introduced. In addition, more complex grammar
use does not always mean better in written language, but greater linguistic complexity may be
some evidence of learning as writers expand their linguistic repertoire. The studies reviewed in
this chapter often assess learning after an intervention using a wide variety of syntactic complexity,
morphosyntactic accuracy, and fluency (CAF) measures, and I agree that changes in these measures
are important indicators of learning. Fluency is included because faster access to grammatical
structures is an indication of learning. Other studies discussed here do not assess development
per se but rather show how learners attend to grammar through the composing processing. I am
assuming here that attention to grammar will generally have a facilitative effect on grammatical
development.
Finally, as a further clarification, I focus on grammar in this chapter, which includes morph-
ology, syntax, word forms, and prepositions, as opposed to the lexicon. This separation of grammar
and vocabulary is complex and, at times, problematic. On the one hand, I agree with various
DOI: 10.4324/9780429199691-20 169

Charlene Polio
usage-based claims (e.g., Römer, 2009) that grammar and lexis cannot be separated. I am also
aware that studies that classify errors can be problematic because it is not clear whether the error
is attributable to a morphosyntactic error or whether a learner does not understand the grammar of
the word (e.g., a transitive/intransitive problem). Additionally, the tension between the lexicon and
grammar is a theme that appears in some L2 writing research discussed in this chapter. It is also not
clear that writing has equally facilitative effects on vocabulary and grammar. For example, there
is an abundance of research showing the facilitative effectives of writing on learning vocabulary
based on, in part, Laufer and Hulstijn’s (2001) task-induced involvement hypothesis (and see Kyle,
Chapter 13, this volume) suggesting that certain tasks will allow for deeper processing of vocabu-
lary. Less research on depth of processing for grammar is available (but see Sachs & Polio, 2007,
and Leow & Suh, Chapter 2 this volume), so it is helpful to separate vocabulary and grammar when
considering the empirical research.
I begin with a historical perspective on how writing has been suggested as a way to facilitate
grammatical development. This is followed by a discussion of critical issues and a review of recent
empirical studies related to writing and grammatical development in six areas. I end with a brief dis-
cussion of methodological considerations in present and future research and with some pedagogical
implications of the research reviewed.
Cumming (1990) is credited as being the first person to draw attention to the fact that writing can
facilitate acquisition. He wrote: “Composition writing elicits an attention to form-meaning relations
that may prompt learners to refine their linguistic expression – and hence their control over their
linguistic knowledge –so that it is more accurately representative of their thoughts and of standard
usage” (p. 483). Harklau (2002), soon after, pointed out that sometimes much more learning takes
place through reading and writing. Neither of these studies incited much research on the facilitative
effects of writing on grammar. With the exception of corrective feedback studies (e.g., discussion
in Roca de Larios & Coyle, Chapter 7 this volume) sparked by Truscott (1996), research on writing
as a site for grammar learning never took off. Rather, references to language learning through
writing is found buried in various studies including Polio, Leder, and Fleck (1998) in their study
of corrective feedback. Some researchers studied second language acquisition using written lan-
guage (e.g., Swain, 1998; Swain & Lapkin, 1995; Bardovi-Harlig, 2001) but did not mention how
modality might affect grammatical development.
One possible reason for a lack of early research on the topic was Krashen’s (1982) arguments
that writing simply promoted monitoring of oral language akin to Truscott’s (1996) claim that
written corrective led to only “pseudo-learning” (p. 345) (i.e., explicit knowledge). However,
the application of Skill Acquisition Theory to SLA, promoted by DeKeyser (e.g., 2015), based
on the idea that we first learn about something (declarative knowledge) and then learn how
to do it (procedural knowledge), suggests a role for explicit or declarative knowledge, that
would include grammar learning. DeKeyser (2015) said that through practice, knowledge used
in routine procedures can be accessed more smoothly and more rapidly; procedural know-
ledge becomes gradually automatized. Skill acquisition theory has now been applied in L2
writing research to hypothesize and explain the effects of corrective feedback on learning (e.g.,
Hartshorn et al., 2010 and Kurzer, 2018) and, to a lesser extent, other interventions in L2 writing
(Shintani, Aubrey, & Donnellan, 2016).
Writing as a way to learn language is not a new idea (e.g., Manchón, 2011), and Rubin and Kang
(2008) discussed the relationship between writing and speaking and suggested that writing can
serve as a springboard for speaking but did not investigate the matter empirically. Later, Williams
(2012) described exactly how writing might be even better than speaking at facilitating L2 learning
in general, and grammatical development in particular. Specifically, because writing is generally
170
L2 Writing and Grammar Development
slower and less fleeting than speaking, learners can better contemplate form-meaning connections.
This can lead to the internalization of knowledge as well as to the modification and consolidation
of current knowledge. Williams’s discussion gave greater theoretical credence to the facilitative
effects of writing in second language grammatical development both with regard to the creation
of new knowledge and with regard to gaining control over and consolidating current knowledge.

The research falling in the area of writing and grammatical development includes points we know
with some certainty as well as claims that most researchers believe but that have not been fully
supported empirically. As I will show, despite Williams’s (2012) well-argued claims, we must do
some heavy extrapolating to argue that writing facilitates the acquisition of morphosyntax. The
most crucial issue addressed here is the learning potential of writing over speaking and there are
three ways to consider this, as I have also discussed elsewhere (Polio, 2020). First, research could
potentially show that new grammatical structures emerge first in writing or that written language
is more complex or accurate than oral language among learners. For example, Weissberg (2006)
conducted a multiple case study and found that most structures appeared first in writing for four of
the five adult learners that he followed. But, of course, all learners are different and have different
amounts of exposure to oral and written language, and different types of instruction that may favor
one modality over another. Manchón and Vasylets (2019) noted that some research has shown that
learners produce more complex language in writing but that the evidence is more robust for vocabu-
lary than morphosyntax. As discussed further in the next section, this line of research is problematic
because oral and written language differ inherently in their types of complexity (Biber & Gray,
2010) (e.g., with oral language having more relative clauses and written more adverbial clauses)
suggesting that differences may be related to shifts in register and not ease of use in one particular
modality.
Two other avenues of research for showing the facilitative effects of writing on acquisition would
be, one, to show structures learned through writing activities (e.g., dictoglosses or other text recon-
struction tasks) are then transferred to oral tasks. Theoretically, this should be possible, particularly
if structures learned though writing become more automatized. A second related approach is to
compare the facilitative effect of the modality of pre-speaking activities (i.e., oral versus written)
on language production. In other words, does writing as opposed to speaking before speaking result
in more complex, accurate, or fluent oral language? Recent research in these areas is discussed in
the next section.
Other lines of research less directly related to the facilitative effect of writing on acquisition are
nevertheless relevant and have been more extensively researched. First, much previous research has
shown that writers attend to form as they compose, which might thus result in grammar learning.
This is related to the long history of research hypothesizing that output leads to acquisition (e.g.,
Swain,1985). Some studies empirically demonstrated the positive effects of oral output (Izumi,
2002; Uggen, 2012) with others only partially supported the role of oral output (Morgan-Short &
Bowden, 2006; Izumi, Bigelow, Fujiwara, & Fearnow, 1999). There is copious evidence, however,
that learners pay attention to form as they compose, thus supporting Cumming’s (1990) claim.
Much of this early research on written language (e.g., Swain & Lapkin, 1995) studied collabora-
tive conditions, mostly because collaboration was a way to capture what writers were doing as
they wrote. Although there is still a strong interest in collaborative writing, more recent research,
discussed later, has also used think alouds or stimulated recall to understand what writers pay
attention to as they write.
Second, if writing facilitates grammatical development, we would expect to see some improve-
ment in terms of grammatical complexity and accuracy as students take writing classes. Unfortunately,
many studies have shown virtually no progress in writing classes in terms of grammatical accuracy
171
Charlene Polio
or complexity with instruction ranging from one semester to 2.5 years (e.g., Godfrey, Treacy, &
Tarone, 2014; Knoch, Rouhsahd, & Storch, 2014; Knoch, Roushad, Oon, & Storch, 2015; Roquet
& Pérez-Vidal, 2015; Serrano, 2011; Storch, 2009; Yoon & Polio, 2017; Polio & Shea, 2014). Yet,
six of these eight studies showed an increase in fluency, so one possible conclusion is that writing
instruction or practice is simply helping learners write more fluently, which should be considered
as some indication of development.
Finally, another indirect approach is to examine how interventions affect morphosyntax
including varying the writing prompt or writing conditions or including some type of instruction
or prewriting. Studies of how prompt or task (including writing as opposed to oral tasks) affect
language do not directly show that writing promotes grammatical development, but they show
how certain tasks might push learning. The research on writing tasks and syntactic complexity is
copious, and Johnson’s (2017) recent meta-analysis shows just how complex and noncomparable
the studies are. Nevertheless, he was able to conclude that certain task features seem to lead to an
improvement on some CAF measures and that “features of task complexity may promote attention
to the formulation and monitoring systems of the writing process” (p. 13). Although task and prompt
differences can affect grammar with regard to linguistic complexity, both Johnson (2017) and Yoon
and Polio (2017), who found more complex language in argumentative essays than narrative essays,
suggest that these differences could be related to the communicative functions related to each genre
and not to cognitive demands. However, no studies have looked at the cumulative effects of task
effects on grammatical development, so we do not know if tasks or genres that produce complex
language have any developmental outcomes. In addition, research into instruction or prewriting that
might affect grammatical development, on the other hand, is shockingly rare, with the exception of
research on corrective feedback.
Current Contributions of Research

This section is organized into six areas with the last being, arguably, the most important. I highlight
a few key studies in each area.
How Learners Pay Attention to Form While Writing

It is extremely difficult to characterize exactly how and when learners pay attention to form when
they write because learner variables (e.g., proficiency) and task variables (e.g., time and task com-
plexity) affect the writing process. The most current research, such as Révész, Michel, and Lee
(2019), focuses on how researchers might access related cognitive processes but also provides some
enlightening information. Révész et al. used keystroking logging, eye-tracking, and stimulated
recall to describe what L2 writers focused on as they wrote. They coded the data for various aspects
of the writing process including syntactic encoding. In a stimulated recall, their 12 participants
were prompted to comment on their thoughts while pausing and mentioned morphosyntax 27
times, as opposed to lexical issues, which were mentioned 71 times. When describing their revision
behaviors, they commented on morphosyntax 74 times and lexical issues 114 times during a 40-
minute writing task. Two conclusions are possible. First, writers are more likely to pay attention to
form –or to at least explain that they are doing so –at the revision stage rather than the formulation
stage. If this is true, it means that the writers may be applying explicit knowledge that they were not
able to access when writing quickly. Second, for some reason, lexical issues were more salient to
the writers. Révész et al. noted, however, that they considered pauses of only two seconds or longer,
so these longer pauses may not have fully captured all the lower-level processes, such as focusing
on morphosyntax. They also noted that many of the writers looked at the keyboard while writing
and not the screen; this likely limits the amount of editing for language that can be done.
172
Lim, Tigchelaar, and Polio (in press) conducted a case study of six participants to try to under-
stand language development over the course of a semester. As part of the study, they conducted
stimulated recall sessions where students watched a screen capture of their composing and found
that students could identify errors and draw on explicit knowledge after they wrote. While com-
posing under time pressure, however, they were not able to edit, to the same extent, for language.
Both studies suggest, as we already know, that writing allows for use of explicit know-
ledge. If we assume that access to this knowledge becomes automatized, writing should facili-
tate grammar development. But given that we know for certain that L2 writers pay attention to
form to some extent while writing, we need to seriously consider what, very specifically, about
this attention may help them acquire language and whether or not further descriptions of writers’
behaviors will move the field forward. At this point, it is not completely clear that descriptions
in and of themselves are helpful unless there is some link to writing task, writing conditions, or,
most importantly, future learning.
Grammatical Development During Writing Instruction

As discussed earlier, many studies show no gains in the complexity or accuracy of grammar during
writing instruction, but these studies looked at global measures of complexity (e.g., words per
T-unit) and accuracy (e.g., percentage of error-free T-units). Studies that target specific grammat-
ical structures may be more likely to detect change. For example, Mazgutova and Kormos (2015)
studied lower proficiency students’ written language and found an increase, over a semester, in
noun phrase complexity, a feature of academic writing. Man and Chau (2019) studied students
over 12 weeks examining the structure of evaluative that-clauses distinguishing among verb
(I think that), noun (There is no doubt that), and adjective (It is unlikely that) clauses. They found
an increase in noun and adjective that-clauses, which are both more common in academic writing,
but no increase in verb that-clauses, which are more common in speaking.
These two studies differ from each other in that the Mazgutova and Kormos (2015) study
focused on students who were in an academic class whereas the students in Man and Chau (2019)
were taking multiskill classes. In neither study can we be certain about the effects of writing
instruction on grammatical development, but understanding what specific structures develop
raises some interesting questions. It seems that in both studies learners somehow developed the
features of academic writing. It may be that their actual repertoire of grammatical structures did
not change, but their ability to understand the characteristics of the genre did. Further support
for this can be seen in Friginal and Weigle (2014), who found that clusters of features related to
academic writing changed over the course of a semester despite the fact that studies of global
complexity and accuracy measures using the same data set showed limited change (Bulté &
Housen, 2014; Polio & Shea, 2014).
As mentioned earlier, several studies have shown no linguistic development with regard to
accuracy, but a few studies have shown some development for either children and adolescents or
students studying abroad, including Godfrey, Treacy, and Tarone (2014), Torras and Celaya (2001),
Roquet and Pérez-Vidal (2015), and Llanes and Muñoz (2013). This suggests that when looking
at the global development of grammar, input and age may be more relevant than writing instruc-
tion. A recent meta-analysis (Xu, 2019) confirmed an increase in syntactic complexity for students
studying abroad but the increase was small and students had greater improvement in oral skills and
in lexical complexity. These differences are relative in relation to the learners’ starting points but
confirm that studying abroad may lead to written grammatical gains.
Taken together, I would like to argue, these studies do not provide any evidence for the facili-
tative effects of writing on grammatical development. The absence of evidence does not mean,
however, that writing is not helpful to language learning, but rather that we do not have empirical
173
Charlene Polio
evidence for such a conclusion in this research and that the research on language development is
somewhat disappointing except for specific features of academic writing.
The Effects of Task and Writing Condition Manipulations on Writing

As discussed earlier, simply knowing that writers’ grammar is affected by prompt or task conditions
is interesting and has potential pedagogical implications, but it does not directly address how writing
might affect grammar development. In addition to the research reviewed by Johnson (2017), some
recent studies have focused on task sequencing, which alone do not address the superiority of
writing over speaking. They may, however, give us some insight into how writing tasks might
facilitate development.
Allaw and McDonough (2019) studied low proficiency learners of French and varied a written
three-task sequence with a simple-to-complex condition and a complex-to-simple condition. With
regard to the accuracy of relative pronouns, there was improvement in both groups on an immediate
posttest, but the simple-to-complex groups maintained the gains on delayed posttest, whereas the
other group did not. Allaw and McDonough said:
Theoretically, the simple task with its simplified input stabilized the learners’ newly
acquired knowledge of the target lexical and grammatical forms. The medium task
created opportunities for the learners to express similar ideas, but with more independ-
ence due to the lack of the task structure, thereby promoting automatization. The com-
plex task took the learners’ potentials further by providing them with an opportunity to
create new form/meaning connections by letting them engage in spontaneous writing
on a specific theme.
p. 9
Thus, this study suggests something about how producing new structures may take hold in the
learner’s interlanguage system. What is still not clear is if this can happen faster in writing than in
speaking.
In a recent study exploring task conditions, Kang and Lee (2019) studied individual versus col-
laborative planning with simple and complex tasks. While they found a benefit for collaborative
planning for lexical complexity on a complex task and individual planning on the simple task, they
found no difference for syntactic accuracy or complexity among the tasks and planning conditions.
It is difficult to draw clear conclusions from the study, despite Kang and Lee’s tenuous claim that
collaborative planning may be more beneficial. It does seem that vocabulary learning is more
susceptible to variations by condition and, in fact, Kang and Lee found that in the collaborative
planning condition, students focused on issues at the lexical level.
Using a more rigorous design, McDonough and De Vleeschauwer (2019) found that students
who participated in the collaborative prewriting sessions had greater linguistic accuracy on a posttest
after three writing sessions than students who did individual prewriting. The authors claimed that
their study, compared to others that did not find positive effects for collaborative prewriting on
accuracy, used more structured tasks. They concluded, “Thus, to impact L2 writers’ accuracy, col-
laborative prewriting may require structured materials which help students brainstorm ideas and
organize their ideas into a writing plan, which may free up their attentional resources to focus on
accuracy while writing” (p. 127). Interestingly, students in the individual condition did better on the
overall analytic essay scores, but it is not clear why.
Together, these studies show that scaffolding through task sequencing or prewriting might affect
language development by helping learners focus on linguistic features during prewriting that might
transfer to the written text. One study, however, found an effect on vocabulary only suggesting, as
174
with other studies discussed in this chapter, a tension between vocabulary and grammar in terms of
competing for the writers’ attention.
Instructional and Prewriting Interventions

Given the more robust findings of planning on vocabulary, we might conclude that various grammar-
focused interventions are needed for writing to facilitate grammatical development. In addition, the
research on the lack of grammatical development in writing classes suggests this as well. And as
mentioned earlier, research in this area is rare, and this rarity is also noted by Shintani, Aubrey,
and Donnellan (2016), who asked the very straightforward question, “Does a pre-task metalin-
guistic explanation (ME) affect the accuracy of a subsequent writing task?” (p. 947), and they found
that it did. Students received metalinguistic instruction in the past counterfactual either before a
text reconstruction task, after, or not at all. Not only did the pre-task group perform better on the
structure on a posttest, but the results were more pronounced on a delayed posttest. Drawing on
Skill Acquisition Theory, the authors concluded that the pre-task condition allowed the participants
to establish and then practice their explicit knowledge. In a similar study of the same structure,
Shintani (2017) studied the effect of explicit instruction pre-, during, and post-writing. She found
that learners without prior knowledge of a structure benefited from prewriting explicit instruction
whereas learners who already had prior knowledge of the structure benefited from being able to
access instructional materials while and after writing.
Adams, Nik Mohd Alwi, and Newton (2015) conducted a study of the effects of language
support on text chat. Their study was conducted within the framework of Robinson’s Cognition
Hypothesis (Robinson, 2005) with the rationale that the language support pre-activities (reading a
dialogue seeded with the target structure, metalinguistic explanation, and multiple choice quiz) was
a form of planning. As part of their study, they also included a task complexity variable (low versus
high task structure). They found that the language support group made fewer errors in both task
conditions but that syntactic complexity was not affected, thus showing some but limited benefit of
a language-focused intervention.
All of these studies suggest that metalinguistic support before writing can reduce errors. This is
not surprising because students can draw on the explicit explanations and then focus their attention
on reducing errors. Furthermore, Shintani et al. (2016) found that the results were maintained on a
delayed posttest, so the effects might have a lasting impact.
Linguistic Differences in Speaking versus Writing of Language Learners

Linguistic differences in speaking versus writing –or how modality affects performance as
opposed to learning –might give us some insight into the potential of writing. Research on lin-
guistic differences in learner speech and writing, however, is difficult to interpret because of 1) indi-
vidual differences (e.g., age) and learning condition (e.g., exposure to input) among learners; and
2) natural differences in oral and written language that may be not related to learning.
Furthermore, there are different opinions as whether speaking or writing is more taxing. As
discussed earlier, the prevailing view is that for L2 learners, writing is likely to be less taxing
because it is slower; learners have more time to formulate what they want to say making their lan-
guage perhaps more accurate or more complex with the hope that this will carry over to speaking.
However, there is also a body of research in psychology, focusing on L1 writers, that argues that
writing is more mentally taxing and results in weaker performance and poorer recall (Bourdin &
Fayol, 2002) in part because it puts a greater burden on working memory (with L2 researchers
arguing that writing may serve to extend working memory). Grabowski (2010) concluded that
writing may be more challenging only for children and concluded from his study that “For adult
175
Charlene Polio
university students with largely automated writing skills, no difference between spoken and written
output modes occurred, indicating that both verbal modalities involve low-level processes with low
cognitive costs.” (p. 32). Overall, it seems unwise to compare adult L2 learners to either L1 writers
or children.
In addition to learner differences, a potential problem in comparing spoken and written learner
language is that there are natural differences in spoken and written language, as well as differences
among genres (e.g., as in much of the work by Biber and colleagues as in Biber & Conrad, 2009,
and Biber & Gray, 2010). In other words, just as research has shown that learners can vary language
according to genres (e.g., Yoon & Polio, 2017), they can likely vary their language according to
modality. A key study that takes these differences into account is Biber, Gray, and Staples’s (2016)
study of the TOEFL oral and written independent and integrated tasks. They found that among L2
learners, features normally associated with complexity (e.g., passives, premodifying nouns, verb +
that-clauses, noun + that-clauses) were in fact more common in writing than speaking. Interestingly,
these features were not associated with higher scores. Rather, they found that groups of features
according to Biber’s multidimensional model better predicted scores. The results are somewhat
difficult to interpret in terms of language development because the study was cross-sectional based
on learners’ overall scores. The take-away is that research needs to consider features of target texts,
particularly when examining oral versus written language.
Vasylets, Gilabert, and Manchón (2017) studied language differences of learners completing a
simple and complex task in oral and written modes. Not surprisingly, in written texts, the participants
produced more complex language in terms of length of units (specifically, analysis-of-speech, or
AS, units) and greater subordination. This could be due to natural differences in oral versus written
language, but without native speaker data on the same tasks, we cannot be sure. In addition, no
difference was found in terms of grammatical accuracy. In contrast, Zalbidea (2017) found, when
comparing Spanish L2 learners’ performance on written and oral versions of the same task, that
their oral language was more syntactically complex and that their written language was more lexic-
ally complex and more accurate. She posited that this was due to the nature of online planning (i.e.,
a focus vocabulary and error reduction), which learners have time for in writing. This promising
line of research that relates task variables to modality and potential language learning outcomes is
further expanded on in Manchón (2020).
Additionally, Kormos and Trebits (2012) compared learner performance across oral and written
modalities (and included the relationship to task complexity and aptitude as well). They found
greater accuracy and lexical variety in writing but no difference in complexity. The authors suggest
that participants were more focused on vocabulary. Tavakoli (2014) found that increased task com-
plexity in an oral narrative task resulted in more grammatically complex language but not in a
written version of the task. These findings suggest that processing constraints are different across
the two modalities, but they do not necessarily show that writing can facilitate oral acquisition. It is
also not clear what the lasting effects of one modality over the other are.
How Modality Affects Learning

Some studies have examined how modality affects learning as opposed to performance. How
features learned in writing might transfer to oral language is a crucial issue related to the facilitative
effects of writing.
Much of the research on these topics comes from comparing face-to-face (FTF) discussion
versus text chat. On one hand, text chat has features distinct from the extended text of other genres.
On the other, if we are truly comparing only modality, it is quite appropriate to compare FTF and
text chat as text chat is more similar to speaking than writing an essay. Ziegler (2016) conducted a
meta-analysis of 14 studies, eight of which looked at writing, and found a small positive effect for
text chat over FTF discussions on subsequent written language production. This is not surprising
176
given that one would expect a positive effect for written prewriting on writing. What is more signifi-
cant is that when Ziegler separated outcomes based on modality and considered only oral outcomes,
there was a small advantage for FTF communication. Thus, writing via text chat was not more
beneficial than speaking for subsequent oral production.
Chau (2014), in a study that truly compared advantages of writing versus speaking on subsequent
oral production compared three conditions prior to an oral narrative: no planning, written rehearsal,
and oral rehearsal. He found that students in both rehearsal groups improved oral narratives in
terms of fluency, complexity, and accuracy but that there was no effect for modality. In analyzing
the rehearsals and follow-up interviews, Chau concluded that both rehearsal groups spent much of
their time on lexical searches. The written rehearsal group did spend more time on form than the
oral rehearsal group but this did not translate into an improvement on the CAF measures.
Liao (2018) compared FTF and text chat prewriting on the essays of L2 Chinese learners and
found that students had better Chinese character accuracy after the text chat condition and that there
was no effect on syntactic complexity. Kessler et al. (2020) replicated the study with students at a
somewhat lower level of proficiency using reliable measures of syntactic complexity and character
accuracy. They found that the FTF mode resulted in greater syntactic complexity in writing as well
as greater lexical complexity, concluding that there were no benefits for the text chat group. Kessler
et al. suggested that writing in Chinese is arduous task for learners, hence any facilitative effects of
writing are eliminated.
Kim and Godfroid (2019) conducted a study of oral versus written input in the development
of implicit and explicit grammar knowledge. They found written input had a greater effect on the
development of implicit knowledge concluding that “the permanence of visual input may be a crit-
ical advantage for beginning learners to develop implicit knowledge of word order rules” (p. 1), a
similar argument that has been made for the advantages of writing over speaking. What is critical
about their study is that they conducted two experiments, one comparing aural input to a condition
where learners could read freely at their own pace and one in which participants read as words
appeared on the screen. Kim and Godfroid argued that the advantage for written input occurred only
when learners had control over their reading rate. This is parallel to the notion that writing better
promotes learning because it is slower.
Kim, Jung, and Skalicky (2019) conducted a study comparing face-to-face (FTF) and syn-
chronous computer-mediated communication (SCMC) in an activity that focused on preposition
stranding. Their research was framed as an alignment/priming study, namely, how learners’ syntax
was affected by the researcher’s language in a communicative task. They found that learners who
participated in the task where the researcher primed them with stranded prepositions were more
likely to use them and do better on a posttest than the control group. Furthermore, the group that
participated in the SCMC condition did better than the FTF groups on the immediate and delayed
posttests. This well-designed study provides some evidence that the written mode might be better
for retention of grammatical structures.
Unfortunately, there is still not strong evidence that writing is better than speaking at promoting
development, and different studies have yielded different results. This could be due to learner
variables as well as language variables (as in the case of logographic languages). Furthermore,
studies use different measures, and, as discussed below, comparing oral and written conditions can
be problematic.

Research on what students do as they write is descriptive while research in the other areas involves
the manipulation of some independent variable (e.g., prewriting activity or modality) or considers
time as an independent variable. Most research on morphosyntactic development is quantitative.
One theme that is evident in the discussion of studies in the last section is the contradictory findings
177
Charlene Polio
and lack of comparability of studies. This is because of both individual differences among learners
and different operationalizations of independent and dependent variables.
Research into the writing process has been written about extensively in articles and books (e.g.,
Polio & Friedman, 2017; Michel et al., Chapter, 6 and Révész et al, Chapter 25), so I will not detail
the tools used here, but briefly say that traditionally they include having learners think aloud as they
compose or revise, or asking learners about their process after they have written. Thinking aloud,
discussed in detail in Bowles (2010), can be challenging and may interfere with the writing process,
but also has advantages in that it taps into what a writer is doing in real time. Interviews are problem-
atic in that writers may not accurately or completely recall what they were focused on or thinking as
they wrote. Thus, many studies choose to use stimulated recall where writers are shown a reminder,
usually a screen capture video, of the writing activity. The writers are then asked to recall their
thoughts while they were writing. The pros and cons and variations of stimulated recall are outlined
in Gass and Mackey (2016). More and more, these methods are being combined with eye-tracking or
key-stroke logging data that show what writers are focused on as they compose, as in Révész et al.
(2019) discussed earlier. It is important to consider what types of knowledge each method can tap
into. As discussed throughout this chapter, there seems to be tension between lexical and grammat-
ical knowledge, and explicit and implicit knowledge. Most likely, lexical and explicit knowledge are
easier to talk about, so some methods will favor information about what can be verbalized.
Not only is data collection an issue, but data coding can be more vexing. Dividing think-aloud
or stimulated recall data into codable units and then establishing codes that are reliable and valid is
a challenge. Different coding systems make cross-study comparisons challenging. López-Serrano,
Roca de Larios, and Manchón (2019) is a useful resource for anyone wishing to study how learners
focus on language while writing. They devised a coding system for language-related episodes that
is theoretically sound and valid. The coding system includes not only the type of linguistic problems
that writers focus on but how the writers go about solving the problem (e.g., use of a grammatical
rule, cross-linguistic comparison).
For experimental research, issues of reliability and validity are key. These matters are
discussed at length with regard to writing research at large in Polio and Friedman (2017) and
include issues such as how one can validly and reliably measure, for example, syntactic com-
plexity. While this issue is challenging for any writing research, it is thornier in studies that
compare oral and written language. For example, Chan et al. (2015) divided oral and written
language into T-units, a unit intended for writing, and Vasylets et al. (2017) used AS-units, a unit
intended for oral language. Arguments can be made for both approaches, but these differences
make it difficult to compare studies.
Specific issues related to the studies discussed here include time-on-task, the measurement of
the durability of effects, and problems related to mimicking real-life writing tasks and conditions
in a laboratory setting. Because writing is slower, studies comparing the effects of writing versus
speaking either need to use different treatment times, thus favoring the writing condition, or equal
amounts of time, possibly favoring the oral condition because students are able to complete the
intervention task. For example, Liao (2018) and Chau (2014) in their studies of modality on writing
and speaking, respectively, had groups of participants complete the pre-activity in one of the two
modalities. Clearly, writing, especially in Chinese by L2 learners as in Liao’s study, will take longer
and thus learners can plan as extensively as they could in an oral pre-activity. Durability of the
treatment is an important matter in that interventions in experimental studies might have short-
term but not long-term effects. Shintani et al. (2016) addressed this issue using both an imme-
diate and delayed posttest, but not all studies have done so. Finally, most studies discussed here
are laboratory-based in that the writing tasks are used for research and mimic testing conditions
(e.g., TOEFL or IELTS type writing tasks) and not multi-stage writing tasks that involve gathering
sources, writing over an extended period of time, and revising after feedback. Studies of L2 writing
that focus on such real-life writing tasks tend to be less focused on language learning.
178

Despite mixed results in the research regarding grammatical development in written language and
the role of writing in grammatical development, some tentative recommendations for teaching can
be made.
First, overall, much of the research points to the need for more language support in writing
instruction. If we consider together the fact that several studies show a lack of grammatical devel-
opment and studies such as Shintani et al. (2016) and Shintani (2017) showed a positive effect
for prewriting grammatical interventions, such interventions seem warranted. Furthermore, the
research points to careful scaffolding of language support throughout the writing process during
prewriting, writing, and revision. Given that the research shows that learners might be more focused
on vocabulary, explicit attention to grammar is called for throughout the process.
Second, in addition to explicit attention to grammar in the prewriting stage, studies have looked
at the effects of collaborative versus individual and oral versus written planning on language.
McDonough and De Vleeschauwer’s (2019) study suggests that collaborative planning might result
in fewer errors, so it seems that collaboration might help learners focus on, rehearse, or refine their
grammar. But whether that planning should be oral (i.e., face-to-face) or written (i.e., via text chat) is
not completely clear.
Third, several of the studies reviewed suggest that learners may not pay as much attention to
grammar while formulating language as opposed to when they revise. It seems that many students
need time to revise explicitly for language. In terms of teaching, strategies for self-editing might be
helpful, but there are also implications for assessment. Timed writing tasks may not allow learners
to apply their explicit knowledge.
Fourth, another implication is that it is important to vary genres for students. As shown in
the research reviewed in this chapter, students may at times vary their language according to the
demands of the genres, so it may be best not to limit learners to what might be perceived as easier
genres. At the same time, we do not know for certain that pushing students to provide a variety of
genres will lead to language learning, but it is possible that limiting genres can have a detrimental
effect in so far as limiting language production.
Finally, we need to consider whether or not students should simply write more and whether that
writing will help overall grammatical development. Unfortunately, there is not clear evidence that
more writing will promote language learning. Rather, it seems writing needs to be scaffolded with
proper language support.
References
Adams, R., Nik Mohd Alwi, N.A., & Newton, J. (2015). Task complexity effects on the complexity and
accuracy of writing via text chat. Journal of Second Language Writing, 29, 64–81. https://doi.org/10.1016/
j.jslw.2015.06.002
Allaw, E., & McDonough, K. (2019). The effect of task sequencing on second language written lexical com-
plexity, accuracy, and fluency. System, 85, 102–104. https://doi.org/10.1016/j.system.2019.06.008
Bardovi-Harlig, K. (2001). Another piece of the puzzle: The emergence of the present perfect. Language
Learning, 51, 215–264.
Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge: Cambridge University Press.
Biber, D., & Gray, B. (2010). Challenging stereotypes about academic writing: Complexity, elaboration, expli-
citness. Journal of English for Academic Purposes, 9, 2–20.
Biber, D., Gray, B., & Staples, S. (2016). Predicting patterns of grammatical complexity across language exam
task types and proficiency levels. Applied Linguistics, 37(5), 639–668. https://doi.org/10.1093/applin/
amu059
Bulté, B., & Housen, A. (2014). Conceptualizing and measuring short-term changes in L2 writing complexity.
Journal of Second Language Writing 26, 42–65.
Bourdain, B., & Fayol, M. (2002). Even in adults, written production is still more costly than oral production.
International Journal of Psychology, 37(4), 219–227.
179
Charlene Polio
Bowles, M. (2010). Bowles, M.A. (2010). The think- aloud controversy in second language research.
Byrnes, H. (2020). Toward an agenda for researching L2 writing and language learning: The educational con-
text of development. In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas
Chan, H.P., Verspoor, M., & Vahtrick, L. (2015). Dynamic development in speaking versus writing in identical
twins. Language Learning, 65(2), 298–325. https://doi.org/10.1111/lang.12107
Chau, H.T. (2014). The effects of planning with writing on the fluency, complexity, and accuracy of L2 oral
narratives (Unpublished PhD dissertation). Michigan State University.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing, 7, 31–51.
DeKeyser, R. (2015). Skill acquisition theory. In B. VanPatten, & J. Williams (Eds.), Theories in second lan-
guage acquisition: An introduction (pp. 94–112). New York: Routledge.
Friginal, E., & Weigle, S. (2014). Exploring multiple profiles of L2 writing using multi-dimensional analysis.
Gass, S.M., & Mackey, A. (2016). Stimulated recall methodology in second language research. New York:
Routledge.
Godfrey, L., Treacy, C., & Tarone, E. (2014). Change in French second language writing in study abroad and
domestic contexts. Foreign Language Annals, 47, 48–65.
Grabowski, J. (2010). Speaking, writing, and memory span in children: Output modality affects cognitive per-
formance. International Journal of Psychology, 45 (1), 28–39.
Writing, 11, 329–350.
Hartshorn, K.J., Evans, N.W., Merrill, P.F., Sudweeks, R.R., Strong-Krause, D., & Anderson, N.J. (2010).
Effects of dynamic corrective feedback on ESL writing accuracy. TESOL Quarterly, 44, 84–109.
Izumi, S. (2002). Output, input enhancement and the noticing hypothesis: An experimental study on ESL rela-
tivization. Studies in Second Language Acquisition, 24, 541–577.
Izumi, S., Bigelow, M., Fujiwara, M., & Fearnow, S. (1999). Testing the output hypothesis: Effects of
output on noticing and second language acquisition. Studies in Second Language Acquisition, 21(3),
421–452.
37, 13–38.
Kang, S., & Lee, J.-H. (2019). Are two heads always better than one? The effects of collaborative planning on
L2 writing in relation to task complexity. Journal of Second Language Writing, 45, 61–72. https://doi.org/
10.1016/j.jslw.2019.08.001
Kessler, M., Polio, C., Xu, C. & Hao, X. (2020). The effects of oral discussion and text chat on L2 Chinese
writing. Foreign Language Annals, 53, 666−685.
Kim, K.M., & Godfroid, A. (2019). Should we listen or read? Modality effects in implicit and explicit know-
ledge. The Modern Language Journal, 103(3), 648–664. https://doi.org/10.1111/modl.12583
Kim, Y., Jung, Y., & Skalicky, S. (2019). Linguistic alignment, learner characteristics, and the production of
stranded prepositions in relative clauses: Comparing FTF and SCMC contexts. Studies in Second Language
Acquisition, 1–33. https://doi.org/10.1017/S0272263119000093
Knoch, U., Roushad, A., Oon, S.P., & Storch, N. (2015). What happens to ESL students’ writing after three
years of study at an English medium university? Journal of Second Language Writing, 28, 38–52.
Knoch, U., Rouhshad, A., & Storch, N. (2014). Does the writing of undergraduate ESL students develop after
one year of study in an English-medium university? Assessing Writing, 21, 1–17.
ance. Language Learning, 62(2), 439–472. https://doi.org/10.1111/j.1467-9922.2012.00695.x
Krashen S. (1982). Principles and practice in second language acquisition. New York: Prentice-Hall.
Kurzer, K. (2018). Dynamic written corrective feedback in developmental multilingual writing classes. TESOL
Quarterly, 52, 5–33.
Laufer, B., & Hulstijn, J. (2001). Incidental vocabulary acquisition in a second language: The construct of task-
induced involvement. Applied Linguistics, 22, 1–26.
Liao, J. (2018). The impact of face-to-face oral discussion and online text-chat on L2 Chinese writing. Journal
Lim, J., Tigchelaar, M., & Polio, C. (in press). Understanding written linguistic development through writing
goals and writing behaviors. Language Awareness.
Llanes, A., & Muñoz, C. (2013). Age effects in a study abroad context: Children and adults studying abroad
and at home. Language Learning, 63, 63–90
180
López-Serrano, S., Roca de Larios, J., & Manchón, R. M. (2019). Language reflection fostered by individual
L2 writing tasks: Developing a theoretically motivated and empirically based coding system. Studies in
Second Language Acquisition, 41(3), 503–527. https://doi.org/10.1017/s0272263119000275
Man, D., & Chau, M. H. (2019). Learning to evaluate through that-clauses: Evidence from a longitudinal learner
corpus. Journal of English for Academic Purposes, 37, 22–33. https://doi.org/10.1016/j.jeap.2018.11.007.
Manchón, R.M. (2011). Situating the learning- to-
write and writing-
to-
learn dimensions of L2 writing.
In R. Manchón, (Ed.), Learning-to-write and writing-to-learn in an additional language (pp. 3–14),
Manchón, R.M. (2020). The language learning potential of L2 writing: Moving forward in theory and research.
In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 405–426).
Mazgutova, D., & Kormos, J. (2015). Syntactic and lexical development in an intensive English for Academic
Purposes programme. Journal of Second Language Writing, 29, 3–15.
McDonough, K., & De Vleeschauwer, J. (2019). Comparing the effect of collaborative and individual prewriting
on EFL learners’ writing development. Journal of Second Language Writing, 44, 123–130. https://doi.org/
10.1016/j.jslw.2019.04.003
Morgan-Short, K., & Bowden, H. (2006). Processing instruction and meaningful output-based instruction.
Studies in Second Language Acquisition, 28, 31–65,
Polio, C., Fleck, C., & Leder, N. (1998). “If I only had more time:” ESL learners’ changes in linguistic accuracy
on essay revisions. Journal of Second Language Writing, 7, 43–68.
Polio, C., & Friedman, D. (2017). Understanding, evaluating, and conducting second language writing
Polio, C., & Shea, M. (2014). Another look at accuracy in second language writing development. Journal of
Polio, C. (2020). Can writing facilitate grammatical development? Advancing research agendas. In R.M.
Manchón (Ed.), The language learning potential of L2 writing: Moving forward in theory and research,
Révész, A., Michel, M., & Lee, M. (2019). Exploring second language writers’ pausing and revision behaviors.
Studies in Second Language Acquisition, 41(3), 605–631. https://doi.org/10.1017/s027226311900024x
Robinson, P. (2005). Aptitude and second language acquisition. Annual Review of Applied Linguistics,
25, 46–73.
Römer, U. (2009). The inseparability of lexis and grammar: Corpus linguistic perspectives. Annual Review of
Cognitive Linguistics, 7, 140–162.
contexts? The case of writing. Applied Linguistics, 38(4), 489–511.
Rubin, D.L., & Kang, O. (2008). Writing to speak: What goes on across the two-way street. In D. Belcher &
A. Hirvela (Eds.), The oral/literate connection: Perspectives on L2 speaking, writing, and other media
interactions (pp. 210–225). Ann Arbor: University of Michigan Press.
Sachs, R., & Polio, C. (2007). Learners’ uses of two types of written feedback on a L2 writing revision task.
Serrano, R. (2011). The effect of program type and proficiency level on learners’ written production. Revista
Española de Lingüística Aplicada, 24, 211–226.
Shintani, N. (2017). The effects of the timing of isolated FFI on the explicit knowledge and written accuracy
of learners with different prior knowledge of the linguistic target. Studies in Second Language Acquisition,
39(1), 129–166. https://doi.org/10.1017/S0272263116000127
Shintani, N., Aubrey, S., & Donnellan, M. (2016). The effects of pretask and posttask metalinguistic
explanations on accuracy in second language writing. TESOL Quarterly, 50, 945–955.
Storch, N. (2009). The impact of studying in a second language (L2) medium university on the development of
L2 writing. Journal of Second Language Writing, 18, 103–118.
output in its development. In S.M. Gass & C. Madden (Eds.), Input and second language acquisition (pp.
Swain, M. (1998). Focus on form through conscious reflection. In C. Doughty & J. Williams (Eds.), Focus
on form in classroom second language acquisition (pp. 64–81). Cambridge: Cambridge University Press.
Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive processes they generate: A step towards
second language learning. Applied Linguistics, 16(3), 371–391.
181
Charlene Polio
Tavakoli, P. (2014) Storyline complexity and syntactic complexity in writing and speaking tasks. In H. Byrnes,
& R. Manchon (Eds.), Task based language learning: Insights from and for L2 writing (pp. 217–236).
Torras, R., & Celaya, M.L. (2001) Age-related differences in the development of written production. An empir-
ical study of EFL school learners. IJES, 1(2), 103–126.
327–369.
Uggen, M. (2012). Re-investigating the noticing function of output. Language Learning, 62, 506–540.
Vasylets, O., Gilabert, R., & Manchón, R. M. (2017). The effects of mode and task complexity on second lan-
guage production. Language Learning, 67(2), 394–430. https://doi.org/10.1111/lang.12228
Weissberg, R. (2006). Connecting speaking and writing in second language writing instruction. Ann Arbor,
MI: University of Michigan Press
Xu, Y. (2019). Changes in interlanguage complexity during study abroad: A meta-analysis. System, 80, 199–
211. https://doi.org/10.1016/j.system.2018.11.008
written genres. TESOL Quarterly, 51, 275–301.
in L2 performance. Modern Language Journal, 101, 335–352.
Ziegler, N. (2016). Synchronous computer-mediated communication and interaction: A meta-analysis. Studies
in Second Language Acquisition, 38(3), 553–586. https://doi.org/10.1017/S027226311500025X
182
14
L2 WRITING AND VOCABULARY
DEVELOPMENT
Kristopher Kyle
University of Oregon
Introduction
Large bodies of well-known vocabulary learning research have centered around the development
of receptive and controlled productive vocabulary knowledge (e.g., Laufer & Nation, 1999; Peters
& Webb, 2018; Webb, 2010) and the organization of the lexicon (e.g., Meara, 2009). As in many
other areas of SLA, research into vocabulary learning (as opposed to vocabulary use) in written
modes is less prevalent. This state of affairs is likely due to the difficulty in conducting controlled
experiments wherein particular words are obligatory in an essay task. Nonetheless, writing is an
important aspect of many second language users’ daily lives, and writing proficiency is an important
factor in academic and professional success (Kellogg & Raulerson, 2007). A growing body of litera-
ture has investigated the relationship between writing and vocabulary use (Crossley & McNamara,
2012; Ferris, 2006; Jarvis, 2013b; Kyle & Crossley, 2015). These studies can be grouped by the
nature of the study and in the types of vocabulary knowledge or use that are examined. Studies
tend to focus on the development of vocabulary use over time, vocabulary use as evidence of lan-
guage learning, and/or writing as a context for vocabulary learning. Further, they tend to examine
measures of lexical diversity, lexical sophistication, collocations, or word choice. A number of key
terms in the study of writers’ vocabulary are described below.
Lexical Diversity
Lexical diversity refers to the variety of words in a text and may be a reflection of the size of a
language users’ vocabulary. Given a particular language task, more proficient writers are expected
to use a wider range of vocabulary items than less proficient writers. The simplest measure of lex-
ical diversity is the number of types (i.e., unique words) in a text divided by the number of tokens
(i.e., total number of running words) in a text, which is referred to as the type-token ratio (TTR).
However, as we will see in the third below, research has outlined serious limitations of TTR as an
index of diversity, and alternative indices such as the measure of textual lexical diversity (MTLD;
McCarthy & Jarvis, 2010) or moving average TTR (MATTR; Covington & McFall, 2010) are pre-
ferred. It should be noted that diversity indices measure the variety of words in a text, but provide
no information about the characteristics (e.g., the difficulty or accuracy) of the words themselves.
DOI: 10.4324/9780429199691-21 183

Kristopher Kyle
Lexical Sophistication
Lexical sophistication refers to the proportion of advanced or difficult words in a text (Read,
2000). The advanced or difficult nature of a word can be considered from at least two perspectives,
both from a learning perspective (i.e., what characteristics make a word difficult to learn) and
from the perspective of perception (i.e., what characteristics make a word seem advanced to a
reader or listener). Traditionally, lexical sophistication has been measured with regard to the
relative frequency of a word (usually measured by number of occurrences in a representative
corpus), wherein less frequent (i.e., rarer) words are considered more sophisticated than more
frequent words (Laufer, 1994; Laufer & Nation, 1995). Recent research by Crossley, Kyle, and
colleagues (e.g., Crossley, Salsbury, McNamara, & Jarvis, 2011; Kyle & Crossley, 2015; Kyle,
Crossley, & Berger, 2018), however, have investigated a wide range of word characteristics that
index sophistication (see the third and fourth sections below). Given a particular language task,
more proficient language users are presumed to use a higher proportion of sophisticated words
than those that are less proficient.
Words, Lemmas, Flemmas, and Families

In studies of lexical diversity and sophistication, the definition of a word unit may meaningfully
affect the results (see Kyle, 2020; McLean, 2018). When words are used to define a word unit, each
word form (regardless of inflection or derivation) is counted as a different type. For example, if
words are used, then uses of the words smoke (noun), smokes (noun), smoke (verb), smokes (verb),
smoked (verb), smoking (verb), smoker (noun), and smokers (noun) would all be counted as unique
types in calculations of lexical diversity and in the calculation of corpus frequencies. In studies of
lexical diversity, this means that a written text sample would be considered lexically diverse if it
included a range of inflected and derived forms of smoke (or any other word). In studies of lexical
sophistication, this means that each word form would receive its own frequency value, and less fre-
quent forms of a word would be considered more sophisticated than less frequent forms. Note that
in practice, most operationalizations of words are based on the surface form of the word (e.g., the
noun and the verb variations of smoke and smokes would be counted as the same form) unless a part
of speech (POS) tagger is used.
When lemmas are used to define a word unit, all inflected forms of a word are counted as a single
type. In the case of the forms of smoke outlined above, any instance of the verb smoke (including
smoke, smokes, smoked, and smoking) would be counted as a single type. Accordingly, any inflected
form of the noun smoker (i.e., smoker, smokers) would be counted as a single type (but would be
considered distinct from smoke as a verb). In practice, most studies that purport to use lemmas often
use flemmas (a portmanteau of family and lemma; Pinchbeck, 2017), which conflates word forms
with zero derivation (i.e., the noun smoke and the verb smoke). When word families are used to
define a word unit, all inflected and most derived forms (see Bauer & Nation, 1993) of a word are
counted as a single type. In the case of the forms of smoke outlined above, any use of the inflected or
derived forms of smoke (i.e., smoke, smokes, smoked, smoking, smoker, smokers) would be counted
as the same type.
Collocation and Word Choice

Collocation refers to the tendency of words to co-occur with particular words (and not with others;
e.g., Sinclair, 1991). Collocation can be operationalized objectively using statistical measures such
as mutual information (Oakes, 1998), which compare the frequency with which two words occur
together and apart. Word combinations with high MI scores, such as straight-A and student are
more strongly collocated than those with lower MI scores, such as abilities and student.1 In written
184
L2 Writing and Vocabulary Development
corrective feedback (WCF) studies, the concept of collocation is often intuitively indexed as word
choice, wherein failing to use a collocate can result in a word choice error, although it may be
difficult to assess collocational accuracy in a binary manner (see further discussion in Yoon, this
volume).
Early vocabulary-centered writing research explored the range of vocabulary items used by writers
(lexical diversity). Later, related research also explored the characteristics of the particular words
used by writers (lexical sophistication). Research has also explored the relationship between recep-
tive and productive vocabulary knowledge, the use of collocations, and writing as a site for vocabu-
lary learning. Each of these topics are explored below.
Lexical Diversity
The study of lexical diversity likely began with Thompson and Thompson (1915), who sought
to develop methods of estimating an individual’s productive vocabulary. Thirty years later, Yule
(1944), attempted to address the question of authorship identification by employing word-use
statistics (based on type and token counts to measure the size of a particular author’s vocabu-
lary). By the 1960s, diversity measures were used in second language (L2) research to measure
the relative size of an L2 user’s productive vocabulary (e.g., Guiraud, 1960). In the decades that
have followed, writing research with lexical diversity indices has fallen into two categories,
namely exploring the relationship between indices of lexical diversity and writing development
and/or proficiency (e.g., Crossley et al., 2011; Engber, 1995; Jarvis, 2013b) and measuring the
reliability of various lexical diversity measures (e.g., Covington & McFall, 2010; McCarthy &
Jarvis, 2010).
Generally speaking, research has indicated that lexical diversity is often a reasonable indicator
of (writing) proficiency. Engber (1995), for example, found moderate correlations between holistic
scores of writing quality and lexical diversity (measured as simple TTR) both when lexical errors
were ignored (r = .45) and when lexical errors were omitted from calculations (r = .57). Similar
findings have been observed in a number of other cross-sectional studies with various measures of
lexical diversity (e.g., Crossley et al., 2011; Harley & King, 1989; Linnarud, 1986; Mazgutova &
Kormos, 2015).
In contrast, there are only a few studies that have investigated the longitudinal development
of written lexical diversity. These have indicated that growth in lexical diversity over time is not
necessarily linear or the same for all individuals. Laufer (1991), for example, followed two groups
of students (one for a single semester and another over two semesters). She found that there were
no significant changes in mean lexical diversity scores over either a single semester or over two
semesters. A closer analysis of the data indicated that a number of the participants did increase
their written lexical diversity over time, while others either remained the same or used less diverse
vocabulary over time. Laufer (1994) also examined growth in lexical diversity of time, and found
significant changes in diversity over the group that was followed over one semester, but not for the
group that was followed over two semesters. More recent studies have failed to find changes over
a single semester (e.g., Bulté & Housen, 2014; Yoon & Polio, 2017). More research is needed to
determine the time scales at which development in lexical diversity can be expected.
Lexical Sophistication
As noted in the introduction, indices of lexical diversity provide insight into the size of a writer’s
vocabulary, but provides no information about the characteristics/qualities of the words used. L2
185
Kristopher Kyle
researchers began to address this gap in the early 1990s through the lens of lexical sophistication
(e.g., Laufer, 1991, 1994; Laufer & Nation, 1995).2 In these studies, Laufer and Nation applied lex-
ical frequency profiles, which had previously been used to estimate reading difficulty (e.g., Hirsh
& Nation, 1992) to the measurement of productive lexical proficiency. Laufer (1991), for example,
measured the proportion of advanced words from Xu and Nation’s (1984) University Word List. She
found a significant increase in the proportion of advanced words over two semesters (but not after
a single semester). In a follow-up study, Laufer (1994) divided words into two categories: Basic
2000 (which include the 2,000 most frequent word families) and Beyond 2000 (which comprised
any word family not among the most frequent 2,000). She observed significant differences between
the proportion of Basic 2000 and Beyond 2000 words in a text both after a single semester and after
an academic year. As writers developed, they tended to use a lower proportion of Basic 2000 words
and a higher proportion of Beyond 2000 words (i.e., they tended to use more sophisticated words as
a function of time studying English).
A number of subsequent cross-sectional studies have also investigated the relationship between
lexical sophistication and general proficiency level, holistic writing proficiency scores, analytic
scores of lexical proficiency and/or vocabulary size. These studies have found a fairly consistent
relationship between the use of lower frequency (i.e., more sophisticated words) and proficiency
level, holistic and analytic essay scores, and vocabulary size. Laufer and Nation (1995), for
example, found that more proficient L2 writers tended to produce essays with a lower propor-
tion of words in the 1K list (the 1,000 most frequent word families) and a higher proportion of
UWL words and Beyond 2000 words. They also found that vocabulary size, based on Nation’s
(Nation, 1983) Vocabulary Levels Test, was positively correlated with the proportion of UWL and
Beyond 2000 words in an L2 essay and negatively correlated with the proportion of 1K words. In
short, L2 writers with a larger vocabulary tended to use less frequent (more sophisticated) words.
A number of subsequent studies using related (if more complex) methods have found similar results
(e.g., Meara and Bell, 2001). A number of subsequent studies have also successfully used LFPs
and related measures to model lexical proficiency and/or writing quality (e.g., Crossley, Cobb, &
McNamara, 2013; Kojima & Yamashita, 2014; Morris & Cobb, 2004).
Other scholars have operationalized frequency using the mean word frequency (usually using
words or lemmas) instead of word family-based band proportion scores (e.g., Beyond 2000, P_
Lex, S). In most cases, these studies differentiate between the use of content words (nouns, lexical
verbs, adjectives, and adverbs of manner) and function words (grammatical words not included as
content words). Mean frequency studies have consistently found small to medium relationships
between average word frequency scores for content words and holistic scores of writing quality
(e.g., Guo, Crossley, & McNamara, 2013; Kyle & Crossley, 2015), analytic scores related to lex-
ical proficiency (e.g., Crossley et al., 2011; Kyle & Crossley, 2015; Kyle et al., 2018), and profi-
ciency level (Crossley & McNamara, 2012; Guo et al., 2013). Kyle & Crossley (2015), for example,
investigated the relationship between various indices of lexical sophistication (see third section
below), including content word frequency, and analytic lexical proficiency scores. They found a
negative correlation between content word frequency and scores, indicated that more lexically pro-
ficient writers tended to use (on average) less frequent content words.
Relationship Between Vocabulary Knowledge and Writing Proficiency

Another area of interest has been the relationship between both receptive and productive vocabulary
knowledge (based on controlled tests) and vocabulary use in writing tasks. Research has indicated,
for example, that words that are learned in productive activities are more likely to be used in sub-
sequent writing tasks than words learned in receptive activities (Webb, 2009). Research has also
indicated that individuals with larger receptive and productive vocabularies tend to earn higher
186
writing quality scores (Johnson, Acevedo, & Mercado, 2016; Yang et al., 2019). Given the limited
writing tasks that have been examined (i.e., picture descriptions and picture-based narratives), more
research in this area is needed.
Collocation, Word Choice, and Writing to Learn

In addition to using a wide variety of lexical items and using sophisticated lexical items, proficient
writers tend to be sensitive to the lexical contexts in which particular words are used (Bestgen &
Granger, 2014; Garner, Crossley, & Kyle, 2019; Kyle et al., 2018). Bestgen & Granger (2014), for
example, found that higher scoring descriptive essays tended to include more strongly associated
(i.e., collocated) bigrams (contiguous two-word combinations) than lower scoring narrative essays.
Similar results have also been observed in studies that have examined argumentative essays (e.g.,
Garner et al., 2020, 2019). For related work on frequently occurring (and relatively fixed) multi-
word utterances see Chapter 27 in this volume.
Work on lexical diversity and lexical sophistication has primarily considered learner-produced
texts as evidence of what has been learned. Separate, but related, strands of research have explored
writing as a site for language learning itself (e.g., Ferris, 2006; Hulstjin & Laufer, 2001; Kim, 2008;
Manchón, 2011). Written corrective feedback is one research strand that considers writing to be
a site for language learning. Written corrective feedback provides opportunities for learning by
drawing learners’ attention to language issues. Attention (and noticing) are key aspects of cognitive
processes related to language learning (e.g., Schmidt, 2001; Swain, 1985, 1995). Although written
corrective feedback research has tended to focus on issues of grammar, a small number of studies
have also investigated the efficacy of marking errors at the word level (though these errors have
often been combined with other errors). It should be noted that many errors at the word level are
actually grammatical in nature (e.g., agreement errors, tense errors, plural marking errors) or are
related to spelling issues. However, errors related to word choice are likely more closely related
to word knowledge (within the construct of collocation, e.g., Nation, 2001; Sinclair, 1991) than
grammar. Ferris (1999) referred to word choice errors as “untreatable” errors, given that one cannot
refer to a systematic rule for learners to follow in order to address such errors. For this reason, most
written corrective feedback studies that include word choice errors as a variable of interest (e.g.,
Ferris, Liu, Sinha, & Senna, 2013) tend to include direct feedback (i.e., a concrete, direct suggestion
for an alternative word).
A second body of research that explores writing as a site for learning has centered around the
involvement load hypothesis (Hulstjin & Laufer, 2001). The involvement load hypothesis suggests
that learning activities that feature a higher involvement load in terms of need, search, and evalu-
ation will result in higher learning gains than those with a lower involvement load. Research has
indicated, for example, that higher vocabulary learning gains are made when learners are required
to write a paragraph that includes newly introduced words than when other activities with lower
involvement loads are used such as the use of glosses or fill-in-the-blank activities (e.g., Hulstjin &
Laufer, 2001; Kim, 2008; Zou, 2017).

Interpreting research on vocabulary and writing depends heavily on the way in which vocabulary
use or learning is measured and the overall design of the study. Below, critical issues in measuring
lexical diversity and lexical sophistication are outlined, followed by important considerations for
the design of research related to the role of WCF and the validation of the involvement load hypoth-
esis (respectively).
187
Kristopher Kyle
Reliability of Indices of Lexical Diversity

A key issue in the measurement of lexical diversity has been the relationship between various
indices and text length. The well-known type-token ratio (TTR) index has been demonstrated to
be inherently contingent on the length of a text (e.g., Jarvis, 2002; McCarthy & Jarvis, 2010).
Due to this issue, some scholars have opted to measure the diversity of a particular section of the
text (e.g., the first 200 words; Treffers-Daller, Parslow, & Williams, 2018). However, this involves
the loss of learner data, and not all of the words used by a learner are considered. Accordingly, a
great deal of lexical diversity research has attempted to find text-length independent measures, with
varying success. Many indices of diversity are statistical transformations of TTR, such as Root
TTR (also commonly known as Guiruad’s index, Guiraud, 1960), and log TTR (Herdan, 1964).
However, subsequent research has indicated that these are still affected by text length (Koizumi
& In’nami, 2012; McCarthy & Jarvis, 2007, 2010). Other solutions, such as moving average TTR
(MATTR; Covington & McFall, 2010), measure of textual lexical diversity (MTLD; McCarthy &
Jarvis, 2010), and D (Malvern & Richards, 1997; McCarthy & Jarvis, 2007) appear to be much
more stable. Despite the well-documented weaknesses of indices such as Guiraud’s index and the
apparent superiority of indices such as MATTR, MTLD, and D, the former is still often reported.
Likely, this is due to the availability of tools to measure the latter indices (see the fifth section
below).
A related issue that has not been widely investigated is the reliability of lexical diversity indices
across writing topics/prompts. As many studies (especially in the realm of learner corpus research)
use essays written on a wide variety of prompts/topics, this is an important issue deserving of fur-
ther research.
Reliability of Indices of Lexical Sophistication

Reliability is also an important, if under-researched, issue in the measurement of lexical sophisti-
cation. As in lexical diversity research, a potential issue is related to the relationship between text
length and sophistication scores. Kojima and Yamashita (2014), for example, found that indices
such as Beyond 2000 (Laufer, 1994) and Advanced Guiraud (Daller, Van Hout, & Treffers-Daller,
2003) were not stable across different text lengths, while indices such as S (which is introduced
in the article) and P_Lex (Meara & Bell, 2001) were. Another important issue is within-subjects
reliability. Laufer and Nation (1995), for example, found that lexical frequency profiles were not
significantly different across essays written by the same individuals one week apart (the topics were
also different). Similarly, Meara and Bell (2001) found that P-Lex was stable across essays written
by the same individuals (but on different topics). More recent research by Kojima and Yamashita
(2014), however, has suggested that longer texts are needed to achieve acceptable test-retest reli-
ability for most indices, including Beyond 2000 (Laufer, 1994) and P_Lex (Meara & Bell, 2001).
While the aforementioned studies are informative, the sample sizes are relatively small, limiting
the conclusions that can be made. Few (if any) studies have conducted a wide-scale analysis of this
sort, leaving an important gap in the research.
Lexical Diversity as a Multifaceted Construct

While much research has investigated the reliability of indices of lexical diversity, recent compel-
ling research by Jarvis (2013b, 2013a, 2017) has investigated the validity of these indices. Jarvis
(2013a) argues that lexical diversity in texts, much like diversity in ecology, is a multifaceted con-
struct consisting of features such as size (number of words), richness (number of different words),
evenness (distribution of different words across a text) and disparity (semantic relatedness of
words), among others. Jarvis (2013b, 2017) demonstrate that most of these features are strongly
188
related to human judgments of diversity. As Jarvis (2017) notes, the most appropriate method of
measuring each feature still needs further investigation. This issue notwithstanding, researchers
have begun to appreciate the multifaceted nature of lexical diversity (Bulté & Housen, 2014; Ellis,
2017), and this is a rich area for future research.
Lexical Sophistication as a Multifaceted Construct

As highlighted in the second section above, lexical sophistication has traditionally been considered
a uni-dimensional construct that has been measured using reference corpus frequency. Word fre-
quency is clearly an important factor in various theories of first and second language learning
(e.g., Chomsky, 1986; Ellis, 2002b; Tomasello, 2003) and a number of empirical studies have
demonstrated a meaningful relationship between frequency and writing quality/proficiency.
However, corpus frequencies are not perfect representations of a particular learner’s language input,
and theories of language learning suggest that factors beyond than frequency (e.g., salience) also
affect learning (Ellis, 2002a; Paivio, 1971). Accordingly, recent research has begun to consider
lexical sophistication as a multifaceted construct that can be measured by a collection of variables
related to learning difficulty and human perceptions of learning difficulty (Crossley & McNamara,
2012; Kim, Crossley, & Kyle, 2018; Kyle & Crossley, 2015; Kyle et al., 2018). These studies have
demonstrated that a much larger proportion of the variance in human ratings of lexical proficiency
and/or writing proficiency can be explained when features beyond frequency (such as concrete-
ness, use of collocations, and contextual diversity) are used. One criticism of these studies is that
they use such a broad range of (sometimes conceptually overlapping) indices that the results can
be difficult to meaningfully interpret. Some efforts have been made to reduce the dimensionality of
these approaches through techniques such as factor analysis (e.g., Kim et al., 2018), however, more
work is needed to refine our understanding of these dimensions and clearly link them to a theory (or
collection of theories) of SLA.
Direct Feedback and Vocabulary Learning

Most WCF research that investigates vocabulary errors can be included in a range of studies whose
main goal is determining whether WCF is effective (Ferris, 1999; Truscott, 1996), and is included as
one of many variables. Further, Ferris (1999) notes that vocabulary errors are “untreatable” in that
a teacher or learner cannot look to a particular language rule to fix the problem. Accordingly, most
vocabulary errors (such as word choice errors) are given direct feedback (i.e., the instructor provides
the correct word or phrase directly). Correcting such errors requires a low involvement load (with
moderate need, but no search or evaluation), and accordingly will likely not lead to longer-term
learning. While vocabulary and word choice errors usually cannot be corrected by falling back on a
rule (and are indeed “untreatable” by Ferris’ definition), there are autonomous learning techniques
that could be explored which include higher involvement loads and therefore may lead to increased
learning. Corpus-based learning, wherein students are taught to use simple corpus tools to explore
language use patterns (e.g., Römer, 2010; Sinclair, 2004; Varley, 2009), for example, is one area that
is rich for more research. Such approaches have the potential to not only make marking more effi-
cient for instructors (by allowing for indirect feedback), but also to increase a learners’ involvement
load, which may lead to more learning. This is clearly an area for future research.
Refining the Involvement Load Hypothesis and Time on Task

Hulstjin and Laufer’s (2001) formulation of the involvement load hypothesis suggested that vocabu-
lary learning activities differed across three parameters including need (extrinsic motivation is mod-
erate need and intrinsic is high need), search (whether or not learners had to look up the meaning or
189
Kristopher Kyle
form of a word), and evaluation (moderate evaluation includes making comparisons and high evalu-
ation includes creating original word contexts). Although a number of subsequent studies have (at
least partially) supported the involvement load hypothesis (e.g., Kim, 2008; Keating, 2008), at least
two key issues are still being investigated. The first is the degree to which the parameters of need,
search, and evaluation and their operationalizations are appropriate and sufficient (and for what
purposes). Some researchers, such as Nation and Webb (2011), for example, suggest that as many
as five parameters (motivation, noticing, retrieval, generation, and retention) are needed to accur-
ately model the involvement load, particularly when designing pedagogical activities. The second
issue is the degree to which involvement load can be separated from time on task (e.g., Keating,
2008; Webb, 2005). Tasks with higher involvement loads (such as those that require looking up
words or writing sentences) tend to take more time than those with lower involvement loads (such
as those that include glosses and only require writing the target word in a blank). For example,
Keating (2008) compared immediate and learning gains for three tasks that varied with regard
to involvement load. The initial findings generally supported the involvement load hypothesis –
tasks with higher involvement loads tended to result in higher vocabulary learning gains. However,
when learning gains were converted to words learned per minute (by estimating the time taken to
complete each task), no significant differences in learning gains were found across three activities
with varying involvement loads. More work is needed to further consolidate the parameters of the
involvement load and to address issues of time on task.

In this section, three studies are reviewed to provide an example of some recent vocabulary and
writing research. The first study highlights recent work by Jarvis (2013b) on lexical diversity. The
second highlights recent work by Kyle et al., (2018) on lexical sophistication. Finally, the third
study highlights recent work on vocabulary learning through writing by Zou (2017).
Lexical Diversity as a Multidimensional Phenomenon

Jarvis (2013a) argues that the notion of diversity is multifaceted, and the measurement of lex-
ical diversity should also consider multiple dimensions. In a follow-up study, Jarvis (2013b)
operationalizes a subset of his proposed dimensions and explores the degree to which these
dimensions can be used to model human judgments of lexical diversity in learner texts. Jarvis used
a learner corpus comprised of 50 narrative film descriptions of a section of a silent film (Charlie
Chaplin’s Modern Times) written by L2 learners of English of varying proficiency levels (n = 37)
and L1 speakers aged 10–15 (n = 13). Each text was scored by at least two raters. Raters were pur-
posefully given very limited instructions on how to rate the texts, as the research did not want to
influence their conceptualization of lexical diversity (see Jarvis, 2017 for more information). Jarvis
then calculated scores for six dimensions of diversity, including variability (measured as MTLD),
volume (number of words), evenness (standard deviation of tokens per type), rarity (measured
using frequency ranks from the BNC), dispersion (mean distance between tokens of a particular
type), and disparity (mean number of words with closely related meanings). He found significant
(p < .05) and meaningful (r > .100) correlations between the ratings of diversity and all of the
diversity dimensions, with the exception of rarity. Significant correlations ranged from r = .31
(variability) to r = .67 (volume). However, many of the indices were strongly collinear suggesting
that some of the dimensions may not be distinct. Volume, for example, was strongly correlated with
both evenness (r = .89) and dispersion (r = .94). Jarvis conducted three multiple regression models
with varying characteristics and found that multivariate models explained between 47% and 49%
of the variance in human judgments of lexical diversity, which is a much greater proportion than if
190
variability alone (which would have explained 9% of the variance) had been used. In his conclu-
sion, Jarvis notes the potential need to reconsider the operationalization of some of the dimensions
to avoid issues of multicollinearity and to obtain more reliable human ratings (see Jarvis, 2017).
Lexical Sophistication as a Multidimensional Phenomenon

Kyle and Crossley (2015), building on previous work conducted using Coh-metrix (Crossley &
McNamara, 2012; Crossley et al., 2011) argue for the conceptualization of lexical sophistication
as a multidimensional construct. Kyle et al. (2018) builds on previous studies by investigating the
degree to which an even wider range of features related to lexical difficulty can explain ratings
of lexical proficiency. They use a corpus of unstructured essays (n = 240) written by L2 users of
varying proficiency who were enrolled in a university English program (n = 180) and undergraduate
L1 speakers (n = 60). Each essay was scored using a holistic lexical proficiency rubric by at least
two raters, and inter-rater reliability was acceptable (r = .796). See Crossley, Salsbury, et al. (2011)
for more information regarding the corpus. Kyle et al. considered the use of 12 index categories
in explaining the variance in lexical proficiency. The first six categories (with example indices),
included word frequency (mean corpus frequency for content words), word range (mean word
range for content words), psycholinguistic information (mean concreteness score), age of acquisi-
tion/exposure (grade level at which a word is learned), academic words (proportion of AWL words),
and contextual distinctiveness (the mean number of semantic relations per word in a text). The final
six categories included word recognition norms (mean reaction times for words in a text), semantic
network (mean number of hypernyms for words in a text), n-gram frequency (mean bigram fre-
quency), n-gram range (mean bigram range), collocation strength (mean bigram mutual informa-
tion score), and word neighbors (mean number of similarly spelled words per word). Significant
correlations between individual indices were relatively small (with absolute values ranging from
r = .213 to r = .391). However, when used in a complementary multiple regression model, ten
indices explained 58% of the variance in lexical proficiency scores, which is almost four times
better than the strongest index alone (bigram collocation strength, 15.3% of the variance), and is
over ten times more than the strongest word frequency index (5.1% of the variance). Studies such
as Kyle et al. (2018) demonstrate the strength of conceptualizing lexical sophistication as a multi-
dimensional construct (see also Kim et al., 2018; Garner et al., 2019).
Learning Vocabulary Through Writing

Hulstjin & Laufer (2001) proposed the involvement load hypothesis and provided empirical evi-
dence that tasks with higher involvement loads (such as paragraph writing) tend to result in higher
vocabulary learning gains that tasks with lower involvement loads (e.g., gap-filling exercises).
Subsequent studies have provided further empirical support for the hypothesis (e.g., Keating, 2008;
Kim, 2008) and/or have also attempted to refine parameters used for estimating the involvement
load (e.g., Zou, 2017). Zou (2017) investigates the effectiveness of three vocabulary learning activ-
ities – fill in the blank, sentence writing, and composition writing. All treatment groups took a
vocabulary pre-test. Then, each group was given a reading text that included ten glossed target
vocabulary words. In the fill-in-the-blank treatment, the target words were substituted with blanks
and all target words (and their glosses) were included in the margin of the text and participants
had to decide which word should be placed in each respective blank. This activity included mod-
erate need, no search, and moderate evaluation. In the sentence writing treatment, participants were
given the same reading passage (with glosses but no blanks) and were asked to write a sentence of
at least ten words in length for each target vocabulary word. This activity included moderate need,
no search, and high evaluation. In the composition writing treatment, participants were given the
191
Kristopher Kyle
same reading passage as the sentence writing group, but were asked to write a coherent compos-
ition that used all ten target words. Like the sentence completion activity, this activity included
moderate need, no search, and high evaluation. The results indicated that the composition writing
group had higher immediate and delayed post-test gains than the sentence writing group, which had
higher immediate and delayed post-test gains than the fill-in-the-blank group. Hulstjin and Laufer’s
(2001) parameters would indicate that sentence writing and composition writing carry a similarly
high evaluation load (words had to be used in original contexts). However, Zou argues that appro-
priately connecting the use of vocabulary words in a coherent composition increases the evaluation
load in the composition writing treatment (to “very high,” which was not an option in the original
operationalization), which helps to explain why the composition writing group outperformed the
sentence writing group.

Though many possibilities exist for research into vocabulary and writing, three research designs
are commonly used (cross-sectional, longitudinal, and experimental). Regardless of the chosen
research design, a wide range of tools for evaluating vocabulary knowledge and measuring various
characteristics of the words used in a learner text are widely and freely available. A number of these
are outlined below, including some features and limitations of each.
Research Designs
Vocabulary research in written modes tends to explore differences in vocabulary use across
proficiency levels (cross-sectional designs), patterns of vocabulary use over time (longitu-
dinal designs), or the relative efficacy of particular pedagogical treatments in controlled studies
(experimental designs). Each of these, which are outlined below, have distinct advantages and
limitations.
Cross-Sectional
Cross-sectional research designs explore differences in vocabulary use across various proficiency
levels (e.g., Engber, 1995; Kyle & Crossley, 2015). One advantage of cross-sectional designs is that
they allow for relatively efficient data collection (a large number of samples could be collected in
a single day or week). Because of this, there are a number of cross-sectional learner corpora that
are freely available for research purposes (e.g., Ishikawa, 2013). However, while cross-sectional
research provides an indication of differences in vocabulary use across proficiency benchmarks,
they do not (necessarily) model the paths that learners may take to achieve those benchmarks (e.g.,
Verspoor, Schmid, & Xu, 2012).
Longitudinal
Research has suggested that language learning is dynamic, and that in many cases learners follow
diverse and complex developmental trajectories (e.g., Verspoor et al., 2012). Longitudinal research
helps explore these developmental trajectories through the collection and analysis of multiple texts
from the same individuals over an extended period of time (e.g., Connor-Linton & Polio, 2014).
Although the benefits of longitudinal designs are clear, the collection of longitudinal datasets can
be challenging due to the time it takes to collect the data (ideally a year or more). Furthermore,
participants often drop out of longitudinal studies meaning that the initial sample size must be much
higher than desired final sample.
192
Experimental
Experimental designs are often used to investigate the relative efficacy of particular pedagogical
treatments (see, e.g., Zou, 2017). These tend to involve a pre-test (to evaluate characteristics of
vocabulary use before the treatment), a treatment (i.e., a learning activity), an immediate post-test
(to evaluate initial vocabulary gains), and a delayed post-test (to determine the degree to which
vocabulary gains were sustained). Importantly, researchers attempt to control for all factors that
are not related to the main research questions, which helps the interpretability of the results (e.g.,
which treatment led to the most sustained gains in vocabulary use). One potential downside of
experimental designs is that the highly controlled nature of the tasks may not be reflective of what
may actually happen in a classroom, which can limit the pedagogical implications of the findings.
Vocabulary Size Assessment Tools

A full review of tools that can be used for directly assessing vocabulary size is beyond the scope of
this chapter. However, a number of popular tools are freely available.
Lextutor
Tom Cobb’s website (www.lextutor.ca/) includes a large repository of computer and paper-based
vocabulary size tests for English (and one for French). Included are tests such as the classic and
various updated versions of the vocabulary size test (Nation & Beglar, 2007), the word associates
test (Read, 1993), and the productive vocabulary levels test (Laufer & Nation, 1999).
_lognostics
Paul Meara’s website (www.lognostics.co.uk/) includes a number of online and downloadable
vocabulary tests. In particular, it includes Lex30 (a test of productive vocabulary), V_Quint (a test
of vocabulary depth), and Y_Lex (a vocabulary size test for less frequent words).
Text Analysis Tools

Antwordprofiler, Range, VocabProfile
Antwordprofiler (www.laurenceanthony.net/software/antwordprofiler/, Anthony, 2014) is a freely
available tool for calculating word frequency profile statistics for a text. Antwordprofiler is based
on the Range (Heatley & Nation, 1994) program, but unlike Range (which only works on Windows
machines), Antwordprofiler is compatible with both Mac and Windows systems. Both Antprofiler
and Range can batch process texts (i.e., can process multiple texts at a time). The only downside of
these programs is that the output requires some amount of cleaning/formatting before it can be used
for any statistical analysis. For follow-up investigations, VocabProfile (www.lextutor.ca/vp/, Cobb,
2018) is an excellent online resource for investigating the word frequency profiles of individual
texts (including a color-coded representation of the words in a text by frequency band).
Coh-metrix
Coh-metrix (Graesser, McNamara, Louwerse, & Cai, 2004; McNamara, Graesser, McCarthy, &
Cai, 2014) (Graesser et al., 2004; McNamara et al., 2014) has been used to investigate lexical diver-
sity and lexical sophistication in a wide number of studies. Coh-metrix is particularly important
as it was used in a number of early studies that demonstrated the multifaceted nature of lexical
193
Kristopher Kyle
sophistication (and other constructs such as reading difficulty). A simplified version of the tool,
which includes 108 indices is freely available online (www.cohmetrix.com). Coh-metrix calculates
various indices related to word frequency (based on the CELEX corpus), psycholinguistic word
information (e.g., concreteness), and lexical diversity (e.g., MTLD, citation) among others. The
main limitation of Coh-metrix is that texts must be processed one at a time, which limits the size of
the datasets examined.
TAALED
The Tool for the Automatic Analysis of Lexical Diversity (TAALED Kyle, Jarvis, & Crossley, 2018)
calculates a variety of classic and more recently developed indices of lexical diversity. In addition to TTR
and related transformations, TAALED calculates MTLD (McCarthy & Jarvis, 2010), HD-D (McCarthy
& Jarvis, 2007), and MATTR (Covington & McFall, 2010). Although TAALED was designed to pro-
cess English texts, diversity indices can also be calculated for other languages provided they are pre-
tokenized and lemmatized. TAALED is freely available at www.linguitsicanalysistools.com, works
on Windows and Mac operating systems, and can batch process texts.
TAALES
The Tool for the Automatic Analysis of Lexical Sophistication (TAALES; Kyle & Crossley, 2015;
Kyle et al., 2018) calculates over 500 indices of lexical sophistication in 12 categories. TAALES
includes frequency indices for a number of classic corpora such as the Brown Corpus (Kucera &
Francis, 1967), London-Lund Corpus (Brown, 1984), and the Thorndike-Lorge corpus (Thorndike
& Lorge, 1944). Also included are frequency indices for more recent corpora such as the British
National Corpus (BNC; BNC Consortium, 2007), the Corpus of Contemporary American English
(COCA; Davies, 2010), and SUBTLEXus (Brysbaert & New, 2009). In addition, TAALES includes
indices related to psycholinguistic word information (such as concreteness), collocation, formal
word characteristics, contextual diversity, and academic language (among others). TAALES
is freely available at www.linguitsicanalysistools.com, works on Windows and Mac operating
systems, and can batch process texts.

Research in lexical sophistication and lexical diversity have indicated that more proficient language
learners (and in particular language learners write more highly rated essays) tend to use a wider range
of vocabulary items, more sophisticated (e.g., less frequent) vocabulary items, and use words that
are more strongly associated (e.g., are more collocated) in their essays. These findings preliminarily
suggest that spending time on vocabulary (and in particular on collocations) in writing classrooms
may be worthwhile. To date, the effects of how and when written corrective feedback should be
used to address vocabulary concerns are unclear. However, research in related areas (e.g., Hulstijn
& Laufer, 2001; Kim, 2008; Varley, 2009) indicate that the use of indirect feedback coupled with
corpus-based autonomous learning may be fruitful. Additionally, research indicates that including
writing activities as a part of the vocabulary will lead to more immediate and delayed learning gains
than only using methods such as reading comprehension and fill-in-the-blank activities (e.g., Hulstjin
& Laufer, 2001; Zou, 2017), though such activities will also likely require more time to complete.
Future Directions
Given what we know (and don’t know) about vocabulary learning for and through writing, there
are a number of directions that would be fruitful for future research. First, a majority of the work
194
in any of the areas outlined in this chapter has dealt with English as a second language (though
there are exceptions to this). Future research should explore the degree to which the trends found
in English do and do not translate to other language learning environments. For example, the issue
of lemmatization/familization may need to be reconceptualized for languages with a wide range of
verbal inflections (such as Spanish) or that are more contextual and have fewer attached morphemes
(such as Hawaiian). Second, future research should explore the degree to which observed trends
are stable across writing task types (e.g., narrative, argumentative, reflective, source based) and
across particular prompts (see, e.g., Yoon, 2018). Third, future research should investigate the rela-
tionship between different types of feedback (and the use of corpora as a resource) in studies that
focus on lexical concerns (such as word choice errors). Fourth, research should continue to inves-
tigate the optimal parameters for estimating the involvement load of a learning activity and disen-
tangle involvement load with time on task. Finally, relatively little longitudinal research has been
conducted, and the findings for studies that have been done tend to contrast with cross-sectional
studies. More research is needed to determine the time scales at which changes in lexical use take
place and the degree to which these trends are similar across participants. In short, there is much
work left to be done.
Notes
1 Based on the Corpus of Contemporary American English (COCA; Davies, 2010). The word straight-A
occurs with student 142 times, but only occurs in other contexts 55 times (MI = 8.88). The word abilities
occurs with student 127 times, but occurs in other contexts 9,638 times (MI = 3.10).
2 Note that related L2 research began in the 1980s (Carlson et al., 1985; Reid, 1986) but used word length as
an indicator of lexical difficulty/sophistication. The work by Laufer and Nation (1995) represents the first
attempts to explicitly examine the relationship between vocabulary size and lexical sophistication.
References
Anthony, L. (2014). AntWordProfiler (Version 1.4. 1)[computer software]. Tokyo: Waseda University.
Bauer, L., & Nation, I.S.P. (1993). Word families. International Journal of Lexicography, 6(4), 253–279.
Bestgen, Y., & Granger, S. (2014). Quantifying the development of phraseological competence in L2 English
writing: An automated approach. Journal of Second Language Writing, 26, 28–41. https://doi.org/10.1016/
j.jslw.2014.09.004
BNC Consortium. (2007). The British National Corpus, version 3. Retrieved from http://www.natcorp.
ox.ac.uk/
Brown, G.D.A. (1984). A frequency count of 190,000 words in the London- Lund Corpus of English
Conversation. Behavior Research Methods, Instruments, & Computers, 16(6), 502–532.
Brysbaert, M., & New, B. (2009). Moving beyond Kucera and Francis: A critical evaluation of current word
frequency norms and the introduction of a new and improved word frequency measure for American
English. Behavior Research Methods, 41(4), 977–990. https://doi.org/10.3758/BRM.41.4.977
Carlson, S. B., Bridgeman, B., Camp, R., & Waanders, J. (1985). Relationship of admission test scores to writing
performance of native and nonnative speakers of English. ETS Research Report Series, 1985(1), i–137.
Chomsky, N. (1986). Knowledge of language: Its nature, origin, and use. Greenwood Publishing Group.
Cobb, T. (2018). Web VocabProfile. https://www.lextutor.ca/vp/eng/
Connor-Linton, J., & Polio, C. (2014). Comparing perspectives on L2 writing: Multiple analyses of a common
corpus. Journal of Second Language Writing, 26, 1–9.
Covington, M.A., & McFall, J.D. (2010). Cutting the Gordian knot: The moving-average type–token ratio
(MATTR). Journal of Quantitative Linguistics, 17(2), 94–100.
Crossley, S.A., Cobb, T., & McNamara, D.S. (2013). Comparing count-based and band-based indices of word
frequency: Implications for active vocabulary research and pedagogical applications. System, 41(4), 965–
981. https://doi.org/10.1016/j.system.2013.08.002
Crossley, S.A., & McNamara, D.S. (2012). Predicting second language writing proficiency: The roles of cohe-
sion and linguistic sophistication. Journal of Research in Reading, 35(2), 115–135.
195
Kristopher Kyle
Crossley, S.A., Salsbury, T., McNamara, D., & Jarvis, S. (2011). Predicting lexical proficiency in language
learner texts using computational indices. Language Testing, 28(4), 561–580. https://doi.org/10.1177/
0265532210378031
Daller, H., Van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of bilinguals.
Applied Linguistics, 24(2), 197–222.
Davies, M. (2010). The Corpus of Contemporary American English as the first reliable monitor corpus of
English. Literary and Linguistic Computing, 25(4), 447–464. https://doi.org/10.1093/llc/fqq018
Ellis, N.C. (2002a). Frequency effects in language processing. Studies in Second Language Acquisition, 24(2),
143–188.
Ellis, N.C. (2002b). Reflections on frequency effects in language processing. Studies in Second Language
Acquisition, 24(2), 297–339.
Ellis, N.C. (2017). Cognition, corpora, and computing: Triangulating research in usage-based language
learning. Language Learning, 67(S1), 40–65.
Engber, C.A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of
Second Language Writing, 4(2), 139–155. https://doi.org/10.1016/1060-3743(95)90004-7
Ferris, D. (1999). The case for grammar correction in L2 writing classes: A response to Truscott (1996). Journal
Ferris, D. (2006). Does error feedback help student writers? New evidence on the short-and long-term effects of
written error correction. In K. Hyland & F. Hyland (Eds.), Feedback in Second Language Writing: Contexts
and Issues (pp. 81–104). Cambridge: Cambridge University Press.
Ferris, D., Liu, H., Sinha, A., & Senna, M. (2013). Written corrective feedback for individual L2 writers.
Garner, J., Crossley, S., & Kyle, K. (2019). N-gram measures and L2 writing proficiency. System, 80, 176–187.
https://doi.org/10.1016/j.system.2018.12.001
Garner, J., Crossley, S.A., & Kyle, K. (2020). Beginning and intermediate L2 writer’s use of n-grams: An asso-
ciation measures study. International Review of Applied Linguistics in Language Teaching, 58(1), 51–74.
Graesser, A.C., McNamara, D.S., Louwerse, M.M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion
and language. Behavior Research Methods, Instruments, & Computers, 36(2), 193–202. https://doi.org/
10.3758/BF03195564
Guiraud, P. (1960). Problèmes et méthodes de la statistique linguistique. Paris; Presses universitaires de France.
Guo, L., Crossley, S.A., & McNamara, D.S. (2013). Predicting human judgments of essay quality in both
integrated and independent second language writing samples: A comparison study. Assessing Writing,
18(3), 218–238.
Harley, B., & King, M.L. (1989). Verb lexis in the written compositions of young L2 learners. Studies in
Heatley, A., & Nation, I.S.P. (1994). Range. Victoria University of Wellington, NZ. [computer program].
Retrieved from www.Vuw.Ac.Nz/Lals/
Herdan, G. (1964). Quantitative linguistics. London: Butterworths.
Hirsh, D., & Nation, P. (1992). What vocabulary size is needed to read unsimplified texts for pleasure? Reading
in a Foreign Language, 8, 689–689.
Hulstijn, J. H., & Laufer, B. (2001). Some empirical evidence for the involvement load hypothesis in vocabu-
lary acquisition. Language Learning, 51(3), 539–558.
Ishikawa, S.I. (2013). The ICNALE and sophisticated contrastive interlanguage analysis of Asian learners of
English. Learner Corpus Studies in Asia and the World, 1(1), 91–118.
Jarvis, S. (2002). Short texts, best-fitting curves and new measures of lexical diversity. Language Testing,
19(1), 57–84. https://doi.org/10.1191/0265532202lt220oa
Jarvis, S. (2013a). Capturing the diversity in lexical diversity. Language Learning, 63(s1), 87–106.
Jarvis, S. (2013b). Deﬁning and measuring lexical diversity. Vocabulary knowledge: Human ratings and
automated measures, 47, 13.
Jarvis, S. (2017). Grounding lexical diversity in human judgments. Language Testing, 34(4), 537–553.
Johnson, M.D., Acevedo, A., & Mercado, L. (2016). Vocabulary knowledge and vocabulary use in second lan-
guage writing. TESOL Journal, 7(3), 700–715.
Keating, G.D. (2008). Task effectiveness and word learning in a second language: The involvement load
hypothesis on trial. Language Teaching Research, 12(3), 365–386.
Kellogg, R.T., & Raulerson, B.A. (2007). Improving the writing skills of college students. Psychonomic
Bulletin & Review, 14(2), 237–242. https://doi.org/10.3758/BF03194058
Kim, M., Crossley, S.A., & Kyle, K. (2018). Lexical sophistication as a multidimensional phenom-
enon: Relations to second language lexical proficiency, development, and writing quality. The Modern
196
Kim, Y. (2008). The role of task-induced involvement and learner proficiency in L2 vocabulary acquisition.
Language Learning, 58(2), 285–325.
Koizumi, R., & In’nami, Y. (2012). Effects of text length on lexical diversity measures: Using short texts with
less than 200 tokens. System, 40(4), 554–564.
Kojima, M., & Yamashita, J. (2014). Reliability of lexical richness measures based on word lists in short
second language productions. System, 42, 23–33.
Kucera, H., & Francis, W.N. (1967). Computational analysis of present-day American English. Providence,
RI: Brown University Press.
Kyle, K. (2020). Measuring lexical richness. In S. Webb (Ed.). The Routledge handbook of vocabulary studies
(pp. 454–476). New York: Routledge.
Kyle, K., & Crossley, S.A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and
application. TESOL Quarterly, 49(4), 757–786. https://doi.org/10.1002/tesq.194
Kyle, K., Crossley, S.A., & Berger, C.M. (2018). The tool for the automatic analysis of lexical sophistica-
tion (TAALES): Version 2.0. Behavior Research Methods, 50(3), 1030–1046. https://doi.org/10.3758/
s13428-017-0924-4
Kyle, K., Jarvis, S., & Crossley, S.A. (2018). The tool for the automatic analysis of lexical diversity (TAALED).
Retrieved from www.linguisticanalysistools.org/TAALED.htm
Laufer, B. (1991). The development of L2 lexis in the expression of the advanced learner. The Modern
Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal,
25(2), 21–33.
Laufer, B., & Nation, I.S.P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied
Linguistics, 16(3), 307–322. https://doi.org/10.1093/applin/16.3.307
Laufer, B., & Nation, I.S.P. (1999). A vocabulary-size test of controlled productive ability. Language Testing,
16(1), 33–51.
Linnarud, M. (1986). Lexis in composition: A performance analysis of Swedish learners’ written English.
Lund: CWK Gleerup.
Malvern, D.D., & Richards, B J. (1997). A new measure of lexical diversity. British Studies in Applied
Manchón, R. M. (2011). Writing to learn the language: Issues in theory and research. In R. M. Manchón
(Ed.), Learning-to-write and writing-to-learn in an additional language (pp. 61–82). Amsterdam: John
Benjamins.
McCarthy, P.M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4),
459–488.
McCarthy, P.M., & Jarvis, S. (2010). MTLD, vocd- D, and HD- D: A validation study of sophisticated
approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. https://doi.org/
10.3758/BRM.42.2.381
McLean, S. (2018). Evidence for the adoption of the flemma as an appropriate word counting unit. Applied
Linguistics, 39(6), 823–845.
McNamara, D.S., Graesser, A.C., McCarthy, P.M., & Cai, Z. (2014). Automated evaluation of text and dis-
course with Coh-Metrix. Cambridge; Cambridge University Press.
Meara, P. (2009). Connected words: Word associations and second language vocabulary acquisition (Vol. 24).
Meara, P., & Bell, H. (2001). P_Lex: A simple and effective way of describing the lexical characteristics of
short L2 texts. Prospect, 3, 5–19.
Morris, L., & Cobb, T. (2004). Vocabulary profiles as predictors of the academic performance of Teaching
English as a Second Language trainees. System, 32(1), 75–87. https://doi.org/10.1016/j.system.2003.
05.001
Nation, I.S.P. (1983). Teaching and testing vocabulary. Guidlines, 4(1), 12–25.
Nation, I.S.P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.
Nation, I.S.P., & Beglar, D. (2007). A vocabulary size test. The Language Teacher 31(7), 9–13.
Nation, I.S., & Webb, S.A. (2011). Researching and analyzing vocabulary. Boston, MA: Heinle, Cengage
Learning.
Oakes, M. (1998). Statistics for corpus linguistics. Edinburgh: Edinburgh University Press.
Paivio, A. (1971). Imagery and verbal processes. New York: Holt, Rinchart, and Winston.
Peters, E., & Webb, S. (2018). Incidental vocabulary acquisition through viewing L2 television and factors that
affect learning. Studies in Second Language Acquisition, 40(3), 551–577.
197
Kristopher Kyle
Pinchbeck, G.G. (2017). Vocabulary use in academic-track high-school English Literature diploma exam essay
writing and its relationship to academic achievement (Unpublished PhD thesis). University of Calgary.
Read, J. (1993). The development of a new measure of L2 vocabulary knowledge. Language Testing, 10(3),
355–371.
Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press.
Reid, J. (1986). Using the writer’s workbench in composition teaching and testing. In C. Stansfield (Ed.),
Technology and Language Testing (pp. 167–188). Alexandria, VA: TESOL.
Römer, U. (2010). Using general and specialized corpora in English language teaching: Past, present and
future. In M.C. Campoy-Cubillo, B. Belles-Fortuño, & L. Gea-Valor (Eds.), Corpus-Based Approaches to
English Language Teaching (pp. 18–35). London: Continuum.
Schmidt, R.W. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 3–32).
Sinclair, J.M. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Sinclair, J.M. (2004). How to use corpora in language teaching (Vol. 12). Amsterdam: John Benjamins.
output in its development. In S.M. Gass & C. Madden (Eds.), Input in second language acquisition (pp.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidlhofer (Eds.),
Principles and practice in applied linguistics: Studies in honor of H.G. Widdowson (pp. 125–144).
Thomson, G.H., & Thompson, J.R. (1915). Outlines of a method for the quantitative analysis of writing
vocabularies. British Journal of Psychology, 8(1), 52.
Thorndike, E.L., & Lorge, I. (1944). The teacher’s wordbook of 30,000 words. New York: Columbia University,
Teachers College.
Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge,
Treffers-Daller, J., Parslow, P., & Williams, S. (2018). Back to basics: How measures of lexical diversity can
help discriminate between CEFR levels. Applied Linguistics, 39(3), 302–327.
Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2),
327–369.
Varley, S. (2009). I’ll just look that up in the concordancer: Integrating corpus consultation into the language
learning environment. Computer Assisted Language Learning, 22(2), 133–152.
Verspoor, M., Schmid, M. S., & Xu, X. (2012). A dynamic usage based perspective on L2 writing. Journal of
Second Language Writing, 21(3), 239–263.
Webb, S. (2005). Receptive and productive vocabulary learning: The effects of reading and writing on word
knowledge. Studies in Second Language Acquisition, 27(1), 33–52.
Webb, S.A. (2009). The effects of pre-learning vocabulary on reading comprehension and writing. The
Webb, S. (2010). A corpus driven study of the potential for vocabulary learning through watching movies.
International Journal of Corpus Linguistics, 15(4), 497–519.
Xue, G., & Nation, I.S.P. (1984). A university word list. Language Learning and Communication, 3(2),
215–229.
Yang, Y., Sun, Y., Chang, P., & Li, Y. (2019). Exploring the relationship between language aptitude, vocabulary
size, and EFL graduate students’ L2 writing performance. TESOL Quarterly, 53(3), 845–856.
Yoon, H. (2018). The development of ESL writing quality and lexical proficiency: suggestions for assessing
writing achievement. Language Assessment Quarterly, 15(4), 38–405.
Yule, C.U. (1944). The statistical study of literary vocabulary. Cambridge; Cambridge University Press.
Zou, D. (2017). Vocabulary acquisition through cloze exercises, sentence-writing and composition-writing:
Extending the evaluation component of the involvement load hypothesis. Language Teaching Research,
21(1), 54–75.
198
15
L2 WRITING AND FORMULAIC
LANGUAGE
Formulaic Chunks and Lexical Bundles
Hyung-Jo Yoon
California State University, Northridge
Introduction
Using appropriate and natural language is one of the aims that second language (L2) writers have.
Much L2 writing research offering pedagogical implications has focused on how to address lin-
guistic errors produced by L2 writers as error-free sentences may contribute to achieving the nat-
uralness of learners’ written language. However, it has been noted that the absence of grammatical
and lexical errors does not necessarily result in the use of language that is fully acceptable and
idiomatic in specific settings, indicating the need to transcend the traditional, binary understanding
of accuracy (Polio & Yoon, 2021; Wulff & Gries, 2011). Informed by a usage-based approach,
this chapter views natural language as high-frequency, formulaic language perceived as natural
by native speakers (Ellis, 2008; Geluso, 2013). While involving multiple sub-skills, therefore, L2
learners’ production of natural language is closely associated with their proper use of formulaic lan-
guage. In this regard, Pawley and Syder (1983) argued for the essential role of formulaic language
for L2 users’ nativelike production, which can be seen ideally as language with “lexicogrammatical
correctness, context-dependent acceptability, and frequency-based idiomaticity” (Mukherjee, 2005,
p. 15).
Since Sinclair’s (1991) theorization of the availability and prevalence of formulaic language
(widely known as the idiom principle), it has been explored using several different terms that
include chunks (De Cock, 2000), formulaic sequences (Wray, 2000), and multi-word units (Greaves
& Warren, 2010), among others. With a growing understanding of the processing advantages of
pre-arranged word combinations in language use (Siyanova-Chanturia & Martinez, 2015; Wray,
2012), L2 writing scholars have investigated various types of formulaic units including collocations
(e.g., Altenberg & Granger, 2001; Laufer, & Waldman, 2011; Yoon, 2016), lexical bundles (e.g.,
Chen & Baker, 2010; Shin, 2019), n-grams (e.g., Bestgen & Granger, 2014; Garner, Crossley, &
Kyle, 2019), and phrase frames (p-frames; e.g., Garner, 2016; O’Donnell, Römer, & Ellis, 2013).
The term collocation has been defined and operationalized in different ways, while seen commonly
as syntactically-bound sequences composed of lexical words (both continuous and discontinuous;
e.g., heated discussion and arrange an appointment, respectively). Collocations were traditionally
viewed as lexical combinations with limited word substitution (e.g., strong tea categorized as a
collocation as the noun tea collocated with strong but not with its synonymous alternative powerful
from Laufer & Waldman, 2011). On the other hand, based on a more recent, frequency-based
DOI: 10.4324/9780429199691-22 199

Hyung-Jo Yoon
approach, lexical combinations that occur together as a formulaic unit more frequently than chance
are considered collocations. That is, eat lunch and drink coffee were not seen as collocations (but
simply free combinations) in a traditional sense, but these high-frequency examples would be
categorized as collocations from a statistical perspective.
Lexical bundles are frequently-recurring continuous sequences that generally consist of three
or more words, and many of these sequences perform important discourse functions (e.g., in
order to, as a result of, at the end of the) (Biber & Barbieri, 2007; Biber, Conrad, & Cortes,
2004). Lexical bundles have also been operationalized differently across L2 studies in terms of
bundle size (number of words constituting a bundle), dispersion or range (number of different
texts that include a bundle), and frequency (minimum occurrences to be considered a target
bundle) (e.g., 40 times per million words in Biber et al., 2004; 25 times per million in Chen &
Baker, 2010; 20 times per million in Pérez-Llantada, 2014), making it difficult to generalize
their research findings. N-grams refer to contiguous sequences of n words, and researchers
generally have interest in exploring bigrams and trigrams that consist of two and three words,
respectively. While n-gram research, unlike that on lexical bundles, tends not to include disper-
sion as one of the criteria, n-grams share their major characteristics with lexical bundles in that
they are contiguous multi-word sequences and often function as stance or discourse organizing
expressions (e.g., you know, for example, in addition) (Granger, 2018).
Two other types of sequences include p-frames and phrasal verbs. P-frames refer to contiguous
word strings with an open slot (e.g., in the context of, in the form of, in the middle of under the
p-frame of in the * of), but there has been variation in the available position of an open slot for
p-frames (e.g., only medial, as opposed to no initial or end positions in Römer, 2010; any position
in Cunningham, 2017). The majority of research employing p-frames has explored research article
section- or discipline-specific features of academic writing (e.g., Cunningham, 2017; Lu, Yoon,
& Kisselev, 2018), and there is less research on this type of formulaic language. Phrasal verbs,
combinations of a verb and a particle that “function together as a single unit both lexically and
syntactically” (Darwin & Gray, 1999, pp. 76–77), have also been identified as a phraseological unit
worth exploration due to their frequent occurrences in language use and L2 learners’ difficulty using
them properly, which might have arisen from their structural and functional variations (Garnier &
Schmitt, 2015). However, L2 writing scholars have given relatively reduced attention to phrasal
verbs because their occurrences are a prominent feature of informal, spoken language registers
(Siyanova & Schmitt, 2007).
Of these formulaic units, therefore, this chapter will focus on reviewing L2 writing studies that
have discussed collocations or lexical bundles. It will specifically offer a summary of the research
findings related to language development, followed by a discussion of methodological issues. It will
end with pedagogical suggestions and future directions derived from previous research findings.
Formulaic language has long received attention from a wide range of fields, and it was in the 1970s
that linguists began showing much interest in formulaic language research. The study by Pawley
and Syder (1983) first noted that L2 speakers’ fluency and nativelike selection depend on their use
of formulaic language, and Sinclair (1987) led the seminal project that enabled an empirical ana-
lysis of language use patterns by exploring a machine-readable corpus systematically. Since then,
formulaic language has been investigated in the fields of corpus linguistics and L2 writing research
through different lenses.
The initial focus of phraseology was mainly on the classification of distinct phraseological units
(e.g., pure idioms, figurative idioms, and collocations) and on the compilation of lists using top-down
linguistic criteria (e.g., structural flexibility and compositionality of meaning) (Nesselhauf, 2005).
More recently, bottom-up or corpus-driven approaches have become possible with researchers
200
L2 Writing and Formulaic Language
examining frequency and co-occurrence patterns of phraseological units in large collections of

spoken and written texts along with their discourse functions (Paquot & Granger, 2012). As a result,
scholars have broadened the scope of word combinations to be explored based on their frequency of
occurrence and functions. For example, lexical bundles, which were not considered main phraseo-
logical units, have become the focus of much attention lately.
In early L2 writing studies, learners’ phraseological competence tended to be measured by iden-
tifying the proportion of erroneous combinations to all combinations identified using a set of cri-
teria such as native speakers’ judgment or listings in selected collocation dictionaries (e.g., Laufer
& Waldman, 2011; Nesselhauf, 2005). However, it has been noted that the dichotomous distinc-
tion of collocations either as correct or incorrect can be problematic due to the existence of many
collocations that would arguably be positioned in the middle of this continuum as well as contextual
influences (e.g., modality, register, or discipline) on the perceived appropriacy of formulaic lan-
guage (Cortes, 2004; Hyland, 2008; Polio & Yoon, 2021). Many scholars now acknowledge that
there is a need to employ a more fine-grained approach to formulaic language development.
Accordingly, using the degree of association strength between word constituents of a combin-
ation, coupled with its frequency, has become a valid alternative to binary judgments (e.g., accurate
or inaccurate) (Biber, Conrad, & Reppen, 1998; Simpson-Vlach & Ellis, 2010). The two widely-
employed measures of association are T-score and mutual information (MI) (see Evert, 2009 for a
detailed explanation). Briefly, T-score indicates the certainty of the association between two words,
and high-frequency sequences tend to receive higher T-score values (e.g., good student, have infor-
mation). MI, on the other hand, is the association measure that emphasizes the exclusive occurrence
of a word constituent with the other constituent as a pair. MI tends to be higher with low-frequency,
strongly associated combinations (e.g., excruciating pain, schedule [an] appointment). Of the two
measures, MI is being increasingly employed as a developmental measure because L2 writers are
found to rely more on sequences with high MI in their language use as their proficiency advances
(e.g., Granger & Bestgen, 2014). However, it has also been noted that different measures are
required for different types of sequences because, for example, MI undervalues combinations with
function words and does not take into account how word constituents are sequenced (see Biber,
2009); thus, a few alternative measures have been proposed (Brezina, McEnery, & Wattam, 2015;
Wei & Li, 2013). In the following section, I will discuss how findings of studies using frequency
or probabilistic methods have informed us of the link between use of formulaic language and L2
writing skills.

L2 writing scholars have made effective use of corpus-linguistic methods in exploring how formu-
laic language is associated with L2 writing development, and as noted earlier, lexical bundles and
collocations have been the most extensively targeted units in formulaic language research. With
regard to learners’ lexical bundle use, studies have made a comparison between L2 writers and L1
counterparts in terms of the total frequency (tokens) and varied use (types) of lexical bundles (e.g.,
Ädel & Erman, 2012; Bychkovska & Lee, 2017; Chen & Baker, 2010; Pérez-Llantada, 2014; Qin,
2014). Their findings have generally shown that L2 writers tend to rely on a smaller set of lexical
bundles frequently as their safe choice (i.e., reduced types of bundles), offering evidence for the L2
writer’s limited phraseological repertoire.
However, relying on certain frequency and dispersion criteria for lexical bundle identification
(e.g., it must appear at least 40 times per million words in at least five different texts to be identified
as a lexical bundle for a target corpus), studies have yielded mixed findings in terms of the total fre-
quency of lexical bundles. Some studies found greater tokens of lexical bundles in L2 writer texts
than L1 texts (e.g., Bychkovska & Lee, 2017), while others demonstrated the opposite (e.g., Chen
& Baker, 2010). In addition, earlier research using structural analyses has identified that L2 writers
201
Hyung-Jo Yoon
are likely to use more clausal bundles (e.g., is going to be, I would like to) in their writing than
L1 writers who tend to use phrasal bundles (e.g., in the case of, the beginning of the) to a greater
extent, but recent studies offered evidence that the difference in bundle structures is attributable to
writing proficiency rather than writers’ native-speaker status (Shin, 2019). Therefore, from a devel-
opmental perspective, L2 researchers began exploring lexical bundles in terms of their varied use
across proficiency levels (Chen & Baker 2016; Staples, Egbert, Biber, & McClair, 2013; Vo, 2019).
Work by Vo (2019), for example, investigated the frequency and structural distribution of lexical
bundles used by L2 learners at different proficiency levels. In this study, four-word clusters that
appeared at least ten times per 50,000 words in at least five different texts were identified as lexical
bundles for each corpus. As a result, Vo found that the total frequency of bundles used by L2 writers
decreased as proficiency increased. It was also found that lower-level essays were characterized by
the increased use of bundles containing a verb phrase (VP-based bundles) and higher-level essays
by the increased use of bundles with a prepositional phrase (PP-based bundles). Similarly, Chen
and Baker (2016) examined the structural patterns of lexical bundles used in L2 essays at the three
proficiency levels (CEFR B1, B2, and C1) and revealed that the majority of bundles occurring in
the essays at the lowest level were VP-based, and the proportion of VP-based bundles decreased
and that of PP-based bundles increased with proficiency advancement.
Staples et al. (2013) also investigated the frequency and function of lexical bundles by L2
proficiency. Their finding suggested that the quantity of lexical bundles used by L2 writers
decreased as their proficiency increased, confirming our understanding that lexical bundles are
useful resources for low-level writers. Interestingly, their additional analysis with the bundles
divided into those directly copied from the writing prompts and those not copied showed that the
greater use of lexical bundles by the lowest group was in fact due to their heavy reliance on the
language expressions at hand (i.e., no statistically significant difference between the lowest and
highest level groups in the quantity of non-prompt bundles; statistical significance only between
the intermediate and highest groups).
I now turn to the issues covered by collocational strength studies. Durrant and Schmitt (2009)
is one of the early studies examining L2 learners’ use of collocations from a statistical perspective.
Specifically, using T-score and MI, the authors examined how L2 learners’ use of premodifier-noun
combinations differed from that of L1 counterpart writers. The major findings of this study include
L2 learners’ overuse of high-frequency combinations characterized by high T-score (e.g., good
example, hard work) and underuse of low-frequency, strongly associated combinations that were
generally assigned high MI (e.g., preconceived notions). Using a similar methodology, Granger
and Bestgen (2014) compared intermediate-and advanced-level learners’ use of adjacent two-
word combinations (noun-noun, adjective-noun, adverb-adjective, and all bigrams) and suggested
a similar pattern of increased use of high MI combinations in the case of advanced learner writing,
giving support to the use of collocational strength as measures of phraseological competence and
L2 writing development.
A few longitudinal studies have been conducted to trace changes in collocational strength
over time (e.g., Bestgen & Granger, 2014; Siyanova-Chanturia, 2015; Yoon, 2016). Bestgen and
Granger (2014) analyzed the association strength of bigrams (all two contiguous words) in the L2
learners’ descriptive essays collected over one semester, with the assumption that the increased use
of strongly associated bigrams would indicate more developed phraseological performance (i.e.,
high MI bigrams such as personality traits, traffic jam). It was found that there was no significant
change in association strength over one semester, but a significant relationship between MI and
essay quality. This finding suggests that there might be a link between formulaic language use and
perceived writing proficiency of L2 learners.
The longitudinal study by Siyanova-Chanturia (2015) made some unique contributions to the
field. Unlike the majority of research with intermediate or higher learner participants, this study
202
explored the development of collocational competence in the essays composed by beginning-

level L2 learners of Italian. Specifically, Siyanova-Chanturia examined tokens and types of noun-
adjective combinations in the learner essays collected at three time points (intervals of seven weeks)
and found that the numbers of the target combinations were comparable across the time points. With
an additional analysis of collocational strength, however, she revealed that the tokens and types
of strongly associated combinations with high MI (i.e., MI equal to or greater than 3) increased
over time.
In a related study, Yoon (2016) explored verb-noun collocations in terms of the development of
their collocational strength (MI) over time. Specifically, this research involved intermediate-level
ESL learners who composed three narrative and three argumentative essays over the course of one
semester, as well as L1 writers who wrote one for each genre. The longitudinal finding of this study
showed that L2 learner writing had little change of association strength in either genre. A cross-
sectional analysis of comparing between the L1 and L2 groups found a significant difference in the
association strength of argumentative writing, but no difference in the case of narrative writing.
This finding seems to indicate that L2 learners’ phraseological competence is genre specific. In the
context of Yoon, the learner participants might have developed the collocational ability comparable
to that of L1 counterparts for narrative writing, but their ability to use the combinations expected
for convincing argumentation (e.g., eliminate distraction, outweigh disadvantage) was not yet fully
developed at this proficiency level.
It can be concluded from these longitudinal studies that L2 writers have difficulty developing
phraseological competence notably over a short time period such as one semester, necessitating an
extended time period or intensive instruction of formulaic language for its acquisition in the written
modality. Another interpretation of the findings is that the phraseological competence approximated
by association strength develops at L2 learners’ early stages of development, as shown in Siyanova-
Chanturia (2015), and we may need to vary our approaches to exploring the development of
phraseological competence.
Similar to the efforts made in SLA research to operationalize L2 proficiency and propose valid
measures of language development (Bulté & Housen, 2012; Norris & Ortega, 2009), there have
been some notable endeavors to delve into the multifaceted nature of phraseological competence.
In a recent study by Paquot (2019), for example, a number of measures categorized into phraseo-
logical diversity and sophistication were compared with traditional syntactic and lexical measures
in terms of their sensitivity to L2 texts at different proficiency levels. In investigating two-word
combinations in a specific syntactic structure (e.g., adjective-noun, adverb-verb, verb-noun), she
used the Academic Collocation List (Ackermann & Chen, 2013) to reveal the proportions of com-
binations that belonged to the list of pedagogically valuable collocations, and also used mean MI
as a target association measure of each combination type. Her findings indicated that MI performed
as valid indices describing L2 writing proficiency, and, in particular, mean MI of verb-noun com-
binations (here, nouns having a syntactic function as a direct object) was found to be the most valid
measure.

This section describes the contributions of current research to our understanding of formulaic lan-
guage instruction, lexical bundle accuracy, and mediating genre effects. As discussed in the previous
section, L2 learners have difficulties improving their formulaic language ability in a short period
of time. For example, a case study by Li and Schmitt (2008) followed a Chinese MA student’s
development of lexical phrases (e.g., according to, in addition, there is no consensus) in her written
assignments over the course of one academic year. While her lexical phrase use improved notably
with regard to appropriateness, she could not overcome her tendency to rely excessively on a small
203
Hyung-Jo Yoon
number of phrases. Lexical bundles, often as structurally varied and incomplete clusters, cannot be
acquired easily through mere exposure to input; it may necessitate more focused instruction.
A few recent studies have been conducted on the effect of instruction on formulaic sequences
(e.g., Al Hassan & Wood, 2015; Shin & Kim 2017) and have commonly found facilitative effects
of focused instruction. For example, a quasi-experimental study by Al Hassan and Wood (2015)
investigated the effects of focused instruction of lexical bundles and collocations over ten weeks.
As a result, they found that the explicit instruction involving consciousness-raising tasks such as
matching and sentence rewriting resulted in a significant increase in the frequency of the target for-
mulaic sequences in L2 learner essays. It was also revealed that the learner essays were assigned
higher quality scores after the training period and that essay scores tended to be correlated strongly
with the quantity of formulas in the essays. These results of significant instructional effects and
the association between formulaic sequences and essay quality clearly indicate that formulaic
sequences should be one of the pedagogical foci for L2 writing development.
In their experimental study, Shin and Kim (2017) examined the instructional effect of a peda-
gogic task intended to raise L2 learners’ awareness of the articles embedded in lexical bundles. For
example, the core expression one of was identified to include high-frequency lexical bundles such
as is one of the, one of the main, and one of the most, and the bundles containing the core expression
was taught explicitly to the learners using error correction and sentence completion activities. With
a total of ten target core expressions, Shin and Kim found that L2 learners with the experimental
treatment outperformed those without the treatment on producing discrete sentences that contain
the core expressions. Their findings obtained commonly from learners at two different proficiency
levels appear to confirm the value of using bundle-based pedagogical tasks for the acquisition of
morphological forms and their appropriate use in various contexts.
Additionally, there has been a shift in the focus of L2 text analysis that is intended to go
beyond the traditional frequency analysis of formulaic sequences; recent writing studies have
paid increased attention to the accurate and inaccurate use of lexical bundles so as to identify
potential learner difficulties (e.g., Bychkovska & Lee, 2017; Huang, 2015; Shin, Cortes, & Yoo,
2018). To explore such qualitative changes in L2 learners’ use of lexical bundles, Huang (2015)
examined the argumentative essays composed by Chinese EFL students at two different levels
(English major students in Years 1 and 2 compared with those in Years 3 and 4). No significant
difference was found between the two groups in terms of the proportion of accurate bundles
(commonly high accuracy over 92%). With the grammatical misuses categorized into inexistent
phrase, agreement error, infinitive error, and article error, Huang found that the most frequent
misuse type for both groups was agreement errors (e.g., one of the newest way) and interpreted
it as a potential consequence of negative L1 transfer. Similarly, Bychkovska and Lee (2017)
explored the accuracy of lexical bundles in L2 essays written by Chinese ESL students. Their
finding was, however, contrary to Huang’s finding in that the proportion of accurate bundles was
only 77%, and over 50% of the misuses were related to the use of articles (e.g., on other hand,
the one of most) and prepositions (e.g., in the same time) rather than agreement. The methodo-
logical difference between the two studies might have resulted in mixed findings (i.e., 3- to
5-word bundles targeted in Huang, 2015; only 4-word bundles in Bychkovska & Lee, 2017),
but their findings can be interpreted commonly as the existence of cross-linguistic influences on
bundle misuses and the need to address them through explicit instruction.
Another attempt for more valid suggestions of formulaic language development is seeing register
(or genre) as one of the mediating factors and controlling for it. It has been noted that L2 learners’
phraseological competence is partly an indication of their ability to deploy genre-specific or
discipline-specific repertoires of formulaic sequences appropriately (Cortes, 2004; Hyland, 2008).
Writers with greater experience and familiarity with conventions in a discourse community are
expected to demonstrate their competence by using more formulaic sequences generally acceptable
204
in the community; the absence of such sequences in writing might be evidence of their limited
discourse-specific repertoires of formulaic language (Haswell, 1991). Some efforts have been made
to analyze lexical bundles frequently occurring in various registers and discourse settings (e.g.,
Biber & Barbieri, 2007; Cortes, 2013; Durrant, 2017), often with a focus on their communicative
functions proposed by the Biber et al. (2004) framework (i.e., stance, discourse organizing, and ref-
erential bundles). Also, while earlier studies tended to use published research articles as a reference
dataset and compare it with L2 learner essays to demonstrate unique characteristics and develop-
ment of learner writing, scholars have begun to conduct their research with sufficient control of
register (e.g., Shin, 2019), with greater awareness of its necessity for developmental research (Yoon
& Polio, 2017). However, little research has been conducted on the effect of register on collocations
using learner corpora (except for Yoon, 2016, in which verb-noun collocations were explored in
narrative and argumentative writing), indicating the need for further research to offer more concrete
suggestions for genre-based collocation teaching.

L2 studies on writers’ performance of phraseology have traditionally employed an interlanguage
analysis design that focused primarily on the frequency of a target phraseological unit (Granger,
1996). Major comparisons have been made between native and non-native language user groups
(e.g., Chen & Baker, 2010; Durrant & Schmitt, 2009), learners with different L1 backgrounds
(e.g., Wang & Shaw, 2008), and learners at different proficiency levels (e.g., Vidakovic & Baker,
2010). Frequency-based L2 writing studies have been conducted with the main goal of linking
formulaic language use with writing skill development. One of the main corpus-linguistic methods
was to extract formulaic clusters from learner essays using a set of extraction criteria (e.g., min-
imum frequency and dispersion) and examine their quantitative and qualitative differences in rela-
tion to learner variables such as L2 proficiency (e.g., Chen & Baker, 2016; Staples et al., 2013).
The other method involves the analysis of target formulaic clusters in learner essays using their
frequency and association strength values obtained from a reference corpus, such as the Corpus
of Contemporary American English (e.g., Bestgen & Granger, 2014; Paquot, 2019; Yoon, 2016).
The second method is based on the expectation that a large collection of native-speaker language
use can serve validly as a model for L2 learners to follow. One limitation of the statistical explor-
ation of formulaic units on the basis of association strength would be that it does not capture
learners’ creative language use that would involve metaphor, simile, and humor (e.g., priceless
scene, squirrels stroll).
The majority of studies have used a cross-sectional design and made comparisons between
groups (e.g., Ädel & Erman, 2012; Bychkovska & Lee, 2017; Chen & Baker, 2010, 2016; Pérez-
Llantada, 2014; Staples et al., 2013; Vo, 2019), while some studies endeavored to trace changes over
time using a longitudinal research design (e.g., Benevento & Storch, 2011; Crossley & Salsbury,
2011; Li & Schmitt, 2009). In addition to these causal-comparative studies, there has been some
correlational research that explores the relationship between formulaic language features and essay
quality scores (e.g., Bestgen, 2017; Bestgen & Granger, 2014; Garner et al., 2019). Most of the
correlational studies, as a follow-up step of their main statistical analysis, compute a regression
analysis to reveal the collective contribution of phraseological features to essay quality. There has
also been intervention research that aimed to investigate the effectiveness of focused instruction of
formulaic language (e.g., Al Hassan & Wood, 2015; Shin & Kim 2017).
It was notable that formulaic language was explored with different methodological foci
depending on the type of formulaic unit. For example, n-gram (two-or three-word clusters) and
collocation studies tended to use the degree of association between constituents of a combination
(e.g., MI scores) as measures of writing development (e.g., Bestgen & Granger, 2014; Durrant &
205
Hyung-Jo Yoon
Schmitt, 2009; Garner, Crossley, & Kyle, 2019; Yoon, 2016), whereas studies on longer contiguous
words (four- or five-word clusters) tended to have a focus on analyzing their occurrence patterns
with regard to functional and structural characteristics (e.g., Chen & Baker, 2010; Qin, 2014; Shin,
2019). Also, some recent studies endeavored to expand the scope of target phraseological units
by using more sophisticated statistical and query systems. Researchers began to explore formu-
laic units with different degrees of fixedness. For example, Staples et al. (2013) explored how
proportions of fixed and variable slot bundles (e.g., fixed bundles: I agree with the, that it is more;
variable slot bundles: * be able to; * in the past) differed in the learner essays across proficiency
levels. Garner (2016) then explored L2 learners’ use of p-frames, which are semi-fixed formulaic
sequences with an open slot surrounded by other word constituents (e.g., the * of the, I am * to; *
indicating an open slot), and found that the patterns of p-frame use in the learners’ essays tended to
become more varied and complex as L2 proficiency increased.

Corpus-based research findings have allowed us to recognize the important roles played by a var-
iety of formulaic units and a shift in focus from single words to multi-word expressions in pedagogy
(Granger, 2018). In the case of L2 writing, learners are expected to benefit from relying on formu-
laic language in their writing performance as it enables them to save greater attentional resources
for content and rhetorical structures. Given this understanding, our next goal should be to find ways
to facilitate L2 writers’ ability to use varied and appropriate formulaic sequences. With a particular
focus on the effect of task support, for example, task designers can explore how writing can be a
medium for language acquisition by manipulating the condition of bundle support (e.g., provision
of a certain number of bundles to be used to complete a writing task) and studying the retention
of them.
It can be inferred that L2 writers at advanced proficiency are likely to have sufficient repertoires
of formulaic expressions for multiple written genres, but many learners are in fact expected to
have differing levels of phraseological competence across different genres. It may be the case that
adult L2 learners have greater familiarity and knowledge of argumentative functions than other
functions such as description and narration, given that argumentative writing is the most prevalent
genre in academic settings (Christie, 1997; Wolfe, 2011). Focusing on narrative and argumenta-
tive genres, Yoon and Polio (2017) explored distinct patterns of formulaic language across the
two genres and found that the argumentative essays composed by intermediate-level ESL students
are characterized by increased tokens of formulaic sequences (e.g., in my opinion, I think that, for
(determiner) reasons). This finding may be indicative of L2 learners’ greater ability to retrieve for-
mulaic bundles related to argumentative writing than to narrative writing; therefore, L2 learners
may need explicit instruction on how to use multi-word clusters to express a wide of discourse
functions to which they are less familiar. Furthermore, Yoon’s (2016) findings suggest the need for
explicit attention to collocations during instruction (and see Polio, this volume, who makes similar
recommendations regarding morphosyntax).
These pedagogical aims can be pursued by designing L2 writing instruction to facilitate learners’
acquisition of genre-specific formulaic sequences and flexibility of their use across different func-
tional needs. Similarly, Qin and Uccelli (2020) emphasized the need to move our focus from com-
plexity increase to register flexibility in relation to L2 writing development (i.e., L2 writers’ ability
to deploy their linguistic resources flexibly across registers and genres). Thus far, studies on register
flexibility tended to focus on examining textual features related to syntactic and lexical complexity.
Given the important roles played by formulaic sequences in written discourse as well as the potential
relationship between traditional linguistic dimensions (complexity, accuracy, and fluency) and for-
mulaic units, it would be of great value to explore L2 writers’ register flexibility at the phraseology
level in comparison to that at different levels of language structure (e.g., lexical sophistication and
206
syntactic complexity); it will inform us of what tasks and techniques to use in classroom settings to
allow L2 students to attain a high level of genre awareness and flexibility.
Finally, for the purpose of fostering the accurate use of formulaic sequences, we can use a con-
cordance tool (e.g., AntConc) to create concordance lines that show some patterns of interest in a
learner corpus containing errors, and then the concordance lines created can be used, as an applica-
tion of data-driven learning (DDL), to raise students’ awareness of the correct use of frequent multi-
word sequences and their deviant forms (e.g., Boulton, 2010; Vyatkina, 2016). For the selection of
target formulaic structures and expressions, we can benefit from existing multi-word lists such as
the Academic Collocation List (Ackermann & Chen, 2013), the Academic Formulas List (Simpson-
Vlach & Ellis, 2010), the Phrasal Verb Pedagogical List (Garnier & Schmitt, 2015), the Phrasal
Expressions List (Martínez & Schmitt, 2012). Written corrective feedback, as a supplement, can
also be provided for the students with a particular focus on the accuracy and appropriacy of for-
mulaic language. Then, with increased noticing of the appropriate form of formulaic language, the
students will be able to consolidate their understanding of the form-function link of a wide range
of formulaic sequences.
Future Directions
This chapter has reviewed the previous findings of formulaic language in L2 writing and some
methodological issues. It has also discussed how these findings offer suggestions for L2 writing
pedagogy. In this closing section, I set forth what could be further pursued as potential phrase-
ology issues in L2 writing research. Reviewing the findings of previous research informed us that
there had been mixed findings in terms of the contribution of formulaic sequences to essay quality,
which may be a reflection of raters’ varying perspectives of how to evaluate an essay that includes
a number of prefabricated chunks. Given this, SLA researchers and testing experts may need to
delve into this issue of how to view formulaicity in writing testing settings. A discussion on whether
to include natural, formulaic language use as part of the target construct would allow test takers
to understand what they are expected to demonstrate in terms of formulaic expressions for higher
scores and also raters to have a clear awareness of how to rate essays with different proportions and
types of formulaic language (along with other dimensions of writing skills). This phraseological
dimension of language can then be included as a descriptor in scoring rubrics.
While we have advanced our understanding of the roles played by formulaic chunks in L2
writing, it is still difficult to have a comprehensive view of how formulaic language contributes
to L2 writing proficiency and development, mainly due to conflicting findings that have arisen
from different criteria for identifying target phraseological units (e.g., frequency and dispersion
cutoffs; length of multi-word units; association strength indices) and different text types (e.g., timed
writing, course assignments, research papers). There are also several external factors that would
mediate the relationship between formulaic language and essay quality (e.g., topic, genre, and other
prompt features). However, despite these difficulties, we should explore various types of formulaic
sequences in terms of their predictive power of essay scores so as to offer valuable implications
for automated essay scoring (AES) systems. Composition scholars have expressed concerns about
the use of AES techniques in testing situations due to their primary focus on linguistic features
and inability to capture, for example, the logical flow of a writer’s ideas (e.g., Condon, 2013;
Herrington, & Moran, 2001), but there is wide agreement that L2 writing assessments that have
the main goal of measuring language skills would be able to make effective use of AES systems.
Also, AES empirical studies have offered evidence in support of the validity of AES systems whose
scores were actually comparable to those by trained human raters (e.g., Attali & Burstein, 2006;
Enright & Quinlan, 2010). Research findings on varying contributions of formulaic units to L2
essay quality in relation to task features would help testing experts build more accurate scoring
algorithms.
207
Hyung-Jo Yoon
Taking one step further, SLA scholars can use reference and learner corpora as effective resources
for detecting L2 writer errors automatically and directing their attention to erroneous expressions
for self-correction. For example, Polio and Yoon (2021) explored how bigrams and trigrams that
are absent or suspiciously rare in a large reference corpus can be seen as linguistic errors, when
they appear in learner essays (e.g., books locate, flower become, obtain humor). They used statis-
tical analyses to offer evidence of how n-gram measures tap into linguistic accuracy. First, using an
exploratory factor analysis, they showed n-gram measures (i.e., bigram MI, trigram MI, and propor-
tion of bigrams absent in a large corpus) loaded on one factor together with traditional error-count
measures (i.e., number of syntactic errors, morphological errors, and prepositional errors). Then,
hierarchical regression analyses were conducted to demonstrate how n-gram measures predict the
number of errors in the texts as well as the general impression of accuracy using a holistic accuracy
rubric. Their findings indicated that 48% of the variance in total error counts was accounted for by
the three n-gram measures and 28% in holistic accuracy.
In addition, Harvey-Scholes (2018) manually identified a total of 1,310 language errors that
occurred in 90 essays and used a written portion of the British National Corpus (BNC) as the
reference to extract the information related to word and phrase frequencies. The result showed
that 79% of the manually identified errors could actually be identified by n-gram measures, and
about two-thirds of the errors detected by n-gram measures were errors at the level of bigrams and
trigrams. Focusing on the frequency of bigrams in a reference corpus, Lawley (2015) developed
the automated program of highlighting all bigrams in learner texts that do not or rarely appear
in the reference corpus (e.g., these day, in Sunday, had do). Tested for L2 students’ self and peer
correction, the program was found to facilitate the correction process and to get the students to think
for themselves to produce correct forms, despite some possibility of false alarms (i.e., bigrams that
happened to occur rarely in the reference corpus). Both of these studies are greatly informative,
but one of their common limitations was that they used a relatively small set of language data (the
written portion of the BNC; approximately 80 million words) as their reference corpus. Given the
availability of much larger corpora (e.g., the Corpus of Contemporary American English) and the
importance of corpus size to generate more reliable and accurate inferences about errors, further
research should be pursued with the goal of developing a computer-based system that enable L2
writers to self-correct their errors with autonomy.
Additionally, we probably need more replication studies in this area to gain a more compre-
hensive, generalizable view of the link between phraseology and L2 writing ability, and it should
be preceded by an attempt to establish clear definitions of formulaic units extensively targeted
by L2 writing research. For example, Benevento and Storch (2011) and Verspoor, Schmid, and
Xu (2012) used the quantity of chunks as target text measures, but the authors did not offer clear
operationalizations of how they identified chunks in their studies, which might have influenced
their findings and made it difficult to replicate them. Also, we should conduct replication research
with a good understanding of potentially different findings and suggestions as a result of employing
different criteria for target unit extraction (e.g., minimum frequency, dispersion). Recent work by
Lu, Kisselev, Yoon, and Amory (2018) re-analyzed the same corpus data explored in another study
(O’Donnell et al., 2013) using different criteria (e.g., n-gram frequency, n-gram MI threshold) and
emphasized the necessity of criterial consistency in formulaic language research.
The triad of complexity, accuracy, and fluency have long been employed as indices of L2 devel-
opment (Housen & Kuiken, 2009; Ortega, 2015), but its exploration has been pursued mostly inde-
pendent of the formulaicity of learner language. However, it is predictable that L2 learners’ large
repertoires of formulaic expressions would make them produce an essay with greater fluency. L2
writers’ reliance on PP-based bundles can result in an increase in phrase-level syntactic complexity.
It is also likely that L2 writers’ accurate use of lexical bundles would lead to their production of
grammatical forms that are a part of the bundles. While formulaic chunks could still be viewed by
208
many researchers as the outcome of early L2 development, continued efforts to combine these two
areas of research would enable us to further deepen our understanding of the link between SLA and
writing development.
References
Ackermann, K., & Chen, Y.-H. (2013). Developing the academic collocation list (ACL): A corpus-drive and
expert-judged approach. Journal of English for Academic Purposes, 12, 235–247.
Ädel, A., & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native
speakers of English: A lexical bundles approach. English for Specific Purposes, 31, 81–92.
Al Hassan, L., & Wood, D. (2015). The effectiveness of focused instruction of formulaic sequences in
augmenting L2 learners’ academic writing skills: A quantitative research study. Journal of English for
Academic Purposes, 17, 51–62.
Altenberg, B., & Granger, S. (2001). The grammatical and lexical patterning of MAKE in native and non-
native student writing. Applied Linguistics, 22, 173–195.
Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater v. 2.0. The Journal of Technology.
Learning, and Assessment, 4, 1–30.
Benevento, C., & Storch, N, (2011). Investigating writing development in secondary school learners of French.
Bestgen, Y. (2017). Beyond single-word measures: L2 writing assessment, lexical richness and formulaic com-
petence. System, 69, 65–78.
Bestgen, Y., & Granger, S. (2014). Quantifying the development of phraseological competence in L2 English
writing: An automated approach. Journal of Second Language Writing, 26, 28–41.
Biber, D. (2009). A corpus-driven approach to formulaic language in English: Multi-word patterns in speech
and writing. International Journal of Corpus Linguistics, 14, 275–311.
Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for Specific
Purposes, 26, 263–286.
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and
textbooks. Applied Linguistics, 25, 371–405.
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use.
Boulton, A. (2010). Data-driven learning: Taking the computer out of the equation. Language Learning, 60,
534–572.
Brezina, V., McEnery, T., & Wattam, S. (2015). Collocations in context: A new perspective on collocation
networks. International Journal of Corpus Linguistics, 20, 139–173.
Bulté, B., & Housen, A. (2012). Defining and operationalizing L2 complexity. In A. Housen, F. Kuiken, & I.
Vedder (Eds.), Dimensions of L2 performance and proficiency. complexity, accuracy and fluency in SLA
Bychkovska, T., & Lee, J. (2017). At the same time: Lexical bundles in L1 and L2 university student argumen-
tative writing. Journal of English for Academic Purposes, 30, 38–52.
Chen, Y.H., & Baker, P. (2010). Lexical bundles in L1 and L2 academic writing. Language Learning &
Technology, 14, 30–49.
Chen, Y., & Baker, P. (2016). Investigating critical discourse features across second language develop-
ment: Lexical bundles in rated learner essays, CEFR B1, B2 and C1. Applied Linguistics, 37, 849–880.
Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and
biology. English for Specific Purposes, 23, 397–423.
Cortes, V. (2013). The purpose of this study is to: Connecting lexical bundles and moves in research article
introductions. Journal of English for Academic Purposes, 12, 33–43.
Christie, F. (1997). Curriculum macrogenres as forms of initiation into a culture. In F. Christie & J.R.
Martin (Eds.), Genre and institutions: Social processes in the workplace and school (pp. 134–160).
New York: Continuum.
Condon, W. (2013). Large- scale assessment, locally- developed measures, and automated scoring of
essays: Fishing for red herrings? Assessing Writing, 18, 100–108.
Crossley, S., & Salsbury, T.L. (2011). The development of lexical bundle accuracy and production in English
second language speakers. IRAL, 49, 1–26.
Cunningham, K.J. (2017). A phraseological exploration of recent mathematics research articles through key
phrase frames. Journal of English for Academic Purposes, 25, 71–83.
Darwin, C.M., & Gray, L.S. (1999). Going after the phrasal verb: An alternative approach to classification.
TESOL Quarterly, 33, 65–83.
209
Hyung-Jo Yoon
De Cock, S. (2000). Repetitive phrasal chunkiness and advanced EFL speech and writing. In C. Mair & M.
Hundt (Eds.), Corpus linguistics and linguistic theory (pp. 51–68). Amsterdam: Rodopi.
Durrant, P. (2017). Lexical bundles and disciplinary variation in university students’ writing: Mapping the ter-
ritories. Applied Linguistics, 38, 165–193.
Durrant, P., & Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations?
International Review of Applied Linguistics, 47, 157–177.
Ellis, N.C. (2008). Phraseology: The periphery and the heart of language. In F. Meunier & S. Granger (Eds.),
Phraseology in language learning and teaching (pp. 1–13). Amsterdam: John Benjamins.
Enright, M.K., & Quinlan, T. (2010). Complementing human judgment of essays written by English language
learners with e-rater® scoring. Language Testing, 27, 317–334.
Evert, S. (2009). Corpora and collocations. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics: An inter-
national handbook (pp. 1211–1248). Berlin: Mouton de Gruyter.
Garner, J. (2016). A phrase-frame approach to investigating phraseology in learner writing across proficiency
levels. International Journal of Learner Corpus Research, 2, 31–68.
Garner, J., Crossley, S., & Kyle, K., (2019). N-gram measures and L2 writing proficiency. System, 80,
176–187.
Garnier, M., & Schmitt, N. (2015). The PHaVE list: A pedagogical list of phrasal verbs and their most frequent
meaning senses. Language Teaching Research, 19, 645–666.
Granger, S. (1996). From CA to CIA and back: An integrated contrastive approach to computerized bilingual
and learner corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in contrast: Text-based
cross-linguistic studies (pp. 37–51). Lund Studies in English 88. Lund: Lund University Press.
Granger, S. (2018). Formulaic sequences in learner corpora: Collocations and lexical bundles. In A. Siyanova-
Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic language: A second language acquisi-
tion perspective (pp. 228–247). New York: Routledge.
Granger, S., & Bestgen, Y. (2014). The use of collocations by intermediate vs: Advanced non-native writers: A
bigram-based study. IRAL, 52, 229–252.
Geluso, J. (2013). Phraseology and frequency of occurrence on the web: native speakers’ perceptions of
Google-informed second language writing. Computer Assisted Language Learning, 26, 144–157.
Greaves, C., & Warren, M. (2010). What can a corpus tell us about multi-word units? In A. O’Keeffe & M.
McCarthy (Eds.), The Routledge handbook of corpus linguistics (pp. 212–227). London: Routledge.
Harvey-Scholes, C. (2018). Computer-assisted detection of 90% of EFL student errors. Computer Assisted
Language Learning, 31, 144–156.
Haswell, R. (1991). Gaining ground in college writing: Tales of development and interpretation. Dallas,
TX: Southern Methodist University Press.
Herrington, A., & Moran, C. (2001). What happens when machines read our students’ writing? College
English, 63, 480–499.
Housen, A., & Kuiken, F. (2009). Complexity, accuracy, and fluency in second language acquisition. Applied
Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes,
27, 4–21.
Huang, K. (2015). More does not mean better: Frequency and accuracy analysis of lexical bundles in Chinese
EFL learners’ essay writing. System, 53, 13–23.
Laufer, B., & Waldman, T. (2011). Verb-noun collocations in second language writing: A corpus analysis of
learners’ English. Language Learning, 61, 647–672.
Lawley, J. (2015). New software to help EFL students self-correct their writing. Language Learning &
Technology, 19, 23–33.
Li, J., & Schmitt, N. (2009). The acquisition of lexical phrases in academic writing: A longitudinal case study.
Lu, X., Kisselev, O., Yoon, J., & Amory, M.D. (2018). Investigating effects of criterial consistency, the diver-
sity dimension, and threshold variation in formulaic language research: Extending the methodological con-
siderations of O’Donnell et al. (2013). International Journal of Corpus Linguistics, 23, 158–182.
Lu, X., Yoon, J., & Kisselev, O. (2018). A phrase-frame list for social science research article introductions.
Journal of English for Academic Purposes, 36, 76–85.
Martínez, R., & Schmitt, N. (2012). A phrasal expressions list. Applied Linguistics, 33, 299–320.
Mukherjee, J. (2005). The native speaker is alive and kicking: Linguistic and language- pedagogical
perspectives. Anglistik, 16, 7–23.
Nesselhauf, N. (2005). Collocations in a learner corpus. Amsterdam: John Benjamins.
Norris, J., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The case
of complexity. Applied Linguistics, 30, 555–578.
210
O’Donnell, M., Römer, U., & Ellis, N.C. (2013). The development of formulaic sequences in first and second
language writing: Investigating effects of frequency, association, and native norm. International Journal of
Corpus Linguistics, 18, 83–108.
Ortega, L. (2015). Syntactic complexity in L2 writing. Progress and expansion. Journal of Second Language
Writing, 29, 82–94.
Paquot, M. (2019). The phraseological dimension in interlanguage complexity research. Second Language
Research, 35, 121–145.
Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. Annual Review of Applied Linguistics,
32, 130–149.
Pawley, A., & Syder, F.H. (1983). Two puzzles for linguistic theory: nativelike selection and nativelike fluency.
In J.C. Richards & R.W. Schmidt (Eds.), Language and communication (pp. 191–226). New York: Longman.
Pérez-Llantada, C. (2014). Formulaic language in L1 and L2 expert academic writing: Convergent and diver-
gent usage. Journal of English for Academic Purposes, 14, 84–94.
Polio, C., & Yoon, H. (2021). Exploring multi-word combinations as measures of linguistic accuracy in second
language writing. In B.L. Bruyn & M. Paquot (Eds.), Learner corpus research meets second language
acquisition (pp. 96–121). Cambridge: Cambridge University Press.
Qin, J. (2014). Use of formulaic bundles by non-native English graduate writers and published authors in
applied linguistics. System, 42, 220–231.
Qin, W., & Uccelli, P. (2020). Beyond linguistic complexity: Assessing register flexibility in EFL writing
across contexts. Assessing Writing, 45, 1–14.
Römer, U. (2010). Establishing the phraseological profile of a text type: The construction of meaning in aca-
demic book reviews. English Text Construction, 3, 95–119.
Shin, Y. (2019). Do native writers always have a head start over nonnative writers? The use of lexical bundles
in college students’ essays. Journal of English for Academic Purposes, 40, 1–14.
Shin, Y., Cortes, V., & Yoo, I. (2018). Using lexical bundles as a tool to analyze definite article use in L2 aca-
demic writing: An exploratory study. Journal of Second Language Writing, 39, 29–41.
Shin, Y., & Kim, Y. (2017). Using lexical bundles to teach articles to L2 English learners of different proficien-
cies. System, 69, 79–91.
Simpson-Vlach, R., & Ellis, N. (2010). An academic formulas list: New methods in phraseology research.
Sinclair, J.M. (1987). The nature of the evidence. In J.M. Sinclair (Ed.), Looking up: An account of the
COBUILD project in lexical computing (pp. 150–159). London: Harper Collins.
Sinclair, J.M. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Siyanova, A., & Schmitt, N. (2007). Native and nonnative use of multi-word vs. one-word verbs. International
Review of Applied Linguistics, 45, 109–139.
Siyanova-Chanturia, A. (2015). Collocation in beginner learner writing: A longitudinal study. System, 53,
148–160.
Siyanova-Chanturia, A., & Martínez, R. (2015). The idiom principle revisited. Applied Linguistics, 36,
549–569.
Staples, S., Egbert, J., Biber, D., & McClair, A. (2013). Formulaic sequences and EAP writing develop-
ment: Lexical bundles in the TOEFL iBT writing section. Journal of English for Academic Purposes,
12, 214–225.
Verspoor, M., Schmid, M., & Xu, X. (2012). A dynamic usage based perspective on L2 writing. Journal of
Vidakovic, I., & Barker, F. (2010). Use of words and multi-word unites in Skills for Life writing examinations.
IELTS Research Reports 41 (pp. 7–14). IELTS Australia/British Council.
Vo, S. (2019). Use of lexical features in non-native academic writing. Journal of Second Language Writing,
44, 1–12.
Vyatkina, N. (2016). Data-driven learning of collocations: Learner performance, proficiency, and perceptions.
Language Learning & Technology, 20, 159–179.
Wang, Y., & Shaw, P. (2008). Transfer and universality: Collocation use in advanced Chinese and Swedish
learner English. ICAME Journal, 32, 201–232.
Wei, N., & Li, J. (2013). A new computing method for extracting contiguous phraseological sequences from
academic text corpora. International Journal of Corpus Linguistics, 18, 506–535.
Wolfe, C.R. (2011). Argumentation across the curriculum. Written Communication, 28, 193–219.
Wray, A. (2000). Formulaic sequences in second language teaching: Principles and practice. Applied
Wray, A. (2012). What do we (think we) know about formulaic language? An evaluation of the current state of
play. Annual Review of Applied Linguistics, 32, 231–254.
211
Hyung-Jo Yoon
Wulff, S., & Gries, S. (2011). Corpus-driven methods for assessing accuracy in learner production. In P.
Robinson (Ed.), Second language task complexity: Researching the cognition hypothesis of language
learning and performance (pp. 61–87). Amsterdam: John Benjamins.
Yoon, H. (2016). Association strength of verb-noun combinations in experienced NS and less experienced
NNS writing: Longitudinal and cross-sectional findings. Journal of Second Language Writing, 24, 42–57.
Yoon, H., & Polio, C. (2017). The linguistic development of students of English as a Second Language in two
written genres. TESOL Quarterly, 51, 275–301.
212
16
WRITTEN CORRECTIVE
FEEDBACK
Short-Term and Long-Term Effects on
Language Learning

Kongju National University and Columbia University
Introduction
In second language acquisition (SLA) research, written corrective feedback (WCF) refers to any
correction provided by the teacher on lexical and grammatical errors made by learners in writing.
It is different from feedback on content, which concerns the ideas, organization, or text structure of
L2 writing (Ferris, 2010; Hyland & Hyland, 2019). WCF comes in a variety of forms, ranging from
the implicit (e.g., indirect feedback that indicates the existence of an error but without providing the
correct form) to the explicit (e.g., direct feedback that not only indicates the locus of an error but
also provides the corresponding correct form) forms. WCF can also be unfocused (e.g., feedback
on each and every error) or focused (e.g., feedback on selected errors).
While it is a common practice that teachers spend a great deal of their time correcting errors in
their students’ writing, the efficacy of WCF has remained a topic of controversy in the specialized
research. Theoretically, in the field of SLA, those adhering to a nativist view on language
acquisition –attributing it to the function of innate mechanisms –have argued that error correction,
oral or written, is unnecessary and even counterproductive (e.g., Krashen 1985; Schwartz, 1993).
In contrast, those who hold a non-nativist view (including both behaviorists and cognitivists) deem
error correction necessary, if not critical, for L2 development, especially in the case of older learners
(e.g., Bley-Vroman, 1986; Sharwood Smith, 1991).
The controversy in the domain of writing came to a head in the mid-1990s, owing largely to the
publication of a critique of WCF by Truscott (1996) where it was contended that error correction
is unhelpful, if not outright harmful. This view was soon rebuked and rebutted by Ferris (1999,
2004, 2006) and others (e.g., Bitchener, 2008; Bruton, 2009; Chandler, 2004). Meanwhile, empir-
ical studies mushroomed, reporting incongruent findings, though analyses of this research have
yielded positive results (see, e.g., Kang & Han, 2015; Liu & Brown, 2015), suggesting that WCF
can be beneficial to L2 learners.
According to the available research syntheses and meta-analyses (Kang & Han, 2015; Liu &
Brown, 2015), WCF can lead to short-term changes in learners’ behavior, as evident in immediate
revisions of writing. The “instant” changes, in Truscott’s view, may, however, very well be ephem-
eral, rather than evidence of true learning. Truscott (1996), therefore, raised the question: Does WCF
DOI: 10.4324/9780429199691-23 213

help students to improve their accuracy over time? To date, few studies have investigated long-term
effects (or retention) of WCF on learning, but recent years have seen an increasing number of
studies examining how learners’ accuracy in writing changes over time in response to WCF.
The remainder of this chapter offers an overview of research on the effects of WCF, identifying
key issues and synthesizing major findings. We begin with a historical perspective and proceed then
to highlight critical issues, topics, and current foci in research. After that, we discuss the method-
ology employed in WCF research and close with a number of pedagogical recommendations and
directions for future research.
The study of WCF has spanned several decades. Until the mid-1990s, empirical research on WCF
had waxed and waned (Ferris, 2010), amid changing theoretical paradigms. In the main, nativists’
views had bred noninterventionist approaches to corrective feedback. Krashen’s (1985) Monitor
Theory can be taken as an example. The theory, comprising five hypotheses including one that
draws a sharp line between learning and acquisition, champions ample exposure to comprehensible
input as the exclusive pathway to L2 development. Accordingly, error correction is considered
neither necessary nor helpful for acquisition. Unlike nativists, non- nativists (e.g., Schmidt,
1990) advocated a strong interventionist approach, viewing the use of WCF as essential to learning.
Studies conducted in the 1980s and early 1990s typically set out to test the role of error correction in
improving accuracy, yielding mixed results. Some studies reported that error correction was beneficial
(e.g. Fathman & Whalley, 1990; Lalande, 1982), while others failed to find a positive effect (e.g. Kepner,
1991; Rob, Ross & Shortreed, 1986; Semke, 1984). These studies employed diverse research methods in
terms of (1) participants, (2) the genre of writing tasks, (3) requirements of revision, (4) type of feedback,
(5) presence of a control group, or (6) duration of the study. This methodological diversity may have
contributed to the mixed findings obtained. A general methodological criticism leveled at these studies
was that research designs lacked rigor. Among their many flaws, few studies employed a control group
(Ferris, 2004; Guénette, 2007; Truscott, 1996, 2007), an issue to which we return to shortly.
Since the mid-1990s, interest in WCF research has intensified, largely spurred by Truscott’s
(1996) critical review of extant research on WCF. Truscott argued poignantly that grammar
correction is harmful and should be abandoned entirely. Invoking both theoretical and practical
arguments, he asserted that error correction does not work because corrections are not always in
tune with the learner’s level of grammatical knowledge, given that the structures targeted in the
teacher’s WCF might not necessarily be the ones the student is developmentally ready to acquire.
Consequently, Truscott argued, WCF leads only to “a superficial and possibly transient form of
knowledge” (p. 345), and is not useful to develop L2 writers’ long-term accuracy. At a practical
level, Truscott noted several problems in teachers’ provisions of WCF: (1) that teachers might not
be aware of errors in their students’ writing; (2) that teachers might not have sufficient metalin-
guistic knowledge to give an adequate explanation of errors; (3) that students might fail to grasp
their teachers’ feedback; (4) that students might simply ignore teachers’ feedback; and last but not
least, (5) that WCF might lower students’ confidence and negatively affect students’ quality of
writing, preventing them from using complex language to avoid corrections.
Truscott’s critical assessment of the efficacy of WCF provided the impetus for much empir-
ical research in the ensuing years. Broadly, two distinct, yet interrelated, lines of work have been
pursued: The study of writing as a second language (L2) literacy skill, and the study of writing
as a source of L2 acquisition. Although the efficacy of WCF has been in focus in both lines of
research, their goals are different (Ferris & Kurzer, 2019). Generally speaking, the goal of L2
writing research on feedback is to examine whether WCF can help learners become efficient writers
and improve the quality of their texts (Ferris, 2010). The goal of the acquisition line of research on
WCF, on the other hand, is to investigate whether WCF can improve the grammatical accuracy of
214
Effects of WCF
writing or enable the acquisition of linguistic constructions (Ferris, 2010). The latter is our concern
in this chapter.
SLA-oriented studies investigating whether WCF can improve the accuracy of the written texts
(see, e.g., Ashwell, 2000; Chandler, 2003; Ferris, 2006; Ferris & Roberts, 2001, and a summary by
Bitchener & Ferris, 2012) have yielded mixed findings. Of note, the studies have mostly examined
L2 learners’ use of WCF in revision, not in subsequent pieces of writing, consequently presenting
evidence only of short-term learning (uptake), not of long-term improvement in accuracy (acquisi-
tion). Therefore, it remains unclear whether WCF can trigger long-lasting changes in learners’ L2
knowledge as well as in their writing ability (Bitchener & Ferris, 2012).
Methodological reviews of this body of work have uncovered several issues, among them.
(a) the absence of a control group to provide a baseline performance for comparison, and (b) the
use of revised writing as a measure of improvement in accuracy as opposed to a new piece of
writing. The lack of a control group is allegedly a critical methodological flaw when evaluating
the effectiveness of WCF because without a control group, it would be difficult to ensure that
any observed accuracy improvement comes solely from the WCF provided. Similarly, learners’
improved accuracy as shown in revised texts may not necessarily lead to a reduction of errors in
new pieces of writing (Ferris, 2004). In other words, greater accuracy in the revised texts may not
translate into internalized knowledge that can produce better accuracy in new discourse. Hence, to
test its efficacy, it is crucial to ascertain whether WCF leads learners to improved accuracy in new
pieces of writing (Truscott, 1996).
The awareness of methodological weaknesses in past research has led to improved study
designs. Influenced by oral corrective feedback research, these studies have been mostly framed
in the focus-on-form paradigm in SLA (e.g., Doughty & Williams, 1998; Long 1991), tracing the
efficacy of WCF in relation to selected forms, such as the English articles (Ferris, 2010). Not only
have these studies included a control group; they have also tested for immediate and delayed effects
of WCF (see, e.g., Nicolás-Conesa, Manchón, & Cerezo, 2019; Van Beuningen, 2010). The studies
have revealed that L2 learners receiving corrections have an advantage over those receiving no
WCF. However, some of these studies focus narrowly on L2 learners’ use of a small number of lin-
guistic features for the sake of empirical rigor (Ferris & Kurzer, 2019), which hardly corresponds
to a real classroom context, where students expect feedback on all their linguistic errors to improve
their written accuracy (Ferris, 2010).
Overall, WCF has, in large part, been found to be effective for improving L2 learners’ accuracy
in writing (Ferris, 2010; Kang & Han, 2015; Van Beuningen, 2010. See Bitchener & Storch, 2016
for a review). In their meta-analysis of 21 primary studies published between 1980 and 2013, Kang
and Han (2015) found a medium effect size (g = .54) for WCF. All other things being equal, the
research in support of the efficacy of WCF, both theoretical and empirical, exceeds the non-support
theory and research.

Early research on WCF examined whether WCF promotes L2 learners’ accuracy in writing in gen-
eral. Later research went beyond this focus, investigating finer-grained issues such as what type of
feedback is more effective and what factors modulate its effectiveness. In this section, we briefly
review research related to these specific questions.
Types of WCF
One contentious issue that has received much attention in research is which type of WCF is most
conducive to L2 accuracy. An important group of studies has compared direct and indirect WCF.
Direct WCF involves explicit correction, signaling the presence of an error and supplying the
215
correct form or relevant rule-based explanation, whereas indirect WCF does so implicitly by,
for example, underlining, circling, or using special symbols to indicate errors. The fundamental
difference is thought to be that indirect WCF invites learners to actively take part in the feedback
appropriation process by inducing them to self-correct their own errors (Park & Kim, 2019).
Still, researchers have proffered different views on the relative effectiveness of direct and
indirect WCF. Some consider direct WCF more helpful, while others claim that indirect WCF is
ultimately more beneficial for writing development. Ferris and Roberts (2001), for instance, argued
that direct WCF is superior to indirect WCF because learners can unequivocally learn what the
error is and how it should be corrected. Likewise, SLA researchers contended that direct WCF, by
virtue of its attention to individual linguistic elements, can play a more productive role than indirect
feedback in instructional contexts where acquisition of linguistic elements, not writing skill devel-
opment, is the main goal (Ferris, Liu, Sinha, & Senna, 2013). In a similar vein, Ferris and Hedgcock
(2014) asserted that the explicit information contained in direct WCF is useful for the learner in
testing hypotheses about the target language. Others, however, have argued the opposite, namely,
that indirect feedback is more beneficial as it can encourage learners to reflect on their language use
more deeply when they attempt to self-correct errors, a cognitive process likely to promote long-
term acquisition (Park & Kim, 2019). Still others have maintained that in order to self-correct their
own errors, learners need to have at least a certain level of prior knowledge of the forms in question.
They have insisted that indirect feedback is likely to work only when learners already have, to some
extent, internalized the linguistic forms in question (e.g., Ellis, Sheen, Murakami, & Takashima,
2008; Park, Song, & Shin, 2016).
In summary, diverse claims have been made with regard to the relative effectiveness of direct
and indirect WCF. Research findings as a whole point to the difficulty of determining which feed-
back strategy is more beneficial. This is because the effectiveness is relative; it depends on how it
interacts with a host of other external and internal factors, such as learners’ developmental readiness
and the type of target structure.
Scope of WCF
The scope of WCF, focused or unfocused, has also received much attention in WCF research.
Focused feedback selectively corrects errors (Ellis, 2008), whereas unfocused feedback indiscrim-
inately corrects all errors. Rather than being strictly dichotomous, they are likely to form a con-
tinuum (Li & Vuono, 2019). In a recent methodological synthesis, Liu and Brown (2015) proposed
a hybrid category, mid-focused feedback, which corrects two to five error types, thus being different
from unfocused feedback, which corrects more than five error types.
In the last decade considerable research has investigated the relative effectiveness of focused
and unfocused feedback. However, the findings are far from consistent. Some studies (e.g., Bitchner
& Knoch, 2008; Ellis et al., 2008) show that focusing on a small number of errors may facilitate
progress toward greater accuracy, whereas others (e.g., Karim & Nassaji, 2020; Van Beuningen,
2010) reveal that unfocused feedback may help reduce error frequency. Conceptually, some
researchers argue for focused feedback, noting that it is less overwhelming to students. From a cog-
nitive perspective, some have stated that focused feedback can lessen learners’ processing burden.
Ellis et al. (2008), for example, claimed that learners may not be able to process a wide range of
linguistic forms at once due to their limited processing capacity and, as a result, focused feedback
can be more effective in that its narrower scope can facilitate learners’ attention to the correction
and may even enable them to learn new forms. Ellis et al. (2008), however, conceded that focused
WCF is likely to promote explicit grammatical knowledge that learners may not be able to transfer
to new and different writing tasks. Considering that the ultimate goal of WCF is to improve the
overall accuracy of L2 written texts, Ellis (2008) argued that focusing only on one specific type of
error would not be sufficient. Further, focused WCF by virtue of its narrow scope might be effective
216
Effects of WCF
in the short term, while unfocused WCF addressing a variety of errors would be beneficial in the
long run (Van Beuningen, 2010).
In sum, with regard to the scope of WCF, current understanding seems to be that focusing on
one or two specific linguistic errors could reduce L2 learners’ cognitive load, but it may not have
ecological validity in a real classroom setting where the goal of instruction is to improve students’
overall written accuracy.
Linguistic Target of WCF

Another issue that has been called out relates to the linguistic target of WCF. There have been dia-
metrically opposing views over what linguistic elements are amenable to WCF. Truscott (2007),
for instance, contended that WCF is likely to be effective only for errors that are “relatively simple
and can be treated as discrete items rather than integral parts of a complex system” (p. 258). It
follows that WCF may be helpful for the learning of discrete, lexical items but not syntactic prop-
erties, which is inherently relational and, thus, complex. Ferris (1999), in contrast, believed that
WCF would be more effective when it targets rule-governed errors, such as subject-verb agreement,
and less effective for lexical items. She considered rule-governed errors treatable and non-rule-
governed errors, such as lexical items, untreatable. Differences aside, both perspectives concur that
the nature of linguistic target can mediate the effects of WCF.
The Bitchener, Young, and Cameron (2005) study was one of the first to investigate how the
nature of the linguistic target determines the efficacy of WCF. The study, examining the effects of
WCF on errors pertaining to the use of English articles and prepositions, found that WCF was more
effective for rule-based elements, such as rule-based use of articles (e.g., a for the first mention and
the for subsequent mentions), than for non-rule-based ones, such as prepositions.
Inspired by Bitchener et al. (2005), several studies have similarly examined the effects of WCF
on the two rule-based functions of English articles. These studies have showed that focused WCF
can improve L2 learners’ acquisition of the linguistic features (e.g., Bitchener, 2008; Bitchener &
Knoch, 2008, 2010a, 2010b; Ellis, Sheen, Murakami, & Takashima, 2008; Sheen, 2007), a finding
corroborating Ferris’ (1999) claim that rule-based linguistic elements are susceptible to WCF.
To summarize, researchers have explored factors that can mediate the efficacy of WCF, in par-
ticular, the type and scope of feedback, and the nature of linguistic target. The next section presents
a selective review of more recent empirical work addressing and transcending these issues, includng
exploring the short-term and long-term effects of alternative WCF strategies.
Alternative WCF Strategies

In recent years research on WCF has seen a gradual shift, away from direct, indirect, focused, or
unfocused types, to other types of WCF strategies. One alternative type of strategy that has lately
received notable attention is the use of model texts, which involves providing learners with samples
of target language texts tailored to “the content and the genre of the writing task at hand” (Coyle & de
Larios, 2014, p. 453). In contrast to standard teacher feedback methods which only present learners
with corrections of erroneous forms, model texts can provide them with a variety of suggestions
with regard to relevant content, grammar, vocabulary, and organizational structures. It is assumed
that learners can readily notice such types of information when juxtaposing their own writing with
the model texts and subsequently incorporate some features from the models into their revised drafts.
This assumption has been tested and partially confirmed by the findings of a number of recent
studies with young and adolescent learners (e.g. Coyle, Guirao, & Roca de Larios, 2018; García
Mayo & Loidi Labandibar, 2017; Kang, 2020) wherein learners completed a writing task and were
217
then provided with model texts. While comparing their writing against the model texts, L2 writers
seemed able to notice solutions to problems they had experienced during the writing process. When
asked to complete the same writing task, the learners demonstrated incorporation of those linguistic
features that they had noticed in the model texts, producing linguistically more accurate drafts. The
findings suggest that model texts can serve as linguistic sources for L2 learners to correct their own
errors (see Han, 2020 for a theoretical discussion).
Another increasingly popular alternative WCF strategy is reformulation, which involves teachers
rewriting an L2 learner’s texts, “preserving all the learner’s ideas, making it sound as native as pos-
sible” (Cohen, 1983, p. 4). Similar to model texts, it is a less explicit feedback method that provides
modeling of writing. By adopting writing-comparison-revision tasks, researchers have found that
learners can notice differences between their own writing and the reformulations provided (Adams,
2003; Kim & Bowles, 2019; Yang & Zhang, 2010). This type of noticing promoted by reformulated
texts is associated with improved accuracy in subsequent revisions. In other words, unlike trad-
itional WCF strategies, when providing reformulations of learners’ original texts, errors are not
identified for learners; instead, learners are expected to actively look for differences between their
interlanguage and the target language, which in turn readily leads to improvements in later drafts.
Target Linguistic Features

As noted earlier, SLA-oriented WCF research has centered around investigating the impact of
focused WCF on the acquisition of discrete forms to determine if it would result in their accurate
use. A limited number of target linguistic features (e.g. verb forms, prepositions, and articles) have
been explored, with most of the existing studies examining learners’ use of English definite and
indefinite articles before and after focused WCF (e.g., Bitchener, 2008; Bitchener & Knoch, 2008,
2010a, 2010b; Ellis, Sheen, Murakami, & Takashima, 2008; Rassaei, 2019; Reynolds & Kao, 2019;
Sheen, 2007; Sheen, Wright, & Moldawa, 2009; Suzuki et al., 2019), in particular, the two rule-
based usages, “first mention a” and “anaphoric second mention the.” Overall, their results have
shown that focused WCF can facilitate the acquisition of the target feature. However, it is palpable
that these studies have limited the scope of article use to binary contrasts (e.g., the first mention
vs. second mention), and this narrow focus might have inadvertently misled learners into thinking
that the article system has only two usages. Thus, this reductionist linguistic focus of WCF might
not have effectively helped learners to establish adequate form-function mappings for an otherwise
complex linguistic subsystem.
To address the issue, Ekiert and di Gennaro (2021) recently replicated Bitchener and Knoch’s
(2010) study but extended it by incorporating additional usages of articles as target features. The
study found that focused WCF directed solely at the two rule-based usages did improve learners’
accurate use of the two functions, as reported by Bitchener and Knoch. However, Ekiert and
Gennaro’s study also showed that such a narrow focus negatively impacted the learning of the other
usages of articles, like the situational the, non-referential a, and idiomatic uses of a and the. These
more nuanced findings point to the need for a more sophisticated approach to investigating English
article errors in particular, and linguistic subsystems more generally.
Individual Differences and WCF

Another interest of recent research on WCF concerns the way in which individual difference (ID)
variables may mediate WCF effects (see Chapters 11 and 12 this volume). ID variables identified
to influence the efficacy of WCF include L2 proficiency (Park, Song, & Shin, 2016), language
aptitude components, such as language analytic ability and grammatical sensitivity (Benson &
DeKeyser, 2019; Sheen, 2007; Stefanou & Révész, 2015), working memory (Li & Roshan, 2019),
and motivation (Waller & Papi, 2017). Not surprisingly, higher levels of proficiency, language
218
Effects of WCF
aptitude, working memory, and motivation have been found to positively correlate with the effi-
cacy of WCF.
Regarding L2 proficiency, many researchers have speculated that WCF may not be as helpful for
less proficient learners given their lack of ability to understand discrete forms (Kang & Han, 2015).
Despite the attention that this claim has received, only a few studies have directly investigated
proficiency’s role in the efficacy of different types of WCF. Results have shown that learners with
higher proficiency benefit more from indirect feedback than lower proficiency learners, who do
not have enough linguistic knowledge to self-correct themselves do (Guo, 2015; Park et al., 2016).
Regarding aptitude components, different studies (e.g., Benson & DeKeyser, 2019; Stefanou &
Révész, 2015) have revealed that language analytic ability, defined as “the ability to analyze lan-
guage by creating and applying rules to new sentences” (Sheen, 2007, p. 259), positively affects the
effectiveness of WCF. It has been claimed that learners with strong analytic abilities may benefit
more from a combination of direct correction and provision of metalinguistic information than from
direct correction alone (Benson & DeKeyser, 2019; Sheen, 2007; Stefanou & Révész, 2015).
Working memory is another ID factor that may affect the effectiveness of WCF. Bitchener and
Storch (2016) suggested that L2 learners with larger working memory capacities can be expected
to attend to and process WCF with great efficacy. Similarly, Li and Roshan (2019) showed that
learners with greater working memory capacities benefited more than other learners did from
cognitively demanding WCF strategies. The researchers noted that reviewing corrections and
then revising the original draft could pose cognitive demands on learners, tapping the use of
their working memory. Accordingly, Li and Rochan argued that working memory might affect
the effectiveness of WCF strategies, particularly those involving deep cognitive processing.
Motivation is yet another individual factor that has received much scholarly attention in WCF
research (e.g., Bitchener & Storch, 2016; Ferris, 2010; Kormos, 2012). Kormos (2012), for instance,
argued that motivation can influence “learners’ attention paid to feedback and their further develop-
ment in creating text revisions” (p. 399). Hyland (2011) also pointed out that students with higher
motivation are more willing to use and benefit from WCF. Similarly, Waller and Papi (2017) found
that learners who were invested in improving their L2 writing sought more feedback and utilized
it in their revisions.
Short-and Long-Term Effects of WCF

Research to date has revealed mainly short-term positive effects of WCF on learners’ accuracy
in writing (as reviewed in Kang & Han, 2015). As mentioned earlier, little is yet known as to
whether the effects are long lasting or merely episodic. Truscott (2007) claimed that the positive
effects of WCF reported in the existing research are likely to fade within weeks after experimental
treatments.
In order to shed light on this issue, recent studies have overcome methodological shortcoming of
earlier studies by investigating the effects of WCF on learners’ writing of new texts, over time (e.g.,
Hartshorn et al., 2010; Karim & Nassaji, 2020). Some studies have included a delayed posttest in
order to ascertain the longer-term effects (or retention) of WCF (e.g., Nicolás-Conesa, Manchón,
& Cerezo, 2019; Shintani & Ellis, 2013). However, only a handful of studies have adopted a lon-
gitudinal design with a delayed posttest and multiple iterations of feedback over time. A notable
example is Rahimi (2019). The study investigated the effects of different types of WCF on the
written accuracy of English as a second language (ESL) learners over a 14-week period, during
which learners received four rounds of WCF treatments, and their written accuracy was compared
at three different intervals corresponding to pretest (week 1), immediate posttest (week four), and
the delayed posttest (week 14). Improvement in sentence accuracy was observed at the imme-
diate posttest, and delayed posttest. The study, thus, showed that the positive effects of WCF were
maintained over time.
219

The last decade has seen several research syntheses seeking to capture the state-of-the-art of meth-
odology in WCF research (see, e.g., Liu & Brown, 2015; Karim & Nassaji, 2019). In this section,
we highlight more recent research trends and methodological issues.
Populations
Traditionally, WCF research studied adult learners, typically at college level (see, e.g., Ashwell,
2000; Chandler, 2003; Ferris, 2006; Ferris & Roberts, 2001). More recently, several studies have
looked into child and adolescent learners (e.g. Coyle, Guirao, & Roca de Larios, 2018; García
Mayo & Loidi Labandibar, 2017; Kang, 2020; Stefanou & Révész, 2015). Also, mid-proficiency
learners were once widely investigated (as reviewed in Kang & Han, 2015), but more recent studies
have involved low proficiency learners (e.g., Coyle et al., 2018; Park et al., 2016).
Independent and Dependent Variables

The independent variables in this body of work is the type of feedback provided (focused/unfocused;
direct/indirect, and, within the latter, models or reformulations) and, as noted above, a range of indi-
vidual differences thought to mediate potential effects (L2 proficiency, aptitude, working memory,
and motivation). The dependent variables involve several measures of accuracy, including ratio of
errors to the total number of obligatory contexts of use (i.e., Li & Roshan, 2019; Rassaei, 2019;
Reynolds & Kao, 2019; Stefanou & Révész, 2015), ratio of total number of errors to total number of
words, (i.e., Karim & Nassaji, 2020; Rahimi, 2019; Nicolás-Conesa et al., 2019) and sum of errors
(i.e., Park et al., 2016).
Designs and Tasks

The currenlty most common design of empirical studies on WCF involves a three-stage pro-
cedure: pretest (initial writing), treatment sessions (feedback provision) and rewriting (or revision).
A variety of writing tasks have been used, the most common types being argumentative essays (i.e.,
Kim & Bowles, 2019), journal writing (i.e., Park et al., 2016), and narrative tasks (i.e., Nicolás-
Conesa et al., 2019). These writing tasks are often completed individually (see Coyle et al., 2018;
Manchón et al., 2020 exceptions) within a time limit.
One notable methodological strength of more recent empirical studies on WCF is the incorp-
oration of a control group and a delayed posttest. As suggested by Ferris (2010), a control group
is necessary to ensure that any observed improvements in accuracy come solely from the WCF
provided. In addition, a delayed posttest is crucial to establish whether the learning that has taken
place is durable. However, despite the methodological improvement, several methodological issues
continue to warrant attention.
First, the majority of studies still adopt a single-shot design where feedback was offered only on
a single draft. Relatedly, in almost all of the existing studies, only as many as three or four feedback
sessions were provided. Evidently, if the goal is to measure long-term learning from WCF, one-off
research designs are woefully inadequate. Studies that employ multiple feedback sessions over
an extended period of time would be more helpful to illuminate both the process and the outcome
of WCF.
A lingering methodological concern is the lack of longitudinal studies to investigate accuracy
development over time. In general, the duration of extant longitudinal L2 studies have ranged
from three months to six years, though the latter have been rare (Ortega & Iberri-Shea, 2005).
Similarly, the observational spans in the WCF studies range from a couple of weeks (e.g., Frear &
220
Effects of WCF
Chiu, 2015) to 14 weeks (e.g., Rahimi, 2019), hence the relevance of future true longitudinal WCF
studies. While recent studies, published since the late 2000s, have generally included delayed
posttests, the delayed posttests employed were only modestly late –administered roughly two
weeks after the immediate posttests (e.g., Frear & Chiu, 2015; Karim & Nassaji, 2020; Nicolás-
Conesa, Manchón, & Cerezo, 2019; Rassaei, 2019; Stefanou & Révész, 2015). Unless the delayed
posttests are substantively extended, Truscott’s (1996) argument against WCF still holds, namely,
that knowledge acquired from WCF can “disappear in a matter of months” (p. 346) or be applied
inconsistently.

Despite that conceptual and methodological issues remain in WCF research, it is possible to
glean some pedagogical implications. One implication is that WCF can be implemented in a var-
iety of ways –explicitly, implicitly, comprehensively, selectively, or in any combination thereof,
although the choice should not be random. Explicit WCF is likely to lead to instant changes
in the learner’s behavior but the changes may not be long lasting. Implicit WCF, on the other
hand, may take a longer time to influence learning, but the changes can be more durable. More
importantly, explicit WCF places the processing responsibility in the hands of the feedback
giver (i.e., the teacher), while implicit WCF shifts the responsibility to the feedback receiver
(i.e., the learner). In addition, researchers have more recently recommended using more discur-
sive forms of feedback, such as providing model texts (e.g., Cánovas Guirao, Roca de Larios,
& Coyle, 2015). These insights notwithstanding, WCF research to date has made clear that no
one type of strategy is absolutely more efficient than any other types. Teachers, therefore, would
benefit from being acquainted with the various options and ultimately make them part of their
own tool kit, so as to be able to adapt their feedback provision practices to diverse learners and
learning conditions.
In this respect, it is important that teachers heed the research insight that the effectiveness of
WCF can be mediated by an array of factors –learners’ prior knowledge of the grammatical feature
in question, their proficiency in the target language, the nature of the targeted grammatical feature,
learners’ metalinguistic or linguistic analytic ability, to name but a few. Teachers should, therefore,
be prepared to adapt their WCF strategies to individual students, as, again, there is no one-size-fits-
all strategy. Teachers should be mindful, as well, of the fact that too many corrections can over-
whelm the learner, resulting in their lack of appropriation of the WCF received. Selective provision
of WCF, on the other hand, may stand a better chance of facilitating the learner’s attention to the
language, especially for learners at lower levels of proficiency.
Finally, for WCF to lead to learning, learners need the opportunity to not only notice the
presence of WCF but to also make cognitive comparisons between their errors and the available
WCF. As suggested by Storch and Wiggleworth (2010, see also Chapter 7, this volume), the depth
of students’ processing of WCF is an important factor in the efficacy of WCF, which teachers can
enhance through creating opportunities for learners to pay attention to and process WCF more
deeply, for example, by encouraging them to keep learning logs, revise drafts in response to WCF,
or review corrections regularly.
Future Directions
Empirical studies thus far have globally established that WCF is beneficial for improving accuracy
in L2 writing. However, some methodological issues are still plaguing the design of much of the
research, which can undercut the credibility of the findings. It is therefore essential to identify these
issues as objects of future study. Four are highlighted here.
221
First, research has uncovered various factors that may influence the efficacy of WCF, chiefly
among them learner proficiency, the nature of the linguistic target, the genre of writing, and the add-
itional task of revision (see, e.g., Kang & Han, 2015; Liu & Brown, 2015). Few studies, however,
have delved into the relative contributions of these factors to WCF efficacy, much less on how the
variables interact to jointly override or enhance the efficacy of WCF. Empirical research directly
engaging these variables is warranted to further understand both the potential and the limitations
of WCF.
Second, in spite of the rapidly rising number of empirical studies, the bulk of research attention
has continued to be on focused feedback, and narrowly on discrete morphosyntactic elements.
This, coupled with other limitations such as lack of measure of retention of learning and the almost
exclusive focus on ostensibly rule-based elements, further restricts our understanding of the poten-
tial contribution of WCF to learning. For one thing, although some studies have found that the
nature of the targeted linguistic element may moderate WCF effects, that is still a far cry from
knowing why grammatical elements are selectively susceptible to WCF. Hence, we are nowhere
near a coherent and systematic understanding of both why and how grammatical elements interact
with WCF. Studies with WCF targeting clusters of linguistic elements rather than isolated, discrete
ones are likely to yield a more sophisticated and useful understanding of the language learning
potential of WCF (Han, 2008).
Third, recognizing the role individual differences play in language learning, future studies can
delve further into their effects on WCF appropriation. Here, too, there is a need to broaden the
scope of investigation. Existing studies have mainly explored learners’ linguistic analytic ability.
Expanding the scope of inquiry would necessitate (a) the inclusion of other individual difference
variables, including, but not limited to, learning style, anxiety, and learning strategies, and (b) tying
them to short-term and long-term learning from WCF.
Last but not least, future research needs to deal with pending methodological concerns. Above
all, the biggest, and the persistent, shortfall in WCF research has thus far been the lack of true
longitudinal studies. The case for longitudinal research has been repeatedly made in theoretical
terms (e.g., Ortega & Byrnes, 2008). The well-known U-shaped learning curve in SLA illustrates
that the developmental trajectory is typically non-linear and dynamic. The same is likely true for
improvement of grammatical accuracy following WCF. That is, following the initial exposure to
WCF, learners may be able to make surface corrections and appear to use the forms accurately. But
before long, they may regress or “backslide” to their earlier, incorrect use of the forms (Selinker,
1972). In time, with continuous exposure to input, including WCF, they may internalize the forms –
developing a deeper understanding of the form-meaning relations. It is, therefore, necessary to take
a longitudinal approach to studying the efficacy of WCF. Only longitudinal studies that employ mul-
tiple rounds of feedback will establish a dynamic and nuanced understanding of the contributions
(or lack thereof) of WCF to L2 learning.
References
Adams, R. (2003). L2 output, reformulation and noticing: Implications for IL development. Language Teaching
Research, 7, 347–376.
Ashwell, T. (2000). Patterns of teacher response to student writing in a multi-draft composition classroom: Is
content feedback followed by form feedback the best method? Journal of Second Language Writing, 9,
227–257.
accuracy. Language Teaching Research, 23, 702–726.
Bitchener, J. (2008). Evidence in support of written corrective feedback. Journal of Second Language Writing,
17, 102–118.
Bitchener, J., & Ferris, D. (2012). Written corrective feedback in second language acquisition and writing.
222
Effects of WCF
Bitchener, J., & Knoch, U. (2008). The value of written corrective feedback for migrant and inter-national
students. Language Teaching Research, 12, 409–431.
Bitchener, J., & Knoch, U. (2010a). The contribution of written corrective feedback to language develop-
ment: A ten-month investigation. Applied Linguistics, 31, 193–214.
Bitchener, J., & Knoch, U. (2010b). Raising the linguistic accuracy level of advanced L2 writers with written
corrective feedback. Journal of Second Language Writing, 19, 207–217.
Matters.
Bitchener, J., Young, S., & Cameron, D. (2005). The effect of different types of corrective feedback on ESL
student writing. Journal of Second Language Writing, 9, 227–258.
Bley-Vroman, R. (1986). Hypothesis testing in second-language acquisition theory. Language Learning, 36,
353–376.
Bruton, A. (2009). Designing research into the effect of error correction in L2 writing: Not so straightforward.
Cánovas Guirao, J.., Roca de Larios, J., & Coyle, Y. (2015). The use of models as a written feedback technique
with young EFL learners. System, 52, 63–77.
Chandler, J. (2003). The efficacy of various kinds of error feedback for improvement in the accuracy and flu-
ency of L2 student writing. Journal of Second Language Writing, 12, 267–296.
Chandler, J. (2004). A response to Truscott. Journal of Second Language Writing, 13, 345–348.
Cohen, A.D. (1983). Reformulation compositions. TESOL Newsletter, 17, 4–5.
Coyle, Y., Cánovas Guirao, J., & Roca de Larios, J. (2018). Identifying the trajectories of young EFL learners
Writing, 42, 25–43.
Coyle, Y., & Roca de Larios, J. (2014). Exploring the role played by error correction and models on children’s
reported noticing and output production in a L2 writing task. Studies in Second Language Acquisition,
36(3), 451–485.
Doughty, C., & Williams, J. (1998). Focus on form in classroom second language acquisition. Cambridge:
Ekiert, M., & Di Gennaro, K. (2021). Focused written corrective feedback and linguistic target mastery:
Conceptual replication of Bitchener and Knoch (2010). Language Teaching, 54(1), 71–89.
Ellis, R., Sheen, Y., Murakami, M., & Takashima, H. (2008). The effects of focused and unfocused written cor-
rective feedback in an English as a foreign language context. System, 36, 353–371.
Fathman, A.K., & Whalley, E. (1990). Teacher response to student writing: Focus on form versus con-
tent. In Kroll, B. (Ed.), Second language writing: Research insights for the classroom (pp. 178–190).
Ferris, D. (1999). The case for grammar correction in L2 writing classes: A response to Truscott (1996). Journal
Ferris, D. (2004). The “grammar correction” debate in L2 writing: Where are we, and where do we go from
here? (and what do we do in the meantime …?). Journal of Second Language Writing, 13, 49–62.
Ferris, D. (2006). Does error feedback help student writers? New evidence on the short-and long-term effects
of written error correction. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts
and issues (pp. 81–104). Cambridge: Cambridge University Press.
Ferris, D. (2010). Second language writing research and written corrective feedback in SLA. Studies in Second
Language Acquisition, 32, 181–201.
Ferris, D., & Hedgcock, J. (2014). Teaching L2 composition. New York: Routledge.
Ferris, D., & Kurzer, K. (2019). Does error feedback help L2 writers? Latest evidence on the efficacy of written
corrective feedback. In K.Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and
issues (pp. 81–104). Cambridge: Cambridge University Press.
Ferris, D., & Roberts, B. (2001). Error feedback in L2 writing classes: How explicit does it need to be? Journal
Frear, D., & Chiu, Y. (2015). The effect of focused and unfocused indirect written corrective feedback on EFL
learners’ accuracy in new pieces of writing. System, 53, 24–34.
García Mayo, M.P., & Loidi Labandibar, U. (2017). The use of models as written corrective feedback in EFL
writing. Annual Review of Applied Linguistics, 37, 110–127.
Guénette, D. (2007). Is feedback pedagogically correct?: Research design issues in studies of feedback on
writing. Journal of Second Language Writing, 16(1), 40–53.
223
Guo, Q. (2015). The effectiveness of written CF for L2 development: A mixed method study of written
CF types, error categories and proficiency levels. Unpublished doctoral dissertation, AUT University,
Auckland, New Zealand.
Han, Z-H. (2008). On the role of meaning in focus on form. In Understanding second language process (pp.
45–79). Clevedon: Multilingual Matters.
Han, Z-H. (2020). Usage-based instruction, systems thinking, and the role of language mining in L2 develop-
ment. Language Teaching. Advanced online publication. https://doi:10.1017/S0261444820000282
Hartshorn, K., Evans, N, Merril, P., Sudweeks, R., Stron-Krause, D., Anderson, N. (2010). Effects of dynamic
corrective feedback on ESL writing accuracy. TESOL Quarterly, 44, 84–109.
Hyland, F. (2011). The language learning potential of form-focused feedback on writing: Students’ and
teachers’ perceptions. In R.Manchón. (Ed.), Learning-to-write and writing-to-learn in an additional lan-
guage (pp. 159–180). Amsterdam: John Benjamins.
Hyland, K., & Hyland, F. (2019). Feedback in second language writing: Contexts and issues. Cambridge:
Kang, E. (2020). Using model texts as a form of feedback in L2 writing. System, 89, 102–196.
Kang, E., & Han, Z-H. (2015). The efficacy of written corrective feedback in improving L2 written accuracy: A
meta-analysis. The Modern Language Journal, 99, 1–18.
Karim, K., & Nassaji, H. (2019). The effects of written corrective feedback. Instructed Second Language
Acquisition, 3(1), 28–52.
Karim, K., & Nassaji, H. (2020). The revision and transfer effects of direct and indirect comprehensive cor-
rective feedback on ESL students’ writing. Language Teaching Research, 24, 519–539.
Kepner, C.G. (1991). An experiment in the relationship of types of written feedback to the development of
second-language writing skills. Modern Language Journal, 7, 305–513.
Kim, H.R., & Bowles, M. (2019). How deeply do second language learners process written corrective feed-
back? Insights gained from think-alouds. TESOL Quarterly, 53(4), 913–938.
21(4), 390–403.
Krashen, S. (1985). The input hypothesis: Issues and implications. Harlow: Longman.
Lalande, J.F. (1982). Reducing composition errors: an experiment. The Modern Language Journal, 66,
140–149.
Li, S., & Roshan, S. (2019). The associations between working memory and the effects of four different types
of written corrective feedback. Journal of Second Language Writing, 45, 1–15.
Li, S., & Vuono, A. (2019). Twenty-five years of research on oral and written corrective feedback in System.
System, 84, 93–109.
Liu, Q., & Brown, D. (2015). Methodological synthesis of research on the effectiveness of corrective feedback
in L2 writing. Journal of Second Language Writing, 30, 66–68.
Long, M. H. (1991). Focus on form: A design feature in language teaching methodology. In K. DeBot, R.
Ginsberg, & C. Kramsch (Eds.), Foreign language research in crosscultural perspective (pp. 39–52).
in language learning and teaching (pp. 241–265). Amsterdam: John Benjamins.
Nicolás-Conesa, F., Manchón, R.M., & Cerezo, L. (2019). The effect of unfocused direct and indirect written
corrective feedback on rewritten texts and new texts: Looking into feedback for accuracy and feedback for
acquisition. The Modern Language Journal, 109(4), 848–873.
Ortega, L., & Byrnes, H. (2008). The longitudinal study of advanced L2 capacities. New York: Routledge.
Ortega, L., & Iberri-Shea, G. (2005). Longitudinal research in second language acquisition: Recent trends and
future directions. Annual Review of Applied Linguistics, 25, 26–45.
Park, E.S. & Kim, O.Y. (2019). Learners’ use of indirect feedback corrective feedback: Depth of processing
and self-correction. In R.P. Leow (Ed.), The Routledge handbook of second language research in classroom
Park, E.S., Song, S., & Shin, Y.K. (2016). To what extent do learners benefit from indirect written corrective
feedback? A study targeting learners of different proficiency and heritage language status. Language
224
Effects of WCF
Rahimi, M. (2019). A comparative study of the impact of focused vs. comprehensive corrective feedback and
revision on ESL learners’ writing accuracy and quality. Language Teaching Research. Advance online pub-
lication. doi:1362168819879182
Rassaei, E. (2019). Computer-mediated text-based and audio-based corrective feedback, perceptual style and
L2 development. System, 82, 97–110.
Reynolds, B.L., & Kao, C.W. (2019). The effects of digital game-based instruction, teacher instruction, and
direct focused written corrective feedback on the grammatical accuracy of English articles. Computer
Assisted Language Learning, 1–21. doi: 10.1080/09588221.2019.1617747
Rob, T., Ross, S., & Shortreed, I. (1986). Salience of feedback on error and its effect on EFL writing quality.
TESOL Quarterly, 20, 83–95.
Sachs, R., & Polio, C. (2007). Learners’ use of two types of written feedback on a L2 writing revision task.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158.
Schwartz, B. D. (1993). On explicit and negative data effecting and affecting competence and linguistic
behavior. Studies in Second Language Acquisition, 15(2), 147–163.
Selinker, L. (1972). Interlanguage. IRAL, 10(2), 209–231.
Semke, H. (1984). The effects of the red pen. Foreign Language Annals, 17, 195–202.
Sharwood Smith, M. (1991). Speaking to many minds: On the relevance of different types of language infor-
mation for the L2 learner. Second Language Research, 7(2),118–133.
acquisition of articles. TESOL Quarterly, 41, 255–283.
Sheen, Y., Wright, D., & Moldawa, A. (2009). Differential effects of focused and unfocused written correction
on the accurate use of grammatical forms by adult ESL learners. System, 37, 556–569.
Shintani, N., & Ellis, R. (2013). The comparative effect of direct written corrective feedback and metalin-
guistic explanation on learners’ explicit and implicit knowledge of the English indefinite article. Journal of
of second language article use for generic and specific plural reference. The Modern Language Journal,
99(2), 263–282.
Storch, N. and Wigglesworth, G. (2010). Learners’ processing, uptake and retention of corrective feedback on
writing. Case studies. Studies in Second Language Acquisition, 32, 1–32.
Suzuki, W., Nassaji, H., & Sato, K. (2019). The effects of feedback explicitness and type of target structure on
accuracy in revision and new pieces of writing. System, 81, 135–145.
327–369.
Truscott, J. (2007). The effect of error correction on learners’ ability to write accurately. Journal of Second
Van Beuningen, C. (2010). Corrective feedback in L2 writing: Theoretical perspectives, empirical insights, and
future directions. International Journal of English Studies, 10(2), 1–27.
Van Beuningen, C.G., De Jong, N.H., & Kuiken, F. (2012). Evidence on the effectiveness of comprehensive
error correction in second language writing. Language Learning, 62(1), 1–41.
Waller, L., & Papi, M. (2017). Motivation and feedback: How implicit theories of intelligence predict L2
Yang, L., & Zhang, L. (2010). Exploring the role of reformulations and a model text in EFL students’ writing
performance. Language Teaching Research, 14, 464–484.
225
17
THE ROLE OF LANGUAGE
IN ASSESSING L2 WRITING
University of Iowa
Introduction
An essential question in exploring the role of language in writing assessment is what is language?
This question has been the work of scholars in linguistics, anthropology, psychology, philosophy,
and education for centuries. Defining language within writing becomes circular as writing is lan-
guage; the two are not separable. Writing can be characterized as having organizational structure
that is grammatical (vocabulary, syntax, and phonology/graphology) and textual (e.g., cohesion,
rhetorical structure) (Bachman & Palmer, 2010). Language also carries pragmatic intent, as it is
used in certain ways for specific functions or to apply sociolinguistic norms. The role of language in
writing assessment in second language research, however, has had more focus in the former areas –
grammatical and textual structure –than in the latter.
In current practice, most writing assessments ask students or test takers to use language by
performing writing. This direct assessment allows researchers or test users to evaluate writing more
authentically. Identifying the key elements of performance assessment is important in understanding
the layered role of language in writing assessment. McNamara (1996) outlined the process of
performance-based assessments detailing the multiple steps (performance, rating), people (candi-
date, raters), and instruments (task, scale/criteria). Unlike indirect methods of assessment, such as
multiple-choice or fill-in-the-blank questions, performance assessment requires test takers to write
on a prompt to complete a task. This prompt elicits underlying abilities used in the performance of
writing; however, the process does not end there. The performance needs to be read, evaluated, and
rated to give feedback or to give a score. This rating targets qualities of performances that relate
to language abilities, knowledge, or skill required in writing often reflected in a rubric or a rating
scale. Rating also introduces another participant in the assessment process –the instructor or rater.
The foundational concepts of task, rubric, and rater will be developed further in this section, and
all three can be related to the role that language plays in assessment.
With any assessment, including performance assessment, a foundational concept is the idea of a
construct, which is defined early in the assessment planning process. Construct refers to the model
of knowledge, skill, or ability which is being assessed in writing or other forms of assessment
(Messick, 1995). Constructs should reflect the purpose and context of an assessment; they also
should inform the task design and be present in the rubric or scoring. Defining a construct guides the
design of tasks that elicit the features to assess and provides a basis for meaningful interpretation
226 DOI: 10.4324/9780429199691-24

The Role of Language in Assessing L2 Writing
of a resulting score. For example, Chan et al. (2015) developed analytic rating rubrics in an effort
to redevelop Trinity College London’s Integrated Skills of English (ISE) examinations. At the ini-
tial stage, they defined a theoretical construct for a reading-into-writing task and then developed,
validated, and refined the rubrics by conducting analyses of performance by test takers, receiving
feedback from raters, and examining rater reliability. The analysis of test taker performances played
an important role in the construction of rubric descriptors, which are reflections of the construct and
the dimensions of skills elicited by the task.
Over the years, many constructs have been used to define writing for assessment (e.g., Grabe &
Kaplan, 1996; Weigle, 2002). Language always appears in these constructs but with some variation.
For example, it may appear as using lexico-grammatical knowledge or decoding/parsing in compre-
hension as foundational aspects of language ability; language may also be reflected in a construct
as functional, sociolinguistic, or pragmatic knowledge for use in specific contexts. In addition to
informing test development, constructs are tied to assessment as the basis to gather evidence for
construct validity. Validity can be defined as the level of assurance that an assessment provides
meaningful interpretations from the score for the intended construct. To illustrate, Yoon (2017)
explored three possible underlying constructs in an essay-writing task –syntactic complexity, lex-
ical complexity, and morphological complexity – to determine their distinctiveness. The findings
were nuanced; however, using factor analysis to understand the relatedness of measures used with
each kind of complexity, the researcher found that syntactic complexity measures of length loaded
most strongly on two different factors, while lexical and morphological complexity loaded strongly
together on another factor. This kind of research uncovers how a writing assessment elicits the
intended construct.
In developing a prompt for a performance-based writing assessment, task design is critical.
Test developers often look to tasks in the domains where writers will use the language. If students
are learning a language for future academic studies, then a task analysis of writing from college
courses can inform task design. Articulating the nature of language in writing in these domains is
an important step in this domain analysis. This insight may lead to adopting writing test tasks that
draw on other skills used in academic settings, such as reading and synthesizing texts in writing. In
addition to authenticity, other decisions about designing tasks include choices about genre, topic,
audience, length, supporting materials, and instructions.
In writing assessment, written products require mediation to assign a score, which is then used
to make decisions or inferences about the writers. This mediation often occurs with the use of a
rating scales with score band levels along with descriptors that vary in what aspects of language
use are included as criteria for different levels. A rating scale plays a central role in the validity of
writing assessments by reflecting the construct (Weigle, 2002), as it represents what constitutes
a highly proficient writing performance in the specific context and genre (Fulcher et al., 2011).
The mediation in scoring a performance assessment also requires someone to apply the rubric or
scale –a teacher, a researcher, or a rater. While constructs and rubrics may have clear direction on
language in writing performances, the way raters or instructors apply and interpret these criteria
can be less transparent (Lumley, 2002). Raters are important when considering the role of language
in writing as they are making judgments about the language in the performance. Rating and raters
are also tied to the critical quality of reliability in language assessment. Reliability and validity are
likely the two most commonly cited concerns in test quality and have an intricate relationship with
each other. Reliability speaks to the consistency and conformity of the assessment procedures and
instruments, with the goal of reducing any variability that is not related to language proficiency. In
terms of rating, inter-rater reliability, for example, indicates if raters are scoring consistently with
each other. The score from a writing assessment should not reflect the preferences of a rater outside
the criteria on a rating scale or rubric. For this reason, it is critical that we understand how these
judgments occur, as rubrics and raters are instruments for producing scores that are interpreted and
used for specific purposes, and the meaningfulness of those interpretations related to validity.
227
Writing and language have overlapped in assessments for a long time. Initially, writing was a
vehicle to test a foreign or classical language, rather than as a focus of assessment (Weigle, 2002).
Grammatical knowledge was assessed through the ability to translate between two languages,
often through writing. The scoring on translation assessments focused on the accuracy of gram-
matical structures and vocabulary, not including actual writing ability or knowledge of discourse
structures. Eventually, other approaches emerged to assess language in writing that were indirect,
perhaps as educational measurement and larger scale standardized assessments influenced the field
(Crusan, 2010). These included multiple-choice, matching, or fill-in-the-blank test items. The focus
remained on linguistic accuracy. In the example below, test takers are asked to identify the “error”
in a sentence via multiple-choice selection.
Identify the error in the following sentence.

The teacher ask the student how to spell a word.
a b c
This approach continued to emphasize syntactic and lexical accuracy. While these two indirect
approaches (i.e., asking test takers to translate one language to another and to respond to selected-
response items) still exist in language testing and language research, the more accepted practice is
to assess writing directly, with a performance assessment task. Over time, the focus of using writing
to assess language has shifted to language appearing as a component in the larger construct of
writing along with other features such as coherence, cohesion, development, style, or voice.
As mentioned in the previous section, the direct approach has created a need for rubrics to score
the writing. In recounting the history of writing assessment and rubrics, Crusan (2010) pointed to
a 1961 Educational Testing Service’s (ETS) Research Bulletin “Factors in Judgements of Writing
Ability” as a precursor to early rubric dimensions. In the report, factor analysis was used to iden-
tify five factors in writing that reflect writing ability: ideas, form, flavor, mechanics, and wording.
Form, mechanics, and wording relate to the foundational language features, such as grammar and
vocabulary, while flavor likely targeted pragmatic features of language. These five factors became
fairly standard in analytic writing rubrics for English as a first or second language. Also during
these decades, the foundational concept of communicative competence was articulated by anthro-
pologist Hymes (1972) and taken up in the field of language teaching through the model proposed
by Canale and Swain (1980). Communicative competence focused on the use of language in con-
text, establishing the importance of appropriateness and audience in communication and not only
linguistic accuracy. Language teaching and testing adapted this idea to focus more on performance
of language and a construct comprised of four components: linguistic competence, sociolinguistic
competence, discourse competence, and strategic competence.
In the 1970s, ETS continued to be active in research around the rating and scoring of writing,
including the use of holistic rubrics (Crusan, 2010). Another influential rubric in the history of
writing assessment is the ESL Composition Scale, developed in California by Jacob et al.(1981).
Their analytic scale included dimensions of content, organization, vocabulary, language use, and
mechanics with slightly different weights given to the first four categories, and mechanics being
weighted least heavily. While these assessments emerged in the United States, others had adopted
direct assessment of writing much earlier, such as the Cambridge Assessments in the United
Kingdom.
As second language writing assessment has shifted to direct evaluation of language use, writing
assessment research has seen considerable attention. One marker of this was the 1994 founding of a
journal devoted disseminating this research, Assessing Writing. In 2019, the journal published a spe-
cial issue of that looked back on its 25 years to frame the future of writing assessment. According to
228
Zheng and Yu’s (2019) analysis of over 200 articles published in the journal, studies on the validity
and reliability have been ranked first as the most researched topic in the last 20 years. The volume
also predicted the future of writing assessment research will continue to be validity and reliability,
but also highlight fairness, stakeholders, technology, and multilingualism (White, 2019).
Critical Issues
To understand the role of language in writing, researchers have explored the essential
question: what does “highly proficient” language look like in writing? In other words, how
do we define the quality of language in writing? This question is relevant both to language
testers and to SLA researchers; however, the two fields have somewhat different responses. In
SLA, attention to linguistic features of language has dominated the conversation about profi-
ciency. In language testing, judgments continue to come from rating with rubrics by human
raters or, more recently, enhanced by automated scoring systems. However, the specific lin-
guistic features of focus in SLA research studied in writing performances has been of interest
to language assessment. This is particularly true as technology affords opportunities to score
writing based on highly technical linguistic features of language that human raters cannot attend
to. Researchers in language testing see the potential for these features to reflect a construct for
writing assessment or to inform task design and scoring rubrics. For example, a considerable
amount of theory and research has explored the qualities of complexity (both syntactic and lex-
ical), accuracy, and fluency (these are commonly referred to with the acronyms CAF or CALF)
as language features in SLA and writing (e.g., Johnson, 2017; Yoon & Polio, 2017). Various
formulae are used to calculate these features, which Kyle details in this volume. More work is
needed to define and describe CAF in other languages with several recent studies taking up this
gap (Abrams, 2019; Jiang, 2013). For the purposes of informing writing assessment, studies in
SLA of language features, such as CAF or CALF have potential applications in the test devel-
opment and validation process (Plakans, Gebril, & Bilki, 2019).
Assessing Language in Relation to Writing Tasks

Features of language and writing quality have been explored in relation to writing tasks elicited
by different assessment prompts. Different genres of writing have been compared in writing
assessments as test takers argue, define, compare, narrate, or summarize (Li, 2014; Yoon & Polio,
2017), which may require variation in language. Yoon and Polio (2017) compared narrative and
argumentative tasks from writing assessments to explore the impact on language and language
development. In prior research, argumentative tasks had been found to elicit more complex lan-
guage, which was confirmed by their data, except for clausal level complexity. Argumentation also
led to the use of longer and less frequent words than narrative writing which had more lexical var-
iety. The difference in accuracy across the two genres did not emerge as statistically significant.
Johnson (2017) conducted a large-scale review of CALF in writing across tasks with different
cognitive demands and complexity. This approach to tasks attends to the processes writers use to
access the cognitive resources allocated to aspects of composing. This conceptualization includes
genre but extends beyond and relates to writing assessment prompt and task development issues of
planning time, the role of prior knowledge, steps to complete the task and cognitive processes for
certain genres of writing like causation or argumentation. Johnson’s review confirms that task com-
plexity impacts the writers’ process and attention to formulation of ideas and monitoring. Research
in writing assessment around processes with different task demands deserves attention to inform
development, use, and interpretation of tests and test scores.
Another example of task variation is considering how recognizing audience in a prompt affects
summary writing. In a study using Test of English as a Foreign Language (TOEFL) writing prompts
229
(Cho & Choi, 2018), writers prompted to target a specific audience were found to give more con-
textual background, communicate ideas more accurately, and provide complete summaries. The
researchers found that the differences in audience awareness occurred differentially across English
proficiency score levels.
Assessing Language with Rubrics

While CAF is the primary focus of linguistic features in writing performances for SLA researchers, the
concept of highly proficient writing from a language testing perspective is determined by the rubrics
that raters use when evaluating performances. Indeed, rubrics define writing performance at different
levels of proficiency; raters then operationalize and interpret the rubric. These processes draw on the
conceptualizations of language in rubric and scales and by raters. There are several types of rating
scales, including: holistic, analytic, primary-trait, and multi-trait (Weigle, 2002). Of these scales, hol-
istic and analytic are the most commonly used, with holistic scales providing one overall score and
analytic scales provide separate scores for each criteria on the scale (Li & He, 2015; Ohta et al., 2018;
Turner, 2013; Weigle, 2002). The type of rating scale has an impact on the process raters evaluate L2
writing products (Barkaoui, 2010a). When using such rating scales, research has shown raters pay more
attention to content, comprehensibility, and clarity of writers’ message than accuracy or appropriacy of
language use, leading to scores that may be less informative about linguistic capacities and more related
to rhetorical language and idea delivery (Barkaoui, 2010a; Gebril & Plakans, 2014; Huot, 1993; Li &
He, 2015; Milanovic, Saville, & Shen, 1996; Pula & Huot, 1993; Vaughan, 1991).
Analytic scales, on the other hand, ask raters to assign multiple scores for a single written per-
formance, evaluating as they do the different dimensions of writing separately (Barkaoui, 2010a).
Typical analytic scales assess dimensions that focus on language use, such as form, mechanics,
wording, and vocabulary (Crusan, 2010; Jacob et al., 1981). The vocabulary dimension in Jacob
et al.’s (1981) ESL Composition Scale, an analytic rubric, included descriptors that are related to the
effectiveness of word choice and use, range of vocabulary, errors in form, and whether the meaning
is obscured by such errors. Furthermore, the language use dimension includes descriptors related to
the complexity of sentence structure as well as the amount and type of errors (e.g., the use of tense
and agreement) that affect meaning. All of these elements are taken into account by raters when
evaluating each dimension separately.
As research in language testing has considered the relationship between CAF measures and how
humans rate writing quality the focus tends to be on construct validity. In fact, research has shown
that human judgment of language use may not always align with the development of CAF measures
(Crossley & McNamara, 2014). This may, potentially, reflect different constructs. In SLA, devel-
opment is found in growth and change over time; in contrast, language assessments are usually
evaluating the test taker’s capacity to perform in one moment in time. It may also reflect raters’
interpretation and application of scale descriptors, which can differ across raters, as each rater may
prioritize different aspects of the descriptors when making scoring decisions (Lumley, 2002).

Research on language in writing assessment has provided more insight into the question introduced
in the previous section of what is highly proficient by applying CAF to language tests. It has also
investigated how linguistic features relate to scores and how this is impacted by writing proficiency,
validity of rating, types of scales, and the characteristics of raters.
The CAF framework has been utilized for research in language testing to understand how scores
relate on writing performances. This work connects SLA research with language testing by illumin-
ating rubrics and rating. A large-scale study of the Test of English as a Foreign Language (TOEFL)
utilized CAF to compare new integrated skill assessment tasks, including integrated writing, with
230
more traditional single skill tasks (Cumming et al., 2006). In another study of integrated skills
writing assessment, researchers considered if measures of CAF were significant predictors of a
score on a writing assessment (Plakans et al., 2019). These studies have shown, of the three CAF
features, fluency is the strongest predictor of a writing score with accuracy as the second predictor.
Researchers have also reported a positive relationship between accuracy and scores on writing
assessments (Neumann, 2014; Plakans et al., 2019; Polio & Shea, 2014).
Complexity in writing has yet to be consistently found as a strong predictor of score. This
feature has also been difficult to measure with some scholars proposing phrasal complexity,
rather than clausal, as the more relevant form for writing (Biber et al., 2011), and others arguing
that complexity is multifaceted and research has selected the wrong measures to reflect it (Bulté
& Housen, 2014; Housen et al., 2019). A recent study by Kyle and Crossley (2019) investigated
large and fine-grained measures of complexity, finding agreement that phrasal complexity
measures relate to second language writing proficiency, but they also identified a fine-grained
clausal measure that had predictive power.
Rating scales play a central role in performance assessments; indeed, McNamara et al. (2002)
defined such scales as “the de facto test constructs” (p. 229). The development of these rating scales
is particularly challenging and can impact how language is reflected in the scale. Fulcher (2003)
argued that there are two main approaches to developing rating scales: intuitive and empirical
methods. The intuitive-based rating scales are “a priori” developed by a team of teachers, experts,
and/or testers by drawing on existing scales, theory, and their impressions of a good performance.
Recent research has determined that scale development is often done through a mix of approaches
rather than following one singularly (Knoch et al., 2021).
In developing the empirically-derived rating scales, a team of L2 writing experts or teachers
collect sample performances to the writing tasks and examine how the language is used in those
sample performances. The process entails identifying key performance features at a specific score
band related to test constructs. In many cases, raters’ think-alouds, interviews, and questionnaires can
inform scale developers of the criteria that raters use when judging the quality of written products.
For example, in developing Empirically-derived, Binary choice, Boundary definition (EBB) rating
scales, raters are asked to categorize a batch of written samples based on high and low level per-
formance, followed by an inquiry of the rationale that was used to make such judgments, as well as
the creation of yes/no questions that aid in establishing the boundaries of the scale levels and, by
extension, distinguishing the proficiency levels (Upshur & Turner, 1995). The development of this
type of scale has been documented by several researchers. In the context of the ESL placement test,
for instance, Plakans (2013) documented the development of two EBB rating scales: one to score
writing quality and another for grammar in writing. Plakans reported low rater agreement for the
EBB rating scale for grammar. Ewert and Shin (2015) investigated how experienced ESL teachers
developed the EBB scale for reading-to-write tasks used in a university English placement test.
They found that the dimension of language abilities was the most influential factor for “being in or
out of the program at the top or the bottom, and whether to go to the higher two levels or the lower
level” (p. 46). In the context of diagnostic assessment, Knoch (2009) compared intuitive-based and
empirically-based analytic scales for the ESL expository writing task with a variety of dimensions,
such as accuracy, fluency, complexity, and content. She found that with the empirically-based scale,
raters were better able to focus on each of the dimensions, showing clearer discrimination between
test takers, higher reliability, and greater rater preference. Despite the importance of rating scales in
performance assessment, however, research on the development of scales remains limited, particu-
larly in comparison to the more fruitful research that has been conducted on rater cognition and the
application of rating scales.
Research has investigated how raters use and interpret rubrics to assign writing scores, finding
that scores are largely affected by the content of scale descriptors as well as those who use it. Harsch
and Martin (2013) emphasized the need to “look at what raters do with the descriptors and how
231
they apply them to scripts when forming their judgments” (p. 281). Bachman and Palmer (1996)
specified the three potential inconsistency issues in the use of rating scale by raters, all of which
contribute to the unreliability of scores. First, the interpretation of the rating scale may differ among
different raters or within a single rater across different rating occasions. Second, different raters
may apply differing levels of severity to the same piece of writing or within a rater across a batch of
written products. Last, the judgment of writing quality may be based on some construct-irrelevant
aspects of writing performance that are not specified in the rating scale.
Research on raters has explored how different rater characteristics interact with the task of
scoring. For example, previous studies have explored whether native and non-native-speaking
raters of L2 performance generate equally reliable scores (Connor-Linton, 1995; Hill, 1996; Kim,
2009; Johnson & Lim, 2009; Shi, 2001; Zhang & Elder, 2011). Researchers have found negli-
gible differences in score reliability between raters with different language backgrounds, but
qualitative data illuminated differences in the area of assessment criteria and scoring behavior
(Kim 2009; Zhang & Elder, 2011). In writing assessment, Connor-Linton (1995) examined
scoring criteria used by American and Japanese teachers of English. Despite the same level of
interrater reliability between the two groups of teachers, English L1 teachers focused more on
the essay structure and development of ideas, but L1 Japanese teachers focused more on accuracy
in content, vocabulary, and grammar. Shi (2001) focused on the Chinese EFL learners’ compos-
ition rating comparing native English speakers with Chinese teachers of English and found no
significant difference in their holistic rating; however, the qualitative data suggested that native-
speaking raters commented more positively compared to Chinese-speaking raters, particularly
in criteria for content and language. Similarly, Johnson and Lim’s (2009) study focused on the
ratings for the MELAB writing test and also found no pattern of difference between native and
non-native English-speaking raters in terms of bias toward writing performance by specific L1
examinees.
Previous studies have also explored whether raters’ educational and professional background
and rating experience affect rating criteria, severity, and consistency of rating. Cumming (1990)
found that expert raters of ESL compositions attended to a wide range of assessment criteria, and
they “appear to have a much fuller mental representation of ‘the problem’ of evaluating student
composition” (p. 43). Huot (1993) found that criteria used by novice raters (unfamiliar to holistic
rating and untrained) were unstable over different tasks and essay features. Huot summarized that
novice raters have not internalized “a definite guideline on which to base their scoring decisions”
(p. 226). Schoonen et al. (1997) found that expert raters were somewhat more reliable in assessing
usage (i.e., errors in grammar, vocabulary/idiom, and style) than lay raters (without linguistic or
educational backgrounds), but both groups of raters were equally reliable in assessing content (i.e.,
relevancy and appropriateness of propositions).
Sakyi (2003) compared the rating behavior by newly trained and experienced ESL teachers who
used a holistic scale to score essays for the Test of English as a Foreign Language Test of Written
English. Novice raters tended to make scoring decisions based on “easily discernable aspects of
the writing” (p. 129), such as language (grammar and syntax) and spelling. Unlike novice raters,
experienced counterparts seemed to have internalized scoring criteria and mainly paid attention
to content and format of the essay. Similarly, Şahan and Razı (2020) examined decision-making
behaviors of raters with low-, mid-, and high-level of rating experience who used an analytic scale
to score university-level EFL students’ essays. They found that less-experienced raters’ attention
focused on language features (e.g., grammar, vocabulary, and spelling), particularly when scoring
low-scoring essays, while more-experienced raters not only attended to linguistic features but also
rhetorical and ideational issues, such as topic development and supporting details. They concluded,
“…content and discursive features of the texts might become more prominent in raters’ decision-
making processes as the raters’ level of assessment experience increases” (p. 328). Somewhat
232
contradictory findings were reported by Barkaoui (2010b) who examined assessment criteria
used by raters with or without an ESL composition teaching and rating experience. He found
that experienced raters who used a holistic scale focused on linguistic accuracy (i.e., syntax and
vocabulary) while novice raters attended more to the quality of argumentation or content aspects of
essays. Accordingly, holistic scores assigned by experienced raters were lower than those assigned
by novice raters, suggesting that novice raters’ ratings were more lenient than those given by
experienced raters. In sum, the research suggests that the effect of rater characteristics on scoring
behavior and criteria, including their focus on language, might interact with test takers’ proficiency
levels and types of scales used, in a complex way.

Research exploring the role of language in writing assessment has tended toward quantitative
approaches using advanced statistical procedures and large computational data sets (corpora) to
delve into validity, reliability, relationships among variables of interest, and to trace patterns of
development. However, researchers have also used qualitative approaches to investigate the elicit-
ation of constructs, to understand processes, and to observe the use of writing assessments.
In studies described earlier, to explore CAF, researchers have depended on data (corpora)
tagged or coded for language features and analyzed them using inferential statistics, such as t-
tests, Chi Square, ANOVA, correlation, and regression. With large corpora and technology to allow
automated tagging, researchers have been able to work with larger data sets which has allowed great
statistical power as well as allowing for more complex analysis. To examine the validity of analytic
rating scales, factor analyses, including exploratory (EFA) and confirmatory factor analysis (CFA)
are often employed. In addition, generalizability (G) theory analyses can examine the magnitude
of score variation due to the sampling of tasks and raters (Brennan, 2001). Multifaceted Rasch
measurement (Linacre, 1989) has also been extensively used to investigate the relationship among
various facets of tests, including task difficulty, rater severity, and test taker ability. Rasch is also
widely used in validity research to assure quality of measures used to provide evidence for construct
representation (Ayradoust et al., 2021). Multifaceted Rasch measurement has also been particularly
popular in language testing research as a means to look at bias, such as rater bias, which can be an
indicator of reliability or fairness issues (Knoch & McNamara, 2015).
When investigating rater performance, such as rating and their decision-making processes, quali-
tative data analyses are more common. Previous research has shown that data obtained from raters’
think-aloud protocols, post-rating interviews, group interviews, and retrospective questionnaires
can provide a much deeper insight into raters’ scoring approaches. For example, raters’ verbal
reports have allowed researchers to examine factors that affected scoring decisions (Gebril &
Plakans, 2014; Lumley, 2002; Sakyi, 2003).

As mentioned earlier in this chapter, the first step in developing a writing assessment is to define the
construct being measured. The construct and purpose for an assessment will determine the role of
language. For example, if a writing assessment is being used to place students into levels of a lan-
guage program, then the construct should have some relationship to the curriculum of the program
and the range of proficiency levels. How language is reflected in the program’s curriculum should
be considered in defining the construct to ensure alignment. This construct should be referenced in
designing tasks and in determining the rubric or scale used to score the performances.
The design of a task in assessing writing will reflect the purpose and construct, but also requires
consideration such as genre, topic, and audience (Cho & Choi, 2018; Li, 2014; Yoon & Polio, 2017).
233
Language appears readily in the design of writing assessment tasks and its role can be explicit or
indirect. In Example Prompt A, the writer is given very clear guidance on what language should be
used in the performance. This approach is more common at early stages in language development
or lower levels of language learning programs.
Example Prompt A: Draw a picture that shows something important to you. Write two
sentences about the drawing using vocabulary learned in class and two new sentence
frames.
On the other hand, in Example Prompt B, instruction in language use is less explicit. This approach
suggests that the writers’ ability to select appropriate language for the task is part of the construct.
This implicit approach would be asked of writers at a more advanced level of language proficiency
and with considerable metalinguistic awareness.
Example Prompt B: Write a persuasive essay to answer the question, what does
courage mean?
While tasks and prompts guide writers to produce certain language forms, the scoring of
performances determines how well the outcome reflects language ability. Language is usually
described in the criteria of a rubric overtly, for example, in the intermediate mid-level descrip-
tion for writing published in guidelines by the American Council of Foreign Language Teachers
Guidelines (ACTFL, 2012), which are commonly used for scoring performance-based assessment.
This descriptor includes references to language throughout:
Writers at the Intermediate Mid sublevel are able to meet a number of practical writing
needs. They can write short, simple communications, compositions, and requests for infor-
mation in loosely connected texts about personal preferences, daily routines, common
events, and other personal topics. Their writing is framed in present time but may con-
tain references to other time frames. The writing style closely resembles oral discourse.
Writers at the Intermediate Mid sublevel show evidence of control of basic sentence struc-
ture and verb forms. The writing is best defined as a collection of discrete sentences and/
or questions loosely strung together.
In these criteria, language is described with reference to both syntax (verb tenses) and lexis
(choice of topic).” Language is also addressed at the functional and discourse levels in phrases such
as “requests for information,” “resembles oral discourse,” and “sentences loosely strung together.”
There are several critical areas of concern in validating rating scales. One critical aspect is to ana-
lyze written products to identify linguistic features that distinguish one level from another. In using
analytic scales, it is also important to investigate whether each of the analytic components is related
but distinct enough to one another given that different dimensions represent unique characteristics of
language performance (Sawaki, 2007). Also, when weighted analytic component scores are summed
to make up an overall score, the relationship between the total score and the subcomponent scores
must reflect the relative importance of aspects of language that is tested. Deciding what type of rating
scale to use requires considerations, such as the purpose and context of assessment (Turner, 2013).
To reduce rater variability, rater training is considered a necessary part of performance assessment
(Attali, 2016; Lim, 2011; Lumley, 2002). According to Lumley and McNamara (1995), rater training
is effective in reducing extremely severe ratings, eliminating random judgments, and increasing
self-consistency within a rater. Rater training generally improves the reliability and validity of
scores. However, Lumley and McNamara (1995) suggest that we accept that both differences and
234
similarities in raters’ judgments are natural phenomena in performance assessment because “in a
matter of some complexity, no one judgment may be said to be definitive, although there is likely
to be a considerable area of overlap between judgments” (p. 57).
The degree to which these practices are implemented will depend on the purpose and the context
of the assessment. For classroom assessment that is formative, feedback may be more appropriate
than a score and the focus should be on supporting students’ language development. If writing
assessment is used to collect research data, then this feedback or any diagnostic supports would not
be necessary.
Future Directions
As the field of writing assessment continues to expand, two directions will decidedly impact the
role of language in terms of the constructs, tasks, raters, and rubrics used in the direct assessment
of language. One area is the work around multilingual and multi-skill assessment. The other area
is technology.
With recognition of the complexity of language use and the pervasiveness of multilingualism,
writing assessments are being explored in a more multifaceted way than in the past. Researchers
have been investigating how to apply theories of multilingualism in assessment by including mul-
tiple languages in one assessment (Schissel et al., 2019; Guzman-Orth et al., 2019; Hofer & Jessner,
2019). This work is in early stages and has focused on speaking; however, writing should also be
considered in designing assessments that tap into multilingual capacity. Another area that is fur-
ther along but still in infancy is multimodality of language in use, moving away from a four-skills
view of language in assessment. Most relevant to writing is the integration of reading or reading
and speaking with writing and how the field might define the construct to assess discourse that is
synthesized (Cheong et al., 2019). This work has expanded to constructs, tasks, rubrics, and raters
(Gebril & Plakans, 2014; Knoch & Sitajalabhorn, 2013; Plakans, 2009). These approaches reflect
the perspective of language as contextually and socially oriented. This evolving view will impact
the role of language in writing assessment.
Technology integration in assessment has a fairly deep history, going back to early computerized
scoring in the 1960s. However, with advances in technology, increasing the level of sophistication,
assuring higher reliability, and allowing more rigorous study of validity have kept this research
in the forefront. One affordance that is not new but has recently become more reliable, accurate,
and robust is the use of artificial intelligence. This can allow customization in the testing process
as well as automatization of scoring. The use of automated scoring of language production may
require considerable rethinking of our constructs (Deane, 2013). It is highly efficient and accurate
in the measurement of certain language features, those that are organizational or textual, but is
less able to score more pragmatic, sociocultural, or rhetorical aspects of writing. In addition to
studying automated scoring in large-scale assessments, researchers have investigated how using
this automation could provide support in the classroom to add to the teacher’s feedback, and, in this
way, the computer can focus on what it does best, lexico-grammatical accuracy. The integration of
technology into the testing process can also allow support for writers, such as spelling or grammar
checks, which could impact authenticity as well as the role of language. This use of automated
scoring in formative classroom writing assessment has the potential to introduce new constructs
to the field (Ranalli et al., 2017; Wilson, Roscoe, & Ahmed, 2017) and to illuminate for future
directions in both research, theory, and applications to language learning.
While the role of language in writing assessment has evolved over time, it is one of the crit-
ical components to consider in the development and use of constructs, tasks, rubrics, rating and
research. While multiple purposes for assessing writing from education to research vary, language
remains a constant feature of interest.
235
References
Abrams, Z. (2019). The effects of integrated writing on linguistic complexity in L2 writing and task com-
plexity. System, 8(1), 110–121. doi.org/10.1016/j.system.2019.01.009
American Council of Foreign Language Teachers. (2012). ACTFL proficiency guidelines. Alexandria,
VA: ACTFL Publications. Retrieved from www.actfl.org/resources/actfl-proficiency-guidelines-2012/
english/writing
Aryadoust, V., Ng, L.Y., Sayama, H. (2021). A comprehensive review of Rasch measurement in language
assessment: Recommendations and guidelines for research. Language Testing, 38, 6–40. doi.org/10.1177/
0265532220927487
Attali, Y. (2016). A comparison of newly-trained and experienced raters on a standardized writing assessment.
Language Testing, 33, 99–115. doi.org/10.1177/0265532215582283
Bachman, L.F., & Palmer, A.S. (1996). Language testing in practice. Oxford: Oxford University Press.
Bachman, L.F., & Palmer, A.S. (2010). Language assessment in practice. Oxford: Oxford University Press.
Barkaoui, K. (2010a). Variability in ESL essay rating processes: The role of the rating scale and rater experi-
ence. Language Assessment Quarterly, 7, 54–74. doi.org/10.1080/15434300903464418
Barkaoui, K. (2010b). Do ESL essays raters’ evaluation criteria change with experience? A mixed methods,
cross-sectional study. TESOL Quarterly, 44, 31–57. doi.org/10.5054/tq.2010.214047
Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammat-
ical complexity in L2 writing development. TESOL Quarterly, 45, 5–35. doi.org/10.5054/tq.2011.244483
Brennan, R.L. (2001). Generalizability theory. New York: Springer.
Bultè, B., & Housen, A. (2014). Conceptualizing and measuring short-term changes in L2 writing complexity.
Journal of Second Language Writing, 26, 42–65. doi.org/10.1016/j.jslw.2014.09.005
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching
and testing. Applied Linguistics, 1, 1–47. doi:10.1093/applin/1.1.1
Chan, S., Inoue, C., & Taylor, L. (2015). Developing rubrics to assess the reading-into-writing skills: A case
study. Assessing Writing, 26, 20–37. doi.org/10.1016/j.asw.2015.07.004
Cheong, C.M., Zhu, X., Li, G.Y., & Wen, H. (2019). Effects of intertextual processing on L2 integrated writing.
Journal of Second Language Writing, 44, 63–75. doi.org/10.1016/j.jslw.2019.03.004
Cho, Y., & Choi, I. (2018) Writing from sources: Does audience matter? Assessing Writing, 37, 25–38. doi.org/
10.1016/j.asw.2018.03.004
Connor-Linton, J. (1995). Cross-cultural comparison of writing standards: American ESL and Japanese EFL.
World Englishes, 14, 99–115. doi.org/10.1111/j.1467-971x.1995.tb00343.x
Crossley, S.A., & McNamara, D.S. (2014). Does writing development equal writing quality? A computational
investigation of syntactic complexity in L2 learners. Journal of Second Language Writing, 26, 66–79. doi.
org/10.1016/j.jslw.2014.09.006
Crusan, D. (2010). Assessment in the second language writing classroom. Ann Arbor, MI: University of
Michigan Press.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing, 7, 31–51. doi.
org/10.1111/j.1467-1770.1989.tb00592.x
Cumming, A., Kantor, R., Baba, K., Erdosy, U., Eouanzoui, K., & James, M. (2006). Analysis of discourse
features and verification of scoring levels for independent and integrated tasks for the new TOEFL. (TOEFL
Monograph No.MS-30 RM 05-13) Educational Testing Service.
Deane, P. (2013). On the relation between automated essay scoring and modern views of the writing construct.
Assessing Writing, 18, 7–24. doi.org/10.1016/j.asw.2012.10.002
Ewert, D., & Shin, S. (2015). Examining instructors’ concept conceptualizations and challenges in
designing a data-drive rating scale for a reading-to-write task. Assessing Writing, 26, 38–50. doi.org/
10.1016/j.asw.2015.06.001
Fulcher, G. (2003). Testing second language speaking. London: Longman/Pearson Education.
Fulcher, G., Davidson, F., & Kemp, J. (2011). Effective rating scale development for speaking tests: Performance
decision trees. Language Testing, 28, 5–29. doi.org/10.1177/0265532209359514
Gebril, A., & Plakans, L. (2014). Assembling validity evidence for assessing academic writing: Rater reactions
to integrated tasks. Assessing Writing, 21, 56–73. doi.org/10.1016/j.asw.2014.03.002
Grabe, W., & Kaplan, W. (1996). Theory and practice of writing: An applied linguistic perspective.
Harlow: Longman.
Guzman-Orth, D.A., Lopez, A.A., & Tolentino, F. (2019). Exploring the use of a dual language assessment
task to assess young English Learners. Language Assessment Quarterly, 16, 447–463. doi.org/10.1080/
15434303.2019.1674314
Harsch, C., & Martin, G. (2013). Comparing holistic and analytic scoring methods: Issues of validity and reli-
ability. Assessment in Education, 20, 281–307. doi.org/10.1080/0969594x.2012.742422
236
Hill, K. (1996). Who should be the judge? The use of nonnative speakers as raters on a test of English as an
international language. Melbourne Papers in Language Testing, 5, 29–50.
Hofer, B., & Jessner, U. (2019). Assessing the components of multi-(lingual) competence in young learners.
Lingua, 232, 1–13. doi.org/10.1016/j.lingua.2019.102747
Housen, A., De Clercq, B., & Kuiken, F. (2019). Multiple approaches to complexity in second language
research. Second Language Research, 35, 3–21. doi.org/10.1177/0267658318809765
Huot, B. (1993). The influence of holistic scoring procedures on reading and rating student essays. In M.M.
Williamson & B.A. Huot (Eds.), Validating holistic scoring for writing assessment: Theoretical and
Empirical foundations (pp. 206–236). Creskill, NJ: Xiton Press.
Hymes, D.H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics: selected
readings. (pp. 269–293). Harmondsworth: Penguin.
Jacobs, H.L., Zingraf, S.A., Wormuth, D.R., Hartfiel, V.F., & Hughey, J.B. (1981). Testing ESL composition: A
practical approach. Rowley, MA: Newbury House.
Jiang, W. (2013). Measurements of development in L2 written production: The case of Chinese writing.
Applied Linguistics, 34, 1–24. doi.org/10.1093/applin/ams019
Johnson, J.S., & Lim, G.S. (2009). The influence of rater language background on writing performance
assessment. Language Testing, 26, 485–505. doi.org/10.1177/0265532209340186
Johnson, M. (2017). Cognitive task complexity and written syntactic complexity, accuracy, lexical complexity,
and fluency: A research synthesis and meta-analysis. Journal of Second Language Writing, 37, 13–38. doi.
org/10.1016/j.jslw.2017.06.001
Kim, Y.-H. (2009). An investigation into native and non-native teachers’ judgments of oral English perform-
ance: A mixed methods approach. Language Testing, 26, 187–217. doi.org/10.1177/0265532208101010
Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing,
26(2), 275–304. doi.org/10.1177/0265532208101008
Knoch, U., Deygers, B., & Khamboonruang, A. (2021). Revisiting rating scale development for rater-mediated
language performance assessments: Modelling construct and contextual choices made by scale developers.
Language Testing. Advance online publication. https://doi.org/10.1177/0265532221994052
Knoch, U., & McNamara, T. (2015). Rasch analysis. In L. Plonksy, (Ed.), Advancing quantitative methods in
second language research (pp. 275–304). New York: Routledge. https://doi.org/10.4324/9781315870908
Knoch, U., & Sitajalabhorn, W. (2013). A closer look at integrated writing tasks: Towards a more focused def-
inition for assessment purposes. Assessing Writing, 18, 300–308. doi.org/10.1016/j.asw.2013.09.003
Kyle, K., & Crossely, S. (2019). Measuring syntactic complexity in L2 writing using fine-grained clausal and
phrasal indices. Modern Language Journal, 102, 333–349.
Li, H., & He, L. (2015). A comparison of EFL raters’ essay-rating processes across two types of rating scales.
Language Assessment Quarterly, 12, 178–212. doi.org/10.1080/15434303.2015.1011738
Li, J. (2014). Examining genre effects on test takers’ summary writing performance. Assessing Writing, 22,
75–90. doi.org/10.1016/j.asw.2014.08.003
Lim, G.S. (2011). The development and maintenance of rating quality in performance writing assessment:
A longitudinal study of new and experienced raters. Language Testing, 28, 543–560. doi.org/10.1177/
0265532211406422
Linacre, J.M. (1989). Many-facet Rasch measurement. Chicago, IL: MESA Press.
Lumley, T. (2002). Assessment criteria in a large-scale writing test: What do they really mean to the raters?
Language Testing, 19, 246–276. doi.org/10.1191/0265532202lt230oa
Lumley, T., & McNamara, T.F. (1995). Rater characteristics and rater bias: Implications for training. Language
Testing, 12, 54–71. doi.org/10.1177/026553229501200104
McNamara, T. (1996). Measuring second language performance. Harlow: Addison Wesley Longman.
McNamara, T., Hill, K., & May, L. (2002). Discourse and assessment. Annual Review of Applied Linguistics,
22, 221–242. doi.org/10.1017/s0267190502000120
Messick, S. (1995). Standards of validity and the validity of standards in performance assessment. Educational
Measurement: Issues and Practice, 14, 5–8. doi.org/10.1111/j.1745-3992.1995.tb00881.x
Milanovic, M., Saville, N., & Shen, S. (1996). A study of the decision-making behavior of composition
markers. In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment (pp. 92–
114). Cambridge: Cambridge University Press.
Neumann, H. (2014). Teacher assessment of grammatical ability in second language academic writing: A case
study. Journal of Second Language Writing, 24, 83–107. doi.org/10.1016/j.jslw.2014.04.002
Ohta, R., Plakans, L.M., & Gebril, A. (2018). Integrated writing scores based on holistic and multi-trait
scales: A generalizability analysis. Assessing Writing, 38, 21–36. doi.org/10.1016/j.asw.2018.08.001
Plakans, L. (2009). Discourse synthesis in integrated second language writing performance. Language Testing,
26, 561–587. doi.org/10.1177/0265532209340192
237
Plakans, L. (2013). Writing scale development and use in a language program. TESOL Journal, 4, 151–163.
doi.org/10.1002/tesj.66
Plakans, L., Gebril, A., & Bilki, Z. (2019). Shaping a score: Complexity, accuracy, and fluency in integrated
writing performances. Language Testing, 36, 161–179. /doi.org/10.1177/0265532216669537
Polio, C., & Shea, M. (2014). Investigation in the current measures of linguistic accuracy in second language
writing research. Journal of Second Language Writing, 26, 10–27. doi.org/10.1016/j.jslw.2014.09.003
Pula, J.J., & Huot, B. (1993). A model of background influences on holistic raters. In M. Williamson and B.
Huot (Eds.), Validating holistic scoring for writing assessment: Theoretical and empirical foundations (pp.
237–265). Cresskill, NJ: Hampton Press.
Ranalli, J., Link, S., & Chukharev- Hudilainen, E. (2017). Automated writing evaluation for formative
assessment of second language writing: Investigating the accuracy and usefulness of feedback as part of an
argument-based validity framework. Educational Psychology, 37, 8–25. doi.org/10.1016/j.asw.2018.03.007
Şahan, Ö., & Razı, S. (2020). Do experience and text quality matter for raters’ decision-making behaviors?
Language Testing, 37(3), 311–332. doi.org/10.1177/0265532219900228
Sakyi, A.A. (2003). The study of the holistic scoring behaviours of experienced and novice ESL instructors
(Doctoral dissertation). Available from ProQuest Dissertations and Theses. (UMI No. NQ78033).
Sawaki, Y. (2007). Construct validation of analytic rating scales in a speaking assessment: Reporting a score
profile and a composite. Language Testing, 24, 355–390. doi.org/10.1177/0265532207077205
Schissel, J., Leung, C., & Chalhoub-Deville, M. (2019). The construct of multilingualism in language testing.
Language Assessment Quarterly, 16, 373–378. doi.org/10.1080/15434303.2019.1680679
Schoonen, R., Vergeer, M., & Eiting, M. (1997). The assessment of writing ability: Expert readers versus lay
readers. Language Testing, 14, 157–184. doi.org/10.1177/026553229701400203
Shi, L. (2001). Native-and nonnative-speaking EFL teachers’ evaluation of Chinese students’ English writing.
Language Testing, 18, 303–325. doi.org/10.1191/026553201680188988
Shohamy, E. (2000). The relationship between language testing and second language acquisition, revisited.
System, 28, 541–553. doi.org/10.1016/s0346-251x(00)00037-3
Turner, C.E. (2013). Rating scales for language tests. In C.A. Chapelle (Ed.), The Encyclopedia of Applied
Linguistics (pp. 1–7). Malden, MA: Wiley-Blackwell.
Upshur, J.A., & Turner, C.E. (1995). Constructing rating scales for second language tests. ELT Journal, 49,
3–12. doi.org/10.1093/elt/49.1.3 Actions
Vaughan, C. (1991). Holistic assessment: What goes on in the raters’ minds? In L. Hamp-Lyons (Ed.), Assessing
second language writing in academic contexts (pp. 111–125). Norwood, NJ: Ablex.
Weigle, S.C. (2002). Assessing writing. Cambridge: Cambridge University Press.
White, E. (2019). (Re)visiting twenty-five years of writing assessment. Assessing Writing, 42, 1–6. doi.org/
10.1016/j.asw.2019.100419
Wilson, J., Roscoe, R., & Ahmed, Y. (2017). Automated formative writing assessment using a levels of lan-
guage framework. Assessing Writing, 34, 16–36. doi.org/10.1016/j.asw.2017.08.002.
Yoon, H-J. (2017). Linguistic complexity in L2 writing revisited: Issues of topic, proficiency, and construct
multidimensionality. System, 66, 130–141. https://doi.org/10.1016/j.system.2017.03.007
Yoon, H., & Polio C. (2017). The linguistic development of students of English as a second language in two
written genres. TESOL Quarterly, 51, 275–301. doi.org/10.1002/tesq.296
Zhang, B., & Elder, C. (2011). Judgments of oral proficiency by non-native and native English speaking
teacher ratings: Competing or complementary constructs? Language Testing, 28, 31–50. doi.org/10.1177/
0265532209360671
Zheng, Y., & Yu, S. (2019). What has been assessed in writing and how? Empirical evidence from Assessing
Writing (2000–2018). Assessing Writing, 42, 1–11. doi.org/10.1016/j.asw.2019.100421
238
SECTION 6
Writing Research in Different Contexts

18
LEARNING AND TEACHING
L2 WRITING IN CONTENT
AND LANGUAGE INTEGRATED
LEARNING (CLIL) CONTEXTS
Universitat Pompeu Fabra and University of the Basque Country UPV/EHU
Introduction
The acronym CLIL (Content and Language Integrated Learning) was coined at the end of the 1990s
by a group of European experts and was rapidly adopted in the European context. Since CLIL is
used as an umbrella term, it is not always obvious what it refers to. In order to avoid the diffuse
nature of the term, we first summarize the main features of this approach as understood in this
chapter. This is pertinent because this set of criteria will lay the foundation for the selection of
research studies on teaching and learning writing in CLIL settings analyzed in the chapter.
An initial feature of CLIL is that it is an approach whose dual objective consists in teaching con-
tent through a foreign language, so that students learn both content and language at the same time.
This acronym coexists with others used to include language and content teaching, such as content-
based instruction, foreign language medium instruction, learning through an additional language,
or content-enhanced teaching, among many others. Although CLIL is mainly a European label, it is
now also becoming popular with researchers, educational administrators, teachers, and educational
authorities in other contexts such as Australia, Asia, and Latin America.
A second feature of CLIL is that it involves foreign languages or lingua francas, English being
the predominant language in many mainstream educational contexts. In our understanding of the
label for the purpose of this chapter, CLIL does not include minority languages as a medium of
instruction. In the review of the literature we therefore adhere to this fundamental criterion and
review only studies that include foreign languages as the medium of instruction.
A third feature of the term is that school subjects in CLIL contexts are taught alongside foreign
language classes (e.g., English as a foreign language classes), which is why content subjects in CLIL
contexts are usually taught by content teachers, although in some countries, such as the Netherlands,
CLIL teachers may have been trained as both language and content teachers. However, this is not
the case in most European (and non-European) countries. In fact, CLIL is usually taught by con-
tent teachers in secondary education, although in primary education it is mostly taught by language
specialists in some countries (e.g., Spain), as they are expected to have acquired an adequate com-
petence level in the target language through their degree studies while such competence cannot be
assumed in other degree subjects.
DOI: 10.4324/9780429199691-26 241

Finally, CLIL aims to teach the language of the disciplines and therefore the development of
language literacies represents one of its main objectives. Accordingly, CLIL research has sought
to investigate “the processes of how language and content are best taught, learnt and assessed in
integration” (Dalton-Puffer, Llinares, Lorenzo, & Nikula, 2014, p. 216; emphasis in the original)
and this also applies to the skill of writing. Since integration is one of the main features of CLIL
programs at pre-university level, we have not included university EMI (English-medium instruc-
tion) programs, as the focus on language has been largely overlooked in teaching at tertiary level
(Lasagabaster, 2018).
The CLIL approach is believed to foster incidental language learning, to boost students’ for-
eign language learning motivation and self-confidence, to trigger high levels of communication
in the target language, and more importantly, to improve overall language competence without
any detrimental effect on content learning or on L1 development (Coyle, Hood & Marsh, 2010;
Lasagabaster, 2008). The reasons for these positive results lie in the fact that CLIL, in comparison
to foreign language tuition, brings with it not only input-related qualitative differences (since the
specific subject content –e.g., astronomy –is taught in a foreign language), but also output-related
differences. As a result of the interest sparked in the research community due to the aforementioned
reported benefits of CLIL, in the last two decades there has been a steady increase in the number
of studies examining the linguistic results of this approach. In this chapter we exclusively focus on
those that pertain to the connection between CLIL and writing.
Despite its relatively short history, CLIL has attracted great interest among practitioners and
researchers. It has experienced a rapid growth as an educational approach and we can confi-
dently predict that it will not be short-lived. CLIL programs have been fueled from two ends,
high-level policy making and grassroots initiatives. The idea of CLIL as an advantageous peda-
gogical approach was featured in the multilingual language policy initiatives taken in 1995 by the
European Language Council and the European Commission, which heralded change in the domain
of languages in education. That year, the White Paper on Teaching and learning: Towards the
learning society (European Commission, 1995) was issued in order to promote multilingualism
in a continent with 23 languages, and populations with mostly “a monolingual habitus” (Dalton-
Puffer, 2011, p. 185). In this, “bilingual education” was the term used to refer to what today is
known as CLIL (European Commission, 2017, p. 55). Several goals were behind the initiative
of enhancing instruction via an additional language for future generations of European citizens,
while ensuring content learning: the promotion of linguistic diversity in Europe, while aiming
at mobility within the Schengen area, and internationalization and European citizenship for the
young generations.
This was in fact the “second time around” European institutions had taken a stand for the pro-
motion of languages in contemporary times. In the 1980s, the Council of Europe had commissioned
prominent applied linguists to design and develop what was called the “threshold levels,” the subse-
quent functional levels of language proficiency to be included in language planning across Europe.
Such seminal work paved the way de facto for the proposal of a functional-semantic perspective
to syllabus design together with a new approach to language teaching, namely communicative lan-
guage teaching (CLT) (see, for example, Johnson, 1982). Communication in the classroom was
seen as the way to learning and input, output, and interaction (Gass & Mackey, 2015), together
with negotiation of meaning activities (Long, 1996; Gor & Long, 2009) were considered the locus
of learning, in an attempt to mirror first language development in natural settings. Against such a
backdrop, tasks, projects, and, ultimately, immersion in the target language, that is, CLIL, came to
be considered the communicative options par excellence in pedagogical terms, given their focus on
meaning, rather than on linguistic form (Skehan, 1998).
242
L2 Writing in CLIL Contexts
High-level policy meetings attended by member states’ educational stakeholders decided

to adhere to the idea of CLIL, which was rapidly placed at the heart of educational policies.
From then onwards, CLIL was promoted in compulsory education. To that end, a number of
European-funded CLIL projects were launched which developed CLIL as a new pedagogical
approach to education. Their output was meant to offer the educational community a specific
philosophy and methodology, teacher education guidelines and, most importantly, new teaching
materials (Pérez-Vidal, 2015).
As already mentioned, in parallel to such top down policies, a spread of grassroots CLIL
initiatives were being undertaken dovetailing the aspirations of teachers, parents, institutions
and schools, to gain an edge in the competition for employment, which requires better educated
cohorts with an adequate knowledge of languages and an international outlook. A quick look at
the European Commission’s Eurydice Report on education (2017, p. 58) shows that in all of the
28 European member states but five, CLIL provision was on offer in some schools, and that most
countries have officially endorsed CLIL initiatives, in particular Spain and the Netherlands. In those
same years, the Common European Framework of Reference (Council of Europe, 2001) and the
European Language Portfolio (Little & Perclovà, 2001) were issued by the Council of Europe in
order to homogenize language teaching and assessment throughout Europe. The scene had been set
for CLIL, or bilingual education, to take off in multilingual Europe, and so it did. Meanwhile in the
United States, Content-Based language teaching had been in place in some schools for a number
of years, as a multi-faceted educational approach integrating content and language, intended for
learners who were non-native speakers of the instructional language. (Snow & Brinton, 1988).
Critical Issues
These days a substantial number of publications on writing in CLIL contexts, including journal art-
icles and books, are hosted in the main academic data bases (SCOPUS and MLA have been used
as the main sources of references). In the current section we present and discuss the critical issues
addressed in them, following Polio and Lee’s (2017, p. 300) classification of research on writing,
and adapting it to our discussion of the specificities of CLIL written development. It must first be
noted that these authors adhere to the “writing-to-learn” approach advocated by Manchón (2011),
which understands that written production and instruction may facilitate general L2 development.
Within such an approach, Polio and Lee identify the following four most relevant and critical issues
in the study of writing: (a) contribution of writing activity to L2 learning and its interfaces with
other language skills; (b) the impact of corrective feedback on writing; (c) the effects of different
tasks or prompts on written production and learning; and, finally, (d) context effects on writing. It
is also worth mentioning that, as will be seen below, some of the issues tackled within such topic
areas are an integral part of the cognitivist interactionist model of language acquisition, and cog-
nitivist approaches, widely recognized within SLA research in the past decades (Gass & Mackey,
2015; DeKeyser, 2007).
Language Learning Through Writing

Concerning the first of Polio and Lee’s categories, i.e., the analysis of the linguistic benefits
accrued through written practice in CLIL settings, by and large significantly higher benefits are
yielded for CLIL as compared to non-CLIL learners: they show a wider range of lexical and
morphosyntactic resources, deployed in more elaborately and complex structures; and greater
accuracy and pragmatic awareness at sentence level, but not at the discourse level, something
which one might not have assumed in the first place, given the CLIL focus on meaning (Dalton-
Puffer, 2011, p. 186). Such analyses of written production have often adopted the complexity,
accuracy, and fluency (CAF) construct (see Gené-Gil, Juan-Garau, & Salazar-Noguera, 2015;
243
Lahuerta, 2017a; Zydatiß, 2007), or a combination of such a construct and holistic scoring
profiles, such as Friedl and Auer’s (2007).
On the basis of such an advantage for the CLIL groups, a first issue in CLIL writing research
arises, concerning whether such benefits are the result of a focus-on-writing approach, or rather
improvement takes place implicitly and incidentally as a result of the work being done in CLIL
classrooms, and not necessarily with a focus on writing (Lyster, 2017). That is, in a focus-on-
writing approach, explicit attention is paid to the development of writing abilities, with learner
attention and awareness being drawn to it, a phenomenon described by Lyster (2017, p. 97) as
a proactive approach to CLIL. Indeed, a proactive approach in the form of preparing an out-
line before the task was found to positively and significantly impact writing in 9–10-year olds
in two respective CLIL programs (De Diezmas, 2016), as did a functional linguistics approach
(Whittaker & Acevedo, 2016), discussed in detail in the following section. As for the interfaces of
writing and other skills, Whittaker and McCabe (2020) report on an increase in lexical density and
in the use of grammatical metaphors typical in History, i.e., nominalizations and abstract nouns,
when analyzing CLIL 7–10 graders’ compositions longitudinally, following up on work such as
Llinares and Whittaker (2010).
The Effect of Corrective Feedback and Tasks or Prompts on Writing

Studies do not abound regarding the second and third critical issues identified above, both central
in the cognitive interactionist model, and hence presented together here. The few studies that exist
have delved into task effects in the development of writing skills in CLIL settings. They mostly
analyze compositions involving narratives (Gené-Gil et al., 2015; Roquet & Pérez-Vidal, 2017).
Others measure progress, or lack thereof, when using non-creative tasks, such as copying a text
fragment from a course book, writing down ideas presented orally, i.e., answers to closed questions
in a lesson (Vallbona, 2014), or other types of written tasks such as note-taking, and reading and dis-
cussion (Van Gorb & Van den Branden, 2015). This article appeared in a noteworthy special issue
of System (García-Mayo, 2015), on the interfaces between CLIL and task-based-language teaching,
also including an analysis of biliteracy tasks across languages and content areas, in French and
English language arts, social studies, and science classes (Lyster, 2015).
Context Effects on Writing

As for the fourth category proposed in Polio and Lee (2017), i.e., contexts effects, we find, on the
one hand, studies which compared CLIL with other learning contexts, in particular formal instruc-
tion (FI) and study abroad (Valls-Ferrer, Roquet, & Pérez Vidal, 2014). On the other hand, research
has also compared different types of CLIL subjects, such as History or Arts and Crafts. Regarding
the former, i.e., the contrasting effects of different contexts, it is worth remembering that, most
typically, the target language adopted as the medium of instruction in CLIL subjects is also taught
in conventional FI (the traditional English as a foreign language classroom) in parallel to the CLIL
course, as an integral part of the curriculum. In this respect, the designs of most research studies
have included a comparison or control group which does not have the CLIL subjects and only
experience the FI course, either matching them for the number of hours of instruction, to counter-
balance the time advantage, or for age. In all cases, research has examined what, how, when, and
why foreseeable progress occurs in the CLIL groups, as further discussed below.
Regarding the second comparison (i.e., that of the contrasting impact of different CLIL subjects),
it must be underscored that CLIL programs are not a uniform context in terms of intensity, that is,
the number of CLIL subjects offered or the hours of instruction of such subjects in the L2. They
can range from including one or two curricular subjects taught through the learners’ additional
language, to an entire curricular program, as for example the BatchiBac, a baccalaureate program
244
taught entirely through French in some European member states. The analysis of the differen-
tial effects regarding the intensity of CLIL programs and its effects on written development has
received very little attention thus far. Moreover, a further distinction should be made regarding
whether subjects are of a very practical nature (i.e., Physical Education, Arts and Crafts) or involve
academic language and conceptualization (i.e., Philosophy, History), an issue discussed in the last
section of this chapter.
Current Contributions of Research

The studies on L2 writing in CLIL contexts can be divided into four main groups: (a) those focused
on L2 development that mainly gauge the three CAF traits of complexity, accuracy, and fluency;
(b) those that explore the impact of CLIL on L1 writing skills in monolingual settings, and on L1
and L2 writing skills in bilingual contexts; (c) those that approach the integration of language and
content from an academic written discourse perspective and that examine CLIL students’ “devel-
opment in uses of language much closer to the concerns of content learning” (Morton & Llinares,
2018, p. 497); and (d) those that scrutinize the processes through which texts are produced rather
than the final written version. We will now proceed to review representative examples of these four
trends.
The studies that fall within the first CAF group are more numerous (which is why more space
is devoted to them) and tend to show that CLIL students outperform their non-CLIL counterparts,
although different factors such as the students’ age, or the number of years and the intensity of CLIL
programs, as mentioned above, need to be considered. Gené-Gil et al.’s (2015) longitudinal study
is worth mentioning as it compared Catalan/Spanish CLIL and non-CLIL groups of equivalent lan-
guage competence at pre-test stage (in terms of both within-group and between-group differences)
and at four times during three years. As for within-group differences, non-CLIL students displayed
significant differences only in accuracy and lexical complexity, whereas CLIL participants’ scores
showed significant differences in six out of eight CAF measures. Between-group comparisons
were carried out with the same hours of exposure for both groups, so that differences could not
be attributed to more hours of exposure of CLIL students. The results revealed that non-CLIL
students progressed significantly more in lexical complexity, but CLIL students obtained higher
scores overall.
Other studies within the first group identified above reveal that fluency, accuracy, syntactic,
and lexical complexity progress in tandem and that, whenever younger and older CLIL students
are compared, the older students outperform their younger counterparts in all measures. Different
studies have also confirmed that CLIL secondary education students outscore in different writing
measures non-CLIL students who are a year older (Lahuerta, 2017b; Lasagabaster, 2008), which
might indicate that CLIL students’ writing skills develop at a faster pace than those of non-CLIL
students, although it has to be acknowledged that in these studies the CLIL groups’ exposure
was greater and so was –more than likely–their writing practice (although the latter issue is not
mentioned by the authors).
However, not all studies that have looked at CAF measures have come to such positive
conclusions about the effects of CLIL on the development of writing. Roquet and Pérez-Vidal
(2017) elicited written samples from grade 7 and 8 students at two different times over one aca-
demic year. Their compositions were quantitatively (CAF) and qualitatively (task fulfilment, organ-
ization, grammar and vocabulary) assessed, something which makes this study one of the few that
combines quantitative and qualitative measures. Statistically significant differences were found
only in accuracy measures in favor of CLIL students. However, the lack of differences in the other
measures indicates that one year may not be sufficient for the CLIL students to improve in all the
domains of writing, a conclusion also shared by Merino and Lasagabaster (2018). Similarly, and
this time from a longitudinal perspective, in primary education Agustín-Llach (2016) did not find
245
any significant differences between CLIL and non-CLIL groups in a three-year longitudinal study.
Students were invited to complete a letter-writing task and, although CLIL students performed
slightly better in lexical development, the differences were not statistically significant. The author
concluded that the young age and the low proficiency of the participants may be the reason for the
lack of CLIL advantage that researchers have found in other contexts.
Jexenflicker and Dalton-Puffer (2010) also found a more limited impact of CLIL on written
measures. They analyzed the written work of CLIL and non-CLIL groups at two higher technical
colleges in Austria. Although CLIL students outperformed their non-CLIL peers in overall scores,
on closer inspection results reflected a more complex picture. CLIL students were significantly
better in the area of lexico-grammar, vocabulary range, and orthography, but no differences were
detected in discourse competence and textual organization.
A question to bear in mind is that the majority of the written tasks used in the studies summarized
above were not content specific and did not require specialized vocabulary. Olsson (2015)
investigated the effect of CLIL on English academic vocabulary use in four written assignments
over three years in Swedish upper secondary education (students aged 16–19). After controlling
for between-groups comparability, the progression of academic vocabulary use between CLIL and
non-CLIL groups did not show any significant difference which led the author to conclude that
“attention is not paid to general academic vocabulary in CLIL to such an extent that CLIL students’
productive academic vocabulary develops more compared with students in regular education”
(Olsson, 2015, p. 67). This is an area that undoubtedly deserves further attention.
In short, although there are some contradictory findings, CLIL seems to have a positive impact
on CAF measures.
Now we turn to the second group, that made up of studies on the impact of CLIL on L1 writing
skills. In a six-year longitudinal study, Merisuo-Storm and Soininen (2014) analyzed both L1 and
L2 writing skills in CLIL contexts and found that primary education Finnish CLIL students were not
only better in L2 writing skills, but they also outperformed their non-CLIL counterparts in their L1
spelling skills. In the German context Gebauer, Zaunbauer, and Möller (2012) similarly concluded
that CLIL students’ L1 writing skills were not negatively affected by being schooled in English
in primary education. In officially bilingual contexts such as Galicia (San Isidro & Lasagabaster,
2019) and the Basque Country (Merino & Lasagabaster, 2018) in Spain, results also indicate that
CLIL students’ writing in their L1 and L2 (English being the L3) showed no significant differences
when compared to non-CLIL counterparts. Therefore, CLIL seems to have no detrimental impact
in educational contexts where two languages (the national or majority language and English) or
even three languages (the majority, the minority and English) are used as the means of instruction.
Within the third group of studies on the CLIL/writing relationship mentioned above, much
research has been recently carried out in the areas of genre and textual resources. Lorenzo and
Rodríguez (2014) relied on a corpus formed by 244 historical narratives to explore the evolution
of complex syntax and cohesion among CLIL students from grade 9 to grade 12. Lexical growth
was observed at early stages, but syntactic growth was only significant during the final year. Thus,
whereas CLIL students initially produced texts characterized by a lack of dependent clauses, T-
units and coordinate phrases, their written productions were consolidated in higher grades, “bearing
witness to the acquisition of historical literacy” (Lorenzo & Rodríguez, 2014, p. 70).
Whittaker and McCabe (2020), Whittaker, Llinares, and McCabe (2011) and Llinares, Morton,
and Whittaker (2012) delved into written discourse in secondary education History classes and
did so from a Systemic Functional Linguistics perspective. These studies brought to light CLIL
students’ improvement in the control of textual resources and increase in nominal group complexity
over a four-year period. The analysis of the students’ written production allowed them to conclude
that CLIL fosters the development of an academic register in the area of textual coherence. These
authors suggest that CLIL settings that focus primarily on the learning of content provide suitable
246
contexts in which to develop written discourse, since the participating students were able to draw
on a solid knowledge base from which to generate their texts. However, they also claim that an
explicit focus on the features of academic written discourse and their functions would help CLIL
students to produce much more developed written texts. We will elaborate on this line of research in
the “Recommendations for practice” section due to its relevance for the connection between writing
and language learning.
The fourth and final group encompasses those studies focused on processes rather than final
written products, whose supporters claim that writing should be studied “as a phenomenon that
happens in and through social interaction” (Jakonen, 2019, p. 429). These studies represent a shift
from examining final products or texts to the processes through which texts are produced as a
result of peer talk. Jakonen (2019) analyzed selected peer interactions as a social process among
students (aged 14–15 years) participating in a small-scale CLIL program (two History lessons in
English per week) and observed that the source text played a paramount role when completing
written tasks collaboratively. Students’ L1 (Finnish) was heavily relied on for making sense of
the source text and for their task completion. In their formulations, students strove to get both
language and content right and this integration fostered complex processes that went beyond the
interface of form and meaning, as students’ interactions involved shifts back and forth between
content and language. Basterrechea, García Mayo, and Leeser (2014) focused on both processes
and final products by investigating how written output affected noticing in a secondary education
CLIL context. Present and past tense markers were analyzed while students completed a dictogloss
collaboratively and individually. The number of correct instances of the present form (albeit not of
the past form) increased significantly between the first and the second reconstructions, pointing to
a positive impact of the output-input-output cycle and to the need to include writing activities that
draw learners’ attention to specific grammatical features.
In summary, the studies focused on CAF measures (Group 1) seem to indicate that CLIL
produces more positive results in writing in secondary education (Gené-Gil et al., 2015) than in pri-
mary education (Agustín-Llach, 2016). However, the time factor needs to be considered as one-year
longitudinal studies (Roquet & Pérez-Vidal, 2017) do not show such positive results as three-year
longitudinal studies (Gené-Gil et al., 2015), although not all measures benefit to the same extent
(Jexenflicker & Dalton-Puffer, 2010; Olsson, 2014). The success of CLIL programs may therefore
depend on the number of years and the intensity of such programs, as significant improvement in
CLIL students’ writings has not been observed in a one-year period (Merino & Lasagabaster, 2018;
Roquet & Pérez-Vidal, 2017). Group 2 studies concur that the CLIL approach has no negative
effect on the development of writing skills in the L1 in monolingual contexts (Gebauer et al., 2012;
Merisuo-Storm & Soininen, 2014), and in both the L1 and L2 in bilingual contexts in which the for-
eign/CLIL language represents the L3 (Merino & Lasagabaster, 2018; San Isidro & Lasagabaster,
2019). Group 3 studies confirm that CLIL students acquire disciplinary literacy (Lorenzo &
Rodríguez, 2014), but that an explicit focus on the features of academic written discourse is needed
and that more attention should be paid to the articulation of connections between the different
school subjects and the learning and teaching of academic writing (Llinares et al. 2012; Whittaker
et al., 2011). Finally, Group 4 studies (Basterrechea et al., 2014; Jakonen, 2019) reveal that learners’
collaboration when performing written tasks triggers complex processes that positively affect their
written texts and relevant language learning processes (e.g., noticing), although the need to draw
learners’ attention to specific grammatical features is underscored by researchers.

In our analysis of the main methodological features of CLIL studies, we will once again rely on the
aforementioned four-group division.
247
In the case of Group 1 studies (those focused on CAF measures), longitudinal studies predom-
inate, which speaks for the robustness of their findings. However, and despite the increasing number
of CLIL programs in primary education (e.g., in Spain), there is an evident skew towards secondary
education participants (13 out of the 16 studies reviewed in section 4). Additionally, these studies do
not provide a fine-grained analysis of learners’ writing, as CAF measures do not fully characterize
their writing texts. This is why CLIL researchers should complement them with a range of holistic
measures such as content and organization so that we can achieve a more complete understanding
of L2 writing (Roquet & Pérez-Vidal, 2017). As for instruments and tasks, researchers tend to
collect data through timed compositions about general topics (e.g., write an email to an English
friend), but these tasks may not be the best to allow CLIL learners to display their full potential in
what should purportedly benefit most from this approach, namely disciplinary writing. Researchers
should design tasks more closely tuned to the language used in CLIL classes. Finally, statistical ana-
lyses are an inherent feature of this type of quantitative research, but many studies do not provide
effect size values. If researchers’ goal is to determine how effective CLIL is, they must report and
interpret effect sizes. Null hypothesis significance testing (e.g., p<.5) is not enough to adequately
address CAF-related research questions and effect size measures provide the remedy, since they
describe the degree of the effect of CLIL on writing.
Group 2 studies, those dealing with L1 writing development in CLIL contexts, broadly share the
methodological features described with regards to Group 1 studies, since they are mainly longitu-
dinal studies carried out in secondary education through timed compositions about general topics.
However, some studies do incorporate size effects in their analyses (e.g., Merino & Lasagabaster,
2018), this helping to determine the strength of the claims made by researchers. Since this statistical
analysis is closely linked to the size of the sample (the larger the sample, the stronger the effect
size), readers can be more confident about the results. Describing descriptive statistics in detail is
especially important when working with small samples, but this information is not always as com-
plete as it should be in Group 1 and Group 2 studies.
Group 3 studies hinge on large corpora of data gathered from students’ written assignments,
mainly in secondary education. Unlike groups 1 and 2, the data collected comes from a number
of tasks that are closely related to the specific CLIL subject syllabus. However, they coincide
with groups 1 and 2 in that the tasks are habitually carried out within a time limit. Strikingly,
much research is focused on History, whereas other CLIL school subjects are underrepresented.
Researchers tend to build on the theoretical perspective provided by Systemic Functional
Linguistics with a view to scrutinizing the development of the academic register required for
successful writing in the school discipline concerned. In this group research is also based on a
longitudinal perspective.
Group 4 studies, those centered on the writing process, are limited in number and scope and
much remains to be done. In contrast to the other three groups, they are cross-sectional. Participants
are enrolled in secondary education (in our review we did not come across any study undertaken
in primary education). Some of the methodological issues that require further attention include
the following: (a) the role played by learners’ L1, since studies have proved that L2 writers find
it helpful to resort to their L1 when it comes to verifying lexical choices and answering tasks
(Jakonen, 2019; Manchón, Murphy, & Roca, 2007); (b) the strategies learners use and whether they
are influenced by the specific CLIL disciplinary language; and (c) the investigation of learners’
writing processes through introspective and retrospective methods such as think aloud protocols
and stimulated recall (Polio, 2012).
To conclude, it has to be noted that out of the eight types of methods into which Polio (2012)
classifies empirical studies of L2 writing, only four have been considered in CLIL literature (con-
tent analysis, text analysis, process research, and classroom observation), while the remaining four
(surveys, interviews, meta-analysis, and ethnography) have not been taken into consideration.
248

As mentioned above, CLIL teacher education and pedagogy have been at the center of CLIL
research since CLIL initiatives began in the mid-1990s. In this section we first review a general
pedagogical proposal for the adequate implementation of CLIL programs. After this follows a more
specific proposal concerning the language dimensions in use in CLIL writing pedagogy. Next, we
present an approach advocating teaching and learning the linguistic features or registers in written
texts of both scientific and humanities curricular subjects respectively. Finally, we focus on teacher
collaboration within CLIL programs.
In the early days of CLIL a number of European-funded projects were devoted to the devel-
opment of a pedagogic proposal for the CLIL approach and delivered reports which included
recommendations for teachers, for teacher training pedagogies, for materials production, and
for stakeholders (the ALPME, the CLIL Compendium, the CLIL Cascade Network, the TIE-
CLIL, among others). The CLIL Compendium was a case in point, breaking the ground for the
understanding of the components of the approach: a volume written in five different languages,
which identified the five core dimensions which should be included in any CLIL program, with
local variations according to differences in weight of each dimension: the learner (LENTIX), the
language (LANTIX), cognition (CONTIX), the environment (ENTIX), and the culture (CULTIX)
(Marsh, Majers, & Artiala, 2001; Grenfell, 2002). Within LANTIX, often the component with a
larger weight in most programs, writing was contemplated as a central component.
As years went by, pedagogues came to identify teachers’ needs more realistically and to develop
the CLIL paradigm vis-à-vis language in a more sophisticated manner. A recent pedagogical frame-
work proposes a language triptych which distinguishes between: (a) the language of learning,
required by learners to express key aspects of content and to deepen their understanding of concepts
and skills related to a specific topic, such as specialized language, and subject-related expressions
and grammar, to be used in their writing; (b) the language for learning, needed to participate in
written tasks and activities in a CLIL environment; and (c) the language through learning, needed
to meet the challenge of expressing meanings related to content, in order to build, organize, and
formulate meaning so that language learning takes place in a deeper and more personalized way,
while learners develop their writing skills (Coyle, Hood, & Marsh, 2010).
In an attempt to comprehend the linguistic challenges posed by CLIL pedagogy, a concern for
how to develop subject specific written abilities came to the fore. Llinares et al. (2012) research
CLIL writing, based on a large corpus gathered in different European contexts (Austria, Finland,
Spain, and the Netherlands) aimed to put forward strategies that CLIL teachers might employ in
order to develop their students’ L2 writing. These authors propose a framework aimed at presenting
the linguistic features that students need in order to carry out different tasks and to structure texts
both in Science and in History. The framework should help both teachers and learners in the task of
developing writing genres by making use of the appropriate register features for different curricular
subjects. This should enable learners to progress in both content understanding and meaning-
making as well as to develop more sophisticated language skills. In this way the double focus on
language forms and content learning will actually be reflected on CLIL students’ writings.
Finally, empirical research in CLIL pedagogy has focused on some of the features of the imple-
mentation and practice of CLIL programs (Escobar-Urmeneta, 2019). To start with, it has always
been claimed that CLIL programs require and facilitate cooperation amongst teachers (Escobar-
Urmeneta, 2010; Vallbona, 2014). Cooperation is visualized in particular between content teachers
and language teachers of the target language (Salsbury, 2011). However, CLIL programs may also
underscore collaboration among teachers in charge of the different languages in the school cur-
riculum, and in relation to the more academic literacy skills, reading and, in particular, writing,
where besides small differences, strategy building should be a cross-curricular ability, dealt with
in collaboration across languages, as discussed in the following section. This is not surprising, as
249
one of its major initial influences was in fact, the “languages across the curriculum” movement
which emerged in the 1980s and 1990s in the United Kingdom (Grenfell, 2002). The need to boost
language-teacher cooperation is underpinned by the evident parallels between L1 and L2 academic
writing development (Lorenzo & Dalton-Puffer, 2016).
Future Directions
To close this chapter, a brief indication of future directions for the CLIL writing research agenda
is offered. There are several key issues around which research is practically non-existent and more
efforts are needed to tackle them. Some are external to the learner and are related to the architecture
of the CLIL program, whereas others are internal to the individual learner, although also established
by the program. CLIL programs can differ with respect to the following three factors: (a) length in
combination with intensity, that is the number of CLIL subjects included in the program; (b) cur-
ricular CLIL subject chosen and written component in it; and (c) age of starting. In CLIL research on
writing we are still far from being able to make any strong claims about how CLIL intensity affects
the development of students’ writing abilities, whether some CLIL subjects are more beneficial than
others when it comes to writing, whether the implementation of CLIL at an early age does in fact
improve learners’ writing, or whether it is better to start at a later age once students have already
achieved a minimum level of proficiency in the foreign language, but in a more intensive manner,
with a “proactive approach” and contemplating the specificities of different types of content.
Indeed, regarding length and intensity, research needs to focus on the effects on progress in
writing of programs varying in length, and including different numbers of CLIL subjects in them,
that is, also varying in intensity. As already summarized, a number of studies thus far have attributed
lack of benefits in written abilities to short length and low intensity experiences, as often just one
CLIL subject is taught on CLIL programs (e.g., Pérez-Vidal & Roquet, 2015).
As for comparisons of different curricular subjects, their pedagogical approach to writing, and
subsequent written progress, different subjects involve different degrees of abstraction and often
have a lighter written component because of being more practical and hands on (the contrast between
say Physical Education or Arts and Crafts and History or Philosophy). It has been argued that less
abstract subjects, taxing learners with a lower cognitive load, are recommended at the initial stages
in any program, and probably at earlier ages, while subjects such as those in the humanities, with
a more robust written component, should arrive later on; however, empirical evidence on such an
idea and the effect on writing is still missing. Regarding the relation between writing in CLIL and
language learning, CLIL writing pedagogy has made a unique contribution to the development of
disciplinary writing, mainly concentrating on the History curriculum (Llinares et al., 2012). Other
areas need to be examined in order to pin down the specificities of the different written genres.
Concerning learner factors considered in the design of CLIL programs, starting age and its
effects on progress in written abilities has received scarce attention. Artieda, Roquet, and Nicolás-
Conesa (2017) found that the earlier the better is not always the case: when matched for hours, older
learners in formal instruction are significantly better than younger CLIL learners in writing. When
matched for age, hours have a significantly positive impact. Lorenzo et al. (2009) also find that later
start learners show competences similar to early start learners. However, further evidence is needed
on this issue to be able to generalize. Finally, as Dalton-Puffer (2011) has claimed, CLIL students’
L1 writing has not been found to surpass L2 writing, something which has led her to question
whether there exists “a general level of writing development that has an impact on how learners
deal with a writing task, independently of whether it is in the L1 or the L2” (p. 187).
In sum, there is no doubt that the analysis of how writing skills develop in CLIL approaches is
an avenue of research well worth pursuing, perhaps not in isolation but rather understood as a cog-
nitively demanding task regardless of language (L1 or L2).
250
References
Agustín-Llach, M.P. (2016). Age and type of instruction (CLIC vs. Traditional EFL) in lexical development.
International Journal of English Studies, 16, 75–96.
Artieda, G., Roquet, H., & Nicolás-Conesa, F. (2017). The impact of age and exposure on EFL achievement
in two learning contexts: Formal instruction and formal instruction + content and language integrated
learning (CLIL). International Journal of Bilingual Education and Bilingualism, https://doi.org/10.1080/
13670050.2017.1373059
Basterrechea, M., García-Mayo, M.P., & Leeser, M.J. (2014). Pushed output and noticing in a dictogloss: Task
implementation in the CLIL Classroom. Porta Linguarum, 22, 7–22.
European Commission (1995). Teaching and learning: Towards a learning society, 449. White Paper on
Education and Learning. Brussels: DGV.
European Commission (2017). Key data on teaching languages at school in Europe –2017 Edition. Brussels.
Council of Europe (2001). Common European framework of reference for languages: Learning, teaching,
assessment. Cambridge: Cambridge University Press.
Coyle, D., Hood, P., & Marsh, D. (2010). CLIL –content and language integrated learning. Cambridge:
Dalton-Puffer, C. (2011). Content-and-language integrated learning: From practice to principles? Annual
Review of Applied Linguistics, 31, 182–204.
Dalton-Puffer, C., Llinares, A., Lorenzo, F., & Nikula, T. (2014). “You can stand under my umbrella”: Immersion,
CLIL and bilingual education. A response to Cenoz, Genesee & Gorter (2013). Applied Linguistics, 35,
213–218.
De Diezmas, E. (2016). The impact of CLIL on the acquisition of L2 competence skills in primary education.
International Journal of English Studies, 16(2), 81–101.
DeKeyser, R. (2007). Practice in a second language: Perspectives from applied linguistics and cognitive
psychology. Cambridge: Cambridge University Press.
Escobar Urmeneta, C. (2010). Pre- service CLIL teacher- education in Catalonia: Expert and novice
practitioners teaching and reflecting together. In D. Lasagabaster & Y. Ruiz de Zarobe (Eds.), CLIL in
Spain: Implementation, results and teacher training (pp. 188–218). Newcastle upon Tyne: Cambridge
Scholars Publishing.
Escobar Urmeneta, C. (2019). An introduction to content and language integrated learning (CLIL) for teachers
and teacher educators. CLIL Journal of Innovation and Research in Plurilingual and Pluricultural
Education, 2(1), 7–19.
Friedl, G., & M. Auer. (2007). Rating scale used for assessment of the writing task. Erläuterungen zur
Novellierung der Reifeprüfungsverordnung fur AHS, lebende Fremdsprachen. BIFIE. Wien –St. Pölten.
(Gültig ab dem Sommertermin 2009). Retrieved from www.bifie.at/publikationen
Gass, S., & MacKey, A. (2015). Interaction approaches. In B. Van Patten & J. Williams (Eds.), Theories in
second language acquisition (pp. 180–206). London: Routledge.
García-Mayo, M. P. (Ed.). (2015). Special Issue: The interface between task-based language teaching and
content-based instruction. System, 54.
Gebauer, S., Zaunbauer, A.C.M., & Möller, J. (2012). Erstsprachliche Leistungsentwicklung im
Immersionsunterricht: Vorteile trotz Unterrichts in einer Fremdsprache? (First- language performance
development in immersion programs: Advantages despite instruction in a foreign language?) Zeitschrift fur
Pädagogische Psychologie, 26, 183–196.
Gené-Gil, M., Juan-Garau, M., & Salazar-Noguera, J. (2015). Development of EFL writing over three years in
secondary education: CLIL and non-CLIL settings. The Language Learning Journal, 43, 286–303.
Gor, K., & Long, M.H. (2009). Input and second language processing. In W.C. Ritchie & T.K. Bathia (Eds.),
The new handbook of second language acquisition (2nd ed., pp. 445–467). Bingley: Emerald Group
Publishing Limited.
Grenfell, M. (Ed.). (2002). Modern languages across the curriculum. London: Routledge.
Jakonen, T. (2019). The integration of content and language in students’ task answer production in the bilingual
classroom. International Journal of Bilingual Education andBilingualism, 22, 428–444.
Jexenflicker, S., & Dalton-Puffer, C. (2010). The CLIL differential: Comparing the writing of CLIL and non-
CLIL students in higher colleges of technology. In C. Dalton-Puffer, T. Nikula, & U. Smit (Eds.), Language
use and language learning in CLIL classrooms (pp. 169–189). Amsterdam: John Benjamins.
Johnson, K. (1982). Communicative syllabus design and methodology. Oxford: Pergamon.
Lahuerta, A.C. (2017a). Analysis of accuracy in the writing of EFL students enrolled on CLIL and non-CLIL
programmes: The impact of grade and gender. Language Learning Journal, 8, 1–12.
Lahuerta, A.C. (2017b). Syntactic complexity in secondary-level English writing: Differences among writers
enrolled on bilingual and non-bilingual programmes. Porta Linguarum, 27, 67–80.
251
Lasagabaster, D. (2008). Foreign language competence in content and language integrated courses. The Open
Applied Linguistics Journal, 1, 31–42.
Lasagabaster, D. (2018). Fostering team teaching: Mapping out a research agenda for English-medium instruc-
tion at university level. Language Teaching, 51, 400–416.
Little, D., & Perclovà, R. (2001). European language portfolio: Guide for teachers and teacher trainers.
Strasbourg: Council of Europe.
Llinares, A., Morton, T., & Whittaker, R. (2012). The roles of language in CLIL. Cambridge: Cambridge
University Press.
Llinares, A., & Whittaker, R. (2010) Writing and speaking in the history class: A comparative analysis of CLIL
and first language contexts. In C. Dalton-Puffer, T. Nikula, & U. Smit (Eds.), Language use and language
learning in CLIL classrooms, (pp. 125–144). Amsterdam: John Benjamins.
Long, M.H. (1996). The role of the linguistic environment in second language acquisition. In W. Ritchie
& T.K. Bhatia (Eds.), Handbook of second language acquisition (Vol. 2) (pp. 413–468). San Diego,
CA: Academic Press.
Lorenzo, F., Casal, S., & Moore, P. (2009). The effects of content and language integrated learning in European
education: Key findings from the Andalusian bilingual sections evaluation project. Applied Linguistics, 3,
418–442.
Lorenzo, F., & Dalton-Puffer, C. (2016) Historical literacy in CLIL: Telling the past in a second language. In
T. Nikula, E. Dafouz, P. Moore, & U. Smit (Eds.), Conceptualising integration in CLIL and multilingual
education (pp. 73–91). Bristol: Multilingual Matters.
Lorenzo, F., & Rodríguez, L. (2014). Onset and expansion of L2 cognitive academic language proficiency in
bilingual settings: CALP in CLIL. System, 47, 64–72.
Lyster, R. (2015). Using form-focused tasks to integrate language across the immersion curriculum, System,
54, 4–13.
Lyster, R. (2017). Content-based language learning. In S. Loewen & M. Sato (Eds.), The Routledge handbook
of instructed second language acquisition (pp. 87–108). New York: Routledge.
Manchón, R. (Ed.). (2011), Learning to write and writing to learn in an additional language. Amsterdam: John
Benjamins.
Manchón, R., Murphy, L., & Roca, J. (2007) Lexical retrieval processes and strategies in second language
writing: A synthesis of empirical research. International Journal of English Studies, 7, 149–174.
Marsh, D., Maljers, A., & Hartiala, A. (2001). The CLIL compendium. Profiling European CLIL classrooms.
Jyväskylä: University of Jyväskylä.
Merino, J.A., & Lasagabaster, D. (2018). CLIL as a way to multilingualism. International Journal of Bilingual
Education and Bilingualism, 21, 79–92.
Merisuo-Storm, T., & Soininen, M. (2014). Students’ first language skills after six years in bilingual education.
Mediterranean Journal of Social Sciences, 22, 72–81.
Morton, T., & Llinares, A. (2018). Students’ use of evaluative language in L2 English to talk and write about
history in a bilingual education programme. International Journal of Bilingual Education and Bilingualism,
21, 496–508.
Olsson, E. (2015). Progress in English academic vocabulary use in writing among CLIL and non-CLIL students
in Sweden. Moderna Sprak, 109, 51–74.
Pérez-Vidal, C. (2015). Languages for all in education: CLIL and ICLHE at the crossroads of multilingualism,
mobility and internationalisation. In M. Juan-Garau & J. Salazar-Noguera (Eds.), Content based language
learning in multilingual educational environments (pp. 105–121). Berlin: Springer.
Pérez-Vidal, C., & Roquet, H. (2015). CLIL in context: Profiling language abilities. In M. Juan-Garau & J.
Salazar-Noguera (Eds.), Content based language learning in multilingual educational environments (pp.
237–254). Berlin: Springer.
Polio, C. (2012) How to research language writing. In A. Mackey & S. M. Gass (Eds.), Research methods in
second language acquisition (pp. 139–155). Chichester: Wiley-Blackwell.
Polio, C., & Lee, J. (2017). Written language learning. In S. Loewen & M. Sato (Eds.), The Routledge hand-
book of instructed second language acquisition (pp. 299–319). New York: Routledge.
(CLIL) contexts? The case of writing. Applied Linguistics, 38, 489–511.
Salsbury, T. (2011). Teaching English through content areas. In H. Puji-Widodo & A. Cirocki (Eds.), Innovation
and creativity in ELT methodology (pp. 173–184). New York: Nova Science Publishers.
San Isidro, X., & Lasagabaster, D. (2019) The impact of CLIL on pluriliteracy development and content
learning in a rural multilingual setting: A longitudinal study. Language Teaching Research, 23, 584–602.
Skehan, P. (1998). Task-based instruction. Annual Review of Applied Linguistics, 18, 268–286.
252
Snow, M.A., & Brinton, D.M. (1988). Content-based language instruction: Investigating the effectiveness of
the adjunct model. TESOL Quarterly, 22(4), 553–574.
Vallbona, A. (2014). L2 Competence of young language learners in science and arts CLIL and EFL instruction
context. A longitudinal study (PhD dissertation). Universitat Autònoma, Barcelona.
Valls-Ferrer, M., Roquet, H., & Pérez-Vidal, C. (2014). The effects of practice in different contexts of lan-
guage acquisition: Contrasting FI, SA and BLC. In Actas del XXIV Congreso Internacional de AESLA.
Aprendizaje de lenguas, uso del lenguaje y modelación cognitiva: Perspectivas aplicadas entre disciplinas
(pp. 219–228). Ciudad Real: Ediciones de la Universidad de Castilla-La Mancha.
Van Gorb, K., & Van den Branden, K. (2015). Teachers, pupils and tasks: The genesis of dynamic learning
opportunities. System, 54, 28–39.
Whittaker, R., Llinares, A., & McCabe, A. (2011). Written discourse development in CLIL at secondary school.
Language Teaching Research, 15, 343–362.
Whittaker, R., & McCabe, A. (2020). Writing on history in a content and language integrated learning
(CLIL) context: Development of grammatical metaphor and abstraction as evidence of language learning.
In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 309–332).
Zydatiß, W. (2007) Deutsch- Englische Zuge in Berlin (DEZIBEL). Eine Evaluation des bilimgualen
Sachfachunterrichts in Gymnasien: Kontext, Kompetenzen, Konsequenzen. Frankfurt am Main: Peter Lang.
253
19
L2 WRITING IN STUDY-ABROAD
CONTEXTS
Cedarville University and Georgetown University
Introduction
It is a commonly held belief that the best way to learn a language is through immersion. While
this may often be the case, both research and experience have shown that merely being in another
country where the language is spoken does not automatically lead to language gains (see review by
Isabelli-García, Brown, Plews, & Dewey, 2018). The continuous growth in the number of students
enrolling in study abroad (SA) programs worldwide, reaching numbers as high as 5 million in
2014 (University of Oxford, 2017), justifies the need for research on the external and individual
factors that contribute to language development in the study abroad context. Among the gaps in the
research are studies on the development of writing skills during an educational stay abroad.
Study abroad has previously been defined as “a temporary sojourn of pre-defined duration, under-
taken for educational purposes” (Kinginger, 2009, p. 11) though the recent expansion in programs
offered has led to changes in program characteristics as many students now go abroad for non-credit
work, internships, volunteering, or research in another country (Institute for International Education,
2018). However, the definition given by Kinginger (2009) is still an appropriate definition to
examine SA as a context, different from the foreign language classroom, where the development of
second language (L2) writing occurs. It warrants mentioning that not all students study abroad with
the purpose of learning another language. In fact, the UNESCO Institute of Statistics (n.d.) shows
three main trends in the flow of students: 1) students who go to a country where their first language
(L1) is one of the official languages; 2) the tendency for students from non-Anglophone countries to
study in a country where English is one of the official languages; and 3) those who chose to study in
a country where the language is not the same as their L1. Isabelli-García et al. (2018) add a fourth
trend “of heritage seekers, who study abroad because of an ethnic, (ethno)religious, linguistic, or
national connection to a specific ancestral country or region” (p. 441). The focus of the present
chapter will be on students who go to a country where the language spoken is different from their
L1. Due to the prominence of English in the international sphere, most of the interest on L2 writing
in SA contexts falls within the study of English as an additional language (e.g., Sasaki 2004).
SA programs that have the students’ language development as a pedagogical goal can vary
widely in terms of length, courses offered (language courses or content courses), type of matricula-
tion (classes with other foreign students or direct matriculation), living arrangements (host families
or dormitories), and contact with the local community; this variation has the potential to differen-
tially affect L2 gains. While these factors have been somewhat investigated, the focus has been on
254 DOI: 10.4324/9780429199691-27

L2 Writing in Study-Abroad Contexts
the development of oral skills (Sanz & Morales-Front, 2018), so little is known about the influence
L2 writing. Of the handful of published studies exploring writing development during or after SA,
the majority compare writing done by students who have participated in SA programs with those
who continue to enroll in classes in the target language (TL) at their home institution (AH) (Freed,
So, & Lazar, 2003; Godfrey, Treacy, & Tarone, 2014; Sasaki, 2004, 2007, 2011) while other studies
follow the same group of writers before and after a time of formal instruction at home, during study
abroad, and upon their return (Pérez-Vidal & Barquin, 2014). Although small in number, these
studies represent a good start as they point to ways to effectively design both SA and AH programs
to further students’ L2 writing skills; they also suggest areas for further research.
As the numbers of students participating in SA have increased, so have the studies that seek to
explore the different aspects of the SA experience and explain how it contributes to language
learning. Most research has focused on oral production as this is believed to be the skill that
shows the greatest development given the increased opportunities for oral interaction in immersive
contexts compared to the traditional classroom setting. Studies typically find an improvement in
speaking abilities, although it varies based on the measures used. For example, fluency increases
(Freed et al., 2003; Segalowitz & Freed, 2004) but grammatical accuracy does not always improve
(DeKeyser, 2010), although counterevidence exists (e.g., Grey, Cox, Serafini, & Sanz, 2015). Some
of these findings have also been linked to program characteristics, such as length of stay, learner
variables, or proficiency level at the onset of the sojourn abroad (Zalbidea, Issa, Faretta-Stutenberg,
& Sanz, 2020). Although comprehension has received less attention than oral production, research
suggests that listening comprehension improves in the SA setting (Ginsberg, Robin, & Wheeling,
1992; Llanes & Muñoz, 2009).
In contrast, SA scholars have focused much less on the two academic skills of reading and
writing, even though most SA programs are, after all, academic programs, and an important
number are offered in a foreign language. Only a handful of studies have investigated changes in
reading abilities in SA. Findings have revealed that reading comprehension improves as a result
of SA (Brecht, Davidson, & Ginsberg, 1995; Lapkin, Hart, & Swain, 1995), confidence in reading
ability increases (Dewey, 2004), and that SA leads to significant gains in receptive and productive
vocabulary knowledge (Briggs, 2016). However, the advantage of SA is not conclusive as not
all studies have found significant differences between learners in a regular classroom setting or
immersion context (Dewey, 2004; Huebner, 1995); there is room for many more studies on L2
reading in SA.
This is also true for L2 writing in SA, as the number of studies is also scarce, although interest has
increased in the last few years. While most studies investigating writing do not include descriptions
of the types of writing learners engage in while abroad, personal experience and conversations
with other practitioners have revealed that depending on the type of program, students do engage
in writing, but may not receive explicit instruction on writing or writing strategies. The type of
writing students engage in while abroad is often similar to that AH, and includes academic papers
as well as short essays in exams. It is also becoming more common, especially programs that
include internships or community service abroad, to ask students to keep a reflective journal in the
TL. Some programs also connect students from SA to those in the local community and they will
often engage in written conversations, often in the form of a chat. Teacher feedback varies with the
task: often, papers and essays are graded for content, but grammar and lexical accuracy receive less
attention, especially in upper-level (content) courses.
Within studies to examine writing gains, few have included factors that could contribute to the
observed differences; only a few have begun to scratch the surface of the potential benefits of SA for
L2 writing and motivation (e.g., Freed et al., 2003; Sasaki, 2011). These studies will be examined
255
in the section on Current Contributions and Research, but first, some critical issues related to the
study of L2 writing in SA will be considered.

In contrast with other areas of SA research, fewer scholars have focused on program length, curricular
design, and learner individual differences when investigating L2 writing in immersive contexts.
Turning first to length of stay, various factors, including financial and logistical, are pushing students
to participate in short-term programs lasting eight weeks or less (Institute for International Education,
2018); however, most research looks at semester-long programs. Their results point to the benefits of
longer stays abroad. Some of the current research does compare different program lengths, but only
a few studies have considered lengths of two months or less (e.g., Llanes & Muñoz, 2009; Sasaki,
2011). Clearly, program length is a key variable when considering L2 writing development, as it has
been shown to be an important variable in predicting and maintaining gains (Sasaki, 2011).
Another program characteristic that can affect writing outcomes is curricular design, specifically
the amount of writing that happens within the program. Writing may not always be explicitly taught
in a SA program, but students are often required to engage in different types of writing, such as
writing emails, reflective journals, or academic essays as part of their coursework or internship. In
order to make studies more comparable, it would be beneficial for researchers to collect and report
data related to frequency and types of writing. This practice would ultimately help to identify the
curricular design characteristics most conducive to L2 writing development.
While variation in length of stay or types of writing are external factors that influence L2 writing
development, students can also vary widely in terms of their learning goals, aptitude, motivation,
and personality. We need to know more about individual differences that have been shown to affect
language outcomes, including gender, age, cognitive ability, linguistic identity, and personality
traits. Importantly, it is the interaction among individual differences and program characteristics
that will have an impact on writing outcomes; such interactions, rather than isolated variables,
should be the focus of SA research.

This section shifts focus to reporting on the studies that have investigated writing in SA, with spe-
cial attention to those aspects of writing development most influenced by the immersive context.
Studies are presented according to research design, beginning with between-subjects studies com-
paring SA and AH groups, followed by within-subjects design studies.
Between-Subjects Design: SA vs. AH Comparisons

One of the first studies to examine the effects of SA on writing was Freed et al. (2003) and they
did so by comparing the gains made in writing to oral gains between two groups of students at a
US institution, (n=15 AH; n=15 SA in France). Oral and written data were collected from both
groups at the beginning and end of each semester and were subsequently evaluated by a group
of native speakers of French, and a subset of eight students’ (four SA and four AH) speaking and
writing samples were further analyzed for a more in-depth comparison. All SA participants were
judged to be more fluent by the native-speaker raters because they spoke at a faster rate, used
better grammar and vocabulary, and lacked hesitation. However, when it came to writing, it was
the AH group that was judged to be more fluent by the native-speaker raters, on both the pre- and
post-test as they showed more accurate grammar, more complex vocabulary, and better expression
of thought and organization. A subset of the texts was further analyzed for their “length (number
256
of words, sentences, and T-units), grammatical accuracy (proportion of error-free T-units, correct
noun-adjective agreement, subject-verb agreement, and past-tense usage) syntactic complexity
(number of words per T-unit), and lexical density (proportion of lexical words)” (Freed et al., 2003,
emphasis in original). The SA and AH groups production was comparable in terms of grammatical
accuracy and complexity (words per T-unit was similar for both groups), but the SA group did show
an increase in lexical density. The authors conclude that immersion resulted in better oral skills but
did not have the same effect for writing.
Godfrey et al. (2014) also investigated the changes in French writing of eight American uni-
versity students over the course of a semester (n= 4 AH, included instruction on writing; n= 4 SA
in France wrote papers in French). Participants completed the American Council on the Teaching
of Foreign Languages (ACTFL) computerized Oral Proficiency Interview and Writing Proficiency
Test in addition to a separate writing task in which they were asked to analyze a picture of what a
typical Parisian family eats in a week. Tasks were completed at the beginning (T1) and end (T2)
of the semester and all writing samples were analyzed using measures of complexity (T-unit ana-
lysis), accuracy (gendered nouns and gender agreement), and fluency (total number of words per
essay) (CAF) as well as form and function. The form and function analysis was to identify how the
participants “used the visual text as an evidentiary source for making claims or hypotheses” (p. 54);
a type/token analysis of discourse markers was also included.
Overall (scores on ACTFL and OPI), the AH students showed no changes while all of the SA
students moved up at least one level. For writing (scored on the Writing Proficiency Test), all SA
students also improved by moving up one level, compared to only one AH student; two stayed at
the same level, and one moved down one level. The CAF analysis showed more growth for SA than
for AH participants in fluency and accuracy: all participants increased written fluency, but the AH
students had greater gains; for accuracy, SA participants used more instances of gendered language
(i.e., attempts to use the feminine in noun phrases) at T2 than T1 and were more accurate in their use
of gender agreement while the AH participants showed marginal improvement. In contrast, the AH
group descriptively improved their syntactic complexity more than their SA counterparts. Finally,
in terms of form and function, all participants showed more varied discourse markers to present
their claims at T2 with the AH group’s density of discourse markers increasing modestly and more
frequent use of markers used in academic discourse, like therefore and because.
In a study of Spanish speakers learning English, Serrano, Llanes, and Tragant (2011) compared
changes in oral and written production in three different contexts, an AH semi-intensive course
(10 hours/week for 11 weeks; N = 37), an AH intensive course (25 hours/week for four-and-a-half
weeks; N = 69) and a SA program in the UK (N = 25). Participants completed pre-and post-test
writing with a descriptive topic and a subset of the AH participants and all of the SA participants
also completed an oral narrative task. Results revealed no significant differences between the SA
and AH intensive students on any of the measures used for oral or written data. When comparing
the SA and AH semi-intensive groups, significant differences were found in favor of SA for written
fluency (words per T-unit) and lexical complexity (Guiraud’s Index).
Results from these three studies are mixed. They all show SA results in gains in oral fluency
compared to AH programs in a semester-long program; in the case of writing, findings are not
so clear cut: no advantage for SA in Freed et al. (2003), a clear advantage for SA participants
in Godfrey et al. (2014) and in Serrano et al. (2011), whose SA participants improved both their
writing fluency and lexical complexity. In Godfrey et al. (2014) however, the AH group increased
their syntactic complexity and discourse markers more than the SA group. It seems that SA is bene-
ficial for some aspects of writing, but there may be other aspects that improve more in domestic
programs that include instruction on L2 writing. The following study also compared written and
oral gains between SA and AH participants but focused on bilinguals learning English and included
the effects of age to unveil possible interactions.
257
Between-Subjects Design: Age

Llanes and Muñoz (2013) compared measures of written and oral complexity, accuracy, and flu-
ency in Catalan/Spanish bilingual children (N = 73) and adults (N = 66) before and after a period
of two to three months of instruction; 39 of the children and 46 of the adults participated in SA in
an English-speaking country. Participants were asked to write a composition about their life and
complete an oral picture narration task at the beginning and end of the programs. Both lexical and
syntactic complexity were measured using Guiraud’s Index and clauses per T-unit respectively,
while accuracy was measured as errors per T-unit. Written fluency was measured using words per
T-unit while oral fluency was measured using pruned syllables per minute.
A MANCOVA revealed a significant effect for age and context as well as an interaction between
the two. Overall, SA participants had significant gains in oral measures, while those who remained
at home had significant gains in written syntactic complexity. Further, adult participants had the
greatest gains in both oral lexical complexity and on written fluency and lexical and syntactic
complexity. This study shows that while SA is beneficial for developing oral skills, particularly for
younger learners, it does not appear to have the same benefits for writing, as it was the adults who
remained AH that had the greatest gains in both written fluency and syntactic complexity followed
by the SA adults, SA children, and finally the AH children, possibly due to the nature of feedback.
Between-Subjects Design: Motivation

The following series of studies examines Japanese learners of English in a longitudinal design.
In addition to measures of writing ability, they also consider participants’ L2 writing motivation.
The first study in the series followed the writing development of 11 (SA n = 6, AH n = 5) Japanese
students over a period of three and a half years (Sasaki, 2004). Throughout this period, participants
completed four argumentative writing prompts which were rated holistically by an English as a
foreign language (EFL) specialist. Results revealed that all participants did improve both their
language proficiency and L2 writing skills, but no significant differences were found between the
two groups even though the SA group improved at each data collection point while the AH group
showed a drop in their scores. In individual interviews with participants in their last semester at
the university, they were asked two questions 1) “Compared to your freshman year, do you think
that writing in English has become easier for you?” and 2) “Compared with your freshman year, do
you think you have changed your way of thinking about writing in English? Why?” (Sasaki, 2004,
p. 542). Answers to these questions revealed that while all participants felt more confident in their
writing abilities, the SA students were motivated to continue to study English and to write better,
which they did.
In order to further examine this potential difference between the two groups, Sasaki (2007)
completed a partial replication of the previous study with another group of participants (N = 13),
this time following them for only one-and-a-half years. Seven of the participants joined in a four
to nine-month long SA program in either Canada or the United States while the other six remained
AH. Two writing samples were compared, one produced prior to participation in the SA program,
and one completed one year later, after all of the SA group had returned. An additional interview
was also conducted a few months later, prior to the students’ graduation.
The findings present a complex picture, especially when compared with results from Sasaki’s
previous study. Once again, all students did improve their overall English proficiency, but only the
SA group improved their composition scores and writing fluency. In their interviews, participants
attributed their changes to the time spent abroad, where they had to write frequently for their courses.
They also mentioned increased motivation to write well as a result of their SA experience. In con-
trast, most of the participants in the AH group did not perceive an improvement in the overall lan-
guage proficiency, even though their scores did show increased proficiency overall. The AH group
258
attributed their perception to reduced opportunities to write in the L2. In conclusion, Sasaki (2007)
revealed more differences between the AH and the SA groups compared to Sasaki (2004): beyond
increased motivation to improve their L2 writing and increased use of writing strategies, Sasaki
(2007) shows that the immersive context is conducive to higher development of writing skills, pos-
sibly due to more frequent and more meaningful opportunities for practice.
Between-Subjects Design: Length of Program

In order to better understand the conditions that affect L2 writing improvement and motivation in
immersive contexts abroad, Sasaki (2009, 2011) conducted a third longitudinal study with a larger
group of students; this time she compared the effects of different lengths of SA programs on writing
outcomes and motivation. Since the same participants from the 2009 study are included in the
2011 one, only the second study is reported here. Writing samples were collected from a total of 37
participants who were British and American Studies majors at a university in Japan at four different
points; near the beginning of their freshman year (T1), then at the midpoint of the sophomore (T2),
junior (T3) and senior (T4) years. Participants responded to argumentative writing prompts that
were rated by an EFL specialist; they were also interviewed about the strategies they used while
writing their composition, English classes they had taken, and areas of English composition they
wanted to improve. An additional interview was conducted near the end of the participants’ senior
year to ask them about changes that occurred in their writing over the observation period. Of the
37 participants, nine remained in Japan for the duration of the study and 28 studied abroad (n= 9,
1.5–2 months abroad, n=7, 4 months abroad; n= 12, 11 months abroad).
L2 proficiency levels were comparable across groups at the onset of the study, but differed
in terms of writing ability, as the four-month SA group had a significantly higher composition
score at T1 than the other two groups. The development over time showed a slightly different
pattern than the previous studies; only the eight to 11-month SA group’s mean composition score
improved across all observation points. Across group comparisons show the four-month SA group
outperformed the AH and the 1.5–2-month SA group; no other significant contrasts were identi-
fied. Importantly, while all of the SA groups’ mean composition scores at T4 were higher than at
T1, the AH group’s mean score at T4 dropped below that of T1. When asked about these changes
in their interviews, many of the AH students felt that their scores had dropped because they took
fewer English classes in their last year at the university, while those in the SA groups attributed their
improvement to English writing classes they took while in Japan and/or abroad, maintaining written
contact with friends they made while abroad, or to a perception that the final topic was easier than
the previous ones. Many of them also mentioned that being required to write in larger quantities and
frequently was helpful for improving their L2 writing ability. In other words, having more oppor-
tunities to write over a longer period of time and maintaining contact with friends abroad increased
the SA group’s motivation.
In terms of writing development, scores for all groups improved from T1 to T2 before any
participants went abroad, suggesting that L2 writing instruction alone can help to improve L2
writing ability. However, as is seen more clearly in the discussion of the motivation data, instruction
alone may not be enough to maintain gains. The interviews on motivation revealed that although
all groups wanted to improve aspects of their L2 writing (e.g., grammar, vocabulary, fluency), only
those in the SA groups actively sought out resources to help them. The majority of the AH group on
the other hand, did nothing to improve these aspects of their L2 writing. Another difference emerged
at T3 after some students had already returned from their overseas experience; some students began
to form L2 imagined communities that they reported were influential in improving their L2 writing.
An imagined community is “[a]‌group of people, not immediately tangible and accessible with
whom we connect through the power of imagination” (Kanno & Norton, 2003). Qualitatively, there
was a difference between those who participated in a one-and-a-half to two-month program from
259
those who participated in a four or eight to 11-month program with the former imagining writing
to an email pen pal and the latter imagining writing in a course they took while abroad. It was
also found that only those who participated in the longest programs (eight to 11 months) practiced
writing in the L2 on their own, even when it was not required.
All of these studies taken together suggest that L2 writing instruction in which students are
provided with L2 metaknowledge and different types of writing assignments can contribute to
the development of L2 writing ability. However, for those effects to be maintained over time,
it appears to be important for them to create an L2 imagined community specifically related to
the types of skills and abilities they want to improve. As it was only those who participated in a
four or eight to 11-month SA program that formed an academic related L2 imagined community,
it would also appear that length of stay abroad differentially influences motivation and type of
imagined community.
Within-Subject Design
While the previous studies adopted a between-subjects comparison to examine the effects of context-
SA vs AH, age-children vs adults, or length of program on writing development, the following
studies use a within-subject design. Serrano et al. (2012) compared changes in CAF between written
and oral data from 14 Spanish speakers learning English over a year-long study abroad program
and analyzed how learner attitudes interacted with these changes. Data were elicited using a picture
narration task for oral data and descriptive essays for written data. For syntactic complexity they
used the measure of clauses per T-unit and Guiraud’s Index for lexical complexity. Accuracy was
determined by the number of errors per T-unit and fluency was measured using syllables per minute
for oral data and words per T-unit for written. Their results showed a significant increase in fluency
and lexical richness in the first semester but only for oral data. From the beginning of the second
semester to the end of the time abroad, accuracy improved for both speaking and writing, and
writing became more syntactically complex. Considering the full SA experience, all four measures
showed a significant improvement for writing. Learners’ attitudes also played a role: those who
rated English people as more sociable and humble had more gains in written accuracy, while those
who rated the language as more complex made greater gains in lexical complexity for both written
and oral production.
Pérez-Vidal and Barquin (2014) examined 73 Spanish/ Catalan bilinguals who directly
matriculated at a foreign institution during three months, a graduation requirement; the majority
of the students were studying Translation and Interpreting with English as their major language.
The writing data collected were part of the larger Study Abroad and Language Acquisition (SALA)
project conducted by Pérez-Vidal and colleagues (Pérez-Vidal, 2014). Participants contributed data
at four different points during their time at the university; upon first entering their degree course
before receiving formal instruction (FI) in the target language (T1), after 80 hours of FI in the first
year of their university studies prior to participating in SA (T2), after completion of a three-month
stay at a host university in an English-speaking country (T3), and 15 months after returning from
SA while enrolled in another three-month period of FI at the home institution (T4). At each data
collection point, participants were given 30 minutes to respond to an argumentative prompt in
test-like conditions. The prompt was the same each time and the writing samples were analyzed
using measures of complexity (words per sentence, clauses per sentence, coordination index, and
Guiraud’s Index), accuracy (errors per word), and fluency (words per minute). An additional group
of native speakers (NS) of English completed the writing task while on a SA program in Spain and
the data from the L2 English learners was compared to this sample.
Results revealed that L2 learners’ writing fluency was comparable to NS level at T3, after their
academic sojourn abroad, and was maintained 15 months later, at T4. In terms of accuracy, and des-
pite improvement, learners did not reach NS level, which was close to zero errors. As for complexity,
260
at T1, learners’ use of coordination was significantly lower than that of the NS’s, but after the period
of formal instruction in their first year at university, they began to use a similar amount of coord-
ination as their NS counterparts and maintained it throughout the duration of the study. Measures
of lexical complexity showed that learners used significantly less lexical diversity at T1 and T2
but increased to become comparable to the NS’s at T3, after SA, and maintained their gains at T4.
Taken together, these results show an advantage for FI for syntactic complexity in terms of coordin-
ation because this was seen to increase after the period of FI and to be maintained. However, in all
other domains, SA seems to give an advantage because fluency and lexical complexity only reached
NS level after SA and had not shown any gains after exposure to FI alone.
More recent studies have revealed positive effects on lexical complexity as a result of SA.
Zaytseva, Miralpeix, and Pérez-Vidal (2019) examined different aspects of vocabulary develop-
ment over a period of formal instruction followed by a three-month stay abroad using writing
samples collected as part of the SALA project. They measured lexical diversity (Guiraud’s Index and
D measures), lexical density (content words/total number of words, function words/total number of
words), lexical sophistication (Lexical Frequency Profile), and lexical accuracy (vocabulary errors/
word and spelling errors/word). Results showed that lexical diversity significantly increased, but
only after SA. Adverb density was also found to increase and noun density to decrease, and this
occurred across both settings. Vocabulary accuracy also increased and did not favor one setting
over the other. It was found that after SA, but not FI, lexical diversity, use of academic terms, and
spelling errors were comparable for native speakers’ and learner’s writing.
The final study under consideration is that of Xu (2019), a metanalysis of changes in inter-
language complexity during SA that includes 28 studies with measures of complexity in oral and
written data. Most of the studies were of adults at varying levels of proficiency learning English in
short-term to mid-term programs (five weeks to four months). Only studies that used an essay for
the writing task were included; most of the writing tasks were either narrative or argumentative.
Results showed a small effect size for written complexity, with a positive effect for lexical density,
diversity, and sophistication, with lexical diversity showing the greatest changes. Although they
did show some improvement, changes in syntactic complexity were much smaller. These results
are similar to those in the previous study which also showed an advantage for the development of
lexical complexity during SA.
Discussion
Definite conclusions about changes in L2 writing as a result of immersion abroad are difficult
to draw due to the small number of studies, their reduced samples, and differences in measures
used. However, some trends are revealed that can suggest useful principles for program design and
directions for future research. All the studies summarized here suggest that both explicit instruc-
tion in L2 writing and immersion abroad can help to improve L2 writing ability, but the specific
aspects that are influenced by each vary. The studies by Llanes and Muñoz (2013), Godfrey et al.
(2014) and Pérez-Vidal and Barquin (2014) seem to indicate that explicit instruction provides an
advantage for the development of syntactic complexity. However, lexical complexity appears to
benefit from SA (Pérez-Vidal & Barquin, 2014; Xu, 2019). When it comes to fluency and accuracy,
the picture is less clear. On the one hand, only SA participants in Sasaki (2007) showed improved
fluency; likewise, in Pérez-Vidal and Barquin (2014), participants only showed improved fluency
after immersion abroad. On the other hand, all the participants in Godfrey et al. (2014) increased in
fluency with the AH group showing greater gains. Finally, results from these studies make it nearly
impossible to draw any conclusions about accuracy, but there are reasons to remain hopeful. The SA
participants in Godfrey et al. (2014) did use gendered nouns and gender agreement more accurately
after SA, and participants in Pérez-Vidal and Barquin (2014) also decreased in total errors per word
after SA even though they didn’t reach the same level as NS.
261
The studies by Sasaki (2004, 2007, 2011) add a unique contribution as they include motivation
in the equation. From her studies, it appears that SA positively affects motivation to write well in the
L2, which aids in maintaining gains made as a result of a combination of classroom and immersion
experience. Motivation appears to be related to the creation of L2 imagined communities, which
itself is connected to length of stay abroad; also, the nature of these communities affects the type of
writing (argumentative vs. email) that participants want to improve, and how participants are able
to apply what they have learned at home or abroad to new writing contexts. Other factors, such as
attitudes may also affect gains in lexical diversity and accuracy (Serrano et al., 2012) while abroad.
More research is still needed to confirm these observations (see Table 19.1).
Table 19.1 Summary of studies on writing
Study Languages Writing task Measures used Results

Freed et al. L1 English Description/ NS judgments, number of No advantage for SA
(2003) L2 French narration words, sentences and for written fluency;
T-units, error-free increase in lexical
T-units, subject-verb density
agreement, past-tense
usage, words per T-unit,
proportion of lexical
words
Godfrey L1 English ACTFL Writing Total number of words, Increased fluency; more
et al. L2 French Proficiency gendered nouns, gender gendered language with
(2004) Test, analysis agreement, clause/T- fewer errors in gender
unit, discourse markers agreement
Serrano et al. L1 Spanish Descriptive Words per T-unit, clauses No sig. differences
(2011) L2 English per T-unit, Guiraud’s between AH intensive
Index, errors per T-unit and SA; SA sig. greater
fluency and lexical
complexity than AH
semi-intensive
Llanez & Spanish/Catalan Descriptive Words per T-unit, SA had greater gains in
Muñoz bilinguals Guiraud’s Index, oral fluency, no clear
(2013) L3 English clauses per T-unit, benefits for writing
errors per T-unit
Sasaki L1 Japanese Argumentative Holistic ratings by EFL No significant differences
(2004) L2 English specialists between AH and SA
groups; SA more
motivated to study
English and write better
Sasaki L1 Japanese Argumentative Holistic ratings by EFL Only SA group improved
(2007) L2 English specialists on holistic rating
of texts and writing
fluency
Sasaki L1 Japanese Argumentative Holistic ratings by EFL Longer (4–11 months)
(2011) L2 English specialists SA programs led to
significantly higher
holistic ratings; also
led to development of
academic imagined
community
262
Table 19.1 Cont.
Study Languages Writing task Measures used Results

Serrano et al. L1 Spanish Descriptive Words per T-unit, clausesNo sig. changes in
(2012) L2 English per T-unit, Guiraud’s writing after one
Index, errors per T-unit semester; significant
improvement on all
measures after one year
Pérez-Vidal Spanish/Catalan Argumentative Words per minute, Fluency, accuracy, and
& Barquin bilinguals words per sentence, lexical complexity
(2014) L3 English clauses per sentence, improved after SA;
coordination index, fluency and lexical
Guiraud’s Index, errors complexity comparable
per word to NS
Zaytseva Spanish/Catalan Argumentative Guiraud’s Index, D Lexical diversity sig.
et al. bilinguals measures, content increased after SA
(2019) L3 English words/total words, but not FI; adverb
function words/ density increased
total words, Lexical through both FI and
Frequency Profile, SA while noun density
vocabulary errors/word, decreased; decrease
spelling errors/word in lexical errors
across both contexts;
lexical diversity
academic word means
comparable to NS after
SA
Xu (2019) Multiple; Most Varied; common measures SA positively affected
TLs include narrative or for complexity included lexical density,
English, argumentative Guiraud’s Index and diversity, and
Spanish, clauses per T-unit sophistication; some
Chinese, improvement in
French syntactic complexity

There are several methodological considerations when designing studies on writing in SA.
Some scholars have questioned AH and SA group comparisons (Marijuan & Sanz, 2018) for a
number of reasons, the most important is the inability to randomize participants’ assignment. As
learners self-select, there are likely qualitative differences such as motivation, openness to new
experiences, and socioeconomic status between SA and AH participants, which alone can poten-
tially play a role in the development of writing skills, but that can also interact with program
characteristics to influence outcomes. In contrast, across programs comparisons (Sasaki, 2011)
or longitudinal studies (Pérez-Vidal & Barquin, 2014) provide valuable insights to the specific
effects of SA on learner development. Other studies compare learners to native speakers as a
benchmark which can also be problematic as second language speakers will not become monolin-
gual writers in their new language, though they still can reach high levels in the L2 (Pérez-Vidal
& Barquin, 2014). In the absence of an appropriate control group, a mixed-methods design is an
adequate option.
When it comes to tracking language development, the quality of the measures used to cap-
ture it is fundamental. Gains in overall proficiency have often been measured using the ACTFL
263
Oral Proficiency Interview for speaking or Writing Proficiency Test for writing or other holistic
measures of proficiency. The scores provided by these types of ratings may be useful for language
educators but are not sensitive enough to capture changes that characterize writing development.
One way to combat this is to rely on more fine-grained measures targeting specific linguistic or
grammatical forms, and on writing tasks and dependent variables such as complexity, accuracy, and
fluency, which have been used to investigate L2 writing in classroom contexts. Although there is
much variation in the techniques used to track each of these, combining these measures with more
holistic ratings can better inform researchers and practitioners on the aspects of writing that most
benefit from SA and on specific program characteristics.
One final consideration is that isolating just one variable to measure its impact on writing
outcomes is hard and even unproductive, as writing can be influenced by a variety of factors at any
given point in time; also, learning to write well is not a linear process. However, attempts can still
be made to determine the effects of certain program or learner characteristics on specific aspects of
writing by having participants complete more than one writing sample or by surveying their motiv-
ation and strategy use for each writing assignment. The topics chosen for writing have to be care-
fully considered also. Argumentative prompts are generally regarded to be more challenging and
therefore more able to distinguish between different levels of writers and can avoid ceiling effects
that may come from other types of genres, such as narratives or descriptions. Prompts can be varied
from one time to the next to control for possible task effects and counterbalanced across participants
to control for differences in topics.

Considering the findings presented earlier, there are three recommendations that can be made
to help students improve their L2 writing. First, it is important to provide students with explicit
writing instruction as this can and does improve at least some aspects of writing. As shown in Pérez-
Vidal and Barquin (2014) and Sasaki (2004, 2007, 2011), students do improve their writing after
exposure to FI that includes explicit L2 writing instruction; even students who did not go abroad
but received writing instruction at their home institution improved (Godfrey et al., 2014). Having
metaknowledge about how to write well has also been shown in other studies to lead to better
writing (e.g., Sasaki & Hirose, 1996). This instruction can be provided in classes at the home insti-
tution and when possible should also be included in classes taken while abroad. A research question
of interest is whether L2 learners need to reach a certain level of writing skills and metaknowledge
to benefit from writing practice during their sojourn abroad.
Providing writing instruction is important and can lead to gains in L2 writing, but without
sustained practice these gains do not seem to be maintained. Therefore, the second recommendation
is to provide frequent opportunities to write and for writing assignments to be varied in type and
length. In the studies by Sasaki (2004, 2007, 2011) the number of writing assignments required as
well as their length was consistently cited by the students who completed SA programs as a factor
that increased their confidence in L2 writing; also, students who spent a longer time abroad and
presumably took more classes and therefore had more writing assignments achieved higher scores
and maintained them longer. In contrast, many of the AH students failed to maintain the gains made
after their first year of L2 writing instruction and felt that their writing had worsened as a result of
less practice.
Motivation appeared to be one of the most important factors in maintaining gains made both
from FI and from time spent abroad. Learners’ reliance on L2 imagined communities that were
academic in nature was a specific type of motivation identified in the research that led to the most
gains and to sustaining them. This evidence leads to the third and final recommendation: to find
a way to help students create an L2 imagined community, especially when it is not possible for
264
them to go abroad. Opportunities for online collaboration with students at an institution where
students are learning each other’s language (for example, Teletandem (Telles, 2015)) are not dif-
ficult to organize. Students could create blog-posts that participants at both institutions could read
and respond to, or they could also be assigned topics where they have to interact with students
from the other institution in order to learn the information they need to complete the assignment.
This type of activity could be used with students of varying levels of proficiency and adapted to
meet their level. For example, an intermediate language class could be given an assignment about a
description of university life in the other country while a more advanced class could discuss current
issues facing the country and write a paper presenting the different sides. This type of assignment
could help to create an L2 imagined community related to the type of writing that is expected,
which could motivate students to write well in the L2, knowing their writing is going to be read by
their native-speaker peers.
Future Directions
As seen in the sections on previous and current research, the number of studies investigating SA as
a site for L2 writing development is surprisingly low, considering SA is an academic endeavor. We
need research that looks at the place that the teaching of writing has in the study abroad program
curricula – how much focus is there in promoting writing for students learning a language abroad? Is
it different from the L2 classroom at home institutions? We also need studies that take a closer look
at how writing is taught and how writing develops. We need more studies conducted with different
language backgrounds and target languages in a variety of curricular designs, as different writing
systems may play a role. Current studies have focused on English speakers learning French (Freed
et al., 2003; Godfrey et al., 2014) and Japanese and Spanish/Catalan bilinguals learning English
(Pérez-Vidal & Barquin, 2014; Sasaki, 2004, 2007, 2009, 2011). To increase generalizability, future
studies should also strive to include larger samples that yield more robust results.
In addition, studies should consider the program characteristics and their influence on L2 writing.
Programs can vary in their length, the types of classes they offer, the frequency and type of writing
opportunities their courses provide, and the amount of interaction with the local community they
promote. As was mentioned in the Introduction, not all programs are focused solely on study but
may include internships, volunteer work, or research. If students are participating in these types of
activities in a setting that requires the use of their L2, this is likely to have a different impact on the
development of their writing skills because they may be engaging in different types of writing than
in a classroom setting. The experience of directly participating in the L2 community while abroad
and maybe even once they return home may also impact motivation. None of the studies reviewed
here mentioned these types of activities, so future research could compare the effects different cur-
ricular designs have in L2 writing and motivation from traditional SA to programs that immerse
students in the community as volunteers or employees.
Related to program characteristics is program length. As more and more students participate in
shorter SA programs, it is worrisome that the current findings show that more time spent abroad
typically leads to greater writing gains. Future research can investigate if there are certain aspects
of L2 writing that benefit from a shorter time abroad, or that short-term programs can be designed
to help participants become effective writers. For example, the students in Sasaki (2009, 2011) who
went abroad for two months maintained email contact with friends they had made, so it is possible
that this type of writing may have shown more gains than those shown in the argumentative essay
included in the study. While email writing may not be the focus of many classes, for many of the
participants email may be the type of L2 writing they continue to engage in beyond college, so it
could be used as a site to improve L2 knowledge and to investigate changes over time. Other types
of digital platforms, such as blogs, e-journals, and social media, can also be used both as teaching
265
tools and as sources for L2 writing data (e.g., Back, 2013; Stewart, 2010). Consideration of more
varied genres of writing will help to gain a fuller picture of L2 writing.
The times demand that we diversify our samples. Programs are different, but so are students. The
field tends to assume that SA participants are white, middle-class students, and it has disregarded
gender as a variable. If anything, the pandemic has made it clear that research, whether educational
or medical, has to make an effort to select samples that represent the population, and include in its
design, rather than ignore, variables such as race or gender (Marijuan & Sanz, 2018). In this sense,
recent efforts to focus on language development among heritage speakers in SA programs are a
most welcome development (Sanz, 2021).
Finally, another way to gain a fuller picture would be to triangulate the methods used in research
on L2 writing in immersive contexts. A direct comparison among the studies reviewed here was dif-
ficult because some only used a holistic rating such as a rubric for writing scores, while others used
fine-grained measures for CAF analyses. Future studies could include both of these methods as well
as interviews and tests to gauge learners’ socio-cognitive variables to determine if there are spe-
cific aspects of L2 writing that are more susceptible to changes in FI versus SA and how individual
differences relate to these specific changes in addition to overall scores. Other sources of qualitative
data could also be included such as stimulated recalls collected following completion of the writing
task. In addition, when participants are asked to complete the writing tasks on the computer instead
of by hand, researchers can use keystroke logging, which allows for a more fine-grained analysis of
fluency (Vallejos, 2020). Keystroke logging allows the study of qualitative changes that reflect how
learners are processing the language while on task by measuring the length and frequency of pauses
and where they occur. Finally, investigating key curricular characteristics and including a wider
variety of types of writing would help to fill many of the current gaps and could aid in designing
more effective SA programs.
References
Back, M. (2013). Using Facebook data to analyze learner interaction during study abroad. Foreign Language
Annals, 46(3), 377–401. doi: 10.1111/flan.12036.
Brecht, R., Davidson, D., & Ginsberg, R.B. (1995). Predictors of foreign language gain during study abroad.
In B. Freed (Ed.), Second language acquisition in a study abroad context (pp. 37–66). Philadelphia: John
Benjamins.
Briggs, J.G. (2016). A mixed-methods study of vocabulary-related strategic behaviour in informal L2 contact.
Study Abroad Research in Second Language Acquisition and International Education, 1(1), 61–87.
DeKeyser, R. (2010). Monitoring processes in Spanish as a second language during a study abroad program.
Foreign Language Annals, 43(1), 80–92. doi:http://dx.doi.org/10.1111/j.1944-9720.2010.01061.x.
Dewey, D.P. (2004). A comparison of reading development by learners of Japanese in intensive domestic
immersion and study abroad contexts. Studies in Second Language Acquisition, 26(2), 303–327.
Freed, B.F., So, S., & Lazar, N.A. (2003). Language learning abroad: How do gains in written fluency compare
with gains in oral fluency in French as a second language. ADFL Bulletin, 34(3), 349–356.
Ginsberg, R.B., Robin, R.M., & Wheeling, P.R. (1992). Listening comprehension before and after study
abroad. Washington, DC: National Foreign Language Center.
Godfrey, L., Treacy, C., & Tarone, E. (2014). Change in French second language writing in study abroad and
domestic contexts. Foreign Language Annals, 47(1), 48–65. doi:http://dx.doi.org/10.1111/flan.12072.
Grey, S., Cox, J.G., Serafini, E., & Sanz. C. (2015). The role of individual differences in the study abroad con-
text: Cognitive capacity and language development during short-term intensive language exposure. The
Modern Language Journal, 99(1), 137–157. doi: 10.1111/modl.12190.
Huebner, T. (1995). The effects of overseas language programs: Report on a case study of an intensive
Japanese course. In B. Freed (Ed.), Second language acquisition in a study abroad context (pp. 171–193).
Institute for International Education. (2018). Open doors 2018 fast facts. Retrieved from www.iie.org/Research-
and-Insights/Open-Doors/Fact-Sheets-and-Infographics/Fast-Fact.
Isabelli-García, C., Bown, J., Plews, J.L., & Dewey, D.P. (2018). Language learning and study abroad.
Language Teaching, 51(4), 439–484. doi:10.1017/S026144481800023X.
266
Kanno, Y., & Norton, B. (2003). Imagined communities and educational possibilities: Introduction. Journal of
Language, Identity & Education, 2(4), 241–249. doi:10.1207/S15327701JLIE0204_1.
Kinginger, C. (2009). Language learning and study abroad: A critical reading of research. Basingstoke: Palgrave
Macmillan.
Lapkin, S., Hart, D., & Swain, M. (1995). A Canadian interprovincial exchange: Evaluating the linguistic
impact of a three-month stay in Quebec. In B.F. Freed (Ed.), Second language acquisition in a study abroad
context (pp. 67–94). Philadelphia: John Benjamins. https://doi.org/10.1075/sibil.9.06lap
Llanes, À., & Muñoz, C. (2009). A short stay abroad: Does it make a difference? System, 37(3), 353–365.
Llanes, À., & Muñoz, C. (2013). Age effects in a study abroad context: Children and adults studying abroad
and at home. Language Learning, 63(1), 63–90. doi:10.1111/j.1467-9922.2012.00731.x.
Marijuan, S., & Sanz. C. (2018). Expanding boundaries: Current and new directions in study abroad research
and practice. Foreign Language Annals, 51(1), 185–204. http://dx.doi.org/10.1111/flan.12323.
Pérez-Vidal, C. (2014). Study abroad and formal instruction contrasted. In C. Pérez-Vidal (Ed.), Language
acquisition in study abroad and formal instruction contexts (pp. 17–58). Amsterdam: John Benjamins.
Pérez-Vidal, C., & Barquin, E. (2014). Comparing progress in academic writing after formal instruction
and study abroad. In C. Pérez-Vidal (Ed.), Language acquisition in study abroad and formal instruction
contexts (pp. 217–234). Amsterdam: John Benjamins.
Sanz, C. (2021). Afterword: Charting a path forward for study abroad and Spanish as a heritage language
research. In R. Pozzi, T. Quan, & C. Escalante (Eds.), Heritage speakers of Spanish and study abroad (pp.
276–284). London: Routledge.
Sanz, C., & Morales-Front, A. (2018). Issues in study abroad research and practice. In C. Sanz and A. Morales-
Front (Eds.), The Routledge handbook of study abroad research and practice (pp. 1–16). New York: Taylor
& Francis.
Sasaki, M. (2004). A multiple-data analysis of the 3.5-year development of EFL student writers. Language
Learning, 54(3), 525–582. doi: 10.1111/j.0023-8333.2004.00264.x.
Sasaki, M. (2007). Effects of study-abroad experiences on EFL writer: A multiple-data analysis. The Modern
Sasaki, M. (2009). Changes in English as a foreign language students’ writing over 3.5 years: A sociocognitive
account. In R. Manchón (Ed.), Writing in foreign language contexts: Learning, teaching and research (pp.
49–76). Bristol: Multilingual Matters.
Sasaki, M. (2011). Effects of varying lengths of study-abroad experiences on Japanese EFL students’ L2 writing
ability and motivation: A longitudinal study. TESOL Quarterly, 45(1):81–105. doi:https://www.jstor.org/
stable/41307617.
Sasaki, M., & Hirose, K. (1996). Explanatory variables for EFL students’ expository writing. Language
Learning, 46(1), 137–174. doi: 10.1111/j.1467-1770.1996.tb00643.x.
Segalowitz, N., & Freed, B.F. (2004). Context, contact, and cognition in oral fluency acquisition: Learning
Spanish in at home and study abroad contexts. Studies in Second Language Acquisition, 26(2),173–199.
doi:http://dx.doi.org/10.1017/S0272263104262027.
Serrano, R., Llanes, À., & Tragant, E. (2011). Analyzing the effect of context of second language
learning: Domestic intensive and semi-intensive courses vs. study abroad in Europe. System, 39(2), 133–
143. doi:10.1016/j.system.2011.05.002.
Serrano, R., Tragant, E., & Llanes, À. (2012) A longitudinal analysis of the effects of one year abroad. The
Canadian Modern Language Review, 68(2), 138–163. doi:10.3138/cmlr.68.2.138.
Stewart, J.A. (2010). Using e-journals to assess students’ language awareness and social identity during study
abroad. Foreign Language Annals, 43(1), 138–59. doi:10.1111/j.1944-9720.2010.01064.x.
Telles, J.A. (2015). Teletandem and performativity. Revista Brasileira de Linguística Aplicada, 15(1), 1–30.
doi:10.1590/1984-639820155536.
UNESCO Institute of Statistics. (n.d.) Global flow of tertiary-level students. Retrieved from http://
uis.unesco.org/en/uis-student-flow
University of Oxford. (2017). International trends in higher education.
Vallejos, C. (2020). Fluency, working memory and second language proficiency in multicompetent writers.
(Doctoral dissertation, Georgetown University). Proquest.
Xu, Y. (2019). Changes in interlanguage complexity during study abroad: A meta-analysis. System, 80, 199–
211. doi:10.1016/J.SYSTEM.2018.11.008.
Zalbidea, J., Issa, B.I., Faretta-Stutenberg, M., & Sanz, C. (2020). Initial L2 proficiency and L2 grammar
development during short-term immersion abroad: Conceptual and methodological insights. Studies in
Second Language Acquisition, 43(2), 239–267. https://doi.org/10.1017/S0272263120000376
Zaytseva, V., Miralpeix, I., & Pérez-Vidal, C. (2019). ESL written development at home and abroad: taking
a closer look at vocabulary. International Journal of Bilingual Education and Bilingualism. doi:10.1080/
13670050.2019.1664392.
267
20
LEARNING IN ACADEMIC
SETTINGS
Nigel A. Caplan
University of Delaware
Introduction
On the one hand, the discursive knowledge-making practices research cultures develop
over generations to accomplish their knowledge work become normalized, transparent,
invisible, and indeed appear universal to long- term members of research cultures,
rendering writing a non-question. On the other hand, for newcomers, these very practices
constitute new territory and a vital site of inquiry into how knowledge and researcher iden-
tities are produced and negotiated in these research cultures.
(Starke-Meyerring, 2011, p. 92)
Writing often functions as both the gatekeeper to academia –in graded student papers, entrance
examinations, standardized tests, and doctoral dissertations –and its currency: the journal articles,
chapters, and books in which knowledge is produced, disseminated, and acknowledged. However,
as Starke-Meyerring (2011) observed, the “paradox” of academic writing is that it is most opaque
to those for whom it has the highest stakes, namely novices and those outside the mainstream schol-
arly society, including L2 learners. At the same time, while there is widespread recognition in style
guides, advice books, and teachers’ feedback that there are forms of language which are more or
less academic, pinning down the language that is to be learned and used in academic settings turns
out to be somewhat elusive (Biber & Gray, 2016).
Writing in academic settings could be tautologically defined as all writing that occurs in edu-
cational contexts, from pre-school scribbles to scholarly handbooks such as this one. However, this
chapter follows Achugar and Carpenter (2014) in focusing on disciplinary literacy rather than all school
writing, operationalized as “the representation and orientation choices that characterize meaning-making
practices within the field” (p. 61). That is, academic writing refers to writing that represents, discusses,
and expands the knowledge of particular fields. While there is much to be said about language learning
and writing in elementary and secondary settings and in other languages, this chapter concentrates on the
intersections between second-language learning in English and writing in tertiary education.
What then is the language that is typical of these academic settings, bearing in mind that not
all language teaching in academic settings focuses on academic language? There is a widely held
assumption that academic registers of English are complex, whereas in fact it is more accurate
to say that they are specialized (Biber & Gray, 2016). For example, Biber et al. (2011) compared
268 DOI: 10.4324/9780429199691-28

L2 Writing in Academic Settings
professional academic writing to conversational American English and found that the traditional
definition of syntactic complexity (the extensive use of subordinate clauses) applies best to spoken
English. Academic written English is still complex but in other ways: it is more densely packed,
more reliant on nouns and phrases than verbs and clauses, and less explicit in its logical relations.
These preferences are neither accidental nor, as self-proclaimed style gurus sometimes claim,
deliberate attempts to “dress up the trivial and obvious with the trappings of scientific sophistica-
tion” (Pinker, 2014). Academic language has evolved to fulfill the needs of its users by creating the
kinds of meanings that do the work of research writing (Halliday & Martin, 1993). As Hyland (2018)
argues, the role of the EAP teacher is to help students see “linguistic forms as not merely arbitrary,
instrumental and autonomous, just something to ‘get right’, but as fundamental to disciplinary com-
munication and to thought itself” (p. 388) A good example of this is nominalization, the process of
turning a verb (nominalize) or clause (scientists nominalize heavily) into a noun (nominalization) or
noun phrase (the heavy use of nominalization). Decried by best-selling critics of academic style (e.g.,
Sword, 2012; Williams, 1995), nominalization is in fact a valuable resource for packing entire clauses
and processes into nouns that can be summarized, elaborated, challenged, or supported (Halliday,
1999; Halliday & Martin, 1993). Here for example is a sentence filled with nominalizations co-
authored by one of the champions of common-sense “clarity” in academic writing:
This fact has influenced many areas of science, including theories about the plasticity of
the young brain, the role of neural maturation in learning, and the modularity of linguistic
abilities.
(Hartshorne et al., 2018; emphasis added)
The nominalizations in this sentence are purposeful and effective in a research article on cogni-
tive linguistics in a major journal. The elaborated noun phrases (e.g., the role of neural maturation
in learning) neatly package complex theories that are presumed to be prior knowledge for readers
of Cognition so that the writers can quickly signal the place of their study in the broader field.
Consequently, academic writing –Pinker’s (2014) “academese” –is easy to identify and parody.
And like any other specialized register, the language of the university is “nobody’s first language”
(Bourdieu & Passeron, 1977): it must be acquired through exposure, practice, and often instruc-
tion since it is simply not present in other environments (Hinkel, 2004). No student –regardless
of their first language(s) – learns academic English from participating in conversations, reading
novels, watching movies, or even listening to TED talks because all those registers are systemat-
ically different in their grammar and word choice than academic writing (Biber & Gray, 2016). In
academic settings, writers carefully select lexicogrammatical resources (that is vocabulary, syntax,
and the intersection between lexis and grammar) both within and beyond the clause. Language
thus operates at both the syntactic and discourse-semantic levels since meaning in academic texts
is not constructed simply in clauses and sentences but in often lengthy stretches of connected text
(Humphrey & Macnaught, 2016). As such, language learning for academic purposes cannot be
reduced to vocabulary lists and grammatical accuracy: it demands attention to ways that language
resources are used to construe meanings across disciplines as well as sensitivity to the linguistic
and rhetorical variations that exist among fields, genres, and even texts belonging to the same genre
(Tardy, 2016). These grammatical choices are, as Starke-Meyerring (2011) observed, a “new terri-
tory” for most writers and one which disciplinary experts often struggle to map.
There is a surprisingly short history of research into the acquisition of academic written language,
so the historical perspectives on this topic are primarily pedagogical. Hyland’s (2019) introduction
to second-language writing teaching reviews six “orientations” that have arisen, survived, or fallen
269
Nigel A. Caplan
since the 1970s: a focus on structure, function, expression, process, content, and genre. All have
engaged to different degrees with language. From a structural orientation, writing is the accurate
arrangement of “words, clauses and sentences structured according to a system of rules” (p. 3); thus,
the eradication of “error” is one of the primary goals of L2 writing instruction, which may be treated
as little more than an opportunity to demonstrate mastery of a language feature. From a functional
perspective, writing involves filling slots in a fixed range of text types, generally producing some-
thing like a five-paragraph essay (Caplan & Johns, 2019). Language learning is once more viewed
through a deficit approach to reducing errors along with some instruction in decontextualized
patterns thought to support the text form (very often conjunctions and other connectors).
Expressivist approaches to language teaching have had less impact in L2 than L1 writing. By
insisting that the essence of writing is creativity and self-expression, this approach disadvantages lan-
guage learners by failing to provide any focus on form, leading to what Halliday criticized as an attitude
of “benevolent inertia,” in which learners were largely left to exploit their own linguistic and rhetorical
resources as if learning would take place “by magic” (Halliday & Hasan, 2006, p. 24). In process
approaches, language is similarly separated from writing, which is regarded as a separate skill to be
mastered. Content approaches, meanwhile, often reinforce the idea that language learning is something
that will happen elsewhere (for example, from extensive reading in the content area) and not as part
of writing instruction (Hyland, 2019, p. 16). Finally, in some genre-based L2 writing pedagogies, lan-
guage is reinstated as the primary semiotic system through which the work of academic writing is done
(e.g., Cheng, 2019; Rose & Martin, 2012; Schleppegrell, 2004), while in others, the role of language is
acknowledged but is seen as tangential to the development of rhetorical moves (Swales, 1990).
Such a historical perspective suggests a cline of attention to language in L2 writing. At one end,
language is defined as accuracy in structure; in the middle it fades out to be replaced by writing or
composition as an independent construct, with language learned somehow somewhere else; and at
the other end of the cline, language is reconceptualized as discourse, the ways in which meaning
is made in and across disciplinary texts. These orientations position the L2 writing teacher differ-
ently: as an enforcer of grammatical norms; as support staff to the real work of content classes; and
finally as a rhetorical and linguistic expert able to teach L2 students the tools for success in multiple,
varied disciplines (Hyland, 2018).

Research into the intersection between language learning and L2 academic writing has primarily
asked two questions: what is the language of academic writing, and how do learners develop it?
Since there is no compelling evidence to suggest that most learners can acquire the specialized
registers of academic writing spontaneously, most research and thus the remainder of this chapter
examines instructed language acquisition.
Target Language in Academic Writing

A substantial body of research has investigated the language used by professional or proficient
academic writers through corpus analyses of textbooks and journal articles, which are presumed to
represent the pinnacle of the register and thus the target for academic language development (e.g.,
Biber & Gray, 2016; Coxhead, 2000; Hyland & Tse, 2007). Other investigations have used corpora
or other samples of writing by L1 and/or L2 students in university classes in an attempt to build a
more realistic definition of student writing, although this in itself raises questions about what counts
as academic writing (e.g., Hinkel, 2002; Staples et al., 2016; Staples & Reppen, 2016).
In the area of vocabulary, much attention has been devoted to identifying the lexical items that
are frequent across a range of academic disciplines (e.g., Coxhead’s, 2000, Academic Word List and
Simpson-Vlach & Ellis’s, 2010, Academic Formulas List). However, these lists are derived from texts
270
that students may read in academic settings (typically a corpus of research articles and textbooks)
and not those that successful students write. Syntactic features identified in English academic texts
include preferences for subordination over coordination and for phrasal modifications and reductions
over finite dependent clauses (Biber, 1992; Biber et al., 1999; Biber & Gray, 2016; Staples et al., 2016;
Staples & Reppen, 2016; Wolfe-Quintero et al., 1998). Consequently, academic writing uses more
nouns and fewer verbs than other registers, more pre-and post-nominal prepositional phrases, more
noun modifiers (e.g., text type), more reduced relative clauses, and more noun complement clauses (e.g.,
the fact that …). Academic writing uses a limited range of verb tenses (present simple and past simple
followed by the present perfect) while also employing the passive voice and existential structures (e.g.,
there is/are) more frequently (Biber & Gray, 2016; Caplan, 2019a; Hinkel, 2004).
Beyond the clause at the discourse-semantic level, cohesion is established in academic writing
through a range of lexicogrammatical resources, including conjunctions and transition phrases,
enumerative nouns (e.g., these factors), repetition, synonymy, elision, and nominalization (Halliday
& Hasan, 1976). Proficient academic writers use fewer connectors than many L2 writers think, per-
haps due to dubious advice in their ESL textbooks (Hinkel, 2002, 2004). Instead, academic texts
express logical relationships less explicitly through verb choices (lead to, result in), juxtaposition
of ideas, punctuation such as colons, and other efficient but lexically dense and logically inexplicit
forms (Biber & Gray, 2016; de Oliveira & Schleppegrell, 2015).
Stance is another area of concern that rises above the syntactic level (Hyland, 2004). Stance
may be expressed through a wide range of reporting verbs which, although common overall in
academic writing, vary in their distribution across disciplines (Hyland, 1999). Authors also refer
to themselves in order to establish authority and interact with readers more than L2 students may
realize (Hyland, 2002), but other interactive language such as personal pronouns, exclamations,
and interrogatives are infrequent (Biber & Gray, 2016; Hyland & Jiang, 2017). Many expressions
of stance are described as hedging (softening a claim or indication of a lack of commitment to a
particular statement or position) and its opposite, boosting (Hyland, 1998). Biber et al. (1999),
for example, found that the most common modal verbs in academic writing are may and can.
Other categories of hedging and boosting that have been noted and recommended as targets
for L2 writing teaching include quantifiers, conditionals, comparatives, adjectival, and adver-
bial hedges (somewhat, approximate, particularly), and certain reporting verbs (claims, states,
argues) (Caplan, 2019a; Hyland, 1998; Kwon et al., 2018).
One explanation for these preferences is that academic writing tends towards non-congruent
meanings, whereas everyday registers tend towards congruent meanings (Halliday, 1999). Linguistic
forms are said to be non-congruent when they express functions that do not align with their formal
properties. For instance, in everyday English, processes are encoded as verbs (analyze, reflect, per-
spire), participants in those processes are nouns (we, light, athletes), and circumstances of time and
place are often adverbs (normally, deeply, quickly) or clauses (after they run, when it is sunny). In
academic English, processes become nouns and thus participants (analysis, reflection, perspiration),
some subjects can be omitted by using passive voice, and circumstances are often encoded as prepos-
itional phrases (in sunshine), non-finite verbs (when running), or adjectives (the normal reflection).
A broader term for all these lexico-syntactic choices is “grammatical metaphor” (Halliday, 1999)
because the academic usage involves a metaphorical use of, say, a process as a noun rather than its
more literal realization as a verb. Writers use these and other lexciogrammatical features discussed
in this chapter to paraphrase sources (Walsh Marr, 2019), establish cohesion and coherence, and
reproduce and extend disciplinary knowledge, the primary functions of writing in academic settings.
The Development of L2 Students’ Academic Writing

Longitudinal studies of language development in L2 academic writing are rare and logistically
complex (Connor-Linton & Polio, 2014; Wolfe-Quintero et al., 1998). Therefore, most research has
271
Nigel A. Caplan
compared language use in student and professional writing (e.g., Hyland, 2002; Liardét & Black,
2019), L1 and L2 students’ writing (e.g., Hinkel, 2002, 2003), or cross-sections of student writing
at different levels of language proficiency (e.g., Liardét, 2013; Parkinson & Musgrave, 2014). From
these comparisons, development is imputed but not conclusively demonstrated: presumably expert
writers once wrote like the novices to whom they are compared, but how this development occurred
remains opaque and is often unrelated to the features of academic language identified by corpus
studies. Further complicating the research is the tendency for quantitative studies in particular to
control the writing conditions in order to isolate language development by having students write
five-paragraph essays (Connor-Linton & Polio, 2014), a form of writing that has been strongly
criticized at all educational levels (Caplan & Johns, 2019). Therefore, some of the research that
purports to show language development in academic settings is based on a contrived form of writing
that may not even elicit the linguistic structures let alone the content and intellectual engagement
that define academic writing (but see Friginal & Weigle, 2014, who examined register development
in the Connor-Linton & Polio, 2014, data set).
A different approach is taken by educational linguists working within Halliday’s (1989) Systemic
Functional Linguistics (SFL) framework (Halliday & Matthiessen, 2013). Initially focusing on
elementary and secondary school contexts, SFL scholars have mapped ways in which “learning and
learning new language occur simultaneously” (Schleppegrell, 2004, p. 23) as students encounter
increasingly complex and disciplinary-specific ways of knowing that are abstracted from a common-
sense view of the world. The task for students who are not yet fully proficient in the dominant lan-
guage of the educational system is to “learning language, learning through language, [and] learning
about language” (Halliday, 1993, p. 113). For example, in the U.S. context, a student might be
learning English, learning science through English, and also learning about the specific discourse
of scientific English; all three areas are interrelated and necessary for academic success. Since SFL
scholars reject the idea that most learners will develop these linguistic resources spontaneously, the
critical issue addressed by this body of research is how to modify writing and content instruction
in order to develop learners’ academic language (e.g., de Oliveira & Lan, 2014;Schleppegrell &
Achugar, 2003; Pessoa et al., 2018).
Variation in Academic Language

There is a healthy debate within the EAP field as to whether it is possible to identify, teach, and
learn a single register of academic writing. At the surface level, this debate can be seen in the
adoption of word lists. New ways of classifying academic English words and phrases have been
published based on increasingly sophisticated corpus and lexicographical techniques, such as the
Oxford Phrasal Academic Lexicon and the English Vocabulary Profile, which is aligned with the
Common European Framework of Reference. However, while general academic vocabulary lists
such as the Academic Word List (Coxhead, 2000) consistently cover around 10% of academic texts
(Nation, 2001), many of the words are still infrequent, unevenly distributed across disciplines, vari-
able in actual meaning and use, and from a teaching standpoint “may involve considerable learning
effort with little return” (Hyland & Tse, 2007, p. 236). In response, some discipline-specific vocabu-
lary lists have begun to appear (e.g., Ha & Hyland, 2017; Mudraya, 2006), which may prove more
useful for learners.
Variation among disciplines is also evident in syntactic and lexicogrammatical choices, including
preferred reporting verbs (e.g., show vs. claim), the choice of integral vs. non-integral citations, and
the degree and types of hedging and boosting that occur (Hyland, 1999, 2004). Furthermore, while
Biber and Gray (2016) found that the grammatical features of academic language discussed above
272
do consistently differentiate academic writing from other registers, they also found systematic and
statistically significant differences among the major subdivisions within academia. In the human-
ities, there is a greater preference for clausal structures that are rare in the sciences (for example,
noun clauses complementing verbs and adverb clauses), while highly modified noun phrases are
far more characteristic of specialist science writing than popular science books or the humanities.
These trends have to be read together: fields such as physics and engineering where citations are
almost exclusively non-integral have less use for complement noun clauses than the social sciences
and humanities, where sentences such as Biber and Gray (2016) found that … are more frequent.
As a result, tracing students’ academic language acquisition is problematic since academic lan-
guage is a moving target. This has important implications for teaching and learning: should instruc-
tion target language that is generally useful for academic writing or that is specific to a particular
discipline or even sub-discipline? The situation is further muddied in educational contexts where
students have to read and write in many disciplines. each of which may call for its own register of
academic language (Caplan, 2019b). Consequently, writing considered effective in one class may
be deemed unacceptable in the next.
L2 Students’ Academic Language Development

The underlying premise of much of the developmental research is that lower-performing L2
students’ writing will improve if they are taught and/or start to produce the linguistic features
of higher-performing students’ texts, and that the overall trajectory of linguistic development in
academic contexts should be in the direction of expert (or, problematically, monolingual peers’)
writing. Quantitative research in this area broadly accepts Biber and Gray’s (2016) core linguistic
features of the academic register as its dependent variables because these features reliably differen-
tiate academic writing from other linguistic registers.
Biber et al. (2011) hypothesized a developmental sequence for acquisition of complexity features
in academic English, from finite complement clauses through phrasal embedding, non-finite
clauses, extraposed clauses (it is clear that …), appositives, and multiple pre-and post-nominal
phrasal modifiers. Parkinson and Musgrave (2014) found some limited support for elements of this
sequence by comparing EAP (that is, pre-matriculation) graduate students’ writing, MA students’
writing, and two corpora of expert academic writing, finding a fairly consistent increase of aca-
demic features and decrease of nonacademic language features across the trajectory. Taguchi et al.’s
(2013) descriptive study of placement-essay tests written by incoming first-year L2 undergraduate
students also found that writing with more of the features of academic writing (especially noun-
phrase elaboration) scored higher than writing with more of the features of other registers (espe-
cially dependent clauses), suggesting that students whose writing is closer to the academic register
are evaluated at a higher level of linguistic development. Using a larger sample and an automated
syntactic analysis tool, Lu (2011) noted a curvilinear relationship between the use of dependent
clauses and level of proficiency across four years of English studies in Chinese universities, largely
as Biber et al.’s sequence expects: that is, L2 students’ use of subordination peaks in the upper-
intermediate levels and then declines as students develop more advanced language proficiency
and academic writing skills. Meanwhile, the use of relative clauses increases in more proficient L2
writing even as other types of subordination decrease (Crossley & McNamara, 2014), which has
been shown to predict holistic quality in ESL writing (Ferris, 1994).
These studies indicate that the linguistic differences between more and less proficient L2 writing
in English across a wide range of contexts can be partly explained by the acquisition and use of
the features that are specific to the generic construct of academic writing. It is much less clear
how this acquisition takes place and how learners develop the specific register of their discip-
lines. Research suggests that simply immersing L2 students in advanced academic studies without
language support does not usually lead to development in linguistic proficiency (Storch, 2009).
273
Nigel A. Caplan
However, studies that specifically investigate the impact of instruction on the features of academic
language identified above have produced mixed results. Intensive academic language instruction
appears to produce benefits on some measures of syntactic complexity although not lexical sophis-
tication or range (Bulté & Housen, 2014) nor accuracy (Polio & Shea, 2014). For example, after
a semester of intensive EAP study, the writers in Bulté and Housen’s (2014) analysis used signifi-
cantly fewer simple sentences and longer noun phrases (thus making their writing more academic)
but significantly more compound sentences (which are less academic). Mazgutova and Kormos
(2015) did find progress among low-advanced students in a four-week intensive language program
along the dimensions of noun-phrase complexity and relative clauses; more advanced students,
however, showed less progress, suggesting that this type of pre-matriculation instruction may have
a ceiling effect absent engagement in disciplinary content classes. Shin et al. (2018) found that
using corpus-informed materials to teach reporting verbs led to significant impact on first-year
L2 undergraduates’ literature reviews in just one 45-minute workshop. However, other research
has further complicated the general pattern of results by showing differences among writers from
different L1 backgrounds: for example, English L2 writers with Spanish and Tswana as their L1s
seem to retain a preference for subordination, while Russian L1 writers continue to use coordination
heavily even at advanced proficiencies of English (Ai & Lu, 2013; Neff et al., 2004; Ortega, 2003).
One plausible explanation for the ambiguous evidence of development is that, as noted in the
introduction to this chapter, the language of academic writing –and therefore evidence for the
learning of that register –cannot be defined only by counting certain types of grammar and vocabu-
lary (e.g., Lu, 2011). Complexity and sophistication are also evident in lexicogrammatical choices
that extend beyond the clause (Humphrey & Macnaught, 2016). Furthermore, the presence of
“complex” features alone, even those that are typical of academic writing according to large-scale
corpora, does not necessarily mean that the language is being used effectively in the particular text
and task (Ryshina-Pankova, 2015): the complexity measures in Bulté and Houson’s (2014) corpus
did not correlate well with expert raters’ quality scores, suggesting a mismatch between language
use and the raters’ expectations of effective academic writing. Therefore, it is important to consider
qualitative and mixed-method research that takes a meaning-and not just form-focused approach to
the development of L2 students’ academic language in specific disciplines and genres.
For example, Achugar and Carpenter (2014) conducted a design experiment with high-school
students writing summaries of source texts in the discipline of history across a semester. Instead
of quantifying a generic set of complexity measures, the authors specifically honed in on the
lexicogrammatial and discourse-semantic features that allow writers to “do history” (Coffin, 1997;
Schleppegrell et al., 2004), such as using verbs instead of conjunctions to mark logical relations,
and expressing certain types of evaluation through choices of vocabulary, comparative structures,
and adverbs. As Achugar and Carpenter note, “What constitutes evidence of development is not
just counting the features, but how these bundles of linguistic choices function in a text” (p. 62).
The students –some but not all of whom were multilingual or designated English learners –were
taught three language-focused lessons on reading and analyzing source texts. The students then had
to summarize the opinions of the sources’ authors. A comparison of students’ summary writing at
the start and end of the semester revealed that all groups, including English learners, wrote longer
responses that were somewhat but not significantly more lexically dense and used a wider range
of clause combination resources including subordination and embedding. Overall, the study found
increases in non-congruent and thus more academic ways of recontextualizing historical knowledge
despite the brevity of the intervention.
In more advanced academic settings, the most widely studied discourse-level feature is gram-
matical metaphor, which includes the familiar concept of nominalization (Halliday, 1993).
Grammatical metaphor is important because it allows writers to “move from the language of
everyday concrete interactions toward more specialized, abstract knowledge and lexis” (Liardét,
2013, p. 163). Grammatical metaphor explains the preference for expanded noun phrases observed
274
by many analyses because these resources create rich opportunities to construct precise meanings
and connect clauses. For example, a Japanese ESL student wrote the following in a summary of a
research article:
The author explains that plant genetics is valuable to produce new species of food plants
by special operation. The effective use of plant genetics brings us more varieties of
vegetables, fruits.
Yasuda, 2015, p. 115, emphasis added
The non-finite clause in the first sentence (“to produce new species”) is repackaged as a noun
phrase in the subject of the second sentence (“the effective use of plant genetics”), reducing redun-
dancy and creating cohesion (see also Liardét, 2013). However, simple nominalizations such
as agnates (to work /work) and related lexical choices (near / proximity) can also contribute to
textual cohesion, even if they are not elaborated in ways that would register in automated analyses
(Ryshina-Pankova, 2015). The resulting texts are, as Biber and Gray (2015), found more synoptic
(noun-based) and non-congruent (the part of speech does not match the meaning of the word)
than dynamic (verb-based) and congruent (Halliday, 1993) because they are increasingly trying to
represent complex, abstract, disciplinary concepts.
If grammatical metaphor is a form of semantic complexity that distinguishes certain types of aca-
demic writing just as much as the markers of formal complexity discussed earlier (Ryshina-Pankova,
2015), then it should emerge as L2 students’ academic writing becomes more sophisticated and profi-
cient, at least in contexts and genres which can reasonably solicit this register of writing. This indeed
seems to be the case. In her descriptive study of 30 Japanese university students learning to write a
summary, Yasuda (2015) found an increase in the use of grammatical metaphor, suggesting that this
resource is amenable to instruction. When grammatical metaphor is not explicitly taught, it still appears
to develop over time but less consistently: while fourth-year university EFL students in China were
found to use this resource more than first years, they also missed many opportunities to reconstrue
congruent (that is, clausal) structures with the nominalized structures that are more indicative or
effective in academic writing or struggled to deploy nominalization effectively and in the correct
word forms (Liardét, 2013, 2016). The need for explicit instruction is further supported by a study
conducted with undergraduate students taking classes in English as their L2 in Hong Kong (Devrim,
2015). “Frontloaded” instruction in the resources of grammatical metaphor was found to increase
the number of experiential metaphors (that is, nominalization) and logical metaphors (e.g., replacing
the more congruent conjunction so with the metaphorical prepositional phrase as a result). Similarly,
Walsh Marr (2019) and Yasuda (2017) have demonstrated the impact of teaching L2 students to move
between congruent and non-congruent language forms in order to paraphrase effectively.
Taken together, the research into the development of canonical markers of academic written
language suggests that development occurs because contrasts can be found between more and less
proficient writers. However, these studies have not shown clear trajectories or mechanisms for this
development. Meanwhile, studies that go beyond clausal syntax by examining discourse semantics
have been somewhat more successful at showing that the development of specific lexicogrammatical
features can occur through targeted instruction.
Development of Stance and Engagement in Academic Writing

As discussed above, another defining characteristic of academic writing is the language used to
encode stance. Research from pragmatic and syntactic perspectives has revealed the preferences
and patterns of hedging, boosting, and other metadiscoursive markers of stance in expert aca-
demic writing (Hyland, 2004). It should therefore be expected that L2 students’ linguistic resources
for indicating their attitudes and positions will develop. Indeed, novice L2 writers frequently
275
Nigel A. Caplan
underuse these resources, leading to claims that are overstated, exaggerated, or excessively forceful
(Hinkel, 2002).
An alternative approach to Hyland’s widely studied formulation of stance is to use Appraisal
theory, a branch of SFL concerned with the different ways that writers evaluate claims, attitudes,
and sources at the discourse-semantic level (Hood, 2010; Martin & White, 2005; White, 2003).
In L2 writing, the most active area of Appraisal research is in the Engagement subsystem, the
resources that writers use to bring other sources into their texts and control their alignment to or dis-
tance from those intertextual references. Most research into Engagement has focused on describing
experts’ and students’ –including L2 students’ –use of its resources in order to make their texts
“heteroglossic,” a key expectation since most academic writing refers to other texts (e.g., Liardét &
Black, 2019; Miller et al., 2014; Pessoa et al., 2017; White, 2003).
For instance, Liardét and Black (2019) investigated the use of reporting verbs that, in
Engagement terms, expand the dialogical space by opening up the possibility of disagree-
ment and alternative explanations (e.g., the author argues that), or attribute an idea neutrally
(e.g., according to the author, included in this study as an equivalent to a reporting verb)
versus those that shut down any further dialogue (e.g., the author proves that). There were
differences between expert and student writing in the density of citations, the choice of reporting
verbs, and most interestingly the function of the verbs in the Engagement system: writers of
published research articles were more likely to contract the dialogic space and exclude alterna-
tive opinions, while students –writing in English both as an L1 and L2 –were more likely to
attribute ideas and open the space for disagreements rather than indicate strong alignment with
sources. Liardét and Black conclude that expert texts “are more balanced, providing greater
insight into the value with which the writers perceive the given propositions” (p. 47).
Research that compares student and expert writing may offer a trajectory for development.
However, there are important differences between published journal articles and first-year under-
graduate student writing. The target may not be professional academic writing, but the language of
a particular pedagogical genre. For example, Pessoa et al. (2018) designed an intervention in col-
laboration with a history instructor to teach specific language resources for a historical argument,
including using “multi-voiced resources (heteroglossic propositions) to present information as an
interpretation that needs to be argued for” (p. 85). In this way, students learned what they needed to
do in the assignment (interpret evidence in order to evaluate a position), how to do it with effective
language resources (such as modal verbs, reporting verbs, and the concede-counter move although
this … that), and why those language forms realize the intended function. This is a more nuanced
approach than the list of reporting verbs often presented in ESL textbooks. Instead, writers learn
how to use language resources meaningfully to complete a challenging task. Over the course of the
semester, students incorporated the new Engagement techniques with increasing success, especially
those who started out with weaker baseline essays, thus narrowing the gap with more experienced
and proficient students. Students who received the intervention were more likely to write arguments
rather than recounts, suggesting that stance, defined here through the Appraisal framework, can and
should be taught through language-focused interventions if students are to develop disciplinary
literacy.
Research Methods
As can be seen from this chapter, three principal methods have been used to investigate aspects
of academic language and language acquisition in the context of L2 writing. First, corpus-based
research has provided insights into the nature of written academic English, usually through the
analysis of highly proficient or professional, published writing and less often by successful L1 or
L2 student writers (e.g., Kwon et al., 2018; Staples et al., 2016). Other corpora allow research into
276
learner language, such as the International Corpus of Learner English (Granger et al., 2020) or the
Chinese Longitudinal Learner Corpus (CLLC), which contains one essay written in four consecu-
tive semesters by the same group of 130 students (Liardét, 2016). Some studies focus on vocabu-
lary (Coxhead, 2000), lexical bundles (Biber, 2006), phrase frames (Lu et al., 2018), or a particular
grammatical structure or function such as reporting verbs (Liardét & Black, 2019), while others
have used multidimensional measures of register (e.g., Biber et al., 2020) or complexity measures
from automated programs such as Coh-Metrix (e.g., Crossley & McNamara, 2014) or the Syntactic
Complexity Analyzer (e.g., Lu, 2011).
In the second approach, comparative research designs either look for differences among students
at different proficiency levels or compare L2 writers to monolinguals or expert writers (e.g., Hinkel,
2002; Liardét & Black, 2019). Such between-group comparisons are valuable but do not directly
speak to individuals’ language acquisition and writing development. Case studies have shed light on
language development over a semester or more of language development for a single student (e.g.,
Li & Schmitt, 2009), a whole course (e.g., Yasuda, 2011), and even an entire EAP program (Connor
Linton & Polio, 2014). There is a need for more qualitative, quantitative, and mixed-methods lon-
gitudinal studies.
Finally, although less common, experimental studies of interventions can shed light on the effect-
iveness of instruction on the development of academic written language. Interventions include case
studies of the effectiveness of instruction provided through specific courses and assignments (e.g.,
Pessoa et al., 2018), design research in which an intervention is tested and refined over the course of
the study (e.g., Achugar & Carpenter, 2014), and more traditional pre/posttest intervention research
which seeks to measure development in a treatment group by comparison to a business-as-usual
control group (e.g., Shin et al., 2018).

One of the challenges for L2 writers and writing teachers is that the task can feel insurmount-
able: understanding new genres, managing sources, organizing ideas across the whole text and within
each stage of the text, hedging and boosting claims, evaluating propositions, adopting a discipline-
specific academic register, choosing the right words, and arranging them into meaningful, accurate,
and cohesive clauses and sentences. And all this must be accomplished using language resources
that the L2 writer is still acquiring as they are “learning language, learning through language, [and]
learning about language” (Halliday, 1993, p. 113). In order to do any of this, L2 writing instruction
needs to take place using relevant genres and disciplinary materials. Furthermore, language must be
seen as integral to the acquisition and practice of academic writing. Decontextualized gap-filling, sen-
tence writing, and skill-based writing exercises, including artificial forms such as the five-paragraph
essay, are likely to be less effective than meaningful engagement in the practices of academic writing,
whether those practices are situated in secondary education, higher education, or professional research
(Caplan & Johns, 2019). The debate between “general” and “specific” models of academic language
instruction will most often be resolved by logistical considerations of the institution. However, regard-
less of the program structure, all writing teachers in all academic settings need to see themselves as
language teachers and develop their awareness of the common underlying features of the academic
register as well as the discipline-specific variations that pertain to their students’ current and future
needs (de Oliveira & Schleppegrell, 2015; Hyland, 2007; Schleppegrell, 2004).
A powerful heuristic for understanding the language demands of a writing task is the 3×3 matrix,
developed using the SFL framework discussed throughout this chapter (Humphrey, 2013; Humphrey
et al., 2010; Humphrey & Macnaught, 2016). The matrix shows how SFL’s three metafunctions of
meaning (the ideational, the interpersonal, and the textual) are construed in language choices at
three levels: the whole text; the paragraph, phase, or stage; and the clause, word, and sentence.
277
Nigel A. Caplan
Versions of the 3×3 have also been created for specific disciplinary genres (e.g., Pessoa et al.,
2018), and one variation uses less SFL metalanguage and instead turns each cell in the matrix into
questions that can be asked of any academic writing task (Caplan, 2019b). These questions target all
the key features of academic register raised in this chapter, including stance and Engagement, thus
allowing teachers and students to make choices that are not just generically “academic” but specific
to a particular context. Pessoa et al. (2018) have demonstrated that by looking across texts, within
clauses, and beyond sentences, much can be inferred about the demands of academic task that can
inform instruction, assignment design, and assessment.
Future Directions
The place of language learning in L2 academic writing remains open to debate, with scope for
further research, especially oriented towards pedagogy. In a personal essay on the future of EAP,
Swales (2019) invites researchers to redirect their attention away from text structure, moves, and
formulaic vocabulary sequences. He also criticizes research on stance and Engagement that does
not consider actual readers but instead makes assumptions about writers’ communicative goals. In
their place, Swales promotes “thicker” ethnographic studies of texts in context including further
understanding of the use of lexicogrammatical patterns and cohesive techniques that have clear
pedagogical value.
Furthermore, even as we develop a deeper understanding of the language features that make
writing academic, either broadly or in specific fields and genres, we have a far hazier picture of
how and why some learners develop those linguistic and rhetorical skills in their writing. EAP/
ESP and SFL pedagogies implicitly presume that instruction is at least beneficial and perhaps even
necessary, but the research base is insufficient to make strong claims about the value of teaching
academic language for L2 writing. More careful intervention studies from quantitative, qualitative,
and mixed-methods paradigms as well as, and perhaps especially, high-quality longitudinal studies
of development are needed.
References
Achugar, M., & Carpenter, B.D. (2014). Tracking movement toward academic language in multilin-
gual classrooms. Journal of English for Academic Purposes, 14, 60–71. https://doi.org/10.1016/j.jeap.
2013.12.002
Ai, H., & Lu, X. (2013). A corpus-based comparison of syntactic complexity in NNS and NS university
students’ writing. In N. Ballier, A. Díaz Negrillo, & P. Thompson (Eds.), Automatic treatment and analysis
of learner corpus data (pp. 249–264). Amsterdam: John Benjamins.
Biber, D. (1992). On the complexity of discourse complexity: A multidimensional analysis. Discourse
Processes, 15, 133–163. https://doi.org/10.1080/01638539209544806
Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam: John
Benjamins.
Biber, D., & Gray, B. (2016). Grammatical complexity in academic English: Linguistic change in writing.
Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure gram-
matical complexity in L2 writing development? TESOL Quarterly, 45, 5–35. https://doi.org/10.5054/
tq.2011.244483
Biber, D., Gray, B., Staples, S., & Egbert, J. (2020). Investigating grammatical complexity in L2 English
writing research: Linguistic description versus predictive measurement. Journal of English for Academic
Purposes, 46. https://doi.org/10.1016/j.jeap.2020.100869
Biber, D., Johansson, S., Leech, G., Conrad, S.M., & Finegan, E. (1999). Longman grammar of spoken and
written English. Harlow: Longman.
Bourdieu, P., & Passeron, J.C. (1977). Reproduction in education, society and culture (trans. R. Nice).
London: Sage Publications.
278
Caplan, N.A. (2019a). Grammar choices for graduate and professional writers (2nd ed.). Ann Arbor: University
of Michigan Press.
Caplan, N.A. (2019b). Asking the right questions: Demystifying writing assignments across the disciplines.
Journal of English for Academic Purposes, 41. https://doi.org/10.1016/j.jeap.2019.100776
Caplan, N.A., & Johns, A.M. (Eds.). (2019). Changing practices for the L2 writing classroom: Moving beyond
the five-paragraph essay. Ann Arbor: University of Michigan Press.
Cheng, A. (2019). Genre and graduate-level research writing. Ann Arbor: University of Michigan Press.
Coffin, C. (1997). Constructing and giving value to the past: An investigation into secondary school history.
In F. Christie & J. R. Martin (Eds.), Genre and institutions: Social processes in the workplace and school
(pp. 196–230). London: Cassell.
corpus. Journal of Second Language Writing, 26, 1–9. https://doi.org/10.1016/j.jslw.2014.09.002
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213–238. https://doi.org/10.2307/
3587951
Crossley, S.A., & McNamara, D. S. (2014). Does writing development equal writing quality? A computational
investigation of syntactic complexity in L2 learners. Journal of Second Language Writing, 26, 66–79.
https://doi.org/10.1016/j.jslw.2014.09.006
de Oliveira, L.C., & Lan, S.-W. (2014). Writing science in an upper elementary classroom: A genre-based
approach to teaching English language learners. Journal of Second Language Writing, 25, 23–39. https://
doi.org/10.1016/j.jslw.2014.05.001
de Oliveira, L C., & Schleppegrell, M. (2015). Focus on grammar and meaning. Oxford: Oxford University Press.
Devrim, D.Y. (2015). Grammatical metaphor: What do we mean? What exactly are we researching? Functional
Linguistics, 2(1). https://doi.org/10.1186/s40554-015-0016-7
Ferris, D.R. (1994). Lexical and syntactic features of ESL writing by students at different levels of L2 profi-
ciency. TESOL Quarterly, 28, 414–420. https://doi.org/10.2307/3587446
Friginal, E., & Weigle, S. (2014). Exploring multiple profiles of L2 writing using multi-dimensional analysis.
Granger, S., Dupont, M., Meunier, F., Hubert, N., & Paquot, M. (2020). The International Corpus of Learner
English. Version 3. Louvain-la-Neuve: Presses Universitaires de Louvain.
Ha, A.Y.H., & Hyland, K. (2017). What is technicality? A technicality analysis model for EAP vocabulary.
Journal of English for Academic Purposes, 28, 35–49. https://doi.org/10.1016/j.jeap.2017.06.003
Halliday, M.A.K. (1989). Language, context, and text: Aspects of language in a social-semiotic perspective
(2nd ed.). Oxford: Oxford University Press.
Halliday, M.A.K. (1993). Towards a language-based theory of learning. Linguistics and Education, 5, 93–116.
https://doi.org/10.1016/0898-5898(93)90026-7
Halliday, M.A.K. (1999). Construing experience through meaning: A language-based approach to cognition.
London: Cassell.
Halliday, M.A.K., & Hasan, R. (1976). Cohesion in English. Harlow: Longman.
Halliday, M.A.K., & Hasan, R. (2006). Language and literacy. In R. Whittaker, A. McCabe, & M. O’Donnell
(Eds.), Language and literacy: Functional approaches (pp. 15–44). London: Continuum.
Halliday, M.A.K., & Martin, J.R. (1993). Writing science: Literacy and discursive power. London: Taylor &
Francis.
Halliday, M.A.K., & Matthiessen, C. (2013). Halliday’s introduction to functional grammar (4th ed). Abingdon:
Routledge.
Hartshorne, J.K., Tenenbaum, J.B., & Pinker, S. (2018). A critical period for second language acquisi-
tion: Evidence from 2/3 million English speakers. Cognition, 177, 263–277. https://doi.org/10.1016/
j.cognition.2018.04.007
Hinkel, E. (2002). Second language writers’ text: Linguistic and rhetorical features. Mahwah, NJ: Lawrence
Erlbaum.
Hinkel, E. (2003). Simplicity without elegance: Features of sentences in L1 and L2 academic texts. TESOL
Quarterly, 37, 275–301. https://doi.org/10.2307/3588505
Hinkel, E. (2004). Teaching academic ESL writing: Practical techniques in vocabulary and grammar.
Mahwah, NJ: Lawrence Erlbaum.
Hood, S. (2010). Appraising research: Evaluation in academic writing. Basingstoke: Palgrave Macmillan.
Humphrey, S. (2013). And the word became text: A 4 × 4 toolkit for scaffolding writing in secondary English.
English in Australia, 48(1), 46–55.
Humphrey, S., & Macnaught, L. (2016). Developing teachers’ professional knowledge for discipline literacy
instruction. In H. De Silva Joyce (Ed.), Language at work: Analysing language use in work, education,
medical and museum contexts (pp. 68–86). Newcastle-upon-Tyne: Cambridge Scholars Publishing.
279
Nigel A. Caplan
Humphrey, S., Martin, J.R., Dreyfus, S.J., & Mahboob, A. (2010). The 3 × 3: Setting up a linguistic toolbox
for teaching and assessing academic writing. In A. Mahboob & N. Knight (Eds.), Appliable Linguistics (pp.
185–199). Continuum.
Hyland, K. (1998). Hedging in Scientific Research Articles. Amsterdam: John Benjamins.
Hyland, K. (1999). Academic attribution: Citation and the construction of disciplinary knowledge. Applied
Linguistics, 20, 341–367. https://doi.org/10.1093/applin/20.3.341
Hyland, K. (2002). Authority and invisibility: Authorial identity in academic writing. Journal of Pragmatics,
34, 1091–1112.
Hyland, K. (2004). Disciplinary discourses: Social interactions in academic writing. Ann Arbor: University
of Michigan Press.
Hyland, K. (2007). Genre pedagogy: Language, literacy and L2 writing instruction. Journal of Second
Language Writing, 16, 148–164. https://doi.org/10.1016/j.jslw.2007.07.005
Hyland, K. (2018). Sympathy for the devil? A defence of EAP. Language Teaching, 51, 383–399. https://
doi.org/10.1017/S0261444818000101
Hyland, K. (2019). Second language writing (2nd ed.). Cambridge: Cambridge University Press.
Hyland, K., & Jiang, F. (2017). Is academic writing becoming more informal? English for Specific Purposes,
45, 40–51. https://doi.org/10.1016/j.esp.2016.09.001
Hyland, K., & Tse, P. (2007). Is there an “academic vocabulary”? TESOL Quarterly, 41, 235–253. https://
doi.org/10.2307/40264352
Kwon, M.H., Staples, S., & Partridge, R.S. (2018). Source work in the first-year L2 writing class-
room: Undergraduate L2 writers’ use of reporting verbs. Journal of English for Academic Purposes, 34,
86–96. https://doi.org/10.1016/j.jeap.2018.04.001
Li, J., & Schmitt, N. (2009). The acquisition of lexical phrases in academic writing: A longitudinal case study.
Liardét, C.L. (2013). An exploration of Chinese EFL learner’s deployment of grammatical metaphor: Learning
to make academically valued meanings. Journal of Second Language Writing, 22, 161–178. https://doi.org/
10.1016/j.jslw.2013.03.008
Liardét, C.L. (2016). Nominalization and grammatical metaphor: Elaborating the theory. English for Specific
Purposes, 44, 16–29. https://doi.org/10.1016/j.esp.2016.04.004
Liardét, C.L., & Black, S. (2019). “So and so” says, states and argues: A corpus-assisted engagement analysis
of reporting verbs. Journal of Second Language Writing, 44, 37–50. https://doi.org/10.1016/j.jslw.2019.
02.001
Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL
writers’ language development. TESOL Quarterly, 45, 36–62. https://doi.org/10.5054/tq.2011.240859
Lu, X., Yoon, J., & Kisselev, O. (2018). A phrase-frame list for social science research article introductions.
Journal of English for Academic Purposes, 36, 76–85. https://doi.org/10.1016/j.jeap.2018.09.004
Martin, J., & White, P. (2005). The language of evaluation: Appraisal in English. Basingstoke: Palgrave
Macmillan. https://doi.org/10.1057/9780230511910
Mazgutova, D., & Kormos, J. (2015). Syntactic and lexical development in an intensive English for
Academic Purposes programme. Journal of Second Language Writing, 29, 3–15. https://doi.org/10.1016/
j.jslw.2015.06.004
Miller, R.T., Mitchell, T.D., & Pessoa, S. (2014). Valued voices: Students’ use of Engagement in argumenta-
tive history writing. Linguistics and Education, 28, 107–120. https://doi.org/10.1016/j.linged.2014.10.002
Mudraya, O. (2006). Engineering English: A lexical frequency instructional model. English for Specific
Purposes, 25, 235–256. https://doi.org/10.1016/j.esp.2005.05.002
Nation, I.S.P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.
Neff, J., Dafouz, E., Diez, M., Prieto, R., & Chaudron, C. (2004). Contrastive discourse analysis: Argumentative
text in English and Spanish. In C.L. Moder & A. Martinovic-Zic (Eds.), Studies in language companion
series (Vol. 68, pp. 267–283). Amsterdam: John Benjamins. https://doi.org/10.1075/slcs.68.15nef
Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis
of college-level L2 writing. Applied Linguistics, 24, 492–518. https://doi.org/10.1093/applin/24.4.492
Parkinson, J., & Musgrave, J. (2014). Development of noun phrase complexity in the writing of English
for Academic Purposes students. Journal of English for Academic Purposes, 14, 48–59. https://doi.org/
10.1016/j.jeap.2013.12.001
Pessoa, S., Mitchell, T.D., & Miller, R.T. (2017). Emergent arguments: A functional approach to analyzing stu-
dent challenges with the argument genre. Journal of Second Language Writing, 38, 42–55. https://doi.org/
10.1016/j.jslw.2017.10.013
Pessoa, S., Mitchell, T.D., & Miller, R.T. (2018). Scaffolding the argument genre in a multilingual univer-
sity history classroom: Tracking the writing development of novice and experienced writers. English for
Specific Purposes, 50, 81–96. https://doi.org/10.1016/j.esp.2017.12.002
280
Pinker, S. (2014, September 26). Why academics stink at writing. The Chronicle of Higher Education.
Retrieved from www.chronicle.com/article/Why-Academics-Writing/148989
Polio, C., & Shea, M.C. (2014). An investigation into current measures of linguistic accuracy in second
language writing research. Journal of Second Language Writing, 26, 10–27. https://doi.org/10.1016/
j.jslw.2014.09.003
Rose, D., & Martin, J.R. (2012). Learning to write, reading to learn: Genre, knowledge and pedagogy in the
Sydney School. Sheffield: Equinox.
Ryshina-Pankova, M. (2015). A meaning-based approach to the study of complexity in L2 writing: The
case of grammatical metaphor. Journal of Second Language Writing, 29, 51–63. https://doi.org/10.1016/
j.jslw.2015.06.005
Schleppegrell, M.J. (2004). The language of schooling: A functional linguistics perspective. Mahwah,
NJ: Lawrence Erlbaum.
Schleppegrell, M.J., & Achugar, M. (2003). Learning language and learning history: A functional linguistics
approach. TESOL Journal, 12(2), 21–27. https://doi.org/10.1002/j.1949-3533.2003.tb00126.x
Schleppegrell, M.J., Achugar, M., & Oteíza, T. (2004). The grammar of history: Enhancing content-based
instruction through a functional focus on language. TESOL Quarterly, 38, 67–93. http://dx.doi.org/10.2307/
3588259
Shin, J.Y., Velázquez, A.J., Swatek, A., Staples, S., & Partridge, R.S. (2018). Examining the effectiveness of
corpus-informed instruction of reporting verbs in L2 first-year college writing. L2 Journal, 10(3). https://
doi.org/10.5070/L210337022
Simpson-Vlach, R., & Ellis, N.C. (2010). An academic formulas list: New methods in phraseology research.
Applied Linguistics, 31, 487–512. https://doi.org/10.1093/applin/amp058
Staples, S., Egbert, J., Biber, D., & Gray, B. (2016). Academic writing development at the university
level: Phrasal and clausal complexity across level of study, discipline, and genre. Written Communication,
33, 149–183. https://doi.org/10.1177/0741088316631527
Staples, S., & Reppen, R. (2016). Understanding first-year L2 writing: A lexico-grammatical analysis across
L1s, genres, and language ratings. Journal of Second Language Writing, 32, 17–35. https://doi.org/10.1016/
j.jslw.2016.02.002
Starke-Meyerring, D. (2011). The paradox of writing in doctoral education: Student experiences. In L.
McAlpine & C. Amundsen (Eds.), Doctoral education: Research-based strategies for doctoral students,
supervisors and administrators (pp. 75–95). New York: Springer. http://link.springer.com/chapter/10.1007/
978-94-007-0507-4_5
Storch, N. (2009). The impact of studying in a second language (L2) medium university on the development of
L2 writing. Journal of Second Language Writing, 18, 103–118. https://doi.org/10.1016/j.jslw.2009.02.003
Swales, J.M. (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge
University Press.
Swales, J.M. (2019). The futures of EAP genre studies: A personal viewpoint. Journal of English for Academic
Purposes, 38, 75–82. https://doi.org/10.1016/j.jeap.2019.01.003
Sword, H. (2012). Stylish academic writing. Cambridge, MA: Harvard University Press.
Taguchi, N., Crawford, W., & Wetzel, D.Z. (2013). What linguistic features are indicative of writing quality?
A case of argumentative essays in a college composition program. TESOL Quarterly, 47, 420–430. https://
doi.org/10.1002/tesq.91
Tardy, C.M. (2016). Beyond convention: Genre innovation in academic writing. Ann Arbor: University of
Michigan Press.
Walsh Marr, J. (2019). Making the mechanics of paraphrasing more explicit through grammatical metaphor.
Journal of English for Academic Purposes, 42. https://doi.org/10.1016/j.jeap.2019.100783
White, P.R.R. (2003). Beyond modality and hedging: A dialogic view of the language of intersubjective
stance. Text –Interdisciplinary Journal for the Study of Discourse, 23, 259–284. https://doi.org/10.1515/
text.2003.011
Williams, J.M. (1995). Style: Toward clarity and grace. Chicago: University of Chicago Press.
Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures
of fluency, accuracy, and complexity. University of Hawai’i National Foreign Language Resource Center.
Yasuda, S. (2015). Exploring changes in FL writers’ meaning-making choices in summary writing: A sys-
temic functional approach. Journal of Second Language Writing, 27, 105–121. https://doi.org/10.1016/
j.jslw.2014.09.008
Yasuda, S. (2017). Toward a framework for linking linguistic knowledge and writing expertise: Interplay
between SFL-based genre pedagogy and task-based language teaching. TESOL Quarterly, 51, 576–606.
https://doi.org/10.1002/tesq.383
281
21
LEARNING IN ELECTRONIC
ENVIRONMENTS
Chinese University of Hong Kong and Kansai University
Introduction
This chapter sets out to explore the potential for electronic environments to support writing for
second language acquisition (SLA). To do so, several core areas related to computers and language
learning – along with their requisite acronyms – will be defined. We will then describe how these
areas have become increasingly important for the teaching and learning of L2 writing. Next, we will
examine the three major areas of writing that technology has affected, namely, synchronous text-
based communication (text chat), collaborative writing, and teacher-provided written corrective
feedback. Included in this examination will be descriptions of commonly used electronic tools, their
capabilities, and their limitations. We will also examine the research findings as they relate to the
effect of technology on the L2 writing process, the written product, and the ability for learners to
acquire the L2 through writing, followed by a summary of common methodological characteristics.
Finally, based on our discussion, pedagogic suggestions and future directions for research relating
to L2 writing, SLA, and technology will be described.
Computer-assisted language learning (CALL) is defined as “any process in which a learner uses a
computer and, as a result, improves his or her language” (Beatty, 2003, p. 7) and is now an integral
part of language learning. In L2 writing, CALL environments provide a multitude of opportunities
for learners to engage in writing and for teachers to support them. Early research on writing in
CALL focused on the advantages of using word processing over traditional pen-and-paper writing,
emphasizing the convenience of the mechanical process of generating texts, such as deleting and
adding words and the physical ease of pressing keys (Pennington, 1996). This led to learners pro-
ducing longer texts and writing for longer periods of time (Brock & Pennington, 1999; Chadwick
& Bruce, 1989), more intensive revision (Pennington 1993, 1999) and more within-task planning
(Akyel & Kamisli, 1999) than they otherwise would be able to do. However, the development of the
internet and the advent of mobile devices have broadened the field of CALL and second language
writing into online communication.
The areas of mobile-assisted language learning and technology-enhanced language learning
have introduced a new kind of writing to the second language writing classroom: electronic writing.
282 DOI: 10.4324/9780429199691-29

L2 Writing in Electronic Environments
Electronic writing is described as a conglomerate of writing that can be done on and through the
medium of a networked computer (Ferris, 2002), which offers a greater capacity for individual
participation and interactivity. Such writing takes place in the context of computer-mediated com-
munication (CMC), that is, any kind of human communication that occurs through the use of elec-
tronic devices. CMC may either occur asynchronously (e.g., email, blogs) or synchronously (e.g.,
Facebook messenger, LINE, WeChat), referred to as asynchronous CMC and synchronous CMC
respectively.
Internet-based tools that facilitate CMC are referred to as Web 2.0 or “web technology that aims
to enhance creativity, information sharing and collaboration among users” (Tu, Blocher, & Ntoruru,
2008, p. 336) and have become ubiquitous in second language writing classrooms. One example
of such a tool is Google Docs, a free web-based program that allows for synchronous editing of
documents by more than one user. This not only provides a platform for learners to collaborate
while writing but allows the teacher to monitor and give feedback in real-time. Thus, the current
most influential aspects of CALL on writing seem to lie in the CMC domain, which has turned
the writing process from an individual solitary activity into a collaborative one, opening up new
opportunities for developing writing skills and language learning. Furthermore, as recent reviews of
CALL demonstrate (e.g., Çiftçi & Aslan, 2019; Lin, 2015), CMC has become an important tool in
L2 writing. Although we acknowledge recent developments in non-CMC areas of electronic writing
(e.g., automated writing scoring), for the purpose of this chapter, our discussion of writing in elec-
tronic environments only concerns writing that occurs in a CMC context.
Writing in Text Chat

The first subset of CMC relevant to L2 writing is text chat, which is described as “types of technolo-
gies that allow users to transfer text messages between computers quasi-instantaneously” (Wigham
& Chanier, 2015, p. 260). Text chat is now an integral part of everyday communication and exists
as part of mobile applications (e.g., LINE, WeChat), social networking sites (e.g., Facebook), and
Multi-User Virtual Environments (e.g., Second Life, River City, Quest Atlantis).
The discourse of text chat combines characteristics of spoken and written language (Blake,
2009; Smith, 2003). Similar to traditional writing, it results in a permanent written record which can
be reviewed by the learner as the communication evolves (Chun & Payne, 2004), but like speaking,
text chat produces a spontaneous exchange of information (Adams, Mohd Alwi, & Newton, 2015).
Its similarity to speaking means the benefits of text chat are often interpreted through the Interaction
Hypothesis (Long, 1996), which allows researchers to theorize that text chat promotes language
learning in a similar way to oral interaction. However, text chat differs from face-to-face interaction
in that the learner does not have to process language instantaneously; text chat users benefit from
the opportunity to read and re-read received messages, and to carefully formulate their responses.
Computer-Mediated Collaborative Writing

The next area worthy of discussion is computer-mediated collaborative writing. Collaborative
writing, without the use of computers, is a vibrant area of L2 writing in its own right. The
distinguishing feature of collaborative writing is that two or more L2 learners jointly develop a
single piece of writing. The importance of collaborative writing in SLA is mainly supported by
sociocultural theory (SCT), which claims learners collaborate to solve problems through inter-
action, thus facilitating the emergence of new knowledge and growth for all participants involved
(Fullan, 1999). Collaboration focusing on linguistic problems in particular is considered to be
important for SLA (Swain & Lapkin, 2002).
283
Computer-mediated collaborative writing also allows learners to plan within the task, draft, and
revise the text cooperatively by utilizing multimodal online tools, such as blogs (e.g., Arslan &
Sahin-Kizil, 2010; Sun & Chang, 2012) and online editing programs (e.g., Kessler, Bikowski, &
Boggs, 2012; Strobl, 2014; Yeh, 2014). The most common tool appearing in the literature is wikis
(e.g., Li & Zhu, 2011; Wang, 2015), or web applications which enable multiple participants to add,
remove, and edit content in an asynchronous manner. Wikis typically have four functions: edit
enables learners to write and revise the main text, or wiki page; history allows learners to observe
changes and revert back to previous versions; discussion provides a space for learners to communi-
cate via asynchronous messages; and comments enables learners to request and provide feedback.
The synergy of these interaction modes provides enhanced opportunities for attending to the lin-
guistic form of their text as well as other aspects of writing, possibly leading to language develop-
ment and improved writing skills.
Teacher-Provided Written Corrective Feedback

The final area we will discuss is teacher-provided written corrective feedback in an electronic
environment. While feedback has mostly been examined in traditional pen-and paper environments
(see Chapter 4), CMC written feedback differs in that it is provided in interaction with the writer
(synchronous or asynchronous), with different timing (immediate or delayed), and with different
modality (text or oral).
Synchronous feedback is provided within a synchronous online platform. By using synchronous
editing tools, such as Google Docs and Microsoft’s OneDrive (Aubrey, 2014), the teacher can
monitor students composing their texts and provide feedback immediately when needed. Students
can respond to feedback immediately, which might lead to further negotiation or clarification of
the feedback, thus facilitating the learning process (Ellis, 2009). The teacher can also strategic-
ally adjust the levels of help by, for example, providing increasingly more explicit feedback to the
learner’s incorrect forms during a feedback session (i.e., graduated feedback; Aljaafreh & Lantolf
1994; Erlam et al. 2013). In such an environment, the multiple phases of writing processes (i.e.,
drafting the text, processing feedback, and revising the text) take place within a single writing task
(Shintani & Aubrey, 2016), which might lead to different effects on accuracy development and/or
language learning.
Asynchronous feedback, on the other hand, is similar to traditional pen-and-paper feedback,
where feedback is provided after a text is written (i.e., it is delayed). However, CMC feedback
differs from traditional writing feedback in that it can be delivered in either oral or written mode.
For example, feedback in written form can be provided using comment/track changes in a word
processor (Ene & Upon, 2014) or by providing explanations by email (Tafazoli et al. 2014), while
oral feedback is possible using voice recording or screencast software (Cunningham, 2019; Eloa &
Oskoz, 2016). Such variety of feedback types might influence the learners’ writing and language
learning in distinct ways.
Studies Using Text Chat

While some studies have focused on the process features of text chat interactions, such as the inter-
actional moves associated with text chat or the resultant negotiation of meaning and the occurrence
of feedback, other studies have attempted to investigate language acquisition more directly, either
through investigating learners’ development of language features over longer periods of time or by
comparing the learning effects to the equivalent oral mode.
284
One broad line of research seeks to describe features of text chat interaction that are distinct
from oral interaction. Some of these features include the use of simplistic register and syntax,
abbreviations, symbols, and split negotiation routines (Smith, 2003), copy-and-pasting previously
sent messages as a method of composition (Ziegler, 2018), separate adjacency pairs (Smith, 2009),
non-contingent feedback (Lai & Zhao, 2006), and both overt and covert self-repair (Sauro & Smith,
2010; Smith, 2005). A more pertinent question, though, is whether these unique characteristics of
interaction provide a more optimal condition for language learning.
Negotiation and Noticing

A prominent research focus has been the quantity and focus of negotiation that is produced between
learners. Descriptive studies that draw on substantial corpora of text chat transcripts have found
high rates of negotiation of meaning (e.g., Pellettieri, 2000; Toyoda & Harrison, 2002; Tudini,
2003). However, when compared to similar oral interactions, this apparent advantage is less certain,
with some studies finding oral interaction produces an equal amount (Fernández-García & Arbelaiz,
2003; Lai & Zhao, 2006, Loewen & Reissner, 2009) or much more negotiation (e.g., Pellettieri,
2010). In terms of what learners negotiate during text chat, most research has found that lexical
rather than grammatical items are usually the trigger for negotiation of meaning (Blake, 2000;
Jepson, 2005; Tudini, 2003, 2007). In an exception to this, Nik, Adams, and Newton (2012) found
a higher proportion of grammatical-related than lexical-related negotiation episodes. Pellettieri
(2000, p. 82), like many others, maintain that the “additional time to think about their language
use, as well as a visual display of their utterances” benefit the more cognitive demanding aspects
of restructuring learners’ interlanguage. Task characteristics, such as goal-orientedness, heightened
difficulty (e.g., Pellettieri, 2000) and rigid structure (Adams, Alwi, & Newton, 2015) are likely to
be strong factors promoting the kind of negotiation that leads to language acquisition.
The oft-quoted advantage of text interaction as being an “amplifier of attention to form” (Ortega,
2009, p. 236) has been tested when researchers measure the extent to which learners notice new lan-
guage and whether those noticed language items are uptaken (i.e., used in subsequent language pro-
duction). There seems to be ample evidence that text chat interactions provide a conducive context
for noticing (e.g., Lai & Zhao, 2006; Smith, 2005; Sotillo, 2005; Tudini, 2007). Explanations for
this relate to the visual display of text chat, which may draw learners’ attention to form (Pellettieri,
2010), and the delayed nature of text chat, which gives learners time to consciously notice gaps
between their production and incoming messages and to produce more accurate responses. This
more careful approach to reconstructing messages is demonstrated when learners engage in covert
repair (i.e., deletion and rewriting of would-be texts; Sauro & Smith, 2010) as well as the common
practice of copying, pasting, and reformulating previously sent texts (Ziegler, 2018), suggesting
that text chat writers spend a great deal of time evaluating and revising their texts in ways that are
not possible during oral interaction.
Peer Feedback
The idea of using text chat as a platform for delivering feedback in ways that can be noticed
by learners is an interesting one. Feedback provided during text-based interaction is often non-
contingent; that is, it is provided in a delayed manner –sometimes several turns after the error
occurs. The non-contingent nature of feedback in text chat interaction has been cited as a reason
why text chat is an inefficient mode of delivering feedback (Jepson, 2005; Fiori, 2005) and why,
even if appropriate feedback is delivered, it may be difficult for learners to notice (e.g., Iwasaki &
Oliver, 2003; Lai, Fei, & Roots, 2008). However, how much negative feedback is provided, what
it focuses on, and whether it is noticed by the recipient may be dependent on the extent to which
285
interlocutors assume a pedagogical role. Evidence for this comes from the high rates of feedback
and noticing when researchers/teachers are involved in interactions with the explicit purpose of
bringing learners’ attention to their errors (e.g., Lai, Fei, & Roots, 2008; Smith & Renaud, 2013).
This points to the suitability of text chat as a means for teacher-directed feedback to learners.
However, as Ortega (2009) states, “one wonders how many teachers would choose text chat activ-
ities as a pedagogically appropriate environment in which to implement systematic error correction
with their students” (p. 242). In a subsequent section, we discuss claims that teacher-provided
feedback within a synchronous CMC environment is both feasible and conducive for language
acquisition.
Language Learning
Few studies have attempted to collect evidence on how text chat interaction actually results in
language acquisition. However, there is growing evidence that learners participating in text chat
tasks engage in the kind of language negotiation that leads to gains in vocabulary (e.g., Smith,
2004; Smith, 2005) and grammatical knowledge (Smith & Renaud, 2013). It has been shown that
when learners discuss language issues during text chat interactions, they are able to recall know-
ledge related to those items immediately after as well as several weeks after the task (Shekary &
Tahririan, 2006) –an indication of durable learning effects. Even when children interact via text
chat, there is evidence that new lexical items are noticed, retained, and used over time (Coyle
& Reverte, 2017). In a rare longitudinal study, González-Lloret (2008) illustrated how pragmatic
development can occur over longer periods of time when learners interact with native speakers over
several months. She found that negative feedback episodes issued by L1 interlocutors can result in
sustained L2 changes in the subsequent production of formal and informal address by L2 learners.
Overall, the features of text chat interaction seem to produce an environment that can be exploited,
through well-designed tasks, to encourage interaction, noticing of new language, and ultimately
acquisition.
Studies on Computer-Mediated Collaborative Writing

Whereas interaction is usually both the means and ends of text chat activities, the interaction that
arises during collaborative writing has the explicit purpose of facilitating the production of a single
joint text. Similar to text chat, though, the computer-mediated aspect brings out unique features of
collaborative interaction, such as frequent deleting, rewriting, and reshuffling of content (Strobl,
2014). Research has focused on both the process features of collaborative writing interactions, the
quality of resultant joint text, and language acquisition as measured through new pieces of writing.
Patterns and Phases of Collaboration

The use of a CMC tool, such as a wiki, during collaborative writing is thought to promote a
reflective, non-threatening writing environment (Colomb & Simutis, 1996), balanced, open com-
munication between writers (Warschauer & Ware, 2006; Warschauer, 1997), and enhanced learning
through joint scaffolding. However, the reality is that collaborative interaction isn’t always “col-
laborative.” Storch (2002) explains that the nature of collaborative interaction varies in degree
of equality (distribution of contribution) and mutuality (engagement via reciprocal feedback),
resulting in differing patterns of collaborative interactions. One common line of research adapts
this framework to computer-mediated collaborative writing.
Using one or more data sources available from wikis (i.e., discussion, comments, edited pages,
and history), researchers have found that computer-mediated collaborative writing results in a wide
array of interaction patterns. In addition to collaborative interaction (i.e., when learners jointly
286
construct the text), it is not uncommon to find some groups of learners who resist interaction or those
who display cooperative interaction (i.e., division of labor; e.g., Bradley, Lindstrom, & Rystedt,
2010). In fact, cooperative (as opposed to collaborative) interaction has been commonly observed,
where learners in each group –rather than combining efforts to work on the text together –will
each sequentially and independently add their own contributions to the text (Abrams, 2016). Other
research has gone beyond simply describing patterns of interaction by showing how patterns can be
dynamic, resulting in phases of interaction.
Whereas stable patterns of interaction during computer-mediated collaborative writing have been
found (e.g., Li & Zhu, 2011), others have shown that interaction patterns are not static. Some have
found that interaction patterns may vary within a task. For example, Kessler and Bikowski (2010)
found three sequential phases of interaction which align with different stages in the writing pro-
cess: build and destroy, which consists of initial writing followed by large-scale deletions as learners
evolve in their planning, full collaboration, which is a highly productive period without large-scale
deletions; and informal reflection, consisting of holistic commentary on the text. There is also evi-
dence of changes in interaction patterns between tasks, which arise, for example, as a result of changes
in task type (Li & Kim, 2016) as well as interlocutor characteristics (Li & Zhu, 2017). These studies
highlight the notion that computer-mediated collaborative writing doesn’t in itself result in collabor-
ation but depends on group members’ effort to align with co-constructed goals throughout the task.
Collaborative Dialogues
Research has also looked at how learners resolve linguistic problems through interactions during
computer-mediated collaborative writing. These studies draw on Swain’s (2000, 2006) notion of
collaborative dialogue –described as cognitive and social activity that leads to the construction
and shaping of linguistic knowledge. Investigations into the production of collaborative dialogues
during computer-mediated collaborative writing have yielded conflicting results. Some studies have
found that students are willing to engage in collaborative dialogues (Elola & Oskoz, 2010; Lee,
2010; Li & Zhu, 2011) or use their collective knowledge to resolve language issues (Hsu, 2019; Li,
2013; Nami & Marandi, 2014), while others have observed learners’ reluctance to enter into such
dialogues (Li & Zhu, 2011; Lim, So, & Tan, 2010; Lund & Smørdal, 2006). In terms of the kind
of collaborative dialogues generated, computer-mediated collaborative writers tend to prioritize
meaning over form (Abrams, 2016; Keesler, 2009), supporting each other to correct errors at the
sentence and word level, making form-related (e.g., word choice, verb tense) and content-related
changes to jointly constructed texts (Arnold, Ducate, & Kost., 2009; Aydin & Yildiz, 2014; Kessler,
2009; Kessler & Bikowski, 2010; Kost, 2011; Lee, 2010). As Li and Zhu (2011) argue, a high fre-
quency of collaborative dialogues tends to be associated with interactions that are collaborative –
that is, when there is no identifiable expert and group members draw on their resources to scaffold
each other’s efforts. Studies, such as Aydin and Yildiz (2014), show that certain task types built into
the computer-mediated collaborative writing activity can promote more collaborative dialogues,
with argumentative tasks (as opposed to information-gap tasks), in which learners are compelled to
share, defend, and react to others’ opinions, leading to more peer scaffolding.
Improvement in Writing Quality

Language acquisition as a result of computer-mediated collaborative writing has been investigated
in a number of ways. This has first been done through examining changes in the quantity and quality
of resultant texts through an examination of stages of collaboration. Longitudinal studies have found
that several rounds of computer-mediated collaborative writing results in increasingly complex and
coherent texts (Mak & Coniam, 2008) and that writers – especially lower proficiency learners –
show significant gains in content, organization, linguistic complexity, and accuracy (Yang, 2018).
287
Other studies have connected patterns of collaboration to the quality of resultant texts. Highly col-
laborative writers focus more on generating ideas and produce writing more fluently and accurately
(Yeh, 2014), with better coherence (Abrams, 2019) and more control over mechanics (Yim, Wang,
Olson, Vu, & Warschauer, 2017).
Fewer studies have investigated language acquisition by looking at gains in new individually
written texts using quasi-experimental designs. Wang (2015) compared the effects of computer-
mediated collaborative writing using a wiki and collaborative writing using Microsoft Word.
Learners in each group drafted, peer-edited, and revised the same two written assignments in their
respective modes. A post writing test revealed that the wiki group achieved greater improvement in
terms of audience awareness, organization, content and style, grammatical accuracy, and sentence
structure than the Microsoft Word group. Other studies have found that computer-mediated collab-
orative writers make significantly greater gains compared to their individual writing counterparts
(e.g., Bikowski & Vithanage, 2016; Hsu & Lo, 2018). For example, Hsu and Lo (2018) compared
computer-mediated collaborative writing and individual writing. Analyses of pre-, post-essay
writing tests indicated a significant gain only for the collaborative writing group on content quality
and linguistic accuracy. These kinds of studies are in their infancy, but so far they seem to show
consistent positive effects for computer-mediated collaborative writing when compared to indi-
vidual and non-computer-mediated conditions.
Studies on Teacher-Provided Written Corrective Feedback

CMC technology brings out unique features in teacher-provided feedback in term of its modalities
(e.g., text, oral, or video), timing (immediate or delayed), and mode of synchronicity (asynchronous
or synchronous). Researchers have examined the effects of teacher feedback in both asynchronous
and synchronous feedback provided in different modalities as well as the possibility of providing
synchronous immediate feedback. The combinations of CMC feedback that such research has
investigated are summarized in Figure 21.1.
Delayed Asynchronous Feedback

Most CMC feedback consists of asynchronous comments on learners’ texts, for example, when a
teacher uses the comment box and Track Changes functions of Microsoft Word. Such electronic
Text (asynchronous editing tools)
Asynchronous Oral feedback (audio recording)
Video feedback (video recording)

Delayed
Text (synchronous text chat)
Synchronous Oral (oral chat)

CMC feedback
Video (video chat)
Immediate Synchronous Text (simultaneous editing software)
Figure 21.1 CMC feedback types
288
feedback is considered beneficial because it enables teachers to provide feedback more quickly than
handwritten feedback and allows unlimited space for the teacher’s comments (Rodina, 2008), which
might have advantages in providing further elaboration about grammatical rules (Ferris, 2012).
Studies that have examined the effects of delayed asynchronous feedback have generally
compared it with pen-and-paper feedback (Ene & Upon, 2014; Tafazoli, Nosratzadeh, & Hosseini,
2014). Examining different kinds of written feedback, Ene and Upon (2014) found that asyn-
chronous teacher feedback –provided via comments and the Track Changes functions in Microsoft
Word –was similar to handwritten feedback in that it successfully elicited appropriate revisions of
grammatical structures and surface-level features as well as content and organization. Similarly,
Tafazoli, Nosratzadeh, and Hosseini (2014) compared asynchronous written (provided in Microsoft
Word document) and handwritten feedback and found that asynchronous feedback had a greater
effect on the development of grammatical accuracy. These studies suggest that delayed asyn-
chronous written feedback is at least comparable to pen-and-paper feedback in terms of the resultant
written quality of learners’ texts.
An extensively researched aspect of delayed asynchronous feedback is its modality (e.g., text,
audio, or video). Evidence shows video-recorded feedback benefits learners, as it engages them
visually, offers an opportunity for listening practice (Elola & Oskoz, 2016), and is preferred for its
conversational, positive, and explanatory attributes over written feedback (Cunningham, 2019).
Comparing video to audio feedback, research has shown that L2 learners perceive audio-recorded
feedback as clearer and faster, which helps in their revision of errors (Ali, 2016; Ducate & Arnold,
2012), while video-recorded feedback is perceived as personal, engaging, and memorable (Ali,
2016; Harper, Green, & Fernández-Toro, 2018). However, the relative effects of audio and video
feedback on learners’ revisions may depend on proficiency. Li and Akahori (2008) found that audio-
recorded feedback benefits lower proficiency learners, while video-recorded feedback benefits high
proficiency learners in their ability to identify and correct errors in a piece of writing. Overall,
the audio and video modes may hold advantages over written feedback due to their interpersonal
nature, speed, and detail provided.
Delayed Synchronous Feedback

Far fewer studies have focused on synchronous teacher feedback. Two such studies that we identi-
fied dealt with synchronous feedback as supplementary to asynchronous text-based feedback (Ene
& Upton, 2018; Odo & Yi, 2014). Both studies suggest that such instructional support offers add-
itional benefit for students’ writing. Ene and Upton (2018), for example, explored the effects of syn-
chronous feedback as an additional support for asynchronous feedback for university students in an
EAP program. Teacher asynchronous feedback was first provided using Microsoft Word comments
and Track Changes in electronic drafts, but for the final drafts, the students were engaged in a text
chat discussion with the teacher to receive feedback. Although the teacher feedback mainly focused
on content in both Word comments and text chat, they provided feedback on linguistic (grammar
and vocabulary) and mechanical errors more in Word than in text chat. The teachers felt that adding
a text chat session after providing asynchronous feedback was helpful for students.
The other study explored doctoral students’ responses to synchronous oral feedback. Odo and
Yi (2014) investigated the use of voice chat software to provide feedback on academic writing to
students. The results showed that clarification requests commonly occurred in the oral feedback
sessions, but there were some occurrences of negotiation where students clarified the intention of
feedback or the supervisor requested students’ opinions on the feedback. There were also frequent
mentoring comments (e.g., discussion about the student’s doctoral life) apart from the feedback on
the text. The students appreciated the multimodal aspect of the feedback (e.g., they could listen to
the supervisor’s comments while taking notes) as well as the removal of the ‘geographical distance’
barrier.
289
The dearth of research, however, makes it difficult to form a clear picture of the value of delayed
synchronous feedback, in particular, of the effects of this type of feedback compared to asyn-
chronous feedback. As discussed, oral or visual asynchronous feedback has a number of benefits
for L2 writers (Cunningham, 2019; Eloa & Oskoz, 2016; Li & Akahori, 2008). It seems likely,
therefore, that allowing students to interact concurrently with their teacher while receiving feed-
back is beneficial for learning.
Immediate Synchronous Feedback

Even less research has investigated immediate synchronous feedback. Early exploratory studies
documenting semester-long implementations of immediate synchronous teacher feedback during
L2 writing courses have found that students respond positively to the interactive nature of the
feedback (Aubrey, 2014; Kim, 2010). However, we identified only one experimental study that
explored the effects of synchronous feedback on language acquisition. Shintani and Aubrey
(2016) examined the relative effects of immediate versus delayed feedback on the acquisition of
a grammatical aspect (the hypothetical conditional structure) using Google Docs. After receiving
feedback on their writing, post-and delayed post-tests revealed that immediate and delayed feed-
back were both effective but immediate feedback was more effective in improving learners’
accuracy.
Shintani (2015) further conducted a detailed analysis on the writing process of the same imme-
diate and delayed conditions in Shintani and Aubrey (2016). Shintani reported that (1) the immediate
feedback created an interactive writing process similar in some respects to oral corrective feedback;
(2) both the immediate and delayed feedback promoted noticing-the-gap, but self-correction was
more successful in the immediate condition; (3) focus on meaning and form took place contigu-
ously in the immediate condition, while it occurred separately in the delayed condition; and (4) both
types of feedback facilitated metalinguistic understanding of the target feature.
These studies demonstrate that computer-mediated synchronous error correction can assist L2
writing, and that immediate synchronous feedback offers teachers a way of providing effective
and engaging support to learners. However, as Shintani (2015) and Shintani and Aubrey (2016)
suggested, future research must explore whether immediate synchronous feedback is applicable
to different learning situations and the teachers’ and students’ perceptions of immediate feedback.
Process Studies
The majority of studies cited in this chapter have focused on process features, such as text
characteristics (e.g., coherence, linguistic accuracy) and learner/teacher behaviors (e.g., collabora-
tive dialogues, teacher feedback) in computer-mediated writing. Individual or multiple case studies
figure prominently in the writing literature, especially text chat and collaborative writing, indicating
a bias towards qualitative, exploratory studies.
In collaborative writing and feedback studies, data sources have heavily relied on the archived
history of edits (e.g., the Track Changes function in Word, Google Documents, or wiki pages) to
capture patterns of how texts are co-constructed (Arnold, Ducate, & Kost, 2012; Bradley, Linstrom,
& Rystedt, 2010; Li, 2013) or on how feedback leads to revision (Eloa & Oskoz, 2016; Ene &
Upton 2014, 2018). Some studies employ multiple data sources, including discussion posts and
wiki edits in tandem (Alghasab & Handley, 2017) and synchronous text chats to examine negoti-
ation routines (Elola & Oskoz, 2010).
Text chat research has mostly relied on chat scripts for data collection. Some recent studies have
begun using screen video-recordings to document synchronous interaction (Sauro & Smith, 2010;
290
Ziegler, 2018), enabling researchers to observe online planning behavior (e.g., scrolling to see pre-
vious messages). As these transcripts bring out behavior unique to text chat interactions, they are
slowly becoming the standard for future transcription work. Feedback studies typically analyze a
corpus of synchronous interactions (Lund, 2008; Shintani & Aubrey, 2016) and oral interactions
during feedback sessions (Cunningham, 2019; Odo & Yi, 2014).
Linking the Writing Process to Text Outcomes

Fewer studies have attempted to investigate the process-product link by examining the relationship
between the writing process and the writing outcome (in the case of text chat and collaborative
writing studies) and the relationship between learners’ responses to feedback and improvement in
the quality of revisions (in the case of teacher feedback). These studies, which were motivated by
L2 acquisition theories, focused on how various noticing opportunities were related to the product.
The product has taken several forms in these studies: individualized tests based on negotiated lan-
guage items during text interaction (e.g., Shekary & Tahririan, 2006); tests of items that are seeded
in the task (e.g., Smith, 2003, 2005); and text chat scripts that are corrected by learners at a later
time (e.g., Smith, 2012).
Recent collaborative writing studies have focused on relating patterns of interaction to measures
of writing quality (e.g., Li & Zhu, 2017). These studies often have an experimental design, with
independent variables encompassing both technology (computer-mediated versus non-computer-
mediated; see Wang, 2015) and group size (group versus individual; see Bikowski & Vithanage,
2016). Independent variables in feedback studies include feedback types, the target of feedback
(e.g., grammar, vocabulary), and feedback contexts (e.g., text versus oral chat), with uptake of the
feedback (e.g., successful, unsuccessful, unattempted and unverifiable; see Ene & Upon, 2014) as
the dependent variable.
Acquisition Studies
Research has employed an experimental approach to investigate language development as a result
of CMC writing activities. Most of these studies measured acquisition in terms of the improve-
ment in accuracy, organization, content, and style of a new text (Bikowski & Vithanage, 2016;
Hsu & Lo, 2018; Wang, 2015). Some theory-driven studies additionally examined the causal
relationship between noticing opportunities and language acquisition (Smith, 2005; Shekary &
Tahririan, 2006). Noticing has been operationalized in various ways: language-related episodes
(Shekary & Tahririan, 2006), negotiation episodes (Smith, 2005) and uptake of feedback (Coyle
& Reverte, 2017).
Only a few studies have focused on a specific grammatical structure (Shintani & Aubrey,
2016) or vocabulary (Smith, 2004). These studies have examined acquisition through pre-/post-
test research designs intended to test whether the treatments improve language production in new
writing. Shintani and Aubrey (2016) also included a process analysis of learners’ behavior to estab-
lish a link between the uptake of the target features during writing and language development.

We discourage the view that technology itself is the panacea for enhancing language learning oppor-
tunities. At this point it should be clear that CMC occurs through a disparate collection of technolo-
gies. We therefore strongly recommend that teachers familiarize themselves with various CMC
tools before deciding how to utilize them for classroom use. The larger a teacher’s repertoire, the
easier it will be to predict when technology will be beneficial. What follows are some suggestions
for how teachers can make use of CMC technologies.
291
Teachers should first use CMC in ways that can support activities they already use in the class-
room. We have seen from some studies that free conversation text chat does not result in the quality
of interaction needed for learners to learn from each other. Similarly, unstructured collaborative
writing tasks, where learners are not accountable for their contributions, can result in freeloading,
non-collaborative patterns, and ultimately less language development. Therefore, teachers should
draw on task design principles (see Ellis, 2003) to ensure that learners are given distinct roles, clear
task objectives, and are motivated to collaborate.
We have seen that synchronous CMC provides a means for delivering feedback that is effective for
language learning either during an interactive task (Smith & Renaud, 2013) or as written corrective
feedback during individual writing (Shintani & Aubrey, 2016). It is less clear, however, how feasible
this would be in a real classroom. Still, given the positive research findings, we encourage teachers
to experiment with how synchronous corrective feedback might be successfully implemented. In
Shintani and Aubrey (2016), one teacher-researcher was able to provide feedback to seven learners
at one time. Synchronous corrective feedback, therefore, might be most achievable during collabora-
tive writing tasks where the teacher can attend to groups rather than individual writers.
Finally, we encourage teachers to combine multiple CMC modes. For example, collaborative
writing activities may involve the discussion of language problems using text chat and the joint
production of a text using a co-editing tool, while teachers provide synchronous corrective feed-
back. This “layering” of interaction serves different purposes (editing, organizing, feedback) and
provides opportunities for the cyclical processes of internalization, modification, and consolidation
involved in language learning (Williams, 2012). In addition to layering interaction, teachers can use
different CMC modes sequentially. For example, a decision-making task in text chat mode can act
as a pre-task for a main collaborative writing task, which may then be followed by an individual
writing task.
Future Directions
The above examination suggests that exploratory research on CMC writing processes is relatively
mature, but there is a crucial lack of research on acquisition, in particular –that is, research that
focuses on the relationship between learner behavior, noticing opportunities, and language acqui-
sition. This causal relationship can be established either by tracking learners’ noticing of a target
language feature in their production in a subsequent context (as in Smith 2005 for vocabulary
items and Shintani & Aubrey, 2016 for grammatical items) or by creating a tailor-made test for
noticed items identified in a writing task (as in Shekary & Tahririan, 2006). Technologies such
as eye-tracking (e.g., Smith, 2010; Smith, 2012; Smith & Renaud, 2013) and key loggings (e.g.,
Révész et al., 2019) might help advance systematic investigations of how the writing processes
affect written products. Longitudinal investigations can also shed light on the relationship between
noticing and gradual language development. In a rare longitudinal text chat study, González-Lloret
(2008) suggested noticing observed in LREs predicted accuracy improvement in subsequent
writings. Such an approach can also be applied to the investigation of collaborative writing and
teacher feedback.
These suggestions for future directions, together with the discussions in the previous sections of
this chapter, show that research on electronic writing and its effects on SLA are a dynamic area of
research. It is also clear that technology plays an important role in teaching and learning to write
in an L2.
References
Abrams, Z.I. (2016). Exploring collaboratively written L2 texts among first-year learners of German in Google
Docs. Computer Assisted Language Learning, 29(8), 1259–1270.
292
Abrams, Z. I. (2019). Collaborative writing and text quality in Google Docs. Language Learning & Technology,
23(2), 22−42.
Adams, R., Alwi, N.A.N.M., & Newton, J. (2015). Task complexity effects on the complexity and accuracy of
writing via text chat. Journal of Second Language Writing, 29, 64–81.
Akyel, A., & Kamisli, S. (1999). Writing and word processing in the EFL classroom: Possible effects on writing
strategies, attitudes, compositions, composition ratings and length. In M.C. Pennington (Ed.), Writing in an
electronic medium: Research with second language learners (pp. 14–29). Houston, TX: Athelstan.
Alghasab, M., & Handley, Z. (2017). Capturing (non-) collaboration in wiki-mediated collab-orative writing
activities: The need to examine discussion posts and editing acts in tandem. Computer Assisted Language
learning, 30(7), 664–691.
Ali, A.D. (2016). Effectiveness of using screencast feedback on EFL students’ writing and perception. English
Language Teaching, 9(8), 106.
Aljaafreh, A.L.I., & Lantolf, J.P. (1994). Negative feedback as regulation and second language learning in the
Zone of Proximal Development. The Modern Language Journal, 78(4), 465–483.
Arnold, N., Ducate, L., & Kost, C. (2009). Collaborative writing in wikis: Insights from culture projects in
German class. In L. Lomicka & G. Lord (Eds.), The next generation: Social networking and online collab-
oration in foreign language learning (pp. 115–144). San Marcos, TX: CALICO.
Arnold, N., Ducate, L., & Kost, C. (2012). Collaboration or cooperation? Analyzing group dynamics and revi-
sion processes in wikis. CALICO Journal, 29(3), 431–448.
Aubrey, S. (2014). Students’ attitudes towards the use of an online editing program in an EAP course. Annual
Research Review, 17, 45–57.
Aydin, Z., & Yildiz, S. (2014). Using Wikis to promote collaborative EFL writing. Language Learning &
Technology, 18(1), 160–180. Retrieved from http://llt.msu.edu/issues/february2014/aydinyildiz.pdf
Beatty, K. (2003). Teaching and researching computer-assisted language learning. Harlow: Pearson Longman.
Bikowski, D., & Vithanage, R. (2016). Effects of web-based collaborative writing on individual L2 writing
development. Language Learning & Technology, 20(1), 79–99.
Blake, C. (2009) Potential of text-based internet chats for improving oral fluency in a second language. The
Blake, R. (2000). Computer mediated communication: A window on L2 Spanish interlanguage. Language
Bradley, L., Lindström, B., & Rystedt, H. (2010). Rationalities of collaboration for language learning in a wiki.
ReCALL, 22, 247–265.
Brock, M.N., & Pennington, M.C. (1999). A comparative study of text analysis and peer tutoring as input to
writing on computer in an ESL context. In M.C. Pennington (Ed.), Writing in an electronic medium: Research
with language learners (pp. 61–94). Houston, TX: Athelstan.
Colomb, G.G., & Simutis, J.A. (1996). Visible conversation and academic inquiry: CMC in a culturally diverse
classroom. In S. Herring (Ed.), Computer-mediated communication: Linguistic, social and cross-cultural
perspectives (pp. 203–222). Amsterdam: John Benjamins.
Coyle, Y. & Reverte, M.J. (2017). Children’s interaction and lexical acquisition in text-based online chat.
Language Learning and Technology, 21(2), 179–199.
Chadwick, S. & Bruce, N. (1989). The revision process in academic writing: From pen and paper to word pro-
cessor. Hong Kong Papers in Linguistics and Language Teaching, 12, 1–27.
Chun, D.M., & Payne, J.S. (2004). What makes students click: Working memory and look-up behavior. System,
32, 481–503.
Çiftçi, H., & Aslan, E. (2019) Computer-mediated communication in the L2 writing process: a review of
studies between 2000 and 2017. International Journal of Computer Assisted Language Learning and
Teaching, 9(2), 19–36.
Cunningham, K.J. (2019). Student perceptions and use of technology-mediated text and screencast feedback in
ESL writing. Computers and Composition, 52, 222–241.
Ducate, L., & Arnold, N. (2012). Computer-mediated feedback: Effectiveness and student perceptions of
screen-casting software versus the comment function. In G. Kessler, A. Oskoz, & I. Elola (Eds.), Technology
across writing contexts and tasks (Vol. 10, pp. 31–56) San Marcos, TX: CALICO.
Ellis, R. (2003). Task-based language learning and teaching. Oxford: Oxford University Press.
Elola, I. & Oskoz, A. (2010). Collaborative writing: Fostering foreign language and writing conventions devel-
opment. Language Learning & Technology, 14(3), 51–71.
Elola, I., & Oskoz, A. (2016). Supporting second language writing using multimodal feedback. Foreign
Language Annals, 49(1), 58–74.
Ene, E., & Upton, T.A. (2014). Learner uptake of teacher electronic feedback in ESL composition. System, 46,
80–95.
293
Ene, E. & Upton, T.A. (2018). Synchronous and asynchronous teacher electronic feedback and learner uptake
in ESL composition. Journal of Second Language Writing, 41, 1−13.
Erlam, R., Ellis, R., & Batstone, R. (2013). Oral corrective feedback on L2 writing: Two approaches compared.
System, 41(2), 257–268.
Fernández-García, M., & Arbelaiz, A. (2003). Learners’ interactions and the negotiation of meaning: A com-
parison of oral and computer-assisted written conversations. ReCALL, 15, 113–136.
Ferris, D.R. (2012). Written corrective feedback in second language acquisition and writing studies. Language
Teaching, 45(04), 446–459.
Ferris S. (2002) The effects of computers on traditional writing. Journal of Electronic Publishing 8. Retrieved
from www.press.umich.edu/jep/08-01/ferris.html
Fiori, M. (2005). The development of grammatical competence through synchronous computer-mediated
communication. CALICO Journal, 22, 567–602.
Fullan, M. (1999). Change forces: The sequel. London: Falmer.
González-Lloret, M. (2008). Computer-mediated learning of L2 pragmatics. In E. Alcón Soler & A. Martinez-
Flor (Eds.), Investigating pragmatics in foreign language learning, teaching, and testing (pp. 114–132).
Harper, F., Green, H., & Fernández- Toro, M. (2018). Using screencasts in the teaching of modern
languages: investigating the use of Jing® in feedback on written assignments. The Language Learning
Journal, 46(3), 277–292.
Hsu, H.C. (2019): Wiki-mediated collaboration and its association with L2 writing development: an explora-
tory study, Computer Assisted Language Learning, 32(8), 945–967.
Hsu, H.C., & Lo, Y.F. (2018). Using wiki-mediated collaboration to foster L2 writing performance. Language
Iwasaki, J., & Oliver, R. (2003). Chat-line interaction and negative feedback. Australian Review of Applied
Jepson, K. (2005). Conversations and negotiated interaction in text and voice chatrooms. Language Learning
& Technology, 9(3), 79–98.
Kessler, G. (2009). Student-initiated attention to form in wiki-based collaborative writing. Language learning
& Technology, 13(1), 79–95.
Kessler, G., & Bikowski, D. (2010). Developing collaborative autonomous language learning abilities in com-
puter mediated language learning: Attention to meaning among students in wiki space. Computer Assisted
Language Learning, 23(1), 41–58.
Kessler, G., Bikowski, D., & Boggs, J. (2012). Collaborative writing among second language learners in aca-
demic web-based projects. Language Learning & Technology, 16(1), 91–109.
Kim, S. (2010). Revising the revision process with Google Docs. In S. Kasten (Ed.), TESOL classroom prac-
tice series. Effective second language writing (pp. 171–177). Alexandria, VA: TESOL Publications.
Kost, C. (2011). Investigating writing strategies and revision behaviour in collaborative writ-ing projects.
CALICO Journal, 28(3), 606–620.
Lai, C., Fei, R., & Roots, R. (2008). The contingency of recasts and noticing. CALICO Journal, 26, 70–90.
Lai, C., & Zhao, Y. (2006). Noticing and text-based chat. Language Learning & Technology, 10(3), 102–120.
Retrieved from http://llt.msu.edu/vol10num3/pdf/laizhao.pdf
Lee, L. (2010). Exploring wiki-mediated collaborative writing: A case study in an elementary Spanish course.
CALICO Journal, 27, 260–276.
Li, K., & Akahori, K. (2008). Development and evaluation of a feedback support system with audio
and playback strokes. CALICO Journal, 26(1), 91– 107. Retrieved from www.jstor.org/stable/
calicojournal.26.1.91
Li, M. (2013). Individual novices and collective experts: Collective scaffolding in wiki-based small group
writing. System, 41(3), 752–769.
Li, M., & Kim, D. (2016). One wiki, two groups: Dynamic interactions across ESL collaborative writing tasks.
Li, M., & Zhu, W. (2011). Patterns of computer-mediated interaction in small writing groups using wikis.
Computer Assisted Language Learning, 35, 61–82.
Li, M., & Zhu, W. (2017). Explaining dynamic interactions in wiki-based collaborative writing. Language
Lim, W.Y., So, H.J., & Tan, S.C. (2010). E-learning 2.0 and new literacies: Are social practices lagging behind?
Interactive Learning Environments, 18(3), 203–218.
Lin, H. (2015). Computer-mediated communication (CMC) in oral proficiency development: A meta-analysis.
ReCall, 27(3), 261–287.
294
Loewen, S., & Reissner, S. (2009). A comparison of incidental focus on form in the second language classroom
and chatroom. Computer Assisted Language Learning, 22, 101–114.
Long, M.H. (1996). The role of the linguistic environment in second language acquisition. In W. C.
Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). New York:
Academic Press.
Lund, A. (2008). Wikis: A collective approach to language production. ReCALL, 20(1), 35–54.
Lund, A., & Smørdal, O. (2006) Is there a space for the teacher in a wiki? In Proceedings of the 2006
International Symposium on Wikis (WikiSym ’06). Odense: ACM Press, 37–46.
Mak, B., & Coniam, D. (2008). Using wikis to enhance and develop writing skills among secondary school
students in Hong Kong. System, 36, 437–455.
Odo, D.M., & Yi, Y. (2014). Engaging in computer-mediated feedback in academic writing: Voices from L2
doctoral students in TESOL. English Teaching, 69(3) 129–150.
Ortega, L. (2009). Interaction and attention to form in L2 text-based computer mediated communication. In
A. Mackey, & C. Polio (Eds.), Multiple perspectives on interaction in SLA: Research in honor of Susan
M. Gass (pp. 226–253). New York: Routledge.
Pellettieri, J. (2000). Negotiation in cyberspace: The role of chatting in the development of grammatical com-
petence. In M. Warschauer & R. Kern (Eds.), Network-based language teaching: Concepts and practice
Pellettieri, J. (2010). Online chat in the foreign language classroom: From research to pedagogy. MEXTESOL
Journal, 34(1), 41–57.
Nami, F., & Marandi, S. S. (2014). Wikis as discussion forums: exploring students’ contribution and their
attention to form. Computer Assisted Language Learning, 27(6), 483–508.
Nik, N., Adams, R., & Newton, J. (2012). Writing to learn via text-chat: Task implementation and focus on
form. Journal of Second Language Writing, 21(1), 23–39.
Pennington, M.C. (1993). A critical examination of word processing effects in relation to L2 writers. Journal
Pennington, M.C. (1996). The computer and the non- native writer: A natural partnership. New York:
Hampton Press.
Pennington, M.C. (1999). Computer-aided pronunciation pedagogy: Promise, limitations, directions. Computer
Assisted Language Learning, 12(5), 427–440.
Révész, A., Michel, M., & Lee, M. (2019). Exploring second language writers’ pausing and revision behaviors: A
mixed-methods study. Studies in Second Language Acquisition, 41(3), 605–631.
Rodina, H. (2008). Paperless, painless: Using MS Word tools for feedback in writing assignments. The French
Review, 82(1), 106–116.
Sauro, S., & Smith, B. (2010). Investigating L2 performance in text chat. Applied Linguistics, 31(4),
554–577.
Shekary, M., & Tahririan, M.H. (2006). Negotiation of meaning and noticing in text-based online chat. Modern
Shintani, N. (2015). The effects of computer-mediated synchronous and asynchronous direct corrective
feedback on writing: A case study. Computer Assisted Language Learning, 29(3), 1–22.
Shintani, N., & Aubrey, S. (2016). The effectiveness of synchronous and asynchronous written corrective
feedback on grammatical accuracy in a computer-mediated environment. The Modern Language Journal,
100(1), 296–319.
Smith, B. (2003). Computer-mediated negotiated interaction: An expanded model. The Modern Language
Journal, 87(1), 38–57.
Smith, B. (2004). Computer- mediated negotiated interaction and lexical acquisition. Studies in Second
Smith, B. (2005). The relationship between negotiated interaction, learner uptake, and lexical acquisition in
task-based computer-mediated communication. TESOL Quarterly, 39(1), 33–58.
Smith, B. (2009). Task-based learning in the computer-mediated communicative ESL/EFL classroom. CALL-
EJ Online, 11(1).
Smith, B. (2012). Eye tracking as a measure of noticing: A study of explicit recasts in SCMC. Language
Smith, B., & Renaud, C. (2013). Eye tracking as a measure of noticing corrective feedback in computer-
mediated instructor-student foreign language conferences. In K. McDonough and A. Mackey (Eds.),
Interaction in diverse educational settings (pp. 147–165). Philadelphia: John Benjamins.
Sotillo, M.S. (2005). Corrective feedback via instant messenger learning activities in NS-NNS and NNS-NNS
dyads. CALICO Journal, 22(3), 467–496.
Storch, N. (2002). Patterns of interaction in ESL pair work. Language Learning, 5, 119–158.
295
Strobl, C. (2014). Affordances of web 2.0 technologies for collaborative advanced writing in a foreign lan-
guage. CALICO Journal, 31(1), 1–18.
Sun, Y-C., & Chang, Y-J. (2012). Blogging to learn: Becoming EFL academic writers through collabora-
tive dialogues. Language Learning & Technology, 16(1), 43–61. Retrieved from http://llt.msu.edu/issues/
february2012/sunchang.pdf
Swain, M. (2000). The output hypothesis and beyond: Mediating acquisition through collaborative dialogue.
In J. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 94–119). Oxford: Oxford
University Press.
Swain, M. (2006). Languaging, agency and collaboration in advanced second language proficiency. In H.
Bynes (Ed.), Advanced language learning: The contribution of Halliday and Vygotsky (pp. 95–108).
London: Continuum.
Swain, M., & Lapkin, S. (2002). Talking it through: Two French immersion students’ response to reformula-
tion. International Journal of Educational Research, 37, 285–304.
Tafazoli, D., Nosratzadeh, H., & Hosseini, N. (2014). Computer- mediated corrective feedback in ESP
courses: Reducing grammatical errors via email. Procedia –Social and Behavioral Sciences, 136, 355–359.
Toyoda, E., & Harrison, R. (2002). Categorization of text chat communication between learners and native
speakers of Japanese. Language Learning & Technology, 6(1), 82–99.
Tu, C., Blocher, M., & Ntoruru, J. (2008). Integrate Web 2.0 technology to facilitate online professional com-
munity: EMI special editing experiences. Educational Media International, 45(4), 335–341.
Tudini, V. (2003). Using native speakers in chat. Language Learning & Technology, 7(3), 141–159. Retrieved
from http://llt.msu.edu/vol7num3/pdf/tudini.pdf
Tudini, V. (2007). Negotiation and intercultural learning in Italian native speaker chat rooms. Modern Language
Journal, 91(4), 577–601
Wang, Y. (2015) Promoting collaborative writing through wikis: A new approach for advancing innovative and
active learning in an ESP context. Computer Assisted Language Learning, 28(6), 499–512.
Warschauer, M. (1997). Computer-mediated collaborative learning: Theory and practice. Modern Language
Journal, 81, 470–481.
Warschauer, M., & Ware, P. (2006). Automated writing evaluation: Defining the classroom research agenda.
Language Teaching Research, 10(2), 157–180. Retrieved from www.learntechlib.org/p/101033/.
Wigham, C.R., & Chanier, T. (2015). Interactions between text chat and audio modalities for L2 commu-
nication and feedback in the synthetic world second life. Computer Assisted Language Learning, 28(3),
260–283.
Yang, Y.F. (2018) New language knowledge construction through indirect feedback in web-based collaborative
writing, Computer Assisted Language Learning, 31(4), 459–480.
Yeh, H.C. (2014) Exploring how collaborative dialogues facilitate synchronous collaborative writing.
Language Learning & Technology, 18(1), 23–37.
Yim, S., Wang, D., Olson, J., Vu, V., & Warschauer, M. (2017). Synchronous writing in the class-
room: Undergraduates’ collaborative practices and their impact on text quality, quantity, and style.
Proceedings of the Conference on Computer Supported Cooperative Work, CSCW ’17.
Ziegler, N. (2018). Pre-task planning in L2 text-chat: Examining learners’ process and performance. Language
Learning & Technology, 22(3). 193–213.
296
PART III
Expanding Research Agendas

22
DIRECTIONS
FOR FUTURE RESEARCH
AGENDAS ON L2 WRITING
AND FEEDBACK AS LANGUAGE
LEARNING FROM AN ISLA
PERSPECTIVE
Georgetown University and University of Murcia
Introduction: Writing and Language Learning from an ISLA Perspective

The role of writing is typically reserved for academic settings in which this ability forms part
of either a four-skill language curriculum or comprises an entire upper-level writing course.
Consequently, research on writing, especially when associated with potential language learning,
needs to be situated within the context in which it occurs, namely, in an instructed setting. This,
in turn, leads to the importance of considering several crucial variables that exist within such
a formal environment, for example, the structure of the classroom setting, syllabus content,
the roles of both students and teachers, the provision of written corrective feedback (WCF),
and the learning goals and outcomes of any given language curriculum or composition course.
Accordingly, when related to potential language learning, research on writing can greatly benefit
from being situated within an instructed second language acquisition (ISLA) perspective, as
opposed to a purely SLA one.
This chapter, then, focuses on the relationship between L2 writing, WCF, and language learning,
and we begin by viewing this relationship in an ISLA perspective divided into two sub-strands.
We then underscore a process-oriented approach to this relationship by exploring more deeply the
nature, conditions, and effects of the processing dimension of both stages of the writing and revi-
sion (processing of feedback) process. The relevance of situating ISLA-oriented writing research
within a language curriculum is then strongly proposed. In line with the aims of chapters in this
section of the Handbook, we propose several directions of future ISLA research and methodo-
logical recommendations associated with a research agenda based on pursuing a process-oriented
perspective of writing, the role of WCF, and potential associations with language learning. We con-
clude with a synthesis of the ideas discussed in the chapter.
DOI: 10.4324/9780429199691-31 299

ISLA: Applied ISLA vs. ISLA Applied

ISLA, although in existence for over three and a half decades, has only recently witnessed a
concerted effort to avoid the conflation between this classroom-based strand of research and its
broader and more naturalistic counterpart (SLA), and to revisit the previous foci of what defines
ISLA. The early definitions of ISLA as classroom-based research (Ellis, 1990) to include the role of
instruction as an intervention into the second/foreign language (L2) learning process (Ellis, 2005)
have gradually become more fine-tuned to view ISLA from a dual perspective: (a) externally, in
light of the role that instruction or intervention may play in mediating such processing to promote
learning internally, and (b) internally, by placing greater emphasis on research that underscores the
importance of a more nuanced understanding of the cognitive processes employed by L2 learners
while interacting with L2 data. Synthesizing previous definitions, Loewen (2015) characterizes
ISLA as
a theoretically and empirically based field of academic inquiry that aims to understand
how the systematic manipulation of the mechanisms of learning and/or the conditions
under which they occur enable or facilitate the development and acquisition of a language
other than one’s own.
p. 2. Italics added.
This definition underscores (a) the instructed or classroom setting; (b) a focus on the “mechanisms
of learning” (cognitive processes) employed in this instructed setting (which include “the pro-
cessing and internalization of L2 input; the consolidation and storage of L2 knowledge, and the
production of L2 output” (Loewen & Sato, 2017, p. 3); and (c) the potential manipulation of these
processes by instructional intervention to promote superior L2 learning.
Viewing ISLA from contextual, processing, and curricular perspectives, Leow (2019a) divides
ISLA into two sub-strands: (a) applied ISLA and (b) ISLA applied. Applied ISLA comprises studies
that investigate the many variables in the instructed setting (e.g., length of writing activity, type
of linguistic item or WCF, genre) while not necessarily taking place in an instructed setting and
thus includes lab-based studies. ISLA applied –which according to Leow (2019a) would be better
reflected by instructed language learning (ILL) –comprises studies seeking to inform pedagogical
practice via pedagogical intervention. The main distinction, then, between applied ISLA and ISLA
applied lies in the former investigating the instructed setting without any specific effort to pro-
vide pedagogical implications (relatively similar to controlled or laboratory-based ISLA research),
while the latter comprises studies not only situated within a language curriculum and seeking to
inform pedagogical practice but also attempting to promote a level of learning that is successful (a
passing grade) from a curricular perspective. Both sub-strands of ISLA research are important in
their respective way and can also potentially feed each other regarding the nature of the processing
dimension of writing and use of WCF in relation to language learning and the potential success of
instructional interventions.
The Cognitive Processing Dimension of ISLA and Writing

The focus on mechanisms of learning parallels the cognitive processing underscored in Cumming’s
(2020) concept of learning through writing: “L2 learning through writing appears to happen
through cognitive problem solving and restructuring, applications and enhancements of self-
regulation, and collaborations with others while writing” (p. 40). These processes and strategies
are well established in studies in writing (see Roca de Larios, Nicolás-Conesa, & Coyle, 2016, for
a comprehensive review; López-Serrano, Roca de Larios, & Manchón, 2019, 2020; Stiefenhöfer
& Michel, 2020, for recent empirical studies on writing processes in individual and collaborative
300
Future Research on Writing & L2 Learning
writing conditions). In non-writing ISLA, several recent studies have identified several cognitive
processes to be associated with successful L2 learning during input and intake processing: activa-
tion of appropriate prior knowledge, metacognition, hypothesis testing and rule formulation, and
a high depth of processing that leads to awareness at the level of understanding (e.g., Leow, 2015;
See Leow, 2019b for several other recent works). Other studies have addressed the effects of their
manipulation during task completion (mostly controlled, computerized problem solving or reading
tasks, e.g., Cerezo, Caras, & Leow, 2016; Leow, Cerezo, Caras, & Cruz, 2019). However, while
there are ample data on L2 cognitive processes, as well as abundant research on writing processes,
there is a paucity of studies that have addressed the nature and role of these general L2 cognitive
processes within the strand of writing as a site for L2 learning and in association with any potential
role of WCF in subsequent L2 learning. We return to these future needed explorations of writing
processes and processing from a language learning perspective at different points throughout the
chapter.
The Curricular Perspective

Pursuing a process-oriented perspective of writing and language learning within ISLA also raises
several issues that need to be seriously considered by any research agenda aimed at understanding
writing as a site for L2 learning. The first is the need to acknowledge that ISLA-oriented L2 writing
research should be situated within a language curriculum, which, as noted above, has a syllabus,
curricular information, specific goals, and expected learning outcomes students are required to
successfully achieve (Leow & Cerezo, 2016, see also Leow, 2019a, 2019c). Without denying the
relevance of controlled, laboratory-type writing research, the failure of many SLA-oriented writing
studies to situate their research designs within some language curriculum not only reduces the
external validity of their findings, but also any potential pedagogical ramifications to be derived
from these findings. Acknowledging that ISLA-oriented L2 writing research lies within a language
curriculum opens up new directions for such studies to form part of the language syllabus followed
by L2 writers during the academic semester or year (see, e.g., Caras, 2019; Coyle, Cánovas-Guirao,
& Roca de Larios, 2018; Cerezo, Manchón, & Nicolás-Conesa, 2019; Manchón, Nicolás-Conesa,
Cerezo, & Criado, 2020).
Situating ISLA within a curricular perspective also warrants serious consideration of the type of
learning assumed to take place in this formal setting (Leow & Cerezo, 2016). The formal instructed
setting does not share the affordances provided by the naturalistic setting (SLA) and is designed to
promote more explicit than implicit learning (Leow, 2019c). The statistical superiority of explicit
or intentional over implicit or incidental learning in instructed SLA contexts (Leow, 2018, 2019c)
suggests that the investigation of the cognitive processes associated with explicit learning should
be prioritized in this setting. Thus, as noted above, the processing dimension of the act of writing
and of feedback use warrants further scrutiny given the gaps in our understanding of the precise
nature and effects of the cognitive processes identified as contributing to successful L2 learning in
the domain of writing.
Arguably, the most important (and perhaps most debatable) aspect of a curricular approach
to ISLA research is the need to consider that any pedagogical implications for the instructed L2
environment should address the learning outcomes of the language program given its curricular
status (Leow, 2019a; Leow & Cerezo, 2016). Consequently, ISLA-oriented L2 writing research that
seeks to provide pedagogical implications needs to consider the usefulness of its findings in rela-
tion to the curricular learning outcomes of the instructed setting (Leow, 2015, 2019a, 2019c). The
ramifications of this consideration affect any future ISLA research with regard to research questions
asked, research designs employed, and the impact and directions of the study’s findings. These
ramifications are captured in Leow’s (2019a) proposal to differentiate ISLA based on the goals of
the researchers, as reported above.
301
ISLA, Writing, and WCF

Writing as a site for learning has been approached from different perspectives (as recently reviewed
in Manchón, 2020a, 2020b). We discuss below those strands that are more closely linked and rele-
vant to the ISLA perspective in focus in this chapter.
ISLA and Writing

There has been a recent shift in orientation in the study of writing processes in an attempt to
address such processes from the perspective of their language learning potential (see Chapter 6
and Chapter 25, this Handbook, for a review of research trends and findings and further research
methodology considerations in this body of work). Pertinent to the focus of this chapter are past and
more recent studies investigating the temporal distribution of writing processes (i.e., how writers
allocate their attentional resources during the composing time and sequence their processes).
Gánem-Gutiérrez and Gillmore (2018), employing a triangulation of data gathered via screen cap-
ture techniques, eye tracking, and stimulated retrospective recalls, and Roca de Larios, Manchón,
Murphy, and Marín (2008, see also Manchón, Roca de Larios, & Murphy, 2009), employing think-
aloud data, both reported that formulation, that is, converting ideas into language, was the predom-
inant process in writing conditions with and without access to sources and that the various activities
in which participants engaged was activated differentially at any given time during the composing
process. It was also reported that proficiency level played a role in the activation and distribution
of writing processes across the entire writing process. The role of formulation and the intense
linguistic processing that has been found to characterize writing underscores the need for further
studies of writing processes from the perspective of language learning, following recent studies
on the very nature of such processing in individual/collaborative, pen-and-paper/digital writing
conditions (e.g., López-Serrano, Roca de Larios, & Manchón, 2019, 2020; Stiefenhöher & Michel,
2020), as well as across modalities (Zabildea, 2020). The research findings that can derive from
this line of research would constitute essential empirical evidence that could then inform classroom
practice and pedagogical decision making in crucial areas of task selection and task implementation
conditions. For instance, teachers could benefit from empirical insights on which tasks elicit the
most attention to language, when and by whom.
ISLA and Written Corrective Feedback

The role of WCF can only be situated within an ISLA perspective and theoretically and empir-
ically pursued with the ultimate aim of informing practice on how to substantially improve not
only students’ writing ability but also their L2 learning potential. Although investigated for many
decades, the role of WCF in language learning remains inconsistent (see Manchón & Vasylets,
2019 for a recent review). The vast majority of these studies fall within the applied ISLA sub-strand
of inquiry given that many did not link the research designs to the language curriculum although
some pedagogical attempts were made. In addition, they did not address how L2 writers processed
the feedback provided and relied on assumptions and claims pertaining to such processing (see
Leow, 2020 for further elaboration). Yet, ISLA writing research addressing potential L2 learning
via WCF is beginning to move toward a deeper probing into the cognitive processes employed by
L2 writers as they process WCF via the use of written languaging, think-aloud protocols, collab-
orative dialogues, and noticing charts (see Roca de Larios & Coyle, Chapter 7, this Handbook).
For example, a number of studies have investigated WCF processing (in either individual or col-
laborative writing conditions) to, first, establish levels of processing and, second, assess the impact
of levels or depth of processing of WCF on immediate revisions (e.g., Adrada-Rafael & Filgueras-
Gómez, 2019; Caras, 2019; Cerezo et al., 2019; Manchón et al., 2020; Park & Kim, 2019; Suzuki,
302
2017). These descriptive studies are especially relevant because they represent real links with cru-
cial issues of debate in SLA and ISLA research such as the role of attention (see Chapter 23, this
Handbook). In addition, many of these studies are framed within Leow’s (2015) recent model of
the L2 learning process in ISLA that underscores the role of depth of processing and potential levels
of awareness at several processing stages (input, intake, knowledge) along the L2 learning process
(see Leow, 2020 for his feedback processing framework based on his model). These works are also
notable for the light they shed on the very processing of WCF, be it from the perspective of levels of
noticing (Suzuki, 2017) or levels or depth of processing (Adrada-Rafael & Filgueras-Gómez, 2019;
Caras, 2019; Cerezo et al, 2019; Manchón et al., 2020; Park & Kim, 2019).
Of relevance, other studies (e.g., Caras, 2019; Coyle et al., 2018) have moved into the cur-
ricular domain (ISLA applied) discussed above by situating their research design within existing
syllabi in an effort to inform practice. Caras’s (2019) exploratory study addressed how learners
process WCF during the revision stage of a composition and the effect of type of feedback (direct,
indirect, metalinguistic, or control) on participants’ subsequent performances on the Spanish lin-
guistic dichotomies ser versus estar and the preterit versus imperfect past tense aspects. Crucially,
the compositions written formed part of the regular assignments on the syllabus and were carefully
designed to elicit the target linguistic items. Also of importance in Caras’s curricular-based study
was the effort to provide the typical unfocused WCF in classroom L2 learning, while probing
deeper into focused linguistic items. Type of WCF in Caras’s study had no differential effect on
accuracy scores over time.
Coyle et al. (2018) was a robust descriptive-interventionist study that was conducted in
two intact primary school classrooms. It investigated the language learning potential of WCF
(models) processing by two groups of children (one receiving training in the use of models)
writing collaboratively over a period of five months. The researchers analyzed WCF processing
by combining a complex product-process analysis of the full trajectories the children followed
from their initial, joint problem-solving activity while producing their texts, to their collabora-
tive analysis and use of the feedback provided in the model texts, to their collaborative effort to
revise their initial texts revisions on the basis of their processing and appropriation of the input
provided in the model.
The findings in Coyle et al.’s (2018) study underscore the importance of seriously considering
the population (e.g., adults vs. children) before blindly extrapolating key constructs developed for
investigating writing processes for one population to another. The need to employ additional analyt-
ical categories such as the multi-layered analyses of the data to address the relationship between the
provision of WCF and language development in the writing processes was also clearly informative.
In the next section we provide implications for future ISLA research on writing, WCF, and lan-
guage learning.
Implications for Future ISLA Research on Language Learning

Through Writing and Corrective Feedback
Adopting an ISLA perspective and, more specifically, one that accepts the distinction discussed
above between applied ISLA and ISLA applied, brings with it implications for empirical research
on writing and language learning at two levels. The first relates to the ultimate aim of the research
being conducted, that is, whether we are concerned with non-pedagogically-or pedagogically-
based studies. The second implication relates to the key questions addressed in research agendas
and methodologies employed to answer them.
Globally, given the current focus of ISLA on a better understanding of “how the systematic
manipulation of the mechanisms of learning and/or the conditions under which they occur enable
or facilitate the development and acquisition of a language other than one’s own” (Loewen, 2015,
p. 2), as we advanced in an earlier publication (Manchón & Leow, 2020), there are minimally three
303
principal avenues of inquiry for future ISLA-oriented writing research: (1) exploring more deeply
the nature, conditions, and effects of the processing dimension of both stages of the writing and
feedback processing; (2) situating ISLA studies within a language curriculum; and (3) ensuring
in both instructed and more controlled laboratory-type settings whether or not the two stages of
writing lead to the kind of language learning gains (“development and acquisition of a language”
in Loewen’s definition of ISLA research) that are not only theoretically predicted, but also peda-
gogically expected while fulfilling curricular learning goals.
Advancing research along these three directions will entail conducting both controlled and
curricular-based, longitudinal studies contributing to applied ISLA and ISLA applied, respect-
ively, that is, studies seeking to shed further light on the manner in which writing and WCF
may contribute to language learning, as well as studies seeking to inform pedagogical practice
via pedagogical interventions. In the next section we suggest how to move forward in pursuing
these research agendas.
Main Research Directions on the Language Learning Potential of Writing Processes

Building on Manchón and Leow (2020), future ISLA directions would need to minimally include
the following:
Direction 1: Expand current process-oriented research on writing and engagement with WCF in
pen-and-paper and digital environments. Further research on the processing dimension of writing
itself and of the L2 writer’s engagement with WCF will not only provide more detailed data on such
processes and strategies employed during writing and feedback appropriation (across different aca-
demic levels or contexts, time, and populations) but also to probe deeper into their roles in poten-
tial language learning, an avenue of research still in need of more robust exploration (Manchón &
Leow, 2020). This expansion could eventually allow us to move from generating hypotheses about
the connection between writing processes and language learning, to hypothesis testing in investi-
gating L2 writing as a site for language learning.
With regard to the processing dimension of WCF use, although the WCF strand of research
stretches over several decades, very little of the research has been conducted in relation to the
processes employed by L2 writers as they process WCF and the environment within which they are
situated, namely, within the language curriculum. As pointed out by Leow (2020), future studies
need to address the dearth of concurrent data on the cognitive processes employed by L2 writers
at all levels of proficiency during both stages of the writing process (composing and revising).
A better understanding of how L2 writers initially compose and then interact with WCF and what
role their cognitive processes play in subsequent L2 development in relation to WCF will prevent
researchers from making assumptions on how L2 writers process WCF. Concurrent data will also
provide deep insights into, for example, how L2 writers process different types of linguistic items,
perhaps based on the characteristics of the error produced (e.g., saliency, complexity), or the role
of depth of processing (e.g., Caras, 2019; Cerezo et al., 2019; Leow, 2015; Manchón et al., 2020)
during both the writing and revision process. With this process-oriented focus, research can repli-
cate previous WCF studies that have addressed a multitude of variables with their future research
designs situated within the language classroom and curriculum. Ultimately, if viewed from an ISLA
applied perspective, such rich data can only lead to pedagogical implications aimed at promoting
superior learning in the instructed setting.
Future studies also need to acknowledge that digital modalities are currently accompanying trad-
itional modes of writing (i.e., paper-and-pen) in many language curricula and it is imperative that
these studies address this digital development (McKee & DeVoss, 2007), including digital writing
practices outside the formal setting, as well as digital writing in individual and collaborative writing
conditions (e.g., Stiefenhöher & Michel, 2020). In an effort to simulate the ISLA setting, as noted
304
by Manchón and Leow (2020), process-oriented writing research, then, must include the explor-
ation of both print-based pen-and-paper and screen-based writing processes in both individual and
collaborative writing conditions and inside and outside the formal setting.
Direction 2: Expand research to include diverse contexts and populations. An overview of ISLA-
oriented writing research reveals that the majority of writing studies have addressed an academic
context whose population is typically college-level and, in several cases, possesses some language
and linguistics backgrounds (Manchón & Leow, 2020). Indeed, Coyle et al. (2018), conducted in
two intact primary school classrooms, revealed important directions for future research with regard
to different contexts and populations. Given the potential for differences between contexts and
populations, future research on writing needs to not only include other lower academic or non-
academic contexts and L2 writers who do not possess much linguistic knowledge, but also perform
contrastive analyses between them to arrive at a better understanding of these differences (see
McBride & Manchón, forthcoming). In addition, as suggested by Coyle et al. (2018), key constructs
applied to one population may not necessarily be appropriate for another and some caution and
reconsideration are warranted before blind extrapolation.
Direction 3: Expand studies on task complexity in writing. Current writing research exploring
online writing behavior has reported the effects of task-complexity variables on attentional resources
allocated to addressing language-related concerns during the process of writing (see Chapter 5, this
Handbook). For example, Ong (2014) found an effect for the manipulation of task conditions on
metacognitive processes (measured on a retrospective questionnaire), together with some trade-off
effects, on one hand, between metacognitive processes related to idea generation and informa-
tion organization, and, on the other hand, language-related dimensions of writing activity. Révész,
Kourtali, and Mazgutova (2017) reported that, based on keystroke logging and the stimulated recall
procedure, the less complex task in their study “reduced processing burden on planning processes,
facilitating attention to linguistic encoding” (p. 208). These findings align relatively well with
Manchón and Vasylets’s (2019) review of task-complexity studies and empirically pursuing this
direction will clearly improve our understanding of the role of task complexity in this strand of
research and the connection between writing and language learning in ISLA (see also Manchón,
2020b). Importantly, however, concurrent data (think-aloud protocols) will need to be gathered to
truly address how L2 writers are processing during exposure to manipulated task complexity instead
of relying solely on concurrent hand movement (keystroke logging) and post-writing (stimulated
recall) data that are unable to capture the cognitive processes employed by L2 writers.
Direction 4: Expand research on the mediation of individual differences (IDs) in the processing
dimension of writing and WCF appropriation. An expanding body of empirical investigations has
responded to Kormos’s (2012) initial call to make the study of IDs more central in L2 writing
research (as reviewed in Chapters 11 and 12, this Handbook). This research on individual differences
has studied both writing and written corrective feedback, focusing primarily on the effects of cogni-
tive individual differences, especially aptitude and working memory (see review in Chapter 11, this
volume). Additionally, as reviewed by Ferris and Kurzer (2019), several studies have looked into the
mediation of affective and attitudinal individual differences in making use of feedback. Yet, most
of this research has studied effects on writing products. Hence, the kind of future process-oriented
studies advocated here would benefit from pursuing further insights into the way in which the pro-
cessing dimensions of writing and feedback appropriation are mediated by learner characteristics.
Such research would be both theoretically-and empirically-relevant. Thus, from a theoretical
angle, recent postulations of working memory as part of language aptitude (e.g., Kormos, 2012;
Wen, 2019) need to be put to the empirical test in the case of writing, at a minimum on account of
the differential demands for attention and depth of processing required in oral and written commu-
nication (Williams, 2012; Manchón & Williams, 2016), and of the idiosyncratic nature of writing
processes that may be linked to crucial aptitude components other than working memory capacity
305
and functioning (as discussed by Ahmadian & Vasylets, Chapter 11, this Handbook). It is also an
empirical question whether or not various sets of individual differences may differentially influence
writing and WCF use by the same L2 writers. Importantly, most extant research has been carried out
with college students or with teenagers enrolled in immersion programs, while studies with younger
populations are underrepresented in the field (Michel, Kormos, Brunfaut, & Ratajczak, 2019 for an
exception). Accordingly, there is scope for further work on individual differences in future ISLA-
oriented L2 writing studies.
Main Methodological Considerations

The future research agenda suggested above poses diverse methodological challenges, among
which we should mention the following:
1. A process-oriented research agenda on writing as a site for learning in ISLA necessitates

a clear effort to gather as much concurrent data during the writing process in an effort to
establish the cognitive process being employed by L2 writers and whether such processes
are connected to or predictive of any potential learning. The current effort to expand meth-
odological approaches in order to capture writing processes in pen-and-paper and digital
writing is clearly laudable (see Révész & Michel, 2019; Chapter 25, this Handbook). Yet,
think-aloud protocols, although a time-consuming undertaking mostly due to transcription,
segmentation, and coding, arguably elicit the richest data on cognitive processes employed
during task performance and this procedure also avoids, for the most part, the claims and
assumptions postulated to account for how learners process L2 data. While these protocols
do hold the potential for reactivity (based on type of task) and obtrusiveness, all the other
procedures mentioned above may be inadequate to provide such rich data on specifically
how L2 writers are processing during the composing and revising stages or what cognitive
processes are being employed that are related to subsequent L2 development, validity issues
raised by other researchers (e.g., Galbraith & Baaijen, 2019; Galbraith & Vedder, 2019. See
also contributions to Lindgren & Sullivan, 2019, Manchón & Roca, forthcoming; Révész
& Michel, 2019). Given these options, the recommendation by Leow, Grey, Marijuan, and
Moorman (2014) in their critical review of three concurrent data elicitation procedures
(think-aloud protocols –TA, eye tracking –ET, and reaction times –RT) also holds for the
process-oriented strand of writing, namely, that “one methodological suggestion may be
to employ a procedural combination of ET, RT, and TA that aims to increase the level of
internal validity of the study by maximizing the strengths of a particular procedure while
minimizing its weaknesses” (p. 121). This call for triangulation of data (see also Révész &
Michel, 2019 and Chapter 25, this Handbook), then, may be the logical direction to take in
an effort to capture the many subtle layers of the writing process, in relation to language
learning and the provision of WCF.
2. The quest for valid methodological procedures discussed above is linked to how the L2 writing
data are coded and analyzed. For example, as Coyle et al. (2018) pointed out, the analysis
of WCF processing needs to be more fine-tuned and go beyond simple binary distinctions.
Additional analytical categories should be sought that allow, for example, for both the descrip-
tion of noticing and the outcomes of noticing as a question of degree rather than as consum-
mate categories. Such an effort is also evidenced in Manchón et al. (2020), who investigated
whether levels or depth of processing (DoP) were mediated by writing conditions (individual
vs. collaborative writing), and whether there was a relationship between the participants’ DoP
of WCF and any observed effects on accuracy measures in rewritten texts produced before and
after processing WCF. Given the potential of concurrent data elicitation procedures to obtain
306
depths of processing and levels of awareness during task performance, future research needs
to probe deeper into the sublayers of L2 writers’ processes and the relationships between such
processes and subsequent language learning.
3. Research needs to include, in addition to the typical one-shot or short time span design (applied
ISLA), more long-term or longitudinal designs if researchers seek to view the study of writing
and writing processes from an ISLA applied perspective. This latter perspective attempts to
(1) capture the temporal nature of a language curriculum and the repeated opportunities for
learning and development through writing (see Norris & Manchón, 2012), and (2) investigate
the nature and effects of the manipulation of processes reported to be conducive to language
learning within the real language/writing classroom and within the syllabus (i.e., instead of a
laboratory setting. See further discussion in Manchón, 2020b). Longitudinal data will likely
provide us with richer insights into writing processes while performing writing activities that
are distributed over different time spans in a natural syllabus (see Seror, 2013 for one such
attempt). In short, as discussed by Manchón and Leow (2020), conducting only cross-sectional
studies will not suffice to provide the kind of robust insights that can advance theoretical and
pedagogical knowledge in the domain of writing in a language program that is potentially
linked to language learning. In addition, future ISLA applied studies also need to conform to
the type of WCF provided in such formal settings, namely, direct or metalinguistic. No provi-
sion of feedback fails to meet the expectation of students, may be confusing, and viewed as a
dereliction of teacher responsibility.
An Exemplar of One ISLA Instructed Research Design

To illustrate an ISLA applied approach to gathering online or concurrent data on writing pro-
cessing and processes, one research design that adheres to contextual, processing, and curricular
perspectives may be the following conducted within a language curriculum that incorporates sev-
eral compositions or tasks as assignments during the semester and based on the content, vocabulary,
and linguistic structures of the textbook chapters. For example, composition topics are selected
and carefully designed prompts are developed for each topic that will elicit a satisfactory amount
of target linguistic items covered in the chapter within each composition (these prompts comprise
the focused WCF to be used to address potential learning). Subsequent topics, in addition to newly
covered grammatical or lexical items, will incorporate some of these previous prompts to address
the issue of learning or L2 development. Students (participants) will follow the typical assignment
of writing their compositions at home but, in order to gather concurrent data on how L2 writers pro-
cess written production both at the initial composing and revision stages of the L2 writing process,
they will be asked to record their thoughts and upload the protocols to a server. Corrected original
compositions and rewrites are also collected. The usual unfocused WCF will comprise only direct
and metalinguistic feedback at the lower levels of proficiency (due to the empirical findings that
indirect feedback elicits overall low depth of processing) and include this type of indirect feed-
back at upper levels (to promote more robust activation of prior knowledge). Providing no written
corrective feedback may be unethical and unusual in a language curriculum. Students revise their
original compositions once again at home and record their thoughts during this revision stage.
Gathering concurrent data at both the initial composing and revision stages of the writing process
can provide potential correlations between, for example, the depths of processing and strategies
employed, at the two stages. WCF may be alternated (as done in the study by Amelohina, Manchón,
& Nicolás-Conesa, 2020) to address a within-subject analysis for type of feedback, and a ques-
tionnaire on students’ preference can be sought. The same procedure can be conducted at different
levels of language proficiency to address whether such levels play a role in the L2 writing pro-
cess and L2 learning. To address the role of individual differences, questionnaires or tests may be
307
conducted, while an attempt to equate or distinguish the L1 and L2 writing processes may require
students to write in their L1 while thinking aloud during written production of a similar topic.
The value of such a research design, while challenging, is clearly tremendous and intended
to address a multitude of variables that may potentially play a role in the L2 writing process and
product. Between-and within-subjects, quantitative and qualitative analyses are all possible, which
would lead to stronger insights into the L2 writing process, the provision of WCF, and potential lan-
guage learning. The true strength of the design is that the data are taken from different stages within
an existing syllabus or language curriculum, are authentic (as opposed to a controlled, laboratory
setting), and are available to provide important insights not only in relation to the two stages of
the writing process and the roles of several variables on overall writing ability but also provide
important feedback to teachers on their students’ progress with regard to their L2 writing ability.
Conclusions
There is no question that the writing component of any instructed setting, be it during the com-
posing and/or the revision stages (with WCF provided), is arguably one of the most investigated
areas of L2 learning and development. At the same time, further studies that specifically address
L2 writing processes during these two stages and whether the act of writing and revising (and the
cognitive processes employed during each act) promote L2 learning are both theoretically and
pedagogically relevant.
We began this chapter with the argument that any L2 writing research associated with language
learning needs to be situated within an ISLA perspective as opposed to an SLA one if viewed from
a contextual, processing, and curricular perspective. We suggested that future work on L2 writing as
a site for learning be subsumed within one of two perspectives of ISLA research: applied ISLA and
ISLA applied (Leow, 2019a) that differ sharply with regard to the context (classroom vs. labora-
tory), research designs (longitudinal vs. one-shot), type of WCF, and researchers’ goals regarding
pedagogical implications derived from the empirical findings. In particular, we focused on the
processing dimension of writing that characterizes the inquiry into “L2 learning through writing”
(Cumming, 2020). This foundational principle carries implications for future research agendas at
various levels, namely, the need of future research efforts to adhere to current ISLA theorizing,
and the nature of research efforts in the domain premised on robust theoretical, methodological,
and pedagogical considerations. We underscored the necessity to acknowledge that ISLA-oriented
L2 writing research is situated within a language curriculum, which opens up new directions for
such studies to form part of the language syllabus followed by L2 writers during the academic
program. We then proposed several directions of future research, whether from an applied ISLA
or ISLA applied perspective, that include (a) an expansion of current process-oriented research on
L2 writing as a site for language learning; (b) an acknowledgment that digital modalities are cur-
rently accompanying traditional modes of writing in many language curricula and hence the need
for future studies to address this digital development inside and outside the formal setting; (c) a
consideration of diverse contexts and populations to include other lower academic (e.g., children)
or non-academic contexts and L2 writers who do not possess much linguistic knowledge while
reconsidering a blind extrapolation of key constructs from one population to another; (d) an expan-
sion of studies on task complexity in writing given its effects on attentional resources allocated to
address language-related concerns during the process of writing, and (e) an expansion of research
on the role of individual differences in bringing about learning through writing.
We also provided several methodological considerations future ISLA process-oriented studies
may want to address. One major consideration is the type of data elicitation procedure employed
to elicit data on L2 writers’ cognitive processes. While we have underscored the richness of
mental or cognitive data gathered by concurrent think-aloud protocols when compared to other
procedures employed in the writing strand, we clearly support Leow et al.’s (2014) call for potential
308
triangulations of several data elicitation procedures or techniques that can minimize the limitations
of any one procedure. Linked to valid data elicitation procedures is how the L2 writing data are
coded and analyzed. Concurrent data provide the opportunity to code and analyze sublayers of
processes that will logically lead to a better understanding of how L2 writers process during both
composing and revising stages of the writing process and the relationships between such processes
and subsequent language learning. We also recommended going beyond the one-shot/short span
or laboratory-based design to long span/longitudinal designs that would allow the investigation
of writing processes employed by L2 writers as they perform writing activities over different time
spans. This design is crucial for ISLA instructed studies given its pedagogical purpose of informing
practice.
To conclude, a focus on L2 writing from a process-oriented perspective as a site for learning and
situated within a language curriculum clearly opens up several promising areas of research in ISLA.
Perhaps the most challenging but clearly the most fruitful area is the adoption of an ISLA applied
approach that seriously considers the language curriculum, its learning outcomes, and those of L2
writers in this instructed setting.
References
Adrada-Rafael, S., & Filgueras-Gómez, M. (2019). Reactivity, language of think-aloud protocol, and depth
of processing in the processing of reformulated feedback. In R.P. Leow (Ed.), The Routledge handbook of
second language research in classroom learning (pp. 201–213). New York: Routledge.
Amelohina, V., Manchón, R.M., & Nicolás-Conesa, F. (2020). Effects of task repetition with the aid of direct
and indirect written corrective feedback: A longitudinal study. In R.M. Manchón (Ed.), Writing and lan-
guage learning. Advancing research agendas (pp. 145–181). Amsterdam: John Benjamins.
Caras, A. (2019). Written corrective feedback in compositions and the role of depth of processing. In R.P.
Cerezo, L., Caras, A., & Leow, R.P. (2016). Effectiveness of guided induction versus deductive instruction on
the development of complex Spanish “gustar” structures: An analysis of learning outcomes and processes.
Coyle, Y., Cánovas-Guirao, J., & Roca de Larios, J. (2018). Identifying the trajectories of young EFL learners
Writing, 42, 25–43.
Cumming, A. (2020). L2 writing and L2 learning. Transfer, self-regulation, and identities. In R.M. Manchón
(Ed.), Writing and language learning. Advancing research agendas (pp. 29– 48). Amsterdam: John
Benjamins.
Ellis, R. (1990). Instructed second language acquisition. Cambridge, MA: Blackwell.
Ellis, R. (2005). Instructed second language acquisition: A literature review. Wellington: Ministry of Education.
issues (pp. 106–124). Cambridge: Cambridge University Press
Gánem-Gutiérrez, A., & Gillmore, A. (2018). Tracking the real-time evolution of a writing event: Second lan-
guage writers at different proficiency levels. Language Learning, 68 (2), 469–506.
Galbraith, D., & Baaijen, V.M. (2019). Aligning keystroke with cognitive processes in writing. In E. Lindgren
& K.P.H. Sullivan (Eds.), Observing writing: Insights from keystroke logging and handwriting (pp. 306–
325). Leiden: Brill.
Galbraith, D., & Vedder, I. (2019). Methodological advances in investigating L2 writing processes. Challenges
and perspectives. In A. Révész & M. Michel (Eds.), Methodological advances in investigating L2 writing
processes. Special Issue, Studies in Second Language Acquisition, 41 (3), 633–645.
Kormos, J. (2012). The role of individual differences in L2 writing. Journal of Second Language Writing, 21,
390–403.
Leow, R.P. (2015). Explicit learning in the classroom. A student-centered perspective. New York: Routledge.
309
Leow, R.P. (2018). Explicit learning and depth of processing in the instructed setting: Theory, research, and
practice. Studies in English Education, 23, 769–801.
Leow, R.P. (2019a). From SLA > ISLA > ILL: A curricular perspective. In R.P. Leow (Ed.), The Routledge
Leow, R.P. (Ed.) (2019b). The Routledge handbook of second language research in classroom learning.
Leow, R.P. (2019c). ISLA: How explicit or how implicit should it be? Theoretical, empirical, and pedagogical/
curricular issues. Language Teaching Research, 23, 476–493.
Leow, R.P. (2020). L2 writing-to-learn: Theory, research, and a curricular approach. In R. M. Manchón (Ed.),
Leow, R.P., & Cerezo, L. (2016). Deconstructing the “I” and “SLA” in ISLA: One curricular approach. Studies
in Second Language Learning and Teaching, 6, 43–63.
Leow, R.P., Cerezo, L., Caras, A., & Cruz, G. (2019). CALL in ISLA: Promoting depth of processing of com-
plex L2 Spanish “para/por” prepositions. In R. DeKeyser & G. Prieto Botana (Eds.), SLA research with
implications for the classroom: Reconciling methodological demands and pedagogical applicability (pp.
Leow, R.P., Grey, S., Marijuan, S., & Moorman, C. (2014). Concurrent data elicitation procedures, processes,
and the early stages of L2 learning: A critical overview. Second Language Research, 30(2), 111–127.
Lindgren, E., & Sullivan, K.P.H. (Eds.) (2019). Observing writing: Insights from keystroke logging and hand-
writing. Leiden: Brill.
Loewen, S. (2015). Introduction to instructed second language acquisition. New York: Routledge.
Loewen, S. & Sato, M. (Eds.) (2017). Instructed second language acquisition (ISLA): An overview.
In S. Loewen & M. Sato (Eds.), The Routledge handbook of second language acquisition (pp. 1–12).
López Serrano, S., Roca de Larios, J., & Manchón, R.M. (2020). Processing output during L2 individual
writing tasks: An exploration of depth of processing and the effects of proficiency. In R.M. Manchón (Ed.),
Manchón, R.M. (Ed.). (2020a). Writing and language learning. Advancing research agendas. Amsterdam: John
Benjamins.
Manchón, R.M. (2020b). Writing and language learning. Looking back and moving forward. In R.M. Manchón
(Ed.), Writing and language learning. Advancing research agendas (pp. 3–26). Amsterdam: John Benjamins
Manchón, R.M., & Leow, R.P. (2020). An ISLA perspective on L2 learning through writing. Implications
for future research agendas. In R.M. Manchón (Ed.), Writing and language learning. Advancing research
in language learning and teaching. A collection of empirical studies (pp. 241–265). Amsterdam: John
Benjamins.
Manchón, R.M., & Roca de Larios, J. (Eds.) (Forthcoming). Research methods in the study of L2 writing
processes. Amsterdam: John Benjamins.
The handbook of second and foreign language writing (pp. 567–586). Boston: de Gruyter.
McBride, S., & Manchón, R.M. (Forthcoming). Written corrective feedback processing in digital and pen-
and-paper environments: Comparing the affordances of written languaging and think-aloud protocols.
In R.M. Manchón & J. Roca de Larios (Eds.), Research methods in the study of L2 writing processes.
McKee, H., & DeVoss, D. (2007). Digital writing research: Technologies, methodologies, and ethical issues.
New dimensions in computers and composition. New York: Hampton Press.
310
Norris, J., & Manchón, R.M. (2012). Investigating L2 writing development from multiple perspectives: Issues
in theory and research. In R.M. Manchón (Ed.), L2 writing development: Multiple perspectives (pp. 221–
244). Berlin: de Gruyter Mouton.
Park, E.S., & Kim, O.Y. (2019). Learners’ engagement with indirect written corrective feedback: Depth of pro-
cessing and self-correction. In R.P. Leow (Ed.), The Routledge handbook of second language research in
classroom learning (pp. 212–226). New York: Routledge.
linguistic complexity. Language Learning, 67, 208–241.
Révész, A., & Michel, M. (Eds.) (2019). Methodological advances in investigating L2 writing processes.
Special Issue, Studies in Second Language Acquisition, 41(3).
behavior in the allocation of time to writing processes. Journal of Second Language Writing, 17, 30–47.
Seror, J. (2013). Screen capture technology: A digital window into students’ writing processes. Canadian
Journal of Learning and Technology, 39(3), 1–16.
Stiefenhöfer, L., & Michel, M. (2020). Investigating the relationship between peer interaction and writing
processes in computer-supported collaborative L2 writing. A mixed-methods study. In R.M. Manchón (Ed.),
Pedagogy, 8(3), 461–482.
Wen, Z. (2019). Working memory as language aptitude: The phonological/executive model. In Z. Wen,
P. Skehan, A. Biedrón, S. Li, & R. L. Sparks (Eds.), Language aptitude. Advancing theory, testing, research,
and practice. New York: Routledge.
Zabildea, J. (2020). A mixed-methods approach to exploring the L2 learning potential of writing versus
speaking In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp. 207–
311
23
DIRECTIONS FOR FUTURE
RESEARCH ON ATTENTION
AND L2 WRITING
Tokyo International University and Sophia University
A Very Brief Historical Perspective on the Issue

In the field of second language acquisition (SLA), especially in the cognitive strand of research,
central debate has revolved around the roles of attention and awareness in second language (L2)
development (Robinson, 1995; Schmidt, 1990, 1993, 1995; Schmidt & Frota, 1986; Tomlin &
Villa, 1994). The construct of attention minimally involves three levels, namely, peripheral (a
glance out of the corner of one’s eye), selective (corresponding to detection –cognitive regis-
tration of the attended stimuli that may or may not be accompanied by awareness, as argued by
Tomlin & Villa, 1994), and focal (similar to the notion of noticing –attention with a low level of
awareness and abstraction, as argued by Schmidt, 1990). In this chapter, we would like to restrict
the term “attention” to focal attention, and use the term “noticing” to mean focal attention accom-
panied by some further processing (e.g., data-driven or conceptually-driven processing). The latter
would lead to the subjective sense of awareness, along the lines suggested by Robinson (1995;
see also Izumi, 2002 and Chapter 2, this volume for further discussion). While debate still con-
tinues about the exact nature of the attentional processes that are implicated in L2 learning, it is
ultimately through further empirical research that answers to the puzzles can be fully elucidated.
It is the purpose of this chapter to present some relevant literature and suggestions to this effect in
relation to writing.
Spurred by heightened interest in the role of attention in L2 learning, a number of empirical studies
have been conducted to examine the role of attention and awareness in L2 learning (e.g., Leow,
2000; Rosa & Leow, 2004). While the initial focus of research was placed on the role of learners’
attention to the input they receive, an increasing number of researchers began to investigate the
critical roles of attention and awareness during output processing as stimulated by Swain’s (2005)
Output Hypothesis. Coinciding with this development, a new strand of research has emerged from
L2 writing scholarship and started investigating the language-learning potential of writing from the
perspective of “writing-to-learn” (as opposed to a more traditional “learning-to-write” approach),
where the act of writing is seen to be contributing to the very process of learning (Manchón, 2011).
This new line of research has started to examine L2 writers’ attentional behavior in the cyclical and
recursive process of writing. The role of learners’ attention and awareness has thus begun to be
seen as an essential target of investigation, giving momentum to the cross-fertilization of the two
312 DOI: 10.4324/9780429199691-32

Future Research on Attention & L2 Writing
camps (Manchón & Williams, 2016). In particular, recent investigations have specifically focused
on the hypothesized advantages of writing for promoting focused attention to form, which derives
from the slower and self-modulated pace and permanent record available in writing (J. Williams,
2012), together with unique demands of communicating in displaced time and space which push
the learner to be more precise, complex, and coherent (see Chapter 4, this volume). What is dis-
tinctive about attentional research in writing is the investigation into learners’ attentional behavior
observed in the process of writing, where major issues include attentional allocation at and across
different stages of writing, namely planning, formulation, and revision, intervened by a number of
factors influencing it.
A Brief Description of What we Know Already

A major strand of qualitative process-oriented research addressing the Output Hypothesis has been
conducted in collaborative writing conditions from sociocultural perspectives (see Chapter 3, this
volume), as represented by Swain and her colleagues (e.g., Lapkin & Swain, 2004; Lapkin, Swain,
& Smith, 2002; Swain, & Lapkin, 2002). By analyzing “language-related episodes” (i.e., episodes
in which learners focus on and discuss language), these researchers have uncovered how writing
stimulates learners’ noticing of linguistic problems, hypothesis-testing, and metalinguistic reflec-
tion when they work in collaboration with other learners. Another line of research has focused on
individual writing conditions and has addressed how noticing of holes (i.e., finding what is lacking
in one’s interlanguage (IL) repertoire in the process of output) promotes subsequent noticing of
solutions and/or gaps (i.e., finding discrepancies between one’s output formulation and input
forms) in the feedback received, and what uptake results from such noticing (e.g., Hanaoka, 2007,
2012; Hanaoka & Izumi, 2012; Martínez Esteban & Roca de Larios, 2010; Qi & Lapkin, 2001).
These studies of collaborative and individual writing have demonstrated the useful roles of input
texts (models and reformulations) in promoting learners’ noticing. Importantly, such noticing is
stimulated by internally motivated attention to form triggered by output, as opposed to externally
directed attention achieved by instruction or feedback.
Attention to form in tasks has been another main area of investigation in the exploration of
language-learning potential of writing. Issues of attentional capacity and allocation have been
examined in relation to various task-related variables such as task types, conditions, and sequences,
yielding complex pictures involving interactions of these and other related variables. A number of
studies in this domain have employed writing tasks including text reconstruction, essay writing, and
picture-cued writing (Izumi, 2002; Izumi & Bigelow, 2000; Izumi, Bigelow, Fujiwara, & Fearnow,
1999; Russel, 2014; Song & Suh, 2008; Uggen, 2012). These studies reported facilitative roles of
written output in learning the targeted structures. Less clear, however, is the link between noticing
and learning both during the output stage and the subsequent input or feedback processing stages
(see Chapter 2, this volume).
In another strand of research, learners’ attention in composing processes and problem-solving
behavior has been investigated rigorously from SLA- oriented L2- writing perspectives, with
two comprehensive programs in Spain and the Netherlands (see Manchón, Roca de Larios, &
Murphy, 2009, and Schoonen, Snellings, Stevenson, & Van Gelderen, 2009, for a synthesis of
the studies conducted within each program). The research program in Spain addressed two main
issues. One concerned learners’ attentional allocation to different composing activities (Roca de
Larios, Manchón, & Murphy, 2008), the specific macroprocess of formulation (Roca de Larios,
Marín, & Murphy, 2001; Roca de Larios, Manchón, & Murphy, 2006) and planning (Manchón &
Roca de Larios, 2007). These analyses suggested how L2 proficiency and the language in which
to perform the writing task influence learners’ attentional behavior. The other issue involved the
learners’ problem-solving behavior (Manchón, et al., 2009; Roca de Larios et al., 2006; Roca de
313
Larios et al., 2001; Roca de Larios, Murphy, & Manchón, 1999), which elucidated two types of
problems learners attended to –i.e., compensatory and upgrading types –and the learners’ use of
three key strategies –i.e., restructuring, backtracking, and use of the L1. In the research program
in the Netherlands, on the other hand, Schoonen et al. suggested that conceptual processing may be
inhibited in favor of linguistic processing in foreign language writing, which was indicated in their
temporal data provided by think-aloud protocols.
Together with contextual factors, individual difference factors have also been increasingly
recognized as important variables to be investigated. Aptitude and L2 proficiency have often been
examined in relation to the learners’ variable performance outcomes and feedback processing (e.g.,
Benson & DeKeyser, 2019; Sheen, 2007; Shintani & Ellis, 2015; Stefanou & Révész, 2015), but
it is rare to see studies that examine these variables in relation to their attentional processes.1 In
what follows, of the numerous issues in need of investigation, we will focus on how attention while
writing is mediated by (a) task-related issues as learner-external factors and (b) individual diffe-
rence factors in L2 writing as learner-internal factors. Before we suggest specific agendas for these
topics, we will first present general issues that relate to all the agendas that follow.
A Research Agenda
General Framework
In order to situate our discussion on various research suggestions to be made below in a broad
framework, it will be useful first to explain an SLA model which illustrates how L2 learners
develop their interlanguage (IL) as influenced by a host of relevant factors (see Izumi, 2003, 2013,
for detailed discussion).
As shown in Figure 23.1, the model shows how L2 acquisition is driven primarily by input,
whether oral or written. The solid arrows connecting input through output depict the main SLA
processes that convert input to intake and IL all the way through to output production –whether
via speaking or writing.2 The dotted arrows connecting output to other processes illustrate cyclical
loops whereby output production affects overall SLA processes in dynamic ways as proposed by
the Output Hypothesis (Swain, 2005). Specifically, output production is believed to stimulate the
acquisition processes by inducing focused attention on form in the input, providing chances for
External factors Internal factors

(e.g., instruction, interaction, (e.g., L1, affective factors, cognitive
feedback, task demands, input factors) factors, current L2 knowledge)
Noticing Noticing gap between Noticing hole /

form IL and TL gap in ability
INPUT INTAKE INTERLANGUAGE OUTPUT
Metalinguistic reflection
Hypothesis-testing
Focused attention facilitating intake
Figure 23.1 Overall SLA processes and affecting factors (Adapted from Izumi, 2013, p. 29)
314
hypothesis-testing, and encouraging metalinguistic reflection. Each of these processes is expected to

lead to different types of noticing as shown in the dotted squares in the figure – namely, noticing
of form, noticing of the gap between the IL and the target language, noticing of holes, and noticing
of the gap in ability (i.e., gap between what learners wish to express and what they can actually
formulate). The figure also depicts a number of factors hypothesized to affect learners’ noticing at
different stages of L2 development, including external factors such as instruction and feedback and
internal factors such as motivation and aptitude. All these factors likely interact with each other to
determine the outcome of noticing and learning.
Given the paramount importance of attention in driving L2 development, SLA-oriented research
on writing, as does general SLA research, would inevitably require close examination of how
attentional processes are engaged as learners grapple with the act of writing in the L2. It will be
necessary, in particular, to elucidate whether and how the observed effect of writing may differ
in comparison with those found for the act of speaking which is ephemeral and occurs usually
under much greater time pressure (see Chapter 4, this volume). Writing, unlike speaking, typically
involves three broad and often overlapping stages of planning, formulating, and revising. Careful
empirical scrutiny is necessary to illuminate what learners pay attention to and what cognitive
processes are engaged in each of these stages. Also, given that writing involves recursive and cyc-
lical processes (Schoonen et al., 2009), research is needed into not only what happens in each stage,
but also whether and how events happening at each stage are related to each other culminating in
eventual L2 development. It is possible that L2-learning benefits are amplified in the form of a chain
reaction of what occurs at different stages of writing. Alternatively, learners may be stuck at any of
these stages facing difficulties in realizing what they wish to accomplish (cf., Inhibition Hypothesis
proposed by Schoonen et al., 2009). In such cases, research will help us identify any missing link or
missteps in the learning process, which can inform us in devising (or in delaying) necessary peda-
gogical measures to address the problems.
Methodologically, in order to capture an otherwise hard-to-access elusive phenomenon of
attention, noticing, and learning, researchers will need to employ multiple measures –both online
and offline –to allow for methodological triangulation as much as possible (see Chapters 23 and
26, this volume). As online measures, keystroke logging (Lindgren & Sullivan, 2019) and eye-
tracking (Godfroid, 2019) are particularly useful tools to capture the focus and shift of learners’
attention in real time as they write and receive input and feedback. These measures, however, will
not be very revealing about the elaborate thought processes that may be occurring in the learners’
heads; thus, they need to be complemented by other online measures like think-aloud protocols
and offline measures such as in-depth interviews, retrospective questionnaires, and stimulated
recall that probe why learners engaged (or did not engage) in certain behaviors (Révész, Michel,
& Li, 2017). The changes in the learners’ writing should also be examined from draft to draft to
keep track of how their attentional changes during writing may be reflected (or not) in the written
products.
Once we learn more about what happens at and across different stages of writing, we can move
on to study how engaging in such processes repeatedly over time, with possibly further instruc-
tional intervention provided, may lead to any change in the learners’ attentional focus and long-
term L2 development. This will inevitably require much-needed longitudinal research designs
that involve both extended opportunities for learning through writing and delayed assessment
measures to gauge any long-term changes (see Manchón & Leow, 2020; Polio & Park, 2018;
Chapter 23, this volume). These basic methodological pointers apply equally to all issues explored
in the rest of the chapter, and thus will not be repeated hereafter. Keeping all these points in mind,
we will now move on to more specific research suggestions that we believe are particularly worth
investigating.
315
Task-Related Issues
Research Task 1: Explore How Different Task Demands Posed by Different Task
Types and Sequences Affect Learners’ Attentional Allocation and Cognitive Processes
at and Across Different Stages of Writing and How They Contribute to L2 Learning
Task-based language teaching (TBLT) is premised on the notion that learners should be challenged
to use the language from early on by engaging in meaningful tasks in order to learn it effectively
(Long, 2015; Van den Branden, Bygate, & Norris, 2009). Seeing learners’ writing in light of TBLT,
it is imperative to know more about how the external manipulation of task variables of different
kinds exerts influence on the learners’ internal attentional processes leading to L2 development. The
following are but some major issues in need of research in this area.
Suggested Empirical Studies

How Do Different Task Types Affect Learners’ Attentional
Allocation and Writing Performance?
Task types can be defined in various ways, but with our focus here on attentional processes in
writing, we are considering mainly in terms of varying task complexity that creates different task
demands on the learners (see Chapter 5, this volume). As per the different degrees of task demands,
learners are said to be predisposed to allocate their attentional resources differently. This may
occur in line with the predictions made by Skehan’s (1998) Trade-off Hypothesis, which states
that attention to one aspect (e.g., complexity) is likely to compromise attention to the others (e.g.,
fluency and accuracy). Alternatively, task demands may exert influence as predicted by Robinson’s
(2005) Cognition Hypothesis, which states that structural complexity and accuracy may both
increase in response to the increased functional demands imposed by the task. Still another pos-
sibility is that the learner’s L2 proficiency level moderates the effects of task demands such that
lower-proficiency learners are more prone to the trade-off phenomena due to their still limited pro-
cessing capacity. Another possibility is that the results could vary depending on the types of forms
examined (Sasayama & Izumi, 2012). All this could also depend not only on task type, but also on
the conditions under which the task is conducted (e.g., R. Ellis, 2005; Ong, 2014).
In terms of writing, it is possible that written modality with more time and permanent record
available for sustained processing may mitigate the effects of imposed task demands such that
simultaneous attention to different aspects of language becomes more manageable than in the case
of speaking (Manchón & Williams, 2016; J. Williams, 2012; Chapter 4, this volume). Though the-
oretically not implausible, the studies that actually compared written modality with oral modality
still leave this an open question requiring further research (Vasylets, Gilabert, & Manchón, 2017,
2020; Zabildea, 2017, 2020; Chapter 4, this volume). An additional important characteristic of
writing relates to its problem-solving nature (Manchón et al., 2009; Manchón & Williams, 2016).
Namely, as writing often requires more careful and elaborate linguistic processing than is allowed
for speaking, it should, theoretically at least, be conducive to promoting such processes as noticing
and metalinguistic reflection, leading to greater L2 development. However, the scarcity of research
in this area precludes us from making any definitive conclusion on this issue as yet.
How Do Different Task Sequences Affect Learners’ Attentional

Allocation and Writing Performance?
Task sequence is yet another area that can benefit from concentrated effort for research. By task sequence
we do not mean macro sequencing of different tasks in the syllabus or the curriculum, but instead here
more micro sequence within a single lesson, namely the integration of a writing task with other related
316
tasks in a successive sequence in the lesson flow. Given that such skill integration is becoming the norm
of language testing and language teaching alike (Gilabert, Manchón, & Vasylets, 2016), it is impera-
tive for us to investigate the effects of different task sequences on the learners’ attentional processes,
their writing performance and L2 development. A series of studies by Hanaoka and Izumi (Hanaoka,
2007, 2012; Hanaoka & Izumi, 2012; Izumi, 2002) are but one example of research that investigated
such issues. These studies examined the shift of learners’ attention when the task was sequenced in the
order of writing, reading, and writing on the same prompt again. These studies found that learners face
specific language-related problems in trying to write, find solutions to these problems when they read a
model text, and incorporate many of these solutions in their subsequent writing. This whole sequence,
therefore, seems to provide a learning opportunity for the learners in accordance with the predictions
made by the Output Hypothesis. To the extent that it is the combination and the specific sequence that
catalyze L2 learning, task sequencing and integration issues merit further research.
For instance, one question that may be raised is whether learners pay attention to similar things
and learn in similar ways when a model text is presented before rather than after writing. Similar
questions can be raised for the value of reading, discussion, debate, and other pedagogical activities
that can be integrated with the writing task. It is possible that the roles and effects of these activities
on writing (as well as those of writing on the development of other skills) may differ depending on
their placement in the instructional sequences (Hirvela & Belcher, 2016; J. Williams, 2008; see also
Chapter 4, this volume). As more and more classes are oriented toward content-based instruction or
content and language integrated learning (CLIL), where writing is embedded in an overall instruc-
tional curriculum in an integrated manner, it will be important to investigate how writing tasks can
be situated and sequenced to derive their maximum benefit. It is also possible that the same sequence
may not benefit all learners equally, such that, for example, beginner-level learners may learn better
in input-output-input order, while advanced learners may benefit more effectively from an output-
input-output sequence. This relates to the next set of issues on individual differences in L2 learning.
Effects of Individual Difference Factors

Research Task 2: Examine How Learners’ Attentional Allocation, Cognitive
Processes, and Writing Performance Are Affected by Various Individual Difference
Factors Including Motivation, Aptitude, Working Memory, L2 Proficiency,
and Previous Learning Experiences
While learner-external factors are important in stimulating and influencing learners’ writing and the
learning processes engaged, their influences are mediated by learner-internal factors (see Chapters 11
and 12, this volume) that shape the cognitive and affective processes, which as a consequence generate
various individual differences in the learning outcomes (see Figure 23.1). Some of the most frequently
researched internal factors in SLA research include motivation, language-learning aptitude, working
memory, and L2 proficiency. Other less researched factors include learners’ previous L2-learning
experiences, their beliefs about language learning, and learning strategies. With regard to attentional
processes, we need to know how the learners’ attentional processes may be engaged differently in the
planning, composing, and monitoring stages depending on their different profiles along these individual
difference variables. Some specific suggestions for research along this basic question are proposed below.
Suggested Empirical Studies

How Do Differences in Learners’ Motivation Affect Their Attentional Allocation,
Their Writing Performances, and Subsequent Learning?
Previous research on learner motivation typically looked at the effects of different motivational
types and differing degrees of motivational intensity on L2 learning in general, yielding findings
317
that motivation is indeed one of the most important factors influencing L2 learning (Ushioda &
Dörnyei, 2012). In motivational research, researchers typically used correlational approaches and
examined the degree to which motivation as measured by questionnaires is related to L2-learning
outcomes as measured by standard language tests. What is often left unexamined in such studies
is how affective factors and cognitive factors are linked by any specific learner behavior (Swain,
2013). However, as recent research on neuroscience reveals, cognition and affect are intricately
intertwined (Schumann & Wood, 2004), so it will be important to examine more closely how affect
and cognition interact, not just correlate, with each other to bring about language learning. At the
interface of affect, cognition, and learning may lie the mediating role of attention, which is believed
to bridge one’s external and internal worlds.
Added to this, in light of recent views of motivation as being dynamic, fluid, socially- and
contextually-situated (Ushioda & Dörnyei, 2012), it will be important to take a more focused
approach than has been to date and investigate how specific motivational profiles of the learners –
whether defined in terms of traditional integrative-instrumental motivation (Gardner, 2010), inter-
national posture (Yashima, 2009), or L2 motivational self system (Dörnyei & Ushioda, 2009) –are
implicated in specific learner attentional behavior leading to the use and development of those
attended aspects in L2 writing. For instance, it is possible that a particular type of motivation,
say desire to be assimilated in the L2-speaking community, may predispose learners to seek for
and attend to certain aspects of language (e.g., colloquial aspects, as opposed to formal registers),
alerting them and orienting their attention to these features and prompting noticing (Schmidt, 2001;
Tomlin & Villa, 1994). Because such effects are hard to observe through the results obtained by
general proficiency tests, researchers need to adopt a more process-oriented approach to probe the
relationship between particular motivational types and specific learner behaviors. The less pressured
production conditions in writing may allow such possible motivational effects, if any, to become
more easily observable than in the case of speaking. This, however, awaits empirical verification.
How Do Differences in Learners’ Aptitude and Working Memory Affect

Their Attentional Allocation, Their Writing Performances,
and Subsequent Learning?
Along similar lines as motivation, differences in learners’ aptitudinal profiles (or “aptitude
complexes”: Robinson, 2007, 2012) may be related to their attentional behavior and subsequent
use and development of those attended aspects. While it is by now fairly established that aptitude
constitutes an important variable that can account for individual differences in L2 learning (e.g., Li,
2016; Chapter 11, this volume), little is known about how variations in aptitude components are
implicated in what learners attend to and process in input and output. For example, do learners with
strengths in grammatical sensitivity attend to grammatical formulation aspects in the composing
process, resulting in improvement in writing and leading to L2 development? Future research on L2-
learning aptitude vis-à-vis L2 writing would thus need to connect identification of learners’ aptitude
with investigation of their differential effects on writing tasks via attentional processes. It would be
equally illuminating to examine whether and how learners with weaknesses in any of the aptitude
components may shift their attention in an effort to cope with the demands of the writing tasks.
Another important cognitive factor that is likely to influence one’s attentional processes is
working memory. Skehan (2012) argued that working memory is implicated in different stages
of L2 processing. For example, greater working memory capacity may enable more attention to
be directed to monitoring, which can result in reduced number of errors. Likewise, more working
memory may enable more attention to be directed to feedback and the incorporation of feedback
into one’s output, leading to greater change in one’s IL held in long-term memory. Working memory
is also argued to be implicated in task-based learning in general, where learners are required to
switch attention between form and meaning rapidly in the course of communicative interactions
318
(J.N. Williams, 2012). As previous research in this area has so far produced mixed results, further
research is needed that takes into account potential mediating factors such as L2 proficiency and
task complexity (see Chapter 11, this volume). Notably, a shift in focus is called for from product-
oriented research that examines the effects of working memory on the learner’s written product to
more process-oriented research that explores the learner’s attentional processes during writing to
illuminate a crucial link connecting the key constructs under investigation. Ultimately, if written
modality turns out to give any benefit to learners, especially for those with more limited working
memory capacity or weaknesses in any of the aptitude components, such results could inform peda-
gogical applications along similar lines to those offered by research on aptitude-treatment inter-
action (Ackerman, 2003; Hwu & Sun, 2012).
How Does Learners’ L2 Proficiency Affect Their Attentional

Processes as They Engage in Different Stages of Writing?
Writing can be a very laborious activity for anybody, but it is especially so for those who are at
lower levels of L2 proficiency. This is because they have to struggle with a number of things at the
same time in the composing process, including generating ideas, organizing them, searching for
and retrieving lexical items that match their intentions, formulating these into sentences, putting
them down as a written product while caring about the spelling, and checking to see whether the
output thus produced really matches their intended messages in a cohesive and coherent manner.
Any of these may break down easily in the writing process unless their cognitive resources are
effectively used and managed (Roca de Larios, Nicolás-Conesa, & Coyle, 2016). Previous research
on the effects of L2 proficiency on L2 writing performance generally found, not surprisingly, that
the greater their L2 knowledge, the better their performance becomes in many ways (e.g., Pae,
2018; Tiryakioglu, Peters, & Verschaffel, 2019). However, studies that specifically examined atten-
tional aspects are scarce. An important question for attention-based research then is what learners at
different levels of L2 proficiency can and do pay attention to in different phases of the composing
process. What do they prioritize if they find themselves overburdened, and what do they put off to
take care of till later? Do they come to pay attention to those less prioritized aspects at later stages in
the composing process, or perhaps over a longer span of time as their L2 proficiency increases? Or
do they still fail to shift their freed-up attention, thus failing to improve their writing performance
and their L2 proficiency as a result?
Regarding the relationship of attention and L2 learning, VanPatten (2004) addressed the spe-
cific issues concerning how intake is derived from input and what psycholinguistic strategies L2
learners of different proficiency levels rely on during input processing. Among those processing
principles that are believed to guide learners’ language acquisition, one principle states that learners
process lexical items –particularly content words –before grammatical form (the lexical pref-
erence principle), and another stipulates that, among the grammatical items, learners are more
likely to process non-redundant meaningful grammatical items before they process redundant less-
meaningful forms (the preference for non-redundancy principle). It is also argued that it is the
function of their developing L2 proficiency that enables learners to attend to grammatical aspects of
language. Importantly, these processing strategies have been used by VanPatten and his colleagues
(see VanPatten, 2004) as a basis for creating their structured input activities known as “processing
instruction” –a comprehension-based activity that guides learners to pay attention to the otherwise
hard-to-acquire form-meaning connections. VanPatten’s theory, however, is focused on input, not
output, processing. While it may be reasonable to assume that output processing closely mirrors
input processing tendencies, it still remains to be seen how learners’ attention is enabled and guided
by their developing L2 proficiency when we focus on output, especially in writing. Close exam-
ination of the learners’ attentional focus at different stages of the writing process, as was done for
example in Hanaoka (2012), should reveal what their processing biases may be, and longitudinal
319
or pseudo-longitudinal studies should uncover how their output processing tendencies may or may
not change over time. Such investigation will inform us in our determination of whether, when, and
how we should teach learners as they struggle with writing tasks.
How Does Learners’ Previous Learning Experience Affect Their

Attentional Processes as They Engage in Different Stages of Writing?
Still another factor that may affect L2 writing processes is the learners’ previous learning experi-
ence. It is a common observation that some learners have learned the L2 by receiving mostly
grammar-based formal instruction, while other learners have learned the L2 through exposure
to a large amount of naturalistic L2 input with very little formal instruction. The former type
of learners may be described as those who have learned the L2 in what Long (1991) calls a
focus-on-forms environment, whereas the latter have done so in a contrastive focus-on-meaning
context. Such differences in the learning background are likely to lead these learners to develop
not only different strengths and weaknesses in their L2 skills, but also different expectations and
beliefs about L2 learning in general (Izumi, Shiwaku, & Okuda, 2011; Ogawa & Izumi, 2015).
It is for this reason that Lyster and Mori (2006) were prompted to propose their Counter-Balance
Hypothesis, which states that instructional activities and interactional feedback work best if they
act as a counterbalance to the predominant approach to classroom instruction that the learners
received previously. They based their hypothesis on the findings of their research that recasts
as communicative feedback resulted in better uptake and repair for those learners who had thus
far received more grammar-focused instruction, while prompts worked better for those who had
been placed in more meaning-focused immersion contexts. While the hypothesis still needs fur-
ther empirical verification, it is an intriguing possibility that familiarity with the prior instruc-
tional type exerts influence on how learners react to current tasks, feedback, and instruction.
In cognitive psychology, it is well known that selective attention characterizes human learning
especially for adults, and that selective attention is itself a learned phenomenon based on our pre-
vious experiences (this is known as “learned attention,” “attentional bias,” or “blocking”; see
N. Ellis, 2012). If so, we need to know more about what the learners’ current attentional bias may
be, how it is influenced by their previous training, and how it influences their engagement in the
current tasks. What happens, for example, when learners who received accuracy-oriented instruc-
tion previously now receive fluency-focused training, in contrast to the situation in which learners
who engaged in fluency-focused training previously now receive accuracy-focused treatment? In
relation to attention and writing, it will be interesting to investigate in what ways learners with
different previous L2-learning backgrounds may differ in their current attentional behavior when
engaging in L2 writing tasks. Empirical research that tackles these issues can take the form of
experimental studies that examine how learners with different learning backgrounds behave in
terms of their attention and writing, as well as how they respond to different types of instruction
or feedback given to them. Alternatively, the research can take the form of qualitative case studies
that gather data from a small number of participants to see how they may (or may not) adjust their
attentional behaviors in tackling writing tasks over time.
Conclusion
Attention is believed to be implicated in any aspect of L2 development. Accordingly, shedding
light on the attentional processes involved in L2 writing is essential to understanding the language-
learning potential of writing. In particular, we believe that the potential advantages of writing
associated with generous time constraints and the permanent trace of written output deserve to be
examined with virtually all issues that have traditionally been investigated with primarily oral data
in the past. As the writing-to-learn agendas have started attracting much attention in recent years, a
320
whole range of intriguing issues remain to be investigated, of which we were able to suggest only
a few in this chapter.
To summarize our points briefly, we first discussed general issues regarding attention in SLA
that cut across all areas of investigation. We then focused on task-related issues as learner-external
factors and individual difference factors as learner-internal factors. While these topics are discussed
separately here for conceptual convenience, it is important to stress the need for the researchers to
keep in mind that it is the complex interaction among all these factors that produce much of SLA
outcomes. What is needed in future research is a greater focus on the attentional processes of indi-
vidual learners, which needs to be examined not just at one stage of the writing process or at one
instance of task implementation but also across different stages of the writing sequence and over
time. These process-oriented studies should be combined with product-oriented research to further
advance our understanding of the role of attention in L2 writing and learning.
Notes
1 We deliberately decided not to touch on research pertaining to written corrective feedback in this chapter,
due mainly to the space limitations and in light of the wealth of research in this area that are discussed exten-
sively in other chapters of this volume. Readers are referred to Chapters 7 and 16 in particular for feedback-
related issues.
2 The proposed model is situated in a broader socio-cultural context in which acquisition takes place, which
most likely affects the overall processes inside. However, discussion on this will be beyond the scope of the
present chapter and thus will not be touched on here.
References
Ackerman, P.L. (2003). Aptitude complexes and trait complexes. Educational Psychologist, 38(2), 85–93.
accuracy. Language Teaching Research, 23, 702–726.
Dörnyei, Z., & Ushioda, E. (Eds.) (2009). Motivation, language identity and the L2 self. Bristol: Multilingual
Matters.
Ellis, R. (Ed.) (2005). Planning and task performance in a second language. Amsterdam: John Benjamins.
Ellis, N.C. (2012). Learned attention and blocking. In P. Robinson (Ed.), The Routledge encyclopedia of SLA
Gardner, R.C. (2010). Motivation and second language acquisition. The socio- educational model.
New York: Peter Lang.
Gilabert, R., Manchón, R., & Vasylets, O. (2016). Mode in theoretical and empirical TBLT research: Advancing
research agendas. Annual Review of Applied Linguistics, 36, 117–135.
Godfroid, A. (2019). Eye tracking in second language acquisition and bilingualism: A research synthesis and
methodological guide. New York: Routledge.
form in a four-stage writing task. Language Teaching Research, 11, 459–479.
Hanaoka, O. (2012). Spontaneous attention to form in EFL writing: The role of output and feedback texts
(Unpublished PhD dissertation). Sophia University.
Hanaoka, O., & Izumi, S. (2012). Noticing and uptake: Addressing pre-articulated covert problems in L2
writing. Journal of Second Language Writing, 21(4), 332–347.
Hirvela, A., & Belcher, D. (2016). Reading/writing and speaking/writing connections: The advantages of
mutlimodal pedagogy. In R.M. Manchón & P.K. Matsuda (Eds.), Handbook of second and foreign language
writing (pp. 587–612). Boston: De Gruyter Mouton.
Hwu, F., & Sun, S. (2012). The aptitude-treatment interaction effects on the learning of grammar rules. System,
40, 505–521.
Izumi, S. (2002). Output, input enhancement, and the Noticing Hypothesis: An experimental study on ESL
relativization. Studies in Second Language Acquisition, 24, 541–577.
Izumi, S. (2003). Comprehension and production processes in second language learning: In search of the
psycholinguistic rationale of the output hypothesis. Applied Linguistics, 24, 168–196.
Izumi, S. (2013). Noticing and L2 development: Theoretical, empirical, and pedagogical issues. In J. M.
Bergsleithner, S. N. Frota, & J. K. Yoshioka (Eds.), Noticing and second language acquisition: Studies in
321
honor of Richard Schmidt (pp. 25–38). Honolulu, HI: University of Hawai’i, National Foreign Language
Resource Center.
Izumi, S., & Bigelow, M. (2000). Does output promote noticing and second language acquisition? TESOL
Quarterly, 34, 239–278.
Izumi, S., Bigelow, M., Fujiwara, M., & Fearnow, S. (1999). Testing the output hypothesis: Effects of output on
noticing and second language acquisition. Studies in Second Language Acquisition, 21, 421–452.
Izumi, S., Shiwaku, R., & Okuda, T. (2011). Beliefs about language learning, learning strategy use, and self-
efficacy/confidence of EFL learners with and without living-abroad experience. Sophia Linguistica, 59,
151–184.
Lapkin, S., & Swain, M. (2004). What underlies immersion students’ production: The case of avoir besoin de.
Foreign Language Annals, 37, 349–355.
Lapkin, S., Swain, M., & Smith, M. (2002). Reformulation and the learning of French pronominal verbs in a
Canadian French immersion context. Modern Language Journal, 86, 485–507.
Leow, R.P. (2000). A study of the role of awareness in foreign language behavior: Aware versus unaware
learners. Studies in Second Language Acquisition, 22, 557–584.
Li, S. (2016). The construct validity of language aptitude: A meta-analysis. Studies in Second Language
Acquisition, 38(4), 801–842.
Lindgren, E., & Sullivan K. (Eds.). (2019). Observing writing: Insights from keystroke logging and hand-
writing. Leiden: Brill.
Long, M. (1991). Focus on form: A design feature in language teaching methodology. In D.B, Kees, R.
Ginsberg, & C. Kramsch (Eds.), Foreign language research in cross-cultural perspective (pp. 39–52).
Long, M. (2015). Second language acquisition and task- based language teaching. Malden, MA:
Wiley-Blackwell.
Lyster, R., & Mori, H. (2006). Interactional feedback and instructional counterbalance. Studies in Second
Manchón, R.M. (2011). The language learning potential of writing in foreign language contexts: Lessons from
research. In M. Reichelt & T. Cimasko (Eds.), Foreign language writing: Research insights (pp. 44–64).
West Lafayette, IN: Parlour Press.
Manchón, R.M., & Leow, R.P. (2020). Investigating the language learning potential of L2 writing: Methodological
considerations for future research agendas. In Rosa M. Manchón (Ed.), Writing and language learning.
Advancing research agendas (pp. 335–355). Amsterdam: John Benjamins.
Manchón, R.M., & Roca de Larios, J. (2007). On the temporal nature of planning in L1 and L2 composing.
Language Learning, 57, 549–593.
Manchón, R.M., & Williams, J. (2016). L2 writing and SLA studies. In R.M. Manchón & P.K. Matsuda (Eds.),
Handbook of second and foreign language writing (pp. 567–586). Boston: De Gruyter Mouton.
Martínez Esteban, N., & Roca de Larios, J. (2010). The use of models as a form of written feedback to sec-
ondary school pupils of English. International Journal of English Studies, 10, 143–170.
Ogawa, E., & Izumi, S. (2015). Belief, strategy use, and confidence in L2 abilities of EFL learners at different
levels of L2 proficiency. JACET Journal 59, 1–18.
Pae, T. (2018). Effects of task type and L2 proficiency on the relationship between L1 and L2 in reading and
writing: An SEM approach. Studies in Second Language Acquisition, 40, 63–90.
Polio, C., & Park, J. H. (2018). Language development in second language writing. In R.M. Manchón & P.K.
Matsuda (Eds.), Handbook of second and foreign language writing (pp. 287–306). Boston: De Gruyter
Mouton.
Qi, D. S., & Lapkin, S. (2001). Exploring the role of noticing in a three-stage second language writing task.
Révész, A., Michel, M., & Lee, M. (2017). Investigating IELTS Academic Writing Task 2: Relationships
between cognitive writing processes, text quality, and working memory. IELTS Research Reports Online
Series, 44. /www.ielts.org/for-researchers/research-reports/ielts_online_rr_2017-3.
Robinson, P. (1995). Review article: Attention, memory, and the “noticing” hypothesis. Language Learning,
45, 283–331.
Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential framework for
second language task design. International Review of Applied Linguistics in Language Teaching, 43, 1–33.
322
Robinson, P. (2007). Aptitudes, abilities, contexts and practice. In R. DeKeyser (Ed.), Practice in second
language learning: Perspectives from applied linguistics and cognitive psychology (pp. 256– 286).
Robinson, P. (2012). Individual differences, aptitude complexes, SLA processes, and aptitude test develop-
ment. In M. Pawlak (Ed.), New perspectives on individual differences in language learning and teaching
(pp. 57–76). Heidelberg: Springer.
writing: A temporal analysis of problem-solving formulation processes. Modern Language Journal, 90,
100–114.
Roca de Larios, J., Manchón, R.M., & Murphy, L. (2008). The foreign language writer’s strategic behaviour in
the allocation of time to writing processes. Journal of Second Language Writing, 17, 30–47.
Roca de Larios, J., Marín, J., & Murphy, L. (2001). A temporal analysis of formulation processes in L1 and L2
writing. Language Learning, 51, 497–538.
Roca de Larios, J., Murphy, L., & Manchón, R. (1999). The use of restructuring strategies in EFL writing: A
study of Spanish learners of English as a foreign language. Journal of Second Language Writing, 8, 13–44.
Rosa, E., & Leow, R. (2004). Awareness, different learning conditions, and second language development.
Applied Psycholinguistics, 25, 269–292.
Russell, V. (2014). A closer look at the output hypothesis: The effect of pushed output on noticing and inductive
learning of the Spanish future tense. Foreign Language Annals, 47, 25–47.
Sasayama, S., & Izumi, S. (2012). Effects of task complexity and pre-task planning on Japanese EFL learners’
oral production. In A. Shehadeh & C.A. Combe (Eds.), Task-based language teaching in foreign language
contexts: Research and implementation (pp. 23–42). Amsterdam: John Benjamins.
Schmidt, R. (1990). The role of consciousness is second language learning. Applied Linguistic, 11, 129–158.
Schmidt, R. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13,
206–226.
awareness in learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning (Technical
Report No. 9, pp. 1–63). Honolulu: University of Hawai’i.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 3–32).
Schmidt, R., & Frota, S. (1986). Developing basic conversational ability in a second language: A case study of
an adult learner of Portuguese. In R. Day (Ed.), Talking to learn: Conversation in second language acqui-
sition (pp. 237–326). Rowley, MA: Newbury House.
Schoonen, R., Snellings, P., Stevenson, M., & Van Gelderen, A. (2009). Towards a blueprint of the foreign lan-
guage writer: The linguistic and cognitive demands of foreign language writing. In R.M. Manchón (Ed.),
Writing in foreign language contexts: Learning teaching, and research (pp. 77–101). Bristol: Multilingual
Matters.
Schumann, J.H. & Wood, L.A. (2004). The neurobiology of motivation. In J.H. Schumann, S.E. Crowell,
N.E. Jones, N. Lee, S.A. Schuchert, & L.A. Wood (Eds.) The neurobiology of learning: Perspectives from
second language acquisition (pp. 23–42). Mahwah, NJ: Lawrence Erlbaum.
acquisition of articles. TESOL Quarterly, 41, 255–281.
Shintani, N., & Ellis, R. (2015). Does language analytical ability mediate the effect of written feedback on
grammatical accuracy in second language writing? System, 49, 110–119.
Skehan, P. (2012). Language aptitude. In S. Gass & A. Mackey (Eds.), The handbook of second language
Song, M., & Suh, B. (2008). The effects of output task types on noticing and learning of the English past coun-
terfactual conditional. System, 36, 295–312.
of second language article use for generic and specific plural reference. Modern Language Journal, 99,
263–282.
Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in
Swain, M. (2013). The inseparability of cognition and emotion in second language learning. Language
Teaching, 46 (2), 195–207.
323
Swain, M., & Lapkin, S. (2002). Talking it through: Two French immersion learners’ response to reformula-
tion. International Journal of Educational Research, 37, 285–304.
Tiryakioglu, G., Peters, E., & Verschaffel, L. (2019). The effect of L2 proficiency level on composing
processes of EFL learners: Data from keystroke loggings, think alouds and questionnaires. In E. Lindgren,
& K. Sullivan (Eds.), Observing writing: Insights from keystroke logging and handwriting (pp. 212–235).
Leiden: Brill.
Tomlin, R.S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in
Uggen, M. (2012). Reinvestigating the noticing function of output. Language Learning, 62, 506–540.
Ushioda, E., & Dornyei, Z. (2012). Motivation. In S. Gass & A. Mackey (Eds.), The handbook of second lan-
guage acquisition (pp. 396–409). New York: Routledge.
Van den Branden, K., Bygate, M., & Norris, J. M. (Eds.) (2009). Task-based language teaching: A reader.
VanPatten, B. (2004). Processing instruction: Theory, research and commentary. Mahwah, NJ: Lawrence
Erlbaum.
guage production. Language Learning, 67, 394–430.
Vasylets, O, Gilabert, R., & Manchón, R.M. (forthcoming). Task complexity and task modality: CAF measures
and communicative adequacy. To appear in R.M. Manchón (Ed.), Writing and language learning. Advancing
research agendas. Amsterdam: John Benjamins.
In D. Belcher & A. Hirvela (Eds.), The oral/literate connection: perspectives on L2 speaking, writing, and
other media interactions (pp. 10–25). Ann Arbor: University of Michigan Press.
Williams, J.N. (2012). Working memory and SLA. In S. Gass & A. Mackey (Eds.), The handbook of second
language acquisition (pp. 427–441). New York: Routledge.
Yashima, T. (2009). International posture and the ideal L2 self in the Japanese EFL context. In Z. Dörnyei & E.
Ushioda (Eds.), Motivation, language identity and the L2 self (pp. 144–163). Bristol: Multilingual Matters.
in L2 performance. The Modern Language Journal, 101, 335–352.
Zalbidea, J. (2020). Modality in SLA: A mixed-methods approach to exploring the learning potential of writing
vs. speaking. In R.M. Manchón (Ed.), Writing and language learning. Advancing research agendas (pp.
324
24
RESEARCH ON SLA, L2 WRITING,
AND MULTIMODALITY
Dankook University and University of South Florida
Introduction
In a 2017 Disciplinary Dialogue appearing in the Journal of Second Language Writing, scholars
both debated and speculated about the potential affordances and drawbacks of multimodal writing
for the purposes of L2 learning. Acknowledging the fact that writing inherently becomes more
multimodal as people use digital media where composing options are open to choice, researchers
have expressed some concerns about using multimodal writing in L2 writing classrooms (e.g.,
Manchón, 2017; Qu, 2017). Such concerns could be summarized as potential distractions for lan-
guage development and negative impacts on academic writing development. These concerns might
be rooted in the conventional understanding of language development as acquiring target native-
like implicit knowledge development. To discuss the potential roles of multimodal writing in the
development of knowledge related to L2 writing, we follow the definition of L2 writing develop-
ment by Polio (2017) that encompasses various areas:
…writing development [is] change over time in any of the following areas related to written
text production: language (e.g., complexity, accuracy, fluency, cohesion, mechanics);
knowledge of different genres; text production processes; metacognitive knowledge and
strategy use; and writing goals and motivation.
p. 261
Using this definition is not to avoid the instructional focus being on language, for multimodal writing
is concerned with linguistic knowledge as one of the resources for meaning making. However, we
also find it meaningful to focus on various changes over time in writing, including academic social-
ization, metadiscourse, as well as linguistic knowledge.
In this chapter, which focuses on the nexus of L2 writing development and SLA, we first briefly
provide a broad overview of the different approaches to multimodal writing and identify the gaps
that future research may address in order to provide empirical evidence as to how L2 learners con-
struct multimodal texts and what they may be able to gain through multimodal writing. We provide
four major areas for future research to address along with some key research questions to better
situate multimodal writing research in SLA theory.
DOI: 10.4324/9780429199691-33 325

Approaches to Multimodal Writing

Research on multimodal writing mostly has been grounded in social semiotics (e.g., Cimasko
& Shin, 2017; Jiang, 2018; Tardy, 2005) and systemic functional linguistics (e.g., Archer, 2010;
Morell, 2015) and entail an assumption that all modes, linguistic and nonlinguistic resources,
have equivalent potential for meaning making. Building on this assumption, L2 researchers
have focused on identifying the roles of nonverbal texts as well as linguistic texts in com-
municating meaning. Cimasko and Shin (2017), for example, utilized the ideas of transduc-
tion (i.e., between-mode changes) and transformation (i.e., within-mode changes) to discuss an
adult ESL writer’s identity development while creating multimodal texts. Similarly, focusing
on how L2 learners choose and orchestrate different semiotic resources in meaning making,
Anderson, Stewart, and Kachorsky (2017) described how academically marginalized secondary
school students renegotiate positioning in multimodal writing. While this tradition of research
has offered a basis of understanding how L2 writers orchestrate different modes for meaning
making, these studies do not provide empirical evidence as to how such meaning-making activ-
ities affect learners’ development of linguistic knowledge.
In terms of pedagogical application, from the perspective of social semiotics, researchers have
emphasized a strong version of multimodality especially for content courses (e.g., science sub-
ject for ELLs, Grapin, 2018; Lee, Llosa, Grapin, Haas, & Goggins, 2019), which equally values
different modes, not necessarily prioritizing language as an instructional focus. This emphasis on
nonlinguistic modes has triggered some pushback from L2 researchers and instructors against intro-
ducing multimodal writing practices into the writing classroom (e.g., Qu, 2017). A weak version of
multimodality, which sees nonlinguistic modes as compensatory in the learning context (Grapin,
2018), would be more appropriate for language classes because of its emphasis on language. What
is expected from multimodal tasks is their potential to mediate linguistic knowledge development
and to provide opportunities for language practices as emphasized in Manchón (2017).
While social semiotics prevails as the most frequently adopted theoretical basis in multi-
modal research, the perspectives of genre and cognitive processes of writing provide another
insight regarding what linguistic knowledge contributes to multimodal writing and how it can
be developed while interacting with multimodal texts. Based on genre analysis (Swales, 1990),
researchers have investigated common semiotic choices, ranging from lexicon to metadiscourse in
different discourse communities. Some example studies are D’Angelo (2010, 2016) for academic
posters and Rowley-Jolivet (2002, 2012) for conference presentations (see Research Tasks 5 and
6 for more details). Because writing development also concerns the expansion of knowledge of
different genres, results from these studies could be adapted for instructional purposes. In spite of
the potential pedagogical implications, researchers have not yet incorporated multimodal genre
analysis in L2 studies.
Additionally, based on the cognitive model of writing (Hayes, 2012), recently researchers
have illuminated individual writers’ processes of composing multimodal texts. Leijten, Van Waes,
Schriver, and Hayes (2014) conducted an ethnographic study using a keystroke logging system to
update the original cognitive model of writing to accommodate the multimodal nature of contem-
porary writing (see Research Task 2 for more details). This revised version of the writing process
essentially shows that in real life, writing tasks require writers to construct a multimodal text in
which they have to integrate their writing and design schema with their own ideas; however, to
date, there has not been a multimodal writing study in L2 research that has adopted this approach.
Despite this, the cognitive model of writing has been influential in L2 writing research that focuses
on individual writing processes (e.g., Ong & Zhang, 2013; Roca de Larios, Manchón, Murphy, &
Marín, 2008; Sasaki, 2000) and task-based language teaching (TBLT) studies that investigate the
extent to which task design affects the composing process and text features (e.g., planning time by
Ong & Zhang, 2013 and idea support by Yoon, 2019). As the model has informed how task design
326
L2 Writing and Multimodality
induces different levels of cognitive demands, multimodal writing might be discussed in relation to
how writers’ cognitive capacity holds and processes information.
Thus far, we have briefly introduced prevalent and potential perspectives for examining multi-
modal writing, particularly regarding language development. While approaches of social semiotics
abound in both L1 and L2 research, in terms of the relevance to SLA, much work from genre-based
and cognitive approaches has yet to be done. In the next section, we illustrate four themes regarding
what future researchers might address in order to better explore the intersections of multimodal
writing and SLA for the purposes of understanding language and writing development.
Research Agenda
Language Development Through Multimodal Writing

Research Task 1: Conduct Cross-Sectional and/or Longitudinal Studies to
Investigate the Effects of Multimodal Composition on Different
Features of L2 Acquisition
As outlined previously, most research surrounding L2 writing and multimodality has tended to
adopt qualitative approaches using case-study designs, with the intended focus or outcome being
more exploratory or descriptive in nature. While such studies have been valuable for increasing
our understanding of various issues related to learners’ identity formation (e.g., Tardy, 2005) and
learners’ engagement in the writing process itself (e.g., Pyo, 2016), still, there is a need for future L2
research to explore learners’ linguistic development in studies that adopt quantitative and/or mixed-
method designs. To date, there have been few studies that have staged interventions (e.g., Dzekoe,
2017) or used control groups for the purposes of comparisons (e.g., Vandommele et al., 2017), and
even these studies have faced issues with regards to balancing and/or matching control groups.
Most of the studies we have highlighted throughout this chapter have been conducted in the context
of the L2 classroom. As such, in terms of research design, it may be that a case study involving the
analysis of one or a few individuals engaged in a singular multimodal writing task is simply easier
to execute than a larger-scale experimental study involving an entire classroom(s); this combined
with the fact that many teachers may be reticent to adopt or integrate multimodal writing activities
into their classrooms may explain the dearth of such studies in the literature.
Despite these challenges, future research needs to investigate the effects of multimodal com-
position on various features of L2 writing development using both cross-sectional and longi-
tudinal designs. For example, in terms of short-term interventions, researchers may wish to
examine the potential impacts of multimodal writing during the pre-task or planning stages of
the writing process. In SLA, there is a growing body of research centered around investigating
the effects of different pre-task planning modes (e.g., oral versus text) on learners’ subsequent
L2 writing (e.g., Kessler, Polio, Xu, & Hao, 2020), or conversely, the effects of writing and
text chat on learners’ subsequent oral performance (e.g., Abrams, 2003; Blake, 2009). Future
investigations involving multimodality may wish to continue with this line of research by fur-
ther examining the effects of different pre-task multimodal writing activities and their imme-
diate effects on learners’ performance in either oral or written language tasks. Some related
questions that researchers may wish to pose include: Do learners tend to transfer more language,
content, etc. from a certain multimodal task type (or platform, tool, etc.) over another? To what
extent does the mode(s) of the pre-task itself affect measures related to learners’ subsequent
oral or written output? In other words, do certain multimodal pre-task activities promote the
production of more complex, accurate, and/or fluent language? And finally, what are learners’
perceptions of engaging in the different multimodal tasks when compared to other pre-task
planning activities for the purposes of L2 learning?
327
Beyond the aforementioned examples, as discussed in the introduction to this chapter, some
researchers have expressed concerns about the negative influence of multimodal writing for
impeding L2 learners’ language development and academic writing development. However, it is
important to note that at this point in time, these are theorized concerns, as research has yet to
corroborate them. Therefore, to further investigate these concerns, one of the major needs is for
future studies to adopt longitudinal designs in which the effects of multimodal composition are
compared with that of more traditional writing assignments involving pen and paper and/or basic
word-processing programs. Designing these longitudinal quantitative and/or mixed-method studies
with control groups will undoubtedly prove challenging, so future studies may wish to adopt similar
designs from research in other areas of L2 writing. One such example that might be replicated is
that of Bikowski and Vithanage (2016), who used intact EAP classes to investigate the impacts
of repeated collaborative writing activities on individuals’ L2 writing development. In the study,
which took place over one academic semester, Bikowski and Vithanage integrated the repeated
collaborative writing activities into the EAP program’s existing curriculum and trained the par-
ticipating classroom teachers. As mentioned, since some teachers may be hesitant to alter their
teaching and/or classroom practices to adopt multimodal writing activities, perhaps teachers may
be more receptive to this type of design. In terms of multimodal writing and longitudinal research
designs, some related research questions include: What are the effects of repeated multimodal
writing assignments on L2 learners’ development? How does this development compare to that of a
control group who uses more traditional means of composition? Do students in each of the groups
show similar language gains?
Multimodal Writing Processes

Research Task 2: Explore L2 Learners’ Multimodal Composing
Processes Compared to Traditional Composition
While many SLA researchers are undoubtedly interested in the effectiveness of multimodal tasks
on language development as listed in the first research agenda item, more attention needs to be
directed to the interaction of language with other modes in terms of writing process and product.
This question could be answered from different theoretical standpoints, including social semi-
otic and the cognitive process of writing. As discussed earlier, more empirical research using
a cognitive approach would provide a chance to connect cognitively oriented SLA researchers
with the current discourse on multimodal writing that has been built upon more post-modern
perspectives.
For example, Leijten et al.’s (2014) ethnographic study revealed that the cognitive process of
composing multimodal texts is realized through translating a multimodal writing plan that combines
visual schema as well as writing schema. To provide a conceptual representation of the writing
process, which in the real world involves reciprocal processes over a long time, they analyzed
key stroke logs of the focal participant’s writing of a business proposal, observation data, and
retrospective interview data. With this longitudinal data, the researchers added some elements to
Hayes’s (2012) model such as design schemas on the control level, a searcher that is related to
source-based writing on the process level, and text-and-graphics-created-so-far on the process
level. The authors summarized that writing in the workplace involves “integrating one’s own texts/
graphics with ideas based on others’ text/graphics” (p. 322). This ethnographic study that combines
quantitative data in milliseconds, observational, and interview data is a good example for a follow-
up study to replicate. Given the lack of research on multimodal writing in an L2, some fundamental
questions include: What is the process of multimodal writing for L2 learners? How do writers
utilize resources in linguistic and nonlinguistic modes? How does linguistic knowledge play a role
in the multimodal writing process?
328
Another cognitive account for multimodal writing could be found from Flower and Hayes’s
(1984) Multiple Representation Thesis, in which they attempted to illustrate the ways writers com-
pose a formal prose text from thoughts, or meanings, and stored and accessed multimodal forms.
Simply put, this thesis hypothesized that the distance between the modes of writing and representa-
tion could explain the amount of effort a writer invests to translate and recode a writing plan, which
is a composite of information in multiple ways, into a written discourse. An interesting argument in
this thesis is that the mode of mental representation is critical to the difficulty of writing prose. For
example, describing non-verbal imagery that includes auditory and kinesthetic information may be
more challenging to put into words than explaining an abstract concept. This alignment of internal
and external representation can be connected to the task manipulation studies that have tried to
hypothesize what makes some tasks more cognitively challenging and how such challenges affect
language (e.g., Kormos, 2011; Vasylets, Gilabert, & Manchón, 2017). For example, researchers
could explore the following questions: What are the cognitive processes L2 writers experience in
planning, drafting, and revising multimodal texts? How do multimodal text construction processes
stimulate writers’ uses of linguistic and nonlinguistic resources?
While this cognitive approach offers some new areas of research, more in-depth analysis
on how information represented through each mode and intermodal relation construct meaning
should continue. For example, Cimasko and Shin (2017) examined how an undergraduate ESL
writer in a first-year writing class performed on an argumentative monomodal writing task and
a reproduction of the argumentative essay in an animated video. By controlling the prompt, this
study provided valuable insights as to L2 writers’ authorial decisions in different modes of com-
position. Following this line of research, future researchers might aim to answer the following
question: What are the choice-and meaning-making strategies that L2 learners make during multi-
modal writing tasks?
Research Task 3: Examine L2 Learners’ Collaborative Multimodal Composition

Processes from a Variety of SLA Theories such as Interactionist or
Sociocultural Approaches
In the introduction to this chapter, we stated that many of the theories and methodological practices
that originated from L1 research have greatly informed research on L2 multimodal composition to
date. As such, some of the L1 lenses or approaches that have been used for examining L2 multi-
modal writing are not particularly conducive for studying the influence of multimodal practices on
language learning. While a handful of L2 multimodal writing studies have adopted SLA theories
(e.g., Dzekoe, 2017 used the concept of noticing from Schmidt, 1990), there is minimal research
that has employed SLA theories for the purposes of examining learners’ interactions. Importantly,
as many of the multimodal tools and platforms used by educators are both digital and conducive
for collaboration (e.g., videos, wikis, blogs, etc.), there is a particular need for researchers to adopt
different SLA theories such as interactionist approaches (e.g., Gass, 1997; Long, 1996) or socio-
cultural approaches (e.g., Engeström, 1987; Vygotsky, 1978) to explore the manner in which L2
learners co-construct multimodal writing projects.
For instance, many L2 collaborative writing studies have explored the concept of languaging
(Swain & Lapkin, 1998), or learners’ discussions about language use and/or their use of language as
a problem-solving tool during the collaborative writing process (e.g., Fernández Dobao, 2012). One
interesting question that future scholars may wish to consider in relation to collaborative multimodal
writing is: To what extent do different multimodal writing projects affect learners’ languaging? In
addition to exploring this, researchers may wish to explore related topics such as the effects of
different multimodal writing assignments on learners’ collective scaffolding (e.g., Donato, 1994).
In other words, does the presence or inclusion of certain modes affect the manner in which learners
329
pool their linguistic resources and work together? This is not to say that there has been no research
in such areas. For instance, a study by Rouhshad and Storch (2016) examined how different col-
laborative writing modes (e.g., writing face-to-face versus online through the use of Google Docs)
affected learners’ interactions, finding that the face-to-face mode promoted more collaboration
among the L2 learners. Likewise, there have been multiple studies that have explored how face-
to-face versus online modes affect the nature of learners’ writing and their overall contributions to
texts, particularly involving the use of wikis (e.g., Li & Zhu, 2017a, 2017b). Despite the insights
gleaned from these studies, there is undoubtedly much more to explore beyond Google Docs, wikis,
and/or face-to-face versus online collaborations, especially with different platforms that utilize and/
or allow users to integrate different modes. Such future studies may wish to continue with this line
of research specifically by exploring how/whether the presence of videos, images, and/or sound,
etc. affects the focus of learners’ interactions. Does the presence (or absence) of such elements and
different modes affect the amount of languaging that occurs? If so, how?
To conclude, as previous studies on the individual and collaborative writing process have
provided bases for designing writing tasks for research and instruction, a better understanding
of what happens in the mind and how collaborators interact with each other when composing
multimodal texts could better guide future multimodal task design. For example, beyond
exploring learner languaging, researchers may also wish to investigate the differences between
multimodal and monomodal writing processes and the potential interplay between linguistic
and nonlinguistic resources in writing, similar to previous studies that have explored the cog-
nition hypothesis (e.g., Ong & Zhang, 2013; Roca de Larios, Manchón, Murphy, & Marín,
2008; Sasaki, 2000; Yoon, 2019). These research studies will offer insights as to how linguistic
knowledge and languaging take place in the multimodal writing process, and thus increase the
validity of the arguments in research studies.
Multimodal Text Analysis

Research Task 4: Explore the Similarities and/or Differences Between
Individually Versus Collaboratively Produced Multimodal Writing
Assignments and Texts
This task concerns the need for scholars to explore the differences between individually and collab-
oratively produced multimodal writing projects. While there are a few studies that have explored the
effects of collaborative multimodal writing on various phenomena (e.g., Vandommele et al., 2017),
most of the L2 multimodal explorations thus far have been concerned with describing the processes
and performance of those learners engaged in individual composition (e.g., Pacheco & Smith, 2015;
Pyo, 2016). As referenced in the previous research task, many of the multimodal tools and platforms
used by educators are both digital and conducive (or sometimes even designed) for collaboration.
Therefore, it is quite interesting that most multimodal writing research thus far has tended to focus
on individual composition. In a research timeline by Storch (2018), she outlined seven themes that
scholars have pursued for the purposes of researching L2 collaborative writing, many of which can
be adopted by those scholars interested in comparing individually versus collaboratively produced
multimodal writing texts or assignments. Such related questions may include: Are there differences
in the multimodal text products that learners produce individually versus collaboratively (e.g.,
differences in complexity, accuracy, fluency, or in perceived quality)? Are there differences in
learners’ perceptions of engaging in individual versus collaborative multimodal writing activities
(i.e., do learners feel some activities are more conducive for solo-authorship or collaboration; see
Kessler, 2019 for a related discussion on text co-ownership)? These are but a few questions that
future researchers may wish to consider exploring.
330
Research Task 5: Explore the Product of L2 Learners’ Multimodal

Compositions Compared to Traditional Composition
This fifth task might be addressed in two ways: exclusive focus on linguistic modes (i.e., oral and/
or written) or integrative analysis of linguistic and nonlinguistic resources (e.g., D’Angelo, 2016;
Tardy, 2005). First, researchers may want to compare various linguistic features of monomodal and
multimodal texts on a similar genre. There are numerous studies on linguistic features of L2 texts
across tasks and genres (see Johnson, 2017 for a meta-analysis of the effect of task complexity
on CAF measures). In such studies, researchers use repeated designs with counter-balancing for
comparisons between linguistic features in two tasks and/or genres. Following this line of research,
scholars may collect monomodal and multimodal samples from participants and investigate what
linguistic features are preferred in the multimodal tasks. For example, Kim, Belcher, and Peyton
(2019) examined the effectiveness of multimodal writing tasks on writing development, and
compared linguistic features (i.e., CAF measures) of the monomodal and multimodal texts college
L2 writers produced. More specifically, the experimental group developed a storyboard including
English scripts and composed a video while the control group did traditional essay writing. What
is worth noting among their findings is that the complexity and accuracy of the written texts in the
multimodal task (i.e., scripts) were not quantitatively different from the written texts of the control
group’s traditional writing task; and the script for the multimodal task was longer than the texts
produced for a traditional writing task This finding indicates that possibly multimodal tasks elicit
similar forms to achieve the task goals, for example to make an argument, even if other modes are
available. Kim et al.’s study is an example that shows how to study multimodal texts in comparison
with monomodal texts, but further qualitative examination of the linguistic features is necessary
to provide concrete examples of language (in oral and written modalities) in multimodal texts,
which can inform how writers change language as additional modes come into play or meaning
construction.
Another area of research is related to the connection between speaking and writing. The speaking-
writing connection is an important issue that has been addressed in SLA (e.g., Hirvela & Belcher,
2016; Rubin & Kang, 2008; Vasylets, Gilabert, & Manchón, 2017). In terms of empirical research,
there tends to be a comparison between learners’ performance on a speaking task and a writing
task separately. However, in a multimodal writing task, writing and speaking might sometimes be
inseparable. For instance, for a task involving the production of a digital story, a writer may use
strings of words on the screen and write a script for a voice over. Even if the final product involves
limited written words that are visible in the final product, there may have been a substantial amount
of writing involved in the multimodal project (e.g., script writing), which might show that writing
can assist speaking and language development. Thus, students’ performance on a multimodal task
may include articulation of carefully planned and written texts and provide another way to investi-
gate writing-speaking connections. Kim et al.’s (2019) study provides an example of investigating
the writing-to-speak dimension. They used scripts that students prepared for the multimodal task to
compare with traditional writing.
Multimodal outcomes involve both linguistic and nonlinguistic modes of communication, and
it has not been clearly described as to what makes some multimodal texts more compelling than
others. As language differs across genres of writing (e.g., Polio & Park, 2016; Yoon & Polio, 2017),
written multimodal genres might determine the types of nonlinguistic and linguistic resources that
are preferred in communication. One way of investigating such a question would be by following
a multimodal genre analysis as D’Angelo (2010, 2016) and Rowley-Jolivet (2012) have done. For
example, the D’Angelo (2016) integrated Kress and Van Leeuwen’s grammar of visual design
and Hyland’s (2005) metadiscourse framework. In a convergent parallel mixed-methods design,
D’Angelo analyzed a multimodal corpus of 120 posters (40 posters from the three disciplines of law,
psychology, and physics), and conducted an online survey with experienced and novice researchers
331
as well as interviews with 12 researchers. Based on the understanding that metadiscourse could
be realized through both textual and visual modes, she manually annotated textual metadiscourse
markers based on Hyland’s list of metadiscourse markers and coded interactive visual components
with her coding scheme consisting of five categories of interactive resources: information value,
framing, connective elements, graphic elements, and fonts. This study expanded genre research to
cover multimodal texts and exemplified coherent schemes for both images and texts. Following
this line of research, SLA researchers may want to apply how some rhetorical moves that have
only been realized in words in monomodal texts could be expressed in nonlinguistic resources
in multimodal texts. If, for example, L2 learners are able to use different nonlinguistic resources to
achieve some interactional meanings in multimodal texts but cannot express similar meanings in
monomodal texts, it could be the case that developing relevant linguistic repertoires, not raising the
awareness of interaction, needs to be the target of instruction.
Understanding Current Needs and Practices

Research Task 6: Conduct a Needs Analysis of the Types of Multimodal Writing
Required of L2 Learners Within Various Contexts (e.g., Within Various
Fields/Majors in Academia or in Various Professional Contexts)
At a basic level, one of the most fundamental needs surrounding the nexus of multimodal
writing and SLA research involves the necessity for teachers and researchers to gain a better
understanding of the different types of multimodal writing actually required of L2 learners
within various academic and/or professional contexts (e.g., Molle & Prior, 2008). As discussed
in the preceding section, in the body of research on L1 and L2 multimodal writing to date, a
number of descriptive studies have attempted to explain various aspects surrounding students’
multimodal writing practices within academia. For example, studies by Rowley-Jolivet (2002,
2012) and Tardy (2005) have explored phenomena regarding learners’ practices with Microsoft
PowerPoint in terms of visual, rhetorical, organizational, and linguistic choices (e.g., uses of
keywords, salient parts-of-speech, etc.). However, in terms of the existing experimental studies
involving L2 learners and multimodal writing, it can be rather challenging to find studies in
which the authors present strong justifications for their choice(s) of multimodal platforms, tools,
and/or other assignments that they give to their L2 learners.
Understanding the actual multimodal writing needs of students or learners in various contexts
is vital, first, because it has been suggested that the integration of authentic materials and tasks
into curricula may increase L2 learners’ motivation (e.g., Gilmore, 2007; Peacock, 1997). Second,
gaining such an understanding is necessary, because, just as different written genres often have
distinctive linguistic or rhetorical features (e.g., see Bhatia, 1993 for differences in professional
genres or Swales, 2004 for academic/research genres), so, too, do different genres and/or forms
of multimodal writing. For instance, Rowley-Jolivet (2012) conducted a corpus-based analysis
that examined 931 PowerPoint slides taken from scientific conference presentations. The results
of Rowley-Jolivet’s analysis revealed that individuals tended to eliminate many of the function
words on their slides, resulting in noun groups that dominated the text. Likewise, most of the text
itself was found clustered in the introduction and conclusion slides, with other visual modes such
as images and graphs making up most of the slides in between.
Returning to the necessity of surveying learners’ multimodal writing needs and its connection to
SLA, if we are to take Rowley-Jolivet’s study as a vehicle for reflecting on these connections, mul-
tiple questions may arise, such as: Is the language that is being taught in the L2 classroom reflective
of the different types of multimodal writing that students are actually expected to produce during
their schooling and in their subsequent professional careers (e.g., do L2 instructors teach students
the linguistic and rhetorical differences between various types of writing and the writing that is
332
characteristic of PowerPoint slide presentations, etc.)? Do students need to be explicitly taught such
differences in language use as skills (e.g., DeKeyser, 2015), or is this something that (either L1 and
L2) learners can induce naturally on their own through repeated exposure to the input (e.g., Ellis
& Wulff, 2015)? Understanding the answers to such questions are integral, as many instructors –
both in content-and in L2-based courses across university contexts –often assess their students
on different forms of multimodal writing in the classroom (e.g., PowerPoints, discussion boards,
blog postings, wikis, videos, etc.). As Elola and Oskoz (2017) have pointed out, teachers may
assume that their students are adept users of such current tools and technologies. However, although
teachers may assume technological proficiency on the part of their students, an L2 learner’s cap-
acity to create a PowerPoint slide with text and images does not necessarily equate to an L2 learner
who, at a basic level, understands the linguistic or rhetorical differences between PowerPoint slides
and other types of writing; importantly too, is that technological capability does not causally imply
that an L2 learner can effectively use and manipulate language, pictures, video, etc. across various
modes and genres.
Research Task 7: Conduct a Survey to Understand L2 Learners’ Current Practices and

Engagement with Multimodal Writing for Learning Beyond the Classroom
While the literature surrounding L2 multimodal writing continues to expand, it should be noted that
most research in the area –like that of research in the area of computer-assisted language learning –
has been predominately researcher-driven in the sense that teachers or researchers have dictated
the technologies and types of multimodal writing assignments that students will engage in for the
purposes of research (Ma, 2017). As mentioned earlier, researchers still know very little about what
types of multimodal writing L2 learners are required to use in their everyday lives for academic
and/or professional purposes. Similarly, even less is known about the types of multimodal writing
that L2 learners may be engaging in for the purposes of learning beyond the classroom.
In terms of learning beyond the classroom, Reinders and Benson (2017) have outlined three pri-
mary questions that L2 researchers need to consider in the future, including: “Where does learning
beyond the classroom take place? How does it take place? [and] How should teachers be involved?”
(p. 561). These questions, too, naturally extend to research involving multimodal writing and SLA.
Such questions might include: Are students engaging in multimodal writing for the purposes of
learning beyond the classroom? If so, where and in what form is this L2 learning taking place?
Additionally, if L2 learners are engaging in multimodal writing for the purposes of learning beyond
the classroom, what is the nature of this learning, or, how is this learning occurring? For instance, are
learning processes different outside of the L2 classroom, or can existing theories (in SLA or beyond)
help to explain the acquisition processes for L2 learning beyond the classroom? Because so little is
known in this regard, basic questions about the nature of these learning processes exist, such as: Is
the learning that occurs incidental or intentional (e.g., Hulstijn, 2008)? And, among other questions,
is the resulting knowledge that has been gained implicit or explicit in nature (e.g., DeKeyser, 2008)?
As Reinders and Benson (2017) have rightly noted, for SLA researchers, “modeling LBC [learning
beyond the classroom] is clearly a long-term task” (p. 563); however, it is, nonetheless, an important
task that teachers and researchers must grapple with if they are to effectively connect or supplement
students’ existing outside-of-class practices with in-the-classroom instruction.
Research Task 8: Conduct Research to Understand How Instructors and/or Different

Disciplines/Discourse Communities Assess Different Types of Multimodal Writing Genres
Much of the work undertaken on the assessment of multimodal writing has focused on researchers
generating guiding principles (e.g., Hung, Chiu, & Yeh, 2013) or discussing general challenges
faced by educators (e.g., Yi & Choi, 2015; Yi, King, & Safriani, 2017). There have been no
333
large-scale surveys or studies designed to understand teachers’ current classroom practices for
assessing multimodal writing. Although linguistic knowledge is featured prominently in language
classrooms, learners need to know what they are expected to produce and/or how they are expected
to perform when using nonlinguistic modes along with linguistic mode. Additionally, as Yi and
Choi illustrated, the washback effect of language assessments greatly influences instructional focus
to limit the opportunities to practice multimodal writing. The absence of a basis of interpreting
multimodal writing performance thus has generated teachers’ and students’ reluctance to incorp-
orate the authentic modes of writing.
In addition to the needs of rating rubrics for pedagogical reasons, the lack of some common
understanding of what makes some multimodal texts better than others has been a challenge for
researchers. For L2 writing literature, rubrics have served for systematic analyses of learners’ lin-
guistic development (e.g., Connor-Linton & Polio, 2014; Jacobs, Zinkgraf, Wormuth, Hartfiel, &
Hughey, 1981). When a new construct is introduced, researchers have developed a new rubric to
account for the construct (e.g., authorial voice in Zhao, 2012; integrated writing ability in Chan,
Inoue, & Taylor, 2015; Cumming, Kantor, & Powers, 2002). Such development of new grading
criteria was guided by target language use domain as language assessment literature guides (e.g.,
Bachman & Palmer, 1996; Norris, 2016) and included mixed methods (e.g., expert panel judgment,
group interview, textual analysis, and statistical analysis of rater reliability). A rubric will be able to
facilitate future research on the relationship between the roles of linguistic mode and nonlinguistic
modes in written communication. In other words, with rubrics of multimodal writing, further research
will be able to explore the traits of language competence and other meaning-making competences,
for example whether they are interdependent in composing communicative competence.
Conclusion
Writing, as a practice, is inherently multimodal. It combines linguistic resources as well as other
modes such as visual cues. While some have argued that writing has changed from monomodal
to multimodal, it may be more accurate to say that scholarly attention has recently expanded
from focusing on linguistic knowledge to other communication resources. This expansion inevit-
ably has triggered some concerns from SLA researchers who aim to find the mechanism of how
learners develop and use the target language (i.e., automatized or implicit linguistic knowledge).
However, if we focus on writing development that overarches changes in linguistic knowledge,
genre knowledge, production processes, and strategy use, this new area of research may generate
many more interesting questions regarding how language learners use their L2 while completing
a multimodal task and how this task might enhance the language learning process. There is a
working hypothesis from L1 research that design schemas and writing schemas are both activated
while composing a multimodal text. Seen from the perspective of SLA, L2 researchers may
question whether such parallel activation eases L2 writers’ text production processes or demands
more cognitive efforts.
Even if we employ the narrower definition of SLA that focuses on linguistic knowledge, in
our opinion, there is no current evidence as to why multimodal writing would be detrimental
to language instruction and L2 learning. In fact, there are some recent case studies and (quasi-)
experimental studies that have revealed facilitative effects of using multimodal writing in lan-
guage and identity development, as we summarized in Table 24.1 that outlines the studies for
further reading.
With the four research agendas presented in this chapter, we hope that future research on multi-
modal writing will find firmer grounding in SLA theories and literature. Finally, echoing Manchón’s
(2017) words, we call for research on multimodal writing with a greater emphasis on language
and its possibilities for language learning opportunities. Targeting effective multimodal writing
could be, of course, a viable instructional goal. Nevertheless, it should be noted that many L2
334
Table 24.1 Summary of the studies for further reading
Study Focus Tasks Methods (Participants and Data)

Cimasko & Shin Authorial decisions and Reproduction of Ethnographic case study
(2017) contextual factors in argumentative essays• One college ESL writer in
multimodal design in animated video or English 101
slides • Her argumentative paper,
video transcript, multimedia
video, interview transcripts,
observation notes
Dzekoe (2017) Effects of a computer- Online multimodal Case study with embedded
based multimodal posters quantitative data
poster project on • 22 advanced-low proficiency
students’ revisions ESL students
• Survey, students’ revision
histories, online posters,
reflections, stimulated recall,
final written drafts
Molle & Prior Genre and needs of EAP Authentic writing tasks Needs analysis
(2008) graduate students students performed in • International graduate students
their content courses in an EAP course
• L1 instructors in the students’
disciplines
• Student texts, class observations
(ethnographic methods)
Tardy (2005) Identity development Participants’ Case study
(disciplinarity and presentation slides • 4 international graduate students
individuality) observed made for their own in an EAP course
in slides academic purposes • 20-month period (12 slides
in total)
Genre analysis of presentation
slides
Vandommele Effects of collaborative Design of a website Experimental study
et al. (2017) multimodal writing in that included photo- • 84 novice learners of Dutch:
different settings on comic, video-based in-class (n=26); out-of-
writing development interview, etc. school project (n=26); control
group (n=32)
• Pre- and post-test performance
on traditional writing tasks (one
narrative, one persuasive)
learners, especially adults, may already possess unique or advanced multimodal literacy that they
have developed from previous life experience and current everyday practices. In this regard, the
primary knowledge and abilities that such learners may need to develop is L2 knowledge.
References
Abrams, Z.I. (2003). The effect of synchronous and asynchronous CMC on oral performance in German. The
335
Anderson, K.T., Stewart, O.G., & Kachorsky, D. (2017). Seeing academically marginalized students’ multi-
modal designs from a position of strength. Written Communication, 34(2), 104–134. https://doi.org/
10.1177/0741088317699897
Archer, A. (2010). Multimodal texts in higher education and the implications for writing pedagogy. English in
Education, 44(3), 201–213. https://doi.org/10.1111/j.1754-8845.2010.01073.
Bachman, L.F., & Palmer, A.S. (1996). Language testing in practice: Designing and developing useful lan-
guage tests. Oxford: Oxford University Press.
Bhatia, V.K. (1993). Analysing genre: Language use in professional settings. New York: Routledge.
Bikowski, D., & Vithanaga, R. (2016). Effects of web-based collaborative writing on individual L2 writing
development. Language Learning & Technology, 20(1), 79–99.
Blake, C. (2009). Potential of text-based Internet chats for improving oral fluency in a second language. The
Chan, S., Inoue, C., & Taylor, L. (2015). Developing rubrics to assess the reading-into-writing skills: A case
study. Assessing Writing, 26, 20–37. https://doi.org/10.1016/J.ASW.2015.07.004
Cimasko, T., & Shin, D.-S. (2017). Multimodal resemiotization and authorial agency in an L2 writing class-
room. Written Communication, 34(4), 387–413. https://doi.org/10.1177/0741088317727246
corpus. Journal of Second Language Writing, 26, 1–9. https://doi.org/10.1016/j.jslw.2014.09.002
Cumming, A., Kantor, R., & Powers, D.E. (2002). Decision making while rating ESL/ EFL writing
tasks: A descriptive framework. The Modern Language Journal, 86(1), 67–96. https://doi.org/10.1111/
1540-4781.00137
D’Angelo, L. (2010). Creating a framework for the analysis of academic posters. Language Studies Working
Papers, 2, 38–50.
D’Angelo, L. (2016). Academic posters: A textual and visual metadiscourse analysis. Bern: Peter Lang.
DeKeyser, R. (2008). Implicit and explicit learning. In C.J. Doughty & M.H. Long (Eds.), The handbook of
second language acquisition (pp. 313–348). Malden, MA: Blackwell.
DeKeyser, R. (2015). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second lan-
guage acquisition: An introduction (pp. 94–112). New York: Routledge.
Donato, R. (1994). Collective scaffolding in second language learning. In J.P. Lantolf & G. Appel (Eds.),
Vygotskian approaches to second language research (pp. 33–56). Norwood, NJ: Ablex.
Dzekoe, R. (2017). Computer- based multimodal composing activities, self- revision, and L2 acquisition
through writing. Language Learning & Technology, 21(2), 73–95.
Ellis, N.C., & Wulff, S. (2015). Usage-based approaches to SLA. In B. VanPatten & J. Williams (Eds.),
Theories in second language acquisition: An introduction (pp. 75–93). New York: Routledge.
Elola, I., & Oskoz, A. (2017). Writing with 21st century social tools in the L2 classroom: New literacies,
genres, and writing practices. Journal of Second Language Writing, 36, 52–60.
Engeström, Y. (1987). Learning by expanding: An Activity Theoretical approach to developmental research.
Helsinki: Orienta-Konsultit Oy.
Fernández Dobao, A. (2012). Collaborative writing tasks in the L2 classroom: Comparing group, pair, and
individual work. Journal of Second Language Writing, 21(1), 40–58.
Flower, L., & Hayes, J.R. (1984). Images, plans, and prose: The representation of meaning in writing. Written
Gass, S.M. (1997). Input, interaction and output in second language acquisition. Mahwah, NJ: Lawrence
Erlbaum.
Grapin, S. (2018). Multimodality in the new content standards era: Implications for English learners. TESOL
Quarterly, 53(1), 30–55. https://doi.org/10.1002/tesq.443
Gilmore, A. (2007). Authentic materials and authenticity in foreign language learning. Language Teaching,
40, 97–118.
Hayes, J.R. (2012). Modeling and remodeling writing. Written Communication, 29(3), 369–388. https://
doi.org/10.1177/0741088312451260
Hirvela, A., & Belcher, D. (2016). Reading/writing and speaking/writing connections: The advantages of multi-
modal pedagogy. In R.M. Manchón & P.K. Matsuda (Eds.), Handbook of Second and Foreign Language
Writing (pp. 587–611). Berlin: De Gruyter.
Hulstijn, J.H. (2008). Incidental and intentional learning. In C.J. Doughty & M.H. Long (Eds.), The handbook
of second language acquisition (pp. 349–381). Malden, MA: Blackwell.
Hung, H.T., Chiu, Y.C.J., & Yeh, H.C. (2013). Multimodal assessment of and for learning: A theory-driven
design rubric. British Journal of Educational Technology, 44(3), 400–409. https://doi.org/10.1111/
j.1467-8535.2012.01337.x
Hyland, K. (2005). Metadiscourse: Exploring interaction in writing. London: Continuum.
336
Jacobs, H., Zinkgraf, S., Wormuth, D., Hartfiel, V., & Hughey, J. (1981). Testing ESL composition: A practical
approach. Rowley, MA: Newbury House.
Jiang, L. (2018). Digital multimodal composing and investment change in learners’ writing in English as a for-
eign language. Journal of Second Language Writing, 40, 60–72. https://doi.org/10.1016/j.jslw.2018.03.002
Johnson, M.D. (2017). Cognitive task complexity and L2 written syntactic complexity, accuracy, lexical com-
plexity, and fluency: A research synthesis and meta-analysis. Journal of Second Language Writing, 37,
13–38. https://doi.org/10.1016/j.jslw.2017.06.001
Kessler, M. (2020). Promoting text co-ownership and peer interactions in collaborative writing. TESOL
Journal, 11(2), e476. https://doi.org/10.1002/tesj.476
Kessler, M., Polio, C., Xu, C., & Hao, X. (2020). The effects of oral discussion and text chat on L2 Chinese
writing. Foreign Language Annals. https://doi.org/10.1111/flan.12491
Kim, Y., Belcher, D., & Peyton, C. (2019). Writing to make meaning through multimodal composing: Does
it facilitate L2 writing development? Paper presented at the Symposium on Second Language Writing.
Tempe, Arizona
Kormos, J. (2011). Task complexity and linguistic and discourse features of narrative writing performance.
Journal of Second Language Writing, 20(2), 148–161. https://doi.org/10.1016/j.jslw.2011.02.001
Lee, O., Llosa, L., Grapin, S., Haas, A., & Goggins, M. (2019). Science and language integration with English
learners: A conceptual framework guiding instructional materials development. Science Education, 103(2),
317–337. https://doi.org/10.1002/sce.21498
Leijten, M., Van Waes, L., Schriver, K., & Hayes, J.R. (2014). Writing in the workplace: Constructing
documents using multiple digital sources. Journal of Writing Research, 5(3), 285–337.
Li, M., & Zhu, W. (2017a). Good or bad collaborative wiki writing: Exploring links between group
interactions and writing products. Journal of Second Language Writing, 35, 38–53. https://doi.org/
10.1016/j.jslw.2017.01.003
Li, M., & Zhu, W. (2017b). Explaining dynamic interactions in wiki-based collaborative writing. Language
Learning and Technology, 21(2), 96–120. https://dx.doi.org/10125/44613.
Long, M.H. (1996). The role of linguistic environment in the second language acquisition. In W. Ritchie & T.
Bhatia (Eds.), Handbook of Second Language Acquisition (pp. 413–468). San Diego, CA: Academic Press.
Ma, Q. (2017). A multi-case study of university students’ language learning experienced mediated by mobile
technologies: A socio-cultural perspective. Computer Assisted Language Learning, 30, 183–203. https://
doi.org/10.1080/09588221.2017.1301957
Manchón, R.M. (2017). The potential impact of multimodal composition on language learning. Journal of
Second Language Writing, 38, 94–95. https://doi.org/10.1016/j.jslw.2017.10.008
Molle, D., & Prior, P. (2008). Multimodal genre systems in EAP writing pedagogy: Reflecting on a needs ana-
lysis. TESOL Quarterly, 42(4), 541–566.
Morell, T. (2015). International conference paper presentations: A multimodal analysis to determine effective-
ness. English for Specific Purposes, 37, 137–150. https://doi.org/10.1016/j.esp.2014.10.002
Norris, J.M. (2016). Current uses for task-based language assessment. Annual Review of Applied Linguistics,
36, 230–244. https://doi.org/10.1017/S0267190516000027
Ong, J., & Zhang, L. J. (2013). Effects of the manipulation of cognitive processes on EFL writers’ text quality.
TESOL Quarterly, 47(2), 375–398. https://doi.org/10.1002/tesq.55
Pacheo, M.B., & Smith, B.E. (2015). Across languages, modes, and identities: Bilingual adolescents’ multi-
modal codemeshing in the literacy classroom. Bilingual Research Journal, 38, 292–312. DOI: 10.1080/
15235882.2015.1091051
Peacock, M. (1997). The effect of authentic materials on the motivation of EFL learners. ELT Journal, 51(2),
144–156.
Polio, C. (2017). Thinking allowed: Second language writing development: A research agenda. Language
Teaching, 50(2), 261–275. https://doi.org/10.1017/S0261444817000015
Polio, C. & Park, J.-H. (2016). Language development in second language writing. In R.M. Manchón & P.
K. Matsuda (Eds.), Handbook of second and foreign language writing (pp. 287–306). Berlin: De Gruyter.
Pyo, J. (2016). Bridging in-school and out-of-school literacies: An adolescent EL’s composition of a multi-
modal project. Journal of Adolescent and Adult Literacy, 59(4), 421–430. https://doi.org/10.1002/jaal.467
Qu, W. (2017). For L2 writers, it is always the problem of the language. Journal of Second Language Writing,
38, 92–93. https://doi.org/10.1016/j.jslw.2017.10.007
Reinders, H., & Benson, P. (2017). Research agenda: Language learning beyond the classroom. Language
Teaching, 50(4), 561–578. https://doi.org/10.1017/S0261444817000192
behaviour in the allocation of time to writing processes. Journal of Second Language Writing, 17(1), 30–47.
https://doi.org/10.1016/j.jslw.2007.08.005
337
Rouhshad, A., & Storch, N. (2016). A focus on mode: Patterns of interaction in face-to-face and computer-
mediated contexts. In M. Sato & S. Ballinger (Eds.), Peer interaction and second language learning (pp.
Rowley-Jolivet, E. (2002). Visual discourse in scientific conference papers: A genre-based study. English for
Specific Purposes, 21, 19–40.
Rowley-Jolivet, E. (2012). Oralising text slides in scientific conference presentations: A multimodal corpus
analysis. In A. Boulton, S. Carter-Thomas, & E. Rowley-Jolivet (Eds.), Corpus-informed research and
learning in ESP: Issues and applications (pp. 137–165). Amsterdam: John Benjamins.
Rubin, D.L., & Kang, O. (2008). Writing to speak: What goes on across the two-way street. In D. Belcher &
A. Hirvela (Eds.), The oral/literate connection: Perspectives on L2 speaking, writing, and other media
interactions (pp. 210–225). Ann Arbor: University of Michigan Press.
Sasaki, M. (2000). Toward an empirical model of EFL writing processes: An exploratory study. Journal of
Second Language Writing, 9(3), 259–291. https://doi.org/10.1016/S1060-3743(00)00028-X
Schmidt, R.W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158.
Storch, N. (2018). Research timeline: Collaborative writing. Language Teaching, 52(1), 40–59. https://doi.org/
10.1017/S0261444818000320
Swain, M., & Lapkin, S. (1998). Interaction and second language learning: Two adolescent French immersion
students working together. The Modern Language Journal, 82(3), 320–337.
Swales, J.M. (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge
University Press.
Swales, J.M. (2004). Research genres: Explorations and applications. Cambridge: Cambridge University Press.
Tardy, C.M. (2005). Expressions of disciplinarity and individuality in a multimodal genre. Computers and
Composition, 22, 319–336. https://doi.org/10.1016/j.compcom.2005.05.004
Vandommele, G., Van den Branden, K., Van Gorp, K., & De Maeyer, S. (2017). In-school and out-of-school
multimodal writing as an L2 writing resource for beginner learners of Dutch. Journal of Second Language
Writing, 36, 23–36. https://doi.org/10.1016/j.jslw.2017.05.010
Vasylets, O., Gilabert, R., & Manchón, R. M. (2017). The effects of mode and task complexity on second lan-
guage production. Language Learning, 67(2). https://doi.org/10.1111/lang.12228
Vygotsky, L.S. (1978). Mind in society: The development of higher psychological processes. Cambridge,
Yi, Y., & Choi, J. (2015). Teachers’ views of multimodal practices in K-12 classrooms: Voices from teachers in
the United States. TESOL Quarterly, 49(4), 838–847. https://doi.org/10.1002/tesq.219
Yi, Y., King, N., & Safriani, A. (2017). Reconceptualizing assessment for digital multimodal literacy. TESOL
Journal, 8(4), 878–885. https://doi.org/10.1002/tesj.354
Yoon, H. (2019). The effects of writing task manipulations on ESL students’ performance: Genre and idea
support as task variables. In S. Papageorgious & K.M. Bailey (Eds.), Global perspective on language
assessment (pp. 139–151). New York: Routledge.
Yoon, H., & Polio, C. (2017). The linguistic development of students of English as a second language in two
written genres. TESOL Quarterly, 51(2), 275–301. https://doi.org/10.1002/tesq.296
Zhao, C.G. (2012). Measuring authorial voice strength in L2 argumentative writing: The develop-
ment and validation of an analytic rubric. Language Testing, 30(2), 201–230. https://doi.org/10.1177/
0265532212456965
338
25
METHODOLOGIES TO CAPTURE
THE PROCESSING DIMENSION
OF L2 WRITING AND WRITTEN
CORRECTIVE FEEDBACK
Andrea Révész, Xiaojun Lu, and Ana Pellicer-Sánchez
UCL Institute of Education, University College London
Introduction
In the area of second language writing, there has been an increased interest in exploring and
understanding the cognitive mechanisms that underlie L2 text production (Cumming, 2016; Polio,
2012; Révész & Michel, 2019; Roca de Larios, Nicolás-Conesa, & Coyle, 2016) and the processing
of written corrective feedback (Bitchener & Ferris, 2012; Bitchener & Storch, 2016). A large part
of this research has adopted a cognitive approach and focused on investigating the behaviors of L2
writers (i.e., the features of the writing and feedback process that can be directly observed) and the
cognitive processes that underlie L2 writing and feedback processing (see Chapters 2 and 3, this
volume). This chapter aims to provide a brief review of the methodological options available to
investigate L2 writing behaviors and processes and L2 feedback processing, and highlight methodo-
logical innovations in the field. We also intend to demonstrate how employing novel data collection
tools and data analysis procedures and using them together in innovative ways can help obtain a
more valid and complete understanding of the L2 writing and feedback process. The chapter starts
with an overview of key theoretical ideas underlying cognitively-oriented writing and corrective
feedback research and a discussion of methods that can be used to explore L2 writing processes and
feedback processing. Then we turn to a description of the design of four suggested empirical studies
that, through the use of innovative methods, would help deepen our understanding of how L2 text
production processes unfold, the ways in which L2 writing processes change over time, and how
learners deal with the feedback they receive on their written output.
Methodological Options in Investigating

L2 Writing Behaviors and Processes
From the beginnings, cognitively-oriented L2 writing research has been informed by writing models
adopted from first language (L1) writing, with Hayes and Flower’s (1980) model being invoked in
most L2 investigations. Hayes and Flower saw writing as a recursive process of problem solving,
DOI: 10.4324/9780429199691-34 339

Andrea Révész et al.
which involves three subprocesses: planning, including idea generation, organization, and goal-
setting; translation or turning plans into linguistic form; and reviewing. These processes are thought
to respond to the task environment, which is shaped by the task instructions and the evolving text,
while interacting with the writer’s long-term memory representations. The monitor, another main
component in the model, coordinates this set of writing operations. More recently, L2 researchers
have increasingly drawn on Kellogg’s (1996) model given that his framework, albeit positing the
same primary processes as Hayes and Flower’s, provides more elaborate description of linguistic
encoding operations (lexical retrieval, syntactic encoding, and expression of cohesion), whose exe-
cution often constitutes a key challenge in L2 writing.
To investigate L2 writing processes and by that inform L2 writing models and pedagogy,
researchers have employed various methods, with the bulk of research so far utilizing verbal
protocols. Among verbal report techniques, the think-aloud procedure, or concurrent verbal
reporting, has probably been the most commonly employed data collection tool (Polio &
Freedman, 2017). During think-alouds, participants describe their thoughts concurrently while
involved in composing. L2 writing researchers have used this method to examine the nature of
cognitive writing processes (e.g., Manchón, Roca de Larios, & Murphy, 2009), the relationship of
the process to the product of writing (e.g., Van Weijen, Van den Bergh, Rijlaarsdam, & Sanders,
2008), reliance on the L1 when composing (e.g., Wang & Wen, 2002), and the use of writing
strategies (e.g., Yang & Shi, 2003). While think-aloud studies have considerably contributed to
our understanding of conscious writing processes, the method has been subject to criticisms given
that the act of engaging in thinking aloud may change the actual composing process (e.g., making
writers focus more on planning or accuracy) and, as a result, the writing product. This potential
threat to validity is referred to as reactivity. Several studies have investigated whether reactivity
does indeed pose a validity risk (e.g., Yanguas & Lado, 2012; Yang, Hu, & Zhang, 2014), and they
have yielded mixed findings (Polio & Freedman, 2017). Another methodological threat related
to think-aloud protocols is veridicality. This means that verbal reports cannot capture all the
thoughts that participants had during writing given that not all cognitive processes are conscious
and thus verbalizable (Polio & Freedman, 2017), and some individuals are better able to think
aloud (Barkaoui, 2011).
Another type of verbal protocol, stimulated recall, is often employed by L2 researchers to inves-
tigate writing processes. This method aims to elicit the thoughts that participants had while writing
during a session after task performance (Gass & Mackey, 2016). A prompt such as a video recording
of the writing performance is employed to facilitate recall. L2 researchers, for instance, have used
this method to investigate how individual differences may affect writing processes (Bosher, 1998),
how writing processes change over time (Sasaki, 2004), what strategies L2 writers employ (DeSilva
& Graham, 2015), and how task complexity affects writing processes (e.g., Révész, Kourtali, &
Mazgutova, 2017). Stimulated recall protocols carry a lower risk of reactivity than think-alouds,
but they pose a greater threat to veridicality. Owing to memory limitations, it is unlikely that
participants can fully and accurately remember all the thoughts they had during writing in a retro-
spective session.
Given the limitations of verbal reports, keystroke logging, alone or together with verbal
protocols, is increasingly employed to examine writing processes. Keystroke-logging software
records the keystrokes and mouse movements of writers and produces log files which can be used
to extract information about fluency, revision, and pausing (Lindgren & Sullivan, 2019; Van Waes,
Leijten, Lindgren, & Wengelin, 2015). L2 keystroke-logging studies have looked into how L1 and
L2 writing behaviors differ (e.g., Chukharev-Hudilainen, Feng, Saricaoglu, & Torrance, 2019;
Stevenson, Schoonen, & De Glopper, 2006; Van Waes & Leijten, 2015); how writing behaviors
change over time (Spelman Miller, Lindgren, & Sullivan, 2008); and how task type (Barkaoui, 2016;
Michel, Révész, Lu, Kourtali, Lee, & Borges, 2020; Spelman-Miller, 2000), proficiency (Barkaoui,
340
Methodologies for L2 Writing Processes
2016), and task complexity (Révé et al., 2017) influence writing behaviors. As compared to verbal
protocols, keystroke logging has the advantage of providing real-time data without interrupting
writing. A shortcoming of the technique, however, is that, unlike verbal reports, it provides no direct
information about cognitive activities. This makes it challenging to make unambiguous interpret-
ations about cognitive processes based on keystroke measures (e.g., a pause may reflect planning
for content or form) (Galbraith & Baaijen, 2019). Additionally, keystroke logs supply no evidence
about L2 writers’ reading behaviors. This is a weakness since rereading previously produced text
to generate new ideas and/or monitor the evolving text are key writing processes (see Manchón
et al., 2009).
This limitation, however, can be resolved through using keystroke logging together with eye-
tracking (see Lindgren & Sullivan, 2019). Eye tracking records the moment-by-moment eye
fixations or eye gazes of writers during composition. Eye-tracking methodology is based on the
assumption that the location, sequence, and duration of eye fixations mirror the writer’s allocation
of attention and visual processing (Reichle, 2006). That is, the combination of keystroke logging
and eye tracking allows researchers to capture writers’ text production as well as viewing processes.
In the area of L1 writing, eye tracking has been utilized in a number of studies (see Wengelin, Frid,
Johansson, & Johansson, 2019 for a review). However, little L2 research has applied this technique
in L2 writing research to date. Révész, Michel, and Lee (2019) were among the first to investigate
the viewing behaviors of L2 writers. Among other things, they examined the position of writers’ eye
fixations during pauses, whether they stayed within the word, clause, sentence, or paragraph prior to
the inscription point. In addition, the eye-tracking data were triangulated with keystroke logs (e.g.,
position and length of pauses) and comments from stimulated recall interviews (reflecting con-
scious cognitive activities). This combination of sources enabled the researchers to determine how
far writers looked back in their texts and what conscious cognitive activities they engaged in when
they paused at different textual units. Likewise, a study by Gánem-Gutiérrez and Gilmore (2018)
relied on eye tracking combined with screen-recordings and stimulated recall to gain insights into
L2 writing processes. It is also worth highlighting Chukharev-Hudilainen et al.’s (2019) work. They
developed a tool to record time-aligned logs of eye fixations and keystrokes to examine disfluencies
in L2 writing. In all three studies, triangulating methods enabled the researchers to reach a fuller
picture of the writing process than using a single method would have afforded.
Methodological Options in Investigating L2 Feedback Processing

Although initially research on written feedback was primarily concerned with the effects of feed-
back on L2 outcomes, the past decade has seen an increased interest in exploring how learners pro-
cess written feedback (see Chapter 2, this volume). Underlying much of this research has been the
view that feedback can draw learners’ attention to discrepancies between their interlanguage and
target-like forms and thereby facilitate learning. In particular, following Schmidt’s (2001) Noticing
Hypothesis, advocates of written feedback have argued that feedback can help learners achieve
noticing, that is, becoming aware of gaps between their non-target-like and the corresponding
target-like forms. Noticing, in turn, has been proposed to be an initial step towards restructuring
in the interlanguage system and acquisition of the target feature (Schmidt, 2001). While the neces-
sity of noticing (attention with awareness) for L2 acquisition is subject to debate, there is a gen-
eral agreement that, for feedback to be effective, learners need to attend to it (see Chapter 24, this
volume). Besides fostering attentional processes, written feedback has been proposed as a way to
engage learners in “guided learning and problem solving and, as a result, promote the type of reflec-
tion that is more likely to foster long-term acquisition” (Bitchener & Knoch, 2008, p. 415).
To examine the cognitive processes in which learners engage when processing feedback,
researchers have relied on various techniques, including interviews, verbal protocols, written
341
languaging (see Chapter 3, this volume), and eye tracking. Regarding interviews, Ferris, Liu, Sinha,
and Senna (2013), for example, used semi-structured retrospective interviews to examine what strat-
egies student writers use to include feedback during revision and how their individual backgrounds
and attitudes might influence their progress. Similarly, Han and Hyland (2013) and Zheng and Yu
(2018) employed semi-structured interviews to investigate students’ general perspectives on feed-
back processing, coupled with verbal protocols to tap the processes in which learners engaged when
processing specific instances of feedback.
As compared to research on L2 writing processes, verbal protocols have been employed
in fewer written feedback studies. One of the first studies to use the think-aloud procedure
was Sachs and Polio (2007), who relied on concurrent verbal reports to examine the extent to
which learners’ awareness of reformulations of their original texts would relate to the accuracy
of the revised versions. The researchers also examined the reactivity of the think-alouds, and
found that those who thought aloud while they processed the reformulations produced a greater
number of accurate revisions (positive reactivity). In addition, the authors highlighted poten-
tial issues with veridicality; for example, the think-alouds did not make it entirely clear what
learners were aware of during the treatment. In Zheng and Yu’s (2018) study, the concurrent
verbal reports, in combination with the interviews were used to assess participants’ affective and
cognitive reactions to written feedback. This study, however, did not directly address methodo-
logical issues such as reactivity.
Han and Hyland’s (2015) work is among the few studies that have employed stimulated recalls to
examine written feedback processing. Participants were asked to recall their feelings and thoughts
when they received, processed, and used written feedback. As prompts, the researchers used the
participants’ first drafts with feedback and the subsequent final draft. Due to possible memory
decay, these stimulated recall results likely suffered more from veridicality than the verbal protocol
data in think-aloud studies, but there was no risk of reactivity.
Written languaging, “the process of making meaning and shaping knowledge and experience
through language” (Swain, 2006, p. 98), is another technique that has been used by researchers to
get a window into feedback processing (see Chapter 7, this volume). It can take place in the oral and
written mode, and involves externalizing one’s thoughts about language and reflecting about them
while engaging in language use. Languaging is now regarded as a useful learning tool, but it was
not until recently that researchers have begun to explore the potential of languaging as a technique
to examine cognitive activities during feedback processing. Suzuki (2012) was the first to employ
written languaging in a study of feedback, investigating whether learners paid attention to grammar
or lexis while responding to feedback. Based on the same dataset, Suzuki (2017) engaged in fur-
ther analyses of the written-language related episodes, exploring the level of awareness learners
displayed. Following the same procedure as Suzuki (2012, 2017) to elicit written languaging,
Cerezo, Manchón, and Nicolás-Conesa (2019) used a more elaborate coding scheme to examine
the depth with which learners processed feedback, employing a five-level coding scheme. These
studies (see further details in Chapters 7 and 16, see also Chapter 24) found written languaging to
be a useful technique to investigate feedback processing, as it allows for collecting data from sev-
eral participants simultaneously and runs a low risk of reactivity. However, it has the disadvantage,
like verbal protocols, of supplying no information about viewing behaviors while learners interact
with feedback.
An innovative study by Shintani and Ellis (2013) addressed this limitation by combining
stimulated recall with eye tracking to compare the attentional processes involved in the processing
of direct written corrective feedback versus written metalinguistic explanation. This combination
enabled the researchers to gain insights not only into participants’ thoughts but also their viewing
behaviors when they processed and used L2 written feedback or metalinguistic information. The
interest areas for the eye-tracking analyses were defined as instances of the target construction
342
and errors in its use. In the stimulated recall sessions, participants were prompted by the eye-gaze
recordings to share what they were thinking when they interacted with the feedback or the metalin-
guistic explanation. Through the joint use of the two methods, the researchers were able to assess
not only the amount of attention learners paid to written feedback/metalinguistic explanation (eye-
tracking) but also the level of awareness the two techniques triggered (stimulated recall). This study
further demonstrates that a promising way to overcome the weaknesses of various methods is to
triangulate data gathered through multiple data elicitation tools.
A Future Research Agenda

Against this background, we now turn to describing four empirical studies that we believe would
further the fields of writing processes and feedback processing research from both a theoretical and
methodological perspective.
Empirical Study 1
Research Question: What Behaviors and Associated Cognitive
Processes Do L2 Users Engage in During L2 Text Production?
RATIONALE AND AIM
We have earlier suggested that combining various data elicitation methods can help researchers
achieve more thorough and valid insights into the writing process and thereby better inform models
of writing (e.g., Kellogg, 1996) and L2 instruction. Indeed, as outlined above, the few studies that
have adopted a mixed-methods approach to studying L2 writing processes (e.g., Révész et al., 2017;
Révész et al., 2019) have provided novel and more complete insights into the cognitive activities of
L2 writers. A limitation of previous research, however, is that the dataset in most studies was only
analyzed quantitatively, including group-level summaries. This is an important research gap given
that, without fine-grained qualitative data analyses that consider individual-level data, researchers
may fail to capture the details of how L2 writers develop their text, with individual behaviors being
masked by group analyses (e.g., Chan, 2017; Gánem-Gutiérrez & Gilmore, 2018). To address this
issue, we would like to call for detailed qualitative studies to gain a fuller picture of the L2 writing
process. The first empirical study that we propose gives an example of how to go about designing
such a study.
Our proposed study would examine L2 writing behaviors and associated cognitive processes
using a multi-level analysis of verbal protocols, keystroke logging, and eye-gaze recordings as this
combination of sources would help reveal writers’ thinking processes (stimulated recall) and cap-
ture their real-time writing (keystroke logging) and viewing (eye-tracking) behaviors. The study
would focus on differences between more and less proficient L2 writers adopting a qualitative
approach to data analysis.
DESIGN
Given that the study would aim to explore every detail of the text production process, it would be
important that the writing prompt, the typing area, and the timer (if needed) appear on the computer
screen. In the case of integrated writing, it would also be preferable to include the resources (e.g.,
reading) on the screen. This set-up would allow the researcher to record eye fixations also on the
resources and to use them as prompts in the subsequent stimulated recall session. In addition, it
would be important to select appropriate font size and line spacing (see Conklin, Pellicer-Sánchez,
& Carrol, 2018, for recommendations), as the purpose of the study would be to investigate detailed
viewing behaviors, that is, what specific areas of the screen the writers are looking at (e.g., par-
ticular words).
343
During data collection, the participants would be asked to perform the writing task with the
keystroke-logging program and eye tracker running in the background. Ideally, the keystroke-
logging and eye-tracking programs should be synchronized (Chukharev-Hudilainen et al., 2019)
to facilitate accurate data triangulation. Immediately after completing the writing task, participants
would be invited to take part in a stimulated recall interview prompted by the recording of their
typing and synchronized eye-gaze behaviors.
The data analysis would involve tabulating the stimulated recall comments in a chronological
order in relation to the data captured in the keystroke logs and eye-gaze recordings (see Study 2
below for a detailed list of possible eye-tracking measures). The analyses could, for example, be
informed by Kellogg’s (1996) model of writing.
Table 25.1 provides an example of how the qualitative analysis could be conducted, considering
the performance of a less and a more proficient writer taken from a dataset collected as part of a
larger project (e.g., Michel et al., 2020). Comparing the parts for lower- and higher-proficiency
writers in the first row, it would appear that two writers processed the task requirements in a similar
way, as they both recognized the importance of the prompt and spent the same amount of time
reading it. However, differences begin to emerge when the verbal protocols are triangulated with
the eye-gaze data. The comment provided by the more proficient writer is well aligned with their
viewing behavior. There is a discrepancy, however, between the reported and actual behavior of
the less proficient writer. Although they reported that they had read the prompt carefully, the eye-
gaze data show that they had only looked at the second sentence briefly instead of reading the
entire prompt. Probably as a result, three out of six pauses (#6, #9, and #10) that the less proficient
writer made in the next one minute and a half (shown in the keystroke logs) involved checking the
writing prompt (as reflected in the eye-movement data). The stimulated recall comments confirmed
that, indeed, the writer paused because they were struggling to understand the task prompt. The
eye-tracking data further suggest that the writer had a problem with processing the second sen-
tence, given the repeated fixations on it. The more proficient writer, on the contrary, left the typing
area only once to go back to the writing prompt to confirm their understanding (#3). After that, as
manifested in the stimulated recall comments, their pauses were associated with paraphrasing, as
the writer attempted to look for lexical alternatives.
Empirical Study 2
Research Question: How Do L2 Writing Behaviors and Associated
Cognitive Processes Change Over Time?
RATIONALE AND AIM
There is also a scarcity of longitudinal research available on writing processes using a mixed-
methods approach. Therefore, we would like to recommend that future researchers conduct longer-
term investigations of L2 writing behaviors and processes using a combination of behavioral
techniques and verbal protocols. The second empirical study we propose describes how one such
study could be designed and conducted adopting a quantitative approach.
This study would aim to investigate L2 writing behaviors and associated cognitive processes
over a period of time (for instance, one academic year) employing verbal protocols, keystroke
logging, and eye-gaze recordings. The study would focus on changes in writers’ behaviors and the
cognitive processes that underlie them by triangulating data at various data points.
Design
A challenge in longitudinal writing research is to design and/or select comparable writing prompts.
Otherwise, prompt difficulty can confound the results, making it impossible to decide whether
344
newgenrtpdf
Table 25.1 First two minutes of the writing process from a lower-and a higher-proficiency L2 writer
Lower-proficiency L2 writer Higher-proficiency L2 writer
No. Starting Duration Eye gazes Keystrokes SR comments No. Starting Duration Eye gazes Keystrokes SR comments
time time
1 00:00 16s WQ (S2) It was important 1 00:00 16s WQ I was trying to understand
to understand the question. I knew it
the question. was very important to
I was reading it understand the question
carefully. correctly, so I went

over it twice. I realised
it was asking whether
you agree or disagree
with something and give
examples to support
your opinion.
345
2 00:17 3s Now I was thinking 2 00:17 8s Then I was thinking how to

how to write the write the first sentence,
beginning. I knew to introduce the topic.
I didn’t have to I knew normally the
think much but first paragraph was
just rephrase the an introduction. I was
question. thinking how to write it.
3 00:21 12s TA Nowadays, it has I was thinking 3 00:26 23s TA-WQ Recently, I wanted to make sure how
been [PAUSE] whether to write (S2)-TA many exactly the question
a hot [BACK] ‘it has been a hot [PAUSE] stated the issue and then
[BACK][BACK] topic’ or ‘discuss use my own words to
[BACK] blablabla’. I was phrase it.
deciding which
one to go for.
I didn’t remember
which one was
grammatically
correct.
(continued)
newgenrtpdf
Table 25.1 Cont.
time time
4 00:34 26s TA-WQ a [PAUSE] const I was thinking what 4 00:50 20s TA people I wanted to use a more
(S2)-TA [BACK] word to use next. [BACK] formal way to say ‘many
[BACK] troversy Initially I wanted [BACK] people’.
to use important, [BACK]
but then I realised [BACK]
I shouldn’t use [BACK]
the same word as [BACK]
in the question, [BACK]

so I changed [BACK]
for ‘controversy [BACK]
(controversial)’. [BACK] an
346
increasing
number
of people
argue that
it is more
5 01:01 2s TA whether poe I was going to 5 01:10 12s TA [PAUSE] I wanted to use another
[BACK] write ‘what they [BACK] collocation to say
[BACK] eople are interested [BACK] ‘more important’ in the
should choose w in’. I was also [BACK] question.
[PAUSE] hat thinking, shall [BACK] of
I change it to more
‘what’s their
favourite majors’
or ‘their loved
majors’? Then
I thought they
were too informal.
newgenrtpdf
6 01:04 6s TA-WQ they are interested
I was still thinking 6 01:23 5s TA importance to I was thinking how to
(S2)-TA as [PAUSE] a about what the [PAUSE] rephrase the second half
question asked choose of that sentence in the
me to do. Because question.
if you learn
something you
like, it may have
nothing to do with
your career. Did
I understand the
question correctly?
7 01:11 4s TA career [PAUSE] or I was thinking 7 01:29 28s TA to study I was phrasing the next bit

whether it was [PAUSE] in my mind.
grammatically the
correct to use
‘whether’ here.
8 01:16 9s TA what promi I wanted to write 8 01:58 16s TA: reading subjects that I was thinking of using a
347
[BACK][BACK] ‘promise’, but through [PAUSE] clause to describe the

[BACK] I was not sure the text one have subjects ‘that you are
whether it spelt interest in interested in’.
in this way. I was
thinking if there
was another way
to say ‘promise a
good future’, but
I only had that
in mind, so I just
went for it.
(continued)
newgenrtpdf
Table 25.1 Cont.
time time
9 01:26 24s TA-WQ omises [PAUSE] I was again

(S2)-TA [BACK] thinking whether
I understood the
question correctly.
I wanted to change
the way how the
sentence was
phrased. It seemed

to me the question
was asking about
choosinhg the
348
major in the
university.
10 01:51 23s TA-WQ [BACK][BACK] I deleted ‘what
(S2)-TA [BACK][BACK] promises’ ‘cause
TA: reading [BACK][BACK] I didn’t think
through [BACK][BACK] I used it correctly.
the text [BACK][BACK] I wanted to be safe,
[BACK][BACK] so I was going to
[BACK] copy what was in
[PAUSE] the question.
I was reading the first
sentence, ‘cause
I knew if I started
the first sentence
wrongly, I wouldn’t
get a good score.
Notes: WQ = Writing question, TA = Typing area, WQ (S2) = The second sentence in the writing question
any differences in behaviors or cognitive activities were due to changes in writing ability or were
artefacts of prompt difficulty. In the proposed study, a way to deal with this challenge would be
to design and/or select an initial pool of potential prompts and pilot the difficulty of these using
a within-subject design, that is, asking participants to respond to several prompts from the pool
at around the same time. Then, the researcher could select a subset that appears comparable. The
difficulty of the prompts could additionally be judged by experts to triangulate the learner data.
Finally, to control for any remaining prompt effects, the researcher could administer the prompts in
a Latin-square design.
Another key issue surrounding the selection of prompts concerns the type of task(s) participants
are asked to carry out. The researcher would need to select task type(s) that are appropriate for
writers at the beginning and end of the study period, taking into consideration that participants
would be likely to develop their writing ability over the span of an academic year.
The data collection would follow the same procedures as Study 1. The only difference would be
that only half of the participants would take part in the stimulated recall sessions. The inclusion of
both stimulated recall and non-stimulated recall participants would enable the researcher to check
for potential reactivity (i.e., whether participating in stimulated recall affects subsequent perform-
ance) and thereby inform the ongoing methodological debate regarding reactivity in the context of
writing research (Gass & Mackey, 2016).
The keystroke-logging and eye-gaze data would be analyzed quantitatively. Using keystroke-
logging software, measures of speed fluency, pausing and revision behaviors would be obtained to
tap different aspects of the writing process. Some keystroke-logging software such as InputLog 7
(Leijten & Van Waes 2013) can generate such indices automatically. We would recommend that the
researcher adopts both a lower (e.g., 200 ms) and a higher pause threshold (e.g., 2000 ms) when
computing the fluency and pausing measures (Van Waes & Leijten, 2015). The former would allow
for capturing lower-level cognitive writing processes, whereas the latter would enable comparison
with previous studies. Speed fluency could be assessed with measures assessing production rate
(e.g., number of characters typed between pauses) as well as process variance (e.g., standard devi-
ation of number of characters per minute) to encapsulate the multi-faceted nature of the construct
(Van Waes & Leijten, 2015). Pausing behaviors would be expressed in terms of pause frequency
and pause length.
Two types of eye-gaze analyses would be conducted, calculating indices that capture overall
viewing patterns and carrying out more detailed analyses that investigate eye movements during
pausing and revision. This combination would make it possible to obtain both more global and spe-
cific perspectives on the writing process. The interest areas for the more general eye-gaze analyses
could be the writing window, the prompt, the timer (if present), and the various resources (e.g.,
reading text). Types of measures that could be computed for these interest areas include indices of
fixation duration and counts as well as number and length of saccades (see Brunfaut & McCray,
2015; Michel et al., 2020 for examples). Turning to the more detailed analyses (Révész et al., 2019),
for all pauses, the eye gazes could be coded according to whether writers stayed during the pause at
the inscription point or looked back in the evolving text (e.g., looked at the previous word, phrase,
clause, sentence, or paragraph). For revision, viewing behaviors preceding the revision could be
considered, that is, what areas of the screen writers visited (e.g., word, phrase, clause, sentence, or
paragraph before the inscription point). A particular benefit of these detailed eye-tracking indices
would be that they could be directly linked to the keystroke logging and stimulated recall data (see
Révész et al., 2019).
The stimulated recall comments would be inspected for emergent categories, which would be
classified into more general categories (see Révész et al., 2019 for example). This stage would also
be informed by models of writing (e.g., Kellogg, 1996). Then, the number of comments per cat-
egory would be summed up to obtain frequency counts for each writer by category.
349
The final stage of data analysis would involve interpreting the results through triangulating the
various data sources. The researcher would conduct statistical analyses to find relationships among
participants’ performance on the various measures. We recommend that mixed effects modeling is
used, as this approach can control for any random variation caused by remaining task differences.
Empirical Study 3
Research Question: What Is the Relationship Between Feedback
Processing and Accuracy in the Revised Text?
RATIONALE AND AIM
As argued earlier, learners’ engagement with written corrective feedback has been under-explored
in L2 writing research (Hen & Hyland, 2015). Most available studies have examined feedback
processing using think-alouds (e.g., Sachs & Polio, 2007), interviews (e.g., Zhang & Yu, 2018),
stimulated recalls (Han & Hyland, 2015), and written languaging (e.g., Cerezo et al., 2019; Suzuki,
2012, 2017). Although these methods provide useful insights about learners’ processing of feedback
(see Chapter 7, this volume), they do not inform about L2 writers’ viewing behaviors during feed-
back processing. This drawback can be overcome using eye tracking, but so far few studies (Shintani
& Ellis, 2013) have used this technique to investigate feedback processing. Examining learners’ eye
movements while processing feedback can help researchers to better understand subsequent revi-
sion behavior. The underlying assumption is that, if feedback fosters noticing and attention (see
Chapter 2, this volume), this would be reflected in eye movements with more and longer fixations.
If, as argued earlier, noticing and attention influence the effectiveness of feedback, we might see a
relationship between fixation durations and revision behavior. For example, indirect feedback has
been claimed to prompt deeper cognitive processing than direct corrective feedback (but see Cerezo
et al., 2019). Deeper levels of processing and increased cognitive effort would be reflected in more
and longer fixations (Pickering, Frisson, McElree, & Traxler, 2004), which might then be positively
related to the quality of the revised text. While Shintani and Ellis (2013) relied on eye tracking to
explore feedback processing, the researchers did not directly relate their eye-tracking indices to the
outcome measures. Thus, the main aim of the suggested empirical study would be to examine the
processing of different types of written corrective feedback and its relationship with the number of
accurate revisions.
DESIGN
The overall design of this empirical study would involve the usual procedure of an initial writing
task, feedback provision (and writers’ processing of feedback provided), and the subsequent revi-
sion of the original text. It would follow a between-subjects design with participants exposed to
two treatment conditions: 1) direct written corrective feedback, where participant’s errors on the
chosen target feature are indicated and corrected providing the target form; and 2) indirect written
corrective feedback, where participants’ errors are only indicated by underlining them or by using
feedback codes. We suggest here examining the processing of these two types of written corrective
feedback as it has been claimed that they might prompt different levels of cognitive processing.
However, researchers might wish to compare other types of feedback. Only one target feature
would be chosen in this study. Studies could potentially be conducted to examine the processing of
different error types.
Data would be collected in two sessions. In Session 1, participants would be asked to complete a
writing task. Since we are interested in the processing of feedback, as opposed to viewing behavior
while producing the written text, the initial writing task could be typed or handwritten. However,
for the appropriate recording of eye movements, the handwritten text would need to be transcribed
(Courier New, min font size 14, triple spacing) and errors corrected in the electronic version of
350
Direct corrective feedback condition Indirect corrective feedback condition
The man is croshing the street. The man is croshing the street.
crossing
Figure 25.1 Example of spelling error in the two treatment conditions (direct and indirect corrective
feedback). Areas of interest indicated by boxes
the text (see Conklin et al., 2018, for a discussion of methodological issues in the examination of
handwritten production). This corrected text would then form the stimulus for the eye-tracking
experiment and would be presented on the computer screen. Having participants typing the original
writing task would avoid having to go through the transcription stage. In Session 2, participants
would be asked to read their text with the corrections (i.e., direct: target forms; indirect: underlined
errors or feedback codes) while their eye movements are recorded. After reading the feedback, they
would be asked to revise their original writing.
After data collection, the revised written texts would be scored according to the percentage of
errors that are accurately revised. Eye movements to the corrected errors would be extracted and
analyzed. An important methodological decision would be the specific eye-movement measures to
be used. Existing studies exploring the relationship between eye movements and outcome measures
have typically examined late eye-movement measures, which reflect more controlled and strategic
processing (see Conklin et al., 2018, for a review of eye-movement measures).
Another important methodological consideration in this type of research is that the corrected
errors in the two treatment conditions could have different sizes, as illustrated in Figure 25.1. We
would expect that the larger areas of interest in the direct written corrective feedback would receive
more and longer fixations, not because of different underlying cognitive processes but just because
there is more written input to process. In order to account for this potential confounding factors,
eye-movement data could be normalized by the number of characters or syllables in each area
of interest. The focus of the study would be the relationship between those potential differences
and accuracy in the revisions made. Statistical analyses would be used to explore the correlation
between eye-movement measures and the percentage of accurate responses.
As hypothesized above, a positive relationship between feedback processing and percentage
of errors revised would be expected, denoting that the increased attention was a sign of deeper
engagement and deeper processing of the feedback, which is later reflected in higher percentage
of accurate revisions. However, a negative relationship could also be found, indicating that the
increased attention paid to the errors might reflect processing difficulties that are later reflected
in lower scores. Results could also show a lack of relationship between eye movements and per-
centage of accurate responses, reflecting the complexity of subprocesses encoded in eye-movement
measures and the need to distinguish these subprocesses more clearly, as suggested in other areas of
L2 learning (e.g., see Pellicer-Sánchez, 2020, for a review).
Empirical Study 4
Research Question: What Behaviors and Associated Cognitive Processes
Do L2 Learners Engage in During Written Feedback Processing?
RATIONALE AND AIM
The examination of the relationship between feedback processing patterns and revision accuracy
explored in Study 3 contributes to our understanding of the processes that might lead to higher
accuracy in revisions. However, as argued above, eye movements in the context of feedback pro-
cessing might encode a range of subprocesses, and eye-movement data on its own might not allow
us to discern between them. This drawback can be overcome by the combination of eye-movement
351
measures with stimulated recalls. Shintani and Ellis (2013) used eye movements to examine
how learners attended to the corrections they received in two feedback conditions (i.e., meta-
linguistic explanation and direct corrective feedback), and stimulated recalls to investigate how
learners responded to and handled specific corrections. Results showed that, while there were no
differences at the level of noticing (eye-movement data), there were differences at the level of
understanding (stimulated recall data). It is important to note though, that the eye-tracking experi-
ment and stimulated recalls in Shintani and Ellis (2013) were only conducted with six learners and
conclusions were therefore based on descriptive statistics of a small data set.
To address this issue, we would like to call for investigations that specifically focus on the pro-
cessing of feedback combining these two data collection methods. Thus, the fourth empirical study
that we propose here aims at examining the amount of attention and cognitive effort spent pro-
cessing instances of feedback (eye-movement data) and learners’ actual thinking processes during
feedback processing (stimulated recall data).
DESIGN
This study would involve an initial writing task, feedback provision (and learners’ processing of
the feedback received), and stimulated recall interviews. Two types of feedback that could lead
to different amounts of processing effort would be examined in this investigation, i.e., focused
vs. unfocused direct corrective feedback. The study would follow a between-subjects design and
participants would be randomly assigned to one of the two treatment groups. Participants in the
focused corrective feedback group would receive correction on only one type of error, whereas
participants in the unfocused corrective feedback group would receive corrective feedback on
different features. Unfocused feedback has been claimed to accord more closely with pedagogical
purposes (Nicolás-Conesa, Manchón, & Cerezo, 2019; Van Beuningen, De Jong, & Kuiken, 2012).
However, it could be hypothesized that unfocused feedback entails increased cognitive effort, as
learners have to process and understand different types of errors. This increased cognitive effort
might then be reflected in more and longer fixations.
Data would be collected in two sessions. In Session 1, participants would be asked to complete
the writing task. The same writing prompt would be used for both experimental treatment groups.
As explained in the design of Study 3, since the focus of this study is the processing of feed-
back, the initial writing task could be either handwritten or typed. However, asking learners to type
the text would avoid having to go through the transcription process. Errors would be corrected
in the electronic version of the task, which would be the stimulus for the eye-tracking experiment.
In the second session, participants would be asked to read their initial text with corrections while
their eye movements are recorded. Immediately after the eye-tracking experiment, participants in
both treatment groups would participate in the stimulated recall interviews. The recordings of their
eye movements would be used as prompt for the stimulated recalls.
Eye movements to the corrected errors would be the focus of the analysis. As explained in
Study 3, one of the main methodological issues would be the need to adjust for size differences
in the corrected errors. In order to adjust for these potential differences, eye movements could be
normalized by number of characters or syllables of each corrected error. Another important meth-
odological consideration when triangulating eye movements and stimulated recall is the specific
eye-movement measures to use. Since learners’ recalled thoughts are likely to be a reflection of more
controlled and strategic processing, exploring late measures (e.g., total fixation duration, fixation
count) would be an appropriate choice. The stimulated recall data would be analyzed following the
data procedures outlined in the design of Study 2. Finally, statistical analyses will be conducted to
relate eye-movement measures and the categories that emerged in the stimulated recalls. This final
stage of data analysis would allow researchers to establish connections between amount of attention
and specific cognitive strategies.
352
Conclusion
In this chapter, we have provided a review of methods that can be used to examine L2 writing
processes and feedback processing. Our aim was to demonstrate that combining various data elicit-
ation tools, novel and more traditional, can facilitate a more valid and fuller understanding of the
L2 writing and feedback process. As an illustration, we proposed designs for four future empirical
studies that, through the use of innovative combination of methods (e.g., verbal protocols, keystroke
logging, eye tracking), would help obtain a clearer picture of the ways in which L2 text production
evolves, how L2 writing processes develop over time, the ways in which learners process written
feedback, and how the processing of different types of feedback relates to accurate revisions.
References
Barkaoui, K. (2011). Think-aloud protocols in research on essay rating: An empirical study of their veridicality
and reactively. Language Testing, 28, 51–75.
Barkaoui, K. (2016). What and when second-language learners revise when responding to timed writing tasks
on the computer: The roles of task type, second language proficiency, and keyboarding skills. The Modern
Language Journal, 100, 320–240.
Bitchener, J., & Knoch, U. (2008). The value of written corrective feedback for migrant and international
students. Language Teaching Research, 12, 409–431.
Matters.
Bosher, S. (1998). The composing process of three Southeast Asian writers at the postsecondary level: An
exploratory study. Journal of Second Language Writing, 7, 205–241.
Brunfaut, T., & McCray, G. (2015). Looking into test-takers’ cognitive processes whilst completing reading
tasks: A mixed-method eye-tracking and stimulated recall study (ARAGs Research Reports Online; Vol.
AR/2015/001). London: The British Council.
Cerezo, L., Manchón, R.M., & Nicolás-Conesa, F. (2019). What do learners notice while processing written
feedback? In R. Leow (Ed.), The Routledge handbook of second language research in classroom learning
Chan, S. (2017). Using keystroke logging to understand writers’ processes on a reading-into-writing test.
Language Testing in Asia, 10(7), 1–27.
Chukharev-Hudilainen, E., Saricaoglu, A., Torrance, M., & Feng, H.-H. (2019). Combined deployable key-
stroke logging and eyetracking for investigating L2 fluency. Studies in Second Language Acquisition,
41(3), 583–604.
Conklin, K., Pellicer-Sánchez, A., & Carrol, G. (2018). Eye-tracking: A guide for applied linguistics research.
Cumming, A. (2016). Theoretical orientations to L2 writing. In R.M. Manchón & P.K. Matsuda (Eds.),
Handbook of second and foreign language writing (pp. 65–88). Boston: De Gruyter Mouton.
De Silva, R., & Graham, S. (2015). The effects of strategy instruction on writing strategy use for students of
different proficiency levels. System, 53, 47–59.
Galbraith, D., & Baaijen, V.M. (2019). Aligning keystrokes with cognitive processes in writing. In E. Lindgren
& K. Sullivan (Eds.), Observing writing: Insights from keystroke-logging and handwriting (pp. 306–325).
Leiden: Brill.
Gánem-Gutiérrez, G.A., & Gilmore, A. (2018). Tracking the real-time evolution of a writing event: Second
language writers at different proficiency levels. Language Learning, 68(2), 469–506.
Gass, S., & Mackey, A. (2016). Stimulated recall in second language research (2nd ed.). New York: Routledge.
Hayes, J.R., & Flower, L.S. (1980). Identifying the organization of writing processes. In L.W. Gregg & E R.
Steinberg, (Eds.), Cognitive processes in writing (pp. 3–30). Hillsdale, NJ: Erlbaum.
Kellogg, R. (1996). A model of working memory in writing. In M.Levy & S. Ransdell (Eds.), The science of
writing: Theories, methods, individual differences, and applications (pp. 57–72). Mahwah, NJ: Lawrence
Erlbaum Associates.
353
Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using Inputlog to analyze and
visualize writing processes. Written Communication, 30, 358–392.
Lindgren, E., & Sullivan, K. (Eds.). (2019). Observing writing: Insights from keystroke logging and hand-
writing (Vol. 38). Leiden: Brill.
Manchón, R., Roca de Larios, J., & Murphy, L. (2009). The temporal dimension and problem-solving nature
of foreign language composing processes: Implications for theory. In R. Manchón (Ed.), Writing in foreign
language contexts: Learning, teaching, and research (pp. 102–124). Bristol: Multilingual Matters.
Michel, M., Révész, A., Lu, X., Kourtali, N., Lee, M., & Borges, L. (2020). Investigating L2 writing processes
across independent and integrated tasks: A mixed methods study. Second Language Research, Advance
Access. https://doi.org/10.1177/0267658320915501
Nicolás-Conesa, F., Manchón, R.M. & Cerezo, L. (2019). The effect of unfocused direct and indirect written
corrective feedback on rewritten texts and new texts: Looking into feedback for accuracy and feedback for
acquisition. The Modern Language Journal, 109(4), 848–873.
Pellicer-Sánchez, A. (2020). Expanding English vocabulary knowledge through reading: Insights from eye-
tracking studiess. RELC Journal, 51(1), 134–146.
Pickering, M.J., Frisson, S., McElree, B., & Traxler, M. J. (2004). Eye movements and semantic composition.
In M. Carreiras and C. Clifton (Eds.), The on-line study of sentence comprehension: Eyetracking, ERPs and
beyond (pp. 33–50). New York: Psychology Press.
Polio, C. (2012). Second language writing. In S. Gass & A. Mackey (Eds.), Handbook of second language
Polio, C., & Freedman, D. (2017). Understanding, evaluating and conducting second language writing
Reichle, E.D. (2006). Theories of the “eye-mind” link: Computational models of eye-movement control during
reading. Cognitive Systems Research, 7, 2–3.
Révész, A., Kourtali, N.-E., & Mazgutova, D. (2017). Effects of task complexity on L2 writing behaviors and
linguistic complexity. Language Learning, 67(1), 208–241.
Révész, A., & Michel, M. (2019) (Eds). Methodological issues in investigating second language writing pro-
cess. Special Issue. Studies in Second Language Acquisition, 41, 491–645.
Révész, A., Michel, M., & Lee, M. (2019). Exploring second language writers’ pausing and revision
behaviours: A mixed-methods study. Studies in Second Language Acquisition, 41, 605–631.
Boston/: De Gruyter Mouton.
Sachs, R., & Polio, C. (2007). Learners’ uses of two types of written corrective feedback on a L2 writing revi-
sion task. Studies in Second Language Acquisition, 29, 67–100.
Sasaki, M. (2004). A multiple-data analysis of the 3.5-year development of EFL student writers. Language
Learning, 54, 525–582.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognitive and second language instruction (pp. 3–32).
Shintani, N., & Ellis, R. (2013). The comparative effect of direct written corrective feedback and metalin-
guistic explanation on learners’ explicit and implicit knowledge of the English indefinite article. Journal of
Spelman Miller, K. (2000). Academic writers on-line: Investigating pausing in the production of text. Language
Teaching Research, 4, 123–148.
Spelman Miller K., Lindgren E., & Sullivan K.P.H. (2008). The psycholinguistic dimension in second language
writing: Opportunities for research and pedagogy using computer keystroke logging. TESOL Quarterly, 42,
433–453.
Stevenson, M., Schoonen, R., & De Glopper, K. (2006). Revising in two languages: A multi-dimensional
comparison of online writing revisions in L1 and FL. Journal of Second Language Writing, 15, 201–233.
Suzuki, W. (2012). Written languaging, direct correction, and second language writing revision. Language
Learning, 62(4), 1110–1133.
Pedagogy, 8(3), 461–482.
Swain, M. (2006). Languaging, agency and collaboration in advanced second language proficiency. In H.
Byrnes (Ed.), Advanced language learning: The contribution of Halliday and Vygotsky (pp. 95–108).
London: Continuum.
Van Beuningen, C.G., De Jong, N., & Kuiken, F. (2012). Evidence on the effectiveness of comprehensive error
correction in second language writing. Language Learning, 62(1), 1–41.
354
Van Waes, L., & Leijten, M. (2015). Fluency in writing: A multidimensional perspective on writing fluency
applied to L1 and L2. Computers & Composition, 38, 79–95.
Van Waes, L., Leijten, M., Lindgren, E., & Wengelin, Å. (2015). Keystroke logging in writing research: Analyzing
online writing processes. In C. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research
(2nd ed.) (Vol. 2, pp. 410–426). New York: Guilford Press.
Van Weijen, D., Van Den Bergh, H., Rijlaarsdam, G., & Sanders, T. (2008). Differences in process and process-
product relations in L2 writing. ITL-Review of Applied Linguistics, 156, 203–226.
Wang, W., & Wen, Q. (2002). L1 use in the L2 composing process: An exploratory study of 16 Chinese EFL
writers. Journal of Second Language Writing, 11, 225–246.
Wengelin, A., Frid, J., Johansson, J., & Johansson, V. (2019). Combining keystroke logging with other
methods:Towards an experimental environment for writing process research. In E. Lindgren & K.
Sullivan (Eds.), Observing writing: Insights from keystroke-logging and handwriting (pp. 30–49). Leiden/
Boston: Brill.
Yang, C., Hu, G., & Zhang, L.J. (2014). Reactivity of concurrent verbal reporting in second language writing.
Yang, L., & Shi, L. (2003). Exploring six MBA students’ summary writing by introspection. Journal of English
for Academic Purposes, 2(3), 165–192.
Yanguas, I., & Lado, B. (2012). Is thinking aloud reactive when writing in the heritage language? Foreign
Language Annals, 45(3), 380–399.
Zheng, Y., & Yu, S. (2018). Student engagement with teacher written corrective feedback in EFL writing: A
355
26
DIRECTIONS FOR FUTURE USE
OF USING EXISTING CORPORA
IN THE STUDY OF L2 WRITING
Shelley Staples, Adriana Picoral, Aleksey Novikov,
and Bruna Sommer-Farias
University of Arizona, University of Arizona, University of Arizona,
and Michigan State University
Introduction: A Brief Historical Perspective

Analysis of learner corpora for the purposes of examining second language acquisition (SLA) and
development in writing started in the early 1990s, culminating in the creation of the first widely
available learner corpus, the International Corpus of Learner English (ICLE) (Granger, Dagneaux,
Meunier, & Paquot, 2002, 2009). The ICLE research team at Université catholique de Louvain,
Belgium, most notably Sylvianne Granger, contributed two major strands of research in rela-
tion to second language development in these early studies: Contrastive Interlanguage Analysis
and Computer-Aided Error Analysis (Granger, 1996). Contrastive Interlanguage Analysis (CIA)
compares learner data (e.g., ICLE) with L1 data (e.g., LOCNESS, a corpus of student essays
loosely constructed to be similar to the ICLE but produced by British and American L1 English
writers). In early CIA, it assumed that the L1 is the target goal for L2 writers and that learners
“overuse” and “underuse” phenomena in relation to the L1. Computer-Aided Error Analysis (CEA)
takes the accuracy of learner constructions as the focus of analysis, utilizing interactive tools for
coding errors in SLW. These two approaches have led to valuable insights into the ways in which
learner language differs from L1 target language, and have been used to inform both pedagogy
and test development for SLW (Granger, 2015). However, both approaches have somewhat prob-
lematic relationships with current perspectives on SLA. CIA in particular has come under attack
by many SLA researchers in its use of the comparative fallacy and native-speaker norms. Granger
(2015) directly addressed this criticism, calling for the use of “reference language varieties” rather
than “native language varieties” to account for expert/novice distinctions as well as a range of var-
ieties (including ELF) that learners may see as a target. However, the continued use of the terms
“overuse” and “underuse” suggests a prescriptive view of learner language in comparison with
“reference language varieties.” A less prescriptive approach which still allows for useful statistical
comparison between L1 and L2 groups can be to simply refer to the two groups descriptively (L1
and L2), and to the descriptive differences as “more” or “less” use of a feature.
At the same time as this line of research was developing, scholars also began using learner cor-
pora to investigate SLW from the perspective of genre and register. These studies, while they often
include comparisons between learners’ L1 and L2, focus on describing how SLW is influenced by
356 DOI: 10.4324/9780429199691-35

Corpora and L2 Writing
contextual and discourse-level factors, emphasizing a functional approach to language use and
SLW development (Flowerdew, 2003; Hyland, 2004; Reppen & Grabe, 1993). Most of these studies
do not explicitly align themselves with a particular SLA theory or approach; however, Flowerdew
(2003) specifically promotes the combination of learner corpus research with Systemic Functional
Linguistic models of language development.
From the early 2000s on, researchers have also turned their attention to cross-sectional designs
of learner language at different proficiency levels (Grant & Ginther, 2000) and, more recently, lon-
gitudinal designs (Belz & Vyatkina, 2005; Byrnes, Maxim, & Norris, 2010; Connor-Linton & Polio,
2014; Vyatkina, Hirschmann, & Golcher, 2015). Multivariate approaches to the study of SLW have
increased as well, moving analyses beyond individual variables to constellations of variables that
are related to L2 development, using factor analysis and cluster analysis (Biber, Gray, & Staples,
2016; Friginal, Li, & Weigle, 2014; Gries, 2018; Jarvis, Grant, Bikowski, & Ferris, 2003; Weigle
& Friginal, 2015; Yan & Staples, 2020). In addition, SLW research outside of English has begun
to flourish with studies such as the development of L2 pragmatics through use and awareness of
German modal particles (Belz & Vyatkina, 2005) and genre-based curricular design for advanced
writing development in foreign languages (Byrnes et al., 2010).
The lack of an explicit theoretical focus in many learner corpus studies derives from the descrip-
tive nature of corpus linguistics. While no research, of course, is atheoretical, corpus linguists
tend to focus more on empirical data analysis rather than theory building. In part, for this reason,
since the early 2000s, we see learner corpus studies that adopt a variety of perspectives on SLA
research. One of the most widely used SLA theories in learner corpus research is a usage-based/
connectionist approach (e.g., Ellis, O’Donnell, & Römer, 2013; Wulff, 2016). Within this para-
digm, verb-argument constructions (VACs) have recently become an important site for research,
since they are constructions at the “cornerstone of the syntax-semantics interface” that allow users
to gain knowledge of language based on “inference of syntactic and semantic bootstrapping” (i.e.,
creation of exemplars based on various iterations of these forms from the input). Examples of VACs
include patterns like V across N (swim across a lake) or V of N (think of an example). Both Ellis
et al. (2013) and Wulff (2016) illustrate how VACs are used differently in writing than in speech.
Building on previous research that shows a preference for that deletion in complement clauses in
speech when compared to that deletion in L1 English writing, Wulff (2016) shows that L2 English
writers (from German and Spanish backgrounds) follow this pattern even more strongly than L1
writers, dropping the complementizer less frequently in writing.
Also within a usage-based paradigm, researchers have started to employ dynamic systems
theory in their corpus-based studies (e.g., Verspoor, Schmid, & Xu, 2012). Verspoor et al. (2012)
examine a cross-sectional corpus of Dutch L2 English writers’ narrative texts. They compare group
level means (Levels 1–4) on a variety of syntactic and lexical measures (which represented com-
plexity, accuracy, and fluency, CAF), but focus on change between subsequent levels rather than
discrimination among levels in their findings. They found that the changes varied depending on the
level: between Levels 1 and 2, mainly lexical changes occurred; for Levels 2 and 3 mainly syntactic
changes; and Levels 3 and 4 showed changes in both sets of measures.
Others have sought to connect cognitive models of complexity to linguistic complexity (e.g.,
Bulté & Housen, 2014). Bulté and Housen (2014) provide a useful review of recent complexity
research in SLW, emphasizing that scholars often examine linguistic complexity, operationalized
as quantitative complexity (more is more complex), alongside cognitive complexity (difficulty).
Attempting to disambiguate these two types of complexity, they examine L2 English writing
at the beginning and end of an intensive English program. They focus on syntactic complexity
(using length and ratio-based measures) and lexical complexity (e.g., D), showing that many of
the features are positively correlated with both time (more features used after the IEP) and holistic
writing quality (more features associated with higher scores).
357
Shelley Staples et al.
As mentioned previously, Systemic Functional Linguistics and other functional approaches to

SLW development have had a close relationship to corpus research (e.g., Byrnes, Maxim, & Norris,
2010). Byrnes et al. (2010), for example, rely on SFL to argue for an approach to SLW develop-
ment that incorporates a textual, meaning-oriented approach to grammar, functional interpretation
of language, and an acknowledgment of the impact of registers and genres on language used in
writing. They provide a detailed list of lexico-grammatical features (e.g., present perfect and time
expressions) as well as stages of a genre (e.g., narrative genres) to suggest developmental sequences
and curriculum arcs for four levels within a foreign language program.
Other genre/register-based approaches have recently emerged in the study of SLW, particularly
from the work of Biber and colleagues, but others as well (e.g., Mazgutova & Kormos, 2015; Qin
& Uccelli, 2016). Similar to Byrnes et al. (2010), Biber et al. (2016) make the case that register,
in this case encompassing both mode (speaking/writing) and task type (integrated/independent),
should be taken into consideration when evaluating language development. They also argue for a
multidimensional approach towards investigating grammatical complexity, using factor analysis
to capture co-occurring linguistic features that are used by writers at lower and higher proficiency
levels. Their study, which focuses on a corpus of TOEFL iBT essays, illustrates that features that
are often included as “more complex” in complexity frameworks (e.g., finite adverbial clauses)
are actually used more in speaking than writing, in independent rather than integrated tasks, and in
lower scoring responses. Higher-scoring, integrated, written tasks, on the other hand, tend to have
more features associated with noun-phrase complexity (more premodification and use of prepos-
itional phrases).
Finally, cross-linguistic influence (CLI) has seen a resurgence in popularity alongside the devel-
opment of new corpora for examining this phenomenon (e.g., Golden, Jarvis, & Tenfjord, 2017;
Lu & Ai, 2015; Paquot, 2017). Golden et al. (2017) contribute to the study of CLI in SLW with
their edited collection on ASK (Norsk andrespråkskorpus), a corpus of adult immigrant learners of
Norwegian, which includes texts from ten different L1 backgrounds. Golden (2017), for example,
argues that a learner’s choice to use emotion words when writing in Norwegian is related to whether
the concept can be represented by one major equivalent term in the learners’ L1.
As can be seen, theoretical approaches using corpora range from the more socially-oriented (SFL
and other genre/register-based approaches) to more cognitively-oriented approaches (usage-based/
connectionist approaches and dynamic systems theory) to more linguistically (typologically) oriented
approaches (e.g., cross-linguistic influence). Notably, most studies that use existing corpora to investi-
gate language development often use corpus linguistics to explore writers’ complexity, accuracy, lexis,
and fluency (CALF). We now turn to empirical studies that focus on SLW development from a variety
of theoretical perspectives. These studies span linguistic levels of lexis, grammar, lexico-grammar, and
discourse.
Previous Empirical Research
Lexis
Corpus-based studies emphasize the development of lexical complexity in the form of sophistica-
tion and diversity. Lexical sophistication has been traditionally operationalized by using frequen-
cies from reference corpora, with lower frequency lexical items considered to be more sophisticated
(Fellner & Apple, 2006; Laufer & Nation, 1995; Malvern, Richards, Chipere, & Durán, 2004).
Other measures of lexical sophistication are based on human ratings in experimental studies with
native-speaker participants, and encompass judgment of concreteness (i.e., more abstract words
are more sophisticated), and reaction times for word recognition (i.e., less sophisticated words are
recognized faster), among others (McNamara, Crossley & McCarthy, 2010). Measures of lexical
358
sophistication have been shown to correlate with SLW development expressed longitudinally
(Fellner & Apple, 2006) and cross-sectionally (McNamara et al., 2010).
Lexical diversity usually refers to the number of unique words in a text (Malvern et al., 2004).
This measure was traditionally operationalized as type-token ratio or TTR, which consists of
the proportion of unique words (i.e., types) to the total number of words (i.e., tokens) in a text,
usually modified in some way to account for the effect of text length (Kuiken & Vedder, 2007).
TTR has been shown to correlate with language ratings, with higher rated essays displaying a
higher TTR (Staples & Reppen, 2016). Other more recent measures of diversity, including the
D-value and ratio of pronouns to noun phrases, have been shown to be indicative of SLW pro-
ficiency (Malvern et al., 2004).
Phraseology
The main distinction between lexical and phraseological corpus studies is that phraseology takes
into account lexical co-occurrence, such as collocations (e.g., two words), lexical bundles (e.g.,
three or four sequential words), and other multi-word sequences (see also Yoon, this volume).
These features are often operationalized as a component of learners’ complexity of language use,
but can also be considered as an aspect of fluency, in the case of multi-word sequences (i.e., more
fluent writers use formulaic sequences to facilitate their abilities to produce more writing). In add-
ition, a novel approach by Polio and Yoon (2021) to accuracy measurement uses collocations from
an L1 corpus to identify accurate and inaccurate phrases by L2 writers.
Within multi-word sequences, fixedness (usually operationalized as the percentage of words that
vary within a sequence of “slots”) has been shown to decrease with proficiency (Staples, Egbert,
Biber, & McClair, 2013). For example, the lexical bundle on the other hand is relatively fixed (>
50% of the time, these words do not vary) compared to the nature of the, in which the second slot is
often (> 50% of the time) filled by other words, such as end. Proficiency has also been linked with
strength of association, as L2 writers use less frequent but more strongly associated collocations at
higher proficiency levels (Paquot, 2019).
L2 writers also seem to rely on multi-word units from their L1 when writing in an L2 (Pan,
Reppen, & Biber, 2016). The use and frequency of phrases in SLW, however, has been shown to be
nonlinear across L2 development (Staples et al., 2013). For example, Staples et al. (2013) found an
increase in the frequency of multi-word units across TOEFL writing exams in intermediate levels
followed by a decrease at the highest level.
Grammar and Syntax

Many corpus studies have used grammar-based measures to investigate SLW development
(Biber et al., 2016; Lu, 2011; Vyatkina et al., 2015). Syntactic measures are often studied
through a complexity framework, and vary from clausal length- based measures to more
fine-grained, register-based measures such as particular types of embedded phrases (attribu-
tive adjectives, premodifying nouns, of-phrases, and other prepositional phrases). The latter
have been demonstrated to be more characteristic of both L1 and L2 advanced English writing
when compared to conversation (Biber et al., 2011; Biber et al., 2016). These phrasal features
have also been shown to be correlated with increasing proficiency (Biber et al., 2016; Kyle &
Crossley, 2018; Mazgutova & Kormos, 2015; Vyatkina et al., 2015).
While less well-studied in SLW, morphology has typically been investigated through the perspec-
tive of CLI (Golden et al., 2017; Spoelman, 2014). For example, Brezina and Pallotti (2019) show
that native speakers and higher proficiency L2 learners of Italian demonstrate a higher verb mor-
phological complexity index (MCI) than lower proficiency learners of Italian. The morphological
359
complexity index calculates the average morphological diversity within and across subjects for, in
this case, verb forms.
Lexico-Grammar
Although grammar-based measures reveal important developmental stages, a usage-based perspec-
tive on language highlights the importance of investigating interconnections between grammar and
lexis (Römer & Schulze, 2009). The link between lexis and grammar is particularly emphasized
in studies of collostructions, which are extensions of collocations analyzed in terms of “which
lexemes are strongly attracted or repelled by a particular slot in the construction” (Stefanowitsch
& Gries, 2003, p. 214). For example, the verbs prefer and continue occur in gerundial argument
constructions in L1 English writing, while in L2 English they are more associated with the infinitive
construction (Wulff & Gries, 2011). Increased proficiency is also indicative of expanded repertoire
and productivity of VACs with more native-like production (Römer & Berger, 2019), less frequent
verb-VACs combinations, and a stronger verb-VAC association (Kyle & Crossley, 2017).
The interplay of grammar and lexis has a long tradition within functional interpretations of lan-
guage use (Halliday, 1989; Biber, 1988). Lexico-grammatical features have been connected with
specific discourse types in SLW, such as narration with verbal features and exposition with nom-
inal features in L2 Spanish (Asención-Delaney & Collentine, 2011). Staples and Reppen (2016)
show that greater use of grammatical features (e.g., premodifying nouns) may be accompanied by
more lexical repetition, likely a strategy for providing coherence, but perhaps a less straightforward
measure of SLW development as current models of CALF might suggest.
Discourse
L2 discourse-level writing development has been measured by lexico-grammatical features in
tandem with rhetorical organization in registers and genres. For example, studies of grammat-
ical metaphor (GM) (i.e., nominalization) show that complexity increased not necessarily by the
number of GM constructions but by where and how GM is placed in the text (Byrnes, 2009). This
research suggests that GM acquisition occurs in later stages (Ortega & Byrnes, 2008).
The use of interactional and interpersonal metadiscourse analyzed in the light of Hyland’s (2005)
metadiscourse model has revealed developmental trends by EFL learners of different backgrounds,
such as influence of modal system acquisition stage, adjectival lexical phrase choices, and reflexive
constructions in the case of Spanish EFL writers (Neff van Aertselaer & Dafouz-Milne, 2008).
Using a longitudinal design, Lee (2016) also mapped the use of attitude markers by a Korean
learner of English over two years to show writer stance and identity development.
Research Agenda
Taking into account these major areas in which corpora have been used to understand SLW devel-
opment, we propose four research agendas for future studies.
Research Agenda 1: How Can CA(L)F (Complexity, Accuracy, Lexis, and Fluency)
Be Expanded or Reconceptualized Using Corpus Linguistic Methods?
Although ideas about how CA(L)F should be measured differ across research paradigms, it seems
that most SLA scholars agree that there is still much work to be done in defining and operational-
izing complexity and to some extent accuracy and fluency as well (Bulté & Housen, 2014; Ortega,
2015). Thus far, most studies of complexity either neglect lexis and focus entirely on grammatical
360
complexity or investigate lexis and grammar separately. This is despite the underlying assumptions
of usage-based approaches to SLA that support language learning as a lexico-grammatical endeavor
(Ellis et al., 2013). In this first section, we illustrate how corpus linguistic findings and approaches
can be used to support an expanded reconceptualization of complexity. We also promote a func-
tional approach that considers both context and co-text in analyzing complexity.
RESEARCH TASK 1: INCORPORATE LEXICO-GRAMMATICAL FEATURES FROM CORPUS

LINGUISTIC ANALYSIS INTO MODELS OF COMPLEXITY
There are several methodological models from corpus linguistics that illustrate how lexis and
grammar can be integrated in SLW research. The first approach includes integrating sophistica-
tion measures (e.g., frequency bands or other frequency criteria) with grammatical features. This
approach is minimally included in Biber, Gray, & Poonpon’s (2011) hypothesized sequence of SLW
development (e.g., Stage 2 includes extremely common attributive adjectives and Stage 4 includes
a wider range of attributive adjectives). Gray, Geluso, & Nguyen (2019) applied this framework
in a study of longitudinal development over nine months using the TOEFL iBT independent and
integrated tasks. The findings show that writers at Time 2 (nine months after Time 1) not only
used more attributive adjectives, but also a greater range of types (variety) as well as less frequent
adjectives (sophistication) (p. 53). Clausal features such as non-finite verb complement clauses and
finite relative clauses decreased from Time 1 to Time 2 (p. 55). These clausal constructions, how-
ever, were not accompanied by an examination of lexical variations, and could be further integrated
into the model. For example, the verbs controlling the complement clauses could be further divided
into high and low frequency verbs. There is also quite a bit of evidence that relative pronouns (that
or wh) differ across registers, and thus may differ across developmental stages (Biber et al., 2011).
Incorporating context (here, register) as a factor in complexity development allows researchers to
accommodate both usage-based and functional approaches to SLA.
Another way in which lexis and grammar have been examined in corpus-based studies of SLW is
through semantic categories. Staples and Reppen (2016), for example, found significant differences
in the ways that L2 writers used semantic classes of adverbials when compared with L1 writers: L2
writers tended to use more causative adverbial clauses while L1 writers used more conditional
adverbial clauses. Future research can be extended to semantic categories of other grammatical
constructions such as attributive adjectives and premodifying nouns.
Lexis and grammar have also been combined in research on multi-word sequences (n-grams,
lexical bundles, etc.). The most common method of doing so is to classify bundles that emerge
from the data into categories such as bundles formed from noun phrases (e.g., the end of the) or
verb phrase (e.g., take a look at) (Biber, Conrad, & Cortes, 2004; Yan & Staples, 2020). Another
important aspect of lexico-grammar revealed in corpus-based studies of multi-word sequences is
the repetition of language from a prompt (or repetition more generally). A number of studies have
related lower scores on proficiency tests with greater use of prompt-based language (Staples et al.,
2013; Yan & Staples, 2020. However, these factors are generally not incorporated into grammatical
complexity measures.
VACs are another promising approach to identifying lexico-grammatical development of L2
writers. This research expands lexical complexity research to include strength of association, which
has been shown to increase with higher proficiency (Kyle & Crossley, 2017; Römer & Berger,
2019). VACs also allow researchers to capture the functional relationship between the verb and
direct object and more generally to take co-text into account, thus increasing connections with
usage-based approaches to language development. Paquot (2019) incorporates VACs and other
structures, arguing for the addition of collocational analysis measures of diversity (type-token ratio
of particular collocations) and sophistication (sophisticated word combinations) in complexity
research.
361
Of course, researchers may not be able to (or want to) include so many disparate measures
in a study, and in fact it can be quite problematic to include multiple overlapping measures in a
single study. In order to determine which measures and approaches might yield the most fruitful
understanding of L2 development, it would be useful for researchers to incorporate multiple, com-
plementary approaches to complexity in their studies, or, for multiple researchers to use the same
dataset to explore complexity from these multiple perspectives. The special issue of Journal of
Second Language Writing in December 2014 (Connor-Linton & Polio, 2014) offers a model of how
this could be carried out: five research teams examined CALF in a longitudinal data set compiled
over the course of one semester. For further research in this area, it would be helpful to examine the
various ways in which lexico-grammatical complexity can be operationalized using the research
methods above.
Research Agenda 2: How Can Our Understanding of Writing Development

Be Expanded Using Multivariate Corpus Analyses?
There have been a number of calls for expanding the ways in which we measure writing devel-
opment, including by Norris and Manchón (2012), Ortega (2015), and Polio (2017). Although a
great deal of SLA scholarship in this area has focused on measurements of CALF, Polio and Park’s
(2016) review of language development in SLW shows how many studies depart from this para-
digm (e.g., Byrnes, 2009 on grammatical metaphor). In what follows, we discuss ways in which
multivariate methods commonly used to examine corpus data (multidimensional analysis, cluster
analysis, discriminant analysis, and regression analyses) can be leveraged to expand notions of lan-
guage development in writing.
RESEARCH TASK 2: MULTIDIMENSIONAL ANALYSIS

Multidimensional (MD) analysis is both a statistical and interpretive procedure developed by Biber
(1988) for the purpose of examining differences across registers. The statistical procedure involves
factor analysis, which is used in many applied linguistics contexts, but the interpretive procedure is
guided by corpus linguistic principles (see Berber-Sardinha & Veirano-Pinto, 2019 for discussions
of how to conduct multidimensional analysis). While it was originally used to describe differences
in speech and writing, this approach has been applied in a growing number of discourse contexts
(including analyses of SLW) to examine linguistic features as constellations of co-occurring
variables rather than in isolation (Biber et al., 2016; Weigle & Friginal, 2015; Yan & Staples, 2020).
In doing so, researchers can functionally interpret the factors of co-occurring features to under-
stand differences in language use across developmental stages, as identified through some outside
category such as score level on a proficiency exam or years in an academic program. For example,
Biber et al. (2016) show that test takers on the TOEFL iBT writing exam used more of the constel-
lation of features labeled as “literate” as score level increased. That is, higher-scoring test takers
used more nouns, prepositional phrases, adjectives, passives, and longer words (literate features)
and used fewer present-tense verbs, mental verbs (think, know), modal verbs, verb + that comple-
ment clauses (I think that…), and finite adverbial clauses (oral features). The literate features are
all more associated with academic writing while the oral features are more associated with speech
(see, e.g., Biber et al., 2011). While other studies show that these variables can individually measure
grammatical complexity, Biber et al. (2016) show the ability of these two constellations of features
to work together for the same communicative function, increasing the interpretive power as well
as the statistical power. Thus, MD analysis can provide a more holistic approach to understanding
the language used by second language writers at different developmental stages than individual
features, but still allows for a functional interpretation of those variables.
The challenges of MD analysis are two-fold: 1) datasets need to be larger than 100 cases (texts)
(ideally n > 300) 2) researchers need to be familiar with multivariate statistical tests, specifically
362
factor analysis. Most graduate programs do not require such high-level statistical knowledge, and
some researchers may not even have access to training in this area. One alternative is to use a tool
that allows researchers to use the results of previous MD analyses to investigate their data. For
example, researchers can use Nini’s (2014) Multidimensional Analysis Tagger or Brezina’s (2019)
Statistical Tool to apply the findings from Biber (1988) to their own data sets.
RESEARCH TASK 3: CLUSTER ANALYSIS

Cluster analysis is another useful statistical procedure that is increasingly used in both corpus and
SLA research. Similar to factor analysis, cluster analysis is a multivariate approach, but tradition-
ally has been used to cluster cases (observations, e.g., participants) rather than linguistic features.
Cluster analysis can be used instead of a priori categories that are intended to represent develop-
mental stages (e.g., score levels on a test, task type). In this way, clusters of texts (which represent
the writing produced by second language writers) are categorized by the linguistic features present
in their texts. These clusters can then be compared with the a priori measures (e.g., score levels on
a test, task type) to see whether those categories align with expected patterns of language devel-
opment. When there is not a one-to-one alignment, the results may reveal variation in the lin-
guistic features used by high proficiency L2 writers (see Friginal, Li, & Weigle, 2014; Jarvis, Grant,
Bilowski, & Ferris, 2003). The cluster analysis may also point to other learner characteristics that
might be impacting the patterns (e.g., L1 background). In this way, cluster analysis allows for indi-
vidual differences within groups.
Staples and Biber (2015) describe the steps needed to conduct a cluster analysis, illustrating
the process with SLW data from the TOEFL iBT test (integrated and independent). Their results
showed some clustering in relation to proficiency level and task type, but also raised questions
about the language used at intermediate levels on the writing tasks. Thus, cluster analysis can help
to elucidate a more complex picture of language development in SLW.
The challenges for cluster analysis are similar to those for MD analysis, although fewer texts
are needed to meet the assumptions for cluster analysis. Readers are directed to Staples and Biber
(2015) for more details on how cluster analysis can be used to explore the multi-faceted nature of
SLW development.
RESEARCH TASK 4: DISCRIMINANT ANALYSIS

Another multivariate statistical procedure, discriminant analysis, is predictive rather than being
exclusively interpretative like MD and cluster analyses. That means the researcher uses a number
of linguistic features (predictors) to create models (e.g., regression) that are then used to pre-
dict group membership (McNamara et al., 2010). The group here may refer to different learner
characteristics, such as a shared L1 or proficiency level. For example, McNamara et al. (2010)
used as predictors measures of cohesion, syntactic complexity, and lexical diversity and sophis-
tication to build predictive models to classify L2 English essays in one of two groups: high and
low-scoring. In this type of study, the total number of texts are divided into two sets. The largest
set (usually 80% of the entire corpus) is used for creating predictive models. The models are
then used to predict the rest of the data (usually 20% of the corpus). Measures of model accuracy
generally comprise precision and recall, which take into account positive misclassification (i.e.,
how many texts were incorrectly identified as belonging to a group) and model sensitivity (i.e.,
how many texts were ignored, or negatively mislabeled as not belonging to a group), respectively
(Sokolova & Lapalme, 2009).
The main challenges in using discriminant analysis with learner corpora are determining which
linguistic features to include as predictors in the modeling process, and subsequent model selection.
More SLA studies that use this technique are needed to establish statistically sound procedures for
these two steps.
363
RESEARCH TASK 5: REGRESSION ANALYSIS

Regression analyses are used more frequently in learner corpus studies than some other quantitative
methods, like discriminant or cluster analysis, but there are still several methodological concerns
that need to be addressed in future research, including the way learner populations are sampled, data
is aggregated, and assumptions are checked (Gries, 2018). Gries (2018) offers a strong argument
for using carefully constructed regression-based approaches by re-analyzing data from previously
published learner corpus research. He also includes in his argument step-by-step instructions on
how to analyze learner data by using case-by-variable datasets (instead of aggregating the data),
limiting tests of statistical significance, and calculating effect sizes and confidence intervals.
In addition, Gries (2018) argues for using an even more sophisticated regression- based
approach, namely MuPDAR (Multifactorial Prediction and Deviation Analysis Using Regression),
which has some steps in common with discriminant analysis, discussed above. The main difference
in MuPDAR is that native-speaker behavior is modeled to guess what is said in given linguistic
contexts (i.e., language variants). The resulting native-speaker model, which includes how each
linguistic factor affects native-speaker language choice, is then used to predict non-native speaker
(i.e., learner) data, i.e., for each construction produced by learners a prediction is supplied for the
language choice of the native-speaker model in that specific linguistic context (Gries & Deshors,
2020). In other words, each learner choice is labeled as native or non-native like.
While more learner corpus research using multifactorial regression analyses is needed, a word
of caution is warranted regarding the use of native-speaker norms. MuPDAR, like other regression-
based methods, offers a way to compare learner language use against the statistical language use of
a native-speaker community, instead of using idealized native-speaker language (e.g., prescriptive
grammar). However, the assumption remains that the norm or the target for learner language is
native-speaker language use.
Research Agenda 3: How Can Corpus Linguistics Contribute to Our Understanding of

What Developmental Patterns Look Like Over Time?
One of the most common refrains underscored in future directions for second language (writing)
development is the need for more robust longitudinal studies. As Polio (2017) indicates, most lon-
gitudinal studies are case studies, with a few important exceptions (e.g., Belz & Vyatkina, 2005;
Kreyer & Schaub, 2018; Biber, Reppen, Staples, & Egbert, 2020). This is an obvious place in
which corpus linguistics, and particularly the development of corpora and methods for examining
these large data sets, can contribute. So far, however, most corpus studies remain cross-sectional
in design, due to the methodological challenges of collecting longitudinal corpora. There is also
the tradeoff, discussed in Polio (2017), between matching texts from learners across time for task,
genre, or register characteristics vs. ecological validity of what learners produce outside of a timed,
controlled context.
One important exception is the Michigan State University corpus of 70 students reported on
in a special issue of Journal of Second Language Writing (Connor-Linton & Polio, 2014). This
provides one model of how institutionally created corpora can contribute to the study of language
development over time, although one limitation of such corpora is their constrained text types (in
this case five paragraph essays). While the MSU corpus takes the first approach, matching texts
from learners across time (on counterbalanced topics), another approach is being taken by the Crow
(Corpus and Repository of Writing; https://crow.corporaproject.org) and MACAWS (Multilingual
Academic Corpus of Assignments –Writing and Speech; https://macaws.corporaproject.org) at
University of Arizona. These corpora are being collected from L2 first-year writing (English) and
Portuguese and Russian language classes, respectively. The tasks come from the existing course
curriculum, which means that they are more ecologically valid, but are not controlled for over time.
364
While these corpora are not exclusively longitudinal, because of the “monitor” approach to them
(a monitor corpus continually collects data over time), and the careful tracking of student demo-
graphic data, there are currently 33 students in Crow and 76 Portuguese and 37 Russian students in
MACAWS that have longitudinal data in the corpus (at least two semesters). Crow and MACAWS
are available to researchers through a password protected site.
RESEARCH TASK 6: BUILD MORE LONGITUDINAL, OPEN ACCESS, INTERINSTITUTIONAL CORPORA

In order to support the compilation of longitudinal corpora, it is suggested that researchers take
an interinstitutional approach, as well as an approach to corpus development that allows, from the
beginning, for researcher access. This approach was taken, for example, by the LANGSNAP group
at Southampton (UK; langsnap.soton.ac.uk). By working with universities in France, Spain, and
Mexico, the Southampton group was able to track learner progress at six data points, before, during,
and after study abroad. Another example is the Crow and MACAWS projects, which have set up
their IRB protocols (ethics review) to allow for sharing of data after de-identification. While the de-
identification process requires additional time and resources, it allows researchers to share data eth-
ically. The Crow team is developing open access resources for interactive de-identification (as well
as automated scripts), which should help researchers to share their data, with proper permissions
and consents (see https://github.com/writecrow/ciabatta/wiki).
There are also methodological challenges of longitudinal data that must be addressed, including
attrition of participants (so not all participants will be represented at all time points), and vari-
ation in tasks, for those longitudinal corpora for which learners produce different text types over
time (due to curricular progression). Mixed effects models allow for random effects of different
text types and learners at various points in time. In this way, they can also account for individual
differences within groups.
Research Agenda 4: How Can Corpus Linguistics Contribute to Our Insights into
Cross-Linguistic Influence for Different L1 Backgrounds/Target Languages?
With the emergence of corpus-based empirical studies, research on cross-linguistic influence (CLI)
has moved away from the error-based approach that was characteristic of early SLA (and CIA),
focusing on patterns of language use instead (Osborne, 2015; Paquot, 2017). An important limita-
tion of these studies is their heavy focus on L2 English, with additional overrepresentation of L2
French, L2 German, and L2 Spanish (Osborne, 2015). Exceptions include the edited volume by
Golden et al. (2017), discussed above, which focuses on L2 Norwegian produced by writers from
ten different language backgrounds. These studies investigate linguistic features that are not present
in English, such as adjective inflection and gender assignment (e.g., Spoelman, 2014).
RESEARCH TASK 7: INCORPORATE A WIDER VARIETY OF LANGUAGES AND A MULTILINGUAL

APPROACH TO STUDIES OF CROSS-LINGUISTIC INFLUENCE
If we are to study CLI for a diverse group of linguistic features that run the gamut of language levels
discussed in this chapter (from morphology to discourse), more studies on target languages other
than English are needed. It is also interesting to note that most research on CLI assumes learners
have previously acquired only one language, which is not true for the majority of language learners
(De Angelis, 2007). There is still a lot of room for corpus-based research on CLI that not only
considers the different languages learners previously acquired, but also includes data produced by
individuals with the same L1 background in different target languages.
While advances in statistical and technological methods (e.g., discriminant analysis) have allowed
for more empirical investigations of cross-linguistic influence on larger corpora (Paquot, 2017),
the collection of rich metadata is at the heart of this type of research. The issue here centers once
365
again on how corpora are built, and what type of learner metadata are collected (McEnery, Brezina,
Gablasova, & Banerjee, 2019). We are still in need of documented procedures on how to collect
background information on learners’ previously acquired languages. Descriptions on how existing
learner corpora were built, however, can help researchers design their own instruments for metadata
collection. Lozano (2009), for example, provides a detailed account of how learner metadata for
CEDEL2 (Corpus Escrito Del Español 2/L2 Spanish Corpus of Writing; cedl2.learnercorpora.com)
was compiled. The author includes the survey used to collect information on learner demographics
(e.g., age), language background (e.g., learner’s native language, their parents’ native languages),
and self-reported L2 proficiency in any additional languages. The survey provided assumes, like
many other L2 corpora, that learners are native speakers of only one language. In contrast, the
survey used for collecting learner information for MACAWS allows participants to enter mul-
tiple L1s. MACAWS has currently 118 Portuguese learners that self-reported speaking more than
one L1. The reported L1s include a broad variety of languages, such as American Sign Language,
Arabic, and Hebrew, in addition to the expected Spanish and English. ASK, CEDEL2, Crow, and
MACAWS, (discussed here and above) also represent an important trend by making their learner
corpora publicly available.
Conclusion
Learner corpus research has contributed to SLA and writing studies in key ways, particularly in
the last ten years. It has allowed writing scholars to investigate language development in more sys-
tematic ways, on larger datasets, providing insights into lexical, grammatical, lexico-grammatical,
phraseological, and discourse-level development. It has also allowed writing scholars to move
beyond tightly controlled assessment tasks to investigate the language used by writers in classrooms
and in their professions. Learner corpus researchers have begun to contribute to key areas of future
studies within SLA and corpora, including a broader understanding of complexity in L2 develop-
ment, cross-linguistic influence beyond a deficit approach (and on languages beyond English), and
multivariate approaches to understanding SLW development. Longitudinal corpora also have begun
to be developed, and have the potential to revolutionize the ways in which we understand SLA at
both a group and individual level. However, there is much work to be done in these new areas. We
hope this chapter will inspire and give concrete direction to those research efforts.
References
Asención-Delaney, Y., & Collentine, J. (2011). A multidimensional analysis of a written L2 Spanish corpus.
Applied Linguistics, 32(3), 299–322.
Belz, J., & Vyatkina, N. (2005). Learner corpus analysis and the development of L2 pragmatic competence in
networked inter-cultural language study: The case of German modal particles. Canadian Modern Language
Review, 62(1), 17–48.
Berber-Sardinha, T.B., & Pinto, M.V. (Eds.). (2019). Multi-dimensional analysis: Research methods and
current issues. London: Bloomsbury.
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and
textbooks. Applied Linguistics, 25(3), 371–405.
ical complexity in L2 writing development? TESOL Quarterly, 45(1), 5–35.
task types and proficiency levels. Applied Linguistics, 37(5), 639–668.
Biber, D., Reppen, R., Staples, S., & Egbert (2020). Exploring the longitudinal development of grammatical
complexity in the disciplinary writing of L2-English university students. International Journal of Learner
Corpus Research, 6(1), 38–71.
Brezina, V. (2019). Lancaster stats tools online. Retrieved from http://corpora.lancs.ac.uk/stats/
366
Brezina, V., & Pallotti, G. (2019). Morphological complexity in written L2 texts. Second Language Research,
35(1), 99–119.
Byrnes, H. (2009). Emergent L2 German writing ability in a curricular context: A longitudinal study of gram-
matical metaphor. Linguistics and Education, 20(1), 50–66.
Byrnes, H., Maxim, H. H., & Norris, J. M. (2010). Realizing advanced foreign language writing develop-
ment in collegiate education: Curricular design, pedagogy, assessment. The Modern Language Journal,
94(s1) i–235.
corpus. Journal of Second Language Writing, 26, 1–9.
De Angelis, G. (2007). Third or additional language acquisition. Bristol: Multilingual Matters.
Ellis, N.C., O’Donnell, M.B., & Römer, U. (2013). Usage-based language: Investigating the latent structures
that underpin acquisition. Language Learning, 63, 25–51.
Fellner, T., & Apple, M. (2006). Developing writing fluency and lexical complexity with blogs. JALT Call
Journal, 2(1), 15–26.
Flowerdew, J. (2003). Signalling nouns in discourse. English for Specific Purposes, 22(4), 329–346.
Friginal, E., Li, M., & Weigle, S.C. (2014). Revisiting multiple profiles of learner compositions: A comparison
of highly rated NS and NNS essays. Journal of Second Language Writing, 23, 1–16.
Golden, A., Jarvis, S., & Tenfjord, K. (Eds.). (2017). Crosslinguistic influence and distinctive patterns of lan-
guage learning: Findings and insights from a learner corpus. Bristol: Multilingual Matters.
Granger, S. (1996). From CA to CIA and back: An integrated approach to computerized bilingual and learner
corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in contrast: Text-based cross-
linguistic studies (pp. 37–51). Lund: Lund University Press.
Granger, S. (2015). Contrastive interlanguage analysis: A reappraisal. International Journal of Learner Corpus
Research, 1(1), 7–24.
Granger, S., Dagneaux, E., Meunier, F., & Paquot, M. (2002, 2009). ICLE: International corpus of learner
English. Louvain-la-Neuve: Presses Universitaires de Louvain.
Grant, L., & Ginther, A. (2000). Using computer-tagged linguistic features to describe L2 writing differences.
Gray, B., Geluso, J., & Nguyen, P. (2019). The longitudinal development of grammatical complexity at the
phrasal and clausal levels in spoken and written TOEFL iBT® responses. ETS Research Report. Retrieved
from https://onlinelibrary.wiley.com/toc/23308516/2019/2019/1
Gries, S.T. (2018). On over-and underuse in learner corpus research and multifactoriality in corpus linguistics
more generally. Journal of Second Language Studies, 1(2), 276–308.
Gries, S.T., & Deshors, S.C. (2020). There’s more to alternations than the main diagonal of a 2×2 confu-
sion matrix: improvements of MuPDAR and other classificatory alternation studies. ICAME Journal
44, 69–96.
Halliday, M.A.K. (1989). Spoken and written language. Oxford: Oxford University Press.
Hyland, K. (2004). Disciplinary interactions: Metadiscourse in L2 postgraduate writing. Journal of Second
Hyland, K. (2005). Metadiscourse: Exploring writing in interaction. London: Continuum.
Jarvis, S., Grant, L., Bikowski, D., & Ferris, D. (2003). Exploring multiple profiles of highly rated learner
compositions. Journal of Second Language Writing, 12(4), 377–403.
Kyle, K., & Crossley, S. (2017). Assessing syntactic sophistication in L2 writing: A usage-based approach.
Language Testing, 34(4), 513–535.
Kyle, K., & Crossley, S.A. (2018). Measuring syntactic complexity in L2 writing using fine- grained clausal
and phrasal indices. The Modern Language Journal, 102(2), 333–349.
Kreyer, R., & Schaub, S. (2018). The development of phrasal complexity in German intermediate learners of
English. International Journal of Learner Corpus Research, 4(1), 82–111.
Kuiken, F., & Vedder, I. (2007). Cognitive task complexity and linguistic performance in French L2 writing. In
M.P. García Mayo (Ed.), Investigating tasks in formal language learning (pp. 117–135). Bristol: Multilingual
Matters.
Laufer, B., & Nation, P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied
Linguistics, 16(3), 307–322.
Lee, S. (2016). Metadiscourse and writer identity: A longitudinal case study of a Korean L2 writer in the US.
Modern Studies in English Language and Literature, 60, 129–150.
Lozano, C. (2009). CEDEL2: Corpus Escrito del Español L2. In C.M. Bretones Callegas et al. (Eds.), Applied
linguistics now: Understanding language and mind (pp. 197–212). Almería: Universidad de Almería.
367
writers’ language development. TESOL Quarterly, 45(1), 36–62.
Malvern, D., Richards, B., Chipere, N., & Durán, P. (2004). Lexical diversity and language development.
New York: Palgrave Macmillan.
McEnery, T., Brezina, V., Gablasova, D., & Banerjee, J. (2019). Corpus linguistics, learner corpora, and
SLA: Employing technology to analyze language use. Annual Review of Applied Linguistics, 39, 74–92.
McNamara, D.S., Crossley, S.A., & McCarthy, P.M. (2010). Linguistic features of writing quality. Written
Neff van Aertselaer, J., & Dafouz-Milne, E. (2008). Argumentation patterns in different languages: An ana-
lysis of metadiscourse markers in English and Spanish texts. In J. Neff van Aertselaer & E. Dafouz-Milne
(Eds.), Developing contrastive pragmatics interlanguage and cross-cultural perspectives (pp. 87–102).
Berlin: DeGruyter.
Nini, A. (2014). Multidimensional Analysis Tagger. Retrieved from https://sites.google.com/site/
multidimensionaltagger/versions
Norris, J.M., & Manchón, R. (2012). L2 writing development: Multiple perspectives. Berlin: DeGruyter.
Ortega, L. (2015). Syntactic complexity in L2 writing: Progress and expansion. Journal of Second Language
Writing, 29, 82–94.
Ortega, L., & Byrnes, H. (2008). The longitudinal study of advanced L2 capacities: An introduction. In L. Ortega
& H. Byrnes (Eds.), The longitudinal study of advanced L2 capacities (pp. 3–20). New York: Routledge.
Osborne, J. (2015). Transfer and learner corpus research. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The
Cambridge handbook of learner corpus research (pp. 333–356). Cambridge: Cambridge University Press.
Pan, F., Reppen, R., & Biber, D. (2016). Comparing patterns of L1 versus L2 English academic
professionals: Lexical bundles in telecommunications research journals. Journal of English for
Academic Purposes, 21, 60–71.
Paquot, M. (2017). L1 frequency in foreign language acquisition: Recurrent word combinations in French and
Spanish EFL learner writing. Second Language Research, 33(1), 13–32.
Paquot, M. (2019). The phraseological dimension in interlanguage complexity research. Second Language
Research, 35(1), 121–145.
Polio, C. (2017). Second language writing development: A research agenda. Language Teaching, 50(2),
261–275.
Polio, C., & Park, J.H. (2016). Language development in second language writing. In R. Manchón & P.K.
Matsuda (Eds.), Handbook of second and foreign language writing (pp. 287–306). Berlin: DeGruyter.
Polio, C., & Yoon, H.J. (2021). Exploring multi-word combinations as measures of linguistic accuracy in
second language writing. In B. Le Bruyn & M. Paquot (Eds.), Learner corpus research meets second lan-
guage acquisition (pp. 96–121). Cambridge: Cambridge University Press.
Qin, W., & Uccelli, P. (2016). Same language, different functions: A cross-genre analysis of Chinese EFL
learners’ writing performance. Journal of Second Language Writing, 33, 3–17.
Reppen, R., & Grabe, W. (1993). Spanish transfer effects in the English writing of elementary students.
Lenguas Modernas, 20, 113–128.
Römer, U., & Berger, C.M. (2019, advanced access). Observing the emergence of constructional know-
ledge: Verb patterns in German and Spanish learners of English at different proficiency levels. Studies in
Second Language Acquisition, 1–22.
Römer, U., & Schulze, R. (Eds.). (2009). Exploring the lexis-grammar interface. Philadelphia: John Benjamins.
Spoelman, M. (2014). The use of partitive plural predicatives by learners of Finnish from related and non-
related L1 backgrounds: The same side of a slightly different coin. Apples: Journal of Applied Language
Studies, 8(3), 55–70.
Staples, S., & Biber, D. (2015). Cluster analysis. In L. Plonsky (Ed.), Advancing quantitative methods in
second language research (pp. 243–274). New York: Routledge.
Staples, S., Egbert, J., Biber, D., & McClair, A. (2013). Formulaic sequences and EAP writing develop-
ment: Lexical bundles in the TOEFL iBT writing section. Journal of English for Academic Purposes,
12(3), 214–225.
Staples, S., & Reppen, R. (2016). Understanding first-year L2 writing: A lexico-grammatical analysis across
L1s, genres, and language ratings. Journal of Second Language Writing, 32, 17–35.
Stefanowitsch, A., & Gries, S.T. (2003). Collostructions: Investigating the interaction of words and
constructions. International Journal of Corpus Linguistics, 8(2), 209–243.
368
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks.
Information Processing & Management, 45(4), 427–437.
Verspoor, M., Schmid, M.S., & Xu, X. (2012). A dynamic usage based perspective on L2 writing. Journal of
Second Language Writing, 21(3), 239–263.
Vyatkina, N., Hirschmann, H., & Golcher, F. (2015). Syntactic modification at early stages of L2 German
writing development: A longitudinal learner corpus study. Journal of Second Language Writing, 29, 28–50.
Weigle, S.C., & Friginal, E. (2015). Linguistic dimensions of impromptu test essays compared with successful
student disciplinary writing: Effects of language background, topic, and L2 proficiency. Journal of English
for Academic Purposes, 18, 25–39.
Wulff, S. (2016). A friendly conspiracy of input, L1, and processing demands: That-variation in German and
Spanish learner language. In L. Ortega, A.E. Tyler, H.I. Park, & M. Uno (Eds.), The usage-based study of
language learning and multilingualism (pp. 115–136). Washington, DC: Georgetown University Press.
Wulff, S., & Gries, S. T. (2011). Corpus-driven methods for assessing accuracy in learner production. In P.
Robinson (Ed.), Second language task complexity: Researching the cognition hypothesis of language
learning and performance (pp. 61–87). Philadelphia: John Benjamins.
Yan, X., & Staples, S. (2020). Fitting MD analysis in an argument-based validity framework for writing
assessment: Explanation and generalization inferences for the ECPE. Language Testing, 37(2), 189–214.
369
27
AUTOMATED ANALYSES OF
L2 WRITTEN TEXTS
Xiaofei Lu
Pennsylvania State University
The past two decades have witnessed a tremendous growth of second language (L2) writing and
second language acquisition (SLA) research within the “data-driven paradigm” (Barlow, 2000,
p. 17) informed in part by systematic, automated analyses of written texts produced by L2 learners.
The emergence and growth of this line of research did not come about lightly but entailed two non-
trivial prerequisites.
The first prerequisite was the availability of carefully constructed, adequately sized corpora
of written texts produced by L2 learners. Two important landmark efforts in this enterprise were
the publication of Learner English on Computer (Granger, 1998), “the first book devoted to the
idea of collecting a corpus of the language produced by foreign language learners” (Leech, 1998,
p. xiv) and the public release of the first version of the International Corpus of Learner English
(ICLE) (Granger, Dagneaux, & Meunier, 2002). Since that time, ICLE has released a second
version (Granger, Dagneaux, Meunier, & Paquot, 2009), and a host of other corpora either exclu-
sively of L2 writing or containing L2 writing samples have been compiled, such as the Written
English Corpus of Chinese Learners (WECCL) (Wen, Wang, & Liang, 2005), the Michigan Corpus
of Upper-level Student Papers (MICUSP) (Römer & O’Donnell, 2011), the International Corpus
Network of Asian Learners of English (ICNALE) (Ishikawa, 2013), and the Longitudinal Database
of Learner English (LONGDALE) (Meunier, 2016).
The second prerequisite was the development of natural language processing (NLP) technolo-
gies and computational tools that made it possible to automate analyses of large corpora of written
texts produced by L2 learners in various ways. Concordancing programs such as WordSmith Tools,
with version 1.0 released in 1996 and now at version 8.0 (Scott, 2020), and AntConc, with version
1.0 released in 2002 and now at version 3.5.9 (Anthony, 2020), have played a long-standing role
in learner corpus research. These programs offer the functionalities to generate word frequency
lists and to identify collocations, n-grams, and keywords from a corpus without requiring the
corpus to be linguistically annotated beforehand. Corpus annotation tools such as part-of-speech
(POS) taggers, morphological analyzers, and syntactic parsers (see, e.g., Jurafsky & Martin, 2008;
Lu, 2014) initially designed for annotating corpora of first language (L1) texts have increasingly
been used to annotate corpora of L2 texts to enable a wider range of linguistic analysis, but the
accuracy with which they handle L2 texts needs to be carefully assessed. More recently, a number
370 DOI: 10.4324/9780429199691-36

Future Automated Analysis of L2 Writing
of tools have been designed for analyzing written texts produced by L2 learners in specific ways.
For example, various systems have been designed for automated grammatical error detection
in learner writing (see, e.g., Leacock, Chodorow, Gamon, & Tetreault, 2014), such as Criterion
Online Writing Evaluation Service developed by Educational Testing Service (ETS) (available at
www.ets.org/criterion) and the Grammar and Mechanics Error Tool (GAMET) (Crossley, Bradfield,
& Bustamante, 2019). The L2 Syntactic Complexity Analyzer (L2SCA) (Lu, 2010) and the Tool for
the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC) (Kyle, 2016) were
both designed for L2 writing syntactic complexity analysis. The capability to automate analyses
of written texts has been instrumental in the growth of SLA research in the data-driven paradigm.
What We Know Already

As a research methodology, automated analyses of written texts produced by L2 learners have
enabled L2 writing and SLA researchers to address many questions of theoretical, pedagogical, and
practical significance. In this section, we briefly review the cumulative knowledge generated by
research that engaged this methodology in two prominent areas.
One prominent area of L2 writing and SLA research that employs automated analyses of written
texts has been the investigation of linguistic development in L2 writing, particularly the develop-
ment of various dimensions of complexity, accuracy, and fluency (CAF) (e.g., Wolfe-Quintero,
Inagaki, & Kim, 1998). A sizable body of research in this area has focused on identifying valid
and reliable indices of L2 development and assessing their patterns of development, their ability
to discriminate different levels of language proficiency, their relationship to each other, and their
susceptibility to effects of different learner or task parameters (e.g., Lu, 2011; Yoon & Polio, 2017).
Take the construct of L2 writing syntactic complexity as an example. Syntactic complexity is
generally understood as the range and degree of sophistication of the grammatical structures used
in language production (Ortega, 2003). There is now consensus that syntactic complexity should be
conceptualized as a multidimensional construct encompassing global complexity (e.g., measured
using mean length of sentence), complexity by coordination (e.g., measured using T-units per
sentence), complexity by subordination (e.g., measured using dependent clauses per clause), and
phrasal complexity (e.g., measured using complex nominals per clause) (e.g., Norris & Ortega,
2009). Multiple indices have been proposed to measure each of these dimensions (e.g., Kyle,
2016; Lu, 2017), and several computational tools have been designed to automate syntactic com-
plexity analysis using those indices –in addition to L2SCA and TAASSC mentioned above, both
Biber Tagger (Biber, Johansson, Leech, Conrad, & Finegan, 1999) and Coh-Metrix (McNamara,
Graesser, McCarthy, & Cai, 2014) can generate a large set of syntactic complexity indices as well.
Given that texts produced by learners often contain some errors, researchers have paid attention
to the extent to which such tools can accurately analyze learner texts. Precision, recall, and F-score
are commonly used to measure the performance of such tools on linguistic feature identification.
Consider a scenario where a tool is attempting to identify all verb phrases from a corpus. Precision
refers to the proportion of verb phrases identified by the tool that are actually verb phrases, recall
refers to the proportion of verb phrases in the corpus that are accurately identified by the tool, and
F-score is the harmonic mean of precision and recall, calculated as (2 × precision × recall) /(preci-
sion + recall). In a similar vein, these measures can also be used to report the overall performance
of a tool in identifying multiple types of target linguistic features (e.g., verb phrases, noun phrases,
and prepositional phrases). However, given the differences in the frequency and ease of identifi-
cation among different features, it is often useful to know how a tool performs on each individual
feature. In evaluating the performance of L2SCA, Lu (2010) manually annotated 20 essays written
by Chinese learners of English and reported F-scores ranging from .916 to 1.0 for the identification
of different types of syntactic structures (e.g., dependent clauses and coordinate phrases), with the
371
Xiaofei Lu
exception of complex nominals, whose F-score was .83. He further reported that the values of the
14 syntactic complexity indices generated by L2SCA, e.g., mean length of sentence and number
of dependent clauses per clause, correlated very strongly (r > .834) with those generated by human
annotators. Polio and Yoon (2018) assessed the reliability of 12 syntactic complexity measures in
L2SCA and eight measures in Coh-Metrix. They reported very strong correlation coefficients (r >
.80) between the values generated by the tools and those generated by human annotators, with the
exception of one measure for each tool.
L2 writing syntactic complexity researchers have taken advantage of the capabilities of these
tools to increase both the scale of their data analysis and the scope of the research questions asked.
Various indices have been reported to be discriminative of spoken and written registers (Biber et al.,
2011) or different proficiency levels (Lu, 2011). There is collective evidence that coordination and
subordination may be indicative of beginning and intermediate levels of proficiency, respectively,
that phrasal complexity is indicative of an advanced level of proficiency, and that global complexity
tends to increase as proficiency increases (Norris & Ortega, 2009; Lu, 2011). It has been shown
that the development of at least some dimensions of syntactic complexity may take a relative long
period of time, as many measures have been found to discriminate non-adjacent proficiency levels
only (Lu, 2011), and several studies observed no development of multiple syntactic complexity
dimensions over the course of a semester (e.g., Bulté & Housen, 2014; Crossley & McNamara,
2014; Yoon & Polio, 2017). Additionally, various learner and task parameters have been reported to
affect L2 writing syntactic complexity, such as L1 background, genre, topic, and planning time (Lu,
2011; Lu & Ai, 2015; Yang, Lu, & Weigle, 2015; Yoon & Polio, 2017). The cumulative insights into
these issues have useful implications for L2 writing pedagogy, e.g., in terms of determining appro-
priate levels of linguistic complexity in materials development and syllabus design practices and
in terms of instructional attention to particular dimensions of syntactic complexity for learners at
particular levels of proficiency (e.g., Lu, 2011). One methodological caveat in L2 writing syntactic
complexity research is that some indices of the same dimension may be redundant of each other.
For example, Lu and Ai (2015) reported that the following three subordinate measures: complex
T-units per T-unit, dependent clauses per T-unit, and dependent clauses per clause, generated iden-
tical patterns of inter-group difference in their investigation of L1-related differences in syntactic
complexity. Such redundant indices should be selectively applied in future research, especially
when multivariate regression analysis is adopted.
Another body of L2 writing and SLA research in this area has attempted to use automated ana-
lyses of written texts to provide empirical evidence to illustrate, validate, compare, or challenge
different theoretical claims with respect to the nature of L2 development. Research from usage-
based approaches takes the position that like L1 learning, L2 learning is achieved by learning
constructions (Römer, Skalicky, & Ellis, 2020) and has thus focused on the processes through
which different types of constructions emerge, grow, and mature in L2 learners and the factors that
affect those processes. Constructions are understood as conventionalized form-meaning mappings
at varied levels of complexity that are entrenched as language knowledge in the speaker’s mind
(Goldberg, 1995). For example, through automated analyses of verb- argument constructions
(VACs) in a large corpus of written texts produced by L2 learners of English at different proficiency
levels, Römer (2019) found that “from lowest to highest proficiency levels, the VAC repertoire of
L2 English learners shows an increase in VAC types, growth in VAC productivity and complexity,
and a development predominantly fixed sequences to more flexible and productive ones” (p. 270).
Wulff and Gries (2019) analyzed verb-particle constructions in a corpus of spoken and written
productions by intermediate ESL learners from 17 L1 backgrounds and reported that “processing
demands, input effects, and native language typology jointly shape the degree to which learners’
choices of constructions are nativelike or not” (p. 1).
Viewing language development as a dynamic process, research from the dynamic systems
approach employs rich, longitudinal data to examine the trajectories of individual learners’ language
372
development, the interaction of different subsystems of the language in the developmental process,
as well as inter-and intra-individual variability in those developmental trajectories and interaction
patterns (Verspoor, De Bot, & Lowie, 2011). Through automated analyses of CAF features in lon-
gitudinal L2 writing data, multiple studies have reported evidence that the L2 developmental tra-
jectory is not discrete or linear but characterized by fluctuation, variation, and even regression, that
patterns of individual learner development and of interaction among different CAF features in the
developmental process are both highly variable, and that the variability of such patterns follow
the principles of dynamic systems (e.g., Caspi, 2010; Larsen-Freeman, 2006; Verspoor, Lowie,
& Van Dijk, 2008). For example, Larsen-Freeman (2006) found that over a period of six months,
a learner’s grammatical complexity may see increase followed by decrease, may compete with
accuracy early on but co-develop with it later, and may exhibit a drastically different developmental
trajectory and pattern of interaction with accuracy than another learner. Results from these studies
and many others of similar nature have clear theoretical significance.
An area of great practical significance is automated writing assessment, both in terms of
automated essay scoring (AES) and automated provision of feedback to L2 writers (e.g., Lu &
Bluemel, 2020). AES research generally adopts a construct-driven approach. Such research starts
with an analysis of the definition of the writing construct being assessed, which is often avail-
able in the scoring rubric developed for the assessment. A number of features of written texts
that operationalize the various aspects of the writing construct are then identified. These features
are automatically derived from written texts using NLP tools and subsequently assessed for their
correlations with human ratings of writing quality. A scoring model is then developed using an
optimal set of features that are predictive of human ratings of writing quality to automatically
assign scores to writing samples, and the reliability of the model is rigorously evaluated. Many L2
writing studies, while not necessarily targeting the development of AES systems, have contributed
useful insights into the relationship of various linguistic features to human ratings of rating quality
(e.g., Biber, Gray, & Staples, 2016; Crossley & McNamara, 2014; Kyle & Crossley, 2017; Lu,
2017; Yang et al., 2015). A good example of a fully deployed AES system is the e-rater, developed
at the ETS for scoring essays written by test-takers in response to the TOEFL iBT independent and
integrated writing prompts (Ramineni, Trapani, Williamson, Davey, & Bridgeman 2012). E-rater
incorporates a large number of features, including grammar, usage, mechanics, style, organiza-
tion, development, lexical complexity, positive attributes, and topic-specific vocabulary usage,
in accordance with the key qualities of good writing defined in Culham’s (2003) six-trait model.
The system employs a multiple regression procedure to combine features with strong correlations
with human ratings and achieves a very high level of reliability. For example, for independent
tasks, weighted kappa and Pearson correlation between scores assigned by human raters and the
e-rater reach .69 and .74, respectively (Ramineni et al., 2012). While the bulk of AES research has
focused on English, some research on other languages exists as well, such as Chinese (e.g., Chang,
Lee, & Chang, 2006).
In addition to assigning scores to written texts produced by L2 learners, automated writing
assessment systems can also provide useful feedback to L2 writers to help them improve their
writing. The types of feedback provided may include overall comments, comments on different
aspects or dimensions of the writing construct being assessed (e.g., grammar), and comments and
suggestions on specific problems or errors (e.g., grammatical errors). The ETS Criterion Online
Writing Evaluation Service mentioned above and the Jukuu online essay scoring system (avail-
able at www.pigai.org), for example, offer all three types of feedback. Another system, Grammarly
(available at www.grammarly.com), focuses primarily on detecting problems and errors and pro-
viding comments and suggestions for revision. Research has reported some positive impact of
automated feedback on learners’ writing strategy use and writing performance, along with some
concerns over misanalysis and feedback quality, among others (e.g., Cheng, 2017; Hsieh, Hiew, &
Tay, 2017; Lavolette, Polio, & Kahng, 2015; Ranalli, Link, & Chukharev-Hudilainen, 2017).
373
Xiaofei Lu
A Research Agenda
As we have seen, much progress has already been made in developing NLP tools for automating
written text analysis and in using such tools to analyze written texts to address various research
questions of theoretical, methodological, and pedagogical significance in L2 writing and SLA. In
this section, I present a research agenda consisting of three primary research questions along with
relevant suggested empirical studies aimed at pushing this line of research even further.
Research Question 1: How Can We Improve the Accuracy of

NLP Tools on Written Texts Produced by L2 Learners?
Many NLP tools designed for and trained on L1 texts have been used to analyze L2 texts. Needless
to say, the accuracy of the automated analyses texts performed by such NLP tools has direct
implications for the reliability of the results obtained through them and subsequently the validity
of the conclusions drawn from those results. For this reason, it is always good practice to expli-
citly report the accuracy of the NLP tools used in a specific study. As a field, we are certainly very
interested in finding ways to improve the accuracy of NLP tools on L2 texts. In what follows,
I suggest two empirical studies that aim to contribute to understanding the severity and impact of
this problem and to solving this problem, respectively.
Suggested Empirical Study 1

The first suggested empirical study aims 1) to evaluate the accuracy of relevant NLP tools on L2
texts by comparing automated analyses of L2 texts generated by them against manual analyses,
2) to identify possible sources of the performance errors of the NLP tools on L2 texts by conducting
a close error analysis of all misanalyzed data, 3) to determine whether the performance errors of
the NLP tools significantly affect the reliability of the research results and the conclusions drawn
from them by comparing the research results obtained through automated analyses of L2 texts
against those obtained through manual analyses, and 4) to assess the potential ceiling of perform-
ance improvement of the NLP tools on L2 texts by comparing their performance on L2 texts against
their performance on L1 texts.
This empirical study may focus on an NLP tool that has been or has the potential to be exten-
sively used in L2 writing research but that has not yet been systematically evaluated for its accuracy
on L2 texts. Some examples of such an NLP tool include corpus annotation tools such as POS
taggers (e.g., the Stanford POS Tagger, Toutanova, Klein, Manning, & Singer, 2003) and syntactic
parsers (e.g., the Stanford Parser, Klein & Manning, 2003) and corpus analysis tools such as Coh-
Metrix (McNamara et al., 2014).
Data for this empirical study should ideally include a set of L2 texts produced by learners at
different proficiency levels (e.g., beginning, intermediate, and advanced) and of different genres
(e.g., narrative and argumentative texts) and L1 texts of the same genres. These texts will then
be automatically annotated or analyzed using the NLP tool in question, as well as manually
annotated using the same annotation scheme or analyzed for the same linguistic features or
measures. A satisfactory level of inter-rater reliability should be achieved for the manual anno-
tation or analysis.
Once the texts have been automatically and manually annotated or analyzed, a series of ana-
lysis can then be performed to address the four sub-questions mentioned above. The automated
and manual analyses of the L2 texts can be compared to determine the accuracy of the NLP tool on
the complete set of L2 texts as well as on different subsets of the L2 texts by proficiency level and
by genre. The set of all instances in the L2 texts that have been erroneously annotated or analyzed
by the NLP tool can be closely examined to identify different types of error sources. If the set of
374
L2 texts have been or can be used to address a set of research questions, e.g., regarding linguistic
differences across different proficiency levels or genres, statistical analyses can be run separately
on the automatically analyzed texts and on the manually analyzed texts, and comparisons can then
be made between the two sets of statistical results to determine if the performance errors of the
NLP tool on the L2 texts have led to significant changes in such results. Finally, the accuracy of
the NLP tool on the L2 texts can be compared against its accuracy on the L1 texts, both overall
and by genre, to obtain a sense of the potential ceiling of any performance improvement that may
be attempted.
Results generated from such an empirical study will shed very useful light on the reliability and
error sources of automated analyses of L2 texts, the impact of performance errors of NLP tools on
the research results obtained, and the potential room for performance improvement.

The second suggested empirical study aims to improve the performance of NLP tools on L2 texts
by re-training them using manually annotated corpora of L2 texts and to evaluate the relationship
between the amount of training data needed and the level of improvement achieved. This empirical
study may but does not have to follow the empirical study discussed above. In general, it may focus
on any NLP tool that was originally trained on L1 texts, that has useful applications in L2 writing
research, and that is in need of performance improvement on L2 texts. POS taggers and syntactic
parsers trained on L1 texts, for example, again constitute good examples of candidate NLP tools
for such a study.
The data for this study would be a corpus of written texts produced by L2 learners. Given that
a much larger dataset is generally required to train an NLP tool than to evaluate the performance
of an NLP tool, the corpus of L2 texts used here should be substantially larger than the set of L2
texts used in the suggested empirical study discussed above. The composition of the corpus should
ideally represent the full range of types of texts that the NLP tool is expected to analyze in L2 writing
research. This means that, at the minimum, it should include texts of different genres produced by
learners at different proficiency levels and from different L1 backgrounds. As in the first study, the
texts in the corpus need to be manually annotated using the same annotation scheme or analyzed for
the same linguistic features or measures as those used by the NLP tool, with a satisfactory level of
inter-rater reliability (e.g., for POS tagging, a percent agreement of over 99% should be achieved).
A very small number of learner corpora have been annotated for similar purposes (e.g., Berzak
et al., 2016; Dahlmeier, Ng, & Wu, 2013; Regheb & Dickinson, 2014).
The computational part of the empirical study will then involve re-training the NLP tool using a
training dataset (e.g., 80% of the manually annotated text samples), evaluating the performance of
the re-trained tool using a development dataset (e.g., 10% of the manually annotated text samples)
followed by some revisions to the tool, and evaluating the performance of the revised tool using
a test dataset (e.g., 10% of the manually annotated text samples). Other experimental setups are
possible, too, such as ten-fold cross-validation, which means repeating the training and testing
experiment ten times, each with a different training dataset and test dataset (a development dataset
is not involved in this case), and then averaging the performance of the NLP tool from the ten
experiments. It may also be useful to examine how varying the amount of training data used may
affect the performance of the re-trained NLP tool, as this can help determine whether more training
data helps further improve performance or results in overfitting, i.e., a situation where the trained
model fits well to the training data but not the test data. The performance improvement achieved by
re-training the NLP tool can be reported by comparing the performance of the re-trained NLP tool
on L2 texts against the performance of the original NLP tool on the same texts. The performance
of the NLP tool on different types of L2 texts (e.g., in terms of proficiency level and genre) should
also be reported.
375
Xiaofei Lu
Research Question 2: How Can We Automatically Assess Whether the

Linguistic Features Deployed in Written Texts Are Appropriate and Effective
for The Rhetorical Functions They Are Used to Realize?
Previous research engaging automated analyses of written texts, such as the types of studies
described in Section 2, has primarily focused on identifying and analyzing linguistic features indica-
tive of language proficiency, linguistic development, or writing quality. Such studies have provided
useful insights into which linguistic structures learners may benefit from acquiring. However, the
focus on frequency of linguistic forms divorced from their rhetorical functions, i.e., the commu-
nicative purposes that writers aim to achieve (e.g., indicating a research gap), is problematic for
several reasons. Specifically, it fails to capture the inseparable connections between linguistic forms
and rhetorical functions in the L2 writing construct, may negatively impact L2 writing learning
(e.g., learners trying to plug in desirable linguistic features that are inappropriate for the intended
rhetorical function), and may have even created a false ceiling for AES research. Meanwhile, aca-
demic writing scholars are beginning to shift their focus away from analyses which isolate form or
function towards explorations which unify the “functional patterns and constructions of different
academic genres” (O’Donnell, Römer, & Ellis, 2013, p. 84). An important question to ask is then
how we can automatically assess the functional effectiveness of linguistic features in L2 written
texts. I suggest two empirical studies that can contribute to answering this question below. In the
rest of the chapter, I use the term “function” (and its related forms, e.g., “functional” and “function-
ally”) to specifically refer to rhetorical functions, but the line of argument may be extended to other
compatible uses of the term.

In order to be able to automatically assess the functional effectiveness of linguistic features, we
first need an understanding of the connections between linguistic forms and rhetorical functions
(hereafter, form-function mappings) in expert or quality writing. Such mappings capture the range
and types of linguistic forms that are strongly associated with or effectively indicative of specific
rhetorical functions. For example, the rhetorical function of the following sentence is to indicate a
research gap: while much previous research has focused on and no prior study has. Various types
of form-function mappings may be extracted from this sentence, depending on the types and nature
of linguistic forms that are of interest. In terms of formulaic language use, for instance, we may
map the formulaic sequences while much previous research has focused on and no prior study has
to this function. At the syntactic level, we may map heavy left-embeddedness (i.e., a large number
of words before the main verb) and more specifically, a subordinate clause of concession headed by
while before the main verb, to this function.
Some prior research has looked at formulaic language features in relation to rhetorical functions
in expert writing (Cortes 2013; Durrant & Mathews-Aydınlı, 2011; Le & Harrington, 2015; Lim,
2010; Liu & Lu, 2020; Lu, Casal, & Liu, 2020; Lu, Yoon, & Kisselev, 2021; Omidian, Shahriari,
& Siyanova-Chanturia, 2018), but the types of features examined are so far limited in scope
and research has not systematically examined form-function mappings in L2 writing. The first
suggested empirical study thus aims to provide this understanding by systematically mapping lin-
guistic features of written texts extracted through automated analyses of written texts to the rhet-
orical functions they are used to realize in those texts.
The data for this study may consist of written texts produced by expert writers as well as written
texts of varied levels of quality produced by L2 writers (such as those produced for a standardized
language proficiency test). The first step of data analysis would be to identify and adapt an existing
taxonomy of rhetorical functions or to develop a new taxonomy, an effort that can be informed
by prior genre analysis research (e.g., Cotos, Huffman, & Link, 2017; Swales, 2004). Any given
376
rhetorical function may be realized in a text chunk of varied length (e.g., a paragraph, a few
sentences, a single sentence, a clause within a sentence, or a phrase). There may be an advantage
to treat the sentence as the unit of annotation for subsequent computational work to be described in
the next suggested empirical study, but I will leave this choice open here. Once the annotation unit
is determined, the taxonomy of rhetorical functions can then be used to annotate each unit in the
written texts for the rhetorical function or functions it realizes, with an acceptable level of inter-rater
reliability. Next, linguistic features of one or several different types (e.g., lexical features, formulaic
language features, and syntactic features) that have been shown in previous L2 writing research
to be predictive of writing quality can be automatically extracted from the written texts, and each
occurrence of such a feature can then be mapped to the rhetorical function or functions of the sen-
tence (or other annotation unit) containing it.
In the case of expert writing, the form-function mappings extracted provide a resource for deter-
mining what linguistic forms may be especially effective for realizing what rhetorical functions in
subsequent computational work to be described in the next suggested empirical study. They will
also constitute a useful resource for L2 writing pedagogy. In the case of L2 learner writing, it will
be useful to generate a repertoire of form-function mappings for different levels of writing quality.

The second suggested empirical study aims to develop NLP tools to automatically annotate written
texts for rhetorical functions and to assess the degree of effectiveness of various form-function
mappings in written texts. This study should ideally be preceded by the empirical study described
above and can use the same corpus of written texts produced by expert and L2 writers described in
the study above.
For the first part of the study, i.e., to develop NLP tools to automatically annotate written texts
for rhetorical functions, the corpus that has been manually annotated for rhetorical functions with
an appropriate taxonomy will be used. The task involves identifying an optimal machine learning
approach and an optimal set of linguistic features (see, e.g., Pustejovsky, & Stubbs, 2012), training
a model using part of the annotated corpus, and evaluating the performance of the model on the
other part of the annotated corpus. Using the sentence as the unit of annotation will eliminate
problems in annotation unit segmentation (see, e.g., Cotos & Pendar, 2016). In particular, if the
annotation unit is allowed to vary in terms of its linguistic level, e.g., ranging from a phrase or a
clause to multiple sentences or a paragraph, the task of segmenting a written text into such units
will become highly challenging.
For the second part of the study, i.e., to develop NLP tools to automatically assess the degree of
effectiveness of various form-function mappings in written texts, the complete set of form-function
mappings derived from the corpus along with the indication of the expertise (i.e., expert vs. learner)
and quality level (i.e., different score levels) of each mapping will be used. The task involves using
part of the data to train a model that can classify each form-function mapping into a quality category
and evaluating the performance of the model on the other part of the data.
Similar to the second study suggested for Research Question 1 above, it may be useful to
examine different experimental setups as well as the performance of the model on different types
of written texts for both tasks.
Research Question 3: How Can the Capability to Automatically Identify Form-

Function Mappings and Assess Their Appropriateness and Effectiveness in Written
Texts Be Utilized to Inform and Promote the Teaching and Learning of L2 Writing?
The research outcomes of the empirical studies suggested for Research Question 2 above will have
direct implications for L2 writing and SLA research in terms of extending the analytical focus
377
Xiaofei Lu
from linguistic features in isolation to form-function mappings. Furthermore, they can also lead to
the design of function-oriented pedagogical resources as well as automated writing assessment or
aid systems that offer function-oriented feedback to L2 writers (Lu, Casal, & Liu, 2021). In what
follows, I suggest two empirical studies aimed at designing such an automated writing assessment
or aid system and evaluating the usefulness of function-oriented pedagogical resources and feed-
back in L2 writing teaching and learning.

The first suggested empirical study aims to design an automated writing assessment system that
can offer function-oriented feedback and to investigate the extent to which such a system can help
promote learner awareness of effective form-function mappings and enhance learner performance
in using appropriate forms to effectively realize different rhetorical functions.
With the capability to automatically analyze the rhetorical functions of sentences (or other text
chunks) as well as to automatically assess the functional effectiveness of linguistic features in L2
written texts, it will become possible to design an automated writing assessment system similar
to the systems discussed in the second section, but now with the capability to automatically offer
L2 writers feedback on the functional appropriateness of relevant linguistic features found in the
written texts, in addition to feedback on linguistic forms alone. Such a system will analyze the
rhetorical function of each sentence (or any other unit of annotation used by the rhetorical function
annotator) in a written text, identify functionally effective and inappropriate language forms in
that sentence, and provide positive feedback on the effective forms and editing suggestions for
the inappropriate forms. For example, if the rhetorical function of a sentence is determined to
be question raising, the relevant linguistic features deployed in that sentence (e.g., lexical features,
formulaic language features, and syntactic features) can be analyzed in terms of their effective-
ness for that rhetorical function. Positive feedback can be given to features found to be effective
(e.g., a formulaic sequence such as to what extent does), while editing suggestions may be given to
those found to be inappropriate (e.g., a lengthy subordinate clause before the question, assuming
this is a feature that is rarely used for this rhetorical function in expert or quality writing).
Alternatively, the study may also design an interactive writing aid system that allows L2 writers
to specify the rhetorical function they are trying to realize at any point during the writing process
and either offer feedback on the language forms that the learners have produced for that function
or suggest language forms from the corpus of expert or high quality L2 writing for the learners’
consideration. Different from the automated writing assessment system described above, the inter-
active writing aid system facilitates the writing process and also has the advantage of avoiding
potential errors of automated rhetorical function annotation.

The second suggested empirical study seeks to evaluate the usefulness of function-oriented peda-
gogical resources, the automated writing assessment or aid system described above, and the
function-oriented feedback the system provides for promoting learner awareness of effective form-
function mappings and enhancing learner performance in using appropriate forms to effectively
realize different rhetorical functions.
The first empirical study suggested for Research Question 2 above can yield repertoires of
various types of form-function mappings in expert or high quality L2 writing, in the form of lin-
guistic features (e.g., different types of formulaic sequences or complex syntactic structures) that
have been used to effectively realize different rhetorical functions. It will be highly useful to inves-
tigate the ways and extent to which such pedagogical resources can be used to promote L2 learners’
awareness of form-function mappings and enhance their ability to deploy functionally effective
378
linguistic features in their own writing. To this end, systematic corpus-based genre analysis activ-
ities designed to draw learner attention to form-function mappings following the principles of data-
driven learning (e.g., Boulton & Cobb, 2017) can be implemented throughout a writing course.
Various types of learner performance, perception, awareness, and behavior data can be gathered
at multiple time points to assess issues surrounding the implementation and usefulness of such
resources.
Similarly, it will be useful to investigate the pedagogical effectiveness of the automated writing
assessment or aid system designed in the first empirical study suggested for the current research
question. This part of the study may focus on how L2 writers use the function-oriented feedback
provided by the system in the writing or revision process. It may also focus on the impact of
consistent, longer-term use of such a system on L2 learners’ awareness and knowledge of form-
function mappings and their skills to deploy functionally effective linguistic forms. Various types
of writing process data (e.g., screen capture data) and learner performance, perception, awareness,
and behavior data can be gathered over time.
Conclusion
The increasing availability of large-scale corpora of written texts produced by L2 writers and
the capability of NLP tools to automate various types of linguistic analyses of written texts have
enabled L2 writing and SLA researchers to efficiently analyze and assess the linguistic features
used in L2 texts and subsequently employ that information to answer research questions of theor-
etical, pedagogical, and practical significance. In this chapter, I have proposed a research agenda
aimed at improving the accuracy of NLP tools on L2 texts, and more importantly, developing new
NLP tools with the capability to analyze linguistic features in relation to the rhetorical functions
they are used to materialize in expert and L2 texts, designing automated writing assessment or aid
systems that can offer function-oriented feedback to L2 writers, and investigating the usefulness of
function-oriented pedagogical resources and feedback in L2 writing teaching and learning. In my
view, the push for a functional turn in research on automated analyses of written texts is critical. It
is foreseeable that progress in this research area will lead to exponential growth in L2 writing and
SLA research that examines the development of L2 learners’ ability to use functionally appropriate
and effective linguistic forms, that assess learner writing in terms of this ability, and that designs
corpus-based pedagogical resources aimed at promoting this ability.
References
Anthony, L. (2020). AntConc (Version 3.5.9) [Computer software]. Tokyo: Waseda University. Retrieved from
www.antlab.sci.waseda.ac.jp/
Barlow, M. (2000). [Review of Learner English on Computer, by S. Granger (Ed.)]. London:. 1998. Language
Learning and Technology, 3(2), 15–17.
Berzak, Y., Kenny, J., Spadine, C., Wang, J.X., Lam, L., Mori, K.S., Garza, S., & Katz, B. (2016). Universal
dependencies for learner English. In Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics (pp. 737–746). Berlin: Association for Computational Linguistics. Retrieved
from https://scholarspace.manoa.hawaii.edu/bitstream/10125/25067/1/03_02_review1.pdf
ical complexity in L2 writing development? TESOL Quarterly, 45(1), 5–35.
task types and proficiency levels. Applied Linguistics, 37(5), 639–668.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and
written English. New York: Longman.
Boulton, A., & Cobb, T. (2017). Corpus use in language learning: A meta-analysis. Language Learning, 67(2),
348–393.
379
Xiaofei Lu
Caspi, T. (2010). A dynamic perspective on second language development (Unpublished doctoral dissertation).
University of Groningen, the Netherlands.
Chang, T., Lee, C., & Chang, Y. (2006). Enhancing automatic Chinese essay scoring system from figures-of-
speech. In Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
(pp. 28–34). Beijing: Tsinghua University Press.
Cheng, G. (2017). The impact of online automated feedback on students’ reflective journal writing in an EFL
course. The Internet and Higher Education, 34, 18–27.
Cortes, V. (2013). The purpose of this study is to: Connecting lexical bundles and moves in research article
introductions. Journal of English for Academic Purposes, 12, 33–43.
Cotos, E., Huffman, S., & Link, S. (2017). A move/step model for methods sections: Demonstrating rigour and
credibility. English for Specific Purposes, 46(1), 90–106.
Cotos, E., & Pendar, N. (2016). Discourse classification into rhetorical functions for AWE feedback. CALICO
Journal, 33(1), 92–116.
Crossley, S.A., Bradfield, F., & Bustamante, A. (2019). Using human judgments to examine the validity
of automated grammar, syntax, and mechanical errors in writing. Journal of Writing Research, 11(2),
251–270.
Crossley, S.A., & McNamara, D.S. (2014). Does writing development equal writing quality? A computational
investigation of syntactic complexity in L2 learners. Journal of Second Language Writing, 26, 66–79.
Culham, R. (2003). 6 + 1 traits of writing: The complete guide. New York: Scholastic.
Dahlmeier, D., Ng, H.T., & Wu, S.M. (2013). Building a large annotated corpus of learner English: The NUS
Corpus of Learner English. Proceedings of the Eighth Workshop on Innovative Use of NLP for Building
Educational Applications (pp. 22–31). Atlanta, GA: Association for Computational Linguistics.
Durrant, P., & Mathews-Aydınlı, J. (2011). A function-first approach to identifying formulaic language in aca-
demic writing. English for Specific Purposes, 30, 58–72.
Goldberg, A.E. (1995). Constructions. A construction grammar approach to argument structure. Chicago:
University of Chicago Press.
Granger, S. (Ed.). (1998). Learner English on computer. Austin, TX: Addison Wesley Longman.
Granger, S., Dagneaux, E., & Meunier, F. (2002). The International Corpus of Learner English. Handbook and
CD-ROM. Louvain-la-Neuve: Presses Universitaires de Louvain.
Granger, S., Dagneaux, E., Meunier, F., & Paquot, M. (2009). International corpus of learner English (Version
2.0). Louvain-la-Neuve: Presses Universitaires de Louvain.
Hsieh, Y., Hiew, C.K., & Tay, Y.X. (2017). Computer-mediated corrective feedback in Chinese as a second
language writing: Learners’ perspectives. In D. Zhang & C. Lin (Eds.), Chinese as a second language
assessment (pp. 225–246). Singapore: Springer.
Ishikawa, S. (2013). The ICNALE and sophisticated contrastive interlanguage analysis of Asian learners
of English. In S. Ishikawa (Ed.), Learner corpus studies in Asia and the world (Vol. 1) (pp. 91–118).
Kobe: Kobe University.
Jurafsky, D., & Martin, J.H. (2008). Speech and language processing (2nd ed.). Upper Saddle River,
NJ: Prentice Hall.
Klein, D., & Manning, C.D. (2003). Fast exact inference with a factored model for natural language parsing.
In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in neural information processing systems 15
(pp. 3–10). Cambridge, MA: MIT Press.
Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity
and usage-based indices of syntactic sophistication (Unpublished Doctoral Dissertation). Georgia State
University, Atlanta.
Kyle, K., & Crossley, S.A. (2017). Assessing syntactic sophistication in L2 writing: A usage-based approach.
Language Testing, 34(4), 513–535.
Larsen–Freeman, D. (2006). The emergence of complexity, fluency, and accuracy in the oral and written pro-
duction of five Chinese learners of English. Applied Linguistics, 27(4), 590–619.
Lavolette, E., Polio, C., & Kahng, J. (2015). The accuracy of computer-assisted feedback and students’
responses to it. Language Learning & Technology, 19(2), 50–68.
Le, T.N.P., & Harrington, M. (2015). Phraseology used to comment on results in the Discussion section of
applied linguistics quantitative research articles. English for Specific Purposes, 39, 45–61.
Leacock, C., Chodorow, M., Gamon, M., & Tetreault J. (2014). Automated grammatical error detection for
language learners (2nd ed.). San Rafael, CA: Morgan & Claypool.
Leech, G. (1998). Learner corpora: What they are and what can be done with them. In S. Granger (Ed.),
Learner English on computer (pp. xiv–xx). London: Addison Wesley Longman.
380
Lim, J. M. (2010). Commenting on research results in applied linguistics and education: A comparative genre-
based investigation. Journal of English for Academic Purposes, 9, 280–294.
Liu, Y., & Lu, X. (2020). N1 of N2 constructions in academic written discourse: A pattern grammar analysis.
Journal of English for Academic Purposes, 47, 1–11.
Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal
of Corpus Linguistics, 15(4), 474–496.
writers’ language development. TESOL Quarterly, 45(1), 36–62.
Lu, X. (2014). Computational methods for corpus annotation and analysis. Dordrecht: Springer.
Lu, X. (2017). Automated measurement of syntactic complexity in corpus-based L2 writing research and
implications for writing assessment. Language Testing, 34(4), 493–511.
Lu, X., & Bluemel, B. (2020). Automated assessment of language. In S. Conrad, A. Hartig, & L. Santelmann
(Eds.), The Cambridge introduction to applied linguistics (pp. 86–93). Cambridge: Cambridge University
Press.
Lu, X., Casal, J.E., & Liu, Y. (2020). The rhetorical functions of syntactically complex sentences in social
science research article introductions. Journal of English for Academic Purposes, 44, 1–16.
Lu, X., Casal, J.E., & Liu, Y. (2021). Towards the synergy of genre-and corpus-based approaches to aca-
demic writing research and pedagogy. International Journal of Computer Assisted Language Learning and
Teaching, 11(1), 59–71.
Lu, X., Yoon, J., & Kisselev, O. (2021). Matching phrase-frames to rhetorical moves in social science research
article introductions. English for Specific Purposes, 61, 63–83.
McNamara, D.S., Graesser, A.C., McCarthy, P.M., & Cai, Z. (2014). Automated evaluation of text and dis-
course with Coh-Metrix. Cambridge: Cambridge University Press.
Meunier, F. (2016). Introduction to the LONGDALE project. In E. Castello, A. Katherine, & F. Coccetta
(Eds.), Studies in learner corpus linguistics. Research and applications for foreign language teaching and
assessment (pp. 123–126). Berlin: Peter Lang.
Norris, J.M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The
case of complexity. Applied Linguistics, 30(4), 555–578.
O’Donnell, M., Römer, U., & Ellis, N.C. (2013). The development of formulaic sequences in first and second
language writing: Investigating effects of frequency, association, and native norm. International Journal of
Corpus Linguistics, 18(1), 83–108.
Omidian, T., Shahriari, H., & Siyanova-Chanturia, A. (2018). A cross-disciplinary investigation of multi-word
expressions in the moves of research article abstracts. Journal of English for Academic Purposes, 36, 1–14.
Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis
of college-level L2 writing. Applied Linguistics, 24(4), 492–518.
Polio, C., & Yoon, H.J. (2018). The reliability and validity of automated tools for examining variation in syn-
tactic complexity across genres. International Journal of Applied Linguistics, 28(1), 165–188.
Pustejovsky, J., & Stubbs, A. (2012). Natural language annotation for machine learning: A guide to corpus-
building for applications. Sebastopol, CA: O’Reilly.
Ramineni, C., Trapani, C.S., Williamson, D.M., Davey, T., & Bridgeman, B. (2012). Evaluation of the e-rater
scoring engine for the TOEFL independent and integrated prompts. ETS Research Report (ETS RR-12-06).
Princeton, NJ: Educational Testing Service.
Ranalli, J., Link, S., & Chukharev- Hudilainen, E. (2017). Automated writing evaluation for formative
assessment of second language writing: Investigating the accuracy and usefulness of feedback as part of
argument-based validation. Educational Psychology, 37(1), 8–25.
Regheb, M., & Dickinson, M. (2014). Developing a corpus of syntactically-annotated learner language for
English. Proceedings of the 13th International Workshop on Treebanks and Linguistic Theories (pp. 293–
300). Tübingen: Universität Tübingen.
Römer, U. (2019). A corpus perspective on the development of verb constructions in second language learners.
International Journal of Corpus Linguistics, 24(3), 268–290.
Römer, U., & O’Donnell, M.B. (2011). From student hard drive to web corpus (part 1): The design, compil-
ation and genre classification of the Michigan Corpus of Upper-level Student Papers (MICUSP). Corpora,
6(2), 159–177.
Römer, U., Skalicky, S.C., & Ellis, N.C. (2020). Verb-argument constructions in advanced L2 English learner
production: Insights from corpora and verbal fluency tasks. Corpus Linguistics and Linguistic Theory,
16(2), 303–331.
381
Xiaofei Lu
Scott, M. (2020). WordSmith Tools (Version 8.0) [Computer software]. Liverpool: Lexical Analysis Software.
Retrieved from www.lexically.net/wordsmith/
Swales. J.M. (2004). Research genres: Explorations and applications. Cambridge: Cambridge University Press.
Toutanova, K., Klein, D., Manning, C., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a
cyclic dependency network. Proceedings of the 2003 Human Language Technology Conference of the
North American Chapter of the Association for Computational Linguistics (pp. 252–259). Edmonton: The
Association for Computational Linguistics.
Verspoor, M.H., De Bot, K., & Lowie, W. (Eds.). (2011). A dynamic approach to second language develop-
ment: Methods and techniques. Amsterdam: John Benjamins.
Verspoor, M.H., Lowie, W., & van Dijk, M. (2008). Variability in second language development from a
dynamic systems perspective. Modern Language Journal, 92(2), 214–231.
Wen, Q., Wang, L., & Liang, M. (2005). Spoken and written English corpus of Chinese learners. Beijing: Foreign
Language Teaching and Research Press.
Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures
of fluency, accuracy and complexity. Honolulu, HI: University of Hawaii, Second Language Teaching and
Curriculum Center.
Wulff, S., & Gries, S.T. (2019). Particle placement in learner language. Language Learning, 69(4), 873–910.
Yang, W., Lu, X., & Weigle, S.A. (2015). Different topics, different discourse: Relationships among writing
topic, measures of syntactic complexity, and judgments of writing quality. Journal of Second Language
Writing, 28, 53–67.
Yoon, H.-J., & Polio, C. (2017). The linguistic development of students of English as a second language in two
382
Coda
28
IMPLICATIONS OF SLA-ORIENTED
RESEARCH FOR THE TEACHING
OF L2 WRITING
Dana Ferris
University of California, Davis
In 2002, I was asked to teach an advanced graduate course in second language acquisition (SLA) in
place of a colleague who would be on sabbatical. This course was his favorite, so I knew I’d likely
never get another chance to teach it. I was being a good team player and helping out my colleague
and the department, but I gained something important for myself from the experience of preparing
for and teaching that course. For the first time I read SLA-focused research that was relevant to my
own interests in error correction and second language writing (SLW).
My interest in the language and error correction side of teaching L2 writing was not new at
that point, but up until then it had been informed by research in foreign language writing and the
still minimal research base on error correction in ESL/EFL writing or in L1 composition studies.
Though I had taken courses in SLA in graduate school, I had never, until I taught the SLA course in
2002, come across the research base on corrective feedback (e.g., Doughty & Varela, 1998) or focus
on form (Doughty & Williams, 1998). I realized that this research base had a good deal to teach me
in my own work on understanding how (or if) corrective feedback and form-focused instruction
were effective and valuable in the teaching of L2 writing. Further, the theoretical work we read in
the class about noticing (Schmidt, 1990), output (Swain, 1995), and interaction (Long & Porter,
1985) gave me a broader perspective on why these pedagogical interventions might be important.
After 2002, I continued reading and learning about the intersections of SLA and L2 writing,
which led to an article on the different perspectives on written corrective feedback (WCF) in SLA
and L2 writing published in a special issue of Studies in Second Language Acquisition (Ferris,
2010) and later to my book-length collaboration with John Bitchener (Bitchener & Ferris, 2012) in
which we discussed theoretical and practical overlaps and distinctions between the two fields and
their approach to WCF. Thus, I was happy to be invited to read the chapters in this new collection
and to reflect on what we can learn from them about the teaching of L2 writing and writers.
The lack of connections between SLA and writing has always been surprising to me. For
instance, in SLA, studies of corrective feedback were heavily slanted towards oral correction until
perhaps 10–15 years ago, and in this volume, Gilabert and Vasylets (Chapter 4) note that most
task-based language teaching (TBLT) research to date has focused on oral, rather than written,
tasks. Further, particularly in L1-focused writing studies, facilitating language development in
the context of writing instruction has been a neglected topic in research and in teacher prepar-
ation (see MacDonald, 2007; Matsuda, 2012) that is still controversial to this day, a point to which
I return later.
DOI: 10.4324/9780429199691-38 385

Dana Ferris
The authors in this volume make a coherent argument that writing has great potential to
facilitate advanced SLA and indeed that the process of writing may offer rich opportunities for
linguistic development. For example, Johnson (Chapter 5) argues that in authentic writing tasks,
“…learners must grapple with the linguistic resources available to them in an effort to com-
municate cognitively demanding messages.” It is clear that, for the authors and editors of this
volume, writing is no longer the neglected so-called fourth skill (after listening, speaking, and
reading) in SLA but rather has a unique, complex, and valuable role to play. But what do their
insights contribute to those of us focused on the teaching of L2 writing and writers? The rest of
this coda focuses on what the SLW community can learn from this like-minded community of
SLA-oriented L2 writing researchers.
Overarching Concepts
Writing to Learn
Two recurring themes in this handbook are foundational for considering the suggestions for prac-
tice contained in individual chapters. First, Manchón (2011) distinguished between learning to
write and writing to learn. This distinction has helped to frame the discussion of language acqui-
sition in the context of writing instruction. Writing to learn is not a new concept for United States-
based writing specialists who are familiar with it from the Writing Across the Curriculum (WAC)
movement (e.g., Bean, 2011). However, in this SLA context, it means something different. In WAC,
writing to learn refers to setting writing tasks for students to help them process course content. For
example, students might be asked at the end of a lecture in a particular disciplinary course to write
a one-minute summary of what they learned or to pose questions that the day’s material has raised
for them.
In contrast, in the SLA context, writing to learn refers rather to having learners write as a way
to expand their language resources, or to practice, apply, and solidify language they have learned.
Learning to write, on the other hand, refers to teaching students how to improve their writing
processes and genre awareness as a means to accomplish their communicative purposes inside and
beyond the writing/composition classroom. Several authors in this collection make the generaliza-
tion that in ESL contexts, learning to write is emphasized, while in EFL settings, writing to learn
(language) might be the primary (although not always the only) focus.
The chapters in this volume, taken together, make the argument that both approaches to writing
instruction are important for a broad range of theoretical and practical reasons. Researchers and
practitioners focused especially on the SLA benefits of writing activities should keep in mind that
meaningful, authentic tasks will help learners’ motivation and keep them cognitively and affectively
engaged in ways that will facilitate language processing. On the other hand, writing instructors who
see learning to write as their sole task without facilitating language development are missing a crit-
ically important piece of the puzzle, particularly for their L2 students but even also for L1 writers,
a point emphasized by Polio (2019) in a recent commentary.
Unfortunately, many writing instructors do not necessarily view themselves as language
teachers –or even see facilitating language acquisition as an important part of their job. There
are several reasons for this lack of interest and/or sense of urgency. First, depending upon their
academic backgrounds, they may not have received any/much training in language teaching, lan-
guage acquisition, or even the linguistics of the target language. Many writing professionals have
completed their training in composition and rhetoric, literature, creative writing, or other human-
ities disciplines, but not applied linguistics. Even if they wanted to help their student writers with
their language development, they might not have the content knowledge or pedagogical knowledge
to do so competently and confidently. Second, writing studies as a field has largely been dismissive
of, if not outright hostile toward, considerations of language as a focus of writing instruction. Thus,
386
Implications for Teaching L2 Writing
scholars in the field who prepare future teachers do not emphasize issues such as formal language
instruction or WCF and in some cases will actively discourage it. The reasons for this dismissal
are both practical –a belief that research proves that classroom grammar instruction and error
correction are futile and even counterproductive (e.g., Connors & Lunsford, 1988; Hartwell, 1985;
Lunsford & Lunsford, 2008; Santa, 2006) –and philosophical: many writing studies scholars fear
that attention to language reifies a deficit view of multilingual students and reflects an inappropriate
bias toward monolingualism (Canagarajah, 2017; Horner, Lu, Royster, & Trimbur, 2011; Horner,
NeCamp, & Donahue, 2011; Silva & Wang, 2021).
While some of these concerns are valid and indeed are echoed by some authors in this volume
(e.g., Schoonen & van Vuuren, Chapter 8; Gentil, Chapter 9), a concern with not overemphasizing
language in writing instruction should not be license for ignoring it altogether. As I have noted
elsewhere,
…successful writing, by definition, includes and requires the effective deployment of a

range of linguistic and extralinguistic features…Such decision-making goes far beyond
simply avoiding errors…When we consider how much tacit and explicit linguistic know-
ledge goes into every sentence we utter and write, not to mention how complex these lan-
guage options can be and how daunting they are for L2 acquirers to master, it is amazing
that more attention is not given to this topic…
Ferris & Hedgcock, 2014, p. 310
Discussing teachers of academic writing, Caplan (Chapter 20) adds that “all writing teachers in all
academic settings need to see themselves as language teachers…” Happily, the theory, research,
and practical suggestions in this volume provide a way forward for learning-to-write-focused
instructors to consider how to incorporate more writing-to-learn activities and goals in their curric-
ulum, tasks, and materials.
Intention and Attention

Writing specialists reading this handbook will not find many brand-new suggestions in the
implications for practice sections of the various chapters –at least not on the surface. Most of the
practical ideas are things that writing teachers usually do and have been doing for quite some time.
However, what is new, and very important, is the argument that these activities should explicitly
and intentionally move students’ attention toward the language of their writing. As an applied lin-
guist who is firmly anchored in writing studies –I teach and am a program administrator in a large
writing program in the United States –I can attest that many writing instructors in the learning-to-
write context do not see attention to language as an important priority for instruction. To provide
just one example of this, in our advanced writing courses, which are geared towards writing for
specific disciplines or in professional settings, we regularly utilize collaborative writing activities.
However, the goal behind such assignments is to build students’ psychosocial strategies for navi-
gating collaborative activities in their major courses or in a future workplace. We rarely or never
talk about how the process of collaborating over a jointly authored text forces students to process
language more deeply –both the language of the text itself and the language needed to negotiate in
a potentially fraught environment (see Storch, Chapter 3; Michel et al., Chapter 6; Coyle & Roca de
Larios, Chapter 7). The learning-to-write perspective on collaborative writing is not wrong. Rather,
it just does not go far enough in maximizing the potential of such activities for a range of learners
and learning objectives.
The practical suggestions in this book, if applied, will help learners pay attention to language
through the process of writing. However, before that can happen, writing instructors themselves
need to decide to be intentional about language development –writing to learn –in addition to their
387
Dana Ferris
already established learning-to-write objectives. Teachers’ intention to focus learners’ attention on

language while in the process of writing and processing feedback will likely give students max-
imum opportunities to make progress in SLA through both learning-to-write and writing-to-learn
activities, as more fully discussed in diverse chapters in this handbook (see especially chapters in
Part I, and Chapters 6, 7, 22, and 23). I will now turn to discussing the suggestions for practice
included throughout the handbook through the lens of both approaches and intention and attention.
Pedagogical Suggestions
With some 26 chapters in this book, there are too many specific ideas to do justice to all of
them in this brief coda. Rather, I will divide my discussion among three important areas: tasks
(including matters of writing assessment), feedback, and learning contexts (including both learner
characteristics and different instructional settings and modes).
Tasks
The different suggestions in various chapters argue that writing tasks, to facilitate language pro-
cessing resulting in uptake for language acquisition, should be challenging, authentic, and motiv-
ating. The chapters in Part II (e.g., Gilabert & Vasylets, Chapter 4; Johnson, Chapter 5) argue that
various theoretical models on task complexity predict that increased task complexity in L2 writing
will direct learners’ attention to language and thus likely promote SLA. They suggest that using
multiple modes for the same task, such as speaking-to-writing and writing-to-speaking sequences,
will direct students’ attention to the language of oral tasks versus the language used in writing.
Johnson adds that asking learners to explore a variety of genres rather than writing the same type of
text repeatedly will require students to notice and produce language appropriate to specific genres
and registers.
In earlier models of teaching second language writers, controlled composition activities would
prompt students to practice specific language forms they had learned, such as converting a past
tense narrative (“Last Sunday, I…”) to present tense (“Every Sunday, I…”). In contrast, many of
the authors in this volume suggest that students be asked to produce authentic text types and genres.
There are several sound reasons for this suggestion. First, it is more useful for students to grapple
with authentic forms in text models they read and write than artificial ones. In the controlled past-
to-present example just cited, students might indeed get a bit of hands-on practice in producing
specific verb tenses but would not learn how or why, in authentic narratives, writers often switch
frequently between present and past tense. It would be much more on point to have learners analyze
an authentic narrative –a news story, a plot summary, a retelling of a historical event –to observe
how tenses are used naturally, and perhaps, as suggested by Coyle and Roca in Chapter 7, use
models of such texts as a feedback strategy.
Second, it is more motivating for students to read and write in genres they might actually
encounter and use outside of a language or writing class. In a large study of advanced university
students in the United States called The Meaningful Writing Project (Eodice, Geller, & Lerner
2017), the researchers learned that students across various disciplines and institutions found writing
assignments engaging when they were asked to produce genres in which they had never written. As
noted in Papi’s contribution (Chapter 12), if students are motivated and engaged in a writing task,
they are more likely to pay attention to all aspects of producing that text, including the language
that they use.
Several chapters in this volume make suggestions about encouraging lexical development,
including the appropriate production of lexical bundles (Kyle, Chapter 13; Yoon, Chapter 15) in the
context of writing and about how to measure lexical and syntactic development and performance
388
in assessment tasks (Plakans & Ohta, Chapter 17). While agreeing that attention to these matters
is important for L2 writing teachers, it is also fair to note that applying some of these ideas in a
learning-to-write context can be easier said than done. Learning-to-write syllabi typically are task-
based (structured around multi-stage writing assignments), and a good deal of time and energy is
devoted to generating content, rhetorical considerations of purpose, exigence (the importance or
urgency of the issue being considered), and audience, and to going through steps of the writing
process (planning, drafting, obtaining feedback at intermediate stages, revising, and editing). It is
challenging for teachers to figure out when and how to insert discussions of target vocabulary and
lexical bundles, and deciding how to assess language use along with everything else that is included
in a rubric for a writing assignment or assessment task can be controversial. In my own program, for
instance, there are differences of opinion among instructors who believe we should grade far more
harshly on grammar issues and those who believe that language facility is not an important part of
assessing writing at all (see Chapter 17 by Plakans & Ohta).
A final point on the subtopic of task and language development is whether there is any utility
for on-demand (no prior preparation), timed writing tasks in either learning-to-write or writing-to-
learn contexts. In the task section of the volume in Part II (Chapters 4 and 5), the authors mention
that allowing students time to plan before writing seems to improve linguistic performance. Yet,
allowing time to plan is only relevant in timed writing situations –in process-oriented classrooms,
the students have days or even weeks to produce their texts, and planning and/or prewriting is an
integral part of the instructional sequence. Timed writing tasks are really mostly or perhaps even
only useful for research purposes (i.e., to obtain writing samples under equivalent conditions), with
the possible exception of writing or language courses that are specifically designed for standardized
test preparation. Even for research, the ultimate utility of such tasks is questionable. Researchers
may find out what students can or cannot do under artificial conditions (still a most relevant and
valid aim for certain research purposes), but that may or may not mirror what they could do in
authentic settings and hence be questionable on their ecological validity. In any case, many authors
in this volume (see, e.g., Michel et al., Chapter 6; Coyle & Roca, Chapter 7) call for both pedagogy
and future research to focus on authentic classroom conditions rather than research or laboratory
settings.
Feedback
Many authors in this collection discuss written corrective feedback (WCF) as a mechanism to
facilitate SLA (see especially Chapters 2–3 and 10–12). SLA perspectives on WCF have evolved
over time. In the first decade or so after Truscott’s (1996) incendiary essay was published, most
SLA researchers seemed to agree with him that the application of information provided by WCF to
revised texts was not meaningful and that only new texts written under posttest or delayed posttest
conditions mattered as evidence that WCF “works.” Now, however, at least some experts (see
Chapters 2–3, in particular) acknowledge that the intermediate step of having learners apply WCF
to existing texts rather than new ones can lead to uptake, and in so doing, facilitate longer-term
acquisition. Indeed, among the suggestions for practice given in Kang and Han (Chapter 16) and
Coyle and Roca (Chapter 7) are recommendations that teachers guide students in analyzing and
applying the WCF (e.g., charting their errors/corrections or revising/rewriting errors in their texts)
they have received to explicitly assist in such uptake.
To the dismay of some L2 writing experts, WCF has (by far) been the most studied topic in
second language writing over the past 25 years (Atkinson & Tardy, 2018). As research has continued,
applications for pedagogy have become more nuanced, as can be seen in different contributions to
this volume. Suggestions have evolved from the earlier rigid prescriptions (“only new texts count
as evidence”; “direct feedback is always better than implicit correction”) to “well, it depends” –on
389
Dana Ferris
learner proficiency, on the goals and purposes of the learning context and the tasks, and on indi-
vidual learner differences. For instance, Roca and Coyle (Chapter 7) argue that considering indi-
vidual writers’ engagement with WCF is a critically important focus for future L2 writing research
and pedagogy. Though authors in this collection make some general suggestions that seem to be
well supported across the research base, they also call for future studies to take up some of the more
complex questions related to individual differences and to use a broader range of research methods
to investigate WCF phenomena. For instance, Leow and Manchón (Chapter 22), suggest a renewed
research focus on processing conditions for both the writing and revision stages as learners apply
WCF to existing (i.e., revision) and new texts.
Several generalizations do appear to hold across a broad range of available WCF studies:
1. WCF should be focused on a few patterns of error, rather than a comprehensive marking of all
instances of all types of errors, so that learners are not overwhelmed with too much information
on the cognitive level or too much negative feedback on the affective level.
2. While direct corrections seem to provide more straightforward feedback for language acqui-
sition than does implicit feedback (e.g., underlining or circling errors), metalinguistic rule
reminders may be more important than either type of correction. Without that information,
students may be able to notice a correction and agree that it’s accurate (because it “sounds
right”) but may not be able to understand why it was incorrect and avoid the error the next time.
3. Promoting student engagement with WCF is very important to its success and a factor that has
been overlooked and underappreciated in most WCF research.
In addition to the above generalizations, which can apply to any WCF activities, several authors in
this handbook make a direct connection between collaborative writing and WCF. Indeed, Storch,
who wrote the chapter on theoretical perspectives of collaborative writing in this volume (Chapter 3),
recently published a paper arguing that collaborative writing is more beneficial to learners than are
traditional peer feedback activities because learners are more invested in a jointly authored text
than in giving quality feedback about someone else’s text (Storch, 2019). As has been well argued
by Storch and others in this volume and elsewhere (e.g., Bitchener & Storch, 2016), collaborative
writing provides unique opportunities for learners to engage with language in the process of shared
meaning-making, and WCF provided by peer co-authors (e.g., in the margins of a Google Doc) will
likely motivate student writers toward deeper processing of that feedback, since they have a more
significant stake in understanding whether the feedback is right or wrong. Shintani and Aubrey, in
their chapter on computer-mediated feedback (Chapter 21) also suggest that synchronous WCF
provided by the teacher as part of collaborative writing activities can be an innovative approach to
providing multiple sources of WCF (peers and teacher) to multiple learners at a teachable moment,
as they are negotiating over their jointly authored text. In sum, nearly a quarter-century of theory
and research on WCF predicts that it is beneficial for promoting SLA under the right conditions.
Learning Contexts and Learner Characteristics

One of the most appealing contributions of this handbook is its broad scope. Different authors dis-
cuss the role of writing in facilitating SLA in a broad range of contexts, from primary and secondary
learners (Coyle & Roca de Larios, Chapter 10; Pérez-Vidal & Lasagabaster, Chapter 18) to study
abroad sojourners (Vallejos & Sanz, Chapter 19) to learners who need to master advanced academic
discourse in their specific fields of study (Caplan & Polio, Chapter 20). It also acknowledges the
rapidly advancing areas of computer-mediated writing instruction and language learning (Shintani
& Aubrey, Chapter 21) as well as the importance of multimodality in our considerations of how
language is acquired and how instructed SLA might progress (Lim & Kessler, Chapter 24).
390
For some years, researchers interested in WCF, whether from a learning-to-write or a writing-
to-learn perspective, have called for increased attention to individual differences in how learners
respond to and apply information provided by WCF (see Bitchener & Storch, 2016; Ferris, 2006,
Ferris & Kurzer, 2019; Ferris, Liu, Sinha, & Senna, 2013) rather than simply relying on statistics
from groups of learners to assess whether or why WCF is effective. It is encouraging therefore to see
not only a whole section of the volume (Chapters 10–12) on this topic but many individual authors
throughout the collection urging attention to learner characteristics in their recommendations for
future research. Also promising is the call for a broader range of research methods, from corpus-
based research to qualitative or mixed-method/triangulated designs (Part III, Chapters 22–27).
For the teaching of writing, the broader perspective taken in this volume implies that instructors,
curriculum developers, and materials designers should consider activities and tasks that meet indi-
vidual needs rather than taking a one-size-fits-all approach. Meeting individual needs might mean –
to identify just a few e xamples –tailoring WCF to each student’s particular growth areas rather
than pre-selecting WCF targets for a whole class. It might mean allowing students more choices in
content to write about and/or genres/registers in which to write, both to engage more learners but
also to challenge them to consider language choices in tasks that might be less familiar to them.
It might mean combining several approaches in unique ways –such as text chat, synchronous
teacher WCF, and collaborative writing (Shintani & Aubrey, Chapter 21) or oral and written modes
(Gilabert & Vasylets, Chapter 4). A creative combination of approaches by a teacher can cast a wide
net for learners –if they do not engage in language processing in one mode, they might do so in a
complementary one.
Concluding Thoughts
The good news for instructors in learning-to-write contexts is that increased attention to language
acquisition does not require a complete curricular overhaul. Most of the practical suggestions
shared by authors in this volume are things that writing teachers are doing anyway: giving
thoughtful feedback, providing challenging and authentic tasks, considering individual stu-
dent needs, and prioritizing engagement and metacognitive awareness. Teachers simply need
to add within those activities regular opportunities to direct students’ attention to language. For
example, while modeling for students how to write in authentic genres, teachers should include
analysis of the language of those target genres along with considerations of content, audience,
and text structure. While providing helpful feedback for writing development, teachers should
incorporate focused and precise information about student writers’ language use. The classroom
technique known as Dynamic Written Corrective Feedback (Hartshorn et al., 2010; Kurzer,
2018) is an example of a writing-to-learn activity easily incorporated into a learning-to-write-
oriented classroom that provides a concrete and visible measurement of individual students’
progress; in one study of an L2 writing program over several years, dynamic WCF was the
most popular activity identified by students who had completed the program (Ferris, 2018).
Student writers also may benefit from self-directed language study as a requirement or option
for learning-to-write courses (see Ferris, Eckstein, & DeHond, 2017, for discussion of such
a program). There are many other suggestions in the “Practical applications” sections of the
various chapters of this handbook; the point here is that writing instructors do not need to com-
pletely transform their writing courses into language classes to be intentional about directing
students’ attention to language acquisition in a learning-to-write context.
In addition to the very comprehensive discussion of different teaching and learning contexts in
this volume (Chapters 18–21), there is a call throughout for broadening research approaches, both
to investigate questions in new ways (e.g., incorporating qualitative methods and authentic tasks
or genres) and to utilize technological advances in corpus analysis, automated writing evaluation,
391
Dana Ferris
and translation software/devices; see Chapters 22–27). It is encouraging to anticipate even richer,
more robust research and theory that will emerge in years to come on this topic of how writing can
facilitate SLA –and vice versa.
References
Atkinson, D., & Tardy, C.M. (2018). SLW at the crossroads: Finding a way in the field. Journal of Second
Bean, J.C. (2011). Engaging ideas: The professor’s guide to integrating writing, critical thinking, and active
learning in the classroom (2nd ed.). San Francisco, CA: Jossey-Bass.
Matters.
Canagarajah, A.S. (2017). Translingual practice as spatial repertoires: Expanding the paradigm beyond struc-
turalist orientations. Applied Linguistics, 39(1), 31–54.
Connors, R., & Lunsford, A.A. (1988). Frequency of formal errors in current college writing, or Ma and Pa
Kettle do research. College Composition and Communication, 39, 395–409.
Doughty, C., & Varela, E. (1998). Communicative focus on form. In C. Doughty & J. Williams (Eds.), Focus
on form in classroom SLA (pp. 114–138). Cambridge: Cambridge University Press.
Doughty, C., & Williams, J. (Eds.) (1998). Focus on form in classroom second language acquisition.
Eodice, M., Geller, A.E., & Lerner, N. (2017). The meaningful writing project: Learning, teaching, and writing
in higher education. Logan, UT: Utah State University Press.
Ferris, D.R. (2006). Does error feedback help student writers? New evidence on the short-and long-term
effects of written error correction. In K. Hyland & F. Hyland (Eds.), Feedback in second language
writing: Contexts and issues (pp. 81–104). Cambridge: Cambridge University Press.
Ferris, D.R. (2010). Second language writing research and written corrective feedback in SLA: Intersections
and practical applications. Studies in Second Language Acquisition, 32, 181–201.
Ferris, D.R. (2018a). Using student satisfaction surveys for program improvement. CATESOL Journal
30(2), 19–42.
Ferris, D.R. (2018b). “They said I have a lot to learn”: How teacher feedback influences advanced university
students’ views of writing. Journal of Response to Writing, 4(2). Retrieved from https://journalrw.org/
index.php/jrw/article/view/114
Ferris, D., Eckstein, G., & DeHond, G. (2017). Self-directed language development: A study of first-year
college writers. Research in the Teaching of English, 51(4), 418–440.
Ferris, D., & Hedgcock, J. (2014). Teaching L2 composition: Purpose, process, and practice (3rd ed.).
corrective feedback. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and
issues (2nd ed.) (pp. 109–124). Cambridge: Cambridge University Press.
Hartshorn, J.K., Evans, N.W., Merrill, P.F., Sudweeks, R.R., Strong-Krause, D., & Anderson, N.J. (2010). The
effects of dynamic corrective feedback on ESL writing accuracy. TESOL Quarterly, 44, 84–109.
Hartwell, P. (1985). Grammar, grammars, and the teaching of grammar. College English, 47, 105–127.
Horner, B., Lu, M.-Z., Royster, J.J., & Trimbur, J. (2011). Opinion: Language difference in writing: A
translingual approach. College English, 73(3), 303–321.
Horner, B., NeCamp, S., & Donahue, C. (2011). Toward a multilingual composition scholarship: From English
only to a translingual norm. College Composition and Communication, 63(2), 269–300.
Kurzer, K. (2018). Dynamic written corrective feedback in developmental ESL writing classes. TESOL
Quarterly, 53(1), 5–33.
Long, M.H. & Porter, P.A. (1985). Group work, interlanguage talk, and second language acquisition, TESOL
Quarterly, 19, 207–227.
Lunsford, A.A., & Lunsford, K.J. (2008). “Mistakes are a fact of life”: A national comparative study. College
Composition and Communication, 59, 781–806.
MacDonald, S.P. (2007). The erasure of language. College Composition and Communication, 58, 585–625.
Manchón, R.M. (Ed.). (2011). Learning- to-
write and writing-to-
learn in an additional language.
392
Matsuda, P.K. (2012). Let’s face it: Language issues and the writing program administrator. Writing Program
Administration, 36(1), 141–163.
Polio, C. (2019). Keeping the language in second language writing classes. Journal of Second Language
Writing, 46. doi:https://doi.org/10.1016/j.jslw.2019.100675
Santa, T. (2006). Dead letters: Error in composition, 1873–2004. Cresskill, NJ: Hampton Press.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2),
192–196.
Silva, T., & Wang, Z. (Eds.) (2021). Reconciling translingualism and second language writing. New York:
Routledge.
Storch, N. (2019). Collaborative writing as peer feedback in K. Hyland & F. Hyland (Eds.), Feedback in second
language writing: Contexts and issues (2nd Ed.) (pp. 143–162). Cambridge: Cambridge University Press.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook and B. Seidlhofer
(Eds.), Principles and practices in applied linguistics: Studies in honour of H. Widdowson (pp.125–144).
327–369.
393
INDEX
Academic Collocation List 207 Attitude 87, 131, 156, 158, 161, 260, 262, 275–276,
Academic formulas list 207 342; Attitude towards writing 75, 158–159;
Academic writing 57, 70–71, 111, 130, 157, Attitude towards feedback 87–88, 90, 133;
173–174, 200, 247, 250, 268–289, 325, 362 Attitude towards speakers of other languages 116;
Accuracy measures 47, 58, 127, 169, 173, 199, Attitude towards English speakers 132
203–204, 206–208, 228–235, 260–261, 264, Authentic tasks 332, 386, 391
356–357, 359–360, 371; Accuracy & Writing 42, Automated analyses, 5, 275, 370–376, 379
44–46, 48, 52, 55, 59–61, 101, 128–129, 142, Automated scoring 145, 147, 207, 229, 235, 373
144, 146, 171, 173, 175–177, 243, 245, 257–258, Automated writing assessment 373, 378–379
260–263, 274; Accuracy & Oral speech 44–46, Awareness 11–13, 24, 26, 81, 83, 85–86, 90, 99, 119,
61, 176–177, 255; Accuracy & WCF 13, 15, 22, 130, 204–205, 207, 230, 243–244, 277, 288, 301,
82, 85–88, 132–133, 143, 214–222, 255, 284, 312, 332, 341–342, 357, 378–379; Cross-cultural
288–292, 303, 306, 350–351; Accuracy & Task awareness 116; Genre awareness 119, 130, 207,
Complexity 53–56, 59, 159–160, 162, 174, 386; Language awareness 81, 110; Metacognitive
316, 331 awareness 114, 391; Metalinguistic awareness
Achievement goals 156, 161–162 110, 114, 119, 126–127, 132, 234; Metacognitive
Activation of prior knowledge 10, 12, 17–18 knowledge 4, 101, 105, 114
Age-related differences 126–128, 132, 134–135 Awareness levels 11, 13, 16–18, 83–86, 90, 303, 307
Annotation: Corpus annotation tools 370, 374;
Annotation units 377–378; Annotation scheme Barcelona Age Factor Project 127
375; Automated rhetorical function annotation Bidirectional transfer 4, 97, 103, 113
378 Bi/multilingual turn 110–111
Anxiety 18, 48, 140, 154–155, 159–161, 222; Bi/multilingual writers 113
Writing anxiety 56, 154, 159–162 Bi/multilingual writing development 113
Appraisal 276 Bilingualism 101, 112; Bilingual cognition/mind
Aptitude: Aptitude & WCF 15, 18, 143, 146–147, 119; Cognitive effects 110; Holistic view
314; Aptitude & Writing 139, 142, 147, 110
318–319; Aptitude Complexes Hypothesis British National Corpus 194, 208
140; Language aptitude 139–143, 145–148,
218–219, 305 CLIL 4, 128–129, 134–135, 241–250, 317
Association strength 202–203, 359, 361; Association Cluster analysis 357, 362–364
strength values 205; Association strength indices Co-ownership 330
207 Cognitive flexibility 110, 116
Attention 11–12, 24–28, 48, 55, 187, 206, 244, 285, Cognitive individual differences 139, 141–142,
312–321, 379, 388; Attention and Writing 42–43, 147–148, 305
59–61, 68–69, 74, 101, 131, 170–173, 247, 302, Cognitive load 41, 68–70, 104, 148, 217, 250;
305, 313–321, 387–388; Attention and WCF 10, Cognitive Load Theory 148; Intrinsic cognitive
12, 14, 16, 25, 81, 83–84, 154, 175, 216, 219, 221, load 148; Extraneous cognitive load 148;
320, 341–343, 350–352 Germane cognitive load 148
394
Index
Cognitive processes/processing 10, 12–13, 15, Constructions 357, 372, 376; Syntactic
17–19, 24–26, 47, 53, 68–70, 73–74, 83, 87, 89, constructions 48, 142; Inversion, preposing
98–100, 103, 110, 118, 131, 172, 187, 229–302, and cleft constructions 103; Linguistic
304–306, 308, 315–317, 326, 329, 339–341, constructions 215; Learner constructions 356;
343–344, 351 Verb-argument constructions 357, 372; Verb-
Cognitive processes (See also activation of prior particle constructions 372; Gerundial argument
knowledge, awareness, depth of processing, constructions 360; Grammatical metaphor
hypothesis testing, metacognition, rule constructions 360; Reflexive constructions
formulation) 360; Clausal constructions 361; Grammatical
Cognitive resources 42–43, 48, 101, 142, 148, 229, constructions (attributive adjectives and
319 premodifying nouns) 361
Cognitive system 110, 115–117 Constructs 23–24, 43, 86, 153, 158, 160, 194,
Cognitive writing processes 131, 340, 349; See also 226–227, 230, 233, 235, 303, 305, 308, 319;
Composing processes Aptitude constructs 142; Constructs of WM
Coh-metrix 191, 193–194, 277, 371–372 (working memory) 148; Motivational constructs
Cohesion 42, 226, 228, 246, 271, 275, 325, 340, 363 153; “The de facto test constructs” 231; Test
Collaborative dialogues 10, 29, 84, 89, 287, 290, 302 constructs 231
Collaborative processing of feedback 22–24, 26, Content and language integrated learning. 4, 117,
29–31 128–129, 134–135, 241–250, 317; See also
Collaborative writing 3, 10, 22–24, 26, 28–31, “CLIL”
67–69, 71, 72, 74–76, 81, 82–84, 86, 88, 134, Context 3–4, 10, 18, 22, 23, 26, 28, 52, 70–72, 74,
282, 286–288, 290–292, 300–302, 304–306, 313, 75–76, 82, 88–90, 97, 99, 100–101, 105, 109, 118,
329–330, 387, 390–391; Collaborative writing 125–127, 129–130, 133–134, 148, 153, 156–157,
tasks/activities 26, 28, 71–72, 74, 135, 171, 292, 159–161, 183, 187, 190, 192, 200, 203–204,
328, 387, 390; Collaborative writing processes 215–216, 220, 226–228, 231, 234–235, 241,
69, 71, 72, 74, 76, 329–330; Computer-mediated 243–249, 254–260, 262, 264, 266, 268, 272–273,
collaborative writing 283, 284, 286, 287, 288; 275–276, 278, 283, 285, 291–292, 299, 301,
Computer-supported collaborative writing 304–305, 308, 320, 326–327, 332–333, 349, 351,
(CSCW) 72 361–362, 364, 385–391
Collective scaffolding 28–29, 329 Contrastive Interlanguage Analysis (CIA) 104, 356
Collocations 102, 183, 185, 189, 194, 199, 200–206, Cooperative interaction 287
359, 360, 361, 370 Corpus /corpora 5, 58–59, 98–99, 102–104, 106,
Complexity 39, 42, 44–48, 52–61, 85, 88, 90, 119, 184, 188–192, 194–195, 200–202, 205–208, 233,
127–129, 133–134, 144–148, 156, 159–160, 162, 246, 248–249, 270–274, 276, 285, 291,
169, 171–174, 176–177, 206, 208, 227, 229–231, 331–332, 356–366, 370–375, 377–379, 391
243, 245–246, 257–258, 260–261, 264, 273–275, Corpus of Contemporary American English (COCA)
277, 287, 304, 316, 325, 330–331, 351, 357–362, 194–195, 205, 208
366, 371–373; See also Task complexity; Corpus linguistics 200, 357–358
Lexical complexity 44–46, 55–56, 58–61, 101, Corrective feedback 1, 3, 4, 9, 17, 22, 31, 46–48,
127–128, 144, 147, 173–174, 177, 206, 227, 245, 81–82, 84, 87, 125, 132, 139, 140–143, 145,
257–258, 260–261, 357–358, 361, 373; Syntactic 147–149, 152, 167, 170, 172, 185, 187, 194, 207,
complexity 44–46, 55, 57–61, 102, 129, 144, 147, 213–215, 243–244, 282, 284, 288, 290, 292, 299,
169, 172–175, 177, 206–208, 227, 245, 257–258, 302–303, 305, 307, 339, 342, 350–352, 385, 389,
260–261, 269, 274, 277, 357, 363, 371–372 391
Complexity, Accuracy, and Fluency (CAF) measures Cross-linguistic influence (CLI) 97, 101, 104, 106,
44, 46–47, 144, 169, 177, 245 109, 204, 358, 365
Composing processes 25, 118, 131, 313; Crosslinguistic transfer 110, 113; See also
Formulation 41, 48, 54, 55, 68–69, 131, 134, 172, Transfer
229, 302, 313, 318–319; Planning 41, 43, 47, 48, Cross-sectional studies 106, 130, 133, 185–186, 195,
55–57, 59, 60, 67–69, 74, 113, 131, 134, 141, 149, 307, 327
157, 174–177, 179, 226, 229, 282, 287, 291, 305, Curricular design 256, 265, 357
313, 315, 317, 326–327, 329, 340–341, 372, 389; Curricular perspective 9, 300–301, 307–308
Translating 54, 149; Execution 68–69, 141, 149;
Monitoring i, 41, 54–56, 59, 60, 68–69, 72–73, Data-driven learning 207, 379
170, 172, 229, 317–318 Data elicitation procedures 9, 12, 306, 308, 309
Computer-assisted language learning 282, Deficit/non-deficit view (orientation, approach) 112,
333 116, 270, 366, 387
Computer-mediated communication 41, 135, 177, Depth of processing 10, 16–17, 26, 42, 86, 170, 221,
283 301–304, 306–307
395
Index
Development: Bi/multilingual writing development 257–266, 316, 320, 325, 330, 340, 349, 357–360,
110, 112, 115; Language and writing development 371
111, 115, 117–118, 291, 327–328 Focus-on-form 11, 19, 215
Direct assessment 226, 228, 235 Focus-on-forms 320
Disciplinary literacy 247, 268, 276 Form-function mappings 376–379
Discourse-level features 274 Formulaic language 102, 199–208, 376–378
Discriminant analysis 362–365 Formulaic sequences 278, 359
Dual-task methodology 146, 149 Formulation ; Writing processes 41, 48, 54–55, 68,
Dynamic Systems Theory 83, 87, 114, 129, 135, 101, 131, 134, 172, 229, 247, 302, 313; Rule
357–358 formulation 11, 17, 301; Hypothesis formulation
Dynamic Written Corrective Feedback 391 24–25
Function-oriented feedback 378–379
Early language learning 126–127, 135 Functional appropriateness 378
Effect size 143, 215, 248, 261, 364 Future L2 selves 157, 161–162
Electronic writing 282–284, 289, 292, 352
Emotions 29–31; Emotions and WCF 133, 152; Genre: Genre analysis 326, 331, 376, 379; Genre
Emotions and L2 Writing 154, 159, 162 awareness 119, 130, 207, 386; Genre-based
Engagement with written corrective feedback 4, 82, writing instruction 119, 130; Genre knowledge 61,
87, 132, 350–351 115, 118–119, 334; Genre theory 115
Enjoyment 154, 156, 160 Grammar: Grammar and vocabulary 9, 169, 228,
Error correction 89, 145, 204, 213–217, 286, 290, 274, 289; Grammar and lexis 44, 170, 360;
385, 387, 389 Grammar knowledge 177; Grammar correction
Exemplar-based system 142 214; Grammar-focused 40, 320; Grammar
Expectancy value theory 158 learning 170–171
Experimental research designs 48, 56, 82, 103, 106, Grammatical development 169–173, 179
133, 147, 160–161, 178, 192–193, 204, 277, 288, Grammatical metaphor 130, 271, 274–275, 360;
291, 327, 332, 334–335, 358, 375, 377 See also Nominalization
Explicit correction 213, 215, 221
Explicit instruction 4, 126, 130, 148, 175, 204, 206, Hayes and Flower’s model 54, 339–340
255, 261, 264, 275 Heteroglossia 276
Explicit learning 2, 10, 13, 17–18, 301 Hypothesis testing 12–13, 16–17, 86, 98, 301, 304,
Eye fixations 85, 89 313, 315
Eye-gaze data 70–71, 73–74, 343, 344, 349
Eye tracking 12, 16, 47, 70, 73, 76, 85, 89–90, 103, Ideal L2 self 157, 159–161
134, 149, 172, 178, 292, 302, 306, 315, 341, 342, Implicit learning 17–18, 301
343–344, 349, 350–353 Incidental learning 157, 242, 301
Individual differences 4, 73, 75, 99, 140–142, 148,
Feedback 1–4, 22–23, 47–48, 81–90, 143, 170, 178, 218, 220, 222, 256, 266, 305–308, 314,
187, 194, 207, 244, 305, 307, 313, 315, 318, 385, 317–318, 340, 365, 390–391; See also Anxiety,
389–390. Aptitude, Attitude, Enjoyment, Learner Beliefs,
Feedback cycles 132–133 Mindsets, Motivation, Working Memory
Feedback types 24–26, 46, 69, 105–106, 132, 145, Individual writing 3, 9–11, 18–19, 23, 26, 30, 69–71,
148, 189, 213–222, 255, 283–286, 288–292, 73–75, 83–84, 87, 89, 288, 292, 313, 326
320, 373, 378–379; Direct 24–25, 86, 215–216, Inhibition hypothesis 131, 315
342, 350–351, 391; Indirect 24–25, 86, 132, Input processing 1, 11, 15–16, 18, 24, 84, 142, 319
146, 215–216, 350; Metalinguistic 83, 86, 146, Instructed language learning (ILL) 300
307; Focused feedback 89, 213, 216–217, 220, Instructed second language acquisition (ISLA) 9, 23,
222, 307, 352; Mid-focused feedback 216; 86, 139, 148, 299–300; Applied ISLA 300,
Synchronous/asynchronous feedback 83, 87, 284, 302–304, 308; ISLA applied 300, 303–304,
288–290; Unfocused feedback 89, 213, 216–217, 307–309
303, 307, 352 Instructed SLA contexts 23, 71, 301
Feedback processing 2, 10–19, 22–31, 43, 88, 90, Instruction 69, 71, 100, 109, 114–118, 126–130,
139–142, 149, 221, 302–304, 314, 339–353 134, 153, 171–172, 204, 217, 242–244, 250,
Feedback-seeking behavior 153–156, 158, 259, 264, 273–274, 275, 277–278, 300, 313,
160–162 320, 332, 343; Language instruction 9, 105,
Fixed mindset 156, 161 117, 274, 277, 334; Writing instruction 100,
Fluency 41, 44, 46–48, 50–56, 58–61, 100–103, 103, 105, 113–115, 128, 130, 134–135, 160,
126–127, 129, 142, 144, 146, 157, 159–162, 169, 162, 172–173, 179, 206, 259–260, 264, 270,
172, 177, 200, 206, 208, 229, 231, 243, 245, 255, 277, 385–387, 390
396
Index
Instructional contexts 22, 26, 90, 129, 134, 216 220–222, 227, 241–247, 249, 254, 256–257,
Intake 11–13, 15–17, 24, 86, 301, 303, 314, 319 260–261, 264–265, 269, 270, 272–275, 277, 282,
Integrated Contrastive Model (ICM) 104 284, 286, 290, 292, 299, 300–304, 306–309,
Integrated writing assessment 231 312–313, 315, 317–318, 320–321, 326, 333,
Inter-rater reliability 191, 227, 374–375, 377 341–342, 372, 376–379, 385–388; Learning
Interaction 12, 22, 24, 27–29, 40–41, 43, 47, 55–57, affordances 1, 3, 71–72; Learning context 23,
69, 72, 74–76, 86, 88, 98, 109, 131, 141, 242, 247, 99, 130, 134, 135, 148, 244, 326, 388, 390–391;
256, 265, 274, 283–287, 289–292, 319, 329–330, Learning environment 99, 153, 195; Learning
332, 373, 385; Interaction Hypothesis 283; experience 3, 133, 161, 317, 320; Learning
Interaction patterns 72, 75, 286–287 opportunities 23, 26, 28, 41, 69, 148, 291, 317,
Interactionist perspective 15 334; Learning outcomes 2, 11, 22, 149, 176, 301,
Interference 58, 97, 100 309, 317–318
International Corpus of Learner English (ICLE) 102, L2 learning 1, 3, 10, 11, 12–18, 22–25, 27, 29–30,
104, 277, 356, 370 43–47, 67, 69–71, 73, 75, 81, 87, 90, 99, 105, 131,
Internationalization 242 143, 149, 170, 220, 222, 243, 300–303, 305,
Interventionist studies 71, 134 307–309, 312, 315–317, 325–327, 333–334, 351,
Interviews 44, 71, 73–74, 89, 90, 103, 113, 118, 134, 372
160, 177, 178, 231, 233, 258–259, 266, 315, 332, Learning process 2, 10–19, 23, 42–43, 45–47,
341–342, 350, 352 71–72, 87, 105, 149, 153, 160, 162, 247, 284, 286,
Involvement load hypothesis 187, 189–191 300, 303, 315, 317, 333–334
Learning strategies 82, 113, 222, 317
Joint scaffolding 286 Learning beyond the classroom 333
Learning through writing 169–170, 190, 300, 308,
Kellogg’s model of writing 52, 54, 141 315
Keystroke logging 70–71, 73, 76, 103, 266, 305, Learning to write 3, 9, 22–23, 69, 98–99, 115,
315, 326, 340, 341, 343–344, 349 117–118, 264, 275, 292, 312, 386–389, 391
Lexical bundle 102, 199–205, 208, 277, 359, 361,
L1 use 100–101, 116–117 388–389
L2 motivational self system 153, 157, 318 Lexical diversity 44, 46, 58, 147, 183–185, 187–190,
L2 proficiency 23, 42, 55, 57, 61, 99, 102, 118, 133, 193–194, 261–263, 359, 363
144, 146–147, 149, 203, 205, 218–220, 259, Lexical sophistication 45, 58, 144, 147, 183–191,
313–314, 316–317, 319, 366 193–194, 206, 261, 274, 358
Language analytic ability 142, 145–146, 218–219 Limited Attentional Capacity Model See Skehan
Language learning 1, 3–5, 9–13, 19, 22, 24–25, 27, Linguistic competence 109–110, 113, 115, 126, 131,
30, 39–41, 68–76, 82, 99, 106, 111, 117, 125–126, 228
133, 143, 156, 160, 169, 173, 176, 178, 179, 183, Linguistic features 100, 143, 174, 207, 215,
187, 189, 192, 195, 213, 235, 242, 247, 249–250, 217–218, 229–230, 232, 234, 249, 273, 331,
255, 268–270, 278, 282–286, 292, 299–300, 358, 362–363, 365, 371, 373–379, 387; See also
302–309, 317–318, 329, 333, 361, 390 Contrastive analysis; detection-based
Language learning potential of WCF 222, 303 approaches
Language learning potential of writing 2, 4, 43, 44, Linguistic knowledge 4, 13, 25, 27, 42, 45, 98–99,
46, 227, 312–313 101, 105–106, 110, 113, 126–127, 130–132, 170,
Language learning process 247, 334 214, 219, 287, 305, 308, 325–326, 328, 330, 334,
Language learning programs 135, 234 387
Language learning through writing 169–170, 243, LLAMA test 145, 147
303 Longitudinal Database of Learner English
Language related episodes 26, 69, 178, 291, 313, (LONGDALE) 104, 370
342 Longitudinal Studies 57, 61, 102, 106, 127, 133,
Languaging 24, 29, 69, 71–72, 84–85, 89, 302, 192, 202–203, 220–222, 247–248, 263, 271, 278,
329–330, 342, 350 287, 304, 319, 327, 364
Learner beliefs 152, 153, 154, 160
Learner corpora 5, 104, 192, 205, 208, 356, 363, Meta-analysis 4, 143, 172–173, 176, 215, 248, 331
365–366, 375 Metacognition 17, 301
Learning (conceptualization) 2, 3, 11–13, 15, 17–19, Metacognitive knowledge 4, 101, 105, 114, 131,
24–25, 27–30, 40, 43, 45–47, 52, 74, 81, 83, 325
87–90, 98–99, 104, 109–111, 113, 116–117, Mindsets 153–154, 156, 161–162
125–127, 130, 133–134, 140, 145–149, 153–154, Modality: Task modality 2, 4, 55, 81, 169–171,
156–158, 160–162, 169–177, 183, 185, 187, 175–178, 201, 203, 316, 319; Feedback modality
189–191, 193–195, 207, 213–215, 217–218, 284, 289
397
Index
Motivation 4, 18, 87, 131, 135, 140, 152–163, 389; Planning stages 47, 327; Collaborative
189–190, 218–220, 242, 255–256, 258–260, planning 56, 174, 179
262–265, 315, 317–318, 325, 332, 386; See also Prewriting 172, 174–175, 177, 179, 389;
Motivational Self-System; Intrinsic motivation Collaborative prewriting 174
132, 156–157, 160 Process-oriented approach 299, 318
Motor execution 41, 141 Process-oriented research agenda 306
Multicompetence 4, 106, 109–119, 131, 133; Processing of written feedback 2, 22, 303, 339
Monocompetence 111–112; Multicompetent Processing stages 12, 16, 25, 303, 313
110–118 Process-oriented research 304, 306, 308, 313, 319
Multi-dimensional analysis 362–363 Product-oriented research 321
Multilingual writing development 110–119 Proficiency 14–15, 18, 23, 26–28, 41–42, 52, 55, 57,
Multilingualism 97, 109, 116, 229, 235, 242 59, 61, 70, 72, 75, 83, 87, 90, 99–106, 112, 118,
Multimodal composition 327–331 125–127, 131, 133, 135, 141, 144, 146–147, 149,
Multimodality 5, 235, 289, 326–327, 390 173–174, 177, 183, 185–186, 189–192, 201–207,
Multistage tasks 81 218–222, 227, 230–231, 233–234, 242, 246, 250,
Mutual information 184, 201 255, 257–259, 261, 263–265, 272–273, 277, 287,
289, 302, 304, 307, 313–314, 316–319, 333, 340,
N-gram 104, 191, 196, 199–200, 205, 208 344, 357–363, 366, 371–372, 374–376, 390
Natural language processing 370
Naturalistic and instructed settings 125–126, Raters 115, 144, 190–191, 207, 226–227, 229–233,
129–301 235, 256, 274, 373
Needs analysis 332, 335 Rating 147, 189–191, 226, 228–230, 232, 235, 264,
Negotiation of meaning 69, 242, 284–285 266, 358–359, 373; Rating scales 114, 226,
Nominalization 244, 269, 271, 274–275, 360 230–235, 262
Noticing 10–13, 15–16, 24–27, 42, 71, 75, 81, Reaction times 191, 306, 358
83–89, 132–133, 140, 142–143, 187, 190, 207, Reactivity 73, 89, 306, 340, 342, 349
218, 247, 285–286, 291–292, 303, 306, 312–316, Reading-while-writing 73, 235
318, 329, 341, 350, 352, 385 Register 59, 111, 171, 200–201, 204–206, 246,
Noticing gaps 26, 81, 285, 290, 313–315 248–249, 268–275, 277–278, 285, 318, 356,
Noticing Hypothesis See Schmidt’s Noticing 358–362, 364, 372, 388, 391
Hypothesis Regression analysis 145, 205, 208, 233, 362, 364,
372–373
Operation Span tests 144 Regression approach 146–147, 364
Output 11–15, 17, 18, 24–26, 29, 41–44, 55–56, 69, Regulatory focus theory 153, 157, 162
85, 87, 98, 118, 126–127, 131–133, 141, 171, 193, Reliability 178, 185, 188, 277, 229, 233–235,
243, 300, 313–314, 318–319, 385; Written output 372–375
176, 247, 313, 320, 327, 339; Oral/spoken output Repertoire (linguistic) 75, 90, 101, 103, 105,
1, 171, 176, 327; Output processing 312; Output 113–115, 117, 131, 169, 173, 201, 204–208, 313,
processing models 83–84; Output-input-output 332, 360, 377–378
cycle/sequence 247, 317; Input-output-input order Resource-directing variables See Task complexity
317 Resource-dispersing variables
Output Hypothesis See Swain’s Output Revisions 10, 68, 72, 74, 85–86, 88, 103, 133, 154,
Hypothesis 213, 218–219, 289, 291, 302–303, 342, 350–351,
Output production 81, 314 353
Rhetorical function 376–379
P-burst 58 Robinson’s Aptitude Complexes Hypothesis 140
P-frame 199, 200, 206 Robinson’s Cognition Hypothesis 39–40, 42, 45,
Pausing 67–68, 70–71, 73, 103, 172, 340, 349 52–56, 59, 175, 316
Pedagogical ramifications 18, 301 Rubrics 207, 227–231, 334; Analytic 227; Holistic
Peer feedback 23, 26, 30, 69, 155, 160–161, 285, 228, 262, 264, 266
390 Rule formation/formulation 11, 17, 86, 301
Phrasal Expressions List 207 Rule-based system 53, 142–143, 216–218, 222
Phrasal Verb Pedagogical List 207
Phraseological competence 201–204, 206 Schmidt’s Noticing Hypothesis 11–12, 15–16,
Phraseology 200, 205–206, 208, 359 24–25, 69, 83, 84, 341
Planning 41, 43, 48, 56–57, 59–60, 67–69, 131, Scores/scoring 11, 44–45, 101, 113, 128–129, 132,
134, 141, 149, 157, 174–177, 179, 226, 242, 282, 144–147, 149, 174, 176, 184–191, 204–205, 207,
287, 291, 305, 313, 315, 317, 327, 329, 340–341; 226–235, 244–246, 257–259, 264, 266, 274, 283,
Planning time 39, 47, 55, 70, 74, 229, 326, 372, 300, 302–303, 351, 357–358, 361–363, 371, 373
398
Index
Self Correction 208, 290 57, 59, 70, 327; Online planning 57, 70, 74, 176,
Self-determination theory 153, 156, 162 291
Self-repair 285 Task conditions 75, 79, 174–175, 305
Short-term memory 141 Task demands 42–43, 45–46, 229, 314, 316
Skehan’s Macro-SLA aptitude model 142 Task design 39–41, 47–48, 70, 148, 206, 226–227,
Skehan’s Limited Capacity Model 39, 42, 53–56, 229, 292, 326, 330
132 Task modality 2, 55
Skehan’s Processing Stage model 143 Task repetition 39, 42–43, 46–48
Skehan’s Trade-off Hypothesis 40, 160, 316, 318 Task sequences 56–57, 174, 316, 317
Skill Acquisition Theory 11, 13–14, 170, 175 Task types 48, 70, 74, 75, 144, 195, 287, 313,
SLA-L2 writing interfaces 1–2 316
Social semiotics 326–327 Task-based language teaching (TBLT) 39–41, 52,
Social turn 110–111 56–58, 60–61, 244, 316, 326, 385
Sociocultural theory 3, 9, 23–24, 26–27, 152, Text analysis 134, 193, 204, 248, 330, 374
283 Text chat 4, 72–73, 75–76, 135, 169, 175–177, 179,
Stance 22, 200, 205, 271, 275–276, 278, 360 282–286, 288–292, 327, 391
Stimulated recall 73, 76, 89, 103, 134, 149, Text quality 72, 100, 129
171–172, 178, 248, 266, 305, 315, 335, 340–343, Think aloud 10, 25, 26, 29, 70, 73, 74, 103, 134,
349–350, 352; Stimulated recall data 70–71, 178, 171, 233, 248, 302, 305, 306, 308, 314, 315, 340,
349, 352; Stimulated recall comments 71, 344; 342, 350
Stimulated recall interviews 74, 341, 344, 352; Time-compressed writing 68, 134
Stimulated recall sessions 173, 343, 349 Tokens 183, 185, 190, 201, 203, 206, 257;
Strategic knowledge 101, 131; See also Type-token ratio 58, 183, 188, 359, 361
Metacognitive knowledge Trade-off 42, 48, 117, 159, 316; -Hypothesis 40,
Strategies 42, 70–71, 75, 82, 84, 87–89, 99, 159, 316; -Effects 305
100–101, 105, 113–118, 135, 154, 156–159, Training data 375
161–162, 169, 179, 217, 218–219, 221–222, Trajectories 84, 87–88, 102, 118, 129–130, 132, 192,
248–249, 255, 259, 300, 304, 307, 314, 317, 275, 303, 372–373
319, 329, 340, 342, 352, 387; See also Learning Transcription 70, 76, 291, 306, 351, 352
strategies, WCF strategies, Writing strategies Transfer 3, 4, 48, 97, 99–106, 109–110, 113–114,
Study abroad 3–4, 129, 131–133, 161, 244, 127, 171, 174, 176, 204, 216, 283, 327; Adaptive
254–255, 260, 365, 390; Study abroad programs 105; Lexical 102; Negative (see interference) 97,
129, 132, 254, 260, 265 100; of writing processes 100; Positive 100, 103;
Subordination 44–46, 58–59, 144, 147, 176, 271, Reverse transfer 109; Rhetorical 103
273–274, 371–372 Transferability 102, 105
Swain’s Output Hypothesis 1, 11, 12, 13, 14, 18, 24, Translanguaging 111–112
25, 26, 27, 69, 312, 314, 317 Translation 54, 59, 70, 102, 104, 118, 141–142, 228,
Synchronous feedback 284, 288–290 260, 340, 392; Machine translation 118
Syntactic complexity See Complexity
Systemic Functional Linguistics (SFL) 114–115, Usage-based approaches 111, 170, 199, 357–361,
134, 246, 248, 272, 326, 358 372
T score 147, 201–202, 229, 231, 234, 259, 377 Validity 57, 81, 178, 207, 227–230, 233–235, 301,
TAALED (Tool for the Automatic Analysis of 306, 340, 374; Ecological validity 75, 149, 217,
Lexical Diversity) 194 364, 389
TAALES (Tool for the Automatic Analysis of Verb-argument constructions 357, 372
Lexical Sophistication) 194 Verbal protocols 47, 103, 340–344, 353
Task see Collaborative writing tasks, task Verbal reports 233, 340–342
complexity, Task conditions, Task demands, Veridicality 89, 340, 342
Task design, Task modality, Task repetition, Task
sequences, Task types WCF strategies 217–219, 221
Task complexity 4, 42–43, 45–48, 52–53, 70, 142, Within-subject 115, 188, 256, 260–261, 307–308,
144–145, 152, 159, 172, 175–176, 229, 305, 308, 349; Within-subject comparison 115; Within-
316, 319, 331, 340–341, 388; Cognitive task subject research design 256, 260–261, 349
complexity (CTC) 43, 52–61; Resource-directing Word combinations 102, 184, 199, 201–203, 361
task complexity features 42, 53, 55; Reasoning Working memory 15–16, 41, 52–56, 60, 68, 89, 131,
demands 42, 57, 59–60; Number of elements 39, 139–143, 148–149, 218–220, 305, 317–319
42, 57, 59–60, 148; Resource-dispersing task Writing Across the Curriculum (WAC) movement
complexity features 48, 53, 60; Pre-task planning 386
399
Index
Writing fluency 55, 58, 101, 103, 126, 129, 169, Writing tasks 9, 23, 26, 28, 39, 41–43, 48, 52,
257–260, 262 54–61, 68, 70–72, 74–75, 101, 134–135, 143–144,
Writing instruction 103, 105, 113–115, 119, 128, 148–149, 158, 161–162, 172, 174, 178–179,
130, 134–135, 162, 172–173, 179, 206, 259–260, 186–187, 214, 216, 220, 229, 231, 261, 264,
264, 270, 277, 385–387, 391 266, 292, 313, 317–320, 330–331, 386,
Writing competence 110, 113, 115, 118, 127–128, 388–389
130, 161 Writing to learn 98, 115, 117–118, 126, 134, 187,
Writing expertise 105, 112, 118, 131 386–387
Writing knowledge 110, 113, 115, 131 Written feedback appropriation 15, 86, 90, 135,
Writing processes 10, 43, 54–55, 60–61, 67–76, 139–140, 143, 145, 147, 216, 221–222,
98–101, 112–113, 131, 141–143, 146, 149, 248, 304–305
284, 292, 300–302, 304–308, 320, 326, 328–330, Written languaging 85, 89, 302, 342, 350
339–349, 353, 386
Writing strategies 23, 71, 84, 87, 99–101, 154, Zone of proximal development 10–11, 24,
156–157, 162, 255, 259, 340 27, 29
400

(The Routledge Handbooks in Second Language Acquisition) Rosa M. Manchón, Charlene Polio - The Routledge Handbook of Second Language Acquisition and Writing-Routledge (2021)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(The Routledge Handbooks in Second Language Acquisition) Rosa M. Manchón, Charlene Polio - The Routledge Handbook of Second Language Acquisition and Writing-Routledge (2021)

Uploaded by

Copyright:

Available Formats

THE ROUTLEDGE HANDBOOK OF SECOND

LANGUAGE ACQUISITION AND WRITING

Rosa M. Manchón is Professor of Applied Linguistics in the Department of English at the

The Routledge Handbooks in Second Language Acquisition are a comprehensive, must-​have

The Routledge Handbook of Second Language Acquisition and Pragmatics

The Routledge Handbook of Second Language Acquisition and Corpora

The Routledge Handbook of Second Language Acquisition and Language Testing

The Routledge Handbook of Second Language Acquisition and Technology

The Routledge Handbook of Second Language Acquisition and Writing

For more information about this series, please visit:

Edited by Rosa M. Manchón and Charlene Polio

1 L2 Writing and Language Learning 1

2 Theoretical Perspectives on L2 Writing, Written Corrective

3 Theoretical Perspectives on L2 Writing and Language Learning in

4 Task Effects Across Modalities 39

5 Task Complexity Studies 52

6 L2 Writing Processes of Language Learners in individual and

7 Learners’ Engagement with Written Corrective Feedback in Individual

8 Transfer, Writing, and SLA: L2 Writing as a Multilingual Event 97

9 Multicompetence and L2 Writing 109

10 Age-​Related Differences in L2 Written Performance and Written

11 The Role of Cognitive Individual Differences in L2 Writing

12 The Role of Motivational and Affective Factors in L2 Writing

13 L2 Writing and Grammar Development 169

14 L2 Writing and Vocabulary Development 183

16 Written Corrective Feedback: Short-​Term and Long-​Term Effects on

17 The Role of Language in Assessing L2 Writing 226

18 Learning and Teaching L2 Writing in Content and Language

19 L2 Writing in Study-​Abroad Contexts 254

20 L2 Writing and Language Learning in Academic Settings 268

21 L2 Writing and Language Learning in Electronic Environments 282

22 Directions for Future Research Agendas on L2 Writing and

23 Directions for Future Research on Attention and L2 Writing 312

24 Directions for Future Research on SLA, L2 Writing, and Multimodality 325

25 Directions for Future Methodologies to Capture the Processing

26 Directions for Future Use of Using Existing Corpora in the Study

27 Directions for Future Automated Analyses of L2 Written Texts 370

28 Implications of SLA-​Oriented Research for the Teaching of L2 Writing 385

ZhaoHong Han is Professor of Applied Linguistics at Teachers College, Columbia University.

Osamu Hanaoka is a professor in the School of International Relations at Tokyo International

Ronald P. Leow is Professor of Applied Linguistics, Director of Spanish Language Instruction

Xiaojun Lu is a lecturer in Applied Linguistics at Southeast University (China).

Rosa M. Manchón is Professor of Applied Linguistics in the Department of English at the

Bruna Sommer-​Farias is an assistant professor on the MA in Foreign Language Teaching program

Shelley Staples is Associate Professor of English/​Second Language Acquisition and Teaching at

Olena Vasylets is an associate professor at the University of Barcelona, Spain.

Marjolijn Verspoor is Professor Emeritus of English Language at the University of Groningen,

Hyung-​Jo Yoon is an assistant professor in the Department of Linguistics/​TESL at California State

The Handbook of SLA and Writing: An Overview

Theoretical Underpinnings for Writing and WCF

The Noticing Hypothesis

Comments on the Noticing Hypothesis

The Output Hypothesis

Comments on the Output Hypothesis

Skill Acquisition Theory

Comments on the Skill Acquisition Theory

Model of Second Language Acquisition

Model of the L2 Learning Process in ISLA

Input Processing Stage

Intake Processing Stage

L2 Knowledge Processing Stage

Comments on the Model of the L2 Learning Process in ISLA

Summary and Suggestions for Future Research

Collaborative Writing and Feedback Processing

The Routledge Handbooks in Second Language Acquisition are a comprehensive, must-have

10 Age-Related Differences in L2 Written Performance and Written

16 Written Corrective Feedback: Short-Term and Long-Term Effects on

19 L2 Writing in Study-Abroad Contexts 254

28 Implications of SLA-Oriented Research for the Teaching of L2 Writing 385

Bruna Sommer-Farias is an assistant professor on the MA in Foreign Language Teaching program

Shelley Staples is Associate Professor of English/Second Language Acquisition and Teaching at

Hyung-Jo Yoon is an assistant professor in the Department of Linguistics/TESL at California State

process to be managed by working memory/attentional resources (Kormos, 2006). In Skehan’s

Table 5.1 Resource-directing and resource-dispersing features of cognitive task complexity

System Process Sub-process

Collaborative Writing in Face-to-Face Mode

Computer-Supported Collaborative Writing (CSCW)

Processes Underlying Digitally-Mediated Interactive Writing