Learner Chunks in Second Language Acquisition: April 2018

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/324808335
Learner Chunks in Second Language Acquisition
Thesis · April 2018
CITATIONS READS
2 1,886
1 author:
Timothy Hall
University of Pennsylvania
5 PUBLICATIONS 2 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Second Language Acquisition in K-12 View project
Learner Chunks View project
All content following this page was uploaded by Timothy Hall on 27 April 2018.
The user has requested enhancement of the downloaded file.

LEARNER CHUNKS IN SECOND LANGUAGE ACQUISITION
by
Timothy Merrick Hall
Dissertation Committee:
Professor ZhaoHong Han, Sponsor

Dr. Vivian Lindhardsen
Approved by the Committee on the Degree of Doctor of Education
Date: May 17, 2017____
Submitted in partial fulfillment of the

requirements for the Degree of Doctor of Education in
Teachers College, Columbia University
2017
ABSTRACT
LEARNER CHUNKS IN SECOND LANGUAGE ACQUISITION
Timothy Merrick Hall
The field of Second Language Acquisition has sought to explain how learners
develop rules in their second language. Working within an emergentist framework, N.
Ellis (2002a) proposed that rule-like constructions emerge when domain-general learning
mechanisms grammaticalize holistically stored, multi-morphemic chunks of language.
This exploratory study took N. Ellis’s proposal as a starting point to investigate how three
adult English language learners in a U.S.-based community college used chunks.
Naturalistic oral data was elicited over ten months by means of three language tasks. The
oral output was recorded, transcribed into corpora, and coded for chunks. Patterns of
chunk use were analyzed to indicate the following: (a) All learners used chunks to
communicate a variety of meanings; (b) There were inter-learner differences with regard
to chunk use; (c) There were inter- and intra-learner differences in how chunk
grammaticalization manifested itself. Findings and implications are discussed with
respect to an emergentist understanding of second language acquisition.

© Copyright Timothy Merrick Hall 2017
All Rights Reserved
ii
ACKNOWLEDGMENTS
This dissertation was possible thanks to the consideration and support of many
individuals. I would like first to thank Dr. ZhaoHong Han, my dissertation advisor, for
what has been the most profound learning experience of my life. I would also like to
thank Drs. Vivian Lindhardsen, Hoa Nguyen, and Hervé Varenne for their insightful
review of my work. Past and present members of my doctoral seminar have motivated me
through discussion, proofreading, resource sharing, and countless coffees, making a
decade of work feasible and collegiate. Thanks to them all. In particular, I wish to thank
Adrienne Lew, Sunhee Song, Sarah Sok, Phillip Choong, Alice Chen, Farah Akbar, and
Rosette Finneran. I extend thanks to Marta Lynch, Ann Holby, Debra Billman, and Reyes
Llopis Garcia for coding, proofreading, and moral support, and to Carol Friend for access
to my research site. I am, of course, indebted to my research participants, Weike, Eddy,
and Jun. I also thank Nick Ellis for his willingness to engage in a thoughtful conversation
with me on the current topic, and of course, to pose such an interesting question in the
first place.
My academic work was completed with the support of friends, colleagues, and
family. Thank you Lonnie Lippert, Lea Rumbolo, Janet Immatteo, Pat Brokaw, and Andy
Black for your unwavering reliability. Thank you to Yiqiang Wu and Jean Wong, Nicole
Maldonado, Megan Gordon, Alan Amtzis, Amy Dell, Diane Gibson, and Mary Ann
Peterson for your encouragement. And heartfelt thanks to Kelly Blair and Michael
Castagna. Finally, my love and thanks go to my family: Ann, David, Christopher,
Heather, Liz, and Aaron for being the rock upon which I stand.
T.M.H.
iii
TABLE OF CONTENTS
I -INTRODUCTION………………………………………………………………..…….. 1
1.1 Background…………………………………………………..……….……. 2
1.2 Focus of the Study……………………………………………………… 8
1.3 Key Terms………….……………………………………………….…… 9
Acquisition……..………..…………………………………….……. 9
Interlanguage……..………..………………………………………… 9
Constructions …………………………………….…………………. 10
Learner Chunk………………………………………………………. 11
N-gram………………………………………………………………. 11
Language Processing………….…………………………………….. 12
Holistic Processing and Storage…………………………………….. 12
Processes and Mechanisms………………………………………….. 12
Associative Learning………………………………………………… 13
Type and Token Frequency…………………………………………. 13
Productivity …………………..…………………………………….. 14
Grammaticalization…………..……………………………………… 15
Comparison, Analysis, Abstraction, Schematization……………….
15
Exemplars ………………………………………………………….. 16
Exemplar-Based Learning…………………………………………... 16
1.4 Outline of the Dissertation……………………………………………… 17
II - REVIEW OF THE LITERATURE…………………………………..……………… 18
2.1 Emergentism in SLA…………………………………………………… 18
Frequency Effects…………………………………………………… 23
Counterpoints to Frequency Effects………………………………… 27
2.2 Learner Chunks…………………………………………………………. 31
2.3 Development of Chunks………………………………………..……….. 37
2.4 The Pilot Study………………………………………..………..……….. 67
III - METHODOLOGY………………………………………………….……………… 72
3.1 Methodological Issues………………………………………….……….. 72
3.2 Design of the Study…………………………………………….……….. 80
3.2.1 Context of Study…………………………………………………. 80
3.2.2 Participants……………………………………………………….. 82
iv
3.2.3 Data…………………………………………………..........……… 84
Instruments………………………………………………………….. 84
Narrative Task…………………………………………………… 86
Role Play Task……………………..……………………………. 87
Live Talk Task…………………………………………………… 87
3.2.4 Procedure………………………………………………..……….. 90
3.2.5 Coding and Analyses…………………………………………….. 93
Descriptive Statistics………………………………….…............. 95
Single Task Analysis……………………………………….......... 96
Chunk-Based Analysis ……………………..……………………. 96
IV - INTER-LEARNER ANALYSES AND RESULTS………….…………… 98
4.1 Learner Chunks………………………………………………………… 98
4.2 Descriptive Statistics………………………………………………….... 100
4.3 Inter-Learner Task Comparison………….…………………………….. 104
V- INTRA-LEARNER ANALYSES AND RESULTS……….….…………………..116
5.1 Weike…………………………………………………………………… 116
5.2 Eddy…………………………………………………………………….. 127
5.3 Jun………………………………………………………………………. 134
5.4 Summary………………………………………………………………... 140
VI - DISCUSSION AND CONCLUSION………………………………………. 144
6.1 To What Extent Do Adult Learners of English as a Second Language Use
Chunks as a Resource for Acquisition ……………………………… 144
Inter-learner analyses and results………………………............... 144
Intra-learner analyses and results………………………............... 148
6.2 Study Limitations, Implications, and Directions for Future Research.… 157
REFERENCES ………………………………………...............…………………. 165
Appendix A: Tasks……………………………..............................……………. 179
Appendix B: Weike…………………………….....................…………………… 184
Appendix C: Eddy………………………........................…………………… 200
Appendix D: Jun………………………...........................…………………… 213
v
LIST OF TABLES
2.1 Distinctive Features of LCs, Low-Scope Patterns, and Constructions..… 38
2.2 Distinctive Features of LCs, Low-Scope Patterns, and Constructions

as Operationalized in Hall (2010) …………………………………….... 69
3.1 Task Characteristics……………………………………………………... 85
3.2 Example of a Role Play Task……………………..……………………... 87
3.3 Monthly Task Completion Totals………………..…………………….... 90
3.4 Table of Analyses……………………………………………………….. 95
4.1 Sample of Human-Coded Chunk Types and N-gram Types……………… 99
4.2 Corpus Characteristics by Participant…………………………………… 101
4.3 Change in Chunk Use………………………….………………………… 104
4.4 Learner Chunks from Roommates Role Play…………………………… 106
4.5 Weike: Roommates Role Play 0711…………………………………….. 110
4.6 Eddy: Roommates Role Play 0916……………………………………… 111
4.7 Jun: Roommates Role Play 0711………………………………………… 112
4.8 All Learners: The Laptop Task………………………………………….. 114
5.1 Weike: <take a shower>……………………………….…………………. 117
5.2 Weike: <won’t be>………………………………………………………. 118
5.3 Weike: <want to>…………….………………………………………….. 119
5.4 Weike: <there have>………..……………………………………………. 121
5.5 Weike: <gonna be>…………………….…………………………………. 123
vi
5.6 Weike: <what’s happened / what’s going on>…………………………… 124
5.7 Weike: <doesn’t work / doesn’t matter>.………………………………… 126
5.8 Eddy: <there is>……………………………………….…..…………….. 127
5.9 Eddy: <why are you>…………………………………………….……… 129
5.10 Eddy: <do you>…………………...………………………...……………. 130
5.11 Eddy: <a little bit>……………………………………………………….. 131
5.12 Eddy: <the first step is>……………………………………………….. 132
5.13 Eddy: <at that time>……….……………………………..………………. 134
5.14 Jun: <it’ll be>………………..…………………………………………… 135
5.15 Jun: < you have to>..…………………………………………………….. 136
5.16 Jun: <but at that time / but so>………………………………………….. 137
5.17 Jun: <do you have>……………………………………….……………. 138
5.18 Jun: <what’s>……………………………………………………….….. 139
5.19 Intra-Learner Findings: Weike …………………………………………. 141
5.20 Intra-Learner Findings: Eddy ……………………………………..…… 142
5.21 Intra-Learner Findings: Jun…………………………………..…….…… 143
vii
LIST OF FIGURES
3.1 Example of Narrative Task………………………………………………… 86
3.2 Coding Methods………………………………………………..…………… 91
viii
1
I – INTRODUCTION
With the rise of globalization and immigration, it is increasingly necessary to
communicate in a second language (L2), which in turn has motivated a keen interest in
effective ways of teaching them. Arguably, however, an understanding of teaching should
be predicated on understanding of how languages are learned. Over the course of four
decades, the field of Second Language Acquisition (SLA) has attempted to inform such
an understanding, and has identified some fundamental observations: L2 learning is
characterized by a development of rule-like tendencies (Hakuta, 1974; Krashen &
Scarcella, 1978; Myles, Mitchell & Hooper, 1999); L2 learning can follow predictable
paths or stages (Pienemann & Lenzing, 2015; VanPatten & Williams, 2015); and L2
outcomes tend to be variable across linguistic subsystems and across learners (Han, 2004,
2009). These phenomena have been described from different perspectives, including
universal grammar, sociocultural theory, skill acquisition theory, monitor theory, and
input processing theory, to name a few. However, the field has yet to embrace a cohesive
theoretical framework that accounts for such observations.
More and more, SLA researchers have been working in a theoretical framework
called emergentism (MacWhinney, 2015) and its sub-variants — dynamic systems theory
(deBot, 2008; Lowie & Verspoor, 2015), complex adaptive systems (Beckner et al.,
2009), learner varieties (Dimroth, 2012), ACT-R (Andersen, 1996) and usage-based
linguistics (UBL) (Bybee 2008; Ellis & Wulff, 2015; Ellis, O’Donnell, & Römer, 2013;
Eskildsen, 2015; Tomassello, 2003). All of these approaches share a common belief that
2
language acquisition emerges from the dynamics of language use. The present study
considers a particular emergentist assertion that L2 learners can develop productive rule-
like patterns of language, ‘rules’, using chunked language sequences as an initial
resource, and proceed according to a specific pattern of development. (N. Ellis, 2002a;
Eskildsen, 2015). The present study seeks evidence of the relationship between chunks
and rules in three adult L2 learners, and seeks a greater specification of the role that
chunks play in L2 learning overall. This dissertation simultaneously contributes to an
understanding of how L2 learners use chunks of language, what roles chunks play in
language acquisition, and the appropriateness of emergentism as a theoretical approach
for SLA.
1.1 Background
Since the late 1960s, researchers in the field of SLA have sought to understand why
the outcomes of L2 learning differ from those of first language (L1) learning. Variable L2
outcomes have been observed on an intra-learner basis because learners often fail to
reach target-likeness in some linguistic subsystems but not in others. Variable L2
outcomes have also been observed on an inter-learner basis because some learners are
less successful than others (Cook, 2010; Han, 2004; Kellerman, 1995; Selinker, 1972).
Such variable outcomes led to an assertion that there is a fundamental difference both in
the processes and in the products of first and second language acquisition, as articulated
by Bley-Vroman (1990) in his fundamental difference hypothesis, by Cook (1991) in his
notion of multi-competence, and by Selinker (1972) in his notion of interlanguage. In
order to account for the divergences in L1 and L2 learning, SLA researchers formed
3
many guiding questions, the most relevant to the present study being: (a) With what
resources do L2 learners build the mental structure of a new language? (b) By what path
does L2 acquisition proceed? (c) Why and how do we see inter- and intra-learner
variation in L2 language use?
Historically, the theory of universal grammar (UG) was influential in guiding
research into these questions. Broadly speaking, this theory seeks to describe “what we
know about language and where this knowledge comes from” (Cook & Newson, 2007, p.
4). UG proposes that L1 acquisition is guided and constrained by universal principles and
that acquisition is supported by an innate language endowment, which is unique to
humans. The innate endowment embodies a hierarchical, representational system
containing predetermined linguistic categories such as nouns, verbs, and modifiers that
are recursively manipulated into phrases through a narrow set of domain-specific,
combinatorial rules (Hauser, Chomsky, & Fitch, 2002). The universal principles and
corresponding innate system are argued to constrain the range of possibilities that a
learner will entertain when acquiring a language, thereby facilitating acquisition.
One of the central problems that UG attempts to address is what is called the logical
problem of language acquisition. It asserts that “learners of a language know more about
what can and cannot be done in a language than they could have possibly learned from
the input alone” (Gass, 2013, Kindle Locations 5295-5296). Proponents of UG assert that
an innate, universally specified language system helps learners to overcome the
deficiencies of impoverished input and to develop a robust language system.
Another tenet of UG is the distinction between linguistic competence, which is “the
implicit and abstract knowledge of language” (VanPatten & Benati, 2015, p. 104) and
4
performance, which is the language that speakers produce. This distinction is grounded in
the belief that competence is somewhat divorced from performance. Whereas
competence underlies linguistic performance, performance does not have a reciprocal
influence on competence. Furthermore, linguistic performance is an inaccurate reflection
of competence owing to the intermediating effects of working memory and articulatory
processes (VanPatten & Benati, 2015).
The logical problem argument, the assertions of an innate language system, and the
distinction between competence and performance carry over into the field of SLA under
the moniker of generative SLA (GenSLA) (Hawkins, 2009; Slabakova, Leal, & Liskin-
Gasparro, 2014, 2015; White, 1992, 2003). Like UG, GenSLA focuses on resolving
inconsistencies between the exposure to language input and the resulting development
(Han, 2009) and like UG, GenSLA emphasizes the role of an innate language learning
faculty and de-emphasizes the role of language input, whose essential value is restricted
to triggering the underlying UG system (Elman, 1995).
There are objections to UG’s relevance in SLA (deBot, 2015), notably because the
relationship between input and acquisition does not appear to be the same in the L2 as in
the L1. Unlike L1 learners, L2 learners do not consistently arrive at target-like intuitions
about their new language. L2 learner language abilities do not seem to be reliably
triggered into target-likeness by language input. Finally, L2 systems as represented in the
minds of learners possess some forms that remain L1-like, and yet others that are neither
L1 nor L2-like (Han, 2009; Selinker, 1972). These observations led researchers to an
understanding that UG shapes the problem space of L2 learning, but that it has “a highly
circumscribed role” (Han, 2009, p. 141).

5
The GenSLA perspective has neglected other factors that have been shown to shape
L2 learning. These include cognitive faculties (Schmidt, 1990; Selinker, 1972; Skehan,
1998; VanPatten, 1996), factors related to language use such as input frequency and
recency (Hoey, 2005; Larsen-Freeman, 2002), interaction (Long, 1996), output (Swain,
2005), and how frequently and reliably language forms can be traced to their underlying
meanings (Han, 2013).
In reaction to these shortcomings, emergentist approaches (Barlow & Kemmer,
2000; Conway & Christiansen, 2002; Croft & Cruise, 2004; N. Ellis, 1996; Ellis &
Wulff, 2015; Eskildsen, 2015; MacWhinney, 2015; Tomassello, 2003) have taken a
different approach to describing the relationship between the exposure to language input
and L2 acquisition. Emergentists believe that in the L1 and the L2, language structures
emerge and change in the mind as a result of patterns of language use (Tomassello,
2003). They assert that there is no innate language endowment. Instead, there are domain-
general cognitive mechanisms that interact with language input. This interaction is
sufficient initially to produce and subsequently to change language structure in the mind,
thereby overcoming what is in emergentist eyes, a fallacious logical problem argument
(Clark, 2015; N. Ellis, 2012). Therefore “grammar is viewed as the cognitive
organization of one’s experience with language” (Bybee, 2008 p. 216), and linguistic
knowledge is more gradient than categorical (Crocker & Keller, 2006; O’Grady, 2010a).
Emergentists therefore minimize the distinction between competence and performance
(Barlow & Kemmer, 2000; Bates & Goodman, 1999; Bybee, 2013; Tomassello, 2003)
because competence is conditioned and continually updated by performance.

6
Explaining language use as the basis of acquisition necessitates the investigation of a
number of constructs. These include a range of domain-general cognitive capacities such
as associative learning, attention, and memory (Baddeley, Gathercole, & Papagno, 1998;
N. Ellis, 2001; Ellis & Sinclair, 1996; Gathercole & Baddley, 1993; Engel de Abreu &
Gathercole, 2012; Robinson, 1995, 2015; Schmidt, 1990; Skehan, 1998), and processing
tendencies (Han & D’Angelo, 2009; O’Grady, 2003; VanPatten, 1996, 2002). These
learner-internal constructs are considered in conjunction with properties of language
input, specifically the frequency and variety of input features (N. Ellis, 2002a, 2002b,
2012; Ellis & Wulff, 2015; Larsen-Freeman, 2002; Year, 2009), the context and recency
of input features (N. Ellis, 2002a; Hoey, 2005), and the extent to which meanings and
language forms are mutually retrievable (Han, 2008, 2009, 2013; Miller & Schmitt,
2010). Language use includes the consideration of language output because the patterns
of neural activation that underlie encoding processes are deemed to affect the language
system (Swain, 2005). Finally, language use entails interaction (Long, 1996) because of a
learner’s engagement in cycles of input processing, output processing, and feedback.
The role of meaning is also fundamental to understanding language use. For
example, Krashen (1985) broadly stipulated that input must be meaningful for acquisition
to take place. Han (2013) further stipulated that form-meaning relationships must be
robust in the input and transparent to the learner for structures to be more easily acquired.
Equally, Swain specified in her distinction between output and pushed output that the
learner must endeavor to encode intended meanings “precisely, coherently, and
appropriately” (Swain, 2005, p. 473) with increasing degrees of semantic nuance.
Similarly, in her evaluation of interaction, Gass (2013) observed that feedback and the
7
basic pragmatics of discourse create pressure to communicate in a meaningful way,
thereby catalyzing language acquisition. Finally, MacWhinney (2015) echoed this idea,
stating that “the fundamental functional pressure [driving language change] is to
communicate efficiently in ways that allow the listener to efficiently and accurately
decipher the message […] the forms of natural languages are created, governed,
constrained, acquired, and used in the service of communicative functions” (p. 2).
Following these emergentist assertions, it would seem that our entire language apparatus
for our L1 and L2 is continually re-tuned through an interaction between domain-general
learning mechanisms and patterns of language use in order to fulfill our drive to make
meaning with others.
The arguments and data produced by researchers operating within the emergentist
paradigm have been robust enough that even GenSLA recently articulated a theoretical
interest. For example, recent reviews (Slabakova, Leal, & Liskin-Gasparro, 2014, 2015)
highlight GenSLA’s interest in language processing and underlying cognitive constructs,
in the properties of linguistic input including frequency, and in the relationships between
form and meaning as important factors in SLA. However, there is a need to investigate
emergentist claims in reference to SLA interests, which include an understanding of the
variable paths by which L2 learners use their new language and how they build their own
mental L2 structures. These last points constitute the scope of interest for the present
study.
8
1.2 Focus of the Study
Working in an emergentist framework, the present study documented the extent to
which three adult L2 learners of English used holistically stored strings of language
called chunks during L2 communication and acquisition. One suggestion by N. Ellis
(2002a) is that the emergence of language rules can start with chunks and proceed along a
particular path: An L2 learner acquires a chunk of lexical and grammatical morphemes as
a whole (e.g., howyadoin). Then, learning mechanisms and cognitive processes determine
which features are fixed and which are variable, and induce the gradual
grammaticalization of the constituents into other contexts such that a learner’s
interlanguage becomes productive, rule-like, and grammatically refined.
The SLA literature seems to provide consistent evidence of such a phenomenon in
L1 and child L2 learners (e.g., Hakuta, 1974; Myles, Hooper, & Mitchell, 1998; Myles,
Mitchell, & Hooper, 1999; Peters, 1983, 2009; Wong Fillmore, 1976). However, N. Ellis
(2002a) questions the extent to which this path is relevant in adult L2 language learning
given some additional complexities that are not present during L1 learning: a mind
optimally tuned for a preexisting L1, mature cognitive development, a history of formal
language instruction in some cases, and of course, fossilization. The literature presents
conflicting arguments about the role of chunks in adult L2 learning, thereby motivating
the present study. A number of key terms are used in this study, and are described in the
next section.
9
1.3 Key Terms
Acquisition
Broadly, acquisition refers to the internalization of a new linguistic system
(VanPatten & Benati, 2015). Within the current framework, acquisition specifically
denotes the emergence of new form-meaning connections. It also denotes changes in
cognitive associations between language forms, meanings, and contexts of use such that
the learner can communicate in a second language. The present study uses acquisition,
learning, and development synonymously.
Interlanguage
Interlanguage (IL) (Selinker, 1972) is a construct characterizing the sources of
linguistic knowledge that underlie L2 language use, including “numerous elements, not
the least of which are elements from the NL [native language] and the TL [target
language]. There are also elements in the IL that do not have their origin in either the NL
or the TL […] the learners themselves impose structure on the available linguistic data
and formulate an internalized system” (Gass, 2013, Kindle Locations 1058-1059).
Processes that structure this data in the mind, according to Selinker (1972), include
language transfer, fossilization, transfer of training, strategies of second language
learning, strategies of second language communication, and the overgeneralization of
target language material.

10
Constructions
Constructions are the basic units of language in an emergentist framework.
Following Langacker (1987, 2009) and Goldberg (2003, 2006), constructions are “stored
pairings of form and function, and include morphemes, words, idioms, partially lexically
filled and fully general linguistic patterns” (Goldberg, 2003, p. 219). A construction can
be constituted by a single lexical morpheme such as “/dɔg/ = dog” or it can be a single
grammatical morpheme like the plural -s. It can be a full idiom with specified
constituents like “/kɪkðə’bʌkɪt/ = die.” A construction can be a low-scope pattern, which
is a range of minimally variable slot-and-frame forms with lexico-grammatical
restrictions, such as good morning/afternoon/evening (N. Ellis, 2002a). Finally, a
construction can be maximally variable, entailing several unspecified lexical or
grammatical morphemes within abstract frames, such as passives (SUBJECT + BE +
PAST PARTICIPLE) or di-transitive structures (SUBJECT + VERB + NOUN +
PREPOSITIONAL PHRASE). Constructions can be hierarchically assembled such that
small units (i.e., individual lexical or grammatical morphemes) can be combined to
represent larger semantic propositions (e.g., relative clauses, dependent clauses, and
discourse-level units) (Goldberg, 2003, 2006; Langacker, 1987, 2009). Because we are
interested in exploring connections between fixed and variable language structures in
learner output, the term construction will denote variable (i.e., syntactically or
morphologically productive) language forms in the present study.

11
Learner Chunk
A learner chunk is a type of construction created by the language learner. For the
purpose of this study, it shall refer to a multi-morphemic string of language that is
processed in a holistic manner. Much of the recent SLA literature on the topic refers to
these language structures as holophrases (Corder, 1973), holistic phrases, phraseological
units (Peters, 1983), prefabricated patterns and routines (Hakuta, 1974, 1976), formulas
(Bardovi-Harlig, 2012; R. Ellis, 1984) or formulaic expressions (FEs) (Weinert, 1995).
However, these terms, particularly the prevalent term, formulaic expression, can
confound several different assumptions that need to be considered separately. This
definition is discussed in further detail in Chapter II.
N-gram
N-gram is a term used in corpus linguistics and denotes any sequence of N (3, 4, 5,
etc.) words. Software can be configured to scan a corpus of learner utterances and count
multi-word sequences of specific lengths. If a corpus contains a single sentence, I walked
to the park, the software will segment the 5-word sequence into the following N-grams,
regardless of whether they correspond to meaningful units:
2-word grams (bi-grams): I walked, walked to, to the, the park (4 total)
3-word grams (tri-grams): I walked to, walked to the, to the park (3 total)
4-word grams (4-grams): I walked to the, walked to the park (2 total)

12
Language Processing
Language processing refers to computations that the mind performs during language
use. Processing, for example, occurs during comprehension (i.e., input processing),
during the accommodation of new features into the interlanguage (intake), and during the
production of spoken or written language (output processing) (VanPatten & Benati,
2015).
Holistic Processing and Storage
Holistic processing, or automatic processing (Myles, 2016), denotes the concurrent
neural activation of multiple linguistic units as if they represented a single choice. L2
learners may not necessarily have full knowledge of individual constituent units during
holistic processing (N. Ellis, 1996; Sinclair, 1991; Wray, 2000). For example, the phrase
whachadoin, in a holistic processing view, would presuppose that all constituents (e.g.,
what + are + you + doing) are accessed as a single item from long-term memory as if
they constituted one large word. Holistic storage denotes the intake of a multi-morphemic
form-meaning mapping into long-term memory.
Processes and Mechanisms
Emergentist theories assume that our cognitive endowment mediates learning
through processes and mechanisms. The present study relies on Gernsbacher’s (1991)
usage, whereby processes are complex cognitive operations requiring multiple steps or
elements. Processes include analysis, comparison, abstraction, and schematization. In
contrast to processes, mechanisms tend to be associated with basic operations at the

13
neural level that have catalytic or causative qualities (Bechtel, 2008; Gernsberger, 1991),
and can induce processes. Mechanisms include sensitivity to frequency and recency,
whereby neural activation is conditioned by how often and how recently a stimulus has
occurred (N. Ellis, 2006a, 2006b).
Associative Learning
Associative learning is a domain-general learning mechanism that relies on a
fundamental, cognitive “sensitivity to the contingency between cues and outcomes” (N.
Ellis, 2008, p. 375). In language acquisition, associative learning mediates the acquisition
of form-meaning pairs through the co-perception of form and meaning. The more
frequently a form and a meaning can be reliably perceived together, the stronger the
associative bond between the two becomes over time.
Token and Type Frequency
Type and token frequency are properties of the input that interact with cognitive
processes such as categorization, comparison, and associative strengthening. To illustrate
the difference between tokens and types, consider the following set of linguistic forms:
[car, house, shop, car, shop]. The entire set contains five tokens; there are two tokens of
car, two tokens of shop, and one token of house. Token frequency denotes the number of
occurrences, or tokens, of a specific feature in the input. Token frequency is purported to
strengthen associations between linguistic features in the language system, to help form
initial linguistic categories in the interlanguage (Goldberg, 2006), and to indirectly

14
predispose certain linguistic features to the process of abstraction by virtue of their
prominence in working memory (N. Ellis, 1998).
Type frequency, however, denotes the number of different patterns available in a set,
or the number of contexts in which an item occurs. In the above example, the set contains
three linguistic types: car, house, and shop. When describing morphological features of
English, for example, the grammatical morpheme plural –s has a high type frequency
because of the great number of different nouns that it can attach to. The derivational
morpheme –ness has a comparatively low type frequency because of the fewer number of
nouns that it can attach to. In an emergentist view, encounters with high type frequency
items in the input ensure that a linguistic form is not associated with any particular
linguistic context. High type frequency facilitates the acquisition of abstract patterns
(Goldberg, 2006).
Productivity
The present study relies on the psycholinguistic definition of productivity, which
denotes the number of linguistic contexts in which a construction can appear (Bauer,
2004; N. Ellis, 1998). High productivity is essentially synonymous with the concept of
high type frequency. For example, the plural -s is a highly productive construction in
English compared to the morpheme -ness. The passive construction is another productive
structure because it is used with many transitive verbs in a variety of discourse contexts.
15
Grammaticalization
Emergentist theories view language as a network of associative connections (Bybee,
2006; 2008; N. Ellis, 2006a). Grammaticalization refers to the change in associative
connections between linguistic features in the interlanguage. When considered with
respect to learner chunks, grammaticalization can be understood in two ways: changes
within the chunk and changes outside the chunk. If a learner starts with knowledge of a
holistically stored chunk such as <there have> in the context of the utterance <there
have> three dogs, the learner gradually learns which of the features can vary (N. Ellis,
2002a). In this case, the learner breaks the exclusive association between there + have
and associates the constituents to other variations to produce utterances such as there had,
there is, there are, and there was. With regard to a learner chunk and its external
associative connections, a learner may possess a chunk <what are you> and produce an
utterance such as <what are you> do__ with that. Initially, the learner does not associate
the morphological slot for verb endings with relevant features inside the chunk, but over
the course of acquisition, the learner develops the appropriate association between the
chunk and –ing verb ending.
Comparison, Analysis, Abstraction, Schematization
The processes of comparison, analysis, abstraction, and schematization are deemed
to underlie grammaticalization. Comparison entails two discrete language units being
evaluated for similarity. Analysis describes the gradual segmentation of holistically
stored, complex units such that their constituents are viewed individually by the language
system in their own right (Arnon & Christiansen, 2014). Abstraction creates a
16
paradigmatic “slot” in sequences based on regularities shared by two or more units (N.
Ellis, 2002a). Schematization is the process of expanding that paradigmatic slot to
accommodate an increasing variety of forms.
Exemplars
Exemplars are memorized experiences of language (Gahl & Yu, 2006). When a
learner experiences and comprehends a construction, he creates a memory of its use
which contributes to the aggregate of his language experience and serves as a basis for
interpreting input and producing output.
Exemplar-Based Learning
Exemplar-based learning describes how we acquire language on an exemplar-by-
exemplar basis and form linguistic categories based on similarities across features
(Bybee, 2013; N. Ellis, 2006a, 2006b; Gahl & Yu, 2006). Learning mechanisms compare
new exemplars that are apperceived in the input to those that are already stored in the
interlanguage. When an association is formed, the category is updated to accommodate
the new memory. The new exemplar need not have the same core properties as all of the
exemplars in a comparison category (Rosch, 1973; Taylor, 2008) nor does it have to be
target-like. Instead, the new exemplar must have enough properties in common with some
of the existing exemplars that populate the category to become associated, which results
in an update or modification to the linguistic category itself. Additionally, a single
exemplar can belong to multiple categories at the same time.

17
1.4 Outline of the Dissertation
The present study is an exploratory attempt to investigate the role of learner chunks
in adult L2 acquisition. Chapter II reviews the relevant literature as it pertains to the role
of chunks in second language acquisition. Chapter III describes the methodology of the
present study. Chapter IV presents the findings from analyses that consider data from an
inter-learner perspective. Chapter V presents the findings from analyses that consider
data from an intra-learner perspective. Chapter VI synthesizes and discusses the findings,
and concludes with theoretical implications, limitations of the study, and directions for
future research.
View publication stats

Learner Chunks in Second Language Acquisition: April 2018

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Learner Chunks in Second Language Acquisition: April 2018

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Learner Chunks in Second Language Acquisition

Thesis · April 2018

Second Language Acquisition in K-12 View project

Learner Chunks View project

The user has requested enhancement of the downloaded file.

Timothy Merrick Hall

Professor ZhaoHong Han, Sponsor

Approved by the Committee on the Degree of Doctor of Education

Date: May 17, 2017____

Submitted in partial fulfillment of the

LEARNER CHUNKS IN SECOND LANGUAGE ACQUISITION

Timothy Merrick Hall

develop rules in their second language. Working within an emergentist framework, N.

mechanisms grammaticalize holistically stored, multi-morphemic chunks of language.

adult English language learners in a U.S.-based community college used chunks.

grammaticalization manifested itself. Findings and implications are discussed with

respect to an emergentist understanding of second language acquisition.

All Rights Reserved

through discussion, proofreading, resource sharing, and countless coffees, making a

to my research site. I am, of course, indebted to my research participants, Weike, Eddy,

Castagna. Finally, my love and thanks go to my family: Ann, David, Christopher,

2.1 Distinctive Features of LCs, Low-Scope Patterns, and Constructions..… 38

2.2 Distinctive Features of LCs, Low-Scope Patterns, and Constructions

3.1 Task Characteristics……………………………………………………... 85

3.2 Example of a Role Play Task……………………..……………………... 87

3.3 Monthly Task Completion Totals………………..…………………….... 90

3.4 Table of Analyses……………………………………………………….. 95

4.1 Sample of Human-Coded Chunk Types and N-gram Types……………… 99

4.2 Corpus Characteristics by Participant…………………………………… 101

4.3 Change in Chunk Use………………………….………………………… 104

4.4 Learner Chunks from Roommates Role Play…………………………… 106

4.5 Weike: Roommates Role Play 0711…………………………………….. 110

4.6 Eddy: Roommates Role Play 0916……………………………………… 111

4.7 Jun: Roommates Role Play 0711………………………………………… 112

4.8 All Learners: The Laptop Task………………………………………….. 114

5.1 Weike: <take a shower>……………………………….…………………. 117

5.2 Weike: <won’t be>………………………………………………………. 118

5.3 Weike: <want to>…………….………………………………………….. 119

5.4 Weike: <there have>………..……………………………………………. 121

5.5 Weike: <gonna be>…………………….…………………………………. 123

5.7 Weike: <doesn’t work / doesn’t matter>.………………………………… 126

5.8 Eddy: <there is>……………………………………….…..…………….. 127

5.9 Eddy: <why are you>…………………………………………….……… 129

5.10 Eddy: <do you>…………………...………………………...……………. 130

5.11 Eddy: <a little bit>……………………………………………………….. 131

5.12 Eddy: <the first step is>……………………………………………….. 132

5.13 Eddy: <at that time>……….……………………………..………………. 134

5.14 Jun: <it’ll be>………………..…………………………………………… 135

5.15 Jun: < you have to>..…………………………………………………….. 136

5.16 Jun: <but at that time / but so>………………………………………….. 137

5.17 Jun: <do you have>……………………………………….……………. 138

5.18 Jun: <what’s>……………………………………………………….….. 139

5.19 Intra-Learner Findings: Weike …………………………………………. 141

5.20 Intra-Learner Findings: Eddy ……………………………………..…… 142

5.21 Intra-Learner Findings: Jun…………………………………..…….…… 143

3.1 Example of Narrative Task………………………………………………… 86

3.2 Coding Methods………………………………………………..…………… 91

With the rise of globalization and immigration, it is increasingly necessary to

effective ways of teaching them. Arguably, however, an understanding of teaching should

an understanding, and has identified some fundamental observations: L2 learning is

characterized by a development of rule-like tendencies (Hakuta, 1974; Krashen &

theoretical framework that accounts for such observations.