You are on page 1of 17

The Language Learning Journal

ISSN: 0957-1736 (Print) 1753-2167 (Online) Journal homepage: http://www.tandfonline.com/loi/rllj20

The measurement of implicit and explicit


knowledge

Rod Ellis & Carsten Roever

To cite this article: Rod Ellis & Carsten Roever (2018): The measurement of implicit and explicit
knowledge, The Language Learning Journal, DOI: 10.1080/09571736.2018.1504229

To link to this article: https://doi.org/10.1080/09571736.2018.1504229

Published online: 19 Aug 2018.

Submit your article to this journal

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=rllj20
THE LANGUAGE LEARNING JOURNAL
https://doi.org/10.1080/09571736.2018.1504229

The measurement of implicit and explicit knowledge


Rod Ellisa and Carsten Roeverb
a
School of Education, Curtin University, Perth, Australia; bDepartment of Linguistics and Applied Linguistics,
University of Melbourne, Melbourne, Australia

ABSTRACT KEYWORDS
This article presents a review of research that has investigated ways of Measuring implicit/explicit L2
measuring implicit and explicit knowledge of a second language (L2), knowledge; grammar;
focusing on grammar. It begins by defining implicit and explicit knowledge pragmatics
in terms of a distinguishing set of criteria. Two ways of investigating
implicit knowledge are discussed – through experimental studies of implicit
learning and by means of factor-analytic studies. This provides the basis for
a taxonomy of tests designed to provide separate measures of the two
types of knowledge. Proposals for oral production tests, comprehension
tests, judgements tests, tests of metalinguistic knowledge and tests derived
from psycholinguistic research are examined and critiqued. The article
concludes by suggesting there is a need for tests of L2 pragmatic
knowledge to complement those available for grammar and offers a
number of suggestions for the design of such tests.

Introduction
A prevailing problem facing SLA researchers is how to obtain evidence that a person’s knowledge of a
second language is of the implicit rather than explicit kind. It is not possible to examine implicit
knowledge directly as there is no direct window into people’s minds to see how their knowledge
is represented or what kind of knowledge they utilise when they perform a task (although advances
in neurolinguistics show promise in this direction). Instead, inferences about the type of knowledge
involved have to be drawn by examining a person’s linguistic behaviour. This requires making pre-
dictions about what kinds of behaviour are most likely to constitute evidence of a learner’s implicit
and explicit knowledge and then developing validity arguments to support the theoretical premises
of these predictions.
The measurement of implicit knowledge is of importance for at least three areas of enquiry. SLA
researchers (e.g. Williams 2005; Godfroid 2016) are interested in establishing to what extent learners
(especially adult learners) are capable of implicit learning (i.e. learning without intention and aware-
ness) and to this end they need to be able to assess whether the learning that results from exposure
to specific linguistic features results in implicit knowledge. Language instructors are also keen to
know whether instruction directed at specific linguistic features leads to implicit or just explicit knowl-
edge of these features and also whether some types of instruction are more likely to achieve this than
others. Finally, language testers may also wish to know what aspects of a learner’s language profi-
ciency their tests measure. In short, understanding how best to measure implicit knowledge is of
enormous significance to the central areas of enquiry in applied linguistics.
My goal in this article is to examine the various tests and the measures derived from them that
have been used by researchers to assess the extent to which L2 learners’ possess implicit knowledge
of a target language. Given that the research to date has focused more or less exclusively on

CONTACT Rod Ellis r.ellis@auckland.ac.nz


© 2018 Association for Language Learning
2 R. ELLIS AND C. ROEVER

grammar, I will restrict my review to tests of grammatical knowledge. However, I will also point to
ways (and difficulties) of utilising similar tests for measuring learners’ pragmatic knowledge. I
begin with a definition of implicit knowledge.

Defining implicit knowledge (Ellis 2005)


Definitions of implicit knowledge are based on identifying the characteristics that distinguish implicit
from explicit knowledge. In Ellis (2005, 2006), I proposed seven key characteristics that differentiate
the two types of knowledge. Table 1 from Ellis (2005) provides a summary of the assumptions I made.
Of these characteristics, some are clearly more central to defining implicit knowledge than others and
thus of greater importance for designing tests. The key characteristics are (1) awareness, (2) accessi-
bility and (3) self-report. That is, implicit knowledge is most clearly defined as knowledge that the
learner has no subjective awareness of, can access for spontaneous language use through automatic
processing, and is unable to verbalise.
Some researchers (e.g. Dienes and Perner 1999) have viewed the distinction between implicit and
explicit knowledge as continuous rather than dichotomous. In Ellis (2004), however, I rejected this
position arguing that from a connectionist account of language implicit knowledge consists of an ela-
borated, statistically determined network of weighted associations, making it difficult to see how
knowledge could be more or less implicit/explicit. Neurolinguistic studies also point to a clear separ-
ation of the two types of knowledge (see Paradis 2004; Ullman 2004).
There is, however, another possibility that carries greater conviction. This disputes characteristic
(2) by proposing that accessibility may not clearly distinguish the two types of knowledge. I and
other researchers have argued that automatic processing is a defining characteristic of implicit knowl-
edge and, therefore, that any language behaviour involving automatic processing affords evidence of
the learner’s implicit knowledge. This assumption has underscored the development of tests of
implicit knowledge as reported in a number of studies (e.g. Ellis 2005, 2006; Gutiérrez 2013; Spada,
Shiu and Tomita 2015; Zhang 2015). However, it has also been challenged by DeKeyser (2003),
who pointed out that explicit knowledge can be proceduralised and automatised through practice
allowing for its use spontaneously. Suzuki and DeKeyser (2015) suggested that a distinction needs
to be made between ‘explicit knowledge’ and ‘automatic explicit knowledge’ and that the latter is
functionally equivalent to implicit knowledge although still distinct from it. From this perspective,
then, time-pressured tests cannot provide indisputable evidence of implicit knowledge as learners
may be able to utilise their automatised explicit knowledge. In other words, to test implicit knowl-
edge it is necessary to base measurement on one or both of the two other defining characteristics
of implicit knowledge – awareness and self-report. Later we will consider studies that have attempted
this.
Discussions of implicit knowledge have focused on grammar (and to a lesser extent phonology
and vocabulary). There is an most complete absence of any consideration of implicit pragmatic
knowledge probably because the study of interlanguage pragmatics has been more concerned
with the social and sociolinguistics dimensions of language use than the psycholinguistic. An

Table 1. Key characteristics of implicit and explicit knowledge.


Characteristics Implicit knowledge Explicit knowledge
Awareness Intuitive awareness of linguistic norms Conscious awareness of linguistic norms
Type of knowledge Procedural knowledge of rules and fragments Declarative knowledge of grammatical rules and
fragments
Systematicity Variable but systematic knowledge Anomalous and inconsistent knowledge
Accessibility Access to knowledge by means of automatic Access to knowledge by means of controlled
processing processing
Use of L2 Access to knowledge during fluent performance Access to knowledge during planning difficulty
knowledge
Self-report Non-verbalisable Verbalisable
THE LANGUAGE LEARNING JOURNAL 3

exception is Taguchi (2012) who drew on Bialystok’s two dimensional model of language proficiency
to distinguish what she called pragmatic knowledge (i.e. the ability to comprehend and produce
speech intentions) and processing fluency in comprehension and production. In other words,
Taguchi proposed distinguishing pragmatic knowledge in terms of learners accessibility to the lin-
guistic forms needed to decode and encode pragmatic intentions. Although Taguchi did not use
the terms implicit and explicit knowledge it is clear that her model of pragmatic knowledge addresses
this distinction. The model, however, relies solely on the accessibility criterion, which as noted above
may not be sufficient. However, the other two characteristics are equally applicable to pragmatic
knowledge; learners may or may not be aware of the social meanings they convey through their prag-
malinguistic choices and they may or may not be able to verbalise the choices they made.
Whether measuring grammatical or pragmatic knowledge, developing tests that distinguish
implicit and explicit knowledge is difficult. As de Jong (2005a) noted:
Testing whether learning is implicit or explicit is very difficult, because there are no clear boundaries between
implicit and explicit processes and nearly all cognitive processes have both implicit and explicit aspects. This
means that implicit learning should not be ruled out as soon as awareness has been established, nor should
implicit learning only be assumed when there is no awareness at all of the learning process or product. The
same argument holds for implicit and explicit knowledge, which can (and often do) co-exist and operate simul-
taneously. (p. 7)

Most L2 learners possess both types of knowledge so determining which type of knowledge they
deploy on particular occasions is problematic. At best, tests can only hope to bias learners to the
use of one or the other type of knowledge.

Two approaches for investigating tests of implicit knowledge


Some of the most interesting work on measuring implicit knowledge comes from studies that have
investigated whether learning in the absence of conscious awareness is possible for adults (e.g. Wil-
liams 2005; Rebuschat et al. 2015; Godfroid 2016; Kerz, Wiechmann and Riedel 2017). To demonstrate
whether implicit learning is possible it is necessary to test whether the learning that takes place
results in implicit or explicit knowledge. The second approach involves administering a battery of
tests that have been theorised to afford relatively separate measurers of implicit and explicit knowl-
edge and investigating whether they in fact do so by factor-analysing scores from the test (e.g. Ellis
2005; Zhang 2015; Kim and Nam 2017; Suzuki 2017). The aim here is to show that scores from the
tests load on distinctive factors as predicted.

Studies of implicit learning


Researchers interested in investigating implicit language learning have adopted an experimental
approach often involving an artificial language to ensure that the learners had no prior knowledge
of a target feature. Learners are first exposed to multiple exemplars of a specific grammatical struc-
ture in a training phase of a study. This is followed by a testing phase. A variety of test types have
been used (e.g. a grammaticality judgement test (GJT), forced-choice test and a fill-in-the gap
test). By themselves, these tests are not able to show whether learners have acquired explicit or
implicit knowledge. But in addition, information is collected through self-report to determine
whether the learners had consciously registered (i.e. become aware of) the target structure during
the training or testing phases of the study. For example, Rebuschat et al. (2015) asked the participants
in their study to complete a forced-choice test production test and to provide confidence ratings and
source attributions for responses to each item in the test. The participants indicated how confident
they were about their responses by choosing from four options (not confident at all – just guessing;
somewhat confident, very confident, 100% confident). Source attributions were gathered by requir-
ing the participants to state whether they guessed, relied on intuition, memory or rule knowledge.
These two kinds of self-report provided data about the participants’ judgement knowledge and
4 R. ELLIS AND C. ROEVER

structural knowledge. Evidence for implicit learning was held to have occurred when the participants
indicated no confidence in the choices they made in the test and were just guessing but nevertheless
scored above chance.
This approach suffers from a number of problems especially if the tests are of the kind that are
likely to tap explicit knowledge. It relies on participants reporting their confidence levels and
source attributions honestly. But there is no way of telling if a test-taker based confidence ratings
on his/her implicit or explicit knowledge. Similarly, a test-taker may report he/she relied on intuition
but in fact may have also referred to a rule. As Rebuschat (2013) pointed out, learners often possess
both explicit and implicit knowledge of the same grammatical feature and may utilise one or the
other or both when subjectively rating test items. Nevertheless, confidence ratings and source attri-
butions can help to check whether tests designed to assess implicit or explicit knowledge actually
functioned as intended.

Factor-analytic studies
The first of a series of factor-analytic studies investigating the assessment of implicit/explicit was
Ellis (2005). This study set out to validate a set of tests designed to provide relatively separate
measures of implicit and explicit knowledge of language. The tests focused on knowledge of 17
English grammatical structures selected to represent different levels of learning difficulty and to
include both morphological features (e.g. 3rd person -s) and syntactic structures (e.g. dative alter-
nation). The tests differed in terms of four criteria that were theorised to distinguish the two types
of knowledge: (1) degree of awareness, (2) time available, (3) focus of attention and (4) utility of
meta-language. Table 2 shows how the tests in the battery mapped onto these four design fea-
tures. Three tests (an oral elicited imitation test (EIT), an oral production test and a timed GJT)
were hypothesised to measure implicit knowledge and an untimed GJT and metalinguistic knowl-
edge test (MKT) to measure explicit knowledge. The tests were administered to a sample of mixed
proficiency adults ESL learners in New Zealand. An exploratory factor analysis produced two factors
with scores from the two sets of tests loading more or less as hypothesised. A subsequent confi-
rmatory factor analysis (Loewen and Ellis 2007) tested a model based on the implicit/explicit dis-
tinction and a second model based on the oral/written distinction and showed that only the
former constituted a satisfactory fit.1
Further studies by Bowles (2011) and Zhang (2015) on very different populations of learners (i.e.
classroom and heritage learners of Spanish in the US in Bowles and Chinese university students in
Zhang) have confirmed the results obtained in Loewen and Ellis (2007). Other studies (e.g. Spada,
Shiu and Tomita 2015; Suzuki and DeKeyser 2015), however, have produced different results,
leading in particular to questions regarding the validity of the elicited imitation test as a measure
of implicit knowledge (as opposed to a measure of automatised explicit knowledge) along with pro-
posals for alternative tests of implicit knowledge borrowed from the psycholinguistic literature. The
doubts regarding the validity of the elicited imitation test and also the timed GJT that were raised by
these studies are considered below.
The strength of the factor-analytic approach is that it serves to test theory-driven hypotheses
about the type of knowledge that different tests measure. The starting point is the definition of
implicit/explicit knowledge (see preceding section). This then serves to identify the key characteristics
of tests hypothesised to provide relatively distinct measures of the two types of knowledge. The tests

Table 2. Design features of the tests in the test battery (from Ellis et al. 2009: 47).
Criteria oral imitation Oral narrative Timed GJT Untimed GJT Meta-language
Degree of awareness Feel Feel Feel Rule Rule
Time available Pressured Pressured Pressured Unpressured Unpressured
Focus of attention Meaning Meaning Form Form Form
Utility of knowledge of meta-language No No No Yes Yes
THE LANGUAGE LEARNING JOURNAL 5

are designed accordingly, administered to a sample of L2 learners and the scores submitted to a
confirmatory factor analysis to determine whether in fact the tests distinguish the knowledge
types as intended. This approach, which to date has been used exclusively for tests of grammar, is
the obvious way to tackle the development of tests of pragmatic knowledge.

A taxonomy of tests of implicit and explicit knowledge


There is now a range of tests that have designed to measure implicit and explicit knowledge. Table 3
lists the various tests along with key variants in the basic types. It also gives examples of studies that
have investigated these types. These test types differ in a fundamental way. Whereas production and
comprehension tests can claim to be authentic in the sense that they involve how language is used in
normal communication, the other tests (the judgement, metalinguistic knowledge and psycholin-
guistic tests) are clearly lacking in authenticity as language users do not normally have to judge
the grammaticality of sentences, demonstrate knowledge of meta-language or read sentences as
they appear word by word on a computer screen in the course of their everyday use of language.
The authenticity of the tests may not be important if the purpose is to investigate theoretical
issues about language acquisition in experimental research but face validity does become important
if the tests are to serve as proficiency tests.
The tests marked with a * in Table 3 were designed as potential measures of implicit knowledge
while those marked with a ** were intended to measure explicit knowledge. Thus tests of implicit
knowledge include free production, elicited imitation, picture-matching comprehension tests,
timed and aural judgement tests and psycholinguistic tests. There are some notable differences
in these tests. Some of them (free production, elicited imitation and picture-matching) stipulate
that learners should focus on meaning rather than form. However, the judgement tests require
learners to focus on form and attempt to elicit the use of implicit knowledge by creating proces-
sing pressure either through an aural presentation of the sentence stimuli or through time restric-
tions. However, as noted above, simply ensuring that test performance is pressured cannot
guarantee that learners draw on their implicit knowledge as they may have been able to access
automatised explicit knowledge. The psycholinguistic tests adopt a very different approach.
They require a focus on the meaning of stimuli but measure implicit knowledge in terms of lear-
ners’ sensitivity to grammatical violations through a comparison of response times to grammatical
and ungrammatical sentences, assuming that learners will take longer to process ungrammatical
than grammatical forms.
In the sections that follow I will provide a commentary on each of the different tests and
also consider to what extent it might be feasible to use each to assess implicit pragmatic
knowledge.

Table 3. A taxonomy of tests of implicit/explicit knowledge.


Type of test Versions Example
Oral production tests . Free production * Ellis (2005); Spada, Shiu and Tomita (2015)
. Controlled production ** Macrory and Stone (2000)
. Elicited imitation * Ellis (2005); Zhang (2015)
. Error correction ** Spada, Shiu and Tomita (2015)
Comprehension . Picture-matching listening test * De Jong (2005a)
Judgement tests . Timed * vs untimed ** Ellis (2005)
. Aural * vs written ** Kim and Nam (2017)
Metalinguistic knowledge . Receptive knowledge **
tests . Productive knowledge **
Ellis (2005)
Psycholinguistic tests . Word monitoring test * Suzuki and DeKeyser (2015)
. Self-paced reading test * Vafaee, Suzuki and Kachisnke (2017)
. Visual word task * Suzuki (2017)
6 R. ELLIS AND C. ROEVER

Tests involving oral production


Free oral production tests
A key characteristic of a free production test is that learners are not made aware of what linguistic
features the test has been designed to measure (i.e. learners are required to focus solely on
meaning). A free oral production test also requires real-time language processing so learners need
to draw on their automatised knowledge. Finally, learners are unlikely to draw on their knowledge
of meta-language. In short, free production tests correspond quite closely to how language is
used in everyday communication and satisfy Ellis’ (2005) four criteria for tests of implicit knowledge
(see Table 2).
Ellis (2005) included an oral narrative task in his battery of tests. Learners were asked to read a
story through twice and then retell it orally. To encourage them to speak spontaneously they were
told they only had 3 minutes. Scores on eight grammar structures were calculated using obligatory
occasion analysis. In the confirmatory factor analysis of the battery of tests, oral narrative scores
loaded on the implicit factor but the loading was weaker than for both the EIT and the timed GJT,
possibly because these other tests measured knowledge of all 18 grammatical structures. The oral
narrative scores also correlated weakly but significantly with scores on the Untimed GJT and the
MKT. Clearly, it was not a ‘pure’ measure of implicit knowledge.
Bowles (2011) study of L2 Spanish also reported that scores from the oral narrative test loaded on
the implicit factor in a study that used the same range of tests as in Ellis (2005). However, other follow-
up factor-analytic studies (e.g. Suzuki and DeKeyser 2015; Zhang 2015) did not include a free oral pro-
duction test in their test battery probably because administering such a test is very time-consuming
and because of the difficulty of ensuring that production of the target structures is task-essential (see
Loschky and Bley-Vroman 1993). Evidence of this latter problem comes from Spada, Shiu and Tomita
(2015) who included a picture-cued story-telling task. Even though learners were given the key words
for each picture, they failed to consistently use the target structure (passive forms) and scores on this
test were very low. Spada, Shiu and Tomita concluded that the ‘specific design features of the oral
production task exclude it as a reliable measure of implicit knowledge’ (2015: 740).
Despite the problems with oral production tasks, they have high face validity as a language testing
device. Also, as Ellis’ and Bowles’ studies showed, they have construct validity as tests of implicit
knowledge. Clearly, though they do not prevent learners from accessing their explicit knowledge
especially if this is of the automatised kind. Role-play tests – a popular way of testing pragmatic
ability – have similar design features to the oral narrative tests in Ellis and Bowles studies. Perhaps,
though, learners are even more likely in a role-play to draw on whatever explicit pragmatic knowl-
edge they possess as the specification of the situation included in such a task sensitises learners
to attend consciously to using language in a socially appropriate way.

Controlled production tests


Arguably the difference between a controlled and free production test is continuous rather than
dichotomous – that is, tests can constrain the extent to which learners are directed to attend to
specific grammatical forms to a greater or lesser extent. Spada, Shiu and Tomita’s (2015) picture-
cued story-telling task lies somewhere in the middle of this continuum. Ellis’ (2005) Oral narrative
task lies towards the free production end of the continuum.
Tests at the controlled end of the continuum are what Rebuschat (2013) called direct tests. That is
they explicitly require students to make use of their grammatical knowledge to complete the test –
for example, a fill-in-the gap test where learners are asked to complete sentences using specific lin-
guistic forms. Kerz, Wiechmann and Riedel (2017) used such a test to measure whether implicit learn-
ing had taken place but they also included confidence ratings to establish whether the zero-order
correlation criterion (Dienes et al. 1995) had been met (see above). They also asked participants to
THE LANGUAGE LEARNING JOURNAL 7

report on whether they had identified the target feature and to describe what they had noticed.
Without the subjective measures and self-report information such a test cannot be used to
measure implicit knowledge. Godfroid (2016) included a controlled production test involving pictures
and word prompts but used it as a measure of explicit knowledge.
A controlled production test encourages a high degree of linguistic awareness, a focus on form
and the use of metalinguistic knowledge. As such it is more likely to tap explicit knowledge especially
if there is no time pressure. If time pressure is exerted – for example, by presenting the stimuli aurally
and by setting a time limit for responding to each stimulus – learners are perhaps more likely to draw
on their implicit knowledge if they have it. However, such test conditions would not preclude the use
of automatised explicit knowledge. Thus, a controlled production test cannot serve as a satisfactory
measure of either type of knowledge.
The discourse completion test (DCT), popular in interlanguage pragmatic research, is a controlled
production test. The written form of this test most likely taps meta-pragmatic (i.e. explicit knowledge)
as learners are encouraged to consciously think about what they would say in the situations given to
them (see Golato 2003). However, introducing processing pressure by requiring an aural rather than a
written response and by setting a time limit might show whether the learner has access to automa-
tised knowledge (implicit or explicit). This is doubtful, though. Enochs and Yoshitake-Strain (1999)
factor-analysed a battery of tests of pragmatic competence, including a role play, a written DCT,
an aural DCT and a multiple choice DCT. They found that scores all loaded on the same factor,
suggesting that the tests were all measuring the same type of knowledge – probably meta-
pragmatic.

Elicited imitation test


It is useful to distinguish two uses of this test. It has been used as a measure of global language profi-
ciency or ‘L2 processing efficiency’ (Gaillard and Tremblay 2016), in which case the assumption is that
test-takers draw on an amalgam of implicit and explicit knowledge. It has also been used in SLA
research as a measure of implicit knowledge. Common to its use for both of these purposes, the
test requires learners to listen to a set of sentences that are sufficiently long to prevent rote recall
and then to reproduce then orally. Thus, the test involves both input- and output-processing – listen-
ing comprehension and oral production. However, measures derived from the test are based entirely
on learner production. As there are differences in the specific features of tests intended to assess
general language proficiency and implicit knowledge, I will consider them separately.
As a test of global language proficiency, Ortega et al.’s (1999) Spanish EIT has served as a model for
the design of EITs for a variety of languages and for validation studies such as Gaillard and Tremblay
(2016) and Bowden (2016). The EIT includes sentences that cover a wide variety of grammatical struc-
tures and differ in sentence length (from 7 to 17 syllables in Ortega et al.’s test). All the sentences are
grammatical. Learners are given a fixed time to repeat each sentence (e.g. 2.5 seconds in Bowden’s
study) and are not required to perform any response other than repeating the sentences. Ortega
(2000) developed a rating scale for evaluating learners’ performance on an EIT. This does not focus
on the accuracy with which specific grammatical structures are performed but on the quantity
and quality of the idea units that a learner includes in his/her repetition of a sentence (e.g.
‘perfect repetition’ = 4; ‘more than half the content preserved; slight changes in content that make
the content inexact’ = 2; ‘silence, unintelligible content, or only one content word’ = 0). Studies
have shown that the test has high reliability, is strongly correlated with other standardised measures
of oral proficiency as well as with learners’ self-assessment scores, and effectively discriminates
between learners who vary in language learning experience (Bowden 2016). Some doubts about
its practicality exist, however, as rating learners’ performances on each sentence is time-consuming.
The use of the EIT as a measure of implicit L2 knowledge began with Ellis’ (2005) study. Seventeen
structures were embedded in sentences that expressed beliefs about something (e.g. ‘New Zealand is
greener and more beautiful than most countries’). Some of the sentences were grammatical and
8 R. ELLIS AND C. ROEVER

some ungrammatical but test-takers were not told this. The rationale for including ungrammatical
sentences was that learners draw on their long-term memory of the target structures and thus auto-
matically correct the errors without necessarily noticing them. Studies have shown that native speak-
ers do this. The instructions given to learners were to (1) listen to each sentence, (2) indicate whether
they agreed or disagreed with the proposition it encoded and (3) repeat the sentence in correct
English. The sentences were scored dichotomously in terms of whether a learner had produced
the target structure in a sentence correctly or not. If learners paraphrased a sentence avoiding use
of the target structure they were scored 0. This resulted in a total accuracy score, a grammatical sen-
tences score, an ungrammatical sentences score for each learner and also for scores for each separate
structure. Total scores on the EIT loaded strongly (.87) on the implicit factor in Loewen and Ellis’ (2007)
confirmatory factor analysis. It constituted the best measure of implicit knowledge. This finding has
been replicated in other factor-analytic studies.
But does the EIT measure implicit knowledge? Erlam (2006) argued that to function as a test of
implicit knowledge the EIT needs to (1) require a primary focus on meaning, (2) include a delay
between presentation of the stimuli and repetition and (3) be time-pressured. She argued that the
test in Ellis (2005) satisfied these conditions. The fact that the learners often corrected the ungram-
matical sentences, that there was no correlation between the length of the sentences and accuracy in
reproducing the target structures, and that EIT scores were strongly correlated with scores from other
tests involving online language use all supported the claim that it measured implicit knowledge.
Nevertheless, as noted previously, this claim has been challenged on the grounds that the EIT
cannot distinguish between implicit knowledge and well-automatised explicit knowledge. This chal-
lenge arises from studies that have shown that scores from established psycholinguistic measures of
implicit knowledge such as the word monitoring task do not load on the same factor as EIT scores,
suggesting the need to distinguish three types of linguistic knowledge – implicit knowledge, auto-
matised explicit knowledge and non-automatised explicit knowledge. These studies will be con-
sidered later in the section dealing psycholinguistic measures.
A number of modifications to the EIT might arguably enhance its construct validity as a test of
implicit knowledge. Spada, Shiu and Tomita (2015) substituted the belief statements used in Ellis
(2005) with truth-value statements. This has the advantage of showing clearly whether learners
have processed the sentence stimuli for meaning. In Ellis’ EIT, learners had to respond ‘yes’, ‘no’ or
‘not sure’ with no way of ensuring that their responses actually reflected their beliefs. But if they
have to respond ‘true’ or ‘false’ to factual statements it is possible to see if they have understood
a sentence and therefore to eliminate those sentences that they failed to understand from the
scoring. Kim and Nam (2017) addressed another possible limitation of the Ellis EIT, which did not
impose a strict time limit for the learners’ repetition of the stimulus sentences. Kim and Nam had lear-
ners complete the EIT under time pressure (allowing just 20% more time than native speakers
required) and an unpressured condition. They reported that the time-pressured version afforded a
stronger measure of implicit knowledge.2 These modifications increase the need for automatic
processing.
Another interesting possibility comes from the use of an EIT for dynamic assessment – a form of
assessment supported by sociocultural theory (Lantolf and Poehner 2011). Its aim is to overcome the
dualism of ‘assessment’ and ‘instruction’ by investigating what L2 learners can accomplish both
without and with assistance. van Compernolle and Zhang (2014) describe a study where learners
were asked to repeat each sentence in an EIT and, if they were unable to do so, were offered gradu-
ated assistance. The study did not set out to investigate implicit and explicit knowledge but it
suggests an interesting way of doing this within a dynamic assessment framework. If learners can
repeat a sentence spontaneously and independently this might constitute evidence of implicit
knowledge (or automated explicit knowledge); if, however, they require assistance but then
succeed in reproducing the sentence this might be indicative of explicit knowledge. If they fail to
reproduce the sentence even with help this would demonstrate they had neither implicit nor explicit
knowledge.
THE LANGUAGE LEARNING JOURNAL 9

Irrespective of whether the EIT measures implicit knowledge or automatised explicit knowledge, it
has obvious potential as a measure of learners’ ability to use a range of grammatical structures
without the need for controlled processing. For this reason, it is of potential interest for the measure-
ment of pragmatic competence. It is, however, not easy to see how it can be adapted to focus on
pragmatic aspects of language. In a grammatical EIT, the grammaticality of the stimulus sentences
depends solely on the linguistic context provided by each sentence. But this is not possible in a prag-
matic EIT as appropriateness depends on the situational, not the linguistic, context. Thus to include
both pragmatically appropriate and inappropriate stimuli it would be necessary to specify the situa-
tional context for each utterance. This may be possible, especially if the situational context is
described in the learners’ L1. Alternatively, an EIT might focus solely on learners’ pragmalinguistic
knowledge by assessing the accuracy with which they are able to reproduce sentences containing
different realisations of the same speech act (e.g. direct and indirect forms of requests). This,
however, would tell us nothing about their sociopragmatic knowledge. Another problem in design-
ing an EIT to assess pragmatic knowledge is that it may not be possible to replicate the dual-tasking
element of grammatical EITs by asking learners to make belief or truth-value judgements before
reproducing the sentences. Asking learners to judge whether a sentence is appropriate or inappropri-
ate (in relation to the situational context provided) will draw attention to its pragmatic acceptability
and encourage conscious attention to the linguistic forms needed to reproduce or correct it. Using an
EIT as a measure of implicit pragmatic knowledge will clearly need careful consideration of design
issues.

Error correction tests


An error correction test was not included in the Ellis (2005) test battery or in most of the follow-up
studies. Spada, Shiu and Tomita (2015), however, did include one. In addition to requiring learners
to correct the errors in sentences, their test asked them to identify the errors first and to explain
them after correcting them. Such a test involves a high degree of awareness, is unpressured,
focuses attention explicitly on form and involves the use of meta-language. As such, according to
the design features that informed the Ellis battery (see Table 2), it clearly functions as a measure
of learners’ explicit knowledge. Support for this claim comes from Spada, Shiu and Tomita’s study,
they found that the three scores based on the error correction test along with scores on a written
GJT loaded on the same factor. Further support comes from studies such as Shintani and Ellis
(2013). This study reported that performance on an error correction test following form-focused
instruction deteriorated markedly over time. Shintani and Ellis suggested that this was because an
epiphenomenon of explicit knowledge is that learners can easily forget it. There is, of course, the
possibility of asking learners to rate their confidence and indicate the knowledge source they used
when correcting sentences. As with other form-focused methods of assessment these subjective
measures might be used to derive measures of implicit knowledge. However, I know of no
attempt to do this with error correction tests.
It should be easy to develop pragmatic tests where learners are asked to rewrite situationally
embedded inappropriate utterances in appropriate language. Such tests, however, would serve pri-
marily as measures of explicit pragmatic knowledge.

Comprehension tests
de Jong (2005a) pointed out the importance of investigating receptive knowledge as well as pro-
ductive knowledge when investigating learners’ implicit and explicit knowledge. He argued that
overall the literature points to the receptive representation of linguistic knowledge preceding the
productive representation but with some shared representation developing over time. Somewhat
surprisingly, however, there have been few attempts to investigate receptive implicit knowledge
of L2 grammar.
10 R. ELLIS AND C. ROEVER

In his own study, de Jong (2005b) investigated the effects of listening training on the acquisition of
a grammatical feature in an artificial language and included both receptive and productive tests
designed to measure implicit and explicit knowledge. In the training task, learners heard a sentence
and indicated which picture matched the meaning of the sentence by pressing a key on a computer
keyboard. To test receptive knowledge de Jong used a self-paced listening task (described below in
the section dealing with psycholinguist tests). He also administered a testing version of the picture-
matching task and recorded response times. Longer reaction times to the sentences containing the
grammatical target were taken as evidence of slower processing. However, de Jong admitted that
these tests could not distinguish between the rapid processing of explicit knowledge and implicit
knowledge. de Jong also asked the learners to complete a questionnaire where they were asked
to describe the rule that was the target of the study and reported that half the participants produced
at least a half correct description of the rule, suggesting that in fact, for many of the learners, the tests
may have measured explicit knowledge.
Qin and van Compernolle (forthcoming) describe an interesting dynamic assessment study invol-
ving pragmatic implicature. Learners were presented with a situation for a specific speech act along
with the utterance performing the act and then asked to decide what the meaning of the utterance
was by answering a multiple choice question (five choices). If they chose the correct answer they
moved on to the next item in the test. If they chose a wrong answer they received graduated feed-
back in the form of prompts. An example of an item from the test and the prompts can be found in
the Appendix. As suggested above, correct responses provided without assistance and following
prompting might be taken as signs of implicit knowledge and explicit knowledge respectively. Qin
and Compernolle’s test was written and unpressured; an aural, time-pressured test would have stron-
ger validity if the aim was to distinguish implicit and explicit knowledge.
Listening comprehension tests have a clear advantage over any kind of production test as they
can be designed to require rapid online processing – a key requirement for testing implicit knowl-
edge. In a production test, even a speeded one, learners still have some latitude for controlled pro-
cessing. A time-pressured picture-matching test could be designed to investigate learners’ receptive
sociopgragmatic knowledge; learners could be shown pictures with empty speech bubbles and then
asked to pick the picture that best matched the utterance they listened to. For example, for the utter-
ance ‘Excuse me but would you mind closing the window?’ learners would select from pictures
showing a boy speaking to another boy, a boy speaking to his mother and a boy speaking to an
old man. Time-pressuring the learners’ response would help to ensure that they could not easily
draw on rule-based knowledge about how to request in English.

Judgement tests
The use of GJTs has a long history in both linguistics and SLA research. In much of the research, they
have been used to measure learners’ knowledge of grammatical structures without any consideration
of what type of knowledge is involved. In Ellis (2005) and follow-up studies, however, attempts were
made to design tests with different features (e.g. timed vs. untimed; aural vs. written) in order to dis-
tinguish measurements of implicit and explicit knowledge.
GJTs can come in various forms (see Ellis 1991). For example, test-takers can be simply asked to
judge whether the sentences are grammatical or ungrammatical, to indicate the part that is ungram-
matical, and/or to correct the errors in the ungrammatical sentences. Thus a judgement test can also
double up as an error correction test. The type of test that has been used in the implicit/explicit
knowledge studies has typically only asked for a grammaticality judgement. Asking for test-takers
to indicate and correct errors is impractical in timed and aural tests.
One of the issues addressed is whether responses to the grammatical and ungrammatical sen-
tences draw on different knowledge sources. Ellis (2005), for example, found that scores derived
from the ungrammatical sentences in the untimed GJT loaded much more strongly on the explicit
factor than scores for the grammatical sentences. Total GJT scores (i.e. for both grammatical and
THE LANGUAGE LEARNING JOURNAL 11

ungrammatical sentences) loaded on both the explicit and implicit factors although more strongly
on the explicit factor. Gutiérrez (2013) found that there were statistically significant differences
between the learners’ responses to the grammatical and ungrammatical sentences in both the
timed and untimed tests in her study and proposed learners resort to their implicit knowledge
when judging grammatical sentences and to their explicit knowledge when judging ungrammati-
cal ones. However, Kim and Nam (2017) found that it was the ungrammatical sentences in an aural
GJT that loaded on the implicit factor. Clearly, results for grammatical versus ungrammatical sen-
tences have not been consistent across studies although the weight of the evidence supports Ellis’
(2005) finding, namely that ungrammatical sentences are the more likely to elicit explicit knowl-
edge. Vafaee, Suzuki and Kachisnke (2017), for example, reported that the confirmatory factor
model with the best fit in their study identified the ungrammatical sentences in both a timed
and untimed GJT as loading together with scores on the MKT on a factor they labelled explicit
knowledge.
There is also evidence that the modality of a GJT affects the kind of knowledge tapped by a test.
This makes good sense as responding to aural sentences requires real-time language processing
especially if the items in the test are time-pressured. In Kim and Nam’s study, for example, the
reason why the ungrammatical sentences loaded on the implicit factor was probably because the
sentences were presented aurally (i.e. the learners did not have time to access their explicit
knowledge).
Drawing together the results from various studies, it is possible to identify the features of GJTs that
influence the type of knowledge learners are likely to draw on. I have summarised these in Table 4.
This table suggests that it is the grammatical sentences in an aural timed GJT that are most likely to
tap into learners’ implicit knowledge. This is an idealisation, however, as the various design features
interact in complex ways that are not entirely predictable. Also, given that any kind of GJT requires
learners to focus on form, it could be argued that GJTs can only distinguish automatised and non-
automatised explicit knowledge at best. Kim and Nam’s factor analysis resulted in a three factor sol-
ution that they labelled ‘implicit strongest’ (loadings just for the EIT), ‘implicit weakest’ (loadings for
written timed GJT and aural TGJT) and ‘explicit (loading for the MKT) but an alternative description of
these three factors, however, could be ‘implicit’, ‘automatised explicit’ and ‘non-automatised explicit’
with the two GJTs serving as a measure of automatised explicit knowledge.
GJT scores by themselves cannot convincingly tell us what kind of knowledge learners draw on.
This is why researchers investigating implicit learning have also obtained subjective ratings (i.e. confi-
dence ratings and source attributions) along with retrospective reports from learners in order to more
clearly establish whether their judgements were accompanied by conscious awareness and searching
for a rule. Arguably, then, if GJTs are to be used as measures of implicit knowledge, they must incor-
porate methods for investigating the nature of the judgements that learners make.
Judgement tests have not figured in interlanguage pragmatics research to the best of my knowl-
edge. If they are to be used, then learners would need to judge the appropriateness of utterances in
relation to their situational contexts. That is, it would be necessary to describe the situational context
for each item in the test. Also, whereas grammaticality is an either-or phenomenon appropriateness is
best viewed as scalar so learners might be asked to judge the appropriateness of utterances on a
scale from ‘very appropriate’ to ‘entirely inappropriate’. Aural stimuli along with timed responses
and subjective ratings offer the greatest likelihood of measuring implicit (or automatised explicit)
knowledge.

Table 4. Design features of GJTs influencing the type of knowledge tapped.


Design features Implicit knowledge Explicit knowledge
Grammaticality Grammatical sentences Ungrammatical sentences
Timing Timed Untimed
Modality Aural Written
12 R. ELLIS AND C. ROEVER

Metalinguistic knowledge test


The MKT used in Ellis et al. (2009) consisted of three parts. In the first part, learners were asked to
select the best explanation for grammatical errors in sentences from the four choices provided for
each sentence. The second part tested receptive knowledge of metalinguistic terms and the third
part productive knowledge.
The MKT is most clearly a test of explicit knowledge. In all the factor analysis studies that included
such a test, scores loaded on the factor labelled explicit. In Ellis (2005), ungrammatical sentences from
the untimed GJT loaded along with scores from the MKT. In Vafaee, Suzuki and Kachisnke (2017),
scores from the ungrammatical sentences in both their timed and untimed GJT loaded with the
MKT scores. In Kim and Nam’s (2017) study, the MKT loaded on its own factor. Elder (2009) reported
a thorough validation study of the MKT used in Ellis (2005) as a test of explicit knowledge.
There would seem little value in testing L2 learners’ knowledge of meta-pragmatic terms as these
are very technical and unlikely to be known by most learners. Thus parts 2 and 3 of Ellis et al.’s test
would have no counterpart in a meta-pragmatic test. But the multiple choice format of part 1 could
be adapted by asking learners to select the best explanation for why a particular utterance is inap-
propriate from the explanations provided. Alternatively, learners could be asked to provide their
own explanations although this would confound learners’ meta-pragmatic understandings with
their ability to verbalise them.

Psycholinguistic measures
As previously discussed, Ellis (2005) characterised tests of implicit knowledge as tests that (1) involve
no conscious awareness of linguistic form, (2) are time-pressured to induce online processing, (3)
focus on meaning and (4) do not involve the application of meta-language. We have seen that a
number of factor-analytic studies reported that the EIT most convincingly satisfies these criteria.
However, Suzuki and DeKeyser (2015), Vafaee, Suzuki and Kachisnke (2017) and Suzuki (2017)
have challenged this claim on the grounds that time pressure cannot limit access to explicit knowl-
edge sufficiently to ensure that implicit knowledge is drawn on. These studies included the measures
that Ellis claimed to be best measures of implicit knowledge (i.e. the EIT and timed GJT) along with
measures drawn from the psycholinguistic literature. They found that that these latter measures
loaded on a separate factor from the other measures and went on to argue that the psycholinguistic
measures served as measures of implicit knowledge while the EIT and the timed GJT were best seen
as measures of automatised explicit knowledge.
The psycholinguistic measures were based on responses derived from three different tests. I have
summarised the descriptions of these tests in Suzuki (2017):

(1) The visual word paradigm: This involved tracking learners’ eye movements. Learners were pre-
sented with a scene involving four pictures. They then listened to sentences and the eye move-
ments they directed at specific pictures as they listened were recorded. The pictures included a
picture representing the grammatical target (e.g. the agent of the sentence), a competitor target
and a distractor. Learning was considered to be evident if the learners displayed sensitivity to the
grammatical target by looking at the relevant picture more frequently than at the pictures repre-
senting the competitor target or the distractor picture.
(2) Word monitoring task: In this test, a word appears on a screen. Learners are told to listen3 to a
sentence and as soon as they hear the target word to press a button. The purpose of this is to
involve the learners in dual processing; that is, because they have to consciously listen for the
word they are not able to consciously attend to the grammaticality of the sentences. The sen-
tences were a mixture of grammatical and ungrammatical but learners were not told this.
Each sentence was followed by a comprehension question to ensure that the learners’ were
focused on meaning. Response times for the region of interest in each sentence (i.e. the target
THE LANGUAGE LEARNING JOURNAL 13

word and the immediately following word) were recorded. A measure of grammatical sensitivity
was calculated by subtracting the time taken to respond in the region of interest in the ungram-
matical sentences from the time taken in the grammatical sentences. The assumption here is that
sensitivity to grammatical violations while learners are distracted from attending to form reflects
implicit knowledge.
(3) Self-paced reading task: Learners are asked to read a sentence word by word as quickly as poss-
ible. The learners press a computer button to bring up each word in the sentence. As they do so
the preceding word on their screen disappears. When learners have finished reading the whole
sentence they answer a comprehension question. As in the word monitoring task, some of the
sentences are grammatical and some ungrammatical and response times are recorded for the
region of interest in each sentence. Grammatical sensitivity scores were calculated in the same
way as for the word monitoring task.

If these tests were measuring the same construct (implicit knowledge) one might expect that they
would be strongly inter-correlated. Suzuki did find a weak correlation between scores on the word
monitoring task and the self-paced reading task but the correlations involving the visual-world
task were all very weak and non-significant. In general scores on all these tests were much lower
(but with very high standard deviations) than scores on the tests considered more likely to
measure explicit knowledge (i.e. a timed and an untimed GJT and a controlled production test). It
is therefore not so surprising to find that the psycholinguistic tests and the explicit tests loaded
on separate factors. Also, while these psycholinguistic tests may be well-suited to experimental
studies, they do not have high face validity as language tests as they involve very artificial procedures
(e.g. reading a sentence word by word). Thus doubts must exist about their value if the aim is to
design tests of implicit and explicit knowledge that can serve as stand-alone tests.
It is also difficult to see how these psycholinguistic tests could be adapted to measure pragmatic
knowledge as they all rely on how learners process a specific word in the area of interest in a sen-
tence. Pragmatic meanings are not usually conveyed by a single word but by multiple linguistic
devices. A complex request such as “Excuse me, would you mind closing the window”, for
example, does not contain a clearly identifiable region of interest. Thus what works for grammatical
sensitivity may not work for pragmatic sensitivity. A psycholinguistic test of pragmatic knowledge is
only feasible if it is possible to identify individual words that determine the social (in)appropriateness
of an utterance. Also, as noted for many of the other tests, assessing whether learners are pragma-
tically sensitive will require providing information about the situational context of each utterance.

Some final comments


The tests designed to measure implicit knowledge almost invariably involved specific elements in iso-
lated sentences.4 This is not a problem when it comes to testing grammar but it is a problem for prag-
matics testing as pragmatic features are multiple within an utterance and also cross utterance
boundaries, i.e. they are discoursal. A key question, then, is whether it is possible to design tests
that assess pragmatic ability in discourse and, if it is, how this can be done. The research on
grammar testing offers little help here.
If, however, the aim is to test implicit and explicit knowledge of individual speech acts, which after
all has been a major focus in interlanguage pragmatics, then the grammar tests do provide ideas for
how this might be done. The characteristics of a test most likely to elicit implicit pragmatic knowledge
are as follows:

. It will involve dual processing (i.e. learners will be asked to accomplish a secondary task that dis-
tracts their attention from linguistic form).
. It will assess receptive or productive pragmatic knowledge separately or both together (as in an
EIT).
14 R. ELLIS AND C. ROEVER

. It will assess pragmatic (not semantic) meaning (i.e. the social appropriateness of utterances).
. It will involve real-time language processing which can best be achieved by the use of aural (as
opposed to written) stimuli and by imposing time limits on the learner’s responses.
. The time taken for learners to respond to individual test items will need to be recorded.
. It will be supported by the collection of subjective ratings and self-report data from the learners to
show to what extent they manifested conscious awareness of the pragmatic targets of the test.

Any test involving the comprehension or production of language will potentially elicit the use of
implicit and explicit knowledge and the extent to which it leads to the use of one or the other type or
both types will depend on how the language is represented in the minds of individual learners. At
best, then, all a test can do is bias learners to the deployment of one type of language. Learners
who lack implicit knowledge will have to use their explicit knowledge irrespective of the kind of
test. They will just do badly in those tests designed to assess their implicit knowledge.
Finally, it may not ultimately be possible to distinguish true implicit from automatised explicit
knowledge. Some progress in achieving this has been made for grammar by using psycholinguistic
tests such as the word monitoring task. But such tests may not prove adaptable to the testing of prag-
matic aspects of language. It can also be argued that if the purpose is simply to develop a battery of
tests that distinguish different types of pragmatic ability, simply showing that particular groups of
learners differ in terms of the extent to which they can access their pragmatic knowledge automati-
cally may suffice. From a theoretical perspective, it is clearly desirable to distinguish implicit and expli-
cit pragmatic knowledge. From a practical point of view, it may prove impossible to do so and, in any
case, functionally it is not necessary.

Notes
1. Neither Loewen and Ellis (2007) nor the follow-up factor-analytic studies tested a model involving a single factor
fit. Vafaee, Suzuki and Kachisnke (2017) argued that is it essential to do so.
2. The EIT in Ellis (2005) was administered face-to-face with individual learners. The EITs in both Spada, Shiu and
Tomita (2015) and Kim and Nam (2017) were administered by computer, allowing for the time taken to reproduce
a stimulus sentence to be strictly controlled. Interestingly, Suzuki and DeKeyser (2015) who found that EIT scores
correlated with scores from a MKT allowed 8 seconds for learners’ to repeat a sentence, far longer than in any of
the other EIT studies.
3. Another version of the word monitoring test (Vafaee, Suzuki and Kachinske 2017) had learners read rather than
listen to sentences.
4. The exception is the use of free production tasks – e.g. the oral production task in Ellis (2005).

Disclosure statement
No potential conflict of interest was reported by the authors.

References
Bowden, H. 2016. Assessing second-language oral proficiency for research: the Spanish elicitation test. Studies in Second
Language Acquisition 38: 647–675.
Bowles, M. 2011. Measuring implicit and explicit linguistic knowledge: what can heritage language learners contribute?
Studies in Second Language Acquisition 33: 247–271.
van Compernolle, R. and H. Zhang. 2014. Dynamic assessment of elicited imitation: a case analysis of an advanced L2
English speaker. Language Testing 31: 395–412.
de Jong, N. 2005a. Learning second language grammar by listening. Unpublished PhD thesis, Graduate School of
Linguistics, the Netherlands.
de Jong, N. 2005b. Can second language grammar be learned through listening? An experimental study. Studies in
Second Language Acquisition 27: 205–234.
DeKeyser, R. 2003. Implicit and explicit learning. In Handbook of Second Language Acquisition, ed. C. Doughty and M.
Long, 313–348. Malden, MA: Blackwell.
Dienes, Z. and J. Perner. 1999. A theory of implicit and explicit knowledge. Behavioral and Brain Sciences 22: 735–808.
THE LANGUAGE LEARNING JOURNAL 15

Dienes, Z., G.T.M. Altmann, L. Kwan and A. Goode. 1995. Unconscious knowledge of artificial grammars is applied stra-
tegically. Journal of Experimental Psychology: Learning, Memory, and Cognition 21, no. 5: 1322–1338.
Elder, C. 2009. Validating a test of metalinguistic knowledge. In Implicit and Explicit Knowledge in Second Language
Learning, Testing and Teaching, ed. R. Ellis, S. Loewen, C. Elder, R. Erlam, J. Philp and H. Reinders, 113–138. Bristol:
Multilingual Matters.
Ellis, R. 1991. Grammatically judgments and second language acquisition. Studies in Second Language Acquisition 13, no. 2:
161–186.
Ellis, R. 2004. The definition and measurement of explicit knowledge. Language Learning 54: 227–275.
Ellis, R. 2005. Measuring implicit and explicit knowledge of a second language: a psychometric study. Studies in Second
Language Acquisition 27, no. 2: 141–172.
Ellis, R. 2006. Modelling learning difficulty and second language proficiency: the differential contributions of implicit and
explicit knowledge. Applied Linguistics 27: 431–463.
Ellis, R., S. Loewen, C. Elder, R. Erlam, J. Philp and H. Reinders. 2009. Implicit and Explicit Knowledge in Second Language
Learning, Testing and Teaching. Bristol: Multilingual Matters.
Enochs, K. and S. Yoshitake-Strain. 1999. Evaluating six measures of EFL learners’ pragmatic competence. JALT Journal 21,
no. 1: 29–50.
Erlam, R. 2006. Elicited imitation as a measure of L2 implicit knowledge: an empirical validation study. Applied Linguistics
27, no. 3: 464–491.
Gaillard, S. and A. Tremblay. 2016. Linguistic proficiency assessment in second language acquisition research: the elicited
imitation task. Language Learning 66: 419–447.
Godfroid, A. 2016. The effects of implicit instruction on implicit and explicit knowledge development. Studies in Second
Language Acquisition 38: 177–215.
Golato, A. 2003. Studying compliment responses: a comparison of DCTs and recordings of naturally occurring talk.
Applied Linguistics 24: 90–121.
Gutiérrez, X. 2013. The construct validity of grammaticality judgment tests as measures of implicit and explicit knowl-
edge. Studies in Second Language Acquisition 35: 423–449.
Kerz, E., D. Wiechmann and F.B. Riedel. 2017. Implicit learning in the crowd: investigating the role of awareness in the
acquisition of L2 knowledge. Studies in Second Language Acquisition 39: 711–734.
Kim, J-E. and H. Nam. 2017. Measures of implicit knowledge revisited: processing modes, time pressure and modality.
Studies in Second Language Acquisition 39: 431–457.
Lantolf, J.P. and M.E. Poehner. 2011. Dynamic assessment in the classroom: Vygotskian praxis for second language devel-
opment. Language Teaching Research 15: 11–33.
Loewen, S. and R. Ellis. 2007. Confirming the operational definitions of explicit and implicit knowledge in Ellis (2005).
Studies in Second Language Acquisition 29: 119–126.
Loschky, L. and R. Bley-Vroman. 1993. Grammar and task-based methodology. In Tasks and Language Learning:
Integrating Theory and Practice, ed. G. Crookes and S. Gass, 123–167. Clevedon: Multilingual Matters.
Macrory, G. and V. Stone. 2000. Pupil progress in the acquisition of the perfect tense in French: the relationship between
knowledge and use. Language Teaching Research 4: 55–82.
Ortega, L. 2000. Understanding syntactic complexity: The measurement of change in the syntax of instructed L2 Spanish
learners. Unpublished Ph.D., University of Hawai’i at Manoa.
Ortega, L., N. Iwashita, S. Rabie and J.M. Norris. 1999. A Multilanguage Comparison of Measures of Syntactic Complexity.
Hawai’i: University of Hawai’i, National Language Resource Center.
Paradis, M. 2004. A Neurolinguistic Theory of Bilingualism. Amsterdam: John Benjamins.
Qin and van Compernolle (forthcoming). Computerized dynamic assessment of implicature comprehension in L2
Chinese. Language Learning and Technology.
Rebuschat, P. 2013. Measuring implicit and explicit knowledge in second language research. Language Learning 63: 595–
626.
Rebuschat, P., P. Hamrick, K. Riestenberg, R. Sachs and N. Ziegler. 2015. Triangulating measures of awareness: a contri-
bution to the debate of learning without awareness. Studies in Second Language Acquisition 37: 299–334.
Shintani, N. and R. Ellis. 2013. The comparative effect of direct written corrective feedback and metalinguistic explanation
on learners’ explicit and implicit knowledge of the English indefinite article. Journal of Second Language Writing 22:
286–306.
Spada, N., J. Shiu and Y. Tomita. 2015. Validating an elicited imitation task as a measure of implicit knowledge: compari-
sons with other validation studies. Language Learning 65, no. 3: 723–751.
Suzuki, Y. 2017. Validity of new measures of implicit knowledge: distinguishing implicit knowledge from automatized
explicit knowledge. Applied Psycholinguistics 38: 1229–1261.
Suzuki, Y. and R.M. DeKeyser. 2015. Comparing elicited imitation and word monitoring as measures of implicit knowl-
edge. Language Learning 65, no. 4: 860–895.
Taguchi, N. 2012. Context, Individual Differences, and Pragmatic Competence. New York: Multilingual Matters.
Ullman, M. 2004. Contributions of memory circuits to language: the declarative/procedural model. Cognition 92: 231–270.
16 R. ELLIS AND C. ROEVER

Vafaee, P., Y. Suzuki and I. Kachisnke. 2017. Validating grammaticality judgment tests: evidence from two psycholinguistic
measures. Studies in Second Language Acquisition 39: 59–95.
Williams, J. 2005. Learning without awareness. Studies in Second Language Acquisition 27: 269–304.
Zhang, R. 2015. Measuring university-level L2 learners’ implicit and explicit linguistic knowledge. Studies in Second
Language Acquisition 37: 457–486.

Appendix
Example of question and prompts from Qin and van Compernolle (forthcoming)
Miles plans to travel to Beijing. He knows his high school classmate, Lucy, a graduate student there. Mike calls Lucy and
asks if he can stay in her dorm for a few nights.
She replies: There are many people living in my dorm and it is already fully occupied (in Chinese). What does Lucy mean?

(a) Mike cannot stay in Lucy’s dorm


(b) Lucy’s dorm is already fully occupied.
(c) Lucy invites Mile to visit Beijing.
(d) Lucy’s dorm is very small.
(e) Lucy plans to have Mike stay in Beijing for a long time.

Prompt 1: That is not the right answer. Listen to the clip and try again.
Prompt 2: That is not right either. Why does Lucy talk about her dorm?
Prompt 3: That is not right either. Is Lucy going to allow Mike to stay with her?

You might also like