Professional Documents
Culture Documents
reading
Words you don't know, words you think you know,
and words you can't guess
Batia Laufer
Introduction
No text comprehension is possible, either in one's native language or in a
foreign language, without understanding the text's vocabulary. This is not
to say that reading comprehension and vocabulary comprehension are
the same, or that reading quality is determined by vocabulary alone.
Reading comprehension (both in LI and in L2) is also affected by tex-
tually relevant background knowledge and the application of general
reading strategies, such as predicting the content of the text, guessing
unknown words in context, making inferences, recognizing the type of
text and text structure, and grasping the main idea of the paragraph. And
yet, it has been consistently demonstrated that reading comprehension is
strongly related to vocabulary knowledge, more strongly than to the
other components of reading. Anderson and Freebody (1981) survey
various studies (correlational, factor analyses, readability analyses) that
show that the word variable is more highly predictive of comprehension
than the sentence variable, the inferencing ability, and the ability to grasp
main ideas. Beck, Perfetti, and McKeown (1982), Kameenui, Carnine,
and Freschi (1982), and Stahl (1983) have demonstrated that an improve-
ment in reading comprehension can be attributed to an increase in vocab-
ulary knowledge.
A similar picture of vocabulary as a good predictor of reading success
emerges from second language studies. Laufer (1991a) found good and
significant correlations between two different vocabulary tests (the Vo-
cabulary Levels Test by Nation [1983a] and the Eurocentres Vocabulary
Test by Meara and Jones [1989]) and reading scores of L2 learners. The
correlations were .5, significant at the level of p < .0001, and .75, signifi-
cant at the level of p < .0001, respectively. Even higher correlations are
reported by Koda (1989) between vocabulary (tested by a self-made test)
and two reading measures, cloze and paragraph comprehension. These
correlations are .69, p < .0002 and .74, p < .0001. Coady, Magoto,
Hubbard, Graney, and Mokhtari (1993) conducted two experiments that
showed that increased proficiency in high-frequency vocabulary also led
20
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 21
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
22 Batia Laufer
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 23
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
24 Batia Laufer
be 3,000 x 1.6 = 4,800 (for the conversion formula, see Nation [1983a]).
The level at which good LI readers can be expected to transfer their
reading strategies to L2 is 3,000 word families, or about 5,000 lexical
items. Until they have reached this level, such transfer will be hampered
by an insufficient knowledge of vocabulary. A regression analysis was
carried out to check how an increase in vocabulary quantity related to the
improvement in reading. The analysis showed that an increase in 1,000
words resulted in an increase of 7% on a comprehension test. Thus, if we
tried to predict the relationship between vocabulary increase and com-
prehension increase, we might expect a knowledge of 3,000 word fam-
ilies (5,000 lexical items) to result in a reading score of 56% (the mini-
mum passing grade on reading tests in the students' institution at the time
of the experiment), a knowledge of 4,000 (6,400 lexical items) in 63%,
5,000 (8,000 lexical items) in 70%, 6,000 (9,600 lexical items) in 77%,
and so on. These figures are correct if the progress in reading vis-a-vis
vocabulary size is always linear. It is possible, however, that when the
learner reaches a certain vocabulary level, progress in the reading score
will decrease and finally level off.
Even if the results are not conclusive for all vocabulary levels, they
provide, nevertheless, a general idea of how reading progresses above the
threshold level of 3,000 word families and what vocabulary size should
be aimed at for different reading levels. If the optimal reading score is
considered to be, for example, 70%, then the vocabulary size to aim for
will be 5,000 word families; if 63 % is taken as a passing score, then 4,000
will suffice.
Further support for 3,000 word families as the lexical threshold is
found in Laufer (1992). Adult learners were compared on their vocabu-
lary level, reading comprehension in EFL, and general academic ability,
which included a reading in LI component. (The General Academic Abil-
ity score and the EFL reading score were taken from the university psy-
chometric entrance test.) It was shown that learners below the 3,000
vocabulary level (5,000 lexical items) did poorly on the reading test re-
gardless of how high their academic ability was. In other words, even the
more intelligent students who are good readers in their native language
cannot read well in their L2 if their vocabulary is below the threshold.
In terms of text coverage, i.e., the actual percentage of word tokens in a
given text understood by a reader, the 3,000 word families are reported to
provide a coverage of between 90% and 95% of any text (for surveys of
frequency counts, see Nation, 1983a and 1900). The figure claimed to be
necessary for text comprehension is 95% of text coverage (Deville, 1985;
Laufer, 1989b). Thus, both earlier frequency counts and later empirical
studies of L2 vocabulary and reading suggest a similar vocabulary mini-
mum, which is 3,000 word families, or 5,000 lexical items. As mentioned
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 25
before, the higher the comprehension level expected, the larger the vocab-
ulary should be.
Deceptive transparency
So far we have discussed one aspect of the lexical plight in L2 reading -
words one does not know, or, more precisely, words the reader knows he
or she does not know. In addition to these, there are other words that are
not even recognized as unfamiliar by the reader. These words are decep-
tively transparent, i.e., they look as if they provided clues to their mean-
ing. For example, infallible looks as if it were composed of in+fall+ible
and meant 'something that cannot fall'; shortcomings looks like a com-
pound of 'short' and 'comings', meaning 'short visits' (these are actual
misinterpretations provided by students). Huckin and Bloch (1993)
found similar types of misinterpretations. They refer to the deceptively
transparent words as cases of 'mistaken ID'. The deceptively transparent
words (hence DT words) seem to fall into one of five distinct categories.
Idioms
Hit and miss, sit on the fence, a shot in the dark, miss the boat were
translated literally, word by word. The learner's assumption in the case of
idioms was similar to that of deceptive transparency, i.e., the meaning of
the whole was the sum of the meanings of its parts.
False friends
Sympathetic was interpreted as 'nice' (Hebrew 'simpati'); tramp as 'lift'
(Hebrew 'tremp'); novel as 'short story' (Hebrew 'novela'). The mistaken
assumption of the learner in this case was that if the form of the word in
L2 resembled that in LI, the meaning did so too.
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
26 Batia Laufer
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 27
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
28 Batia Laufer
since a variety of factors will interfere with the guessing attempts of the
reader.
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 29
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
30 Batia Laufer
Suppressed clues
One of the factors that contribute to successful guessing is the reader's
background knowledge of the subject matter of the text, or content sche-
mata. Inferences are drawn from the text on the basis of the reader's
expectations of certain content. Since 'reading is a psycholinguistic guess-
ing game' (Goodman, 1967), the successful players should, among other
things, draw on their experiences and concepts. This strategy may work
quite well, except when the reader's expectations and concepts are
different from those of the author of the text. Readers tend to disregard
information that, according to their world view, seems unimportant, add
information that 'should' be there, and focus their attention on what, in
their opinion, is essential (Steffensen &c Joag-Dev, 1984). So strong is the
effect of background knowledge that it overrides lexical and syntactic
clues. In an experiment by Laufer and Sim (1985b), students were given a
passage where the author (Margaret Mead) discussed biological
differences between men and women, and clearly implied that boys and
girls should get a different education. Some learners, however, insisted
that the author was advocating the same education for both sexes. From
interviews it became clear that the students were using their knowledge of
the world, which was that nobody today would dare to suggest different
education for men and women, certainly not a woman author. When a
biased opinion of this kind is introduced into the interpretation, individ-
ual unknown words will be taken to mean whatever suits the reader's
own notion of what the text says. If there are clues in context that would
suggest a different interpretation, they can easily be suppressed.
In this essay, I have discussed three lexical problems that may seriously
impede reading comprehension in L2: (1) the problem of insufficient
vocabulary, (2) misinterpretations of deceptively transparent words, and
(3) inability to guess unknown words correctly. I will recapitulate my
arguments starting with the third issue. A learner who has been taught
guessing strategies will not automatically produce correct guesses. The
following factors, which are beyond the reader's control, will affect
guessing.
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 31
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
32 Batia Laufer
References
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
The lexical plight in second language reading 33
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004
34 Batia Laufer
Downloaded from https://www.cambridge.org/core. Universiteit Gent, on 19 Feb 2018 at 20:02:15, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781139524643.004