You are on page 1of 36

Lexical demands for reading the

texts contained in the textbooks


distributed by the MINEDUC
Marion Durbahn Daniela Sánchez
KU Leuven & Pontificia Pontificia Universidad
Universidad Católica de Católica
Chile. Facultad de Letras. Facultad de Letras
Pedagogía en Inglés. Pedagogía en Inglés.
Marion.durbahn@kuleuven.be &
Universidad San Sebastián
Supervisor: Elke Peters, dsanchezb@uc.cl
KU Leuven, Belgium
Co-supervisor: Michael
Rodgers, Carleton
University, Canada
Glossary
• Corpus
• BNC/COCA700 million words
• Frequency lists word families
• Tokens
• Word families
• RANGE software (Heatley, Nation, & Coxhead, 2002)
• Lexical coverage
Justification
• Vocabulary knowledge is one of the most powerful
predictors of reading comprehension (Hu and Nation,
2000; Laufer and Sim, 1985, Laufer, 1989, 1992; Nation,
2001, 2006; Qian, 1999).
• Both components are strongly correlated (correlation
coefficients ranged from .4 to .83 [Laufer and Ravenhorst-
Kalovski, 2010; Qian, 1999; Stæhr 2008]).
Background
Lexical Vocabulary Methodology Authors
coverage targets

95% Underlined unknown Laufer (1985)


words plus a translation
task
98%
 Use of non-words Hu & Nation (2000)

98% 8,000-9,000 Corpus study Nation (2006)

95% 4,000-5,000 Frequency based Laufer & Ravenhorst-


98% 8,000 vocabulary tests Kalovski (2010)


98 % for 60% Vocabulary known in two Schmitt, Jiang & Grabe
comprehension academic texts (2011)
scores
98% lexical coverage
• Do you feel anxious to constantly be connected to the
Internet? Do you spend hours ________
browsing web pages
without noticing? Do you sleep for less than five hours
because you prefer to surf the net? If you answer yes to
these questions, then you might be suffering from
__________:
Webaholism an addiction to the Internet. Twenty years
ago, no one could have predicted that consulting the
Internet would become addictive. There is, however,
evidence which suggests that people can develop a
compulsive need to be online, to check emails constantly,
to update blogs daily or to visit social network sites when
they should be studying.
Excerpt taken from Meet up, Unit 2, text 2
95% lexical coverage
• Normal ___________,
behaviour you might say, but Internet
addiction is a serious condition and might be more
common than you think. A recent telephone ________
survey
from the Stanford School of Medicine found that one out of
eight people interviewed could be experiencing problems
misuse
related to the __________ of the Internet. Spending more
hours online means that users are overtired, leading to
problems at work or school. But perhaps the social
__________
implications of webaholism are even more serious. People
share feelings and experiences with online friends they
have never met because they feel more ___________,
confident but
avoid meeting real friends.
Excerpt taken from Meet up, Unit 2, text 2
Relevant studies
• This study is based on
o Nation’s (2006) corpus study where he analyzed
different texts to see how much vocabulary is needed
for unassisted comprehension.

8,000-9,000 98%

o Matsuoka and Hirsh’s (2010) where they investigated


the vocabulary learning opportunities in an ELT course
book designed for upper-intermediate learners.

2,000 most frequent word families but few


opportunities to learn lower frequency word families
Demands at A2 and B1 levels
• According to the Programas de Estudio from the
MINEDUC
o No mention to lexical goals per level. The National
Curriculum (MINEDUC, 2012) only states that in 8th
grade of primary school, learners should identify key
words and phrases and frequent expressions (are they
tokens or word families?)
• 8th gradeA2KET
• 12th gradeB1PET
Lexical goals
• Table 1: Suggested vocabulary goals per level

• Native speakers know approximately 20,000 word families


(Schmitt, Cobb, Horst, and Schmitt, 2015).
Rationale

• We don’t know the lexical demands for unassisted reading


of the texts contained in the textbooks distributed by the
MINEDUC.
• We don’t know whether the texts of the textbooks
distributed by the MINEDUC favor the preparation of
students from 8th and 12th grades to succeed on
standardized tests at A2 and B1 levels, respectively.
Research questions

RQ1: What are the lexical demands of reading the texts from
the textbooks distributed by the Ministry of Education?
RQ2: How well do the texts from the textbooks distributed by
the MINEDUC reflect A2 and B1 levels in terms of lexical
demands as measured by the KET and PET tests?
Method
• Materials
Materials
Corpus
• 8th grade40 texts 8,727 tokens from E-teens
• 12th grade6413,497 tokens from Tune up
• A23 texts571 tokens  from Cambridge English Key
• B13 texts1,550  from Cambridge English
Preliminary

• Total 110 texts24,345 words


Analysis
To answer research questions 1 and 2, this procedure
was followed

All the texts were transformed to .txt


format

The texts were read and cleaned up


(e.g. hyphenated compounds).

RANGE software (Heatley, Nation


and Coxhead, 2002)  BNC/COCA
corpus
Results:
Research question 1: What are the
lexical demands of reading the texts
from the textbooks distributed by the
Ministry of Education?
Table 2: Results Lexical Frequency Profile for 8th
Frequency band Tokens % Word % Word %
families families*
1K 6747 77.31 674 40.53 855 51.41
2K 772 86.16 341 61.03 341 71.92
3K 331 89.95 167 71.08 167 81.96
4K 122 91.35 78 75.77 78 86.65
5K 79 92.25 49 78.71 49 89.60
6K 59 92.93 41 81.18 41 92.06
7K 33 93.31 15 82.08 15 92.96
8K 27 93.62 16 83.04 16 93.93
9K 25 93.90 17 84.06 17 94.95
10K 19 94.12 11 84.73 11 95.61
11K 17 94.32 11 85.39 11 96.27
12K 15 94.49 8 85.87 8 96.75
13K 16 94.67 7 86.29 7 97.17
14K 4 94.72 4 86.53 4 97.41
15K 1 94.73 1 86.59 1 97.47
16K 1 94.74 1 86.65 1 97.53
17K 9 94.84 6 87.01 6 97.90
18K 5 94.90 2 87.13 2 98.02
19K 1 94.91 1 87.19 1 98.08
20K 10 95.03 3 87.37 3 98.26
22K 3 95.06 2 87.49 2 98.38
24K 4 95.11 1 87.55 1 98.44
25K 4 95.15 2 87.67 2 98.56
Compound nouns 43 95.65 24 89.12 24 100.00
Proper nouns 337 99.51 167 99.16
Marginal words 23 99.77 10 99.76
Acronyms 6 99.84 4 100
Spanish words 3 99.87
Not in the list 11 100 ??? ???
Table 2: Results Lexical Frequency Profile for 8th
Frequency band Tokens % Word % Word %
families families*
1K 6747 77.31 674 40.53 855 51.41
2K 772 86.16 341 61.03 341 71.92
3K 331 89.95 167 71.08 167 81.96
4K 122 91.35 78 75.77 78 86.65
5K 79 92.25 49 78.71 49 89.60
6K 59 92.93 41 81.18 41 92.06
7K 33 93.31 15 82.08 15 92.96
8K 27 93.62 16 83.04 16 93.93
9K 25 93.90 17 84.06 17 94.95
10K 19 94.12 11 84.73 11 95.61
11K 17 94.32 11 85.39 11 96.27
12K 15 94.49 8 85.87 8 96.75
13K 16 94.67 7 86.29 7 97.17
14K 4 94.72 4 86.53 4 97.41
15K 1 94.73 1 86.59 1 97.47
16K 1 94.74 1 86.65 1 97.53
17K 9 94.84 6 87.01 6 97.90
18K 5 94.90 2 87.13 2 98.02
19K 1 94.91 1 87.19 1 98.08
20K 10 95.03 3 87.37 3 98.26
22K 3 95.06 2 87.49 2 98.38
24K 4 95.11 1 87.55 1 98.44
25K 4 95.15 2 87.67 2 98.56
Compound nouns 43 95.65 24 89.12 24 100.00
Proper nouns 337 99.51 167 99.16
Marginal words 23 99.77 10 99.76
Acronyms 6 99.84 4 100
Spanish words 3 99.87
Not in the list 11 100 ??? ???
Table 2: Results Lexical Frequency Profile for 8th
Frequency band Tokens % Word % Word %
families families*
1K 6747 77.31 674 40.53 855 51.41
2K 772 86.16 341 61.03 341 71.92
3K 331 89.95 167 71.08 167 81.96
4K 122 91.35 78 75.77 78 86.65
5K 79 92.25 49 78.71 49 89.60
6K 59 92.93 41 81.18 41 92.06
7K 33 93.31 15 82.08 15 92.96
8K 27 93.62 16 83.04 16 93.93
9K 25 93.90 17 84.06 17 94.95
10K 19 94.12 11 84.73 11 95.61
11K 17 94.32 11 85.39 11 96.27
12K 15 94.49 8 85.87 8 96.75
13K 16 94.67 7 86.29 7 97.17
14K 4 94.72 4 86.53 4 97.41
15K 1 94.73 1 86.59 1 97.47
16K 1 94.74 1 86.65 1 97.53
17K 9 94.84 6 87.01 6 97.90
18K 5 94.90 2 87.13 2 98.02
19K 1 94.91 1 87.19 1 98.08
20K 10 95.03 3 87.37 3 98.26
22K 3 95.06 2 87.49 2 98.38
24K 4 95.11 1 87.55 1 98.44
25K 4 95.15 2 87.67 2 98.56
Compound nouns 43 95.65 24 89.12 24 100.00
Proper nouns 337 99.51 167 99.16
Marginal words 23 99.77 10 99.76
Acronyms 6 99.84 4 100
Spanish words 3 99.87
Not in the list 11 100 ??? ???
Table 2: Results Lexical Frequency Profile for 8th
Frequency band Tokens % Word % Word %
families families*
1K 6747 77.31 674 40.53 855 51.41
2K 772 86.16 341 61.03 341 71.92
3K 331 89.95 167 71.08 167 81.96
4K 122 91.35 78 75.77 78 86.65
5K 79 92.25 49 78.71 49 89.60
20%
6K 59 92.93 41 81.18 41 92.06
7K 33 93.31 15 82.08 15 92.96
8K 27 93.62 16 83.04 16 93.93
9K 25 93.90 17 84.06 17 94.95
10%
10K 19 94.12 11 84.73 11 95.61
11K 17 94.32 11 85.39 11 96.27
12K 15 94.49 8 85.87 8 96.75
13K 16 94.67 7 86.29 7 97.17
14K 4 94.72 4 86.53 4 97.41
15K 1 94.73 1 86.59 1 97.47
16K 1 94.74 1 86.65 1 97.53
17K 9 94.84 6 87.01 6 97.90
18K 5 94.90 2 87.13 2 98.02
19K 1 94.91 1 87.19 1 98.08
20K 10 95.03 3 87.37 3 98.26
.12%
22K 3 95.06 2 87.49 2 98.38
24K 4 95.11 1 87.55 1 98.44
25K 4 95.15 2 87.67 2 98.56
Compound nouns 43 95.65 24 89.12 24 100.00 .06%
Proper nouns 337 99.51 167 99.16
Marginal words 23 99.77 10 99.76
Acronyms 6 99.84 4 100
Spanish words 3 99.87
Not in the list 11 100 ??? ???
Table 2: Results Lexical Frequency Profile for 8th
Frequency band Tokens % Word % Word %
families families*
1K 6747 77.31 674 40.53 855 51.41
2K 772 86.16 341 61.03 341 71.92
3K 331 89.95 167 71.08 167 81.96
4K 122 91.35 78 75.77 78 86.65
5K 79 92.25 49 78.71 49 89.60
6K 59 92.93 41 81.18 41 92.06
7K 33 93.31 15 82.08 15 92.96
8K 27 93.62 16 83.04 16 93.93
9K 25 93.90 17 84.06 17 94.95
10K 19 94.12 11 84.73 11 95.61
11K 17 94.32 11 85.39 11 96.27
12K 15 94.49 8 85.87 8 96.75
13K 16 94.67 7 86.29 7 97.17
14K 4 94.72 4 86.53 4 97.41
15K 1 94.73 1 86.59 1 97.47
16K 1 94.74 1 86.65 1 97.53
17K 9 94.84 6 87.01 6 97.90
18K 5 94.90 2 87.13 2 98.02
19K 1 94.91 1 87.19 1 98.08
20K 10 95.03 3 87.37 3 98.26
22K 3 95.06 2 87.49 2 98.38
24K 4 95.11 1 87.55 1 98.44
25K 4 95.15 2 87.67 2 98.56
Compound nouns 43 95.65 24 89.12 24 100.00
Proper nouns 337 99.51 167 99.16
Marginal words 23 99.77 10 99.76
Acronyms 6 99.84 4 100
Spanish words 3 99.87
Not in the list 11 100 ??? ???
Table 3: Results Lexical Frequency Profile for 12th
Frequency band Tokens % Word % Word %
families families*
1K 10469 78.09 795 37.68 1010 47.87
2K 1289 87.70 450 59.00 450 69.19
3K 690 92.85 314 73.89 314 84.08
4K 174 94.14 114 79.29 114 89.48
5K 89 94.81 61 82.18 61 92.37
6K 49 95.17 37 83.93 37 94.12
7K 39 95.47 25 85.12 25 95.31
8K 22 95.63 15 85.83 15 96.02
9K 21 95.79 17 86.64 17 96.82
10K 8 95.85 7 86.97 7 97.16
11K 4 95.88 3 87.11 3 97.30
12K 4 95.91 4 87.30 4 97.49
13K 4 95.93 3 87.44 3 97.63
14K 9 96.00 4 87.63 4 97.82
15K 1 96.01 1 87.68 1 97.87
16K 1 96.02 1 87.73 1 97.91
18K 3 96.04 3 87.87 3 98.06
20K 1 96.05 1 87.91 1 98.10
22K 1 96.05 1 87.96 1 98.15
23K 1 96.06 1 88.01 1 98.20
24K 1 96.07 1 88.06 1 98.25

Compound words 61 99.71 37 99.19 37 100.00


Proper names 398 99.04 188 96.97
Marginal words 29 99.25 10 97.44
Acronyms 22 99.87 13 99.81
Spanish words 8 99.93 4 100.00
Results:
RQ2: How well do the texts from the
textbooks distributed by the
MINEDUC reflect A2 and B1 levels in
terms of lexical demands as
measured by the KET and PET
tests?
Table 4: Results Lexical Frequency Profile for A2 level (KET)

Frequency Tokens % Word % Word %


band families families
*
1K 500 87.57 168 85.28 184 93.40
2K 14 90.02 10 90.36 10 98.48
3K 0 90.02 0 90.36 0 98.48
4K 2 90.37 2 91.37 2 99.49
5K-25K 0 90.37 0 91.37 0 99.49
Compound
words 1 90.54 1 91.88 1 100
Proper
nouns 54 100 16 100
Total 571 197 197
Table 5: Results Lexical Frequency Profile for B1 level (PET)
Frequency band Tokens % Word % Word %
families families*
1K 1322 85.29 313 70.34 349 78.43
2K 120 93.03 67 85.39 67 93.48
3K 30 94.97 18 89.44 18 97.53
4K 7 95.42 6 90.79 6 98.88
5K 2 95.55 2 91.24 2 99.33
6K 2 95.68 2 91.69 2 99.78
7K-13K 0 95.68 91.69 99.78
14K 5 96.00 1 91.91 1 100.00
15K-25K 0 96.00 91.91 100.00
Compound
words 0 96.00 91.91 100.00
Proper nouns 44 98.84 22 96.85
Marginal words 6 99.23 6 98.20
Acronyms 12 100.00 8 100.00

Total 1550 445 445


Discussion
• Research question 1: What are the lexical demands of
reading the texts from the textbooks distributed by the
Ministry of Education?
• 8th grade
o 95%9K
o 98%17-18K
• 12th grade
o 95%7K
o 98%16K
o Goals well beyond the ones suggested by Meara
(1980), Meara & Milton (2003), and Kriszan (2003).
Discussion
• RQ2: How well do the texts from the textbooks distributed
by the MINEDUC reflect A2 and B1 levels in terms of
lexical demands as measured by the KET and PET tests?
• Table 5: Comparison between MINEDUC textbooks and
Cambridge Exams

A2 B2
E-teens KET Tune up PET
95% 9K 1.5K 7K 2.5K
(approx.) (approx.)
98% 17K 2K 16K 3K
Conclusion

• The levels of the texts in the textbooks in terms of


vocabulary are considerably higher than what is expected
from 8th and 12th graders.

• Learners should have a vocabulary repertoire similar to a


native speaker to reach 98% lexical coverage and being
able to comprehend the texts’ from both levels without
external help (dictionaries, translators, teacher’s help, etc).
Conclusion
• In relation to CEFR levels, the textbooks surpass the
requirements expected from the learners when compared
to KET and PET reading comprehension texts.
Implications
• For teaching

The purpose of knowing the lexical demands of texts in


textbooks is relevant for language teachers because it
provides hints on how to adapt the texts based on the
learners’ proficiency level.

Glossaries
Pre-teaching vocabulary
Adapting the material
Choosing appropriate texts according to frequency profiles
Training learners in intensive reading
• For publishers or textbook writers

• To create or adapt texts making frequency informed


decisions.
Disclaimer
• Our intention in this presentation was not to stigmatize the
textbooks distributed by the MINEDUC in anyway.

• Our purpose was to shed light on the importance of


frequency based decisions at the moment of creating
reading comprehension material and highlight the
relevance of the development of lexical knowledge.
Selected references
Laufer, B. and Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage,
learners' vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30.
Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? The Canadian Modern
Language Review, 63, 59–82.
Nation, I. S. P. and Beglar, D. (2007). A vocabulary size test. The Language Teacher, 31(7), 9-13.
Schmitt, N., Jiang, X., and Grabe, W. (2011). The percentage of words known in a text and
reading comprehension. Modern Language Journal, 95(1), 26-43.
Heatley, A., Nation, I. S. P., & Coxhead, A. (2002). Range computer program.
• http://www.vuw.ac.nz/lals/staff/Paul_Nation
Thank you…
Not found in any list
A2 B1

• Capricorns • Automatised
• Chunking • Councelor
• Colonizers • Fanging
• Diastole • Habitants
• Leos • Joyologist
• Scorpios • Stewarding
• Tauruses • Webaholism
• Virgos
• Widescreen

You might also like