Professional Documents
Culture Documents
Paul Leeming
Abstract
Vocabulary is an important element of language proficiency, and acquisition of an
extensive vocabulary should be a goal for every language learner (Nation, 2008). As
teachers, it is important that we incorporate a focus on vocabulary into our language
courses, but in order to do so it is helpful to have some knowledge of our students
current vocabulary level(Beglar, 2010)
. The Vocabulary Size Test is a free vocabulary
test available online, and is designed to test the level of students vocabulary, with 140
items designed to measure the first 14000 words of English. This paper presents a
Rasch analysis of the first 80 test items used with university students in Japan. Rasch
analysis was used to determine the relative difficulty of each item, and also to assess the
validity and usefulness of the test in this context. Results show that although the test is
useful in assessing the relative vocabulary levels of students, the items do not behave
entirely as predicted by the difficulty according to the word levels, and care should be
taken by teachers hoping to use the test to ascertain student knowledge of different
vocabulary levels.
Introduction
Vocabulary is an important element of language proficiency. Put simply, if we do
not know a given vocabulary item we are unlikely to understand it in spoken or written
text, and will be unable to express ourselves fully in a second language. Vocabulary
level will determine what level of reading materials can be used in class(Hu & Nation,
2000)
, and many other decisions regarding teaching materials. Teachers aim to build
students vocabulary and most language courses feature some focus on vocabulary
(Nation, 2001)
. Having decided to include a vocabulary component however, teachers
are faced with the difficult decision of what vocabulary to include. Should there be a
focus on the General Service List of vocabulary(West, 1953)which has proved its
reliability over time, or should teachers concentrate on developing a more academic
vocabulary(Coxhead, 2011)
, and assume that students know the most basic vocabulary?
−73−
教養・外国語教育センター紀要
In order to answer this question we need a test of our students current vocabulary
level, and one such test is the Vocabulary Size Test(VST)developed by Paul Nation
and David Beglar(2007)
.
This paper begins by introducing and describing the VST, before presenting a
Rasch analysis of the results of the first 80 items of the test which were administered to
81 students in a Japanese university context. The test was limited to the first 80 items
as the students in the current study were relatively low in English proficiency(average
TOEIC score of 390)
. The analysis will ascertain whether the theoretical hierarchy of
difficulty for the words is borne out in a Japanese context, and whether the test is
effective in differentiating between the different vocabulary sizes of the students in this
study. A correlation analysis with TOEIC test scores will be used to determine how
closely vocabulary knowledge can be considered a measure of overall English language
proficiency. A basic knowledge of the Rasch model(Rasch, 1960)is assumed, although a
brief description of the key parts of the model are described(for a comprehensive
introduction to the Rasch Model for Measurement see Bond & Fox, 2007)
.
−74−
Analysis of the VST
−75−
教養・外国語教育センター紀要
Methodology
Participants
The participants in the study were 81 students(58 male and 23 female students)
in a first-year compulsory English communication course of a science department at a
private university in western Japan. The age range of students was from 18 to 22, with
77 first year students and 4 students who were repeating the course and were in the
third or fourth year. All of the participants were native speakers of Japanese. Students
in the school of science and engineering are grouped according to major rather than
English proficiency, and students in the three classes that participated in this study
were majoring in biology, chemistry, and physics. The entry requirements were slightly
different for each major so there were slight differences between classes in terms of
English proficiency, and within each class there was a range of English proficiencies.
Although generally proficiency levels were similar, in one class there was a difference of
over 600 points on the TOEIC test between the highest and lowest level students, as
one student had spent her childhood in America and therefore scored highly on the
TOEIC test. All students had six years of formal education in English in Japanese
secondary schools. Only three of the participants had lived in English-speaking countries
for periods greater than one year, and two other students had taken part in short study
abroad programs. Approximately 15% of the students had experience of English
conversation classes outside of general education.
Test administration
The VST was administered to students at the start of the academic year and was
given to the students online through the website Survey Monkey. Instructions were
given in Japanese and the test format was explained. Students were given 15 minutes to
complete the test which consisted of the first 80 items of the VST. This gave students
little over 10 seconds per item with the rationale being that if the students knew the
item this would be sufficient time and would prevent excessive guessing. Students were
told to skip items if they did not know the answer. The majority of students were
unable to finish the test in the allotted time, and the last 10 items were answered by
only a small number of students, as shown by the lack of response to these items.
The TOEIC test was administered by the university approximately two months
after the vocabulary test. The test is a measure of English proficiency specifically
−76−
Analysis of the VST
designed to test business English and predict how effectively one can function in a
business environment(see http://www.ets.org/toeic for details)
. The first part of the
test focuses on listening, and the second part on reading, with questions related to
grammar and vocabulary included in the reading section. The vocabulary in the test
tends to be of a more academic or business nature, and is more likely to be covered by
the Academic Word List(Coxhead, 2011)
. Students were required to take the test but it
was zero stakes and students motivation was low. Colleagues proctoring the test
claimed that it was not uncommon for students to sleep during the test, and therefore
the reliability of scores is somewhat limited.
Following the administration of both tests a Rasch analysis was performed on the
VST results using WINSTEPS version 3.64.2(Linacre & Wright, 2007)
. The logit scores
derived from the test were correlated with the TOEIC scores for students.
−77−
教養・外国語教育センター紀要
Table 1.
( )
−78−
Analysis of the VST
( )
−79−
教養・外国語教育センター紀要
In order to determine the degree to which these two misfitting items were
influencing the person measures, Rasch person ability estimates were calculated with
and without these two items. The Pearson correlation was significant with a value of
.998; thus, it was concluded that these items were not affecting the overall person ability
estimates and they were therefore retained in subsequent analyses.
The Wright map(Figure 1)shows the persons on the logit scale on the far left of
the figure. The persons are displayed as Xs according to their Rasch person ability
measures, with persons with larger vocabularies toward the top of the map and persons
with smaller vocabularies toward the bottom of the map. The items are displayed on the
right side of the figure according to their difficulty estimates: More difficult items are
toward the top of the map and easier items toward the bottom. A person has a 50%
chance of correctly answering an item that is at the same point on the logit scale. The
average measure for persons was -.28, indicating that the items were well matched to
the participants although a little difficult for this group as shown by the negative value
of the mean for persons. This is supported by the Wright Map, which shows that
participants and items are well distributed about the mean and that there are no
significant gaps in the item hierarchy. Linacre(2002)considers gaps in item hierarchy
of greater than .59 logits to indicate a problem, and there are no gaps of this size close
to the people ability measures. Again the findings mirror those of Beglar(2010)
, which
showed that the test had sufficient items to avoid floor and ceiling effects and to
accurately measure the range in respondents’receptive vocabulary knowledge.
The VST separates words by frequency levels, and item difficulty is hypothesized
to increase with item number, as word frequency decreases(Nation & Beglar, 2007)
.
This claim is generally supported by the distribution of items, with low frequency words
being more difficult. One exception to this is English loan words in Japan, which should
generally be slightly easier, regardless of their frequency in the English language,
although words such as 70( )may be difficult due to spelling. The most difficult
items are 60( )
, 71( )
, and 80( )
, which are all in the 6,000-word
frequency level or above, and these items were too difficult for all the students. The
easiest items are 2( )
, 5( )
, 6( )
, and 38( )
, which were
answered correctly by all the participants.
−80−
Analysis of the VST
--------------------------------------------------------------------------------
Students with a | More difficult items
larger vocabulary |
3 +
|
|
| 60: Veer 71: Erratic 80: Mumble
| 77: Locust
| 10: Basis
| 45: Compost 49: Fracture
2 + 34: Tummy 40: Allege 58: Cavalier
|S 16: Nil 67: Demography
| 75: Eclipse 78: Authentic
| 19: Microphone 33: Candid 66: Bloc 72: Palette
X | 79: Cabaret 29: Rove 69: Azalea
| 57: Strangle 65: Bristle 73: Null 76: Marrow
X | 41: Deficit 43: Nun 64: Shudder 68: Gimmick
1 XXX T+ 51: Devious 59: Malign
| 55: Threshold 70: Yoghurt 74: Kindergarten
XX | 56: Thesis
XX | 15: Patience 52: Premier 63: Stealth
XXX S | 39: Remedy 44: Haunt
XX | 62: Quilt
XXXXXXXX | 13: Upset 32: Latter 47: Miniature 48: Peel
0 XXXXXXX +M 53: Butler 04: Figure 14: Drawer 23: Jug
XXXXXXXX | 54: Accessory 27: Pave 31: Compound 61: Olive
XXXXXXXXXXX M | 22: Restore 42: Wept
XXXX |
XXXXXXXXX | 24: Scrub
XXX | 03: Period
XX | 50: Bacterium
-1 XXXXXX S + 25: Dinosaur
XX | 18: Circle 26: Strap
XXX | 37: Crab 46: Cube
X | 11: Maintain
T | 09: Standard 20: Pro 30: Lonesome
X |
X |S 21: Soldier
-2 +
X | 07: Jump 36: Input
| 12: Stone
|
| 17: Pub
|
| 35: Quiz
-3 +
|
|
| 01: See 08: Shoe 28: Dash
|
|T
|
-4 +
|
|
|
|
|
| 02: Time 05: Poor 06: Drive 38: Vocabulary
-5 +
Students with a | Easier items
smaller vocabulary |
--------------------------------------------------------------------------------
Wright map for the vocabulary size test items. M = Mean; S = 1 SD; T 2 SDs.
Each X = 1 person. −81−
教養・外国語教育センター紀要
Items 2, 5, and 6 are in the first 1,000 high frequency words of English and would
be expected to be easy. However item 38 is in the 4,000-word level and is therefore
relatively infrequent and yet was easy for these students. The vocabulary item for 38 is
and although a relatively infrequent word, it is used by the teacher regularly,
and was used to introduce this test, and therefore known to all students. Item 35( )
is also easy for these students as it is a loan word, commonly used in Japanese and
therefore known to all students. Item 50( )is relatively easy for these
students as they are science students and therefore this is a high frequency word. Item
10 ( ) is high frequency and yet not known by any of these students. An
examination of the item shows that it is difficult for these students, with a mixture of
responses. Item 15( )is in the 2,000-word frequency level, and yet proved to be
difficult for these students. is not typically taught in Japanese English classes, and
although a high frequency word of English, it is quite specialized, being used generally
to talk about football scores in England. This explains why this item was difficult for
these students.
The logit difficulty for items was aggregated by level and the results are shown in
Table 2 below. Results show that generally the item difficulty increase by level although
there are several anomalies in the hierarchy. Words in the two thousand level are
harder than the next two subsequent levels, and there is a negligible difference in item
difficulty for the six and seven thousand words levels. One possible explanation is that
the students were in their first year, and had been studying vocabulary to pass the
entrance exam for university. The vocabulary in the entrance exam is unlikely to focus
on simple language, and therefore the students have studied the third and fourth
thousand extensively, making these items easier for this particular group of students.
Table 2.
1000 -23.82
2000 -4.94
3000 -8.54
4000 -5.1
5000 4
6000 10.15
7000 9.74
8000 17.6
−82−
Analysis of the VST
−83−
教養・外国語教育センター紀要
frequency items that were less difficult. These results combined with the results of the
PCA, indicate no evidence that a meaningful second dimension exists in the data and
that it might simply be difficulty; thus, it was concluded that the test items form a
fundamentally unidimensional construct. Beglar(2010)reached the same conclusion,
and had strong results supporting the claim for unidimensionality of the test.
Table 3.
Mean -.27
.08
95% CI [-.42, -.12]
.68
Skewness -.06
.27
Kurtosis .40
.54
The results are in logits attained from the Rasch analysis, and show that the distribution
is normal for this measure. There were no outliers on this measure.
Correlation Analysis
In order to test the assumption that the VST was a measure of English proficiency
a bivariate correlation analysis was performed with the listening and reading sections of
the TOEIC test. Table 4 shows the results of the correlation analysis.
Table 4.
1. VST ̶
2. LIST .48 ̶
3. READ .56 .69 ̶
. VST = Vocabulary Size Test; LIST = TOEIC listening; READ = TOEIC reading. All
correlations significant at < .01(2-tailed)
.
−84−
Analysis of the VST
All of the correlations are significant indicating that there is a clear link between
vocabulary size and overall language proficiency. Reading correlates more strongly with
the VST, as reading is more likely allow time for off-line processing of vocabulary, and
both reading and the VST use the same modality. The correlations are statistically
significant and strong, suggesting that vocabulary size is an important factor in
language proficiency, but that there are other variables. Again this is expected, as
language proficiency is considered to be complex and multifaceted(Fulcher & Davidson,
2007)
.
The pedagogical implications of the current study are that the VST is a useful
measure of the receptive vocabulary knowledge of students such as those in the current
study, and can be used to differentiate between vocabulary knowledge of students, even
within a fairly homogenous sample. This makes the test useful for researchers looking
to measure elements of language proficiency, and also for teachers who may want to
use relative proficiency differences in constructing groups within the language
classroom. As Nation and Beglar(2007)state, the test should not be used to decide
which level of vocabulary to focus on, as there was a reasonable mix of items from
different levels. The fact that the current test is free and readily available will also
appeal to teachers.
Conclusion
The results of the Rasch analysis support the previous analysis by Beglar(2010)
,
and show that the VST is a useful test of the general vocabulary level of students in
this context. Loan words do prove a slight complication in a Japanese context, giving
students knowledge of some relatively obscure vocabulary. Also, although the test
generally followed the predicted order in that items in the higher word bands became
more difficult, there was quite a large degree of mixing of difficulty, and based on these
results it would be unadvisable to use the test to try to identify specific weaknesses in
vocabulary within a given level. Again supporting the findings of Beglar(2010)
, the test
was unidimensional and was effective in differentiating between the different levels of
vocabulary knowledge for the students in this study, with a good level of separation of
items. The strong correlation with the TOEIC test supports the claim that vocabulary is
a relatively important part of overall language proficiency and suggests that teachers
−85−
教養・外国語教育センター紀要
References
−86−
Analysis of the VST
−87−