d expertise
bjcctives
effective
Part A Data collection and mat
1 Using corpus data in the
Gwy
Introduction
During the past 20 yea
which language can be stu
y of computers to hi
possible to build
Often they were
was frequently
matical class typically behaved.
Nowadays, computers have developed in such a way that there is no
fonger any restriction on the size a corpus can be. A corpus is Aathing
more nor less than a collection of rexts input into a computer, and te
number of texts will depend upon the uses that will be
corpus. For example, if teachers want to know what type ofee ————————e a ee ae ee ee es ee
Data collection and materials development Using corpus data in the classroom"
by students stud acy,
ida corpus of the books the students are requi
lectures they are required to attend; and tl
ly 125,000 for ‘start’, In the spoken data, ‘start*
more freque ‘begin’, bur only just. ‘Commence’ is hardly
used at all — there are only 99 examples, including those used in radio
asts,
also vary in frequency: ‘judg ice a8
fomimon as ‘judgement, and there are approximately five citations of
‘inquire’ to every four citations of ‘enquire’.
ceds to be taught.
course, only the frst stage in the whole teaching-learning process; it
would nor, for example, identify how the items might be taught or even
hhow they should be buile into a syllabus. Ie should, however; mean that
ad the relevance of the items to their particular
will assist the leacning of them. Specialised corpora of
id are being built at the present time by teachers ina number of
‘urther details of speci ly be expended on them. Infrequent words are
found in Chapter 2 of this book by Jane Willis, usually topic-specific, and can be acquired when needed. It is the
however, also possible to build a much more general corpus, Seneral vocabulary, those words used across the board in a wide range
1ow which words are infeequent, as less
is available for research into th For of copies, that is more diffeule to acquice, as the meaning is likely so
ws these far more data is idea ng to the context.
es as possible. For example, The Bank of English, the Ir has been argued that the common words are actually the ones
by COBUILD ar
University of Birmingham, which need less cea
iow words of current En
British, Ai
ing as frequency of exposure will do the job for
ally valid for
and learning effort should go
nt senses of words than into rare senses. There is an
reason for so much data is to do with frequency — intransitive use of give” asin ‘the rope gave’, which all native speakers
tency of words and the frequency of individual senses of words. Know, but which is rarely used. ‘The contexts it is used in are few and
By doing a frequency count, it is possible to find out the relotve easily predictable. If students come across it, they will quickly unde
frequency of words, eanging fr stand its mieaiting. However, the transitive use of ‘give’ in clauses such
he’ | corpus of Ei to the least frequent; ax (she gave him a really lovely smile’, and “he gave an extremely boring
ays be a very high proportion (up to so per cent) of talk’ is so common that many native speakers hardly even notice it. The
hapaxlegomena, words which occur only once. Words which occur Only meaning of ‘give’ here is easily intelligible to learners, but what they
a few times are usually more or less ignored by corpus linguists, as ther, Riight not realise is how important this particular structural pattern i=
hot enough evidence as to how they are typically used in. the Native speakers use it all the time, as it allows us to focus on the even
wage, father than on the action, and also to give as much (or as litle)
tis easy to discover which is most frequent among words which are information as we want to about chat event. Often there is « rel
near synonyms, for example ‘start’, ‘begin’ and ‘commence’. In the verb that could be chosen instead. For example, ‘he
wh K of English, ‘start’ is about x0 per cent more frequent than semantically means nearly the same as ‘he smiled at her’ ~ although they
begin’, with ‘commence’ being very infrequent having just over 000 are potentially very different in what they indicate, Both are relevant to‘she
eth ‘he account
mon and students should
in English text
be encouraged
show the pre
ot show any ci
He gave her a smile —He smiled at her
He gave the door a kick ~ He kicked:the door
-houe period. If
r themselves how
ing, of course, that
and in many cases a discussion wi
a chosen in preference to one in
in teaching where we focus on
a word, rather chan on
meanings are said to
of abstract ones,
Be
Using corpus data in the classroor
the more common uses out of the class-
room,
The word ‘thing’ is a further example. It docs,
of course, mean
“What's th
more often used
Literacy isn't the same thing as
Axeally strange thing happene:
And it js also frequently used 28a prefacing device to tll the person you
are addressing what your attitude isto what you are saying
Because you can use ‘thing? more or less any time when you do not
ecify more precisely what you are saying, it is an extremely
\seful word, bur learners rarely use it as frequently as native speakers
do, This s makes their language too precise and therefore
ly a sm to study a
concordance of ‘thing, they can be alerted to all its uses and can t
tase these themselves.
its up one of the significant differences between
id language in the real world, Most classroom
lanned (or, at fanned) and therefore is lacking
res of unplanned discourse, which include
is such as ‘something like thai
.