Professional Documents
Culture Documents
FACULTY OF ENGLISH
DEPARTMENT OF ENGLISH
UNIVERSITY OF GUJRAT
JANUARY, 2021
NAME: Ifrah Anum
ROLL NO. 18081517-020
SEMESTER. 5th
SUBJECT: Corpus Linguistics
INTRODUCTION:
One of the first steps when it comes to studying a language is the compilation of data, so it can
be analysed and used for the purpose(s) required. As Sinclair (1991: 13) states:
“Thebeginningofanycorpusstudyisthecreationofthecorpusitself.Thedecisionsthataretaken about
what is to be the corpus, and how these lectionis to be organized, control almost everything that
happens subsequently.”
A corpus is a searchable database of language samples for linguistic research. A corpus may be
based on written or spoken language. Some corpora are tagged or annotated by part of speech;
other corpora are plain text.
BRITISH NATIONAL CORPUS (BNC):
BNC COCA
Work on building the corpus began in 1991, 20 M per year for 1990-2008
and was completed in 1994. No new texts
have been added after the completion of the
project but the corpus was slightly revised
prior to the release of the second edition BNC
World (2001) and the third edition BNC XML
Edition (2007).
Since the completion of the project, two sub- Updated every 6-9 months
corpora with material from the BNC have
been released separately: the BNC Sampler (a
general collection of one million written
words, one million spoken) and the BNC
Baby (four one-million-word samples from
four different genres).
newspapers, academic books, letters, essays, Equally divided among spoken, fiction,
informal conversations, radio shows popular magazines, and academic texts
This corpus covers a variety of different Useful for studying variation across genres
genres. and over time.
HILARIO, P., 2021. COCA Vs BNC, Two Corpora or Lexicological Databases, What Are the Odds of
Having the Same Features? [online] Academia.edu. Available at:
<https://www.academia.edu/44241712/COCA_vs_BNC_two_corpora_or_lexicological_databases_what_
are_the_odds_of_having_the_same_features> [Accessed 3 January 2021].
Sketch Engine. 2021. British National Corpus (BNC) Search | Sketch Engine. [online] Available at:
<https://www.sketchengine.eu/british-national-corpus/#:~:text=What%20is%20British%20National
%20Corpus,letters%2C%20essays%2C%20etc.)> [Accessed 3 January 2021].
Burnard, L., 2021. [Bnc] About the British National Corpus. [online] Natcorp.ox.ac.uk. Available at:
<http://www.natcorp.ox.ac.uk/corpus/> [Accessed 3 January 2021].