You are on page 1of 2

Corpus Linguistics

Corpus Linguistics is the study of language which has become popular quite recently. What
contributed to its popularity and increased use is the development of computer technologies
which play an important role in the practical use of Corpus Linguistics. However, it is wrong to
assume that it was not used by linguists in language studies prior to this event. Up until 1950s,
language descriptions were based on “analyses of collections of natural texts: pre-computer
corpora”, and it is not until 1980s, when some of the first major corpus-based linguistic studies
began to appear, that electronic corpora became available. (Biber, Reppen 2) Corpus Linguistics
could thus be characterized as “a research approach that facilitates empirical investigations of
language variation and use, resulting in research findings that have much greater generalizability
and validity than would otherwise be feasible.” (Biber, Reppen 3) The empirical investigation
could therefore be related with quantitative methods, which are most often used in Corpus
Linguistics for the analysis of language. Apart from having knowledge in linguistics, having
computational skills for the use of corpora is equally important, since one of the goals of
activities based on Corpus Linguistics is to make the most of the computer technology which is
accessible. The foundation of Corpus Linguistics is a corpus, which is “a large, principled
collection of naturally occurring texts (written or spoken) stored electronically.” (Reppen) By
naturally occurring language it is meant that the language used in corpora comes from common,
real-life situations, such as meetings, conversation with friends, books, etc. Corpora can be either
written, which can include collections of texts from different genres, such as magazines, fiction,
academic language, etc. or spoken, which is based on time-consuming transcriptions. Such
transcriptions are time-consuming, especially since there are certain rules that should be
followed. Numerous data-based corpora have been created and analyzed since the beginning of
the development of Corpus Linguistics, and such corpora can provide an understanding of the
language used in everyday communication rather than the understanding of the universal rules of
language and the way these rules should be applied. Accordingly, the use of “real-world”
language should be given the advantage. The main purpose of language should be
communication and its use in different socio-cultural contexts, because “today, language students
are considered successful if they can communicate in their second or foreign language.” (Celce-
Murcia 125) Corpus Linguistics can also be an interdisciplinary approach, and therefore may be
used for different text analyses in law, forensics, psychology, etc. Corpus Linguistics has
contributed to the development of corpus-based activities which can be used in classrooms and
which “provide students with language input that accurately reflects the way language is used” as
well. (Reppen) Such activities could be used to enhance grammar, vocabulary or pragmatics
acquisition of students. The main goal of the use of corpus-based activities is to move away from
the traditional ways of teacher-centered classes and encourage students to be actively involved in
the class and become researchers. Corpora could be used in teaching and material-development
in ways that are not unfamiliar to teacher or students and in the ways students are accustomed to.
Thus, corpus-based activities should not necessarily need to be something teachers should avoid
in classrooms or something students would be reluctant to do. Moreover, numerous activities
already used in course books available in classrooms are similar to corpus-based activities, such
as filling the blanks, crosswords, understanding the functions of affixes, etc. Even though
“corpus-informed teaching materials provide students with examples of real language use,
helping learners to know how to use language that is appropriate in different contexts,” such
teaching materials are not commonly used in classrooms, and teaching is mostly based on direct
instruction. The main reasons for such an occurrence is the lack of equipment in schools needed
for the designing of such activities, and students’ reluctance and lack of motivation to do
something new when it is introduced to them. In order to encourage students to participate in the
activities that can enhance their vocabulary acquisition, various corpus-based activities could be
created.

You might also like