Professional Documents
Culture Documents
GE2412 W5 Corpus Analysis - IMRD Results
GE2412 W5 Corpus Analysis - IMRD Results
Concordancers
WEEK 5
ENGLISH FOR THE HUMANITIES AND SOCIAL SCIENCES
GE2412
What is a corpus?
”
me facts that I couldn’t imagine finding out
about in any other way.
FILLMORE, 1992, P. 35
Why use corpora?
Discussion activity: With the person sitting next to you, list some
specific questions you have about the best way to say something
in English.
Benefits of corpus data
Corpus data is natural: it’s how people really speak and write.
Corpora are often very large, so the information they give us is true
for many people.
Corpus data is contextualized; we can see how language is used
differently in different situations.
Corpus data can find differences that intuitions alone cannot
perceive
What can we learn from corpora?
More than one billion words of text (25+ million words each year
1990-2019);
Eight genres: spoken, fiction, popular magazines, newspapers,
academic texts, and (with the update in March 2020): TV and
Movies subtitles, blogs, and other web pages.
Major corpus (2):
The British National Corpus (BNC)
(Reveir, 2009)
Activity 1: Query a corpus
Underlying cause/s
Underlying problem
Underlying assumptions
Activity 2: Query a corpus
With a partner, try to find what preposition usually follows the word
“gravitate”. Provide two or three examples of words that follow
‘gravitate’.
Possible answers
gravitate toward
gravitate to
Activity 3: Query a corpus
A click on the particular collocation provides concordance data, that is, the
keywords displayed in context.
Concordance data includes information about the year in which when the
phrase was used, the text-type and the sub-genre from which the phrase was
extracted.
Advanced search I
Displays
LIST: Shows a list of word(s) or combination of words (according to their frequency)
CHART: Shows a chart comparing frequencies of a word in different genre or time.
COMPARE: Compares two words according to their frequencies (just generally or
with a certain collocate)
COLLOCATES: a word (not a phrase) that occurs within up to 10 words before/
after the search word(s); you can choose the collocation range by clicking two
little boxes next to the COLLOCATE box.
POS LIST: List of “parts of speech”- to look for a part of speech (a noun, a verb etc.)
that occurs after a word
Advanced search II
So far, we’ve experimented with inflected forms. However, if you’re searching for a
verb, e.g. ‘go’ and would like to retrieve contexts containing all inflected forms of
‘go’: go, went, gone, going, goes,….etc., you need to type your search word in
square brackets. [go]
Activity 4: Use of square brackets: [word]
Now, you can try that again with the following two
words. What inflected forms can you find?
(i) [explain]
(ii) [nice]
Exploring subcorpora
Activity 5: Discover whether a word is
more common in a particular text type
Look at the example sentence below.
I am fully/totally aware of the problem.
Answer the following questions by doing the word search in COCA.
a) In which kind of text is “totally” most frequently used?
b) In which kind of text is “fully” most frequently used?
c) Which word are we more likely to use in academic writing?
Instruction
Type in ‘fully’
Click on Chart
Click on ‘See frequency by section’
If necessary click ‘Change to Vertical chart’
Repeat the same steps for “totally”
Possible solutions: the word “totally”
Possible solutions: the word “fully”
Discussion questions
Using COCA, find the more frequently used collocate for the word
“technology” in the sentence below.
Answer: ______________
Instruction
Suggested answers:
Answer: application
Activity 8: collocation
Review one of your last written assignments; and find three places
where you received comments related to word choice or
grammatical mistakes.
Using all the techniques you have learnt, correct your writing errors.
Please search for the frequency, different combinations, and their
use, and complete the chart in the next page.
Frequency
My word/ (occurrences Ideas for better
What did you notice?
phrase per 100,000 choices
words)
entered into,
e.g. entered Usually followed by a
9 or
onto noun, or in/into
just entered
Extra: Youtube tutorials
Introduction to COCA:
https://www.youtube.com/watch?v=sCLgRTlxG0Y
Using Part-Of-Speech Tags:
https://www.youtube.com/watch?v=KP-7thiUnLM
Collocations:
https://www.youtube.com/watch?v=t_SxpfiPo_o
COCA- Lemmas, Parts of Speech Tags, and Wildcards
https://www.youtube.com/watch?v=3Oy7dL31rhY
COCA Bites: Using the Wildcard in Frequency Searches (5)
https://www.youtube.com/watch?v=_7mSZ6SRCjI
IMRD: Results
Results
Include only tables and figures that are necessary, clear, and worth
reproducing.
Tables
You should use tables only when necessary.
You should maintain the uniform format when using more than one tables.
Figures (graphs, diagrams, maps, photographs, etc)
You should avoid including too much information in one figure.
You should use figures only when they will help convey your information.
Purpose
Verb tense
Characteristics
/elements
Activity
Past Past
Verb tense Present
(refers to previous work)
• Description of
• Nature and scope of
materials and
problem
Characteristics procedure • Observations
• Review of relevant
/elements • Sufficient detail so • Results
literature
that procedure could
• Principal results
be reproduced
Writing plan for LD Draft
At this stage, you should have started exploring the use of COCA. Now, form a
group and read the assignment instruction with your groupmate(s).
Pick a question and try searching a few words from the suggested search terms
as a starting point. Keep a record of your discovery.
RQ 1/ 2/ 3
References