Professional Documents
Culture Documents
Higher – 6 times
In – 93 times
Interest in – 67
1
Of – 58
Interest of – 11
Of interest – 24
Public – 5 times
2
TASK 2
Comparing association measures
a) Search for the word interest using different association measures. Use the default collocation
settings. Complete the table below with the top ten collocates of interest for each association
measure.
b. How do the different sets of collocates compare with each other? What type of words do they
include (grammatical or lexical words, frequent or infrequent)? Is there a preferable association
measure?
TASK 3
Interpreting collocation networks a) Create a graph for the word time using the following
settings: Span L5-R5, Statistics MI, Statistic cut-off value 5.0, Collocation frequency 5.0, Type.
Find the first-order collocate spend in the graph and double click on it. Find the second-order
collocate money in the graph and double click on it. Comment on the connection between time
and money that you can see in the collocation network.
3
TASK 4
In the following tasks, we use the LOB corpus which is provided with #LancsBox. Collocation
settings GraphColl produces collocations tables and graphs. You can search for the node and its
collocates after selecting the appropriate settings: Span: how many words to the left (L) and to
the right (R) of the search term (node) are being considered when searching for collocates.
Statistics: the association measure used to compute the strength of collocation. Threshold: the
minimum frequency and statistics cut-off values for an item (word, lemma, POS) to be
considered a collocate. Corpus: the corpus that is being searched.
Try searching for collocations in #LancsBox
1. Load the LOB corpus into #LancsBox: Download the LOB corpus; Import the corpus
2. Go to the GraphColl tool in #LancsBox and start searching/ Open the GraphColl tool
3. Follow the instructions
4. Personalise the COLLOCATION SETTINGS
5. Type the search term and click ‘Search’
TASK 5
Using any corpus (e.g. Brown/LOB using #LancsBox) look for words ending ‘ly’: *ly
How many are adverbs? How many are nouns? Analyse two pages of results only.
Deliverable: table containing categories
WHELK is used to find absolute and relative frequencies of the search terms in the corpus files.
NGRAMS tool = to identify lexical bundles
4
5