Professional Documents
Culture Documents
2010 9 24-25
H319 A 1003-6105 2010 04-0419-08
1 Herdan
language in
mass language in line
Brown 1967
John Carroll
Brown LOB Frown Flob 500 AHI
BNC CLEC lognormal model
Herdan
Barber
Brown
Herdan Brown
1960
Type token Mathematics
A Textbook of
Mathematical Linguistics
420
George
Miller
Wordnet
Brigham Young
Mark Davies
ICAME International Computer Archive of
Modern and Medieval English LDC
The Linguistic Data Consortium
TACT LEXA Wordcruncher
2 Wordsmith Antconc
1 Penn Treebank
Prague Dependency Tree Bank
PropBank Penn
Discourse Treebank RSTBank
TimeBank TimeBank
Linguistic Data Consortium
LDC
supervised
machine learning
automatic syntactic
parsing automatic semantic
421
analysis
parsing information
extraction word sense
disambiguation question-answer
system automatic summarization
Chinese
Linguistic Data Consortium CLDC text data mining
CLDC text data mining mining
extraction
parallel corpus
2010
data
50 knowledge
strategy
transit
2
1
422
performance-based approach
2
423
1
2010 9
TaCL
3
2
pedagogic ESP
processing
1
1990
424
Chomsky
Corpus linguistics does not exist Tognini
Bonelli 2001 50
Widdowson 2000
Linguistics applied Widdowson
Halliday
1993 1
Halliday
Appliable linguistics 3
Applied Corpus Linguistics
2
1
1
425
2
2
3 1
2
3
4
1
British National Corpus
426
PatCount
2008 Colligator
2009
wwwcorpus4uorg
2009
Halliday M A K 1993 Quantitative studies and
probabilities in grammar In Michael Hoey ed
Data Description Discourse C London
HarperCollins Publisher 1-25
Herden G 1960 Type-Token Mathematics M The
Hague Mouton
Tognini-Bonelli E 2001 Corpus Linguistics at Work
5 71-76
2009
J
3 18-23
2010-10-15
2010-10-22