You are on page 1of 287

Academic Vocabulary in Learner Writing

Corpus and Discourse Series editors: Wolfgang Teubert, University of Birmingham, and Michaela Mahlberg, University of Liverpool. ˇ Editorial Board: Paul Baker (Lancaster), Frantisek C ermák (Prague), Susan Conrad (Portland), Geoffrey Leech (Lancaster), Dominique Maingueneau (Paris XII), Christian Mair (Freiburg), Alan Partington (Bologna), Elena TogniniBonelli (Siena and TWC), Ruth Wodak(Lancaster), Feng Zhiwei (Beijing). Corpus linguistics provides the methodology to extract meaning from texts. Taking as its starting point the fact that language is not a mirror of reality but lets us share what we know, believe and think about reality, it focuses on language as a social phenomenon, and makes visible the attitudes and beliefs expressed by the members of a discourse community. Consisting of both spoken and written language, discourse always has historical, social, functional, and regional dimensions. Discourse can be monolingual or multilingual, interconnected by translations. Discourse is where language and social studies meet. The Corpus and Discourse series consists of two strands. The first, Research in Corpus and Discourse, features innovative contributions to various aspects of corpus linguistics and a wide range of applications, from language technology via the teaching of a second language to a history of mentalities. The second strand, Studies in Corpus and Discourse, is comprised of key texts bridging the gap between social studies and linguistics. Although equally academically rigorous, this strand will be aimed at a wider audience of academics and postgraduate students working in both disciplines. Research in Corpus and Discourse Conversation in Context A Corpus-driven Approach With a preface by Michael McCarthy Christoph Rühlemann Corpus-Based Approaches to English Language Teaching Edited by Mari Carmen Campoy, Begona Bellés-Fortuno and Ma Lluïsa Gea-Valor Corpus Linguistics and World Englishes An Analysis of Xhosa English Vivian de Klerk Evaluation and Stance in War News A Linguistic Analysis of American, British and Italian television news reporting of the 2003 Iraqi war Edited by Louann Haarman and Linda Lombardo

Evaluation in Media Discourse Analysis of a Newspaper Corpus Monika Bednarek Historical Corpus Stylistics Media, Technology and Change Patrick Studer Idioms and Collocations Corpus-based Linguistic and Lexicographic Studies Edited by Christiane Fellbaum Meaningful Texts The Extraction of Semantic Information from Monolingual and Multilingual Corpora Edited by Geoff Barnbrook, Pernilla Danielsson and Michaela Mahlberg Rethinking Idiomaticity A Usage-based Approach Stefanie Wulff Working with Spanish Corpora Edited by Giovanni Parodi Studies in Corpus and Discourse Corpus Linguistics and The Study of Literature Stylistics In Jane Austen’s Novels Bettina Starcke English Collocation Studies The OSTI Report John Sinclair, Susan Jones and Robert Daley Edited by Ramesh Krishnamurthy With an introduction by Wolfgang Teubert Text, Discourse, and Corpora. Theory and Analysis Michael Hoey, Michaela Mahlberg, Michael Stubbs and Wolfgang Teubert With an introduction by John Sinclair

This page intentionally left blank

Academic Vocabulary in Learner Writing From Extraction to Analysis Magali Paquot .

No part of this publication may be reproduced or transmitted in any form or by any means.Continuum International Publishing Group The Tower Building 80 Maiden Lane 11 York Road Suite © Magali Paquot 2010 All rights reserved. Chennai. India Printed and bound in Great Britain by the MPG Books Group . including photocopying. or any information storage or retrieval system. electronic or mechanical. ISBN: 978-1-4411-3036-5 (hardcover) Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress. Typeset by Newgen Imaging Systems Pvt Ltd. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. recording. without prior permission in writing from the publishers. New York London SE1 7NX NY 10038 www.


Acknowledgements List of abbreviations List of figures List of tables Introduction Part I: Academic vocabulary Chapter 1 What is academic vocabulary? 1.1. Academic vocabulary vs. core vocabulary and technical terms 1.1.1. Core vocabulary 1.1.2. Academic vocabulary 1.1.3. Technical terms 1.1.4. Fuzzy vocabulary categories 1.2. Academic vocabulary and sub-technical vocabulary 1.3. Vocabulary and the organization of academic texts 1.4. Is there an ‘academic vocabulary’? 1.5. Summary and conclusion Chapter 2 A data-driven approach to the selection of academic vocabulary 2.1. Corpora of academic writing 2.2. Corpus annotation 2.2.1. Issues in annotating corpora 2.2.2. The software 2.3. Automatic extraction of potential academic words 2.3.1. Keyness 2.3.2. Range 2.3.3. Evenness of distribution 2.3.4. Broadening the scope of well-represented semantic categories 2.4. The Academic Keyword List 2.5. Summary and conclusion

xi xiii xv xvii 1

9 9 10 11 13 13 17 22 25 27

29 31 34 34 36 44 46 48 50 53 55 61


Contents Part II: Learners’ use of academic vocabulary

Chapter 3 Investigating learner language 3.1. The International Corpus of Learner English 3.2. Contrastive Interlanguage Analysis 3.3. A comparison of learner vs. expert writing 3.4. Summary and conclusion Chapter 4 Rhetorical functions in expert academic writing 4.1. The Academic Keyword List and rhetorical functions 4.2. The function of exemplication 4.2.1. Using prepositions, adverbs and adverbial phrases to exemplify 4.2.2. Using nouns and verbs to exemplify 4.2.3. Discussion 4.3. The phraseology of rhetorical functions in expert academic writing 4.4. Summary and conclusion Chapter 5 Academic vocabulary in the International Corpus of Learner English 5.1. A bird’s-eye view of exemplification in learner writing 5.2. Academic vocabulary and general interlanguage features 5.2.1. Limited lexical repertoire 5.2.2. Lack of register awareness 5.2.3. The phraseology of academic vocabulary in learner writing 5.2.4. Semantic misuse 5.2.5. Chains of connective devices 5.2.6. Sentence position 5.3. Transfer-related effects on French learners’ use of academic vocabulary 5.4. Summary and conclusion Part III: Pedagogical implications and conclusions Chapter 6 Pedagogical implications 6.1. Teaching-induced factors 6.2. The role of the first language in EFL learning and teaching 6.3. The role of learner corpora in EAP materials design

67 67 70 72 78 81 81 88 90 95 106 108 122

125 125 142 142 150 154 168 174 177 181 192

201 201 203 206

Contents Chapter 7 General Conclusion 7.1. Academic vocabulary: a chimera? 7.2. Learner corpora, interlanguage and second language acquisition 7.3. Avenues for future research Appendix 1: Expressing cause and effect Appendix 2: Comparing and contrasting Notes References Author index Subject index


211 211 215 216 219 226 235 240 257 261

This page intentionally left blank


There are several people without whom this book would never have been written. First and foremost, I want to express my deepest and most sincere gratitude to my PhD supervisor, Professor Sylviane Granger, for her infectious enthusiasm, her intellectual perceptiveness and her unfailing expert guidance. I am greatly indebted to you, Sylviane, for giving me the opportunity to join the renowned Centre for English Corpus Linguistics seven years ago now! I have been lucky enough to undertake research in an environment where writing a PhD also means collaborating with many fellow researchers on up-and-coming projects, attending thoughtprovoking conferences, organizing seminars, conferences and summer schools, as well as lecturing and offering guidance to undergraduate students. I am also very grateful to my colleagues and friends at the Centre for English Corpus Linguistics - Céline, Claire, Fanny, Gaëtanelle, Jennifer, Marie-Aude, Suzanne and Sylvie – for making the Centre for English Corpus Linguistics such an inspiring and intellectually stimulating research centre. I also wish to thank them for their moral and intellectual support and for all the entertaining lunchtimes we spent together talking about everyday life . . . and work. I am indebted to a great number of colleagues not only for supplying me with corpora, corpus-handling tools and references, but also for providing helpful comments on earlier versions and stimulating ideas for my research. I would like to thank Yves Bestgen, Liesbet Degand, Jean Heiderscheidt, Sebastian Hoffmann, Scott Jarvis, Jean-René Klein, Fanny Meunier, Hilary Nesi, John Osborne and JoAnne Neff van Aertselaer. I am also grateful to an anonymous reviewer for recommendations on the first draft of the text. I gratefully acknowledge the support of both the Communauté française de Belgique, which funded my doctoral dissertation out of which this book has grown, and the Belgian National Fund for Scientific Research (F.N.R.S).

Arnaud: thank you for making it all worthwhile. but not least.xii Acknowledgements On a more personal note. And last. Magali Paquot Louvain-la-Neuve November. I would like to express my deepest thanks to my parents and friends for everything they have done to help me while I was working on this book. 2009 .

. 2009) interlanguage First language Foreign language Longman Dictionary of Contemporary English (4th edition) . Université catholique de Louvain Contrastive Interlanguage Analysis Constituent Likelihood Automatic Word-tagging system Corpus de Dissertations Françaises English for academic purposes English as a foreign language English as a second language English for specific purposes General Service List (West.. 1953) International Corpus of Learner English (Granger et al. 2002) International Corpus of Learner English (version 2) (Granger et al.List of abbreviations AKL AWL BAWE BNC B-BNC BNC-AC BNC-AC-HUM BNC-SP CALL CECL CIA CLAWS CODIF EAP EFL ESL ESP GSL ICLE ICLEv2 IL L1 L2 LDOCE4 Academic Keyword List (my own list) Academic Word List (Coxhead. 2000) British Academic Written English (BAWE) Pilot Corpus British National Corpus Baby BNC Academic Corpus British National Corpus – academic sub-corpus British National Corpus – academic sub-corpus (discipline: humanities and arts) British National Corpus – spoken sub-corpus Computer-assisted language learning Centre for English Corpus Linguistics.

xiv List of abbreviations Louvain Corpus of Native Speaker Essays Log-likelihood statistical test Micro-Concord Corpus Collection B Macmillan English Dictionary for Advanced Learners (second edition) Monolingual learners’ dictionary Native speaker Non-native speaker Per million words Part-of-speech Second language acquisition University Centre for Computer Corpus Research on Language. Lancaster University WordSmith Tools (version 4) LOCNESS LogL MC MED2 MLD NS NNS pmw POS SLA UCREL WST4 .

5: Distribution of the noun ‘solution’ Figure 3.4: WordSmith Tools Detailed Consistency Analysis Figure 2.6: The phraseology of rhetorical functions in academic prose Figure 5..1: ICLE task and learner variables (Granger et al.1: The relationship between academic and sub-technical vocabulary Figure 2.3: BNCweb Collocations option Figure 4.2: The distribution of the adverb ‘notably’ across genres Figure 4.2: Contrastive Interlanguage Analysis (Granger 1996a) Figure 3.5: The distribution of the verbs ‘illustrate’ and ‘exemplify’ across genres Figure 4.3: The distribution of ‘by way of illustration’ across genres Figure 4. 2002: 13) Figure 3.4: The distribution of ‘to name but a few’ across genres Figure 4.1: Exemplifiers in the ICLE and the BNC-AC-HUM Figure 5.1: A three-layered sieve to extract potential academic words Figure 2.List of figures Figure 1.2: The use of the prepositions ‘like’ and ‘such as’ in different genres Figure 5.3: The use of the adverb ‘notably’ in different genres 21 45 49 50 51 53 68 70 77 89 93 94 95 103 121 127 131 131 .2: WordSmith Tools – WordList option Figure 2.1: Exemplification in the BNC-AC-HUM Figure 4.3: Distribution of the words example and law in the 15 sub-corpora Figure 2.

.10: A possible rationale for the use of ‘according to me’ in French learners’ interlanguage Figure 5. learner writing and speech (based on Gilquin and Paquot..12: Features of novice writing .6: The use of ‘despite’ and ‘in spite of’ in different genres Figure 5.4: Expressing cause and effect: ‘Be careful’ note on ‘so’ (Gilquin et al. 2007b: IW5) Figure 6. native-speaker and EFL novices’ writing and native speech (per million words of running text) Figure 6.8: Phraseological cascades with ‘in conclusion’ and learner-specific equivalent sequences Figure 5.3: Reformulation: Explaining and defining: using ‘i. 2008) Figure 5. 2007b: IW13) 132 140 145 153 161 165 187 191 195 202 208 209 210 .7: The frequency of speech-like lexical items in expert academic writing. ‘that is’ and ‘that is to say’ (Gilquin et al.11: A possible rationale for the use of ‘let us in French learners’ interlanguage Figure 5. 2007b: IW9) Figure 6.4: Distribution of the adverbials ‘for example’ and ‘for instance’ across genres in the BNC Figure 5.9: Collocational overlap Figure 5.xvi List of figures Figure 5.1: Connectives: contrast and concession ( Jordan 1999:136) Figure 6..’.5: The treatment of ‘namely’ on websites devoted to English connectors Figure 5.2: Comparing and contrasting: using nouns such as ‘resemblance’ and ‘similarity’ (Gilquin et al.Frequency in expert academic writing.e.

as applied to the field of anatomy Word families in the AWL The corpora of professional academic writing The re-categorization of data from the professional corpus into knowledge domains The corpora of student academic writing Examples of essay topics in the BAWE pilot corpus An example of CLAWS vertical output CLAWS horizontal output [lemma + POS] CLAWS horizontal output [lemma + simplified POS tags] Simplification of CLAWS POS-tags CLAWS tagging of the complex preposition ‘in terms of’ Semantic fields of the UCREL Semantic Analysis System USAS vertical output USAS horizontal output The fiction corpus Number of keywords Automatic semantic analysis of potential academic words Distribution of grammatical categories in the Academic Keyword List The Academic Keyword List The distribution of AKL words in the GSL and the AWL 12 14 17 31 32 33 34 39 40 40 41 41 42 43 44 47 47 54 55 56 60 .12: Table 2.6: Table 2.11: Table 2.4: Table 2.5: Table 2.15: Table 2.1: Table 2.9: Table 2.3: Table 2.16: Table 2.2: Table 1.10: Table 2.List of tables Table 1.17: Table 2.8: Table 2.2: Table 2.7: Table 2.14: Table 2.1: Table 1.3: Table 2.18: Composition of the Academic Corpus (Coxhead 2000: 220) Chung and Nation’s (2003: 105) rating scale for finding technical terms.13: Table 2.

14b: Table 4.2: Table 4.xviii List of tables Breakdown of ICLE essays BNC Index – Breakdown of written BNC genres (Lee 2001) Ways of expressing exemplification found in the BNC-AC-HUM The use of ‘for example’ and ‘for instance’ in the BNC-AC-HUM The use of ‘example’ and ‘for example’ in the BNC-AC-HUM Significant verb co-occurrents of the noun ‘example’ in the BNC-AC-HUM Adjective co-occurrents of the noun ‘example’ in the BNC-AC-HUM The use of the lemma ‘illustrate’ in the BNC-AC-HUM The use of the lemma ‘exemplify’ in the BNC-AC-HUM The use of imperatives in academic writing (based on Siepmann.14d: 89 91 95 96 100 103 105 107 109 109 110 112 115 115 115 116 116 117 117 119 119 120 120 120 .14a: Table 4. 2005: 119) Ways of expressing a concession in the BNC-AC-HUM Ways of reformulating.14: Table 4.11: Table 4.1: Table 4.3: Table 4. paraphrasing and clarifying in the BNC-AC-HUM Ways of expressing cause and effect in the BNC-AC-HUM Ways of comparing and contrasting found in the BNC-AC-HUM Co-occurrents of nouns expressing cause or effect in the BNC-AC-HUM reason implication effect outcome result consequence Co-occurrents of verbs expressing possibility and certainty in the BNC-AC-HUM suggest prove appear tend 69 74 Table 3.9: Table 4.5: Table 4.13: Table 4.6: Table 4.2: Table 4.12: Table 4.4: Table 4.13c: Table 4.13e: Table 4.13a: Table 4.13f: Table 4.13b: Table 4.10: Table 4.7: Table 4.8: Table 4.1: Table 3.14c: Table 4.13d: Table 4.

21: 158 159 162 .16: Table 5.13: A comparison of exemplifiers based on the total number of running words A comparison of exemplifiers based on the total number of exemplifiers used Two methods of comparing the use of exemplifiers Significant adjective co-occurrents of the noun ‘example’ in the ICLE Adjectives co-occurrents of the noun ‘example’ in ICLE not found in the BNC Significant verb co-occurrents of the noun ‘example’ in the ICLE Verb co-occurrent types of the noun ‘example’ in ICLE not found in BNC The distribution of ‘example’ and ‘be’ in the ICLE and the BNC-AC-HUM The distribution of ‘there + BE + example’ in ICLE and the BNC-AC-HUM The distribution of AKL words in the ICLE Examples of AKL words which are overused and underused in the ICLE Two ways of comparing the use of cause and effect markers in the ICLE and the BNC The over.and underused in learners’ writing.2: Table 5.5: Table 5.1: Table 5.20: Table 5.9: Table 5.18: Table 5.17: Table 5.14: 149 151 154 154 156 Table 5.and underuse by EFL learners of specific devices to express cause and effect (based on Appendix 1) The over.and underuse by EFL learners of specific devices to express comparison and contrast (based on Appendix 2) Speech-like overused lexical items per rhetorical function The frequency of ‘maybe’ in learner corpora The frequency of ‘I think’ in learner corpora Examples of overused and underused clusters with AKL words Clusters of words including AKL verbs which are over.19: Table 5.3: Table 5.List of tables Table 5.8: Table 5.10: Table 5. by comparison with expert academic writing Examples of overused clusters in learner writing Verb co-occurrents of the noun conclusion in the ICLE xix 128 129 130 133 133 134 134 135 135 143 144 146 147 Table 5.11: Table 5.4: Table 5.6: Table 5.12: Table 5.7: Table 5.15: Table 5.

1: 205 .22: Table 5.23: Table 5.24: Table 5.28: Table 5.25: Table 5.26: Table 5.xx List of tables Adjective co-occurrents of the noun conclusion in the ICLE The frequency of sentence-initial position of connectors in the BNC-AC-HUM and the ICLE Sentence-final position of connectors in the ICLE and the BNC-AC-HUM Jarvis’s (2000) three effects of potential L1 influence Jarvis’s (2000) unified framework applied to the ICLE-FR A comparison of the use of the English verb ‘illustrate’ and the French verb ‘illustrer’ ‘let us’ in learner texts The transfer of frequency of the first person plural imperative between French and English writing Le Robert & Collins CD-Rom (2003–2004): Essay writing Table 5.27: Table 5.29: 167 178 181 183 184 188 189 191 Table 6.

2002) or foreign language (e. they also need to have a productive knowledge of academic language.. highly routinized. 2002). 1997. partly because of language problems. .000 undergraduates showed that students from all 26 departments at the Hong Kong Polytechnic University experienced difficulties with the writing skills necessary for studying content subjects through the medium of English (Evans and Green. not only a problem for novice writers. As a large number of them are also required to write academic texts (e. ‘students who are beginning university studies face a bewildering range of obstacles and adjustments.g. International refereed journal articles are regarded as the most important vehicle for publishing research findings and non-native academics who want to publish their work in those top journals often find their articles rejected. however. nature of academic prose is problematic for many novice native-speaker writers (e. 2004. Studies in second language writing have established that learning to write second-language (L2) academic prose requires an advanced linguistic competence.g.g. university students need to have good receptive command of English if they want to have access to the literature pertaining to their discipline. Mastering the subtleties of academic prose is.Introduction That English has become the major international language for research and publication is beyond dispute. 2007b). 2005). 2006). Gilquin et al. expressing ideas in correct English and linking sentences smoothly. essays. Several studies have shown that the distinctive. Almost 50 per cent of the students reported that they encountered difficulties in using appropriate academic style. As a result. Nation and Waring. and many of these difficulties involve learning to use language in new ways’ (2006: 1).). MA dissertations. 2002. Reynolds. reports. PhD theses. Hinkel. Cortes. 1997. without which learners simply do not have the range of lexical and grammatical skills required for academic writing (Jordan.g. but poses an even greater challenge to students for whom English is a second (e. As noted by Biber. Hinkel. A questionnaire survey of almost 5. etc.

Biber’s (1988) study of variation across speech and writing has shown that academic texts typically have an informational and non-narrative focus. Wordlists give information on the frequency and distribution of the vocabulary – single words but also word sequences – used in one or more corpora. they require highly explicit. as far as possible. a language or language variety as a source of linguistic research’ (Sinclair. Corpus linguistics is concerned with the collection in electronic format and the analysis of large amounts of naturally occurring spoken or written data ‘selected according to external criteria to represent. contrastive rhetoric. compared to conversation.2 Academic Vocabulary in Learner Writing These problems include the fact that they have less facility of expression and a poorer vocabulary. conceptual or technical subject matter (Biber. 1999) provides a comprehensive description of the range of distinctive grammatical and lexical features of academic prose. The research paradigm of corpus linguistics is ideally suited for studying the linguistic features of academic discourse as it can highlight which words. Because it causes major difficulties to students and scholars alike. 2004). Concordances are used to analyse the co-text of a linguistic feature.e. which includes a number of text-handling tools to support quantitative and qualitative textual data analysis. fiction and newspaper reportage. corpus-linguistic methods focus more on the co-text of selected lexical items in academic texts. namely (Swalesian) genre analysis. 1999). in other words its linguistic environment in terms of preferred co-occurrences and grammatical structures. 2005: 16). they find it difficult to ‘hedge’ appropriately and the structure of their texts may be influenced by their first language (see Flowerdew. 1988: 121–60). ethnographic approaches and corpus-based analysis. While the first three approaches to English for Academic Purposes (EAP) emphasize the situational or cultural context of academic discourse. Corpus-based studies have already shed light on a number of distinctive linguistic features of academic discourse as compared with other genres. Computer corpora are analysed with the help of software packages such as WordSmith Tools 4 (Scott... Flowerdew (2002) identified four major research paradigms for investigating academic discourse. The Longman Grammar of Spoken and Written English (Biber et al. phrases or structures are most typical of the genre and how they are generally used. Common features of this genre include a high rate of . i. academic discourse has become a major object of study in applied linguistics. text-internal reference and deal with abstract. its keywords. Wordlists for two corpora can be compared automatically so as to highlight the vocabulary that is particularly salient in a given corpus.

I first argue that. nominalizations. It takes the reader full circle. and the lack of academic vocabulary development contribute to a situation in which nonnative students are simply not prepared to write academic texts. discipline-based vocabulary syllabus. as well as idioms and collocations to develop a substantial lexical arsenal to improve their writing in English’ (Hinkel. The Academic Word List (Coxhead. I demonstrate. By contrast. In addition. and the generalizing trend which recognizes the existence of a common core ‘academic vocabulary’ that can be taught to a large number of learners in many disciplines. the concept of ‘academic vocabulary’ must be revisited. to the pedagogical implications that can be drawn from the results. This book is an attempt to resolve the tension between the particularizing trend which advocates the teaching of a more restricted. They argue that the different disciplinary literacies undermine the usefulness of such lists and recommend that lecturers help students develop a discipline-based lexical repertoire. 2002: 247). there is a wide range of words and phraseological patterns that . ‘NNSs [non-native students] need to learn more contextualized and advanced academic vocabulary. derived adjectives. This book aims to provide a better description of the notion of ‘academic vocabulary’. that-deletions and contractions occur very rarely in academic texts. from the extraction of potential academic words through their linguistic analysis in expert and learner corpus data. on the basis of corpus data. agentless passive structures and linking adverbials. attributive adjectives. among the top priorities. Recent corpus-based studies have emphasized the specificity of different academic disciplines and genres. 2000) was compiled on the basis of corpus data to meet the specific vocabulary needs of students in higher education settings. verbs with inanimate subjects. the term has been used in various ways to refer to different (but often overlapping) vocabulary categories. the relative absence of direct and focused grammar instruction. activity verbs. researchers such as Hyland and Tse (2007) question the widely held assumption that students need a common core vocabulary for academic study. She provides a list of priorities in curriculum design and writes that.Introduction 3 occurrence of nouns. As a result. that. first and second person pronouns. studies of vocabulary have emphasized the importance of a ‘sub-technical’ or ‘academic’ vocabulary alongside core words and technical terms in academic discourse (Nation. But what is ‘academic vocabulary’? Despite its widespread use. Hinkel (2002: 257–65) argues that the exclusive use of a process-writing approach. private verbs. as well as discipline-specific vocabulary. to resolve this tension. 2001: 187–216). noun phrases with modifiers.

to select academic words that could be part of a common core academic vocabulary syllabus. The book is organized in three sections. Swedish) are compared with a subset of the academic component of the British National Corpus (texts written by specialists in the Humanities) to identify ways in which learners’ use of academic vocabulary differs from that of more expert writers. . It then proposes a data-driven procedure based on the criteria of keyness. Dutch. The first scrutinizes the concept of ‘academic vocabulary’. and therefore possibly developmental. or to perform important discourse-organizing or rhetorical functions in academic writing. range. The learner corpus used is the first edition of the International Corpus of Learner English (ICLE). irrespective of their mother tongue background. I made use of Jarvis’s (2000) unified framework to investigate the potential influence of the first language on French learners’ use of academic vocabulary in English. and they are often novice writers in their mother tongue as well. A comparison of the ten subcorpora then makes it possible to identify linguistic features that are shared by learners from a wide range of mother tongue backgrounds. The comparison of several ICLE sub-corpora helps to pinpoint a number of patterns that are characteristic of learners who share the same first language. reviewing the many definitions of the term and arguing that. academic vocabulary is more usefully defined as a set of options to refer to those activities that characterize academic work. share a number of linguistic features that characterize their use of academic vocabulary. The EFL learners are all learning how to write in a foreign language. organize scientific discourse. A large proportion of this lexical repertoire consists of core vocabulary. and which may therefore be transfer-related. even those at the high-intermediate or advanced levels. German. not all learner specific-features can be attributed to developmental factors. Russian. for productive purposes. However. I make use of Granger’s (1996a) Contrastive Interlanguage Analysis to test the working hypothesis that upper-intermediate to advanced EFL learners. Polish. of scientific knowledge. Spanish. which is among the largest non-commercial learner corpora in existence. and more generally. and evenness of distribution. and build the rhetoric of academic texts.4 Academic Vocabulary in Learner Writing are used to refer to activities which are characteristic of academic discourse. It contains texts written by learners with different mother tongue backgrounds. Ten ICLE sub-corpora representing different mother tongue backgrounds (Czech. Finnish. French. a category which has so far been largely neglected in EAP courses but which is usually not fully mastered by English as a foreign language (EFL) learners. Italian.

000 most frequent words of English. summarizes the major findings. describing the factors that account for learners’ difficulties in academic writing. comprises a set of 930 potential academic words. One important feature of the methodology is that. These factors include a limited lexical repertoire. The AKL is used in Section 2 to explore the importance of academic vocabulary in expert writing and to analyse EFL learners’ use of lexical devices that perform rhetorical or organizational functions in academic writing. and points the way forward to further research in the area. . infelicitous word combinations. called the Academic Keyword List (AKL). sentence-initial positioning of adverbs and transfer effects. unlike Coxhead’s (2000) Academic Word List. the AKL includes the 2. lack of register awareness.Introduction 5 The resulting list. semantic misuse. thus making it possible to appreciate the paramount importance of core English words in academic prose. This section offers a thorough analysis of these lexical devices as they appear in the International Corpus of Learner English. The final section briefly comments on the pedagogical implications of these results.

This page intentionally left blank .

Chapter 1 therefore tries to identify the key features of academic vocabulary and to clear up the confusion between academic words and other vocabulary. and uses this to build a new list of potential academic words. The AKL is used in Section 2 to analyse EFL learners’ use of lexical devices that perform rhetorical or organizational functions in academic writing. This list is very different from Coxhead’s Academic Word List and has already been used to inform the writing sections in the second edition of the Macmillan English Dictionary for Advanced Learners(see Gilquin et al. Nevertheless. In this section. viz. the Academic Keyword List (AKL).Part I Academic vocabulary ‘Academic vocabulary’ is a term that is widely used in textbooks on English for academic purposes and Second Language Acquisition (SLA) reference books. Chapter 2 proposes a data-driven methodology based on the criteria of keyness. it can be understood in a variety of ways and used to indicate different categories of vocabulary. range and evenness of distribution. my objectives are to clarify the meaning of ‘academic vocabulary’ by critically examining its many uses. 2007b). and to build a list of words that fit my own definition of the term.. .

This page intentionally left blank .

parallel. colleague. 2008). I will show why a definition of academic vocabulary that excludes the top 2.000 words of English is not very useful for productive purposes in higher education settings and argue for a function-based definition of the term. But what is academic vocabulary? The term often refers to a set of lexical items that are not core words but which are relatively frequent in academic texts. solution) or discourse-organizing vocabulary (e. regardless of the discipline. Recent titles include Essential Academic Vocabulary: Mastering the Complete Academic Word List (Huntley.g. consist. equivalent. as witnessed by the increasing number of textbooks on the topic. core words. differ. likewise. cause. The very existence of academic words has recently been challenged by several researchers in English for Specific Purposes (ESP) who advocate that teachers help students develop a more restricted.1. contrast. chemical. sub-technical words and discourseorganizing words. core vocabulary and technical terms Numerous second language acquisition studies have investigated whether there is a threshold which marks the point at which vocabulary knowledge . feature. Academic vocabulary vs. they appear in a large proportion of academic texts.g. transport and volunteer (cf. compare. Academic vocabulary is also sometimes used as a synonym for subtechnical vocabulary (e. I set out to review the many definitions of academic vocabulary that have been given and to clear up the confusion between academic words. Unlike technical terms. I will round off this chapter by situating the book in ongoing debates over generality vs. nuclear. disciplinary specificity in teaching vocabulary for academic purposes. In this chapter.Chapter 1 What is academic vocabulary? Academic vocabulary is in fashion. discipline-specific lexical repertoire. mouse. 2006) and Academic Vocabulary in Use (McCarthy and O’Dell. hypothetical. 1. Coxhead. 2000). technical terms. Examples of academic words include adult. bug. and identify).

e. other . the notions of core vocabulary. i.10 Academic Vocabulary in Learner Writing becomes sufficient for adequate reading comprehension. logical or propositional meaning. some and to) and content words like bag. I. 1. Stubbs describes nuclear words as an essential common core of ‘pragmatically neutral words’ (1986: 104) and lists five main reasons for their pragmatic neutrality: 1. 1992) has shown that at least 95 per cent coverage is needed to ensure reasonable comprehension of a text. Laufer (1989. its domain of experience and social settings. he.g. Percentage figures are given for different word meanings and parts of speech of each headword. They have no cultural or geographical associations. 1992). The criticisms levelled at the division of vocabulary into mutually exclusive lists are then reviewed. emotional or evaluative connotations’ (ibid. written or spoken language. about. do. In this section. plus some academic words. with no necessary attitudinal. 4. Next to frequency and coverage. and up to 76 per cent of academic texts (Coxhead. person.1. 5.e. They give no indication of the field of discourse from which a text is taken. however. it is commonly believed that students in higher education settings need to master three lists of vocabulary: a core vocabulary of 2. In a variety of studies. To achieve this coverage. and technical terms. do not agree that vocabulary categories can be described as if they were clearly separable. 2. the GSL provided coverage of up to 92 per cent of fiction texts (e. It comprises the most useful function words (e. They are also neutral with respect to tenor and mode of discourse: they are not restricted to formal or informal usage or to a specific medium of communication. a.1 which was created from a five-million word corpus of written English and contains around 2.g. They are used in preference to non-nuclear words in summarizing tasks. Nuclear words have a ‘purely conceptual. put and suggest.000 high-frequency words. lesson.). The best-known list of core words is West’s (1953) General Service List of English Words (GSL). 3. be. Some researchers. 2000).000 word families. Hirsh and Nation. academic vocabulary and technical terms are described and illustrated. by.g.1. Core vocabulary A core (or basic or nuclear) vocabulary consists of words that are of high frequency in most uses of the language. cognitive.

1984). West also wanted the list to include words that are often used in the classroom or that would be useful for understanding definitions of vocabulary outside the list.5 million words. testing and the development of pedagogical material.000 word families. 1979. Engels (1968) criticized the low coverage of the second 1. Xue and Nation. the GSL includes many words that are considered to be of limited utility today (e.000 running words he analysed. the second set of word families in the GSL provided coverage of less than 10 per cent. 2006).1. for educational purposes. The Academic Word List (AWL) was created from a corpus of 414 academic texts by more than 400 authors and totals around 3. Each sub-corpus is further subdivided into seven subject areas as shown in Table 1. computer-assisted language learning (CALL) materials. Campion and Elley. 2001: ix–x. Lynn. and dictionaries (e. coal. 1973. 1971. Schmitt et al. 2005.g. Ghadessy. Huntley. 2000) is the most widely used today in language teaching. A number of criticisms have.. vocabulary tests (e. The Academic Corpus includes journal articles.g.000 words representing broad academic disciplines: arts. 2001). necessity and style were also used in making the selection (West 1953: ix–x). 1. ornament and vessel) but does not contain very common words such as computer. law and science. it still remains the best of the available lists because of ‘its information on frequency of each word’s various meanings.000 word families covered between 68 and 74 per cent of the words in the ten texts of 1. 1998: 207). It is now included in vocabulary textbooks (e. In addition.1. It is divided into four sub-corpora of approximately 875. However.What is academic vocabulary? 11 criteria such as learning ease.. Schmitt and Schmitt. Major. Carter. 1995: 35–6. While the first 1.2. however.g. 1972. several researchers have pointed out that. been levelled at the GSL. commerce. Academic vocabulary A number of academic word lists have been compiled to meet the specific vocabulary needs of students in higher education settings (e. chapters from university textbooks and laboratory manuals. 2006). The GSL has had a wide influence for many years and served as a resource for writing graded readers and other material. crown. most particularly at its coverage and age.g. Leech et al. because of changes in the English language and culture.g. The Academic Word List (Coxhead. astronaut and television (see Nation and Hwang. and West’s careful application of criteria other than frequency and range’ (Nation and Waring 1997:13). . Praninskas.

history. presumption. Frequency: a word family had to occur at least 100 times in the Academic Corpus. presuming. chemistry. presumptions and presumptuous are all members of the same family. public policy constitutional law. 3. presume. environment. Each family consists of a headword and its closely related affixed forms according to Level 6 of Bauer and Nation’s (1993) scale. context. labour.12 Table 1. research. criminal law.723 Science Total 875. psychology. politics. quasi-commercial law. physics Arts Commerce Law 883. suggesting that the AWL’s word families are closely associated with academic writing (Coxhead. Coxhead (2000) selected word families to be included in the AWL on the basis of three criteria: 1. psychology. presumes. the Academic Word List is made up of word families. For example. The resulting list consists of 570 word families and covers at least 8. Some of the most frequent word families included in Sublist 1 are headed by the word forms analyse. sociology accounting. geology. the words presumably. significant and . 2.000 most frequent words of English as listed in West’s (1953) General Service List.214 879. It is divided into 10 sublists ordered according to decreasing word-family frequency. pure commercial law. marketing. presumed. economics. international law. computer science. issue.5 per cent of the running words in academic texts. which includes all the inflections and the most frequent and productive derivational affixes. finance. Range: a word family had to occur in all four academic disciplines with a frequency of at least 10 in each sub-corpus and in 15 or more of the 28 subject areas. geography.330 113 414 Like the General Service List.846 3. formula.1 Academic Vocabulary in Learner Writing Composition of the Academic Corpus (Coxhead 2000: 220) Running words Texts 122 107 72 Subject areas education. industrial relations.547 874. rights and remedies biology. it accounts for a very small percentage of words in other types of texts such as novels.513. Specialized occurrence: a word family could not be in the first 2. 2000: 225). management. family law and medico-legal. By contrast. mathematics. benefit.

technical dictionaries contain probably 1.What is academic vocabulary? 13 vary. for example. Scarcella and Zimmerman (2005: 127) have also shown that mastery of derivative forms makes academic words particularly difficult for foreign language learners who often fail to analyse the different parts of complex words. resistance to semantic change and absence of exact synonyms (cf. computer science. genotype. law or economics and may have a great deal of difficulty with technical words. Since technical terms are highly subject-specific. enormous. colleague. Mudraya. Technical terms Domain-specific or technical terms are words whose meaning requires scientific knowledge. These words are best learned through the study of the body of knowledge that they are attached to. academic words and technical terms are described as if they were clearly separable. the boundaries between them are fuzzy .000 headwords or less per subject area. 1997). range and distribution (see Section 2. Research suggests that knowledge of domain-specific or technical terms allows learners to understand an additional 5 per cent of academic texts in a specific discipline. Academic words are likely to be problematic for native as well as nonnative students as a large proportion of them are Graeco-Latin in origin and refer to abstract ideas and processes. By contrast. Examples of the least frequent word families in Sublist 10 are assemble. Technical vocabulary is difficult to quantify. 2006: 238–9). As explained by Nation (2001: 203). thus introducing additional propositional density to a text (cf. chromatid. 1973: 228).3) and to use them as a way of characterizing text types (Yang. 1995). These words are very unlikely to occur in texts from other disciplines or subject areas. depress. cytoplasm and abiotic. some practitioners consider that it is not the English teacher’s job to teach technical terms. 1.4. Corson. Technical terms occur with very high or at least moderate frequency within a very limited range of texts (Nation and Hwang. it is possible to identify them on the basis of their frequencies of occurrence. persist and undergo.1. Language teachers are not specialists in chemistry.1. 1. According to Coxhead and Nation (2001).3. likewise. They are typically characterized by semantic specialization. we find words such as alleles. learners who specialize in the field may have little difficulty in understanding these words (Strevens. 1986). Fuzzy vocabulary categories Although core words. In biology.

Step 4 Words that have a specific meaning to the field of anatomy and are not likely to be used in general language. organs. acquisition. but not items at Steps 1 and 2. Table 1. between. lungs. ribs. meaning. 2005). Words in this category may be technical terms in a specific field like anatomy and yet may occur with the same meaning in other fields where they are not technical terms. Examples are: thorax. wall. interaction. structures and functions of the body. This increases to 50. structures. Chung and Nation (2003) investigate what kinds of words make up technical vocabulary in anatomy and applied linguistics texts. shoulder. abdominal. chest. pedicle. intervertebral.14 Academic Vocabulary in Learner Writing (cf. sternum. review). movements. 1986. supports. Examples are: the. A large proportion of technical words belong to the 2. demifacets. pectoralis. or wide or narrow range. associated. such as the regions of the body and systems of the body. it. fascia. In the anatomy texts. Chung and Nation consider items at Steps 3 and 4 to be technical terms. They refer to parts. periosteum. hematopoietic. They refer to structures and functions of the body. academic or technical in context. Examples are: superior. forms. The words may have some restrictions of usage depending on the subject field. As Nation and Hwang remark. coverage and range figures for any text or group of texts occur along a continuum’ (1995: 37). Such words are also used in general language.g. surrounds. They classify technical terms on a four-level scale designed to measure the strength of the relationship of a word to a particular specialized field.g. shoulder). trachea.5 per cent in the applied linguistics texts (e. costal. breathing.000 most frequent word families of English as given in the GSL or to the AWL. bony. cage. heart. Step 3 Words that have a meaning that is closely related to the field of anatomy. cavity.2. commonly. common. vertebrae. liver. words independent of the subject matter. because vocabulary frequency. . breast. 2006. moderate or low frequency. skin. pectoral. mammary. ‘any division is based on an arbitrary decision on what numbers represent high.3 per cent of the word types at Step 3 are from the GSL or AWL (e. early and especially Step 2 Words that have a meaning that is minimally related to the field of anatomy in that they describe the positions. abdomen. Mudraya. A major result of this study is that a word can only be described as general service. constantly. pairs. viscera. trunk. cage. amounts. directly. muscles. part. Beheydt. protects. is. Yang. Examples are: chest. Results for vocabulary in anatomy texts are given in Table 1. or features of the body. These words have clear restrictions of usage depending on the subject field. input. as applied to the field of anatomy Step 1 Words such as function words that have a meaning that has no particular relationship with the field of anatomy.2 Chung and Nation’s (2003: 105) rating scale for finding technical terms. neck. 16. lodges. that is. adjacent. neck. by.

relatively frequent in academic texts and students will most probably encounter them quite often while reading. normalize. use. As early as 1937. there is a need of a divorce between receptive and productive work’ (West. They should therefore be the focus of an academic reading course. tape) (Paquot. result. Originating from research on vocabulary needs for reading comprehension and text coverage. whereas reading and speaking are the Hare and the Tortoise. 1937: 437) and regretted that teachers were giving composite lessons aiming at teaching reading and speaking simultaneously. These words are. argument.g. find. the AWL includes words that are extremely common outside academia (e.) . West argued that ‘both as regards Selection and still more as regards detailed Itemization. Martínez et al. reason. however. implement. The one is Recognition of a lot.g. Reading and speech bear the same relation to each other as musical appreciation and actual execution on the piano. principle and rationalize. argue that ‘the assump˘ tion that any high frequency word outside the GSL coverage in the academic corpus would be a de facto academic item perhaps accounts for the distinctly “un-academic” texture of some of the items on the list’ (Hancioglu ˘ et al. policy.. For example. sex. They also comment that the fact that ‘items such as study appear in the GSL (but not in the AWL) and items such as drama in the AWL (but not in the GSL). panel. adult. The division of vocabulary into three mutually exclusive lists becomes problematic. 2008: 462). These words may be used differently in academic discourse. (ibid. drama.What is academic vocabulary? 15 Similarly. Hancioglu et al.. when it is transposed to academic writing courses and the need arises to distinguish between knowing a word for receptive and productive purposes. it has been shown that the GSL contains words that appear with particularly high range and frequency in academic texts (e. On the other hand. which are not very common in everyday English. example. Partington (1998: 98) has shown that a claim in academic or argumentative texts is not the same as in news reporting or a legal report. the division between core words and academic words is very practical for assessing text difficulty and targeting words that are worthy of explanation when reading an academic text in the classroom. 2007a). Most English for Academic Purposes (EAP) students recognize core words but are not familiar with the meaning of academic words such as amend. however.: 463). concept. show) (cf. suggests that the division of vocabulary into mutually exclusive lists is likely to be an activity that for all its initial convenience may prove inherently problematic in the long run’ (ibid. the other is Skill in using a little. 2009: 192).

either a more specialized list or a larger common core vocabulary. ˘ See Stein (2008) for a similar approach. It is questionable whether all the words from the AWL should be the focus of productive learning.709 word families categorized according to the number of lists in which they were represented. The resulting Billuroglu-Neufeld-List (BNL) consists ˘ of 2. e.000 words of the Brown have tried to revise the General Service List. 2001: 27–8). . Gillett’s website about vocabulary in EAP < http://www. Other examples include the GSL. (5) the revised version of the GSL.academicvocabularyexercises. The AWL. [.000 word families which contains both technical terms and all the general words necessary for reading comprehension and shows that it provides 95 per cent coverage of many basic engineering texts (see also Mudraya.g.. (6) the Longman Wordwise of commonly used words and (7) the Longman Defining Vocabulary. to ensure maximum utility for any learner. Billuroglu and Neufeld (2007) ˘ combined into one list all the words from: (1) the GSL. built an engineering word list of 2. produce it to express the intended meaning in the appropriate context. 2006) and CALL materials (see. (4) the first 5. And yet this strategy lies at the heart of several recent textbooks (e. (3) the first 2.16 Academic Vocabulary in Learner Writing Learning vocabulary for productive purposes has been found to be much more difficult than learning for receptive uses. for example.. Luton’s Exercises for the Academic Word List < http://www. Others. 2005. (2) the AWL. and use it with words that commonly occur with it (Nation. Knowing a word productively> and Haywood’s AWL Gapmaker <) Several scholars have suggested replacing separate lists of general service words. academic vocabulary and technical terms by a single list.uefap. Selection is thus a key issue in teaching vocabulary for academic writing and speaking. This procedure led to the emergence of only 176 word families that were not in either the GSL or the AWL.g. Coxhead (2000: 218) argues that this practice is supported by psycholinguistic evidence suggesting that morphological relations between words are represented in . being able to pronounce and/or spell it correctly. Schmitt and Schmitt. Ward (1999). 2006). by contrast. 1984) and recent domain-specific lists such as those developed by Ward (1999) and Mudraya (2006).000 words of the British National Corpus. regardless of specialization. ] much of the AWL would be absorbed into it’ (Hanciog lu et al. groups words into families. for example. Huntley. A final criticism that can be levelled at the AWL is related to the notion of a word family. as well as most word lists for learners of English. . 2008: 466). thus confirming that ‘if the GSL was enlarged by even a relatively small degree. the University Word List (Xue and Nation.htm>.

however.17 occurrences per million words respectively.3). Baker. not all members of a word family are likely to be equally helpful in academic writing. This may well be true and may justify the use of word families for receptive purposes.29 occurrences per million words in the academic part of the British National Corpus (see Section 3.What is academic vocabulary? Table 1. which has a relative frequency of 134. However.000 words of English but which occur reasonably frequently in a wide range of academic texts. 1990). This. However. Nation (2001: 187–96) uses the term ‘academic vocabulary’ to refer to words that are not in the top 2. 1988.06 and 1.3 link linkage linkages linked linking links 17 Word families in the AWL proceed procedural procedure procedures proceeded proceeding proceedings proceeds issue issued issues issuing evident evidenced evidence evidential evidently item itemisation itemise itemised itemises itemising items stress stressed stresses stressful stressing unstressed utilize utilisation utilised utilises utilising utiliser utilisers utility utilities utilization utilize utilized utilizes utilizing the mental lexicon. A related problem is that parts-of-speech are not differentiated. 1981). we find the noun itemisation and word forms of the verb itemise. does not tell us whether the word forms issue and issues (under the headword issue) are more often used as nouns or verbs in EAP. these two lemmas are quite rare in academic writing. Mudraya. and ‘specialised non-technical lexis’ . For example. however. 1. 2006). Academic vocabulary and sub-technical vocabulary Like Coxhead (2000). Unlike Coxhead. ‘semi-technical vocabulary’ (Farrell. 1986. he also uses it to label a whole set of lexical items also known as ‘sub-technical vocabulary’ (Cowan 1974. ‘non-technical terms’ (Goodman and Payne.3 shows several word families taken from the AWL: the only information provided is that the words in italics are the most frequent form of their family.2. under the headword item. with relative frequencies of 0. Table 1. Yang.

In biology. Cowan defines sub-technical vocabulary as ‘context independent words which occur with high frequency across disciplines’ and comments that.g. (1988) regard the extended meanings of what they call ‘non-technical’ words as a major area of difficulty for non-native readers who may only be aware of one of their meanings. 1988). or truth validity. the adjective specific may also be used with reference to the genetic notion of specificity. for example. time sequence. sub-technical vocabulary as defined by Trimble (1985) is an important subset of academic vocabulary. Cohen et al. alternatively. They show that a large proportion of vocabulary items which indicate time sequence or frequency in a genetics text are unknown to their informants (e.18 Academic Vocabulary in Learner Writing (Cohen et al. However. they are quite familiar with the technical meaning of the verb compile in computer science and tend to interpret it as ‘convert . subsequent and successive). 1990: 37). For example. intermittently. circuit. A second area of difficulty arises because non-technical words may be used in contextual paraphrases to refer to the same concept (e. Cohen et al. which is a characteristic of enzymes. (1988) identify a subset of non-technical vocabulary as a third area of difficulty. for example. ensuing. wage and cage that would be categorized as technical terms according to Chung and Nation’s (2003) four-level rating scale of technicality or field-specificity (see Table 1. They do not offer a precise definition of the term. ‘specialized non-technical lexis’.g. Michael West’s General Service List and the recent one million word computer analysis by Henry Kuc era and Nelson Francis. Clearly some of what I am calling sub-technical vocabulary would be encompassed in the existing word frequency counts like Thorndike Lorge. (Cowan. In Li and Pemberton’s (1994) view. thus causing problems of lexical cohesion at the level of synonymy. They showed that first-year computer science students are better able to recognize the technical meanings of sub-technical words than their non-technical meanings. consecutively.2) (see also Farrell. Trimble’s definition thus encompasses words such as junction. but explain that this lexis includes vocabulary items indicating. Trimble (1985) extends Cowan’s (1974) usage to include ‘those words that have one or more “general” English meanings and which in technical contexts take on extended meanings’ (Trimble 1985: 129). 1974: 391) ˇ Cowan’s definition of sub-technical vocabulary applies to those words that have the same meaning in several disciplines. viz. repair processes and repair mechanism in a genetics text). measurement. all these terms have been used quite differently in the literature.

Baker thus comments that take place and occur can be regarded as subtechnical words. 6. develop a model). General language items which have restricted meanings in one or more disciplines. in addition to a different meaning in general language (e. 3. Items which have a specialized meaning in a particular field. Martin uses the term academic vocabulary as a synonym for sub-technical vocabulary to refer to words that ‘have in common a focus on research. Items which are used in academic texts to perform specific rhetorical functions. In botany. 5. 1988: 92). solution in mathematics and chemistry).g. nouns and their co-occurrences (e. 1988: 92). morphological in linguistics. Expressed in botany is therefore not associated with emotional or verbal behaviour as is the case in general language’ (Baker. These are ‘items which signal the writer’s intentions or his evaluation of the material presented’ (Baker.g. General language items which are used. Items which are not used in general language but which have different technical meanings in different disciplines (e. as opposed to being masked. Items which express notions shared by all or several specialized disciplines. an examination of biology textbooks showed that photosynthesis does not happen but takes place or occasionally occurs. Examples include factor.e. The vocabulary of the research process consists primarily of verbs. the word solution is more frequently used in its non-technical sense in engineering textbooks. in preference to other semantically equivalent items. 4. are more apparent physically. method and function. botany and biology). state the hypothesis and expected results. This is problematic as the non-technical meaning of a sub-technical word is often more common than its technical meaning (see Mudraya. For example. 2006). ‘genes which are expressed have observable effects. to describe or comment on technical processes and functions. analysis and evaluation – those activities which characterize academic work’ (1976: 92). present the methodology. plan or design the experiment.g. bug in computer science. Baker (1988) has argued that this middle area between core and technical vocabulary is itself made up of several different types of vocabulary: 1. For example. i.What is academic vocabulary? 19 or translate a language into a machine code’ or ‘translate’ regardless of the context in which the word occurs. even in a chemical engineering thermodynamics textbook. The vocabulary of analysis includes high-frequency verbs and two-word verbs that are ‘often overlooked in teaching English to foreign students but . 2.

Figure 1. according to the Oxford English Dictionary (OED). consist of. explanation.000 most frequent words of English. group. medicine. develop.g.1 shows that the various definitions of sub-technical vocabulary and academic vocabulary as defined in Section 1. the adjective nuclear has extended senses in astronomy. The AWL also contains several sub-technical words as defined by Martin (1976) (e. report. or to words that allow scholars to conduct research. Trimble’s definition of ‘sub-technical vocabulary’). describe. observe. 1985). significant. and study are among the top 2. the many definitions of sub-technical vocabulary proposed in the literature cover very different sets of lexical items. plan. model. 44 per cent were sub-technical. the verb enable has a specialized meaning in computer science (‘to make (a device) operational.2 partially overlap. result). sociology. e. bring about. found that out of 508 lemmas occurring more than five times in a corpus of electronic texts. . cause. function) but a large number of them do not fall within Coxhead’s definition of academic vocabulary. referring to words that take on extended meanings in specific academic disciplines (Trimble. for example. analyse data and evaluate results (Martin 1976). The same is true of Baker’s (1988) category of words that perform rhetorical functions: case. hypothesis. biology. derive. psychoanalysis. to turn on’).20 Academic Vocabulary in Learner Writing which graduate students need in order to present information in an organized sequence’ (ibid: 93). result from.1. Definitions of sub-technical vocabulary also differ widely. Coxhead’s (2000) Academic Word List includes a large proportion of the words that take on extended meanings in specialised fields (cf. be noted for. compare. Many of these are general service words (e. Adjectives and adverbs make up a large proportion of the vocabulary of evaluation. Baker (1988) uses the term as a broad category for different types of lexical sets including both Trimble’s (1985) sub-technical vocabulary and Martin’s (1976) academic vocabulary. base on. method. Sub-technical vocabulary is generally defined as a category of words which are frequent across disciplines and account for a significant proportion of word tokens in academic texts. linguistics and phonetics. cause.g. cause. The noun error refers to ‘the quantity by which a result obtained by observation or by approximate calculation differs from an accurate determination’ in mathematics.g. Farrell (1990). which are of various sizes and may share certain characteristics. group. This category will be the focus of the next section as it is itself made up of various sets of lexical items and Baker (1988) suggested that it is the most difficult type of sub-technical vocabulary to teach and acquire. For example. In summary.

show. appropriate experiment. remarkable. explanation function increase. observe factor.1 The relationship between academic and sub-technical vocabulary . nuclear. consist present. method result. significant. civil. group. develop.What is academic vocabulary? Baker's (1988) sub-technical vocabulary 21 Coxhead's (2000) academic vocabulary Martin's (1976) academic vocabulary psychology colleague nevertheless enormous cause. model. plan. study compile base fast mouse dog bug solution 'expressed' (genes) 'masked' (genes) thereby briefly welfare hence widespread participant transport. decade. text error enable morphological Trimble's (1985) sub-technical vocabulary Figure 1. journal. hypothesis interesting. case derive result.

. approach. i. cause. . they are part of what Widdowson (1983) called ‘procedural vocabulary’. or forward. compare. They behave grammatically like subject. not . in other words. 1977: 22). These words bear a strong relationship with what Winter (1977) called ‘Vocabulary 3 items’ and Widdowson (1983) ‘procedural vocabulary’.g. and ‘It has been pointed out by . identify. Vocabulary and the organization of academic texts Baker (1988) gave the following examples of sub-technical words that are used to perform rhetorical functions: ‘One explanation is that…’. unless. result.. not so much . aspect. and way). differ. These words ‘may be used to make the relation explicit by saying what the relation is’ (Winter. feature. connect. However. that . case.22 Academic Vocabulary in Learner Writing 1. . method. that is to say. either beforehand or afterwards. although. hence. Luzón Marco.’. . e. highly context-dependent items with very little lexical content which serve to do things with the content-bearing words and draw attention to the function that a stretch of discourse is performing (see also Harris. consequence. . 1997.g. hypothetical.or post-modified. . Winter (1977: 14–23) distinguished between three types of words that are commonly used to create cohesion or structure in discourse and that are essential to the understanding of clause relations.3. Examples include addition. contradict. . for example. Vocabulary 1 consists of ‘subordinators’ which either connect clauses together (e. therefore. . let alone . matter.e. Vocabulary 3 items serve to establish semantic relations in the connection of clauses or sentences in discourse. Vocabulary 2 comprises ‘sentence connectors’ which ‘make explicit the clause relation between the matrix clause and the preceding clause or sentence’ (Winter. . 1999). specify and subsequent. Each group is distinguished by its clause-relating functions. as . indeed. affirm.g. when we encounter them in a text. As such. except that. problem. . Labels have traditionally been described as content words. 1977: 15). anyway. Vocabulary 3 items include a large proportion of nouns that are inherently unspecific and require lexical realization in their co-text. area. look back in the text to find a suitable referent. ‘Others have said …’. verb. . analogous. move. we often need to do ‘something similar to what we do when we encounter words like it. reason. . explanation. he and do in texts: we either refer to the bank of knowledge built up with the author. thus. as far as.g. alike. object or complement and can be pre. whereas) or embed one clause within another (e. Francis (1994) refers to this type of lexical cohesion as ‘advance’ and ‘retrospective labelling’: labels2 allow the reader to predict the precise information that will follow when they occur before their lexical realization and they encapsulate and package a stretch of discourse when they occur after their realization (e. .).

assertion. obstacle. section. response. they can be modified by demonstrative pronouns. comparison. Metalinguistic labels are of four types. 1991: 76). Illocutionary nouns are nominalizations of verbal processes. e. answer. argument.g. Hoey. instance. phrase. idea. thesis.g. and build up expectations concerning the shape of the whole discourse’ (McCarthy. term. labels ‘additionally give us indications of the larger text-patterns the author has chosen. claim. e. excerpt. they can occur in various parts of a sentence and they have a significant constant meaning’ (2001: 212).What is academic vocabulary? 23 anticipating that the writer will supply the missing content’ (Carter and McCarthy. recommendation. 1984. opinion. definition. quotation.g. theory. unlike pronouns. consequence. respond. finding. analysis. hypothesis. Many labels are built into a fixed phrase or ‘idiom’ . 1994. e. 1993. Within the category of labels. effect and result (see also Jordan. As explained by McCarthy. assumption. contrast. words. position. Flowerdew. they are also characterized by their specific collocational environment as shown by Francis: there is a tendency for the selection of a label to be associated with common collocations. concept. belief. and adjectives. the strength of labels as discourse organizing vocabulary is that ‘they have a referential function and variable meaning like pronouns but. summary. 2. difficulty. 1988: 206–7). etc. although there is some overlap between them: 1. observation. remark. reply. 2001: 211). As pointed out by Nation. Language-activity nouns refer to language activities and the results thereof. Francis identified a set of nouns which are ‘metalinguistic in the sense that they label a stretch of discourse as being a particular type of language’ (1994: 89). description. example. suggestion. hinder. attitude. view. dilemma. insight. interpretation. 4. etc. proof. illustration. etc. e. Text nouns refer to the formal textual structure of discourse. conviction. Mental process nouns refer to cognitive states and processes and the results thereof. reasoning.g. Nation. reference. ‘the language learner who has trouble with such words may be disadvantaged in the struggle to decode the whole text as efficiently as possible and as closely as possible to the author’s designs’ (McCarthy. statement. advice. The following words typically cluster round the elements of problem-solution patterns: concern. 1991: 76). 3. Labels not only cluster around elements of macro patterns. detail. As well as representing text segments. 2008. numbers.

Labels are not the only indicators of text patterns in academic discourse. the term ‘academic vocabulary’ has been used extensively in the literature to refer to various sets of lexical . For example. quantitative changes of entities (e. these words express temporal deixis (e. currently). description of processes (especially those involving changes). epistemic relations between the subject matter and the scholar (e. Even where the collocations are less fixed. ’. involve). is often organized with verbs such as assert and state. 2004). Zwier focused on lexical items that are ‘particularly useful in the kinds of writing most common in EAP writing classes – general description. where the retrospective label is found in predictable company (. i. likewise.24 Academic Vocabulary in Learner Writing (in the widest sense of the word). fluctuation).g. characteristic). As shown in the previous sections. 2005: 60). adjectives like false and likely. to solve a problem’. parallel. rise. . . and the relations between them’ (Meyer. link. . indicate. rejected/ denied the allegations’.g. follow. representing a single choice. ‘a specialized form of discourse which allows writers to engage with and influence their interlocutors and assist them to interpret and evaluate the text in a way they will see as credible and convincing’ (Hyland. for example. . raise. define). problem. Frequent collocations include. 2002: xiii) and described the way in which words such as consist of. . . method. to reverse the trend’. may. scholarly speech acts (e. increase. . comprise. for example. Meyer’s ‘non-technical vocabulary’ is therefore closely related to the notion of metadiscourse. later) (ibid: 10–11). . alike.g. suggest. since. obviously. Baker comments that sub-technical words which perform specific rhetorical functions and structure the writer’s argument ‘should not be taught in isolation but in context and as central elements in typical collocations’ (Baker. original. seem).g. Meyer (1997) commented that nontechnical words ‘provide a semantic-pragmatic skeleton for the text. He devoted particular attention to verbs because ‘accurate verb use is especially difficult for academic writers’ (Zwier. 1997: 9).g. . 1988: 103). above. ). stem from. the label occurs in a compatible lexical environment. The claim-counterclaim pattern.g.g.e. ‘the move follows . They determine the status of the (more or less technically phrased) propositions that are laid down in it. relations between entities (e. classifiers of entities (e. the preposition according to and adverbs such as apparently and arguably.g. distinguish. theory. ‘. ‘. and yield are used to perform specific rhetorical functions in academic discourse. In Building Academic Vocabulary. and ‘. modality (e. comparison/ contrast. likely). proposal. Similarly. (1994: 100–1) More generally. show. 2002: xi) (see also Swales and Feak. and textual deixis (e. and cause/effect’ (Zwier. arising from. .

By contrast. with. and sociology. The noun strategy. a majority of the occurrences located in just one domain. In addition. only 36 word families were found to be relatively evenly distributed across the sub-corpora. Hyland and Tse further argued that ‘all disciplines shape words for their own uses’ (ibid: 240) as demonstrated by their clear preferences for particular meanings and collocations. Overall. often appears in the multi-word unit marketing strategy in business.4. mechanical and electronic engineering (engineering sub-corpus). They made use of Coxhead’s (2000) Academic Word List and showed that the coverage of AWL items in a corpus of 3.3 million words from a range of academic disciplines is not evenly distributed. 63 in two subcorpora and 6 in all three. However. while in the social sciences it often simply means ‘considering something carefully’ (ibid: 244). and social sciences sub-corpora. in many cases. engineering. Hyland and Tse questioned the widely held assumption that ‘a single inventory can represent the vocabulary of academic discourse and so be valuable to all students irrespective of their field of study’ (Hyland and Tse. 1. learning strategy in applied linguistics and coping strategy . its very existence has recently been challenged by several ESP researchers. They gave the example of the word process which is far more likely to be encountered as a noun by science and engineering students than by social scientists. physics and computer science (sciences sub-corpus). 227 (40%) have at least 60 per cent of all occurrences concentrated in just one sub-corpus. for example. Is there an ‘academic vocabulary’? In an article entitled ‘Is there an “academic vocabulary”?’. 2007: 238). 78 families were extremely infrequent in one sub-corpus. Of the 570 AWL families. are very unlikely to come across the noun volume in the meaning of ‘a book or journal series’ unless they are reading book reviews. business studies and applied linguistics (social sciences subcorpus). Of these. words may take on additional discipline-specific meanings as a result of their regular co-occurrence with other items. Science and engineering students. for example. An investigation of a set of potential homographs in the AWL revealed a considerable amount of semantic variation across fields.What is academic vocabulary? 25 items. 534 (94%) have irregular distributions across the sciences. The disciplines that make up the corpus are biology. They also showed that the verb analyse tends to refer to ‘methods of determining the constituent parts or composition of a substance’ in engineering.

the problem is to determine what words EAP tutors should teach a mixed groups of students. In linguistics. Huckin. lexico-grammar and phraseological patterns (Granger and Paquot. The authors concluded that ‘By considering context. for example. 1994: 37). another function is to find similarities and generalities that will facilitate instruction in an imperfect world’ (Eldridge.26 Academic Vocabulary in Learner Writing in sociology. The lexico-grammatical environment of the verb will help differentiate its ‘distinct (though not unconnected)’ (Hoey. cotext. Saville-Troike insisted that ‘vocabulary knowledge is the single most important area of second language competence when learning content through that language is the dependent variable’ (1984: 199). analyse is also often used in the sense of carrying a statistical analysis. 2005: 105) senses (see also Sinclair. 1991). As regards the verb analyse.701. in an article on second language teaching for academic achievement. most notably in international EAP programmes (cf. 2008: 111). That being the case. Eldridge. 2009a). 2007: 247) while also subscribing to Eldridge’s claim that ‘though one function of research is to unravel what distinguishes different fields and genres. phrase or sentence.351 words of business. 2008). An investigation of the verb analyse in a corpus of 1. 2002. 2003. It is only by invoking more general definitions of this type that EAP tutors will help L2 learners deal with the various uses of verbs that they may come across even within a single discipline. Bhatia. Granger and Paquot (2009a) advocate a ‘happy medium’ approach which concurs with Hyland and Tse’s rejection of approaches of EAP as ‘an undifferentiated unitary mass’ (Hyland and Tse. submitting data to computer-aided analyses or distinguishing the constituents of a word. Two decades ago. These findings pose a tremendous challenge to the growing number of students who enrol in interdisciplinary programmes and to English teachers who are regularly faced with mixed groups of students. Sutarsyah et al. if academic vocabulary is a chimera. EAP courses need to ensure that sufficient attention is given to vocabulary development (cf. This balanced approach aims to reconcile research findings and the reality of EAP teaching practice. academic vocabulary becomes a chimera’ (ibid: 250). linguistics and medicine articles has shown that it is possible to identify both the common core features of an academic word and its discipline-specific characteristics in terms of meaning. Wang and Nation commented that ‘learners should be encouraged to look for the central concept behind a variety of uses’ (2004: 310). . 1987. with the ‘data’ and the ‘methods or tools’ varying across academic fields. and use. this central concept can be defined as ‘to examine data using specific methods or tools in order to make sense of it’.

commented that academic words should serve ‘to build the rhetoric of a text. analysis and evaluation. however. Martínez et al. the same rhetorical function of reporting research as establish. 2008: 464). Summary and conclusion There have been several studies that have investigated the vocabulary needed for academic study. irrespective of the disciplines. That definition should rely on the work of researchers such as Martin (1976) and Meyer (1997) who focused on the nature and role of words that occur across subject-oriented texts.g. is of limited use when the role words play in academic discourse is examined. Martin (1976) discussed words that are useful instruments in the description of activities that characterize academic work. Meyer (1997) focused on words that provide a semantic-pragmatic skeleton for academic texts and identified a number of lexical subsets that fulfil important rhetorical and organizational functions in academic discourse (e. research. These GSL verbs therefore also deserve careful attention in the academic writing classroom. (2009: 192). 2000) consists of 570 word families that are not in West’s (1953) General Service List but which have wide range and occur reasonably frequently in a 3.000 most frequent words of English and looked at academic texts to see what words not in the core vocabulary occur frequently across a range of academic disciplines. The construct of academic vocabulary remains a useful one which is. I do not. They perform. Beheydt. conclude. I agree with Hanciog lu and her colleagues that EAP practitioners ˘ should ‘avoid taking the GSL as any kind of “given” in the compilation of more specialized wordlists’ (Hancioglu et al. nevertheless. find and report are not presented as academic words because they are part of the GSL.000 word corpus of academic texts. that is. Defining academic vocabulary in opposition to core words. the verbs show. textual deixis. This list is very useful for students entering university. It proves helpful in setting feasible learning goals and assessing vocabulary learning. ˘ subscribe to the idea according to which we ‘should seriously consider putting aside the idea of a distinct discrete-item Academic Word List’ (Hancioglu ˘ et al. however. scholarly speech acts). however. 2005). Some of them have assumed that learners already knew the 2. As shown by Martínez et al.What is academic vocabulary? 27 1. More recently. The Academic Word List (Coxhead.5. providing words useful . as well as being an excellent resource for preparing for the reading test in International English Certificates such as TOEFL and IELTS. 2008: 468).500. in need of a more precise definition (cf. and demonstrate and are often more frequent than these three AWL verbs in academic texts. expressing modality.

agronomists. physicists. poets or playwrights. Academic words in their ‘functional’ sense should be useful to biologists. it seems reasonable to argue that.28 Academic Vocabulary in Learner Writing for the construction of the argument of science’ (2009: 193). linguists and computer scientists writing in higher education settings but not to novelists. The next step is to build a list of academic words according to this definition and it remains to be seen whether this can be done automatically. this lexical set should therefore be reasonably frequent in a wide range of academic texts but relatively uncommon in other kinds of texts. Following Coxhead (2000).000 most frequent words of English may be part of a list of academic vocabulary. keyword analysis. historians. This frequency-based criterion is not regarded as a defining property of academic words but as a way of operationalizing a function-based definition of academic vocabulary. however. All in all. – Words that are reasonably frequent in a wide range of academic texts but relatively uncommon in other kinds of texts will not be granted the status of academic words automatically. two major differences between this proposal and Coxhead’s work: – The 2. I will investigate whether academic words can be automatically extracted from corpora. lawyers. To weed out those words that are not specific to academic texts. I will use a number of corpus linguistics techniques. organize scientific discourse and build the rhetoric of academic texts. for productive purposes. economists. and more particularly. There are. . sociologists. In the next chapter. academic vocabulary would be more usefully defined as a set of options to refer to those activities that characterize academic work.

. Biber. I describe the data-driven approach used to extract potential academic words from corpora. multiword expression or a grammatical construction. The method used to extract potential academic words is based on Rayson’s (2008) data-driven approach. organize scientific discourse and build the rhetoric of academic texts.g. Common to all corpus-based studies is the prior selection of which linguistic features to study. Annotate: manual or automatic analysis of the corpus. lemma. They examine ‘linguistic (lexical or grammatical associations of the feature). 2. Other corpus-based studies invert this relationship and investigate the characteristics of whole texts or language varieties. The majority of corpus-based studies tend to focus on a particular linguistic feature. in other words. Rayson (2008) identified two general kinds of research question that can be investigated using a corpus-based paradigm. The term ‘potential academic words’ is used to refer to words that are reasonably frequent in a wide range of academic texts but relatively uncommon in other kinds of texts and which. Rayson proposed a different approach: ‘decisions on which linguistic features are important or should be studied further are made on the basis of information extracted from the data itself.Chapter 2 A data-driven approach to the selection of academic vocabulary In this chapter. and non-linguistic aspects (distribution of the feature across different types of texts or speech)’ (Rayson 2008: 520). might be used to refer to those activities that characterize academic work. as such. which draws on both the ‘corpus-based’ and the ‘corpus-driven’ paradigms in corpus linguistics. 1988). Build: corpus design and compilation. and so be granted the status of academic vocabulary. possibly a word. by examining how certain linguistic features appear in a text (e. it is datadriven’ (2008: 521). This model is set out in five main steps: 1.

5. The advantages and disadvantages of a keyword list are discussed and the criteria of range and evenness of distribution are proposed to refine the list of potential academic words (Steps 3 and 4). In particular the latter approach is best viewed as an idealized extreme’ (McEnery et al. I first detail the corpora used (Step 1) and the type of annotation adopted (Step 2). The keyword procedure is first used to retrieve a set of words which are distinctive of academic writing. Retrieve: quantitative and qualitative analyses of the corpus. In the last part of the chapter. Rayson’s (2008) data-driven method thus combines elements of both the corpus-based and the corpus-driven approaches. the Academic Keyword List.. 2006: 9–10). This provides a check on the accuracy of the retrieval procedure (Step 5). I give a description of the final list of potential academic words. corpus-driven approaches to language studies is overstated. . Interpret: interpretation of the results or confirmation of the accuracy of the model. They also have strong objections to corpus annotation (see McEnery et al. I then focus on the different steps undertaken to retrieve potential academic words. in which the corpus is the main informant. Rayson (2008) uses the term ‘data-driven’ to distinguish this approach from the corpus-driven paradigm. Research questions emerge from iterative analyses of the corpus data. 2006: 8). Studies that make use of the data-driven approach first focus on whole texts (Step 3) and then refine the research question or suggest specific linguistic features to study in further detail (Step 4). The model is also testimony to the fact that the ‘distinction between the corpus-based vs. stating that pre-corpus theories need to be re-examined in the light of evidence from corpora. 2001: 48). Corpus-driven linguists question the ‘underlying assumptions behind many well established theoretical positions’ (Tognini-Bonelli. 2003: 197).. However. 4. and investigate whether its constituents fit my definition of academic vocabulary. new categories and formulate new hypotheses on the basis of the patterns that were observed’ (De Cock. Following Rayson’s (2008) data-driven approach. The model bears some similarity to corpus-driven linguistics as presented by Tognini-Bonelli (2001: 85). It relies on pre-existing part-of-speech tagsets but considers corpus data as ‘the starting point of a path-finding expedition that will allow linguists to uncover new grounds. Question: devise a research question or model (iteration back to Step 3).30 Academic Vocabulary in Learner Writing 3.

g.1. Mudraya.596 203.Selection of academic vocabulary 31 2.490 146. This division into ‘knowledge domains’ (Hyland. ‘novice writers do not (. science). As Nesi et al. arts.g. 1978.322 132. The MC comprises 33 book sections and the B-BNC is made up of 30 book sections and extracts from scientific journals. both corpora consist of five sections of about 200. Their early attempts at academic writing are more likely to be assessed texts produced in the context of a course study.026. 2000. Texts in the B-BNC were written by British scholars while the MC also includes texts written by American researchers. journal articles and textbooks.060 180.1.678 283. corresponding to five broad academic domains (e. notably student essays.612 219. The professional academic corpora used are the Micro-Concord Corpus Collection B (MC) and the Baby BNC Academic Corpus (B-BNC).000 words each.1 Corpus MC The corpora of professional academic writing Variety of English mainly British English Text type books Number of words 1.067 Arts Belief and religion Science Applied science Social science B-BNC Humanities Politics.496 199. 1990). . .041 2.007 262. or for a readership of strangers. social science. ) begin by writing for publication. The corpora contain about a million words of published academic prose each. however. Academic writing. Table 2.021.005. (2004: 440) comment. 2009: 62–5) is particularly well suited to extracting words that are used by all members of the ‘academic discourse community’ (Swales.316 202.476 196. Coxhead.302 1. As shown in Table 2. includes other kinds of text than professionally edited articles and books.’ The automatic selection of potential academic words for this study was therefore made on the basis of an analysis of both professional and student writing. Corpora of academic writing Corpus-based studies of vocabulary in academic discourse (e. Johansson. 2006) have principally considered book sections. education and law Social science Science Technology and engineering TOTAL British English books and periodicals .

695 words) and American university students (168.000 to 5. There seem to be good reasons for taking knowledge domains as the point of departure for identifying potential academic words.443 219. As shown in Table 2.593 words. 2004). The BAWE Pilot Corpus1 contains about one million words of proficient assessed student writing.400 words) (see Granger. among others.302 262. ‘The National Lottery’. and humanities and the social sciences (‘soft sciences’).316 283. ‘Nuclear power’. ‘Euthanasia’.322 132.041 Number of words 1. education and law BNC Social science ProfHS MC Science MC Applied science BNC Science BNC Technology and engineering 852.2.32 Academic Vocabulary in Learner Writing Table 2. Two corpora of student writing were also used: part of the Louvain Corpus of Native Speaker Essays (LOCNESS) and a selection of texts from the British Academic Written English (BAWE) Pilot Corpus. 1996a. 1998a for further details). 2009: 63). It is across this dividing line that ‘we tend to see the clearest discoursal variation and rhetorical distinctiveness’ (Hyland.304 words and consists of argumentative and literary essays written by British A-level students (60. in the form of 500 assignments ranging from 1.2 The re-categorization of data from the professional corpus into knowledge domains Corpus ProfSS MC Arts MC Belief and religion MC Social science BNC Humanities BNC Politics.490 146.496 199.596 203.209 words). The part used for this study consists of argumentative essays written by university students and totals 168.678 For centuries the traditional dividing line in the history of academia has been between the natural sciences and technology (‘hard sciences’).476 196. ‘The death penalty’. two corpora were compiled from the MC and the B-BNC: a corpus of professional ‘soft science’ (ProfSS) and a corpus of professional ‘hard science’ (ProfHS).886 180. ‘Crime does not pay’ and ‘Money is the root of all evil’.612 202.173.000 words in length (Nesi et al. Together they constitute the Student Writing Corpus. LOCNESS totals 323. British university students (95. Argumentative essay titles include. 27 per cent of the contributors . ‘Fox hunting’..

Essay topics in the BAWE pilot corpus are very diverse and seldom repeated (see Table 2. politics. and literature.300 201. 2003: 1066). (2004: 444) comment that ‘the University of Warwick is a multicultural. theatre.3. As shown in Table 2. skewed towards humanities and social sciences.4 for examples).946 258. the texts were grouped into four sub-corpora which represent a discipline or a set of disciplines. disciplines are not equally represented in the pilot corpus and the majority of student assignments come from the humanities and social sciences. Italian studies. The Student Writing Corpus is thus quite representative of university students’ writing in that it comprises different types of writing tasks (skillsbased writing and content-based writing. argumentative and expository writing). Table 2. Nesi et al. and add that ‘all contributors are proficient users of English. multilingual environment.Selection of academic vocabulary 33 were not native speakers of English.3 Corpus BAWE Language studies Social sciences Psychology History LOCNESS Student Writing Corpus mainly American English argumentative essays The corpora of student academic writing Variety of English British English Text type assignments Number of words 845. law. sociology and economics were grouped together as social sciences as there were not enough texts per discipline to build separate corpora.344 221. I decided to only make use of assignments written by British students as Hinkel (2003) has shown that even English as a Second Language (ESL) students ‘continue to have a restricted repertoire of syntactic and lexical features common in the written academic genre’ (Hinkel.841 163. Unlike the final BAWE corpus2.013. however. and in their departments students are assessed on merit.257 168. without regard for their language background’. It will be seen in Section 2.937 . Texts in business. For the purpose of identifying potential academic words. It is. The ‘Language studies’ sub-corpus consists of essays produced for courses in English studies.3 that the procedure used to extract potential academic words largely overcomes this limitation.593 1. It could be argued that the Academic Keyword List might therefore not fully represent academic vocabulary used in the ‘hard sciences’. French studies. given that their assignments have been awarded high grades’.

or the different conjugated forms of a verb. Section 1. 2000) is a list of word forms that were manually classified into 570 word families (cf. 2007. 2009).. any corpus-based study that aims to identify a specific set of vocabulary items should consider the advantages and disadvantages of annotating corpora. However. Corpus annotation refers to the practice of adding linguistic information to an electronic corpus of language data.1. which concerns the labelling of the part-of-speech (POS) or grammatical category of each word in the corpus. 2006. 2008. None of them discuss issues arising from the format of the corpus.1. ‘corpora are useful only if we can extract knowledge or information from them. Issues in annotating corpora As Leech put it. such as the singular and the plural forms of a noun.4 Academic Vocabulary in Learner Writing Examples of essay topics in the BAWE pilot corpus – Visual arts in Britain – Prince Arthur portrayed in books – Rise of aestheticism – Modes of writing essays – Housing policy – Teachers as professionals – Would you agree that subordination was inscribed in the life of domestic servant? – Clinical depression – Psychology as a science – Expressing attitude – Is attention merely a matter of selection? – Absolutism in early modern Europe – Why did America dominate the world film market by the 1920s? – Who was to blame for the Boxer rising? Language studies [98 essays] Social sciences [64 essays] Psychology [103 essays] History [136 essays] 2. Martínez et al. The fact is that to extract information from a corpus.).34 Table 2.. we often have to begin by building information in’ (1997: 4). A lemma is used to group together inflected forms of a word.2. A second type of annotation is the morphosyntactic level of annotation.2.g. Most studies of vocabulary in the field of English for specific purposes (ESP) are based on raw corpora (e. Coxhead and Hirsh. POS tagging is . Corpus annotation The Academic Word List (Coxhead. 2. Wang et al. Mudraya. 2009.2. Ward. starting from the addition of lemma information to each word in the corpus. Various levels of annotation can be distinguished.

see McEnery et al. The argument. These models of language date from a ‘pre-corpus’ time and some of them derive from descriptions which ignore empirical evidence altogether (Sinclair. However discourse annotation systems. Other levels of annotation are syntactic annotation or parsing (the analysis of sentences into their constituents). for example. Although the sets of categories and features used in annotating a corpus are generally chosen to be as uncontroversial as possible. the interpretative nature of corpus annotation has been perceived as a way of imposing pre-existing models of language on corpus data (Tognini-Bonelli. some theoretical perspective. A number of criticisms have been directed at corpus annotation. either as an adjective (‘my left hand’). The actual loss of information takes place when. it distinguishes between left as the past tense or past participle of leave (‘I left early’). Bowker and Pearson. are more recent and still need to be substantially refined. and left as a word meaning the opposite of right.Selection of academic vocabulary 35 the most popular kind of linguistic annotation applied to text. 2004a). but it should be taken as a warning against the naive assumption that using annotating software is a neutral act. While it is inevitable that annotation systems will sometimes get things wrong. notably by distinguished contributors to corpus-driven linguistics. 2004a: 52). semantic annotation (the labelling of semantic fields) and discourse tagging (the annotation of discourse relations within the texts). 1999. but the problem is that they are bypassed in the normal use of a tagged text. One of the most widespread criticisms is that annotation reflects. although valid. annotation has also sometimes been criticized for resulting in a loss of information (Sinclair. Aarts. For more information on the different levels of annotation. the linguist processes . an adverb (‘turn left’) or a noun (‘on your left’). is certainly not strong enough to counterbalance all the advantages of corpus annotation. 2002). Thus. Another argument against annotation is that it may introduce errors. once the annotation of the corpus is completed and the tagsets are attached to the data. 2002. at least to a certain extent. The argument can be summarized as follows: It could be argued that in a tagged text no information is lost because the words of the text are still there and available. Although annotated data is often described as ‘enriched’ data (Leech and Smith. 2001: 73–4). it makes it possible to extract information about its various meanings and uses. By providing information about the grammatical nature of a word. the various levels of annotation distinguished above are performed with varying degrees of accuracy. 1992. (2006: 33–43). POS tagging is a well-researched kind of linguistic annotation and taggers perform with very high levels of accuracy.

What is lost.g. adverbs and other function words that are commonly used in academic texts. The tools . Word forms of lexical items that have two alternative spellings (e. the different inflectional forms of a verb (e. characterise/characterize. and shown that using lemmatised corpora makes it possible to identify some 31 per cent more lexical verbs that are typical of academic texts than using unlemmatised corpora. verbs. 2. is the ability to analyse the inherent variability of language which is realised in the very tight interconnection between lexical and grammatical patterns. a process that is so useful – but it is argued here that the interconnection between lexis and grammar is crucial in determining the meaning and function of a given unit: any processing that loses out on this is bound to lose out in accuracy.2. If lemmas are used. behaviour/behavior) were lemmatized under the same headword (either the British variant or the most frequently used option). consists. therefore. Part-of-speech tagged corpora will thus facilitate the extraction of specific word classes. This is the price paid for simplification. by first extracting potential academic words from annotated corpora and then returning to raw data to analyse their use in context. 2007b). The main objective of this chapter is to select nouns. (Tognini-Bonelli 2001: 73–4) The data-driven methodology adopted in this book aims to preserve the best of both worlds. Finally. The software The analysis was carried out using Wmatrix. a web-based corpus processing environment which gives researchers access to several corpus annotation and retrieval tools developed at the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University.. By doing this the linguist will easily lose sight of the contextual features associated with a certain item and will accept single. analyse /analyze.g. and consisting) are merged and so a better frequency distribution for the lemma across texts is obtained. I have tested the extraction procedure described in this chapter on two corpus formats. consist. Elsewhere (Paquot. Even the linguists who have directed the most severe criticisms at annotated data acknowledge that the ‘good point of annotation lies in its value in applications’ (Tognini-Bonelli. centre/center. it is worth stressing that this chapter does not attempt to meet a theoretical objective. 2001: 73).2.36 Academic Vocabulary in Learner Writing the tags rather than the raw data. Rather it is content with an applied aim. adjectives. consisted. uni-functional items —tags — as the primary data. and lemma + morphosyntactic tag). (word form + morphosyntactic tag.

abound can only be a verb and kindness is always a noun). the probability that the item to its immediate right is a noun or an adjective can be calculated. and by and large. (see Ide. craft. The tagger makes use of a detailed set of 146 tags3 (CLAWS C7 tagset). 1988: 31).g. It also uses two lexicons: (a) a lexicon of single words with all their possible parts of speech and associated lemmas. which can be a determiner (Do you remember that nice Mr. over 40 per cent of the running words (or tokens) in a corpus are morphosyntactically ambiguous (DeRose. and in comparison with. i. issue. 1997). they are spelt the same but belong to different word classes. compounds such as tabula rasa. and conjunctions such as even though. prepositions such as as opposed to. Many words are part-of-speech homographs. abandon. Co-occurrence probabilities are often automatically derived by training the software on manually disambiguated texts. and so that. Typical rule-based taggers use context frame rules to assign tags to unknown or ambiguous words. all the same. Most current part-of-speech taggers use an approach to disambiguation which is at least partly probabilistic: they rely on co-occurrence probabilities between neighbouring tags. use. Non-probabilistic or rule-based taggers have also been making a comeback with systems such as that proposed by Brill (1992). contrary to. as if. brand new.g. given that x is a determiner. provided that. 2005). cause. Part-of-speech tagging is essentially a disambiguation task. and (b) a multiword expression lexicon.Selection of academic vocabulary 37 available in Wmatrix include the Constituent Likelihood Automatic Word-tagging System (CLAWS) and the UCREL Semantic Analysis System (USAS) (see Rayson. matter-of-fact and grown up. Hoskins who came to dinner?)5. Another very common source of ambiguity in English is homography between verbs and nouns. and so forth. given the immediate syntactic and semantic context of a homograph. For example. The Constituent Likelihood Automatic Word-tagging System A corpus uploaded to the Wmatrix environment is first grammatically tagged with the Constituent Likelihood Automatic Word-tagging System (CLAWS) (Garside and Smith. a relative pronoun (The people that live next door). A tagger needs to determine which part-ofspeech is most probable. An example of a context frame rule is ‘if an ambiguous or unknown . etc. a conjunction (I can’t believe that he is only 17) or an adverb (I hadn’t realized the situation was that bad!). This is largely due to the ambiguity of a number of high-frequency words such as that. e. 2003). Although close to 90 per cent of English types4 can only be one part-of-speech (e. Multiword expressions include adverbs such as a bit. because of. at least.e. in the light of.

g. 4..g. CLAWS is a hybrid tagger. a word is generally considered as an orthographic word. If a particular word is not found in the tagger’s lexicon. if it can belong to more than one word class (e. 2. words are not always separated by blanks (e. it is assigned a tag based on various sets of rules.. a string of letters surrounded by white spaces.e.28). they’re). and in situ as an adverb (see below for more details on ditto-tags and their advantages)). the tagger assigns part-of-speech tags to all the word tokens in the text without considering the context. tag it as an adjective’. However. in contractions such as don’t. viz.. Similarly. If a word is unambiguous. boat.g. The probabilistic tag-disambiguation program: The task of the probabilistic tag-disambiguation program is to inspect all the cases where a word has been assigned two or more tags and choose a preferred tag by considering the context in which the word appears and assessing the . use. and other types of abbreviations (i. 3. person. Dr. in figures (5. etc. combining both probabilistic and rule-based approaches. belong). a word ending in *ness will be classified as a noun. i. however. a task which is not trivial.g. it is assigned a single tag. If a word is ambiguous.38 Academic Vocabulary in Learner Writing word is preceded by a determiner and followed by a noun.). it’s. A full stop does not. e.). An initial part-of-speech assignment: Once a text has been tokenised. as well as is tagged as a conjunction. for tagging unknown items. morphological rules.8 or 14. fire). it is assigned several tags listed in decreasing likelihood. This hybrid approach allows CLAWS to assign POS-tags with a very high degree of accuracy – 97–98 per cent for written texts (Rayson. that is. belongs to only one part-of-speech category or word class (e.g. Voutilainen (1999) has surveyed the history of the different approaches to word class tagging. 2003: 63). i.e. always signal the end of a sentence (e. Thus. because the probability of it being a noun is higher than that of it being a verb. Thus. A pre-editing or tokenisation phase: This stage prepares the text for the tagging process by segmenting it into words and sentence units. fig. a word ending in *ly will be classified as an adverb. A rule-based contextual part-of-speech assignment: This stage assigns a single ‘ditto-tag’ to two or more orthographic words which function as a single unit or multiword expression (e.e. cause. The tagger is commonly described as going through five major stages (Garside. fire is first tagged as a noun and then as a verb. 1987): 1. A sentence is generally described as a string of words followed by a full stop. title nouns (Mr.g.

5 An example of CLAWS vertical output POS-tag AT JJ NN1 IO AT NN1 VVZ TO VBI AT1 NN1 II AT NN1 .6 is that the word forms are replaced by their lemmas. Output: The output data can be presented in intermediate format (vertical output for manual post-editing) or final format (horizontal and encoded in SGML). The probability of a tag sequence is typically a function of. – the probability that one tag follows another. it is less likely to be classified as a verb if it appears in the vicinity of another verb. and – the probability of a word being assigned a particular tag from the list of all its possible tags (Garside and Smith. 1997: 104). If.6). Lemma the whole point of the play seem to be an attack on the church PUNC . Redundant information includes. despite the fact that run is more often a verb than a noun. Word form The whole point of the play seems to be an attack on the Church . for example. that on number given by the tags NN1 (singular common noun) or DD1 (singular Table 2. 5. the word run has been assigned both a noun and a verb tag. while the POS-tags are too specific for our purposes. I have written a Perl program which takes this intermediate format as its input and creates a corpus with lemmas followed by their POS-tags (Table 2. The intermediate format has the advantage of allowing researchers to select the information needed. Table 2.Selection of academic vocabulary 39 probability of any particular sequence of tags.5 shows a typical CLAWS vertical output: each line represents a running word in the corpus and gives its POS-tag and lemma. for example. The problem with the format shown in Table 2.

Finally. VVZ: -s form of lexical verb.6 Academic Vocabulary in Learner Writing CLAWS horizontal output [lemma + POS] the_AT whole_JJ point_NN1 of_IO the_AT play_NN1 seem_VVZ to_TO be_VBI an_AT1 attack_NN1 on_II the_AT Church_NN1 . Such ‘ditto tags’ are not included in the lexicon but the program assigns them via an algorithm which is applied after initial part-of-speech assignment and before disambiguation by looking for a range of multiword expressions included in a pre-established list.6 after simplifi cation of the POS-tags.form of a s lexical verb) or VVG (-ing form of a lexical verb). II: general preposition. representing a group of graphemic words which. POS-tags were therefore simplified by a Perl program to match the level of specificity of the lemmas. The simplification routines are presented in Table 2. However. AT1: singular article. VBI: be. As a result. TO: infinitive marker ‘to’. for grammatical purposes. Table 2. PUNC: punctuation Table 2. are best treated as a single unit. The first of the two digits indicates the number of graphemic words in the sequence. where II stands for a general preposition. each CLAWS7 tag can be modified by the addition of a pair of digits to show that it occurs as part of a sequence of similar tags. NN1: singular common noun.8._PUNC determiner) and that on verbal forms given by the tags VVZ (. frequency lists based on this format generate different frequencies for ‘example_NN1’ and ‘example_NN2’. and the second digit the position of each graphemic word within that sequence. Each graphemic word of the complex preposition is tagged and lemmatised independently. The expression ahead of is an example of a sequence of two graphemic words treated as a single preposition. and complex conjunctions as well as single words that are typical of academic discourse. JJ: adjective. A word list based on CLAWS horizontal output would thus distinguish between the preposition in (in_II) and the preposition in used as the first word of three-word sequences (such as in terms of ) (in_II31). infinitive.9 shows the CLAWS vertical output for the complex preposition in terms of._PUNC Where AT: article. Table 2. It would not be able to retrieve the complex . It receives the tags: ahead_II21 of_II22.40 Table 2. IO: of (as preposition). the annotation format has to be slightly modified to do this. Ditto tags are very useful as they make it possible to extract complex prepositions.7 CLAWS horizontal output [lemma + simplified POS tags] the_AT whole_JJ point_NN of_IO the_AT play_NN seem_VV to_TO be_VB an_AT attack_NN on_II the_AT Church_NN .7 shows the same sentence in Table 2.

MC2 NN1. VBR (are). VBN (been). VHN (had). e. VBI (be. VBZ (is) VD0 (do.8 Simplification of CLAWS POS-tags CLAWS7 POS tags Singular vs. VHZ (has) VV0 (base form of lexical verb).g. NNO2 NNT1. VVD (past tense). VVNK (past participle catenative. JJT Verb forms VB (be) VB0 (be. few) JJ (adjective) DAR (more. NNU2 NP1. island. e. fewest) JJR. infinitive). e. plural forms MC (cardinal number) NN (common nouns) NNL (locative nouns. be bound to). be going to). base form). VBM (am). VDG (doing). VVG (-ing participle). much.g. DAT (most.g. VDD (did).g. VBDZ (was). e. base form). little. hundred) NNT (temporal nouns. infinitive). VVI (infinitive). infinitive). NPM2 41 Simplified POS tags Comparative and superlative forms DA (after-determiners. VBDR (were). street) NNO (numeral nouns. VDZ (does) VH0 (have. e. NP2 NPD1. e. NPD2 NPM1. week) NNU (units of measurement. VVGK (-ing participle catenative. NNT2 NNU1. VDN (done).9 CLAWS tagging of the complex preposition ‘in terms of’ POS-tag II31 II32 II33 Word form in terms of Lemma in term of . NN2 NNL1. VHI (have.g. less). VHD (had). VHG (having). VVZ (-s form) VD (do) VH (have) VV (lexical verbs) Table 2. NNL2 NNO. base form). day.Selection of academic vocabulary Table 2. inch) NP (proper nouns) NPD (weekday noun) NPM (month noun) MC1. e.g.g. VDI (do. VVN (past participle).

Letters Table 2. query. sports and games Life and living things Movement. the category ‘language and communication’ (Q) includes words such as answer. states and processes Science and technology Names and grammar . explain. expand into 232 categories (see Archer et al. in turn. and explanation. 2002). objects and equipment Education in general Language and communication Social actions. question. 1997: 54). feedback. A semantic field is a theoretical construct which groups together ‘words that are related by virtue of their being connected – at some level of generality – with the same mental concept’ (Wilson and Thomas. location. Another Perl program was therefore used to replace any sequence of words with ditto tags (e.10 Semantic fields of the UCREL Semantic Analysis System A B C E F G H I K L M N O P Q S T W X Y Z General and abstract terms The body and the individual Arts and crafts Emotional actions. travel and transport Numbers and measurement Substances. anecdote.. materials. house and the home Money and commerce in industry Entertainment.g. in-terms-of II). in_II31 terms_II32 of_II33) by the component words. response. message.10).42 Academic Vocabulary in Learner Writing preposition itself. For example. This includes not only synonyms and antonyms of a word but also its hypernyms and hyponyms. reply.g. which. The USAS tagset includes 21 major semantic fields (see Table 2. statement. The UCREL Semantic Analysis System A second layer of annotation was applied by the UCREL Semantic Analysis System (USAS). This tool assigns tags representing the general semantic field of words from a lexicon of single words and multiword expressions. states and processes Food and farming Government and public Architecture. and any other words that are linked in other ways with the concept concerned. states and processes Time World and environment Psychological actions. separated by a hyphen and followed by their POS-tag (e.

11 shows that in the sentence ‘This chapter deals with the approach of the criminal law to behaviour which causes or risks causing death’.1.2 G2.1 T1. states and processes – religion and the supernatural’) and T1.1 Z8 Z5 A2. These categories are all marked with a Z-tag. Like part-of-speech tagging. Table 2.1. the subcategory ‘affect’ (A2) and more precisely the sub-subcategory ‘cause / connected’ (A2. and contextual rules (Rayson.11 USAS vertical output POS-tag DD1 NN1 VVZ IW AT NN1 IO AT JJ NN1 II NN1 DDQ VVZ CC VVZ VVG NN1 . Semantic tag M6 Z5 Z8 Q4. semantic tagging can be subdivided broadly into a tag assignment phase and a tag disambiguation phase.2 L1- .1 I2. the word chapter has been assigned the tags Q4.1 Z5 Z5 G2. 2003: 67–8).2 M1 E1 S1.2 Z5 A15A2. For example. a set of potential semantic tags are attached to each lexical unit.2. domain of discourse.1[i1. S9 (‘social actions.K5. the semantic tag A2. First.2.2). states and processes – groups and affiliation’).1 A1. Word form This chapter deals with the approach of the criminal law to behaviour which causes or risks causing death . The program Table 2. S5 (‘social actions.1G2. The next stage consists of selecting the contextually appropriate semantic tag from the set of potential tags provided by the tag assignment algorithm. It assigns a semantic field tag to every word in the text with about 92 per cent accuracy.2 F3/I2.2 I2. The program makes use of a number of sources of information in the disambiguation phase.2 represents a word in the category ‘general and abstract words’ (A).1.1 (‘language and communication – media – books’).2 Z5 Z5 X4.3 S9/S5 S5+ A1. (‘time-period’).1.1 S6+ Y1 Z5 S1.1 A9.1[i1.Selection of academic vocabulary 43 are used to denote the major semantic fields while numbers indicate field subdivisions. The semantic annotation does not apply to proper names and closed classes of words such as prepositions.1.1 G2.3. conjunctions and pronouns. notably POS-tags.A5.

2 death_L1.3.1.2 or_Z5 risks_A15 causing_A2. academic year. Thus. Multiword expressions are analysed as if they were single words. .2. by the skin of one’s teeth) are described as regular expressions or templates. i.1. break out.2 of_Z5 the_Z5 criminal_ G2.1 law G2.12 Academic Vocabulary in Learner Writing USAS horizontal output This_M6 chapter_Q4. 2.1[i1. The same occurrence of a word in a text may simultaneously signal more than one semantic field. etc.g. advisory committee. 1996) to select words that met three frequency-based criteria: 1. at the drop of a hat.2 (see Table 2. It belongs equally to the semantic fields of ‘groups and affiliation’ and ‘religion’.1 with_Z5 the_Z5 approach_X4. The word chapter in the sense of ‘an ecclesiastical assembly of priests or monks’ is a case in point. sequences of words. determiner (D*) or article (AT*) and the singular noun (NN1) sense. the template ‘ma[kd]*_V* {JJ.2. Automatic extraction of potential academic words Coxhead (2000) made use of the Range corpus analysis program (Heatley & Nation.1[i1.1 law_G2.2.44 Table 2. For example.2.g. as represented by West’s (1953) General Service List.1[i1.1[i1. parts of words and grammatical categories used to match similar patterns of text and extract them. The two semantic tags are thus assigned in the form of a single tag S9/S5 (see Table 2. take off). and idioms (e. bank account).. phrasal verbs (e. to bark up the wrong tree. made more sense._PUNC ranked these semantic tags and chose Q4.11).000 most frequent words in English.1 as the semantic tag with the highest correctness probability. AT*} sense_NN1’ identifies all occurrences of the verb make directly followed by an optional adjective (JJ). compounds (e. criminal law is tagged as: criminal G2. D*.g.2 to_Z5 behaviour_S1. It thus retrieves all instances of the expression make sense and its variants make no sense.12).e.12).1 deals_A1. This is displayed in the final output format (see Table 2.1 which_Z8 causes_A2. makes little sense. using ditto-tags similar to those used in part-of-speech tagging. The word families included had to be outside the first 2. In the USAS lexicon of multiword expressions.

To address this limitation. a fully data-driven method that is often used in corpus linguistics to find salient linguistic features in texts (e. Section 1. 2009) and which does not require the use of a stop list to filter out function words. Range 3. Evenness of distribution Potential academic words Figure 2.1.1).1) but which are not necessarily the most representative lexical items in the Academic Corpus.1.Selection of academic vocabulary 45 2. are subsequently used to narrow down the resulting list of potential academic words (Figure 2. 3. the procedure described in this book is primarily based on keyness (Scott. with a frequency of at least 10 occurrences in each sub-corpus and in 15 or more of the 28 subject areas.2. On the other hand. A member of a word family had to occur in all 4 disciplines represented in the Academic Corpus. Keyness 2. Members of a word family had to occur at least 100 times in the corpus (cf. 1. Archer. 2009a: 3). These words and other high-frequency words will only occur in a keyword list “if their usage is strikingly different from the norm established by the reference text” (Archer.).1 A three-layered sieve to extract potential academic words . applying Criterion 1 makes it impossible to identify high-frequency words that are particularly prominent in academic texts. the resulting list would have included a large number of function words and other high-frequency words that tend to be frequent in the English language as a whole (cf. If Criterion 1 had not been used. 2001).g. Section 1. Two quantitative filters. namely range and evenness of distribution.

business English words (Nelson. 2000). usually the log-likelihood ratio. 2006). The word list for the research corpus is reordered in terms of the keyness of each word type. A minimum frequency threshold is usually set at 2 or 3 occurrences in the research corpus. Keyness Keyword analysis has been used in a variety of fields to extract distinctive words or keywords. 2001). ‘keyness is a quality words may have in a given text or set of texts. . 5. words that are statistically prominent in the research corpus. i. words that have strikingly low frequency in the research corpus in comparison to the reference corpus. are filtered out.1. avoiding trivia and insignificant detail.e. 2006: 59). the two corpora of professional writing and the corpus of student academic writing described in Section 2. 2. Words that occur less frequently than the threshold in the research corpus. as well as negative keywords. As emphasized by Scott and Tribble (2006: 55–6).g. once we have steamed off the verbiage. For the purposes of this research.1 were each compared with a large corpus of fiction on the grounds that academic words would be particularly under-represented in this literary genre. they reflect what the text is really about. suggesting that they are important. Thus.3. the reference corpus was not chosen to represent all the varieties of the language6 but to serve as a ‘strongly contrasting reference corpus’ (Tribble. L (mystery and detective fiction). the blah blah blah’. Thus. Frequency-sorted word lists are generated for a reference corpus and the research corpus. What the text ‘boils down to’ is its keyness. i. 2001: 396). words typically used by men and women with cancer in interviews and online cancer support groups (Seale et al. The two lists of word types and their frequencies are compared by means of a statistical test. the adornment.e. The K (general fiction). Software tools usually list positive keywords. ‘for a word to be key. 3. 4. and terminological items typical of specific sub-disciplines of English for information science and technology (Curado Fuentes.46 Academic Vocabulary in Learner Writing 2. The procedure to identify keywords of a particular corpus involves five main stages (see Scott and Tribble.. e. or are not significantly more frequent in the research corpus than in the reference corpus. then it (a) must occur at least as frequently as the threshold level. 2006: 58–60): 1. and (b) be outstandingly frequent in terms of the reference corpus’ (Scott and Tribble.

g. which means that there is less than 1 per cent danger of mistakenly claiming a significant difference in frequency. enzyme.). N. However. Positive keywords are more numerous than negative keywords for each academic corpus. N.Selection of academic vocabulary Table 2. P) BROWN (categories K. bacterium. P) Baby BNC fiction TOTAL Number of words 946..g. This can be explained by the large amount of specialized vocabulary present in academic texts. formula. the ProfHS corpus and the Student Writing Corpus. jurisdiction.322 4. archbishop. .14 Number of keywords Corpus ProfHS corpus ProfSS corpus Student Writing corpus Positive keywords 4. DNA. compared to a reference corpus. penicillin. L. M. Table 2. Keyness values were calculated with the Keyness module of WordSmith Tools 4 (Scott.688 1.13 (see Rayson et al.01 with a critical value of 15. The keyword procedure selects all words that occur with unusual frequency in a given text/corpus. etc. 2004). offence and policy in law (soft science) and theory. N (adventure and western fiction) and P (romance and love story) categories of the LOB (Lancaster-Oslo/Bergen) corpus. methane. N. P) FLOB (categories K.5. martyr. the BROWN corpus and the FROWN (Freiburg-Brown) corpus7 were combined with the Baby BNC fiction corpus (Table 2. factor and participant in student writing. law. 2004). in lemma + POS-tag format. rape. M. simply because they are under-represented in fiction writing (e.13 The fiction corpus Corpora LOB (categories K. M. the FLOB (Freiburg Lancaster-Oslo/Bergen) corpus. cell and species in biology (hard science). N. e. chromosome.14 gives the number of positive and negative keywords for each corpus.492 Negative keywords 837 1. L. The resulting list is therefore likely to include technical words that do not occur in all types of academic texts. Keywords were extracted for the ProfSS corpus.13) to form the reference corpus for this study.025 Table 2.946. L. M. P) FROWN (categories K.656 4. not all of the keywords meet the definition of academic vocabulary in Section 1.201 956 M (science fiction). The significance of the log-likelihood test was set at 0.337 47 999. L.

As a consequence.1 with the WordList option of WordSmith Tools (Scott. the number of texts in which a word appears) is used to determine whether a word appears to be a potential academic keyword because it occurs in most academic disciplines or because of a very high usage in a limited subset of texts. are shown to appear in all 15 sub-corpora. Figure 2. Statistical measures such as the log-likelihood ratio are computed on the basis of absolute frequencies and cannot account for the fact that ‘corpora are inherently variable internally’ (Gries. ‘which suggests that this word is key because of a single author’s use of a word in a specific case. The criteria of range and evenness of distribution were subsequently used to refine the list of potential academic words still further. Range Range (i. The words ability. In other words. the procedure cannot distinguish between ‘global’ and ‘local’ keywords (Katz. Scott’s (1997) notion of ‘key keywords’). It is calculated on the basis of the 15 sub-corpora described in Section 2. for example. rather than being something that indicates a general difference in language use’ (2004: 350). 2004). in a keyword analysis of gay male vs.2. a phenomenon which Katz (1996: 19) has referred to as ‘burstiness’8. the keyword status of wuz is more a function of the sampling decision to include one particular narrative in the corpus than evidence of the distinctiveness of the word in gay male erotic narratives (see also Oakes and Farrow.e. able and about. in 100 per cent of the corpora . that is. 1996). Although the resulting number of keywords fell by more than 60 per cent. As a first step to overcome this inherent limitation of the keyword procedure.048 shared keywords were still identified. 2007: 91). when in fact its use is restricted to one single text.3. For example. 2007: 110). 2.2 shows that there is a column headed ‘Texts’ which shows the number of texts each word occurred in. the ProfSS corpus and the Student Writing corpus (cf. I wrote a Perl program which automatically compares keywords for several corpora and creates a list of positive keywords that are shared in the ProfHS corpus. Baker shows that wuz (used as a non-standard spelling of was) appears to be a keyword of gay male erotic narratives. Global keywords are dispersed more or less evenly through the corpus while local keywords appear repeatedly in some parts of the corpus only.48 Academic Vocabulary in Learner Writing The keyword procedure relies on the conception of a corpus as one big text rather than as a collection of smaller texts. 2. This tool can take several corpus files as input and range comes automatically with any word list it produces. lesbian erotic narratives.

The large variation in the range of law can be explained by the peak frequency of occurrence of the noun in the professional soft science sub-corpora. The frequency of the word example ranges from 26 to 226 in the 15 sub-corpora.2 WordSmith Tools – WordList option analysed. the criterion of range excludes the words sector. .3 and accurately reflect the difference between them. range has an important limitation in that it gives no information on the frequency of a word in each sub-corpus. The frequencies of these two words in each sub-corpus are shown in Figure 2. and the word law. which we intuitively regard as an academic word. criminal law. the meaning of which is more discipline or topic-dependent (e. the law of gravity). while that of the word law varies between 11 and 812. canon law. Thus. For the purposes of this study.Selection of academic vocabulary 49 Figure 2. paradigm and variance as they only appear in 11 sub-corpora but includes both the word example. Used alone.g. only words appearing in all 15 academic sub-corpora were retained as potential academic words.

e.3.. 2004). Its values range from 0 (most uneven distribution possible) to 1 (perfectly even distribution across the sectors of the corpus) (see Oakes (1998: 189–92) and Gries (2008) for more information on dispersion measures).3 Distribution of the words example and law in the 15 sub-corpora 2.3. This is the last criterion I applied to restrict the list of potential academic words. the number of sub-corpora or texts) in the corpus. The evenness of the distribution or dispersion of a word is ‘a statistical coefficient of how evenly distributed a word is across successive sectors of the corpus’ (Rayson. Juilland’s D was first used in the Frequency Dictionary of Spanish Words (Juilland and Rodriguez. 2003: 93). 1964) and is calculated as D = 1 – V / √n-1 where n is the number of sectors (i. but the exact number of times it appears’ (Oakes and Farrow 2007: 91). Juilland’s Ds were calculated for each word using the output list from WordSmith Tools Detailed Consistency Analysis. Evenness of distribution Differences in range can be highlighted by a measure of the evenness of the distribution of words in a corpus. And if the word is used commonly enough.4 gives an example of a M BEL C BN SO C C BN HU C M BN PO C L SO C M MC C SC AP P BN SC BN C SC B C BA AW TE C W E E A H BA H RT W IST S E O PS R Y BA YC W HO E LO SO C C N ES S M C M C AR TS . One such measure is Juilland’s D statistical coefficient. it will be well-distributed’ (Zhang et al. The variation coefficient V is given by V = s / x where x is the mean sub-frequency of the word in the corpus and s is the standard deviation of these sub-frequencies.50 900 800 700 600 500 400 300 200 100 0 Academic Vocabulary in Learner Writing LAW EXAMPLE Figure 2. This measure takes into account ‘not only the presence or absence of a word in each subsection of the corpus. A number of studies have used a measure of dispersion to define a core lexicon on the basis that ‘if a word is commonly used in a language. it will appear in different parts of the corpus. Figure 2.

4 WordSmith Tools Detailed Consistency Analysis 51 .Selection of academic vocabulary Figure 2.

These frequencies were copied into an Excel file and normalized per 100. provide.e.000 words as the 15 sub-corpora are of different sizes. and consequence. and the verbs label. conversely. the cut-off point of 0. was not selected. and confirm that only example is of widespread and general use in this genre. employment.8 appears to be too restrictive and words that would intuitively be considered as academic words are excluded. effective. Evenness of distribution is the only criterion used that could perhaps favour keywords that are more prominent in the different parts of the ProfSS corpus and the Student Writing corpus However. verbs such as prove. the variation coefficient. The noun example was thus identified as a potential academic word as its dispersion value was 0. whereas the noun law. similar and likely and adverbs such as particularly. At times. show. possibility of giving too much weight to words that would be particularly frequent in the ‘soft science’ sub-corpora but much less common in the ‘hard science’ sub-corpora. perceive and isolate. difference.8 reduces the . the third column (Texts) gives the range of each word and the following columns show its frequencies in each sub-corpus.5).8 and were therefore not selected include the nouns health. highly and above.83.9 For a word to be selected as a potential academic word. personality. appear.8. the mean sub-frequency and the standard deviation) were computed in Excel and Juilland’s D values were then calculated for each word. as it has both a general meaning (‘a way of solving a problem’) and a technical meaning (‘a liquid in which a solid or gas has been mixed’) with different frequencies and distributional behaviours. while the noun law is over-represented in the professional soft science corpus. its Juilland’s D value had to be higher than 0.69. Examples of words that have D values lower than 0. . Some words have skewed Juilland’s D values because of their polysemy. result and illustrate. and more specifically in the social science sub-corpus. extent. Its general meaning is found in all academic sub-corpora while its technical meaning is restricted to scientific writing and accounts for its much higher frequency in the two professional scientific sub-corpora (MC-SC and BNC-SC in Figure 2. significance. a relatively high minimum threshold of 0. The resulting list of 599 potential academic words includes nouns such as conclusion.52 Academic Vocabulary in Learner Writing detailed consistency analysis: the second column (Total) gives the total frequency of each word in the whole corpus. and treatment. The measures necessary to calculate Juilland’s D values (i. Dispersion values make it possible to avoid the mistaken conclusion that these two words behave similarly in academic writing. adjectives such as significant. with a Juilland’s D value of 0. The noun solution is a case in point. discuss.

6) of the noun solution.4. the 15 sub-corpora are relatively small and the frequencies of occurrence of words may be skewed by a particular topic or author’s preferred turn of phrase. The category ‘numbers and measurement’ accounts for more than 10 per cent of the potential academic words and includes nouns (e. M C . circumstance. and limitation as well as the verbs perform and cause.2. It is notable that 87 per cent of the 599 potential academic words fall into just six of the categories. Broadening the scope of well-represented semantic categories More generally. 2.Selection of academic vocabulary 300 250 200 150 100 50 0 AR TS C BE L M C SO BN C C H U BN M C PO BN LI C M SO C C SC M IE C AP N L S BN C C BN SC C I BA TE C W H E A BA RT S BA WE H W IS E PS T C BA HY W E LO SC C N ES S M 53 Figure 2. the adjectives detailed and particular and the adverbs similarly and conversely. the category ‘general and abstract terms’ includes almost half the potential academic words. Examples include the nouns activity.2 discussed how a text uploaded to the web-based environment Wmatrix is morphosyntactically and semantically tagged. The semantic analysis was conducted with the UCREL System which classifies words and multiword units into 21 major semantic categories.g. measure. in particular. amount. Section 2.g. extent). high.3.5 Distribution of the noun ‘solution’ These two peak frequencies of occurrence are responsible for the relatively low D value (54. adjectives (e. Some words were automatically classified into more than one category but the figures given are based on the semantic tag most frequently attributed to each word. Table 2.15 shows the distribution of the 599 potential academic words across these semantic classes. I therefore made use of a semi-automatic procedure to identify words that did not pass the dispersion criterion but were semantically related to the 599 potential academic words. degree.

suggest) represent 9. also) and prepositions (e. states and processes’ (e.7 per cent of the potential academic words respectively. according to. Food and farming G.15 Academic Vocabulary in Learner Writing Automatic semantic analysis of potential academic words Number of words 267 2 2 4 0 4 2 7 0 0 12 74 7 4 34 47 26 2 55 2 50 599 Percentage of words 44. objects and equipment P. materials. Language and communication S.g.0 0.g. Movement. house and the home I.g. The categories ‘psychological actions.0 2. On this basis.4 1.3 1. which is morphologically related to the potential academic verb analyse. General and abstract terms B. impose).g. For example. extend. and ‘language and communication’ (e.3 0. 331 keywords that did not have Juilland’s D values higher than 0.0 0. wide). Government and public H. location.7 7.0 12. verbs (e. Psychological actions. frequently. whether).54 Table 2. in addition to).7 4. ‘social actions.3 9.g. The body and the individual C. Entertainment. Substances. Education in general Q. sports and games L. prepositions (such as. was retrieved by the semantic criterion although .g.2 0. adverbs (e.3 8. subsequently. social.2 0.7 0. Social actions. Arts and crafts E.3 100 Semantic categories A. World and environment X. states and processes Y. reduce). therefore)). ‘names and grammar’ (mainly consisting of connective devices such as conjunctions (or. define. during) and adverbs (moreover.6 0. 7.8 but which formed part of one of the six semantic categories described above were added to the list of potential academic words.2 0. since. Money and commerce in industry K. increase. encourage.3 0. attempt). states and processes’ (e. thus. Science and technology in general Z.7 0.3 per cent. travel and transport N. conclusion.3 0. states and processes T. Architecture.2 per cent. Time W. assumption. Many of the words that were retrieved by this additional criterion are morphologically related to words that had already been automatically selected. 8. Numbers and measurement O. facilitate. the noun analysis. Names and grammar TOTAL large. claim.7 per cent and 5. Live and living things M. Emotion F.7 5. analyse. argue. interpretation.

i. A large proportion of the nouns in the list are Table 2.17 per cent of all potential academic words. states and processes’ and ‘language and communication’ – were also included. to appear in all 15 sub-corpora representing different academic disciplines.3 identified 930 potential academic words on the basis of four criteria.17 25. ‘social actions.17.8. and not a list of academic vocabulary in its functional sense.4. ‘psychological actions. This is consistent with Biber et al. they had to be well-distributed across the corpora and have a Juilland’s D value higher than 0.35 8.Selection of academic vocabulary 55 its Juilland’s D was below 0. states and processes’. ‘names and grammar’. The criteria of minimum frequency and range still apply: the noun analysis was retrieved only because it is very frequent in academic prose and appears in a wide range of academic texts. 2.’s (1999) finding that nouns are particularly frequent in academic prose. the first of which is keyness. First. the words had to be keywords in professional (both ‘hard’ and ‘soft’ disciplines) and student academic writing. Other morphologically related words such as analyst or analysable were still excluded from the list.35 9.16 presents a breakdown of the AKL by grammatical category. they had to be characterized by wide range. The resulting list of potential academic words has been named the Academic Keyword List (AKL) to emphasize the fact that it is the output of a data-driven set of criteria. Table 2.e. Nouns make up 38. Third. The complete list is given in Table 2.06 100 .05 19. ‘numbers and measurement’. However this is not an argument for using word families instead of lemmas.16 Distribution of grammatical categories in the Academic Keyword List Number Nouns Verbs Adjectives Adverbs Others Total 355 233 180 87 75 930 Percentage 38. The Academic Keyword List The (semi-)automatic extraction procedure described in Section 2.8. Keywords that did not match this last criterion but belonged to one of the six best represented semantic categories – ‘general and abstract’. Second.

increase. resolution. lack. dimension.Table 2. individual. situation. support. tradition. difference. decision. growth. explanation. survey. source. possibility. author. validity. role. number. kind. behaviour. belief. advance. extreme. idea. reader. evidence. logic. property. disadvantage. conclusion. observer. recognition. option. analogy. means. scheme. term. reality. task. purpose. advice. importance. contradiction. percentage. report. institution. sense. creation. interpretation. relation. literature. potential. benefit. relevance. notion. sex. proportion. consideration. definition. perspective. rise. committee. change. female. evolution. expansion. stress. rule. class. respect. parent. perception. difficulty. experiment. classification. emphasis. team. position. mankind. intervention. impact. concentration. restriction. attitude. medium. unit. rate. target. account. set. effect. awareness. selection. capacity. limitation. maintenance. version. list. insight. space. member. factor. network. failure. participant. guideline. construction. problem. past. technique. sample. outcome. conflict. structure. feature. model. search. defence. theory. tension. loss. influence. pattern. reduction. exclusion. range. manipulation. shift. learning. exposure. pressure. system. criterion. character. skill. destruction. improvement. basis. concept. integration. limit. balance. motivation. identity. country. culture. uncertainty. comparison. centre. service. formation. demand. adoption. work. advantage. examination. compromise. protection. assessment. topic. organisation. mode. element. debate. crisis. achievement. trend. series. degree. amount. choice. association. commitment. extent. diversity. output. complexity. section. act. variety. tendency. reproduction. result.17 The Academic Keyword List 355 nouns ability. experience. reference. personality. instance. operation. distinction. viewpoint. observation. spread. provision. communication. link. statistics. convention. challenge. issue. correlation. phenomenon. occurrence. period. research. adult. discovery. birth. reasoning. argument. resistance. group. content. community. assumption. form. cause. proposition. risk. constraint. tolerance. question. description. quality. approach. method. fact. assertion. point. minority. strategy. male. criticism. interest. reason. contribution. movement. representative. analysis. volume. standard. division. part. person. policy. assistance. characteristic. material. transition. case. type. variation. control. indication. attempt. relationship. hypothesis. understanding. consensus. implication. resource. condition. guidance. aspect. conduct. isolation. opportunity. decline. discussion. practice. introduction. existence. investigation. norm. future. level. alternative. environment. world 56 Academic Vocabulary in Learner Writing . scope. code. interaction. theme. finding. force. being. requirement. programme. process. data. likelihood. subject. activity. parallel. event. attention. concern. similarity. knowledge. population. error. consequence. bias. gain. view. determination. significance. conception. summary. function. measure. dilemma. whole. action. discrimination. contrast. category. application. colleague. establishment. separation. age. scale. review. society. procedure. example. value. doctrine. evaluation. absence. damage. need. solution. presence. essence. aim. progress. information. publication. production. development. use. exception. stimulus. success. majority. addition. circumstance. figure. combination. effectiveness.

dominant. derive. precede. write. assert. determine. practical. limited. different. maximum. limit. scientific. systematic. visual. extensive. relate. prime. real. standard. realistic. stable. yield 180 adjectives absolute. true. damage. relevant. original. tend. consist. unsuccessful. concentrate. total. differ. diminish. report. principal. strengthen. classify. secondary. predict. vital. attend. base. constitute. differentiate. describe. following. initiate. analyse. illustrate. general. great. theoretical. varied. useful. minor. ensure. cause. surprising. submit. encounter. acquire. assign. integrate. employ. define. clear. relative. human. focus. indirect. convert. compete. formal. high. excessive. quote. should. investigate. benefit. logical. can. overcome. support. restrict. male. avoid. contrast. far. formulate. substantial. include. design. previous. concern. alternative. mental. follow. remain. favourable. represent. various. distinct. perform. widespread (Continued) Selection of academic vocabulary 57 . modern. basic. eliminate. evident. future. exceed. pose. experience. resolve. respond. locate. label. productive. necessary. passive. certain. undermine. solve. become. assist. direct. consistent. operate. treat. particular. dependent. frequent. responsible. sexual. present. considerable. create. complex. active. show. restricted. act. explain. outline. immediate. retain. aid. reduce. reveal. independent. lack. appropriate. vary. random. typical. note. late. examine. clarify. expand. conform. effect. rely. coincide. occur. expose. arise. argue. single. supply. available. misleading. arbitrary. alter. valid. induce. internal. successive. similar. representative. abstract. isolate. attribute. compare. prevent. selective. attain. influence. develop. experimental. crucial. maintain. assess. critical. so-called. result. emerge. reproduce. reflect. traditional. incorporate. adapt. govern. construct. progressive. applicable. finance. simple. separate. overall. obvious. average. stress. comprehensive. interpret. advocate. evaluate. identical. undertake. new. appear. lead. enable. suffer. likely. past. rapid. attempt. state. participate. affect. discuss. initial. produce. remove. use. depend. specific. study. mutual. parallel. positive. associate. competitive. prominent. dominate. prove. comprise. profound. establish. effective. combine. favour. involve. express. sufficient. select. propose. provide. reinforce. normal. possible. actual. large. may. psychological. refer. major. indicate. fixed. be. permit. detailed. apparent. suitable. correspond. gain. aim. neglect. minimal. evolve. reject. link. seek. allow. confine. permanent. improve. stimulate. account (for). negative. achieve. adequate. demonstrate. increase. imply. form. render. subsequent. other. function. conclude. highlight. destroy. severe. impose. equivalent. essential. potential. display. local. suggest. perceive. sustain. related. view. encourage.233 verbs accept. social. distinguish. contain. fail. inherent. physical. generate. important. individual. possess. apply. consider. measure. acute. strict. regulate. conventional. acceptable. term. tackle. unlikely. main. deal. difficult. inadequate. introduce. valuable. wide. early. promote. contribute. separate. identify. record. assume. ideal. preserve. accessible. final. facilitate. characterise. require. influential. common. leading. exemplify. partial. unlike. correct. obtain. cite. equal. divide. allocate. successful. summarise. regard. extreme. significant. transform. present. claim. radical. enhance. exist. connect. publish. unique. rational. explicit. decline. symbolic. adopt. recent. primary. replace. pursue. natural. specify. additional. conduct. receive. complete. choose. interesting. extend. emphasize. central. advance. fundamental. incomplete. special. exclude. control. inferior.

prior to. their. for instance. equally. of. traditionally. potentially. approximately. several. in common with. effectively. subject to. fairly. consequently. upon. over. socially. whereas. first. by. for example. fully. independently. that. per. even though. those. closely. accurately. given that. essentially. as well as. highly. second. versus. its. successfully. further. specifically. often. as to. the. in. little. hence. accordingly. most. such as. since. previously. some.17 (Cont’d) 87 adverbs above. or. rather than. in response to. latter. only. fewer. therefore. typically. inevitably. in terms of. generally. e. moreover. from. directly. also. adequately. whether. each. in addition to. strongly. increasingly. explicitly. primarily. for. in relation to. third. extremely. similarly.g. greatly. far. however. these. more. significantly. clearly. purely. because of. itself. although. less. unlike. contrary to. basically. other than. solely somewhat. due to. necessarily. former. to. whether or not. both. secondly. either. which. within Academic Vocabulary in Learner Writing . most. in that. largely. normally. between. subsequently. same. simply. despite.. considerably. as opposed to. in particular. originally. conversely. relatively. in favour of. at best. provided. thereby. especially. initially. depending on. frequently. commonly. wholly. correctly. notably. in the light of. because. such. during. indeed. widely 75 others according to. themselves. than. as. mainly.58 Table 2. namely. in general. thus. many. an. particularly. including. virtually. this. partially. indirectly. readily. less. ultimately. recently.

examine. contrast. consequently.g. include). and (4) other words. mental process nouns (e. articles and determiners are not fully lemmatised by CLAWS. prime. possibility (e. emphasize. following. incomplete. be. argument.g. hypothesis. conjunctions. illustrate. Inspection of this list illustrates the value of CLAWS ditto tags (Section 2. argue. analyses the vocabulary in the text. contrast. according to. show. however.g. adequately. these and those are not analysed as word forms of the same determiner and are categorized as separate lemmas. become. unlikely) and logico-semantic relationships (e. note). different. differ. For example. either positive or negative (e. it was necessary to remove multiword units (such as the adverbs for example and for instance and the complex prepositions .Selection of academic vocabulary 59 abstract terms which belong to Francis’s (1994) category of metalinguistic labels: illocutionary nouns (e. interpret. because of).35 per cent of the potential academic words in this list.2. conversely. interesting. highly. the Academic Keyword List was uploaded to the Web Vocab Profile developed by Tom Cobb10. additional. equivalent.g.g. arise. parallel. comparison.g.g. determiners and ordinal numbers. alternative. The category ‘other’ includes prepositions. language-activity nouns (e. thus) and evaluative (e. such as. similar).g. prove) and logico-semantic relationship verbs (e. (2) the second 1. As Conrad (1999) pointed out. mental verbs (e. Verbs account for 25 per cent of the list and include verbs that Hinkel (2004) classified as activity verbs (e. Before uploading the list of potential academic words. increasingly. respond). clear.000 most frequent words of English. explain. (3) words of the Academic Word List. final.35%) consists essentially of linking (e.000 most frequent words of English. cause.g. adjectives represent 19.2): some 40 per cent of the potential academic prepositions are complex (e. adequate. correctly. Pronouns. if nouns are frequent in academic writing. and classifies the words into four main categories: (1) the first 1. definition. this. useful). imply. in terms of. appropriate. discuss. analysis. There are also complex conjunctions such as whether or not and given that. hence. So it is logical that.g. inevitably. provide). assess. and view) and text nouns (e. The majority of adjectives express value judgements.g.g. follow. potential. This web interface takes a text file as its input. In order to calculate the percentages of GSL and AWL words in the list of potential academic words. use. significantly) words. likely. the category of potential academic adverbs (9.g.g. deal. concept. term). numerous adjectives will be used to qualify them. effectively. section. The syntactic function of adjectives is to modify nouns and noun phrases.g. question). theory. articles. inadequate. possible. appear. pronouns. Although usually disregarded in academic textbooks and teaching materials. reporting verbs (e. linking verbs (e. and description).

derive. reason. result). induce. consequence. include. point. effect. and thus. on account of. compare. example. importance. emphasize. in response to. the prepositions depending on. survey. impact. in view of. in terms of and subject to. viewpoint AWL 40% Other 3% in addition to. the verbs associate. comprise. approach. connect. due to. analysis. in the light of. due to. relate. depend. result and stimulate. the semantic sub-category named ‘A2. method. exception. factor. The Academic Keyword List is a list of potential academic words. reference. exemplify. adequate. link. As explained in Section 1. influence. while 57 per cent are among the 2. The automatic semantic analysis has shown that a number of semantic sub-categories are particularly wellrepresented. given that. summary. increasingly. frequency-based criteria are not used as a defining property of academic words but as a way of operationalizing a function-based definition of academic vocabulary. which provides the retrieval procedure with some evidence of face validity. in the light of. combine. condition.17 shows that a high proportion of AKL words fit the functional definition of academic vocabulary (e. because of. scope. because. explanation. Other well-represented semantic categories . motivation.60 Table 2. discuss. whereas assertion. therefore. argue. and the conjunctions because. hypothesis. argument. assess. versus. The comparison is thus based on single words only. validity. likelihood. hence. namely.18). comparison. criticism. relation. result. difference. show. result and stimulus. proportion. base. effect. the adverbs consequently. render.g.5. relevance. define. provided and since. 1993: 237). cause. prior to. influence. inherent. related and resulting. extent. explain. For example.2. thereby. evidence.18 Lists GSL % 57% Academic Vocabulary in Learner Writing The distribution of AKL words in the GSL and the AWL Examples aim. link. Table 2. The results show that only 40 per cent of the potential academic words in my list also occur in the Academic Word List. with respect to. correlation. cause. proposition. implication. generate. in favour of and because of) since Web Vocab Profile cannot deal with them and would simply decompose multiword units into their parts. theory. measure. experience. Affect: cause – connected’ consists of the nouns basis. These results highlight the important role played by general service vocabulary in academic writing. conclusion. the adjectives dependent. attribute.000 most frequent words of English as described in West’s (1953) General Service List (see Table 2. correlation. differ. lead. produce. An important application of word frequency lists for course design is to uncover ‘functional and notional areas which might be important for the syllabus’ (Flowerdew. in relation to. degree. therefore. tackle. conclude. typically accurately. reason. result. consequence. in respect of.

g. considerable. Words such as the noun illustration. General actions. exclude. amount.5.g. the Academic Keyword List includes several words that are relatively well distributed keywords in academic prose but should probably not be part of a list of academic vocabulary (e. primary). Importance’ (e. The (semi-)automatic method used relies on Rayson’s (2008) data-driven model. female. making. classify. The keyword procedure was first used to retrieve a set of words which are distinctive of academic writing. assume. arise.g. academic vocabulary was defined as ‘a set of options to refer to those activities that characterize academic work. instance. illustrate). might count as academic vocabulary) from several corpora of academic prose. conversely). significance. unlike. refer. case. activity. ‘N5. Evaluation’ (e. favourable. organize scientific discourse and build the rhetoric of academic texts’.g.Selection of academic vocabulary 61 include ‘A1. also means ‘to put pictures in a book’. ‘A6. ‘A4. as such. for example.g. ‘A11. parent. which combines elements of both ‘corpus-based’ and ‘corpusdriven’ paradigms in corpus linguistics. examples’ (e. major. 1991: 20–1). similar. event. positive). a sense that will not be useful to all scholars who are writing in higher education settings. words that are reasonably frequent in a wide range of academic texts but relatively uncommon in other kinds of texts and which. figure. ‘Q2.g. male. ‘A5. perform). Comparing’ (e. sex and world). The fact that these AKL words are used to serve particular functions in academic prose has ‘to be corroborated by concordancing’ (Flowerdew. and to retrieve potential academic words (i. etc’ (e. The verb illustrate. suggest) and ‘X2. Similarly. 2003) and is thus better conceived of as a ‘platform from which to launch corpus-based pedagogical enterprises’ (Swales. definition. kind.g. account. groups. difference. the verb illustrate and the preposition like are often employed as exemplifiers but also have different uses. 2002: 151). progress. . the nouns country. It still needs ‘pedagogic mediation’ (Widdowson.g. describe. improve.1. discussion. circumstance. view. 1993: 237). consist.g. content. General kinds. consider.1. The Academic Keyword List is not a final product and does not in itself ‘carry any guarantee of pedagogical relevance’ (Widdowson. 2. formulate). scope. quote. The main objective of Chapter 2 has been to operationalize this function-based definition using frequency-based criteria. several). emphasize. contrast. enhance. widely.2.1.8. extent. example. Speech acts’ (e. include).1.g.e. Thought and belief’ (e. fundamental. Inclusion / exclusion’ (e. category. ‘A1. Summary and conclusion In Chapter 1. limited. Quantities’ (e.

inclusion/exclusion. states and processes’. ‘social actions. evaluation. and the minimum coefficient of dispersion (see Oakes and Farrow. This methodology has limitations. First. A reference corpus is itself characterized by a set of distinctive linguistic features. a high proportion of the words included in the Academic Keyword List match the definition of academic vocabulary. irrespective of differences in meaning or function. ‘numbers and measurement’. However. ‘the one thing that EAP seems to lack is a corpus that includes all levels of data – from presessional students’ writing and speech to academic lectures. confirming the accuracy of the retrieval procedure. quantities. Despite these limitations. and published research articles and books. some of which may be shared with the corpus under study. However. the results are dependent on a number of arbitrary cut-off points: the probability threshold under which log-likelihood ratio values are not considered significant. 2009). and speech acts. importance. at present no corpus exists that represents all the varieties of academic discourse. comparison. Second.62 Academic Vocabulary in Learner Writing Second. a limitation inherent in the keyness approach is the use of a reference corpus. The method provides a good illustration of the usefulness of annotation for the development of practical applications such as the Academic Keyword List. with sufficient detailed categorization to enable the users (teachers or students) to select a customized set of corpus texts appropriate for their needs’. which clearly relate to academic work. a number of keywords that had failed the dispersion test were selected on the basis of a semantic criterion: they belonged to one of the six semantic categories (‘general and abstract’. states and processes’ and ‘language and communication’) that included most potential academic words. the minimum number of texts in which a keyword must appear. the criteria of range and evenness of distribution were used as a sieve to refine the list of potential academic words. the corpora used are relatively small and contain a limited number of text types. ‘names and grammar’. 2001: 396). Third. The Academic Keyword List might have looked quite different if other cut-off points had been adopted. Such a corpus would need to include as many disciplines as possible. ‘psychological actions. AKL words have been shown to fall in semantic categories such as cause and effect. the organization of scientific discourse. Third. or the building of the argument of academic texts. 2007: 92. There is thus a strong case for using ‘strongly contrasting reference corpora’ (cf. As Krishnamurthy and Kosem (2007: 370) comment. it is likely that a few potential academic words passed unnoticed because they are also used in fiction. Paquot and Bestgen. Tribble. The method has also made . PhD theses.

. 2008: 461). . phraseology.Selection of academic vocabulary 63 it possible to appreciate the paramount importance of general service words in academic prose. A number of general service words take on prominent rhetorical and organizational functions in academic discourse and their absence from lists such as Coxhead’s (2000) Academic Word List may be highly problematic in academic writing courses. Each AKL word ˘ should be subject to a careful corpus-based analysis to confirm its status as an academic word and establish how it is used in academic prose in terms of meaning. However it unquestionably offers ‘a portal into the complex behaviour and intricate relationships of individual lexical items’ (Hanciog lu et al. The Academic Keyword List still needs validation. and sentence position.

This page intentionally left blank .

and proposed a data-driven methodology to extract potential academic words from corpora. Chapter 5 aims to test the working hypothesis that upper-intermediate to advanced EFL learners. I focus on AKL words that are used to organize discourse and build the rhetoric of a text. irrespective of their mother-tongue background. The function of ‘exemplification’ is presented in detail. share a number of linguistic features that characterize their use of academic vocabulary. The learner corpus used is a collection of texts taken from the International Corpus of Learner English. As learner texts are short argumentative essays. analysis and evaluation. not all the features of their writing are shared across the . and can therefore be assumed to be developmental. I will now examine EFL learners’ use of academic vocabulary. The EFL learners are all learning how to write in a foreign language. Many of the linguistic features that characterize learner writing are shared by learners from a range of mother tongue backgrounds. In Chapter 4 the Academic Keyword List is shown to include a large number of lemmas that can be used to organize academic texts and structure their content around logico-semantic relations. However.Part II Learners’ use of academic vocabulary Having clarified what academic vocabulary is by providing a critical overview of its many definitions. This illustrates the type of data and the results obtained when the whole range of lexical strategies available to expert writers for establishing cohesive links in their texts is examined. Chapter 4 also focuses on the phraseology of academic words. rather than on words that focus on research. and they are often novice writers in their mother tongue as well. Chapter 3 offers a detailed description of the corpora used and the method adopted to compare them: Granger’s (1996) Contrastive Interlanguage Analysis (CIA).

Chapter 5 therefore ends with a brief discussion of the potential influence of their mother tongue on French speakers’ use of academic vocabulary in English. . and the differences may be due to transfer from the writers’ mother tongues.66 Academic Vocabulary in Learner Writing different language backgrounds.

I will investigate whether upper-intermediate to advanced EFL learners. irrespective of their mother tongue backgrounds. 2003). Learner-profile questionnaires give two types of information: learner characteristics and information on the type of task. which all the learners were requested to fill in. Each learner text is documented with a detailed learner-profile questionnaire. The method of analysing learners’ use of academic vocabulary and comparing it with that of expert writers relies on Granger’s (1996a) Contrastive Interlanguage Analysis. under the supervision of Sylviane Granger (Granger et al. A subset of the British National Corpus is used as a control corpus and helps bring to light features of learner language. and Ellis and Barkhuizen.1. which is among the largest non-commercial learner corpora currently existing. 2002. I will focus on learners’ use of academic words that serve typical organizational or rhetorical functions in academic discourse. Granger. I describe the corpora and methodology used to pursue these objectives. A computer learner corpus is an electronic collection of (near-) natural language learner texts assembled according to explicit design criteria (see Granger.Chapter 3 Investigating learner language Whatever the definition adopted. 2009 for the design criteria.. The learner corpus used is the International Corpus of Learner English. academic vocabulary has generally been described as a major source of difficulty for EFL learners. and contains texts written by learners with different mother-tongue backgrounds. . 3. The International Corpus of Learner English The learner data used consist of ten sub-corpora of the International Corpus of Learner English version 1 (ICLE) compiled at the University of Louvain. Belgium. In the second part of this book. 2005 for a discussion of different types of learner data). In this chapter. share characteristic ways of using academic vocabulary.

mother tongue background of learners. Finnish. Czech. On the basis of these external criteria. which were used as corpus-design criteria. French. ‘Money is the root of all evil’ and Feminists have done more harm to the cause of women than good.1 ICLE task and learner variables (Granger et al. Polish.2 A large proportion of the learner texts are argumentative.000 words). but the ICLE essays cover a wide range of topics. essay topic and task settings). with reference tools such as . Dutch. German. notably in terms of medium (writing). Spanish and Swedish. L2 exposure.. 2003: 539). Italian. They were learners of English as a Foreign Language rather than as a Second Language and were in their third or fourth year of university study. The ICLE learners represent 11 different mother tongue backgrounds: Bulgarian. their level is described as advanced although ‘individual learners and learner groups differ in proficiency’ (Granger. All the learners who submitted an essay to the ICLE were university undergraduates and were therefore usually in their twenties. In the words of the old song.g. ICLE texts share a number of learner and task variables. Russian. Topics include Most university degrees are theoretical and do not prepare students for the real word. genre (academic essay).68 Academic Vocabulary in Learner Writing International Corpus of Learner English Shared features Variable features Learner variables Age Learning context Proficiency level Task variables Medium Field Genre Length Learner variables Gender Mother tongue Region Other FL L2 exposure Task variables Topic Task setting Timming Exam Reference tools Figure 3. field (general English rather than English for Specific Purposes) and length (between 500 and 1. Essays differ in task conditions: they may have been written in timed or untimed conditions. Other variables differ (e. 2002: 13) As shown in Figure 3.1. as part of an exam or not.1 Learner productions have quite a few task variables in common.

this study makes use of ten ICLE subcorpora. Learner essays in each sub-corpus were carefully selected in an attempt to control for the task variables which may affect learner productions: all the texts are untimed argumentative essays.119 48. I chose to select untimed essays with reference tools as they represent the majority of learner texts in ICLE. As shown in Table 3.060 1. 2004). of words Average no. of words per essay 890 828 750 598 612 604 636 855 665 593 697 69 Czech (ICLE-CZ) Dutch (ICLE-DU) Finnish (ICLE-FI) French (ICLE-FR) German (ICLE-GE) Italian (ICLE-IT) Polish (ICLE-PO) Russian (ICLE-RU) Spanish (ICLE-SP) Swedish (ICLE-SW) TOTAL 147 196 167 228 179 79 221 194 149 81 1641 130.937 99.243 125.768 162. Most studies of the ICLE data to date have not taken these task settings into consideration (see Ädel’s (2008) analysis of timed and untimed essays for an exception).1 Breakdown of ICLE essays No. The software was used to examine the occurrences of potential academic words in context. 2004. Task and learner variables can be used to compile homogeneous subcorpora. Kroll. representing different mother tongue backgrounds.521 165.292 136.3 The ICLE sub-corpora compiled for this study were analysed with the Concord tool of the computer software WordSmith Tools 4 (Scott. of essays No. However.524 grammars and dictionaries or not.165.1.Investigating learner language Table 3.343 and in Scott and Tribble (2006). More information on the types of analyses that can be performed with WordSmith Tools can be found on Mike Scott’s webpage (http: //www. lexically. Although essays written without the help of reference tools would arguably have been more representative of what advanced EFL learners can produce. 1990). potentially written with the help of reference tools.739 140. The Clusters option proved very useful for identifying the most frequent n-grams or lexical bundles that contained the words being studied. researchers in second and foreign language acquisition and teaching insist that the influence of task type and condition is important (Shaw. .556 47.

L1/L2 comparisons have generally been criticized for being guilty of the ‘comparative fallacy’ (Bley-Vroman.2 Contrastive Interlanguage Analysis (Granger.e. for comparing learner language to a native speaker norm and thus failing to analyse interlanguage in its own right (see Lakshmanan and Selinker.g.2). 2009: 41). on the other hand. 1983). In his preface to Learner English on Computer (Granger.. 2005) rather than implicit and intuitionbased as has been common in second language acquisition (SLA) studies CIA L1 > < L2 L2 > < L2 Figure 3.and under-use of linguistic items or structures as a question of downright errors’ (Granger et al. the English of French speakers compared to that of Dutch speakers). Gilquin 2000/2001). Contrastive Interlanguage Analysis The methodology most frequently used to analyse learner corpora is Contrastive Interlanguage Analysis (CIA) (Granger 1996a. In second language acquisition research. which consists of comparing two or more languages. CIA compares varieties of one language: native and non-native varieties (L1/L2).2. make it possible to assess whether these features are peculiar to one language group (and thus possibly due to the influence of the learner’s mother tongue). Leech describes the native control corpus as ‘a standard of comparison. Unlike contrastive analysis. or shared by several learner populations (and therefore likely to be developmental or due to other causes such as teaching methods) (cf. Comparisons of different interlanguages (e. 1998). 1998: xv). or different non-native varieties (L2/L2) (see Figure 3. Granger. i. A strong argument that can be invoked in defence of the CIA model is that the native speaker norm used in learner corpus research is explicit and corpus-based (Mukherjee. 2001. however. a norm against which to measure the characteristics of the learner corpora’ (Leech. ‘which at an advanced level are as much (if not more) a question of over.70 Academic Vocabulary in Learner Writing 3. L1/L2 comparisons bring out the features of non-nativeness in learner productions. 1997). 1996a) . 2002). Firth and Wagner.

corpus data) and a warning against hasty conclusions based on a single researcher’s intuitions. 2009) and that other varieties of English are sometimes used as control corpora. 2001: 401). 2001. this speaker is thus taken to exemplify the abstract native speaker model on grounds of his/her language use. The control corpora used in this study are described below. (2) who know to a large extent what is acceptable in a given communication situation and speak/write accordingly. that the existence of a variety of norms is recognized in learner corpus research (see Granger. If we refer to an individual speaker as a native speaker. however.g. It should be noted. lexico-grammaticality. . i.e. Gilquin and Granger (2008) compared the Tswana component of the second edition of the International Corpus of Learner English to a corpus of South African English editorials. (3) whose usage is largely idiomatic in terms of linguistic routines commonly used in a given speech community. 1983): The term ‘native speaker’ should be used for an abstraction of all language users (1) who have good intuitions about what is lexicogrammatically possible in a given language and speak/write accordingly. Piller. acceptability and idiomaticity (see Pawley and Syder. however. Lakshmanan and Selinker’s point may be understood as a plea for more natural language data (i. the corpus-approximation to the native speaker norm is based on British and American English corpora.Investigating learner language 71 (see also Granger.e. 2005: 14) Mukherjee advocates a corpus-approximation to the native speaker norm and argues that corpus data can be used to describe this norm by ‘generalizing and abstracting from a vast amount of representative performance data’ (ibid: 15). Tan. In this book. argues that ‘nativeness’ remains a useful construct both for linguistics and for the ELT community. a ‘useful myth’ in Davies’s (2003) terms. (Mukherjee. 2005). He proposes a usage-based definition of the native speaker based on three aspects that he regards as central to native-like performance. Although they do not dwell on it. For example. Mukherjee. 2009: 20). Lakshmanan and Selinker (2001) address the issue of the comparative fallacy and warn against the danger of ‘judging language learner speech utterances as ungrammatical from the standpoint of the target grammar without first having compared the relevant interlanguage utterances with the related speech utterances in adult native-speaker spoken discourse’ (Lakshmanan and Selinker. Another criticism of L1/L2 comparisons is directed at the idea of the ‘native speaker’ as the target norm (e.

a text type which combines the advantages of being argumentative in nature and written by professionals’ (Granger. however. native students have been shown to produce more dangling participles than EFL learners (Granger. As Leech puts it. it can be argued that in order to evaluate foreign learner writing by students justly. on the type of material that is best suited to serve as a control for a learner corpus. 1998a: 18. It is. arguing that it is ‘both unfair and descriptively inadequate’ (Lorenz. As Ädel put it. Professional writing has a major role to play in learner corpus research if instruction and pedagogical applications are the goals of the comparison between learner and native-speaker productions. however. The question of the norm is best addressed by considering the aim of the comparison. however. 1997: 184). There is no general agreement. footnote 10). expert writing Carrying out L1/L2 comparisons implies choosing an L1 corpus to be used as some kind of ‘norm’ with which the learner corpus data can be compared. it can also be argued that professional writing represents the norm that advanced foreign learner writers try to reach and their teachers try to promote. A comparison of learner vs. In this study. consists of argumentative texts and ‘argumentative essay writing has no exact equivalent in professional writing’ (De Cock. we need to use native-speaker writing that is also produced by students for comparison. 2003: 196). 2006: 206–7) The International Corpus of Learner English. In this respect. It has been suggested that the ICLE might be compared with ‘a corpus of newspaper editorials. On the one hand. learner writing was compared to expert academic prose. 2000). 1998: xix). Native student writing is arguably a better source of comparable data to EFL learner writing if the aim of the comparison is to describe and evaluate interlanguage(s) as fairly as possible. doubtful whether the findings from such comparisons could make their way into the classroom. On the other hand.3. In a number of . Several researchers have criticized the use of professional writing in learner corpus research. (Ädel. 1999a: 14) and taking a stand against the ‘unrealistic standard of “expert writer” models’ (Hyland and Milton. a useful corpus for comparison is one which offers a collection of what Bazerman (1994: 131) calls ‘expert performances’. For example. ‘native-speaking students do not necessarily provide models that everyone would want to imitate’ (Leech. 1997) and different types of spelling mistakes (Cutting.72 Academic Vocabulary in Learner Writing 3.

as well as linguistic information (morphosyntactic tags. and developed a new resource called the BNC Index which contains genre labels for all BNC texts. for example. having chosen to sample such things as popular novels. by sampling a wide variety of distinct types of material. time refers to the period when the text was written. Thus. and reception. sentences and paragraphs). The text selection procedure has been described as follows: In selecting texts for inclusion in the corpus. file description. 1998: 28) The BNC mark-up conforms to the Text Encoding Initiative (TEI) recommendations (Burnard. 2004a. if we want to analyse texts produced by scholars specializing in natural sciences. time and medium. lemmas.). text profile. best-seller lists and library circulation statistics were consulted to select particular examples of them. Neff van Aertselaer. Thus. popular fiction.. 2007). Three criteria were originally used to select written texts to design a balanced corpus: domain. made use of the written part of the British National Corpus to determine the degree of acceptability of verb-noun combinations that had been extracted from the German subset of ICLE. newspaper articles. by selecting instances of those types which have a wide distribution. if we are not interested in discipline-specific differences and want to examine . 2008). Neff and her colleagues used both native-speaker student writing and newspaper editorials as control corpora (Neff et al. university essays and many other text types. etc. The BNC contains both written and spoken material. The written component totals about 90 million words and includes samples of academic books. we can select all BNC texts classified under ‘W_ac_nat_science’. Table 3.Investigating learner language 73 studies based on ICLE texts produced by Spanish EFL learners. and medium refers to the type of publication (books.2 gives the breakdown of the genre categories in the BNC written corpus and shows that genre labels are often hierarchically nested. On the other hand. Mark-ups include rich metadata on a variety of structural properties of texts (e. The British National Corpus (BNC) was created to be a balanced reference corpus of late twentieth century British English.g. Domain refers to the subject field of the texts. periodicals. etc. letters. General corpora have also been used in learner corpus research. or technical writing. 2004b. Lee (2001) criticized the domain categories for being overly broad and not sufficiently explicit. headings. (Aston and Burnard.). Nesselhauf (2005). account was taken of both production. newspapers.

247.8% 4.209.677 1.867 1.1% 0.391 1.2 BNC written Academic Vocabulary in Learner Writing BNC Index – Breakdown of written BNC genres (Lee 2001) No.1% 0.2% 0.187.5% 0.865 498.8% 1.144 W_ac_humanities_arts W_ac_medicine W_ac_nat_science W_ac_polit_law_edu W_ac_soc_science W_ac_tech_engin W_admin W_advert W_biography W_commerce W_email W_essay_sch W_essay_univ W_fict_drama W_fict_poetry W_fict_prose W_hansard W_institut_doc W_instructional W_letters_personal W_letters_prof W_misc W_news_script W_news_brdsht_nat_arts W_news_brdsht_nat_commerce W_news_brdsht_nat_editorial W_news_brdsht_nat_misc W_news_brdsht_nat_reportage W_news_brdsht_nat_science W_news_brdsht_nat_social W_news_brdsht_nat_sports W_news_other_arts W_news_other_commerce W_news_other_report W_news_other_science W_news_other_social W_news_other_sports W_news_tabloid W_non_ac_humanities_arts W_non_ac_medicine W_non_ac_nat_science W_non_ac_polit_law_edu W_non_ac_soc_science W_non_ac_tech_engin W_pop_lore W_religion TOTAL 3.796 7.3% 5.6% 4.649 1.632 87.811 424.564 3.1% 10.508.9% 0.926.3% 0.943 663.831 4.528.4% 8.364 % ‘Super genre’ No.355 65.121.388 45.843 728.0% 4.366 213.829 1.024 1.5% 1.3% 0.751.258 415.8% 0.892 52.032.171 546.957 1.737 239.1% 0.027.1% .6% 2.895 297.717.031 9.1% 0.376.5% 1.2% 0.3% 4.8% 1. of words 3.1% 0.5% 0.840 4.156.74 Table 3.156 351.933 1.6% 1.3% 100% Academic prose 17.946 558.421.742 1.530 65.757 222.6% Letters 0.3% 0.5% 0.8% 0.895 101.9% 5.4% 0.6% 0.413 3.2% 0.261 436.8% Non-academic prose 19.477.143.3% 0.480 66.256 4.396 2.346 4.5% 3.679 2.1% 0.1% 4.3% 18.7% Unpublished essays 0.1% 1.3% 0.3% 1.640.045 146.444 54.759.2% 1.004 219.2% News 7.2% 0.321. of files 87 24 43 186 138 23 12 60 100 112 7 7 4 2 30 432 4 43 15 6 11 500 32 51 44 12 95 49 29 36 24 15 17 39 23 37 9 6 111 17 62 93 128 123 211 35 3.1% 1.451 15.592 686.284.3% 0.3% Fiction 18.111.293 81.140.133 3.292.

ICLE texts were produced by students of humanities. ICLE is a corpus of unpublished university student essays while BNCAC-HUM consists of samples of published articles and books.. The W_fic and W_news sub-corpora (cf. Table 3.. henceforth BNC-AC-HUM) totals 3. They include The people’s peace.000 to 45. National liberation. Europe in the central middle ages.Investigating learner language 75 texts produced by professional writers in higher education settings. a web-based client developed at the University of Zurich which allows users to access the BNC by means of a Web browser (see Lehmann et al. They also have the advantage of corresponding to the type of writing that learners will try to produce during their university studies. student essays rarely total more than 1.321. topics in BNC-AC-HUM appear only once. which ‘allows sophisticated searches both for individual words (which can be matched against regular .4 Third. interviews and lectures. First. It was used as the comparison corpus to ICLE in this study. texts in the BNC-AC-HUM are arguably quite close to the type of text these students might have come across in their first few years at university. 2000) and the Corpus Query Processor (CQP). China’s students. Interpreting the results in the light of genre analysis thus required special care: differences between student essays and expert writing may simply reflect differences in their communicative goals and settings (Neff et al. Soviet relations with Latin America. we can select all texts whose categorizing labels begin with ‘W_ac’. however. among others.e.2) were sometimes used to compare the frequency of words and phrases across ‘super genres’.000 words). Nietzsche on tragedy. There are. Second. What is this thing called science?. etc.1 above). 2006). The British National Corpus was accessed via the BNCweb (CQP-edition) interface developed by Stefan Evert and Sebastian Hoffmann. Unlike the ICLE. 2004a). British literature since 1945. i. the BNCweb. The morality of freedom. The BNC sub-corpus of academic prose in humanities and arts (W_ac_humanities_arts. a central component of IMS Open Corpus Workbench. This web interface is the result of a ‘marriage of two corpus tools’ (Hoffmann and Evert. broadcast documentaries and news. This sub-corpus was chosen for two main reasons.867 words. major differences between the two corpora. First. the topics in BNC-AC-HUM differ from those in ICLE (described in Section 3.334.000 words while samples in the BNCAC-HUM are much longer (from 25.947 words and includes a wide variety of spoken registers. The BNC spoken corpus (BNC-SP) consists of 10. The spoken part of the British National Corpus was also regularly consulted to check whether words and word sequences that were found in learner writing are more typical of speech or academic writing.

which picks out significant co-occurrents of the search word on the basis of a number of measures of association. log-likelihood and log-log measures rank co-occurrences in very different ways (Evert.76 Academic Vocabulary in Learner Writing expressions) and for lexico-grammatical patterns (using linear grammars that have access to all levels of annotation)’ (Hoffmann and Evert. The log-likelihood test was therefore used to study the phraseology of academic words in expert and learner writing. One tool that is particularly useful is Collocations. 2006: 220). etc. the log-likelihood. The frequency of the co-occurrence is given together with the number of texts in which it appears. use of a threshold value of 15. z-score. 2004). minimum frequency of the co-occurrent. z-score. inclusion of lemma and part-of-speech information. McEnery et al. 2006: 180). Mutual information. (2006) compared the various statistical measures provided by BNCweb and reported that ‘MI and z-scores tend to put too much emphasis on infrequent words. (2008) proved further information on the British National Corpus and the BNCweb interface. MI3. minimum frequency of the co-occurrence. log-likelihood and log-log measures. 1998: 176). Rayson et al. They compute an association score for each pair of words extracted from a corpus. .01. and the user-friendliness of BNCweb with its wide range of query options and display facilities. maximum window span.’s (2004) suggestion. in order to extend applicability of the frequency comparisons to low expected values.3 displays a collocation query result. log-log and MI3 tests appear to provide more realistic collocation information’ (McEnery et al. MI3.13 is preferred at p < 0. A number of other settings are customizable. Hoffmann et al. focused on the comparison of word frequencies between corpora and suggested that. Co-occurrence frequencies can be quite low and I therefore followed Rayson et al. Association measures are the most widely used method of distinguishing between casual and significant co-occurrences.. e. The log-likelihood scores can be directly compared with critical values of a chi-square distribution table (see Oakes. Significant co-occurrents are sorted by decreasing log-likelihood values (right column). It is a marriage between the efficiency and flexibility of CQP queries.g. The CQP edition of the BNCweb combines the strengths of both software packages while overcoming their limitations. Co-occurrences were analysed in windows of one to three words to both the left (3L-1L) and the right (1R-3R). however. (2004). In contrast. Users of the BNCweb can decide to use any of five different measures: mutual information. Figure 3. which indicates the strength of the association relative to that expected by chance.5 They can also sort co-occurrents by decreasing frequency.

3 BNCweb Collocations option 77 .Investigating learner language Figure 3.

In his study of the statistics of word co-occurrences. I therefore decided to use the larger BNC-AC instead of the BNC-AC-HUM to judge the acceptability and typicality of EFL learners’ phraseological sequences (see Section 5. Academic words are not high-frequency words such as make. 3.e. 2004) was adopted to examine the phraseology of academic words in learner writing. the co-occurrence proved significant. Co-occurrence statistics are therefore not comparable across corpora of different sizes such as the British National Corpus and the International Corpus of Learner English. Word pairs that did not appear in the BNC-AC were presented to a native speaker of English for acceptability judgments. Word pairs in the ICLE sub-corpora were classified into three groups according to their co-occurrence status in professional academic writing: – word pairs that are statistically significant co-occurrents in the academic sub-corpus of the BNC (BNC-AC).4. 2004: 133) as expected frequencies and p values for low frequency words are distorted in unpredictable ways. Summary and conclusion This chapter has described the data and methodology used to investigate the use of academic vocabulary in writing by EFL learners. should always be excluded from the statistical analysis’ (Evert. As soon as more data was used.3). 2004) or frequency-based approach (Nesselhauf. – word pairs that appear in the BNC-AC but are not statistically signifi cant co-occurrents. the hapax and dis-legomena.78 Academic Vocabulary in Learner Writing Log-likelihood measures are strongly dependent on corpus size and word frequencies. i. . In a pilot study. I found that learners’ word pairs were sometimes not statistically significant in BNC-AC-HUM just because the co-occurrence was not frequent enough. The ICLE sub-corpora are in fact too small for a statistical analysis of co-occurrences to be meaningful.2. A distributional (Evert. – word pairs that do not appear in the BNC-AC. do and take and co-occurrences often appear less than three times. Special care has been taken to select a set of learner essays from the International Corpus of Learner English that is as homogeneous as possible and to control for a number of variables that have been found to influence such writing. Evert argued that ‘data with co-occurrence frequency f < 3.

Investigating learner language


The learner corpus can be compared to the humanities and arts academic sub-corpus of the British National Corpus to identify learner-specific features of the use of academic vocabulary. The BNC spoken corpus can also sometimes be useful to check whether specific words and phrases that appear in the learner sub-corpora are more typical of speech or writing. The method used to investigate learners’ use of academic vocabulary is based on Contrastive Interlanguage Analysis (CIA) and combines comparisons of learner and native-speaker writing, and comparisons of different learner interlanguages. CIA is very popular among researchers in the field of learner corpus research and has helped to highlight an unprecedented number of features that characterize learner interlanguages. To date, however, most studies have used the technique only to compare a learner corpus and a native reference corpus, rather than to explore different learner corpora in the same target language. The studies that have compared more than one interlanguage have usually focused on learners from one mother-tongue background, and used data from one or two other learner populations only to check whether the features they have identified are L1-specific (and thus possibly transfer-related) or are shared by other learners. L2/L2 comparisons involving many different first languages are, however, indispensable if we want to identify the distinguishing features of learner language at a given stage of development (Bartning, 1997). In the following chapters, I try to make the most of CIA by comparing academic vocabulary in ten learner corpora representing different mother-tongue backgrounds.

This page intentionally left blank

Chapter 4

Rhetorical functions in expert academic writing

This chapter deals with academic vocabulary that serves specific rhetorical and organizational functions in expert academic writing. Section 4.1 focuses on the Academic Keyword List and shows that a high proportion of AKL words can fulfil these functions in academic prose. It lists several steps which are necessary to turn the AKL into a tool that can be used for curriculum and materials design (most notably a phraseological analysis of AKL words). Section 4.2 presents a detailed analysis of exemplificatory devices in academic writing. This serves as an illustration of the type of data and results obtained when the whole range of lexical strategies available to expert writers to organize scientific discourse are examined. For lack of space it is impossible to describe in similar detail all the functions that were analysed in the BNC-AC-HUM so as to provide a basis for comparison to EFL learner writing. Section 4.3 briefly comments on the types of lexical devices used by expert writers to serve four additional functions: ‘expressing cause and effect’, ‘comparing and contrasting’, ‘expressing a concession’ and ‘reformulating: paraphrasing and clarifying’ and aims to characterize the phraseology of rhetorical functions in academic prose.

4.1. The Academic Keyword List and rhetorical functions
The functional syllabus has a long tradition in English language teaching (see Wilkins, 1976; Weissberg and Buker, 1978). Jordan (1997: 165) reports that most of the textbooks that were published in Britain in the 1980s and 1990s that followed a product approach to academic writing were organized according to language functions such as explanation, definition, exemplification, classification, cause and effect, and comparison and contrast (e.g. Jordan, 1999). However, they were rarely based on principled


Academic Vocabulary in Learner Writing

selection criteria, relying instead on the writers’ perceptions of good practice in academic writing. Unlike textbooks adopting a functional approach, courses which use vocabulary as the unit of progression, introduce new words according to principles such as frequency and range of occurrence. Nation explains that such courses generally combine a “series” and a “field” approach to selection and sequencing. In a series approach, the items in a course are ordered according to a principle such as frequency of occurrence, complexity or communicative need. In a field approach, a group of items is chosen and the course covers them in any order that is convenient, eventually checking that all the items are adequately covered. Courses which use vocabulary as the unit of progression tend to break vocabulary lists into manageable fields, (. . . ), according to frequency, which are then covered in an opportunistic way. (Nation, 2001: 386) Most pedagogical applications of Coxhead’s (2000) Academic Word List to date have adopted this particular approach, using the frequency-based AWL sub-lists as fields (e.g. Obenda, 2004; Huntley, 2006). There is a need for teaching materials that merge the two types of syllabus design, thus adopting a ‘functional-product’ approach (Jordan, 1997: 165) to academic writing while introducing new vocabulary according to principled criteria such as frequency and range of occurrence. This is precisely where the Academic Keyword List has a role to play. As explained in Section 2.4, the Academic Keyword List requires pedagogic mediation: it is a platform which can inform a functional syllabus for academic writing, but it needs to be organized. As argued by Martinez et al. (2009: 193), ‘a list based on semantic and pragmatic criteria would perhaps be more useful than lists built solely on frequency criteria.’ Sinclair, however, warns us that ‘there is no assumption that meaning attaches only to the word’ (Sinclair, 2004b: 160). Similarly, Siepmann (2005: 86) comments that ‘neat compartmentalizing of meanings or functions can do no more than partially capture a complex reality’ in which any word or multi-word sequence may express more than one discourse relation. This being said, the results of the automatic semantic analysis of the Academic Keyword List revealed that a significant proportion of AKL words fall into semantic categories that correspond to the rhetorical functions typical of scientific discourse, e.g. A2.2. Affect: Cause – connected, A4. Classification, A5. Evaluation, A6. Comparing, Q2.2. Speech acts (see Section 2.4). A close

Rhetorical functions in expert academic writing


examination of the words classified into these semantic categories made it possible to identify twelve rhetorical functions that dovetail with the functions typically treated in EAP textbooks adopting a functional approach to academic writing: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Exemplification, e.g. example, for example, illustrate Cause and effect, e.g. cause, consequence, result Comparison and contrast, e.g. contrast, difference, same Concession, e.g. although, despite, however Adding information, e.g. first, further, in addition to Expressing personal opinion, e.g. appropriate, essential, major Expressing possibility and certainty, e.g. likely, possibility, unlikely Introducing topics and ideas, e.g. discuss, examine, subject Listing items, e.g. first, second, third Reformulating – paraphrasing and clarifying, e.g. namely Quoting and reporting, e.g. define, report, suggest Summarizing and drawing conclusions, e.g. conclude, conclusion, summary.

The next step is to analyse all words that may serve one of these twelve functions in context, with special emphasis on their phraseology. Multiword sequences have been shown to provide ‘basic building blocks for constructing spoken and written discourse’ (Biber and Conrad, 1999: 185) and to correlate closely with the complex communicative demands of a particular genre, thus contributing to its lexical profile (Biber et al, 1999; 2004; Luzón Marco, 2000). A phraseological analysis will also make it possible to investigate how academic vocabulary contributes to this ‘shared scientific voice or “phraseological accent” which leads much technical writing to polarise around a number of stock phrases’ (Gledhill, 2000: 204). It will examine phrasemes, i.e. syntagmatic relations between at least two lemmas, contiguous or not, written separately or together, which are typically syntactically closely related and constitute ‘“preferred” ways of saying things’ (Altenberg, 1998: 122). This is because such phrasemes: – form a functional (referential, textual or communicative) unit (e.g. Burger, 1998); and – display arbitrary lexical restrictions (e.g. Mel’cuk, 1998); ˇ

compounds and phrasal verbs. referential information) of a text or any type of discourse. They include lexical and grammatical collocations. linking adverbials and textual sentence stems. concession. similes. commonplaces. these functions should be among the least sensitive to the text type differences discussed in Section 3. This function is typically localized in the last paragraphs of a piece of academic writing and might thus be absent from a number of BNC texts. either to focus their attention. phenomena or real-life facts. 1996). attitudinal formulae.and trinomials. They include speech act formulae. idioms. they include grammaticalized sequences such as complex prepositions and complex conjunctions. – display arbitrary restrictions on the word forms that can be used to instantiate at least one of the lemmas involved. As the BNC includes truncated texts. Communicative phrasemes are used to express feelings or beliefs towards a propositional content or to explicitly address interlocutors. The phraseological analysis used here is based on Granger and Paquot’s (2008a) classification of phraseological units into three main categories: referential phrasemes.g. In this chapter and the next. Textual phrasemes are typically used to structure and organize the content (i. I focus on the vocabulary of five rhetorical functions — exemplification. – display a certain degree of syntactic fixity.84 Academic Vocabulary in Learner Writing and/or – are characterized by a certain degree of semantic non-compositionality (e. it would not be reasonable to quantitatively compare the words that are used to serve the function of ‘summarizing and drawing conclusions’.3. comparison and contrast. Nor is it reasonable to focus on functions such as ‘reporting and quoting’ and ‘expressing personal opinion’. the . cause and effect. proverbs and slogans.e. include them as discourse participants or influence them. Unlike experts writing in their field. Referential phrasemes are used to convey a content message: they refer to objects. Barkema. textual phrasemes and communicative phrasemes. Apart from being essential rhetorical functions in academic prose. The use of academic words is compared in the BNC-AC-HUM corpus (a corpus of book samples and journal articles written by experts in the fields of arts and humanities) and the International Corpus of Learner English (a corpus of short argumentative essays produced by EFL learners of English). and reformulating — with occasional forays into other functions. irreversible bi.

g. Lonon Blanton. They have often found considerable mismatches between naturally-occurring language and the type of language that is presented as a model in teaching materials (Römer.1). What is your opinion?’ and ‘In the 19th century. technology and industrialism. 2004. First. Römer. dominated by science. the noun example. Oshima and Hogue. It is only by examining their frequency and patterns of use in expert and learner corpora that I shall be able to assess whether these words and phrasemes should be part of an academic vocabulary and added to the AKL. Ruetten. 2004b.Rhetorical functions in expert academic writing 85 learners who produced the argumentative essays were not supposed to show that they were familiar with the subject by referring to or quoting from the literature.g. 2004a. Some of these non-AKL words may be used in very specific lexico-grammatical environments. Carter. Conrad. Frequency. 1998b.1). however. 2006.” Do you think it is still true nowadays?’) (see Section 3. and listed all the lexical items that are commonly taught to serve rhetorical functions. Several researchers in applied corpus linguistics have examined language features in general reference corpora and compared the distributions and patterns found in actual language use with the presentations of the same features in teaching materials such as textbooks or grammars (e. the verbs illustrate and exemplify. I therefore consulted several EAP textbooks (Harris Leonhard. Other lexical items listed in EAP materials but not found in the AKL are the expressions by way of illustration and to name but a few. 2005. there is no longer a place for dreaming and imagination. the nouns illustration and a case in point and the preposition like. . 1999. 2008: 4). 2001. The textbooks-derived list appeared to be very different from the words found in the Academic Keyword List.1. Zemach and Rumisek. 2005). the adverb notably and the abbreviation e. For example. have a very restricted meaning or prove particularly difficult for learners. Victor Hugo said: “How sad it is to think that nature is calling out but humanity refuses to pay heed. Zwier. the AKL includes a number of words and phrasemes that are commonly used as exemplifiers: the wordlike units for example and for instance. they were explicitly encouraged to give their personal opinions (topics for the essays include: ‘Some people say that in our modern world. 2002. it is not quite clear why these items are taught to novice writers and EFL learners while other much more frequent lexical items that are used to express the same rhetorial functions are missing. may not be the sole criterion to include lexical items in the curriculum (see Section 1. 2002). the preposition such as. I decided to include the lexical items found in EAP textbooks in my study of academic vocabulary for two main reasons. By contrast. 2003. Jordan.

Learner-specific word sequences are discussed in Section 5. in brief and all in all for summarizing and concluding.1 for the automatic extraction of potential academic words was therefore adopted to identify words and word sequences that EFL learners frequently use. thing.2. (Wray. The Academic Keyword List is based on native corpora only. really. EFL learners may use different lexical devices than native writers to serve rhetorical functions. secondly. . so that the frequency of each exemplificatory lexical item can be calculated as a proportion of the total number of exemplifiers. say. say. but which are not favoured by expert academic writers.3. In learner corpus research. we need to know not only how often that form can be found in the sample. As Wray stated in her book on formulaic language. we need a way to calculate the occurrences of a particular message form as a proportion of the total number of attempts to express that message. opinion. so and why (see De Cock. thirdly. A keyword procedure such as that described in Section 2. To capture the extent to which a word string is the preferred way of expressing a given idea (for this is at the heart of how prefabrication is claimed to affect the selection of a message form). These two terms are neutral. they repeatedly use word-like units such as in a nutshell. which are quite rare in academic prose. 2002: 163) in academic prose. thanks. Examples of overused words which do not belong to the AKL but are employed to serve rhetorical functions in learner writing include like. 2002: 30) This approach should help us move towards ‘understanding the intersection of form and function’ (Swales. and simply reflect the fact that a word is more/less frequent in learner writing.86 Academic Vocabulary in Learner Writing Second.3. but. positive keywords are often referred to as overused words and negative keywords are said to be underused. 2003 for a keyword analysis of the French sub-corpus of ICLE). For example. always. sure. especially if conceptual frequency is to be investigated. In other words. firstly. but also how often it could have occurred. their inclusion in the description of a specific function in academic writing allows us to approximate as closely as possible what Hoffmann (2004: 190) referred to as ‘conceptual frequency’. The resulting list was analysed to identify words that might serve one of the 12 rhetorical functions listed above. let. I. The ICLE corpus was compared to the BNC-AC-HUM to extract distinctive words in the learner corpus. which has limitations for an analysis of learner writing. maybe.

). therefore. yet. by/in comparison with. common.. albeit. reversely. though (conj. similarly. even though. *on the opposite. as. that is. on account of. thanks to. contribute to. or rather . in consequence. responsible. give rise to. in spite of. alike. illustration. namely. though (adv. opposite. comparison. as. a case in point. despite. yield. different. hence. differentiate. for example. by/in contrast. reason. versus. provoke. even if. derive. because of. distinct. (the) contrary. prompt. outcome. in the same way as/that Cause and effect: cause (n. in contrast to/with. resemblance. distinguishable. notwithstanding Reformulating – paraphrasing and clarifying: i. They are included in the corpus-based analyses presented in this chapter and in Chapter 5 to assess the adequacy of the treatment of rhetorical functions in EAP textbooks and investigate whether the AKL should be supplemented with additional academic words. parallel. exemplify. parallely. stem from. implication. although. bring about. for. analogous. that is to say. consequent. unlike. to name but a few. while. (the) opposite. in consequence of. unlike. in the same way. nonetheless. parallel. because. like. make sb/sth do sth. reverse. resemble. parallelism. distinguish. *in contrary to. on the other hand. differing. similar. such as. factor. contrasting. compare. is why Concession: however. by way of illustration Comparison and contrast: analogy. look like. comparable. result in/from. by/in comparison. The words in italics are not part of the Academic Keyword List. by implication. thus. as a result of. root. correspond. by way of contrast. similarity. induce. as opposed to. contrastingly. since. same. (the) same. for instance. as a result. differentiation. Exemplification: example. (the) reverse. likewise. correspondingly. contrary to. contrast. in parallel with. notably. follow from. distinction.Rhetorical functions in expert academic writing 87 The final lists of words that may be used to serve one of the five selected rhetorical functions are given below.. in other words. on the grounds that. whereas. viz. compared with/to. so that. so. as a consequence. e. due to. owing to. emerge. contrary. thereby. as against. cause (v. distinctiveness. trigger. accordingly. parallel. source.. They were identified on the basis of a close examination of EAP materials and a keyword analysis of learner corpora. origin. generate.). contrast. or more precisely. on the grounds that.g. or more accurately. lead. conversely. differently. result. illustrate. as a consequence of. difference. *on the other side. comparatively. identically. distinctive. like. in view of. analogously. identical. on the contrary. distinctively. arise from/out of. in (the) light of. as … as. consequently. contrariwise.).). differ.e. consequence. nevertheless. effect.

Other lexical items commonly listed in textbooks and EAP/EFL materials.000 words.3 occurrences per 100.e. The most frequent exemplifiers in professional academic writing are the mono-lexemic phrasemes such as and for example. which occur more than 35 times per 100.1. the choice of exemplifiers). Table 4. Siepmann (2005) showed that exemplificatory discourse markers occur in all kinds of discursive prose. that as an object of study. subordinate to major discoursal stratagems such as “inferring” and “proving”’ (Siepmann. the verbs illustrate and exemplify. the preposition such as. but not found in the AKL. and then focus on the exemplificatory use of nouns and verbs. illustrate. The Academic Keyword List (AKL) includes a number of words and multiword sequences that are commonly used as exemplificatory discourse markers: the mono-lexemic or word-like units for example and for instance.88 Academic Vocabulary in Learner Writing 4.g. like. and are particularly frequent in humanities texts. He argued. the adverb notably and the abbreviation e. I will now discuss my main findings on the exemplificatory functions of prepositions. and the wording of the example (i. plus the noun example. the noun example. are the expressions by way of illustration and to name but a few.2. Almost half of the exemplifiers — for instance. 2005: 111). ‘exemplification continues to be the poor relation of other rhetorical devices’ and that ‘such neglect has led to a commonly held view in both the linguistic and the pedagogic literature that exemplification is a minor textual operation. In Figure 4. see Siepmann (2005: 112–18).000 words and the percentage of exemplificatory discourse markers they represent. . Coltier (1988) remarked that examples and exemplification merit close investigation at two levels: the exemplificatory strategies adopted (i. The function of exemplication This section presents a detailed analysis of the academic words that are used by expert writers to serve the rhetorical function of exemplification. however. For a rhetorical perspective on exemplifiers in native writing. The verb exemplify and the noun illustration are less frequent (around 2.g.000 words) while the adverbials to name but a few and by way of illustration as well as the noun case in point appear very rarely in the BNC-AC-HUM.e. when and why are examples introduced into a text).1 gives the absolute frequencies of these words in the BNC-AC-HUM as well as their relative frequencies per 100. This section deals with the latter and focuses on the lexical items used by expert writers to give an example.000 words. e. and notably — occur with a relative frequency of between 5 and 20 occurrences per 100. adverbs and adverbial phrases. the nouns illustration and a case in point and the preposition like. the lexical items are ordered by decreasing relative frequency in the academic corpus.

0 259 79 338 4.3 0.1 66.6 1.3 0.3 1.2 100 38. Nouns example illustration (BE) a case in point TOTAL NOUNS Verbs illustrate exemplify TOTAL VERBS Prepositions such as like TOTAL PREP.3 7.0 61.9 34.1 Ways of expressing exemplification found in the BNC-AC-HUM Abs.7 179.2 4.3 0.0 18.8 2.g.3 23. notably to name but a few by way of illustration TOTAL ADVERBS TOTAL 1263 609 259 77 4 3 2215 5959 21. no ta bl y ex em pl ify illu BE st a ra ca tio se n to in na po m in e by t bu w ta ay fe of w illu st ra tio n lik e ra te as am ch illu st e.1 Exemplification in the BNC-AC-HUM fo .0 45.3 0.0 8.5 41.3 5. 89 50 45 40 35 30 25 20 15 10 5 0 pl e re xa m pl fo e ri ns ta nc e g. su ex Figure 4.1 0.4 1494 532 2026 25.4 10.4 1.7 2.1 0. freq. freq.Rhetorical functions in expert academic writing Table 4.7 7.8 2.1 37. Adverbs for example for instance e.2 38.5 % Rel.0 16.2 10.2 1285 77 18 1380 21.

4.1. This is the arrangement in Holland whereby various institutions such as media. The term ‘buggery’.1). while remaining essentially cataphoric in nature (i. . i. The small mammals living today in many different habitats and climatic zones have been described. the complex preposition such as is the most frequent exemplifier in the BNC-AC-HUM (see Example 4. for example is twice as frequent as for instance. enclosed by commas.1.2. 4. In a phraseological approach to academic vocabulary. But they can also follow the subject of the exemplifying sentence. and tropical faunas distinct from temperate faunas. These two adverbials are commonly classified as ‘code glosses’ in metadiscourse theory (see Section 1. 4. Using prepositions. and when these and more precise distinctions are made it is possible to correlate and even define ecological zones by their small mammal faunas. explaining. they fall into the category of textual phrasemes as they are mono-lexemic multi-word units.2.3 and 4. so that the associations between faunal types and ecology are well documented [ . to ensure the reader is able to recover the writer’s intended meaning’ (Hyland. cultural organisations. and run by the separate catholic and protestant communities. especially when executed by a showman like Salvador Dali. and hospitals are duplicated. Surrealist painting had publicity value.2). 50). Code glosses are ‘interactive resources’ in Hyland’s typology of metadiscourse: they are features used to ‘organize propositional information in ways that a projected target audience is likely to find coherent and convincing’ (ibid.3) as they are used to ‘supply additional information. for example. it is much more frequent than the preposition like in professional academic writing (Example 4. schools. Such associations of sexual deviance and political threat have a long history sedimented into our language and culture. to whom abominable practices were also ascribed. Unlike in other genres (such as speech and fiction). Woodland faunas. for instance. Similarly. are distinct from grassland faunas.e.90 Academic Vocabulary in Learner Writing 4. especially after the subject. . the OED tells us that it was later applied to other heretics. In the BNC-AC-HUM. for example and for instance are typically used within the sentence. 4. adverbs and adverbial phrases to exemplify As shown in Figure 4.3. pointing forward to the example) as shown in Examples 4. welfare services. ]. . who married the former wife of the poet Paul Éluard. 2005: 52).1.e. derives from the religious as well as sexual nonconformity of an eleventh-century Bulgarian sect which practised the Manichaean heresy and refused to propagate the species. multi-word units that are equivalent to single words and which fill only one grammatical slot. by rephrasing.4. or elaborating what has been said.4. with an organizational – exemplificatory – function.

Consider for example the law of defamation. They are used in the second person of the imperative.8%) 588 (96. and indeed perhaps cannot be.2).Rhetorical functions in expert academic writing Table 4.5. consider (f[n. It is worth pausing here momentarily to observe that such legally provided remedies can be morally justified even when applied to people who are not subject to the authority of the government and its laws. mono-lexemic adverbial phrases can also have their own phraseological patterns. however.6. c]1 = 13. Thirdly. given that counterrevolutionary response to any successful formula will ensure that it will be that much more difficult to apply the same tactics in another situation.185 (93.1) and see (f[n. as illustrated in Example 4. i.5). log-likelihood = 19. Three verbs.2 The use of ‘for example’ and ‘for instance’ in the BNC-AC-HUM Cataphoric marker for example for instance 1. Such is the relation which Nicaragua bears to El Salvador. if only the subject is the example. however.3 clearly shows that for example need not be placed in the initial position to introduce an exemplificatory sentence. log-likelihood = 71. much less frequent (see Table 4. Assuming that it is what it should be. This use is. c] = 19. The verbs consider and take are typically used with for example to introduce an example that is discussed in further detail over several sentences: 4.5%) Endophoric marker 78 (6.5. the debates over how far to forge a strategy either for winning power or for promoting economic development in a post-revolutionary society have not been satisfactorily resolved. it does no more than incorporate into law a moral right existing independently of the law.7) are significant left co-occurrents of for example in the BNC-AC-HUM. is not confirmed by corpus data.e. Enforcing such a duty against a person who refuses to pay damages is morally justified because it . The duty to compensate the defamed person is itself a moral duty. Like nouns and verbs. In Mieux écrire en anglais. take (f[n. This statement. 4. while the adverbial should follow the subject. log-likelihood = 92. Laruelle (2004: 96–7) writes that for example should be placed in the initial position if the whole sentence has an exemplificatory function.2%) 21 (3. for example. Example 4. between commas. c] = 7.5%) 91 For example and for instance can also function as endophoric markers and refer back to an example given before.

and more specifically as engagement markers. for example. ignoring others. led through a line of reasoning.8. He categorizes them as interactional resources. The verb see is frequently used in professional academic writing as an endophoric marker to refer to tables. communication studies. ‘devices that explicitly address readers. But while I am looking at it my eyes constantly wander from one flower to the next. where readers are initiated into a new domain of argument. political science and statistics) and found that second person imperative see was the most . Hyland and Tse. or other sections of the article or to someone else’s ideas or publications (Hyland. pausing at some. 2005. 2007). chemical engineering. In the BNC-AC-HUM. linguistics. Swales et al. The use of the second person imperative see ‘allow[s] academic writers to guide readers to some textual act. The scene will keep constantly changing. But the concept of compresence is far from clear.. I may move the vase closer.e. either to focus their attention or include them as discourse participants’ (Hyland. the perceptual experience that I have while looking at this bunch of carnations arranged in a vase on the table in the middle of the room. 1998. philosophy. I shall experience a succession of different “complexes of qualities” but I shall still be looking at the same bunch of flowers. Finally. If it implies that no time-lag is detectable between elements of an experienced “complex”. then this is true only in a very limited sense. As a result. 2002a: 217). Afro-Caribbean and Asian children are indeed painfully aware that many teachers view them negatively and some studies have documented reports of routine racist remarks by teachers (see for example Wright in this volume). literary criticism. 63 per cent of the occurrences of the sequence see for example appear between brackets as in Example 4. history. 2002a. referring them to another part of the text or to another text’ (Hyland. or directed to understand a point in a certain way’ (2002a: 217). i. experimental geology. Hyland describes this type of imperative as directives with a rhetorical purpose that ‘can steer readers to certain cognitive acts.7. 4. (1998) examined a corpus of research articles in ten disciplines (art history. One need not invoke the authority of the law over the defamer to justify such action.8: 4. The law may not have authority over him. picking out the details of their shapes and colours. or walk around the table and look at the flowers from different angles. without taking my eyes off the flowers. I see this “complex” as one whole. figures. 2005:53). Take.92 Academic Vocabulary in Learner Writing implements the moral rights of the defamed.

have spoken out about deprivation in the inner cities. The sequences take/consider for example consist of two metadiscourse resources in Hyland’s (2005) categorization scheme: the imperative forms take and consider are interactional resources.10). and understanding of. The adverb notably can be regarded as a typical academic word: Figure 4. 4. Similarly.2 per cent of its occurrences in the BNC-AC-HUM (Example 4. the poor. the sequences take/ consider/see for example are textual phrasemes as they form functional — textual — units and display arbitrary lexical restrictions. Note that in both studies. see is an endophoric marker in see for example. the miners’ strike. Hyland (2002a) analysed a corpus of 240 published research articles.9. 50 freq. notably Jenkins of Durham. the use of the imperative varied across disciplines. In our phraseological framework. and more specifically engagement markers. appears quite clearly here. while for example is a code gloss. and the need for government to show a greater compassion for.9) and is qualified by the adverb most in 15. seven textbook chapters and 64 project reports written by final year Hong Kong undergraduates and found that the second person imperative see represented 45 per cent of all imperatives in that corpus. The advantage of adopting a phraseological approach to rhetorical functions.2 The distribution of the adverb ‘notably’ across genres . and hence metadiscourse resources. and Hapgood of York. in his study of directives in academic writing. It is typically preceded by a comma (Example 4. Similarly. Some bishops. Sheppard of Liverpool. per million words 4040 30 20 10 0 academic writing news fiction speech Figure 4.Rhetorical functions in expert academic writing 93 frequent imperative form across disciplines.2 shows that it is much more frequent in academic writing than in the other genres.

g. In fact. per million words 0. farming. there is a tradition of providing MPs.5 0 tio n s ic ne w ad em fic sp ee c h Figure 4. Direct curative measures (e. most notably Eton.7 per cent of its occurrences are between brackets: 4. It is quite common in the BNC-AC-HUM. or the air. e. It may help to refer the patient to other agencies (e. self-help groups). The abbreviation e. As shown in Figure 4. introduce one or more noun phrases rather than full clauses: 4. (or less frequently eg) stands for the Latin ‘exempli gratia’ and means the same as for example. is used without brackets.13. a psychosexual problems clinic. Figures 4. in which 65. social services. these expressions are very infrequent in all types of discourse.4 show the distribution of the two phrasemes in four main genres of the British 1 freq.g.g. the water.12. When e. At leading public schools. and prime ministers. the great majority of occurrences of e.10.g. government ministers. In contrast to for example and for instance. it is preceded by a comma: 4.g.11.g. the textual phrasemes by way of illustration and to name but a few are quite rare in the BNC-AC-HUM.3 and 4.94 Academic Vocabulary in Learner Writing 4. flood protection) are clearly within the domain of a soil conservation policy.3 The distribution of ‘by way of illustration’ across genres ac .1. Primary industries are those which produce things directly from the ground.

per million words 95 0. which is much more common than illustration or a case in point. newspaper texts. the textual phraseme for example. 10 out of a total of 28) appear in academic texts.2. No instance of to name but a few was found in speech. Some 36 per cent of the occurrences of by way of illustration in the BNC (i. 1263 % 49. in the BNC-AC-HUM. BNC-AC-HUM 1285 % 50.3 The use of ‘example’ and ‘for example’ in the BNC-AC-HUM example Absolute freq. The verb be is the most frequent verb co-occurrent of example in windows of one to three words to both the left Table 4. and only one occurrence comes from speech. and speech.2. 4.Rhetorical functions in expert academic writing 1 freq.3 shows that it is as frequently used as its connective counterpart. fiction. The expression to name but a few is more frequent than by way of illustration in the whole BNC.5 0 w s ic em tio ne ad fic sp ee ch n Figure 4.43 for example Absolute freq. namely academic writing. The significant verb co-occurrents of the noun example in the BNCAC-HUM are listed in Table 4.e.4. The noun which is most frequently used in this way is example. Table 4.57 ac .4 The distribution of ‘to name but a few’ across genres National Corpus (BNC). Using nouns and verbs to exemplify Nouns and verbs are used to give examples in specific phraseological patterns. but only 12.8 per cent of its occurrences (10 out of 78) appear in academic writing.

twice as frequent in the left window. This is the supreme surviving example of the early classical taste for stillness and indirect narrative. . Here is a simple example of the complexity at issue. it functions as an advance label which refers forward to a following example (underlined): 4. it mainly functions as a retrospective label.16. 139 26 29 12 5 12 7 9 10 5 Right co-occurrents Verb be illustrate show give suggest quote include provide concern will can would freq. leaving the creature walking-wounded but in obvious distress. In addition. By contrast.96 Academic Vocabulary in Learner Writing Table 4. When example is preceded by the verb be. Vision is a better example of a modular processing system. with my children in the back seat. 4. as Polygnotos showed the sack of Troy in its aftermath. of course. it refers back to the exemplifying element which is given as the subject. used by fast-moving traffic.e. The designer at Olympia chose to represent the race by the moment before it started. when the noun example is introduced by there + BE (11%) or here + BE (15%). The noun example may refer back directly to a noun phrase (Example 4. I am driving along a narrow main road. A car some distance ahead strikes a large dog but does not stop. Be is. 84 14 21 15 12 6 7 8 6 16 15 13 (3L-1L) and the right (1R-3R).14. however.15.14) or to the demonstrative pronoun this which further points to a previous exemplifying sentence (Example 4. i. 4.15).4 Significant verb co-occurrents of the noun ‘example’ in the BNC-ACHUM Left co-occurrents Verb be provide take give cite consider illustrate show see serve freq. choices can result from lengthy weighing of odds.

19. and typically involve a subject and a verb. which do not contain any thematic element (e. namely provide. illustrate and show (given in italics in Table 4. or should not be. cite. .Rhetorical functions in expert academic writing 97 My children. Rhemes typically consist of a verb and its post-verbal elements.g. They consist of sequences of two or more clause constituents. . consider. and in which the writer’s and reader’s attention is not. can and would are significant right co-occurrents. slowing down but then speeding up again. whose nihilistic work is now admired for its qualities of imagination. The prime example is the Dada movement. are significant left and right co-occurrents of the noun example. see and serve are only significant left co-occurrents. The clearest example of emotive language is poetry. consider. zones in which the claims and jurisdictions of different rulers and their subjects overlapped and intersected in a complex and confusing way. cry out. seeing what occurred. When example is the subject of the verb be. take. see. Four other verbs. The most striking example of this is perhaps the frontier in the Danubian plain between the Ottoman empire and the Habsburg territories in central Europe.19). while the verbs suggest. concern. give. give. I glance in the rear-view mirror to see other cars close behind. The verbs provide. . This was especially true in eastern Europe. cite. ) and rhemes (. . The verbs take.4).17. . An example of Y is . Textual sentence stems are routinized fragments of sentences which serve specific textual or organizational functions. . In Example 4. the exemplified item is the pronoun this which refers back to the previous sentences. is an example of Y). quote and include and the modals will. directed at any of the objective relationships between words and things. Until the seventeenth century many. even most. The verb . is another issue).g. it always functions as an advance label. European frontiers were very vague. where many states were large and central governments were usually less effective at the peripheries of their territories than in the west. They typically have an empty slot for the following object or complement.18 and 4. .19. Copular clauses using the noun example consist of textual sentence stems (An example of Y is . .18. serve and include often co-occur with the noun example to form textual — exemplificatory — phrasemes. 4.17 to 4.19) and the exemplified item is generally introduced by the preposition of (Examples 4. 4. I do not stop. It is often qualified by an adjective (see Examples 4. which is entirely concerned with the evocation of feelings or attitudes. 4. . . e.

4. The verb see always co-occurs with the noun example in the second person of the imperative (Example 4. each consisting of about six people. The verb cite is more often found in a passive structure in which example functions as a retrospective label (Example 4. above). for example. 4. 4. When used in the imperative.79%).25) and less frequently first person plural. It is never used to introduce an example.23. provides a classic example of passive resistance. but always as an endophoric marker to direct the reader’s attention to an example elsewhere in the text.9%. Example 4.23) and in imperative sentences (13.26. but active structures in which the subject is the example (Example 4. Example 4.20) are more frequent. Consider the following example.25. The two verbs often form rhemes with the noun example: 4. usually second person imperatives (Example 4.98 Academic Vocabulary in Learner Writing provide can be used in active or passive structures. 4.21).22). at the beginning of the project seven committees were established.26). The verb take is mainly used with the noun example in sentence-initial exemplificatory infinitive clauses (68.24. A famous passage of art criticism can be cited as one example entirely beyond dispute. Take the example of following an object by eye-movements (so-called ‘tracking’). 4. I shall simply use ‘stylistics’ as a convenient label (hence the inverted commas) for the branch of literary studies that concentrates on the linguistic form of texts.21. The most important vowel is set to two or more tied notes in a phrase designed to increase the lyrical expression (see Example 47. it generally appears in the second person (Example 4. The Magdalen College affair. It also occurs in active structures with a personal pronoun subject (13.24) and there is only one occurrence in the first person plural in the BNC-AC-HUM. The verb include is used with the plural form of the noun example in subject position to introduce an incomplete list of examples in object position: . and I shall take four different examples of this kind of work as alternatives to the Prague School’s and Jakobson’s approach to the relationship between linguistics and literature. To take one example.79%. to investigate one of a range of competing architectural possibilities. the verb consider is mainly used with the noun example in imperative sentences (70%). In accordance with the theme of this chapter. 4.22.20. By contrast.

) which describe examples. e. Thirdly. The advantage of using the noun example rather than the adverbials for example or for instance is that it allows the writer to evaluate the example in terms of its suitability. Table 4.Rhetorical functions in expert academic writing 99 4.] illustrates . However.27. This example clearly illustrates the theory dependence and hence fallibility of observation statements. and illustrate (Example 4. These include quote (Example 4. .g. whose nihilistic work is now admired for its qualities of imagination. this example [adv. fine. the whole novel being made up of dialogue and narrative units generated in waves by the central conversation. as the two men’s review of their past lives sparks off inner thoughts and recollections and conjures up other conversations and dramatised episodes.28). in all the examples quoted here. good. Another set of verb co-occurrents of example is used to discuss the examples given in a text.28 to 4.28. 4. . and the combinations are fully explicable in semantic and syntactic terms. The example shows that the objector’s neat distinction between adjudicative and legislative authorities is mistaken. 4. outstanding.31) or typicality.32. suggest and show (Example 4.g. 4.29. e.30. The floral examples include a large lotus calyx and two ivy leaves joined by a slight fillet. and make suggestions on their basis. excellent (Example 4.g. prime (Example 4. they function as endophoric markers in Hyland’s (2005) typology of metadiscourse features. . 4. these co-occurrences are frequently used in adverbial clauses (e. classic.g. An outstanding example of this type of narrative is Vargas Llosa’s Conversation in the Cathedral. . ) and sentence stems (e. typical.32).33): used with the noun example. give more detail about them.5 gives the 24 adjectives that significantly co-occur with the noun example in the BNC-AC-HUM. which pivots around a four-hour conversation between two characters. .30 do not qualify as collocations as the meaning of the verb is not restricted by the noun example. as this example suggests . The adjectives above and following are used to situate the example in the text (Example 4. The prime example is the Dada movement. 4.31.30) to show what something is like or that something is true. These significant co-occurrences illustrated in Examples 4. there is a sense in which all observers see the same thing.29) to talk about conclusions that can be drawn from the examples.

The second element.5 Adjective co-occurrents of the noun ‘example’ in the BNC-AC-HUM Adjective good above following well-known obvious classic typical outstanding extreme clear simple striking freq. standard. of the first class. i. approved as a model. syntactically fully flexible and collocationally open: the adjective classic is used with a meaning that is listed as its first sense in the Longman Dictionary of Contemporary English (LDOCE4) (1. Following Granger and Paquot (2008a: 43). The ‘base’ of a collocation is semantically autonomous and is selected first by a language user for its independent meaning. 1996). TYPICAL: having all the features that are typical or expected of a particular thing or situation) and the Oxford English Dictionary Online2 (1. This is clearly an illustration of the difficulty of separating the senses that a word has in isolation from those that it acquires in context (see Barkema. Collocations represent 8. leading). usage-determined or preferred syntagmatic relations between two lexemes in a specific syntactic pattern. Consider the following example. However. of the highest rank or importance.e. the adjective is only commonly used in this sense with a very limited number of nouns— example. i. There is a case for considering the co-occurrence classic example as a free combination. i. . 38 15 18 10 16 11 13 10 12 16 13 9 Adjective fine notable isolated interesting known excellent prime trivial previous remarkable numerous single freq. The co-occurrence prime + example is a clear example of a collocation: the adjective prime has two core meanings – ‘most important’ and ‘of the very best quality or kin’ – but a prime example is ‘a very typical example of sth’. mistake and case3.100 Academic Vocabulary in Learner Writing Table 4.e.87 per cent of the tokens of adjective + example co-occurrences.3 per cent of the types and 6. 9 8 8 9 7 6 7 5 6 5 5 6 4.e. I classified co-occurrences of this type as collocations. Both lexemes make a separate semantic contribution to the word combination but they do not have the same status. a word combination that is semantically fully compositional. is selected by and semantically dependent on the ‘base’. the ‘collocate’ or ‘collocator’.33.

another and one) are more frequent than the definite article the with example. 2005: 137). In addition to most of the adjectives given in Table 4. Nesselhauf (2005) has shown that free combinations are prone to erroneous or. next and last (Example 4. Lorenz (1998. The first two examples discussed below illustrate different ways in which the linguistic model is used to develop a narrative model. unidiomatic use in learners’ writing. Indefinite determiners (a. This does not mean. alarming. they constitute ‘preferred ways’ of qualifying example as they are repeatedly used with this noun.Rhetorical functions in expert academic writing 101 Other adjectives form semantically and syntactically fully compositional sequences with the noun example. which in turn is often determined by a demonstrative (Example 4. The added value of using statistics. Right co-occurrents include the preposition of and the pronoun this. however. class or event exemplified. awe-inspiring. other significant co-occurrents of the noun example are found in professional academic writing.34. Thus. eminent. irrespective of their phraseological status. the noun example is directly followed by the preposition of which introduces the idea. happy. . crass. in her study of verb + noun combinations. glittering. consummate. these co-occurrents are best described as ‘singularities’ and do not represent ‘the habitual usages of the majority of users’.31 above) or pronominalized to refer back to a previous sentence. . 4. and more specifically association measures. Apart from verbs and adjectives. Siepmann listed a number of adjectives that do not appear even once in the 87-million word written part of the BNC (beguiling. anodyne. to analyse the common co-occurrences of a word in a large corpus is made clear by comparing the significant adjective co-occurrents of the noun example (listed in Table 4. differs from that of native students. cautionary). edifying. well-worn. ). First. Left cooccurrents include determiners and the pronoun this. . and (.34). that they are pedagogically uninteresting. The is mainly used when the noun is qualified by a superlative adjective or preceded by ordinals such as first. apposite. In 40 per cent of its occurrences. the meaning of an outstanding example is composed of the meanings of the adjective outstanding and the noun example.5) with attested adjectival collocates (as given by Siepmann. The pronoun this is typically used as a subject with the verb be to refer back to an example given in a previous sentence (see Example 4. Second. and adjectives which occur only once or twice in the corpus (exquisite. at least.15 above).5. Similarly. emotive. hideous). To use Sinclair’s (1999: 18) words. 1999a) has pointed out that German learners’ use of adjectives.

The narratives of the Passio Praeiecti and of the Vita Boniti both have their peculiarities. The verb illustrate is used with the meaning of ‘to be an example which shows that something is true or that a fact exists’ (Example 4. 4. 1993). article.36.35. 4. Exemplify is very rarely used in other genres. The verbs illustrate and exemplify can also be used as exemplifiers. Table 4. Nevertheless they do illustrate the complexities of local ecclesiastical politics. My aim will be to illustrate different ways of approaching literature through its linguistic form.5 also shows that the verb illustrate is more frequent than exemplify in professional academic writing. Figure 4.6 shows that there is no . The verb illustrate is not uncommon in news but a quick look at its concordances shows that a significant proportion of its occurrences are used not to introduce an example. 4. and it is possible that the appointment of Praeiectus and the retirement of Bonitus were less creditable than their hagiographers claim. but with the meaning of ‘to put pictures in a book. their word forms and tenses in the BNC-AC-HUM4 were computed in the way described by Granger (2006).37). etc’ (Example 4.35) or ‘to make the meaning of something clearer by giving examples’ (Example 4.5 compares the relative frequencies of the two verbs in academic writing with three main genres represented in the British National Corpus. The frequencies of the two verb lemmas. ways involving the direct application of linguistic theory and linguistic methods of analysis in order to illuminate the specifically literary character of texts. Function words seem to display co-occurrence preferences just as content words do (also see Renouf and Sinclair’s (1991) notion of a ‘collocational framework’). The verb exemplify is used with the meanings of ‘to be a very typical example of something’ and ‘to give an example of’.102 Academic Vocabulary in Learner Writing These findings support Gledhill’s (2000) view that there may be a very specific phraseology and set of lexico-grammatical patterns for function words in academic discourse. Also in the pipeline is an Australian children ‘s TV series based on Gumnut Factory Folk Tales (written.37.36) (LDOCE4). illustrated and published by Chris Trump). Both verbs are more frequent in academic writing than in any other genre. These findings also provide strong evidence against the use of stopword lists when extracting co-occurrences from corpora as there is a serious danger of missing a whole set of phraseological patterns (Clear. (BNC-NEWS) Figure 4.

8 37.321. table.89% 23. figure.39).867 7.73% 24.77% 5% 100% major difference in proportion between the verb forms illustrate.6 The use of the lemma ‘illustrate’ in the BNC-AC-HUM The lemma illustrate illustrate simple present infinitive illustrated simple past present/past perfect past participle illustrates illustrating continuous tense -ing clause Total Nr of words Relative freq.32% 5. illustrated and illustrates. per 100.79% 0.5 The distribution of the verbs ‘illustrate’ and ‘exemplify’ across genres Table 4. case or approach (Example 4.000 words BNC-AC-HUM 97 36 61 84 7 0 77 63 15 2 13 259 3.Rhetorical functions in expert academic writing 140 frequency per million word 120 100 80 60 40 20 0 Academic News illustrate Fiction exemplify Speech 103 Figure 4.45% 13.55% 32.7% 0% 29.38).43% 2. Almost all occurrences of the past participle appear in the passive construction BE illustrated by/in (Example 4. . the verb is often preceded by a non-human subject such as example. When used in active structures.

[LogL = 45. The noun point is used as an object of illustrate which refers back to an idea put forward in a previous sentence: 4. proposing a new set of rules for diagnosing schizophrenia. see Table 3.08] in a 1R-3R window. examples can be produced that cast doubt on the invariable necessity for a large number of observations. For most of this century it is those disorders gathered together under the heading of ‘schizophrenia’ that have been used as the paradigm for trying to describe and understand psychosis. ) Whatever the answer to such a question. This example clearly illustrates the theory dependence and hence fallibility of observation statements. The sentence-initial adverbial clause To illustrate this/the point/X.78].65] and fig. Figure 1 illustrates the spread of results for the alcoholics and the controls.39.2) . This co-occurrence is even more marked in academic genres such as social sciences. and with the nouns point [LogL = 168.104 Academic Vocabulary in Learner Writing 4. example [LogL = 49. . How many observations make up a large number? (. one of the present authors was recently asked to review a paper submitted to a prominent psychiatric journal.) is used either as the subject of the verb illustrate or in the passive structure illustrated in Figure x. The noun figure (and the abbreviation fig.41. natural sciences and medicine which rely extensively on figures.38.40. . To illustrate this. or forms – for many would prefer to talk of ‘the schizophrenias’ – there is still no universally accepted set of criteria for diagnosis.5 miles) apart. illustrate significantly co-occurs with the noun example [LogL = 112] in a 3L-1L window.42. Yet even in this form. . (Example 4. 4. I refer to the strong public reaction against nuclear warfare that followed the dropping of the first atomic bomb on Hiroshima towards the end of the Second World War.7 per cent of the occurrences of the lemma illustrate in the BNC-AC-HUM. 4.40) represents 2.43). .42 and 4. 4. tables and diagrams (see Examples 4. (W_ac_medicine BNC sub-corpus. In the BNC-AC-HUM. . The contrast between the conditions on the coast and in the interior is illustrated by the climatic statistics for two stations less than 30 km (18. Correlations varied over a very wide range. In the course of their analysis the authors determined the extent to which their proposed criteria agreed with those contained in other existing diagnostic schemes – some ten or twelve of them. To illustrate the point.

2) The adverbs well. with the verb serve (Example 4. The advantages of the system are illustrated in Fig.45. 8.38 11.53% 100% .7% 19% 2. While our discussion in this chapter is of the doctrine of neutrality as such. can and may (Example 4. 4. 4. the fallow stage is contributing to crop productivity as well as providing protection against soil erosion. Rawls ‘ treatment of it will serve to illustrate the problems involved. like the Peruvian example discussed above.2 and.46. The verb illustrate also co-occurs significantly with how to introduce a clause (Example 4.7 The use of the lemma ‘exemplify’ in the BNC-AC-HUM The lemma exemplify exemplify simple present infinitive exemplified simple past present/past perfect past participle exemplifies exemplifying continuous tense -ing clause Total Nr of words Relative freq.44. (W_ac_soc_science BNC sub-corpus.43.53% 0% 2. This prejudice against close involvement with the secular government may be illustrated by an anecdote related in the about Molla Gurani. per 100.000 words BNC-AC-HUM 9 5 4 53 8 1 44 15 2 0 2 79 3. see Table 3.4% 6. 4.47.44).321.7 shows that the lexico-grammatical preferences of the verb exemplify differ from those of illustrate. A large proportion of the occurrences Table 4.Rhetorical functions in expert academic writing 105 4.26% 55.45). The history of the English monarchy well illustrates both the importance and the unimportance of war.47).867 2. and with the modals will.46).33% 5% 67% 10% 1. better. Table 4. We recently did a simple experiment which happens to illustrate how children’s knowledge of where an object is determines their behaviour. 4. best and clearly are sometimes used with illustrate to evaluate the typicality or suitability of the example (Example 4.

In the BNC-AC-HUM. Piaget’s claim that thinking is a kind of internalised action. the clerk Jankyn.2. of the verb. . Examples include . for example . the exceptions are a few collocations such as prime example and classic example. . 1998) of giving an example in professional academic writing. . Siepmann (2005) analysed a 9. who. . however. is really a global assumption in search of some refined. and more precisely past participle forms. A large majority of these word combinations are semantically and syntactically fully compositional. .. imperative clauses (Consider. word-like units or mono-lexemic phrasemes (the preposition such as. The function of exemplification can be fulfilled by a whole spectrum of single words (the preposition like. . the verb significantly co-occurs with the verb be and the conjunction as in a 3L-1L window. 4.49. .50.). 4. and with the prepositions by and in in a 1R-3R window. sentence stems (An example of Y is X.5-million word corpus of academic writing. characterized by their high frequency of use and can be described as ‘preferred ways’ (Altenberg. He assumed. Discussion The description of exemplifiers presented here does not aim at exhaustiveness in professional academic writing but at typicality. preceded by a comma (Example 4.106 Academic Vocabulary in Learner Writing of the lemma exemplify are –ed forms. . . These significant co-occurrents highlight the preference of the verb for the passive structure BE exemplified by/in (Example 4. i. that science. The association of this material with the clerk is clearly exemplified by Chaucer’s wife of Bath’s fifth husband. .). without argument. 4. He enumerated every . provides a classic example of . The corpus-based methodology adopted has highlighted a number of lexical items that are repeatedly used as exemplifiers in academic writing.) and sentence-initial infinitive clauses (To take one example. reads antifeminist material to her from his book Valerie and Theofraste. . but did not make use of statistical methods. the adverb notably.50). as exemplified by physics. detailed and testable expression.g. . . Exemplify is also often used after a noun phrase.e. is superior to forms of knowledge that do not share its methodological characteristics. exemplified in the assimilation-accommodation theory of infant learning mentioned above. . the adverbials for example and for instance). the verb exemplify does not co-occur significantly with nouns.) and rhemes (… is an example of .48.3. 4.) and word combinations. They are. the abbreviation e.48) and the lexico-grammatical pattern as exemplified by/in (Example 4.49). Unlike illustrate. in the Wife of Bath’s Prologue.

Table 4. (Siepmann.7 0.3 100 . NP as an illustration (of this)/ by way of (brief) illustration. the co-occurrence example + is afforded by and the expression for the sake of example. for the sake of illustration. NP Total Frequency 200 54 16 7 5 3 1 2 2 4 4 2 1 1 302 % 66.7 0.3 1.3 0. Of these last two. consider NP (2) Take (even) NP (2) Let us (now) take + (as) + DET + ADJ + example(s) Let us consider + DET + ADJ + example(s) Let me give (you) (but) one example Let me offer + DET (+ ADJ+) example Let us consider. It may be argued that privileging exhaustiveness over typicality in corpus linguistic research is counter-productive.8) shows. only a very limited set of these are widespread in professional academic writing.2 17. that the co-occurrences see/take/consider + for example account for 89. showing a high degree of audience sensitivity among authors.8 The use of imperatives in academic writing (based on Siepmann. for (another) example.7 1 0.7 1. and that such an approach results in too much — unreliable — information.3 0. 2005:120) A closer look at his frequency data (reprinted in Table 4.3 0.3 1. Siepmann. First person imperatives are extremely rare and let me + VP only appeared three times in the 9. using the direct second-person imperative VP ~ as well as the less imposing hortative let us + VP and the inclusive let me + VP. for example.9 5. NP Consider a(n) (ADJ) example/instance take the example of (as examples of NP) consider (as an example) NP take.Rhetorical functions in expert academic writing 107 single occurrence of word sequences used to give an example and listed rare events such as the infinitive clauses to paint an extreme example and to pick just one example (a single occurrence in his corpus). the former is around five times more frequent than the latter.3 2. wrote that English authors have a large range of exemplificatory imperatives at their disposal.5-million word corpus of professional academic writing he used. Although a large range of exemplificatory imperatives may be available to language users.4 per cent of the imperatives Siepmann found. 2005: 119) Imperatives in academic writing (for example/for instance) see (for example/for instance) NP (for example) consider (for example) NP take. however. as an example.

As shown in Table 4. but which are more frequent in academic prose than in any other genres (illustrate. 4. even though) and (complex) prepositions (e. Similarly. example. yet). for example and for instance. example. for example and for instance) (see Figure 4. ‘expressing a concession’ and ‘reformulating’ in an attempt to give a wider overview of the way academic vocabulary is used to serve specific rhetorical functions.108 Academic Vocabulary in Learner Writing The analysis of exemplifiers presented here also validates the method used to design the Academic Keyword List. nevertheless. despite. ‘comparing and contrasting’. prepositions and conjunctions also represent a large proportion of the lexical devices used by expert writers to serve the functions of ‘expressing cause and effect’ (Table 4. However. reformulation is most frequently achieved by means of the mono-lexemic units that is and in other words. exemplify. — lexical items which are not as frequent as such as.11. e. and the adverb namely (Table 4. however.g. result. factor. outcome and implication. source. Adverbs. nouns account for 32.5 per cent of the lexical means used to express a cause or an effect in academic writing. . The phraseology of rhetorical functions in expert academic writing This section briefly comments on the types of lexical devices used by expert writers to serve the functions of ‘expressing cause and effect’.3.g. and notably).11) and ‘comparing and contrasting’ (Table 4. e. The preposition like can be used to fulfil an exemplificatory function in academic writing but it is much more common in other genres.12). The nouns illustration and case in point are quite characteristic of formal textual genres. consequence. Table 4.g.1 discussed earlier in this chapter). cause. verbs and adjectives in specific phraseological or lexico-grammatical patterns. The expressions to name but a few and by way of illustration are rare in all types of discourse.9 shows that the lexical means of expressing a concession consist of single word adverbs (e. effect.e. (complex) conjunctions (e.g. but they are infrequent. The exemplificatory lexical items which were extracted are of two types: — the most frequent exemplifiers in academic writing (such as. these two functions can also be realized by means of nouns.10).g. the abbreviation i. although. in spite of). It aims to characterize the phraseology of these rhetorical functions in academic prose.

792 19.056 28.5 4.9 100 Rel. Prepositions despite in spite of notwithstanding TOTAL PREP.1 3.9 Ways of expressing a concession in the BNC-AC-HUM Abs. freq.6 0.46 353 2.7 182.7 2.10 Ways of reformulating.e.8 1.2 26.3 5.4 144.26 3.6 2.9 20.727 5. 9.2 15.8 1.8 0. freq.353 676 66 144 1.6 5.7 39.3 54.314 % 25.0 4. i.817 6.3 % Rel.6 .9 0. freq.3 7. paraphrasing and clarifying in the BNC-AC-HUM Abs.2 2.8 0.721 248 451 80 4.3 2.5 100 20.5 6.1 28.Rhetorical functions in expert academic writing Table 4. Adverbs however nevertheless nonetheless though ADV yet TOTAL ADVERBS Conjunctions although though CONJ even though (even if) albeit TOTAL CONJ.7 40.6 100.5 51. or more precisely or more accurately or rather TOTAL 330 375 81 210 187 21 12 7 91 1.5 14.4 0.6 1.2 16.0 51.292 1.5 13. freq.8 7.5 6.2 1.0 14.3 2.86 69. 109 Table 4.6 0.4 6.6 0. TOTAL 681 159 39 879 11.9 11. that is that is to say in other words namely viz.4 0.

g. distinguish and differentiate) are often used to compare and contrast but adjectives (e.6 0.8 1.0 2.3 6. difference and distinction) and verbs (e. Patterns involving nouns (e.2 4.6 55.g.175 500 183 1.8 1.1 4.2 0.2 3.4 259. .2 2.9 9.7 32.5 54.110 Academic Vocabulary in Learner Writing Verbs are also common: cause.8 3.4 1.2 per cent of the lexical means used by expert writers (Table 4.6 1.802 450 1.6 35.7 2. freq. distinct. contrast.0 2.8 2. differ. different.2 3.3 14.9 5.2 0.4 15. Table 4.0 5.1 0.3 12. derive.7 6.06 17. comparison.5 0.2 0. bring about.2 13.0 20.11 Ways of expressing cause and effect in the BNC-AC-HUM Abs.25 % Rel. differing and distinctive) play a more prominent role and account for 29.6 1.8 0.3 0. lead to.5 4. contrast.9 0.4 14.8 3.0 0. result in.0 755 550 1.5 4.8 8.0 24.612 2.4 0. freq.7 6.8 0. and stem.252 2.4 16. nouns cause factor source origin root reason consequence effect result outcome implication TOTAL NOUNS Verbs cause bring about contribute to generate give rise to induce lead to prompt provoke result in yield make sb/sth do sth arise from/out of derive emerge follow from trigger stem TOTAL VERBS 570 125 276 227 101 67 671 115 161 327 129 171 145 476 466 74 56 95 4.9 3.5 0.5 1.52 22.12). emerge.9 128.7 16.2 1.5 1. contribute to.g.4 0.830 813 143 411 8.

5 6.8 53 344 397 0.2 0.1 0.1 7.0 1.99 18.0 0.4 2.6 3. Prepositions because of due to as a result of as a consequence of in consequence of in view of owing to in (the) light of thanks to on the grounds of on account of TOTAL PREP.0 0.0 0.1 0.3 0.7 0.9 4. freq.036 696 52 22 18 12 83 5.1 22.2 21.894 182 101 20 14 35 5.5 0.2 0.3 3.4 0.1 0.0 2.9 5. Adverbs therefore accordingly consequently thus hence so thereby as a result as a consequence in consequence by implication TOTAL ADVERBS Conjunctions because since As 5 for so that PRO is why that is why this is why which is why on the grounds that TOTAL CONJ.5 177.5 57.7 26.7 1.4 0. 111 .4 12 % Rel.3 1.3 3.6 0.475 8.3 1.9 2.04 599 195 196 22 1 66 52 109 35 22 24 1.6 10.1 0.312 22.7 0. TOTAL 2.59 42.6 31.6 0.0 0.5 0.2 8.3 0.6 3.4 1.7 0. Adjectives consequent responsible (for) TOTAL ADJ.2 1.7 0.7 39.5 3. freq.99 1.1 0.3 53.5 3.7 0.9 0.912 26.0 5.49 1.321 2.3 0.6 0.1 180.Rhetorical functions in expert academic writing Abs.33 100 66.767 283 1.0 5.0 1.1 4.97 796.4 28.1 0.207 955 883 1.7 0.981 5.412 130 143 1.0 0.2 0.

5 0.5 4.9 0.3 0.3 15.7 4.6 2.7 257.6 0.4 0.496 72 278 163 33 43 27 127 23 8.9 0.7 4.7 2.1 2.2 0.1 0. freq.4 1.4 4.8 0.6 1.2 8.7 31.9 1.6 5.229 0.2 8.0 1. freq.5 0.3 2.5 0.0 1.44 116 212 147 19 175 522 311 1.7 0.9 0.3 0.6 0.9 1.7 30.1 0.3 0.2 14.7 9.5 0.2 3.8 0.8 1.4 0.1 0.1 0.46 3.3 17.2 0.4 0.8 3.1 8.1 29.4 39.9 3.8 6.12 Ways of comparing and contrasting found in the BNC-AC-HUM Abs.7 127. Nouns resemblance similarity parallel parallelism analogy contrast comparison difference differentiation distinction distinctiveness (the) same (the) contrary (the) opposite (the) reverse TOTAL NOUNS Adjectives same similar analogous common comparable identical parallel alike contrasting different differing distinct distinctive distinguishable unlike contrary opposite reverse TOTAL ADJECTIVES Verbs resemble correspond look like compare parallel contrast 138 137 102 278 56 137 0.24 77.318 76 595 10 559 28 85 56 4.3 % Rel.5 6.5 0.3 0.3 0.5 0.9 1.1 2.112 Academic Vocabulary in Learner Writing Table 4.6 1.8 0. .1 1.0 0.9 75.580 1.1 4.552 0.8 2.027 55 1055 223 137 52 98 63 2.1 4.1 3.4 4.9 0.3 16.1 0.5 0.

0 0.9 0.1 2.2 0.1 0.2 47.4 2.3 0.6 1.6 7.3 1.2 3.88 (Continued) 394 2 2 29 0 118 56 3 97 185 116 69 0 23 14 9 69 4 25 372 136 95 2 62 1.8 1.1 0.9 0.0 3.Rhetorical functions in expert academic writing Abs.568 % 0.6 11.4 0.2 2.3 5.2 5.1 1.3 2.1 0.3 12. 7.2 0.6 0.00 1.674 1.484 9.6 2.7 11.9 50.1 0.3 0.3 84.0 0.0 0.1 1.0 0.8 0.2 11. differ distinguish differentiate TOTAL VERBS Adverbs similarly analogously identically correspondingly parallely likewise in the same way contrastingly differently by/in contrast by contrast in contrast by way of contrast by/in comparison by comparison in comparison comparatively contrariwise distinctively on the other hand (on the one hand) on the contrary quite the contrary conversely TOTAL ADVERBS Prepositions like6 unlike in parallel with as opposed to as against in contrast to/with in contrast to in contrast with versus contrary to by/in comparison with in comparison with in comparison to by comparison with in comparison with TOTAL PREP.4 0.0 0.2 0.91 .5 0.2 0.1 2.3 0.2 4.36 Rel.2 0.4 0.5 2.6 0.4 104. 2.7 0.4 0.0 0.812 244 8 121 46 82 73 9 53 66 52 14 4 21 14 3.1 0. freq.3 0.5 1.4 0.1 0.72 0.2 0.1 0.39 0.6 0.9 5.6 242 404 74 1.2 113 0.3 0.7 0.9 0.0 0. freq.0 0.0 0.

Another direct result of conquest by force of arms was the development of slavery.12 Cont’d Abs. 0.114 Academic Vocabulary in Learner Writing Table 4. This had important implications for the debate over access to birth control information and abortion – rarely were demands for freer access to birth control information devoid of maternalist rhetoric. but actually merge with each other. 4. result and consequence. freq.55. .5 83.9 38.4 1. The reason is that with Van Gogh art and life are not merely conditioned by each other to a greater degree than with any other artist.08 151. 4.045 1264 442 6.751 17. However it is first necessary to consider another important consequence of the view of psychosis being presented here.54.52. .766 38 155 113 42 32 11 20 1 29. 4.0 13.0 880.3 0. Conjunctions as while whereas TOTAL CONJ. Health for women was held to be synonymous with healthy motherhood. .6 0.53. freq. This may be an effect of the uncertainty around television’s textuality.1 100 Table 4.51.3 1.1 0.23 % Rel.5 23. as illustrated in the following examples: 4. effect.2 4.3 1. which was widespread up to the beginning of the nineteenth century. Most of the co-occurrents listed form quite flexible and compositional textual sentence stems with their nominal node.7 3. 4.3 1.5 0.249 9.5 5.0 0. Other expressions as . but it is now an extremely limiting effect for the development of theory. outcome.13 shows a co-occurrence analysis of several nouns that are used to express cause or effect in academic prose: reason.3 203.1 4. implication. as in the same way as/that compared with/to compared with compared to CONJ compared to/with as compared to/with when compared to/with if compared to/with TOTAL 2.

13a: reason Adjective + reason good main sufficient obvious other different alleged simple tactical political major additional right valid similar fundamental real independent special possible historical particular Verb + reason have give see base on provide find examine Auxiliary verb + reason be seem reason + verb be justify reason + preposition for against Preposition (2L) + reason for reason + conjunction why which that Determiner + reason this another (no) reason to + verb believe suppose doubt prefer think fear accept reason(s) for . .13b: implication Adjective + implication important practical political serious social Verb + implication have carry implication + verb be Auxiliary verb + implication be implication + preposition of for Preposition + implication with Determiner + implication this implication + conjunction that .13 Co-occurrents of nouns expressing cause or effect in the BNC-AC-HUM Table 4. . supposing believing thinking accepting rejecting adopting There + verb + reason There is (no) reason to There seems no reason There are (DET/ADJ) reasons Table 4.Rhetorical functions in expert academic writing 115 Table 4. .

13c: effect Adjective + effect adverse overall good profound knock-on indirect far-reaching damaging cumulative dramatic immediate excellent long-term practical particular powerful special full general important other Verb + effect have produce achieve create cause Auxiliary verb + effect be effect + verb be depend on occur effect + preposition of on upon Determiner + effect this effect + conjunction That Noun and effect cause Table 4.13d: outcome Adjective + outcome logical eventual likely different inevitable final outcome + preposition of Determiner + outcome this Verb + outcome influence determine represent affect outcome + verb be Auxiliary verb + outcome be .116 Academic Vocabulary in Learner Writing Table 4.

13f: consequence Adjective + consequence inevitable unintended unfortunate direct important necessary political natural bad practical social likely major possible Determiner + consequence this another consequence + conjunction that Verb + consequence have suffer (from) avoid consider outweigh discuss consequence + verb be follow ensue Auxiliary verb + consequence be consequence + preposition of for Preposition (3L) + consequence with of - .13e: result Adjective + result inevitable direct immediate beneficial eventual interesting practical main similar result + preposition of from Preposition (3L) + result with Determiner + result this Verb + result produce achieve yield give bring lead to show present interpret obtain have result + verb be Auxiliary verb + result be 117 Table 4.Rhetorical functions in expert academic writing Table 4.

It is often used to report suggestions made by other people in impersonal structures introduced by it (e. . A large proportion of those are evaluative adjectives (e. A few co-occurrences are collocations as illustrated by Example 4. major. it has been suggested that. . verbs that serve specific rhetorical or organizational functions in academic prose generally enter compositional and flexible sequences. 4. associating the formal patterning with a semantic field. it appears that). 1999: 16). ) or rhemes (e.51 to 4. Table 4. it is sometimes suggested). result and consequence is also worthy of note and bears testimony to their prominent role in argumentation (Soler. good. viewpoint on. . . For example. We may certainly talk of animals. tend or prove) is mainly used in passive constructions.118 Academic Vocabulary in Learner Writing The word combinations illustrated in Examples 4. 2000: 5). forthcoming). performing an attitudinal and pragmatic function in the discourse’ (Tognini-Bonelli. 2002. as already suggested by).14 gives the most frequent lexical bundles containing one of the four verbs suggest. as suggested above.56. sentence-initial adverbial clauses (e. important. . Tutin. appear. and thus different complete units of meaning. in the absence of speech. fundamental. it has been suggested. prove and tend typically used to express possibility or certainty. . serious. or feelings about the entities or propositions that he or she is talking about’ (Hunston and Thompson.55 are good illustrations of what Sinclair and his followers have called ‘extended units of meaning’ where lexical and grammatical choices are ‘intertwined to build up a multi-word unit with a specific semantic preference.g.56. These extended units of meaning are categorized as textual phrasemes in Granger and Paquot’s (2008a) typology as they function as sentence stems to organize the propositional content at a metadiscoursal level. and in phrases introduced by the conjunction as (e. “consciously intending” or being compassionate. effect. both of which carry implications of understanding to some degree. inevitable. The verb carry is used in a delexical sense in the collocation carry implications. outcome. The variety of adjectives used with the nouns reason.g. and that these constitute different form/meaning pairings. and an identifiable semantic prosody. which basically means have implications. Like nouns. implication. .g. Most clusters are lexico-grammatical patterns which function as textual sentence stems (e.g.g.g. proved a complete failure). sufficient) and are used to express the ‘writer’s attitude or stance towards. the –ed form of suggest (unlike that of appear. It is worth noting that each verb form has its own ‘distinctive collocational relationship’ (Sinclair. 2002: 79).

Suggested is also used in impersonal structures introduced by it followed by a modal verb (e.14a: suggest suggested – it has been suggested that – it is (sometimes. (ADV: strongly) suggesting (that) – I am (not) suggesting that . 4. however. 2004 and Gledhill. phrasal verbs. even) suggested that – it can / could / may be suggested that – this is suggested by – as (already) suggested by – as suggested above – (as) I (have) (already) suggested suggests – NP / it / this (ADV: strongly. as I have suggested) to refer to a suggestion previously made. as suggested above) and/or the first person pronoun I (e. that while it lives in woodland it actually hunts over nearby open areas. More recent evidence suggests.Rhetorical functions in expert academic writing 119 As-phrases are also used with an endophoric marker (e.g. similes.g. adopt an approach/a method.58. the verb form suggests is typically used to make it clear that the suggestion offered is made on the basis of who/whatever is the subject of the sentence: 4.g. also. 1998) conclusion that a rm large proportion of the lexical collocations found in academic discourse consist of a verb in a figurative sense and an abstract noun denoting a recurrent concept in academic discussion (e. Sinclair Hood (1971) suggests that woollen cloth and timber were sent to Egypt in exchange for linen or papyrus. Results also confi Howarth’s (1996. also) suggests (that) – . which suggests (that) – as NP suggests suggest – NP / it / this might / may / would suggest (that) – NP does suggest (that) – there is evidence to suggest – I (would / want to) suggest – NP / it / this seems to suggest (that) suggesting – … . Table 4.14 Co-occurrents of verbs expressing possibility and certainty in the BNC-AC-HUM Table 4. idiomatic sentences.g. results indicate that the phraseology of rhetorical or organizational functions in academic prose does not consist of idioms.. it may/might be suggested that) to make a tentative suggestion. . proverb fragments and the like (see also Pecman. 2000). In summary.57.7 Referential phrasemes that serve to organize scientific discourse mainly consist of lexical and grammatical collocations. commonly) suggested that – it was (first. . By contrast.

conclude / tend – NP tend to V (be. . .14b: prove Academic Vocabulary in Learner Writing proved – NP / it / this proved to – NP / it / this proved (ADV) ADJ (to) with ADJ: difficult. easy. The first is complex prepositions (e. carry out a task/a test/a study). unable. difficult. abortive. which appeared ADJ/ to V appears – NP / it / this appears to V – which appears to V – what appears to V – there appears to V – it appears that – as appears from/in appearing / appear NP would/might/may appear to be/V Table 4. . . support. see. . . obscure. Figure 4. of proving prove – ADJ (likely. by proving – . successful. . inadequate. become. confirm. ignore. successful) – NP proves that – BE proving – .14d: tend tended – NP tended to V (be. favour. which tends to V – it tends to V V: be.g. with respect to. see) tends – NP tends to V – .14c: appear appeared – it appeared (ADJ) that – there appeared to be – this appeared to V – .. may / might / would prove ADJ to – NP was to prove ADJ – attempt to prove – seek to prove proving Table 4. possible) to prove . proving that – . necessary. reach a conclusion/a consensus/a point. . so that.6). inadequate. regard) tending draw an analogy/a comparison/a distinction. . as if. .g. look. the category of textual phrasemes consists of three types of phraseme (cf. in addition to) and complex conjunctions (e.120 Table 4. In academic prose. develop an idea/a method/a model. . even . take. . impossible. possible – NP / it / proved to be (ADV) ADJ – NP proved NP proves – NP proves ADJ (impossible.

. prime. and are therefore. Textual sentence stems are multiple clause elements involving a subject and a verb. . which ‘form the springboard of utterances leading up to the communicatively most important — and lexically most variable — element’ (Altenberg. used to connect two stretches of discourse. to conclude) are also common in academic prose (Conrad. Textual sentence stems and rhemes constitute the third type of textual phrasemes. 1999: 11–12). These first two categories of textual phraseme broadly correspond to Moon’s (1998) set of organizational fixed expressions and idioms. Rhemes typically consist of a verb and its post-verbal elements (e. . They also sometimes function as textual phrasemes but are less frequent than sentence stems. as a result) and clausal linking adverbials (e. . which I refer to as ‘textual formulae’.’ to more inflexible phrasemes such as ‘to be a case in point’. 1998: 113). for example. etc.Rhetorical functions in expert academic writing Phrasemes 121 Referential function Referential phrasemes (Lexical) collocations Grammatical collocations Textual function Textual phrasemes Complex prepositions complex conjuctions Linking adverbials Textual formulae (including textual sentence stems and rhemes) Communicative function Communicative phrasemes Attitudinal formulae Figure 4. Textual formulae are particularly prominent in academic writing and display different degrees of flexibility.g. in other words. . . . . classic. The second is multiword linking adverbials. that is to say. another) ADJ (typical. Although the majority of linking adverbials are single adverbs. is another issue). .6 The phraseology of rhetorical functions in academic prose though) which are used to establish grammatical relations (cf. in addition. and are therefore not part of the phraseological spectrum. .) example of [NP] is . Examples include It has been suggested. .g. what is more. Burger’s (1998) category of structural phrasemes). . and It is argued that. that is. Another reason is . prepositional phrases functioning as adverbs (e. possibly because rhemes are ‘usually tailored to expressing the particular new information the speakers want to convey to their listeners. good. as Altenberg (1998: 111) points out. “composed of variable items drawn from an open set”’ (De Cock. from flexible fragments such as ‘DET (a. 2003: 269).g. in conclusion.

2 has also validated the method used to select AKL words: the lexical items which were automatically extracted included the most frequent exemplifiers in academic writing (such as. The list. They largely consist of sentence stems such as it is important/necessary that. The analysis of exemplifiers presented in Section 4. The frequency-based approach adopted to study the phraseology of rhetorical functions has also helped uncover a whole range of word combinations that do not fit traditional phraseological categories. as was done above (Section 4. conveying two major kinds of meaning: epistemic and attitude/modality’ (Biber et al. together with information on the word’s frequency (see Coxhead et al. The AKL could be very useful for curriculum and materials design as it includes a high number of words that serve rhetorical functions in academic prose. exemplify.4. I have shown that a high proportion of words in the Academic Keyword List (AKL) fit my definition of academic vocabulary and serve rhetorical or organizational functions in academic prose. This means that each AKL word has to be described in context. e. it should include the word combinations (frequent co-occurrences.) in which each AKL word is commonly found in academic prose. 4. 2004: 389).g. This group is similar to Biber et al. Such a contextual analysis will also make it possible to decide whether each word fits my definition of academic vocabulary and deserves to be retained in the Academic Keyword List. evidence suggests. and outstanding example have traditionally been considered as peripheral or falling outside the limits of phraseology (Granger and Paquot. example.122 Academic Vocabulary in Learner Writing Attitudinal formulae make up a large proportion of communicative phrasemes in academic prose. 2008a: 29) but results suggest that they are essential for effective communication and are also part of the preferred lexical devices used to organize scientific discourse. . etc.. however. textual phrasemes. (forthcoming) for a similar project for Coxhead’s (2000) Academic Word List). for example and for instance) and lexical items which are not as frequent but which are more common in academic prose than in other genres (illustrate. collocations.’s (2004) category of stance bundles that ‘provide a frame for the interpretation of the following proposition.2) for the function of exemplification.. it seems that or it is noteworthy that. Summary and conclusion In this chapter. notably). still needs to be refined in various ways. final outcome. Co-occurrences such as direct result. To be useful to apprentice writers.

My findings thus support Gledhill’s call for a rhetorical or pragmatic definition of phraseology: Phraseology is the ‘preferred way of saying things within a particular discourse’. I identified the words that were not part of the AKL and examined their use in the BNC-AC-HUM. namely collocation and the lexico-grammar. but must also take account of the correspondence between the expression and the discourse within which it has been produced. Most notably. Some of these lexical items turned out not to be typical of academic prose or to be extremely rare (e. by way of illustration) and should therefore not deserve the attention they have been given in . and at a higher level idioms and lexical phrases have rhetorical and textual roles within a specific discourse. I listed the words and phrases given in academic writing textbooks as typical lexical devices to perform the five rhetorical functions analysed in detail in this book and compared them with the AKL. results have shown that textual phrasemes make up the lion’s share of multiword units that ensure textual cohesion in academic prose. Attitudinal formulae serve a major role in a restricted number of functions such as ‘expressing personal opinion’ and ‘expressing possibility and certainty’. 2008a: 34–5). The notion of phraseology implies much more than inventories of idioms and systems of lexical patterns. Phraseology is at once a pragmatic dimension of linguistic analysis. Results have also pointed to the prominent role of free combinations to build the rhetoric of academic texts. This type of phraseme. has often been neglected in theories of phraseology (cf. Phraseology is a dimension of language use in which patterns of wording (lexico-grammatical patterns) encode semantic views of the world. To do so. 2000: 202) In line with this call. (Gledhill. however. and a system of organization which encompasses more local lexical relationships.Rhetorical functions in expert academic writing 123 The type of data analysis presented in this chapter has also offered valuable insights into the distinctive nature of the phraseology of rhetorical functions in scientific discourse. Granger and Paquot. Another objective of this chapter has also been to assess the adequacy of the treatment of rhetorical functions in EAP textbooks and investigate whether the AKL should be supplemented with additional academic words.g. the functions of all AKL words and their preferred phraseological and lexico-grammatical patterns should be identified by examining them in context. to name but a few. I claim that the phraseological analysis of a text should not only involve the identification of specific collocations and idioms.

A pedagogically-oriented investigation of academic vocabulary cannot rest solely on native speaker data. do they use them correctly? And do they use them sparingly or do they make heavy use of these infrequent exemplifiers? These questions can only be answered by an analysis of learner corpus data. a large proportion of AKL words were not found in textbooks in spite of their relatively high frequency and major discourse functions in academic prose. Such an analysis is presented in the next chapter. . It is essential to examine what learners actually do with lexical devices that serve rhetorical functions. These findings show the power of a data-driven approach to the selection of academic vocabulary and clearly call for a revision of the treatment of rhetorical functions in academic writing textbooks. By contrast.124 Academic Vocabulary in Learner Writing pedagogical materials. do they use exemplifiers? Do they rely on words and phrasemes that are typical of academic prose? Do they use the expressions to name but a few and by way of illustration? If so. For example.

64 (p < 0. The UCREL log-likelihood calculator website (http://ucrel. In Section 5.3. lack of register awareness. The same methodology was used to examine learners’ use of words that serve the rhetorical functions of ‘expressing cause and effect’. clusters of connectives and unmarked position of connectors. Instead.2 is on the general interlanguage features that emerge from these analyses. The learner’s first language also plays a considerable part in his or her use of academic vocabulary.1.1 presents a detailed comparison of exemplificatory devices in native and learner However these analyses are not presented in as much detail as for exemplification.Chapter 5 Academic vocabulary in the International Corpus of Learner English This chapter is devoted to academic vocabulary in learner writing.01) was taken as the threshold value. ‘expressing a concession’ and ‘reformulating: paraphrasing and clarifying’. This illustrates the type of results obtained when the range of lexical strategies available to EFL learners is compared to that of expert writers. 5. the focus of Section 5. Section 5. learner-specific phraseological patterns.lancs. semantic misuse. ‘comparing and contrasting’. The whole learner corpus was compared to the BNC-AC-HUM but the results are only reported if they are common to learners from a majority of the mother tongue backgrounds considered. both for reasons of space and because the presentation would soon become cumbersome. A bird’s-eye view of exemplification in learner writing A general finding of the comparison between the International Corpus of Learner English (ICLE) and the British National Corpus – Academic Humanities .html) was used to compute log-likelihood values. I focus on transfer effects on French learners’ use of multiword sequences with rhetorical functions. Differences between learner and native writing are highlighted by means of log-likelihood tests. These fall into six broad categories: limited lexical repertoire. However not all learner specific-features can be attributed to developmental factors.

) bespeak a general lack of concern for comprehensibility’ (Siepmann. The bar chart in Figure 5. Overuse of for example has also been found in other learner populations such as Japanese and Taiwanese learners (Narita and Sugiura. these lexical items are quite infrequent in both nativespeaker and learner writing.1. This shows that EFL learners’ overuse of the function of exemplification is largely explained by their massive overuse of the adverbials for example and for instance. There is no significant difference in the use of the preposition such as. Chen. This explanation for German learners’ underuse of exemplifiers is not entirely satisfactory.g. 2006).g. the nouns illustration and case in point and the expressions to name but a few and by way of illustration when comparisons are based on the total number of running words in each corpus. they do not choose the same exemplifiers. The bar chart shows that EFL learners’ use of exemplifiers differs from that of professional writers in at least two ways. learners tend to make little use of the verbs illustrate and exemplify and the adverb notably. As explained in Section 4.. First. The overuse of for instance has already been reported by Granger and Tyson (1996) for French learners and Altenberg and Tapper (1998) for Swedish learners. Corpus comparisons based on the total number of running words have shown that exemplification is used significantly more in the ICLE than . Figures and log-likelihood values for each corpus comparison are given in Table 5. The frequencies of individual items also differ widely.. The lexical items are ordered by decreasing relative frequency in the ICLE. 2006. the frequency of each exemplificatory lexical item can also be calculated as a proportion of the total number of exemplifiers. This result highlights the importance of analysing several learner populations and comparing them so as to avoid faulty conclusions about EFL learner writing in general. and does not apply to EFL learner writing in general: most L1 learner populations overuse exemplificatory discourse markers. By contrast. whereas the most frequent one in the BNC-AC-HUM is such as.1. Siepmann (2005) finds that the adverbials for example and for instance are less frequent in German learner writing than in native and non-native professional writing and argues that ‘under-use of exemplification as a rhetorical strategy in student writing may (. the noun example 2 and the preposition like. the abbreviation e. Thus the most frequent exemplifier in the ICLE is the adverbial for example.000 words of exemplifiers in the ICLE and the BNC-AC-HUM. 2005: 255). . which are underused in the ICLE. . Except for the preposition such as and the abbreviation e.126 Academic Vocabulary in Learner Writing (BNC-AC-HUM) subcorpus is that exemplificatory lexical items are significantly more frequent in learner writing than in professional academic prose.1 shows the frequencies per 100.

g .1 Exemplifiers in the ICLE and the BNC-AC-HUM 127 .80 70 60 50 40 30 20 10 0 Academic vocabulary in the ICLE fo xa re mp le exa mp le suc ha s like tance ins for e. illu str ly ify ate ion point ion few tab empl rat rat ta no in x ust ust bu e e ill e f ill cas am yo on Ea wa t B by BNC-AC-HUM ICLE Figure 5.

128 Table 5.1 words

Academic Vocabulary in Learner Writing
A comparison of exemplifiers based on the total number of running

ICLE Abs. Nouns example example examples *exemple1 *exampl *examle illustration illustration illustrations (BE) a case in point TOTAL NOUNS verbs illustrate illustrate illustrates illustrated illustrating exemplify exemplify exemplifies exemplified exemplified *examplified exemplifying TOTAL VERBS prepositions such as like TOTAL PREP. Adverbs for example for example *for exemple for instance e.g. notably to name but a few by way of illustration TOTAL ADVERBS TOTAL 857 854 3 344 94 5 3 1 1304 3058 73.5 489 468 957 42 40.2 82.1 6 2 2 2 1 1 0 57 51 29 14 8 0 4.38 2.5 1.2 0.7 0 0.43 0.2 0.2 0.18 0.1 0.1 0 4.8 713 477 230 4 1 1 17 16 1 10 740 61.17 40.9 19.7 0.3 0.1 0.1 1.5 1.4 0.1 0.86 63.5 Rel.

BNC-AC-HUM Abs. Rel.


1285 665 620

38.68 20 18.7

91.6 (++) 134 (++) 0.5

77 63 14 18 1380

2.3 2 0.4 0.5 41.5


1.3 83.6 (++)

259 97 63 84 15 79 9 15 53

7.8 2.9 1.9 2.5 0.5 2.38 0.3 0.5 1.6

16.1 (− −) 0.6 2.6 17.7 (− −) 9 20.32 (− −) 0.4 2.1 20.09 (− −)

2 338 10.2


1.2 32.1 (− −)

1494 532 2026

45 16 61

1.8 199.6 (++) 55.3 (++)



209.9 (++)

29.5 8 0.4 0.3 0.1 111.9 262.4

609 259 77 4 3 2215 5959

18.3 7.8 2.3 0.1 0.1 66.7 179.4

47.3 (++) 0.1 22.1 (− −) 0.9 0 208.3 (++) 279.2 (++)

Legend: (++) significantly more frequent (p < 0.01) in ICLE than in BNC-AC-HUM; (− −) significantly less frequent (p < 0.01) in ICLE than in BNC-AC-HUM

Academic vocabulary in the ICLE
Table 5.2 A comparison of exemplifiers based on the total number of exemplifiers used
ICLE Abs. Nouns example illustration (BE) a case in point TOTAL NOUNS Verbs illustrate exemplify TOTAL VERBS Prepositions such as like TOTAL PREP. Adverbs for example for instance e.g. notably to name but a few by way of illustration TOTAL ADVERBS TOTAL 854 344 94 5 3 1 1301 3054 28 11.3 3 0.2 0.1 0 42.6 100 1263 609 259 77 4 3 2215 5959 21.2 10.2 4.3 1.3 0.1 0 37.2 100 39 (++) 2 8.1 (− −) 36.9 (− −) 0.2 0.2 15.3 (++) 489 468 957 16 15.3 31.3 1494 532 2026 25 8.9 34 80 (− −) 70.7 (++) 4.5 51 6 56 1.7 0.2 1.8 259 79 338 4.4 1.3 5.7 47.7 (− −) 35 (− −) 77.3 (− −) 713 17 10 740 23.3 0.6 0.3 24.2 1285 77 18 1380 21.6 1.3 0.3 23.2 2.8 11.7 (− −) 0 0.9 % BNC-AC-HUM Abs. % LogL


Legend: (++) significantly more frequent (p < 0.01) in ICLE than in BNC-AC-HUM; (− −) significantly less frequent (p < 0.01) in ICLE than in BNC-AC-HUM

in the BNC-AC-HUM, and that the four lexical items discussed above are largely responsible for this overuse (Table 5.1). Comparisons based on the total number of exemplifiers allow us to ask and answer different research questions. They give information about which lexical item(s) EFL learners prefer to use when they want to give an example, and in what proportions. Thus Table 5.2 shows that EFL learners select for example on 28 per cent of the occasions when they introduce an example, whereas native-speaker academics only use it to introduce 21 per cent of their examples. Both methods indicate that EFL learners overuse the preposition like and the adverbial for example. As shown in Table 5.3, however, the two methods may also give different results. The noun example appears to be overused in the ICLE when comparisons are based on the total number of running

130 Table 5.3
Lexical item

Academic Vocabulary in Learner Writing
Two methods of comparing the use of exemplifiers
Comparison based on total number of running words ++ // // ++ −− −− −− // ++ ++ ++ ++ // −− // // ++ Comparison based on total number of exemplifiers // −− // // −− −− −− −− ++ // ++ // // −− // // ++

example illustration (be) a case in point TOTAL NOUNS illustrate exemplify TOTAL VERBS Such as Like TOTAL PREPOSITIONS for example for instance e.g. notably to name but a few by way of illustration TOTAL ADVERBS

Legend: ++ significantly more frequent (p < 0.01) in ICLE than in BNC-AC-HUM; − − significantly less frequent (p < 0.01) in ICLE than in BNC-AC-HUM; // no significant difference between the frequencies in the two corpora

words in each corpus. However, a comparison based on the total number of exemplifiers suggests that the learners choose the noun example about as often as professional academics when they want to introduce an example (23.3% vs. 21.6%). More lexical items are significantly underused when figures are based on the total number of exemplifiers. In addition to illustrate, exemplify and notably, the noun illustration and the preposition such as are selected proportionally less often by EFL learners than by professionals to introduce an example. This first broad picture of the use of exemplifiers in the ICLE points to EFL learners’ limited repertoire of lexical items used to serve this specific EAP function. This characteristic of learner writing is discussed in more detail in Section 5.2.1. By comparison with academics, EFL learners overuse the preposition like and underuse such as. Figure 5.2 shows the relative frequencies per 1,000,000 words of like and such as in four sub-corpora of the British National Corpus representing different ‘super genres’ (see Section 3.3): academic writing, fiction, newspaper texts and speech (BNC-SP) as well as in the ICLE. The

Academic vocabulary in the ICLE
2500 2000 1500 1000 500 0 Academic writing Fiction News Speech Learner writing



such as

Figure 5.2 The use of the prepositions ‘like’ and ‘such as’ in different genres

50 45 40 35 30 25 20 15 10 5 0 speech Fiction Learner writing News Academic writing

Figure 5.3 The use of the adverb ‘notably’ in different genres

preposition like is much more frequent than such as in speech3, fiction, news and learner writing but is less frequent in academic prose. By contrast, such as is more frequently used in academic prose. Learners’ use of these exemplificatory prepositions thus differs from academic expert writing, but resembles more informal genres such as speech. Learners’ underuse of the adverb notably in their academic writing is another illustration of the same point (Figure 5.3).

freq. per million words


Academic Vocabulary in Learner Writing
for example 14% 2% 7% 10% 59% 11% 77% Academic writing News Fiction Speech 20% for instance

Figure 5.4 Distribution of the adverbials ‘for example’ and ‘for instance’ across genres in the BNC

A large proportion of EFL learner populations make repeated use of the word-like unit for instance. The use of this adverbial by native-speakers, however, differs significantly from that of for example, both in terms of frequency and register. Figure 5.4 shows that 77 per cent of all instances of for example in the BNC are found in the academic sub-corpus. However only 59 per cent of the occurrences of for instance appear in academic prose while 30 per cent are found in more informal genres such as speech and fiction. Lee and Swales (2006: 64) also showed that the use of these two adverbials differs across academic disciplines: for instance is more frequent in the social sciences and humanities while in natural sciences, technology and engineering, for example is strongly favoured to clarify a difficult or complex point through exemplification. Lack of register awareness manifests itself in a number of ways in learner academic writing. This will be the focus of Section 5.2.2. The phraseology of academic words is also a major source of difficulties to EFL learners. One of the main advantages of using a noun rather than the adverbials for example and for instance is that the use of a noun allows the writer to qualify the example with an adjective (see Section 4.2.2). However only 18 per cent of the adjective co-occurrents (types) of the noun example in the ICLE are significant co-occurrents in the BNC-AC-HUM (Table 5.4). A quarter of the adjective co-occurrents of example in the ICLE do not appear at all in the 100-million word British National Corpus (Table 5.5). A large proportion of these adjectives have been described by our

77 12 8 8 7 6 5 Adjective excellent typical classic interesting numerous outstanding freq. 4 3 2 2 2 1 133 Table 5. We all know thousands of such manipulative examples. 1 1 1 1 1 1 1 1 1 native-speaker informant as forming awkward co-occurrences with example as illustrated in the following sentences: 5.2.1. 2 2 1 1 1 1 1 1 1 Adjective manipulative mere model opposite overstated polemic hair raising stirring upsetting freq.4 Significant adjective co-occurrents of the noun ‘example’ in the ICLE Adjective good extreme above clear striking simple Well-known freq. and once again the step-family is the guilty party.4. Which may most probably influence our feeling towards him. so to speak. The story of Cinderella is one more impermissible example. that was an overstated example. extreme. (ICLE-DU) 5. Of course. The opposite example is (the former?) USSR.5. (ICLE-FR) 5.3. For example a disliked politician will be shot through such a zoom as to expose his ugly bits. every nation defends its own interests before fighting for those of “the group” they joined. This mere example proves that the ideal union people dream of is not yet a total reality: national conflicts are still at work. (ICLE-RU) . (ICLE-FR) 5. where the union was imposed by a central power without real approbation of the states and against people’s will. (ICLE-PO) 5.Academic vocabulary in the ICLE Table 5.5 Adjectives co-occurrents of the noun ‘example’ in ICLE not found in the BNC Adjective big warning absolute bright cruel present day evident frightening impermissible freq. Cinderella is a neglected child.

less poverty and more social justice we would not find the same quantity of crime that we find in our society. They are listed in Table 5. several of these verbs form awkward co-occurrences with the noun example: 5.7. Like adjective co-occurrents. 119 31 15 2 1 1 TOTAL 169 Table 5.6). I can make the example of Naples: here there is everyday an incredible lot of crimes. (ICLE-IT) Table 5.6 Significant verb co-occurrents of the noun ‘example’ in the ICLE Left co-occurrents Verb be take give find show serve illustrate provide cite consider TOTAL freq. In a new society made with less inequality.134 Academic Vocabulary in Learner Writing Similarly. only 23 per cent of the verb types that are used with example in the ICLE are significant co-occurrents of the noun in the BNC-AC-HUM (see Table 5. 1 1 1 1 1 1 1 1 8 Right co-occurrents Verb say reinforce criticize point out express freq. Some 27 per cent of the verb co-occurrents (types) of the noun example in the ICLE do not appear with example in the whole BNC.6. 162 36 28 10 10 4 3 2 2 1 258 Right co-occurrents Verb be show illustrate concern suggest Suffice freq. 1 1 1 1 1 TOTAL 5 .7 Verb co-occurrent types of the noun ‘example’ in ICLE not found in BNC Left co-occurrents Verb culminate into glide into state plaster with derive write help as appear TOTAL freq.

Academic vocabulary in the ICLE Table 5. political and economical stability of the country. 31 15 Rel.9 shows that the structure there + be + example is more frequently used in learner writing than in professional academic writing. Italian and German learners were shown to underuse stems and rhemes with the verb be.e.9 The distribution of ‘there + BE + example’ in ICLE and the BNC-AC-HUM there + BE + example Abs. It originates in dissimilar climate.1 6. freq.7%) 139 (62. Spanish.9.52 (++) 5. running after a ball. To glide into an extreme example. for example.76 (++) Table 5.8 HUM 135 The distribution of ‘example’ and ‘be’ in the ICLE and the BNC-AC- be + example ICLE BNC-AC-HUM 162 (57. There is the example of Great Britain where a professional army costs less than. unequality appears even between people living in towns and villages. social organization. Textual sentence stems and rhemes with the verb be are significantly more frequent in learner writing than in professional academic writing (Table 5.8. 24.7%) TOTAL 281 223 Rel. freq.71 LogL 199. life-style. irrespective of the learner’s mother tongue) as illustrated by the following sentence: 5. Their understanding of the outside world differs.45 LogL ICLE BNC-AC-HUM 34. (ICLE-RU) . freq. This difference may be explained by the fact that the reference corpus used for comparison in Paquot (2008a) was a collection of native-speaker student essays. (ICLE-CZ) 5.3%) 84 (37. These results differ markedly from those reported in Paquot (2008a) in which French. 2.66 0. (ICLE-GE) The copular be is the most frequent left and right co-occurrent of the noun example in learner writing. It appears in all 10 learner corpora (i. The rules of the road you have to learn to pass your driving license are plastered with examples of children who cross the road unexpectedly.8).7. the French army based on conscription.3%) example + be 119 (42. Table 5.

5. It is used in questions and first person plural imperative sentences (Examples 5.17 and 5. The verb have is often used in speech with an inclusive we as subject (Example 5. To take one example. (ICLE-CZ) 5. 5. This pattern is very infrequent in ICLE.11) or in first person plural imperative sentences (Example 5. The imperative sentence. Let us have an example — an extract out of the famous Figaro’s soliloquy: There is a liberty of the press in Madrid now. (BNC-SP) The verb give is the most significant co-occurrent of the noun example in the BNC-SP.13.16.136 Academic Vocabulary in Learner Writing In professional academic writing. at the beginning of the project seven committees were established.15).15. to investigate one of a range of competing architectural possibilities. was judged to be awkward by our native-speaker informant. erm there we have an example of the attitude that the the council is taking towards the the re-use of employment sites. Let’s take the example of painting. (ICLE-FR) 5.13 and 5. providing I will have it checked by two or three censors and an condition that I will not write against the government and religion.14. learners often use the verb have in the same structures as take to introduce an example. each consisting of about six people.10. EFL learners prefer to use the verb take in active structures introduced by the personal pronoun I (Example 5. two patterns that are not found in the BNC-ACHUM despite the fact that the verb is also a significant co-occurrent of .12. (BNC-AC-HUM) 5. Er in relation to existing employment sites er and Mr Laycock referred to National Power. so that I can write about anything I like. Let’s take the example of a cooker. (ICLE-PO) Interestingly.14. the verb have and the first person plural imperative let’s are not significant left co-occurrents of example in the BNC-AC but they are in the BNC-SP corpus of spoken language.12). the verb take is mainly used in sentence-initial exemplificatory infinitive clauses with the noun example (Example 5. (BNC-SP) 5. let’s is typically used with the verb take + example (Example 5. I can take the example of the ‘Société Générale de Belgique’ which is directed by ‘Suez’. I have a good example in my family. 5.10).11. however.18). (ICLE-FR) As illustrated by Examples 5.16).

He assumed.1 above shows that the two verbs are underused in their –ed form only. is superior to forms of knowledge that do not share its methodological characteristics.Academic vocabulary in the ICLE 137 example in academic prose. that science. exemplified in the assimilation-accommodation theory of infant learning mentioned above.21). (BNC-AC-HUM) .20. without argument. which can be interpreted as further indication of their lack of register awareness. who. (BNC-SP) 5. Luzón Marco.23.23): 5.17.19. detailed and testable expression. (ICLE-CZ) In summary. as exemplified by physics. 2004. This underuse corresponds to an underuse of the passive constructions BE illustrated by/in (Example 5.5 miles) apart. 2000). verb co-occurrents of the noun example provide further evidence for the genre-bound nature of phrasemes: the preferred phraseological environment of the noun differs in academic writing and speech (see Biber et al.21. Results suggest that EFL learners sometimes select co-occurrences that are more typical of speech.18. the different forms of the verbs illustrate and exemplify are not all underused in learner writing. Differences in phraseological or lexico-grammatical preferences are often revealed by patterns of overuse and underuse of word forms. is really a global assumption in search of some refined. the clerk Jankyn. By contrast.4 5.20) and BE exemplified by/in (Example 5. Table 5. Let me give you one example – appaling shots from the war in ex-Yugoslavia that we can see nearly every day. in the Wife of Bath’s Prologue. the past participle exemplified following a noun phrase (Example 5. The association of this material with the clerk is clearly exemplified by Chaucer’s Wife of Bath’s fifth husband. Can you give an example when you say that the law is designed? (BNC-SP) 5. Piaget’s claim that thinking is a kind of internalized action. (BNC-AC-HUM) 5.22) and the patterns as illustrated/exemplified by/in (Example 5. reads antifeminist material to her from his book Valerie and Theofraste. (BNC-AC-HUM) 5.22. Let me give you some examples. 1999. (BNC-AC-HUM) 5. first person plural imperative sentences with the verb give do appear in the ICLE (Example 5.19). Thus. The contrast between the conditions on the coast and in the interior is illustrated by the climatic statistics for two stations less than 30 km (18.

g.g.26. . are very diverse. (ICLE-DU) 5. I can illustrate that by a real example. Film stars are usually very attractive and it’s not a surprise that children want to follow them. 5. (ICLE-FR) As in professional academic writing. A great number of children spend more and more time watching television. (ICLE-SP) 5.24. When used.S.28) or determined by a definite article and followed by the verb be and a that-clause (Example 5.29).72%): 5.2. it sometimes appears in lexico-grammatical patterns that are not found in expert academic writing.27. that was only the straw that broke the camel’s back. in an infinitive clause with the verb take (Example 5. The case in point is that little children learn how to smoke how to drink how to be cunning and clever and get round the adults.138 Academic Vocabulary in Learner Writing The verb illustrate is more often used with human subjects (11.25. . the noun case in point is very rarely used in learner writing. (ICLE-RU) EFL learners’ phraseological and lexico-grammatical specificities will be discussed in detail in Section 5. e. wars always break out for economical reasons. . They take into consideration the behaviour patterns of film stars. I would like to illustrate that by means of some examples which. as you will see. To illustrate this point. EFL learners may also experience difficulty with the meaning of single words and phrasemes.3 below. however. Professional observers see some even deeper danger in the emerging situation. heir of Autro-Hungary. However. one has only to mention people’s disappointment when realizing how little value has the time spent at university. For example. did not start because the murder of archduke Frank Ferdinand. as an exemplificatory discourse marker (Examples 5. To illustrate the truth of this. In the worst cases people decide to suicide.e. to take a case in point.29. instead of e. and more specifically with the personal pronoun I: 5.’s.30 .76%) in learner writing. (ICLE-CZ) It is also frequently used in sentence-initial infinitive clauses (13. they want to be like them.A. (ICLE-SP) 5. it would be interesting to compare our situation with the U. For example. they sometimes use the abbreviation i. the first world war.28.

[e.] the split of Czech federation or the unification of Germany).e. but many progressive social changes (*i. (ICLE-IT) 5. Another proof will be the role that imagination plays in all the Arts *as [such as] Literature. This new wave of revolting trivial events is all the more worrying since it is linked to a rise of the small delinquance.32.38.30. however.g. and not an exemplifier at all. (ICLE-RU) Learners also sometimes use as in lieu of the complex preposition such as (Examples 5. others for the young people. that this erroneous use is more frequently found in learner populations with Romance mother tongue backgrounds.33. (ICLE-FR) 5.e. in many cases. of course. (ICLE-DU) 5. because nowadays children play with technological toys (*i. One of the examples that makes this point is related to children’s toys. both accused of all vices and *namely [(most) notably] of being too lax with those evils. culture and politics can influence this natural inclination. the adverb namely is also sometimes misused by EFL learners who use it instead of notably or another exemplifier. in my opinion. There should be particular institutions for those who are mentally alienated *as [such as] the rapists. [e. writing and mathematics..36. In this essay I would like to show how. and these toys do not let the children develop their imagination and.Academic vocabulary in the ICLE 139 to 5.37. (ICLE-SP) 5. (ICLE-CZ) 5. 5. implying a generalized climate of terror and a total mistrust of the citizens towards the police forces and the law.g. (ICLE-CZ) 5.31. The abbreviation i.35.g.: [e. (ICLE-SP) As illustrated in Example 5. 5. time should be reserved for making children conscious of the fact that there is more to life than the things we see.33 to 5. 5. It might seem absurd. etc.e. drinking (if possible) and being lazy in their leisure time.32). Music and Painting. (ICLE-FR) .34. The states mostly tend to solve their politic problems in a peaceful way (*i.38. Thus soldiers learned mostly bad habits *as [such as] smoking.] video games). however.] an increase of individual liberty) may lead to further increase of crime. is a synonym of ‘that is’ used to reformulate by paraphrasing or clarifying. crime is caused by a predisposition of the individuals and how.37). In addition to the familiar subjects *as [such as] reading. other factors *as [such as] society.e. they are so inactive that playing with these toys does not permit physical exercise. It should be noted.

Why. namely. . In Example 5. which is not surprising as it is even found on websites supposed to help learners master English connectors (Figure 5. isn’t it? (ICLE-FI) 5.40. it is no more than a ten minute’s walk to get where you need to be for lectures and seminars. like namely (c' est-à-dire) above all (surtout) http://page sperso-orange fr/frat. such Figure 5. just as. disbelieves. one example.pdf Example: for example. in particular.5). The efforts made by the firms are obvious. I described there only some examples from the great number of criminal offences. namely is very often misused in learner writing and it is not always clear what learners mean when they use this adverb: 5. st.140 Academic Vocabulary in Learner Writing Pour donner des exemples for instance. for example.42.43. .41. so many people object to gay marriages and. short-sightedness or even nationalistic and xenophobic tendencies. After some years many of those criminals will be set free because of their . inferiority complex. yearn for equality? It is ?namely just equality what gay marriages are about. (ICLE-FR) 5. such as. More generally. (ICLE-PO) More examples of semantic misuse are illustrated and discussed in Section 5. for instance. (ICLE-DU) 5. All the academic facilities are ?namely located on the main campus.5 The treatment of ‘namely’ on websites devoted to English connectors This confusion is relatively common.wikibooks. at the same time.paul/BACK itde Survie. the logical relation between the two sentences is a causal link that is left implicit while an unnecessary exemplifier is used: 5. Because the campus consists of modern buildings. . to illustrate http://fr. Another explanation for the general overuse of the function of exemplification in learner writing may be that exemplifiers are repeatedly used when they are superfluous.39. Reluctance to eventually join The Common Market is ?namely caused by fear.4.43. built closely together. They ?namely create replacement products: they replace the gas in the aerosols and so we have ozone-friendly aerosols.2. then. redundant or even when other rhetorical functions should be made explicit.

In Section 4. But there are actually a number of things we all can do that make a difference. and the noun example and the verbs illustrate and exemplify are used in learner-specific phraseological patterns. England has always been divided according to the kind of religion in which a person believed. although this position is rare in academic professional writing (1. the adverbials for example and for instance are predominantly used in sentence-initial position. there ought to be information about different ways to save electricity.2.Academic vocabulary in the ICLE 141 relatively mild punishment. Let us have a good look at television for example. (ICLE-SP) The two adverbials are also repeatedly found at the end of a sentence in the learner subcorpora (7.4. A sentence-initial position for the adverbials for example and for instance is clearly favoured in the ICLE.4% of the occurrences of for instance). I argued that Academic Keyword List (AKL) lexical items and their phraseological patterns should be taught to EFL learners.3% for for instance): 5. They only want an easy to operate camera. There were a lot of wars due to the religion. in England were recently sentenced two 10 years old boys for murder of a 3 years old boy to the lifelong punishment!) (ICLE-CZ) Section 5. (ICLE-DU) Aspects of sentence position are dealt with in Section 5.g.14% of the occurrences of for example and 8.45. They had for example youthful age.44. Learner corpus data support this claim as all the AKL words that are used to give examples in academic prose present one or more learner-specific diffi culties.6% for for example. For example. a Single Use Camera for instance.47.2. who got off with the light punishment.5 will focus on the unnecessary use of lexical items that serve rhetorical or organizational functions as well as on learners’ tendency to clutter up their texts with too many logical devices. 1.46. (ICLE-PO) 5. (ICLE-SW) 5. It was also argued that the pedagogical relevance of non-AKL items – the preposition . are semantically misused. For instance. compared to the BNC-AC-HUM: 5. The adverb notably and the abbreviation e. EFL learners’ use of exemplifiers also differs from that of expert writers with respect to positioning.5. (Youthful age – by the way in contrast to the punishment of 16 years old boys in our country.

2. – The specific lexico-grammatical patterns of case in point should also be taught as this phraseme is repeatedly used in ‘unidiomatic’ patterns. the nouns illustration and case in point and the expressions to name but a few and by way of illustration – depended on whether learners already used these exemplifiers and how they used them.5 and Section 5.2.2. Section 5.142 Academic Vocabulary in Learner Writing like.2. The analysis of the ICLE corpus suggests that: – A word of caution is needed against excessive reliance on the preposition like. Section 5.and underuse. An analysis of learners’ use of potential academic words from the . 5. ‘expressing a concession’ and ‘reformulating: paraphrasing and clarifying’ in learner and expert academic writing has made it possible to identify six specific areas of where learner English varies from native-speaker academic English. The pedagogical implications of learner corpus-based findings will be further considered in Chapter 6. In Section 5. Academic vocabulary and general interlanguage features A comparison of words that serve the rhetorical functions of ‘giving examples’. Limited lexical repertoire Several studies based on one or more ICLE subcorpora have argued that ‘these EFL writers are not equipped with the type of lexical knowledge necessary for the type of writing task they are undertaking’ (Petch-Tyson. Section 5.2. – The noun illustration should be specifically taught to upper-intermediate and advanced learners as it is underused in the ICLE.2.2.2. Learners’ tendency to clutter their texts with unnecessary connectives is the focus of Section 5.1.3 explores the type of phraseological and lexico-grammatical patterns that are found in most learner sub-corpora. ‘expressing cause and effect’.1 focuses on learners’ limited lexical repertoire by examining aspects of over. the characteristics of learner’s lack of register awareness are presented. ‘comparing and contrasting’.6 illustrates their preference for placing connectors at the beginning of sentences.4 discusses patterns of semantic misuse of connectors and abstract nouns. 1999: 60).2. 5.

Not all high frequencies are amplified in EFL learner writing.4%] 21 [28.Academic vocabulary in the ICLE 143 Academic Keyword List (AKL) supports this view. appears to be more complex than Lorenz’s quote suggests.3%] 49 [56. conjunctions. The picture. 2006). the verbs exemplify and advocate.11). Key function words such as between. the adverbs conversely and ultimately and the prepositions as opposed to and in the light of are much less frequent in English (relative frequencies of less than 30 occurrences per million words in the whole BNC). and of are quite representative of the nominal style of academic texts.10 shows that almost 50 per cent of the words in the AKL are underused in the ICLE.10 The distribution of AKL words in the ICLE overused no statistical difference 84 [23. however. overused items such as the nouns idea and problem.3 per cent for adverbs. underused items such as the nouns hypothesis and validity. the verbs be and become and the adjectives difficult and important are very frequent words in general English (relative frequencies of more than 200 occurrences per million words in the whole BNC). 1998.3%] 21 [28. Table 5.8%] .9%] 16 [18. By contrast.9%] 87 [48.8%] underused nouns verbs adjectives adverbs other TOTAL 86 [24.3%] 33 [44. the proportion of words in the AKL that are overused in learner academic writing is only 21. Table 5. the adjectives likely and significant and the adverbs generally and particularly (in bold in Table 5. these highly frequent prepositions are underused in the ICLE.2%] 34 [18.0%] 199 [21. For example.11 gives examples of overused and underused AKL words in the ICLE. difference and effect. It could be argued that ‘learner usage tends to amplify the high frequencies and diminish the low ones’ (Lorenz 1999b: 59). where 60 per cent of all noun phrases have a modifier (Biber. the nouns argument.4 per cent . Table 5.g.8%] 22 [25. The largest percentages of overused items are found in nouns and in the ‘other’ category which includes prepositions. etc. 2000: 279). Many AKL words that appear with a relative frequency of more than 100 occurrences per million words in the whole BNC are underused in the ICLE.4%] 185 [52.1 per cent for nouns and 56. a fact that can be related to EFL learners’ tendency to avoid prepositional noun phrase postmodification (Aarts and Granger. Conversely.2%] 40 [17. e.1%] 100 [42.7%] 93 [39. the verbs argue and explain. determiners.9%] 59 [32.0%] 277 [29. a percentage that rises to 52. in. Meunier. However. by.0%] 454 [48.

conclusion. because. obvious. parallel. readily. hypothesis. during. relatively. The amplification of a restricted set of low frequency words in learner writing may be partly explained by teaching-induced factors. originally. depend. treat. primarily. critical. moreover.144 Table 5. useful underused addition.6). difficult. theme. while its much less frequent synonym. prior to. or. develop. use common. consider. same. In addition. increasingly. subsequently. highlight. be. form. reality. aim. than. secondly. often. interesting. specifically. extensive. in terms of. from. contrast. extremely. effect. emphasis. provided. of. less. concept. unlike. representative. particularly. study. in relation to. the. latter. only. due to. significant. view. essentially. relative. to. yield adequate. scope. comprise. become. possibility. bias. this The preposition despite is underused. inherent. for. practical. indicate. allow. upon. concern. many. words such as the noun disadvantage. assess. assume. difference. major. comprehensive. likely. extent. potentially. idea. issue. and the adverbs consequently and moreover (underlined in Table 5. solve. given that. prove. perspective. example. choice. appropriate. Words such as consequently.11) are overused although they appear with frequencies of less than 50 per million words in the BNC. necessary. mainly. propose. argue. comparison. avoid. theory. influence. in the light of. describe. suggest. basis. contribute. emphasise. especially. more. possible. cause. disadvantage. increase. problem. largely. benefit. enhance. consequence. in response to. create. prime. argument. improve. unlikely adequately. detailed. by. influence. true. special. substantial. conduct. several. an. summary. contrast. note. generally. different. exist. define. particular. deal. explain. main. assumption. successfully. previously. validity adopt. despite. notably. including. real. therefore other according to. stress aim. the verbs participate and solve. cite. explicit. sense.11 the ICLE Academic Vocabulary in Learner Writing Examples of AKL words which are overused and underused in overused nouns advantage. reveal. participate. reflect. degree. subject to. conversely. in. ensure. advocate. each. consist. rather than. the complex preposition in spite of. fact. some. specify. position. criterion. as opposed to. reason. similar. irrespective of genre. examine. consequently. evidence. solution. exemplify. which verbs adjectives adverbs also. however. between. subsequent. choose. hence. effectively. risk. change. misleading. assert. its. ultimately although. is overused in learner writing (Figure 5. moreover and secondly usually appear in the long and . outcome. important. similarly. derive.

Petch-Tyson. assume. This situation may be compounded by problems of semantic misuse as will be discussed in Section 5.2. it was shown that. Another tentative explanation may be that EFL learners do not amplify any high frequencies words except those that are common in speech. words probably stems from learners’ tendency to rely on all-purpose.6 The use of ‘despite’ and ‘in spite of’ in different genres undifferentiated lists of connectors provided in EFL/EAP teaching materials (see Section 6. EFL learners make little use of a number of EAP-specific lexical devices such as the verbs illustrate and exemplify or the adverb notably. (2006). general. They rely instead on a restricted lexical repertoire mainly composed of the adverbials for example and for instance. the noun example and the prepositions like and such as. indicate. although they generally overuse exemplifiers. Broadly speaking. 1998. however. In Section 5. affect all grammatical categories. When corpus comparisons are based on the total . As argued by Baayen et al. appropriate. This overuse does not.Academic vocabulary in the ICLE 250 200 150 100 50 0 Academic writing News Fiction Speech Learner writing 145 despite in spite of Figure 5. learners overuse logical links signifying cause and effect in their argumentative essays. ‘the complexity of the frequency variable has been underestimated’ and it may be that more emphasis should be placed on the explanatory potential of spoken frequency counts.1). Underused words such as argument. The underuse of some frequent.1.4. issue. The same conclusion holds for learners’ use of cause and effect lexical items. which is compared with that of expert writers in Appendix 1. 1999). but semantically specialized. but their frequencies are significantly less when the conversation component is analysed separately. and vague words where more precise vocabulary should be used (Granger and Rayson. and particularly are quite frequent in general English (as represented by the whole BNC).

146 Academic Vocabulary in Learner Writing Table 5. when frequencies are compared to the total number of cause and effect lexical items. adverbs to express a cause or an effect. EFL learners prefer to use prepositions. not all individual connectors are overused in learner writing. even though EFL learners prefer to use prepositions.01) in ICLE than in BNC-AC-HUM. compared to expert writers. conjunctions and adverbs to express cause and effect. The categories of nouns. In other words. Table 5. − − significantly less frequent (p < 0. The overuse of conjunctions largely stems from learners’ marked preference for because. conjunctions and. // no significant difference between the frequencies in the two corpora number of running words in each corpus. to a lesser extent. verbs and adjectives do not display significant patterns of over. and tend to avoid nouns and verbs. that usage is bound to include a number of instances of over-extension. the overuse seems to be generally attributable to adverbs. He argued that ‘if a linguistic element is used as an all-purpose wild card.12 Two ways of comparing the use of cause and effect markers in the ICLE and the BNC Absolute frequency / total number of words Absolute frequency / total number of ‘cause and effect’ markers −− −− // // ++ ++ nouns verbs adjectives adverbs prepositions conjunctions // // // ++ ++ ++ Legend: ++ significantly more frequent (p < 0. only prepositions and conjunctions are significantly overused.or underuse. but which are nevertheless observed by the native speakers. Several of the overused lexical items are massively overused in learner writing.01) in ICLE than in BNC-AC-HUM.12). which represents 19. while nouns and verbs are underused (Table 5. The adverb so represents 11. By contrast. 1999b: 60–1). it can be expected that learners may disregard target-language restrictions which are not that obvious. Lorenz (1999b) examined the use of causal links in essays written by 16-to-18-year-old German learners and described the marked overuse of the conjunction because as ‘wild-card use’.5 per cent of the ‘cause and effect’ . or even accounted for in the standard grammars. This means that. prepositions and conjunctions. Such “simplification” is one of the most frequently cited features of learner language’ (Lorenz.9 per cent of all cause and effect markers in the ICLE.13 shows that.

always. Other examples of ‘lexical teddy bears’ (Hasselgren. usually. outcome. as a result. in fact. so that. In their study of expressions of doubt and certainty. effect. Hyland and Milton (1997) reported similar findings: Cantonese learners used a more limited range of epistemic modifiers. actually. thus. derive. emerge. think. yield. follow. would. due to. reason. on account of 0 10 [100%] 11 [100%] 2 [40%] conjunctions TOTAL because. in consequence 6 [54%] as a result of. 2004) are the prepositions because of and due to. with the ten most frequently used items (will.13 The over. contribute to. implication 13 [76%] generate. may.5 . so 3 [27%] prepositions because of. consequence 1 [6%] cause verbs 5 [45%] source. in (the) light of 11 [100%] 17 [100%] adjectives 0 1 [50%] responsible (for) 2 [100%] 4 [40%] adverbs consequently. result 3 [18%] bring about. arise. owing to. on the grounds that 5 [100%] 56 [100%] 16 [29%] 28 [50%] lexical items used by learners while it only accounts for 7. origin. provoke. on the grounds of. know. thereby 2 [18%] in view of. as a consequence of. result in. in consequence of. thanks to 2 [20%] therefore. induce. factor.2 per cent of those in expert writing. stem. hence. lead to underuse TOTAL 2 [19%] nouns root.Academic vocabulary in the ICLE 147 Table 5. this/that is why 12 [21%] 3 [60%] for.and underuse by EFL learners of specific devices to express cause and effect (based on Appendix 1) overuse no statistical difference 4 [37%] cause. as a consequence. trigger 1 [50%] consequent 4 [40%] accordingly. 1994) or ‘pet’ discourse markers (Tankó. and probably) accounting for 75 per cent of the total. give rise to. prompt.

the verbs induce. the proportions varied significantly. effect and implication.14 shows that almost half of all comparison and contrast markers are underused.1. EFL learners tend to rely heavily on a restricted set of greatly overused adverbs. 50 per cent of the lexical devices which serve to express cause or effect in expert writing are underused by learner writers. As will be discussed in Section 6.13 and 5.3 per cent of them are underused in the ICLE (e. similar.g.2.4). As with cause and effect lexical items.g. similarity. which often account for a large proportion of the lexical strategies used to serve a specific rhetorical or organizational function in expert academic prose. These findings are not restricted to EFL learners: although they may become fluent in English conversational discourse. Unlike the cause and effect lexical items. as lexical cohesion has been largely neglected in teaching materials (textbooks and especially grammars). prepositions or conjunctions to establish textual cohesion.2) as well as commonly misused expressions such as on the contrary (see Section 5. the nouns source.2. look like. 2003: 1066). This is not particularly surprising.148 Academic Vocabulary in Learner Writing On the other hand. Nouns and adjectives (e. the degree of underuse varies significantly. The rate of overuse is relatively low. In summary. emerge and stem from). While underuse was found in all grammatical categories. Tables 5.g. overused comparison and contrast word do not compensate for the underused ones. Comparisons and contrasts are generally underused in learner writing. result in. Logical links can also be provided by nouns (cf. distinct. where the focus has generally been on adverbial connectors. as lexical cohesion has generally been neglected in teaching materials. this may be explained by teachinginduced factors. Nouns and verbs constitute a large proportion of the possible ways of expressing a cause or an effect in academic prose. An analysis of the lexical items which serve to express a comparison or a contrast in academic prose shows that the rate of underuse is also quite high in this function. resemblance.3). do not seem to be readily accessible to upper-intermediate/advanced EFL learners. English as a Second Language (ESL) speakers have also been reported to ‘continue to have a restricted repertoire of syntactic and lexical features common in the written academic genre’ (Hinkel. contrast. and unlike) account for 59 per cent of all underused lexical items in the comparison and contrast category. arise. but 64. yield. however. but once again overused items include words and phrasemes that are more frequent in speech (e.14 provide useful . in the same way) (see Section 5. verbs and adjectives. the concept of ‘labelling’ explained in Section 1. Table 5. These cohesive devices.

similarity. the same. distinct. differing.14 The over. difference. identical.8%] 79 . by/in comparison. parallel. correspondingly. the contrary. analogous. reversely. differently. unlike 7 [33%] similarly. versus 2 [66. distinction. comparison. + erroneous expressions 2 [22%] prepositions like. on the contrary. compare 2 [11%] adjectives same. different 4 [19%] adverbs in the same way. as against. contrary. identically. by/in contrast. on the other hand. by/in comparison with + erroneous expressions 0 1 [25%] other expressions as … as. common. distinguish. compared with/to. distinguishable. in contrast to/with.33%] whereas 3 [75%] in the same way as/ that. contrast 12 [67%] similar.9%] 31 [39. contrastingly. comparable. conversely.2%] 37 [46. quite the contrary. distinctiveness. 10 [48%] analogously. correspond. the reverse 2 [22%] 9 [100%] 15 [100%] TOTAL nouns verbs look like. CONJ compared with/to conjunctions 3 [100%] TOTAL 11 [13. differentiation. contrasting. contrast. the opposite 2 [22%] 5 [56%] resemble. distinctive. differentiate 4 [22%] alike. parallel. contrariwise.67%] as. on the one hand. analogy. distinctively 4 [44%] unlike.Table 5. likewise. by way of contrast. differ. while6 0 4 [100%] 9 [100%] 21 [100%] 18 [100%] underuse 10 [67%] resemblance. reverse parallel. as opposed to.and underuse by EFL learners of specific devices to express comparison and contrast (based on Appendix 2) overuse 0 no statistical difference 5 [33%] parallelism. opposite. parallely. contrary to 1 [33. comparatively 3 [33%] in parallel with.

different thoughts and emotions. But practically everybody is able to dream. there are different people with different concepts of happiness. Examples 5. (ICLE-FI) 5. According to Crystal it has little further potential ouside Spain. Altenberg and Tapper. So they want to get rid of the military service. Granger and Rayson. 2006). have often focused on learners with the same mother tongue background. (ICLE-IT) . and the adverbial all in all which is used to ‘show that you are considering every part of a situation’ (Longman Dictionary of Contemporary English (LDOCE4)). In Section 5. Lack of register awareness Many learner corpus-based studies have reported on EFL learners’ lack of register awareness (e. (ICLE-RU) 5. Many people who are in this situation think that this is a waste of time: you lose an entire year. the adverbial of course to express certainty.150 Academic Vocabulary in Learner Writing information about learners’ particular needs. the adverb though to introduce a concession.2. In the ICLE. most rhetorical functions are characterized by the overuse of at least one lexical item that is more typical of speech than of expert writing (Table 5. 1999b. These studies. The large-scale study undertaken here allows for a more systematic description of register awareness.and underused AKL single words and mono-lexemic units used to perform specific rhetorical functions. 5. 1998.48. 5. the breadth of EFL learners’ lexical repertoire has been examined in terms of the proportion of over. though.51. the stem I am going to talk about to introduce a new topic. what I want to demostrate is that a good way of making politics can cut the roots to crime. Ädel.15).2. however.52 illustrate overused lexical items that are more frequent in the BNC spoken component than in the BNC-ACHUM: the adverb so to express an effect. it will be shown that the limited nature of EFL learners’ lexical repertoire also stems from a restricted use of the phrasemes and lexico-grammatical patterns typically found in expert academic prose.49. 1998. Of course.g.3 will discuss how they can be used to inform pedagogical material. by exploring the way EFL learners with different mother tongue backgrounds use academic vocabulary. Lorenz. Spanish holds an important position in South America and increasingly so in the United States.3.48 to 5. Section 6. 2000.2.50. too. In this section. In this essay I am going to talk about the link between crime and politics. Meunier. (ICLE-DU) 5.

there are many ways in which mass media affect our approach to reality and they are.Academic vocabulary in the ICLE Table 5.15 in the ten learner corpora used here as well as in four L1 sub-corpora (Norwegian. All in all. The corpus totalled around 1. all positive or good for us..5 million words. by no means. Japanese. 2009). and Turkish) from the second version of the International Corpus of Learner English (ICLEv2) (Granger et al. (ICLE-PO) Gilquin and Paquot (2008) examined the use of some of the lexical items listed in Table 5.15 Speech-like overused lexical items per rhetorical function Rhetorical function Exemplification Cause and effect Speech-like overused lexical item like thanks to so because that/this is why look like like the (sentence-final) adverb though sentence-initial and the adverb besides I think to my mind from my point of view it seems to me really of course absolutely maybe I would like to/want/am going to talk about thing by the way first of all 151 Comparison and contrast Concession Adding information Expressing personal opinion Expressing possibility and certainty Introducing topics and ideas Listing items Reformulation: paraphrasing and clarifying Quoting and reporting Summarizing and drawing conclusions say all in all 5. Chinese. Thanks to them anyone willing to broaden his/her general knowledge of the world has an easy access to useful information.52. .

For example.57 occurrences per 100. . I would like/want/am going to talk about.16 shows that relative frequencies differ widely across L1 populations.000 words in the Swedish one (ICLE-SW).29 occurrences per 100. definitely. of course and certainly) are even more frequent in learner writing than in speech. which is overused by all L1 learner populations while showing marked differences across learner L1 sub-corpora. Spanish and Swedish learners’ heavy reliance on I think to express their personal opinion is reported by Granger (1998b). This issue will be touched upon in Section 5.g. really.4. Granger and Tyson (1996) and Altenberg and Tapper (1998). Table 5. Another example is EFL learners’ use of I think. and are therefore likely to be developmental or teaching-induced. Connor.152 Academic Vocabulary in Learner Writing We compared the frequencies of speech-like lexical items in learner writing with their frequencies in the 10-million word spoken component (BNC-SP) and the 15-million word academic sub-corpus of the British National Corpus. however. 1998) have shown that features of writer visibility in academic prose may differ markedly across languages. This huge difference may be partly explained by L1 influence. Japanese. Using the ICLE. French and Swedish learners’ overuse of of course is highlighted by Narita and Sugiura (2006). French. Neff et al. by the way and though in Figure 5. my results suggest that these features are often shared by a large proportion of the learners investigated. Different EFL learner populations. relative frequencies range from 17. it seems to me.000 words in the Polish learner sub-corpus (ICLE-PO) to 143. (2007) and Aijmer (2002). irrespective of their mother tongues. Vassileva.17. As shown in Table 5. whether lack of register awareness is a typical feature of EFL learner writing or whether it is a more general characteristic of novice writing. Lorenz (1999b) discusses the marked overuse of the conjunction because and the adverb so in German learner writing. Chen (2006) reports on the overuse of besides in Taiwanese student writing. do not use speech-like lexical items similarly. however. 1996.’ They show that the relative frequency of these speech-like lexical items in learner writing is often situated between their frequency in academic prose and in speech (see the bar charts for maybe. The overuse of several of these speech-like lexical items has been highlighted in a number of studies focusing on specific L1 learner populations. Although all L1 learner populations overuse the adverb maybe when compared to the BNC-AC-HUM. Studies in contrastive rhetoric (e. However some of these items (so expressing effect. absolutely.7). It remains to be seen. Our findings support Lorenz’s (1999b: 64) statement that there is ‘mounting evidence that text-type sensitivity does indeed lie at the heart of the NS/NNS numerical contrast.

learner writing and speech (based on Gilquin and Paquot. spoken component (10million words) Figure 5. 1. academic component (15million words) Learner writing: ICLEv2 (14 L1s.Academic vocabulary in the ICLE 350 300 250 200 150 100 50 0 1200 1000 800 600 400 200 0 153 Frequency of maybe (pmw) 40 35 30 25 20 15 10 5 0 Frequency of so expressing effect (pmw) 20 15 10 5 0 Frequency of it seems to me (pmw) Frequency of I would like/want/am going to talk about (pmw) 2000 1500 1000 500 0 really of course certainly absolutely definitely Frequency of amplifying adverbs (pmw) 120 40 30 20 10 0 100 80 60 40 20 0 Frequency of by the way (pmw) Frequency of through at the end of a sentence (pmw) Academic writing: British National Corpus.5million words) Speech: British National Corpus. 2008) .7 The frequency of speech-like lexical items in expert academic writing.

18 38.79 6. Learner writing is also typically recognizable by a whole range of cooccurrences that differ from academic prose in quantitative and qualitative .88 32.93 ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ Legend: ++ frequency significantly higher (p < 0.34 35.7 94.13 101.34 16. I focus on aspects of overand underuse of word sequences that include AKL words before discussing learner-specific clusters that are not found in professional academic prose.3.57 134.154 Academic Vocabulary in Learner Writing Table 5.87 51.000 words ICLE-IT ICLE-GE ICLE-DU ICLE-CZ ICLE-SP ICLE-SW ICLE-FI ICLE-FR ICLE-PO ICLE-RU BNC-AC-HUM 48.2. I first present the major results of an analysis of recurrent word sequences in EFL learner writing.61 72. per 100. The phraseology of academic vocabulary in learner writing In this section.28 31.26 1.01) than in the BNC-AC-HUM Table 5.11 66.000 words ICLE-SW ICLE-IT ICLE-RU ICLE-CZ ICLE-FR ICLE-GE ICLE-SP ICLE-FI ICLE-DU ICLE-PO BNC-AC-HUM 143.37 13.14 ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ Legend: ++ frequency significantly higher (p < 0.13 32.21 24.16 The frequency of ‘maybe’ in learner corpora relative freq. per 100.74 20.59 55.06 121.77 17.17 The frequency of ‘I think’ in learner corpora relative freq.01) than in the BNC-AC-HUM 5.

quite clear) and very (e.18 shows that EFL learners overuse adjective + noun sequences with ‘nuclear’ adjectives (see Section 1.g. it depends. significantly different. Table 5. because of the fact7). highly significant and precisely because. more generally. different points.g. from my point of view. important role. Granger (1998b) suggests that the use of these sequences ‘could be viewed as instances of what Dechert (1984: 227) calls “islands of reliability” or “fixed anchorage points”. Similarly. 1998b: 155).Academic vocabulary in the ICLE 155 terms.1. in order to. prefabricated formulaic stretches of verbal behaviour whose linguistic and paralinguistic form and function need not be “worked upon”’ (Granger. closely associated. important (e.g. mainly because). they overuse adverb + adjective/adverb /conjunction sequences with highly frequent adverbs such as mainly (e. main problem). real problem. important factor). main cause. The results show that learner writing is characterized by a marked underuse of a large proportion of the 2-to-5 word sequences that include AKL words and that are typically used to serve specific rhetorical and/or organizational functions in academic prose. particularly interesting.g. main reason. main reason. big problem) to the detriment of more EAP-like phrasemes such as extensive use. great number. This is also consistent with the author’s statement that ‘while the foreignsoundingness of learners’ productions has generally been related to a lack of prefabs. EFL learners rely instead on a restricted set of clusters which they massively overuse (e. .g. quite (e. The comparison between the ICLE and the BNC-AC-HUM was performed with the Keywords option of the software tool WST4. the problem is that). great (e.g.g.g.g.g. almost entirely.e. different problems. more and more. real (e. An analysis of word sequences in EFL learner writing The results presented in this section are based on an analysis of 2-to-5 word sequences that are over. I illustrate this with a comparison of the co-occurrents of the noun conclusion in academic and learner writing and examine EFL learners’ phraseological infelicities and lexico-grammatical errors. real value).1) such as main (e. great importance). relatively few. lesser extent and wide variety. it can also be due to an excessive use of them’ (Granger. different (e. I will discuss. central issue. crucial importance. important question. different reasons) and big (e. people claim that.or underused in learner writing. significant number. 1998b: 156). very important) but make little use of phrasemes such as readily available. integral part. for example. i.g. The foreign-soundingness of EFL learner writing also stems from learners’ overuse of AKL words in clusters that are not typical of the particular genre of academic prose but are more frequently used in speech or more informal types of writing (e.

in the absence of. central issue. was by no means. we can. allows us. as an attempt to. the implications of. particular attention. in the presence of. they argued. it is true that. to a great extent. various forms of. it is very difficult to. good example. great part. suggestion that. in certain respects. there are also people. little evidence. opportunity of. be defined in terms of. the view that. good use. provide evidence. almost entirely. crucial role. advantages and disadvantages. the total number of. listed above. as a conclusion. it seems likely. he remarks. considerable degree. it is very difficult. a similar. in particular. it is obvious that. he cites. extensive use. it follows that. an account of. prevents us from. quite clear.18 Examples of overused and underused clusters with AKL words Overused clusters for example. it is more likely that. there are more and more. the absence of. in the presence of. it depends. it is unlikely that. similar to that of. in relation to. it does not follow. negative consequences. high degree of. it would appear that. central figure. different way. it was claimed. relatively few. important part. a concern with. significantly different. the extent to which. because of the fact. it was difficult to. more and more. be explained in terms of 2-word clusters 3-word clusters 5-word 4-word clusters as a result. more or less. in order to achieve. in practice. to the extent that. they suggest. it has been suggested that. on the assumption that. may suggest. main reason. extent to which. I consider. it is high time. a theory of.Table 5. by reference to. suggested above. the view. may well have been. in so far as they. were subject to. can be related to. important factor. for instance. more significantly. mainly because. inferred from. very important. may say that. another important. the hypothesis that. inherent in. great importance. is ultimately. partly because. it is a fact that. closely associated with. different points. what appears. wide variety. absolutely necessary. in my view. it is also true. radically different. by comparison. real problem. the problem is that. the fact is that. would seem to be. on the basis of. best solution. people claim that. general principles. integral part. with the exception of. by showing that. consistent with. important to. main problem. more generally. was probably. one of the most important . take into account. it means that. more difficult. much emphasis. are concerned. in his view. was effectively. as much as possible. from my point of view. conclusion I. the issue of. at any rate. still further. I will discuss. it is assumed that. to answer this question. it is very important to. final analysis. aim of this. have problems. a considerable degree. be ascribed to. in the belief that. his method in terms of. somewhat different. perhaps because. great amount. with the exception of. can choose. of great importance. this is not the case. I can. to the effect that. to this extent. precisely because. certain respects. major source. provides us with. the edge of the. allowing for. despite the fact. totally different. affect our approach. an attempt to. might have been expected. high proportion of. as noted above. a great number of. it could be argued that. readily available. pay attention to. main cause. is the fact that. therefore I. is described in it may be that. it is hardly surprising that. big problem. when compared with. only because. this need not. it seems likely that. as a consequence. highly significant. it is worth noting that. real value. no reason to suppose. far as I am concerned. as distinct from. discussed in. different reasons. to the advantage of. and therefore. this suggests that. crucial importance. the immediate aftermath of. important question. with the result that. but it is true that. take into consideration. as a matter of fact. because I. he concludes. are likely to be. good idea. it is possible that. reports that. on average. it is necessary for. in view of. described as. reported by. a wide variety of. different problems Underused clusters by contrast. may have been used as in the case of.

19 shows that the picture can even be more complex: verb forms may be overused in some specific lexical bundles. Word sequences used as self mentions are also much more frequent in learner writing than in academic prose (Aijmer. she cites. may have been used. but still display over. Table 5. 2006). it may be that. e. his method. some verbs are under. 1998. 1997: 193) (see also Petch-Tyson. he remarks. For example. Examples of AKL verbs following this pattern are differ and discuss. because I. be ascribed to. is described in. it is assumed that. it is a fact that.or overused as lemmas without this affecting all forms of the verb. 2003. they underuse the 2-word clusters described as. Conversely. a difference which can be related to the more intertextual nature of professional academic texts. I consider. academic writers use more clusters with third person pronouns with an evidential function. Similarly. the lemma provide is underused in learner writing compared to expert writing. Verbs may have similar frequencies as lemmas in learner writing and academic prose. 2009b). The lemmas do not differ significantly in their use. use of other forms of the verb does not differ significantly in the two corpora. we can.g. the 3-word clusters closely associated with. EFL learners state propositions more forcefully and make a more overt persuasive effort: they overuse communicative phrasemes that serve as attitude markers (e. it is obvious that). might have been expected. it seems likely that. in my view. they underuse hedges such as it is (more) likely that. This is consistent with Granger and Paquot’s (2009b) finding that past participles are the most frequent verb forms in academic prose. EFL learners also underuse a whole set of word sequences involving the –ed form of verbs. discussed in and reported by. Neff et al. provides us with. and from my point of view.g. as noted above. However. I will discuss.or underuse of some forms (Granger and Paquot. it was claimed. but it is true that. it is possible that. and more precisely. differ is underused in its –ing form while discuss is overused in its unmarked form (discuss) and underused in its –ed form. the 4-word clusters can be related to. more authoritative tone and stronger writer commitments when compared with native speaker discourse’ (Hyland and Milton. De Cock. inferred from. Examples include therefore I. suggested above. and be explained in terms of. when compared with. I can. and it would appear that. . but an analysis of word forms indicates that this only applies to provided. it is very difficult to. their past participle form.. Lorenz. it could be argued that. be defined in terms of. it is unlikely that. By contrast. and the 5-word clusters it has been suggested that. but are highly underused in learner writing. 2004a). Ädel.g. For example. 2002. listed above.Academic vocabulary in the ICLE 157 The results also seem to support the widely held view that EFL learners’ academic writing is characterized by ‘firmer assertions. it is very important to) and boosters (e. 1998. they suggest.

and tended to. is concerned with the depending upon. allows for. much depends. provide a. will depend. discussed in. discussed in chapter they tended to. have tended to. was to provide. depends upon the. by allowing. affect our approach to reality. media affect our. allowed him. discussed below. provide us with. people tend to. concerned to. allows us. depending on the will discuss. has tended to. not affect the allow (++) allowed (++) allow them.19 Clusters of words including AKL verbs which are over. depended upon. are concerned. which allowed. we are concerned. as I am concerned. to allow. been concerned. depends on the. tended to be might provide. media affect our approach to allowed to. are allowed to. to provide an. provides a. provides us with. I am concerned. affect our approach.01) in ICLE than in BNC-AC-HUM. concerning the. people tend. be allowed. provide an. provide us.and underused in learners’ writing. media affect our approach. // no significant difference between the frequencies in the two corpora . will depend on differed from.01) in ICLE than in BNC-AC-HUM. was concerned with. concerned with. by comparison with expert academic writing Lemmas and their word forms affect (++) affect (++) affects (++) Overused clusters Underused clusters affect our. provide the. affects the. it depends on. allow it. they tend. provide them with Legend: ++ significantly more frequent (p < 0. can provide. been concerned with. it allows. depended on. allowing for. are allowed. to provide a depend (++) depends (++) depending (++) depended (− −) differ (//) differing (− −) discuss (//) discuss (++) discussed (− −) tend (//) tend (++) tended (− −) provide (− −-) provided (− −) tend to. provides that. depend upon the.158 Academic Vocabulary in Learner Writing Table 5. provides an. it concerns. not allowed to. they tend to provides us. it depends. allow them. already discussed. are not allowed to is concerned. to provide. allows us to. am concerned. mass media affect our. far as I am concerned depends on. − − significantly less frequent (p < 0. are not allowed. not allow. not allowed. allow them to. affect us. in discussing. depending on. we tend to. media affect. allowed him. be allowed to. provide them. differs from the was discussed. we tend. and discussed. to discuss. mass media affect. to allow for concern (++) concerning (++) was concerned. I will discuss was affected. concerned about. provide evidence. concerned with the. allow that. affect our approach.

everybody knows that. many things. people believe that. I agree that. let us. Examples of learnerspecific sequences that do not include an AKL word are given in Table 5. due to the fact that. very serious. but at the same time. I am convinced.2. They include: – word sequences that are more frequently used in speech. instead of. my opinion. is why. – sequences that exist in English but are very rare in all types of discourse. we think. I am sure that. people often. I believe that. by the way. but if we look. there are. far as I am concerned. e. I think. sure that. I don’t agree with. we all know. so why. if we want. I guess. I would like to say. are more and more. I am afraid that. I do not agree with. I would like to. first of all. the verb form concerned is overused in as I am concerned and concerned about but underused in been concerned withor we are concerned. I hope. everybody knows. when I. there is. in order to. on one hand.2. thanks to. in this essay I. people feel. people believe. helps us in my opinion. think twice. as a matter of fact. there are a lot of (see Section 5.g. e. I think that. there are a lot.20. I would say. in fact. many people think on the one hand. of course. but I.g. the sequence as far as I am concerned which is repeatedly used to express a personal opinion in the ICLE. all kinds of. we want. on the contrary. if you. to my mind. I believe. it is easy to I do not think that. I want. I really. we have to. a look at.Academic vocabulary in the ICLE Table 5. we must. opinion is. just imagine. I will try. I must. even worse. last but not least.4 on semantic misuse). it means that. no matter. I would like. some people say that.20 Examples of overused clusters in learner writing Examples 2-word clusters 159 in sum. at all. It makes it possible to uncover a whole range of words and word sequences that are not typical of academic prose but which are nevertheless used by EFL learners to organize scientific discourse and build the argument of academic texts. in spite of. why we. we get. if we. people say. let them. . quite sure. I think that. to sum up.g. it seems to me that. – sequences that are not used in English to establish the logical link intended by the EFL learner. look at. EFL learners overuse the sequences it allows and allows us to and underuse the EAP sequence allows for. of course. or maybe. said that. I agree. we can say that. I do not think so 3-word clusters 4-word clusters 5-word clusters while being underused in others. and of course. on the other side (see Section 5. really think. For example. A keyword analysis of recurrent word sequences is indispensable if we want to build up a full picture of all the possible lexical realizations of rhetorical functions in learner writing. we look. I want to say.2). from my point of view. people think that. it is impossible to. that is why. Similarly. e. means that. people think.

by the contrary. in the contrary.160 Academic Vocabulary in Learner Writing – ‘unidiomatic’ sequences such as as a conclusion used as a textual phraseme to introduce a conclusion (see below for a co-occurrence analysis of the noun conclusion in the ICLE).1. but as a conclusion. as discussed above. This includes as a conclusion but not in conclusion. This learner-specific word combination represents 39. – erroneous sequences such as in contrary.2 per cent of the concluding textual phrasemes involving the noun conclusion in the ICLE. 2006: 224) This development may be related to the increasing use of the internet for study purposes and of the type of teaching materials available on this channel. the latter being notoriously overused by German learners of English at university level as well. (Mukherjee and Rohrback. it was shown that EFL learners manifest a marked preference for a restricted set of single words and mono-lexemic phrasemes to express logical links. EFL learners’ overuse of sequences that are rarely used by native-speakers (such as as far as I am concerned or last but not least) or ‘unidiomatic’ sequences (such as as a conclusion) may be partly explained by poor teaching materials and/or the influence of their mother tongue. For example. In a longitudinal study of German learners’ use of the noun conclusion. the most frequent phrase is no longer in conclusion. Mukherjee and Rohrback (2006) commented that the sequence as a conclusion is gaining ground in learner writing to the extent that it is even more frequent than in conclusion in the more recent corpus they use: Interestingly. in contrary to that are used to express a contrast in EFL learner writing. Another example of a learner-specific logical . They also use learner-specific functional equivalents of these markers such as the sequence as a conclusion instead of in conclusion. This certainly is a problematical development because in conclusion is much more frequent and idiomatic than as a conclusion. Preferred co-occurrences in EFL learner writing In Section 5.8 The rare expression as far as I am concerned is also given as a key expression for voicing one’s own opinion.2. Les fiches essentielles du Baccalauréat en anglais (published in 2008 by Clairefontaine) give a list of linking words that French students are encouraged to use in the English test of the ‘Baccalauréat’ (the final secondary school examination which gives successful students the right to enter university) to ‘enrich their essay and give more clarity to their argumentation’.

i.2. anticipating the reader’s reaction. chains of shared collocates)’ (Gledhill. However. who tend to produce their own phraseological ‘cascades’.7%) mention speak about reiterate quote 6 2 1 1 1 1 1 Figure 5.e. The sequence in conclusion. it was shown that mono-lexemic phrasemes such as for example have their own phraseological patterns in academic prose.2. Some 30. very often introduces the sequence like to.1. these do not seem to be readily available to EFL learners. metadiscourse items that refer explicitly to the writer and/or reader. Another example is learners’ use of the noun conclusion. the percentage of verb co-occurrents . serves a wide range of rhetorical functions (including exemplifying. I would like to either introduces the verb say or another verb of saying such as tell or mention. ‘collocational patterns which extend from a node to a collocate and on again to another node (in other words. in turn.1.8 shows that the textual phraseme in conclusion (or one of its learner-specific functional equivalents) is very often directly followed by the personal pronoun I in the ICLE.6%) like to say emphasize tell 13 (12. EFL learners use AKL nouns and verbs in different lexico-grammatical or phraseological patterns than professional writers. I is generally followed by the modal would to produce the word sequence in conclusion.2%) used in the ICLE do not appear in the BNC-AC. When tokens are analysed. I would. This is consistent with Ädel’s (2006) finding that personal metadiscourse.9 Figure 5. The sequence in conclusion.8 Phraseological cascades with ‘in conclusion’ and learner-specific equivalent sequences marker is on the other side which they use instead of on the other hand to compare and contrast (see Section 5. 2000: 212).Academic vocabulary in the ICLE 161 In conclusion 59 As a conclusion 40 As conclusion 3 I 37 (36%) would 21 (20. However almost half of the verb co-occurrent types (46.8 per cent of verb co-occurrent types are significant cooccurrents of the noun conclusion in the BNC-AC . and concluding) in Swedish learner writing.4 below for more details of learners’ use of on the other side). In Section 4. which.21 lists the verb co-occurrents of the noun conclusion in the ICLE. Table 5. This has already been illustrated by learners’ use of the noun example and the verbs illustrate and exemplify in Section 5. arguing.

162 Table 5.21 Verb co-occurrents of the noun conclusion in the ICLE Freq. in ICLE Significant co-occurrent in BNC-AC ** − − ** − − − Appearance in the BNC-AC √ √ X √ X √ X Verb + conclusion as object add up to apply approach arrive at bring bring sb to come to *come into confirm contain draw *draw up end with escape express find gather get give have influence jump to lead to 1 1 1 5 1 2 52 1 1 1 25 1 1 1 1 2 1 1 1 1 1 2 4 x x x √ x √ √ x √ x √ x x √ √ x x √ √ √ √ √ √ emerge arise contain be come need bring sb to 1 1 1 23 1 1 1 Academic Vocabulary in Learner Writing . in ICLE Statistically significant co-occurrent in BNC-AC − − − ** − − ** − ** − ** − − ** ** − − − − − − ** ** Appearance in the BNC-AC conclusion as subject + verb Freq.

01). − not significant co-occurrents in the BNC-AC. x the co-occurrent is not found in the BNC-AC 163 . √ the co-occurrent appears in the BNC-AC.leave sb with look for make point to put put forward reach write as TOTAL 1 1 11 1 1 1 3 1 − − − * − − ** − 128 tokens (32 types) x x √ √ x x √ x TOTAL 29 tokens (7 types) Academic vocabulary in the ICLE Legend: ** significant co-occurrent in the BNC-AC (p < 0.

In Example 5. the indefinite article a is used instead of the definite article the. (ICLE-PO) 5. when we consider all the pros and cons of fast food we will certainly arrive at a conclusion that it is not an ideal way of eating. these findings support Nesselhauf’s (2005: 25) argument that collocations should not be viewed as involving only two lexemes. Conversely. lead + conclusion and reach + conclusion. And taking into consideration that Marx was a materialist we can come to a conclusion that he himself would be attracted by the advantages of television.53. (ICLE-DU) . The single occurrence of the collocation that appears in the ICLE is used in the native-like lexico-grammatical pattern ‘cannot escape the conclusion that’ but its subject is a nominal phrase headed by the noun evaluation: 5. (ICLE-RU) 5. a more objective evaluation of the problem cannot escape the conclusion that. other elements closely associated with them should also be taught. drug use and abuse have occurred in all civilizations all over the world.53 and 5. due to new techniques and industrialization. all I have mentioned before lead us to the conclusion that if our lifes were a little ‘easier’ and we wouldn’t be dominated by a world that is constantly changing. In Examples 5.55. However.56 However. However. come to and draw). they do not always use them in native-like lexico-grammatical patterns. draw + conclusion. 5. (ICLE-SP) The collocation escape + conclusion appears in two phraseological patterns in academic prose: ‘it is difficult to escape the conclusion that’ and ‘we cannot escape the conclusion that’.55. and religion for him would remain the opium of the masses. which is always used in the BNC-AC when the conclusion (underlined in the examples) is introduced by a that-clause. In the context of EFL teaching/learning. To sums up. EFL learners use the collocations arrive at + conclusion. the percentage of verb co-occurrents that are not found in the BNC-AC falls to 12 per cent as ‘non-native’ co-occurrences are rarely repeated. we could enjoy doing things as dream and imagine more frequently. come to + conclusion.54.8 per cent as several of the verbs are repeatedly used in learner writing (e. a pattern which is very rarely found in academic prose. and that it is the criminalization of drugs that has created a much heavier burden on society.164 Academic Vocabulary in Learner Writing that are significant co-occurrents of the noun conclusion in the BNC-AC rises to 75.g.54. the frequent phraseme lead to the conclusion that is used with the personal pronoun us.

(BNC-AC-HUM) 5. 5. The Divisional Court expressed its conclusion in the following terms: . like conclusion.9 Collocational overlap . I wanted to express my conclusions. a set of nouns which have partially shared collocates (see also Lennon. 1998) refers to this phenomenon as a collocational overlap. It is mainly used in legal discourse and thus conveys a rather formal tone as illustrated in Example 5.61. that is: technology. . the preposition into replaces to. (ICLE-SP) 5.57. The woman started to think about the price of progress and came into conclusion that automation causes more problems than it solves. 1996). Howarth (1996.e. (ICLE-SP) There are many other examples of EFL learners’ attempts at using nativelike collocations.Academic vocabulary in the ICLE 165 In the collocation express + conclusion. and no article is used in an attempt to produce the native-speaker sequence ‘came to the conclusion that’. It may be hypothesized that the learner who wrote this sentence has been influenced by the nativelike co-occurrence ‘express one’s opinion/view’. Finally.9). the verb put forward is not used with the noun conclusion in English (see Figure 5. combine with the verb draw to form collocations. However. This verb is commonly used with the abstract nouns plan and proposal. the verb express has acquired a semi-technical sense and means ‘make something public’. plan draw proposal put forward conclusion Figure 5.60.59. Finally. . the verb put forward is used with the noun conclusion.60. In Example 5.59. Its single occurrence in the ICLE can be qualified as ‘nonnative like’ as it appears with the first person singular pronoun I as subject and the possessive determiner my (Example 5. a conclusion can be drawn up emphasizing our first statement. (ICLE-PO) In Example 5.58). the phrasal verb draw up is used in place of draw and in Example 5.58. science and industrialization have not killed dreams and imagination.57. i. 5. which result in crude approximations. two nouns that.

also brings double standard conclusions. Learners sometimes use the preposition about after the abstract noun account (e. Thus. the demand *of raw material (ICLE-GE)). 5. This reveals learners’ weak sense of native speakers’ ‘preferred ways of saying things’.22).64). logical. a conclusion can hardly be put forward as it is supposed to be more than a suggestion and the result of serious consideration and discussion. (ICLE-FR) The semantic incongruity of the co-occurrence ‘put forward a conclusion’ is made apparent by contrasting the definitions of put forward and conclusion. an account *about a murder (ICLE-RU)) or the preposition of instead of for after the noun demand (e. however. explanation etc. we can nevertheless notice that a certain importance is granted to them. EFL learners also produce deviant verb + noun free combinations.63. especially one that other people later consider and discuss’ (LDOCE4) while a conclusion is ‘something you decide after considering all the information you have’ (LDOCE4). main. Looking at this idea from the Polish point of view.166 Academic Vocabulary in Learner Writing 5. foregone. As already pointed out by Nesselhauf (2005). (ICLE-SP) The same remark can be made about several adjective + conclusion co-occurrences (Example 5. and definite. They also use . Looking for the conclusion I would like to say that every person is individual and each has his or her own character. The verb put forward means ‘to suggest an idea. (ICLE-PO) The phraseology of EFL learner writing is also characterized by a number of lexico-grammatical infelicities and errors.62.64. None of these appear in learner writing except for logical. adjective co-occurrents of the noun conclusion in learner writing are not the most typical ones in academic prose even though a large proportion of them occur in the BNC-AC (see Table 5. (ICLE-RU) 5. More importantly. different. The noun conclusion enters into combinations that are not found in academic prose and which are semantically awkward: 5. Having considered the various aspects of capitalism a conclusion must be gathered: the system cannot provide for the basic needs of the population.g.g. similar. firm. The first ten most significant adjective co-occurrents of the noun conclusion in the BNC-AC are general. opposite. consequently it needs to take steps in order to prevent combativity which will endangered their interests. Without putting forward premature conclusions. tentative.61.

Academic vocabulary in the ICLE Table 5. √ the co-occurrent appears in the BNC-AC.22 Adjectives 167 Adjective co-occurrents of the noun conclusion in the ICLE Frequency Significant co-occurrents of conclusion in the BNC-AC − − ** ** − − ** − − − ** − − − ** − ** ** − ** − − − − − − ** ** − − ** − − − − Appearance in the BNC-AC √ x √ √ x √ √ x √ √ √ x √ x √ x √ √ √ √ √ √ x √ √ x √ √ x √ √ x x √ √ absolute awful certain clear clever concrete depressing double standard fair false final frightening interesting liberal logical long-searched for obvious overall only own personal premature private radical right sad same satisfactory satisfying sensible successful terrifying understated unequivocal wrong TOTAL 1 1 3 1 1 1 1 1 1 1 5 1 1 1 4 1 1 1 1 4 1 1 1 1 2 1 2 2 1 2 1 1 1 1 1 51 tokens (35 types) Legend: ** significant co-occurrent in the BNC-AC (p < . x the co-occurrent is not found in the BNC-AC.01). . − not significant co-occurrents in the BNC-AC.

Luckily.68. the possibility *to learn a good job (ICLE-FR)).65 illustrates learners’ confusion between the prepositions despite and in spite of. 5. For example.67. related *with. 1989): 5. (ICLE-SP) Another source of error is the adjective same which is sometimes preceded by the indefinite article in the ICLE: 5.2. (ICLE-GE) 5. When different people read *a [the] same book they have probably various imaginations while reading. their rivalry made them hold continuous battles.4.69. (ICLE-RU) Learners also have a tendency to use the impersonal pronoun it in the subject position after as: 5. (ICLE-CZ) It should be noted that very few of these errors are widespread in learner writing and that some of them may be partly L1-induced.g. French learners use the erroneous colligation discuss *about as a translation of the French discuter de (Granger and Paquot.168 Academic Vocabulary in Learner Writing a to-infinitive structure after the noun possibility instead of an –ing form (e. which results in the blend *despite of (cf. the function of comparing and contrasting was shown to be generally underused in learner writing. Despite *of [Despite] the absence of such professionalism our nation overcame fascists. It is a matter of fact that these ‘things’ cannot be bought and sold like shares on the stockmarket. and discuss *about.2. as *it was [as was] the case of Catholics and Protestants. Example 5. An analysis of individual lexical .65.66. Semantic misuse In Section 5. Dechert and Lennon. (ICLE-FI) 5. 2009b). Because of the ambition for the power. I would say because otherwise only the rich would be able to posses them as *it is [as is] unfortunately the case with many products in other areas of living. Other examples of colligational errors include suggest *to. The negative image of feminism makes it twice as hard for women to rise above it than it would be if men were facing *a [the] same kind of dilemma.1. attempt *of.

The main users of this kind of vehicles are families. 1999: 534-535). For pedagogical purposes. 1990: 317) Granger and Tyson (1996) report the same conceptual problems for French learners. (cf. EFL learners’ semantic misuse of the phraseme on the contrary has already been reported for different learner populations: In Hong Kong. Europeans have lived in their countries for hundreds of years. Sports cars are created for this use and this may be the reason why their price is so high and use is expensive. as an equivalent alternative to on the other hand.72. most Americans have moved to the USA from different countries as immigrants.70 to 5. *On the contrary [By contrast].70. (ICLE-RU) 5. (Crewe. Teaching materials often provide lists of connectors in which the adverbial on the contrary is described as a phrase of contrast. by contrast. As Lorenz (1999b: 72) has demonstrated.70. Thus. etc. Crewe. Onasis had everything but he wanted to have more. we are all familiar with students who use ‘on the contrary’ for ‘however/on the other hand’. 1990). of course. 5. EFL learners typically use on the contrary erroneously (instead of a contrastive discourse marker such as on the other hand or by contrast) to contrast the qualities of two different subjects (underlined in Examples 5. Raskolnikov differs from Onasis. *On the contrary [By contrast]. For instance. overtaking and leading on the roads. station wagons are not expensive in maintenance. *on the contrary [by contrast]. reveals that the adverbials on the contrary and on the other hand are overused in the ICLE. however. Lake (2004) . that is. Raskolnikov. thus adding an unintended ‘corrective’ force to the merely ‘contrastive’ function sought. This is confirmed by our corpus-based analysis of EFL learners from different mother tongue backgrounds (see also Celce-Murcia and Larsen-Freeman. the fact that Onasis had everything is contrasted with the fact that Raskolnikov had nothing and the phraseme by contrast would have been more appropriate. overuse is often accompanied by patterns of non-native usage.71. (ICLE-PO) 5. had nothing.72).Academic vocabulary in the ICLE 169 items. in Example 5. Lake (2004) states that a large proportion of EAP non-native speakers who use on the contrary do so inappropriately. (ICLE-FI) The semantic inappropriacy of on the contrary in EFL learner writing has been attributed to teaching practices. The young like crazy driving.

Lake (2004) considers EFL learners’ misuse of on the contrary to be ‘something of an exception’ and writes that ‘in the EAP context. one positive statement and one negative statement open to similar interpretations. and do not pose great problems of usage’ (Lake. to which the two statements.1. The potential influence of the first language on French learners’ use of on the contrary is discussed in Section 5. an argument. but as a guideline for production. which can be used to express both a concessive and an antithetic link. however. argued that French learners’ overuse and misuse of on the contrary is probably due to an over-extension of the semantic properties of the French ‘au contraire’. probable that misguided teaching practices and L1 interact here. two contrasting qualities. adjacent to the phrase both form a refutation. for example. 2004: 142) Lake (2004) rules out the possibility of an L1 influence on EFL learners’ semantic misuse of on the contrary on the basis that over 70 per cent of international students from widely different mother tongue backgrounds produced two distinctly separate L1-equivalent items in a cloze test in which they were required to insert on the contrary or on the other hand and provide an equivalent phrase for both adverbials. it now becomes possible to consult a checklist of contextual features that should be present in order for on the contrary to be appropriate: one subject. Granger and Tyson (1996). (Lake. In Section 5. The L1 equivalent forms to on the contrary and on the other hand may be characterized by different patterns of usage and thus be the source of negative transfer.g. the preposition as (instead of such as) and the . such functional phrases [connectives] are usually familiar to learners from an early stage. Such a checklist may be simplistic in that it does not cover all the possible lexico-syntactical environments in which the phrase might be encountered.). it ought to prove a useful starting point from which EAP teachers can devise their own practice materials. EFL learners’ inappropriate use of the abbreviation i.170 Academic Vocabulary in Learner Writing proposes a checklist of contextual features that should be present when on the contrary is employed: As for the implications for learners. (in lieu of e. however. is over-optimistic and is clearly not reflected in our corpus-based learner data.e. either genuinely present or implied. This view. It is.3 below. 2004: 137).

So. The re-introduction of the death penalty may have positive sides. This does not occur in academic prose. there are always dreams and imagination. all countries should understand . it is children who keep dreams and imagination alive! (ICLE-IT) 5. On the other side. (ICLE-FI) The word combination on the other side sometimes appears in the ICLE in places where a contrast seems to be the logical link intended by EFL learners. I strongly believe that there is still a place for dreaming and imagination in our modern society. Poland cannot reply with isolation as the unification still remains the best solution to its problems. (ICLE-GE) 5. who use it to mean ‘another side or aspect’. *On the other hand the aim of punishments is [also] to make the criminals obey the laws and show example to other’s so that they will not follow the bad example and commit the same crime. Everybody knows that children like inventing funny stories and amusing plays by using their wide fantasy. The following extracts are examples of the use of on the other hand in the ICLE where it would have been more appropriate to use no connector or an additive marker: 5.73.74. erroneous uses of on the other hand are found in most ICLE sub-corpora. [P]10 Firstly. where there is a child. because *on the other hand [ø] the death penalty develops violence and is incompatible with the basic laws of humanity. Criminality would be limited. It is illustrated in the following examples: 5. too. which suggests that there are other contributing factors to this learner difficulty. Field and Yip (1992: 25) reported that on the other hand is frequently used by Cantonese speakers to make an additional point. Other examples of semantically misused lexical items in learner writing include on the other hand. on the other side. because criminals would be afraid of the severe punishment. They suggested that this semantic misuse might be L1 induced: the Chinese equivalent of on the other hand is often misused by novice L1 writers. [P] This might be an illusion. moreover. with no implied contrast.75. and even if. besides. *On the other hand. fantasy is [also] a useful mean used by teachers in primary schools to teach school subjects to their little students. The function of punishment is to show that crimes are not acceptable or that they can solve any problems.76.Academic vocabulary in the ICLE 171 adverb namely was discussed. This is one reason why children always bring happiness and awake the adults’ childish part. Although L1 influence may play a part in Hong Kong Chinese students’ inappropriate use of the adverbial.

81. either beforehand or afterwards (Flowerdew. But on the other side we will form a new nation with new hopes. We can no longer enjoy the sun in summer because of the hole in the ozone layer.79. There is pollution wherever you look. *even if [even though] we are not in the European Union. This hole is caused by technical improvements in the last decades.172 Academic Vocabulary in Learner Writing that history and its consequences cannot divide the continent. (ICLE-SW) Even if should be used to introduce a condition. and claim that are inherently unspecific and require lexical realization in their co-text. We are as much a part of Europe as any other country here.. 2006). In the first sentence. not a concession. 5. Even though these descriptions are valid they still leave open a number of questions. particularly why the same mechanisms do not operate with girls. Semantic misuse has often been discussed in the literature in relation to logical connectives.80. The successful process of unification should be carried out with respect to nations’ rights and without special privileges given to the powerful. (ICLE-PO) 5.. particularly why the same mechanisms do not operate with girls. *even if [even though] they are cheaper. argument. and more specifically. Even if these descriptions are valid they still leave open a number of questions. but Europeans. In the second sentence.78. home schooling to me is no real alternative. However.*even if [even though] I agree that the American public school system is defective. .77.82. Learners often use even if in lieu of even though to introduce a concession: 5. abstract nouns such as issue. (ICLE-GE) 5. as I feel that parents are not the best teachers for their own children. Europe 92 means well a loss of identity since we’ll be no longer Belgians. labels. Another big problem is our environment. i.83. English . (ICLE-FR) There is also some confusion between the conjunctions even if and even though in EFL learner writing. Compare: 5. We must forget about refrigerators containing CFC-11 and CFC-12. new ideas . . he or she does not. However. Italians. (ICLE-PO) 5. EFL learners also experience difficulty with the semantic properties of other types of cohesive devices. In addition to phraseological and lexico-grammatical infelicities. EFL learners’ use of labels is . (ICLE-GE) 5.e. But on the other side it is sometimes hard to live without car or aerosols. the writer knows and accepts that the descriptions are valid.

the rather unidiomatic expression ‘familiar arguments about’ should be rephrased as ‘widespread or popular beliefs about’. There are two main arguments [?reasons] that help us understand why Big Tobacco stuck to their statements for so long. They feared that there was going to be even more legislation and regulation if they would ever admit to lying. The most important question concerning genetic engineering is the problem [that] of gen manipulation with humans. economy and environmental protection would be to the benefit of all. another aspect introduces a second example (about the unemployed and housewives) of the fact that you are judged by what you do rather than by what you are. In Example 5. (ICLE-PO) 5.90. Female participation in making decisions concerning war and peace. in certain aspects stands for in some respects and the aspect of money probably refers to the ‘money issue’ or the ‘money question’ in Example 5. 1999b) in lieu of more specific nouns such as issue or question as illustrated in the following sentences: 5.Academic vocabulary in the ICLE 173 characterized by semantic infelicity or lack of semantic precision. . In Example 5. . (ICLE-DU) Other problematic labels include. the sentences that follow the label argument would be better described as ‘reasons’ why Big Tobacco did not depart from prepared statements.89. . . If we are aware of the fact that such time-tables are very common for people living in a modern society like ours. reject familiar arguments [widespread/popular beliefs] about women being unreliable. However it will not be possible until males re-think and.84. the companies feared the consequences that would follow a confession. Learners. (ICLE-GE) 5. In Example 5.85. aspect and issue. 5. In Example 5.91. [P] First.86. (ICLE-PO) 5. hopefully. for example. irrational and dependent on instincts. . Industrialisation has transformed dreaming into a waste of time which is now “cleverly” linked to money.88.87. the problem [question] of the place of imagination and dreaming is not even worth examining. among many others.87. Lorenz. (ICLE-FR) The noun argument also seems to cause difficulty to EFL learners. This short discussion of the main points linked to the problem [issue] of capital punishment leads to the final question.88. contrasting it with the first example (about physicists and mathematicians). use the noun problem as an ‘all purpose wild card’ (cf.

2. for as far as real good relations among countries are concerned. A legend exists that money was invented by the devil to tempt the mankind. but true. And another aspect is that [?by contrast.6 below). a hyperbolic issue [product] of my vivid imagination. religion. Our modern western society puts a lot of pressure on people as far as work is concerned. because it was considered that this would lead to common happiness. Chen. [5] Firstly because students spend . it is still a matter of distant future. The following text is an excerpt from an essay written by a French-speaking EFL learner. you are judged by what you do rather than by what you are. [3] So one can wonder if a university degree really prepare students for real world and what his value is nowadays.92. There were and there are different ideas about making all people equal.93. according to popular opinion you must be very intelligent if you are a physicist or a mathematician. Sad. Or. [1] But what about these prestigious institutions today? [2] To caricature them rapidly one could say that universities consist of courses given by professors (competent in their fields) in front of a silent audience who is conscientiously taking notes. (ICLE-GE) 5.93. Your job is your “trademark”. the most famous woman in the field of Catholic theology. Each sentence contains at least one connective device – typically an adverbial connector or a sentence stem – which is often found in sentence-initial position (see Section 5. Uta Ranke-Heinemann. tries to provide answers to them.174 Academic Vocabulary in Learner Writing 5.89. 5.2. (ICLE-PO) 5. Human beings can eventually feel as one great family. Actually. 1990. issue most probably stands for ‘product’: 5.94.90. (ICLE-GE) 5.] the unemployed or housewives are sometimes treated as social outcasts. Her issues [?] lies on the verge of theology. it is not quite clear what her issues refer to and in Example 5.5. (ICLE-PO) 5. For example. philosophy and first of all. but only *in certain aspects [in some respects]. [4] I think it is true that lectures in themselves are theoretical. Chains of connective devices EFL learners’ texts are sometimes characterized by the use of too many connective devices (Crewe. in other words.92. Narita and Sugiura. (ICLE-RU) In Example 5. She is employed in defining the relation between faith and the mind. The picture I draw from my dear old houseman admittedly is nothing but a mere cliché. bits of information from the remotest parts of the globe reach us in an instant. 2006.91. The aspect [?issue/question] of money includes the problem of equality. 2006).

I want to show that theory must always be accompagnied by practical applications. [12] Moreover they are also able to adapt or to modify their method according to the situation. and at worst causes the thread of the argument to zigzag about. He added that ‘over-use at best clutters up the text unnecessarily. first. Several of these connectors are superfluous and sometimes wrongly used (e. Lado’s are studied in detail but practical points are hardly ever considered. indeed in sentence [15]).g. I could say that a teacher in front of a classroom does not think about particular methodological theories again but that he has created his own methodology. for example. [19] Let’s take the example of a student in economics who has his certificate in his pocket and proudly goes working in a big firm for the first time.). secondly.Academic vocabulary in the ICLE 175 most of their time sitting in big classrooms which do not allow practical exercises but only ex cathedra lectures. 1990: 320). [17] Nevertheless.g. [11] The reason is that. could be more practical) different theories as Krashen’s. as each connective points it in a different direction’ (ibid: 324). to take the example of). I think that academic studies develop a critical mind. Some EFL learners use many logical connectives between sentences simply to indicate to the reader that they are adding another point (e. I do not want to go too far. [22] I think that this is a fully justified criticism against this institution. [8] However is it true that this formation does not prepare students for real world? [9] I am of the opinion that the answer is no. thanks to the theoretical background they have learned. which means that they have to dissect them. [15] The students are indeed trained to analyse pieces of information coming from different horizons from a critical point of view. [14] Secondly. The following excerpt from an EFL learners’ essay is a good example . [18] I really think that theory is essential but I am convinced that practice should also be present. which is not often the case at university. [6] Secondly because the subjects of the lectures are theoretical. [13] To take the example of a teacher again. moreover. [20] I would compare this business man to a gentleman who perfectly knows the highway code and who knows how to start and how to run through the gears but who finds himself in the center of Paris at the peak hours the first time he really drives! [21] By this example. [10] First I think that university degrees are theoretical on purpose (as opposed to high schools which are more practical. Crewe (1990) attributed EFL learners’ massive overuse of connective devices to their attempt at imposing ‘surface logicality on a piece of writing where no deep logicality exists’ (Crewe. firstly. moreover in sentence [12]. to confront them and then to be able to pass judgment on them. [7] For example: during a general methodology course (which. university students are able to build up their own way to achieve their aim. we think. [16] That is the way they should create a personal opinion for themselves.

95. 1990: 316) but whose presence will not make the text coherent. like nature. Hobbes is a stern determinist. I am convinced that) when it is communicatively unnecessary in the flow of argumentation. 5. They often use I think or an equivalent expression (e.96 and 5. learning when not to use them is as important as learning when to do so. as subject to the chain of cause and effect. (ICLE-DU) As Aijmer (2001) showed in a study of Swedish EFL student writing.97. I think that in Examples 5. i. because they are not able to live together in harmony. that our ancestors used to live in trees. Hobbes even considers people as artificial creatures. The sequence I think it is true in Sentence [4] corresponds to what Aijmer (2001) described as a ‘rhetorical overstatement’. because they are natural. (ICLE-IT) The pedagogical implication of these findings is that. As a consequence. because as far as I am concerned I think that in every country there are few people which are rich and many people which are poor.e. I am of the opinion that. one for almost every sentence. I agree with George Orwell.96. ‘words or expressions that may be sprinkled over a text in order to give it an “educated” or “academic” look’ (Crewe. Furthermore. I think and as far as I am concerned. For example. 5. learners use I think to make their claims more persuasive rather than to express a tentative degree of commitment. students need to be taught that excessive use of linking devices. In other words. Sentence [18] in Example 5. Hobbes was accused of being an atheist and forbidden to publish any more books. (ICLE-SW) 5.176 Academic Vocabulary in Learner Writing of EFL learners’ use of logical connectors as ‘stylistic enhancers’. 1983: 27). . The clusters To me.97 respectively are two more instances of rhetorical overstatement. which the author regards as typical of non-native-speaker argumentative essays. can lead to prose that sounds both artificial and mechanical’ (Zamel. He regards man. two hundred years later.94 could be rephrased as ‘Theory is essential but practice should also be present’. not belonging to nature. Of course. and then on the other hand I understand that they also can be seen as separate. Therefore a concept like ‘free will’ is impossible.g. something which animals like bees and ants are capable of. these ideas were as much an insult to man’s estimation of himself as Darwin’s allegation. To me I think technology and imagination are very much interrelated. ‘important as these links are.

1999b. shows a realism and subtlety of characterization that are Coysevox’s own. though. (BNC-SP) EFL learners’ marked preference for the sentence-initial position has been reported in various studies focusing on one L1 learner populations (Field and Yip.98. 5. Agriculture was brutally collectivized and no concessions were made in the use of the Ukrainian language and culture. 1992.102. as shown in Example 5. (BNC-AC-HUM) 5. Narita and Sugiura.6.98.Academic vocabulary in the ICLE 5. Zhang. i. the Red Army units did nothing to conciliate the Ukrainian Left or the peasants. Coysevox’s bust of Lebrun repeats – again with a certain restraint – the general outlines of Bernini’s bust of Louis XIV. In practice. Lorenz. Despite its commercial character Christmas still means a lot to me. Denikin’s White armies counter-attacked and after seven months the Red Army was obliged to withdraw.5 per cent in the BNC-AC-HUM (see Example 5.2.101. however. However.102). Examples include the preposition despite which appears in sentence-initial position in 52 per cent of its occurrences in the ICLE but only in 34. The final position is also possible. 2000. within the sentence. Sentence position 177 Linking adverbials occur in different sentence positions. 5. but is more typical of speech as illustrated in Example 5. often immediately after the subject. and sentence-initial due to which is repeatedly used in learner writing but hardly ever occurs in academic prose (Example 5. Our analysis of connectors in the ICLE supports this hypothesis.100. 2006).99. It’d be worth asking him first. They can also occur in a medial position. The face. Table 5. (ICLE-FI) 5.23 shows that the total proportion of sentence-initial connectors in learner writing is much higher than that found in academic prose (13. (ICLE-DU) Another example is the adverb therefore which often appears in sentenceinitial position in the ICLE but is not often used in that position in the BNC-AC-HUM: .99. (BNCAC-HUM) 5. Due to these developments the production expanded enormously. which meant that a greater number of people could be fed.2% compared to 6%). They often occur initially.101). Granger and Tyson (1996: 24) commented that ‘it is likely that this tendency for learners to place connectors in initial position is not languagespecific’.e.100. as does however in Example 5.

2 39. pmw 203.5 20.7 57.8 145.6 79.2 8.103.7 0.3 18.493 530 179 96 246 274 127 854 344 127 1.5 56 34. on rest.9 71.276 91.808 % Rel.2 1. 522 32.8 96.2 676 1374 65 22 31 151 46 60 235 3 94 28 233 86 176 882 42 365 392 48 155 675 5 75 756 6.4 13.3 87. contrary to what is often .5 1.128 106 292 250 164 418 1.5 227.730 50.8 53. pmw 225.9 577.6 690 58.3 42.6 109.5 27 96.4 73.3 5. Flowerdew (1993) argued that teaching materials do not provide students with authentic descriptions of syntactic patterns of words.9 30.894 35 1. freq.436 199 689 446 43.6 1249 60.7 44.7 1.675 29.5 13.4 40 218.6 6.7 189.6 4.3 68 56.24 S-I BNC-AC-HUM Total freq.5 4.5 12.8 6 5.9 24.916.505 % Rel.9 118 14.4 91. freq.9 195.8 11.2 49. (ICLE-PO) These findings provide evidence for EFL learners’ lack of knowledge of the preferred syntactic positioning of connectors in English.6 14.353 159 495 676 95 372 1.207 599 143 681 195 451 248 1263 609 217 3.6 9 45.178 Academic Vocabulary in Learner Writing Table 5.4 4.3 26.23 The frequency of sentence-initial position of connectors in the BNC-AC-HUM and the ICLE ICLE S-I Total freq.3 291. consequently.1 26. 2.6 2009 although and as a result as a result of as far as X is concerned because because of consequently despite due to even if even though for example for instance furthermore however in spite of moreover nevertheless on the contrary on the other hand so thanks to therefore thus TOTAL 263 1456 71 24 96 107 62 103 50 29 83 46 235 93 113 673 47 255 170 92 228 805 68 340 221 5.6 19.7 58 50.79 7.4 14.8 18 70.9 20.5 6.4 42.5 22.4 57.412 1.3 36.5 41.5 63.3 52.5 201.5 52 11. Therefore people should pay more attention to what they consume.6 82.7 203. sleeping and even dreams.5 413.7 35.3 49.1 81.306 102 194 59 2.4 70 25.3 11.8 30.5 68.2 88.4 46. He showed that.67 42 34.9 28.2 27.1 54.7 11.9 53 265.767 110.8 78.6 59. Scientific research as well as individual observations prove that eating habits have a great impact on the condition of the human body and soul and.236 103 79 167 2.11 This lack has often been attributed to L2 writing instruction.

106. EFL learners’ tendency to place connectors in unmarked sentence-initial position seems to be reinforced by teaching (see Granger. they introduce the cause of something that is described in the main clause: 5. only be to win Pyrrhic-victories. In academic prose.107. However. the highest percentage of linking adverbials appear in sentence-initial position and concluded that ‘initial position seems the unmarked position for linking adverbials’ (Conrad.104.. The crime rate would also strongly reduce and this is of course the main objective of all this measures. Conrad (1999) studied variation in the use of linking adverbials across registers. 2006). It is also used to serve different functions in learner writing. Contrary to our expectations. As shown in the following examples. (BNC-AC-HUM) 5.105. (ICLE-DU) 5. 1999:13) (see also Biber et al. the adverbial connector then rarely occurs in sentence-initial position.18 in learner writing and 4. but is more usually found in a medial position. To directly try to change people with ‘experience of life’ would. 1985).54 in academic prose). Europe’s history is inseparable from world history between 1880 and 1945. sentenceinitial because is significantly more frequent (relative frequencies of 9. Unmarkedness provides another possible explanation for EFL learners’ massive overuse of sentence-initial connectors. (ICLE-SW) . EFL learners seem to use the unmarked sentence-initial position as a safe bet. Because everybody wants to live in a safe society. at best. Similarly. and suggested that one way in which instruction may skew EFL learners’ style is ‘by the presentation of these expressions as if they occurred in only sentence-initial position’ (see also Narita and Sugiura. Because these changes were worldwide. Because deep inside every man’s heart lies the ‘Indian’-insight that we are only borrowing the earth from our children. compared to this effective investment. marriages were usually short-term. Because the death-rate was high. Milton (1999: 225) discussed the problematic aspects of teaching connectors by means of lists of undifferentiated items. EFL learners sometimes use sentence-initial because to introduce new information in independent segments and give the cause of something that was referred to in the previous sentence: 5.. the proportion of sentence-initial because is lower in learner writing than in professional writing. Thus. in both conversation and academic prose.Academic vocabulary in the ICLE 179 taught in course books. She showed that. (BNCAC-HUM) Unlike expert writers. 2004: 135). 1999 and Quirk et al. sentence-initial becauseclauses are attached to a main clause.

and other types of speech-like clause-combining strategies. Because their sorrow is found as the extenuating circumstance. As indicated above. a fact stressed by all reputable modern historians who have worked on this intractable subject. Conrad (1999) found that three highly frequent items – then. Conrad (1999) reported that. In a comparison of strategies for conjunction in spoken English and English as a Second Language (ESL) writing. can also play important roles in the interpersonal interaction . teaching materials tend to focus on sentence-initial position. Schleppegrell (1996) found that students who had spent most of their lives in the US and learnt English primarily through oral interaction. Linking adverbials which occur between an auxiliary and the main verb.99 above. which is quite uncommon in the BNC-AC-HUM. In my opinion it is useful only for them. She argued that these linking adverbials are commonly found in sentence-final position as they serve important interpersonal functions: Adverbials in conversation. but rare in academic prose. such as: All estimates of population size must therefore allow for a large measure of conjecture.180 Academic Vocabulary in Learner Writing 5. for their trial. (ICLE-CZ) EFL learners share this characteristic with ESL writers.108. (BNC-AC-HUM) A medial position for connectors is quite typical of academic prose. 1999: 14–15): 1.g. and EFL learners probably feel unsafe about other syntactic positionings for connectors. The final position is frequent in conversation. Linking adverbials which occur between the main verb and its complement. 1984) to add information in independent segments. it is clearly less favoured by EFL learners. 2. in addition to showing a link with previous discourse. They employed both an ‘afterthought’ because (Altenberg. Table 5.: It is difficult to believe therefore that one of these mosaics was not influenced by the other. transferred conjunction strategies from speech to essay writing. in academic prose. most linking adverbials are placed in sentence-initial or medial position. However. anyway and though — account for the relatively high proportion of sentence-final linking adverbials in native-speaker’s conversation. Three types of medial position are particularly frequent (Conrad. Linking adverbials which occur immediately after the subject as illustrated in Example 5. e. (BNC-AC-HUM) 3.24 shows that several connectors are also repeatedly used in sentence-final position in the ICLE.

(Conrad 1999:14) The type of interpersonal interaction that takes place in conversation is not typical of academic prose.8 4. that in some cases in conversation there is a tension between placing the linking adverbial at the beginning of the clause.9 S-F 20 20 8 18 14 17 7 BNC-AC-HUM Tot.24 HUM 181 Sentence-final position of connectors in the ICLE and the BNC-AC- ICLE S-F anyway for example for instance indeed of course then though 25 63 31 15 34 35 11 Tot. freq. and (.6 0. freq. been suggested that learners’ preference for the sentence-initial . (. due to its linking function.Academic vocabulary in the ICLE Table 5. .3 4.7 1.1 5.4 0.2 1.5 3.4 2. ).2 was on interlanguage features that are shared by most learner populations when compared to expert academic writing. .6 1.6 0. It may be. for example.3. none of the linking adverbials commonly associated with the final position in conversation are common in formal writing. 132 854 344 257 750 1054 256 % 18. and the textual and/or interpersonal functions they serve. 0. ) [A] final though often occurs when speakers are disagreeing or giving negative responses.2 that takes place. may combine to influence learners’ use of academic vocabulary. and at the end of the clause.9 Rel. It has. Thus. Multiple factors.4 9.3 1. ) then typically indicates that a speaker is making an inteference (sic) based on another speaker’s utterance.3 2. freq. however. .6 0. final anyway is often associated with expressions of doubt or confusion. .9 3.3 1.5 0. then. These roles are often particularly noticeable for the common adverbials in final position. Transfer-related effects on French learners’ use of academic vocabulary The focus of Section 5. 71 1263 609 1413 863 3062 178 % 28. . 2. due to its interactional function.5 0.3 Rel. .5 0. freq. and which are therefore likely to be developmental.9 7. 5. These findings suggest that the positioning of linking adverbials in native discourse is directly influenced by the register in which they appear.0 5.0 0.2 0. The placement of these adverbials in final position is consistent with previous corpus analysis of conversation that has found that elements with particular interpersonal importance are often placed at the end of a clause (.

have often been built on shaky methodological foundations and suffer from what Jarvis (2000: 246) referred to as a ‘you-know-it-when-you-see-it’ syndrome. Jarvis used Selinker’s (1992) finding according to which Hebrew-speaking learners of English as a group tend to produce sentences in which adverbs are placed before the object (e. Intra-L1-group homogeneity in learners’ IL performance is found when learners who speak the same first language behave as a group with respect to a specific L2 feature.182 Academic Vocabulary in Learner Writing position for connectors may be attributed to the influence of instruction or ‘transfer of training’ (Selinker. I like very much movies). To remedy this situation. which is intended as a methodological heuristic to be used by transfer researchers: L1 influence refers to any instance of learner data where a statistically significant correlation (or probability-based relation) is shown to exist between some features of learners’ IL performance and their L1 background. Inter-L1-group heterogeneity in learners’ IL performance is found when ‘comparable learners of a common L2 who speak different L1s diverge in their IL performance’ (Jarvis.12 Intra-L1-group homogeneity is verified by comparing the interlanguage of learners sharing the same mother tongue background. 2000: 254). 2. Jarvis (2000) incorporated three types of L1 observable effects into a unified framework for the study of L1 influence and proposed the following working definition of L1 influence. 2000: 252) Jarvis translated his working definition of L1 influence into a list of specific types of L1 observable effects that should be examined when investigating transfer. (Jarvis. Jarvis referred to a number of studies reported by Ringbom (1987) that . To illustrate this effect. To illustrate this first L1 effect. As Granger (1998b: 158) put it. Claims made about the nature of L1 influence and its interaction with other factors. ‘learners clearly cannot be regarded as “phraseologically virgin territory”: they have a whole stock of prefabs in their mother tongue which will inevitably play a role – both positive and negative – in the acquisition of prefabs in the L2’. He argued that transfer studies should minimally consider at least three potential effects of L1 influence when presenting a case for or against L1 influence: 1. The marked difference in frequency of I think across the learner sub-corpora may be partly explained by different academic writing conventions in the different mother tongues. 1972). however.g.

25. The identification of two simultaneous L1 effects is necessary to present a convincing case for L1 influence. 2000: 255). Inter-L1-group heterogeneity is identified by comparing the interlanguage of learners from different mother tongue backgrounds. The added value of this third L1 effect is that it also has explanatory power by showing ‘what it is in the L1 that motivates the IL behavior’ (Jarvis. Table 5. 2000: 255). Jarvis concluded that. Selinker (1992) uses this type of evidence to show that Hebrew-speaking learners’ positioning of English adverbs parallels their use of adverbs in the L1.Academic vocabulary in the ICLE 183 have shown that Finnish-speaking learners are more likely than Swedishspeaking learners to omit English articles and prepositions. As shown in Table 5. despite differences in the degree of reliability. These three effects can emerge in circumstances in which transfer is not at play and can thus be misleading when considered in isolation. Intra-L1-group congruity between learners’ L1 and IL performance is found where ‘learners’ use of some L2 feature can be shown to parallel their use of a corresponding L1 feature’ (Jarvis. Jarvis argued that ‘this type of evidence strengthens the argument for L1 influence because it essentially rules out developmental and universal factors as the cause of the observed IL behaviour. Intra-L1-group congruity is confirmed by an IL/L1 comparison. In other words.25 L1 effect Jarvis’s (2000) three effects of potential L1 influence reliability poor strong strongest sufficient criterion no no no Intra-L1-group homogeneity in learners’ IL performance Inter-L1-group heterogeneity in learners’ IL performance Intra-L1-group congruity between learners’ L1 and IL performance . none of the three effects is sufficient by itself to verify or characterize L1 influence. 2000: 254–5). Identifying the three L1 effects would be even more convincing if it were not that ‘the ubiquity of conditions that can obscure L1 effects renders the three-effect requirement unrealistic in many cases’ ( Jarvis. it shows that the IL behaviour in question (omission of function words) is not something that every learner does (to the same degree or in the same way) regardless of L1 background’ (Jarvis. 2000: 255). 3.

Inter-L1-group heterogeneity in learners’ IL performance is verified by a comparison of the number of texts in which a specific lexical item is used in the ICLE-FR and in other L1 sub-corpora. The International Corpus of Learner English appears to be ideally suited to analysing the three potential effects of L1 influence described by Jarvis (2000). the Corpus de Dissertations Françaises (CODIF).13 To establish intra-L1-group congruity between learners’ L1 and IL performance. Table 5. Intra-L1-group homogeneity in learners’ performance is investigated by comparing all the essays written by French learners to verify whether they behave as a group with respect to a specific L2 feature. the three transfer effects are found in French learners’ use of on the contrary.184 Table 5. Applying Jarvis’s (2000) framework to the ICLE texts reveals the potential influence of transfer on French learners’ use of multiword sequences that serve specific rhetorical functions in English. For example. French EFL learners’ use of a specific lexical item is compared to the use of its equivalent form in a 225. i. I made use of comparison of means tests and post hoc tests such as Ryan’s procedure and Dunnett’s test to confirm this second L1 effect.4 as potential explanations for the frequent misuse of the adverbial. Unlike Jarvis (2000).2.26 L1 effect Academic Vocabulary in Learner Writing Jarvis’s (2000) unified framework applied to the ICLE-FR Corpus comparisons A comparison of the use of a specific lexical item in all the essays written by French learners A comparison of the use of a specific lexical item in the ICLE-FR against other L1 subcorpora A comparison of a specific lexical item in the ICLE-FR to the use of its equivalent form in a comparable corpus of French native student writing Intra-L1-group homogeneity in learners’ performance Inter-L1-group heterogeneity in learners’ IL performance Intra-L1-group congruity between learners’ L1 and IL performance I made use of Jarvis’s (2000) unified framework to investigate the potential influence of the first language on multiword sequences that serve rhetorical functions in French learners’ argumentative writing.174-word comparable corpus of essays written by French-speaking students collected at the University of Louvain.e. This strongly supports Granger and Tyson’s (1996) suggestion that French learners’ overuse and misuse of the connector is probably due to an over-extension .26 lists the three steps needed to investigate the influence of French on recurrent word sequences in the ICLE-FR. indicating that L1 influence most probably reinforces the conceptual problems and misguided teaching practices that were identified in Section 5.

However. However. ‘transfer of meaning’ (cf. These other types of knowledge can also give rise to transfer. research into learners’ use of cognates has highlighted transfer effects on style and register (cf. they differ in one significant way: according to me is usually not accepted as a correct English phraseme. Next to knowledge of form and meaning.g. cognates) (see Jarvis and Pavlenko. Applying Jarvis’s (2000) unified framework on learner corpus data brings to light interesting findings relating to L1 influence on word use. where and how to use it (Nation.g. much remains to be done regarding ‘transfer of use’. Granger and Swallow 1988. semantic extension) or ‘transfer of form/meaning mapping’ (e. They are illustrated in the remaining of this section. This sequence is repeatedly used in the ICLE-FR. which can be used to express both a concessive and an antithetic link. borrowing). Moreover. 2001: 27). however. selon moi is perfectly fine in French and is. with what words. it does not appear in other learner sub-corpora except for the ICLE-DU and the ICLE-SW. Most transfer studies have focused on what we can call ‘transfer of form’ (e. Nesselhauf. It helps to identify a number of transfer effects that remain largely undocumented in the SLA literature: transfer of function. the most admired. knowing a word also involves knowing in what patterns. The English preposition according to and the French selon both mean ‘as shown by something or stated by someone’ (e. 2003). when. where it is extremely rare.g. which are probably regarded as translation equivalents by French EFL learners. and transfer of L1 frequency. Granger 1996b). These four transfer effects often accompany transfer of form and meaning and may also reinforce each other. French learners’ use of the idiosyncratic expression *according to me is a good example of transfer of function.g. 1998b. semantic transfer. Studies focusing on learners’ use of phrasemes have brought to light transfer effects on collocational restrictions and lexico-grammatical patterns (e. transfer of the phraseological environment. there is congruity between French learners’ use of according to me in English and selon moi in French. and influential sculptor since Bernini”. 2008. 1989. 2003 and Ringbom. According to George Heard Hamilton. transfer of style and register. . Biskup. Rodin became “a figure of international significance. Multiword sequences with a pragmatic anchor seem to be quite easily transferred. Van Roey 1990. BNC-AC-HUM). Granger.Academic vocabulary in the ICLE 185 of the semantic properties of the French au contraire. 1992. in fact. Odlin. 2007 for excellent syntheses on lexical and semantic transfer). This may explain why French EFL learners are keen to use what they regard as a direct translation of a common French expression. prolific. For example. quite frequent in French native-speaker students’ writing. By contrast.

186 Academic Vocabulary in Learner Writing The following examples illustrate French students’ use of selon moi and French EFL learners’ use of according to me: 5.113 and 5.112. (CODIF) French learners’ knowledge of the verb illustrer in their mother tongue probably influences the type of word combinations and lexico-grammatical patterns in which they use the English verb illustrate. the real problem now is not that man refuses to pay heed but that man refuses to make some sacrifices for the sake of ecology and to understand that the values that we have chosen are the wrong ones. (ICLE-FR) 5. According to me. a pattern that is also the preferred lexico-grammatical environment of illustrer in the corpus of French essays (Example 5. Although it is not found in many texts written by French learners. prenons l’exemple des pâtes alimentaires italiennes. (ICLE-FR) 5.2.115. tout le monde pense ce qu’il veut et comme il veut.27 shows that French EFL learners frequently use the verb illustrate in its infinitive form. .115). Pour illustrer cela.110.A. we can mention the notion of culture and language in the north of Belgium. agit comme il l’entend en respectant la loi et les codes établis. According to me. (ICLE-FR) 5. (CODIF) 5.6 in Section 4.113. it is much more frequent in ICLE-FR overall than in any other learner sub-corpus. Selon moi. Table 4. A closer look at the occurrences of the infinitive form of illustrate in ICLE-FR reveals that it is repeatedly used in sentence-initial to-infinitive structures (Examples 5.6%) (cf. To illustrate this point.111. (CODIF) 5.10 represents graphically how the misleading translation equivalent may be created by French EFL learners.2). The percentage of use of this form (40%) is quite similar to that of the infinitive form of the French cognate verb illustrer in CODIF. The English verb illustrate is a case in point. 5. the prison system is not outdated: it has never been a solution per se. it would be interesting to compare our situation with the U.114).S. To illustrate this. Table 5. Transfer effects are also detectable in French learners’ use of lexicogrammatical and phraseological patterns.’s.109. but differs significantly from the proportion of infinitive forms of the English verb illustrate that were found in the BNC-ACHUM (23. la chanson est un vecteur de culture parce qu’elle est un art qui impose l’engagement des différents acteurs. (ICLE-FR) Figure 5. Selon moi.114.

idée. article. Judge Kamins. 'selon moi' 'selon' + [-HUM] e. 'according to' + [+HUM] e. norme. The following examples show that longer sequences . etc. théorie. 'according to' + [-HUM] 'according to' ENGLISH Figure 5. situation. certains. Hugo. etc. theory. Xavier Flores.g. monsieur Bernanos. sentenceinitial Pour conclure.Academic vocabulary in the ICLE FRENCH 187 'selon' 'selon' + [+HUM] 'selon X' e. etc. argument. viz.g. principe. etc.g. This pattern is less frequent in the writing of EFL learners with other mother tongue backgrounds and parallels a very frequent way of concluding in French.g. supporters. Civil Liberty Members.10 A possible rationale for the use of ‘according to me’ in French learners’ interlanguage Similarly. argument. loi. French EFL learners almost always use the verb conclude in the sentence-initial discourse marker To conclude followed by an active structure introduced by a first person pronoun + modal verb. FRENCH LEARNERS' INTERLANGUAGE 'according to' 'according to' + [+HUM] 'according to' + [+HUM] 'according to X' *'according to me' e. philosophie. idea. lui.

Pour conclure. ‘illustrate’ in ICLE-FR simple present infinitive past participle imperative past TOTAL Rel.27 A comparison of the use of the English verb ‘illustrate’ and the French verb ‘illustrer’ En. per 100. To conclude.2. In Section 5. it is used as a code gloss to . je dirais que chaque individu est unique.2. (CODIF) 5.55 and phraseological cascades (see Section 5. (ICLE-FR) 5. ‘illustrer’ in CODIF 8 13 3 1 1 26 31% 50% 12% 4% 4% 100% 11.28. différent et qu’il est facile de vouloir ressembler aux autres plutôt que de s’accepter tel qu’on est. words 10 8 2 0 0 20 50% 40% 10% 0% 0% 100% 14.116.188 Academic Vocabulary in Learner Writing Table 5. technology and industrialisation certainly stand in the way of human relationships but not in people’s dreams and imagination. freq. This difference in use between ICLE-FR and the other ICLE sub-corpora proved to be statistically significant. nous pouvons dire que les deux stades sont aussi importants l’ un que l’ autre : il est nécessaire que l’ homme soit membre d’ un groupe mais il est tout aussi primordial qu’ il s’ en détache pour construire son identité propre.67 Fr.9 per cent of the texts produced by French learners and is much more frequent in the ICLE-FR than in any other learner sub-corpus. we can say that many people are today addicted to television. An analysis of concordance lines for let us shows that this sequence is repeatedly used by French speaking students to serve a number of rhetorical and organisational functions. As shown in Table 5. the two-word sequence occurs in 25.3) may also be transferrelated. the first person plural imperative form let us was shown to be overused by all L1 learner populations when compared to expert academic writing. (CODIF) 5. For example. (ICLE-FR) My findings also point to a transfer of style and register. Pour conclure. To conclude. I would say that science. 5.

there are only eight significant verb co-occurrents of let us: consider. take and have.123.28 ‘let us’ in learner texts Rel.2 9. Let us then focus on the new Europe as a giant whose parts are striving for unity.120.121.88 25.23 18. i. let us take the example of Britain which was already fighting its corner alone after Mrs Thatcher found herself totally isolated over the decision that Europe would have a single currency. Hyland. Thus.122.7 10.78 13.9 9.21 38.65 introduce an example (Example 5. Equivalence is however found at the morphological level as French makes use of an inflectional suffix to mark the first imperative plural form. return.46 occurrences per 100.000 words French Czech Dutch Finnish German Italian Polish Russian Spanish Swedish TOTAL 71. a transition marker to change topic (Examples 5.95 19. and an attitude marker (Example 5. our behaviour would be cowardly. look. begin. (ICLE-FR) 5.11 13.33 8. let us in French.3.7 6 7.Academic vocabulary in the ICLE Table 5. freq.8 12.73 26. but it is not frequent (relative frequency of 5.120). 1998.4 24. say.85 Number of texts including ‘let us’ or ‘let’s’ Number of texts % 189 59 19 19 10 14 10 23 47 14 9 224 228 147 196 167 179 79 221 194 149 81 1641 25. To illustrate the truth of this.122).69 20.2. In the BNC-ACHUM. (ICLE-FR) 5. 2002).000 words).121 and 5. There is no lexically equivalent form to En.24 12. suppose.e. 5. of ‘let us’ and ‘let’s’ per 100. It is also restricted to a limited set of verbs (see Swales et al.9% 12.57 26. to investigate the third L1 effect. Let us now turn our attention to the students who want to apply for a job in the private sector. (ICLE-FR) As explained in Section 4. intra-L1 group congruity between learners’ L1 and IL . (ICLE-FR) 5.4 11..123). the first person plural imperative form let us is found in professional academic writing. Let us be clear that we cannot let countries tear one another to pieces and if we closed our eyes to such an atrocity.

2008a): 5. than in English expert or novice writing. (CODIF) 5. As shown in Table 5. Ajoutons qu’une partie plus spécifique de la population est touchée.000 words in the BNC-SP but only 5. Other examples of sequences that have French-like frequencies in the ICLE-FR include on the contrary. on the other hand. Prenons l’exemple des sorciers ou des magiciens au Moyen Age.129. (CODIF) 5. (CODIF) 5. This generalized overuse of the first person plural imperative in EFL French learner writing as a rhetorical strategy does not conform to English academic writing conventions but rather to French academic style. Considérons un instant le cinéma actuel. the frequency of let us in the ICLE-FR is much closer to the frequency of first person plural imperative verbs in student writing in French. let us take the example.125. Imperative forms that are repeated in the ICLE-FR often have formal equivalents that are found in CODIF (e.126. and more specifically in academic writing. (CODIF) First person plural imperative verbs serve specific discourse strategies in French formal types of writing. to illustrate this.127. In English. This example points to yet another type of transfer effect.29. (CODIF) 5.124. Imaginons un monde ou règne une pensée unique. (CODIF) 5. let us examine ‘examinons’.3 per 100. Examinons successivement le problème de l’abolition des frontières d’un point de vue économique.000 in the BNC-AC). the speech-like nature of let us in French EFL learner writing leads to an overall impression of stylistic inappropriateness. 1996) and make use of imperatives in English academic writing in the same way as in French academic writing. I compared the use of let us in ICLE-FR with that of first person plural imperative verbs in CODIF. Envisageons tout d’abord la question économique. juridique et enfin culturel. . to conclude and *according to me.190 Academic Vocabulary in Learner Writing performance. Comparons cela à la visite de la cathédrale d’Amiens. let us take ‘prenons’. French EFL learners seem to transfer their knowledge of French academic writing conventions (Connor. namely transfer of L1 frequency. The rhetorical and organisational functions fulfilled by let us in French EFL learner writing can be paralleled with the very frequent use of first person plural imperative verbs in French student writing to organize discourse and interact with the reader (Paquot. let us consider ‘considérons’. let us (not/never) forget ‘oublions/n’oublions pas que’.g. let us (and more precisely its contracted form let’s) is much more typical of speech (relative frequency of 42.5 occurrences per 100.128. let us take the example of ‘prenons l’exemple de’. (CODIF) 5.130. let us think ‘pensons’). As a result. let us hope ‘espérons’.

‘let us take the example of’ En ‘let us not forget’ En. 1st plural imperative Fr.. ‘let us take the example of’ En ‘let us not forget’ En. FREQUENCYFR REGISTERFR FUNCTIONFR PHRASEOLOGYFR Figure 5. ‘let us examine’ FREQUENCYEN REGISTEREN FUNCTIONEN PHRASEOLOGYEN FRENCH EFL LEARNERS' INTERLANGUAGE En.7 3 191 French L1 students (CODIF) French EFL learners (ICLE-FR) English expert writers (BNC-AC-HUM) English novice writers (LOCNESS) FRENCH Fr.. 1st plural imperative En.000 words of first person plural imperative verbs 95.9 5. ‘let us examine’ .29 The transfer of frequency of the first person plural imperative between French and English writing Corpus Relative frequency per 100. ‘prenons’ example de’ Fr. ‘examinons’ FREQUENCYFR REGISTERFR FUNCTIONFR PHRASEOLOGYFR ENGLISH En. 1st plural imperative En.Academic vocabulary in the ICLE Table 5.5 71.11 A possible rationale for the use of ‘let us’ in French learners’ interlanguage . ‘n’ oublions pas’ Fr.

My results show that the expression of rhetorical and organizational functions in EFL writing is characterized by: A limited lexical repertoire: EFL learners tend to massively overuse a restricted set of words and phrasemes to serve a particular rhetorical . following Hoey (2005: 183).. This textual dimension is particularly difficult to master and has been described by Perdue (1993) as the last developmental stage before bilingualism in second language acquisition. They are the result of many encounters with these lexical items in L1 speech and writing. 5. As illustrated in Figure 5.11. Mental primings in the L1 lexicon probably influence EFL learners’ knowledge of English words and word sequences by priming the lexico-grammatical preferences of an L1 lexical item to its English counterpart. transfer of style and register. I refer to as ‘transfer of primings’. Primings for collocational and contextual use of (at least a restricted set of frequent or core) L1 lexical devices are particularly strong in the mental lexicon of adult EFL learners. let’s examine or let us not forget mirror the stylistic profile of the French sequences prenons l’exemple de . Summary and conclusion The data presented in this chapter support the idea that the ‘English of advanced learners from different countries with a relatively limited variation of cultural and educational background factors share a number of features which make it differ from NS language’ (Ringbom. stylistic or register features.4. discourse functions and frequency.. and more precisely in argumentative essays. French EFL learners use English first person plural imperatives in academic writing with the frequency of French imperative verbs in the corresponding register. 1998: 49). examinons et n’oublions pas in French academic writing. Thus. in French-like phraseological patterns and to serve the same organizational and interactional functions. The transfer effects identified in this section – transfer of function. transfer of lexico-grammatical and phraseological patterns. The focus of the analysis has been on the lexical means available to learners to perform specific rhetorical and organizational functions in academic writing. and transfer of frequency – make up what. French EFL learners’ use of textual phrasemes such as let’s take the example of.192 Academic Vocabulary in Learner Writing Transfer effects often interact in learners’ use of English lexical devices. EFL learners’ knowledge of words and word combinations in their mother tongue includes a whole range of information about their preferred co-occurrences and sentence position.

They also seem to prefer to use conjunctions. with labels. A marked preference for sentence-initial position of connectors: connectors are often used in the unmarked sentence-initial position in learner writing. Semantic misuse: As Crewe (1990: 317) commented. verbs and adjectives. ‘the misuse of logical connectives is an almost universal feature of ESL students’ writing’. which reveals learners’ weak sense of native speakers’ ‘preferred ways of saying things’. The frequency of informal words and phrases in learner writing is often closer to their frequency in native-speakers’ speech than in their academic prose. adverbs and prepositions rather than phraseological patterns with nouns. A medial position is not favoured by EFL learners. where lexical items are employed to signal grammatical and textual relations’ and that ‘a lack of coherence in advanced learners’ writing must at least partly be attributable to lexico-grammatical deficits’ (Lorenz. Preferred co-occurrences in the ICLE are often not the same as in academic prose. A lack of register awareness: texts produced by EFL learners often ‘give confusing signals of register’ (Field and Yip. The methodology used in the first part of this chapter has made it possible to draw a general picture of the writing of upper-intermediate to advanced EFL learners from different mother tongue backgrounds. abstract nouns that are inherently unspecific and require lexical realization in their co-text. either beforehand or afterwards.e. Chains of connective devices: EFL learners’ texts are sometimes characterized by the use of superfluous (and sometimes semantically inconsistent) connective devices. and specifically. 1999b: 56).Academic vocabulary in the ICLE 193 function and to underuse a large proportion of the lexical means available to expert writers. i. Lexico-grammatical and phraseological specificities: EFL learners’ writing is distinguishable by a whole range of lexico-grammatical patterns and co-occurrences that differ from academic prose in both quantitative and qualitative terms. Most . Learners’ attempts at using collocations are not always successful and sometimes result in crude approximations and lexico-grammatical infelicities. My results also support Lorenz’s (1999b) remark that ‘advanced learners’ deficits are most resilient in the area of lexico-grammar. 1992: 26) as they display mixed patterns of formality and informality. is that EFL learners also experience difficulty with the semantics of other types of cohesive devices. although it is typical of academic prose. however. What is less well-documented in the literature.

it is probably not the only explanation.6. teaching-induced factors have been identified as a possible cause for learners’ preference for sentence-initial position. once linguistic features of upper-intermediate to advanced EFL learner writing have been identified. In Section 5. this feature is actually common to all learner populations represented in the corpus. . in this case Chinese: The overuse of this expression [more and more] was most probably due to language transfer since a familiar expression in the Chinese language ye lai yue was popularly used. although it is indeed very significant in the Chinese component of the second edition of the ICLE (Granger et al.. 2008).. 2004: 135–6). As explained above. foreign learner writing and native student writing make it possible to distinguish between learner-specific and developmental features (e.2. It is not always possible to attribute learner-specific features to a single factor. (Zhang. Syntactic positioning of connectors is rarely taught and EFL learners often consider the sentenceinitial position to be a safe strategy. (Zhang. My methodology makes it possible to avoid hasty interpretations in terms of L1 influence.g. 2000: 77).1) has a role to play. Another advantage of the method I used is that. This is precisely where a corpus of essays written by English native university students such as LOCNESS (see Section 2. Neff et al. The reason for the initial positioning of conjunctions was again due to the transfer of the Chinese language where conjunction devices with similar meaning are mostly used at the beginning of a sentence. sentence-initial positioning of conjunctions is common to most learner populations. The mother tongue may reinforce learners’ preference for sentence-initial position but cannot be regarded as a complete explanation for this learner-specific feature. teaching-induced and transfer-related effects can reinforce each other (Granger.194 Academic Vocabulary in Learner Writing of these features have already been mentioned in the literature. as developmental. but they have always been reported on the basis of only one or two L1 learner populations. 2009). This suggests that. Consider the following quotations by Zhang (2000). who attributes a number of features to the influence of the learners’ mother tongue. Tripartite comparisons between professional writing. As for the overuse of the expression more and more. we can check to what extent they are specific to EFL learners or just typical of novice writing. while transfer may be at work in Chinese learners’ use of more and more. 2000: 83).

this/that is why.12 shows that a whole range of lexical items that Gilquin and Paquot (2008) found to be overused in EFL learners’ writing – maybe. which suggests that native novice writers do not transfer all spoken features to their 3000 200 150 100 50 0 Freq. Figure 5. of PRO (this. really. They are even less frequent in native-speaker students’ writing than in academic prose. (2004a) who described it as a general ‘novice-writer characteristic of excessive visibility’ (Neff et al. that. of first of all (pmw) 2500 2000 1500 1000 500 0 Freq. native-speaker and EFL novices’ writing and native speech (per million words of running text) . spoken component (10m words) Figure 5.14 The overuse of I think in both EFL learner and native-speaker student writing has already been reported by Neff et al. words) Native speech: British National corpus. of I think (pmw) Expert academic writing: British National Corpus. certainly. 2004a: 152). Figure 5.702 words) EFL learners' writing: ICLEv2 (14L1s. it seems to me. so expressing effect. academic component (15m words) Native-speaker student writing: Sub-corpus of LOCNESS (100. the findings suggest that the main feature shared by native and non-native novice writers is a lack of registerawareness.5m. absolutely. I think and first of all –are also more frequently used by native-speaker student writers than in expert academic prose. but as a general rule. around 1.Academic vocabulary in the ICLE 195 Whether a feature is learner-specific or developmental varies from lexical item to lexical item. sentence-final though. which) is why (pmw) 200 150 100 50 0 Freq.12 also shows that not all learner-specific speech-like lexical items are overused in the writing of native-speaking students. by the way and I would like/want/am going to talk about are quite rare in LOCNESS. Thus.12 Features of novice writing – Frequency in expert academic writing. the lexical items of course.

of maybe (pmw) 40 35 30 25 20 15 10 5 0 Freq. of amplifying adverbs (pmw) 120 100 80 60 40 20 Freq.196 350 300 250 200 150 100 50 0 Academic Vocabulary in Learner Writing 1200 1000 800 600 400 200 0 Freq. of so expressing effect (pmw) 18 16 14 12 10 8 6 4 2 0 Freq. of sentence-final though (pmw) definitely 45 40 35 30 25 20 15 10 5 0 Figure 5. of by the way (pmw) 0 Freq. of it seems to me (pmw) Freq. of I would like / want / am going to talk about (pmw) 2000 1500 1000 500 0 really of course certainly absolutely Freq.12 Continued .

so expressing effect). I made use of Jarvis’s (2000) framework for assessing transfer and identified a number of transfer effects – transfer of function. lexical items that are very frequent in speech. discuss *about). Other linguistic features are limited to EFL learners. possibility *to. In the last part of this chapter. especially at higher levels of proficiency. they also suggest that the main effect of the students’ mother tongue on higher-intermediate to advanced learner writing is not errors. Granger and Paquot (2007a: 323) have argued. However. and the overuse of relatively rare expressions such as in a nutshell. are very likely to be overused (e. despite *of. be held responsible for all learner specific-features. the use of non-native-like sequences (e.Academic vocabulary in the ICLE 197 academic writing. It seems that lexical items which are not particularly frequent in speech and are rare in academic prose (e. As Gilquin. I want/would like/ am going to talk about) are less likely to be overused by native novice writers.g. and of L1 frequency – that I referred to as ‘transfer of primings’.g. according to me and as a conclusion). maybe. By contrast. of the phraseological environment. I focused on the potential influence of the first language on multiword sequences that serve rhetorical functions in French learner writing. Developmental factors in L1 and L2 acquisition cannot. and acceptable in academic prose. 1984: 121). These results support Kellerman’s claim that the ‘hoary old chestnut’ according to which transfer does not afflict the more advanced learner ‘should finally be squashed underfoot as an unwarranted overgeneralization based on very limited evidence’ (Kellerman. but more subtle transfer effects. however. of style and register. These include lexico-grammatical errors (*a same. .g. the first language also plays a part in EFL learners’ use of academic vocabulary. the issue of the degree of overlap between novice native writers and non-native writers has far-reaching methodological and pedagogical implications and is clearly in need of further empirical study. In addition to teachinginduced factors and proficiency.

This page intentionally left blank .

in the development of EAP teaching materials. and suggests several remaining issues and avenues for future research. . and analysed their use in ten sub-corpora of the International Corpus of Learner English. There are three key aspects: the influence of teaching on learners’ writing.Part III Pedagogical implications and conclusions In the first two sections I defined the concept of ‘academic vocabulary’. and more specifically. learner corpora. I discuss some of the important pedagogical implications of this research. discusses some of their implications. built a list of academic keywords from corpora of expert writing. the role of the first language in EFL learning and teaching. and the use of corpora. Chapter 7 then briefly summarizes the major results. In Chapter 6.

This page intentionally left blank .

Chapter 6 Pedagogical implications This chapter considers three areas where my findings have major pedagogical implications: teaching-induced factors. Connectors are often presented in long lists of undifferentiated and supposedly equivalent items. 2004). 1997. American consumers prefer white eggs.1). and the role of corpora in EAP material design. Teaching-induced factors Factors linked to teaching have repeatedly been denounced in the literature as being responsible for a number of learners’ inappropriate uses of connectors (see Zamel. on the one hand. This can cause semantic misuse (Crewe. Milton. (LDOCE4) Also problematic are the categorization of besides as a marker of concession. For example. classified in broad functional categories. and the misleading presentation of the conjunctions even if and even though as synonyms. 6. Hyland and Milton.1. 1999). The same is said about conversely. The ways in which corpus data. 1998. Overuse of connectors such as nevertheless. Flowerdew. the role of the first language in EFL learning and teaching. Jordan (1999) describes the adverbial on the contrary as a phrase of contrast equivalent to on the other hand and by contrast (see Figure 6. 1990. 1983. 2007) are also discussed. and on the other hand can also be attributed to the long . as far as I am concerned. have been used to inform the academic-writing sections of the second edition of the Macmillan English Dictionary for Advanced Learners (MED2) (Rundell.1. British buyers like brown eggs. in a nutshell. However this adverb should only be used to indicate that one situation is the exact opposite of another: 6. Lake. in particular data from learner corpora. conversely.

as Milton explains: Students are drilled in the categorical use of a short list of expressions – often those functioning as connectives or alternatively those which are .1 By contrast. . surprising nature of what is being said in view of what was said before: besides (or) else however nevertheless nonetheless notwithstanding only still while (al)though yet in any case at any rate for all that in spite of/despite that after all at the same time on the other hand all the same even if/though Figure 6.g. as a conclusion) are sometimes found in teaching materials. the connectors most frequently used to serve rhetorical functions are sometimes missing from these lists. Contrast. Another direct consequence of these lists is EFL learners’ stylistic inappropriateness. especially in the lists of connectors freely available on the Internet. The selection of connectors to be taught may also lend itself to criticism. 1999: 136) lists of connectors found in most textbooks (Granger. B. .3 that sequences that are rarely used by native speakers (e. . as far as I am concerned or last but not least) or ‘unidiomatic’ sequences (e. It was shown in Section 5.202 Academic Vocabulary in Learner Writing A.1 Connectives: contrast and concession (Jordan.2. Milton (1999) has shown that there is a strong correlation between the words and phrases overused by Hong Kong students and the functional lists of expressions distributed by tutorial schools (private institutions which prepare most high school students in Hong Kong for English examinations). Concession indicates the unexpected. . 2004: 135) as no information is given about their frequency or semantic properties. with what has preceded: instead conversely then on the contrary by (way of ) contrast in comparison (on the one hand) .g. on the other hand .

Positional variation of connectors is usually not taught. 1998. the spoken-like expression all the same is given as an equivalent alternative to more formal connectors such as on the other hand or notwithstanding in Jordan (1999) (see Figure 6. 6. or to which text types they are appropriate (1998: 190). Labels.3 However I have shown in this book that nouns. register and frequency.2 Another problem of teaching practices (which has not often been documented) is that too much emphasis tends to be placed on connectors. while).2. 1999. to the detriment of lexical cohesion. The preposition notwithstanding is listed together with adverbs and adverbial phrases (e.g. that is. 2006: 345). 2006). 1976). Narita and Sugiura. collocational and lexico-grammatical preferences. Thus. Milton. however. on grammatical cohesion (see Halliday and Hasan. Awareness-raising activities focusing on similarities and differences between the mother tongue and the foreign language are clearly needed to achieve this. One of the many roles of teaching should thus be to counter these ‘default’ and sometimes misleading primings in EFL learners’ mental lexicons. verbs and adjectives all have prominent rhetorical functions in academic prose. It is most probable that lexical cohesion has been neglected in EFL teaching because ‘there have been no good descriptions of the forms and functions of this phenomenon’ (Flowerdew. have also been found to fulfil a prominent cohesive role in this particular genre. These activities should not be restricted to ‘helping learners focus on errors typically committed by learners from a particular L1’ (Hegelheimer and . Learners’ marked preference for the sentence-initial positioning of connectors has also been related to L2 instruction (see Flowerdew. yet) as well as conjunctions (e. and learners use the sentence-initial position as a safe bet. The role of the first language in EFL learning and teaching My findings have at least two important pedagogical implications relating to the role of the first language in EFL learning and teaching.g.1). Transfer of primings means that words or word sequences in the foreign language may be primed for L1 use in terms of discourse function. This example also illustrates the fact that no information about the connectors’ grammatical category or syntactic properties is made available to the learners. although.Pedagogical implications 203 colourful and complicated (and therefore impressive) – regardless of whether they are used primarily in spoken or written language (if indeed at all).

1 lists examples of infelicitous translation equivalents. Persian. Although the comparisons are sometimes restricted to words in the native and target languages. Learners have no way of knowing which collocations are congruent in the mother tongue and the foreign language. 2006: 259). I showed that first person plural imperatives are not the best way of organizing discourse and interacting with the reader in English academic writing. However. Table 6.4 . They should also raise learners’ awareness of more subtle differences such as the register differences and collocational preferences of similar words in the two languages. teachers could not profitably spend the class time necessary to illuminate so many contrasts. Spanish. teachers can help individual students in using any contrastive information that their dictionaries provide. the Robert & Collins CD-Rom (Version 1. Similarly. In Section 5. and as a conclusion as a possible equivalent of the French ‘pour conclure / pour résumer ’. moreover. 2008b).3. This recommendation stands in sharp contrast to Bahns’s (1993: 56) claim that collocations that are direct translation equivalents do not need to be taught. However. as Odlin commented. it is not always possible to make use of the first language in the classroom and to rely on contrastive data: Whatever the merits of contrastive materials in some contexts. there is not likely to be any textbook that contrasts English verb phrases with verb phrases in all of those languages – and even if there were. the differences between the collocations in L1 and L2 may lie in aspects of use rather than form or meaning.1) includes an essay-writing section in which first person plural imperatives in French are systematically translated by structures employing let us in English (Granger and Paquot. Tamil. (1989: 162) Bilingual dictionaries should ideally facilitate the teacher’s task in multilingual as well as monolingual classrooms. and Yoruba. it is clear that such materials are not always feasible. Yet even in such classes. If the class size allows it. the most carefully prepared dictionaries often provide some comparisons of pronunciation and grammar as well. when an ESL class consists of speakers of Chinese.204 Academic Vocabulary in Learner Writing Fisher. For example. one type of contrastive information is frequently available: bilingual dictionaries. however. For example. a web-page devoted to linking words and hosted by the ‘Académie de Lille (Anglais BTS Informatique)’ lists according to me as a direct translation equivalent of the French ‘à mon avis’. it is questionable whether the type of contrastive information they provide is fully adequate.

. sur Terre.1 Le Robert & Collins CD-Rom (2003–2004): Essay writing French sentence Prenons comme point de départ le rôle que le gouvernement a joué dans l’élaboration de ces programmes En premier lieu. Victoria l’Américaine débarque à Londres en 1970 et réussit rapidement à s’imposer sur la scène musicale N’oublions pas que.. notons toutefois que le rôle du Conseil de l’ordre a été déterminant Nous reviendrons plus loin sur cette question.. as a starting point’ = ‘firstly. considérons maintenant le style Venons-en maintenant à l’analyse des retombées politiques Assessing an idea Examinons les origines du problème ainsi que certaines des solutions suggérées Sans nous appesantir or nous attarder sur les détails. however.. let us now consider’ = ‘now let us come to’ = ‘let us examine . la gravité pilote absolument tous les phénomènes 205 Essay writing: function Developing the argument Proposed English equivalence = ‘let us take ... that’ = ‘we shall come back to this question later. let us mention briefly’ = ‘let us add to this or added to this’ Introducing an example Stating facts = ‘(let us) take the case of’ = ‘let’s recall the facts’ Emphasizing particular points = ‘let us not forget that’ .. let us examine’ = ‘after studying . but let us point out at this stage’ = ‘before tackling .Pedagogical implications Table 6.. mentionnons brièvement le choix des métaphores Adding or detailing Ajoutons à cela or Il faut ajouter à cela or À cela s’ajoute un sens remarquable du détail Prenons le cas de Louis dans «le Nœud de vipères» Rappelons les faits. mais signalons déjà l’absence totale d’émotion dans ce passage Avant d’aborder la question du style. as well as’ = ‘without dwelling on the details. let us note. examinons ce qui fait obstacle à la paix The other side of the argument Après avoir étudié la progression de l’action.

2001.or overuse. 2009). relies exclusively on data from a native-speakers’ academic corpus. in context. exacerbate the problem with their overuse’ (Flowerdew. 6. Yet. such corpora have very rarely been used systematically to inform EAP materials (see Milton. as well as the items they tend to under. few make use of authentic texts and very few are informed by the use of corpora. As Flowerdew (1998) put it. 1998 and Tseng and Liou. Multilingual corpora clearly have an important role to play here by providing an empirically-based source of translation equivalents (Bowker. 2003). it is arguably less useful for non-native learners. Although this is one of the most innovative EAP textbooks to date.g. the types of infelicities EFL learners produce and the types of errors they make. The only type of resource in which learner corpus data have been relatively successfully implemented up to now is the monolingual learners’ dictionary (MLD). Granger. As shown in Section 5. 2006 for two exceptions in 5 Computer-Assisted Language Learning). EFL writing is characterized by a number of linguistic features that differ from novice native-speakers’ writing.206 Academic Vocabulary in Learner Writing These findings are quite representative of a general lack of good contrastive studies on which pedagogical materials can be based.3. learner corpora are the most valuable resources for designing EAP materials which address the specific problems that EFL learners encounter (see also Flowerdew. By showing. decisions made should also be based on findings from a parallel student corpus to ascertain where students’ main deficiencies lie. Thus. there is a danger that the emphasis on teaching the most frequent markers may focus on ones already familiar to and correctly used by students. The value of pedagogical tools for non-native speakers of English would be greatly increased if findings from learner corpus data were also used to select what to teach and how to teach it. Thurstun and Candlin’s (1997) Exploring Academic English. 2003. the Longman Dictionary of Contemporary . Even when they are corpusinformed. Hamp-Lyons and Heasley 2006). If not. EAP resources tend to be based on data from native-speakers only. Bailey 2006. 1998: 338). which uses concordance lines to introduce new words in context and familiarize learners with phraseological patterns. For example.2. The role of learner corpora in EAP materials design While teaching materials designed to help undergraduate students improve their academic writing skills are legion (e. despite Thurstun and Candlin’s (1998) claim that it is equally appropriate for native and non-native writers. ‘when choosing which markers to teach. or in this case. King.

(2) comparing and contrasting: describing similarities and differences. syntactic positioning. (9) listing items. the countable use of the noun information).Pedagogical implications 207 English and the Cambridge Advanced Learner’s Dictionary include a number of learner corpus-informed usage notes which warn against common learner errors (e. 2003). (4) expressing cause and effect. As shown in Figures 6. frequency. Special emphasis is placed on AKL nouns. 1996: 262) of the many lexical means that are available to expert writers to perform a specific function. 2002. The writing section includes 12 functions that EFL learners need to master in order to write well-structured academic texts. (6) expressing possibility and certainty. (11) reporting and quoting.1 as typically appearing in EAP textbooks which adopt a functional approach to academic writing: (1) adding information. the confusion between the adjectives actual and current. . adjectives and verbs and their phraseological patterns. if MLDs are to take further ‘proactive steps to help learners negotiate known areas of difficulty’ (Rundell. 2007b: IW1–IW29).2 and 6.3. Widdowson. corpus findings’ (see also Swales. These were identified in Section 4. As put by Cook (1998: 57) referring to Carter’s (1998b) standpoint. designed for the second edition of the Macmillan English Dictionary for Advanced Learners (Gilquin et al. (8) introducing topics and related ideas. learner corpora should not only be exploited to compile error notes but also to improve other aspects of the dictionary. (7) introducing a concession. ‘materials should be influenced by. the sections provide information about how to use these words appropriately by focusing on their: – – – – – semantic properties. Gaëtanelle Gilquin and Sylviane Granger. (12) summarizing and drawing conclusions. Each writing section includes a detailed ‘corpus-based rather than corpus-bound’ description (Summers. (10) reformulation: paraphrasing or clarifying. (5) expressing personal opinions. All the examples come from the academic component of the British National Corpus. 1999: 47).g. style and register differences. collocations. Yet. however. (3) exemplification: introducing examples. but not slaves to. A selected list of features were used to inform a 30-page writing section which I and two other members of the Centre for English Corpus Linguistics (CECL). The method used in Chapter 5 has made it possible to identify a number of common features of EFL learners’ expression of rhetorical and organizational functions.

The orang-utan is the primate most closely related to man. The sections specifically address the types of problems discussed in Chapter 5 — limited lexical repertoire. ideas. lack of register awareness. similarity. Certain. obvious A close analogy can be drawn between cancer of the cell and a society hooked on drugs Figure 6. situations. of clothes and of weapons. striking. processes. ideas. 2007b: IW5) Evidence from learner corpora was used in several ways to inform the writing sections. ideas. situations. He would have recognized her from her strong resemblance to her brother. they share some characteristics but are not exactly the same: There is a striking resemblance between them. An analogy is a comparison between two situations. Collocation Adjectives frequently used with resemblance and similarity.208 Academic Vocabulary in Learner Writing You can use the nouns resemblance. situations. are similar to each other: Scientists themselves have often drawn parallels between the experience of a scientific vocation and certain forms of religious experience. and analogy to show that two points. You can also use the noun parallel to refer to the way in which points. The noun similarity also refers to a particular characteristic or aspect that is shared by two or more points. or situations are similar in certain ways: If there is a resemblance or similarity between two or more points. close. There is a remarkable similarity of techniques. or people: These theories share certain similarities with biological explanations. usually made in order to explain something or make it easier to understand: A usefull analogy for understanding Piaget's theory is to view the child as a scientists who is seeking a 'theory' to explain complex phenomena. interesting. strong. or people. or people. its lively facial expressions show striking similarities to those of humans. semantic misuse. phraseological infelicities. parallel. Collocation Adjectives frequently used with analogy and parallel close. overuse of connective .. ideas. superficial The distribution of votes across the three parties in 1983 bears a close resemblance to the elections of 1923 and of 1929. remarkable. There are close parallels here with anti-racist work in education. etc which are similar in some ways.2 Comparing and contrasting: using nouns such as ‘resemblance’ and ‘similarity’ (Gilquin et al.

e. the nationalized industries. the Latin equivalent of 'that is') or the expressions that is and that is to say: The police now have up to ninety-six hours.3 Reformulation: explaining and defining: using ‘i. to detain people without charge. Numerous authentic examples are provided to illustrate .e. First. Thus.e.. The abbreviation i. These notes are typically supported by frequency data. That is and that is to say are usually enclosed by commas. follows a comma or is used between brackets: Network emergencies (i.e. in the form of graphs which help the reader visualize the differences between learners’ language and that of native writers. in the section on ‘Expressing cause and effect’. Academic writing Freq. you can use the abbreviation i. questions about what we can know and how we can know it.4). 2007b: IW9) devices and syntactic positioning. that is. Note that. which is relatively rare in academic prose and much more typical of speech (see Figure 6.and underuse).’. that to say.e. a graph is used to show that learners have a strong tendency to use the adverb so. ‘that is’ and ‘that is to say’ (Gilquin et al.Pedagogical implications 209 When you want to explain or define exactly what you mean by something. that is that is to say Figure 6. and that is are much more frequent than that is to say. i. in that we draw learners’ attention to error-prone items and we provide negative feedback in the form of ‘Be careful!’ notes which focus on problems of frequency (over. in academic writing and professional reports.e. register confusion and atypical positioning. it excludes the public sector. There are also ‘Get it right’ boxes which are intended to give guidance on how to avoid common errors.e. four days and nights. (short for 'id est'. per million words 160 140 120 100 80 60 40 20 0 i. Descartes was obsessed by epistemological questions. network failures) should be reported immediately. Our treatment of these problems is mainly explicit. i.

can bring to light a wide range of learner-specific features. (2007a. 2007b: IW13) all the points we make. The reader is referred to Gilquin et al. but it is more typical of speech and should therefore not be used too often in academic writing and professional reports. such as textbooks or electronic writing aids.210 Be careful! Academic Vocabulary in Learner Writing Learners often use so to express an effect.6 could equally benefit from the use of learner corpus data. not limited to grammatical or lexical errors. . (2007a) for more detailed information on the principles that guided the design of these writing sections. My investigation of academic vocabulary has shown that the use of learner corpus data. other writing resources. so expressing effect Freq. per million words 1200 1000 800 600 400 200 0 Academic writing Learner writing Speech Figure 6. 2007b) have shown how these findings can be integrated into a learner’s dictionary. but also including over-reliance on a limited set of lexical devices and under-representation of a wide range of typical academic words and phraseological patterns. While Gilquin et al.4 Expressing cause and effect: ‘Be careful’ note on ‘so’ (Gilquin et al.. and their systematic comparison with native corpora. This use is correct.

1997: 5). academic style.. with proficiency in the language use’ (Jordan. This book. Academic vocabulary: a chimera? The status and usefulness of EAP has been questioned by Hyland who believes that ‘academic literacy is unlikely to be achieved through an orientation to some general set of trans-disciplinary academic conventions and practices’ (Hyland. 1999). however. 1998: 41) and focuses on ‘a general academic English register. from topic introduction to concluding statements. I have therefore argued in favour of a functional . and as a teaching practice that deals with ‘the teaching of the skills and language that are common to all disciplines’ (Dudley-Evans and St Johns. I take stock of the main findings of the present study and bring out its major contributions to these three research areas.Chapter 7 General conclusion This book lies at the intersection of three areas of research: English for academic purposes. These lexical items also contribute to discourse organization and cohesion.1. The chapter concludes with some avenues for future research. In this final chapter. incorporating a formal. 1991: 78). and more generally. My own contribution to legitimizing EAP has been to demonstrate – on the basis of corpus data – that ‘it is possible to delimit a procedural vocabulary of such words that would be useful for readers/writers over a wide range of academic disciplines involving varied textual subject matters and genres’ (McCarthy. learner corpus research and second language acquisition. 2000: 145). 7. of scientific knowledge. supports and substantiates the concept of ‘English for (General) Academic Purposes’ both as a macro-genre which subsumes a wide range of text types in academic settings (Biber et al. Academic texts are characterized by a wide range of words and phrasemes that refer to activities which are typical of academic discourse.

a category which has so far largely been neglected in EAP courses. lexicogrammatical patterning and phraseology in expert academic prose needs to be carefully described and learner corpus data should be used to . The methodology makes use of the criteria of keyness. The outcome of this procedure is the Academic Keyword List. thus making it possible to appreciate the paramount importance of core English words in academic prose.17). As a result. range and evenness of distribution. particularly teaching aimed at productive activities. I have derived a productive counterpart to the Academic Word List. This list should not. In its current form (see Table 2. One important feature of the methodology adopted here is that it includes the 2. Each word still needs ‘pedagogic mediation’ (Widdowson. it is not a list of academic vocabulary in a functional sense. (2008).000 most frequent words in English.212 Academic Vocabulary in Learner Writing definition of ‘academic vocabulary’ (Martínez et al. and have developed a rigorous and empirically-based procedure to select potential academic words for this list. Following researchers such as Hanciog lu et al. Numerous so-called general service words are not mastered productively by L2 learners. However. these words serve important discourse-organizing functions in academic writing. My findings call into question the systematic use of Coxhead’s Academic Word List as the exclusive vocabulary syllabus in a number of recent productivityoriented vocabulary textbooks.. Teachers should not assume that EAP students know the first 2. organize scientific discourse and build the rhetoric of academic texts. I have therefore ˘ questioned the fuzzy but well-established frequency-based distinction between general service words and academic words. however. be regarded as an end product. As such. even at upper-intermediate to advanced levels of proficiency. a large proportion of what has been referred to as academic vocabulary in this book consists of core words. this suggests that they should be the target of teaching. Unlike Coxhead’s (2000) definition of the term. the list is the raw result of the application of purely quantitative criteria to native-speaker corpus data. and provides a good illustration of the usefulness of POS-tagged corpora for applied purposes. 2009) and proposed the following definition: academic vocabulary consists of a set of options to refer to those activities that characterize academic work.000 words of English. 2003): its different meanings. Another fact that stands out is that a clear distinction should be made between vocabulary needs for academic reading and writing.

Milton (1999: 223) commented that ‘a great deal of research [was] still necessary to describe with any empirical rigour the lexis that is characteristic of particular purposes. the aim of this study.General conclusion 213 complement these descriptions. EAP tutors are left wondering how they can possibly meet the needs of all their . and expressing cause and effect. 2004) and contribute to push ‘the boundary that roughly demarcates the “phraseological” more and more into the zone previously thought of as free’ (Cowie. These partof-speech categories. and registers’. 2009a). My results have provided ample evidence for the prominent discursive role of nouns. Adverbs do not have a monopoly on lexical cohesion and discourse organization in academic writing. irrespective of discipline. however. . conclusion. their findings do not easily lend themselves to being used in general EAP courses and it is now essential to find ways of reconciling research findings and the reality of EAP teaching practice. This procedure has already been applied to the study of words that serve discourse functions (such as exemplifying. comparing and contrasting. . there has been a huge increase in the number of corpus-based studies highlighting the specificity of vocabulary and phraseology in different academic disciplines and genres. . the method has helped to demonstrate that an essential set of phrasemes in academic prose consists of ‘lexical extensions’ (Curado Fuentes. 1998: 20). issue. 2002. As a result. Different disciplines may also have their preferred ways of performing rhetorical or organizational functions. genres. claim.g. The primary motivation of these studies. Biber et al. verbs and adjectives and their phraseological patterns. comparing and contrasting) in academic prose. however.. These words acquire their organizational or rhetorical function in specific word combinations that are essentially semantically and syntactically compositional (e.g. is . A decade ago.. serve organizational functions as diverse as exemplification. . expressing cause and effect. a role which is hardly ever mentioned in EFL/EAP teaching. these words may also have a discipline-specific phraseology (Granger and Paquot. . argue). As well as their common core features. 2001: 115) of academic words (e. the next section aims at . it has been suggested) (Oakey. as discussed below. The first result of this method has been to dethrone adverbs from their dominant position as default cohesive markers. has not been pedagogical.. Second. an example of . . I have shown that a phraseological approach to the description of academic vocabulary provides a mine of valuable information for pedagogical tools. The focus has been on words that are reasonably frequent in a wide range of academic texts and their preferred lexicogrammatical and phraseological patterns. Since then.

The study of ‘individualized’ examples derived from specialized corpora can be of considerable benefit in helping learners to appreciate the possible linguistic realisations of rhetorical and organizational functions in their own disciplines. history. to law students who also have to take courses in economics. We have shown that it is possible to identify both the common core features of an academic word and its discipline-specific characteristics in terms of meaning. students are better equipped to examine the ways in which grammatical patterns and lexical choices combine to perform rhetorical functions within their own disciplines and hence to apply this knowledge to their own academic writing. 1994). Faced with this difficulty. One way of implementing this ‘happy medium’ approach in the classroom is to apply a data-driven learning methodology. etc. although it may not be possible in all teaching situations to provide materials that are specifically tailored to the disciplines of the students taught. They do not know either what should be taught. phraseological patterns. which consists of making use of corpus data as a source of learning materials for language students (Johns. 2003: 6). that the LSP [Language for Specific Purposes] teacher has the disciplinary knowledge needed to provide reliably accurate instruction in technical varieties of language’ (Huckin. With the emergence of a wide range of interdisciplinary curricula. 2003: 8). the process of investigation is itself of great value in raising students’ awareness of the patterned nature of academic discourse. 2009a) a balanced approach which concurs with Hyland’s (2002b) plea for more specificity in EAP teaching while also subscribing to Eldridge’s view that an essential function of research is to identify ‘similarities and generalities that will facilitate instruction in an imperfect world’ (Eldridge. (2007: 216) In a heterogeneous EAP class. the problem is likely to become even more acute in the future. while also empowering learners by giving them the tools to investigate authentic texts and practices . 2008: 111). especially in mixed classes. As Charles put it. lexico-grammar. we have advocated elsewhere (Granger and Paquot. where disciplinary variability constitutes a serious problem. With this understanding. sociology or psychology. not only for students but also for their teachers as ‘it seldom happens.214 Academic Vocabulary in Learner Writing students in classes which are ‘often composed of students from different disciplines and/or language backgrounds with different purposes for taking the class’ (Huckin. this approach allows teachers to emphasize general academic words and phrasemes which ‘are not likely to be glossed by the content teacher’ (Flowerdew 1993: 236). for example.

7. for example native English and the English produced by French-speaking learners. or are L1-specific (and so possibly transfer-related).2. for example the English produced by French-speaking learners and the English produced by Italian-speaking learners. However. the semantic misuse of connectors and labels. . Although the CIA method has become quite popular. The common core of interlanguage features that characterize the expression of rhetorical and organizational functions in EFL writing includes a limited lexical repertoire and a lack of register awareness as well as lexico-grammatical and phraseological specificities. My journey into academic vocabulary – from the extraction of potential academic words through their linguistic analysis in expert and learner corpus data. Learner corpora. and more specifically. most studies using the method have been of the first type. other features such as lexico-grammatical errors. 1996) involves two types of comparison. One compares native with non-native (or inter-) language. and more precisely argumentative. the use of non-native-like sequences and the overuse of relatively rare expressions seem to be largely learner-specific. may also be found in novice native-speaker writing. the lack of register awareness. the extensive use of chains of connective devices and a marked preference for placing connectors in the sentence-initial position.General conclusion 215 in their own disciplines. 2005: 273). ‘thereby allowing considerations of subject specificity and disciplinary variation to inform classroom discussion’ (Groom. essays written by upper-intermediate to advanced EFL learners share a number of linguistic features irrespective of the learners’ mother tongue backgrounds or language families. I have tried to make the most of CIA by systematically exploiting the two types of comparison it allows to examine EFL learners’ use of academic vocabulary. In this book. to the pedagogical implications that can be drawn from the results – has contributed to fleshing out this concept and has convincingly demonstrated that academic vocabulary is anything but a chimera. interlanguage and second language acquisition Contrastive Interlanguage Analysis (CIA) (Granger. The other type of analysis compares two (or more) interlanguages. The results show that academic. Studies comparing more than one IL usually focus on learners from one mother tongue background and use data from one or two other learner populations only to check whether the features they have highlighted in one corpus are common to other learners. Several of these linguistic features.

With its focus on frequency. I have highlighted the important role played by developmental and teaching-induced factors in learners’ written production. lexico-grammatical and phraseological patterns. Applying Jarvis’s (2000) methodological framework to learner corpus data has helped identify a number of transfer effects that until now have been largely undocumented in the SLA literature. 7. style and register preferences. following Hoey (2005). I have also shown that it is not always possible to attribute learner-specific features to a single factor. Learner corpora can clearly act as a test bed for studies that aim to provide empirical evidence for theories of second language acquisition. use. because developmental. has rarely been investigated. By focusing on shared features across L1 learner populations. teaching-induced and transfer-related effects can reinforce each other (Granger. discourse function. Avenues for future research A promising area of research which has only been touched upon in this book lies in the investigation of patterns of difficulty shared by . 2004:135–6). The valuable theoretical insights provided by a learner-corpus based approach to the study of L1 influence bring to the fore the potential contribution of learner corpora for SLA studies. and frequency of use. and should feature prominently in the battery of data types used by all SLA specialists. register differences and phraseology. There are many other variables that interact in learners’ interlanguage which are also in need of careful operationalization. Lexical transfer has too often been narrowed down to transfer of form/meaning mappings and the third aspect of word knowledge. They are not the exclusive preserve of learner corpus researchers.216 Academic Vocabulary in Learner Writing A systematic analysis of several interlanguages is necessary to analyse the potential influence of developmental. I refer to as ‘transfer of primings’.3. they arguably provide a good account of the complexity and versatility of L1 influence. corpus linguistics clearly has numerous resources and specific tools to offer SLA researchers who wish to further investigate the manifestations of L1 influence on learners’ interlanguage. Transfer of primings includes L1 influence on collocational use. In addition. Learner corpora are probably the best – if not the sole – type of learner interlanguage samples which can be used to investigate these transfer effects. i. My study has helped to identify a number of transfer effects relating to word use that make up what. teaching-induced and transferrelated factors on EFL learner writing.e.

Further research is clearly needed to shed more light on the similarities and differences between EFL learners’ use of academic words and phrasemes and that of novice native-speaker writers. All in all. the Varieties of English for Specific Purposes dAtabase (VESPA) learner corpus. A new corpus currently under development at Louvain. 1999: 151). and produce coherent and cohesive texts in a foreign language. Learner corpus research would greatly benefit from the design of comparable corpora of L1 and L2 writing produced by the same learners. New corpora such as the British Academic Written Corpus and the Michigan Corpus of Upper-level Student Papers are thus particularly welcome. The collocation is not primed to occur in other text types or other contexts. and some overlap between native and non-native writers’ (Howarth. has been designed as the ESP . However. Howarth postulated the existence of a continuum of phraseological competence that would ‘encompass mature NS writers at one extreme and weak NNS writers at the other. both native and non-native speakers. Such research would enable linguistic features that are characteristic of novice writing to be separated from those features that have commonly been attributed to EFL writing. in a variety of disciplines. longitudinal corpora of learner language are sorely lacking. In particular. I have shown that the research paradigm of corpus linguistics is ideally suited to studying the lexical specificities of academic discourse in native-speaker and learner writing. Novice native-speaker writers have been shown to have difficulty with academic language. and more particularly with its highly conventionalized phraseology. The many corpora already available make it possible to examine a wide range of genres and text types. He gave the example of the word research which is primed in the mind of academic language users to occur with recent in academic discourse and news reports on research. There is also an urgent need for learner corpora which represent academic text types other than argumentative essays.General conclusion 217 mother-tongue English-speaking students and EFL learners. if they cannot already perform this task in their mother tongue. Hoey (2005) insisted that primings are constrained by register and genre. A direct implication of Hoey’s theory of lexical priming is that academic phraseology cannot be assumed to be primed in the mental lexicon of novice native-speaker writers who have had little contact with academic disciplines. with NS and NNS students of varying levels of proficiency in between. much more could be achieved in the field if other types of corpora were collected. as they consist of ESP texts produced by writers at different stages of undergraduate and graduate level study. It does not make sense to expect learners to write properly in English.1 L1 writing skills also need to figure more prominently in future research.

218 Academic Vocabulary in Learner Writing counterpart of the International Corpus of Learner English. The role of frequency is a key issue in second language acquisition. thus restricting his discussion to the question of ‘how often does input of X need to be provided in order for X to be acquired?’ (Gregg. learner corpus research. I have sought to unify several aspects of English for academic purposes. 2003: 846). in a state-of-the-art article on SLA theory. and implicit vs. . Navigating my way through the complexity of each of these research areas. My journey into academic vocabulary has led me to explore a large number of fascinating fi elds of research. The challenges presented by such a cross-disciplinary position have quickly been proved worthwhile by the fresh light the approach has shed on key issues such as the nature of academic vocabulary. There is still so much to explore. corpus learner researchers and teachers alike. but the potential influence of L1 frequency on learner interlanguage has also been highlighted. I hope that this book will serve as a starting block for further research into the many issues raised. and its relation with language processing. Volume 24/2) is devoted to L1 frequency effects and their implications for second language acquisition. New avenues of research can now be explored by SLA specialists. However. The role of L1 frequency is particularly interesting. intake3. and the methodological aspects of interlanguage studies. The volume largely focuses on input frequency. and can be expected to be the object of much attention in the next few years. explicit learning.2 Not a single article in the special issue of Studies in Second Language Acquisition (2002. the relative influence of developmental features and transfer effects. Similarly. it has generally been conceived of in terms of L2 frequency. Gregg (2003) only addresses the issue of frequency in relation to the role of input. and second language acquisition into a coherent whole. It includes English for specific purposes texts written by L2 writers from various mother tongue backgrounds. Not only have a number of largely unrecognized transfer effects been brought to light. and experiment with a wide range of tools and methods.

Rel.7 6.7 16.5 35.5 83.6 1.3 (++) 27.8 26.802 1105 697 − − 450 223 269 − − − − − − − 54. LogL 19.2 (− −) 5.2 15 81. BNC−AC−HUM Abs.9 755 492 263 − 550 244 306 1.6 4.3 92.2 (− −) 5.4 40.4 13.6 23.175 577 598 − 500 286 214 − 183 72 111 80.6 87.1 (++) (Continued) . nouns cause cause causes *causae factor factor factors source source sources *sourse origin origin origins *origine root root roots reason reason reasons *reaons *reasongs consequence consequence consequences *consecvencies *consecuence *consecuences *consecuenses *consequencies *consequense *consequenses 314 127 186 1 229 100 129 274 194 78 2 60 48 11 1 173 112 61 939 563 374 1 1 319 76 227 1 2 3 2 4 1 3 14.3 (++) Rel.Appendix 1: Expressing cause and effect Comparisons based on total number of running words ICLE Abs.2 (++) 22.

33.7 24.7 101 23 21 32 18 7 3 6.2 .8 67. Rel.612 259.8 0.2 227 63 23 119 22 6.4 570 133 66 317 54 125 44 6 64 11 276 52 18 82 26 − 17.3 2.220 Appendix 1 ICLE Abs.3 9.8 10 8. 1.4 (− −) 499 140 106 220 33 51 25 10 14 2 116 61 20 21 13 1 14 3 2 9 0 20 8 4 3 5 0 42.4 170.4 4.03 (− −) 1 12.9 BNC−AC−HUM Abs. 55 84.2 211 (++) 3.9 (++) 2.8 4.8 (− −) LogL effect effect effects efect result result results *resut outcome outcome outcomes implication implication implications TOTAL NOUNS verbs cause cause causes caused causing bring about brings brings brought brining contribute to contribute contributes contributed contributing *contribuates generate generate generates generated generating give rise to give gives gave given giving 395 214 179 2 381 167 213 1 28 21 7 12 4 8 3.5 20.124 32.5 Rel.3 2.6 1.830 1249 581 813 502 311 − 143 135 8 411 93 318 268 8.4 (− −) 1.

4 476 77 68 297 34 − 2 2 0 0 0 42 0.6 15 7 8 0 0 356 184 83 72 17 12 4 2 3 3 4.9 0. 67 19 5 35 8 671 161 105 334 71 115 14 13 82 6 4.2 4.1 (− −) 2.2 (− −) (Continued) . 2 2.7 171 145 31 28 30 4 52 14. 1.2 50 14 8 16 8 1 2 1 8.8 (++) 1 3.8 0 30. induce induce induces induced inducing lead to lead leads led leading prompt prompt prompts prompted prompting provoke provoke provokes provoked provoking provocate provocated provoqued result in/from result results resulted resulting yield yield yields yielded yielding make sb/sth do sth# arise from/out of arise arises arose arisen arising derive derive derives derived deriving *derivated 39 12 8 15 3 1 489 8 4 2 2 0 0 3.1 (− −) 666.3 BNC−AC−HUM Abs.4 9.7 39.2 Rel.6 20.3 161 38 11 102 10 − − − 327 104 18 138 67 88 31 16 34 7 5.5 22.Appendix 1 ICLE Abs.5 Rel.1 (++) 46 (− −) 115.7 221 LogL 31.3 114 30 33 32 5 0.

466 107 95 221 43 74 33 35 5 1 1.6 (++) 175.847 158.5 53 344 397 1.9 0.7 7 (− −) Rel.109 45. Rel.9 13.89 531 246 79 7 5 8 17 7 199 3 7 1.7 1.8 229.6 0.3 0.321 18 5.6 68 14 22 23 9 4.1 6.1 8.6 17 0.2 (− −) LogL emerge emerge emerges emerged emerging follow from 33 2.8 11 6 15 1 4 follow follows followed following 1 0 2 1 8 trigger triggers triggered triggering 5 0 3 0 7 stem stems stemmed stemming 1 5 0 1 1. 14 126.1 (++) 3 0.7 56 14 3 27 12 2.4 12 3.6 21.2 23.9 14. 2.174 125.19 433.7 15.6 (− −) 360.8 0.3 (− −) 0.1 31.3 BNC−AC−HUM Abs.9 5.7 39.8 (++) 0.6 (− −) 0.5 0.222 Appendix 1 ICLE Abs.1 (++) 1.4 (++) . 10 171 181 0.6 10.5 0.4 0.7 13.7 0.3 1 0.6 95 599 195 196 22 1 66 52 109 35 22 24 1.1 0.7 (++) 10.7 (− −) trigger stem from TOTAL VERBS adjectives consequent responsible (for) TOTAL ADJ.7 66.6 3.3 (++) 4.7 0 2 1. prepositions because of due to as a result of as a consequence of in consequence of in view of owing to in (the) light of thanks to on the grounds of on account of TOTAL PREP.

5 0.7 28.9 0 257.3 3.3 8.4 (++) 38.495 2. Adverbs therefore therefore *therefor accordingly consequently consequently *consecuently thus hence so thereby as a result as a consequence in consequence by implication TOTAL ADVERBS conjunctions because because *becausae *becaus since## as ## 223 BNC−AC−HUM LogL Rel.4 1.7 (−) 132.74 26.9 (− −) 2.1 (++) 2.4 5 23.912 26.5 3 0.1 180 41.4 359 (++) 381.2 21 1.4 2.412 42.2 15.1 (− −) 243.3 7.7 130 143 3.036 696 52 22 18 12 83 5.2 (− −) 33.1 (++) 1 325.5 1 0.998 60.Appendix 1 ICLE Abs.207 66.810 13.9 1121 955 883 1.2 8.2 1.981 53.9 (++) for so that PRO is why that is why this is why which is why on the grounds that TOTAL CONJ. TOTAL 220 189 18 12 5 3.3 (− −) 457.7 (++) 34.436 15 103 35 11 0 2.9 4.2 1.6 123.8 21.4 18.5 54.9 17.407 28.3 (++) 2.3 (++) 3.9 16.5 178 794.3 25 (− −) 809.767 283 1. 701 689 12 26 183 179 4 446 42 1.56 0.4 1553.1 (++) 989.5 57 5.8 (++) 43.8 3 0.6 0. Abs.8 (− −) 55.2 1.066 .7 (++) 24.894 182 101 20 14 35 5.7 (++) 0. Rel.4 326.7 0.1 1.493 1 1 428 331 58 273 214 2.8 (++) 36.6 31.

7 0.7 0.8 (− −) 14.1 0.5 1.1 0.7 6.3 1.9 (− −) 4.4 1.2 0.1 10 171 181 0.8 1.7 6.1 570 125 276 227 101 67 671 115 161 327 88 171 145 476 466 74 56 68 4.3 0. nouns cause factor source origin root reason consequence effect result outcome implication TOTAL NOUNS Verbs cause bring about contribute to generate give rise to induce lead to prompt provoke result in yield make sb/sth do sth# arise from/out of derive emerge follow from trigger stem TOTAL VERBS adjectives consequent responsible (for) TOTAL ADJ.1 2.8 24.6 6.1 0.1 93.6 (− −) 16.2 0.3 7.9 2.8 84.6 1.4 53 344 397 0.5 (− −) 8.0 0.1 0.3 1.1 14.1 0.2 (− −) % BNC−AC−HUM Abs.5 1.8 (− −) 0.3 15.1 23.5 (− −) 192.802 450 1.3 (++) 2.9 (− −) 9 (− −) 1.5 1.2 0.5 0.5 9.4 0. % LogL .847 3.8 1.4 3 2.9 (− −) 10.6 (− −) 0 0.4 2.9 3.9 0.6 0.2 2.8 2.1 599 195 196 22 2.8 0.4 (++) 1.4 1.1 4.9 0.3 2.224 Appendix 1 Comparisons based on total number of ‘cause and effect’ lexical items ICLE Abs.1 1.9 (− −) 56 (− −) 463.2 0.1 39.4 (++) 1.8 0.2 (− −) 314 229 274 60 173 939 319 395 381 28 12 3.4 0.9 499 51 116 14 20 15 356 12 50 114 2 489 8 39 33 4 8 7 1.9 0.175 500 183 1.3 0.124 2.6 (− −) 16.7 0.3 (− −) 153.3 (− −) 36.55 1.0 3.174 2.4 0.9 0.3 (++) 95.9 755 550 1.1 0.9 0.612 2.7 32.3 0.4 0.6 1.9 145.3 0.7 23.7 0.9 0.7 0.7 (− −) 201 (− −) 36. prepositions because of due to as a result of as a consequence of 531 246 79 7 4.3 0.1 0.1 1.5 (++) 263.2 1.4 (− −) 274 (− −) 231.2 106.2 0.6 (++) 71.830 813 143 446 8.5 (− −) 23.

4 (− −) 72.1 (− −) 0. .894 182 101 20 14 35 5.9 2.1 (++) 5 8 17 7 199 3 7 1109 701 26 183 446 42 1.7 (++) 6 1.5 0.0 0.2 (− −) 270.4 0.1 0.4 (++) 1.407 8. 1 66 52 109 35 22 24 1321 % 0 0.3 3.4 0.1 (− −) 2.0 0. in consequence of in view of owing to in (the) light of thanks to on the grounds of on account of TOTAL PREP.7 (− −) 70.4 (− −) 26.4 50.3 0.4 0. adverbs therefore accordingly consequently thus hence so thereby as a result as a consequence in consequence by implication TOTAL ADVERBS Conjunctions because since## as## for so that PRO is why that is why this is why which is why on the grounds that TOTAL CONJ.436 15 103 35 11 0 2.1 0.2 (− −) 144.767 283 1.810 13.036 696 52 22 18 12 83 5.3 22.6 0.412 130 143 1.1 0 23 1.1 0.6 (++) 182.0 21.5 (++) 14.6 (++) 2.3 2.Appendix 1 ICLE Abs.8 0.2 100 2.2 (++) 21.1 0.5 0.1 7.2 0.2 1.1 0.8 (++) 1.2 0.92 (−) 262.9 (++) 73. Estimations based on an analysis of the first 200 occurrences of the word in each corpus.912 26.3 (− −) 507.0 0.1 3.8 (++) 294.3 28.998 5.5 0.495 428 331 58 273 220 189 28 3 5 3.7 164.59 (− −) 10.1 0.4 3.5 Abs.3 (++) ## Estimations based on Gilquin (2008).6 0.3 2.3 39.1 22.1 1.7 1.066 19.2 0.3 11 0.6 3.0 0.4 3.3 0.2 0.4 0.9 19.0 29.7 0.4 100 790.4 2.1 0.4 0.1 1.207 955 883 1. TOTAL # 225 BNC−AC−HUM LogL % 0.45 20.5 6.7 0.4 (− −) 158.1 0.1 8.1 0.7 1.1 5 6.981 5.

3 4 1.1 0.3 0.1 1.3 15.5 0.4 7.6 9.8 16 16.1 0.318 802 516 − − 3.5 1.5 0.3 0 2. Rel.6 1.1 0. nouns resemblance resemblance resemblances similarity similarity similarities *similarieties *similiraty parallel parallel parallels parallelism parallelism parallelisms *paralelism *parallelim analogy analogy analogies contrast contrast contrasts comparison comparison comparisons *comparaison *comparision difference difference differences *differencies *difference 3 3 0 25 18 7 38 36 0 1 1 394 187 191 6 3 3 1 0 1 1 6 6 0 2 2 0 25 7 16 1 1 0.1 0.2 0.3 0.19 35.1 33.38 3.4 0.1 0 0.5 0 0. LogL 4.1 15.4 0.1 0 0.3 0.1 0.3 2.9 (− −) Rel.7 24.9 (− −) 178.5 8 (− −) .6 3.5 0.6 0.3 54 (− −) 2 5.3 (− −) 39.49 3 0.2 0 2.4 2.19 3.3 (− −) 49.5 6. BNC−AC−HUM Abs.3 3.2 1.2 (− −) 54.3 116 100 16 212 106 106 − − 147 76 71 19 10 9 − − 175 133 42 522 470 52 311 249 62 − − 1.Appendix 2: Comparing and contrasting Comparisons based on total number of running words ICLE Abs.1 0.7 14.9 82.1 0.

7 127.3 2.3 30.8 (− −) 20.2 0.3 (− −) 10.9 3 2 0 1 47 38 9 2 246 1 17 16 1 44 40 4 5 860 1 1 1 1 1 1 1 Rel.7 13.4 1 0.1 75.1 0.1 3.9 30.4 0.1 130 129.1 1.8 13. 0.3 0.2 21.1 0.3 0.515 1510 2 1 2 4 90.5 0.8 (− −) 31.1 1.580 1.8 0.1 0.2 0.2 0.1 23.1 0.6 Rel.78 3.1 0.6 4.8 (++) 110.1 0.3 2.2 1.3 16.8 3 2.496 2496 − − − 72 1.9 (− −) 3.1 0.1 4.6 1.3 12.7 30. − − − − − − − 76 72 4 − 595 498 97 10 559 − 28 27 1 85 58 27 56 4.8 6. 227 LogL 28.229 0.5 1.3 (− −) 148.1 17.1 0.2 0 0.4 2 0.4 0.1 25.6 3 1.1 0.9 15 2.1 0.6 (− −) 283.7 4.1 0.9 0.7 (− −) 18.75 (− −) (Continued) .5 (− −) 55 1055 223 137 52 98 63 2.8 0.027 77.Appendix 2 ICLE Abs.3 (− −) 268 (++) 2.5 (+) 2.17 22.1 3.4 73.3 0.058 160 157 2 1 1 275 16 12 5 23 1 1.7 31.4 (− −) 8.6 0.4 (− −) 59.8 BNC−AC−HUM Abs. *diference *difference *difference *differency *differene *difference *diffrences differentiation differentiation differentiations *differenciation distinction distinction distinctions distinctiveness (the) same *similars (the) contrary contrary contraries (the) opposite opposite opposites (the) reverse TOTAL NOUNS Adjectives same similar similar *similiar *simmilar analogous common comparable identical parallel alike contrasting different different *differents *differrent *diffrent differing 1.9 75.

6 0.3 (− −) 6.2 0.7 1.8 0.3 0.3 0.6 0.6 0.5 0.4 4.1 5.4 8.1 3.552 257.6 0.8 1.4 0.2 0.3 0.1 8.6 4. 8.9 (− −) 0.1 11.4 31 16 3 11 1 41 27 3 4 7 106 72 21 12 1 129 75 36 2 16 2 1 0 0 1 7 3 4 0 0 2.9 (− −) 14.2 0.1 111.1 2.1 0.7 4.1 0 0 0.7 40.52 2.3 1.5 1.3 0.4 0.228 Appendix 2 ICLE Abs.3 271. Rel.163 8.1 0.8 3.4 11 (− −) .2 0.9 0.3 0.3 0.3 0 0 138 51 18 46 23 137 73 16 48 28 102 42 38 19 3 278 140 71 17 50 56 9 4 38 5 137 31 47 42 17 4.4 0.6 9.2 1.5 1.5 1.2 1.4 0.2 4.4 4.2 2.3 (− −) 9.8 0.3 0.9 (++) 6.5 1.1 6.4 BNC−AC−HUM Abs.7 (− −) 45.8 58.1 1.7 0.1 1.3 0.4 6. 278 278 − 163 33 43 27 127 23 Rel.2 1. 0.6 0.1 0.1 0.9 1 1.0 0.6 (+) 21.2 1.4 (− −) LogL distinct distinct *distinc distinctive distinguishable unlike contrary opposite reverse reverse *reversed TOTAL ADJECTIVES verbs resemble resemble resembled resembles resembling correspond correspond corresponded corresponds corresponding look like look like looks like looked like looking like compare compare compared compares comparing parallel parallel parallels paralleled paralleling contrast contrast contrasted contrasts contrasting 9 7 2 13 2 2 7 53 7 3 4 3.5 1.1 6.1 0.1 0.1 0.8 3.4 3.5 0.

7 2.1 1.4 1.2 BNC−AC−HUM Abs.1 35.0 0.1 0.568 Rel.7 229 LogL 0.2 6 1.3 62.5 0.1 (− −) 1.3 0.4 (− −) 0.2 45.2 0.2 0.2 0.1 1.5 0.9 394 11.4 0.4 2.7 98.3 0 3.01 5 0.2 17.2 1.1 0.3 (− −) 258.1 0.3 (− −) 9.1 0.0 24.7 0.2 0.3 (+) 1.3 (− −) 7.5 0 9.9 (− −) 13.5 (− −) 2.2 2.8 15.2 2.5 1.9 3.1 0.9 2.7 13.4 4.3 2.1 2.6 0.1 0. 7.5 47.2 0.2 1.4 2.2 0.4 0.16 4.4 9.1 0.Appendix 2 ICLE Abs.6 0.6 1.7 (− −) 54.1 0 0 0 1.29 3.7 12.6 3.1 0 0.9 5.9 3.7 (− −) 2. differ differ differs differed distinguish distinguish distinguished distinguishes distinguishing *distinquish *distingush differentiate differentiate differentiates differentiated differentiating *differenciate TOTAL VERBS adverbs similarly similarly *similarely *similarily *similary analogously identically correspondingly parallely likewise in the same way contrastingly differently by/in contrast by contrast in contrast by way of contrast by/in comparison by comparison in comparison comparatively comparatively *comparitively contrariwise distinctively on the other hand 31 26 1 1 3 1 0 0 1 9 38 0 42 9 2 7 1 0 0 0 14 13 1 0 1 418 2.7 0.8 0.1 0 0 0.8 11.6 9 (− −) 4.2 1. 7.1 0.1 0.4 (− −) − 30.0 0.3 34.8 3.6 (− −) 2 2 29 − 118 56 3 97 185 116 69 0 23 14 9 69 − 4 25 372 0. 242 112 73 57 404 164 116 36 88 − − 74 22 6 31 15 − 1.7 0.3 (++) (Continued) .8 (− −) 3.1 0.9 0.8 1.2 0.9 86 57 29 0 107 70 16 12 6 2 1 18 12 1 2 1 2 527 Rel.09 1.9 0.

1 0.435 26 0 7 0 23 15 8 7 18 2 3 39 28 11 0 0 1.7 (++) 1.2 0.8 (− −) 4.7 (+) 12.5 2.8 (++) 62 (++) 8.2 0 0.1 0 0 0 0.9 (++) LogL (on the one hand) *on the other side *on the opposite on the contrary on the contrary *on the contray *on the contrairy Other expressions with contrary *in contrary *by the contrary *to the contrary quite the contrary *in the contrary rather the contrary *quite contrary *contrary reversely conversely TOTAL ADVERBS Prepositions like# unlike in parallel with as opposed to as against in contrast to/with in contrast to in contrast with versus contrary to *in contrary to *opposite to by/in comparison with in comparison with in comparison to by comparison with in comparison with TOTAL PREP.7 0.9 8.6 0.4 (− −) 124.5 (++) 45.6 (− −) 8.5 (++) 14.7 Rel.8 37.3 14 BNC−AC−HUM Abs.9 (− −) 231.6 2 0 0 1. 136 0 0 95 95 − − 2 0 0 0 2 0 0 0 0 0.4 2. 8.7 (− −) 0.4 (− −) 27.9 38.5 76.6 0.4 2.3 3.3 0.1 2.045 1264 151.4 (− −) .1 0.3 17. 4.9 29.157 206 99.4 104.9 0 0 133.4 12.8 2.6 2 0.5 7.1 0. Rel.5 0.560 123. Conjunctions as # while # 100 23 3 164 160 3 1 13 1 1 2 4 2 1 1 1 1 6 875 1.1 (+) 30.5 (− −) 0.250 0 1.3 1.1 0 0 2.2 3.9 127.9 38 185.4 0.7 0 62 1.6 1.3 0.6 0 2 1.4 (− −) 62 (++) 1.7 5.7 7.9 4 3.2 0.1 0 0 0 0 4.6 1.4 0.484 84.812 244 8 121 46 82 73 9 53 66 0 0 52 14 4 21 14 3.1 (++) 158.1 (++) 12.230 Appendix 2 ICLE Abs.

4 21.7 BNC−AC−HUM Abs.4 0.2 2.14 4.2 128.24 (− −) .3 (− −) 15.6 0.8 11. 11. 442 Rel.751 203.766 38 155 113 42 32 11 20 1 29.5 137 135 2 1. as in the same way as/that compared with/to compared with compared to CONJ compared to/with as compared to/with when compared to/with if compared to/with TOTAL 1. Other expressions as .5 0. whereas whereas wheras TOTAL CONJ.26 1 0.Appendix 2 ICLE Abs. 13.4 1.4 1.67 3.6 4.3 0.2 0.5 2.3 1.0 880.854 110.0 3.3 (− −) 67.287 19 49 12 37 14 5 3 6 9.5 845. .3 0.3 11 (++) 12.6 0.500 Rel.2 281.5 (++) 1.2 1.8 (+) 0.249 83. 1.26 1.6 231 LogL 6.5 0. .

6 29 (− −) 307.2 (− −) 37.4 26.5 (− −) 106.82 (++) 2 25 6 3 3 25 38 394 3 47 2 246 17 44 5 860 0.7 (++) 21.1 0.0 138 137 102 278 56 0.7 116 212 147 19 175 522 311 1.1 0.0 0.2 14.1 0. % LogL .6 0.552 0.1 0.8 0.1 (−) 2.5 0.2 0.5 0.5 (− −) 168.3 0.2 (− −) 10.1 0.318 76 595 10 559 28 85 56 4.3 (− −) 51.4 1 0.496 72 278 163 33 43 27 127 23 8.0 1.0 0.0 2.2 (− −) 29.9 (− −) 45.1 8.1 1.3 0.7 0.7(− −) 15.2 8.0 0.1(− −) 56.1 4.2 (++) 8.4 0.2 4.2(++) 98.4 0.8 0.5 0.0 0.1 (− −) 4.2 3.0 0.0 15.7 0.1 0.580 1.5 11.3 0.2 0.9 3.1 1.1 32.6 (− −) 32.232 Appendix 2 Comparisons based on total number of ‘comparison and contrast’ lexical items ICLE Abs.2 0.1 0.5 0.515 4 9 13 2 2 7 53 7 3.3 (− −) 1.1 0.1 (− −) 0.5 0.9 0.027 55 1055 223 137 52 98 63 2.6 0.8 79.058 160 1 275 16 12 5 23 1 1.1 11.3 (− −) 14.1 29.0 0.5 63.7 1.7 (− −) 9.6 0.9 (− −) 0.9 (− −) 138.1 0.3 0.6 (−) 1.5 5.1 0.3 0.163 10.8 (− −) % BNC−AC−HUM Abs.2 0.3 0.4 4 0.5 0.8(− −) 24.0 0.4 1.4 1.5 52.3 2.5 0.7 (+) 20.5 0.1 19.8 (− −) 3.8 1.5 0. Nouns resemblance similarity parallel parallelism analogy contrast comparison difference differentiation distinction distinctiveness (the) same (the) contrary (the) opposite (the) reverse TOTAL NOUNS Adjectives same similar analogous common comparable identical parallel alike contrasting different differing distinct distinctive distinguishable unlike contrary opposite reverse TOTAL ADJECTIVES Verbs resemble correspond look like compare parallel 31 41 106 129 2 0.0 0.1 2.4 0.229 0.5 0.5 0.7 (− −) 202.2 28.2 0.0 2.5 0.3 1 0.

0 0.1 394 2 2 29 0 118 56 3 97 185 116 69 0 23 14 9 69 4 25 372 136 0 0 95 2 0 62 1.6 0.0 0 0 0.3 0.8 (− −) 2.9 0.0 0 0.0 0.250 1.1 2.4 0.2 5.6 0 92.1 0.0 0.8 (− −) 275.9 (− −) 0.0 0 0.3 (− −) 0.4 BNC−AC−HUM Abs.5 0 0 0. 137 242 404 74 1.2 0.4 (++) 1.3 (− −) 26.Appendix 2 ICLE Abs.2 16.3 0.1 0.2 0.8 (++) 42.0 0.0 0.2 4.3 0.1 0.2 1.7 35.2 0.0 4.1 1.6 (++) 155.8 (++) 25.4 0. contrast differ distinguish differentiate TOTAL VERBS Adverbs similarly analogously identically correspondingly parallely likewise in the same way contrastingly differently by/in contrast by contrast in contrast by way of contrast by/in comparison by comparison in comparison comparatively contrariwise distinctively on the other hand (on the one hand) *on the other side *on the opposite on the contrary Other expressions with contrary reversely conversely TOTAL ADVERBS Prepositions like# unlike in parallel with as opposed to as against in contrast to/with versus contrary to *in contrary to 1.8 28.1 0 0.4 0.4 0.2 0.4 (− −) 0 0.4 0 0.8 1.9 1.2 0.7 (− −) 0.3 2.0 0.0 0.1 8.1 0 0.812 244 8 121 46 82 53 66 0 9.0 0.6 0.2 0 31 1 0 0 1 9 38 0 42 9 2 7 1 0 0 0 14 0 1 418 100 23 3 164 13 1 6 875 0.0 1.3 0.5 (Continued) .5 0.3 (− −) 4.2 5.7 0.3 5.2 0.8 59.8 0.4 233 LogL 42.8 12 (−) 258.1 1.0 2.568 % 0.4 (++) 8.3 0.2 (++) 2.6 7 86 107 18 527 % 0.1 1.1 0.435 26 0 7 0 23 7 18 2 14.1 0.3 8.3 3.3 0 0.4 0.3 (− −) 10.3 (++) 166.6 6.4 (− −) 0.7 1.2 0.8 (++) 33 (++) 63.8 13.9 (− −) 0.7 5.1 0 0.

751 17.4 15.766 38 155 32 29.5 0.484 % 0 0. Other expressions as … as in the same way as/that compared with/to CONJ compared to/ with TOTAL 3 39 1.500 11.4 15.5 0.1 100 2.2 0.1 100 87.9 8.249 9.1 0.1 150.8 BNC−AC−HUM Abs.7 0.3 4.2 5.045 1264 442 6.7 231. .5 (− −) 110.3 1.854 13.2 0. Conjunctions as# while# whereas TOTAL CONJ.8 (++) 2.6 (− −) 0.7 2.234 Appendix 2 ICLE Abs.6 (− −) 1.287 19 49 14 9.1 1.4 (+) 83.157 206 137 1.1 0.5 0.6 # Estimations based on an analysis of the first 200 occurrences of the word in each corpus.560 1. % 0.9 (++) LogL *opposite to by/in comparison with TOTAL PREP. 0 52 3.5 23.3 (+) 13.2 11.0 0.

See ‘carrier nouns’ (Ivanic ˇ.lancs. ‘anaphoric nouns’ (Francis. 1984). Paul Thompson (Department of Applied Linguistics. It was created in 2001 under the directorship of Hilary Nesi.000 different words (called types) in the text. 1986). Life Sciences and Physical Sciences).hit. . Reading) and Paul Wickens (Westminster Institute of Education. If a text is 75. and there may be only 2. 2000) and ‘discourse-organising words’ (McCarthy.html (accessed 2 August 2009).Notes Chapter 1 1 2 See Stein (2008) for a review of major twentieth-century projects aimed at developing a controlled vocabulary for foreign language learners. ‘shell nouns’ (Schmid. see http://khnt. it has 75.000 tokens. Oxford Brookes). Chapter 2 1 2 3 4 5 6 7 The BAWE Pilot Corpus was a pilot for the ESRC funded project ‘An investigation of genres of assessed writing in British higher education (RES-000-23-0800).it/ EAGLES96/corpustyp/node18. with funding from the ESRC (RES-000-23-0800).000 words long. The four corpora are equivalent in the sense that they were compiled using the same corpus design and sampling methods. The BAWE corpus contains 2761 pieces of proficient assessed student writing. Social Sciences. For more information about these corpora. Sentence examples are taken from the Longman Dictionary of Contemporary English (2005) See the definition of a reference corpus proposed by the Expert Advisory Group on Language Engineering Standards (EAGLES96) at http://www. This specific set of abstract nouns has variously been referred to as ‘signalling words’ (Jordan. 1991).cnr.uib. Holdings are fairly evenly distributed across four broad disciplinary areas (Arts and Humanities.ilc.html for a list of tags used in CLAWS C7 tagset (accessed 2 August 2009). But a lot of these words will be repeated. Each of these corpora consists of one million words of British or American written English. Thirty-five disciplines are represented. Reading and Oxford Brookes under the directorship of Hilary Nesi and Sheena Gardner (formerly of the Centre for Applied Linguistics at Warwick University) (accessed 2 August 2009). 1991). with support from the University of Warwick Teaching Development Fund. The British Academic Written English (BAWE) corpus was developed at the Universities of Warwick.

de/index. Turkish and Tswana mother tongue backgrounds (cf. Quantitative comparisons between the BNC and ICLE thus have to be treated with caution. . Norwegian. 2009: 11–12). n the node and c the collocate..e. i. for example..oed. however. words and phrasemes used to introduce the main topic or a conclusion). 2000). Texts longer than 45.000 words were sampled so as to allow for a wider coverage of text types and avoid over-representation of idiosyncratic uses. This design criterion.236 8 Notes 9 10 Katz (1996: 19) distinguishes between ‘document-level burstiness’. Japanese. Granger et al. (accessed 2 August 2009). f is the frequency.e. essays written by Bulgarian-speaking learners were mainly written without the help of reference tools and were therefore not included in the analysis. ICLE also comprises a Bulgarian sub-corpus. ICLEv2 now also includes texts written by students with Chinese. ‘multiple occurrences of a content word or phrase in a single-text document. Gledhill. Chapter 4 1 2 3 In f[n. While 60 per cent of the sample essays were rated as advanced (C1 or C2). the ‘close proximity of all or some individual instances of a content word or phrase within a document exhibiting multiple occurrences’. which is contrasted with the fact that most other documents contain no instances of this word or phrase at all’. 2009). Available at http://www. i. but only for words in a single (accessed 2 August 2009).lextutor. Scott’s (2004) WordSmith Tools 4 can compute Juilland’s D values. See Stefan Evert’s webpage (http: //www. reaching 100 per cent for students with Swedish mother tongue. but falling as low as 40 per cent for Spanish speakers (Granger et al.html (accessed 2 August 2009)) for a comprehensive list of measures of association and their mathematical interpretation. based on an arbitrary division of a text into 8 segments of equal size. and ‘within-document burstiness’ or ‘burstiness proper’. These three nouns are listed under the first sense of ‘classic’ in LODCE4. Chapter 3 1 2 3 4 5 A random sample of 20 essays from each of the 16 L1 sub-corpora available in the second version of ICLE were submitted to a professional rater who was asked to rate them on the basis of the Common European Framework of Reference for Languages (CEF) descriptors for writing.collocations.g. c]. causes problems for certain types of linguistic enquiries. especially when the lexical items under study are closely linked to specific parts of texts (e. A number of studies in the field of English for academic purposes have shown that words may behave differently and display different preferred lexicogrammatical environments in different sections of a text (see. http://www. the proportion was much higher in some sub-corpora.

the differences in use are only significant for a few groups. Estimations based on an analysis of the first 200 occurrences of the conjunction in the BNC-AC-HUM. overuse in learner corpora in general. compounds. etc’ are not included. The relative frequencies of for instance and example are higher in most learner corpora than in the BNC-AC-HUM in most learner corpora. Siegel (2002) and Biber et al. (1999: 562) for specific functions of like in speech. which was widely used in French colleges throughout the 1980s and early 1990s. As shown by Gläser. Aggregated frequencies thus also help to reveal general. though moderate. France) kindly pointed out to me that the sequence according to me also appeared in published textbooks such as Ok! (Lacoste and Marcelin. So we’ve got some examples here of some patterns that we want to learn using the N tuple method and tuple and tuple. This does not mean. although they may be close and intimate and friendly and all that. Studies focusing on terminological terms used in English for Specific Purposes have also revealed the pervasiveness of compounds (e. that there are no idioms. article. The instances of illustrate used in the sense of ‘to put pictures in a book. ‘In the words of the old song: “Money is the root of all evil”’. which learners then tend to work into their essays.g. John Osborne (Université de Haute Savoie. ‘authors of scientific writing are prone to modify idioms. commonplaces and allusions to proverbs and quotations in academic prose. (BNC-SP) 5 6 7 8 The noun root is overused in the ICLE largely because it appears in an essay title given to some of the EFL learners. (BNC-SP) Again think of the example of erm erm a social club you know. phrasal verbs. . however. proverbs. Estimations based on an analysis of the first 200 occurrences of the preposition in the BNC-AC-HUM. See Müller (2005: 197–228) for an analysis of like as a discourse marker. similes. and quotations for intellectual punning and sophisticated allusions’ (1998: 143). Nathan 1984). however. Chapter 5 1 2 3 4 The ‘word list’ option of WST4 was used to search for any misspelt form of the words under study in the ICLE. The underuse of the conjunctions as and while reported here must be treated with caution as it results from estimations based on an analysis of only the first 100 occurrences of each conjunction in each corpus. are not the same as a relationship between members of a family. Other verb co-occurrents that are quite frequent in the BNC-SP but not found in the BNC-AC-HUM are the verbs get and think. Bourigault et al. 2004) in specialized texts. AKL words are printed in bold in these examples.. When the learner corpora for different mother tongues are analysed separately.Notes 4 237 5 6 7 These figures are based on disambiguated data. See Miller and Weinert (1995). relationships between members.

especially given that students increasingly use the Internet for study purposes. I do admit that I have noticed some things that are easier about English than about the other languages that I had the chance to learn. The results reported here are only preliminary. learner corpora in the design of EAP materials and for possible explanations of the relatively modest role that corpora have played so far. See Gilquin et al. [P] indicates a new paragraph in learner writing A related problem is that of punctuation. and least frequent with speakers of V2 languages (Dutch.238 9 Notes 10 11 12 13 14 Gledhill (2000) uses the term ‘collocational cascade’ ICLE-PO).g. in the new corpus-based Cambridge Grammar of English (Carter and McCarthy. Cohesion is often dealt with in grammars. although.(2007a) for a detailed discussion of the role of corpora. following Granger and Paquot (2008a). 2006). Czech and Bulgarian) in between’ (Osborne. When I compare these languages I do not consider English as an easy language. and more specifically. It is noteworthy that.g. http://www2c. when the preferred sentence position of individual connectors is taught. Italian and Spanish). Polish. I prefer to avoid using the adjective ‘collocational’ to refer to sequences of co-occurrents. they sometimes erroneously use a comma after the conjunctions although or (even) though (e. the information is often neither corpus-based nor confirmed by corpus data.htm (last accessed: 30 July 2009). although there is a chapter on textual cohesion (‘Grammar across turns and sentences’. German and Swedish). 266–94). EFL learners sometimes omit commas after sentence-initial subordinate clauses or connectors or before and after appositives such as that is and that is to say (e. According to von Mayer. ICLE-IT). no attention is given to lexical cohesion. 2008: 77). As shown in Section 4. 242–62) as well as a full chapter on ‘Grammar and Academic English’ (pp. By contrast. what matters is relative poverty* that is to say* the sudden decrease of methodes/ Expressions_et_mots_de_liaison. where the focus is always on connectors. Chapter 6 1 2 3 4 5 6 The quality of the teaching material on the use of connectors in English that is freely available on the Internet is generally quite alarming. with speakers of nonraising languages (Russian. The ‘Improve your writing skills’ section in the MED2 shows how a rigorous corpus-based method can help users achieve higher levels of accuracy and fluency . See Paquot (2008b and in preparation) for details on the corpus linguistics methods and statistical measures used to operationalize Jarvis’s (2000) framework on learner corpus data. however. pp.2.1. The figures should be treated with caution as the LOCNESS corpus is quite small. Osborne (2008) compared adverb placement in the various interlanguages represented in the first version of the International Corpus of Learner English and found that ‘V-Adv-O order is most frequent in the productions of learners whose L1 has verb-raising (French.

2008b and 2010).). and “uptake” or “intake” is what we pay attention to and notice’ (2005: 8). the Louvain EAP Dictionary (LEAD) (see Granger and Paquot.Notes 239 in academic writing. Kamimoto et al. it is essential to explore ways of integrating this type of description into the microstructure of dictionaries rather than inserting it as a separate middle section. The Centre for English Corpus Linguistics (Université catholique de Louvain) has therefore recently launched a new dictionary project which consists of a web-based EAP dictionary-cum-writing aid tool. the same students will be followed over a period of two to three years. distinguish between input and intake as follows: ‘“Input” is everything around us we may perceive with our senses. 1992: 211. 1992).. In the LONGDALE project. This project is innovative in two main respects: it allows for both onomasiological (via the lexeme) and semasiological (via the concept) access and is customizable according to the learner’s mother tongue and the field in which he or she is specializing (business. to achieve maximum efficiency. medicine. The major role of L1 frequency has been identified in a few transfer studies focusing on phonology and syntax (Selinker. . with the intention of building a large longitudinal database of learner English containing data from learners with a wide range of mother tongue backgrounds. etc. Chapter 7 1 2 3 The Centre for English Corpus Linguistics launched the LONGDALE project in January 2008. However. De Bot et al.

Feldman. Metadiscourse in L1 and L2 English. Studia Linguistica. Hasselgren (eds). in Granger S. R. A. Amsterdam: Rodopi. K. (ed.comp. (2006). Ädel. pp. D. 247–57. and Rayson. (1998). S. J. in Granger. ‘Modality in advanced Swedish learners’ written interlanguage’.ac. A. and Burnard. Hung. (2002).). (eds). Aarts. 1–15. ‘Involvement features in writing: do time and interaction trump register awareness’. Altenberg. H. B. 80–93. (1993). pp.. G. F. Archer. (2002). Wilson. Language Learning and Language Teaching 6. (1998). L.. D. Amsterdam: John Benjamins. and Others. (ed. (ed. ‘Does corpus linguistic exist? Some old and new issues’. Göteborg: Acta Universitatis Gothoburgensis. ELT Journal. A Wealth of English. (2009a). Learner English on Computer. J. ‘On the phraseology of spoken English: the evidence of recurrent word-Combinations’. G. Archer. S. K. pp. ‘Lexical collocations: a contrastive view’. Oxford: Oxford University Press. ‘Morphological influences on the recognition of monosyllabic monomorphemic words’. Bahns. and Granger. What’s in a Wordlist? Investigating Word Frequency and Keyword Extraction. pp. 35–53. J. (2008).). in Gilquin. ‘The use of adverbial connectors in advanced Swedish learners’ written English’. (2006).). Altenberg. and Petch-Tyson. Aston. Amsterdam and Philadelphia: John Benjamins. B. and Tapper. Second Language Acquisition and Foreign Language Teaching. 53. Phraseology: Theory. Computer Learner Corpora. Amsterdam and Atlanta: Rodopi.lancs. Ädel. Aijmer. (1998). (eds). (2006). (ed. S. The BNC Handbook.) (2009). pp. Introduction to the USAS category system. and Schreuder. S. (1984). (2002). D. D. Analysis and Applications. London and New York: Routledge. and Diez-Bedmar B. Edinburgh: Edinburgh University Press. Papp. Academic Writing: A Handbook for International Students (2nd edition). R. Studies in Honour of Göran Kjellmer. Learner English on Computer. pp. in Archer. From the COLT’s Mouth . P. Farnham: Ashgate. in Cowie. (2001). . (ed.. in L. in Aijmer.). Altenberg. Baayen. London and New York: Addison Wesley Longman. (1998). Aijmer. guide. What’s in a Word-list? Investigating Word Frequency and Keyword Extraction. 55–76. A. Journal of Memory and Language. 38. ‘Tag sequences in learner corpora: a key to interlanguage grammar and discourse’. 56–63. Farnham: Ashgate. P.References1 Aarts. K. 20–69. Linking up Contrastive and Learner Corpus Research. ‘Does frequency really matter?’. 47 (1). 1–17. 496–512. pp.pdf. A.. . 132–41. in Granger S. Bailey. M. ‘Causal linking in spoken and written English’. J. 101–22. B. London and New-York: Addison Wesley Longman. S. Archer. . ‘I think as a marker of discourse style in argumentative Swedish student writing’. L.). Available from http://www. M. Breivik and A. (ed.

21–39. D. J. Applied Linguistics. Bazerman. (1999). (1983). A. Actes del I Symposium Internacional de Lexicografia. ‘A generic view of academic discourse’. London: Macmillan. Billurog lu. S. Bowker. (ed. (1994). BNL 2709: The most commonly used words in Eng˘ lish. (1996). ‘The development of an academic vocabulary’. 181–90. (2005). C. 25 (3). L. Série actvitats 15. . 87–110.. V. frequency and sense in keyword analysis’. in Flowerdew. Biskup. ‘L’apprenant dit avancé et son acquisition d’une langue étrangère: tour d’horizon et esquisse d’une caractérisation de la variété avancée’. ‘If you look at . D. Aussenac-Gilles. P. (2003). London and New York: Routledge. Johansson. (eds). 371–405. 6 (4). pp. Bourigault. Revue d’Intelligence Artificielle. De lexicografia. Barcelona: Institut universitari de linguistica applicada. N. 253–79. in Hasselgård. International Journal of Lexicography. (1993). and Conrad. Biber. in Freedman. Biber. Reading in a Foreign Language. and Béjoint. P. S.. 18 (1). (2004). ‘Construction de ressources terminologiques ou ontologiques à partir de textes: un cadre unificateur pour trois études de cas’. Biber D. (1988). and Petch-Tyson. (2002). S. E. 79–101. and Medway. (1999). P. H. and Nation. pp. ‘Lexical bundles in conversation and academic prose’. S. J. Genre and the New Rhetoric. A. V. P. and Charlet. Out of Corpora: Studies in Honour of Stig Johansson. and Cortes. ‘The comparative fallacy in interlanguage studies: the case of systematicity’.References 241 Baker. Bartning. Leech. ‘Idiomaticity and terminology: a multi-dimensional descriptive model’. G. (eds). S. Beheydt. (1997). in Granger. Variation across Speech and Writing.univ-tlse2. 85–93. L. L. Vocabulary and Applied Linguistics. 91–105. D. ‘Querying keywords: questions of difference. (1992). (2007). I. (1988). D.. Cambridge: Cambridge University Press. Conrad.. Journal of English Linguistics. Language Learning. 33. pp. P. . R. and DeCesaris. ‘Word families’. Lerot. pp. ‘Systems of genres and the enactment of social intentions’. (eds). ‘Sub-technical vocabulary and the ESP teacher: an analysis of some rhetorical items in medical journal articles’.: lexical bundles in university teaching and textbooks’. Corpus-based Approaches to Contrastive Linguistics and Translation Studies. 32 (4). Fourth Edition. Conrad.. and Oksefjell. pp. Bley-Vroman. S. Amsterdam and Philadelphia: John Benjamins. M. (2004). 346–59. 169–83. 125–60. J. (2006). J. S. Nicosia: Rüstem Kitabevi. 4. London: Taylor and Francis. D. Bhatia. Amsterdam and Philadelphia: John Benjamins. D. ‘Corpus-based applications for translator training: exploring the possibilities’. and Neufeld. Available from http:// w3. S. J. 1–17. Biber. and Pearson. (2002). 241–50. Amsterdam: Rodopi. Working with Specialized Text: A Practical Guide to Using Corpora. Baker. Harlow: Longman.. H. 9–50. ‘L1 influence on learners’ renderings of English collocations: a Polish/ German empirical study’. H. (2004). Barkema. (eds).doc Bowker. Studia Linguistica. S. and Finegan. Longman Grammar of Spoken and Written English. in Arnaud. pp. . L.) Academic discourse. Bauer. 50 (2). 9. D. in Battaner. . Harlow: Longman. (eds). University Language: A Corpus-based Study of Spoken and Written Registers. Biber. AILE.

S. McH. ‘The use of conjunctive adverbials in the academic papers of advanced Taiwanese EFL learners’. Conrad. Reference Guide for the British National Corpus (XML edition). (1988). Glasman. Devine. Coltier.242 References Brill. Available from http://www. 113–30. J. (1988). ‘Argument or evidence? Disciplinary variation in the use of the noun “that” pattern in stance construction’. E. 3rd Conference on Applied Natural Language Processing. Berlin: Erich Schmidt Verc C lag. 27. Text and Technology: In Honour of John Sinclair. Vocabulary and Language Teaching. Cook. (1993) ‘From Firth principles: computational tools for the study of collocation’.html Burger. and Tognini-Bonelli. in Carrrell P. Carter R. ‘Orders of reality: CANCODE. D. (2007).ac. communication. pp. ELT Journal. M. New York: Cambridge University Press. London: Routledge. (2003). Contrastive Rhetoric: Cross-Cultural Aspects of Second-Language Writing. Available from http:// citeseer. Amsterdam: John Benjamins. M. (ed. The Grammar Book: An ESL/EFL Teacher’s Course (2nd edition). (1998). J. pp. An Academic Vocabulary List. and McCarthy. E. 23–41. B. Cambridge Grammar of English: A Comprehensive Guide. and Nation. (1988).hawaii. Interactive Approaches to Second Language Reading. J. 52 (1): 57–63. System. 152–67. C. G. J. Rosenbaum-Cohen. Conrad. and McCarthy. D. Spoken and Written English Grammar and Usage. U. (1992).natcorp. and Language Teaching’. R. D. 203–18. Chen. ‘Introduction et gestion des exemples dans les textes à thèse’. (1999). (1998 [1987]) Vocabulary: Applied Linguistic Perspectives (2nd edition). (1998). ‘The learning and use of academic English words’. ‘A simple rule-based part of speech tagger’.. 52 (1). (eds). ‘Lexis and discourse: vocabulary in use’.nflrc. Proceedings of ANLP-92.. Ferrera. Wellington: NZCER. 67–85. (2007). Eine Einführung am Beispiel des Deutschen. and culture’. Phraseologie. ‘The uses of reality: a reply to Ronald Carter’. ‘Reading English for specialised purposes: discourse analysis and the use of student informants’. R.ox.) How to Use Corpora in Language Teaching. Language Variation. W. in Sinclair. T. in Baker M. and Fine. Amsterdam: John Benjamins. G. Boston: Heinle and Heinle. Celce-Murcia. (1998b).psu. Francis. International Journal of Corpus Campion. R. M. English for Specific Purposes. Cohen. . and McCarthy. Carter. Pratiques. Carter. Connor. (1997). Cambridge: Cambridge University Press. Chung. D. ˇˇ Burnard. (2004). 58. 26 (2).. P. New York: Longman. 43–56. J. E. P.. pp. Language Learning. M. and Elley. 201–20. H. (1996). R. Reading in a Foreign Language. (eds). (1999). Cambridge: Cambridge University Press. and Eskey. M. M. Corson. L. Carter. Available from http://www. (eds). 15 (2).edu/rfl/ Clear. 47 (4). A. Charles. ‘The importance of corpus-based research for language teachers’. in Carter. H. pp. S. 671–718. W. (2006). 11 (1). ‘Corpus Linguistics. 271–92. ‘Technical vocabulary in specialised texts’. 1–18. and Larsen-Freeman. ELT Journal. D. (2006). (1971)..

‘Lexical bundles in Freshman composition’. and Verspoor. De Cock. Bunting. ‘Written errors of international students and English native speaker students’. ‘Lexical behaviour in academic and technical corpora: implications for ESP development’. and Granger. (2001). W. London and New York: Routledge. Second Language Acquisition: An Advanced Resource Book. The Academic Word List: Collocations and Recurrent Phrases. P. (eds). Council of Europe (2001). 31–9. Oxford: Oxford University Press. and Hirsh. Möhle. 211–30. Dudley-Evans. (eds). and Nation. ‘The specialised vocabulary of English for Academic Purposes’. 65–78. Eldridge. 97–113. (2000). and Mahlberg. P. 316–25. Lowie. Revue française de Linguistique Appliquée. 72–86. (ed. 12 (2). 109–13. 44 (4). Research Perspectives on English for Academic Purposes. Phraseology: Theory. Coxhead. Curado Fuentes. in Cowie. Frankfurt am Main: Peter Lang. and St Johns. (2003). Cowan. V. R. and Biber. P... 8 (4). (1988). W. . Oxford: Oxford University Press. K. J. Amsterdam: John Benjamins. ELT Journal. 252–67. in Teubert. Cutting. (2003). and Raupach. Davies. M. Amsterdam: John Benjamins. (2005). (eds). J. Analysing Learner Language. Language Learning and Technology. Special issue of Lexicographica. D. W. A. (2000). pp. p. and Peacock. Common European Framework of Reference for Languages: Learning. Coxhead. (1989). S. Contrastive Pragmatics. S. H. ‘Collocational blends of advanced language learners: a preliminary analysis’. J. S. (2007). De Cock. 42 (1). Coxhead. Using Corpora to Explore Linguistic Variation.’ but . Milton. (eds). G. pp. P. S. DeRose. J. and Saville. in Reppen. R. Ellis. Dechert. P. . 5 (3). M. 389–400.. TESOL Quarterly. ..). A. 131–45. M. ‘Phraseological dictionaries: some east-west comparisons’. (1998). pp. (eds). M. A.References 243 Cortes. Coxhead. Teaching. ‘“No. G. Cambridge: Cambridge University Press. D. pp. in Dechert. Developments in English for Specific Purposes. (1974). Unpublished PhD thesis. (2008). Cowie. Computational Linguistics. A. W. and Barkhuizen. Assessing English for Academic Purposes. (forthcoming). ‘Second language production: six hypotheses’. Cambridge: Cambridge University Press. De Bot. pp. T. H. (2005). TESOL Quarterly. Clevedon: Multilingual Matters. and Lennon. The Native Speaker: Myth and Reality. 20. ‘Computer learner corpora and monolingual learners’ dictionaries: the perfect match’. The Corpus Approach to Lexicography. Byrd. S. ‘A pilot science-specific word list’. K.. in Flowerdew. Louvain-la-Neuve: Université catholique de Louvain. J. (2004). and Moran. A. (2002). in Olesky. TESOL Quarterly. Second Language Productions. Fitzmaurice. M. D. Assessment. (1990). J. Crewe. ‘Grammatical category disambiguation by statistical optimization’. Dechert. A. Tse’s “Is there an ‘academic vocabulary’?”’. 14. (ed.). (2001). ‘A new Academic Word List’. J. Tübingen: Gunter Narr.. there isn’t an ‘academic vocabulary. ‘Recurrent sequences of words in native speaker and advanced learner spoken and written English: a corpus-driven approach’. R. 213–38. M. 34 (2). H. Analysis and Applications. Hyland and P. 106–29. 209–28. ‘The illogic of logical connectors’. M. Boston: University of Michigan Press. 131–68. Cambridge: Cambridge University Press. in Blue. (1998). A. ” A reader responds to K. ‘Lexical and syntactic research for the design of EFL reading materials’. J. (1984). A.

10. P. 11 (3). Flowerdew. The Computational Analysis of English. Gilquin.17. Francis. G. pp. A. R. 3–33. (1993). ‘The exploitation of small learner corpora in EAP materials design’. and Roseberry. M. ‘From EFL to ESL: evidence from the International Corpus of Learner English’. 95–123. M. (eds). Flowerdew. (eds). G. (2006). London and New-York: Routledge. word lists and materials preparation: a new approach’. University of Birmingham. (2001). Spicing up your data’. 1–17. and Papp. Flowerdew. Harlow: Longman. International Journal of Corpus Linguistics. Discourse Analysis Monograph 11. (ed. Gilquin. 285–300. Paper presented at the First Triennial Conference of the . ‘The integrated contrastive model. S. Garside. (1992). 23 (1). Flowerdew..D. ‘Signalling nouns in a learner corpus’. thesis. Institut für maschinelle Sprachverarbeitung. (1999). L. (1997). RELC Journal. Corpus Annotation: Linguistics Information from Computer Text Corpora. 82–101. S. B. (1990). 6.244 References Engels. L.. 8 (3). ‘The CLAWS word-tagging system’. in Diez-Bedmar. 243–64. (1998). M. 3–17. ‘A comparison of internal conjunctive cohesion in the English essay writing of Cantonese and native speakers of English’. London and New York: Longman. and Smith. Studies in Corpus Linguistics 29. ‘Integrating ‘expert’ and ‘interlanguage’ computer corpora findings on causality: discoveries for teachers and students’. G. Evans. J. A. pp. Journal of English for Academic Purposes. ‘Why EAP is necessary: a survey of Hong Kong tertiary students’. L. 213–31. 345–62. 1–83. pp. ‘Combining contrastive and interlanguage analysis to apprehend transfer’.de/phd. ‘Frequency counts.K. University of Stuttgart. S. ‘The fallacy of word counts’. New York: Addison Wesley Longman. G. English for Specific Purposes. 15–28. Gilquin. 21 (2). 329–345. (eds). Amsterdam and Philadelphia: John Benjamins.M. and Yip. ‘Labelling discourse: an aspect of nominal-group lexical cohesion’. J. M. Languages in Contrast. Y.html Farrell. (1997). G. Francis. (ed. Ghadessy. Evert. ‘A hybrid grammatical tagger: CLAWS4’. and McEnery. Flowerdew. communication and (some) fundamental concepts in SLA research’. English Teaching Forum. L. pp. ‘Vocabulary in ESP: a lexical analysis of the English of electronics and a study of semi-technical vocabulary’. and Granger. (2006). Modern Language Journal. System. R. (1968). N. Amsterdam: John Benjamins. R.) Academic Discourse. 25. ‘Concordancing as a tool in course design’.. Garside. Advances in Written Text Analysis. 24–7. in Ghadessy. Firth. (2000/2001). ‘On discourse. (2004). Ph. (1987). G. 17 (4). in Coulthard. L. G. R. J.collocations. Small corpus studies and ELT. R. 231–44.). (2008). in Garside. (2008). 363–79. Gilquin. (2002). (1986) Anaphoric Nouns. and Green. (1994). Available from http://www. and Sampson. G. J. Field. J. Flowerdew. Leech. ‘The statistics of word cooccurrences: word pairs and collocations’. Journal of Second Language Writing. G. J. in Garside. Amsterdam and Atlanta: Rodopi. in Flowerdew. (1979). ‘Introduction: approaches to the analysis of academic discourse in English’. CLCS Occasional Paper.O. Leech. C. pp. ‘Problems in writing for scholarly publication in English: the case of Hong-Kong’. (eds). 30–41. (2008) Corpus-based Analyses of the Problem-Solution Pattern: A Phraseological Approach. 3 (1). Flowerdew. S. 102–21. 81. Linking Up Contrastive and Learner Corpus Research. Birmingham: English Language Research. and Wagner. pp. International Review of Applied Linguistics.

Corpus-based EAP pedagogy. and Hnazeli. (eds). E. M. (editor in chief) Macmillan English Dictionary for Advanced Learners (2nd edition). in Connor. Studies in English Language and Teaching.References 245 International Society for the Linguistics of English. ‘The stylistic potential of phraseological units in the light of genre analysis’. Proceedings of an International Symposium. and Payne. and Upton. English for Academic and Technical Purposes: Studies in Honour of Louis Trimble. 123–45. I. Amsterdam and Atlanta: Rodopi. Germany. ‘The International Corpus of Learner English: a new resource for foreign language learning and teaching and second language acquisition research’. ‘Romance words in English: from history to pedagogy’. (eds). in Svartvik. Analysis and Applications. S. G. Lund: Lund University Press. E. Phraseology: Theory. S. ‘From CA to CIA and back: an integrated approach to computerized bilingual and learner corpora’. in Selinker L. 105–21. ‘Improve your writing skills: writing sections’. 37–51. 3–33. Phraseology: Theory. Learner English on Computer. pp. Granger. Amsterdam and Philadelphia: John Benjamins. Granger. (eds). and Petch-Tyson. R. (eds). (ed. and Wekker. S. Lund Studies in English 88. (ed. 125–43. 6 (4). 538–46. (2008). Granger. Granger. in Aarts. P. pp. ‘A taxonomic approach to the lexis of science’. and Paquot. Rowley MA: Newbury House. (2007a) ‘Learner corpora: the missing link in EAP pedagogy. pp. M. 3–18. (2007b). Granger. in Cowie. S. Special issue of the Journal of English for Academic Purposes. pp. H. (1998b). J. S. Granger. pp. M. pp. pp.). (2006). Language in Performance 22. in Cowie. ‘On identifying the syntactic and discourse features of participle clauses in academic English: native and non-native writers compared’. A.. (ed. (2004). 41–61. Goodman. S. Albert-Ludwigs-Universität Freiburg. (1996a). Granger. S. S. 23–39. Languages in Contrast: Text-based Cross-linguistic Studies. Analysis and Applications. Granger. Collocations in Science Writing. IW4–IW28. M. 185–98. C. S. (1998a). 1 (1). Gilquin. and Paquot. Computer Learner Corpora. S. S. Granger.). ‘Computer learner corpus research: current status and future prospects’. Tuebingen: Gunter Narr Verlag. Granger. in Rundell. J. J. (ed. Tarone. in Granger. in Aijmer K. and Johansson. Oxford: Oxford University Press. (1981). P. 145–60.). Gläser. Stockholm: Almqvist and Wiksell International. Gilquin. (ed. Oxford: Oxford University Press. (1998). in Granger. Oxford: Macmillan Education. ‘Prefabricated patterns in advanced EFL writing: collocations and formulae’. pp. pp.. S. B. Learner English on Computer. Granger. Altenberg. Gilquin. (ed. T.). Second Language Acquisition and Foreign Language Teaching. Hung. G.. (2002). London and New York: Addison Wesley Longman. and Paquot. (2000). P. G. Gledhill. in Thompson. University of Hanover. (2003). ‘A bird’s-eye view of learner corpus research’. (eds). Granger..). Words. M. S. 319–35. pp. A... ‘Too chatty: learner academic writing and register variation’. V. London and New York: Addison Wesley Longman. (1996b). Language Learning and Language Teaching 6. de Mönnink. 5–7 October 2006. ‘The computer learner corpus: a versatile new source of data for SLA research’. 37 (3). 8–11 October 2008. . (1997). English Text Construction. ‘Lexico-grammatical patterns of EAP verbs: how do learners cope?’ Paper presented at Exploring the Lexis-Grammar Interface. Amsterdam and Atlanta: Rodopi. A. U. S. TESOL Quarterly.) (1998). S. Applied Corpus Linguistics: A Multidimensional Perspective.

246 References Granger. Granger. K. 19–29. and Paquot. (2009).ucl. 257–77. ‘Pattern and meaning across genres and disciplines: an exploratory study’. pp. Gregg. University of Crete Publications. S. academic_english. Learner English on Computer. H. S. London: Blackwell. in Granger. (eds) eLexicography in the 21st century: new challenges. and Heasley. M. S. Cahiers du Cental. ‘From dictionary to phrasebook?’. in Charles. Cohesion in English. F. Granger. (eds). Louvain-la-Neuve: Presses universitaires de Louvain. Harris.). D. and Meunier.). (1997). (2008a). (eds).S. and Paquot. (1998). R.). . pp. M. Presses universitaires de Louvain: Louvain-laNeuve. Halliday. Handbook of Second Language Research. and DeCesaris. N. English for Specific Purposes. in Aijmer. pp. P.. new applications. Barcelona. M. London: Longman. M. 4. S. 94–108. R. H. and Long. Neufeld. 289–308. S. Meunier. Gries. ‘Exploring variability within and between corpora: some methodological considerations’. World Englishes. ‘Automatic profiling of learner texts’. M. Corpora. Amsterdam and Philadelphia: John Benjamins. Granger. in Granger. (2010).. International Journal of Corpus Linguistics. Hamp-Lyons. ‘Lexical verbs in academic discourse: a corpus-driven study of learner use’. in Granger. E-media. Proceedings of the XIII EURALEX International Congress. 27 (4). and Meunier. Groom. M. Louvain-la-Neuve: Presses universitaires de Louvain. 109–51. 403–37. 193–214.fltr. 27–49. and Paquot. Study Writing: A Course in Writing Skills for Academic Purposes. ‘Through the looking glass and into ˘ (eds. M. M. S. 23. ‘Dispersions and adjusted frequencies in corpora’. S. and Swallow. (eds) (2002). Version 2. Granger. (2009). London and New York: Addison Wesley Longman. 1345–55. Granger. and Hasan. pp. (ed. F. Corpora and Language Teaching. J. M. (eds). S. and Paquot. in Doughty.. (ed. S. and Paquot.P practitioners Conference Proceedings. pp. and Tyson. S. (2006). Handbook and CD-ROM.pdf Granger. ‘The contribution of learner corpora to second language acquisition and foreign language teaching: a critical evaluation’. (2003). S. and Eldridge. English for Specific Purposes. 1 (2). Langage et l’Homme. the land of lexico-grammar’. and Rayson. (2005). Dagneaux. ‘Customising a general EAP dictionary to learner needs’. S. Phraseology: An Interdisciplinary Perspective. Granger. Academic Writing: At the Interface of Corpus and Discourse. 831–65. 119–31. Granger. 13 (4). J. The International Corpus of Learner English. Spain. S.. The International Corpus of Learner English. Granger. Journal of English for Academic Purposes. Cambridge: Cambridge University Press. Continuum. ‘In search of General Academic English: a corpusdriven study’. 15–19 July 2008. F. K. ‘Disentangling the phraseological web’. M. and Hunston. K. CD-ROM and Handbook. B. Granger S. Options and Practices of L. (2008b). Hanciog N. (2008). (1976). 15. Available from http://cecl. S. 15–32. E. Amsterdam and Philadelphia: John Benjamins. E. E. S and Paquot. pp. in Katsampoxaki-Hodgetts. 16 (4). (ed. 108–20. S. Gries. (2009a). ‘SLA theory: construction and assessment’. pp. S. (1996). Dagneaux. ‘False friends: a kaleidoscope of translation difficulties’. 6. Proceedings of the eLex2009 Conference. ‘Connector usage in the English essay writing of native and non-native EFL speakers of English’. 459–79. L. and Paquot. in Bernal. (2007). ‘Procedural vocabulary in law case reports’. S. C.).. Pecorari . (1988). (2009b). (2008).

London: Routledge. M. P. writing. Tübingen: Max Niemeyer Verlag. and Nation. Hoey. Techniques of Description. New Zealand: Victoria University of Wellington. Nottingham: Nottingham University Press. Corpus Technology and Language Pedagogy: New Resources. (2006). (1992). (2005). pp. (1998). Lexical Priming: A New Theory of Words and Language. M. Phraseology: Theory. volume 3 of English Corpus Linguistics. Range [Computer software]. Bonston: Heinle and Heinle. T. ‘BNCweb (CQP-edition): The marriage of two corpus tools’. and Thompson. 689–96. E. M. (2006). 8. P. 1049–68. Discoveries in Academic Writing. (1999). ‘Grammar. London and New-York: Routledge. Advances in Written Text Analysis. (2004). Hinkel. Hirsh. Hoey. ‘Signalling in discourse: a functional analysis of a common discourse pattern in written and spoken English’. Frankfurt am Main: Peter Lang. Amsterdam and Philadelphia: John Benjamins. K. C. S.). in Lindquist. stefan. S. 3–17. 177–95.). resources/range. and Nation. M. H. in Coulthard. and Evert. Heatley. pp. ‘Lexical teddy bears and advanced learners: a study into the ways Norwegian students cope with English vocabulary’.. A. P. Journal of Pragmatics. 149–58. (eds). P. Available from http://www. . Frankfurt am Main. M. S.victoria. Evert. pp. and Fisher. (eds). Oxford: Oxford University Press. Phraseology in English Academic Writing: Some Implications for Language Learning and Dictionary Making. (2008).ac. (2003). (1996). International Journal of Applied Linguistics. V. and technology: a sample technology-supported approach to teaching grammar and improving writing for ESL learners’. Y. (eds). Smith. in Bool H.. J. Howarth. P. A. (eds) Academic Standards and Expectations: The Role of EAP. S. (2003). Hinkel. S. G. Hoey. 67–82. G. E. J. Howarth.pdf Hoffmann. (2002). NJ: Lawrence Erlbaum Associates. 237–60. ‘A common signal in discourse: how the word reason is used in texts’.aspx Hegelheimer. E. D. (1994). Peter Lang. Hasselgren. Ibérica. D. and Berglund Prytz. and Mukherjee. Hoey. Second Language Writers’ Text: Linguistic and Rhetorical Features. Mahwah. ‘Specificity in LSP’. CALICO Journal. 161–86. ‘What vocabulary size is needed to read unsimplified texts for pleasure?’ Reading in a Foreign Language. ‘Adverbial markers and tone in L1 and L2 students’ writing’.. in Cowie. Evaluation in Text: Authorial Stance and the Construction of Discourse.. Kohn. Huckin. London and New York: Routledge.. in Sinclair. Hoffmann. P. Howarth. (1993). 171–210. (ed. 35 (7). Available from http://purl. Oxford: Oxford University Press. S. 4 (2). P. New Methods. Lee D. Hunston. Essential Academic Vocabulary: Mastering the Complete Academic Word List. in Braun. New Tools. B. Boston: Houghton Mifflin Company. ‘Are low-frequency complex prepositions grammaticalized? On the limits of corpus data – and the importance of intuition’. Wellington. pp. pp. Corpus Approaches to Grammaticalization in English. (1994). Huntley. (1996). Hoffmann. 23 (2). N. London: Lawrence Erlbaum Associates. and Luford. (2002). (2006). ‘The phraseology of learners’ academic writing’. and Fox.References 247 Harris Leonhard. (eds) (2000). and Mair. (ed. Teaching Academic ESL Writing: Practical Techniques in Vocabulary and Grammar. 26–45. (2004). Analysis and Applications.evert/PUB/HoffmannEvert2006. 5. H. ‘Phraseological standards in EAP’. S. 257–79. Hinkel. Corpus Linguistics with BNCweb – a Practical Guide.

and Howatt. S. Hyland. Jarvis. S. in Granger. Katz. S. J. A. (2007). 183–205. King. London: Allen and Unwin. Natural Language Engineering. E. pp. (1978). T. K. and Pavlenko. K. in Davies A. A. (1999). Some Aspects of the Vocabulary of Learned and Scientific English. ‘A second language classic reconsidered: the case of Schachter’s avoidance’. London and New York: Continuum. A Guide and Resource Book for Teachers. Ivanic R. Göteborg: Acta Universitatis Gothoburgensis. 41 (2). (2005). 15–59. Second Language Research. and Tse. 356–73. (1996). 98–122. 29. A. R. J. K. 93–114. (1990). P. ‘Nouns in search of a context: a study of nouns with both ˇ. Johansson. and Petch-Tyson. ‘Parallel concordancing and its applications’. S. (2002a). (2008). 6. Harlow: Pearson Education Limited. 245–309. International Review of Applied Linguistics in Language Teaching. ‘Preparation and analysis of linguistic corpora’. (2009). (2007). Applied Linguistics. ‘Issues in creating a corpus for EAP pedagogy and research’. (ed. R. in Schreibman. R. K. pp. Study Skills in English. Frequency Dictionary of Spanish Words. ‘Is there an “academic vocabulary”?’ TESOL Quarterly. 6 (2). Hyland. M. Interlanguage. Academic Writing Course. S. ‘From printout to handout: grammar and vocabulary teaching in the context of data-driven learning’. open. (ed. (eds). Metadiscourse. and Unsworth. E. (1994). Harlow: Pearson Education Limited. and Milton. Rhetoric of Everyday English Texts.. English for Academic Purposes. 30. T. ‘Specificity revisited: how far should we go now?’ English for Specific Purposes. A Companion to Digital Humanities. Crosslinguistic Influence in Language and Cognition. Shimura. (1984). Cambridge: Cambridge University Press. 50 (2). La Haye: Mouton. Hyland. Siemens. ‘Methodological rigor in the study of transfer: identifying L1 influence in the interlanguage lexicon’. Journal of English for Academic Purposes. R.) Second Language Writing. S. Ide. R. (2005). Cambridge: Cambridge University Press. (1984). 293–313. Criper.. London and New York: Continuum. and Kosem. Krishnamurthy. class compositions’. Edinburgh: Edinburgh University Press. B. 215–39. 21. Oxford: Blackwell. Approaches to Pedagogic Grammar. 235–53. E. Kellerman. ‘The empirical evidence for the influence of the L1 in interlanguage’. Amsterdam and Philadelphia: John Benjamins. K. ‘What does time buy? ESL student performance on home vs. (eds). K. ‘Distribution of common words and phrases in text and language modelling’. ‘Directives: argument and engagement in academic writing’. P.248 References Hyland. New York and London: Routledge. and Kellerman. P. ‘Qualifications and certainty in L1 and L2 students’ writing’. N. Kamimoto. Language Learning.. Hyland. (2000) Disciplinary Discourses: Social Interactions in Academic Writing.and closed-system characteristics’. Johns. 8 (3): 251–77. in Kroll. C. Academic Discourse. Jarvis. Hyland. (1964). 385–95. Jordan. Corpus-based Approaches to Contrastive Linguistics and Translation Studies.). A. 289–306. in Odlin. 2 (1). Journal of Pragmatics. ‘Persuasion and context: the pragmatics of academic metadiscourse’. T. 157–68. pp. Kroll. Journal of Second Language Writing. B. Juilland. (1997). J. (1992). R. (1991). pp. Jordan. 23. K. Hyland. S.. and Rodriguez. I. Cambridge: Cambridge University Press. 437–55. Lerot. (2000). Hyland. Jordan. (2003). 140–53. (1997). (eds). (1998). K. (2002b). C. .

J. Composition Practice 3. C. ‘Using ‘on the contrary’: the conceptual problems for EAP students’. domains. S. London: Longman. (ed. G. Leech. text types. Learner English on Computer. and Béjoint. and Selinker. H. P. and Pemberton. M. London: Macmillan. ‘An investigation of students’ knowledge of academic and subtechnical vocabulary’. ‘Procedural vocabulary: lexical signalling of conceptual relations in discourse’. 55–75. (eds). ‘Overstatement in advanced learners’ writing: stylistic aspects of adjective intensification’. Lorenz. Laufer. in Kirk. Laruelle. ‘What percentage of text-lexis is essential for comprehension?’. Lennon. (ed. (2004). A. D. 23–36.. pp. non-native argumentative writing’. 19 (1). J.. (1997).References 249 Lake. ‘A corpus-based EAP course for NNS doctoral students: Moving from available specialized corpora to self-compiled corpora’. in van Halteren. Adjective Intensification – Learners versus Native Speakers. pp. in Flowerdew. . (2001). New York: Addison Wesley Longman. Li. IRAL. and Wilson. ‘Getting ‘easy’ verbs wrong at the advanced level’. Rayson. U. (1998) ‘Preface’. R. pp. Luzón Marco. English for Specific Purposes. (2001). Applied Linguistics. and Smith. Syntactic Wordclass Tagging. H. (2001). D. London and New York: Addison Wesley Longman. L. (1999b). Entering Text. (1999). ‘Introducing corpus annotation’. Lenk. A. R. Learner English on Computer. Laufer. Leech. (ed. N.. Clevedon: Multilingual Matters. L. and Tong. (1998). ‘The use of tagging’. (eds). Amsterdam: Rodopi. K. and Swales.). ‘Learning to cohere: causal links in native vs. ‘Genres. Schneider. pp. G. L. in Granger. (1992). (2001). ELT Journal. (2006). 393–420. and Ventola. M. and styles: clarifying the concepts and navigating a path through the BNC jungle’. Amsterdam and Philadelphia: John Benjamins. G. Paris: Presses Universitaires de France. ‘Analysing interlanguage: how do we know what learners know?’ Second Language Research. 20 (1). How to Create it and How to Describe it. in Granger. 25. E.-M. S. Amsterdam and Atlanta: Rodopi. Corpora Galore: Analysis and Techniques in Describing English. 126–132.-L. London and New York: Addison Wesley Longman. 34 (1). in Bublitz. P. K. (eds). 1–21. Lonon Blanton. Leech G. J. 259–66. Leech. Corpus Annotation: Linguistics Information from Computer Text Corpora. 137–44. 17. Luzón Marco. pp. (1999a). E. Lorenz. (ed. 5 (3).. Lorenz. G. P. 23–36. Language Learning and Technology. Lehmann. 37–72. (eds). S. G. and Hoffmann. Boston: Heinle and Heinle. J. B. 183–96. (2000). English for Specific Purposes. 316–23. (eds). Language and Computers: Studies in Practical Linguistics 27. Dordrecht: Kluwer Academic Publishers. Hong Kong: The Hong Kong University of Science & Technology. in Garside. U. 63–86. pp. ‘How much lexis is necessary for reading comprehension?’. A Corpus Study of Argumentative Writing. P. Mieux écrire en anglais. (2000). (1989). P. A. H. S. in Lauren. 53–66. J. ‘Collocational frameworks in medical research papers: a genre-based study’. (1999). Leech. pp. B. 1–18. Lee. and Nordman. Lakshmanan.). (1994). Coherence in Spoken and Written Discourse. W. G. Vocabulary and Applied Linguistics. pp. and McEnery. G. 56–75. Word Frequencies in Written and Spoken English.). Lee. registers. (2004). M. in Arnaud. ‘BNCweb’. Special Language: From Humans Thinking to Thinking Machines. 58 (2). (1996). J.).

and Nation. K. Acquisition and Pedagogy. P. (1976). M. in Kettemann. (1998). and Hyland. RELC Journal. ‘Where would general service vocabulary stop and special purposes vocabulary begin?’ System. ‘The native speaker is alive and kicking – linguistic and languagepedagogical perspectives’. (1998).250 References Lynn. (1995). ‘The use of adverbial connectors in argumentative essays by Japanese EFL college students’. and Rohrback. 16 (2). London and New York: Addison Wesley Longman. J. pp. I.) (2006). Nation. Learner English on Computer. and Tono. English for Specific Purposes. and Sugiura. Beck. (1995). Corpus-based Language Studies: An Advanced Resource Book. Cambridge: Cambridge University Press. 205–32. ‘Engineering English: a lexical frequency instructional model’. R. R. R. 28. (eds). Y. Martínez. 23 (1). ‘Teaching academic vocabulary to foreign graduate students’. ‘Rethinking applied corpus linguistics from a language-pedagogical perspective: new departures in learner corpus research’. 235–56. F. Vocabulary: Description. English Corpus Studies. J. 183–98. pp. Martin. 365–93. Nation. Major. TESOL Quarterly. Frankfurt am Main: Peter Lang. J-M. (2006). 4 (1). pp. 221–43. Cambridge: Cambridge University Press. Gluing and Painting Corpora: Inside the Applied Corpus Linguist’s workshop. Discourse Markers in Native and Non-native English Discourse. O. and O’Dell. P. (ed. ‘Vocabulary size. C. A. 13. (1999). P. Phraseology: Theory. (2009). Mukherjee. A.. 7–23. A. (2006). Anglistik. pp. ‘Collocations and lexical functions’. Discourse Analysis for Language Teachers. W. Cambridge: Cambridge University Press. M. 91–7.. Unpublished PhD thesis. M. English for Specific Purposes. C. 35– Müller. Fixed Expressions and Idioms in English. 25 (2). I. (2005). Processes and Practices. and Hwang. 10 (1). M. (2008). McEnery. Miller. Learning Vocabulary in another Language. 23–42. ‘Lexical thickets and electronic gateways: making text accessible by novice writers’. Mel’c ˇuk. P. (1973). Milton. Coming to Know: Studies in the Lexical Semantics and Pragmatics of Academic English. B.). S. (eds). Mudraya. Amsterdam and Philadelphia: John Benjamins. in Schmitt. ‘Preparing word lists: a suggested method’. ‘Academic vocabulary in agricultural research articles: a corpus-based study’. Tübingen: Gunter Narr Verlag Tübingen. London and New York: Longman. Journal of Pragmatics. Narita. P.). and Marko.G. (2000). Nation. Oxford: Oxford University Press. Meunier. S. 6–19. Harlow: Longman. R. (ed. (1997). (2001). Analysis and Applications. F. J. K. Cambridge: Cambridge University Press. Oxford: Clarendon Press. 25–32. Academic Vocabulary in Use. Writing: Texts. (eds). (2006). and Panza. Xiao. (2006). P. 23–53. in Candlin. N. Mukherjee. G. The Longman Exams Dictionary. M. and Weinert. text coverage and word lists’. (1991). R. in Granger. Milton. in Cowie. and Waring. ‘A computer corpus linguistics approach to interlanguage grammar: noun phrase complexity in advanced learner writing’. . (1997). McCarthy. (1998). ‘Exploiting L1 and interlanguage corpora in the design of an electronic language learning and production environment’. Planning. McCarthy. J. ‘The function of like in dialogue’. Université catholique de Louvain: Louvain-la-Neuve. (ed. Available from http://www. Meyer. S. J. 186-198. N.uni-giessen. Moon. London and New-York: Routledge. 23. (2005).

pp. and Rica. and Wieser. G. Prieto. O. in Facchinetti. N (2004). The Handbook of Second Language Acquisition. in Tschichold. Edinburgh: Edinburgh University Press. Collocations in a Learner Corpus. C. F. (2000). E.. Díez. ‘A corpus-based study of business English and business English teaching materials’. M. 22 (1). C. 73–89. ‘Student papers across the curriculum: designing and developing a corpus of British student writing’. pp. L. R. Computers and Composition. ‘The expression of writer stance in native and non-native argumentative texts’. Frankfurt am Main: Peter Lang. in Reynolds.14 (1). and Prieto. Amsterdam and New York: Rodopi. E.. J. 436–86. M. (2008). pp. Oakey. pp. Cambridge: Cambridge University Press.References 251 Neff. Corpus Linguistics and Society. J.. T. Numbers.. Nesselhauf. 21. Statistics for Corpus Linguistics. Language. D... Amsterdam: Rodopi. Amsterdam: John Benjamins. D.. ‘A contrastive functional analysis of errors in Spanish EFL university writers’ argumentative texts: corpus-based study’. (2007). J. 85–100. F.. Neff. Dafouz. Odlin. Amsterdam and Philadelphia: John Benjamins. Manchester: University of Manchester. F. 267–83. People. F. (ed. (2005). Obenda. 203–25. Dafouz. A. Oxford: Blackwell. 19–45. Neff van Aertselaer. and Ganobcsik-Williams. Dafouz. in Gerbig. Literary and Linguistic Computing. and Granger. pp. (eds). ‘Formulaic language in English academic writing: a corpus-based study of the formal and functional variation of a lexical phrase in different academic disciplines’.. ‘Cross-linguistic influence’. ‘Linguistic correlates of second language literacy development: evidence from middle-grade learner essays’. Phraseology in Language Learning and Teaching. H.. 269–86.. F. C. (2003). N. M. J-P. Ballesteros. M. R. Odlin. M. D. H. J. ‘Contrasting English-Spanish interpersonal discourse phrases: a corpus study’.. in Fitzpatrick. A. Neff J. (ed. Language Transfer: Cross-linguistic Influence in Language Learning. Martínez. pp. (2004a). (2007). (2004b). J. J. D. pp. Díez.). Journal of Second Language Writing. C. and Mason. Corpus Linguistics beyond the Word: Corpus Research from Phrase to Discourse (Language and Computers 23). (2004). and Chaudron. (eds). 141–61.) English Core Linguistics.. and Martinovic-Zic. ‘Contrastive discourse analysis: argumentative text in English and Spanish’. E. and Rica. ‘What are collocations?’. Nesselhauf. Essays in Honour of D. R. Neff J. ‘Use of the Chi-Squared Test to examine vocabulary differences in English language corpora representing seven different countries’. Oakes. E. Basel: Schwabe. T. ‘Formulating writer stance: a contrastive study of EFL learner corpora’. N. Amsterdam: John Benjamins. and Farrow. Phraseological Units: Basic Concepts and their Application. 439–50. Nelson. F. J. (2008).) (2004). Allerton. C. (2002). 1–22. Tschichold. in Doughty. . Bern: Lang. (1998). M. (1989). Boston and New York: Houghton Mifflin Company. Rica. (eds). S. and Palmer. P. Sharpling. Martínez. P.. J. in Allerton. (eds). J-P. (2005). F. and Long. Oakes. (2003). Unpublished PhD Thesis. F. Discourse across Languages and Cultures. E. Nesselhauf. M. English Modality in Perspective. Dafouz. (eds). ‘Transfer at the locutional level: an investigation of Germanspeaking and French-speaking learners of English’. (ed. Ballesteros. Academic Word Power (1 – 4). (eds). Nesi. 85–99. Martínez. pp. in Meunier. in Moder. Ballesteros...

and Bestgen. Lodz Studies in Language 13. P. 43–64. (2006). (1983). Oshima. (eds). 12 (2).ucl.comp. (2008b). F. J. M. Quirk. D. (eds). ‘Reader/writer visibility in EFL persuasive writing’. 243–65. A. Paquot. Pawley. S. F. Learner English on Computer. ‘Exemplification in learner writing: a cross-linguistic perspective’. C. Amsterdam and Philadelphia: John Benjamins. Frankfurt am Main: Peter Lang. London and New York: Addison Wesley Longman. ‘Comment rendre compte de la “logique” de l’acquisition d’une langue étrangère par l’adulte?’ Etudes de Linguistique Appliquée. American University Word List. 13 (4). .uk/~paul/public. M. P. S. Corpus-based and Computational Approaches to Discourse Anaphora. S. and Jucker. (2008). ‘Who. in Hundt. PALC 2005. M.html Paquot. (1998). ‘Phraséologie contrastive anglais-français: analyse et traitement en vue de l’aide à la rédaction scientifique’. pp. Paper presented at the ASKeladden Opening Conference. ‘Phraseology effects as a trigger for errors in L2 English: the case of more advanced learners’.. (eds). and Hogue. (in preparation). Phraseology in Language Learning and Teaching. and Syder. ‘Two puzzles for linguistic theory: nativelike selection and nativelike fluency’. (1998). and Schmidt. Paquot. F. M. Praninskas. G. 107—118. and Granger. ‘From key words to key semantic domains’. 24–25 June 2008. in Walinski. Corpora: Pragmatics and Discourse. is a native speaker?’ Anglistik. (eds). (eds). Y. Corpora and ICT in Language Studies. ‘Matrix: a statistical method and software tool for linguistic analysis through corpus comparison’. Greenbaum. Unpublished PhD thesis. C. Paquot. Language and Communication. Amsterdam and Philadelphia: John Benjamins. H. ‘Towards a productively-oriented academic word list’. S. Leech. Rayson. (1993). Piller. (1985). and Gozdz-Roszkowski. Patterns and Meanings. Paquot. if anyone. Norway. pp. and Meunier. J. in Granger. Amsterdam and Philadelphia: John Benjamins. Bergen. International Journal of Corpus Linguistics. (2007b).html Rayson. Using Corpora for English Language Research and Teaching. Schreier. W.lancs. A. 29–59. S. Unpublished PhD Thesis. R. Université de Nice. in Richards. ‘Demonstrative expressions in argumentative discourse – a computer-based comparison of non-native and native English’. Phraseology in Language Learning and Teaching. in Granger. J.. (2009). M. 519–49. London: Longman. A phraseologically-oriented approach’. 67–84. M. A. (2008). London and New York: Longman. Available from http://cecl. in Meunier. Unpublished PhD thesis. A. ‘Unveiling L1-induced effects with the help of learner corpora: Transfer of lexical priming’. 101–19.. pp. Université catholique de Louvain. P. pp. (1972). I.252 References Osborne. pp. M. London: Longman. ‘Distinctive words in academic writing: a comparison of three statistical tests for keyword extraction’. (2001). J. Amsterdam: Rodopi. M. (eds). Writing Academic English. (2003). Amsterdam: John Benjamins. A. (ed. in Botley. K. ‘EAP vocabulary in native and learner writing: from extraction to analysis. Available from http://www. Pecman. be/ S. J. Petch-Tyson. Perdue. 127–40. Paquot. pp. Lancaster University. A. New York: Pearson Education. H. S. 109–21. Kredens.. Partington. and McEnery. ‘Lifting the “methodological fog” that covers transfer studies: a combination of Granger’s (1996) integrated contrastive model and Jarvis’s (2000) unified framework for transfer research’. Petch-Tyson.fltr. and Svartvik. Sophia Antipolis. (2004). (2008a). A Comprehensive Grammar of the English Language. M.). pp. (1999). 8–22. S. (2007a).

in Tyler. Progressives. [HSK series]. Kim. 199–219. ‘Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test’. A Corpus-driven Approach to English Progressive Forms. Belgium. 271–85. in Lüdeling. ‘Collocational frameworks in English’. and Altenberg. cognition and writing: an investigation of the use of cognates by university second-language learners’. Berlin: Mouton de Gruyter. in Purnelle. ‘Vocabulary frequencies in advanced learner English: a crosslinguistic approach’. U. (1999). Developing Composition Skills. How to Use Corpora in Language Teaching. Renouf. (eds). Applied Linguistics.. McH. B. D. Ringbom. M.References 253 Rayson. Schmitt. 185–99. Scarcella. pp. J. (2005).) (2007). (1996). (2nd Edition). ‘Linguistic correlates of second language literacy development: evidence from middle-grade learner essays’. An International Handbook (volume 1). pp. D. S. 41–52. Fairon. Rhetoric and Grammar. ‘A corpus-driven approach to modal auxiliaries and their didactics’. (2001).). Louvain-la-Neuve. M. Berlin: Mouton de Gruyter.). Washington: Georgetown University Press. in Aijmer. pp. Schleppegrell. and Zimmerman. H. Patterns. 18 (1). C. and Kytö. and Sinclair.Corpora and Language Learners. Schmid. D. U. J. (2008). G. Cross-linguistic Similarity in Foreign Language Learning. M. Schmitt.. Römer. A. Amsterdam: John Benjamins. Schmitt. J. 123–36. and Clapham. Macmillan English Dictionary for Advanced Learners. Reynolds. The Role of the First Language in Foreign Language Learning. (eds). ‘What really matters in second language learning for academic achievement?’ TESOL Quarterly 18 (2). D. Rundell. Bernardini. P. Y. (2005). ‘Comparing real and ideal language learner input: The use of an EFL textbook corpus in corpus linguistics and language teaching’. H-J. in Aston. pp.. N. Takada. 112–30. and Dister. pp. ‘Conjunction in spoken English and ESL writing’. Oxford: Macmillan Education. D. ‘Extending the Cochran rule for the comparison of word frequencies between corpora’. 2004. 35–53. G. (eds). (2004). London and New York: Addison Wesley Longman.. H. (ed. pp.. English Abstract Nouns as Conceptual Shells: From Corpus to Cognition. (2003). U. and Francis. Language Testing. . London: Longman. D. pp. ‘Corpora and language teaching’. S. R. A. 19–45. Clevedon: Multilingual Matters. Pedagogy. 55–88. Focus on Vocabulary: Mastering the Academic Word List. C. C. English Corpus Linguistics: Studies in Honour of Jan Svartvik. (1984). and Stewart. Berridge. 14. Language in Use: Cognitive and Discourse Perspectives on Language and Language Learning. M. and Marinova. Learner English on Computer. M. International Journal of Lexicography. (2004b). B. 17 (3). Ringbom. 926–36. Römer. H. 128–43. Römer. Functions. and Schmitt. (ed. (1991). (1998). Corpus Linguistics. Contexts and Didactics. Ruetten. B. (ed. (2000). Boston: Heinle. U. Ringbom. Clevedon and Philadelphia: Multilingual Matters. Rundell. ‘Cognates. K. N. A. (eds). (2005). Saville-Troike. (2005). 12 (1). (1987). London and New York: Longman. ‘Dictionary use in production’. C. Louvain-la-Neuve: Presses universitaires de Louvain. Le Poids des Mots: Proceedings of the 7th International Conference on Statistical Analysis of Textual Data (JADT 2004).. M. March 10–12. A. Journal of Second Language Writing. in Granger. (2007). M. Römer. 151–68. (eds). W. in Sinclair. (2004a). Amsterdam: John Benjamins. Amsterdam: John Benjamins.

Concordance. Advances in Corpus Linguistics. M. L. (2008). 25 (2). ‘Interlanguage’. and Altenberg. Contrastive Lexical Semantics. M. ‘Technical. Sinclair. Developing your English Vocabulary. Journal of Semantics. (2002). frequency distributions through the WordSmith Tools suite of computer programs’. IRAL. Corpus. (2004a). M. System. lexical competence and nuclear vocabulary’. (ed. J. pp. Trust the Text: Language.). (1986). 150–9. (ed. Amsterdam: John Benjamins. English for Specific Purposes. J. London and New York: Longman. Amsterdam: John Benjamins. M. London and New York: Routledge. B. M. and Roseberry. J. (2005). Shaw. ‘Comparing corpora and identifying key words. Oxford: Oxford University Press. (1973). ‘Analysing adjectives in scientific discourse: an exploratory study with educational applications for Spanish speakers at advanced university level’. Stubbs.. M. Looking up. in Stubbs. Aarhus. Developing Linguistic Corpora: A Guide to Good Practice. 2577–90. Henry. (2004). J. (ed. D. M. 39–59. pp. 27 (3). in Weigand. Amsterdam and Philadelphia: John Benjamins. 223–34. (2004b). C.). J. P. ‘Like: the discourse particle and semantic’. WordSmith Tools 4.). 378–97. 1–24. (2002). M. pp. Available from http://ahds. Discourse Markers across Languages: A Contrastive Study of Second-level Discourse Markers in Native and Non-native Text with Implications for General and Pedagogic Lexicography. (1999). A Systematic New Approach. (1992). Sinclair. Educational Linguistics. (1991). 21. collocations. Sinclair. J. ‘Intuition and annotation – the discussion continues’. (ed. J. Amsterdam and New York: Rodopi. Current Issues in Linguistic Theory 17. ‘The nature of the evidence’. Sinclair. E. cancer experience and internet use: a comparative keyword analysis of interviews and online cancer support groups’. M. (1987). S. in Sinclair. C. Rediscovering Interlanguage. May 27– London: Collins. G. Stein. . ‘Gender. ‘Corpus and text–basic principles’. Small Corpus Studies and ELT. (ed. 62.). J. Scott. M. 47–67.254 References Scott. 1–16. Sinclair. Available from http://www. ‘The automatic analysis of corpora’. L. in Sinclair.htm. Oxford: Oxbow Books. V.html Siegel. M. and Charteris-Black. J. in Svartvik. M. K. ‘PC analysis of keywords and key keywords’. 145–65. Collocation. ‘Language development. 35–71. (2006). (2004). P. pp. in Aijmer. Selinker. Siepmann. pp. A. (1997). (eds). Selinker. Sinclair. Textual Patterns: Key Words and Corpus Analysis in Language Education. Oxford: Oxford University Press. ‘The empty lexicon’. (eds). M. Seale. Strevens. Scott. and scientific English’. J. 19 (1). (ed. M. M. Sinclair. M. 149–63. (2001). and Tribble. ‘The lexical item’. J. Papers from the 23rd International Conference on English Language Research on Computerized Corpora. in Ghadessy. pp.).). (1992). pp. L. appropriacy. Denmark.. 233–45. in Proceedings of the Ninth Nordic Conference for English Studies. 209–31. 98–115. M. technological. (2006). Berlin and New York: Mouton de Gruyter. Soler. ‘The development of Swedish university students’ written English. in Wynne.hum. Directions in Corpus Linguistics. Corpus and Discourse. (2005). Social Science and Medicine. ELT Journal. (1972). scope and coherence’.dk/ engelsk/naes2004/papers. Tübingen: Stauffenburg Verlag. Oxford and New York: Blackwell. New York: Routledge. Scott. X (3).uk/creating/guides/linguistic-corpora/chapter1.



Summers, D. (1996), ‘Computer lexicography: the importance of representativeness in relation to frequency’, in Thomas, J. and Short, M. (eds), Using Corpora for Language Research: Studies in Honour of Geoffrey Leech. London: Longman, pp. 260–6. Sutarsyah, C., Nation, P. and Kennedy, G. (1984), ‘How useful is EAP vocabulary for ESP? A corpus based case study’. RELC Journal, 25, 34—50. Swales, J.M. (1990) Genre Analysis: English in Academic and Research Settings. Cambridge: Cambridge University Press. Swales, J. M. (2002), ‘Integrated and fragmented worlds: EAP materials and corpus linguistics’, in Flowerdew, J. (ed.), Academic Discourse. Harlow: Pearson Education, pp. 150–64. Swales, J.M., Ahmad, U., Chang, Y., Chavez, D., Dressen, D. and Seymour, R. (1998), ‘Consider this: the role of imperatives in scholarly writing’. Applied Linguistics, 19, (1), 97–121. Swales, J.M. and Feak, C. B. (2004), Academic Writing for Graduate Students: Essential Tasks and Skills (2nd edition). Ann Arbor, MI: University of Michigan Press. Tan, M. (2005), ‘Authentic language or language errors? Lessons from a learner corpus’. ELT Journal, 59 (2), 126–34. Tankó, G. (2004), ‘The use of adverbial connectors in Hungarian university students’ argumentative essays’, in Sinclair, J. (ed.), How to Use Corpora in Language Teaching. Amsterdam and Philadelphia: John Benjamins, pp. 157–81. Thurstun, J. and Candlin, C. (1997), Exploring Academic English. A Workbook for Student Essay Writing. Sydney: NCELTR Publications. Thurstun, J. and Candlin, C. (1998), ‘Concordancing and the teaching of the vocabulary of Academic English’. English for Specific Purposes, 17 (3), 267–80. Tognini-Bonelli, E. (2001), Corpus Linguistics at Work. Amsterdam and Philadephia: John Benjamins. Tognini-Bonelli, E. (2002), ‘Functionally complete units of meaning across English and Italian: towards a corpus-driven approach’, in Altenberg, B. and Granger, S. (eds) Lexis in Contrast: Corpus-based Approaches. Amsterdam and Philadephia: John Benjamins, pp. 73–95. Tribble, C. (2001), ‘Small corpora and teaching writing: towards a corpus-informed pedagogy of writing’, in Ghadessy, M., Henry, A. and Roseberry, R. (eds), Small Corpus Studies and ELT: Theory and Practice. Amsterdam and Philadelphia: John Benjamins, pp. 381–408. Trimble, L. (1985), English for Science and Technology. Cambridge: Cambridge University Press. Tseng, Y-C. and Liou, H-C. (2006), ‘The effects of online conjunction materials on college EFL students’ writing’. System, 34, 270–83. Tutin, A. (forthcoming), ‘Evaluative adjectives in academic writing in the humanities and social sciences’. Paper presented at Interpersonality in Written Academic Discourse: Perspectives across Languages and Cultures, Jaca, 11–13 December 2008. Available from interlae_2008_tutin.pdf Van Roey, J. (1990), French-English Contrastive Lexicology: An Introduction. Louvain-la-Neuve: Peeters. Vassileva, I. (1998), ‘Who am I/who are we in academic writing? A contrastive analysis of authorial presence in English, German, French, Russian and Bulgarian’. International Journal of Applied Linguistics, 8 (2), 163–90.



Voutilainen, A. (1999), ‘A short history of tagging’, in van Halteren, H. (ed.), Syntactic wordclass tagging. Dordrecht: Kluwer Academic Publishers, pp. 9–21. Wang, J., Liang, S., and Ge, G. (2008), ‘Establishment of a Medical Academic Word List’. English for Specific Purposes, 27 (4), 442–58. Wang, K. and Nation, P. (2004), ‘Word meaning in academic English: homography in the Academic Word List’. Applied Linguistics, 25 (3), 291–314. Ward, J. (1999), ‘How large a vocabulary do EAP engineering students need?’ Reading in a Foreign Language, 12 (2), 309–24. Ward, J. (2009), ‘A basic engineering English word list for less proficient foundation engineering undergraduates’. English for Specific Purposes, 28, 170–82. Weissberg, R. and Buker, S. (1978), ‘Strategies for teaching the rhetoric of written English for Science and Technology’. TESOL Quarterly,12 (3), 321–9. West, M. (1937), ‘The present position in vocabulary selection for foreign language teaching’. The Modern Language Journal, 21 (6), 433–7. West, M. (1953), A General Service List of English Words. London: Longman. Widdowson, H. G. (1983), Learning Purpose and Language Use. Oxford: Oxford University Press. Widdowson, H. G. (1991), ‘The description and prescription of language’, in Alatis, J. E. (ed.), Linguistics and Language Pedagogy: The State of the Art. Washington, D.C.: Georgetown University Press, pp. 11–24. Widdowson, H. G. (2003), Defining Issues in English Language Teaching. Oxford: Oxford University Press. Wilkins, D. A. (1976), Notional Syllabuses. Oxford: Oxford University Press. Wilson, A. and Thomas, J. (1997), ‘Semantic annotation’, in Garside, R., Leech, G. and McEnery, A. (eds), Corpus Annotation: Linguistics Information from Computer Text Corpora. New York: Addison Wesley Longman, pp. 53–65. Winter, E. (1977), ‘A clause relational approach to English texts: a study of some predictive lexical items in written discourse’. Instructional Science, 6, 1–92. Wray, A. (2002), Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. Xue, G. and Nation, P. (1984), ‘A University Word List’. Language Learning and Communication, 3 (2), 215–29. Yang, H. (1986), ‘A new technique for identifying scientific / technical terms and describing science texts’. Literary and Linguistic Computing, 1 (2), 93–103. Zamel, V. (1983), ‘Teaching those missing links in writing’. ELT Journal, 37 (1), 22–9. Zemach, D. and Rumisek, L. (2005), Academic Writing: From Paragraph to Essay. Oxford: Macmillan. Zhang, M. (2000), ‘Cohesive features in the expository writing of undergraduates in two Chinese universities’. RELC Journal, 31, 61–95. Zhang, H., Huang, C. and Yu, S. (2004), ‘Distributional consistency: A general method for defining a core lexicon’, in Proceedings of the Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal, 26–28 May 2004. Available from Zwier, L. J. (2002), Building Academic Vocabulary. Michigan: The University of Michigan Press.


All cited internet sources were correct as of 2 August 2009.

Author index

Note: Page numbers in italics denote illustrations. Aarts, J. 35, 143 Ädel, A. 69, 72, 150, 157, 161 Aijmer, K. 152, 157, 176 Altenberg, B. 83, 106, 121, 126, 150, 152, 180 Archer, D. 42, 45 Aston, G. 73 Baayen, R. H. 145 Bahns, J. 204 Bailey, S. 206 Baker, M. 17, 19, 20, 21, 22, 24 Baker, P. 48 Barkema, H. 84, 100 Barkhuizen, G. 67 Bartning, I. 79 Bauer, L. 12 Bazerman, C. 72 Beheydt, L. 14, 27 Bestgen, Y. 62 Bhatia, V. 26 Biber, D. 2, 29, 55, 83, 122, 137, 143, 179, 211, 237n. 3 Billuroglu, A. 16 ˘ Biskup, D. 185 Bley-Vroman, R. 70 Bourigault, D. 237n. 7 (Ch. 4) Bowker, L. 35, 206 Brill, E. 37 Buker, S. 81 Burger, H. 83, 121 Burnard, L. 73 Campion, M. E. 11 Candlin, C. 206 Carter, R. 11, 23, 85, 207, 238n. 3 Celce-Murcia, M. 169 Charles, M. 214 Chen, C. W. 126, 152, 174 Chung, T. 14, 18 Clear, J. 102 Cohen, A. D. 18 Coltier, D. 88 Connor, U. 152, 190 Conrad, S. 59, 83, 85, 121, 179, 180 Cook, G. 207 Corson, D. 13 Cortes, V. 1 Cowan, J. R. 17, 18 Cowie, A. P. 213 Coxhead, A. 3, 5, 9, 10, 11, 12, 13, 16, 17, 20, 21, 25, 27, 28, 31, 34, 44, 63, 82, 122, 212 Crewe, W. 169, 174, 175, 176, 193, 201 Curado Fuentes, A. 46, 213 Cutting, J. 72 Davies, A. 71 De Bot, K. 239n. 3 Dechert, H. 155, 168 De Cock, S. 30, 72, 86, 121, 157 DeRose, S. 37 Dudley-Evans, T. 211 Eldridge, J. 26, 214 Elley, W. B. 11 Ellis, R. 67 Engels, L. K. 11 Evans, S. 1 Evert, S. 75, 76, 78 Farrell, P. 17, 18, 20 Farrow, M. 48, 50, 62 Feak, C. B. 24


Author index
Huntley, H. 9, 11, 16, 82 Hwang, K. 11, 13, 14 Hyland, K. 3, 24, 25, 26, 31, 32, 72, 90, 92, 93, 99, 147, 157, 189, 201, 211, 214 Ide, N. 37 Ivanic, R. 235n. 2 (Ch. 1) ˇ Jarvis, S. 4, 182, 183, 184, 185, 197, 216, 238n. 13 Johansson, S. 31 Johns, T. 214 Jordan, M. P. 23, 235n. 2 (Ch. 1) Jordan, R. R. 1, 81, 82, 85, 201, 202, 203, 211 Juilland, A. 50 Kamimoto, T. 239n. 2 Katz, S. 48, 236n. 8 Kellerman, E. 197 King, P. 206 Kosem, I. 62 Krishnamurthy, R. 62 Kroll, B. 69 Lake, J. 169, 170, 201 Lakshmanan, U. 70, 71 Larsen-Freeman, D. 169 Laruelle, P. 91 Laufer, B. 10 Lee, D. 73, 74, 132 Leech G. 11, 34, 35, 70, 72 Lennon, P. 165, 168 Li, E. S.-L. 18 Liou, H.-C. 206 Lonon Blanton, L. 85 Lorenz, G. 72, 101, 143, 146, 150, 152, 157, 169, 173, 177, 193 Luzón Marco, M. J. 22, 83, 137 Lynn, R. W. 11 Major, M. 11 Martin, A. 19, 20, 21, 27 Martínez, I. 15, 27, 34, 82, 212 McCarthy, M. 9, 23, 211, 235n. 2 (Ch.1), 238n. 3,

Field, Y. 171, 177, 193 Firth, A. 70 Fisher, D. 204 Flowerdew, J. 2, 60, 61, 172, 178, 201, 214 Flowerdew, L. 23, 199, 201, 204 Francis, G. 22, 23, 59, 235n. 2 (Ch. 1) Garside, R. 37, 38, 39 Ghadessy, M. 11 Gilquin, G. 1, 7, 70, 71, 151, 153, 195, 197, 207, 207, 208, 209, 210, 225, 238n. 5 Gläser, R. 237n. 7 (Ch. 4) Gledhill, C. 83, 102, 119, 123, 161, 236n. 4, 238n. 9 Goodman, A. 17 Granger, S. 4, 26, 32, 65, 67, 68, 70, 71, 72, 84, 100, 102, 118, 122, 123, 126, 143, 145, 150, 151, 152, 155, 157, 168, 169, 170, 177, 179, 182, 184, 185, 194, 197, 202, 204, 206, 213, 214, 215, 216, 236n. 1, 236n. 2, 238n. 9 Green, C. 1 Gregg, K. R. 218 Gries, S. 48, 50 Groom, N. 215 Halliday, M. 203 Hamp-Lyons, L. 206 Hancioglu, N. 15, 16, 27, 63, 212 ˘ Harris, S. 22 Harris Leonhard, B. 85 Hasan, R. 203 Hasselgren, A. 147 Heasley, B. 206 Heatley, A. 44 Hegelheimer, V. 203 Hinkel, E. 1, 3, 33, 59, 148 Hirsh, D. 10, 34 Hoey, M. 23, 26, 192, 216, 217 Hoffmann, S. 75, 76, 86 Hogue, A. 85 Howarth, P. 119, 165, 217 Huckin, T. 26, 214 Hunston, S. 118

Author index
McEnery, A. 30, 35, 76 Mel’cuk, I. 83 ˇ Meunier, F. 143, 150 Meyer, P. G. 24, 27 Miller, J. 237n3 Milton, J. 72, 147, 157, 179, 201, 202, 203, 206, 213 Moon, R. 121 Mudraya, O. 13, 14, 16, 17, 19, 31, 34 Mukherjee, J. 70, 71, 160 Müller, S. 237, Narita, M. 126, 152, 174, 177, 179, 202 Nation, I. S. P. 12 Nation, P. 1, 3, 10, 11, 13, 14, 16, 17, 18, 23, 26, 44, 82, 185 Neff, J. 73, 75, 152, 157, 194, 195 Neff van Aertselaer, J. 73 Nelson, M. 46 Nesi, H. 31, 32, 33 Nesselhauf, N. 73, 78, 101, 164, 166, 185 Neufeld, S. D. 16 Oakes, M. P. 48, 50, 62, 76 Oakey, D. 213 Obenda, D. 82 O’Dell, F. 9 Odlin, T. 185, 204 Osborne, J. 238n. 12 Oshima, A. 85 Paquot, M. 15, 26, 36, 62, 84, 100, 118, 122, 123, 135, 151, 153, 157, 168, 190, 195, 197, 204, 213, 214, 238nn. 9,13, 239n. 6 Partington, A. 15 Pavlenko, A. 185 Pawley, A. 71 Payne, E. 17 Pearson, J. 35 Pecman, M. 119 Pemberton, R. 18 Perdue, C. 192 Petch-Tyson, S. 142, 145, 157 Piller, I. 71 Praninskas, J. 11 Quirk, R. 179


Rayson, P. 29, 30, 37, 38, 43, 47, 50, 61, 76, 145, 150 Renouf, A. 102 Reynolds, D. W. 1 Ringbom, H. 182, 185, 192 Rodriguez, E. C. 50 Rohrback, J.-M. 160 Römer, U. 85 Ruetten, M. 85 Rumisek, L. 85 Rundell, M. 201, 207 St Johns, M. J. 211 Saville-Troike, M. 26 Scarcella, R. C. 13 Schleppegrell, M. J. 180 Schmid, H.-J. 235n. 2 (Ch.1) Schmitt, D. 11, 16 Schmitt, N. 11, 16 Scott, M. 2, 45, 46, 47, 48, 69, 236n. 9 Seale, C. 46 Selinker, L. 70, 71, 182, 183, 239n. 2 Shaw, P. 69 Siegel, M. 237n. 3 Siepmann, D. 82, 88, 101, 107, 126 Sinclair, J. M. 2, 26, 35, 82, 101, 102, 118 Smith, N. 35, 37, 39 Soler, V. 118 Stein, G. 16, 235n. 1 (Ch.1) Strevens, P. 13 Stubbs, M. 10 Sugiura, M. 126, 152, 174, 177, 179, 203 Summers, D. 207 Sutarsyah, C. 26 Swales, J. M. 24, 31, 61, 86, 92, 132, 189, 207 Swallow, H. 185 Syder, F. H. 71 Tan, M. 71 Tankó, G. 147 Tapper, M. 126, 150, 152 Thomas, J. 42

171. 36. 16 Yang. 11. J. A. 25. A. Y. 3 Weissberg. S. C. I. 80 West. 26 Ward. 50 Zhang. 20. 35. 255n. 177. A. A. G. H. 70 Wang. 184 Van Roey. 206 Tutin. 92 Tseng. L. J. 16. K. R. 46. C. R. P. 1. 85 Thompson. 15. 118 Thompson. 34 Waring. 10. 27. M. 201 Zemach. 61. O. 42 Winter. 69 Trimble. 237n. P. R. 17 Yip. 118 Tribble. 24. 22. 2 Thurstun. 176. M. L. 12. 34 Wang. 169. 193 Zamel. 21 Tse.260 Author index Weinert. 212 Wilkins. 126. L. 81 Wilson. 18. A. V. J. H. 30. 86 Xue. 206 Tognini-Bonelli. 152 Voutilainen. W. 26. 118 Tyson. 60 Widdowson. 11. G. M. 38 Wagner. 85 Zhang. 170. 3. G. 14. 177. 152. J. H. 22 Wray. 177. D. 185 Vassileva. 13. B. 207. E. 11 . J. J. 13 Zwier. D. 62.-C. E. 194 Zimmerman. 44.

59. 31. 15. 43–4. overused and underused in ICLE 144 academic literacy 231 academic vocabulary 7 vs. 119. 34–5. 47 bilingual dictionaries 204 Billuroglu-Neufeld-List (BNL) 16 ˘ blend 168 BNC-AC-HUM 75. 43 part-of-speech annotation 30. 100. 59 semantic misuse and 139–40 sentence position 179 annotation 34–6.Subject index Note: Page numbers in italics denote illustrations. 38. 122. 95. 55–61. 32. in ICLE 143 exemplificatory discourse markers in 88 grammatical distribution categories in 55 need for concordancing in 61 need for pedagogic mediation 61. 60. 122 academic vocabulary and 60 automatic semantic analysis of 82–3 distribution. 212 activity verbs 59 adjectives 101. of AKL 82–3 Baby BNC Academic corpus (B-BNC) 31. 37. core vocabulary and technical terms 10–13 definition of 212 fuzzy vocabulary categories 13–17 meaning of 9. 3–4. 122. 167 potential academic 57. 59 adverbials/adverbs 93. 214. 43 semantic annotation 35. 27. 133. 78. in GSL and AWL 60 words. Academic Corpus 11–12 academic discourse 2. 102. 37–42. 7. 15. 34. 90. 114–18 expressing a concession in 109 expressing possibility and certainty 118–20 reformulating in 109 see also British National Corpus (BNC) . 217 academic discourse community 31 Academic Keyword List (AKL) 5. 63. 20. 40. 5. 27–8. 213 in the Academic Keyword List 58 mono-lexemic 91 multiword linking 121 potential academic 58. 40. 101 attitudinal formulae 84. 12. 82. 36. 63. 24–5. 102 comparing and contrasting in 112–14 expressing cause and effect in 110–11. 41. 11. 17. 53 association measures 76. 123 automatic semantic analysis. 16. 25. 82 nouns and 56 overused and underused clusters with 156 and rhetorical functions 81–7 words distribution. 28 and sub-technical vocabulary 17–21 Academic Word List (AWL) 3. 24. 118 in the Academic Keyword List 57 as co-occurrents of academic nouns 100.

95. 10–11. 93. 215. 172. 100. 238n. 31. 67. 95. 102. 210. 180. 30. 164. 204. 211. 150. 79. 192. 1. 133. 120 overuse of 146 sentence initial position of 194 connectors 140. 203. 152. 203 BNCweb 75. 213 non-technical words 18 textual 123. 191 corpus-based approach 2. 76. 203. 22. 130. 148 colligation 168 colligational errors 166. 193–4. 213. 121–2 comparative fallacy 70. 96. 147 Centre for English Corpus Linguistics (CECL) 207 Clairefontaine Les fiches essentielles du Baccalauréat en anglais 160 CLAWS 37–42. 118. 177. 115–17. 178–9. 166. 181. 77. 78. 226–34 in BNC-AC-HUM 112–14 EFL learners’ use of 148. 238n. 4. 167. 59. 190. 3. 207 Baby BNC Academic Corpus (B-BNC) 31 BNC-AC 78 BNC-AC-HUM see BNC-AC-HUM Index 73–5 mark-ups 73 BROWN corpus 47 burstiness 48. 76. 208. 169. 77. 161. 202. 70. 8 Cambridge Advanced Learner’s Dictionary 207 cataphoric markers 90–1 cause and effect markers 87. 61. 71.2 (ch2) British National Corpus (BNC) 4. 219–25 in BNC-AC-HUM 110–11 EFL learners’ use of 146–8. 152 control corpus 67. 71 comparison and contrast markers 87. 188–9 cohesion 22. 148. 77 booster 157 British Academic Written Corpus 217 British Academic Written English (BAWE) Pilot Corpus 32–3. 125. 3 advance and retrospective labelling 22 grammatical 203 lexical 18. 102. 119. 172. 79. 188. 133–4. 102. 74. 31. 100. 134. 35 Corpus Query Processor (CQP) 75. 203 Constituent Likelihood Automatic Wordtagging System (CLAWS) 37–42. 76 co-text 2. 174–82. 168 collocation 23–4. 160. 75–6. 17. 193. 178. 9 collocational framework 102 collocational overlap 165 . 235n. 202. 87. 84. 99. 59 code gloss 90. 114. 87. 165. 146. 193 preferred co-occurrences in EFL writing 160–8 core vocabulary 3. 16. 201–2. 84. 85. 132. 123. 78. 216. 15 Corpus de Dissertations Françaises (CODIF) 184. 8 Contrastive Interlanguage Analysis (CIA) 4. 186. 119–20. 1 (ch3) communicative phrasemes 84. 73 co-occurrence 37. 162. 73. 76. 101. 1–3 medial position for 180 overuse of 201–2 semantic misuse 201 sentence 22 sentence position 141. 215 contrastive rhetoric 2. 236n. 217. 236n. 34. 193. 22. 30.262 Subject index Common European Framework of Reference for Languages (CEF) 236n. 238nn. 137. 22. 29. 170. 149 conceptual frequency 86 concession markers 87 in BNC-AC-HUM 109 conjunctions complex 40. 70. 59 content words 10. 174–6. 106. 65. 238n. 6 corpus-driven approach 29. 119.

211. 119 English as a Second Language (ESL) 33. 93. 44. 204 English for Academic Purposes (EAP) 15. 86 positive keyword 47. 214. 14. 55. 95. 132. 26. 159 global keyword 48 local keyword 48 negative keyword 47. 19 developmental factor 4. 235n. 75. 138 cataphoric marker 90. 204 second person 91–2. 27. 20. 86. 107 as directives with rhetorical purpose 92 first person plural 136. 84 attitudinal formulae 84. 94. 192. 46–8. 237n. 189–91 ‘extended units of meaning’ 118 fiction 46. 45. 93. 71. 102. 5. 143 fuzzy vocabulary categories 13–17 263 General Service List of English Words (GSL) 10–11. 130. 20. 33. 71. 98. 78. 93. 148. 45. 159. 14. 59 document-level burstiness 236n. 85 English for Specific Purposes (ESP) 9. 123 FROWN corpus 47 functional-product approach 82 functional syllabus 81–2. 65.Subject index data-driven approach 4. 145 global keywords 48 grammatical cohesion see cohesion graphemic words 40 hedge 2. 7 illocutionary nouns 23 imperatives in academic writing 93. 3 Juilland’s D statistical coefficient 50–3 keyness 4. 29. 181. 93. 61. 59. 55. 72–3. 16. 16. 95. 62. 212 keyword 30. 213. 15. 25. 126. 23 dispersion see distribution distribution 29. 27. 47. 216 directives see imperatives discipline 3. 135. 37 idiom 3. 73. 45. 98 IMS Open Corpus Workbench 75 International Corpus of Learner English (ICLE) 4. 18. 119. 217 epistemic modifiers 147 evenness of distribution see distribution and Juilland’s D statistical coefficient exemplifiers 85–8 in BNC-AC-HUM 88–108 learners’ use of 125–42. 60 general service word 16. 47 field approach 82 fixed phrase 23–4. 30. 18. 236n. 123 textual formulae 121 free combination 100. 13. 137. 47. 180. 50–3. 143 ditto-tag 38. 213. 15. 61. 74. 10. 99. 99. 44. 8 EAP material design 221 EAP teaching 26. 83 function words 10. 212 homographs 25. 44. 103. 212 genre 2. 91 endophoric marker 91. 131. 125. 37. 157 high-frequency word 5. 119 engagement marker 92. 197. 93 discourse-organizing vocabulary 9. 123. 132. 122. 84. 55. 214 endophoric markers 91. 67–9. 60. 36 data-driven learning 214 derivation 12. 102. 191. 55. 121 FLOB corpus 47 formulae 12. 188. 86 . 84. 103. 78. 75. 27. 62. 23. 101. 86. 183. 98. 40. 62. 125. 26. 189–90. 215. 2 see also knowledge domain discourse marker 88. 63. 92. 45. 12. 28. 94. 27. 60. 46–8.

194. 32 monolingual learners’ dictionary (MLD) 206–7 morphosyntactic annotation 34–5 multiword expression 37. 25. 184 . 197. UCREL 125 LONGDALE project 239n. 118. 134. 152. 48. 95–6. 102. 10. 123. 162–3 novice writing 1. 76. 140. 195–6. 148. 204 LOB corpus 47 local keywords 48 logico-semantic relationship verbs 59 log-likelihood 47. 133. 185. 157.264 keyword analysis see keyness knowledge domain 31. 59. 181. 97 retrospective labeling 22. 1 Longman Dictionary of Contemporary English 206 Louvain Corpus of Native Speaker Essays (LOCNESS) 32. 4. 118. 161. 152. 35. 215 lexico-grammatical error 155. 239n2 L1-induced factor see transfer L1 influence 182–92 Jarvis’s unified framework 182–4 labeling 22–4. 20. 160. 158. 18–19. 214. 142–50. 161 metalinguistic labels 23. 65. 118. 84. 207. 145–6. 195 Macmillan English Dictionary for Advanced Learners 7. 201. 197 native control corpus 70 native speaker norm. 5 Oxford English Dictionary (OED) 20 paraphrasing and clarifying see reformulation markers L1 frequency 185. 52 mental process nouns 23 mental verbs 59 metadiscourse 24. corpusapproximation to 70. 238n. 217 nuclear vocabulary 9. 49. 201. 100–1. 121. 98 labels 22–3 semantic misuse 172–3 language-activity nouns 23 learners’ dictionary 206–7 lexical bundle 69. 192–3 lexical teddy bear 147 lexical transfer see transfer lexico-grammar 26. 59 Michigan Corpus of Upper-level Student Papers 217 Micro-Concord Corpus Collection B (MC) 31. 73. 20 nuclear words and pragmatic neutrality 14 organizational function see rhetorical function overuse 86. 99. 14. 155. 160. 150. 186. 177 lexical cohesion see cohesion lexical extension 213 lexical priming see priming lexical repertoire 3. 137. 6 meaning 10. 144. 5. 71. 88. 19. 197. 90. 129. 194. 167 verbs as co-occurrents of academic 95–9. 32 Subject index non-technical meaning 18. 78. 31. 156. 173. 125 log-likelihood calculator. 118. 179. 71 native student writing 72 negative keywords 86 n-gram 69 non-technical term 17. 62. 93. 4. 53. 215. 9. 18 non-technical words 18–19. 159. 85. 125. 14. 184. 215 linking word 3. 138. 180. 190. 137. 197. 190. 96. 60. 19 technical meaning 18. 164. 185 delexical meaning 118 figurative meaning 119 over-extension 146. 24 nouns 22–3. 130. 193. 143. 206. 194. 177. 218. 121. 138 in the Academic Keyword List 56 adjectives as co-occurrents of academic 100. 83. 151. 126. 203 advance labeling 96. 44. 105. 170. 13. 237n. 108. 85. 195. 52.

157 mono-lexemic phraseme 88. 112–14. 161 phraseological accent 83 phraseological analysis 83–4. 76. 27. 45. 101 demonstrative 96 as exemplified item 97 impersonal 168 personal 98. 136. 151. 14. 201 semantic tagging 43 semantic transfer see transfer semi-technical vocabulary 17 sentence connectors 22 sentence stem 97. 139. 108. 108. 37–8 see also annotation pedagogic mediation 61. 153. 184. 33 sub-technical vocabulary 17–21. 122. 143 complex 40. 192. 178. 90. 4. 44–55 preferred co-occurrence 2. 161. 120. 22. 161 referential phraseme 84. 78. 148. 81. 33. 164 phraseme 83. 119 structural phraseme 121 textual phraseme 84. 161. 135 rhetorical function 5. 119.Subject index parsing see syntactic annotation part-of-speech (POS) tagging 34–5. 206. 108. 211 production 1. 148. 13. 164 third person 157 265 range 1. 160. 212 Perl program 48 personal metadiscourse 161 personal pronoun 98. 109. 69. 152. 30. 15. 121. 142. 160. 166. 48–50. 73. 139. 170. 209 in BNC-AC-HUM 109 register awareness 5. 7. 90. 9. 216. 62. 123. 141. 90. 60. 208. 211. 106. 22. 84. 197. 155. 106. 115–17. 164. 161. 62. 172. 212 Range corpus analysis program 44 reception 1. 13. 101. 136. 24. 61. 193. 137. 10. 213. 99. 148. 203. 93. 136. 125. 95. 145. 188 phraseological competence 217 phraseological infelicity 155 phraseology of rhetorical functions 65. 203. 160. 118. 85. 123. 118. 190. 14. 15–16. 195 spoken frequency counts 145 Student Writing Corpus 32. 24 syntactic annotation 35 tagging see annotation teaching material 82. 195. 95. 138. 213. 73 reference corpus 46. 114. 177. 180. 21. 213 speech-like lexical item 151–2. 185. 119 reformulation markers 87. 138. 151. 139–40. 217 mental 192 transfer of 192. 110–11. 142. 132. 119–20. 17–21 . 155. 193 preferred ways of saying things 83. 26. 145. 135 specialised non-technical lexis 17. 144 priming 192. 11. 213 phraseological ‘cascade’ 161. 123. 197. 214. 193. 12. 4. 68. 169. 102. 93. 47. 202. 207. 82. 9. 135 referential phrasemes 84. 214 communicative phraseme 84. 20. 238n. 197. 190. 62. 131. 18 speech 2. 18 technical vocabulary 13. 193 preposition 97. 118. 1 technical terms 3. 106. 84. 217 frequency-based approach to 122 positive keywords 86 potential academic words 29. 41. 90. 132. 81. 188. 192–3. 168–74. 212 pronoun 23. 203 procedural vocabulary 22. 94. 166. 9. 122. 153. 142. 98. 121. 63. 84. 17. 71. 150. 121. 125. 154. 123. 150–2. 215 rhetorical overstatement 176 Robert & Collins CD-Rom 204. 94. 121. 205 semantic annotation 35 semantic misuse 5. 120. 121. 70. 138. 215 reporting verbs 59 retrospective labelling 22 rhemes 97.

194. 190–1 transfer of meaning 185 transfer of the phraseological environment 185. 45 in AWL 12. 16. 191. 159 underused words see negative keywords University Word List 16 USAS (UCREL Semantic Analysis System) 37. 171. 16–17. 197 transfer of primings 192. 26–7. 216. 11. 182. 157 word list 2. 118–20. 107 underuse 86. 168. 162–3 forming rhemes with noun 98 lexical 36 linking 59 mental 59 potential academic 57 reporting 59 in sentence-initial infinitive clauses 138 Vocabulary 3 items 22 Web Vocab Profile 59. 51. 157–8 activity 59 co-occurrents of academic nouns 95–9. 137. 49. 134. 119. 93. 17 in GSL 11 word form 12. 218 lexical transfer 216 transfer effects 182–5 transfer of form 185 transfer of form/meaning mapping 185. 197. 8 Wmatrix 36–7. 39. 161 textual sentence stems 97. 144. 147. 121. 201–3 transfer of use 185 transfer-related factor see transfer typicality 106. 135. 203. 131. 94. 149. 40. 203. 47. 17. 194. 126. 190. 121. 137. 160. 123. 148. 27. 69 Concord tool 69 Detailed Consistency Analysis 51 Keywords option 155 WordList option 49 ‘you-know-it-when-you-see-it’ syndrome 182 text coverage 10. 16.266 Subject index Varieties of English for Specific Purposes dAtabase (VESPA) 217 verbs 24. 143. 197. 91. 157. 42–4 . 155. 48. 146. 17. 158. 192 transfer of training 144. 118. 216 transfer of L1 frequency 185. 216 transfer of style and register 185. 36. 102. 181. 16. 145. 130. 34. 53 word families 12. 59. 60 within-document burstiness 236n. 188. 236n. 135 tokenisation 38 transfer 4. 136. 156. 46 in the Academic Keyword List 57 Word Smith Tools 2. 4 text nouns 23 textual formulae 121 textual phrasemes 84. 15.