Since English remains the primary language of science and research across the globe, many academics are required to produce research in a language that is not their own. This research has been motivated by the difficulties this presents for the post-graduate students at the Eastern Mediterranean University (EMU). The main aim of the study is to construct a comprehensive pedagogic corpus for such students, and to incorporate it into an advanced academic thesis writing course. To this end, a learner abstract corpus (LAC) and a target abstract corpus (TAC) were compiled respectively from work produced by post-graduate students at EMU, and from abstracts written by post-graduate students in English speaking countries. Both quantitative and qualitative methods were utilized to analyse the corpora. The comparison of the corpora exhibited extensive use of higher frequency vocabulary, a tendency to repeat similar items, and recurrent inadequacy in using appropriate collocations and lexico-grammatical patterns in the LAC. The work in the TAC, however, demonstrated the use of a wider range of lower frequency words, as well as a more varied lexico-grammatical utilization of these items. Accordingly, a pedagogic corpus was constructed. This corpus includes the Academic Abstract Corpus (AAC) Bank, which offers alternative lexico-grammatical patterns for fulfilling the required generic moves and sub-moves in abstract and thesis writing; the TAC Wordlist of 165 key words for thesis writing; the web concordances of the LAC and the TAC; and a variety of teacher-led data driven and learner-led discovery tasks as well as other diverse academic writing resources. The corpus-informed course is mounted on Moodle, a virtual learning platform founded on social constructivist principles. The study produced major conclusions regarding corpora,


wordlists, and lexico-grammatical patterns, the broader implications and applications of which are explored from a range of perspectives.


corpus, pedagogic corpus, learner corpus, lexico-grammar, genre, academic writing

academic writing



Günümüzde İngilizce’nin tüm dünyada bilim ve araştırma dili olması, birçok akademisyenin kendi dili olmayan bir dilde araştırma üretmesini gerektiriyor. Bu çalışma, Doğu Akdeniz Üniversitesi’nde lisansüstü eğitim yapmakta olan öğrencilerin bu bağlamda yaşadıkları zorluklardan esinlenmiştir. Araştırmanın temel amacı bu tür öğrenciler için kapsamlı bir eğitsel bütünce oluşturup, bunu ileri düzeyde tez yazımı dersine kazandırmaktır. Bu amaçla, biri Doğu Akdeniz Üniversitesi lisansüstü öğrencilerinin, diğeri ise ana dili İngilizce olan ülkelerdeki üniversitelerde lisansüstü eğitim görmüş öğrencilerin yazdığı tez özetlerinden oluşan iki bütünce oluşturulmuş ve oluşturulan bütüncelerin çözümlemesinde nitel ve nicel yöntemlerden yararlanılmıştır. Bu bütünceler öğrenci tez özeti bütüncesi (LAC) ve hedeflenen tez özeti bütüncesi (TAC) olarak adlandırılmıştır. İki bütüncenin karşılaştırılması sonucu LAC’taki tez özetlerinde en sık kullanılan sözcük listelerindeki sözcüklerin yaygın olarak kullanıldığı, benzer sözcüklerin

tekrarlandığı, eşdizimsel sözcük ve sözcük gruplarının doğru kullanımlarının yetersiz olduğu gözlemlenmiştir. Diğer yandan, TAC’taki tez özetlerinin çözümlemesi, daha ender sözcüklerin sıkça ve eşdizimsel sözcüklerle ve sözcük grupları içinde doğru olarak kullanıldığını göstermiştir. Çözümleme sonuçlarından çıkan verilere dayanarak yaratılan kapsamlı eğitsel bütünce, tez özeti ve tez yazımında kullanılan hamleleri oluşturmak için gereken eşdizimsel sözcüklerden ve sözcük gruplarindan oluşan bir banka (AAC Bank), tez özeti yazımında en çok kullanılan 165 sözcük ailesi (TAC list), çözümlemede kullanılan iki bütüncenin örüt dizini (web concordance), bilgisayar ve veri destekli öğretmen ve öğrenci önderliğinde farklı türde aktiviteler ile çeşitli akademik yazı kaynaklarından


oluşmuştur. Bütünce destekli ileri düzey tez yazımı dersi sosyal oluşturmacılık ilkesine dayanan sanal etkileşimsel bir örüt (web) ortamına (Moodle) taşınmıştır. Bu çalışma, bütünce, sözcük sıklığına dayalı listeler, eşdizimsel sözcük ve sözcük gruplarıyla ilgili önemli sonuçlar üretmiştir. Bu sonuçların çıkarımları ve uygulama alanları çalışmada değişik açılardan irdelenmektedir.

Anahtar sözcükler:

bütünce, öğrenci bütüncesi, eğitsel bütünce, eşdizimsel

sözcük ve sözcük grupları, yazı türü (genre), akademik yazı



In a unique corpus-based study of dissertation acknowledgements, Ken Hyland remarks that completing a thesis is a long and difficult task, and that many students see “an acknowledgement as an important way of publicly recognizing the role of mentors and the sacrifices of loved ones” (2004b, p. 306). As I have now reached this concluding stage, I can understand these sentiments only too well.

I would like to express my sincere gratitude to my supervisor, Assoc. Prof. Dr. Gülşen Musayeva Vefalı, for being my constant guide and mentor, for providing me with invaluable feedback by reading my numerous drafts not merely line by line, but word by word, to ensure that there were no gaps or omissions. I genuinely appreciate her tireless work, and support.

I am extremely grateful to the members of the thesis monitoring committee, Prof. Dr. Gürkan Doğan and Assoc. Prof. Dr. Necdet Osam. It is thanks to their comprehensive, thoughtful, and highly useful feedback that this research developed in the way that it has. I would also like to thank the members of the jury, Prof. Dr. Ülker Vancı Osam, Prof. Dr. Deniz Zeyrek and Assist. Prof. Dr. Ali Sıdkı Ağazade for their valuable comments towards the improvement of this thesis.

I decided to conduct corpus related research in 2002. It was during this period that I first read Averil Coxhead’s article on the development of the Academic Word List. Her article then led me to Prof. Dr. Tom Cobb’s extraordinary Lexical Tutor Website, the riches of which have generously been made freely available to us all. I


would therefore like to acknowledge the genuinely inspirational role of both these researchers in the realization of this study.

I would like to thank all those friends that accompanied me on the long road to this PhD. It was fun because of them. Special thanks go to my closest companion, Elmaziye Özgür Küfi, with whom I went through thick and thin.

My colleagues at the Department of General Education have been extremely understanding and helpful. I would like to thank all of them, especially Ayfer Şen and Yeşim Dede for helping ensure that time and space was made available for me to focus on my research, especially in the last period, and Aytül Dereboylu for being by my side when I most needed support and a helping hand.

I should also thank all those ENGL501 students who have taken this course over the years, for bearing with me as the course evolved, and for willingly taking time out of their own busy research schedules to assist me further when I asked. This research and accompanying course has been developed on their behalf, and their positive responses and appreciation therefore make me feel immensely proud and privileged.

I would like to thank my whole family, and especially my sister, Berna, my brother, Attila, my aunt and my uncle Jennifer and Engin Kemal Örek, and my niece and three nephews for their encouragement and support. Special thanks go to Özbil Ege, who showed a genuine interest in my work despite his young age.

This has indeed been a long and stressful journey. I would like to express my deepest gratitude to my beloved parents, Serpil and Özbil Hancıoğlu, for teaching


me the meaning of hard work and sacrifice, for encouraging me, and believing in me not only through this period, but all my life. It really means a lot to me.

How can I ever thank my dear friend, Steve Neufeld, who was always there when I needed help, guidance, and advice? On this strenuous journey, it was wonderful to be accompanied by a true friend, and a positive personality, one who seems to lack words of negation in his mental lexicon.

I owe the greatest debt and gratitude to my dear husband, who looked after me, encouraged me, guided me with his wisdom, and always stood by me. It is a cliché to say “I could not have done it without you”. In my case, it is the essential reality. I could not have done it without you, John P. Eldridge.



The skill of writing is to create a context in which other people can think. Edwin Schlossberg This research study adopts a corpus-informed approach to academic writing pedagogy, and employs two corpora with the aim of constructing a pedagogic corpus with multiple components that incorporate teacher-directed data-driven in-class work, complemented by a virtual learning environment providing access to the authentic corpus data and learner-led exploratory tasks. The pedagogic corpus

incorporating manifold constituents is envisaged to assist non-native post-graduate students involved in research and publication in producing accurate and appropriate written texts.

This introductory chapter first provides the background to the study. After the problem that prompted this research is described in detail, the chapter presents the purpose of the study. Following a discussion of the factors that make this research significant, the terms that are exploited in this study are defined.

1.1 Background to the Study

English is the primary international language of research communication (Garfield & Welljams-Dorof, 1990; Krashen, 2003; Swales, 1990), “today’s premier research language” (Swales, 2004, p. 33), and “indeed the international language of science”


(Wood, 2001, p. 71). As the overall trend in many domains, including education and academic scientific research, is toward globalization, “more and more nonnative speakers are seeking to publish in international journals devoted to English language teaching, applied linguistics, and related areas” (Flowerdew, 2001, p. 121). Further, while the percentage of articles written in English in the 1977 Science Citation Index was 83% (Krashen, 2003), by 1997, this number had increased to 95% and of this, only half came from authors in English-speaking countries (Graddol, 1997). This increase was not due to more research done by scholars in English-speaking countries, but because of more scholars from non-English speaking countries publishing in English (Krashen, 2003). It is clear, therefore, that non-native speakers do write ‘a considerable number’ of research articles even in the “most prestigious journals in science” (Wood, 2001, p. 80).

Swales (2004) refers to today’s ‘Anglophone research world’, and states that “the status and contribution of the non-native speaker of English has become somewhat more central than it used to be and increasingly (albeit slowly) is perhaps recognized as such by native speakers of English” (p. 52). He further holds that by the beginning of the 21st century, there has been some internationalization of the research world, and the role of ‘non-Anglophones in that Englishized world’ has gained greater recognition (Swales, 2004, p. 46). According to Wood (2001), to become a member of this world, the scientist needs to produce research accepted by the community. “The more central the claim and the more widely accepted by the community, the more central a member of the community the researcher becomes” (Wood, 2001, p. 81). For the achievement of these aims, Wood (2001) claims, “the researcher must deploy a skilful use of language” (p. 81).


Such statistics and observations stated above clearly indicate that non-native speakers of English are under increasing pressure to both follow the latest research, and probably even more so, to have their own research published. Non-native speakers of English “risk being unaware of- and overlooked by- mainstream international research unless they learn to read, write, and publish in English” (Garfield & Welljams-Dorof, 1990). Given the speed of change and development in all aspects of science, technology, and research in general, and the generally accepted view, and necessity that a common language of research is required to disseminate research and findings with efficiency, it is unlikely that the position of English as the dominant research language will diminish in the near future. Wood (2001) states that:

for scientists to become recognized and successful their work must be read and cited by their peers as frequently as possible. To ensure such citation it is imperative that their work be accessible to as many as possible and thus that it be written in English. (p. 71)

Hence, non-native speaker researchers and academics would seem to have little choice, but to continue to try and master the prevailing conventions of academic English. This may not be a major problem for NNS (non-native speaker) researchers with experience, as “experienced NNS writers are familiar with the discourse requirements of their discipline” (Wood, 2001, p. 77). Nevertheless, Wood (2001) acknowledges that the beginning writer faces difficulties in terms of publishing research (p. 77), and cites Canagarajah (1996), Jernudd and Baldauf (1987), St John (1987), and Swales (1990, 1996), who have emphasized non-native writers’ difficulties in publishing in English (p. 77).


There seems to be little question, then, that non-native speakers of English need and will continue to need a lot of guidance and support in developing acceptable performance levels in reporting research. This is more so for beginning researchers. Whether or not they have traditionally been provided with this support is a different issue. Swales, for example, points out that there is little research on “how non-native speakers of English manage to survive in an increasingly English-dominated research world” (1990, p. 102).

Hyland (2004a) also agrees that “in an era of globalisation, English is now established as the world’s leading language for the dissemination of academic knowledge” (p. ix). He further emphasizes that:

whether we see this as a facilitative lingua franca or a rampaging Tyrannosaurus rex (Swales, 1997), the dominance of English has transformed the educational experiences and professional lives of countless students and academics across the planet. (Hyland, 2004a, p. ix)

Writing, therefore, has become a central element of university courses, as well as professional development programs, which necessitated the understanding of “what these discourses of the academy are, and what counts as ‘good writing’” (Hyland, 2004a, p. x).

According to Grabe and Kaplan (1996), “academically valued writing requires composing skills which transform information or transform the language itself” (p. 17). Therefore, writing, particularly the more complex composing skill appreciated in the academy, “involves training, instruction, practice, experience, and purpose” (Grabe and Kaplan, 1996, p. 6). Conventionally, English for Academic Purposes (EAP) classes have offered academic language support to especially university


students. Yet, these courses have generally tended to focus on the general needs of students involved in academic studies, and catered more for university students at undergraduate level, who are not expected to carry out or publish research. However, post-graduate candidates who are engaged in conducting and disseminating research have more sophisticated needs in terms of language knowledge and related skills, the most important of which is producing cohesive and coherent written text. Hyland (2005) holds that EAP teachers should do more research in their classes to better understand their teaching context (p. 60). This need is even more pressing for EAP teachers of post-graduate students who have very specialized needs, and who need a lot of guidance and support in attaining a language level whereby they can report their research and compete in the international academic discourse community.

Written text is “the product of a series of complicated mental operations” (Clark and Clark 1977, cited in Richards, 1990, p. 101), and is not easy to construct. After deciding on a meaning to be conveyed, writers must consider the genre, the style they are going to employ, the purpose they want to achieve and the amount of detail required to achieve it (Richards, 1990, p. 101-102). Nunan agrees that “producing a coherent, fluent, extended piece of writing is probably the most difficult thing there is to do in language” and “it is something most native speakers never master”. He also acknowledges the enormity of this challenge for second language learners, “particularly for those who go on to a university and study in a language that is not their own” (1999, p. 271).

One very important consideration in text creation is that language does not exist in a vacuum, but is a social phenomenon used for social interaction. Gumperz (1968, p. 219) emphasizes this fact by referring to verbal interaction as “a social process in


which utterances are selected in accordance with socially recognized norms and expectations”. He states that “the communication of social information presupposes the existence of regular relationships between language usage and social structure” (Gumperz, 1968, p. 220). The fact that language use is closely related to the social context naturally leads to the concept of ‘genre’.

Hyland characterizes genres as “socially recognized ways of using language” (Johns et al., 2006, p. 3). For Swales, a genre is “a class of communicative events, the members of which share some set of communicative purposes” (1990, p. 58), and this purpose determines the genre’s ‘generic’ (Flowerdew, 2000; Halliday & Hasan, 1985; Henry, 2007; Nunan, 1993), ‘organizational’ (Flowerdew, 2000), ‘discourse’ (Swales, 1990), ‘generic move’ (Flowerdew, 2000), or ‘schematic’ (Swales, 1990) structure. This structure is achieved through units of purpose, called ‘moves’ (Swales, 1990) or ‘move structures’ (Flowerdew, 2000) which are fulfilled by lexico-grammar (Henry, 2007, p. 1-2). Key lexical phrases represent the move structures of a genre (Flowerdew, 2000, p. 374). Moves, in turn, are realized through different ‘strategies’ or ‘tactics’ (Henry 2007), which are tactical selections of the writer in accomplishing the purpose (Bhatia, 1993, p. 19). These tactics or strategies similarly necessitate the exploitation of lexico-grammar. Therefore, it can be concluded that lexico-grammar has a major function in the fulfillment of strategies or tactics leading to moves, which in turn form the generic structure of a genre, and thereby reflect its communicative purpose.

The major role lexico-grammar plays in text creation requires a thorough analysis of lexico-grammatical features employed to fulfill different communicative purposes in texts, and this comprehensive analysis is nowadays viable through the use of a


corpus, “a collection of naturally-occurring language text, chosen to characterize a state or a variety of a language” (Sinclair, 1991, p. 171). Referring to the late 1950s, Leech (1991) recalls that “for years, corpus linguistics was the obsession of a small group which received little or no recognition from either linguistics or computer science” (cited in Granger, 2003, p. 538). Due to the recent developments in computer technology, however, it is now possible for anyone to store large amounts of language data on a computer for analysis. Like many other scholars and researchers, Hunston holds that “corpora, and the study of corpora, have revolutionized the study of language and the applications of language” (Hunston, 2002, p. 1). Referring to the emergence of this new view of language and the use of technology related to it, Sinclair points out that “the analysis of language has developed out of all recognition” (1991, p. 1).

In the last two decades, extensive vocabulary research has been carried out using corpora. In the 1980’s, the English Language Research group at the University of Birmingham collaborated with Collins publishers to create language reference works, and the Collins Cobuild dictionary project produced new research areas in the study and teaching of languages (Sinclair, 1991, pp. 1-3). Extensive research has also been done on compiling wordlists based on frequency, as research shows that most English texts are covered by a limited number of words, with the most frequent 2,000 word families making up 79.7% of all text (Cobb, 2002). Basing her research on West’s manually compiled list of the most commonly used 2,000 words in English (GSL, General Service List) in 1953, Coxhead (2000) produced the Academic Word List (AWL), which is based on a corpus of academic texts from different fields and consists of the most frequent 570 word families not included in the GSL. More recently, Billuroglu and Neufeld (2005), in an attempt to tackle the


weaknesses they observed in the GSL and the AWL, compiled the BNL (Billuroglu Neufeld List), which is based on an analysis of six different wordlists comprising 2,709 word families. These lists are invaluable in teaching English, especially for academic purposes, and are, undoubtedly, indispensable resources for vocabulary recognition and tools for staged vocabulary development. However, they are criticized for treating words as isolated units, and separating lexis and grammar. Wordlists, inasmuch as they are very useful resources in language learning, cannot be the sole resource to rely on for productive vocabulary teaching purposes, as in real life, words are never encountered or produced in isolation, but in a social context.

As regards grammar, the belief is that corpus-based studies have the potential to revolutionize grammar teaching in the 21st century through providing registerspecific descriptions of English grammar, shifting the emphasis from structural accuracy to appropriate use of structures, and most importantly, incorporating grammar teaching with the teaching of vocabulary (Conrad, 2000, p. 549). Extensive research employing corpora is also being increasingly carried out by genre analysts, especially by researchers involved in EAP. In addition to studies that focus on the generic move structure of many different kinds of genres, there has also been considerable research concerned with how different moves are achieved through language, i.e, lexico-grammar (Bonn & Swales, 2007; Flowerdew, 2000; Henry, 2007; Hyland, 2004a; Hyland & Tse, 2005; Ozturk, 2007; Paltridge, 2002; Weber, 2001). However, in spite of the wealth of the research in this field, there are few studies analyzing problems that non-native post-graduate students face in producing coherent and appropriate language to write their thesis, and to publish in internationally recognized journals. In addition, although there are books (Swales &


Feak, 1994), websites (, and university academic writing centres trying to provide support for post-graduate students in writing their theses, there has been insufficient attention to the thesis writing instruction.

As already stated, EAP practitioners need to do more research in the classroom so as to be able to acquire a better understanding of the teaching-learning experiences, and provide continuous and up-to-date support and guidance to students. The need, therefore, is to focus more on classroom practices, and exploit corpora not only for research but also for pedagogic purposes.

1.2 Statement of the Problem

In academic settings, especially at post-graduate levels, non-native speakers of English are faced with a serious problem. They are specifically expected to produce work at a native-like level to be admitted into the academic discourse community. As the conventions in text types determine the intertextuality of texts, creating texts should not be considered an “individually-oriented, inner directed cognitive process….but an acquired response to the discourse conventions which arise from preferred ways of creating and communicating knowledge within particular communities” (Swales, 1990, p. 4). These communities are known as discourse communities and they are:

recognized by the specific genres that they employ, which include both speech events and written text types. The work that members of the discourse community are engaged in involves the processing of tasks which reflect specific linguistic, discoursal and rhetoric skills. (Swales, 1990, p. vii)


This study focuses on the academic genre of ‘theses’. ENGL501 (currently Advanced Thesis Writing) is an advanced writing course offered to Masters’ and PhD candidates from all faculties by the Modern Languages Division of the Department of General Education at EMU (Eastern Mediterranean University), an English-medium university in North Cyprus. The students taking this course come from a variety of countries and backgrounds. The original EFL 501 course was designed by the former School of Foreign Languages, upon the request of the EMU Graduate Institute to support Master’s and PhD students languagewise in the thesis writing stage, as although language support is provided for undergraduate students, post-graduate students were not previously given language guidance.

The course was first designed to focus on the common language functions and lexis in academic writing prior to thesis writing. Gradually it evolved into a thesis writing course with a language focus. Currently, the aims of the course are specified in the course description as follows: The purpose of this course is to develop the academic thesis writing knowledge and skills of post-graduate candidates. The course focuses on improving the participants’ academic study skills, and their knowledge of academic conventions, and thesis structure and format. It is also the objective of the course to systematically develop post-graduate candidates’ academic vocabulary knowledge and skills, to develop their awareness of the need and benefit of producing multiple drafts with the aim of improving the structure, lexis and style of their own text, and bringing their work to an acceptable level.

The participants of the thesis writing course observe and analyze the organization, the discourse structure, the grammar and lexis of different sections or chapters of authentic theses, and how they are made cohesive and coherent through the use of


lexico-grammatical devices. The post-graduate candidates then produce their own work and are encouraged to identify their own problems with language use, and find solutions to those problems. To this end, they work on multiple drafts, which make up their end-of-semester portfolio, until the end-product is adequate.

Although the participants are given guidance and support in terms of the moves making up the generic structure of the thesis in accordance with the genre-based approaches, the quality of most of their work reveals a gap between actual and target performance levels in producing coherent text. The main problem hindering the production of coherent and appropriate texts seems to be the participants’ insufficient knowledge of the lexico-grammatical resources necessary for meaning creation. This problem possibly arises from insufficient exposure to, and lack of awareness of collocations, and syntagmatic and paradigmatic relations that are of major significance for creating meaning. This problem is consistent with the findings in the literature. For example, Hunston cites Halliday and Martin (1993) who emphasize that non-native writers use “fewer of the lengthy noun phrases that are essential to formal, particularly academic, writing in English” due to the fact that they do not use prepositions in a ‘native-like’ way (2002, p. 82).

Jordan (1997) also maintains that “written work has been referred to as being one of the major causes of concern for students” (p. 46), and reports a study (Jordan 1981) exploring the writing difficulties of overseas post-graduate students taking writing classes in UK universities. The results showed that these students had the most difficulty in vocabulary (62%), followed by style (53%), spelling (41%), grammar (38%), punctuation (18%) and handwriting (12%) respectively. When asked what caused the difficulty in vocabulary, the students stated ‘using a word correctly’


(21%), ‘own lack of vocabulary’ (15%), and ‘confusion caused by similar sounding/looking words’ (12%). As regards the difficulties with grammar, the students reported ‘verbs: tense formation and use; active / passive use’ and ‘agreement of verb and subject’ (Jordan, 1997, pp. 46-47). As can be observed from the percentages, the greatest difficulty in text creation seems to be related to lexicogrammar.

The problem that has led to the present study can therefore be summarized as follows. Like all post-graduate students worldwide, the students pursuing a postgraduate degree at EMU are expected to produce coherent and appropriate academic texts, so that their work can be accepted in the global academic discourse community, and so that they can disseminate their research internationally. However, most of the work produced by the post-graduate candidates taking ENGL501 reveals problems specifically at the lexico-grammatical level.

1.3 Purpose of the Study

This study employs a corpus-informed approach (McCarthy, 2001) whereby the applied linguist can “mediate the corpus, design it from the very outset and build it with applied linguistic questions in mind, ask of it the questions applied linguists want answers to, and filter its output, use it as a guide or tool for what you, the teacher, want to achieve” (p. 129). Extracting lexico-grammatical information from a corpus applies this approach (McCarthy, 2001, p. 138).

The main aim of the study is to construct a pedagogic corpus for ENGL501 that incorporates various corpus-informed components. The key component of the pedagogic corpus is a bank of lexico-grammatical patterns to fulfill the generic


moves that constitute the overall generic structure of thesis abstracts. In the pedagogic corpus, there are also tasks for teacher-directed data-driven in-class work, and a complementary web-based interactive platform (Moodle) to provide access to the authentic data, the corpora, and to promote learner-led exploratory work. The complementary platform is a virtual classroom, with all the features of a traditional classroom and more, which is expected to increase the post-graduate students’ exposure to the pedagogic corpus. Moreover, the increased interaction with the data, the tasks, peers and the teacher is anticipated to maximize the participants’ learning opportunities. It is envisaged that the construction of a comprehensive corpusinformed advanced thesis writing course will assist the post-graduate students involved in research and publication in creating coherent academic texts, and therefore help to minimize the gap between the current and the target performance levels. Through the authentic corpus data and the data-driven tasks, the students are expected to observe the use of language themselves, and become language researchers, or ‘language detectives’ (Johns, 1997).

The two corpora incorporated into the pedagogic corpus are constructed from thesis abstracts. One of the reasons for this choice is that abstracts do not normally include quotations and paraphrases, and the language is expected to be the writers’ own. The second reason is that abstracts are miniature forms of research studies. The scientific research article has a particular type of rhetorical pattern which is reflected through the Introduction-Method-Results-Discussion (IMRD) format (Swales, 1990). Although there may be variations across different disciplines, Wood (2001) holds that these rhetorical conventions “are so accepted and so standard that they are often given in journal guidelines to contributors” (p. 74). In the same vein, according to Swales, the abstract, like other genres reporting research, also seems to have an


IMRD (Introduction-Method-Results-Discussion) structure (1990, p.181). This structure reflects the main chapters of the thesis: Introduction, Methodology, Analysis, and Conclusion. Therefore, it is anticipated that the analysis of abstracts in this study will reveal language data that are relevant to the thesis as a whole.

For this study, two corpora are compiled: a learner corpus of abstracts of about 100 non-native participants as a representative sample of the whole ENGL501 population (LAC: Learner Abstract Corpus), and a specialized target corpus of abstracts from universities in countries where English is the native language (TAC: Target Abstract Corpus). The abstracts in the target corpus are also produced by learners, not experts. Flowerdew (2000) draws attention to the importance of providing good ‘apprentice’ models rather than ‘expert’ generic models as these are difficult to replicate due to learners’ communicative and linguistic deficiencies.









( for range and frequency, Concordance (   and   AntConc 

( to explore the syntagmatic and paradigmatic relations of words. The learner abstract corpus (LAC) is analyzed to identify the most common lexico-grammatical problems in the academic work produced by the post-graduate candidates enrolled in the advanced thesis writing course. Then the target abstract corpus (TAC) is analyzed to extract the targeted lexico-grammar used for fulfilling the strategies and moves within the generic structure of a thesis, and compose a bank of moves and sub-moves. The data are integrated into the pedagogic corpus through both teacher-directed data-driven and learner-led discovery work. Through various task-based activities, the


participants are provided with the opportunity to enrich their lexico-grammatical knowledge, and produce coherent and appropriate academic text. The study will seek answers to the following research questions:

1. What are the major lexico-grammatical patterns identified in the LAC?

2. What are the major lexico-structural patterns in the TAC?

3. How does the LAC relate to the TAC?

4. What does the cross-examination of the two corpora necessitate in terms of the comprehensive pedagogic corpus design?

1.4 Significance of the Study

Post-graduate students pursuing Master’s and PhD degrees are required to follow the latest international developments in their fields, get their research articles published in international journals and present at conferences. Authenticity and high performance standards of their academic work are of primary significance. Writing even in the mother tongue is no easy task. In a foreign language, text creation becomes a major challenge. Hence, in academic environments, EFL learners have to compete with their native peers in the international arena not only in terms of the quality and relevance of their research, but also in the coherent and appropriate manifestation of their work. Due to the recent developments in computer technology which have made possible the compilation of vast amounts of authentic data electronically, more and more studies across the globe are making use of corpora not only to provide better descriptions of languages, but also to offer a new, and a more


effective way of learning languages. However, most of these studies “have focused on teaching a corpus approach per se rather than incorporating it into the writing process” (Yoon, 2008, p. 31). The current study is significant, as corpus data is integrated into the writing process, providing a data-rich environment where postgraduate students are exposed to authentic language use, and engaged in a process of discovery learning. Furthermore, this study is the first post-graduate study making use of corpora in EMU.

This research is also significant in terms of the nature of the ENGL501 course. Most universities have Academic Writing Centers, academic writing courses, and research methods courses to assist their post-graduate students. There are not, however, many universities that offer language support for thesis writing to their post-graduate students. In fact, a search of all domains ending with '' using WebCorp produced no reference to any blended advanced thesis writing course at any Turkish University in Turkey or the Turkish Republic of Northern Cyprus.

Another factor making this research significant is that this pedagogic study incorporates the use of an e-learning platform, or a virtual learning environment, which is widely used in the world, including the Open University in the UK (, but quite innovative on the EMU campus. This platform, Moodle (, which is based on strong underlying pedagogical principles, provides an environment where new knowledge is created through the individual’s interaction with the environment, as well as through individuals constructing things for one another

( The use of Moodle in this research has


increased the participants’ exposure to the target language, and the target genre manifold.

The research is also noteworthy in that the researcher has continued to teach the course throughout the research and thesis writing process. This made it possible for the researcher to examine the difficulties of new groups of post-graduate students, and continuously revise the pedagogic corpus. Furthermore, she could observe the impact of the pedagogic corpus and its components on the course participants’ learning and performance.

1.5 Definition of Terms


Sinclair (1991, p. 171) provides the following definition for a corpus: “A corpus is a collection of naturally-occurring language text, chosen to characterize a state or variety of a language”. A similar definition is provided by Biber, Conrad, and Reppen (1998): A corpus “is a large and principled collection of natural texts” (p. 12), which is analysed both quantitatively and qualitatively (p. 5). Hunston (2002) also states that “a corpus is planned, …, and it is designed for some linguistic purpose. The specific purpose of the design determines the selection of texts” (p. 2). In this study, authentic abstracts from the World Wide Web, and from the thesis writing course participants were compiled into two corpora based on the required design principles and for a linguistic purpose, and analyzed both quantitatively and qualitatively to address the identified language-related problem and work towards its solution.


Learner Corpus:

A learner corpus is comprised of texts produced by learners of a language. A learner corpus is used to “identify in what respects learners differ from each other and from the language of native speakers ….” (Hunston, 2002, p. 15). The compilation of learner corpora is very recent, it started only in the 1990s (Granger, 2003, p. 538). O’Keeffe, McCarthy and Carter (2007) define the compilation of learner corpora as a very important development, and acknowledge Granger as a ‘forerunner in the area’ (p. 23). Granger (2003) refers to a learner corpus as “an electronic collection of authentic texts produced by foreign or second language learners” (p. 538). The best known learner corpus is the International Corpus of Learner English (ICLE) (Granger, Dagneaux & Meunier, 2002). The present study utilized a learner corpus of abstracts produced by EFL post-graduate students.

Specialised Corpus:

For this corpus, particular types of texts are chosen. Therefore, “it aims to be representative of a given type of text. It is used to investigate a particular type of language” (Hunston, 2002, p. 15). In this study, the Target Abstract Corpus (TAC) is a specialized corpus as it is representative of post-graduate thesis abstracts, and is used to explore the lexico-grammatical patterns fulfilling moves and sub-moves in theses.

Pedagogic Corpus:

A pedagogic corpus “can consist of all the course books, readers, etc. a learner has used, plus any tapes, etc. they have heard” (Hunston, 2002, p. 16). In short,


according to Hunston, this corpus is comprised of “all the language a learner has been exposed to” (p. 16). Willis, on the other hand, provides a more comprehensive definition, and points out that a pedagogic corpus involves the texts that the learners have encountered, or will encounter (Willis, 2003, p. 165). He maintains that “learners process a set of texts to enable them to develop their own vocabulary and work out their own grammar of the language”, and this set of texts can be described as a pedagogic corpus (Willis, 2003, p. 163). According to Willis (2003), tasks are also components of a pedagogic corpus (p. 223). This study adopts Willis’ more inclusive definition of the pedagogic corpus.


A genre is “a class of communicative events, the members of which share some set of communicative purposes” (Swales, 1990, p. 58). Stubbs (2002, p. 20) uses ‘genre’ and ‘text type’ interchangeably. However, according to Biber, “genre categories are determined on the basis of external criteria relating to the speaker’s purpose and topic; they are assigned on the basis of use rather than on the basis of form”, whereas “text types represent groupings of texts that are similar in their linguistic form, irrespective of genre” (1988, p. 170). Flowerdew and Peacock (2001) define ‘genre’ as “a particular type of communicative event which has a particular communicative purpose recognized by its users, or discourse community” (p. 15). This study adopts the definition of ‘genre’ by Swales (1990), Biber (1988) and Flowerdew and Peacock (2001), and differentiates between ‘genre’ and ‘text type’. In this study, the genres of post-graduate theses, and specifically thesis abstracts are explored.


Virtual Learning Environment:

A virtual learning environment is “a collection of integrated tools enabling the management of online learning, providing a delivery mechanism, student tracking, assessment and access to resources” (

effective-use-of-VLEs). Moodle, which is employed in this study, is an open-source, free, and highly adaptable virtual learning environment offering a rich selection of features (Robb, 2004, p. 1).




This chapter aims to present the conceptual framework of the study through a comprehensive review of text and textuality, text creation and the importance of lexico-grammar, the significance of the discourse community and the concept of ‘genre’, and the features of academic texts in general and thesis abstracts in specific. After this exploration of ‘text as product’, writing pedagogy is reviewed as this study is pedagogical in nature and it is, therefore, essential to explore how text creation, i.e. writing in language teaching terms, is taught. Following an in-depth discussion of corpora, their relevance to language teaching pedagogy is assessed. The section on the use of corpora in language teaching incorporates a review of a closely related issue, Data-driven Learning (DDL), and the need for a platform to host and exploit corpora as well as DDL tasks. After the related research studies are reviewed, and their relevance to the present study explored, the chapter concludes with a summary of the literature review focusing on the implications for the present study.

2.1 Texts


Text and textuality

The concept of text has been extensively defined by linguists. Halliday and Hasan (1976) maintain that a text is not a collection of sentences, but realized through sentences, and a text needs to form a ‘unified whole’ to be considered as text. They


note that most teachers are sometimes unsure about whether their students’ compositions can be regarded as texts or not, and stress the fact that “the distinction between a text and a collection of unrelated sentences is … a matter of degree” (p. 12).

What, then, is a text and what are the features and regularities through which textuality is achieved? Stubbs (1996) defines text as “an instance of language in use, either spoken or written: a piece of language behaviour which has occurred naturally, without the intervention of the linguist” (p. 4). Halliday and Matthiessen, on the other hand, consider “any instance of language, in any medium, that makes sense to someone who knows the language” (2004, p. 3) as text. For Nunan, text is “any written record of a communicative event” (1993, p. 6) and for Widdowson, “the product of the process of discourse” where, in written language, the writer is “part of the communication” (1996, p. 132). Halliday and Hasan provide the following definition for text: “any instance of living language that is playing some part in a context of situation” (1985, p. 10).

The common element in all these definitions is that text is an instance of language, a record, or a product of language in use, making a distinction between ‘text’ and ‘discourse’, the process of language in use. According to Stubbs (Hoey et al., 2007), text is a static, fixed product, and discourse is a dynamic, interactive process (p. 146). Likewise, Beaugrande and Dressler refer to ‘text’ as an ‘occurrence’, implying some sort of completion. According to them, a text is a “communicative occurrence which meets seven standards of textuality” (1981, p. 3). These seven standards are “the constitutive principles of textual communication and they define and create the


form of behaviour identifiable as textual communicating” (Beaugrande and Dressler, 1981, p. 11).

The first standard of textuality is cohesion, “the way in which the components of the surface text, i.e. the actual words we hear or see, are mutually connected within a sequence” (Beaugrande and Dressler, 1981, p. 3). According to Halliday and Hasan, “typically, in any text, every sentence except the first exhibits some form of cohesion with a preceding sentence, usually with the one immediately preceding”. That is, each sentence contains at least one anaphoric tie that links it with the previous one or ones (1976, p. 293). Nunan has a word of caution about cohesion. He holds that “the cohesive devices themselves do not create the relationships in the text; what they do is to make the relationships explicit” (1993, p. 27). In a similar vein, Beaugrande and Dressler emphasize that cohesion by itself is not sufficient, and for efficient communication, there should be interaction with the other standards of textuality (1981, p. 4). They point out that cohesion “is the function of syntax in communication” (1981, p. 48) and it relies on grammatical dependencies which are “major signals for sorting out meanings and uses” (1981, p. 3).

Coherence “concerns the ways in which the components of the textual world, i.e. the configuration of concepts and relations which underlie the surface text, are mutually accessible and relevant” (Beaugrande and Dressler, 1981, p. 4), and it is “the outcome of cognitive processes among text users” (Beaugrande and Dressler, 1981, p. 6). The foundation of coherence is the continuity of meaning among the knowledge stimulated by the expressions of the text (1981, p. 84). Stubbs refers to coherence as semantic unity or connectedness (1983, p. 9).


In addition to cohesion and coherence, which are text-centred notions, there are also ‘user-centred notions’ acting upon textual communication (Beaugrande and Dressler, 1981, p. 7). Two of these are ‘intentionality’ and ‘acceptability’. The text producer intends to produce a cohesive and coherent text in line with the objectives, and the text receiver accepts the text as cohesive and coherent and relevant for the objectives (Beaugrande and Dressler, 1981, p. 7). ‘Acceptability’ requires the text receiver to maintain cohesion and coherence by providing material, and tolerating disturbances as required (pp.7-8). Text receivers support coherence through inferencing, and therefore contributing to the sense of the text (p. 8).

‘Informativity’ is the fifth standard and “concerns the extent to which the occurrences of the presented text are expected vs. unexpected or known vs. unknown / certain” (Beaugrande and Dressler, 1981, pp. 8-9). Low informativity causes boredom, and even rejection of text. On the other hand, very high informativity puts too much burden on the receivers’ processing and may endanger communication (Beaugrande and Dressler, 1981, p. 9).

“The factors which make a text relevant to a situation of occurrence” are known as ‘situationality’. Through this standard, “the sense and use of the text are decided” and the situation helps to make sense of the text (Beaugrande and Dressler, 1981, pp. 9-10). ‘Intertextuality’, the seventh standard, “concerns the factors which make the utilization of one text dependent upon knowledge of one or more previously encountered texts”, and it is “responsible for the evolution of text types as classes of texts with typical patterns of characteristics” (Beaugrande and Dressler, 1981, p. 10). Although there are certain features that are common to all texts to be considered as


texts, there are also texts that share some common characteristics that distinguish them from other texts.

Beaugrande and Dressler consider these 7 standards of textuality to be concerned with how occurrences are linked to others “via grammatical dependencies on the surface (cohesion), via conceptual dependencies in the textual world (coherence); via the attitudes of the participants towards the text (intentionality and acceptability); via the incorporation of the new and unexpected into the known and expected (informativity); via the setting (situationality); and via the mutual relevance of separate texts (intertextuality)” (1981, p. 37).

In addition to these constitutive principles, there are also ‘regulative’ ones that “control textual communication rather than define it” (Beaugrande and Dressler, 1981, p. 11). These are ‘efficiency’, ‘effectiveness’, and ‘appropriateness’ of a text. Efficiency refers to the use of a text with minimum effort by the participants. The effectiveness of a text is “its leaving a strong impression and creating favourable conditions for attaining a goal”. “The agreement between its setting and the ways in which the standards of textuality are upheld” is the appropriateness principle that regulates and controls a text (Beaugrande and Dressler, 1981, p. 11). According to Beaugrande and Dressler, “acceptability and appropriateness are more crucial standards for texts rather than grammaticality and well-formedness” (1981, pp. XIVXV).



Text Creation

Nunan states that the creation of a written text is a complicated undertaking (1993, p. 2). An understanding of how textuality is achieved, therefore, initially requires an understanding of how language resources are used to create text, “the most extensive unit of meaning” (Halliday and Matthiessen, 2004, p. 566). Halliday and Hasan regard text “as a semantic unit; a unit not of form but of meaning” (1976, p. 2). Halliday and Matthiessen emphasize that it is important “to be able to think of text dynamically, as an ongoing process of meaning” (2004, p. 524). Beaugrande and Dressler maintain that “the text producer has the intention of pursuing some goal via the text” and thus, text creation is a sub-goal towards the main goal (1981, p. 39). Texts, then, are produced to achieve goals and to convey meanings, and the greatest challenge is whether or not the intended messages are coherently and appropriately communicated through the use of language since, as Beaugrande and Dressler point out, “knowledge is not identical with language expressions that represent or convey it” (1981, p. 85).

Having established that text creation is a means to an end, and the ultimate objective is to communicate via the text, it is worth examining how meaning is encoded through language. Widdowson proposes that “semantics is the complex interplay of morphology, lexis, and syntax” (1996, p. 61). They interact with each other to create meaning. Semantics is concerned with the meanings of words as lexical items (lexis), the meanings of derivational and inflectional morphemes (morphology) and how words are ordered (syntax) (Widdowson, 1996, p. 53). Morphology is concerned with “how morphemes operate in the processes of derivation and inflection” (Widdowson, 1996, p. 129). Derivation involves ‘lexical innovation’ or


‘formation’, i.e. the way words mean, and inflection is about ‘grammatical adaptation’, i.e. the way words function (pp. 47-48). Therefore, morphology is closely related to lexis and syntax. Widdowson concludes that although meaning is communicated by “the morphological and syntactic processes of word adaptation and assembly; … it is the words which provide the main semantic content” (1996, p. 54).

Morphological and syntactic processes together make up the study of grammar; how words are combined in sentences, and how they are adapted (Widdowson, 1996, p. 48). As grammar is concerned with word combinations and adaptations, it is impossible to think of lexis and grammar as two separate entities. McCarthy believes that there is no major distinction between vocabulary and grammar and “… any word in the language can be examined from the point of view of grammar, and, vice versa, any word, even words like articles and prepositions, can be considered as vocabulary items” (1990, p. 12).

Halliday and Matthiessen use the terms ‘lexicogrammar’ and ‘grammar’ interchangeably and argue that “grammar and vocabulary are not two separate components of a language- they are just the two ends of a single continuum”, and “the sound system and the writing system are the two modes of expression by which the lexicogrammar of a language is presented, or realized” (2004, p. 7). In lexicogrammar, according to Halliday and Hasan, there is “no hard-and-fast division between vocabulary and grammar; the guiding principle in language is that the more general meanings are expressed through the grammar, and the more specific meanings through the vocabulary” (1976, p. 5). Grammar is the fundamental processing unit of language (p. 21), and a resource for making meaning (Halliday


and Matthiessen, 2004, p. 31). Widdowson also considers grammar as a tool to express meaning. Grammar, he says, is important because of its communicative purpose. It serves to “adapt words morphologically and organize them syntactically so that they are more capable of encoding the reality that people want to express” (1996, p. 51).

Within lexicogrammar, system and structure are very important in the creation of meaning. Structure is the “syntagmatic ordering in language patterns, or regularities, in what goes together with what”. System, which is the paradigmatic ordering in language, involves “patterns in what could go instead of what” (Halliday and Matthiessen, 2004, p. 22). System and structure work together and “… each systemeach moment of choice- contributes to the formation of the structure” (Halliday and Matthiessen, 2004, p. 23). Therefore, what goes together with what and what has the potential to go instead of what are very important in text creation and “a text is the product of ongoing selection in a very large network of systems- a system network” (Halliday and Matthiessen, 2004, p. 23).

Widdowson states that language elements combining with others along a horizontal dimension are in a syntagmatic relationship, and those that have the same potential to vertically appear in the same environment are in paradigmatic relationship. The horizontal elements exist in combination; sounds or letters combine to form words, words combine to form phrases, phrases combine to form sentences. The vertical elements, on the other hand, exist in association; “when different forms have the same possibility of occurrence in a structure at a particular level, and are therefore equivalent in function, they are paradigmatically associated as members of the same class of items” (1996, p. 33-34). According to Widdowson, this two-dimensional


mode of organization allows the generation of infinite expressions from finite means and “is the essential source of the creativity and flexibility …. of human language” (1996, p. 34).

Halliday and Hasan argue that all components of the semantic system are realized through the lexicogrammatical system” (1976, p. 6). Stubbs holds that “.. messages are conveyed not only explicitly, by words themselves, but also implicitly, by lexical and syntactic patterning” (1996, p. 10). Morphological and syntactic processes, according to Widdowson, perform the function of extending word meanings, and so “constitute a communicative resource” (1996, p. 52). Therefore, although grammatical processes play a supportive role in organizing and adapting existing units of lexical meaning to requirements, they do not initiate meaning but “act upon meaning already lexically provided” (Widdowson, 1996, p. 55).

As lexis is the initiator of meaning and grammar organizes and changes lexical meaning according to needs through syntax and morphology, it would be meaningful to look at the major carrier of meaning in more detail. A lexeme or a lexical item is a “separate unit of meaning, usually in the form of a word, but also as a group of words” (Widdowson, 1996, p. 129). Sinclair holds that “lexical items are not always words, and each word may enter into a variety of relationships with others to realize lexical items” (2004, p. 161). Lexical words are the ‘content’ words of the vocabulary of a language, and “they can be viewed in terms of the relations in which they enter: paradigmatic relations (the options that are open to them) and syntagmatic relations (the company they keep)” (Halliday and Matthiessen, 2004, p. 38).


Ginzburg defines paradigmatic relations as those “that exist between individual lexical items which make up one of the subgroups of vocabulary items, e.g. sets of synonyms, lexico-semantic groups, etc.”, and holds, for example, that “the meaning of the verb to get can be fully understood only in comparison with other items of the synonymic set: get, obtain, receive, etc.” (1979, p. 46). Paradigmatically, words can form lexical sets. “They function in sets having shared semantic features and common patterns of collocation” and “typically, the semantic features that link the members of a lexical set are those of synonymy or antonymy, hyponymy and meronymy” (Halliday and Matthiessen, 2004, p. 40). Antonymy is “the sense relation of various kinds of opposing meaning between lexical items” (Widdowson, 1996, p. 125), and synonymy “the sense relation of equivalence of meaning between lexical items” (Widdowson, 1996, p. 131). Cohyponyms are “words that are subtypes of the same type” and comeronyms are words that are “part of the same whole” (Halliday and Matthiessen, 2004, p. 40). Hyponymy is characterized by Widdowson as “the sense relation between terms in a hierarchy, where a more particular term (the hyponym) is included in the more general one (the superordinate)” (1996, p. 128).

According to Ginzburg, “syntagmatic relations define the meaning the word possesses when it is used in combination with other words” (1979, p. 46). Syntagmatically, lexical items can form collocations, “the co-occurrence of lexical items in text” (Widdowson, 1996, p. 125) and “a tendency for words to occur together” (Sinclair, 1991, p. 71). Approaches to the semantic analysis of natural languages depend on the view that ‘lexical items are interrelatable’ (van Buren, 1975, p. 126). The probabilistic view, also known as the ‘collocational theory of lexical meaning’, was supported by the British linguist J. R. Firth (van Buren, 1975,


p. 126). According to Firth, the word ‘night’ is more likely to collocate with the word ‘dark’ than with ‘hippopotamus’, and this probability is part of the meaning of the word ‘night’ (van Buren, 1975, p. 126-127). “These probabilistic lexical relations cut across and therefore independent of grammatical structure” (van Buren, 1975, p. 127). According to the collocational theory, “lexical items are not co-extensive with any grammatical unit” (van Buren, 1975, p. 127). A lexical item like ‘put up with’ should be considered as one lexical item if it significantly co-occurs with ‘a unique cluster of other items’ (van Buren, 1975, p. 127).

Carter and McCarthy also refer to Firth as the father of collocation, stating that he brought the term ‘collocation’ into prominence (1988, p. 32). According to this Firthian view, collocation is one type of meaning and it is “an abstraction at the syntagmatic level and is not directly concerned with the conceptual or idea approach to the meaning of words” (Firth, 1951/1957, cited in Carter and McCarthy, 1988, p. 32). Although collocation is often referred to as a ‘Firthian’ term, Nation (2001) notes that Palmer used it before Firth. Palmer’s idea of collocation was that each “must or should be learnt, or is best or most conveniently learnt as an integral whole or independent entity, rather than by the process of piecing together their component parts” (Palmer, 1933, p. 4, cited in Nation, 2001, p. 317).

Halliday and Matthiessen consider collocation “an instance of lexical cohesion” which depends on a tendency of items to co-occur (2004, p. 576-577). McCarthy holds that the relationship of collocation is central to the study of vocabulary as it is “an important organizing principle in the vocabulary of any language” (1990, p. 12). He regards collocation as a “marriage contract between words”, some words being “more firmly married to each other than others” and gives ‘blond hair’ as a strong


collocation as blond can refer to almost nothing else but hair (1990, p. 12). McCarthy does not make a distinction between collocation and colligation, and regards the co-occurrence of ‘the’ (a function word) with a noun (a content word) as collocation (1990, p. 12). Hunston defines ‘colligation’ as “the collocation between a lexical word and a grammatical one” but mostly refers to this co-occurrence as just ‘collocation’ (2002, p. 12-13). Nation also provides a ‘loose’ definition of ‘collocation’ as “any generally accepted grouping of words into phrases or clauses” (2001, p. 317).

Syntagmatically, words may also appear together with other words forming phrasal verbs, compound nouns, and formulaic phrases, “a (relatively) fixed collocation” (Widdowson, 1996, p. 60). A lot of collocations, however, “are not fixed and can be syntactically modified to a certain extent” (Widdowson, 1996, p. 60). Halliday and Matthiessen point out that words sometimes come together and form “patterns which lie somewhere between structures and collocations, having some sort of the properties of both” and give ‘take + pride / pleasure / delight + in + -ing’ as an example (2004, p. 45). It is clear from the example that lexical co-occurrences (collocations) and co-occurrences of lexical and grammatical words (colligations) may come together and form longer lexical phrases. Nation has the most global view of collocations:

Collocations differ greatly in size (the number of words involved in the sequence), in type (function words collocating with content words (look with at), content words collocating with content words (united with states), in closeness of collocates (expressed their own honest opinion), and in the possible range of collocates (commit with murder, a crime, hara kiri, suicide…). (2001, p. 56)


Syntagmatic and paradigmatic relations of lexis play a significant role in text production through forming ‘lexical relations’ (Halliday and Matthiessen, 2004, p. 571). Therefore, lexical cohesion is achieved through lexical relations at the syntagmatic (collocation) and paradigmatic level (synonymy, antonymy, hyponymy, meronymy). Through the choice of lexical items, “a speaker or writer creates cohesion in discourse” (Halliday and Matthiessen, 2004, p. 570). According to Hoey (1991), “lexical cohesion is the only type of cohesion that regularly forms multiple relationships (though occasionally reference does so too). If this is taken into account, lexical cohesion becomes the dominant mode of creating texture” (p. 10).

Halliday and Matthiessen believe that collocations have quite a cohesive effect. This is so because they are “one of the factors on which we build our expectations of what is to come next” (Halliday and Matthiessen, 2004, p. 577). Nation draws attention to the significant relationship between the size of word groupings such as collocation, and the level of proficiency:

when language users segment language for reception or production or to hold it in memory, they typically work with meaningful groupings of items. The size of these groupings, called chunks, depends on the level of proficiency they have attained. At one level they are realized as collocations. (Nation, 2001, p. 317)

Grammar and lexis, therefore, “are the essential resources for meaning”, but “how these resources have to be exploited for language users to achieve meaning” (Widdowson, 1990, p. 117) is also very important. Therefore, in addition to the semantic meaning created in language through lexico-grammatical processes, there is another dimension in text creation: pragmatics, “what people mean by the language they use” (Widdowson, 1996, p. 61). It is impossible to think of text


creation only in semantic terms as, through text, not only are meanings encoded, but they are also appropriately communicated to the target reader through the context. Communication “can only be achieved by relating language with context” (Widdowson, 1990, p. 94).

Pragmatics is concerned with “what people mean in a particular context and how the context influences what is said” (Yule, 1996, p. 3). It also involves “how people conform to social conventions” as well as “how people assert themselves and manipulate others” by taking individual initiative (Widdowson, 1990, p. 68). Therefore, pragmatics “is concerned with how people negotiate meaning” and also “how they negotiate social relations” (Widdowson, 1990, p. 68). Appropriateness of the language to the social context is very important in conforming to social conventions and building interpersonal relationships. Inappropriate use of language does not simply mean “a violation of linguistic appropriateness norms”, but “may lead to misunderstanding of intent” (Gumperz, 1982, p. 50). Pragmatics is about how people actualize the meaning potential of language (Widdowson, 1996, p. 61). It is “much concerned with written as with spoken uses of language” such that “writers assume a degree of shared schematic knowledge, produce texts which are cohesive and which conform to the conventions of a particular genre” (Widdowson, 1996, p. 68).



“Every text is in some sense like other texts” (Halliday and Hasan, 1985, p. 42) and texts, in general, share certain characteristics that help them to be recognized and accepted as texts. Nonetheless, research has also shown that “texts vary according to


the nature of the contexts they are used in” (Halliday and Matthiessen, 2004, p. 27). Therefore, it may be concluded that some texts share more features than others based on the purposes they serve and the contexts in which they are used.

“The production and reception of a given text depends upon the participants’ knowledge of other texts” (Beaugrande and Dressler, 1981, p. 182). Intertextuality, according to Beaugrande and Dressler, plays such a central role in the science of texts that “the whole notion of textuality may depend upon exploring the influence of intertextuality as a procedural control upon communicative activities at large” (Beaugrande and Dressler, 1981, p. 206). The role of intertextuality in text creation is also emphasized by Swales who states that creating texts should not be considered an “individually-oriented, inner directed cognitive process … but an acquired response to the discourse conventions which arise from preferred ways of creating and communicating knowledge within particular communities” (1990, p. 4). These preferred ways make texts recognizable in their discourse communities. Discourse communities “tend to separate people into occupational or speciality-interest groups” (Swales, 1990, p. 24) and have “common goals, participatory mechanisms, information exchange, community specific genres, a highly specialized terminology and a high general level of expertise” (1990, p. 29). Swales (1990) suggests that discourse communities use some specific lexis and also community-specific abbreviations and acronyms (p. 26).

Beaugrande and Dressler (1981) consider the notion of discourse community to be very important. They focus on the importance of what some scholars call ‘text types’ and state that “the scientists themselves cannot belong to a scientific community until they have acquired its conventions of discourse and argumentation. Instruction,


description, explanation, examination, interviews, questionnaires, research reportsall these commonplace uses of texts are as indispensable to science as the most elaborate technological instruments” (1981, p. 212). Genre and Text Type

In the literature, two different terms are used to refer to texts which share particular characteristics. These terms are ‘genre’ and ‘text type’. Although some linguists and researchers do not make a distinction between these two concepts and use them interchangeably, others consider them as having distinct meanings. Beaugrande and Dressler use the term ‘text type’ to mean “classes of texts expected to have certain traits for certain purposes” (1981, p. 182). They say that texts can be assigned to a text type according to their function in communication (Beaugrande and Dressler, 1981, p. 185), and give ‘descriptive’, ‘narrative’ and ‘argumentative’ texts as examples of text types, and emphasize that many texts may include all these three functions (1981, pp. 182-184). However, they also refer to telegrams and road signs as text types (1981, p. 142), which could be considered genres by other scholars.

Nunan holds that “different types of communicative events result in different types of discourse, and each of these will have its own distinctive characteristics” (1993, p. 49), emphasizing ‘purpose’ of different genres. In fact, he stresses the fact that “the overall structure, appearance and grammatical elements reflect the purposes for which the texts were created” (Nunan, 1993, p. 53). Nunan refers to the structure of a text as its ‘generic structure’ and argues that this structure is determined by the communicative purposes of the text (1993, p. 58).


Biber makes a clear distinction between ‘genre’ and ‘text type’. He believes that “genre categories are determined on the basis of external criteria relating to the speaker’s purpose and topic; they are assigned on the basis of use rather than on the basis of form”, whereas “text types represent groupings of texts that are similar in their linguistic form, irrespective of genre” (1988, p. 170). Therefore, for example, “particular texts from press reportage, biographies, and academic prose might be very similar in having a narrative linguistic form, and they would thus be grouped together as a single text type, even though they represent three different genres” (Biber, 1988, p. 206). According to Widdowson, genre is “a type of discourse in written or spoken mode with particular characteristics established by convention” (1996, p. 127). That he refers to ‘a formal meeting’ (Widdowson, 1996, p. 67) as a genre seems to indicate that his understanding of genre is similar to that of Biber’s.

Stubbs acknowledges that “some authors distinguish between text type and genre” (1996, p. 11), but he does not. Therefore, he uses the two terms interchangeably and cites Kress (1989) to define these terms as “conventional ways of expressing meanings: purposeful, goal-directed language activities, socially recognized text types, which form patterns of meaning in the social world” (Stubbs, 1996, p. 11). He refers to jokes, sermons, chats, committee meetings, debates, signs, etc. as genres (or text types) (1996, p. 11), and claims that we can gain an understanding of them through identifying and comparing different genres (1996, p. 12).

In his seminal study “Genre Analysis’, Swales provides a comprehensive definition of ‘genre’: “A genre comprises a class of communicative events, the members of which share some set of communicative purposes” (1990, p. 58). He notes that these communicative purposes form the rationale of the genre which “shapes the


schematic structure of the discourse and influences and constrains choice of content and style” (Swales, 1990, p. 58). From the definition, we can infer that genres share purposes, schematic structures, content and style. Genre and Register

A discussion on genre necessitates some clarification as to what differentiates it from the ‘well-established and central concept in linguistics’ (Swales, 1990, p. 41), register. A register involves “the linguistic features which are typically associated with a configuration of situational features” (Halliday and Hasan, 1976, p. 22), and it refers to a language variety defined according to the characteristics of the situation (McArthur, 1992, p. 839). Halliday and Matthiessen define register as “a functional variety of language” (2004, p. 27). Halliday and Hasan hold that when a text is coherent with respect to the situation, it is consistent in register (1976, p. 23).

Registers are sub-classified into field, tenor, and mode of discourse. “Field is associated with the management of the ideas, tenor with the management of personal relations, and mode with the management of discourse itself” (Swales, 1990, p. 40). Couture (cited in Swales, 1990, p. 41) holds that registers enforce limitations on syntax and vocabulary, whereas genres enforce them on discourse structures. Additionally, unlike register, a genre can be realized in completed texts and specifies conditions for beginning, continuing and ending a text. The two concepts are distinct in that a genre (research report, business report) is a text with a structure, whereas a register (language of newspaper reporting, bureaucratic language) represents more generalizable stylistic choices. According to Swales, the study of genre is evolving,


it is essential to disconnect genres from registers or styles, and to recognize that genres have schematic structures (1990, p. 42). Genres, Generic Structure, and Moves

Paltridge defines genres as “ways in which people ‘get things done’ through their use of language in particular contexts” (Johns et al., 2006, p. 2). According to Hyland, genres are the socially recognized ways that writers use language “to respond to and construct texts for recurring situations” (Johns et al., 2006, p. 3). Tardy admits that genres are complex and provides an interesting definition for genre, that it is “a kind of nexuses among the textual, social, and political dimensions of writing” (Johns et al., 2006, p. 4). According to Coe, a genre is a “culturally typical structure that embodies a socially appropriate strategy for responding to varied situations” (Johns et al, 2006, p. 8). The common theme emerging in all definitions of the concept of ‘genre’ is that a genre involves the socially acceptable use of language in a situation or context to achieve a purpose. Another common view involving genres is that they should not be seen as ‘permanent’ formulas, as they are living texts and they change according to the needs of their users (Crossley, 2007, p. 15). Swales (2004) also admits that his 1990 definition of genre was ‘long and bold’, and that such definitions may not be relevant to “all possible worlds and all possible times” (p. 61).

The overall structure of a genre represents its purpose. Nunan agrees that the communicative purpose of a text determines its ‘generic structure’ (1993, p. 58). In the literature, this overall structure of a text is referred to as ‘generic structure’ (Flowerdew, 2000; Halliday & Hasan, 1985; Henry, 2007; Nunan, 1993),


‘organizational structure’ (Flowerdew, 2000), ‘discourse structure’ (Swales, 1990), ‘generic move structure’ (Flowerdew, 2000), and ‘schematic structure’ (Swales, 1990). Genres are composed of units of purpose, called ‘moves’ (Swales, 1990) or ‘move structures’ (Flowerdew, 2000), some of which are compulsory and some optional (Flowerdew, 2000; Hasan, 1985;). These constituent parts, or moves,

represent the writer’s communicative purpose (Flowerdew, 2000) and perform specific functions (Bhatia, 1993 cited in Henry, 2007, p. 2).

Swales holds that a move is “a discoursal or rhetorical unit that performs a coherent communicative function in a written or spoken discourse” (2004, p. 228). He emphasizes that a move should be “seen as flexible in terms of its linguistic realization”, although “it has sometimes been aligned with a grammatical unit such as a sentence, utterance, or paragraph” (Swales, 2004, pp. 228-229). He emphasizes that a move is a ‘functional, not a formal unit’, and can be fulfilled by a clause, or by several sentences (2004, p. 229).

In addition to the discourse structure represented by the moves, the language used to realize the moves is also extremely important. Henry emphasizes the significance of the lexico-grammatical features commonly employed to fulfill moves (2007, p. 1-2), Flowerdew draws attention to key lexical phrases ‘representative of the move structures’ (2000, p. 374), and Tardy also emphasizes the lexico-grammatical features in generic moves (Johns et al., 2006, p. 5). Moves can be realized in a number of ways, each of which is called a ‘strategy’ (Henry, 2007, p. 3) or a ‘tactic’ (2007, p. 7). Bhatia defines ‘strategies’ as the tactical choices made by the writer to fulfill his or her intention. He states that these strategies are generally used to make the “writing more effective, keeping in mind any special reader requirements,


considerations arising from a different use of medium or prerequisites or constraints imposed by organizational and other factors of this kind”. He emphasizes that strategies do not generally change the fundamental communicative purpose of the genre (1993, pp. 19-20). Henry highlights the fact that each strategy has its own lexico-grammatical features that need to be identified (2007, p. 3).

These strategies are clearly identified by Bhatia. After emphasizing that a writer may use different rhetorical strategies to realize a communicative intention at the level of a move, he exemplifies this with the first move of research article introductions: ‘establishing the research territory’. According to Bhatia, this first move can be realized through three strategies. These are a) asserting centrality of the topic, or b) stating current knowledge, or c) ascribing key characteristics, and the choice depends upon “the constraints like the nature of the topic / field, the background knowledge of the intended readership, reader-writer relationship etc.” (1993, pp. 3031). It is obvious that the discourse structure of a genre fulfilled by moves, as well as the lexico-grammatical patterns representative of the relevant moves, play significant roles in genre studies, and specifically in academic discourse.


Academic Discourse

Jordan discusses the features of academic texts and highlights formality, avoidance of contractions, colloquialisms and personal pronouns, and the need to use cautious language while making claims (1997, pp. 240-244). This concept of cautious language was first termed ‘hedging’ in 1972 and 1973 by George Lakoff (cited in Jordan, 1997, p. 240) Selinker explains why hedging is important and states that in scientific writing every attempt to explain a phenomenon in a certain way is open to


an alternative explanation (cited in Jordan, 1997, p. 240). Different ways of making academic language vague are also mentioned by Hyland (cited in Jordan, 1997, p. 241) as the use of modal verbs (would, could), adverbs (probably, possibly), adjectives (certain, probable), nouns (assumption, estimate) and some lexical verbs (seem, appear, suggest).

Another feature of academic texts is formality. Formality involves not using contractions, colloquialisms, many phrasal verbs and personal pronouns, although ‘I’ can sometimes be appropriate depending on the situation. The writer should use an analytical, objective, intellectual and rational approach and employ a serious, impersonal and formal tone. In academic texts, passive forms of verbs, complex sentence structures and specialized vocabulary are frequently used (Jordan, 1997, p. 245).

Some academic discourse conventions, such as the need to be impersonal and objective, have been changing and evolving. For example, Elbow questions the language of academic texts which excludes the personal voice, and eliminates the author from the text. He maintains that ‘a detached and impersonal stance’ is a ‘pretense’ since arguments and opinions cannot be treated separately from the person who possesses them (cited in Zamel, 1993, p. 3). Hyland states that “over the past decade or so, academic writing has gradually lost its traditional tag as an objective, faceless and impersonal form of discourse and come to be seen as a persuasive endeavour involving interaction between writers and readers” (2005, p. 173). The recent view is that academic discourse should not be considered as ‘a uniform set of norms and conventions’, since this attitude would prevent the experience of constructing knowledge in a community (Zamel, 1993, p. 3). She claims that


academic cultures, like all cultures, are constantly re-created by people who enter as well as the languages they bring with them (1993, p. 7).

The focus in academic discourse nowadays seems to have shifted to knowledge creation and ‘solidarity’ (Hyland, 2005) with the readers. Writers, Hyland argues, “do not act in a social vacuum, and knowledge is not constructed outside particular communities of practice” (2005, p. 191). He suggests that academics do not simply produce texts, but also use language ‘to acknowledge, construct and negotiate social relations’, and therefore a successful academic text “displays the writer’s awareness of both its readers and its consequences” (Hyland, 2005, p. 174). Therefore, nowadays extensive research is carried out on how writers in academia “use language to express a stance and relate to their readers” (Hyland, 2005, p.174).

In an earlier article, Hyland again emphasizes the importance of the reader, and focuses on ways of achieving a collegial stance towards them. He maintains that one important consideration in academic discourse is how writers try to “modify the assertions that they make, toning down uncertain or potentially risky claims, emphasizing what they believe to be correct, and conveying appropriately collegial attitudes to readers” (2000, p. 179). This stance, or position is achieved through expressions of doubt and certainty, known as hedges and boosters (Hyland, 2000, p. 179). “Hedges such as might, probably and seem signal a tentative assessment of referential information and convey collegial respect for the views of colleagues”. Boosters, on the other hand, are expressions like clearly, obviously and of course, and they help writers to express confidence and also their involvement and unity with an audience (2000, p. 179). Hyland claims that conscious awareness of ‘pragmatic features’ such as hedges and boosters are very important in teaching


English for Academic Purposes (EAP) (2000, p. 180), since these tools help to get the work of academics accepted “by balancing conviction with caution, and by conveying an appropriate disciplinary persona of modesty and assertiveness” (Hyland, 2000, p. 179).

It may be concluded that in academic discourse, the appropriate expression of ideas has become as important as the accurate use of the linguistic resources. Pragmatic features, such as the use of linguistic resources to express solidarity with colleagues, to get one’s work considered seriously, and therefore to become a member of the discourse community, are now equally essential. Thus, powerful personal expression of ideas and arguments has gained major significance in academia. Genres in Academia

Academic texts are of different genres. As stated by Swales, genres share some common features in terms of purpose, target audience, structure, style, and content.

A genre comprises a class of communicative events, the members of which share some set of communicative purposes. These purposes are recognized by the expert members of the parent discourse community, and thereby constitute the rationale for the genre. This rationale shapes the schematic structure of the discourse and influences and constrains choice of content and style. ….In addition to purpose, exemplars of a genre exhibit various patterns of similarity in terms of structure, style, content and intended audience. (Swales, 1990, p. 58)

Jordan gives research articles, abstracts, theses, and textbooks as examples of genres in academic written English (1997, p. 231). Swales regards research articles, research presentations, grant proposals, theses and dissertations, reprint requests, and abstracts as instances of academic genres (1990). Scholarly books and


conference presentations are also listed as academic genres by Bonn and Swales, although the research article in a first-rank peer-reviewed journal is referred to as the ‘top’ academic genre as it provides the greatest reward for its writers (2007, p. 9394).

An academic genre: The Abstract

Specific parts of a genre, such as abstract, introduction, discussion and literature review sections of research articles and theses, are named part genres by some scholars (Hyland, 2005; Flowerdew, 2000). Although abstracts are sometimes considered as part genres (Bonn and Swales, 2007), they are generally treated as a distinct genre as they are ‘independent discourses’ (van Dijk, 1980, cited in Swales, 1990, p. 179). They are independent, as not everyone who reads the abstract will necessarily read the article itself (Swales, 1990, p. 179).

Morton also stresses the ‘independent’ quality of especially research article abstracts, and defines an abstract as “a continuous piece of prose written in whole sentences that can function as an independent discourse” (1999, p. 179). He also emphasizes the necessity that the abstract “should reflect the structure of the article, and follow its order exactly” (Morton, 1999, p. 179). He describes this structure as the presentation of the state of previous research, or a description of the background, followed by a description of the experiment or research conducted, a description of the results, and finally a statement of the implications of these results (1999, p. 179).

Morton’s observation is confirmed by Swales, who stands at the forefront in the field of genre studies. Swales holds that abstracts, like other genres reporting research, can be said to have an IMRD (Introduction-Method-Results-Discussion) structure


(1990, p.181). What is significant is that these moves recur throughout thesis and research writing in general. Therefore, abstracts act as a miniature of the academic research genre as a whole, which makes them a powerful and useful research and teaching device in general. However, Swales (1990) observes that abstracts are a neglected genre which is quite unfortunate, as they are specifically suitable for genre investigation (p. 181).

Chan and Foo hold that “the abstract is perceived as a rhetorical structure” (2001, p. 4). They also emphasize the information structure realized by moves. They maintain that the abstract is genre specific, “governed by a consistent set of information elements, organized in a specific structure, and expressed in a particular style” (2001, p. 4). They also stress the fact that the cohesion of the abstract is sustained by a move structure, and “follows a broad model of the research paper” (Chan & Foo, 2001, p. 4): an introduction, which includes the statement of purpose, followed by the main method used, then the most significant result, and finally the conclusion or recommendation statement (2001, p. 4).

The research article abstract “has emerged as a result of a well-defined and mutually-understood communicative purpose that most abstracts fulfil, irrespective of the subject-discipline they serve” (Bhatia, 1993, pp.77-78). Bhatia defines an abstract as “a description or factual summary of the much longer report, … meant to give the reader an exact and concise knowledge of the full article” (1993, p. 78). Bhatia also identifies four moves in abstracts: Introducing purpose, describing methodology, summarizing results, and presenting conclusions (1993, pp. 78-79).


Research results reveal that journal abstracts reflect the structure of the research article, across different disciplines (Hyland, 2000 cited in Hyland, 2004, p. 203). Again consistent with Swales’ IMRD moves and Bhatia’s four moves, the rhetorical moves in article abstracts are given as: Introduction, Purpose, Method, Results, Conclusion (Hyland, 2000 cited in Hyland, 2004, p. 204). Hyland also provides information on the key features of journal abstracts revealed in the study (2004, p. 204). Firstly, while the present tense is used for the background introduction and purpose, the past tense is common in reporting methods and results. Another finding is that although writers sometimes refer to themselves using pronouns, the use of the passive voice and inanimate subjects is also common. The verbs used in different moves differ. For the ‘purpose’ move, presentation verbs like discuss, describe, explore, and address are employed, while verbs such as show, demonstrate, find, and establish are used in the ‘results’ move. Extensive use of noun groups, which “allows the writers to package complex events and entities as single things” (Hyland, 2004, pp. 204-205), is also observed. Hedges are common in an attempt to avoid overstating findings. “This may be intended to state the writers’ confidence in the findings precisely, to avoid criticism, or to show respect for the alternative views of others” (Hyland, 2004, p. 205).

Hedges are used in abstracts written in other languages as well. According to the findings of a small-scale study comparing Turkish-medium thesis abstracts with those written in English, grammatical and lexical words are employed as hedges. Further, while the tense in abstracts written in both languages is predominantly present, their move structures also display considerable similarity (Hancioglu, 2002, pp. 4-6). Cross-linguistic analysis of abstracts has attracted considerable attention in the recent years. Bonn and Swales (2007) compared French and English academic


article abstracts from the language sciences, and found that in French article abstracts longer sentences are used, and in English abstracts, more passives are employed (p. 93). An interesting finding was that the abstracts written in English attempt to situate their research within a wider academic context (p. 104). Bonn and Swales (2007) state that this may be due to the wider size of the English discourse community (p. 93).

2.2 Writing Pedagogy

So far, the product, i.e. the text, has been explored in depth. As this research is pedagogical in nature, and since especially non-native writers from different fields need guidance and instruction in becoming aware of, and meeting the requirements of producing a coherent and appropriate text, writing pedagogy, with a special focus on academic writing (EAP-English for Academic Purposes), also needs to be investigated.


Approaches to Writing

Different approaches to teaching writing are characterized by their different focuses: the form, the writer, the reader. Although these approaches have developed as a reaction or as an attempt to improve the previous one, they all have a place in theory and in practice today (Raimes, 1991, p. 408). The Product Approach emerged in the late 1960’s, when the audio-lingual method was used in language instruction. Jordan defines this approach as being concerned with ‘the finished product’ (1997, p. 164). This approach, also known as text-based approach, considers writing “as mainly concerned with knowledge about the structure of language”, and development in writing is regarded as “the result of imitation of input in the form of texts” (Badger,


2002, p. 1). Pincas states that this approach considers writing “as primarily about linguistic knowledge, with attention focused on the appropriate use of vocabulary, syntax, and cohesive devices” (cited in Badger and White, 2000, p. 153). In this approach, learning how to write has four stages: familiarization by analyzing the target text, controlled writing, guided writing and free writing. Leki criticizes this approach saying that what is called guided writing in this approach is in fact ‘disguised grammar exercises’ (1991, p. 8). The other main criticisms of this approach are that skills, such as planning, play a minor role, that the learners’ knowledge is undervalued, and that the social context in which texts are produced is not paid enough attention (Badger, 2002, p. 1). Another criticism is that presenting the students with an ‘aimed-for model’ and expecting them to produce a parallel text restricts them in what they can write and how they can write it (Jordan, 1997, p. 164).

The emergence of the communicative approach led to student-centred learning, and therefore resulted in focus on the writer, rather than the finished text, the meaning rather than the form, the process approach. One vital element of this approach is the feedback from the teacher and peers to enable the writer to make improvements to the writing produced. Keh points out that feedback in the form of comments, suggestions and questions a reader provides for the writer serves as a good opportunity for revision by offering a reader perspective (1990, p. 294). The process approach mainly regards writing “as the exercise of linguistic skills and writing development as acquisition which happens in situations where teachers facilitate the exercise of writing skills” (Badger, 2002, p. 1). The writing skills mentioned here are rehearsing or pre-writing, drafting, revising and editing. The focus in this approach is not the product, but the process the students go through to create it. The


rationale is that “error-free writing without substance is not as good as substantive writing even with errors” (Leki, 1991, p. 10), and “text evolves as a result of the writer’s efforts to explore, formulate, and reformulate meaning through revision” (Dheram, 1995, p. 160).

Jordan holds that to help students with the challenging process of expressing themselves effectively in writing, the process approach, which “is concerned with the processes of writing that enable the product to be achieved”, is used. Especially lower level students are expected to benefit from this approach since “the processes involved match the mental processes inherent in writing in the mother tongue, namely, planning, drafting, rethinking, revising, etc.” (1997, p. 164). Leki believes that when students are not focused on grammatical mistakes, and instead they write freely to convey a message, they “develop confidence and a sense of power over the language …” (1991, p. 8). With this approach, the students, liberated from an approach emphasizing correct form and accuracy, can express themselves fluently. They have “more opportunities for meaningful writing, are less dependent on the teacher, and work collaboratively with other students” (Richards, 1990, p. 110).

There are, however, some critics of this approach as well. Badger summarizes these criticisms as follows: These approaches “regard all writing as being produced by the same set of processes”, do not pay a lot of attention to the kind of texts writers produce, and may not “provide learners with sufficient input to carry out the writing tasks successfully” (2002, p. 1). One other criticism is that since EAP writing is very product-oriented and the conventions regarding the organization and expressions are very tight, an approach that encourages meaning, individuality and fluency may not be appropriate for it. Therefore, there is the need to familiarize students with these


conventions to help them operate within them (White, 1988 cited in Jordan, 1997, p. 168).

Hyland is also a strong critic of process approaches and he offers five limitations related to this approach. Firstly, process writing “represents writing as a decontextualised skill by foregrounding the writer as an isolated individual struggling to express personal meanings” (2003, p. 18). Language use is the outcome of ‘individual capacities’. Thus, while this “approach acknowledges the cognitive dimensions of writing and sees the learner as the active processor of information, it neglects the actual processes of language use” (Hyland, 2003, p. 19). Secondly, these models “disempower teachers and cast them in the role of wellmeaning bystanders” (2003, p. 19). The feedback stage is the most important step in this approach as explicit teaching of language is most likely to occur here. However, as language and rhetorical organization are only dealt with in the final ‘editing’ stage, students are not offered an understanding of how different texts are constructed in clear and recognizable ways in terms of their purpose, audience and message. The third limitation of this approach is that what is to be learned is not clear. Students are expected to discover appropriate forms as they are writing, aided by the teacher’s feedback and samples of expert writing which are not analysed. This attitude presupposes knowledge of target texts. L1 students may actually have this knowledge. However, L2 learners “do not have access to this cultural resource and so lack knowledge of the typical patterns and possibilities of variation within the texts that possess cultural capital”. These L2 learners are then forced to utilize “the discourse conventions of their own cultures and may fail to produce texts that are either contextually adequate or educationally valued” (Hyland, 2003, p. 19).


A fourth limitation, according to Hyland, is the role of the hidden American values in process methods. Principles such as personal voice, peer review, critical thinking and textual ownership “incorporate an ideology of individualism which L2 learners may have serious trouble accessing”. These norms of thought and expression may be clear and familiar to American students, but for students whose cultures do not involve the ideology of individualism, these norms may be difficult to recognize and accept (Hyland, 2003, p. 20). The final limitation that Hyland mentions is the “lack of engagement with the socio-political realities of students’ everyday lives and target situations”. This approach caters for the individual needs and personalities of learners and fails to offer them access to the resources to “participate in, understand, or challenge valued discourses”, thus failing to present them with “the cultural and linguistic resources necessary for them to engage critically with texts” (Hyland, 2003, p. 20).

After remaining the dominant pedagogical practice for more than 30 years, the process approach started to be replaced by more socially-oriented views of writing, which reject the liberal individualism (Hyland, 2003, p. 17). The genre-based approach has been “a response to the occasional excesses of a process approach to writing instruction. An emphasis on a process approach often disregards the importance of written form and, in effect, takes power away from learners, particularly those from different language and culture backgrounds” (Reppen, 2002, p. 321). For these learners from different backgrounds, unless teachers bring the forms and patterns of language use to conscious awareness, many writing conventions will forever remain unlearned. Emphasizing the process at the expense of the product ignores the need for direct instruction of features of text.


Nevertheless, students are still being assessed on features such as text organization and sentence structure (Reppen, 2002, p. 321-322). The genre approach, on the other hand, recognizes the vital role of language in written communication. Genre, therefore, “is, in part, a social response to process” (Hyland, 2003, p. 9). Genres have their own specific structures and own grammatical forms that reflect their specific communicative purpose. Nunan notes that this communicative purpose will determine the characteristics of the type of discourse (1999, p. 280). The goal of genre pedagogies is to “explore ways of scaffolding students’ learning and using knowledge of language to guide them towards a conscious understanding of target genres and the ways language creates meanings in context” (Hyland, 2003, p. 20). A genre-based approach to writing regards writing as “essentially concerned with knowledge of language in context and the development of writing as a response to input in the form of texts” (Badger, 2002, p. 1).

There are strong similarities between this relatively new approach and the product approach and, in some ways, “genre approaches can be regarded as an extension of product approaches”. Like the product approach, this approach “regards writing as predominantly linguistic,” but unlike the product approach, it emphasizes that “writing varies with the social context in which it is produced” (Badger and White, 2000, p. 155). The focus in this approach is on purpose. Different kinds of writing fulfill different purposes. In addition to purpose, there are some other important considerations such as the subject matter, the relationships between the writer and the audience, and the pattern of organization. The theory of learning in this approach, although not explicitly stated, is that “learning is partly a question of imitation and partly a matter of understanding and consciously applying rules”


(Badger and White, 2000, p. 155). In this approach, after a sample text is introduced and analysed, students manipulate the language elements and eventually produce the target text (Badger, 2002, p. 1).

According to Hyland, “genre analysis can … provide the vocabulary and concepts to explicitly teach the text structures we would like our students to produce. It places language at the center of writing development by allowing shared understanding and explicit guidance” (1992, p. 16). Genre theory suggests that explicit knowledge of language is a very important aspect of effective communication and “seeks to establish that it is the context-determined structure of a piece of writing in combination with its propositional content that gives the text its meaning” (Hyland, 1992, p. 17).

The genre-based approach has however been criticized for underemphasizing the skills needed to produce a text and causing the learners to be passive. The learners may be able to deal with the types of texts they have studied in the classroom, but not be able to tackle new forms of texts outside the classroom. Moreover, they are not likely to use the language creatively (Badger, 2002, p. 1). The Process Genre Approach emerged as a consequence of the dissatisfaction caused by the emphasis on the product in the genre approach, where skills like drafting, revising and editing are undervalued. The development of writing in this approach is regarded as involving knowledge about language (product and genre approaches), knowledge of the context in which writing takes place (genre approach) and the purpose for the writing (genre approach). According to the process genre view, writing develops by extracting the learner’s potential (process approach), and by providing input for the learner to respond (product and genre approaches) (Badger and White, 2000, p.


158). In the genre approach, writing takes place in a social situation and aims to achieve a specific purpose, which determines the subject matter, the writer/reader relationship, and organization, channel, or mode. To enable learners to be able to do so, the genre approach focuses on the language used in a particular text. In the process genre approach, the writers’ processes in producing a text by reflecting such elements as field (purpose of communication, the subject matter), tenor (who the audience is/the social role relationships), and mode (channels of communication) are included. After learners have identified the field, mode and tenor with the help of the teacher, peers or sample texts, they use the skills appropriate to the genre, such as redrafting and proofreading, and produce a text (Badger and White, 2000, p. 158).

A writing class employing a process genre approach starts with the situation that leads to a specific genre of writing. Then students produce some writing “in line with their own needs supported by the teacher, their peers and sample texts” (Badger, 2002, p. 1). Therefore, where learners lack knowledge, there are three potential sources: the teacher, other learners and examples of the target genre. The most typical source of input on contextual and linguistic knowledge is language awareness activities, based on a corpus of the relevant genre as there are similarities between texts written for the same reason. These activities help learners to notice the kind of sentence structure and vocabulary used in this genre (Badger and White, 2000, p. 159).


The Process Genre Approach in EAP

EAP (English for Academic Purposes) writing is currently dominated by genrebased approaches as there is widespread belief that different communicative


purposes result in different genres, and it is more valuable to focus on the discourse structure and linguistic features relevant to that genre, rather than the general features of text in general. Henry believes that the aims of a genre approach in ESP (English for Specific Purposes) are to focus on the organization of information in genres through moves performing specific purposes, and to establish the obligatory and optional moves and their order. This focus is claimed to have resulted in students organizing their writing more effectively (Henry, 2007, p. 1).

Another very important aim in this approach to writing is to identify the linguistic features employed to achieve the moves and present these in a meaningful context. Yet, in this regard, Henry has a word of caution. He emphasizes the need to provide learners with ‘a wide enough range of linguistic options’ to choose from to fulfill the different generic functions, since otherwise there is the risk of learners perceiving “the limited number of formulas as a template to be followed rather than as a description of acceptable language conventions” (2007, pp. 1-3).

Genre-based models are based on explicit theory of “how language works or the ways that social context affects linguistic outcomes” (Hyland, 2003, p. 20). In academic contexts, the acceptability of texts is very important to be able to become a member of the academic discourse community and in a genre-based approach, “language is seen as embedded in (and constitutive of) social realities, since it is through recurrent use of conventionalized forms that individuals develop relationships, establish communities, and get things done” (Hyland, 2003, p. 21).

Considering this view, the Process Genre approach seems to be a valuable approach, especially in the teaching of academic writing. The strength of product-based


approaches is that they set out principles for the selection of content, which is a matter of syllabus design. Process-oriented approaches, on the other hand, have implications towards classroom action, which are purely methodological concerns (Bamforth, cited in Nunan, 1999, p. 287). Nunan points out that “any comprehensive approach to pedagogy must incorporate syllabus design, methodology, and assessment” (Nunan, 1999, p. 287).

As already stated, generic moves in different genres are realized by strategies, which are in turn fulfilled through distinct lexico-grammatical structures. Flowerdew maintains that key lexical phrases are “representative of the move structures” (2000, p. 374). In this regard, the use of corpora, which will be discussed in detail in this chapter, is a great means of extracting the lexico-grammar, as a corpus provides vast amounts of data and corpus software allow quick and easy manipulation of the language data. Beaugrande notes that “the advent of large corpus data with userfriendly access marks a turning point in the evolution of descriptive linguistics” (2002, p. 1), and concludes that “we should no longer displace real data with invented data” (Beaugrande, 2002, p. 28).

2.3 Corpora and Applications



A corpus may be defined as “a large collection of instances of spoken and written texts” (Halliday and Matthiessen, 2004, p. 29), as “a collection of naturally occurring examples of language, consisting of anything from a few sentences to a set of written texts or tape recordings, which have been collected for linguistic study” (Hunston, 2002, p. 2), and more recently as “collections of texts (or parts of text)


that are stored and accessed electronically” (Hunston, 2002, p. 2). Sinclair adds a new dimension to the definition by pointing out that “a corpus is a collection of naturally-occuring language text, chosen to characterize a state or variety of a language” (1991, p. 171).

Corpora, which were first constructed in computer-readable form in the 1960s and 1970s (Stubbs, 1996, p. xvii), have made the observation of language possible. In the past, language description relied on introspection, which utilized intuitions, and elicitation, which drew on the intuitions of other members of the community. Although these two methods of linguistic data collection reveal information about the formal properties and the typical functioning of language, they do not expose data about actual language behaviour (Widdowson, 1996, pp. 72-73). Corpora make the observation of language possible on a vast scale (Widdowson, 1996, p. 73) and provide authentic data. Referring to this authenticity, Halliday and Matthiessen maintain that “what people actually say is very different from what they think they say; and even more different from what they think they ought to say” (2004, p. 34). Sinclair calls this ‘objective evidence’ (1991, p. 1) and holds that supporting the idea that “invented examples can actually represent the language better than real ones” is ‘absurd’ (1991, p. 5). Cook and Prodromou suggest that corpus data include all the peculiarities of native speaker language and contain cultural elements, and therefore material to be presented to the learner should be adapted according to the local context. On the other hand, Carter, like Sinclair, advocates the use of genuine language with learners, saying that otherwise the learner would always be an average user of the language and would be deprived of the opportunity to use the language at a native speaker level (cited in Deterding, 2005).


A corpus gives information about what language is like, and it is more dependable than native speaker intuition since although the native speaker has “experience of very much more language than is contained in even the largest corpus, much of that experience remains hidden from introspection” (Hunston, 2002, p. 20). Intuition is not very helpful in four aspects of language: collocation, frequency, prosody and phraseology. It is easy to intuit some common collocations, such as ‘play – game’, while it is not for some others such as adverb-adjective combinations, such as ‘acutely aware’. Without corpus evidence, the native speaker may not be aware of them (Hunston, 2002, p. 21). Similarly, without corpus data, being conscious of the relative frequency of words, phrases and structures is difficult. For native speakers, it is equally difficult to intuit many instances of pragmatic meaning, and also unusual phraseology without corpus backup (Hunston, 2002, pp. 21-22). However, intuition is very important in “extrapolating important generalizations from a mass of specific information in a corpus” (p. 22). Hunston stresses the fact that “the corpus simply offers the researcher plenty of examples, only intuition can interpret them” (2002, p. 23).

Over the last few decades, “corpora, and the study of corpora, have revolutionized the study of language and the applications of language” (Hunston, 2002, p. 1), as computers became more accessible, and more technologically advanced, making it possible to handle large amounts of data and allowing more corpus studies to be conducted (Hunston 2002, Biber et al. 1998). Biber et al. also maintain that empirical investigations of corpora “can shed new light on previously intractable research questions in linguistics” (1998, p. ix), and the corpus-based approach “has opened the way to a multitude of new investigations of language use” (1998, p. 3).


According to Biber et al. (1998), through the study of corpora, it is not possible to determine what is possible in language, but how the language is actually used in naturally occurring texts (p. 1). Therefore, a corpus-based approach involves the study of language use, and analysts “attempt to uncover typical patterns rather than making judgments of grammaticality” (Biber et al., 1998, p. 3). Hunston similarly draws attention to the fact that a corpus does not give information about the possibility of a language item. Instead, it will present data on its frequency. Furthermore, she adds that a statement about evidence in a corpus will be relevant for that particular corpus, and therefore “conclusions about language drawn from a corpus have to be treated as deductions, not as facts” (2002, pp. 22-23).

The corpus-based approach is empirical, “analyzing the actual patterns of use in natural texts”, and relies on quantitative as well as qualitative analytical techniques (Biber et al., 1998, p. 4). Biber et al. point out one important consideration for corpus-based approaches. The analyses should go beyond simple counts of linguistic features, and include qualitative, functional interpretations of quantitative patterns (1998, pp. 4-5). Hunston similarly maintains that a corpus provides evidence in the form of many examples which can only be interpreted by intuition (2002, p. 23).

A corpus-based approach may study the language use patterns for a linguistic structure, the language of a text or a group of speakers / writers, or the language of different texts or groups of texts (Hunston, 2002, p. 2). The patterns of language use in different texts provide information on language varieties, how different situations require different language registers. Such studies attempt to describe the characteristics of registers, in which grammatical and lexical choices play a major role (Hunston, 2002, p. 2). McCarthy also emphasizes the ‘social’ dimension of


corpora and corpus linguistics. He states that “its method, gathering large amounts of representative data, whether written or spoken, immerses it in the social, the world of texts and users, of producers and consumers” (2001, p. 125).

There are different approaches to corpora (McCarthy, 2001). The ‘corpus-based’ approach is using corpora to demonstrate some known facts about the language. On the other hand, through a ‘corpus-driven’ approach, one can “go with a completely open mind to a corpus, willing to be guided, illuminated by it in ways one could not dream of”. A third approach, which critics of corpus work like Widdowson fail to explore according to McCarthy, is the ‘corpus-informed approach’. Widdowson claims that ‘freezing language in a computer database’ decontextualizes language and makes it impossible to use it as ‘real’ and ‘authentic’ in the language classroom (cited in McCarthy, 2001, p. 128). However, McCarthy holds that with the corpusinformed approach, the applied linguist can “mediate the corpus, design it from the very outset and build it with applied linguistic questions in mind, ask of it the questions applied linguists want answers to, and filter its output, use it as a guide or tool for what you, the teacher, want to achieve” (2001, p. 129). Extracting lexicogrammatical information from a corpus is an example for this approach (McCarthy, 2001, p. 138).

A corpus can do nothing at all by itself, but corpus access software re-arranges this store of used language, allows observations of various kinds to be made and provides a new perspective on language (Hunston, 2002, p. 3). Corpus analysis tools manage data in three ways: to show frequency, phraseology, and collocation (2002, p. 3), which will be explained in Chapter three in detail. Frequency counts are very useful, since they give valuable information on the frequencies of lexical and


grammar words in different corpora (Hunston, 2002, p. 3), as well as frequencies of categories of linguistic items (e.g. present and past tenses) across registers (2002, p. 8). Through phraseology, on the other hand, differences between easily confused words can be easily observed from the concordance lines (Hunston, 2002, p. 12).

Through concordance lines, it is also possible to observe the ‘central and typical’, meaning distinctions, meaning and pattern, and detail (Hunston, 2002, p. 42). Although a corpus cannot be used to establish what is impossible or possible in a language, it provides information about ‘central and typical usage’. Typicality involves “the most frequent meanings or collocates or phraseology of an individual word or phrase” (Hunston, 2002, p. 42). Centrality, on the other hand, “can be applied to categories of things rather than to individual words” (2002, p. 43). For example, although present progressive can be used for the present, the future, or no specific time, the central use is for the present time. A corpus serves an important purpose here as the prototypical (what is felt to be typical) may not be the most frequent (Hunston, 2002, pp. 43-44). Exploring collocations in corpora is also very valuable, as they “can indicate pairs of lexical items, … , or the association between a lexical word and its frequent grammatical environment”, the latter frequently called ‘colligation’ (Hunston, 2002, p. 12). Collocational information can be useful in highlighting the different meanings of a word, therefore providing a semantic profile of the word, and in obtaining a profile of the semantic field of a word (Hunston, 2002, pp. 75-79).



Types of Corpora

Corpora are designed for particular purposes, and the purpose determines the corpora types. Sinclair broadly classifies corpora into two as ‘sample’ and ‘monitor’ corpora (1991, p. 23). In sample corpora, the most important consideration is ‘the printed language as a whole’, and there is a ‘close to random selection of extracts within genres’ (Sinclair, 1991, p. 23). The Brown Corpus and LOB (the LancasterOslo/Bergen Corpus) are sample corpora which consist of American English and British English, respectively. The LOB corpus “is the British counterpart of the Brown Corpus of American English”, and it is compiled from texts printed in the same year as the Brown Corpus, in 1961, so that comparison between these two varieties of English could be made The


second type, the ‘monitor’ corpora, holds a “’state of the language’ for research purposes” (Sinclair, 1991, p. 26) as opposed to ‘sample’ corpora, which “have a large and up-to-date selection of current English available” (1991, p. 25). Sinclair suggests that the monitor corpus is now ‘a standard research tool’ (1991, p. 26).

Hunston, on the other hand, has a more specific classification regarding the types of corpora, and lists the commonly used corpus types as ‘specialised corpus’, ‘general corpus’, ‘comparable corpora’, ‘parallel corpora’, ‘learner corpus’, ‘pedagogic corpus’, ‘historical or diachronic corpus’ and ‘monitor corpus’ (2002, pp. 14-16). A specialized corpus aims to represent a specific text type, and researchers often compile “their own specialized corpora to reflect the kind of language they want to investigate” (Hunston, 2002, p. 14). A general corpus consists of texts of different types, and it is not likely to represent any particular ‘whole’. This type is generally


larger than a specialized corpus, and it is sometimes referred to as ‘reference corpus’ as “it is often used as a baseline in comparison with more specialized corpora”. LOB and the Brown corpus are early examples of general corpora (Hunston, 2002, pp. 1415). This kind of corpus is called a ‘sample corpus’ by Sinclair (1991, p. 23).

Comparable corpora are two or more corpora which are used to compare different languages or different varieties of a language. Parallel corpora, on the other hand, contain texts “that have been translated from one language into the other” (Hunston, 2002, p. 15). A historical or diachronic corpus contains texts from different time periods and it is “used to trace the developments of aspects of a language over time” (Hunston, 2002, p. 16). A monitor corpus is “designed to track current changes in a language”, so that a language can be compared yearly (2002, p. 16).

A learner corpus is composed of texts which are produced by learners of a language. The compilation of learner corpora is a very recent development. Granger (2003) emphasizes that “only in the early 1990’s did publishers and academicsconcurrently but independently- start collecting and analyzing learner data” (p. 538). In this early period, two learner English corpora were developed: the Longman Learners’ Corpus (Longman Corpus Network, 2003), and the International Corpus of Learner English (ICLE, Granger, n.d) (Granger, 2003, p. 538). Learner corpora aim to find out “in what respects learners differ from each other and from the language of native speakers”. To fulfil the latter aim, a comparable corpus of nativespeaker texts is necessary (Hunston, 2002, p. 15).

The last type in Hunston’s categorization of corpora is the pedagogic corpus, which consists of “all the language a learner has been exposed to” (Hunston, 2002, p. 16).


Willis’s definition, however, is much more inclusive: A pedagogic corpus involves the texts that the learners have encountered, or will encounter (Willis, 2003, p. 165). According to Willis (2003), “learners process a set of texts to enable them to develop their own vocabulary and work out their own grammar of the language”, and this set of texts can be described as a pedagogic corpus (p. 163).


The Use of Corpora in Applied Linguistics

Corpora are currently used in various fields and for various purposes in Applied Linguistics, ‘to solve real-world problems’ (Hunston, 2002, p. 136). By far the most influential application of corpora has possibly been in the writing of dictionaries and grammar books for learners in that “even people who have never heard of a corpus are using the product of corpus investigation” (Hunston, 2002, p. 96). The effect of corpus analysis has been that there is now more emphasis on frequency, collocation and phraseology, variation, lexis in grammar, and authenticity in dictionaries and reference books. As a result, reference books stress frequency and typicality, as well as phraseology and the interaction between lexis and grammar more (Hunston, 2002, pp. 108-109).

Another application of corpora has been in the study of ideology and culture, critical linguists or critical discourse analysts studying “the role of language in forming and transmitting assumptions about what the world is and should be like, and the role of language in maintaining (or challenging) power relations” (Hunston, 2002, p. 109). Hunston gives as an example a research study carried out by Fairclough who used two corpora (New Labour and Old Labour Party documents) with the aim of showing changes in the ideology of the party through its language (2002, p. 110).


Corpus investigation is also increasingly used in translation, as for institutions such as the European Union, improving and automating translation are very important (Hunston, 2002, p. 123) and corpora offer evidence for “how words are used and what translations for a given word or phrase are possible” (2002, p. 128).

Although dictionaries, grammars and translation aids may all help writers, current work making use of corpora to offer support for writers appears to concentrate on writers in specific areas, “who have fairly narrow identifiable needs” (Hunston, 2002, p. 135). Hunston highlights a very important point regarding this support, stating that “for many writers who are expert in their own field, ……it is not the technical terminology but what might be called the terminology of rhetoric that causes problems” (2002, p. 135). Experts in academic disciplines trying to write papers in a foreign language are confronted with this problem, ‘signals of organization and purpose’ being more challenging to use for them than the technical terminology. Corpus analysis of specific kinds of paper may be employed to isolate words and phrases associated with specific moves and functions to provide the writer with on-line support to use the most appropriate phraseology at different points of the article (Hunston, 2002, p. 135). This last use of corpora naturally leads to a review of the use of corpora in pedagogy.


The Use of Corpora in Language Teaching

Corpus analysis is being used progressively more in language teaching, and it has had two significant effects on the language teacher’s professional life, in terms of content and approach to syllabus design and methodology (Hunston, 2002, p. 137). Corpora give rise to new descriptions of a language, causing a radical change in


what the language teacher is teaching (Hunston, 2002, p. 137). O’Keeffe, McCarthy and Carter agree that “the contribution of corpus linguistics, … , to the description of the language we teach is difficult to dispute” (2007, p. 21). They state, therefore, that “as well as providing an empirical basis for checking our intuitions about language, corpora have also brought to light features about language which had eluded our intuition” (O’Keeffe, McCarthy, & Carter, 2007, p. 21). They give the frequency of ready-assembled chunks as an example for the above observation (2007, p. 21).

In a similar vein, Sinclair (1991) states that the fact that words occur in ‘preferred sequences’ has put phraseology at the heart of language description, leading to three important consequences (cited in Hunston, 2002, p. 138). The first is that pattern and meaning are closely related and when a word has more than one meaning, each meaning requires its own pattern and therefore, the ‘word’ as the unit of vocabulary teaching is replaced by the phraseology for each meaning of the word (Hunston, 2002, p. 139). That language is organized mainly according to ‘the idiom principle’ and, when this does not work, to ‘the open-choice principle’ is the second consequence. The open-choice principle is a “way of seeing language text as the result of a very large number of complex choices” and “at each point where a unit is completed (a word or a phrase or a clause), a large range of choice opens up and the only restraint is grammaticalness” (Sinclair, 1991, p. 109). This model is sometimes called a ‘slot-and-filler’ model, visualizing texts “as a series of slots which have to be filled from a lexicon which satisfies local restraints” (1991, p. 109). Sinclair describes the ‘idiom principle’ as a large number of semi-constructed phrases constituting single choices being available to the language user (1991, p. 110).


Hunston further clarifies Sinclair’s principles by saying that “meaning is either made by the phrase as a whole, in accordance with the conventional phraseology, or (less often) it is made by the individual words, operating in accordance with grammatical rules” (2002, p. 145). That “language as word list may be described in terms of phraseology, and language as text may be accounted for in terms of the idiom principle” (Hunston, 2002, p. 149) leads to the third consequence that there is no distinction between lexis and grammar. Sinclair claims that “a model of language which divides grammar and lexis, and which uses grammar to provide a string of lexical choice points is a secondary model” (1991, p. 114). Hunston emphasizes Sinclair’s claims that lexical (content) words and grammatical words (empty words) are not essentially different and “the observed patternings of lexical items are observations about both lexis and grammar” (2002, p. 149). The notions about words that support a distinction between lexis and grammar are challenged by corpus evidence. For example, the fact that not all grammatical words in a corpus come before all the lexical words refutes the notion that grammatical words are more frequent than lexical words (2002, p. 150).

The use of corpora to store vast amounts of data also resulted in the compilation of wordlists, and made it possible to determine which words frequently occur in the language, which words learners need to know primarily, which words to focus on at different stages of learning, and thus provide appropriate support for language learners. In other words, the use of corpora has also influenced syllabus design and methodology. West’s General Service List (GSL) is a “classic list of high frequency words which contains 2,000 word families” (Nation, 2001, p. 15), compiled in 1953 (2001, p. 11).


After West’s GSL of 1953, wordlists attracted attention again years later when computers became more advanced and more accessible, allowing vast amounts of data to be processed with speed and efficiency. Ellis refers to this inactive period as ‘40 years of exile’ for frequency profiling (2002, p. 143). Michael West’s GSL was developed from a corpus containing 5 million words, considering the needs of EFL/ESL learners. The words in the list were selected by West in terms of certain criteria: frequency, ease of learning, coverage of useful concepts, and stylistic level (Coxhead, 2000, p. 213). Each of the 2,000 words is a headword embodying a word family. It is argued that knowing the 2,000 word families in the GSL helps learners to understand 80% of the words in written texts and thus instigates motivation (Carter and McCarthy, 1988, p. 7). The GSL “greatly influenced the choice of vocabulary for EFL course materials, graded readers, and dictionaries until the mid1970’s” (McArthur, 1992, p. 859), and it is still influential today as there has not been a ‘comparable replacement’ so far (Coxhead, 2000, p. 214).

More recent attempts to provide learners with more specialized lists resulted in a word list of academic words. The AWL (Academic Word List, Coxhead, 2000), compiled from a corpus of 3.5 million running words of written academic text in 1998, contains 570 word families, about 10% of the total words in academic text. The compilation was done according to the range and frequency of words in academic text outside the GSL. Coxhead based the selection of the AWL on GSL because as well as the academic words, the most commonly used 2,000 words are necessary for the learner in an academic environment. Therefore, the word families in the GSL and the AWL together constitute 86% of the academic corpus (Coxhead, 2000, p. 222). Coxhead holds that the AWL may be used in setting vocabulary targets in EAP courses and designing relevant teaching materials to help students to


focus on key vocabulary items. However, Billuroglu and Neufeld (2005), thinking that the separation of the GSL and the AWL is not valid, as there are words in the GSL which are very commonly used in the academic world, and words in the AWL which can very comfortably be used outside the academic circles, compiled the BNL (Billuroglu Neufeld List), which they claim to be more representative of any written text. The rationale for the BNL is therefore that treating the GSL and the AWL separately, and focusing solely on the AWL in academic writing classes, may cause serious consequences and may deprive learners of valuable learning opportunities (Hancioglu et al., 2008, p. 475). The BNL is based on “the identification of contemporary words in common use, leading to the creation of a critical lexical mass of 2,709 word families that consistently provides 90% to 95% coverage of the tokens (not including proper nouns, acronyms or abbreviations) in academic corpora” (Billuroglu & Neufeld, 2005, p. 1).

As mentioned earlier, the use of corpora has the potential to change, in addition to the description of language and instructional content, the views on syllabus design, methodology, the role of teacher, in fact education as a whole. McCarthy states that:

corpus linguistics probably also foreshadows even more profound technological shifts that will impinge upon our long-held notions of education, the roles of teachers, the cultural context of the delivery of educational services and the mediation of theory and technique as the twentieth century becomes history. (2001, p. 125)

Aston also adopts a similar viewpoint. He maintains that with cheap concordancing software and the development of computer technology that allows storing of vast amounts of language data, “teachers and learners seem set to have an enormous quantity of material at their fingertips, with obvious implications (at least in theory)


for greater democratization and autonomy of learning” (1995, p. 257). Data-driven Learning (DDL) is how this democratization and autonomy is actualized. The use of corpora in teaching and learning languages has become widespread through DDL, which was initially developed for use with international students by Tim Johns, who said “Research is too important to be left to the researchers” (1991, cited in Hunston, 2002, p. 170). The theory behind DDL is that students become ‘language detectives’ (Johns, 1997, p. 101), “discovering facts about the language they are learning for themselves, from authentic examples” (Hunston, 2002, p. 170). Johns especially takes into account the foreign language learner, and emphasizes that the learner’s duty is to ‘discover’ the foreign language, and the language teacher is responsible for providing “a context in which the learner can develop strategies for discovery – strategies through which he or she can ‘learn how to learn’” (Johns, 1991, p. 3).

DDL is a methodology that makes use of computerized concordancing, and this has made a great contribution to the development of CALL (Computer Assisted Language Learning) (Levy, 1997, pp. 64-65). Hunston maintains that “DDL involves setting up situations in which students can answer questions about language themselves by studying corpus data in the form of concordance lines or sentences” (2002, p. 170). In this case, students answer their own questions. As an alternative, Hunston states, self-access materials can be prepared to give students the chance to explore items that are considered either problematic or useful (Hunston, 2002, pp. 170-171). The advantage of the first kind of study is maximizing student motivation, as the student is asking a question to which s/he requires an urgent answer, and consults the corpus data to discover information. In the second kind of study, the advantage is that the teacher has more control over the information. Yet, the motivation of students is not as high as in the first type of study. The fact that


dealing with problematic items is more motivating for students leads to the use of learner corpora for designing appropriate DDL tasks. For instance, learner corpora can be investigated to explore learners’ misuse, over- and underuse of language items, and DDL exercises can be developed to raise learners’ awareness of the problematic items (O’Keeffe et al., 2007, p. 23).

O’Keeffe et al. (2007) also emphasize CALL (Computer Assisted Language Learning), where “learners get hands-on experience of using a corpus through guided tasks or through materials based on corpus evidence” (p. 24). This inductive approach involving the observation of patterns in the target language and forming generalizations about language form and use is referred to as Data-driven Learning (Johns 1986, cited in O’Keeffe et al., 2007, p. 24). Data-driven Learning often makes use of concordance lines. According to Nation (2001), through examining concordances, learners observe words in real contexts, and multiple contexts provided through concordance lines offer rich information, such as collocates, and grammatical patterns, about words (p. 111). Moreover, “the use of concordances involves discovery learning, where the learners are being challenged to actively construct generalizations and note patterns and exceptions” (Nation, 2001, p. 111). Through this practice, learners “control their learning and learn investigative strategies” (Nation, 2001, p. 111), and therefore become ‘language detectives’ (Johns, 1997).

Nation (2001) emphasizes that learners need training in reading concordance lines (p. 111). Arkin (2003) similarly draws attention to the necessity of providing “guidance, support and training for teachers in integrating computer technology resources into language instruction” (p. 101). As mentioned earlier though, the use


of corpora has transformed what teachers teach, as well as how they teach it, and how learners learn it. As “the corpus revolution is here to stay” (McCarthy, 2008, p. 573), teachers and learners alike need to embrace these innovations, and adapt to changes. One major change in this regard is the need for user-friendly platforms on which corpora, concordances and DDL activities can be mounted. A number of such applications are now widely used, including Moodle, which is a versatile, interactive, virtual learning environment based on social constructivist principles, and powerful enough both to host such data as well as generate relevant activity types (Philosophy, 2008). The full scope and potential of Moodle in these respects is explored in more detail in the methodology chapter.

2.4 Related Studies

The final section of the literature review is devoted to the discussion of similar research projects that have been conducted in the field, so that the current research can be considered in context.

Vocabulary has been found to be one of the problems students have with written expression. In one study, Laufer (1998) looked at active vocabulary knowledge which is an important factor in creating written texts. He describes quantitative research investigating three types of English as a foreign language vocabulary knowledge, passive, ‘controlled active’, and free active in one year of school instruction. The study also explores how these types of vocabulary knowledge are related to one another, and what changes take place in these relationships after one year. The results of the research showed that passive vocabulary size progressed very well, controlled active vocabulary also improved, albeit less than the passive.


However, active vocabulary did not progress at all. Passive and controlled active vocabulary size scores correlated well, but there was no correlation between the free active vocabulary and the other two types, which means that gains in passive and controlled active vocabulary were not reflected in the lexical profiles of free writing (Laufer, 1998).

Not only vocabulary at word level, but also ‘phraseological competence’ is identified to be an important factor in producing effective writing. The corpus-based study by Howarth (1996) involved L2 writers, who were overseas post-graduate students in university social science departments. The central hypothesis tested in the research is that non-native speakers often lack phraseological competence, even at a fairly high level of proficiency, and this aspect of linguistic competence is frequently not well developed, obstructing easy comprehension, and reducing the effectiveness of their writing. Howarth concludes that native speaker writers of academic English depend as much on the stock of familiar and easily processed 'ready-mades' as language users in other registers. He further maintains that if the lexicogrammatical form conforms to the norms of the register to the anticipated degree, the reader's conscious attention is focused on meaning while the form is processed subconsciously (Howarth, 1996).

Similar to Howarth (1996), Hyland (2008) focuses on the importance of ‘lexical bundles’, “extended collocations which appear more frequently than expected by chance, helping to shape meanings in specific contexts and contributing to our sense of coherence in a text” (2008, p. 4), in language production, but also emphasizes disciplinary variation. In the study, Hyland explores the forms, structures and functions of 4-word bundles in a 3.5 million word corpus of research articles,


doctoral dissertations, and Master’s theses in four disciplines to find out about their frequencies and preferred uses. The study makes use of both quantitative and qualitative procedures for data analysis. Hyland’s findings support studies that have identified considerable variation in the frequency of forms, structures, and functions across different types of academic writing. Hyland concludes that there should be pedagogical focus on bundles in EAP courses, suggests that EAP course designers take this fact into consideration, and start with the student’s specific target context. He also maintains that corpus-informed lists and concordances can be used to identify these frequently occurring bundles.

In another study, Hyland and Tse (2007) challenge the notion of a single core vocabulary, and analyse the assumption that “students of EAP should study a core of high frequency words because they are common in an English academic register” (p. 235). The study, which makes use of both quantitative and qualitative analyses, employs Coxhead’s (2000) Academic Word List (AWL) to investigate the distribution of the 570 word families in a corpus of 3.3 million words from a range of eight academic disciplines and seven different genres. The results showed that although the AWL offered good coverage of the corpus compiled for the study, it was not evenly distributed. Hyland and Tse conclude that each subject discipline is dependent on its own rhetorical practices, and that “teachers help students develop a more restricted, discipline-based lexical repertoire” (2007, p. 235). Hyland and Tse therefore propose that field and genre provide a more solid basis for corpusinformed work than the concept of a “single academic literacy” (Hyland and Tse, 2007).


Considering disciplinary variation, Mudraya (2006) also carried out a study with the aim of establishing a frequency-based corpus of student engineering lexis. The Student Engineering English Corpus (SEEC), which contained 2,000,000 running words, was reduced to 1,200 word families and 9,000 word types. Based on the research findings, Mudraya maintains that sub-technical vocabulary (lexical items with technical and non-technical senses) as well as Academic English should be emphasized more in the ESP classroom. Mudraya also argues that data-driven instructional activities consistent with the lexical approach should be used “in order to help students acquire the so-called language prefabs, or formulaic multi-word units / collocations, for technical and non-technical uses” (2006, p. 235).

Quite a few studies emphasize the importance of rhetorical, or move structures for the teaching of different academic genres. One study carried out by Flowerdew (2000) proposes the use of a genre-based framework for the teaching of the organizational structure of academic report writing. The study involved 15 mechanical engineering under-graduate project reports from a university in Hong Kong, and an analysis of these reports revealed that there may be variations in terms of the moves, the ordering of the moves as well as their linguistic realizations within genres.

In another study, Anthony (1999) evaluates the standard model for describing the structure of research article introductions, the CARS (Create a Research Space) model (Swales, 1990), in terms of how it can be applied to twelve articles in the field of software engineering. The results showed that although the model was very successful in describing the main framework of introductions, problems emerged when a more detailed description was necessary. Many steps in the model were


redundant or rarely used because the model was developed based upon a wide variety of disciplines. A more serious problem identified by Anthony was the absence of a separate ‘evaluation of research’ step, which was shown to be a crucial element in achieving the aims of the introduction in his research. Anthony (1999) concludes that if the limitations of the CARS model are understood, it can be effectively used as a pedagogic tool in the classroom.

Henry (2007) thinks that although a lot of research has been conducted on identifying moves and their order in specific genres, rather less attention has been paid to the presentation of key lexico-grammatical features of genres and moves. The aim of Henry’s study is to determine the effectiveness of a website in presenting computer-based, corpus analyses of sentence level genre features to language learners in a meaningful way in an EAP/ESP teaching situation. When the job application letters written by the learners before and after using the website were compared, it was found out that the learners wrote more effective letters of application, including more obligatory and optional moves, and making use of the lexico-grammatical features associated with the genre. The learners also responded positively to the use of a website and the self-study approach taken. One of the suggestions Henry makes for future research is to investigate if this approach would be effective with less formulaic genres, such as essays and dissertations.

Recently, a number of studies have focused on specifically post-graduate level students. A pedagogic study particularly relevant to the current study explores the effect of corpora on writing competence of post-graduate students. Yoon (2008) investigates the changes in students’ writing process associated with corpus use over an extended period of time through a qualitative study. The findings reveal that


corpus use helps the students solve their immediate writing problems as well as promote their perceptions of lexico-grammar and language awareness. With the introduction of the corpus into the writing process, the students assumed more responsibility for their writing, became more independent writers, and their confidence in writing increased.

Studies have also been conducted on the difficulties faced by post-graduate students in thesis writing. Paltridge (2002), through his study, aims to find out “the extent to which published advice on the organisation and structure of theses and dissertations concurs with what happens in actual practice” (p. 125). Paltridge maintains that published advice on thesis writing is quite important, as there is very little research carried out on the actual theses due to various reasons. The study examined guides which focused on only thesis, and also handbooks which focused on the research process in general, but also referred to thesis writing. According to the findings, the published advice covered many important aspects of the research process, like selecting a topic, and writing a research proposal. However, the majority of the books emphasized thesis writing, rather than the content of the individual chapters. In addition, none of the books described the complete range of thesis options available to thesis writers. The study found a wider range of thesis types than focused on in the guides and handbooks. Paltridge (2002) argues that teaching materials should present students the range of thesis options, emphasize the kind of variation that occurs in actual texts, and consider the rationale for these various choices.

Thesis writing is even more challenging for post-graduate students who need to report their research in a second or foreign language. Yet, research on non-native


post-graduate students’ writing difficulties, and “how faculty assist these students in their thesis/dissertation writing is sparse” (Dong, 1998, p. 370). Therefore, Dong aimed to find out the difficulties especially the non-native post-graduate students faced in writing their thesis, the discipline, genre, and audience specific knowledge of these students about thesis writing in science, the quality and quantity of the supervisor’s assistance in thesis writing, the helping networks available for these students, and the impact of perceived language and cultural differences on thesis writing. The research was carried out at two state research universities in the U.S, with Master’s and PhD students in science and engineering departments, who had passed their qualifying exams, and were in the process of doing their thesis research and writing. 137 native and non-native students and 32 advisors were involved in the research. The survey instruments included a thesis writing scale and a questionnaire, and the data analysis involved both quantitative and qualitative procedures. The results showed that the professors were more likely than students to perceive that they had provided help with topic selection, idea development, drawing conclusions, avoidance of plagiarism, paragraph organization, and logical presentation. Dong (1998), based on the findings, concludes that knowledge transformation skills should be taught in EAP classes, helping networks should be established to provide thesis writing support, and there should be collaboration among disciplines on audience/genre/discipline specific writing instruction.

Another very relevant study employs a corpus of theses written by native speakers, and genre-based pedagogy to improve the writing skills of non-native post-graduate students (Charles, 2007). In the pedagogic study, Charles explores how top-down and bottom-up approaches can be reconciled in EAP writing materials through the use of discourse analysis and corpus investigation. Charles defines ‘genre analysis’


as a top-down approach, and corpus investigation as a bottom-up approach to EAP. In the construction of EAP materials, emphasis is first on macro-textual features, whereas in corpus linguistics, the analyst starts with the linguistic features and subsequently attempts to connect them to wider discoursal concerns. In the small pilot project described in the study, the materials in the study were used with 40 international post-graduate students. Based on the research findings, Charles concludes that “it is the combination of the two approaches that provides the enriched input necessary for students to make the connection between general rhetorical purposes and specific lexico-grammatical choices” (2007, p. 289). She further emphasizes that “in moving from discourse to corpus, the class moves from studying what texts do, to investigating how they do it” (Charles, 2007, 300).

In what could be considered the most related study to the present research, Lee and Swales (2006) present a discussion of an experimental course in corpus-informed EAP for non-native doctoral students, the aim of which was to introduce students to the corpus approach to language, “where the authority for language standards is decentered, and where learners, given some guidance and structured help, can take a more active role in their own learning” (p. 68). As part of this course, the participants were given access to specialized academic corpora, instructed in the use of web- and PC-based concordancers, and inducted into the skills needed to exploit the data for both directed and self-learning. The induction involved such practices as the analysis of language items and using the context to disambiguate nearsynonymous pairs of words through examining concordances, frequency patterns in language usage and change through corpus frequencies, and understanding fixity and flexibility in language through collocations and prefabricated expressions. Lee and Swales report that at the end of the course, they had ‘an exceptional group of


learners’ who could engage intelligently with ‘expert corpora’ of language, and examine them for lexico-grammatical and discourse patterns. This afforded the learners models against which they could compare their own performance. Lee and Swales are positive with the findings, and maintain that this approach to language learning would probably lead to assuming more control over one’s own learning.

2.5 Summary

As far as the present research is concerned, a number of key issues emerge from the review of the literature to date. Most importantly, it should be acknowledged that creating text is a complicated and challenging task requiring not only knowledge of grammar and lexis, but a wide range of structural and lexico-grammatical nuances that create both textual cohesion and coherence. Further, standards of textuality require knowledge of textual features, namely intentionality, acceptability, informativity, situationality and intertextuality. This sophisticated knowledge contributes not only to the creation of text but to the creation of discourse communities, and associated genres, which may have specialized ways of expressing themselves, and lexico-grammatical features that represent them.

Genres themselves are very often constituted of moves, which may be compulsory or optional. These moves are realized through strategies, or tactics, reflected through lexico-grammatical elements, that allow for both paradigmatic and syntagmatic choice. Although in theory this generates an infinite number of possibilities out of a finite number of items, one problem learners face is that their paradigmatic and syntagmatic knowledge is often highly restricted, particularly as far as such areas as synonymy, and collocations are concerned. Genres are not necessarily subject field


specific, and in fact may cut across multiple specialized fields and discourse communities. Academic genres, regardless of subject specialization, seem to display certain common features of different types that may need mastering. Some of these features are degrees of formality, the need for hedging, avoidance of continuous tenses, the use of passives or impersonal subjects, as well as generic organization.

Writing pedagogy has developed and tested different ways of instructing learners to express intended meanings coherently and appropriately. There have been approaches focusing on the product or the process, most of which have been criticized for various reasons. Current thinking leans towards a more integrated model, a process genre approach, which may be facilitated by technological developments that allow the creation of and access to large language corpora that can be analysed for lexico-grammatical content, word frequency, collocation and phraseology through concordance lines. This empirical approach towards teaching of text creation has the advantage of providing authentic data. What is more, such information can also be accessed by learners if compiled and organized effectively, therefore can be used not just as a basis for research, but also for teaching-learning.

Hence, the use of corpora has implications for both syllabus and methodology. DDL (Data Driven Learning) methodology provides the opportunity for learners to act as language researchers. They can find instant answers and solutions to their language and language-related problems, discover regularities and commonalities in the language, and thus become autonomous and self-directed learners. Furthermore, nowadays, e-learning or virtual learning platforms like Moodle are very popular, as they allow the learner to interact with the environment, the data, and other learners, which are believed to be effective contributors to learning.


In the light of the literature review, the main objective of the current research, therefore, is to construct a pedagogic corpus and incorporate data-driven learning tasks into the advanced academic writing course to develop the text creation skills of post-graduate students involved in thesis writing. However, it should be borne in mind that while dealing with texts, “we should work to discover regularities, strategies, motivations, preferences, and defaults rather than rules and laws” (Beaugrande & Dressler, 1981, p. xv).




In this chapter, first, the overall research design of the study is introduced, and the research context and the participants are described. After the presentation of the development and implementation of the data collection instruments, detailed information pertaining to the data collection and data analysis instruments and procedures is given. Then the virtual learning environment used in this study to host the corpora and the DDL activities, Moodle, is described. The chapter concludes with a discussion of the limitations and the delimitations of the study. 3.1 Overall Research Design

Educational research aims at continuous examination and improvement of the educational context. This necessitates the identification of any existing or emerging problems, seeking knowledge and ultimately taking required action to improve the practices. This type of research is referred to as ‘Action-Oriented Problem-Solving Research’ (Patton, 2002, p. 221). Cohen and Manion state that “the aim of action research is to improve the current state of affairs within the educational context in which the research is being carried out” (cited in Nunan, 1992, p. 18).

This study falls into the category of practical action research, “which takes a more applied and contextualized approach to action research” (Mills, 2000, p. 21). Mills defines the goals of action research as “gaining insight, developing reflective practice, effecting positive changes ... on educational practices … and improving


student outcomes and the lives of those involved” (Mills, 2000, p. 6). In action research, teacher researchers choose their areas of study, decide on their data collection techniques and develop action plans after analyzing and interpreting their data (Mills, 2000, p. 9). Action research involves having a reflective and critical stance towards educational practice, and possessing the willingness to improve or enhance it. In the process, the researcher validates and challenges existing practices while also taking risks (Mills, 2000, p. 11).

Hyland stresses the importance of research carried out by teachers themselves. He argues that “a good teacher is a reflective teacher, and a reflective teacher is someone who is familiar with the research, or carries out research” (2005, p. 58). Advocating the same view, McCarthy calls teachers ‘action-oriented’ applied linguists (2001, p. 124).

Mills states that action research is a four-step process involving identifying a focus area, collecting data, analyzing and interpreting data, and developing an action plan (2000, p. 6). The four-step cycle of action research was realized in this research in the following way:

Step 1: Identifying an area of focus

Like all post-graduate learners worldwide, the graduate Master’s and PhD candidates taking ENGL501, currently Advanced Thesis Writing at EMU, are expected to carry out academic scientific research, report their research in their thesis, and contribute to their research fields by getting their research articles published, and be accepted in the global academic discourse community. To this end, these non-native learners of the English language living and studying in a non-


English speaking environment are required to possess adequate academic language competence, and produce accurate and appropriate written work. However, these post-graduate candidates’ written work revealed deficiencies at specifically the lexico-grammatical level.

The problem was initially identified through:

observation of classroom performance / analysis of written assignments over six academic semesters;

needs analysis questionnaires administered to the candidates at the beginning of the semester;

end-of-semester feedback forms from the candidates.

Step 2: Seeking knowledge - collecting data

• • •

Interviews with teachers who have taught the course before Observation of class performance, and written assignments Compilation of the Learner Abstract Corpus (the LAC) including 100 learner abstracts collected through six semesters

Compilation of the Target Abstract Corpus (the TAC) including 600 thesis and dissertation abstracts from four domains (Architecture, Arts and Humanities, Sciences, Social Sciences), and written by Master’s and PhD students living and studying in English-speaking countries


Step 3: Analyzing and interpreting data

The collected data were analyzed to provide answers to the following four research questions which guided the research study:

1 2 3 4

What are the major lexico-grammatical patterns identified in the LAC? What are the major lexico-structural patterns in the TAC? How does the LAC relate to the TAC? What does the cross-examination of the two corpora necessitate in terms of the comprehensive pedagogic corpus design?

Step 4: Developing and implementing an action plan

The construction of a comprehensive pedagogic corpus with multiple components; making the pedagogic corpus as well as all the data available on line, and accompanied by data-driven exploratory tasks, with students learning and discussing the complexities of academic text in a learning environment; creation of a corpusinformed advanced thesis writing course for post-graduate candidates at the institution.

Quantitative and qualitative research methods in education cannot be considered as polar opposites. Patten points out that “some researchers conduct research that is a blend of the two approaches” (2004, p. 21), and Reichardt and Cook state that “researchers in no way follow the principles of a supposed paradigm without simultaneously assuming methods and values of the alternative paradigms” (cited in Nunan, 1992, p. 3). Fraenkel and Wallen hold that “much research in education is a mixture of quantitative and qualitative approaches” (1993, p. 380), and Patton


(2002) maintains that “research and evaluation studies employing multiple methods, including combinations of qualitative and quantitative data, are common” (p. 5). In Applied Linguistics as well, the trend is more towards qualitative research, as stated by McCarthy “the emphasis on both sides of the Atlantic and internationally has shifted away from purely quantitative notions of research in applied linguistics towards qualitative research paradigms” (2001, p. 119).

This is a combined study that made use of both quantitative and qualitative methods. The study initially envisaged applying qualitative methods, as it approached the problem in question inductively, and the research study developed as the data unfolded. The researcher identified a problem through observation over an extended period, and also through semi-structured interviews and open-ended questionnaires. Qualitative studies explore the quality of relationships, activities, situations or materials (Fraenkel and Wallen, 1993, p. 380). Accordingly, the study sought to find out how far the integration of corpus data-driven work into a thesis writing course would contribute to the quality of participants’ written work.

The present study, however, also exploited quantitative research tools, as in corpus studies, range and frequency calculation yields meaningful quantitative data. Subsequently, an in-depth study of corpus data produced qualitative data. The corpus-based approach is empirical, “analyzing the actual patterns of use in natural texts”, and relies on quantitative as well as qualitative analytical techniques (Biber et al., 1998, p. 4). Biber et al. emphasize that in research employing corpora, the analyses should go beyond simple counts of linguistic features, and include qualitative, functional interpretations of quantitative patterns (1998, pp. 4-5).


Hunston similarly draws attention to the fact that a corpus provides evidence in the form of many examples which can only be interpreted by intuition (2002, p. 23).

3.2 Research Questions

The major objective of the current research was to construct a pedagogic corpus with multiple components, and incorporate corpus data as well as data-driven learning activities into the advanced thesis writing course with the aim of improving the text creation skills of non-native post-graduate students involved in thesis writing. As can be seen in Figure 3.1, multiple data sources were utilized to ensure triangulation.



Cr the oss LA -refe C r TA wi e n c th e C the

of pilation 5. Com t rge the Ta pus ct Cor Abstra C) (the T A

of n io r t s e la pi ar n r pu m e o ) o C e L ct C C 4. th tra LA s t he Ab (


Figure 3.1:

Multiple data sources employed

1. O bs er v st ud of atio n th en ro t w se ug o m h rk es six te rs

ws ervie 2. In t e viou s pr with e co urs s/ n er de sig to rs c instru

st po d es an air n re n P tio 3. es qu


The following research questions steered the research:

1. What are the major lexico-grammatical patterns identified in the LAC?

2. What are the major lexico-structural patterns in the TAC?

3. How does the LAC relate to the TAC?

4. What does the cross-examination of the two corpora necessitate in terms of the comprehensive pedagogic corpus design?

3.3 The Context

The Eastern Mediterranean University (EMU) is an English medium university in Northern Cyprus providing higher education to 14.000 students of 68 different nationalities. Seven faculties (Faculty of Arts and Sciences, Faculty of Architecture, Faculty of Business and Economics, Faculty of Communication and Media Studies, Faculty of Education, Faculty of Engineering and Faculty of Law) and three schools (English Preparatory School, Tourism and Hospitality Management, Computing and Technology) offer instruction at undergraduate level. The students, who are generally non-native speakers of English, receive language support in the preparatory and in the freshman year to enable them to cope with their academic studies in the target language. The Institute of Graduate Studies and Research runs Master’s and PhD programs at the At institution graduate


level there was no language support until 2000, which led the Institute of Graduate


Studies and Research to request the former School of Foreign Languages for a language course that would help Master’s and PhD candidates with thesis writing.

Therefore, the former School of Foreign Languages designed a new course, EFL 501, Advanced Writing, and started to offer it as a non-credit course to graduate students as of 2001. In the year 2002, the Academic Writing Unit was founded by the researcher of this study with the primary aim of providing further individualized or group assistance to final year undergraduate and graduate students in improving their academic writing skills. The basis for foundation of the Academic Writing Unit was that language support was not provided in the final years of undergraduate studies. The Academic Writing Unit delivered educational services both through one-to-one tutorials and group workshops. In the year 2003, having observed the difficulties the graduate students were having with English, especially in terms of the accurate and appropriate expression of ideas in writing, the researcher and an associate, who was also a tutor at the Academic Writing Unit, proposed a prerequisite course to the then existing EFL 501. This course would be EFL 501, Advanced Writing 1 and would be offered as a non-credit elective course to Master’s and PhD candidates who were still taking courses and who had not started writing their thesis. The earlier EFL 501 therefore became EFL 502, and was offered to graduate students who had entered the thesis writing stage.

This new EFL 501 (now ENGL501) course aimed at developing the academic writing skills of graduate students, and was designed to provide opportunities for the participants to analyze the structure and lexis of authentic academic texts. The participants then produced their own work and were encouraged to recognize their language problems, and find solutions to those problems. The syllabus (see


Appendix A) covered the common functions in theses. Therefore, relevant input and tasks were provided on such functions as introducing, comparing, interpreting data, narrating/reporting, and classifying/categorising.

ENGL501 (now Advanced Thesis Writing) has been offered as a non-credit course for almost five years now, and some departments (Departments of Civil Engineering, Industrial Engineering), and the Faculty of Architecture have made the course compulsory for their post-graduate students. Two other departments (Departments of Banking and Finance, and Economics) have asked the researcher to teach the course as a departmental course, and currently the advanced writing course is also offered under the course names of ‘Seminar in Banking and Finance’ and ‘Seminar in Economics’, with course codes BANK 598 and ECON 598. These developments seem to suggest that the post-graduate participants, as well as their departments benefit from the course.

ENGL501 has been revised every semester since its inception, taking into account the teacher researcher’s observations and experiences with the coursework, feedback from the participants, and feedback from the departments. In the year 2005, ENGL501 and ENGL 502 were merged upon the request of the Department of Banking and Finance that felt that their post-graduate students needed both courses. Therefore, the new course was completely redesigned, and was developed into an advanced thesis writing course with a language focus, and with special emphasis on academic writing conventions.

At present, the revised syllabus (Appendix B) is based on the five chapters in a traditional thesis, the thesis proposal, and the abstract. Therefore, the participants


have the opportunity to produce a miniature of their thesis during the course. The instructional materials, which are obtained from the World Wide Web, are entirely authentic, and are enhanced with tasks designed by the researcher to promote awareness and critical analysis of the input. The approach is process-genre based. The focus is on the different generic moves in theses, and how they are fulfilled through language. For instance, the students are presented with alternative language for opening up a research gap, which is a necessary move in thesis introductions. As regards the process focus, drafting, revising, and editing are valued as fundamental steps in the writing process, and the participants go through these stages when they are producing their work. Academic writing conventions, and quoting, paraphrasing, referencing, and summarizing skills receive maximum emphasis.

The feedback received from multiple sources as regards the ENGL501 course has been very positive through the years. Yet, in education there is always room for improvement, and there should always be the motivation and will to enhance the learning outcomes. With the advanced computer tools, it is now possible for teacher researchers to compile their own specialized corpus, and offer this comprehensive authentic source to the service of their students. In the study, corpus work and DDL (Data Driven Learning) activities were integrated into the course to provide more learning opportunities for the candidates in developing their academic writing performance. Furthermore, the virtual learning environment was incorporated into the course to increase the amount of exposure to the data, and also enhance the autonomous learning and self-study skills of the participants.


3.4 Participants

This section presents a semester-based breakdown of the participants’ background in terms of the degree pursued, subject field, and nationality. Table 3.1 presents information on how the profiles changed through the three years in which the data were collected.

Table 3.1 Profiles: Degrees pursued, subject fields, and nationalities (2004-2007) Degree pursued Master’s PhD Subject Field Com.& Med. Arch. B&E Educ. Engineering A&S Nationality North Cyprus Turkey Iran Indonesia Pakistan Eritrea Palestine Nigeria Sudan Libya Iraq Kosovo Zambia Malawi Albania Macedonia 04-051 (30) 17 13 04-052 (5) 3 2 04-052 1 1 1 1 1 04-052 2 2 1 05-061 (14) 11 3 05-061 3 1 1 9 05-061 2 7 1 1 1 2 05-062 (19) 13 6 05-062 5 3 11 05-062 7 1 6 2 1 1 1 06-071 (13) 12 1 06-071 1 1 6 5 06-071 10 1 1 1 06-072 (19) 15 4 06-072 4 7 8 06-072 6 1 9 1 1 1 Total (100) 71 29 Total 18 21 21 4 35 1 Total 31 9 40 1 1 1 4 2 1 3 1 1 1 1 2 1

04-051 17 7 3 2 1 04-051 14 7 6 1 1 1 -


It can be observed from the table that almost 30% of the total number of the participants who took the course between 2004 and 2007 were pursuing a PhD degree at the Eastern Mediterranean University. As regards subject fields, there was an increase in the number of Engineering students taking the course in the 20052006 Fall Semester, as the Department of Civil Engineering made EFL501 a compulsory course for their post-graduate students. A similar increase can be observed for the Faculty of Business and Economics in the 2006-2007 academic year, as the Department of Banking and Finance made the course compulsory for the post-graduate students, and the researcher started to offer the course as BANK598 for these students. In the 2006-2007 Spring semester, the Economics Department also started to offer ENGL501 as a Seminar course. Therefore, although the course offered was exactly the same, it was under three different course codes: ENGL501, BANK598, and ECON598.

Finally, regarding the nationality of the participants, of the 100 participants who took ENGL501 in these six academic semesters, 40 were Persian, 31 from North Cyprus, 9 from Turkey, 4 from Palestine, 3 from Libya, 2 from Albania, 2 from Nigeria, 1 from Malawi, 1 from Zambia, 1 from Pakistan, 1 from Indonesia, 1 from Iraq, 1 from Macedonia, 1 from Kosovo, 1 from the Sudan, and 1 from Eritrea. Therefore, 40% of the participants were of Turkish background, while 60% were from a variety of different countries, had different mother tongues. Although during these six semesters the ENGL501 course was offered to the participants from a variety of national, first language, disciplinary and academic backgrounds, these participants exhibited and reported very similar problems in terms of the accurate and appropriate use of English in academic writing, specifically at the lexicostructural level. The 100 international participants who took the course between


2004 and 2007 were therefore regarded as a representative sample of the whole ENGL501 population.

3.5 Data Collection Instruments

The current study employed multiple tools to collect abundant quantitative and qualitative data for an in-depth examination of the problem identified. This section presents the tools which were used in the research study for triangulated data collection.


Interviews with course instructors

Two other instructors other than the researcher were previously involved in the teaching of EFL501, Advanced Writing I (now ENGL501, Advanced Thesis Writing). These two instructors were interviewed to explore their teaching experiences with the participants, find out whether similar problems regarding the participants’ written work were observed, and elicit the instructors’

recommendations on how to further enhance the participants’ learning experiences through revisions to the course.

The advantage of interviews is that they are personalized, and “permit a level of indepth information gathering, free response, and flexibility that cannot be obtained by other procedures” (Seliger and Shohamy, 1989, p. 166). Furthermore, the researcher has the chance to gather unpredicted data by probing for information (Seliger and Shohamy, 1989, p. 166). An interview guide listing the questions or issues to be explored during the interviews was used (Patton, 2002, p. 343). The advantage of an interview guide is ensuring that “the same basic lines of inquiry are pursued with


each person interviewed” (Patton, 2002, p. 343), and providing “topics or subject areas within which the interviewer is free to explore, probe, and ask questions that will elucidate and illuminate that particular subject” (Patton, 2002, p. 343). This gives the researcher the opportunity to have a conversation on a subject area, and adopt a conversational style while focusing on a predetermined subject (Patton, 2002, p. 343). Yildirim and Simsek (1999) also list the advantages of employing an interview guide as allowing the researcher the flexibility of changing the order of the questions, going into more detail with some questions, omitting some questions that have already been answered by the interviewee, and getting systematic and comparable information from different interviewees (p. 95). Seliger and Shohamy (1989) similarly state that in semi-structured interviews, specific questions are prepared in advance, which provides a platform for the researcher to ‘branch off’ and “explore in-depth information, probing according to the way the interview proceeds, and allowing elaboration, within limits” (p. 167). The interview guide (Appendix C) in this research included twelve questions. If the interviewee had already answered a question while commenting on other issues, the question was not asked again.


Questionnaires administered to course participants

Each semester, two questionnaires were administered to the course participants. One of these was completed by the participants at the beginning of the semester, and aimed to determine needs, while the other was filled in at the end of the semester and provided feedback on the course. Some parallel questions also helped to compare the participants’ responses as regards the effectiveness of the course. As most of the items in the questionnaires were unstructured in nature, and were


therefore open-ended questions, the participants’ language use also provided data on the level of the language both before and after taking the course. Best and Kahn (1986) note the advantages of questionnaires as the researcher having the opportunity to establish rapport, explaining the aim, and explaining the meaning of possible unclear items (p. 166). The closed form, or restricted type questionnaires necessitate short responses, they are easy to tabulate and analyse, and they are useful for certain types of information (Best and Kahn, 1986, p. 167). The open-form, on the other hand, “calls for a free response in the respondent’s own words” and provides greater depth of response, encouraging the respondent to reveal the reasons for their responses (Best and Kahn, 1986, pp. 167-8).

Many questionnaires, Best and Kahn state, include both open- and closed-type items (1986, p. 168). In this research, the open type was used in both pre- and postquestionnaires, with the exception of the last closed-type section in the two questionnaires (Appendix D and Appendix E) that required the participants to mark the relevant choices. Slavin (1984) holds that “open-form questions are difficult to code and are disliked by many respondents as they take too much work” (p. 88). Yet, in cases where it is desired that the respondents express complicated opinions, closed-form questions are not suitable, and open-form questions become desirable (Slavin, 1984, p. 88).

In second language acquisition research, questionnaires are generally used to gather data on not so easily observable phenomena like attitudes, motivation, and selfconcepts, as well as to collect information on background information about the research subjects (Seliger and Shohamy, 1989, p. 172). The questionnaires administered at the beginning and end of every semester provided the researcher


with valuable data about the course participants’ backgrounds, their needs and levels, as well as their progress.


Compilation of the Learner Abstract Corpus

For this study, the abstracts for the Learner Abstract Corpus (LAC) were collated over a period of three years (six academic semesters) from 100 non-native ENGL501 participants, who took the course between the years 2004 and 2007. The individual abstracts written by the participants were first saved as word files. After the accumulation of 100 learner abstracts, the word files were converted into text files and stored as a databank ready to be analyzed. The reason for the conversion of the word files into text files was that some corpora analysis programs do not support word files, and operate on text files.

The LAC was composed of 100 thesis abstracts. The reason for the choice of abstracts is that they are short and independent texts, so they are easy to analyze, and they contain all the information in a thesis in summary form. Abstracts, according to Swales, like other genres reporting research, seem to have an IMRD (IntroductionMethod-Results-Discussion) structure (1990, p.181). Within this structure, almost all abstracts appear to have common moves, which are the scope of research, problem statement, aim of research, methodology, main results and conclusions. Thus, abstracts are a sort of miniature of the research study and reflect the main sections of the thesis: Scope of research-problem-aim-method-results-conclusions.



Compilation of the Target Abstract Corpus

Initially, a smaller pilot corpus was compiled for purposes of experimentation, and to get an insight into the applicability and practicality of the tools. This corpus was made up of abstracts from five sub-fields, namely 36 communications abstracts from Canadian universities, 32 archaeology abstracts from Australian universities, 40 psychology abstracts from New Zealand universities, 41 information technology abstracts from the United Kingdom and 40 economics abstracts from America, a total of 189 abstracts from countries where English is the mother tongue. The pilot corpus was invaluable in terms of the familiarization and experimentation with the corpus analysis tools.

In the process of the compilation of the target corpus, three important criteria in corpus design were taken into consideration: size, content, and balance / representativeness (Hunston, 2002). As computer technology is developing day by day, it is now possible to store and manipulate very large amounts of language data. At present, very large corpora are being used, such as the Bank of English which is 400 million words, and the British National Corpus which is 100 million words (Hunston, 2002, p. 25). However, “the exact amount of language required, of course, depends on the purpose and the use of the research” (Coxhead, 2000, p. 216). Corpora of only a few thousand words are also used when they are designed for a specific research study (Hunston, 2002, p. 26) and for a specific purpose. Kennedy and Thorp (1999) report a research study concerning the compilation of a corpus of answers to the Academic Writing Task 2 of the UCLES IELTS Examination, where the aim is reported to be the investigation of the linguistic nature of the answers at three levels. (8 - expert, 6 - competent and 4 - limited). The compiled corpus is


referred to as a ‘large’ corpus of 130 scripts (36,000 words). This research study is an example of how smaller corpora can serve a specific purpose.

The second criterion, content, is related to what will go in the corpus, and this is determined by “what the corpus is going to be used for but also about what is available” (Hunston, 2002, p. 26). When the purpose of corpus development is to create representation of a particular type of language, representativeness and balance are of utmost significance. This necessitates breaking up the representative whole into component parts and seeking to incorporate equal amounts of data from each part (Hunston, 2002, p. 28).

The present target corpus was compiled in an electronic format from the World Wide Web and is approximately 174,000 words. In order to qualify for the corpus, the abstracts primarily had to be from universities in countries where English is the native language. Another important criterion was that the abstracts had to be thesis and dissertation abstracts, since abstracts written for journal articles and conference presentations tend to differ. Therefore, journal article and conference presentation abstracts were disqualified and did not appear in the corpus so as not to distort representativeness.

Yet another criterion was that the abstracts had to be written at Master’s and PhD levels at educational institutions. This criterion naturally excluded abstracts written at Bachelor’s level. One advantage of abstracts is that they do not exhibit significant differences in terms of length. This property contributes to balance within the corpus. To ensure representativeness of the corpus in accordance with the purpose of the study and corpus design in general, the corpus covers four main disciplines: Arts


and Humanities, Social Sciences, Sciences and Architecture/ Urban and Regional Planning. Each discipline, making up a sub-corpus, includes 150 representative texts, in this case abstracts.

In the Arts and Humanities sub-corpus, 5 abstracts come from Anthropology, 30 from Archaeology, 19 from Art History, 7 from History, 40 from Language / Literature / Linguistics, 10 from Philosophy / Religion / Theology, 21 from Psychology, 5 from Music and 13 from Sociology. Of the 150 abstracts in the Social Sciences sub-corpus, 20 are from Business Administration, 17 from

Communications, 3 from Demography, 20 from Economics, 20 from Education, 5 from Geography, 38 from Information Technology, 12 from Accounting and 15 from Political Science. The Sciences sub-corpus is composed of 3 abstracts from Algebra, 13 from Biological Sciences, 10 from Chemistry, 11 from Computer Science, 92 from Engineering, 12 from Mathematics and 10 from Physics. The Architecture subcorpus includes abstracts from the fields of town and regional planning, landscape architecture, interior architecture as well as architecture.

As the abstracts were taken from the World Wide Web, html entities distorted the data, especially in subjects where symbols and equations are abundantly used. All the abstracts had to be examined both manually and with the help of software to discard the html entities before the corpus was put through any analysis. Also, there is always the risk of the same abstract being used by multiple web sites. To avoid this risk, all the abstract titles and the names of universities the abstracts came from were recorded during the compilation of the target corpus (Appendix F).


3.6 Data Collection Procedures

The proposal to carry out the present research was accepted by the EMU Institute of Graduate Studies and Research (IGSR) in December 2004, and the data collection process was initiated immediately. This process took longer than anticipated, and lasted about three years in order to be able to gather sufficient data, in the form of abstracts, for the compilation of the Learner Abstract Corpus. Therefore, the collection process which began with the compilation of abstracts in February 2005 ended when 100 abstracts had been accumulated by June 2007. Accordingly, the analysis of the LAC and the cross-reference of the LAC with the TAC was delayed until August 2007 beyond the researcher’s control, as the number of the participants taking the course each semester was unpredictable and irregular. As the LAC was being gradually compiled into a databank, in the interim, the interviews were held with the course instructors, and the pre- and post-course questionnaires were designed and administered to the course participants.

After appointments were made with the instructors beyond working hours so that they could freely answer the questions without time and work pressures, the interviews were held through the Hotmail messenger chat program and in an informal manner. This would provide the interviewees with the time they needed to contemplate the question and express their opinions without the fear of being misheard, misunderstood, or misinterpreted. Yildirim and Simsek (1999, p. 92) state that a good interview should not have the mistakes such as not listening attentively, and being biased that occur in everyday conversation. One of the reasons why a chat program was used for the interviews was to avoid any misinterpretation of the data as well as possible bias. Instead, the chat program ensured that the interviewees


expressed their feelings and opinions themselves in print. Although there were twelve questions in the interview guide so that the interviewer could focus on the predetermined topics, there was no attempt to follow the exact order of the questions.

The questionnaires were completed by the course participants, at the beginning of every semester, mostly during the first class hour. The participants’ responses provided the researcher with data on their backgrounds, and their needs as perceived by themselves. This information about the participants enabled the researcher to make revisions to the course based on the backgrounds, and needs of the new group of participants. Additionally, at the end of each semester, generally during the last class hour, the unstructured questionnaires including open-ended questions were administered by the researcher to collect in-depth, qualitative feedback on various aspects of the course. The two questionnaires administered at the beginning and the end of the semester were designed in such a way as to enable the researcher to compare and cross-examine their respective items.

As mentioned earlier, the compilation of the Learner Abstract Corpus took three years. Some of these learner abstracts were handwritten, and therefore typing 100 abstracts up for corpus compilation also took time. The Target Abstract Corpus had to be very carefully compiled due to considerations related to size, content, and representativeness. A further issue was discarding the html entities in the data collected from the World Wide Web. Once the data were collected from the interviews and the questionnaires, and the corpora compiled, the analysis process could be initiated.


3.7 Methods for Data Analysis

“Action research studies provide teacher researchers with data that can be used formatively and summatively. That is, much of the qualitative data collected during the study can be used to positively affect teaching throughout the study” (Mills, 2000, p. 97). In this action research study, the results of the data analysis were constantly employed in the development of the pedagogic corpus, and accordingly, the components of the pedagogic corpus were simultaneously revised in line with the findings.

Seliger and Shohamy define data analysis as “sifting, organizing, summarizing, and synthesizing the data so as to arrive at the results and conclusions of the research” (1989, p. 201). Hitchcock and Hughes describe it as “the ways in which the researcher moves from a description of what is the case to an explanation of why what is the case is the case” (1989, p. 295).

They hold that:

analysis involves discovering and deriving patterns in the data, looking for general orientations in the data and, in short, trying to sort out what the data are about, why and what kinds of things can be said about them. (Hitchcock and Hughes, 1989, p. 295)

Therefore, data analysis is of extreme significance as it “becomes the product of all the considerations involved in the design and planning of the research” (Seliger and Shohamy, 1989, p. 201). Yildirim and Simsek suggest that the process of data analysis is not simple even for researchers with experience. The reason for this is every research study is unique, and the methods of analysis in one cannot be


precisely replicated in another study. Thus, each study requires new approaches to the issue of data analysis. The researcher is expected to draw a plan for the unique research study through the exploration of existing analysis methods, and determine the ones relevant to the research design as well as to the nature of the collected data (1999, p. 156).

In the same way as data collection, in data analysis too “alternatives can be mixed to create eclectic designs, like customizing an architectural plan to tastefully integrate modern, postmodern, and traditional elements, …” (Patton, 2002, p. 248). Seliger and Shohamy share the same opinion on the issue. They emphasize that as with data collection, a variety of techniques are possible for analyzing data. They further state that “the selection of a data analysis technique will depend mainly on the nature of the research problem, the design chosen to investigate it, and the type of data collected” (1989, p. 201). Hence, the value of data analysis depends on the extent of its valid relationship with the other components of the research (Seliger and Shohamy, 1989, p. 201).

In this study, multiple data collection instruments produced diverse data. The research problem, the research design and the collected data therefore necessitated not only diverse data analysis instruments, but also corresponding varied procedures. The following section therefore presents the analysis procedures for the interviews, questionnaires, and the two corpora in three separate sections.


Analysis Procedure for Interviews

The interviews with the instructors who had previously taught ENGL501 were conducted through MSN chat, due to the reasons stated earlier in this chapter.


Therefore, there was no need to transcribe the interviews, as they were already in print. During the interviews, the interview guide was used to elicit data on the predetermined topics. As the two instructors were interviewed, the use of the guide enabled the interview process and the revealed data to be ‘systematic and comparable’ (Yildirim and Simsek, 1999, p. 95).

In qualitative data analysis, the most important considerations are describing the data, and ensuring the emergence of themes (Yildirim and Simsek, 1999, p. 158). The qualitative data collected through the interviews were analyzed by utilizing ‘content analysis’ to be able to categorize and thematize. The main aim of content analysis, according to Yildirim and Simsek, is to access the relationships and themes that can explain the collected data (1999, p. 162). Therefore, what is being done through content analysis is to bring together similar data under certain concepts and themes, and then organize and interpret the data in a manner the reader can understand (Yildirim and Simsek, 1999, p. 162). Content analysis, Patton holds, “is used to refer to any qualitative data reduction and sense-making effort that takes a volume of qualitative material and attempts to identify core consistencies and meanings” (2002, p. 453). In that sense, it requires inductive analysis which “involves discovering patterns, themes, and categories in one’s data”, since “findings emerge out of the data, through the analyst’s interactions with the data” (Patton, 2002, p. 453).

The twelve questions in the interview guide (Appendix C) allowed for the two interviewees to reflect on the experiences they had with the ENGL501 participants during the time they taught the course. The abundant qualitative data collected therefore required reduction and categorization. As Yildirim and Simsek recommend


(1999, p. 163), first the researcher tried to identify the common patterns in the data by recognizing key words. Then, the relationship among these common patterns was explored, and these were classified under four major themes, namely ‘student profiles’, ‘problems faced by the participants’, ‘the value and usefulness of the existing course’, and ‘suggestions for the improvement of the course’.


Analysis Procedure for Questionnaires

As mentioned earlier, the questionnaires administered to the participants made use of mostly open-type questions, with the exception of the last closed-type section in the pre-course questionnaire which required the participants to mark the relevant boxes (Appendix D). In the post-course questionnaire, most of the questions were again open-type, with some closed-type questions that required quantitative analysis. Therefore, on the whole, the questionnaires produced rich qualitative data, hence necessitating the utilization of ‘content analysis’ to reduce the data to a manageable size, identify patterns, and categorize them into meaningful themes.

The pre-course questionnaire included eight and the post-course questionnaire nine open ended questions respectively. Parallel questions were included to enable comparison. As the questionnaires were administered to 100 participants, the qualitative data generated were very large. Therefore, the data were first tabulated, and the common patterns identified. The categorization of these patterns produced three themes that were also suitable for a comparison with the interview results. The emerging themes were ‘the participants’ specific problems in producing written work in English’, ‘the effectiveness of the course’, and ‘the participants’ suggestions for the improvement of the course’. Yildirim and Simsek (1999, p. 176) note that


quotations are used in describing, exemplifying, interpreting data in content analysis. In this study, participants’ significant contributions were used as quotations during the description, exemplification, interpretation of the data in the data analysis chapter.


Corpus Analysis Methodology

Corpus analysis incorporates both quantitative and qualitative data analysis techniques. The corpus-based approach is empirical, “analyzing the actual patterns of use in natural texts”, and relies on quantitative as well as qualitative analytical techniques (Biber et al., 1998, p. 4). Biber et al. point out that one important consideration for corpus-based approaches is that the analyses should go beyond simple counts of linguistic features, and include qualitative, functional interpretations of quantitative patterns (Biber et al., 1998, pp. 4-5). Hunston similarly draws attention to the fact that a corpus provides evidence in the form of many examples which can only be interpreted by intuition (2002, p. 23). Intuition is very important, according to Hunston (2002, p. 2), in “extrapolating important generalizations from a mass of specific information in a corpus” (p. 22).

This study fits into the corpus-informed approach defined by McCarthy, as the corpora were designed and built with applied linguistic questions in mind, their output filtered and organized, and employed as a tool for what the teacher researcher wanted to achieve. As an example to this third approach, he gives the example of extracting lexico-grammatical information from a corpus (McCarthy, 2001, p. 138). Accordingly, it is relevant to define this study as a ‘corpus-informed’ study, as the research necessitated the identification of the lexico-grammatical patterns, used for


fulfilling the sub-moves and moves in thesis writing, to be exploited in the construction of a comprehensive pedagogic corpus.

Like all data sources, a corpus cannot accomplish anything unless it is analyzed. Corpus access software can re-arrange corpora, it allows observations of various kinds to be made and provides a new perspective on language (Hunston, 2002, p. 3). Corpus access software packages, which are employed in corpus analysis, manage data in three ways: in terms of frequency, phraseology, and collocation (Hunston, 2002, p. 3). It is possible to arrange words in a corpus in terms of their frequency, and to compare different corpora regarding their frequency lists with the aim of identifying their differences that can be later studied in greater detail (Hunston, 2002, pp. 3-5). This, according to Hunston, is especially valuable in the comparison of specialized corpora, where generally a smaller, specialized corpus is compared with a larger, more general one so that the keywords, “words which are significantly more frequent in one corpus than another” are identified (Hunston, 2002, pp. 67-68). One example given is a study by Sutarsyah et al. where a corpus of Economics texts is compared with a corpus of general academic English to identify the keywords in the Economics corpus. The study revealed that some words like price, cost, demand, firm, supply, occurred more frequently in the Economics corpus (1994, cited in Hunston, p. 68). Frequency counts can also give valuable information on the frequencies of lexical and grammar words in different corpora (Hunston, 2002, p. 3) as well as frequencies of categories of linguistic items (e.g. present and past tenses) across registers (Hunston, 2002, p. 8).

Phraseology can be observed through concordance lines, which are lines that “bring together many instances of use of a word or phrase, allowing the user to observe


regularities in use that tend to remain unobserved when the same words or phrases are met in their normal contexts” (Hunston, 2002, p. 9). Through phraseology, differences between easily confused words can be observed through the wealth of evidence provided by concordance lines (Hunston, 2002, p. 12). Concordance lines can be presented alphabetically or in groups selected and arranged to show a particular language behaviour (Hunston, 2002, p. 38). Therefore, it can be said that concordance lines can be used to observe how words behave (Hunston, 2002, pp. 39-41).

Through concordance lines, it is possible to observe the ‘central and typical’, meaning distinctions, meaning and pattern, and detail (Hunston, 2002, p. 42). Although a corpus cannot be used to establish what is impossible or possible in a language, it provides information about ‘central and typical usage’. Typicality involves “the most frequent meanings or collocates or phraseology of an individual word or phrase” (Hunston, 2002, p. 42). Centrality, on the other hand, “can be applied to categories of things rather than to individual words” (p. 43). For example, although present progressive can be used for the present, the future, or no specific time, the central use is for the present time. A corpus serves an important purpose here as the prototypical (what is felt to be typical) may not be the most frequent (Hunston, 2002, pp. 43-44).

Corpus exploration, by allowing comparative analysis, makes it possible to observe differences in meaning and use of near-synonyms, something which cannot be achieved through dictionaries as they handle words separately (Hunston, 2002, p. 45). The meanings of words are closely related to their co-text and are “distinguished by the patterns and phraseologies in which they typically occur”


(Hunston, 2002, p. 46). Concordance lines can be divided into sets, each of which exemplifies one meaning of a word. Hunston holds that “meaning and phraseology are indistinguishable, and the concordance lines show both” (p. 47). One important problem with concordance lines is that they present information, they do not interpret it, and “interpretation requires the insight and intuition of the observer” (p. 65).

The data in corpora can also be managed through the calculation of collocation, “the statistical tendency of words to co-occur” (Hunston, 2002, p. 12) or “the tendency of one word to attract another” (Hunston, 2002, p. 68). According to Hunston, collocation “can indicate pairs of lexical items, … or the association between a lexical word and its frequent grammatical environment”, the latter frequently called ‘colligation’ (p. 12). Sinclair, on the other hand, defines collocation as “the occurrence of two or more words within a short space of each other in a text”, and describes the ‘short space’ as “a maximum of four words intervening” (1991, p. 170).

In a corpus, collocation is calculated by software through taking a node word, “the selected word appearing in the centre of the screen’, and counting the words occurring within a particular span of the node word (Hunston, 2002, p. 69). Similar to Sinclair, Hunston also defines this span as four words to the left and four words to the right of the node word (p. 69). Collocational information can be useful in summarizing the information found in the concordance lines, in highlighting the different meanings of a word, therefore providing a semantic profile of the word, and in obtaining a profile of the semantic field of a word. An example to the latter can be that words related to money, such as dollar, money, tax, pound, can be


categorized as a semantic field (Hunston, 2002, pp. 75-79). What a simple list of collocations cannot show, however, is the association between meaning and phraseology, which can only be obtained through the concordance lines (Hunston, 2002, pp. 76-77). In this study, in line with Hunston’s (2002) observations with corpus analysis, the corpus data are managed in all the three ways; in terms of frequency, phraseology, and collocation. Corpus Analysis Tools

A variety of tools were utilized for corpus analysis in this study. Some of these tools are computer-based, while some are web-based. The aim of this section, therefore, is to introduce all the tools employed in this study. These tools can be listed as follows: RANGE, FREQUENCY, Concordance, AntConc, and the British National Corpus interface designed by Mark Davies from Brigham Young University of Utah, the United States.

RANGE and FREQUENCY are both computer-based text analysis tools. They were programmed by Alex Heatley, and designed by Paul Nation and Averil Coxhead of  the School of Linguistics and Applied Language Studies, Victoria University, New  Zealand. RANGE  is computer software used to compare the vocabulary of texts. This software, which calculates the range of vocabulary in texts, provides a range or  distribution   figure   (how   many   texts   the   word   occurs   in),   a   headword   frequency  figure (the total number of times the actual headword type appears in all the texts), a  family frequency figure (the total number of times the word and its family members  occur in all the texts), and a frequency figure for each of the texts the word occurs in. 


It can be used to find the coverage of a text by certain word lists, create word lists  based on frequency and range, and to discover shared and unique vocabulary  in  several   pieces   of   writing.   The   program   is   free   for   everyone   to   use   and   is  downloadable   from and it can operate on 32 different texts simultaneously.

The program can be used for multiple purposes. Firstly, it can be used to calculate the coverage of a text by using wordlists such as the GSL (The General Service List) and the AWL (The Academic Word List). It can also be employed to create own wordlists based on range and frequency. Most importantly, it can be utilized to find out common and distinctive vocabulary items in several pieces of writing. The program is accompanied by three ready-made base lists. The first (BASEWRD1.txt)  includes the most frequent 1000 words of English and the second (BASEWRD2.txt)  includes   the   2nd  1000   most   frequent   words.   The   first   2000   words   come   from  A  General Service List of English Words by Michael West (Longman, London 1953).  The third (BASEWRD3.txt) includes 570 word families from The Academic Word  List by Coxhead (2000). All of these base lists include the base forms of words and  their   derived   forms.   For   instance,   the   headword  aid  has   the   following   family  members:  aided,  aiding,  aids,   and  unaided 

/var/www/apps/scribd/scribd/tmp/scratch9/16849819.doc). As mentioned earlier, the  program   can   be   used  to create own wordlists based on range and frequency. Recently, Billuroglu and Neufeld (2005) have created their own BNL (Billuroglu-


Neufeld List) wordlist, which is also available as a word list of the 2,709 most common words in English that can be used with RANGE.

FREQUENCY is another program that also runs on a text file (.txt) to make a frequency list of all the words in a single text. Unlike RANGE, it can only run one text at a time. The output is an alphabetical list, or a frequency ordered list. It gives the rank order of the words, their raw frequency and the cumulative percentage frequency (/var/www/apps/scribd/scribd/tmp/scratch9/16849819.doc).

Concordance,   designed   by  Rob Watt, University of Dundee, Scotland, is a  copyrighted computer­based program that can be used for evaluation purposes for 30  days. However, a registration fee is paid to the author for longer use. In this program  employed for the close study of texts, each word can be seen in its context and also  located in the source text. Since words are given in and with their contexts, all the  usages of any word in a text or body of writing can be compared, and insight into  meaning and usage gained. Using this program, wordlists and word frequency lists  can be created, full concordances can be created for texts of any size, collocation  counts for each word, up to four words left and right, can be observed, concordance  of each word in the source text can be seen by clicking on any word, a wordlist can  be lemmatised by grouping chosen words together, and web concordances can be  made and published on the web. A major advantage of web­concordances is that  they are available to many users at the same time, which make them ideal for class­ based   activity   and   student­centred   learning.   Users   can   locate   every   word   in   the 


source   text   and   can   see   how   it   is   used   within   its   context.  ( 

AntConc 3.2.1 was developed in 2007 by Lawrence Anthony from the School of Science and Engineering of Waseda University in Tokyo, Japan. The earlier versions of this computer-based software started out as a simple concordancing program, but slowly progressed to become a useful text analysis tool. The program can run under any windows environment including Win 98/Me/2000/NT and XP, and also Macintosh OSX and Linux computers. The AntConc 3.2.1 includes multiple text analysis tools, but only the ones utilized in this study are presented. The tools of AntConc 3.2.1 used in this study are as follows: ‘Concordance’, ‘File View’, ‘Clusters’, and ‘Collocates’. The Concordance tool generates concordance lines (or KWIC: key word in context) lines from one or more target texts chosen by the user. At any time, a target file can be viewed in its original form using the File View tool. The Clusters tool is used to generate an ordered list of clusters that appear around a search term in the target files listed in the left frame of the main window. The clusters can be ordered either by frequency or the start or end of the word. A user can also select the minimum and maximum length (number of words) in each cluster, and the minimum frequency of clusters displayed. It is also possible to select if the search term always appears on the left or right of the cluster.

The Collocates tool is used to generate an ordered list of collocates that appear near a search term in the target files listed in the left frame of the main window. The collocates can be ordered either by frequency, frequency on the left or right of the search term, or the start or end of the word. A user can also select the span of words to the left and right of the search term in which to find collocates, and the minimum


frequency of collocates displayed ( READMEantconc3.2.1.txt).

The British National Corpus interface ( developed by Mark Davies is a web-based tool that makes use of the 100-million-word BNC. Words, phrases, lemmas (all forms of words, like sing or tall), wildcards (un*ly or r?n*), and more complex searches such as un-X-ed adjectives or verb + any word + a form of ground can be extracted from this tagged corpus. There are six macro registers on the interface, and the frequency of a word or phrase can be observed in all registers. Collocates can also be obtained through the interface. In addition, the use of a word can be compared across registers. For instance, it is possible to obtain information on the words and phrases which occur much more frequently in one register than another, such as -ness words in poetry, adjectives in tabloid newspapers, nouns in advertisements, or verbs in the slot ‘we * that’ in academic writing. Semantically-oriented searches can also be carried out. For instance, the most frequent nouns used with ‘small’ and ‘little’ can be compared. A very useful facility is finding the frequency and distribution of synonyms of a word, and comparing the frequency of synonyms in different registers. For example, the synonyms of ‘strong’ can be compared in the ‘academic’ and ‘news’ registers (Davies, 2004). Corpus Analysis Procedures

The compiled corpora were thoroughly analyzed by utilizing the diverse tools presented in the previous section. First, the two corpora were separately put through the Frequency program to explore the most frequent words, as well as the


cumulative percentage of individual words in the coverage of the corpora. Then, the corpora were independently put through the Range program for three fundamental reasons: firstly, to examine the range of occurrence of individual words in the four sub-corpora in each corpus and explore the ones that were present in all the subcorpora; secondly, to determine whether the most frequent words in such academic corpora actually came from the AWL, which claims to cover 85 % of any academic text when used with the GSL; and finally, to test the validity of the claim made by Billuroglu and Neufeld (2005), that separating the most frequent words as GSL and AWL is problematic, as words acquire different meanings and have different uses based on their contexts. In the next stage, the two corpora were separately analyzed for collocations and clusters through the simultaneous use of Concordance and AntConc 3.2.1. At this stage, the aim was to explore the lexico-grammatical patterns used in the two corpora.

When the data pertaining to the LAC and the TAC were compared in terms of frequency, range, phraseology, and collocation, the anticipated deficiencies in the LAC were further revealed. Therefore, through the use of Range and Frequency, first the most frequent 165 word families were extracted from the TAC. These 165 word families later acted as key words for the qualitative analysis of the TAC, when Concordance and AntConc 3.2.1 were used to extract the alternative lexicogrammatical patterns employed in the fulfillment of sub-moves and moves in specifically abstracts, and in theses in general. This exercise heavily relied on intuition, and employed the functional dimension of linguistic features as a basis. It also made use of the BNC interface to determine synonyms so that alternative lexico-grammatical patterns to achieve the same functions, sub-moves, and moves could be extracted from the corpus. Biber (1988) defines functions as


“communicative purposes served by particular linguistic features in texts” (1988, p. 25), and refers to corpus analysis of a range of texts in the following way:

Strong co-occurrence patterns of linguistic features mark underlying functional dimensions. Features do not randomly co-occur in texts. If certain features consistently co-occur, then it is reasonable to look for an underlying functional influence that encourages their use. (p. 13)

3.8 Moodle - a Virtual Learning Environment

Corpora and data-driven learning tasks necessitate a platform on which they can be placed. The current study used Moodle as a virtual learning, or e-learning platform. A virtual learning environment (VLE) can be described as “a collection of integrated tools enabling the management of online learning, providing a delivery mechanism, student tracking, assessment and access to resources". It is sometimes also referred to as a ‘Managed Learning Environment’ (MLE), a ‘Course Management System’ (CMS), or a ‘Learning Management System’ (LMS) (IADT, 2007). These environments are conducive to turning teaching and learning into an active, real lifelike process. Through VLEs, the opportunities for collaboration and communication between the teacher and students, as well as among students are maximized, allowing them to engage with the course more actively at their own convenient time and place (IADT, 2007). A VLE not only makes students active, but also ‘actors’, since they are “members and contributors of the social and information place” (Dillenbourg et al., 2002, p. 5). Probably the most important advantage of virtual learning environments is that they can combine heterogeneous technologies and multiple pedagogical approaches. A variety of tools are used to support functions such as information,


communication, collaboration, learning and management. Through the availability of various tools, the teacher has the opportunity to decide which type of interaction is suitable for which instructional and learning objective (Dillenbourg et al., 2002, p. 6). Moodle is an open source, as opposed to commercial, VLE package developed by Martin Dougiamas (Robb, 2004, p. 1). Moodle stands for Modular Object-Oriented Dynamic Learning Environment (Liao, 2007, p. 1), and it is based on strong pedagogic principles, namely ‘constructionism’, ‘constructivism’, and ‘social constructivism’ (Philosophy, 2008). Constructionism means that learning becomes effective when people construct something for others to experience, in other words, we have a better understanding of things when we try to explain them to others. A constructivist view asserts that people create new knowledge as they interact with their environments. Social constructivism, on the other hand, “extends constructivism into social settings, wherein groups construct knowledge for one another, collaboratively creating a small culture of shared artifacts with shared meanings” (Philosophy, 2008). In an online course, “the activities and texts produced within the group as a whole will help shape how each person behaves within that group” (Philosophy, 2008). In short, with Moodle, the job of the teacher can change from being ‘the source of knowledge’ to being an influence and a role model for class culture. Moodle enables the teacher to connect with students in a personal way, to address their learning needs, and also to moderate discussions and activities in such a way as to lead students towards the goals collectively (Philosophy, 2008).


Robb (2004) and Eldridge and Neufeld (2007) emphasize the variety of Moodle modules. Robb states that “there are more features available than any one user is likely to use” (2004, p. 1), and Eldridge and Neufeld (2007) stress the various options and choices available with Moodle. The mode of input is only one of the examples they give. They highlight the fact that input can be delivered in the form of a ‘lesson’, or a ‘workshop’, through ‘locally designed materials’, or ‘links to the World Wide Web’ (p. 21). Eldridge and Neufeld (2007), however, emphasize that the revolutionary nature of Moodle and other virtual learning environments depends more on what teachers do with them using their creativity within solid pedagogical principles, than the technologies themselves (p. 25). 3.9 Limitations and Delimitations of the Study

As in all research studies, certain factors posed as limitations in this study. The compilation of the corpus may have been a limitation as there might be an element of chance in text collection (Hunston, 2002, p. 2). Therefore, the nature of the compiled corpora determined the lexis, patterns and structures extracted from the corpora. Another limitation is related to the nature of corpus work in general. Corpus work reveals data about frequency, not about possibility. Descriptions of English are considered to be shifting towards the typical and away from ‘notions of wellformedness’ (Sinclair and Biber cited in Hunston, 2002). The fact that something is commonly used cannot be evidence that it is acceptable. Therefore, the data obtained from the corpora can only be referred to as ‘typical’, but not necessarily ‘wellformed’.


A further limitation is related to the pace of corpus studies. Corpus compilation and specifically corpus analysis proceeds slowly. Hoey emphasizes that “corpus-based work is slow and often painful … It can sometimes take half a day to complete an analysis that will produce a single sentence or indeed a single cell in a table” (2005, p. xii). Indeed, this study was limited by lack of uninterrupted time, and therefore, the pace was relatively slow.

Corpus work is criticized for presenting “language out of its context” (Hunston, 2002, p. 23). To deal with this limitation, there is the “need for a corpus to be one tool among many in the study of language” (Hunston, 2002, p. 23). Mishan offers another solution to the problem. She suggests that instead of looking to the authenticity of the source text, i.e, corpus, the aim should be “its authentication by the learner, which arises out of the involvement of the learner with the material, via the task” (2004, p. 219). This study was also limited by the issue of decontextualized language, and hence used multiple tools for both data collection and analysis, and also incorporated data-driven learning (DDL) activities to overcome this limitation of corpus studies.

However, the study also has some delimitations. The study made use of two corpora and the range of lexico-structural patterns extracted was wider. The use of comprehensive data involving two corpora, and the analysis of these corpora using a variety of computer-based and web-based tools had a positive impact on both the findings and the outcomes. The use of abstracts in the compilation of the two corpora is also a delimitation, as abstracts are short texts that can be used as whole texts. The use of short and whole texts is an advantage in corpus design. If long texts


are used in a corpus, “the peculiarities of an individual style or topic may occasionally show through into the generalities” (Sinclair, 1991, p. 19).

Another delimitation is the truly international participant population. Graduate candidates involved in the study were of different nationalities and from different backgrounds. Only about 40% of the participants were Turkish speaking, and the remaining 60% were speakers of other languages. “Inclusion of texts written by a variety of writers helps neutralize bias that may result from the idiosyncratic style of one writer, and increases the number of lexical items in the corpus” (Atkins et al., Sinclair, Sutarsyah et al., cited in Coxhead, 2000, p. 215). It was assumed that the diverse participant population would contribute to the richness and breadth of the pertinent data.

A further delimitation is the element of collaboration in the study. The participants engaged in activities and corpus-informed tasks in a supportive atmosphere, both in class and in the virtual learning environment, where they have time and space for reflection, experimentation and discovery. Another delimitation is the researcher’s status. During the process of providing the participants with more effective support for thesis writing, the researcher was also writing a thesis, and therefore was able to combine the insider-emic and the outsider-etic perspectives coined by Pike in 1954 (cited in Patton, 2002, p. 84) and incorporate these perspectives into the research process. The researcher’s status had a two-fold positive impact on the research in progress. Firstly, the researcher, going through the same process, could easily empathize with the participants, and understand their problems better. Secondly, the researcher had the chance to focus on additional problems she herself encountered


during all the different stages of doing research, and writing a thesis, and integrate the experiences in the form of tasks into the pedagogic corpus.




The present study collected abundant data in accordance with the research agenda. The research questions that steered the research were addressed consecutively in this chapter:

Research Question 1: What are the major lexico-structural patterns identified in the Learner Abstract Corpus (LAC)?

This section presents sample work from the learner abstract corpus after the analysis of the interviews with the previous course instructors, and the feedback received from the post-graduate students through the needs analysis questionnaires and the end-of-the-semester feedback reports. The first section focuses on the problem, as perceived by the previous course instructors, the students themselves and the researcher. In the second section, through the use of the computer software introduced in chapter 3, the LAC is analyzed in terms of the lexico-structural patterns, demonstrating the deficiencies in the Learner Abstract Corpus.

Research Question 2: What are the major lexico-structural patterns identified in the Target Abstract Corpus (TAC)?

The TAC is closely examined in terms of the target lexico-structural patterns.


Research Question 3: How does the LAC relate to the TAC?

The analysis of the two corpora provides significant insights into the deficiencies of the post-graduate candidates’ written performance, and the target patterns required in advanced post-graduate studies in fulfilling moves such as stating the aim, and identifying the research gap.

Research Question 4: What does the cross-examination of the two corpora necessitate in terms of the comprehensive pedagogic corpus design?

The formerly analyzed and interpreted data are exploited to improve the current instructional materials and evolve a more comprehensive pedagogic corpus incorporating corpus data and related on-line tasks, taking into account the ultimate objective of action research, which is “effecting positive changes … on educational practices … and improving student outcomes and the lives of those involved” (Mills, 2000, p. 6).

4.1 Analysis of the preliminary data

The researcher has been teaching ENGL501 (presently Advanced Thesis Writing) since the 2004-2005 academic year, and observing the language deficiencies of the post-graduate candidates who are expected to report their research in coherent and appropriate manner. Some pertinent background data were therefore collected from the previous course instructors and the participants themselves to gain deeper insight into the problem. An examination of a small sample of the participants’ writing was also conducted by three experienced writing instructors and a qualified IELTS


examiner to profile the students’ writing at both course entry and exit points using the IELTS (International English Language Testing System) criteria



Interviews with EFL 501 (ENGL501) course instructors

The interviews (see Appendix G and Appendix H) with the two previous course instructors, who were also involved in the design in the period of their instruction, were held with the aim of eliciting their perceptions and observations regarding the post-graduate candidates’ written performance. The interviews aimed at finding out the two instructors’ views on the students’ common problems with the use of the English language in an academic environment. The two instructors’ suggestions for the revision of the course to provide better support for the course participants were also elicited through the interview.

Four major themes emerged from the content analysis of the interviews with the previous course instructors: Student profiles, the participants’ problems with the language, the value and usefulness of the existing course, and suggestions for the improvement of the course. As regards the first theme, student profiles, both instructors reported that the student population was quite diverse not only in terms of the nationalities and departmental backgrounds, but also the language levels. As well as participants with very high, almost native-like language levels, there were those at the pre-intermediate, and in some cases at the elementary level in the target language. According to one of the respondents, some participants needed “some very basic training in English”. With respect to nationalities, although most


participants were reported to be Turkish-speaking, the rate of foreign students taking the course was on the rise.

Both course instructors stated that they observed serious problems with the use of English in an academic environment. One in fact expressed this opinion saying that “a number do have serious academic writing problems for students at post-graduate level”. Although both instructors taught two or more groups throughout the years, they said they could safely generalize the problems, and reported areas of difficulty related to vocabulary, coherence, grammar, organization, and elaboration of ideas. One of the interviewees elaborated on these problems, saying that the most serious problem is ‘lexis and language problems’, because “it is the most difficult problem to solve in the short-term”. Expanding on the ‘lexis’ problem, the interviewee said:

The participants’ lexical range is not very high, which can lead their work to be quite repetitive, and their accuracy breaks down, because their knowledge of the grammar of individual items is weak, e.g. accompanying prepositions, and their sense of collocations is limited.

The respondent also emphasized that the participants’ sense of ‘synonymy’ and ‘the need for variety in writing’ is quite limited. An important point regarding ‘insufficient lexical knowledge’ was also highlighted as follows:

this deficiency leads to a problem with nuances, e.g that 'claim' implies a somewhat critical view on the part of the writer, whereas 'As X states' tends to indicate support; this problem with nuance also leads to problems in such areas as hedging, making exaggerated claims for their own research.

With respect to the value and usefulness of the course, the interviewees stated that the course was extremely useful and valuable, as the participants were given ‘real’ academic support. They also reported that the participants themselves were


motivated, willing and highly appreciative of the course, as they were aware that they needed help with their writing. They also emphasized that the feedback from the departments as well as the participants themselves was proof that the course definitely had a positive impact on the participants’ use of the academic language.

The previous course instructors felt that although the course was useful and valuable, they would make some revisions to it if they taught it again. The first interviewee stressed the need for presenting excerpts from authentic theses as models. The second interviewee also commented on the need to provide more exposure to ‘models’, as “good models provide the skeletons and building blocks of good writing”. There should also be focus on a sense of genre, organization and audience. In fact, one of the respondents said that there should be “prior focus on the logical organization of ideas within the conventions of the given genre”. More use of dictionaries, the introduction of thesaurus type work, and identification of collocations and chunks as the key feature of fluent writing were also emphasized. The course instructors also focused on the importance of presenting writing as a process, and adopting a process approach to writing.


Needs Analysis Questionnaires and End-of-the-semester Feedback Questionnaires administered to course participants

The needs analysis questionnaires were administered at the beginning of each semester to raise the participants’ awareness of academic writing and the course, as well as to elicit their self-perceived needs. A similar questionnaire was designed and employed at the end of the semester, so that feedback on the course could be collated and used to improve the course, as well as so that the participants’ responses before


and after taking the course could be compared and contrasted. As some questions were parallel in the two questionnaires to allow for comparison, they are also analyzed together to ensure a comprehensive understanding of the profiles and needs of the course participants before and after the course, as well as evaluate the effectiveness of the course as perceived by the course participants.

ENGL501, as introduced earlier, was initially designed to provide general academic language support for Master’s and PhD candidates and prepare them for ENGL502, Thesis Writing course offered by a colleague. In time, acting upon the feedback received from the departments and the participants themselves, the two courses were merged to provide academic language support for specifically thesis writing. When the participants were asked what kinds of writing they needed to do in English, they singled out thesis and research paper writing followed by writing articles for publication. This need was raised by such a vast majority of the participants through four semesters that it led to the fusion of the two courses and the development of the existing course, which has been on offer for the last four semesters as the Advanced Thesis Writing course (ENGL501).

The three main themes that emerged from the content analysis of the questionnaires administered at the beginning and the end of the semester can be categorized as the participants’ specific problems in producing written work in English, the effectiveness of the course, and the participants’ suggestions for the improvement of the course. The first theme, the difficulties the participants felt they faced when writing in English, deserved special attention as these difficulties were taken into consideration in the revision of the course, and the construction of the comprehensive pedagogic corpus. Before taking the course, the majority of the


participants felt that poor vocabulary created the most serious difficulty in producing their own texts. They emphasized not being able to use different words with similar meanings, and therefore resorting to repeating the same words. A participant voiced this as “Writing with a rich vocabulary. Using various similar meaning [sic] words. I need to improve technical and academic vocabulary.”

Another almost equally important reported problem was concerned with grammar and sentence structure, length, and complexity. This seems to be consistent with the point that the most significant problem that needs to be dealt with is lexico-grammar in meaning creation. A few participants also mentioned difficulties with developing ideas, drafting, outlining, introducing, and concluding, but these macro-structure problems were reported much less than those related to the ability of creating the intended meaning. Upon the completion of the course, although much fewer people expanded on their problems, 85% of the problems mentioned were either lexical or grammatical in nature. One participant emphasized the widely-held belief that “Writing is the most difficult human being`s [sic] activity. It will remain a difficult task forever and for everyone.” Although at the beginning of the semester not even one participant talked about concepts like collocations and appropriacy, as they were obviously not familiar with such nuances, at the end of the semester quite a number of the participants mentioned related problems. One of them said:

The difficulties that I still encounter in my written English are in terms of appropriacy of the vocabulary due to sometimes confusing the formal and the informal language. Also I need to know more synonyms and antonyms. Another difficulty in my written English is the misuse of collocations.

A very significant theme emerging from the analysis of the questionnaires is the effectiveness of the course as perceived by the course participants. This theme is


exemplified through the responses of the participants to different questions in the questionnaires. The first of these questions was related to the language levels of the participants as perceived by themselves before and after taking the course. The respondents were asked to rate their English language skills (global, reading, listening, writing, speaking) as excellent, good, fair or poor. Before taking the course, the majority of the participants rated their overall language level, and their reading, listening and speaking skills as good. The only skill that the majority rated as fair was the writing skill. After taking the course, however, the respondents rated all the language skills as good, and there was not a single person who thought their writing skill was in the fair category. One respondent feels that the course contributed to the improvement of his/her language skills and says “The course gave me an opportunity to improve my English by enhancing my vocabulary to include [sic] more academic words and phrases.” Another respondent refers specifically to the contribution of the course to the writing skill by saying “It helped me to build confidence and great interest in writing.”

Another question that provided data on the effectiveness of the course was when the respondents were asked directly if they had benefited from the course. Of the 58 respondents who chose to respond to this question, 57 answered in the affirmative and 1 in the negative. The respondent who said he did not benefit from the course said there was ‘insufficient and unrelated homework’. The positive feedback included such comments as: “The course has helped me tremendously in improving my writing skills and taught me how to communicate more clearly”, “Now, I can write my own sentences without plagiarism [sic]”, “Now after this course I have self-confidence to start writing even by [sic] mistakes. Guidance of you [sic], read the books [sic], search in the internet [sic] and the site (lexical) help me in the way


[sic]”, “I can claim that this course has been the most beneficial course during my graduate studies for academic writing purposes”.

Before taking the course, 45% of the respondents checked their written work by giving it to a more competent reader in the English language, followed by 34% who used Microsoft Word tools for spelling and grammar. 28% said they read their written work over and over again and revised it, and 18% consulted reference books and the Internet. 8% mentioned looking at well-written samples and benefiting from them, and 1% paid someone to edit his work. One respondent gave this very interesting answer: “I never trust whatever sentence I write that`s why I have always tried to copy from other people`s work or use their structure [sic] that’s why now in my thesis I have difficulty expressing my own ideas.”

Upon completion of the course, 28% referred to Word grammar and spell check, 22% said they used the Lexical Tutor website introduced during the course to check their written work, 10% mentioned getting feedback from the instructor, 8% said they used the course material, 7% said they gave their work to someone else to check., and 4% said they used the Word Thesaurus to find synonyms. Some responses were: “I am using ‘Academic Word List’ book and I use (vocabulary profile) web site for checking my written work”, “The first tool for checking written work is course lecture notes. The second tool is some websites which were introduced by the instructor and the third and the most important one is the teacher herself”, “Previously, I had just used MS Word editor and Google to check my writing. Now I know how can [sic] I use lextutor website. It`s more reliable than Google and other tools”, “I compare my notebook, lecture notes and what the teacher says about my written work. Yes, this course gave me


new technique, system and procedure for writing”. The responses appear to be indicating that after taking the course, the respondents felt more confident about their written work, used more varied resources to check their work, and relied less on other people to check it for them.

The participants were also asked about what other contribution, if at all, the course made. The majority mentioned language improvement, followed by benefits of the classroom sessions (self-confidence, interaction with peers), and a deeper understanding of what research/thesis writing involves. A few participants also referred to the contribution of the course to their knowledge of writing conventions, such as quoting, paraphrasing and referencing. One respondent expressed his ideas in the following way: “The most important help of this course is; I can say that I know how can [sic] I write my thesis. I didn`t know about writing skills, format of a thesis. I can criticize a thesis. I can see what`s wrong with the thesis.”

To raise the participants’ awareness of some key concepts in writing, a list was provided at the beginning of the semester and the participants were asked to mark the concepts they were familiar with. The same list, with minor changes, was again included in the questionnaire at the end of the semester to observe how effective the course was in contributing to the participants’ awareness of these key concepts. Table 4.1 below provides a comparison:


Table 4.1 Comparison of responses regarding familiarity with important concepts in writing
Frequency of positive responses: Terms Wordiness Cohesion Coherence Process writing Drafting Revising Editing Appropriacy of vocabulary Collocations (words that go together) Citing ( referencing ) sources Avoiding plagiarism Quoting, paraphrasing, summarizing Bibliographies Format of thesis Accuracy of language Punctuation Formal/Informal Language Beginning of semester 5 11 11 11 22 18 22 3 5 14 14 22 22 20 17 19 (not asked) End of semester 11 13 15 34 63 61 61 44 53 56 68 68 57 68 56 59 69

As can be observed from the table, almost all the concepts indicated a high level of familiarity at the end of the semester when compared with the beginning of the course. Especially noteworthy is the fact that the participants’ awareness of plagiarism and of the ways of avoiding it through quotes, and paraphrases seemed to have greatly increased after taking the course. Moreover, initially while only 5 of the students had a conception of ‘collocations’, and 3 of ‘appropriacy of vocabulary’, the number of the students with the awareness of these concepts increased to 53 and 44, respectively, at the end of the semester

When asked if they would advise a friend to take the course, 71 out of 72 participants responded in the affirmative. Some interesting answers included: “Definitely yes, because most of the students have difficulty in writing thesis. Also


most of them have problems in citing sources”, “Yes, it is a very useful course that [sic] I can advise my friends and also my teachers as well”, “Definitely. Academic writing without such course [sic] might be a nightmare”, “Definitely. Course empowers one to write and use English appropriately”, and “Yes, for sure. To excel his writing abilities as well as his English knowledge”.

The last theme emerging from the analysis of the questionnaires is the participants’ suggestions towards the improvement of the course. The majority of the respondents said they were satisfied with all the aspects of the course, but there were also a few suggestions. These included integrating more materials and more challenging assignments into the course, compiling the materials into a textbook, making the course compulsory to all post-graduate students, having smaller class sizes and homogenous groups, providing more language work, setting stricter deadlines and finally opening more courses like this one. The participants were asked if they would suggest the addition or deletion of anything related to the course. Although there were not many proposed changes, some respondents mentioned making the course compulsory and credited, making attendance compulsory, increasing the course hours, including more language work and more assignments with feedback, and opening more specialized courses (IELTS and TOEFL preparation). A respondent referred to the need for more specialized courses in the following way:

I think this course is well [sic] enough as a starting point for all university student [sic]. I suggest to continue [sic] this valuable work with more specialized topics such as writing scientific articles, conference and journal papers. (bolded by the researcher) Considering the effort needed to arrange these kind of activities and the number of people who may attend [sic]. It may be more applicable in form of workshops, short seminar, etc … However, I think such efforts are needed to improve and encourage research in our university.


These three major themes that emerged from the analysis of the questionnaires are noteworthy in that they reflect the participants’ voices. These voices are extremely significant, and were taken into account in the construction of the pedagogic corpus, and the revision of the course that employs a ‘corpus-informed approach’ (McCarthy, 2001).


A sample of participants’ writing profiles

In order to gain another perspective on the post-graduate students’ written performance at both course entry and exit points, a representative sample of 19 pieces of writing were profiled by four independent writing instructors, one of whom was a qualified IELTS examiner with 10 years of experience in examining. The reason why IELTS was employed was that this exam is used world-wide for evaluating academic performance, and set by numerous educational institutions as an entrance requirement for undergraduate and post-graduate programs


The samples of writing were graded according to the IELTS writing band descriptors on a nine-band scale. The participants pre-course texts revealed a wide range of writing levels, from 3 at the bottom end to 8 at the top end, with an average of approximately 5.5 – below the level normally required by universities as entrance level for post-graduate study. These findings seem to corroborate the data gathered through the questionnaires and interviews regarding the language level of the participants. Towards the course completion, however, the post-graduate candidates’ texts exhibited an average of 6.1, ranging between 4 and 8. This is noteworthy as a


shift of half an IELTS band over the duration of a 50-60 hour course would generally be considered substantive.


Cross-reference of the Findings from the Interviews and the Questionnaires

The findings from the interviews with the course instructors and those from the questionnaires administered to the course participants were compared. Two common themes emerged from the cross-reference. These were the difficulties faced by the participants, and perceptions as regards the usefulness of the course.

The course instructors reported that the participants had the most serious difficulties with vocabulary, grammar, coherence, organization, and the elaboration of ideas. However, lexis was singled out as the most serious problem. Limited lexical range resulting in the repetition of the same words, and insufficient knowledge of the grammar of the individual words leading to the use of wrong collocates and colligates were emphasized. Similarly, poor vocabulary and insufficient knowledge of synonyms that caused the repetition of the same words were mentioned as the highest ranking difficulties by the participants. Parallel to the instructors’ observations, the fact that the participants also emphasized grammar and collocations as serious difficulties is interesting.

As regards the usefulness of the course, both the instructors and the participants held that the course definitely provides academic support. One post-graduate student expressed this quite strongly saying that thesis writing without such a course would be a nightmare. The instructors believed that the course had a positive impact on the academic writing skills of the participants. Likewise, the participants reported that


they felt more confident about academic writing after taking the course. The different contributing factors mentioned by the course participants were: awareness of plagiarism and knowledge of the means of avoiding it, tools they could use to improve their writing skills, lecture notes and class discussions, exposure to more academic words and phrases, feedback from the instructor, and critical analysis of authentic samples.


Sample work from the LAC and some preliminary assumptions

In this part, two abstracts from the LAC (1 from the Sciences, and 1 from the Social Sciences sub-corpus) are presented so that some preliminary assumptions can be made, and so that the problem that led the researcher to carry out this research can be better observed. Sample 1
Statistical process control, which is one of the quality control methods, have been used since the World War 2. It takes an important place for manufacturers to deal with the quality and the cost of their products. While manufacturing, it was a problem to find where and how the mistakes occurs. By using SPC, it stops being a problem. In this thesis, improving product quality in a pulp mill using statistical process control is issued. Critical pulp quality parameters like freeness and brightness are chosen to apply SPC. Data are collected by taking samples once in each hour and a software package is used to combine data and have X and MR charts. Too many journals, papers and articles are examined. After all these works, some ‘out-of-control’ points occurred, but the process was analyzed and found to be in process control. On the other hand, some recommendations are made. The company may improve its pulp mill by using more


technological equipments. It may require a high investment first, but by this way, the cost of production will be reduced.

Sample 2
Media affects society in a wrong way. It has lots of benefits for society, such as, it broadcasts news to inform people, it entertains people when they return their home from workplace, it lets us to learn each other etc. However it also damages the society especially the children. The paper tried to explain affects of mass medium on children in three categories. The first is to document the power of mass medium (especially TV). It is the most reliable source because of the visual images. So that everyone spend the time in front of the television to entertain and to inform. Therefore children are affected about to watch it. Hence they start to watch television every time. Second, television is not limited itself to make profit. For that reason they give horror movies and sex movies. They only write that children should not watch it but it is not enough. Third, television reshapes the identities. Such as most of the children wants to be like Tarkan, Madonna or Julia Roberts, whose are the famous people in the world. Finally, Media affects children but it is not absolute. Its affect on the children cannot be generalized. But as I mentioned before, almost all over the world is watched television, therefore their affects are very common.

It can be observed from these two samples that the participants in this study were likely to have problems with grammar, vocabulary, cohesion and coherence, unity, register, and so on. Although they had acceptable vocabulary knowledge, this was at the individual word level, which led to problems in using vocabulary productively. Lack of knowledge of collocations and colligations and insufficient practice in


lexico-grammar are bound to lead to problems which, in some cases, may cause total communication breakdown.

An example sentence from sample 1 is as follows: “After all these works, some ‘outof-control’ points occurred, but the process was analyzed and found to be in process control.” In this sentence, the insufficient knowledge of the verb ‘occur’ has led the student to use it with ‘points’. Can points occur? What are ‘out-of-control points’? Another problem is to do with the word ‘work’. The writer of the abstract uses is in the plural and says ‘after all these works’. Does the student mean ‘after these steps are followed’? As can be observed, communication breaks down. It is very difficult for the reader to make sense of what the writer is trying to convey.

An example sentence from sample 2 is: “Second, television is not limited itself to make profit.” This writer is aware that ‘profit’ collocates with ‘make’. However, the student either does not know the meaning of ‘limit oneself’ and has used it quite by chance, or has learnt it wrongly. There is serious communication breakdown here as only a reader with the same mother tongue as the writer can predict what the message is. These examples will be expanded once the LAC is analyzed in detail.

4.2 Analysis of the Corpus Evidence

This section presents the analysis of the data from the two corpora compiled for this research. 4.2.1 Analysis of the LAC

In this section, the major lexico-structural patterns in the learner abstract corpus (LAC) are identified and examined in detail. The LAC was comprised of 100 learner


abstracts written by the ENGL501 participants over a period of 6 semesters and is composed of 21,575 tokens and 3,453 types. Compared to the 174,093-word TAC, the target abstract corpus compiled from the World Wide Web, the learner abstract corpus is small, as the compilation relied solely on the number of the students taking the course and submitting an abstract each semester. The breakdown of the LAC into its four sub-corpora is as follows:

Table 4.2 The breakdown of the LAC into its 4 sub-corpora
Sciences 6,859 Social sciences 9,256 Architecture 5,195 Arts & Humanities 265 Total: 21,575










( to identify the frequencies in the whole corpus. The ‘Frequency’ program does not reveal word families and treats the data so as to create output based on individual words (tokens). Neither does it provide the range of words in the 4 main domains. However, the output offers cumulative percentages, which is very useful in determining which words together make up what percentage of the whole corpus. The analysis showed that the most common 50 words in the LAC make up 42.79% of the whole corpus (21,575 words in total). Of these 50, only 8 words are content words, which means that the remaining 42 words are function words. The most frequent word is ‘the’ which occurs 1,543 times and covers 7.15% of the LAC. The first six words that occur in this corpus are ‘the, of, and, in, to, is’. This finding is consistent with that from the 1967 study of Kucera and Francis, who found that in written English, simple


grammatical morphemes, the, of, and, to, a/an, in, that, is, was, and he are the most frequent words (cited in Hudson, 2000, p. 63).

According to Hatch and Brown (1995), nouns, verbs, adjectives and adverbs are considered content words as they carry content meanings, whereas function words are pronouns, prepositions, conjunctions and determiners which carry grammatical meanings. Affixes can be added to content words and they can be modified. However, function words are affixlike in function and affixes cannot be added to them (p. 234). In the later stages of the present study, when collocations and common phrases were examined, function words took back their place since, without them, it would be impossible to express meanings appropriately. However, at this stage, the analysis looked at the most common content words in the corpus.

Some critical decisions had to be made during the filtering of the function words. As pronouns, prepositions, auxiliary verbs which belong to the closed category of words are considered function words, at this stage they were filtered from the corpus. Conjunctions are also treated as function words (Hatch and Brown, 1995, p. 234) although they certainly carry meaning, and are vital cohesive devices in a text. At this stage, though, words like ‘as’, ‘thus’, and ‘although’ were filtered, since there are some words like ‘since’ and ‘as’ which may be used either as connecting devices or function as prepositions. Another problem was the case of ‘have’ and ‘do’ and their related word forms. These words are so common that they had to be treated as function words. However, they re-emerged in later stages when lexico-grammatical items were extracted from the corpus. Since the aim of the analysis was not to produce a list of the most common words in academic abstracts, they were excluded. However, the focus being on collocations, colligations, and lexico-grammatical


patterns, these words were initially considered grammatical words. Another problematic case included such items as ‘which’, ‘where’, ‘these’, ‘both’, ‘such’ etc. Although these words refer to other words and therefore carry semantic meaning in context, when not in context they had to be treated as function words. Proper nouns, acronyms, abbreviations, numbers, and chemical symbols were also filtered.

As there are an enormous number of content words and a limited number of function words in languages, not surprisingly, after the first fifty or so words, the filtering process got very slow as most words were content words. There were some problems though. For example, when the word ‘mark’ appeared, it could have been a proper noun, or ‘mark’ as a verb or a noun, so it was included. Previously, it was mentioned that conjunctions would not be treated as content words. However, although ‘consequence’ is a noun and therefore a content word, and therefore included, there is the possibility that in context it functions as part of the conjunction ‘as a consequence’.

Taking into account the belief that the most commonly used 2,000 words are necessary for a learner in an academic environment and that the 2,000 word families in the General Service List and the 570-word -Academic Word List together constitute 86% of the academic corpus (Coxhead, 2000, p. 214), it was felt necessary to explore the LAC in terms of wordlists. The analysis revealed that 67.24% of the top 50 words (39 out of 58) in the LAC came from the GSL, and approximately 24% from the AWL (14 out of 58). Abstracts are highly academic in nature, and for pedagogic purposes, one might be inclined to focus on the AWL (Academic Word List) entirely and take for granted that students are in control of the first 2,000 words. However, as exhibited by the data and the percentages, most GSL


words carry academic meanings when used together with other words in academic texts. Relying solely on the AWL, and assuming that the students at any level have mastered the GSL would deprive them of the opportunity to add these very frequent words to their academic vocabulary knowledge. The next 50 content words needed to be analyzed to see if most of these words also came from the GSL, and then compare the results with those from the TAC in the subsequent stages of the analysis.

In the same way as the top 50 content words in the LAC, 65.79% (37 out of 58) of the second 50 content words were from the GSL, followed by 25.86% from the AWL. For example, a word which is in the top 100 content words in the LAC, ‘term’, is in the GSL list, and it is a commonly used word to refer to school semesters; ‘summer term’ and ‘winter term’. However, in academic texts ‘term’ is usually used to refer to an expression, or a phrase, and the collocations as well as the use of the word in an academic context is different. Treating the GSL and the AWL as distinct constructs and focusing on them separately and for different purposes could lead to serious gaps in the learners’ knowledge of the most frequent words in English. In order to explore the distribution of the GSL, the AWL, and the off-list words in the LAC more clearly, it was necessary to consider the LAC as a whole:

Table 4.3 The distribution in the LAC in terms of the GSL and the AWL
WORD LIST one (GSL1) two (GSL2) three (AWL) not in the lists (off-list) Total TOKENS/%
15489/71.79 1196/ 5.54 2793/12.95 2097/ 9.72 21575

1393/40.34 395/11.44 750/21.72 915/26.50 3453

712 246 388 Not Applicable 1346


The table shows that the post-graduate students used quite a high percentage of word types from the GSL1 (40.34%) When the token coverage is considered, this percentage is much higher (71.79%). This means that the students employed words which are in the top 1,000 word list quite frequently. It was mentioned earlier that most of the words in the GSL are very frequently used in academic writing, and what makes a word academic is more to do with its collocates. The post-graduate candidates’ use of words from the AWL was also quite high; 21.72% in terms of type coverage and 12.95% as regards token coverage. Considering that the AWL comprises 570 word families in total, the students’ use of 388 families from the AWL is quite significant. However, caution needs to be exercised here. The candidates used the 388 word families out of the 570 word families in the AWL, but did they use them accurately? In the same vein, was their use of the GSL words accurate and appropriate?

Billuroğlu and Neufeld (2005), thinking that the separation of the GSL and the AWL is not valid, as there are words in the GSL which are very commonly used in the academic world, and words in the AWL which can very comfortably be used outside the academic circles, compiled the BNL (Billuroğlu Neufeld List), which they claim to be more representative of any written text. They used the the GSL headwords and word family members (Lextutor, 1,000 families; Lextutor, 2,000 families; Dickins, Extended version of a general service list of English words), the AWL headwords (Lextutor, AWL headwords) and most frequently occurring word family member (Lextutor, AWL sublists), the first 2,000 words of the Brown corpus (Edict, The first 2000 most frequent words from the Brown Corpus), the first 5,000 words of the British National Corpus (Kilgarriff, Lemmatized BNC frequency lists), the revised version of the GSL (Bauman, About the GSL), the Longman Wordwise commonly


used words (Longman, 2003), and the Longman Defining Vocabulary (Kennaway, The Longman defining vocabulary). Their list comprises 2,709 words and the words are classified into 6 bands (plus one band-0- for the function words) according to the number of the lists in which they occur, with band ‘1’ comprising the most frequent words. Their list, they claim, has a higher coverage of any given written text than the GSL and the AWL used together. The LAC was also put through the BNL vocabulary profiler to test this claim. Table 4.4 illustrates the distribution of the words in the LAC using the BNL.

Table 4.4 The distribution in the LAC in terms of the BNL
WORD LIST One (function words) Two Three Four Five Six Seven not in the lists Total TOKENS/%
9289/43.05 5828/27.01 1462/ 6.78 1176/ 5.45 797/ 3.69 968/ 4.49 289/ 1.34 1766/ 8.19 21,575

130/ 3.76 1073/31.07 458/13.26 316/ 9.15 259/ 7.50 313/ 9.06 102/ 2.95 802/23.23 3,453

86 519 238 172 154 159 68

Not Applicable

As indeed can be seen from the table above, while the combined GSL and the AWL token coverage of the LAC is about 90%, that of the BNL is around 92%. This may not be a major difference. However, the significant issue is that treating GSL and AWL separately, and focusing solely on the AWL in academic writing classes, may cause serious consequences and may deprive learners of valuable learning opportunities. The BNL does not make this distinction and treats all the most frequent words in the same list. Therefore, considering lexis on the basis of word families in all six BNL bands would ensure that the students get exposed to those


words which are very common, but acquire a different meaning in an academic context. So far, the frequency and range of use of the GSL, the AWL, and the BNL words in the LAC was investigated. However, the fact that a piece of writing reveals the use of a high percentage of academic words, or less common words does not mean that the writing produced can successfully convey the writer’s intended messages. The use of a word does not necessarily mean the accurate and appropriate use of it. The data from the LAC revealed that the post-graduate candidates’ use of the English language in the academic context was not always coherent and appropriate, especially at the lexico-structural level. It is very important, at this point, to

emphasize that “the learning of vocabulary, …, is not just a question of learning the ‘semantic properties’ of items, but also their ‘syntactic properties’” (Corder, 1973, p. 279-280). The post-graduate candidates’ problem with lexis and grammar was reported as the top difficulty by the course instructors, as well as by the participants themselves. Identifying incorrect uses of language is an important pedagogic consideration, since “whilst the nature and quality of mistakes a learner makes provide no direct measure of his knowledge of the language, it is probably the most important source of information about the nature of his knowledge” (Corder, 1973, p. 257). According to Corder (1973), describing and classifying learners’ errors give us information about what learners still need to learn, and which language features are causing them problems (p. 257). The analysis of errors is the most useful for teachers, due to the fact that errors provide them with data regarding the effectiveness of teaching materials and teaching techniques, as well as which parts of the syllabus have been insufficiently learnt, and need to be focused on further (Corder, 1973, p. 265).


Errors can be described once what the learner is trying to say can be interpreted. In the case of a shared mother tongue, and if the learner is available, the teacher can ask the learner to “express his intentions in his mother tongue, and then translate his utterance into the target language”. This is known as ‘authoritative interpretation’, as the utterance is authoritatively reconstructed into an acceptable form, and “its appropriateness is guaranteed by the translation” (Corder, 1973, p. 274). In situations where the learner is absent, however, what he intended to say is inferred “from his utterance, its context and whatever we know about him and his knowledge of the world and the target language”. This is called a ‘plausible interpretation’ and “the corresponding reconstruction only a plausible reconstruction”. However, there are sometimes instances “when the utterance is so incoherent that no interpretation of any sort can be achieved” (Corder, 1973, p. 274). Errors can be described at the most superficial level “in terms of the physical difference between the learner’s utterance and the reconstructed version”. Four categories can be mentioned for differences of this kind: “omission of some required element; addition of some unnecessary or incorrect element; selection of an incorrect element; misordering of elements” (Corder, 1973, p. 277). A deeper description can be achieved through “assigning the items involved to the different linguistic levels: orthographic/phonological, syntactic, and lexico-semantic” (Corder, 1973, p. 278). A learner may use a correct semantic item, but choose its wrong grammatical form, or a correct semantic item used in its correct grammatical form may be spelt wrongly.


Table 4.5 Matrix for the classification of errors
Phonological/ orthographical Omission Addition Selection Ordering Grammatical Lexical

(Corder, 1973, p. 278) The errors of the post-graduate candidates’ were analysed and described using Corder’s (1973) matrix for the classification of errors. This description was of great significance for the later stages of this research study, in the construction of the pedagogic corpus and the design of relevant tasks. It should be emphasized here that the LAC was very rich in errors, and thus only some of them are exemplified here. The most frequent content word in the LAC, ‘study’, occurs 128 times. The examples given include the use of the item ‘study’ both as a verb and a noun. As Table 4.6 below shows, the majority of errors were of syntactic and lexico-semantic nature, and some required elements were omitted, incorrect elements were selected, some redundant elements were added, and some elements were misordered. It needs to be strongly emphasized here that ‘study’ is a GSL1 word, in other words, one of the most common 1000 words in English. One might assume that students at this level would not be expected to make mistakes with a GSL1 word. However, as mentioned earlier, the use of words in academia may be quite different from that in spoken discourse, or fiction, and thus treating the GSL words as ‘given’, and focusing on the AWL words exclusively may result in gaps in the students’ knowledge.


Table 4.6 Description of errors- ‘study’
Data from the LAC This study applied in North Cyprus and the results can be different in other countries.’ Elements are thought to be study individually, within application in a building and the influence through the neighborhood buildings In this thesis, the aim of the study is to understand the meaning of buildings through their form, space and functional organizations and relationships in the Persepolis complex and the cultural values behind that. This study is to examine the relationship between emotional problems, alcohol and drug consumption and academic performance among Eastern Mediterranean University’s students (EMU). A study of various cultures’ architecture and relation to the climate and principles of sustainability will be analyzed. Through the reflection of study about marketing communication role on alcohol consumption offers different perspective about evaluation of its role and effects on alcohol consumption. Plausible reconstruction by the researcher This study was conducted in North Cyprus, and the results can be different in other countries Description of the error -Omission of a required element / syntactic -Selection of an incorrect element / lexico-semantic

Elements are thought to be -Syntactic / Inflectional studied individually ….. affix

This study aims to explore the …

-Addition of some unnecessary element / lexico-semantic -Selection of an incorrect element / lexico-semantic

The objective of this study is to examine …

-Omission of a required element / lexico-semantic

The architecture of various cultures in relation to the climate and the principles of sustainability will be examined.

-Selection of an incorrect element / Addition of some unnecessary element / lexico-semantic -Misordering of elements / syntactic -Misordering of elements / syntactic -Coherence problem

Plausible interpretation not possible.

‘Research’ is the second most frequent content word in the LAC. There can be no doubt that the post-graduate candidates, who are deeply involved in research, know the meaning of ‘research’ very well. Yet, as can be seen in Table 4.7, they have not mastered the lexico-structural use of the word in an academic context. It can be said


that although they knew the word semantically, they seemed not to have learnt the ‘grammar’ of the word ‘research’. Table 4.7 Description of errors - ‘research’
Data from the LAC Secondly, the buildings which are restored contrary to the Venice charter articles will be the result of this research. A comparative research is prepared in order to achieving the background of house owners, searching the rules of housing sites, looking throw city life and comparing the life of housing site with comparing two different characteristics of two sites, which are existed already, in different time periods. The online survey and Internet research is going to use in order to examine role of marketing communication tools in three different countries such as America,.. Studying climatological factors which are affecting hydrologic cycle was the main concern of so many research and studies. Plausible reconstruction by the researcher the buildings …… will be the main focus of this research / This research will focus on the buildings which … Description of the error -Selection of an incorrect element / lexico-semantic

This comparative research study is conducted in order to …

-Selection of an incorrect element / lexico-semantic -Omission of a required element / lexico-semantic

Plausible interpretation not possible.

-Omission of a required element / syntactic -Selection of an incorrect element / lexico-semantic -Selection of an incorrect element / lexico-semantic / syntactic -Addition of some redundant element / lexico-semantic (research and studies)

…….so many research studies / so much research.

The analysis of the data from the interviews with the previous course instructors, the pre- and post- questionnaires from the students, the profiles of the participants’ written work, the sample student work, and the LAC revealed that the post-graduate candidates were facing difficulties especially in creating meanings in a language that was not their own. Therefore, it seemed crucial to compile a parallel corpus of the


work of post-graduate students in English-speaking countries as a next step, in order to observe how they used lexico-grammatical patterns to fulfil moves such as identifying the research gap required in abstracts and theses. Their written performance in terms of fulfilling the required moves was then compared with that of the non-native post-graduate students studying and living in a non-English speaking environment.


Analysis of the Target Abstract Corpus (the TAC)

In this section, the major lexico-structural patterns in the TAC are identified and examined in detail. The target corpus is referred to as the TAC (Target Abstract Corpus), as it is composed of theses abstracts that were the source of the target lexico-structural patterns in this research. The abstracts in the TAC, all published on the World Wide Web, were also produced by students, not ‘experts’. Flowerdew (2000) draws attention to the importance of providing good ‘apprentice’ models rather than ‘expert’ generic models, since expert models are more difficult to replicate due to the likely communicative and linguistic deficiencies of learners.

The TAC was compiled from 4 main domains, namely Sciences, Social Sciences, Architecture and Arts and Humanities. Each domain included 150 thesis or dissertation abstracts, a total of 600 abstracts altogether. The downloaded files had to be filtered using ‘Concordance’ software to eliminate the html formatting of the data as all abstracts were obtained from the World Wide Web. The breakdown of tokens according to the 4 domains (sub-corpora) was as follows:


Table 4.8 The breakdown of the TAC into its 4 sub-corpora
Sciences 42,113 Social Sciences 43,073 Architecture 39,994 Arts & Humanities 48,913 Total: 174,093

The table indicates that although the corpus was balanced in terms of the number of abstracts, the Arts and Humanities sub-corpus had by far the highest number of words. As the disciplines in the Arts and Humanities sub-corpus (Anthropology, Art History, History, Language, Literature, Linguistics, Philosophy, Religion, Theology, Psychology, Music, and Sociology) are dependent on verbal expression by nature, this seems to be expected.

In the same way as the LAC, the TAC was put through the ‘frequency’ program ( to identify the frequencies in the whole corpus. The number of total tokens (individual words) in the TAC is 174,093, and the number of total types 15,425. The most frequent word in the TAC is ‘the’. It occurs 12,550 times and accounts for 7.21 % of the whole corpus.

The analysis of the TAC demonstrated that 39.5 %, of the whole corpus was made up of a mere 50 words. Of these 50 words, only 12 were content (lexical) words and the others were all ‘function’ words. Hunston (2002, p. 3) discusses word frequencies in corpora in detail and compares the first 50 words from three different corpora, a corpus of politics dissertations, a corpus of materials science dissertations and the 1998 Bank of English corpus, and she concludes that in the first 50,


grammar words are more frequent than lexical words. In all the three corpora compared, the first 6 most frequent words are the, of, to, and, a, and in. The TAC findings are consistent with Hunston’s findings in that the first 6 most frequent words in the TAC are the, of, and, to, in, and a, respectively.

In parallel with the analysis of the LAC, the function words in the TAC were then filtered to provide a basis of comparison in the later stages of the data analysis. Of the top 50 content words in the TAC, 60% (30 out of 50) come from the GSL, followed by 34% (17 out of 50) from the AWL. It should again be emphasized that 60% is quite a high percentage, and very strongly confirms that some GSL words acquire a different meaning and a different use when used in an academic context. Not surprisingly, in the TAC, the most frequent content word was ‘study’ (a GSL word) (667 times, 0.39%), as the corpus is highly academic in nature. There were some subject-specific words occurring in the top 50 content words list. Some of these were: architecture, architectural, social, cultural, political, etc. At this point there was no need to filter these words since later in the research a word qualified for inclusion in the target bank of lexico-structural items to achieve moves not only in terms of frequency, but also with respect to the range of occurrence in the four sub-corpora.

One interesting finding was the emergence of the word ‘new’ as the third most frequent content word, occurring 377 times and the adjective ‘different’ occuring 172 times, as academic research, by definition, is expected to offer something ‘new’ and ‘different’ to the field. The analysis of the most frequent 50 content words also indicated a high number of synonymous words used. For example, the top word ‘study’ may be frequently used interchangeably with the 2nd word ‘research’, the 6th


frequent word ‘thesis’, the 23rd most frequent word ‘dissertation’ and even the 11th word ‘project’. Likewise, ‘analysis’, ‘data’, and ‘information’ are all useful content words which may convey a similar message. As the analysis developed, these relationships took better shape.

In the second top 50 list, 66.66% (34 out of 51) of the words were from the GSL, and 29.41% (15 out of 51) from the AWL. Relations of synonymy can also be observed in the second 50 list. When the British National Corpus

( is consulted, for example, it can be seen that the words ‘significant’, ‘important’ and ‘major’, which all appeared on the second 50 list, can be used as synonyms within an academic context. To get a better insight into the first 100 list, it was felt necessary to categorise the words according to their parts of speech.

Table 4.9 The most frequent 100 content words in the TAC, categorized in terms of their parts of speech










The table indicates that most of the frequent words appearing in the target corpus were nouns, followed by verbs, adjectives and adverbs. This would seem to suggest that in academic discourse most meaning is conveyed through nouns. At this stage, it is not clear whether some of the words were used as nouns or verbs, and therefore these words were categorised as both nouns and verbs. However, the concordancing program revealed the parts of speech in the future stages of the research in the analysis of collocations, colligations, and lexico-structural patterns.

In order to examine the distribution of the GSL, the AWL, and the off-list words used in the target corpus more clearly, it was necessary to look at the whole TAC using RANGE.

Table 4.10 The distribution of the words in the TAC in terms of the GSL and the AWL
WORD LIST One (GSL 1) Two (GSL 2) Three (AWL) not in the lists Total TOKENS/% 113985/65.47 8835/ 5.07 25695/14.76 25578/14.69 174,093 TYPES/% 2658/17.23 1208/ 7.83 1992/12.91 9567/62.02 15,425 FAMILIES 938 575 564 Not Applicable 2,077

As indicated in Table 4.10, the target corpus made up of 174,093 words showed variations in terms of ‘types’ and ‘tokens’. While 65.47 % of the tokens came from GSL 1, this was only 17.23% as regards types. This means that a low number of the most frequent words were very commonly used. On the other hand, the words that do not occur in any list made up 14.69% of the tokens but constituted 62.02 % of the types. These words are less frequent words which do not occur among the 2,570 words covered by the GSL, and the AWL. 14.76% of the tokens came from the Academic Word List. This is quite a high percentage but as the corpus is academic in


nature, this is hardly surprising. In terms of types, these academic words constituted 12.91% of the whole target corpus. It should be stressed here yet again that some GSL words acquire an academic meaning when used with certain other words in an academic context, the “linguistic environment”, and especially the co-text, “the other words on either side” (Sinclair, 1991, pp. 171-172).

The GSL and the AWL, if and when they are used together, are claimed to cover 85 to 90% of words (tokens) in academic text. Billuroglu & Neufeld (2005), as mentioned earlier, claim that their list, the BNL, (Billuroglu & Neufeld List), is more
comprehensive and representative of the commonly used words in English. The list

claims to comprise 90 to 95% of tokens (not including proper nouns, acronyms or abbreviations) in academic corpora (

BNL_Rationale.doc ). When the BNL was applied to the TAC, the following output was obtained:

Table 4.11 The distribution of words in the TAC in terms of the BNL
WORD LIST BNL 1 (function words) BNL 2 BNL 3 BNL 4 BNL 5 BNL 6 BNL 7 not in the lists Total TOKENS/% 70794/40.66 38871/22.33 11459/ 6.58 10785/ 6.19 6781/ 3.90 9100/ 5.23 2426/ 1.39 23877/13.72 174093 TYPES/% 163/ 1.06 2049/13.28 1141/ 7.40 851/ 5.52 796/ 5.16 1004/ 6.51 340/ 2.20 9081/58.87 15425 FAMILIES 91 636 416 314 305 281 136 Not Applicable 2179

As can be seen from the figures above, the BNL coverage of the target academic corpus was higher than the GSL and the AWL combined together. Whereas the latter covered 85.3 % of tokens and 37.98 % in terms of types, the BNL comprised 86.28


% of tokens and 41.13 % of word types. However, the percentage of off-list types was still high (58.87 %). The coverage of text by the GSL and the AWL was 85 to 90 % of actual words (tokens) and that of the BNL was 90 to 95 %. Considering that tokens are not equal to types and a word like ‘the’ is repeated 12,550 times, it is worth exploring the off-list word list. One of the off-list words which occurred in all the 4 sub-corpora and repeated 43 times in the TAC is ‘novel’. 28 out of these 43 times, ‘novel’ was used as an adjective meaning ‘new’, and mostly used together with the nouns ‘approach’, ‘technique’ and ‘application’. If we relied solely on word lists such as the GSL, the AWL and the BNL, the students might not be able to encounter ‘novel’ and use it as an alternative to ‘new and original’ in their academic writing. A search of the BNC in fact revealed that ‘novel’ appears 1,349 times in the academic sub-corpus of the BNC, and used with nouns such as ‘problems’, ‘definition’, and ‘way’.

Another example is the ‘interview’ family. Although it is in the BNL band 5, this family does not merit a place in the AWL. However, in academic writing, and especially thesis writing, where research is reported, it is an extremely important family in terms of methodological procedures. In the TAC, the family occurred 89 times as shown below:


Table 4.12 The frequency of occurrence of the ‘Interview’ family in the TAC
Arch. Subcorpus 3 2 0 0 8 13 Arts & Humanitie s Subcorpus 8 6 1 1 19 35 Sciences Subcorpus 0 1 0 0 9 10 Social Sciences Subcorpus 3 1 1 0 26 31


Subcorpora 3 4 2 1 4

Total Freq. 14 10 2 1 62 89

The other off-list word families that occurred in all the four sub-corpora quite frequently were ‘database’ (17 times), ‘era’ (17 times), ‘discourse’ (103 times), and ‘dissertation’ (204 times). Especially noteworthy were the last two families: ‘discourse’ and ‘dissertation’. As regards ‘dissertation’, it is interesting that such a frequent word in the academic register could not qualify for the AWL, or the BNL. Considering that this word can mostly be interchangeably used with ‘thesis’, it would be unfair to deprive students writing their theses of a synonym. The ‘discourse’ family, on the other hand, exhibited meaningful data. Although the word itself and the family members occurred in the Architecture, the Arts and Humanities, and the Social Sciences sub-corpora, they did not occur in the Sciences sub-corpus even once, possibly due to the non-verbal requirements of this domain.

The analysis so far seems to suggest very strongly that wordlists such as GSL, AWL, and BNL have their pedagogic merits. Yet, caution needs to be exercised with them, as there are also extremely significant off-list words employed by different genres and sub-genres. Therefore, it can be said that while wordlists certainly facilitate learning of vocabulary through progressively focusing on the most frequently used


words, they need to be supplemented by genre-based specialized vocabulary at especially higher levels.

In this section, some words from the TAC were analysed in terms of the lexicogrammatical patterns. The same words chosen for the LAC were focused on here as well, and the examples presented were selected on the basis of the deficient patterns in the LAC. As Table 4.13 shows, the examples indicated that the writers of the texts in the TAC used the analysed words accurately, appropriately, and in a variety of ways. Furthermore, the occurrence of less frequent words is easily observable.

Table 4.13 Lexico-grammatical patterns in the TAC- ‘study’, and ‘research’
Data from the TAC While such deviant trademarks do not seem more likely to be abandoned or cancelled or to expire, further study suggests that trademarks that adhere most strictly to design norms are more likely to survive in use over time. This study presents a systematic approach for doing the latter by identifying the ICTs, technology applications and key sectors that most impact the internal digital divide in developing countries. The research is based on a model of systems adoption as a continuous process, and with the choices and decisions taken at an early stage with regard to technology having significant effects on the adoption across time. This research is conducted as an effort to demonstrate the usefulness of the system dynamics models on the construction industry. Comment • Selection of a correct element / lexicosemantic

• •

Correct ordering of elements / syntactic Use of a correct element / syntactic / subject-verb agreement Selection of a correct element / lexicosemantic Correct ordering of elements / syntactic Use of a correct element / syntactic / subject-verb agreement Selection of a correct element / lexicosemantic Use of a correct element / syntactic / subject-verb agreement

• • • • •


The analysis so far provided an outline of the LAC and the TAC compiled for this study, and a rough profile of the writers in the two corpora. More comprehensive data regarding competences and deficiencies were revealed when the two corpora were compared and contrasted with each other.


Comparison of the LAC with the TAC

This section compares and contrasts the LAC with the TAC in terms of word frequency, range, collocations, colligations and lexico-structural patterns in general. This section is therefore of major significance, as the outcome of the analysis formed the basis of the pedagogic corpus and the revised course.

It can be observed from the table (see Appendix I) that while the first 50 words of the TAC made up 39.50% of the whole corpus, the percentage was higher for the LAC, the first 50 words making up 42.79% of the whole corpus. This appears to indicate that the writers in the LAC made more use of the same words more frequently. Graph 4.1 displays the distribution of all the words in the two corpora more clearly.


Graph 4.1:

The LAC and the TAC – Cumulative Percentages

Graph 4.1 indicates a striking difference between the LAC and the TAC in terms of coverage. As can be observed, the whole LAC is covered by approximately 3,400 different words. However, more than 15,000 different words make up the TAC, making it clear that the range of vocabulary used by the writers of the TAC is almost five times as wide as that of the post-graduate writers of the LAC. It should be remembered at this stage, however, that the corpora are not of equal sizes, and more meaningful data are revealed when an equal size sample of the TAC is compared with the LAC in the later stages of the research.

The first ten words in both corpora were all function words, which is consistent with the pertinent literature. The most frequently used content word in both corpora was the same: ‘study’. This word was at number 17 in the LAC, and at number 20 in the TAC. The second most frequent word in the two corpora was also the same, ‘research’, appearing at number 25 in the LAC and 32 in the TAC. There was no


observable difference between the LAC and the TAC in terms of the first 100 words. Thus, it would be useful to compare the top 50 content words and then the second most frequent 50 content words in the two corpora for more relevant data.

Table 4.14 Comparison of the top 50 content words in the LAC and the TAC
The LAC (21,575 words) Frequency Off-list GSL1 GSL2 AWL The TAC (174,093 words) Frequency Off-list * * * * * * * * * * * * * * GSL2 * * * * * * * * * * * * * * * GSL1 AWL * * * * * *

Content Word

Content Word


128 79 71 70 70 65 64 58 44 44 43 41 40 40 39 38 37 37 36 35 34 33 33 33 33 32 31 31 30 30 30 30 29 28 27 27

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *


667 405 377 335 320 317 289 283 282 278 271 262 256 251 243 236 235 227 226 222 220 212 205 190 189 188 187 174 172 171 158 156 155 152 151 150



The LAC (21,575 words) Frequency Off-list GSL1 GSL2 AWL

The TAC (174,093 words) Frequency Off-list GSL2 GSL1 AWL * * * * * * * * * * *

Content Word

Content Word


26 26 25 25 25 25 24 24 23 22 22 22 22 21 21

* * * * * * * * * * * * * * *


148 148 147 146 146 144 136 136 135 134 133 131 130 129 129

* * * *

The data indicated that although the two lists included quite a number of common words, 67.24% of the top 50 words in the LAC came from the GSL, and approximately 24% from the AWL, whereas 60% of the top 50 content words in the TAC came from the GSL, followed by 34% from the AWL. This means that the post-graduate candidates taking ENGL501 made more use of the GSL and less of the AWL. The data from the TAC, however, revealed that the writers of the TAC made more use of academic words in their work, and used quite a number of synonymous words evident in even the list of fifty words. ‘Study’, ‘research’, ‘thesis’, ‘dissertation’ and even ‘project’ can all be used to refer to some kind of product in research. Two other synonymous threads that can be observed are ‘analysis / data / information’ and ‘model / design’. The use of synonymy occurred, but at a lesser extent in the LAC. One noteworthy observation regarding the LAC is that words like ‘process’, ‘theory’, ‘context’, and ‘framework’, which are very


important in describing the methodology in research, did not occur within the top 50 words.

Table 4.15 Comparison of the second top 50 content words in the LAC and the TAC
The LAC (21,575 words) Frequency Off-list GSL1 GSL2 AWL The TAC (174,093 words) Frequency Off-list * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * GSL2 * * * * * * GSL1 AWL

Content Word

Content Word


20 20 20 20 20 20 20 19 19 19 19 19 19 19 19 19 18 18 18 18 18 18 18 17 17 17 17 17 17 17 17 17 17 17 16 16 16 16

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *


128 128 124 124 124 123 123 122 121 119 119 118 118 117 117 117 117 114 113 113 112 109 108 107 105 105 103 103 102 100 100 100 100 100 100 98 98 97


The LAC (21,575 words) Frequency Off-list GSL1 GSL2 AWL

The TAC (174,093 words) Frequency Off-list GSL2 * * * * * * GSL1 AWL * * *

Content Word

Content Word


16 16 16 16 16 16 16 16 16 16 16 16

* * * * * * * * * * * *


96 96 95 95 94 94 93 92 90 90 90 90

* * *

In the LAC, 65.79% of the second 50 content words were from the GSL, followed by 25.86% from the AWL. In the TAC, 66.66% of these words were from the GSL, and 29.41% from the AWL. Again the percentage of the use of academic words was higher in the TAC than in the LAC. However, as previously stated, some GSL words gain academic meanings when used with certain other words in an academic context.

Relations of synonymy could also be observed in the second 50 list. For example, the synonymous words ‘significant’, ‘important’ and ‘major’ appeared on the TAC second 50 list. The LAC second 50 most frequently used words list included two very common words (‘television’ and ‘internet’) which are off-list, as the GSL was compiled in 1953. The Billuroglu-Neufeld List (BNL), however, attempted to reflect the nature of English according to the second half of the 20 th century, through eliminating words like ‘shilling’ which are no longer common, and including words like ‘television’, ‘internet’ which are in frequent use in the modern era. Having looked at the top 100 words in the two corpora, it was also worthwhile to consider


the two corpora as a whole. RANGE software was used to observe the comparative distribution of the GSL, the AWL, and the off-list words used in the LAC and the TAC.

Table 4.16 Comparison of the LAC and TAC distribution- the GSL and the AWL
15489/71.7 9 1196/ 5.54 2793/12.9 5 2097/ 9.72 21575

712 246 388 *NA 1346

1393/40.34 395/11.44 750/21.72 915/26.50 3453

113985/65.4 7 8835/ 5.07 25695/14.7 6 25578/14.6 9 174093

2658/17.2 3 1208/ 7.83 1992/12.9 1 9567/62.0 2 15425

938 575 564 *NA 2077

one (GSL1) two (GSL2) three (AWL) not in the lists (off-list) Total

*NA: Not Applicable The RANGE program ( cannot calculate word families of off-list words as these words do not appear in any of the three lists. (General Service List 1, General Service List 2 and Academic Word List)

The table indicates that the post-graduate candidates used the most common 1000 words (GSL 1) more frequently both in terms of types and tokens. Similarly, the percentage of the next most common 1,000 words (GSL 2) was higher in the LAC. This means that in the LAC the most common 2,000 words are more often used. However, the writers in the TAC made more use of the word families in the GSL. One interesting finding indicated by the above profile is that the post-graduate students taking ENGL501 used quite a high percentage of academic word types. However, careful scrutiny of the data revealed that the use of the academic words in the LAC was restricted to only 388 families, whereas the writers of the TAC exploited 564 families from the academic word list. If we consider that the total number of word families in the AWL is 570, this is quite significant.


The types of words which do not appear in any of the three lists (the off-list words) are much more frequent in the TAC (62.02%), as opposed to 26.50% in the LAC. This may indicate that the writers in the LAC tend to use the most common words more frequently, while the writers in the TAC exploit a variety of words. The table thus suggests that the post-graduate writers involved in research in English-speaking countries did not solely rely on the most commonly used 2,570 word families, but instead made use of more sophisticated words that were not on any of the lists. This is demonstrated by the fact that 62.02% of the types of words used in the TAC did not appear in either the GSL or the AWL. The LAC and the TAC also need to be compared with respect to the BNL, which is argued to have a higher coverage of any
given written text than the GSL and the AWL used together.

Table 4.17 Comparison of the LAC and the TAC distribution according to the BNL
9289/43.05 5828/27.01 1462/ 6.78 1176/ 5.45 797/ 3.69 968/ 4.49 289/ 1.34 1766/ 8.19 21575

86 519 238 172 154 159 68 *NA 1396

130/ 3.76 1073/31.07 458/13.26 316/ 9.15 259/ 7.50 313/ 9.06 102/ 2.95 802/23.23 3453

70794/40.66 38871/22.33 11459/ 6.58 10785/ 6.19 6781/ 3.90 9100/ 5.23 2426/ 1.39 23877/13.72 174093

163/ 1.06 2049/13.2 8 1141/ 7.40 851/ 5.52 796/ 5.16 1004/ 6.51 340/ 2.20 9081/58.8 7 15425

91 636 416 314 305 281 136 *NA 2179

BNL 1** BNL 2 BNL 3 BNL 4 BNL 5 BNL 6 BNL 7 not in the lists (off-list) Total

*NA: The RANGE program ( cannot calculate word families of off-list words as these words do not appear in any of the three lists. (General Service List 1, General Service List 2 and Academic Word List) **BNL1 – function words.

Table 4.17 shows that more words from both the LAC and the TAC appeared in the seven sub-lists of the BNL, which is consistent with the claim. The LAC population


used 1,396 of the 2709 word families in the BNL as opposed to 2,179 word families employed by the target population. The percentage of the off-list word types in the TAC was still high when the BNL was used. It should be kept in mind that words like ‘dissertation’, ‘discourse’, ‘encompass’, ‘interview’, ‘novel’, ‘manifest’ and ‘era’, which are abundant in the TAC, and which are commonly used in academic texts, are all off-list words in the GSL and the AWL. Of these words, ‘interview’ and ‘novel’ are in the BNL, while ‘dissertation’, ‘discourse’, ‘encompass’, ‘manifest’ and ‘era’ are off-list. These words are quite important as ‘novel’ is a synonym for ‘new, original’, ‘encompass’ for ‘include’, ‘manifest’ for ‘demonstrate, show’ and ‘era’ for ‘period’. It should be noted here that of these frequently used words, ‘discourse’ occurs only once in the LAC and ‘era’, ‘novel’, and ‘encompass’ none at all. It seems obvious that, unlike the writers in the LAC, the TAC writers’ use of the language goes beyond what is offered by wordlists. Insufficient lexico-grammatical range resulting in difficulty for the post-graduate candidates in finding synonyms when producing their own text was also confirmed by the data from the student questionnaires, where the students complained that this deficiency caused them to employ the same words repeatedly in their writing. Having identified that the TAC writers’ lexico-structural range was beyond wordlists, it would be valuable to see which of the most frequent off-list words in the TAC also occurred in the LAC.

Table 4.18


Comparison of the off-list word families in the TAC and the LAC

The table very clearly shows that the majority of the off-list words exploited by the target population were almost non-existent in the LAC. This picture seems to suggest that word lists need to be supplemented by data and genre-based lists of words and lexico-grammatical patterns to widen and enrich the post-graduate candidates’ range of productive knowledge and thus, use of lexico-grammar. If the same example ‘novel’ is considered again, the data confirmed that the post-graduate students did not use this word even once. Yet, ‘novel’ is a useful word which can be used synonymously with ‘new’ and ‘original’, and the post-graduate candidates taking ENGL501 would benefit from using such words in their theses. To examine the lexico-grammatical performance of the post-graduate writers in the LAC and the TAC, it is necessary to compare the two corpora in terms of collocations, colligations, and lexico-grammatical patterns in general.


Table 4.19 Comparison of the TAC and the LAC – collocates of ‘study’ (one left, one right)


The TAC Adjectives archaeological architectural Case Cash-flow close comparative comprehensive corpus-based cross-cultural current detailed disciplined empirical ethnoarchaeological ethnographic experimental exploratory Field further in-depth independent intensive interdisciplinary longitudinal main original pilot preliminary present previous qualitative research retrospective serious social special Verbs adds addresses aims/aimed analyzes/analyzed applies argues arises assessed/assesses assumes attempts began brings claims clarifies combines compares complements concludes conducted/conducts considered/considers consisted/consisting constitutes contributes corroborates covers demonstrate/ demonstrated/ demonstrates describes determines develops discerns discusses emphasizes employed encompass enhances establish/ established examined/ examines explores/explored extends failed featured fills study (n) finds/found focuses had/has illustrate include/includes indicated/indicates initiated investigated/ investigates involving lasting may measured Adjectives case descriptive documents experimenta l field present time

The LAC Verbs aims applied argues attempts based can / could consists develops examines explores focus found has indicate / indicates investigates is/was/are/wer e offers preferred provides results tries showed/shown suggest will

study (n)


The table clearly indicates the difference of the range of vocabulary as regards adjectival and verbal collocates of only one word, ‘study’. Both in terms of adjectives and verbs used with ‘study’, the TAC was very rich and varied, achieving different functions and moves. The collocation ‘study employed’ is most probably used to talk about the methodological procedures, while ‘study reveals’ was employed to report findings. In the same vein, ‘original study’, ‘pilot study’, and ‘interdisciplinary study’ have their specific functions, uses, and therefore messages. Compared to the TAC, the data showed that the adjectival and verbal collocates of ‘study’ in the LAC lacked variety, and sophistication. In contrast to the 36 adjectives used in the TAC to modify ‘study’, in the LAC these modifiers were only 7. The difference in the range of the verbs used with ‘study’ in the two corpora was much greater as can be seen in table 4.20. As regards the collocates of ‘studied’ in preposition, a similar difference can again be observed between the TAC and the LAC.

Table 4.20 Comparison of the TAC and the LAC – collocates of ‘studied’ (pre-position) The LAC The TAC is /was/be/were/are/been has not also actively carefully commonly previously well widely

be/ been/ was/ is has not



According to Table 4.20, the TAC again exhibited wider variety in terms of collocates of ‘studied’ in the pre-position. One striking difference, as can be seen from the data, was how adverbs were commonly used to modify the participle form of the verb ‘studied’ by the writers in the TAC. In the LAC, however, the participle form, ‘studied’ was not modified by an adverb at all in pre-position. Therefore, it is clear that relevant input and practice needs to be an essential component of the pedagogic corpus, the construction of which is the ultimate aim of this study.

So far, the differences between the TAC and the LAC were explored at individual word level, and in terms of the collocates of words. The following table compares the lexical item ‘result’ in the TAC with the LAC at the lexico-grammatical level.


Table 4.21 Comparison of the TAC and the LAC – Lexico-grammatical patterns (result)
The LAC the result will be …… as a result of ……. the result expected from …. will be the result of this… *if no result … the result of the study will …… *no special result could … *leads to very helpful result *The result shown that … *to show the result of the dissertation. *were listed according to result of the questionnaire. *According to question 7 result, participants agreed that … As a result, … The TAC As a result, (linking device) as a result of both ……. can result in the ……. were not only the result of … but also almost certainly a consequence of …. can result from a ……… is often the most favorable result of …….. as the unexpected result of ……….. The result is a description of ……. The principal result is that …….. is not so much a result of ….. as …….. A major result of this study shows that …… One important result is the lack of ………. This last result has implications for … This result is surprising because the common perception in the literature is that ……. This finding contradicts the expected result, which is that ……. A 'case result format' presents the results of the ….. There is a dual result. Another result is the proposal for ……… This result is due mainly to the ……, as opposed to ….. Conclusions result from published studies in ….; from research conducted in the 1980s concerning …….; and from experience in ……. that has been in continual operation online since ……. …. and often result in …… The result is a multiplicity of different and in some cases contradictory….. approaches. one interesting empirical result of the research was that ……….. This result has broad implications for … research.


Table 4.21 needed to be analysed in detail, so that a clearer and more comprehensive picture could be offered on the differences between the LAC and the TAC as regards lexico-grammatical patterns employed by their corresponding populations. As the data show, the writers of the LAC used the item ‘result’ in 13 different lexicogrammatical patterns. Of these, 7 were used inaccurately. These inaccurate realisations of the item were dealt with individually. The first pattern, ‘if no result’, seemed to have no agent, and therefore did not convey the intended message. The


second pattern, ‘no special result could …’, was problematic as ‘special’ and ‘result’ are not good collocates, and this collocation does not occur even once in either the 1,007,000 word British National Corpus (BNC Written)

(, or the 1,000,000 Brown Corpus ( The third pattern, ‘leads to very helpful result’, was problematic in two ways. Firstly, the item ‘result’ either needs to have the determiner ‘a’, or needs to be plural ‘results’. The second problem is that ‘helpful’ is not a common collocate of ‘result’. According to the BNC Written, some of the collocates of ‘result’ in the pre-position are ‘main’, ‘major’, ‘inevitable’, ‘overall’, ‘probable’, ‘important’, ‘significant’, ‘powerful’, ‘direct’, ‘useful’, and ‘expected’ (


Another inaccurate lexico-grammatical pattern was ‘The result shown that …’. In spite of the fact that the GSL 1 word ‘show’ is very commonly used, and encountered during the initial stages of language learning, the past tense of the verb was inaccurately formed. ‘To show the result of the dissertation’ sounds odd, as generally researchers discuss the result of the research or the study, but not of the dissertation, or the thesis. Although it does not lead to the breakdown of communication, another flawed lexico-grammatical structure was ‘were listed according to result of the questionnaire’. A determiner (the), and the plural form of ‘result’ are necessary to make the message more comprehensible. The last pattern that sounds unnatural was ‘according to question 7 result, participants agreed that…’. The problem is related to the formation of the noun phrase ‘the results of question 7’, or at least ‘question 7 results’. In some languages, including Turkish, which is the mother tongue of 40% of the participants, noun phrases are formed by


putting the head noun after the modifying noun. Thus, this inaccurate use may be due to the first language influence for some post-graduate candidates.

When the lexico-grammatical patterns used by the writers of the TAC were analysed, however, it could be observed that the target population used ‘result’ in a range of structures, and as different parts of speech. The most striking difference was that in the TAC, ‘result’ was used abundantly as a verb to show causal relationships, ‘result in’ to refer to an effect, and ‘result from’ to indicate a cause. This use was non-existent in the LAC. Another major dissimilarity was the use of ‘result’ in accurate and appropriate lexico-grammatical patterns in the TAC. One of these was ‘a result having implications’, another was ‘a finding contradicting an expected result’. The adjectives ‘favorable’, ‘principal’ and ‘unexpected’ were accurately and appropriately used to modify ‘result’. Furthermore, in the TAC, the word ‘result’ was employed with more complex structures such as ‘were not only the result of … but also almost certainly a consequence of …. ‘, ‘The principal result is that ……..’ and ‘This result is due mainly to the ……, as opposed to …..’.

The findings revealed by the analysis of the abundant data unquestionably point to the fact that the abstracts in the LAC exhibit a more limited range of vocabulary, and an
apparently more limited productive knowledge of the collocations and colligations of even relatively common items, resulting in difficulty for the writers in coherent and appropriate academic text. Therefore, the last section of this chapter describes the

construction of the pedagogic corpus, and the integration of the corpus-informed data and tasks focusing on the lexico-structural patterns required to achieve, with accuracy and appropriacy, specific moves in abstracts, and in thesis writing in general. The pedagogic corpus and its components are envisaged to enhance the


learning outcomes, and to assist the non-native post-graduate candidates living in a non-English speaking environment in producing coherent and appropriate academic
text. Comparison of the LAC with an equal-sized TAC

The data analysis demonstrated that the post-graduate students’ written work compiled in the LAC exhibited a more limited range of vocabulary, and an apparently more limited productive knowledge of lexico-grammatical patterns commonly used in thesis writing. Yet, one factor affecting the data might be that the two corpora were not of equal size, 100 abstracts in the LAC as opposed to 600 in the TAC. The rationale for having a larger TAC was to be able to derive as many alternative lexico-grammatical patterns as possible for the fulfilment of sub-moves and moves required in abstract writing, and thesis writing in general. It seemed imperative, at this point, however, to reduce the TAC to the size of the LAC by taking a random 25 abstracts from each sub-corpus in the TAC, making the two corpora precisely equal in size (both comprising 100 abstracts). The objective is to compare the lexical range of the two writer categories more objectively, and to further substantiate the finding that there is in fact a gap between what the postgraduate students can actually produce, and what they are expected to produce. Graph 4.2 below compares the two parallel corpora composed of the work of postgraduate EFL students writing theses in a non English-speaking country, and those students writing them in English-speaking countries.

Graph 4.2: Comparison of the LAC with a relative sample of the TAC


As can be observed, the graph clarifies the difference between the LAC and the TAC, and further strengthens the existence of the already identified gap. The postgraduate writers of the TAC used 5,206 different words in producing their abstracts, in contrast to the 3,392 different words used by the post-graduate students studying at the Eastern Mediterranean University. Considering that the two corpora were exactly equal in size, in terms of the number of abstracts, these figures point to a difference of 1,814 more words used in the TAC, which is a substantial disparity. The examination of these two equal-sized corpora, in terms of the number of abstracts, yielded information not only on the extent of the gap, but also the composition of the gap. Graph 4.2 also shows the text coverage in the two corpora. According to the graph, 95% of all the text produced in the LAC was covered by 2,311 words. In contrast, in the TAC, 3,798 different words made up 95% of the text. More strikingly, in the LAC, a mere 500 different words were used to produce 72.24% of the text. If this figure is compared with the TAC, it is observed that 500


words made up 63.87% of the text. All the data seem to substantiate the fact that the non-native post-graduate students at EMU used the most common words of English very frequently in writing their theses, and that their lexical range was quite limited, compared with post-graduate students writing their theses in English-speaking countries. As for the word families used in the equal-sized corpora, a variation can yet again be observed. It should be remembered at this point that the six content word bands of the BNL are ordered according to the frequency of occurrence of word families, band 1 including the most common ones. When the BNL word families were used to compare the two corpora, it was clear that the writers in the TAC made more use of the six content word families. The bar graph below provides comprehensive data regarding this observation.


Graph 4.3: Comparison of the LAC and the TAC in terms of BNL2709 word families

According to Graph 4.3, the function words of English (BNL, Band 1) were used equally in the two corpora. Similarly, the word families in Band 2 (the first band of content words), which are the most frequently used word families in English, were almost equal. After Band 3 (the second band of content words), however, the content family words used in the LAC were consistently lower than those in the TAC. Finally, when the totals are analyzed, it can be seen that the post-graduate students at EMU used 200 fewer word families of the BNL. It is important to emphasize that this difference was not in terms of word tokens, or word types, but word families. Just to give an example, the family members, or the word types of the headword ‘amend’ in the sixth band are ‘amended’, ‘amending’, ‘amendment’, ‘amendments’, and ‘amends’. The data can be analyzed even further to observe the difference in


terms of the word types in these families. The bar graph below shows the broad dissimilarity: Graph 4.4: Comparison of the LAC and the TAC in terms of BNL2709 word types

Graph 4.4 indicates that the disparity in terms of word types is even more significant. From the first band of the BNL, 1,306 word types were used in the TAC, as opposed to 1,124 word types in the LAC. The figures were all lower for the LAC in all the six bands, and the gap gets wider, in each band considering that the number of word types decreases as the bands go up. For instance, compared to the 504 word families in Band 2, the number of families in Band 5 is 392, and in Band 6, 186. The case of off-list words was also quite striking. The TAC included 1,843 off-list word types, while in the LAC, the number of off-list word types was 697. This means that the TAC includes more than twice as many word types as the LAC. Considering that off-list words are less frequent and more specialized words, it can safely be


concluded that the post-graduate students studying in English speaking countries have a wider productive knowledge of less frequent, and more specialized words.

Graph 4.4 therefore confirms that the LAC is quite poor in terms of the types of words employed, 3,453 word types in contrast to the 5,346 different types of words in the TAC. The data from the two graphs (4.3 and 4.4) seem to suggest that the problem of the post-graduate students studying at EMU stems from the insufficient productive knowledge of the word types belonging to different families, and the resulting inability to exploit these different types. This can be more clearly observed in Table 4.22 below:

Table 4.22 Comparison of the LAC and the TAC in terms of BNL2709 bands and word types
Types one two three four five six seven not in lists Total TAC 134 1306 583 462 406 477 135 the 1843 5346 697 3453 LAC 133 1124 464 349 274 317 95

As stated earlier, the use of different function words (Band 1) in the two corpora is almost exactly the same. The discrepancy is observed in all the content word bands, and the gap gets larger as the bands go up. If Band five is considered, for instance, the figures show that 406 different words from this band are used in the TAC, compared to 274 in the LAC. The analysis of the word types and the word families


provides in-depth data on the words exploited in the TAC as opposed to the LAC. A very good example is the case of the head word ‘constitute’. In the whole LAC, this word is used only as ‘constituent’, whereas in the TAC, the following three types of the word are utilized: ‘constituents’, ‘constitutes’, and ‘constitutive’. A list of all the family headwords used in both corpora, together with the word types employed in each corpus is provided in Appendix J.

The analysis of the equal-sized LAC and the TAC further substantiated the existence of the gap. The post-graduate students’ written work compiled in the LAC exhibited a more limited range of vocabulary, and a more limited productive knowledge of lexico-grammatical patterns commonly used in thesis writing. The comparison of the two corpora of equal size added further insights into the depth and composition of the gap. One major finding was that the problem seemed to stem significantly from the fact that the post-graduate students studying at EMU had insufficient knowledge of the different types of words in each word family, or their knowledge of these words did not extend to the productive level. The next section focuses on the construction of a pedagogic corpus, which comprises various corpus-informed components, that is envisaged to provide a lexico-grammatical roadmap and assist the post-graduate students studying at EMU in thesis writing.

4.3 Constructing a Pedagogic Corpus for Thesis Writing

This section focuses on what the cross-examination of the two corpora necessitates in terms of the comprehensive pedagogic corpus design. Therefore, this part concentrates on the contribution of this research study to the research field.



The development of the TAC (Target Abstract Corpus) Wordlist

The data from the LAC and the TAC, as well as the comparison of the LAC and the TAC, reveal that the post-graduate writers of the LAC exploited a restricted range of vocabulary in writing their thesis. In addition, although they generally know words in isolation, they have problems using them together with other words accurately and appropriately. In other words, they seem to be lacking lexico-structural competence. The assumption is, therefore, that students at this level have prior knowledge and should know the most common words in English, although sometimes they are still not able to use some of these words accurately or appropriately. This may be due to the fact that the use of quite a large number of very commonly used words (e.g ‘study’, a GSL1 word) necessitates the company of different words when used in academic texts. Taking this into account, some actions were taken.

Firstly, one of the major differences between the TAC and the LAC was a quite wide range of expression in the TAC as opposed to the restricted range in the LAC as exhibited by the output of the quantitative as well as the qualitative analysis. Thus, the pedagogic corpus needed to provide a variety of patterns to enable the postgraduate students to express similar concepts in a variety of ways. At the same time, the pedagogic corpus had to be fairly restricted and economical, as the teachinglearning process is bound by space and time and the objective is to fulfill a set of pre-defined learning objectives in a limited time frame. In addition, it needed to be constructed on the basis of the most frequently communicated meanings in the TAC. Teubert (2005) emphasizes the fact that corpus linguistics focuses on meaning and adds that “meaning is what is being verbally communicated between the members of


a discourse community” (p. 2). Moreover, the patterns also needed to be selected on the basis that they were fundamental to the realization of specific strategies, tactics or sub-moves that make up the required academic moves in abstract writing in specific, and thesis writing in general.

As mentioned in the earlier sections of this study, abstracts, like other genres reporting research, have an IMRD (Introduction-Method-Results-Discussion) structure (Swales, 1990, p.181). What is significant is that these moves recur throughout thesis and research writing in general. Therefore, abstracts act as a miniature of the academic research genre as a whole, which makes them a powerful and useful research and teaching device in general. A close look at thesis abstracts reveals that the IMRD pattern in abstracts seems to have the following 4 moves and some corresponding sub-moves to achieve the required moves:

Introduction: An overview of the field/The establishment of a research gap that justifies the need for the research to be conducted and its likely value and significance to the field in general/A statement of aims and objectives;

Method: The research methodology, references to data;

data collection techniques, and

• •

Results: The analysis of the results; Discussion: Conclusions, evaluation, implications, recommendations.

As stated, it was fundamental to the design of the AAC (Academic Abstract Corpus) Bank of Moves that a limited bank of lexico-structural patterns was created that would enable the post-graduate students to accomplish these moves in abstracts, and in the reporting of research in general, coherently and appropriately. It was


significant to ensure that the use of this limited bank of patterns would assist students at the post-graduate level with thesis writing specifically, and academic writing in general.

Both quantitative and qualitative methods were used for the analysis of the data in the construction of the AAC Bank. For range and frequency, the RANGE software ( was used. To analyze collocations and colligations, word clusters, and lexico-structural patterns in general, Concordance software (  and AntConc ( were utilized. Additionally, some decisions based on insights from the BNC (the British National Corpus), ( were made in the selection and categorization of the data, which would eventually form the basis of the AAC Bank, the core of the pedagogic corpus. Thus, the selection and the categorization of the data were carried out fairly qualitatively. It should again be pointed out that one important consideration for corpus-based approaches is that the analyses should go beyond simple counts of linguistic features, and include qualitative, functional interpretations of quantitative patterns (Biber et al., 1998, pp. 4-5). The construction of the AAC Bank is described below step by step.

After the target corpus was quantitatively analyzed for frequency and range, a number of filtering devices were employed to reduce the corpus to a manageable size. Some of this work, such as the elimination of all function words, was carried out automatically. Items also had to appear in all four sub-corpora to qualify for inclusion, unless there was a valid reason to include them. Then, more qualitative decisions had to be made regarding the membership of the words in the 4 major


moves in the IMRD pattern. Keeping in mind that the ultimate aim was constructing a pedagogic corpus as opposed to a research corpus whose reliability depends on the wealth of data, the selection of words for the AAC Bank was made in the most economical way possible. Therefore, the first step involved the identification of a highly restricted but key set of individual word families used in abstract writing. The behavior of these items then had to be examined in terms of collocations and colligations to identify a key set of lexico-structural patterns used in abstract and thesis writing. These patterns then had to be related to the moves involved in the structure of abstracts, and the thesis as a whole. The final product had to be further developed through comparison with the LAC, so that the final version would be attending to the identified needs of the post-graduate candidates to the greatest possible extent.

The selection of the most frequent words could have been easily carried out on the basis of the word frequencies in the TAC. However, this would mean that the data would be deprived of word family frequencies. The use of wordlists, on the other hand, enables the observation of family frequencies in corpora. Furthermore, as previously mentioned, some GSL words are very commonly used in academic texts, and the separation of the most frequent words as GSL and AWL, despite its convenience, may be problematic in the long run. Gilguin, Granger and Paquot (2007, p. 324) point out that “Coxhead’s (2000) Academic Word List does not include the 2,000 most common English words, with which non-native writers may still have considerable difficulties, especially in cases where their use in academic writing differs from their habitual use”. Paquot (2007) also challenges “the widely used criterion of non-appearance in the GSL for the selection of EAP vocabulary”. The selection, therefore, was based on the GSL and the AWL, so that data could be


obtained on which GSL word families were used abundantly in academic texts, and how.

The GSL 1 comprises the most frequent 1,000 words accounting for some 70% ( of all running text in English, and is composed of the lexical items that appear in most genres of language use, and are most likely to be known by an advanced group of students. GSL 1 and GSL 2 words are the ones that advanced post-graduate students are most likely to have encountered, learned and used. It is awareness of and exposure to less frequent words that is called for. In academic environments, complex ideas are expressed in similarly rich and complex language. Bearing this fact in mind, GSL I words could be eliminated automatically, unless they were clearly fundamental to academic writing. It was also kept in mind that subsequent analysis of colligations and collocations would bring back key members of the family into the final output. Furthermore, once the basic keyword list was established and organized according to a semantic basis dependent upon the IMRD pattern in abstracts and theses in general, it would be necessary to review all the lists once more to locate those items that performed a more or less synonymous function.

The aim of constructing a pedagogic corpus is to provide advanced students with authentic material that depends on semantic frequency, and that will offer them a variety of alternative sophisticated patterns required in academic writing. Therefore, categorization is necessary not in terms of frequency, but in terms of functions realized by language in thesis writing. Nevertheless, as stated earlier, there are some very common GSL 1 content words used in academic texts, and the ones that occurred in the data frequently had to be included in the analysis.


As the pedagogic corpus needed to be restricted and economical by nature to cater for the needs of the students realistically, there had to be a cut-off point in the selection of word families that would be the basis of the AAC Bank of moves. Therefore, the GSL 1 word families that occurred in each of the four sub-corpora of the TAC at least twenty times, and at least one hundred and fifty times in total qualified for inclusion. Out of the 419 content word families in the GSL 1 list, this procedure produced 46 (about 11 %) word families in the GSL 1 that survived the filtering, and therefore formed a list of the 46 word families from the GSL I fundamental to the construction of key moves in abstracts (Appendix K).

The GSL 2 words, the second 1,000 most common set of words in English, form a considerably smaller proportion of running text in English. Therefore, the criteria for inclusion were set at a lower level as these words occur at a generally lower level of frequency in any text. The same procedure was followed and all those word families that occurred at least fifty times in the TAC as a whole, and at least ten times in each sub-corpus were included for further analysis. 13 word families emerged (Appendix K).

The corpus which the AWL is based on extends beyond the boundaries of abstracts, and comprises academic textbooks and research articles. The Academic Word List was therefore expected to generate more keywords for inclusion than the GSL 2. Setting up the selection criteria for the AWL at exactly the same level as for the GSL 2 (at least fifty times in the corpus as a whole and at least ten times in each subcorpus) indeed produced a more extended list of academic word families. The only exception was the word family ‘objective’. This family occurred 40 times in the data. However, during the analysis of the off-list words, it was observed that


‘objectives’ was listed as an off-list word. This word was therefore included as ‘objectives’ occurred 18 times in the data, and together they occurred 58 times. Consequently, 85 academic word families were drawn from the AWL (Appendix K).

Off-list words are words that do not appear in any of the frequency lists, and therefore are less common in any text by definition. However, the fact that they are less common should not mean that they are to be ignored or deleted. Prior analysis of the two corpora revealed that some off-list words could occur in thesis abstracts and theses in general quite frequently. The filtering was thus carried out on a more generous basis. A word could be included on condition that it occurred at least 30 times in total with at least 5 occurrences in each sub-corpus, or a word merited entry if there were at least 20 occurrences as a family, and at least one member of the family occurred in all the 4 sub-corpora. As off-list words do not occur in families, this procedure was carried out manually. There were some unexpected findings. Firstly, ‘organizational’ and ‘organizers’ occurred as off-list, although ‘organize’ is a GSL 1 word. In the same way, although ‘objective’ is an AWL word, ‘objectives’ appears off-list. The word ‘usage’ appears off-list, and not together with ‘use’ and the word ‘technologies’ does not appear together with ‘technology’. It can be speculated that in the past ‘technology’ was uncountable as you could follow the breakthroughs more easily, but today because of all the innovations in each and every branch of technology almost every day, the word has started to be used in the plural. Once these words were identified, they were put in their respective families in the GSL or the AWL in the lemmatiser for further analysis manually. One exception is ‘large-scale’. Although it occurs 23 times, it was included as it was perceived as a common and useful word in academic writing. 21 families of off-list


words were included for further analysis, making the number of total word families 165 (Appendix K).

This analysis very broadly attempted to identify a set of word families that are assumed to have a key role in the fulfillment of moves and sub-moves in thesis abstracts. The outcome was a word family list comprising 165 items found most frequently in thesis abstracts. At this stage, a broad classification of these items into possible strategies, tactics, or sub-moves was necessary. However, before attempting any classification, it would be useful to consider a randomly extracted abstract from the TAC to observe how the IMRD pattern is actualized.

Table 4.23 The IMRD pattern in a randomly extracted abstract from the TAC Title Country An Archaeological Analysis of Gender Roles in Ancient NonLiterate Cultures of Eurasia Australia
Ascription of sex to inhumed remains on the principle basis of grave-goods, as distinct from anthropometric data, can be a vague process due to incipient gender bias in interpretation. Cross-matching of anthropometrics with grave goods can sometimes generate results that appear ambiguous or paradoxical as they may not accord with preconceived relationships between gender roles and sex. This reduces confidence in the demography of various archaeologically-revealed cultures, especially those of Iron Age Europe, which were erected on the basis of what we may now see as potentially flawed analysis. Comparative and contrasting analyses are made of contemporary and related cultures to investigate gender role assumptions on a wide basis. Regarding non-literate cultures, archaeologists have limited means to interpret the relationships between sex and gender-roles, and these methods are explored. The traditional outlook is assessed for functional bias in light of its origins and perpetuation, and a new synthesis is proposed for ongoing analysis. This synthesis includes strict application of refined anthropometric methodology and the resolution of paradox by adoption of a revised underlying hypothesis A correlation is observed between use of the horse and a significant blurring of gender role stereotypes, occurring in nomadic cultures whose legacy persists to the present day. This is examined in light of the proposed new synthesis for a consequential or coincidental relationship, the former being apparent. It is found that gender role bias has played an uncomfortably large part in Iron Age scholarship, and that outdated sociocultural assumptions continue to foster an unstoppable view of elements of world history.






Table 4.23 demonstrates that the IMRD moves are present in the randomly chosen abstract. It is also worth examining the same abstract to observe which sub-moves help to realize the four moves, although it is very difficult to draw a line between them as there seem to be overlaps:

Table 4.24 The sub-moves of the IMRD pattern in a randomly extracted abstract (TAC) Introduction Setting the context / background Stating the problem / opening a research gap Statement of the aim
Ascription of sex to inhumed remains on the principle basis of gravegoods, as distinct from anthropometric data, can be a vague process due to incipient gender bias in interpretation. Cross-matching of anthropometrics with grave goods can sometimes generate results that appear ambiguous or paradoxical as they may not accord with preconceived relationships between gender roles and sex. This reduces confidence in the demography of various archaeologically-revealed cultures, especially those of Iron Age Europe, which were erected on the basis of what we may now see as potentially flawed analysis. Comparative and contrasting analyses are made of contemporary and related cultures to investigate gender role assumptions on a wide basis. Regarding non-literate cultures, archaeologists have limited means to interpret the relationships between sex and gender-roles, and these methods are explored.

Method Methodology used Hypothesis Data analysis Results Significant findings
The traditional outlook is assessed for functional bias in light of its origins and perpetuation, and a new synthesis is proposed for ongoing analysis. This synthesis includes strict application of refined anthropometric methodology and the resolution of paradox by adoption of a revised underlying hypothesis.

A correlation is observed between use of the horse and a significant blurring of gender role stereotypes, occurring in nomadic cultures whose legacy persists to the present day. This is examined in light of the proposed new synthesis for a consequential or coincidental relationship, the former being apparent.

Discussion Filling the research gap by generalizing the findings
It is found that gender role bias has played an uncomfortably large part in Iron Age scholarship, and that outdated sociocultural assumptions continue to foster an unstoppable view of elements of world history.


The TAC offers plenty of data with which to analyze thesis abstracts in all the four sub-corpora according both to moves and sub-moves. Randomly chosen abstracts from the TAC were therefore analyzed with the following results:

Table 4.25 The moves, sub-moves, and examples from random abstracts (TAC)




EXAMPLES -U.S. fisheries legislation requires National Marine Fisheries Service (NMFS) to attend to the critical social and economic issues surrounding the definition and identification of fishing communities, and to the effects that changes to the physical environment and regulatory decisions can have on such communities. -Some researchers have found that the influence of flowers promotes people positive emotions. -This thesis posits the need to integrate the design of landscape with the design of architecture. -The question of how to define Falun Gong is not just an academic issue; the use of the cult label has been used to justify the persecution of practitioners in China. -The solution of linear systems is an ancient and inexhaustible problem. -This thesis explores the integration, through ideas of reciprocity, of landscape and architecture. -This dissertation is concerned with C*algebras associated with boundary actions obtained from graphs of groups... -The intention is to demonstrate the usefulness of a pragmatic approach to applied ethics -Through research, analysis, design experimentation and application, this thesis demonstrates an integrated design strategy... -Research sites included Tampa, Washington D.C., and cyberspace… - The site is an abandoned rail yard - The site and project were selected because they offered a good opportunity to explore the issues of designing...

Introducing the Field

Referring to previous work in the field Opening up a Research Gap Introduction Stressing the value of and need for the Study Stressing the challenge of the study Opening up new links and relationships Giving an overview of the thesis Specifying the objectives and aims of the thesis Giving information about research methods Methodology of the Study Giving information about the research site Justifying choice of material and data



FUNCTIONS/ SUBMOVES Stressing the novelty of the findings (filling the research gap)




-My findings are contrary to the allegations made by the Chinese Government and Western anti-cultists in many ways. - One important result is the lack of Stressing the significant persistent similarities or differences results between register domains. - Our major results can be grouped into two categories: recognition of links between substitution method Categorizing the results calculations and well-known results in other areas of mathematics, and the development of novel algorithms to exploit special structure. - The results provide valuable insights for landscape designers seeking to evoke Implications particular emotions or designing therapeutic environments for particular patient groups. - The thesis closes with an explanation as to why progressive dispensationalism Conclusion/Suggestions is more compatible with amillennialism than with premillennialism. -The results suggest that multidimensional analysis of stance is Implications for further effective, and that further study of study individual registers and dialects would be fruitful. Application to other functional areas is suggested.

It would be possible to conduct this process in even greater detail. For research purposes, however, it is sufficient to note that a thesis abstract contains a number of moves some of which seem to be more or less obligatory, and some more or less optional depending on the specific field and institutional requirements. Each move can be realized in a number of ways, and through a number of discourse functions, or sub-moves. While certain lexico-structural patterns are more likely to appear within certain moves, and beyond that within certain sub-moves, some others appear in different places as integral parts of quite different moves and sub-moves. This is


one obvious reason why a wordlist cannot be used as a detached mechanism with which to teach academic writing. What is essential, then, is to analyze and describe how key lexical items act in combination (i.e lexico-structurally) to perform certain sub-moves, strategies or tactics which are the foundation of the specified moves.

The list which will be henceforward referred to as ‘the Target Abstract Corpus (TAC) Wordlist’ (Appendix K) formed the basis of the AAC Bank (Academic Abstract Corpus Bank) of lexico-structural items used to actualize sub-moves and moves. The list, the bank, the two corpora, the in-class and online course materials and accompanying tasks became the major components of the pedagogic corpus, in line with the main aim of the study, which was to construct a pedagogic corpus through incorporating corpus work into the advanced thesis writing course to assist the post-graduate students involved in research and publication.

As revealed through the analysis of the data, the post-graduate students are restricted by their range of vocabulary, and their productive knowledge of collocates and lexico-grammatical patterns. Therefore, the syntagmatic and paradigmatic levels in the construction of moves are analyzed, and a comprehensive bank of lexicostructural patterns to achieve moves in abstracts, and thesis writing in general is formed. Henry (2007) maintains that although a lot of research has been conducted on identifying moves and their order in specific genres, rather less attention has been paid to the presentation of key lexico-grammatical features of genres and moves. Henry believes in the value of presenting computer-based, corpus analyses of sentence level genre features to language learners, incorporating all the important syntactic patterns with all possible paradigmatic and syntagmatic variations and collocations found in each of the moves and strategies in a given genre. According


to the results of his corpus study (Henry, 2007), the students exposed to the sentence level genre features were able to produce more effective samples of the genre in question. At this point, it is also worth recalling that according to Widdowson, the syntagmatic and the paradigmatic mode of organization allows the generation of infinite expressions from finite means and “is the essential source of the creativity and flexibility …. of human language” (1996, p. 34).

As a first step, the TAC Wordlist (Appendix K) of 165 word families identified as key in abstract writing in particular, and thesis writing in general were tentatively put in different move categories. Some words such as thesis, data, structure, and system were felt to be applicable to all moves, and were therefore classified as a separate ‘All moves’ category. This tentative list proved to be very useful in the construction of the AAC Bank, as words from the list were checked against the 100million-word BNC (British National Corpus, ( for synonyms in the academic register. Subsequent analysis of the TAC data was therefore based upon a sound and solid foundation.


The development of the AAC (Academic Abstract Corpus) Bank of Moves and Sub-moves

The AAC Bank of moves and sub-moves (see Appendix L) that was developed as the central part of the pedagogic corpus was based on the four main moves in abstracts, and in the reporting of research in general. These four moves, known as the IMRD (Introduction-Methodology-Results-Discussion) moves (Swales, 1990, p.181), help to organize the required information in a structured and meaningful way. The discourse structure of an academic genre such as an abstract is


undoubtedly very important. Yet, it is language that is essential in realizing the moves. Many researchers and scholars draw attention to the key role played by ‘lexico-grammar’ in fulfilling generic moves (Flowerdew, 2000; Henry, 2007; Johns et al., 2006), and sub-moves, also referred to as ‘strategies’ (Henry, 2007, p. 3) or ‘tactics’ (Henry, 2007, p. 7).

The main aim of developing the AAC Bank in this research was to provide a wealth of lexico-structural patterns for the post-graduate students to coherently and appropriately fulfill the moves that are called for in abstracts, and in their theses. This study is pedagogic in nature, and the aim of the AAC Bank is to offer a roadmap for the students. As the participants have access to the corpora (both the TAC and the LAC) compiled by the researcher as well as larger corpora such as the BNC (British National Corpus), they have the tools and the opportunity to apply the Data-driven Learning principles and methodology, and go beyond what is offered to them.

Earlier in this chapter, the methodology for the compilation of the TAC Wordlist was described in detail, and the resulting 165 word families were tentatively categorized into moves. This list, together with the synonym facility used for finding semantically related words in the BNC (in this case, the academic sub-corpus of the BNC) (, was indeed very helpful in deciding which key words to seek in the TAC. As expected, some words like ‘economy’ and ‘technology’, which are more common in some disciplines than others, did not occur within the lexico-grammatical patterns reflecting the moves, although they occurred frequently enough in the TAC to deserve a place in the TAC Wordlist itself. These words and other words like these are emphasized in different ways in the materials.


The AAC Bank of moves was developed by picking up the key words, accessing all the semantically related words using the BNC, and searching the TAC for all the lexico-grammatical realizations functioning in the construction of a particular move. Therefore, frequencies were not a major consideration here, as the aim in the building of the bank itself was to offer the post-graduate candidates enough varied data to ensure more variety and flexibility in the creation of their texts, the lack of which was the problem observed by the researcher and reported by the previous course instructors as well as the participants themselves. Hence, high frequencies, though not essential in the achievement of the major aim, are indicated in moves 2 and 3 for purposes of illustration, but are not emphasized in the other moves of the AAC Bank, except for very significant ones.

Once these semantically related words, within their lexico-grammatical patterns, were categorized into different moves and sub-moves, the data were organized in tables. However, it is important to emphasize here that these tables are not traditional substitution tables, but instead maps that are semantically organized according to the move to be achieved. The post-graduate candidates, with the help of relevant tasks, are directed and guided in using the AAC Bank, and accurately and appropriately construct the required moves through using the lexico-grammatical patterns compiled in the Bank.

The AAC Bank is composed of four sections representing the four moves in abstract writing, and research writing in general. Each move in the AAC Bank is further represented by its relevant sub-moves, some of which are necessary and some optional. The nature of the research itself, together with the specific departmental and institutional requirements, determines the inclusion and the ordering of these


sub-moves in the abstract. However, considering that the abstract is a miniaturized version of the thesis, it would not be wrong to say that all these sub-moves occur in the actual thesis. In fact, referring to a research abstract, Bhatia (1993) states that “an abstract, …, is a description or factual summary of the much longer report, and is meant to give the reader an exact and concise knowledge of the full article” (p. 78). In a similar vein, Salager-Mayer (1990) argues that in the abstract, the structuring of the full paper should be reproduced, and the moves fundamental and obligatory ‘in the process of scientific inquiry and patterns of thought’ reflected (cited in Hyland, 2004a, p. 64).

Bhatia (1993) holds that “in order to realize a particular communicative intention at the level of a move, an individual writer may use different rhetorical strategies” (p. 30). These strategies, also referred to as ‘tactics’, ‘sub-moves’, or ‘steps’ (Hyland, 2004a, p. 47) therefore help the writer to fulfill a required move. Bhatia (1993) describes the four moves in abstracts in the following way:

Introducing Purpose: This move gives a precise indication of the author’s intention, thesis or hypothesis which forms the basis of the research being reported. It may also include the goals or objectives of research or the problem that the author wishes to tackle. Describing Methodology: In this move the author gives a good indication of the experimental design, including information of the data, procedures or method(s) used and, if necessary, the scope of the research being reported. Summarizing Results: This is an important aspect of abstracts where the author mentions his observations and findings and also suggests solutions to the problem, if any, posed in the first move. Presenting Conclusions: This move is meant to interpret results and draw inferences. It typically includes some indication of the implications and applications of the present findings. (pp. 78-79)

It should be emphasized that most of the abstracts establish the field, or set the background as the initial sub-move. As this sub-move is particularly subject-specific


by nature, this is the only sub-move that is not represented in the AAC Bank of Moves and Sub-moves. Description of the AAC (Academic Abstract Corpus) Bank of Moves and Sub-moves based on IMRD

It was emphasized in various sections of this study that a genre is made up of a series of moves, each of which “is a distinctive communicative act designed to achieve one main communicative function” (Hyland, 2004a, p. 47). Moves are fulfilled through sub-moves, also referred to as ‘strategies’ (Bhatia, 1993), or ‘steps’ (Hyland, 2004a; Swales, 1990). According to Hyland, “both moves and steps may be optional, embedded in others, repeated, and have constraints on the sequence in which they generally occur” (2004a, p. 47). It can be understood from Hyland’s statement that overlaps between sub-moves should not be unexpected.

Move 1: Introduction

Sub-move 1: Defining the Scope of the Study

The wealth of language derived from the TAC in the form of lexico-grammatical patterns for only one sub-move, ‘scope’, is exceptionally rich, offering a wide range of alternatives both syntagmatically and paradigmatically. The scope of the research tells the reader what the area of focus is, and thus it is important to reflect on all the aspects of the area the research is going to concentrate on, and the perspectives that are going to be considered. Thus, the scope of the research defines the boundaries of the study, and what the research has set out to do. It is important to note that some of


the lexico-grammatical patterns used for defining ‘the scope of the research’ may also be relevant to stating ‘the aim of the research’.

According to the data, at the paradigmatic level, the scope is defined by the ‘product’, such as ‘thesis’, ‘dissertation’ ‘project’, etc., followed by a verb which draws the boundaries of the research, and describes what the research has set out to do. Some of these verbs are ‘provides’, ‘explores’, ‘examines’ ‘outlines’, and it can be observed that the present tense is used. At the syntagmatic level, according to the data, various lexico-grammatical patterns are used in the TAC to define the scope. If the verb ‘demonstrates’ is taken as an example, it can be observed that the verb is used in three different ways in the TAC; it can be followed by a noun clause including a wh- word like ‘how’, by a noun clause introducing a statement, or a noun phrase. These examples are presented in full to provide a clearer understanding of the scope of the research:

This study demonstrates how successive military leaders bolstered their position in the capital by proactively assimilating and adopting established symbols of traditional authority. This research demonstrates that all stone raw materials in Sydney archaeological assemblages are available in the Sydney region, mainly from Tertiary and Quaternary gravel beds, and that these are widely scattered. This research demonstrates the feasibility of mass production of integrated optical and potentiometric sensors with CMOS circuitry on the same chip.

The first example informs the reader that the area of focus in the research is the method used by military leaders to strengthen their position in the capital city of a specific country. The scope gives enough information about the area of research. In this case, a political or a sociological perspective is communicated to the reader. In the second example, the study is of archaeological nature, and the focus is the sites


where the raw materials of the Sydney archaeological sites are found. The third example clearly comes from a study on computer engineering or a mechanical engineering, and the scope of the study is whether it is possible to mass produce integrated optical and potentiometric sensors with CMOS circuitry on the same chip. These examples demonstrate that different lexico-structural patterns are available to writers in communicating, in this case, the scope of their research study.

Sub-move 2: Identifying a Research Gap

This sub-move is an attempt of the researcher or the writer to “establish a niche for about-to-be-presented research” (Swales, 1990, p. 154). It is possible to initiate this step with ‘an adversative sentence connector’, like ‘however’, ‘nevertheless’, ‘yet’, ‘unfortunately’, and ‘but’, and these are used to ‘indicate a gap’ (Swales, 1990, p. 154). Swales argues that in indicating a research gap which the writer is intending to fill through his/her research study, “the author does not counter-claim that the previous work is hopelessly misguided, but rather ‘suffers from some limitations’” (1990, p. 154). Linguistically, this sub-move may be realized through ‘negative or quasi-negative quantifiers’, like ‘no’, ‘little’, ‘none’, ‘few’, or ‘lexical negation’, through the use of such words as ‘fail’, ‘lack’, ‘overlook’, ‘misleading’, ‘limited’, and ‘failure’ (Swales, 1990, pp. 154-155). A niche can also be established through a ‘direct or indirect question’, ‘expressed needs, desires, interests’, ‘logical conclusions’, ‘contrastive comments’, or ‘problem-raising’ (Swales, 1990, p. 156).

The data extracted from the TAC reveal that, the most distinguishing characteristic of this sub-move is indeed the use of negative and quasi-negative quantifiers such as ‘no’, ‘little’, ‘few’, lexical negation like ‘unclear’, ‘inadequacies’, and adversative


sentence connectors, ‘however’, ‘although’ and ‘unfortunately’ being three cases in point. When this language is accurately and appropriately exploited in lexicostructural patterns as exemplified in Appendix L, a research gap is indicated, and a niche, or a research gap (Swales, 1990), is created for the writer.

Sub-move 3: Filling the Research Gap

This sub-move serves a very important purpose, since it connects the present research to the gap, thereby creating a conducive research space for the researcher. This sub-move is typified by adjectives like ‘new’ and ‘first’, verbs like ‘extend’, ‘clarify’, ‘further’, ‘advance’, and nouns like ‘milestone’, and it usually follows the ‘gap’, thereby making the research as a whole necessary, meaningful and worthwhile. As this sub-move is very closely linked to the previous sub-move ‘Identifying a research gap’, there may be overlaps between them in terms of the language employed. Some examples where there are overlaps are:

Unlike previous studies that examine only …, this study also focuses on … This thesis presents results for a broader … range than previously published materials.

As can be seen from the first example, the first part focuses on the ‘research gap’ through the use of ‘unlike’, and ‘only’, while the second part informs the reader that the research gap is to be filled through an additional aspect, or perspective (this study also focuses on …). The second example makes use of lexical choices to indicate the research gap as well as how the gap is to be filled (results for a broader … range than previously published materials).

Sub-move 4: Stating the ‘aim’ or ‘purpose’ of the study


This sub-move is most probably the most significant one in an abstract, and indeed in the whole thesis, since the research gap, the methodology, the findings, the conclusions, and the implications are all dependent on the main aim of the study. This sub-move is the same as the third move of introductions, which Swales refers to as ‘occupying the niche’ (1990, p. 159). The function is to turn the already established ‘niche’ into a research space that rationalizes the present research. The writer may indicate their main purpose or purposes, or describe what they regard as the main aspects of their research (Swales, 1990, p. 159).

Four patterns for stating the aim emerged from the data, which would enable the post-graduate writers to utilize alternative lexico-structural patterns for the aim in different parts of the thesis. As noted earlier, some of the patterns in the ‘scope’ submove are also relevant to the statement of the aim. The first pattern directly announces to the reader that the ‘purpose’, ‘aim’, ‘objective’, ‘goal’, ‘intention’, ‘object’ or ‘intent’ of the thesis, or study is to be stated, and therefore the statement begins with one of these words. The other three patterns initially foreground the product of the research (thesis, dissertation, work, study, etc.), followed by a verb denoting what the study set out to achieve (expand, explain, develop, address, propose, investigate, etc.).

Move 2: Methodology

Perhaps the move which has the highest potential to include the most details is the second move, ‘Methodology’. This move in the thesis abstract, and the corresponding chapter in the actual thesis, incorporate information on the approach and specific methods used together with the rationale, the research context, the


participants or subjects, and the data collection and analysis tools. The purpose of this move is to inform the reader about how the research was done, and why it was done that way. Therefore, it “covers not only the methods used to collect and analyze data, but also the theoretical framework that informs both the choice of methods and the approach to interpreting the data, and relates all of these explicitly to the research question(s) addressed in the thesis”

( The initial analysis of this move with the aim of forming a lexico-grammatical bank for sub-moves generated such abundant data that they had to be trimmed and edited to a manageable size.

Sub-move 1: Presenting the methodology employed

The data from the TAC revealed that the most common three verbs to talk about the methodology of the study are ‘use’, ‘utilize’, and ‘employ’. Both present and past tenses are used in the active, although in the actual thesis, there seems to be more past than present tenses to refer to the methodological procedures. Alternatively, the passive may be employed to talk about the methodology adopted in the study. According to the data, the researcher may also describe or justify the methodology adopted, or may also choose to go into a lot of detail regarding the methodology employed in the study.

Sub-move 2: Justifying the methodology employed

It is also important that the methodology adopted in the study is relevant to the resolution of the problem that has led the researcher to conduct the current research. Accordingly, some researchers may want to discuss the methodology in relation to the problem, and present the theoretical framework justifying the choice of the


approach and the methods used. As this sub-move is closely related to the previous sub-move, ‘presenting the methodology’, there may be overlaps between these two sub-moves.

The analysis revealed that the primary aim of an appropriate methodology in a research study is to collect sufficient data. The word ‘data’ occurs in the TAC 268 times, which is a very high frequency. This should not be surprising, as the significance of the research depends on its findings from data analysis and interpretation. It would be therefore be valuable to consider the lexico-grammatical realizations of this very common and important word in detail.

Table 4.26 Lexico-structural patterns of ‘data’ collected from / in … gathered from … stored in … obtained from … used … produced by … relating to … DATA assessed by the … derived from … generated by / from … required to … relevant to … measuring … is/ are analyzed for … was analyzed for … was analyzed using … is censored … are identified … is examined … is incorporated into … is passed onto … DATA were sorted … that conform to … to analyze income … to contribute to … to deviate from … to help build … to simultaneously capture … to test …


Table 4.26 shows that although ‘data’ is a plural word, with ‘datum’ as the singular form, the writers in the TAC used ‘data’ both in the singular and the plural. As can be seen, ‘data’ is abundantly modified through relative clauses.

Sub-move 3: Describing the context

It is essential that sufficient information is offered to readers on the context, so that they can clearly understand the setting, the conditions and circumstances that led to the problem, and most importantly the problem itself. The researcher’s awareness of the setting plays a vital role in designing the research, since what is relevant to one context may not be so in another. This awareness can easily be observed from the data (see Appendix L), as most of the examples for the pattern include ‘examining a problem within the context of …’, or ‘within a wider context’.

An interesting finding is that while in the TAC, the ‘context’ family occurs 204 times and ‘context’ itself 139 times, the whole family appears only 3 times in the LAC. For this reason, in addition to the lexico-grammatical patterns derived from the TAC to describe the context, it is worthwhile to extract the common verbs to talk about the lexical item itself, independent of the sub-move.

Table 4.27 Verbs used to talk about the ‘context’


creating suit build describe lay out form identify represent complement highlight the importance of construct a model within draw on examples from considered in interpreted in applied in discussed in influenced by

a/an The a new The The The The The The The The The The a /an The The The

… context …

Optional sub-move: Describing the variables

It should be emphasized that ‘describing the variables’ is an optional sub-move, as variables are used in experimental studies in the Sciences. The data from the TAC revealed that, ‘variables’ are mostly ‘analyzed’, ‘investigated’ or ‘tested’. The most frequently used collocates of ‘variables’ in the pre-position are ‘dependent’ and ‘independent’. In qualitative studies, variables are not part of the research design, and therefore this sub-move is not included in such studies.

Move 3: Results

In the ‘Results’ section, the research reports back the data analyzed and highlights and summarizes the significant findings. Therefore, this section may be said to be composed of ‘what the data / analyses / results / findings show’ and ‘what the data / analyses / results / findings mean’. It should be taken into account that there may be overlaps in the use of language in moves 3 (results) and 4 (discussion), as in the


discussion / conclusion part as well, the writer may refer to the research findings, results, or the data.

In scientific writing, when reporting what the results show, mean and how they can be interpreted, in other words, in both Move 3 and Move 4, ‘hedges’ and ‘boosters’ are used. Hyland explains the reasons for the need to use these expressions of ‘doubt’ and ‘certainty’ in the following way:

One of the most important features of academic discourse is the way that writers seek to modify the assertions that they make, toning down uncertain and potentially risky claims, emphasizing what they believe to be correct, and conveying appropriately collegial attitudes to readers. (2000, p. 179)

The examples he gives for ‘hedges’ are ‘might’, ‘probably’ and ‘seem’, and for ‘boosters’ ‘clearly’, obviously’ (Hyland, 2000) and ‘demonstrate’ (Hyland, 2005, p. 179). Hyland is of the opinion that these devices aid the work of academic people to be accepted ‘by balancing conviction with caution’ (2000, p. 179). In another article, Hyland further emphasizes the significance of these devices:

Both boosters and hedges represent a writer’s response to the potential viewpoints of readers and an acknowledgement of disciplinary norms of appropriate argument. They balance objective information, subjective evaluation and inter-personal negotiation, and this can be a powerful factor in gaining acceptance for claims. (2005, p. 180)

Sub-moves 1 and 2: ‘What the data show’ and ‘What the data mean’

The findings of this research are consistent with the literature, as it was observed that the writers of the TAC made abundant use of hedges and boosters. The two very important sub-moves fulfilling this move are ‘what the data show’, and ‘what the


data mean’. Instead of separating these two sub-moves, it would be more convenient to focus on the key words and the lexico-grammatical patterns used to fulfill them.

The first key word is ‘data’, and whether the data ‘show’ or ‘mean’ depends on the verb used with it. For example, ‘showed that’ and ‘demonstrate’ fulfill the first submove, whereas ‘implies’ and ‘suggests’ belong to the second sub-move. On the other hand, ‘demonstrate’ is a ‘booster’, while ‘imply’ and ‘suggest’ are hedges.

Move 4: Discussion

In the ‘discussion’ move, and more generally speaking, in the last chapter of the thesis ‘Conclusion’, the researcher summarizes the key findings, reports the conclusions and discusses the contribution of the research to the field, considers the implications of the study, and makes recommendations as suggested by the findings of the research. The writer may also mention how the current research relates to the existing research, and how it opens up gaps for further research. In this move as well, ‘hedges’ such as ‘imply’ (13 times), ‘suggest’ (168 times), ‘tend’ (35 times), ‘seem’ (24 times), ‘appear’ (54 times), and modal verbs like ‘could’ (77 times) and ‘may’ (128 times), were used abundantly by the TAC writers. Mostly the present tense is used, as can be observed from the data (Appendix L), since there is the need to emphasize the current relevance of the research. An interesting finding relating to especially Move 4 is the case of ‘suggest’. This verb is widely used in this section as a hedging device, while at the same time it is employed in making recommendations based on findings. It is therefore crucial that


post-graduate students are provided with exposure and practice regarding the use of this verb in the research context. Sub-move 1: Describing the ‘key findings’ This sub-move appears to be fulfilled mostly through such key words as ‘imply’, ‘suggest’, ‘seem’, ‘appear’, ‘tend’ and ‘tendency’, all of which can be described as ‘hedges’. The researcher, in general, should be extremely cautious when describing the key findings, and the use of these key words seems to protect the researcher from making risky claims. Sub-move 2: Relating the findings of the study to the existing research

This sub-move can be optional, as some researchers choose not to compare their findings with those of previous research. However, this sub-move is quite significant, as it signals that the researcher is well-read and well-informed about the research field. The key words and phrases that are used to fulfill this sub-move are ‘consistent with / in agreement with previous research’, and ‘unlike / contrary to previous research’, as can be observed from the data (See Appendix L).

Sub-move 3: Describing the ‘conclusions’ According to the data (see Appendix L), the most common key words that are used to describe the conclusions are ‘conclude’ and ‘conclusions’. This sub-move can be said to be almost typified by the use of the simple present tense, most probably due to the need to focus on the current relevance of the research. While in ‘describing


the findings’ hedges are used abundantly, in ‘describing the conclusions’, the writer seems more comfortable. Sub-move 4: Discussing the contributions of the study to the research field The ‘contribution’ family, is used 127 times in the TAC to refer to the contributions of the research. Every piece of research ideally should contribute to the field of research by extending it, clarifying it, or by filling a gap. As the data indicate (see Appendix L), the family members of ‘contribution’ fulfill this sub-move through three patterns. The first is when ‘contribution’ is used as the subject (e.g. The key contribution of this research is the use of …), as the object (e.g. The thesis makes a theoretical contribution to the growing field of …), and as a verb ‘contribute’ (e.g. Our findings contribute not only to … but also to …).

However, the ‘contribution’ family is not the sole means of talking about contributions of the research. The data (see Appendix L) show how other verbs used with ‘study’, ‘findings’, and ‘results’ are employed to denote ‘contribution’. Some of these verbs are ‘provide’, ‘promote’, ‘validate’, ‘reinforce’, and ‘inconclusively prove’. A very interesting finding related to this sub-move is the high frequency of the use of the words ‘understanding’, ‘insight’, and ‘knowledge’, when talking about the contribution of the study. Therefore, it would be useful to present a few examples regarding the lexico-grammatical realizations of these words (see Appendix L). These realizations, however, are quite varied, and the post-graduate students should be encouraged to explore the TAC for further examples.

Sub-move 5: Making recommendations / suggestions based on the research findings


It is quite conventional to propose changes to the existing state of affairs based on the findings of the research. In fact, in some cases these suggestions can be important contributors to the field. The data (see Appendix L) show that three lexical verbs (recommend, suggest, propose) are extensively utilized for this purpose. However, apart from these lexical verbs, two modal verbs, ‘should’, and ‘must’ are widely employed for the same purpose. The fact that ‘ought to’, which is a commonly used modal that can be used interchangeably with ‘should’ in everyday language, occurs only once in the TAC is quite interesting.

Sub-move 6: Discussing the Implications of the research

Implications are important as they confirm that the researcher is aware of how the current research findings can or will impact the existing practice, or theory. Still, the writer may sometimes discuss the implications of the study for further research. In that case, there may be overlaps between this sub-move and the next one ‘suggestions for further research’.

This sub-move makes use of the lexical item ‘implication’ (see Appendix L). The word is used as the ‘subject’ (implications regarding … are discussed), as well as the ‘object’ (this research has important implications for …). The most common collocates of the word are ‘broad’, ‘important’, ‘practical’, ‘theoretical’, and ‘empirically grounded’.

Sub-move 7: Opening up new areas of research

Researchers have to draw boundaries for themselves so as to be able to exploit the tools, and the data effectively, and to present meaningful findings. During the


research process, and especially at the data analysis stage, the researcher will observe that other avenues of research emerge. Being aware of these and opening up research gaps for fellow researchers will benefit the field of study as a whole. The TAC data (see Appendix L) indicate that the central lexical items for opening up new areas of research are ‘future’, ‘further’, ‘recommendation’, and ‘suggestion’. Significance of the AAC Bank of Moves and Sub-moves

The AAC Bank of moves and sub-moves extracted the most commonly used lexicostructural patterns in the TAC to fulfill the moves and sub-moves in specifically abstract, and thesis writing in general. This bank is envisaged to perform as a road map for the non-native post-graduate candidates, and assist them in creating meanings and conveying their intended messages coherently, and appropriately.

The AAC Bank is presented to the students in a table format in the Moodle glossary. Moreover, accompanying tasks, both in-class and online, based on the lexicostructural patterns in the AAC Bank of moves and sub-moves can help the students acquire these structures, and employ them in their own texts.

The data, as already mentioned, are presented to the students in the e-learning platform, Moodle, in tables. Yet, it is also possible to present the data in visual format as a ‘word cloud’, which gives “greater prominence to words that appear more frequently in the source text” ( The advantages of visual representation are that, it is conspicuous, and the most frequent words are the most prominent, and therefore easily detected. In addition, the lexico-grammatical patterns can also be represented in a wordle. The wordle for the sub-move ‘scope of the research’ is provided in figure 4.1 as a sample.


Figure 4.1:

Wordle of the AAC Bank, Move 1, sub-move: defining the scope


The Pedagogic Corpus

At this stage, it would be useful to recall how Hunston (2002) defines a pedagogic corpus:

A corpus consisting of all the language a learner has been exposed to. For most learners, their pedagogic corpus does not exist in physical form. If a teacher or researcher does decide to collect a pedagogic corpus, it can consist of all the course books, readers etc a learner has used, plus any tapes etc they have heard. The term ‘pedagogic corpus’ is used by D. Willis (1993). A pedagogic corpus can be used to collect together for the learner all instances of a word or phrase they have come across in different contexts, for the purpose of raising awareness. (2002, p. 16)


An important aspect of this definition is that the pedagogic corpus involves all the materials the learners have already been exposed to, and with the advanced students in this study, this is impossible, as the post-graduate candidates are from totally different backgrounds, and have not studied English in the same way using the same materials. Willis provides a more comprehensive definition, and points out that a pedagogic corpus involves the texts that the learners have encountered, or will encounter (Willis, 2003, p. 165). He further states that “learners process a set of texts to enable them to develop their own vocabulary and work out their own grammar of the language”, and this set of texts can be described as a pedagogic corpus (Willis, 2003, p. 163). He emphasizes that it is one of the roles of the teacher or the course designer to “highlight important features of the pedagogic corpus and to help learners familiarize themselves with it” (p. 163). Willis further refers to tasks as components of a pedagogic corpus (2003, p. 223).

In this study, the pedagogic corpus is constructed from: • • • • access to the two corpora (the LAC and the TAC); the web concordances of the two corpora available on Moodle; the in-class materials integrating corpus-informed data and tasks; various advanced academic writing resources, as well as larger corpora such as the BNC; • the online corpus-informed tasks that are designed to encourage students to discover regular patterns in language themselves, and learn in a collaborative manner on an e-learning platform founded on social constructivist principles; • the TAC Wordlist of 165 key words;


a glossary of the AAC Bank of moves and sub-moves, offering alternative lexico-grammatical patterns for fulfilling the required moves and sub-moves in abstract, and thesis writing.

Willis (2003) considers the pedagogic corpus ‘as a valuable body of language’ and emphasizes its value in the following way:

If we see the pedagogic corpus as central to syllabus and materials design, we can go beyond the view of language learning as the accumulation of a series of language forms. We can see learning as the learner’s growing familiarity with a valuable body of language. This in turn encourages the learner to take a positive view of learning. Learning is contextualized by the communicative framework, it is communicative activity in the classroom which enables learners to develop their own spontaneous communicative repertoire, but the catalyst for this development is the exploration of text. The learning processes of recognition and system building are important in that they facilitate exploration and communication, but, important though they are, they are simply facilitating processes, paving the way for real language use. (p. 225)

The pedagogic corpus was constructed in this research study upon identifying that “learners deviate from native speaker norms” (Keck, 2004, p. 98). Thus, the constructed pedagogic corpus, with its multiple components, is anticipated to minimize the gap between the identified and the desired performance levels, through the exploitation of authentic texts, and corpus-informed tasks. Data-driven tasks ensure maximum exposure to the authentic data and enable the students to observe the use of language themselves, and become language researchers, or 'language detectives' (Johns, 1997).

221 Task Framework and Taxonomy

This section briefly outlines how the pedagogic corpus is used as a basis for the teaching and learning of advanced academic thesis writing skills, most particularly in terms of task design and taxonomy.

The students engaged in the advanced academic study of the type described in this research have much in common in terms of the aims and discourse of the academic community, and in terms of the moves and the language they require to be successfully involved in that community. In fact, the pedagogic corpus, to a large degree, resulted from identifying what students have in common in terms of what they will need to write at an acceptable level, and the types of typical problems that they encounter before they have reached that level. Despite these commonalities, however, it is equally important to emphasize that these students are engaged in the writing of their own theses, and therefore involved in a process that is individualized and unique. As a result, there are some factors that need to be taken into consideration in designing a task framework to provide a rationale for task design:

Students’ general and academic English language proficiency level can vary quite markedly. Groups of students at this level are usually heterogeneous.

Students’ disciplinary areas and thesis topics may vary considerably, even within the same discipline. To illustrate, within Civil Engineering, there are different branches such as “Construction Materials, Geotechnical Engineering, Hydraulic Engineering, Transportation







Students may be at different stages of writing their theses. In terms of the present course, for instance, some of the students have well advanced in their research writing, while in other cases topic selection has not been finalized.

Students’ aspirations for the future can vary. For some, the master’s thesis is basically the peak of their academic career, and their main aim is to complete their thesis, get their degree, and leave the university to find a good job. However, for others who are pursuing a degree with the aim of finding a place in the academic world, the thesis is a critical milestone in their academic career, and mastery of advanced academic English is a crucial long term skill that needs to be developed.

Swales (2004) refers to the aspirations of the majority of master’s students in the following way, although he admits that there are exceptions:

For a majority of master’s students, the dissertation/thesis is the most sustained and complex piece of academic writing (in any language) they will undertake. As often as not, such students are uninterested in going on to write for publication but rather are looking forward, once having completed their fifty-to one-hundred-page documents, to entering or returning to a professional career in teaching, nursing, business, and so on. (p. 99)

Thus, these individual differences suggest that some students may require formal teaching - learning processes so that they can develop and acquire the language and skills that will help them write a thesis at a later point, whereas others may require a


more resource-driven environment where they can easily access and acquire language that will be of immediate help to them in the writing process.

This heterogeneous group of students from different backgrounds, with different aspirations, at differing language levels and at differing points of progression in their theses cannot be expected to benefit from a systemized approach where everyone does the same thing at the same time. The purpose of not only the formal instruction, but also the introduction of a web-based interactive tool, Moodle, to a larger extent, was not only to ‘teach’ academic writing skills per se, but also to ‘develop’ in students academic writing awareness, provide them with a roadmap, through exemplification and practice in the necessary skills, and thereby increase their self confidence. It needs to be emphasized that high levels of discipline, autonomy, and self-study are key in bringing such an individualized piece of work as a thesis to a successful completion.

The limited hours of instruction (only 3 hours a week), combined with the postgraduate candidates’ individual differences make it unrealistic and impractical to deal with specific language problems. The task design, therefore, should be flexible, and the tasks varied enough to cater for the needs, levels, aspirations of such a heterogeneous group, so that the participants have the choice and the means to focus on their individual and specific needs through different kinds of tasks in their own time.

Taking all these factors into consideration, the tasks were designed to promote awareness of the resources available, and to familiarize the students with the skills and knowledge required to exploit these resources, so that they can build enough


confidence to work individually on their theses. The approach to task design is datadriven and exploratory, rather than didactic. The pedagogic corpus thus functions as an open resource that students can exploit whenever, and however they like or need. This golden opportunity is offered to the students through Moodle.

Moodle was first introduced to the course in the 2007-2008 Academic Year (see Appendix M for screenshots of Moodle). The purpose of integrating Moodle into the ENGL501 course was not only to provide a ‘host’ for the corpora, but also to mediate between the individualized nature of thesis work, and collective learning and psychological support work in a collaborative environment. Furthermore, a webbased platform would offer the opportunity for the students to work at their own pace, on activities of their own choice, and in their own time and place.

Before describing and exemplifying the tasks designed, the components of the pedagogic corpus need to be recalled. The two corpora (the LAC and the TAC) together with their web concordances, the in-class materials integrating corpusinformed data and tasks, various advanced academic writing resources, as well as access to larger corpora such as the BNC, the TAC Wordlist of 165 key words extracted from the TAC, the AAC Bank of moves and sub-moves, and the online corpus-informed tasks constitute the pedagogic corpus. Considering the wealth of authentic data derived from the LAC and the TAC in hand, there should be tasks that ensure their utilization to the fullest extent.

The LAC is also extensively exploited in task design, although the use of incorrect forms for pedagogic purposes has a lot of critics. The opponents of using ‘incorrect forms’ maintain that the learner might learn these forms. (Corder, 1973, p. 294).


Corder (1973), however, is one of the supporters of the use of learner errors. He believes that “there is a strong argument in favour of the controlled use of examples of incorrect forms so long as these are correctly labeled as such” (Corder, 1973, pp. 293-294), and justifies his opinion in the following way:

Language learning is not parrot learning: we do not ‘learn’ or ‘practise’ examples. They are the data from which we induce the systems of the language. Skill in correction of errors lies in the direction of exploiting the incorrect forms produced by the learner in a controlled fashion. (p. 294)

The tasks designed for this course are essentially genre-based, data-driven, and ‘cyclical’ in nature. They are genre-based, since focus on generic moves and submoves is an integral part of the course. Besides being genre-based, they are datadriven, as they require the students to consult the data. The tasks are also cyclical, as all parts of language are interrelated, and learning a new item requires relearning all the other language items studied already (Corder, 1973, p. 265). With such a heterogeneous group of students, at different language levels and with different needs, it is inevitable to include tasks dealing with more basic elements of language, as well as those that require higher level language competence. Lexis, collocations, and lexico-grammar are particularly emphasized in tasks.

To ensure that the students can get maximum benefit from the abundant data, ‘corpus-based learning tasks’ (Keck, 2004, p. 93) are designed. These tasks can be broadly grouped into two categories: teacher directed data-driven and student led discovery learning tasks. The theory behind DDL is that students become ‘language detectives’ (Johns, 1997, p. 101), “discovering facts about the language they are learning for themselves, from authentic examples” (Hunston, 2002, p. 170). Teacher directed data-driven tasks are therefore the ones that engage “learners in the analysis


of concordance lines that have been selected, arranged, and possibly edited by the teacher in order to draw learners’ attention to patterns of language use” (Keck, 2004, p. 94). In this study, these tasks are designed by the teacher to draw attention, but as the students are advanced, the data are not manipulated by the teacher.

In student led discovery learning tasks, on the other hand, students might “generate their own concordance data, thus engaging in autonomous discovery learning” (Keck, 2004, p. 94). Some of the advantages of student-led corpus analysis are that “learners are in charge of their own learning, and thus motivation is increased” (Aston, 2001, cited in Keck, 2004, p. 94), and “learners make ‘serendipitous’ discoveries about language use that, without the use of corpora, would not have been possible” (Bernardini, 2001, cited in Keck, 2004, p. 94). The teacher directed datadriven and student led discovery learning tasks designed for this course are envisaged to help the students to acquire awareness of, or recognize language patterns, be ‘language detectives’ (Johns, 1997, p. 101), explore the data, and make their own discoveries, and eventually produce their own texts. The tasks which play an indispensable role in the exploitation of the data are exemplified and described in detail.


Figure 4.2: A student-led discovery task (individual)

This task was designed upon identifying that the word ‘data’ was used by the majority of the students frequently incorrectly. The aim is to get the students to explore the use of the word ‘data’ in the BNC corpus. The students generate their own concordance lines themselves and find the collocates of ‘data’. This task requires them to explore the word in detail, as they are expected to find out whether ‘data’ is a singular or a plural word. They are also expected to pick, from the corpus, three sentences that they think clearly expresses the meaning of the word. This task is in the Moodle assignment module, which means that the students complete it individually and submit it online. A sample student assignment is in Appendix N.


Figure 4.3: A student-led discovery task (collaborative)

Task 2 was designed after a classroom session when it became clear that the postgraduate students were not only overusing the linker ‘On the other hand’, but also using it incorrectly, most of the time to mean ‘However’. This is again a student led discovery learning task which requires the students to go to the BNC-Written corpus, find some examples with both sentence connectors, and comment on why one, and not the other is used in each situation. Unlike the first task, however, this task is designed as a WIKI, which means that the students work collaboratively, and edit each other’s work. The post-graduate students’ collaborative work can be accessed at


Figure 4.4: A teacher directed data-driven task

At the beginning of the semester, during the initial classroom sessions, it became clear that the post-graduate students learned English at an individual word level, and had no conception of collocations, colligations, and lexico-grammatical patterns. This task was thus designed to raise the students’ awareness of the need to know collocations, colligations, and lexico-grammatical patterns to be able to produce a word in writing. This is a teacher directed data-driven task, as the data (an abstract) are provided and arranged by the teacher. The abstract is a PhD abstract from Virginia Commonwealth University, and the sentences in the abstract are divided into separate lexico-grammatical patterns. The students are expected to solve the puzzle individually, and put the abstract back together by naturally considering the meaning, but more importantly, by deciding what the next word can be. As can be seen in the picture, option C is ‘to be largely stemmed’. The students will not be able to decide on the next pattern unless they know that ‘stemmed’ is followed by ‘from’.


Figure 4.5: An individual student-led discovery task

This task is an individual student-led discovery task. After the post-graduate students analyze authentic thesis proposals both in class and on Moodle for two weeks, they are expected to produce their first draft proposals. They submit this online, and also get feedback online. They have the opportunity to submit their proposals three times, for all of which they get feedback. The students have access to the TAC, the TAC wordlist, the AAC Bank, larger corpora, and authentic proposal samples, and they are encouraged to explore all these different resources in the process of producing their proposal.


Figure 4.6: A collaborative teacher directed data-driven task

This task is based on the LAC, and the data is purposefully selected and arranged by the teacher. The students are encouraged to use the TAC, or the other corpora to determine the source of the problem, and correct the incorrect statements one by one. For example, the second statement is “Risk analysis never mentions what exactly will happen in the future”. At a glance, this seems to be a perfectly formed sentence, with a singular noun used with a singular verb. Furthermore, the noun clause is accurately inserted into the sentence. However, the problem here is one of ‘appropriacy’, and an inanimate subject cannot possibly do something animate; ‘mention’. The task is designed as a discussion, and therefore requires collaboration.


Figure 4.7: An individual teacher directed data-driven task

The data for this task also come from the LAC. The teacher, upon identifying instances of the incorrect uses of ‘study’ in the LAC, arranged the data by providing a correct alternative to accompany the incorrect uses. The students in this task are provided with one correct and three incorrect uses of the word ‘study’, and the incorrect use identified in the LAC is one of the alternatives. The students are expected to choose the correct use, and consider why the other options are incorrect. The task is designed as a ‘hot potato quiz’, which means that after the students complete the quiz, they get immediate feedback and a score. The score is then recorded on Moodle for the teacher’s reference.


HOW MANY MAIN MOVES ARE THERE IN AN INTRODUCTION? The following task may help you to answer this question. Task 4: Name the moves and underline the important language the writer uses to achieve the moves in the following introduction. The introduction does not end here. However, the rest of the introduction belongs to the 3rd move. Have you now decided what the moves are? Move 1: Move 2: Move 3: Task 8: Match the following moves with the relevant language. Move _____
      

However, little information/attention/work/data/research .... However, few studies/investigations/researchers/attempts .... The research has tended to focus on ..., rather than on .... These studies have emphasised ..., as opposed to .... Although considerable research has been devoted to ... , rather less attention has been paid to .... The previous research ... has concentrated on .... So far, investigations have been confined to ... The increasing interest in ... has heightened the need for .... Of particular interest and complexity are .... Recently, there has been growing interest in .... The development of ... has led to the hope that.... The .. has become a favourite topic for analysis .... The study of ... has become an important aspect of .... A central issue in ... is .... The ... has been extensively studied in recent years. Many recent studies have focused on .... The purpose of this paper is to ... The purpose of this investigation is to ... The aim of this paper is to ... This paper reports on the results obtained .... This study was designed to ... In this paper, we give results of ... In this paper, we argue that .... This paper argues that .... We have organized the rest of this paper in the following way .... This paper is structured as follows .... The remainder of this paper is divided into five sections ....

Move _____

        

Move _____

          

Figure 4.8: An in-class teacher led data-driven task


In this task, the data and the tasks are provided by the teacher to be used in the classroom. The focus is the generic moves in thesis introductions, and thus an authentic thesis introduction is used, and accompanying tasks designed. The task requires the students to read the introduction individually, decide on the three moves, discuss their decisions with their classmates, and come to an agreement. Then they are required to match the alternative lexico-grammatical patterns provided with the relevant moves.

Figure 4.9 below shows the number of times the individual data-driven learning tasks in the Hot Potatoes module were attempted by the post-graduate students in the 2007-2008 Academic year, Spring semester.

Figure 4.9:

Reports of hot potato quizzes

The post-graduate students who took ENGL501 in the 2007-2008 Spring term reported that they found these tasks very beneficial, as they needed guidance on how


to use the corpora. It is interesting to note that the post-graduate students, who all received their undergraduate degrees, and are architects, mechanical engineers, civil engineers, economists, finance experts, became language detectives through exposure to corpora, and data-driven learning activities. The Pilot Implementation and Evaluation of the course- 2007-2008 Fall Semester

The corpus-informed Advanced Thesis Writing course began to take shape in the 2007-2008 Fall semester, when the online component was gradually developed and incorporated into the course, creating a virtual classroom for more interaction and sharing beyond the 3 hours a week in class. This incorporation provided the opportunity and the platform to introduce participants to the concepts of corpora, genres, moves, as well as the philosophy behind e-learning platforms, that knowledge is created collaboratively. Although the course in Fall 2007 was not a pilot implementation, it was still evaluated, as some key concepts were introduced to the participants for the first time. A total of 23 participants completed and submitted an evaluation questionnaire online. The major findings are presented below.

Moodle incorporates a questionnaire module, which collates and tabulates the questionnaire entries automatically. According to the analysis of the data, the course was most useful in terms of developing academic study skills and knowledge of academic conventions (4.7 on a 5 point scale), followed by developing knowledge of thesis structure and format, improving academic writing skills and knowledge, developing skills and knowledge of textual dynamics, developing academic vocabulary knowledge and skills, developing awareness of the need and benefit of


producing multiple drafts in academic writing, and developing skills in exploiting computers as a study resource. When the participants were asked about the contribution of the course to their writing skills, 12 out of 23 (52%) said their writing skills were much improved, 9 out of 23 (39%) somewhat improved, and 2 out of 23 (9%) a little improved.

The questionnaire also inquired what the participants still found difficult about writing in English. A participant, unknowingly, stated the aim and the research agenda:

I think that although the course was very useful, it was about academic writing in general and not specialized in my feild of study- as it is supposed to be, because it is not possible to have different courses for different thesis subjects of students. The thing that I need to do now is focussing on writing in my feild of study by reading relevant subjects, learning specialized vocabulary, and trying to write similar to them, by the aid of the knowledge I have earned in this course.

The newly introduced on-line learning platform was also part of the evaluation. The participants were asked about courses offered through Moodle. 91% of the respondents stated that they would like to have more courses supported by Moodle at the post-graduate level. The respondents reported that they benefited from Moodle most in terms of being able to see what the teacher was asking them to do each week (4.4 on a scale of 5), followed by learning independently, writing more in English, feeling more confident about using computers, and seeing what their friends were doing and saying and learning from their ideas. The fact that the participants ranked collaboration among classmates the lowest may be due to the fact that this was the first time they took an online platform supported course, and they were hesitant and uncomfortable about sharing ideas.


The participants also commented on the course in terms of content, teaching materials, instructional methods, focus, and instructor. Most respondents reported that, in general, they found the course satisfactory. One participant said “Best lecture i had this semester”. The necessity of allocating more hours to the course and having the course for two semesters was mentioned. One respondent had a very idealistic suggestion:

To establish a collaborated work between the student, instructor, and the supervisor. To customize the lecture through high interaction between the student, instructor, and of course the supervisor to provide not only a guide and feedback for writing (instructor) but also specialized advice on the topic (supervisor).

The same respondent found this suggestion unrealistic later and said:

However, sounds challenging and even impossible, given the number of applicants, it would bring about significant changes and a solid support to the thesis in terms of both academic writing structure and specialized field supervision. Hence, a superior work and higher motivation for publication. (Thank you). The Pilot Implementation and Evaluation of the course- 2007-2008 Spring Semester

The course was therefore first piloted as a corpus-informed genre-based course in the 2007-2008 Academic Year, Spring Semester. The TAC (Target Abstract Corpus) was put online in the form of web concordances, providing the opportunity for participants to explore a word in the corpus to find out about its use and its lexicostructural properties. The participants were also introduced to larger corpora and vocabulary profiling and concordancing tools ( to raise awareness of the importance of genre and register variation, as well as the value of


observing and exploring the context of a word for its accurate and appropriate use. In addition, corpus-informed tasks and materials were developed and incorporated into both the in-class materials and the on-line component of the course. These tasks were designed to encourage the participants to explore the TAC and the other larger corpora made available, and focus on lexico-structural patterns rather than isolated vocabulary items. Due to the fact that corpora are stored on computers, most of the newly designed tasks were put online to facilitate the exploration of these valuable databanks.

In the 2007-2008 Academic year, Spring semester, 32 participants, all of whom were doing their Master’s, took the course. Of these 12 were from Architecture, 7 from Banking and Finance, 3 from Industrial Engineering, 3 from Economics, and 3 from Communication and Media Studies, 2 from International Relations, 1 from Civil Engineering, and 1 from Mechanical Engineering. As for nationalities, this group was even more international than the previous groups, 14 from Iran, 7 from Cyprus, 2 from Iraq, and 1 person each from Cameroon, Nigeria, Albania, Kosovo, China, Russia, Syria, and Jordan.

As in the previous years, pre- and post- questionnaires were administered at the beginning and end of the course, so as to evaluate the participants’ perception of the impact of the course on their progress. The online pre-course questionnaire included 17 and the online post-course questionnaire 31 questions on various aspects of the course. The pre-course questionnaire was completed by thirty, and the post-course questionnaire by twenty-five of the total thirty-two participants. These questionnaires were analyzed together.


Seven themes emerged from the content analysis of the data; these are ‘perceptions of the importance of English in an English-medium university, especially for postgraduate students’, ‘perceptions of the significance of the writing skill’, ‘methods of checking / improving written work’, ‘the major difficulties in producing own texts’, ‘familiarity with the important concepts in writing’, ‘course-related opinions’, and ‘Moodle-related opinions’. The data analysis revealed that, at the end of the semester, there was a significant increase in the respondents’ perceptions of the importance of English, as can be seen from Table 4.28.

Table 4.28 Perceptions of the importance of English To pursue a post-graduate degree in an English-medium University, one needs to have: Pre-course Questionnaire Post-course Questionnaire (30 respondents) (25 respondents) a high level of English 9 (30%) 9 (36%) a satisfactory level of 21 (70%) 16 (64%) English a low level of English 0 0 The more challenging task in a post-graduate program in an English-medium University is: Pre-course Questionnaire Post-course Questionnaire (30 respondents) (25 respondents) dealing with the subject matter (Architecture, 12 (40%) 8 (32%) Business, etc.) dealing with the expression of ideas in 18 (60%) 17 (68%) English

The first question on this theme was related to the level of English required to pursue a post-graduate degree in an English-medium university. As opposed to the 30% of respondents who thought that a high level of English was required at the beginning of the semester, the percentage rose to 36% in the post-course


questionnaire responses. The findings regarding the second question on this theme were extremely significant. While 60% of the respondents thought dealing with the expression of ideas in English was more challenging than dealing with the subject matter before taking the course, this percentage rose to 68% at the end of the semester. This may mean that the course raised the participants’ awareness of the importance of coherent, and appropriate expression of ideas, and contributed to the understanding that no matter how well one knows about the subject matter in hand, the more demanding task is to convey this knowledge to the reader effectively.

The second theme that emerged is the perceptions of the respondents regarding the importance of the writing skill in post-graduate programs. To this end, the respondents were asked to rank ‘reading, writing, speaking, listening, vocabulary, grammar’ in the order of importance. Before taking this course, the order of ranking was ‘reading-writing-vocabulary / grammar-speaking-listening’. However, at the end of the semester, the ‘writing’ skill ranked first, followed by ‘reading-vocabularygrammar-listening / speaking’. The findings appear to suggest that dealing with advanced thesis writing for a semester raised the participants’ awareness of the importance of the writing skill, and of the fact that there are a lot of factors contributing to coherent and appropriate writing.

The course provides tools and resources to check and improve written work, so that the participants can become their own editor. Before taking the course, using Word tools, and self-checking by reading the produced text ranked top, both referred to eleven times. Ten respondents said they asked a friend, or a teacher to check their work. After taking the course, however, the respondents mentioned a wider variety of tools and resources. Ten respondents reported that in addition to the Word tools,


they benefited from corpus concordancers, the vocabulary profiler, and the TAC (Target Abstract Corpus) in checking and improving their work. Five people mentioned focusing on collocations, and three people on synonyms to improve their work. One respondent said:

gemoodle is the best thing that help [sic] me to improve my written skill. when i stuck [sic] i can check the corpus concordancer. i can search the examples which relates [sic] chapter that im [sic] using in [sic] that moment. it is the best thing in my master education that i met [sic] up to now

The respondents reported that after taking the course, they started using different tools and resources to check and improve their work, and that they felt more confident about writing. This confidence was expressed by a respondent so well:

i [sic] trust my self now, i believe that i can write and complete my thesis because all my questions has [sic] answered. its not a big deal to complete the thesis, but it was before. its aid [sic] to me, especially Moodle! its my supervisor :)

The fourth theme that emerged was the participants’ major difficulties with writing in English. As in the previous semesters, and as reported by the previous course instructors in the interviews, the most common difficulty was related to ‘vocabulary’, both at the beginning and at the end of the semester. One respondent expressed this difficulty in the following manner: “The most difficulty [sic] about writing is vocabulary. Sometimes we know about our feeling, but because of the lack of vocabulary, we cannot explain it to others”. The other difficulties mentioned are shown in Table 4.29.


Table 4.29 The major difficulties in writing Area of difficulty Pre-course Q. What do you find most difficult about writing in English? Please give details. Post-course Q. What do you still find most difficult about writing in English? Please give details and reasons. 8 times 4 times 4 times

Vocabulary Grammar Academic / formal language Quoting / paraphrasing

19 times 12 times 5 times -

As can be seen from the table, vocabulary and grammar posed as the greatest difficulties in writing. As regards vocabulary, finding alternative words, and accurate collocations were mentioned as being problematic, while using the passive was singled out as a problem. Interestingly, the two very important skills in reporting research, ‘quoting’, and ‘paraphrasing’ which were not mentioned at all at the beginning of the semester were reported to be difficult at the end of the semester. This may stem from the fact that some of the participants had no awareness of the need to quote and paraphrase ethically to avoid plagiarism. In fact, it would not be wrong to say that there were participants who found out about the meaning of plagiarism when they took the course. For this reason, in the course syllabus, the first two weeks are allocated to ‘academic ethics’, ‘avoiding plagiarism’, and ‘quoting, paraphrasing, referencing’.

Familiarity with some important concepts with writing was another theme that emerged. At the beginning of the semester, the participants were given a list of


important concepts in writing, to which they were going to be exposed during the course, and were asked to choose the ones they were already familiar with. The same list, with only one concept added, was included in the post-course questionnaire. The rationale for presenting these was that familiarity and awareness can be contributing factors to the skilful manipulation of these concepts in writing. Table 4.30 demonstrates the results:

Table 4.30 Familiarity with some important concepts in writing Concept Cohesion coherence process writing drafting revising editing academic vocabulary collocations avoiding plagiarism quoting paraphrasing bibliographies/references formal / informal language hedging moves Pre-course 23% 33% 33% 63% 50% 43% 43% 13% 60% 67% 67% 57% 60% 3% not included Post-course 36% 52% 44% 72% 48% 52% 80% 76% 88% 92% 92% 84% 88% 52% 72%

The data indicate that the most dramatic increase occurred in the familiarity with the concept of ‘collocations’, followed by ‘hedging’. The former concept plays a major role in both speech and writing, while the latter is particularly important in academic texts. The most familiar concepts in the pre-course questionnaire were ‘quoting, paraphrasing, summarizing’ with 67%. It can be said that, on the whole, the course succeeded in raising participants’ awareness of important concepts in writing. The ‘generic moves’ concept, which was not included in the pre-course questionnaire,


was reported to be a familiar concept by 72% of the respondents. This is quite a satisfactory result considering that the course adopts a genre-based approach to instruction. To sum up, it can be said that the participants’ awareness of some important concepts in writing, especially of academic vocabulary, collocations, and hedging, improved significantly.

Another theme that emerged was the participants’ course-related opinions. The respondents reported that the major contribution of the course was in terms of ‘improving academic writing skills and knowledge’, followed by ‘developing knowledge of thesis structure and format’, ‘developing academic study skills and knowledge of academic conventions’, ‘developing academic vocabulary knowledge and skills’, ‘developing skills and knowledge of creation of text’, and ‘developing awareness of the need and benefit of producing multiple drafts in writing’. Fortyfour percent of the respondents stated that upon completion of the course, their writing skills were ‘much improved’, followed by forty percent who felt these skills were ‘somewhat improved’. Therefore, it can be safely concluded that the majority of the participants felt their writing skills benefited from the course.

The last theme that emerged from the pre- and post-course questionnaires was the participants’ opinions regarding the e-learning platform, Moodle. At the beginning of the semester, the participants talked about their first impressions of Moodle by describing it as ‘excellent’, ‘interesting’, ‘exciting’, ‘very good’, and ‘great’. Some respondents considered it from the ‘learning’ and ‘educational’ point of view, and mentioned improvement in the quality of education, facilitating learning, and focusing on the learners’ needs and expectations, which is in line with the philosophy behind Moodle. One respondent simply said: “Very impressive. I wish


our faculty had such a tool”. At the end of the semester, the participants thought that ‘class materials, tasks and activities’ were the most useful part of Moodle, followed by ‘tools like TAC, Concordance, Vocabulary Profiler’, ‘outside resources and materials’ and ‘on-line assignments’, and ‘quizzes’ and ‘discussion forums’. Sixty percent of the respondents stated that they would like to have more courses supported by Moodle at the post-graduate level. One respondent, talking about why more courses should be supported by Moodle said: “we had too many courses at postgraduate [sic] level but none of it [sic] were helpfull for our thesis. Moodle is the only one course which is usefull for writing, improving and developing our English in academic way [sic]”, and another commented “i [sic] think moodle is a contemporary medium of academic communication and distribution of course material”. Based on the comments of the course participants on various components of the course as well as the findings from the questionnaire, it can be concluded that the course was a fruitful learning experience for the majority of the participants.

4.4 Summary

This chapter presented the results of the data analysis in accordance with the research design introduced in Chapter 3, Methodology. The abundant data which were thoroughly analyzed led to significant findings, the broader implications and applications of which are presented and discussed from a range of perspectives in the final chapter.




This study was motivated by the recognition of the language barriers facing nonnative post-graduate EFL students doing research, and writing their theses in an English-medium university, in a non-English speaking country. The non-native postgraduate students, who aspire to disseminate their research internationally, and get accepted in the global academic discourse community, encounter problems in producing coherent and appropriate academic texts. The findings suggest that the problem stems from the considerable gap between the actual and the required performance levels, especially in terms of the exploitation of different types of words and lexico-grammatical patterns, as well as from the inability to develop language competence from the receptive to the productive level.

5.1 Discussion

This problem is not unique to the Eastern Mediterranean University in Northern Cyprus. The use of English as the language of instruction is widespread, and not only in countries where English is the first language. The perceived importance of English as a global language has led to many English medium universities being established worldwide, it has enabled such universities to promote themselves as international, and to compete with the English speaking countries in attracting students. This trend, however, raises issues concerning language standards. Difficulties with English can be particularly marked at the post-graduate level,


where students may be involved in significant, and in some cases original research, only to find themselves hindered by the difficulties they face in organizing and expressing their ideas. This problem relates not only to the completion of their coursework and theses, but also to the subsequent publication in the international arena, where again English will often tend to be the required language. Additional pressure is often exerted through “the increasing requirement imposed by many university administrations around the world-and across many fields-for publications to appear in major Anglophone peer-reviewed journals as prerequisites for faculty recognition, advancement, and promotion,…” (Swales, 2004, p. 38). Mirahayuni (2002) in her doctoral dissertation motivated by the scarcity of research activities by Indonesian researchers explores the same issue:

For researchers to gain recognition in the wider research community, it is necessary for them to be able to communicate in language(s) that reaches a global readership. International dissemination of knowledge has become a necessary part of communication among researchers and this has become possible with the role of English as the most widely used international language. (p. 313)

The current study has thus explored a problem that is not only common, but also extremely critical to so many non-native writers around the world, and the awareness of the significance of this problem has led to a growing body of research in this area. However, until recently, most of these research studies focused on the genre of research articles, and not as much on thesis writing. This study attempted to contribute to the filling of this research gap, and add to the growing body of research and knowledge, through constructing a pedagogic corpus with multiple components, that would form the basis for a genre-based corpus-informed approach to academic


writing pedagogy, and assist the non-native post-graduate students in reporting their research.

This final chapter is thus organized as follows. After the aims, methods and the major findings are briefly discussed, conclusions are drawn based on the research findings, and in line with the relevant literature. Then, the implications and applications of the conclusions are explored from various perspectives. The chapter concludes with suggestions for further research in the field.


Aims and methods

The research study described set out to construct a pedagogic corpus incorporating a range of corpus-informed components, and teacher-led data driven and learner-led discovery tasks, to assist the non-native post-graduate students in producing coherent and appropriate representation of their research. To this end, two corpora were compiled and lexico-structurally analyzed: A target academic abstract corpus (TAC) composed of thesis and dissertation abstracts from English medium universities in English speaking countries, and a learner corpus of academic abstracts (LAC) written by the post-graduate students studying at EMU. The research sought answers to the following questions:

What are the major lexico-structural patterns in the Learner Abstract Corpus (LAC)?

What are the major lexico-structural patterns in the Target Abstract Corpus (TAC)?

How does the LAC relate to the TAC?


What does the cross-examination of the two corpora necessitate in terms of the comprehensive pedagogic corpus design?

The LAC comprised four sub-corpora consisting of a total of 100 abstracts produced by the students who took the Advanced Thesis Writing course at the Eastern Mediterranean University over a six-semester period. These abstracts were in the process of being developed by the post-graduate students. The TAC was also comprised of the same four sub-corpora (Architecture, Social sciences, Sciences, and Arts and Humanities), each sub-corpus consisting of 150, a total of 600 abstracts. All the abstracts had been published on the World Wide Web, and were completed at universities in English speaking countries, namely the UK, the USA, Australia, New Zealand and Canada. The corpus was therefore constructed from the completed work in a variety of countries, subject areas, and by a variety of postgraduate students, most of whom are assumed to be native speakers of English. The models that emerge from the target corpus, therefore, are not the 'expert' models of experienced, native-speaker academics, but 'peer' models, models to which the postgraduate students in this research, for example, might reasonably be expected to aspire.

The rationale for compiling the corpora from thesis and dissertation abstracts was not simply to improve understanding of the genre of abstract, but to exploit the abstract as a kind of microcosm of the thesis itself, since the fundamental moves of an abstract are more or less parallel to the chapters of the full thesis, with the exception of the literature review. Thus, the abstract has the potential to provide an extremely economical means of finding out a great deal about the whole thesis. A secondary reason for the use of the genre of ‘abstracts’ was that abstracts do not


generally include quotes and paraphrases, and therefore reveal the writer’s own language. The reason for having two corpora, on the other hand, was to enable the researcher to compare and identify the lexico-structural issues that needed addressing, and provide the post-graduate students at EMU with an acceptable standard of academic writing as that produced by post-graduate students in universities in English-speaking countries.

The corpora were not of equal sizes. There were one hundred abstracts in the LAC as opposed to the six hundred abstracts in the TAC. The size of the LAC was beyond the researcher’s control, as it took her six semesters to compile the LAC. However, the TAC was intentionally larger. The rationale for having a larger TAC was to be able to extract sufficient data to compile a bank of lexico-structural patterns that would provide comprehensive coverage of the relevant sub-moves and moves. However, as the aim of the research was pedagogic, economy was also an important consideration, since at a later point students would be using the corpus as a learning resource. In the event, six hundred abstracts proved to be sufficient and yield extensive data, and yet allow manageability. The compiled corpora were then analyzed through a range of computer-based and web-based tools presented in the third chapter, leading to the findings discussed in the following section.


Major findings

The post-graduate students studying at the Eastern Mediterranean University were aware of the difficulties they faced with academic writing. What was even more interesting, however, was the fact that in many cases, they independently identified poor vocabulary as the major problem, echoing the insights of those instructors who


had previously taught the course. This was further confirmed by the results of the learner corpus analysis (LAC). The work compiled in the LAC exhibited extensive use of higher frequency vocabulary, an obvious tendency for repetition of similar items, and recurrent failure to use appropriate collocations and lexico-grammatical patterns. Furthermore, the findings from the LAC pointed to a significantly restricted range of not only individual words, but also frequently their lexicostructural realizations. The TAC, on the other hand, revealed the use of a wide range of lower frequency words, as well as extensive and varied lexico-grammatical utilization of these items. Another finding was that in the TAC, around sixty percent of the word types used were off-list words, which suggests that infrequent and more sophisticated words were widely used.

The cross-reference of the two corpora produced significant results. The analysis of LAC uncovered quite weak vocabulary, in comparison with the more sophisticated language of the authors of the TAC. The abstracts in the TAC exhibited a much wider active range of vocabulary than those written by the post-graduate candidates at EMU. The weakness seemed to arise from the fact that the writers of the LAC could not develop their receptive vocabulary knowledge to the productive level. Compared with the post-graduate students in English-speaking countries who used more infrequent items extensively, the post-graduate students in the institution showed a marked preference for extremely frequent vocabulary items, and their insufficient knowledge of collocates and lexico-grammatical patterns led to serious errors, and consequently to incoherent written communication. To confirm that this wide gap did not simply result from the unequal sizes of the two corpora, a random twenty-five abstracts from each sub-corpus of the TAC were used to obtain a TAC of equal size to the LAC. The findings from this analysis revealed the gap between the


LAC and the TAC more conspicuously. In terms of the range of different words used, the figures pointed to a difference of 1,814 more words used in the TAC, a substantial difference, considering that both corpora were composed of exactly the same number of abstracts. In addition to the extent, the composition of this considerable gap was examined, and it was revealed that as opposed to the 3,798 different words making up 95% of the text in the TAC, only 2,311 words were used to compose the same amount of text in the LAC. The figures and the percentages provided further proof that the non-native post-graduate students at EMU used a more limited range of words, compared with the post-graduate candidates writing their theses in English-speaking countries. Therefore, the analysis of the equal-sized corpora further substantiated the existence of the gap.

This gap can be typified by a more limited range of vocabulary, a more limited productive knowledge of lexico-grammatical patterns commonly used in thesis writing, inaccurate use of collocations and lexico-structural patterns, insufficient knowledge of the different types of words in word families, and the inadequate ability to extend the knowledge of these words to the productive level. Thus, the cross-reference of the two corpora identified, and exemplified the deviation of the actual from the desired output. This deviation, at times, led to communication breakdowns. This finding is consistent with the literature. Howarth (1996) maintains that if the lexico-grammatical form conforms to the norms of the register to the anticipated degree, the reader's conscious attention is focused on meaning while the form is processed subconsciously. “Phraseologically deviant forms disrupt such processing, forcing the reader to analyse form and synthesise meaning from scratch” (Howarth, 1996, p. 10).


These findings necessitated immediate action in accordance with the research agenda. It was of utmost necessity to categorize the data, and then make it accessible in a form in which it could be exploited for teaching and learning purposes. Therefore, the identified discrepancy between the actual and the desired output initially led to the identification of the 165 word families that were of particular significance in thesis writing. However, in addition to isolating the most frequent content-based items used in the target corpus, it was essential to move beyond the individual word or word family, focus on the syntactic properties of these significant vocabulary items, and reveal how exactly these words were used in academic writing. Corder (1973) emphasizes that learning vocabulary is not learning only the semantic properties of items, but also their syntactic properties (pp. 279-280). The need to focus on syntactic properties resulted in the qualitative analysis, and subsequent categorization of the significant words within their lexico-grammatical realizations. This further led to the classification of these patterns according to the IMRD (Introduction-Methodology-Results-Discussion) moves (Swales, 1990), and even more subtle sub-moves, the product of which was named the AAC Bank of moves and sub-moves.

The AAC Bank was only one component of the comprehensive pedagogic corpus constructed in accordance with the research agenda. A virtual learning environment, Moodle, was integrated into the course, and online materials and tasks were created to accompany the in-class materials and tasks. A taxonomy for tasks was generated, and teacher-led data-driven tasks and student-led discovery tasks were produced. The web concordances of the two corpora were also mounted on the ‘host’ environment, Moodle. Two glossaries were created for the TAC Wordlist, and the


AAC Bank of Moves and Sub-moves. Hence, the construction of the comprehensive pedagogic corpus was completed.

The newly designed course incorporating the pedagogic corpus was evaluated through questionnaires in the 2007-2008 Academic year, Spring semester. In their questionnaire reports, the course participants expressed very positive perceptions about the TAC, the tasks, the materials, and the use of Moodle, in other words, the components of the pedagogic corpus. A participant felt that he benefited so much from the components of the pedagogic corpus, and the virtual learning environment that he called Moodle his supervisor. Meanwhile, the course continues to be offered, and the rich corpus data dictate the continuous production of new tasks.

5.2 Conclusions

In the light of the major findings of the research, it is possible to draw major conclusions regarding wordlists, lexico-grammatical patterns and banks, semantic frequency, the significance of generic information, a genre-based corpus-informed data-driven approach to writing, the use of virtual learning environments, and most significantly, the use of corpora.



The findings revealed that the majority of the most common words used in academic abstracts do not come from the AWL, and that the use of GSL words as well as offlist words is significantly high. A further finding was that the most frequent content words in the two corpora were in fact the GSL words, which acquired a different meaning and use in the academic text. Based on the findings, it can be concluded


that the recent trend to base EAP classes and programs exclusively on the words in the AWL can prove counter-productive, as such applications are bound to disadvantage the learners through denying them the opportunity to encounter and practice the GSL words which are frequently used in academic texts. The results of this study, therefore, strongly support the research by Billuroglu and Neufeld (2005), who argue that there is little convincing rationale for the division between the GSL and the AWL in the first place. The results also suggest that the BNL (BillurogluNeufeld List), with its unified perspective on commonly used words, would provide a more meaningful approach to managing graded vocabulary development for receptive purposes. The results of the current research also suggest that the designation of any vocabulary item as ‘academic’ may be misleading, as a word’s connotation, or ‘semantic prosody’ is determined largely by its co-text, or surrounding environment (Sinclair, 1991, p. 112; Stubbs, 1996, p. 173).

The discussion of wordlists leads to another conclusion that is drawn from this research. As much as wordlists are useful for receptive vocabulary teaching-learning purposes, they are not sufficient for productive vocabulary teaching-learning purposes, unless supplemented, for the following reasons: The first reason is that wordlists present the word families of each word in such a way that all the words in a particular family are assumed to express the same meaning, display the same features of use, and occur in the same contexts. However, just because words are in the same family does not mean that they share the same use and meaning. Sinclair (1991) disagrees with the grouping of words together based on the single criterion that they are the different forms of the same word. He maintains that:


It is now possible to compare the usage patterns of, for example, all the forms of a verb, and from this to conclude that they are often very different from one another. There is a good case for arguing that each distinct form is potentially a distinct lexical unit, and that forms should only be conflated into lemmas when their environments show a certain amount and type of similarity. (Sinclair, 1991, pp. 7-8)

Referring to the same potential problem with word families, Hyland and Tse (2007) maintain that “a vocabulary list must either avoid items with clearly different meanings and dissimilar co-occurrence patterns, or these items must be taught separately rather than as parts of families” (p. 243). They further cite Oakey (2003), emphasizing that caution should be exercised regarding the issue, as in some families, meanings and collocational environments change across each inflected and derived word form (p. 243).

The fact that 62.02% of all the word types used in the TAC are off-list words is the second reason why vocabulary teaching-learning for productive purposes cannot be based entirely on wordlists. The frequent words offered by wordlists are the ones that learners are most likely to encounter, learn and use. Especially in academic environments where complex ideas are expressed in similarly complex and rich language, the use of less frequent, more specialized words is called for. According to Clanchy and Ballard (1992), “the academic writer makes frequent use of passive forms of the verb, impersonal pronouns and phrases, qualifying words and phrases, complex sentence structures, specialised vocabulary” (cited in Jordan, 1997, p. 244). Further, “vocabulary choice is a strong indicator of whether the writer has adopted the conventions of the relevant discourse community” (Nation, 2001, p. 178). Therefore, wordlists are useful for understanding, or recognizing, but not producing language, and total reliance on them would lead to the production of simple, and


repetitive language. Hyland and Tse (2007), referring to the use of academic vocabulary lists, emphasize that teaching such a vocabulary “… ignores important differences in the collocational and semantic behaviour of words, and does not correspond with the ways language is actually used in academic writing” (p. 237).

In line with Hyland and Tse’s (2007) opinion above, the finding that the writers in the LAC exhibited the greatest difficulty with collocations, and lexico-grammatical patterns leads to the third, and the most important reason why wordlists should be treated with caution when language production is concerned. Hyland and Tse (2007) maintain that “… vocabulary is more than individual words acting separately in a discourse” (p. 251). Paquot (2007) also points out that the AWL may be of greater value as a resource for receptive rather than productive teaching purposes, and argues for a more phraseological approach to productive teaching. Wordlists treat words as isolated items, and thus ignore the co-text.

This weakness is expressed by Carter and McCarthy (1988) as “the absence of information on collocations and collocational frequencies” (p. 9). Knowing a word
means knowing a lot of other words that collocate with it. Especially for skills that

require production, this knowledge is essential to achieve fluency. Fluency in the language will be promoted if, in addition to individual words, learners have a knowledge of lexical chunks. Furthermore, focusing on the co-occuring words contributes to the use of syntactic structures. Willis (2000) holds that “lexical phrases fill the gap between grammar, on the one hand, and vocabulary on the other” and adds that “they play a vital part in both speech and writing, contributing to the ease, fluency and appropriacy with which someone speaks or writes” (p. 2). According to Nattinger and DeCarrico (1992), this lexico-grammatical view presents


a new potential in language teaching. These ‘strings of words’ are stored together in our minds and are retrieved and used as a ‘pre-fabricated chunk’ either in exactly the same way or with slight changes (cited in Willis, 2000, p.1). Hill (1999) also considers the present view of language as being comprised, to a large extent, of ‘prefabricated chunks of lexis’ (p. 3).


Lexico-grammatical patterns and banks

Hill’s (1999) view of language paves the way to a major conclusion of this study. The findings showed that one of the most serious difficulties of the post-graduate students at EMU is the insufficient knowledge of, or the inability to accurately employ collocations, and lexico-grammatical patterns in their writing. Lexis and grammar are inseparable. The international conference exploring the theme of ''Exploring the Lexis-Grammar Interface'' organized by the English Department of University of Hanover, Germany in 2006 invited proposals through the following introduction:

Over the last two or three decades, research in corpus linguistics has shown that lexis and grammar are closely interdependent. The conference aims to bring together scholars with a common interest in aspects of lexis-grammar co-selection in the English language. We aim to discuss empirical evidence on the inseparability of lexis and grammar, and explore in what respects these two parts that are often treated separately in linguistic description form an organic whole. (Schulze & Römer, 2006)

Considering that it is extremely difficult for any lexis in isolation to be termed specifically 'academic', that advanced language learners involved in the reporting of research seem to be having the most difficulty with collocations and lexicogrammatical patterns, and that lexis and grammar are inseparable, the conclusion emerging from this research is pointing towards lexico-grammar. If researchers and


practitioners wish to teach the nuances of academic prose, and assist learners in the production of coherent and appropriate work, they need to identify the lexicostructural patterns that are utilized in specific moves in different genres, and offer this valuable resource to the service of their students. Hyland and Tse (2007) also emphasize ‘concordances’ in addition to corpus-informed ‘lists’. They maintain that “… corpus-informed lists and concordances can be used to help establish vocabulary learning goals for EAP courses, design relevant teaching materials, and generally target instruction more carefully” (p. 251). Based on the findings of a study exploring the effect of corpora on writing competence, Yoon (2008) concludes that focus on lexical and grammatical aspects is essential in L2 writing pedagogy.


Semantic frequency: Moves and Sub-moves

The conclusion regarding the need for focusing on lexico-grammatical patterns for productive skills leads to a further conclusion: that the categorization of lexicogrammatical patterns should be based on purpose, in other words, moves and submoves. It is semantic frequency that is called for, since in all genres, there are certain functions to fulfill, certain messages to convey, and “meaning is what is being verbally communicated between the members of a discourse community” (Teubert, 2005, p. 2). Categorization, therefore, needs to be based on the analysis of how words
can be used together, as well as which words can be used instead of each other, in order to achieve the specified moves for the required purposes. Such work is painstaking,

and whatever is produced is bound to be necessarily incomplete given the almost infinite variations of language. However, it is inevitable that such semantic categorization will assist writers in the production of coherent and appropriate text to some degree. This study concludes that lexico-grammatical banks of language


organized according to the generic moves provide a very rich resource for student writers in deciding what vocabulary to use, how to use it (through lexico-structural patterns), when to use it, and also for what purpose (through moves, and specifically sub-moves).


Generic versus discipline-specific lexico-grammatical patterns

Emphasizing the importance of disciplinary variation, Hyland and Tse (2007) recommend that in EAP classes, “teachers help students develop a more restricted, discipline-based lexical repertoire” (p. 235). “As teachers”, they say, “we have to recognize that students in different fields will require different ways of using language and so we cannot depend on a list of academic vocabulary” (Hyland and Tse, 2007, p. 249). The conclusions drawn from the current study are not consistent with the claims made by Hyland and Tse (2007). Academic texts from different fields certainly exhibit linguistic variation, but on the other hand, genres cut across subject fields, and moves and many of their lexico-structural realizations are defined not only by the specific subject matter (e.g. architecture), but by the conventions of the given genre (e.g. abstracts). Furthermore, in many institutions, advanced academic writing is taught in cross-disciplinary classes, and it is of utmost necessity to establish, especially for pedagogic purposes, the nature of cross-field generalities in academic writing. The analysis of the sub-corpora shows that the crossdisciplinary nature of moves and functions leads to the use of lexico-structural patterns that are extremely similar in widely different fields. McCarthy and O’Dell (2008) advise students that:

specialist terms are often relatively easy to master – they will be explained and taught as you study the subject… However, it is the more general words


used for discussing ideas and research and for talking and writing about academic work that you need to be fully familiar with in order to feel comfortable in an academic environment. (p. 6)

Hunston also points out that “for many writers who are expert in their own field, … it is not the technical terminology, but what might be called the terminology of rhetoric that causes problems” (2002, p. 135). Another factor that should be kept in mind before attempting to base EAP programs on discipline-specific vocabulary is learner and learning variation. Differences in entry levels of students can be very marked, and it is likely that learners who have not mastered the general lexicostructural building blocks of the language will struggle with both general and specific academic English courses. Therefore, this study concludes that EAP practitioners dealing with classes of students from different academic fields would benefit more from lists and banks that identify generic lexico-structural commonalities, rather than from those that emphasize disciplinary variation.

Eldridge (2008) does not disregard the validity and use of more specialized lists, but emphasizes that general academic lists are necessary for the subsequent production of more specialized lists (p. 112). He points out the important fact that the discipline-specific language is already emphasized by the subject specialist, and thus “the EAP specialist might therefore still be used in helping with the scaffolding” (2008, p. 112). A post-graduate student expresses the need for this ‘scaffolding’ very well:

I think that although the course was very useful, it was about academic writing in general and not specialized in my field [sic] of study- as it is supposed to be, because it is not possible to have different courses for different thesis subjects of students. The thing that I need to do now is focusing [sic] on writing in my field [sic] of study by reading relevant


subjects, learning specialized vocabulary, and trying to write similar to them, by the aid of the knowledge I have earned in this course.







approach to thesis writing

Although the original Advanced Thesis Writing course always emphasized the isolation, identification and use of key lexico-structural patterns, it was neither based on corpus data, nor did it make use of data-driven learning activities. The compilation and the analysis of the two corpora, as well as the use of concordancing tools, enabled the researcher to ensure the exploitation of the authentic data, through designing data-driven learning (DDL) tasks at different levels, with various aims, and in differing modes (individual or collaborative).

The generic move structure, IMRD, representative of abstracts and other genres reporting research (Swales, 1990), on the other hand, made it possible to categorize the data according to the basic moves in theses abstracts, which are parallel, at least to some extent, with the chapters and sections of a thesis. Therefore, the lexicostructural patterns used to fulfill the important moves and sub-moves were identified, offering the post-graduate students not only guidance in terms of the structuring of information (generic moves), but also as regards the relevant language to accomplish the moves (lexico-structural patterns). For students at higher levels, and for those learning a language for specific purposes, the knowledge of genres is as necessary as text creation skills, as each genre has its own discourse structure, style, content and linguistic patterns (Swales, 1990, p. 58). The conclusion of this study is therefore that corpus-informed, data-driven, genre-based approaches to writing with a lexico-grammatical focus offer a comprehensive methodology in


assisting writers in the organization of information in a text, as well as the coherent and appropriate expression of ideas (see Appendix O for Teachers’ Notes).

This conclusion is consistent with the literature. Flowerdew (2000), based on her research findings, concludes that providing exercises to familiarize students with the key organizational aspects of the genre is “the starting point for helping students to acquire competence in a particular genre” and this product-based knowledge should be accompanied by a process approach to writing (2000, p. 375). Recent research by Charles (2007) also supports the conclusion of the present research. Having reconciled corpus investigation with genre-based pedagogy, Charles concludes that “it is the combination of the two approaches that provides the enriched input necessary for students to make the connection between general rhetorical purposes and specific lexico-grammatical choices” (2007, p. 289). She further emphasizes that “in moving from discourse to corpus, the class moves from studying what texts do, to investigating how they do it” (Charles, 2007, 300).


Virtual learning environments

It is a significant conclusion of this research that virtual learning environments like Moodle empower learners and promote exploratory learning and autonomy. Computer Mediated Communication, according to Zane and Collins (1995), “promotes a type of interaction that is often lacking in the traditional teacher-based classroom. It allows learners the freedom to explore alternative pathways - to find and develop their own style of learning” (p. 3). Realistically, language classes include a wide range of students from different backgrounds, at different levels, and with different needs. Moodle is a community building platform (Philosophy, 2008),


and offers far more through its underlying constructivist and social constructivist principles. It helps the teacher to:

focus on the experiences that would be best for learning from the learner's point of view, rather than just publishing and assessing the information you think they need to know. It can also help you realise how each participant in a course can be a teacher as well as a learner. Your job as a 'teacher' can change from being 'the source of knowledge' to being an influencer and role model of class culture, connecting with students in a personal way that addresses their own learning needs, and moderating discussions and activities in a way that collectively leads students towards the learning goals of the class. (Philosophy, 2008)

The virtual learning platform, Moodle, integrated into the Thesis Writing Course in this study, is the host to the pedagogic corpus, with its multiple components. The corpus-informed tasks are of different types, with different aims, and at different levels. Students have the opportunity to determine what they need, and choose to engage in individual or collaborative tasks. This collaborative environment offered by Moodle also facilitates the construction of knowledge. Before the introduction of Moodle into the course, the participants inquired whether the teacher would support them with their writing-related problems after they finished the course. After the introduction of Moodle, the only concern the participants have is whether they will be able to access Moodle after they have completed the course. It is observable that the lack of self-confidence the participants have in their writing abilities when they first take the course gradually decreases towards the end of the semester. Echoing one of the course participants, Moodle becomes their ‘supervisor’, and they feel safe with its existence.

Provided that the course design is based on sound pedagogic aims, the virtual learning environment provided by Moodle promotes the pedagogical principles


embodied in the European Language Portfolio (ELP), mainly in terms of learnercentered classroom practices (p. 6), self-directed collaborative learning (p. 5), learner autonomy (p. 5), communicative intercultural competence (p. 9), differentiated learning (Kohonen, 2000, p. 10), the importance of input that is ‘functional’, “containing information that the learner would like to know”, and ‘realistic’, “the sort of language utterances that the learner is likely to encounter later in real-life situations” (Westhoff, 1999, p. 41). All of these principles, as discussed in chapter 4, are integrated into the pedagogic corpus through the virtual learning platform. There can be no doubt, therefore, that virtual learning environments with so much focus on and consideration of learners and learning, are bound to make a positive impact on the learning experiences and outcomes.


The use of Corpora

The most major conclusions that can be drawn from this research are related to the use of corpora. The use of corpora started to be popular in the 1990s. Keck (2004) states that “since the early 1990s, researchers have become increasingly interested in applying the findings of corpus studies to language teaching” (p. 84). The conventional practice has been researchers compiling and analysing corpora, and integrating the resulting data into teaching materials, regardless of what the practitioner thinks in terms of their usefulness (McCarthy, 2008, p. 565). Gabrielatos (2005) also thinks that although “electronic language corpora, and their attendant computer software, are proving increasingly influential in language teaching as sources of language descriptions and pedagogical materials, … few teachers are clear about their nature or their relevance to language teaching” (p. 1). Nowadays, thanks to technological developments, the use, as well as the compilation and


analysis of corpora, is not in the monopoly of researchers who may not be practitioners.

McCarthy (2008) holds that “teachers should be central stakeholders in the corpus revolution” (p. 565). This study is an example of how a teacher can be a central stakeholder in the corpus revolution, compiling and analyzing corpora, integrating findings into the materials, as well as observing the impact of materials and tasks on learning. More importantly, this study is an example of how learners can also be central stakeholders in the corpus revolution. Gabrielatos (2005) maintains that learners should also be involved in language investigations with the assistance and guidance of the teachers. It is noteworthy that the post-graduate students in this study are able to carry out their investigations through the use of not only the TAC, but also larger corpora like the BNC, and the Brown Corpus.

This research exploited both learner and target corpora to construct a pedagogic corpus with multiple components, which formed the basis for a genre-based corpusinformed approach to academic writing pedagogy. It would have been possible to make use of only a target corpus and design a teaching / learning program based on it. However, the use of learner corpora provides concrete evidence on learners’ language problems. The use of learner corpora has become widespread in EAP in recent years. Gilguin et al. (2007, p. 323) are strong supporters of the use of learner corpora in EAP research, and criticize the fact that “the overwhelming majority of corpus-based EAP studies are exclusively based on native corpora”. Milton and Tsang (1991) also advocate the use of learner corpora as they provide evidence that quantifies students’ problems in written expression (cited in Gilguin et al., 2007, p. 322). Flowerdew (2001), meanwhile, emphasizes that “insights gleaned from learner


corpora need to be employed to complement those from expert corpora for syllabus and materials design” (cited in Gilguin et al., 2007, p. 322).

This study has concluded that learner corpora are extremely valuable in that they not only provide teachers with information about the language level and problems of their students, but also learners themselves with data about what they lack, and therefore what they need, in order to compete in the academic world. The basic principles of such an approach do not only apply to post-graduate students involved in reporting their research, but can potentially be exploited in any language teaching-learning situation. The comparison of any specified learner corpora with any type of specified target corpora could provide the simple equation that ‘what one needs to know’ minus ‘what one already knows’ equals ‘what one needs to learn’. Corder (1973) maintains that describing and classifying learners’ errors give us information about what learners still need to learn (p. 257).

This study also concluded that quantitative and qualitative analysis should complement each other in corpus analysis to accomplish especially pedagogic goals. Without qualitative analysis, the results of this study would have been figures, percentages, and tables. It is questionable whether, or to what extent, such data would be of any use to post-graduate students involved in writing their theses. In corpus analysis, Hunston (2002) holds, there is “the need to move between quantitative information, which can alert the user to potential points of interest, and qualitative information, which is needed to provide explanation of those points” (p. 212). Similarly, Biber et al. (1998) maintain that corpus analyses should go beyond simple counts of linguistic features, and include qualitative, functional interpretations of quantitative patterns (pp. 4-5). A study like Hyland’s (2000)


analyzing hedges and boosters in academic texts would have been impossible to conduct without qualitative analysis of corpora.

A further conclusion of this study in terms of the use of corpora is that, although it may sometimes be perceived as a limitation, small corpora are valuable for pedagogic purposes, as they are easier to analyse, exploit, and base materials and tasks on. Small specialized corpora are useful when they are designed for specific groups of learners, and specific domains. Mudraya (2006) advocates the use of small corpora for language learning and teaching. She states that such corpora “… can be more useful as they are designed to represent the specific part of the language under investigation and are tailored to address the aspects of the language relevant to the needs of the learner” (p. 237). Hunston (2002), similarly, states that “relatively small but highly specialized corpora can be used in certain situations, for example in describing the language of specific discourse communities or in comparing nativespeaker and non-native-speaker usage” (p. 212). Corpora designed for pedagogic purposes “unlike corpora designed for research purposes, … , while useful for language description, are also intended to inform language teaching in specific instructional settings. The corpus, then, is carefully designed to represent the domains of use most relevant to the learner population” (Keck, 2004, p. 90).

A final conclusion that can be drawn from the use of corpora in this dissertation is that it decreases reliance on native speaker intuition, empowers both the teacher and the learner, promotes the self-confidence of both, and as mentioned earlier, reduces the dependence of the learner on the teacher, provided that learners are trained in how and when to use corpora. Corpora can potentially become learners’ best companion. The availability of corpora increases their confidence, and decreases


their reliance on ‘more proficient’ others. Lee and Swales (2006) hold that with a corpus approach to language, the authority for language standards is ‘decentered’, as this approach allows non-native speakers “a chance to make their own discoveries about what is ‘done’ in the language, instead of relying on native-speaker intuitions or grammar/style books” (p. 69).

In a similar vein, Gabrielatos (2005) maintains that:

The use of corpora in language teaching has helped redefine learner and teacher roles. It has reinforced learner-centred methodologies, and facilitated a further step away from the conception of teachers as sources of knowledge and providers of input, towards one of teachers as guides and facilitators, or even co-researchers. Corpus use has also introduced the need for learners and teachers to acquire new skills, and has placed increased emphasis on the necessity for teachers to develop their awareness of the language they teach. Finally, corpus-based research and teaching has the potential to empower non-native teachers and researchers, since native speaker introspection is no longer considered the one infallible source of insights into language structure and use. (p. 27)

In her article titled “Will Corpus linguistics revolutionize grammar teaching in the 21st century?”, Conrad (2000) claims that corpus-based studies have the potential to revolutionize grammar teaching through providing register-specific descriptions of English grammar, shifting the emphasis from structural accuracy to appropriate use of structures, and most importantly, incorporating grammar teaching with vocabulary teaching (p. 549). The use of corpora seems to have great potential to revolutionize many more aspects of teaching, and other language-related domains. This great potential is likely to be highly influential for a long period of time. As McCarthy (2008) says, “The corpus revolution is here to stay” (p. 573).


5.3 Implications


Teachers and Teacher Education

For practitioners wishing to make use of this work, a few words of caution may be required. The learner corpus was deliberately limited to the work of students taking the researcher's own thesis writing course, and the composition of the sub-corpora in the LAC was dictated accordingly. Obviously, a replication of this procedure with a different group of learners, even in the same institution, would produce different results. However, given the widespread and global nature of English medium education and the production of theses and publications written in English by researchers for whom English is not the first language, it is also suggested that a replication of this procedure with many groups of learners would be likely to lead to results which are at least markedly similar in many ways, and which support this undertaking. The research has yielded a bank of genre-based lexico-structural items that are of general utility to EAP teachers, and particularly those working at the postgraduate level. For those practitioners who feel that their circumstances are somewhat more rarified, or indeed dissimilar to those faced by the researcher, what is made available here is an innovative method that enables analysis and categorization for practical and pedagogic purposes.

With the developments in computer technology, it is now possible for teachers to conduct their own corpus-based research. Even the use of corpora, let alone the compilation and analysis of it, requires certain skills to be acquired. However, “to date, relatively little attention has been given in teacher education programmes to the growing influence of corpora and the skills of evaluating and using corpora”


(McCarthy, 2008, p. 563). With the use of corpora getting more and more widespread, teacher education programs should integrate modules with special focus on how teachers can compile, analyze, and exploit corpora. As already stated, McCarthy (2008) claims that the corpus revolution is “here to stay, and teacher education cannot afford to sideline it” (p. 573).



No individual student embarks on a course of study at exactly the same level as any other student, and the distance and route to be traveled between the ‘actual level’ and the ‘required level’ is in each case unique. Each individual thesis is also unique in that the post-graduate student is struggling to produce an original piece of work. Additionally, students are from different subject areas and disciplines. Hence, the pedagogic materials and tasks need to be flexible. It seems essential therefore that the lexico-structural bank of moves and sub-moves created in this research is used as an open-ended resource bank, and not as an imposition on groups of students from different subject areas. In fact, probably just as important as the categorized lexicostructural patterns, is the use of corpora in raising students’ language awareness and promoting independent learning. It is hoped, therefore, that the corpora are not only used as tools for teaching and learning, but also as resources that can be used in the writing process itself, in the same way that a student might make use of a dictionary or thesaurus.


Software Applications and Virtual Learning Environments

As the pedagogic corpus evolved, progressively greater use was made of the fastdeveloping Moodle software application. The General Education department where


the researcher works is the pioneering department at EMU integrating Moodle into its language instruction. The introduction of the software at the institution is recent, and not all the facilities it offers have been fully exploited to date. However time has been sufficient to make substantial developments in this regard, as well as gain an appreciation of what else it can offer.

As an interactive (or Web 2.0) based course design application based on social constructivist principles, Moodle offers a number of features of relevance to this project. Firstly, it enables a course to be fully designed and implemented through the Web, making it possible for students to use as a resource for both individual and collective study. It can also be used and shaped as required by individual teachers as both a pedagogic resource and as a means of communication with learners, both individually and collectively.

Furthermore, Moodle is able to serve as a basic resource centre. In this study, both the TAC and the LAC can be accessed through Moodle, allowing the post-graduate students privileged access into the web concordances of the two corpora, and the concordances it generates, using the Concordance software package (Watt, 2004). This enables the students to check the use, the frequency and full contexts of all the items in the corpora. Once the students have acquired an understanding of the major moves and functions in thesis and academic writing, they have a powerful tool on their desktop that they can exploit according to their individual needs. Moodle also has a powerful glossary feature that enables key items to be highlighted wherever and whenever they appear on the site and then linked into the glossary itself. The glossary can be exploited to provide a summary of details of key vocabulary in addition to lexico-structural patterns. Other features that can be used through or with


Moodle include gap-filling and matching software such as HOT POTATOES and collaborative editing software like WIKI. The facility for online submission and return of assignments also offers significant pedagogic benefits.

To conclude, this research has generically categorized the language used in thesis writing for pedagogic purposes. However, what Moodle does is to enable these categories to be organized and presented to both teachers and learners in an accessible style. This is extremely significant considering the size of the data.

It should be emphasized, however, that Moodle is only a software application, “Like all technology, the Moodle, in itself, is not that revolutionary. What teachers do with it, given the autonomy to let them follow their creativity within solid pedagogical principles, can be” (Eldridge and Neufeld, 2007, p. 25). Yet, at present, using technology has become so widespread that Dudeney (2008) claims that “'technology will not replace teachers, but teachers who use technology will replace those who don’t”.


The Use of Vocabulary Profiling Software

It should also be emphasized that the vocabulary profiling software both initiated and made a critical contribution to this study. This study was initially motivated by the compilation of the AWL, and the free vocabulary profiling tools made available on Tom Cobb’s Compleat Lexical Tutor website. Both RANGE and the Vocabulary Profilers (VP) have been integral to the implementation of this project. Therefore, the use of vocabulary profiling tools as a standard day-to-day practice for both teachers and students is recommended. As part of a related study, Hancioglu and Eldridge (2007) attempted to show how VP tools may be used by teachers to analyse


texts and make reliable decisions on which vocabulary to teach at what point in the teaching-learning process. It may also be added that vocabulary profiling tools enable the analysis not only of the authentic academic text, but of student work in either 'complete' or draft form. The basic VP tools offer a quick and economical way of analyzing and comparing the vocabulary profiles of any text. Besides, they are very easy to operate. The use of these tools is strongly recommended not only for teachers, but also for students in the process of constructing texts.

5.4 Implications for further research

As in all research, this study has also been restricted largely by issues of practicality, economy, and, as traditional, time. Considering the scope of the current study as well as the limitations, a number of suggestions can be made as to how future research can fruitfully build on the insights gained from the conclusions, and the implications of this research.

The research project has been limited to some extent by the selected sub-corpora, as they were of unequal sizes beyond the researcher’s control. Therefore, researchers and practitioners making use of the data, and the discussion need to be aware that the data from the learner corpus is not based on equally-distributed sub-corpora. Since the basic structure of the abstract, as well as the emerging lexico-structural patterns cut across fields, it is unlikely that this omission is of major significance, although wider and equal coverage would have been preferable. Hence, practitioners who are involved in the teaching of academic English in specific subject areas could conduct follow-up research by compiling corpora from their specific fields. The


growing collection of learner corpora available for academic use can also be exploited in this respect.

The use of abstracts as the basis for the study has produced detailed insights into the functioning of the thesis as a whole. Compilation and analysis of much larger and more extensive corpora of Master’s and Doctoral theses is a valuable route that can be followed in future research.

A further approach that would certainly be of significance would be to construct sub-corpora of individual moves, and analyze these corpora separately to extract the lexico-grammatical patterns representative of each move and sub-move. This would undoubtedly be a time-consuming exercise, as not only moves often overlap at the phrasal and sentential levels, but also some lexico-grammatical patterns are multifunctional. Nevertheless, such an approach would certainly constitute an ideal follow-up to the work described here, and is highly recommended for future researchers, particularly those with access to the technology that can facilitate the process of tagging the moves.

This study focused on identifying the lexis and lexico-grammatical patterns used for fulfilling thesis moves and functions that cut across different disciplines. An alternative approach to analysis could certainly be to focus on deriving wordlists and lexico-structural banks from specific fields, and sub-fields, as recommended by researchers such as Hyland and Tse (2007).

The learner population of the learner corpus in this research is of different linguistic backgrounds. A follow-up study could comfortably aim at separating the corpus into


sub-corpora based on linguistic backgrounds, and using error analysis to analyze and determine the cross-linguistic differences.

An obvious route that could be taken as a follow-up to this research would be to carry out a longitudinal study to determine the impact of the pedagogic corpus on the participants’ text creation skills over time. However, as the present ENGL501 Thesis Writing Course is a semester-based course, issues of time and feasibility could pose as serious limitations to such a research study.

As with all work of this nature, there have certainly been limitations, errors and omissions of various types in this research. Yet, it is hoped and indeed believed that the work described here may be of substantial assistance to those engaged in the teaching and learning of thesis writing in particular, and academic writing in general.

5.5 The Final Word

This study has most importantly demonstrated how corpora can be exploited by practitioners for pedagogic purposes. Painstaking as corpora analysis may be, the data provided and the opportunities offered are so great that the hard work becomes worthwhile. This research has also indicated that wordlists with a unified perspective, such as the BNL, can comfortably be used for setting targets for students, and for determining their receptive level, ‘the extent to which students understand language’ (Kohonen, 2000, p. 10). However, for students to ‘make themselves understood’ (Kohonen, 2000, p. 10), in other words, for production purposes, wordlists need to be supplemented. At especially post-graduate levels, where students are expected to be aware of generic features, the study has concluded that lexico-structural banks of moves and sub-moves can help students in producing


coherent and appropriate text. This study has also shown how available software and technologies may be harnessed in helping learners use data-driven learning techniques as a fundamental method of improving the level of their academic writing skills. Another important conclusion of this study is that virtual learning environments provide extensive learning opportunities and experiences for their users. This research was motivated by the difficulties that the post-graduate students at the Eastern Mediterranean University in Northern Cyprus faced in writing their theses, and therefore the final voice should be theirs: Before the class, my idea was no matter how hard I try, I can not write well because good writing only belongs to native speakers, but now, I feel that I only need sufficient work and practice in order to write well.



APPENDIX A: EFL 501 Course Description


Instructor: Extension: Nilgun HANCIOGLU 1070

Class Hours: Can be arranged by negotiation with the participants (any day between 14.30-17.30 or 15.30-18.30) Office Hours: To be announced in class by the instructor Course Description: EFL 501, offered by the SFL, is a non-credit course aimed at developing the academic and professional writing skills of MA/MS candidates. During the course, participants will have the chance to analyze the grammar and lexis of authentic academic texts. Participants will then be invited to produce their own work and will be encouraged to recognize their own problems with the use of the language and find solutions to those problems. At the same time, attention will be paid to devices that make a piece of writing coherent and cohesive. The emphasis will be on trying to produce texts where ideas are unified through certain cohesive devices and sound reasoning. Participants who would like to take this course should not have entered the thesis writing stage yet. Those MA/MS candidates who have already started their thesis are advised to enroll on EFL 502. It is essential that participants to these courses have at least a good intermediate level of English. Course Assessment Procedures: This is a non-credit course. However, participants should expect to submit for formal assessment a portfolio of their own written work. Some of this work will be


done in class, and some will be done outside class. The exact content of these submissions will be discussed during the course. The portfolio will be made up of a number of short pieces of work. In order to achieve a satisfactory grade, candidates will be expected to attend classes, and complete written tasks as required. Plagiarism: This is intentionally failing to give credit to sources used in writing, regardless of whether they are published or unpublished. Plagiarism is a disciplinary issue and will be dealt with accordingly.

Weeks 1 2

Date 1-5 March

8-12 March 3 15-19 • Unit 2: Narrating and Reporting (cont.) March 4 22-26 • Unit 3: Paraphrasing, Summarising, Quoting and March Synthesising 5 29 March – • Unit 4: Describing Processes and Developments 2 April 6 5-9 April • Unit 5: Giving Examples 7 12-16 April • Unit 6: Generalizing 8 19-23 April 23 April • Mid-term Exam Week National Holiday 9 26-30 April • Unit 7: Classifying and Categorising 10 3-7 May • Unit 7: Classifying and Categorising (cont.) 11 10-14 May • Unit 8: Cause and Effect 12 17-21 May • Unit 8: Cause and Effect (cont.) 19 May National Holiday 13 24-28 May • Unit 9: Comparing and Contrasting 14 31 May – 4 • Unit 10: Defining June 15 7-8 June • Unit 11: Interpreting Data (Tables, Graphs, Charts) 16 11-18 June • Final Exam Week 17 21-26 June • Final Exam Week This is a tentative course outline, subject to revision or adjustment as and if required.

• • •

Classwork Introduction/Overview of the Course/Lexical Tutor Unit 1: Introducing Unit 2: Narrating and Reporting


APPENDIX B: ENGL501 Course Description

ENGL501 is a post-graduate academic English course. The purpose of this course is to develop the academic writing skills of MA/MS and Ph.D. candidates. The prime focus will be on examining authentic academic texts, and analysing such elements as structure, lexis, and style in theses and dissertations. Participants will then be invited to exploit this detailed understanding of textual dynamics in their own writing and helped to produce work that is accurate, concise, and appropriate.

The aims of ENGL501 are: o To improve and develop academic writing skills and knowledge. o To improve and develop academic study skills and knowledge of academic conventions. o To improve and develop knowledge of thesis structure and format.

o o o
o o o

To develop skills and knowledge of textual dynamics. To systematically develop academic vocabulary knowledge and skills. To develop awareness of potential problems in academic writing. To furnish with knowledge and strategies for dealing with problems in academic writing To develop awareness of the need and benefit of producing multiple drafts in academic writing with the aim of improving the structure, lexis and style of own text and bringing it to a satisfactory level which would be accepted by the academic discourse community. To develop skills in exploiting computers both as a study resource and as a tool for producing professionally presented work.

On successful completion of this course, all students will have developed knowledge and understanding of: o Thesis structure and format. o Discourse and style of academic writing. o Organizational structure of theses. o English grammatical structure, functions and discourse patterns. o Accurate and appropriate use of academic vocabulary. o The importance and value of drafting, revising and editing written work. On successful completion of this course, all students will have developed their skills in: o Planning the steps of research. o Placing research data in different sections of a thesis. o Planning, drafting and writing a thesis proposal, introduction, literature review, methodology and abstract in the appropriate format, organizational structure and style. o Conforming to academic conventions in writing their thesis. On successful completion of this course, all students will have developed their appreciation


of and respect for values and attitudes regarding the issues of: o Academic honesty and the issue of plagiarism.

There are three class hours per week. Lessons are not lecture-based, although obviously there is some formal teacher input. For the most part though, participants should be expected to take active part in class discussions related to and dependent upon examining authentic academic texts, and analysing such elements as structure, lexis, and style, especially in theses and dissertations. Participants are then invited to exploit this detailed understanding of textual dynamics in their own writing and helped to produce work that is accurate, concise, and appropriate. In addition to the three class hours per week, there is also a complementary web-based interactive e-learning platform, Moodle, which provides the participants with maximum exposure to more tasks, materials, and interaction with peers. In addition, this platform enables constant communication between the participants and the instructor. Students are also expected to make use of the instructor’s office hours. Since this course is aimed at improving academic writing skills, participants should expect to spend a lot of time outside the classroom, writing and bringing their written work to a satisfactory level.

o Participants are expected to work on parts of their thesis, which they can later expand. Though this is a non-credit course, attendance is compulsory as the completion of the assignments is closely linked to the input provided in class and class discussions. A


student not attending 50% of the classes and not submitting the required assignments in a portfolio on time will receive an Unsatisfactory grade. MATERIALS/TOOLS Materials and tasks designed / compiled by the course instructor. Interactive e-learning platform INDICATIVE BASIC READING LIST None EXTENDED READING LIST Academic texts (theses, research articles) SEMESTER OFFERED Fall and Spring Semester CONTENT & SCHEDULE
Time of classes for 2008-2009 Fall Semester. Group 1: Thursday 13.30-16.30 Group 2: Tuesday 13.30-16.30 The following is a tentative course schedule, and it may be subject to change if and when required.


WEEK 1 2

DATE 6-10 Oct. 13-17 Oct

Introduction to the Course and Requirements Introduction to the Course and Requirements Introduction to Moodle-Interactive e-learning platform Student Profiles Diagnostic Writing Exercise (CELF / IELTS criteria) Discussions on writing Lab session • Thesis structure • Microsoft word tools


20-24 Oct

4 27 Oct-31 Oct (29th Oct. Holiday) 3-7 Nov 10-14 Nov 17-21 Nov 24-27 Nov Nov Dec 6 28• •

Moodle Avoiding Plagiarism Quoting/paraphrasing/bibliographies

5 6 7 8 9 10 11 12 13

• • • •

Quoting/paraphrasing/bibliographies (contd) Writing Research (Thesis) Proposals Writing Research (Thesis) Proposals (contd) Writing Introductions 1 MIDTERM EXAMS

8-12 Dec 15-19 Dec 22-26 Dec 29 Dec-3 Jan (1st Jan. holiday) 5-9 Jan 12-16 Jan 19-22 Jan 26 Jan-10 Feb

RELIGIOUS HOLIDAY Writing Introductions II Literature review chapter Methodology Chapter

• •

14 15 16 17

• • • • •

Data Analysis and Conclusion Chapters Abstracts End-of-semester evaluation End-of-semester in-class writing FINAL EXAMS


APPENDIX C: Interview Guide

1. Do you think that EFL 501 is a necessary course for post-graduate students? Why/Why not? 2. How long have you taught this course? 3. Could you please describe your experiences? 4. Could you please talk a little bit about the student profiles? 5. Can you talk a little bit about these students’ problems? 6. Were the problems specific to one group only? 7. What were the most common problems faced by these students? 8. What do you think could be done to reduce the problems these students face? 9. Do you think the EFL 501 course you were teaching suited the needs of these students? 10. Did you observe any concrete developments in these students’ use of the language at the end of the course? If so, mostly in terms of what? 11. Would you teach this course in the same way and using the same approach and materials if you taught it again? If not, what changes/additions/deletions would you make? 12. Some faculties have made this course a compulsory course for their postgraduate students? How do you feel about this?


APPENDIX D: Needs Analysis Questionnaire

EFL 501-Advanced Writing 1 2004-2005 Academic Year STUDENT PROFILE QUESTIONNAIRE Name: Student number: Department: Stage in postgraduate study: e-mail: How would you rate your English as a whole, in terms of knowledge of the language, vocabulary, reading, writing, speaking and listening? Excellent? Good? Fair? Poor? What kinds of writing do you need to do in English? How would you rate your writing in English? Excellent? Good? Fair? Poor? What do you find most difficult about writing in English? Please give details. How do you check your written work? What have you done up to now to improve your writing skills? What has helped you most?How? Do you use any resources to help you with your writing (the internet, self-study books, dictionaries, etc,..)? Which ones? Do you write in hand or do you word-process your documents (including your first drafts)? Which one is more practical, time-saving and useful for you and why?


How familiar are you with the following terms in writing? Put a tick next to the ones you are very familiar with and a cross next to the ones you have never heard of. Wordiness Cohesion Coherence Process writing Drafting Revising Editing Appropriacy of lexis Collocations Rhetorical styles Genres Register Citing sources Avoiding plagiarism Quoting, paraphrasing, summarizing Bibliographies Format Accuracy of language Punctuation


APPENDIX E: Course Evaluation Questionnaire

EFL 501-Advanced Writing 1 2004-2005 Academic Year/Spring Semester END-OF-SEMESTER FEEDBACK FORM Name: Student number: Department: Stage in postgraduate study: How would you now rate your English as a whole, in terms of knowledge of the language, vocabulary, reading, grammar? Excellent? Good? Fair? Poor? Has the course benefited you in this respect? If so, how?

How would you now rate your writing in English? Excellent? Good? Fair? Poor? Has the course benefited you in this respect? If so, how?

What do you still find most difficult about writing in English? Please give details and, if possible, reasons.

How do you now check your written work? Do you use any tools? Has the course benefited you in this respect? If so, how? Give details, please.

Do you now use any resources to help you with your writing (the internet, self-study books, dictionaries, etc,..)? Has the course benefited you in this respect? If so, how? Give details please.

In what other ways, if any, has the course benefited you?


Do you have any other comments about the course in terms of content, teaching materials, instructional methods, focus, etc?

Would you advise a friend to take this course? Why/Why not?

What changes/additions/deletions would you suggest and why?

How familiar are you now with the following terms in writing? Put a tick next to the ones you are very familiar with and a cross next to the ones you have not heard of. Wordiness Cohesion Coherence Process writing Drafting Revising Editing Appropriacy of lexis Collocations (words that go together) Citing sources Avoiding plagiarism Quoting, paraphrasing, summarizing Bibliographies Format Accuracy of language Punctuation Formal/informal language


APPENDIX F: List of Abstracts

Algebra 1. C(star)-algebras for boundary actions of Abelian-by-cyclic groups ARIZONA STATE UNIVERSITY/ PhD, 2005 2) The X-Legion: A compiler-approach to exploit locality and portability of divide-and-conquer algorithms UNIVERSITY OF CALIFORNIA, IRVINE/ PhD, 2005 3) Cohomological methods for determining numerical invariants of algebras and modules RUTGERS THE STATE UNIVERSITY OF NEW JERSEY - NEW BRUNSWICK/ PhD, 2005 Accounting 1. Economic liberalization and its impact on civil war, 1870—2000 STATE UNIVERSITY OF NEW YORK AT BINGHAMTON/ PhD, 2005 2. The influence of leadership on the motivation of virtual teams NORTHCENTRAL UNIVERSITY/ PhD, 2005 3. The influence of evaluative reactions to attribute frames and accounting data on capital budgeting decisions VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY/ PhD, 2005 4. The economic, financial accounting and governance determinants of synthetic lease financing THE PENNSYLVANIA STATE UNIVERSITY/ PhD, 2005 5. On labor force participation of married women: The case of the United States since 1959 UNIVERSITY OF MINNESOTA/ PhD, 2005 6. An integrated systems approach to macroeconomic modeling: Stock-flow consistent accounting and the dynamics of interaction between the real and financial economy NEW SCHOOL UNIVERSITY, PhD, 2005 7. Implications of alternative emission trading plans MCMASTER UNIVERSITY (CANADA)/ PhD, 2005 8. The restructuring and fragility of the Mexican financial system NEW SCHOOL UNIVERSITY/ PhD, 2005 9. Does the extent of compliance with international accounting standards affect information asymmetry? OKLAHOMA STATE UNIVERSITY/ PhD, 2005 10. Convertible debt proceeds: Allocation methods and evidence of market valuation MISSISSIPPI STATE UNIVERSITY/ PhD, 2005 11. Pension accounting and the public interest UNIVERSITY OF CALGARY (CANADA)/ PhD, 2005 12. Introducing an ethical dimension into the earnings management decision VIRGINIA COMMONWEALTH UNIVERSITY/ PhD, 2005 Anthropology 1) Fishing-Dependent Communities on the Gulf Coast of Florida: Their Identification, Recent Decline and Present Resilience/ University of South Florida - Tampa, FL, MA, 2003 2) Falun Gong in the United States: An Ethnographic Study/ University of South Florida Tampa, FL, MA, 2003 3) Dostoevsky's Conception of Man: Its Impact on Philosophical Anthropology/ The Pennsylvania State University, PhD, 1997 4) Impurity and Death: A Japanese Perspective/ San Francisco State University - San Francisco USA, MA, 2002 5) Osteometric Assessment of 20th Century Skeletons from Thailand and Hong Kong/ Florida Atlantic University, MA, 1997 Archaeology 1) An Archaeological Analysis of Gender Roles in Ancient Non-Literate Cultures of Eurasia/ Flinders University, Australia, MA, 2004


2) Bioarchaeology of the St. Mary's Free Ground Burials: Reconstruction of Colonial South Australian Lifeways/ University of Adelaide, Adelaide, PhD, 2004 3) The Blokemuseum: Motor Museums and Their Visitors/ Flinders University, PhD, 2004 4) Place as Occupational Histories: Towards an Understanding of Deflated Surface Artefact Distributions in the West Darling, New South Wales, Australia/ University of Auckland, Auckland, New Zealand, PhD, 2004 5) Investigations Towards a Late Holocene Archaeology of Aboriginal Lifeways on the Southern Curtis Coast, Australia/ University of Queensland, PhD, 2004 6) Site Unseen: Archaeology, Cultural Resource Management, Planning and Predictive Modelling in the Melbourne Metropolitan Area/ La Trobe University, Australia, PhD, 2003 7) "Of More Than Usual Interest": A Bioarchaeological Analysis of Ancient Aboriginal Skeletal Material from Southeastern South Australia/ Flinders University, Adelaide, PhD, 2003 8) Numerous Indications: The Archaeology of Regional Aboriginal Behaviour in Northwest Central Queensland/ University of New England, Austr. PhD, 2003 9) Knowledge, Power and Voice: An Investigation of Indigenous South Australian Perspectives of Archaeology/ Flinders University, PhD, 2003 10) Agate and Carnelian Beads and the Dynamics of Social Complexity in Iron Age Mainland Southeast Asia/ University of New England, Armidale, PhD, 2003 11) The Archaeology and Socioeconomy of the Gunditjmara: A Landscape Analysis from Southwest Victoria, Australia/ Flinders University, PhD, 2002 12) Beyond the Divide: A New Geo-Archaeology of Aboriginal Stone Artefact Scatters in Western NSW, Australia/ Macquarie University, PhD, 2002 13) Ngarranggani, Ngamungamu, Jalanijarra: 'Lost Places', Recursiveness and Hybridity at Old Lamboo Pastoral Station, Southeast Kimberley, WA/ University of Western Australia, PhD, 2002 14) Inland Pilbara Archaeology: A Study of Variation in Aboriginal Occupation Over Time and Space on the Hamersley Plateau/ University of Western Australia, MA, 2002 15) A Space of Their Own: Nineteenth Century Lunatic Asylums in Britain, South Australia and Tasmania/ Flinders University, PhD, 2002 16) Deep Structures: An Examination of Deliberate Watercraft Abandonment in Australia Flinders University, PhD, 2002 17) Late Holocene Indigenous Economies of the Tropical Australian Coast: An Archaeological Study of the Darwin Region/ Northern Territory University, PhD, 2001 18) Database Design, Archaeological Classification and Geographic Information Systems: A Case Study from Southeast Queensland/ University of Queensland, PhD, 2001 19) Continuity and Change: A Late Holocene and Post Contact History of Aboriginal Environmental Interaction and Vegetation Process from the Keep River Region, Northern Territory, University of Wollongong, PhD, 2000 20) Palaeo-Environmental Change and the Persistence of Human Occupation in South-Western Australian Forests/ University Western Australia, PhD, 2000 21) The Archaeology of Body Modification: Identifying Symbolic Behaviour Through Usewear and Residues on Flaked Stone Tools/ University of Queensland, PhD, 2000 22) Station Camps: The Ethnoarchaeology of Cultural Change in the Post-Contact Period in the South-East Kimberley Region of Western Australia/ Flinders University, PhD, 2000 23) Squatting Landscapes in South-Eastern Australia (1820-1895)/ University of Sydney, PhD, 2000 24) Here and There: Links Between Stone Sources and Aboriginal Archaeological Sites in Sydney, Australia/ University of Sydney, MA, 1999 25) Past Aboriginal Hunter-Gatherer Economy and Territorial Organisation in Coastal Districts of Western Australia's Lower South-West/ University of Western Australia, PhD, 1999 26) 'Intended Solely for their Greater Comfort and Happiness': Historical Archaeology, Paternalism and the Peel Island Lazaret/ University of Queensland, PhD, 1999 27) Dependent Colonies: The Importation of Material Culture and the Establishment of a Consumer Society in Australia Before 1850/ Flinders University, PhD, 1999 28) Microdebitage and the Archaeology of Rock Art: An Experimental Approach/ University of Sydney, MSc, 1999 29) Wangala Time, Wangala Law: Hunter-Gatherer Settlement Patterns in a Sub-Humid to SemiArid Environment/ La Trobe University, Austr., PhD, 1997


30) The Importance of Quartz in Stone Artefact Assemblages: A Technological Analysis of Five Aboriginal Sites of the Coonabarabran/Warrumbungle Region/ University of New England, Armidale, MA, 1996 31) SS Xantho: Towards a New Perspective. An Integrated Approach to the Maritime Archaeology and Conservation of an Iron Steamship Wreck/ James Cook University of North Queensland, PhD, 1996 32) Accommodating the Destitute: An Historical and Archaeological Consideration of the Destitute Asylum of Adelaide/ Flinders University of South Australia, MA, 1996 33) Situating Style: An Ethnoarchaeological Study of Social and Material Context in an Australian Aboriginal Artistic System/ University of New England, Austr.,PhD, 1994 Architecture/ Urban-regional planning/ Landscape Architecture 1) Reciprocity and Mutualism: The Integration of Landscape and Architecture in the Reclamation of the Former Cornfields Rail Yard. California State Polytechnic University. 2) The Creativity Experience: Examining the Design Process in Landscape Architecture . California State Polytechnic University. 3) Emotional Responses to Flowering Landscapes. California State Polytechnic University. 4) Tree Ordinances: Public Opinion Survey Examining Issues of Functionality and 5) Aesthetics in Del Mar, California. California State Polytechnic University. Increasing the Acceptability of Urban Nature Through Effective Cues to Care: A Study of the Lower Arroyo Seco Natural Park, Pasadena, California. California State Polytechnic University. 6) Anxiety and Situational Stress in Medical Oncology Patients: An Environmental Study of Landscape Views in Treatment Room Settings. California State Polytechnic University. 7) Communicating the Value of Landscape Architecture. California State Polytechnic University. 8) Unnatural Nature: Eight Artists Look at Southern California. California State Polytechnic University. 9) Landscape Water Conservation: Toward a Community Education Strategy for the Helix Water District. California State Polytechnic University. 10) Environmental Innovation in Residential Subdivision Design: An Investigation in Orange County, California. California State Polytechnic University. 11) A Study of Human Factors Related to Food Production in Regenerative Agriculture: A Design of a Preliminary Labor Model at the Institute for Regenerative Studies. California State Polytechnic University. 12) Rainforest Conservation and Agricultural Development: Conflict and Compatibility in Baja Talamanca, Costa Rica. California State Polytechnic University. 13) Ecosystematic Stormwater and Flood Management Practices for Southern California. California State Polytechnic University. 14) Guidelines for Water Conservation, Including Integrated Computer Systems for Measuring, Designing and Managing Water Use in the Landscape. California State Polytechnic University 15) The Community for Regenerative Studies: An Investigation into the Human Role in the Environment. California State Polytechnic University. 16) Renewable Energy Systems: An Integrated Approach to Environmental Design. California State Polytechnic University 17) The Value of Complexity in Environmental Design. California State Polytechnic University. 18) Reclamation Techniques for Western Surface Coal Mining. California State Polytechnic University. 19) The Gardens of Edward Huntsman-Trout. California State Polytechnic University. 20) Shopping Centers: Behavioral Archetypes and Design Synthesis. California State Polytechnic University. 21) Providing Landscape Architect Design Services for the Average Homeowner. California State Polytechnic University 22) Comparative River Basin Planning: A Historiographical Method for the Analysis of Regional Planning in America and China. California State Polytechnic University. 23) Power, Identity, and the Rise of Modern Architecture from Siam to Thailand- University of Colorado - Denver, CO, USA


24) The First World War, Britain, and modern design: The social use of architecture in inter-war
Birmingham (England) AUBURN UNIVERSITY 25) Automatic building extraction for three-dimensional terrain reconstruction using image interpretation techniques UNIVERSITY OF NEW SOUTH WALES (AUSTRALIA) 26) Shaping sacred space: Toward an evangelical theology of church architecture TRINITY EVANGELICAL DIVINITY SCHOOL 27) Becoming indigenous: A mosaic of house and home UTAH STATE UNIVERSITY 28) Boston's 'three-decker menace': The buildings, the builders and the dwellers, 1870s--1930 (Massachusetts) BOSTON UNIVERSITY 29) Hybrid sketching: A new middle ground between 2- and 3-D, PhD, MIT, 2005 30) Building negotiation: Architecture and sociopolitical transformation at Chau Hiix, Lamanai, and Altun Ha, Belize/ Indiana University, 2005 31) Reading in three dimensions: Architectural biography from Harriet Beecher Stowe to Edith Wharton (Henry James, William Dean Howells), Boston University, 2005 32) Do the original elements by Kevin Lynch on community design apply today with New Urbanism? (Alabama)/ Mississipi State University, 2005 33) Participatory virtual preservation: A human-centered digital media procedure for architectural history inquiry/ Harvard University, 2005 34) Architecture that embodies the symbolic nature of good leadership and promotes productive collaboration between women's international organizations. WILL: Women's International Leadership League/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch 2005 35) Music on the edge: An addition to the music conservatory of Tolima, Colombia/ MArch, UNIVERSITY OF MARYLAND, COLLEGE PARK, 2005 36) The sustainable development of urban 'scrap sites'/ MArch, CARLETON UNIVERSITY (CANADA), 2005 37) View out of a window: Visual preferences of dually diagnosed adolescents residing in group homes/ OKLAHOMA STATE UNIVERSITY, MS, 2005 38) 'Fairy habitations of the mimic city': Sacred Victorian cottages at Chester Heights Camp Meeting/ UNIVERSITY OF DELAWARE (WINTERTHUR PROGRAM), MA, 2005 39) 'The real idealism of history': Historical consciousness, commemoration, and Johannes Brahms's 'years of study' (Germany)/ COLUMBIA UNIVERSITY, PhD, 2005 40) The politics of architecture: Suor Domenica da Paradiso and her convent of la Crocetta in post-Savonarolan Florence (Italy)/ RUTGERS THE STATE UNIVERSITY OF NEW JERSEY - NEW BRUNSWICK, PhD, 2005 41) Transforming Early Gothic form: The Cistercian church of Pontigny, Saint-Martin at Chablis, and northern Burgundian architecture (France)/ UNIVERSITY OF CALIFORNIA, SANTA BARBARA, PhD, 2005 42) Physical-social capital: Towards a critical design praxis for communities of place (Texas) / UNIVERSITY OF GUELPH (CANADA), PhD, 2005 43) Simplified building energy analysis tool for architects/ ILLINOIS INSTITUTE OF TECHNOLOGY, PhD, 2005 44) Relevant attributes in assessment for design features of indoor games halls: The application of importance-performance analysis/ INDIANA UNIVERSITY, 2005 45) The life story of the Cemberlitas Hamam: From bath to tourist attraction (Turkey)/ UNIVERSITY OF MINNESOTA, PhD, 2005 46) An archaeology of the fragment: The transition from the antique fragment to the historical fragment in French architecture between 1750 and 1850/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 47) An architectural history of grand opera houses: Constructing cultural identity in urban America from 1850 to the Great Depression/ RUTGERS THE STATE UNIVERSITY OF NEW JERSEY - NEW BRUNSWICK, PhD, 2005 48) Formalizing the informal city: Designing for development in a Peruvian shantytown/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 49) The politics of style: Building, builders, and the creation of federal Boston (Massachusetts, Charles Bullfinch)/ UNIVERSITY OF MASSACHUSETTS AMHERST, PhD, 2005 50) Re-weaving the urban fabric: A new midtown residential neighborhood in Newport News, Virginia/ UNIVERSITY OF MARYLAND, COLLEGE PARK MArch, 2005 51) Constructing process models from distributed design activity/ CARNEGIE MELLON UNIVERSITY, PhD, 2005


52) Re-constructing the Counter Reformation: Women architectural patrons in Rome and the case of Camilla Peretti (Italy)/ THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL, PhD, 2005 53) The Romanesque Cathedral of Saint Mary at Lincoln and the image of reform (England)/ COLUMBIA UNIVERSITY, PhD, 2005 54) A culinary school in San Mateo, California/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 55) The architecture of Maxentius: A study in architectural design and urban planning in early fourth-century Rome (Marcus Aurelius Valerius Maxentius, Emperor of Rome, Roman Empire)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 56) Squaring the circle: The regulating lines of Claude Bragdon's Theosophic architecture/ VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY, PhD, 2005 57) Homeowning: An exploration of the possession and personalization of the American Dream/ RICE UNIVERSITY, MArch, 2005 58) Home as investment: Housing markets and cultures of urban change in Houston (Texas)/ RICE UNIVERSITY, PhD, 2005 59) Imaginary figures of death and life in the architecture of Grandjean de Montigny (Auguste Henri Victor Grandjean de Montigny, France, Brazil)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 60) Professional sacrifice: Architects, ethics and advertising/ OPEN UNIVERSITY (UNITED KINGDOM), PhD, 2005 61) Shifting archi(text)ure: Notes on a discourse/ RICE UNIVERSITY, MArch, 2005 62) Dovetail Ranch at Ajo Valley/ RICE UNIVERSITY, MArch, 2005 63) Balancing on transit: Redevelopment of the Southern Pacific Railyards Sacramento, California/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 64) A new deal for progress: The 1933 Chicago World's Fair (Illinois)/ UNIVERSITY OF ILLINOIS AT CHICAGO, PhD, 2005 65) Entertainment of the most beautiful kind: The house of William and Harriet Aiken, 1833— 1860/ UNIVERSITY OF DELAWARE (WINTERTHUR PROGRAM), MA, 2005 66) Architecture as a catalyst for organizational change: Facilitating a person-centered approach to care in an adult/dementia day center/ THE UNIVERSITY OF WISCONSIN – MILWAUKEE, PhD, 2005 67) Master planning the State University of New York for real estate development: A case study/ MS, 2005, STATE UNIVERSITY OF NEW YORK COL. OF ENVIRONMENTAL SCIENCE & FORESTRY 68) The effect of the interaction of architecture, culture, and nature on well-being and spirituality/ UNIVERSITY OF CALGARY (CANADA), PhD, 2005 69) Cultivat(ing) modernities: The Society for National Heritage, political propaganda, and public architecture in twentieth-century Iran/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY, PhD, 2005 70) Architecture and social complexity in the Late Ubaid Period: A study of the built environment of Degirmentepe in East Anatolia (Turkey)/ UNIVERSITY OF CALIFORNIA, LOS ANGELES, PhD, 2005 71) Leberecht Migge (1881--1935) and the modern garden in Germany/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 72) Spatial narratives, commemorative practices and the building project: New urban foundations in Upper Syro-Mesopotamia during the Early Iron Age/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 73) BIOcity/ RICE UNIVERSITY, MArch, 2005 74) Architecture, ritual and identity in the cathedral of Saint-Etienne and the abbey of SaintGermain in Auxerre, France/ BROWN UNIVERSITY, PhD, 2005 75) Constructions of public space, Singapore/ HARVARD UNIVERSITY, 2005 76) Regionalism and universality on the Big Muddy: A trail of pavilions along the Mississippi River/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 77) Weaving place and object: A new Martin Luther King community library (Washington, D.C.)/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 78) Felt PET: A material research project/ RICE UNIVERSITY, MArch, 2005 79) Dislocations and relocations: The built environments of Japanese American internment/ UNIVERSITY OF CALIFORNIA, SANTA BARBARA, PhD, 2005


80) Operative topography: An agent for place-making in the age of globalization/ UNIVERSITY OF FLORIDA, PhD, 2005 81) Public spaces, public transit and accessibility for the blind: What Portland, Oregon has to teach us/ MORGAN STATE UNIVERSITY, MLA, 2005 82) Rivalry and representation: Regionalist architecture and the road to the 1937 Paris Exposition (France)/ UNIVERSITY OF VIRGINIA, PhD, 2005 83) Open space for the underclass: New York's small parks (1880--1915)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 84) An XML initiative of transferring architectural information to the construction site based on the BIM object concept/ ILLINOIS INSTITUTE OF TECHNOLOGY, PhD, 2005 85) Leopold Eidlitz: Becoming an American architect/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 86) Building in the air: Aspects of the aerial imagination in modern Italian architecture (Franco Albini, Edoardo Persico, Alberto Sartoris)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 87) Art, architecture and politics in Mewar, 1628--1710 (India)/ UNIVERSITY OF MINNESOTA, PhD, 2005 88) The effects of sculpture in a university public space: An empirical study of user behavior/ MISSISSIPPI STATE UNIVERSITY, MLA, 2005 89) Healing the circulatory wound/ RICE UNIVERSITY, MArch, 2005 90) Fear as a cultural phenomenon in Thailand with special reference to the spatial relations of domestic architecture/ OPEN UNIVERSITY (UNITED KINGDOM), PhD, 2005 91) Systems aesthetics: Architectural theory at the University of Cambridge, 1960--1975 (Massachusetts, Peter Eisenman, Lionel March, Leslie Martin, Christopher Alexander)/ HARVARD UNIVERSITY, PhD, 2005 92) Exploring the effects of local development regulations on ecological landscape structure/ TEXAS A&M UNIVERSITY, PhD, 2005 93) Requirements management interface to building product models/ STANFORD UNIVERSITY, PhD, 2005 94) Towards a draped architecture: An examination of theatricality, virtuosity, and ambiguity in the recent works of Frank O. Gehry, and others/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 95) Historic building documentation in the United States, 1933--2000. The Historic American Buildings Survey: A case study/ TEXAS A&M UNIVERSITY, PhD, 2005 96) A meta-language for systems architecting/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY, PhD, 2005 97) Regio: Leon Battista Alberti and the theory of region in architecture (Italy)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 98) Eco-metropolis: Tourism of the urban ecology/ RICE UNIVERSITY, MArch, 2005 99) Tower typewriter and trademark: Architects, designers and the corporate utopia, 1956--1964 (Gordon Bunshaft, Eero Saarinen, Henry Dreyfuss, Florence Knoll, Eliot Noyes)/ NEW YORK UNIVERSITY, PhD, 2005 100)The snake that swallowed an egg: A network of parks for Houston's wasted spaces (Texas)/ RICE UNIVERSITY, MArch, 2005 101)Grand Theater Square - Shanghai (China)/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 102)PHARM_STAD: Fieldworks for Somkhele (South Africa)/ RICE UNIVERSITY, MArch, 2005 103)Border crossings/ RICE UNIVERSITY, MArch, 2005 104)Personalization and its place in the New Urbanism/UNIVERSITY OF LOUISVILLE, PhD, 2005 105)The veneration of St. Benedict in medieval Rome: Parish architecture, monumental imagery, and local devotion (Italy)/ UNIVERSITY OF MICHIGAN, PhD, 2005 106)Functionalism with ornament: Modernist architectural discourse in Hermann Broch's 'Die Schlafwandler'/ WASHINGTON UNIVERSITY, PhD, 2005 107)The power of fame: Stowe and its uses (England, Alexander Pope, Samuel Richardson, Earl Temple, George Grenville, William Pitt)/ WASHINGTON UNIVERSITY, PhD, 2005 108)Modernity and memory: The politics of architecture in Hungary and East Germany after the Second World War/ PRINCETON UNIVERSITY, PhD, 2005


109)Assessing mold risks in buildings under uncertainty/ GEORGIA INSTITUTE OF TECHNOLOGY, PhD, 2005 110)Alfred Muller and Galveston's late nineteenth-century architectural style (Texas)/ UNIVERSITY OF HOUSTON-CLEAR LAKE, MA, 2005 111)Park space (Texas)/ RICE UNIVERSITY, MArch, 2005 112)Like and like/ RICE UNIVERSITY, Houston, MArch, 2005 113)Building stories: Literature and architecture in early modern England (Anne Clifford, Countess of Pembroke, Sir Henry Wotton, Ben Jonson, John Stow, George Herbert)/ UNIVERSITY OF CALIFORNIA, LOS ANGELES., PhD, 2005 114)The Alhambra in comparative perspective: Towards a definition of palace-cities (Spain, Palestine, France)/ BOSTON UNIVERSITY, PhD, 2005 115)Architecture in mind: Hegel's history of architecture and its place in the 'Philosophy of Fine Art' (Georg Wilhelm Friedrich Hegel)/ UNIVERSITY OF ESSEX (UNITED KINGDOM), PhD, 2005 116)Technically symbolic: The significance of schema and Claude Bragdon's Sinbad drawings in 'The Frozen Fountain'/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 117)Green fabric: An urban center for Virginia's wine culture (Virginia)/ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 118)Decision-making framework for the selection and design of shading devices/ VIRGINIA POLYTECHNIC INSTITUTE AND STATE UNIVERSITY, PhD, 2005 119)Permeable walls and place recognition in Henry Klumb's architecture of social concern (Puerto Rico)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 120)A quantification of proportionality aesthetics in morphological design/ UNIVERSITY OF MICHIGAN, PhD, 2005 121)Tall building form generation by parametric design process/ ILLINOIS INSTITUTE OF TECHNOLOGY, PhD, 2005 122)Conceptual design tools for architects/ HARVARD UNIVERSITY 2005 123)Maximizing the benefits of courtroom POEs in design decision support and academic inquiry through a unified conceptual model/ GEORGIA INSTITUTE OF TECHNOLOGY, PhD, 2005 124)The side stage: A critical cultural awareness forum in Washington, D.C./ UNIVERSITY OF MARYLAND, COLLEGE PARK, MArch, 2005 125)The essence of architecture: August Schmarsow's theory of space/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 126)The First World War, Britain, and modern design: The social use of architecture in inter-war Birmingham (England)/ AUBURN UNIVERSITY, PhD, 2005 127)Niche life/ RICE UNIVERSITY, MArch, 2005 128)An investigation of methods for reducing the use of non-renewable energy resources for housing in Thailand/ TEXAS A&M UNIVERSITY, PhD, 2005 129)The comfort consciousness/ RICE UNIVERSITY, MArch, 2005 130)A case study of cost overruns in a Thai condominium project/ TEXAS A&M UNIVERSITY, PhD, 2005 131)Technik und Kultur: The German architectural discourse on iron, 1890—1918/ NEW YORK UNIVERSITY, PhD, 2005 132)At home in postwar France: The design and construction of domestic space, 1945—1975/ NEW YORK UNIVERSITY, PhD, 2005 133)Italian modern architecture and the vernacular tradition: The aesthetics of morality/ UNIVERSITY OF TORONTO (CANADA), PhD, 2005 134)Extreme spatial experience apparatus altering the perception of space through computermediated movement/ UNIVERSITY OF CALIFORNIA, LOS ANGELES, PhD, 2005 135)Technology and elderly people: Design methodology for interactive communication in later life/ CARLETON UNIVERSITY (CANADA), MArch, 2005 136)Hybrid housing/ RICE UNIVERSITY, MArch 2005 137)An analysis of the American outdoor sport facility: Developing an ideal type on the evolution of professional baseball and football structures/ THE OHIO STATE UNIVERSITY, PhD, 2005 138)Albert Mayer, architect and town planner: The case for a total professional/ NEW YORK UNIVERSITY, PhD, 2005 139)Fat City (a post-movement manifesto)/ RICE UNIVERSITY, MArch 2005


140)Ventilation drying and the performance of the exterior membrane in building enclosure systems/ THE PENNSYLVANIA STATE UNIVERSITY, PhD, 2005 141)Urban metamorphosis and change in central Asian cities after the Arab invasions/ GEORGIA INSTITUTE OF TECHNOLOGY, PhD, 2005 142)Facade-poche: The performative representation of thickened window-walls in the works of Marcel Breuer, Richard Neutra, and Jose Luis Sert/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 143)Reading Ashikaga history in the urban landscape: Kyoto in the early Muromachi period, 1336--1467 (Japan)/ PRINCETON UNIVERSITY, PhD, 2005 144)Johannes Baader and the demise of Wilhelmine culture: Architecture, Dada, and social critique, 1875—1920/ NEW YORK UNIVERSITY, PhD, 2005 145)Study of natural ventilation design by integrating the multi-zone model with CFD simulation/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY, PhD, 2005 146)'I am not a decorator': Florence Knoll, the Knoll Planning Unit, and the making of the modern office/ UNIVERSITY OF DELAWARE (WINTERTHUR PROGRAM), MA, 2005 147)Art and the conventual life in Renaissance Venice: The Monastery Church of Santa Caterina de' Sacchi (Italy)/ PRINCETON UNIVERSITY, PhD, 2005 148)Architecture in the manner of Giovanni Battista Piranesi: Ornamental excess and the apotropaic function of grotesque representations (Italy)/ UNIVERSITY OF PENNSYLVANIA, PhD, 2005 149)Technology as 'praxis of inquiry' in architectural design: Adaptability/modulation/emergence/ CARLETON UNIVERSITY (CANADA), MArch 2005 150)Spaces of illusion, artifice and play: From the nineteenth century to the themed environment/ UNIVERSITY OF ESSEX (UNITED KINGDOM), PhD, 2005 Art History 1) Architecture and Ideology: The National Gallery of Canada (A Reading of the Architecture Using Feminist and Postmodernist Theory) Joan Acland 1989, Canada 2) Closed Systems: Alexandra Luke, Hortense Gordon and the Canadian Art History Canon, Janice Anderson, 1995 3) H. Mabel May (1877-1971) - The Montreal Years: 1909-1938, Karen Antaki, 1992 4) The Sculpture of Anne Kahane, Sylvia A. Antoniou, 1992 5) The Beaver Hall Group and its Place in the Montreal Art Milieu and the Nationalist Network, Susan Avon, 1994 6) The Stained Glass War Memorial Windows of Charles William Kelsey, Shirley May Baird 7) Furs in Fashion as Illustrated in the Photo-Portraiture of William Notman in the 1860s, Jana Bara, 1986 8) William Notman's Portraits of Children, Katharine J. Borcoman, 1991 9) William Brymner ( 1855 - 1925 ) The Artist in Retrospect, Janet Grace Mills Braide, 1979 10) Photography, Immigration, and Canadianism: 1896-1921, Anna Maria Carlevaris, 1992 11) William H. Eagar: "Sensibilities of No Common Order" Alexandra E. Carter, 1979 12) Case Study: Michel Foucault, Critical Modernism, and Writing on the Visual Arts in English Canada, Timothy D. Clark, 1991 13) Jock Macdonald: The Search for the Universal Truth in Nature, Allison J. Colborne, 1992 14) The Church of St. Andrew and St. Paul, Montreal: An Architectural History 1805-1932, and Catalogue of Memorials, Sandra M. Coley, 1993 15) Janvier and Morrisseau: Transcending a Canadian Discourse, Curtis J. Collins, 1994 16) Modernism meets the farm: Precisionist paintings and photographs of vernacular architecture, 1915--1940 (Charles Sheeler, Georgia O'Keeffe), CITY UNIVERSITY OF NEW YORK, 2005 17) Clothing the corps: How the avant-garde and the Ballets Russes fashioned the modern body, UNIVERSITY OF PENNSYLVANIA, 2005 18) 'Against an epoch': Boston moderns, 1880--1905 (Massachusetts, F. Holland Day, Louise Imogen Guiney, Ralph Adams), BOSTON COLLEGE, 2005 19) Perceptions of Native American art: American sculpture and earthworks (Isamu Noguchi, James Pierce, Michael Heizer, Michelle Stuart), RUTGERS THE STATE UNIVERSITY OF NEW JERSEY - NEW BRUNSWICK, 2005


Biological Sciences 1) Pragmatism and Human Genetic Engineering/ Vanderbilt University - Nashville, Tennessee – USA, PhD, 1994 2) Ecological Study of the Role of Highly Processed Milk, Meat and Vegetable Oil in Prostate Cancer Causation/ Clayton College of Natural Health, Birmingham, USA, 2005 3) Population genetics and evolutionary history of some deep-sea demersal fishes from the Azores-North Atlantic (Helicolenus dactylopterus, Beryx splendens)/ UNIVERSITY OF SOUTHAMPTON (UNITED KINGDOM)/ PhD, 2005 4) Monolayer protected gold clusters: Application to biology/ STANFORD UNIVERSITY/ PhD, 2005 5) Factors underlying the interactions between people and wildlife in the Argentine Chaco/ THE UNIVERSITY OF ARIZONA/ PhD, 2005 6) Flexibility in the light reactions of photosynthesis/ WASHINGTON STATE UNIVERSITY/ PhD, 2005 7) Mapping protein functional epitopes using phage display: Applications for streptavidin, HIV-1 Vif, and a terpene cyclase/ UNIVERSITY OF CALIFORNIA, IRVINE/ PhD, 2005 8) DNA hybridization: Fundamental studies and applications in directed assembly/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY/ PhD, 2005 9) A systems biology approach to understanding cellular dynamics during HIV-1 infection and progression to AIDS/ UNIVERSITY OF MICHIGAN/ PhD, 2005 10) Chemical and cell biological approaches to study ER stress-induced apoptosis/ HARVARD UNIVERSITY/ PhD, 2005 11) Breeding biology and ecology of Great Tinamous: Female joint-nesting, extra-pair paternity and natural history/ CORNELL UNIVERSITY/ PhD, 2005 12) Systematics of the megadiverse superfamily Gelechioidea (Insecta: Lepidoptera)/ THE OHIO STATE UNIVERSITY/ PhD, 2005 13) Molecular markers of outcome in prostate cancer/ UNIVERSITY OF NEW SOUTH WALES (AUSTRALIA)/ PhD, 2005 Business Administration 1) Improved Forecast Accuracy in Airline Revenue Management by Unconstraining Demand Estimates from Censored Data/ Rutgers University, Newark, NJ USA, PhD, 2001 2) Towards Improved Project Management Practice: Uncovering the evidence for effective practices through empirical research/ Leeds Metropolitan University, UK, PhD, 2000 3) Electronic Marketing: Advantages and Disadvantages/ Saint Regis University - Wilmington, USA, PhD, 2004 4) Model for the Evaluation of Project Funding in Emerging Markets/ Columbia Pacific University, PhD, 2000 5) Is Total Quality Management Enough For Competitive Advantage? Realities in Organizations Implementing Change Initiatives: with Examples from the United States and the Developing World/ University of Hull, United Kingdom, MBA, 1999 6) Publicly Funded School Voucher Programs: A Policy Analysis/ DePaul University, Chicago, Illinois, USA, MS, 2001 7) Violence in the Workplace: Preparation, Prevention and Response/ Columbia Southern University, MS, 2002 8) Inside Chinese Organizations: An Empirical Study of Business Practices in China/ Oxford University, Said Business School, PhD, 1998 9) An empirical study on the state of the purchasing function within small and medium-sized business enterprises/ THE GEORGE WASHINGTON UNIVERSITY/ PhD, 2005 10) Does insider trading predict the future stock return?/ CORNELL UNIVERSITY/ PhD, 2005 11) A decision model for technology assessment to reduce the internal digital divide in emerging economies (Case: Costa Rica)/ PORTLAND STATE UNIVERSITY/ PhD, 2005 12) Higher education and entrepreneurship: The relation between college educational background and all business success in Texas/ UNIVERSITY OF NORTH TEXAS/ PhD, 2005 13) Relationship-specific motives and cultural values in the crossborder franchisor-franchisee relationship from the Puerto Rican franchisee's perspective/ THE GEORGE WASHINGTON UNIVERSITY/ PhD, 2005 14) Essays on the spatial distribution of population and employment/ THE UNIVERSITY OF CHICAGO/ PhD, 2005


15) Consumer expectations of quality in Master of Business Administration programs: A comparison between face-to-face learning and Web-delivered distance learning in schools of business/ ALLIANT INTERNATIONAL UNIVERSITY, SAN DIEGO/ 2005 16) Psychological ownership in complex technology/ BENEDICTINE UNIVERSITY/ PhD, 2005 17) Corporate governance and financial reporting credibility/ NORTHWESTERN UNIVERSITY/ PhD, 2005 18) Product capital model: Modeling the value of design to corporate performance/ STANFORD UNIVERSITY/ PhD, 2005 19) Regulations, foreign presence, and efficiency of local firms: A multiple country study in commercial banking/ UNIVERSITY OF MINNESOTA/ PhD, 2005 20) Catalog creative design and consumer demand: A spatial distance-metric approach/ THE UNIVERSITY OF CHICAGO/ PhD, 2005 Chemistry 1) Multiply aromatic clusters via ab initio genetic algorithm/ UTAH STATE UNIVERSITY/ PhD,- 2005 2) Sol-gel zirconia- and titania-based surface-bonded hybrid organic-inorganic coatings for sample preconcentration and analysis via capillary microextraction in hyphenation with gas chromatography (CME-GC)/ UNIVERSITY OF SOUTH FLORIDA/ PhD, 2005 3) Reaction mechanisms for catalytic partial oxidation systems: Application to ethylene epoxidation/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY/ PhD, 2005 4) Ionic liquids: Synthesis, solvation interactions, chromatographic characteristics, and micelle formation/ IOWA STATE UNIVERSITY/ PhD, 2005 5) Chemistry of beta-ester radicals: Evidence supporting ion pair processes/ UNIVERSITY OF ILLINOIS AT CHICAGO/ PhD, 2005 6) Electromagnetic excitation of high frequency acoustic shear waves for the study of interfacial biochemical phenomena/ UNIVERSITY OF TORONTO (CANADA)/ PhD, 2005 7) Using the metal-ligand interaction to construct complex supramolecular polymer architectures/ CASE WESTERN RESERVE UNIVERSITY/ PhD, 2005 8) Evolution of chemical composition along river drainage networks (Puerto Rico, Nepal)/ UNIVERSITY OF NEW HAMPSHIRE/ PhD, 2005 9) Characterization and mechanistic study of oxygen-iron intermediates in mononuclear nonheme model systems/ UNIVERSITY OF MINNESOTA/ PhD, 2005 10) Chemistry of tetradecacarbonyltetraosmium/ SIMON FRASER UNIVERSITY (CANADA)/ PhD, 2005 Communications 1) On the Interactions of News Media, Interpersonal Communication, Opinion Formation, and Participation: Deliberative Democracy and the Public Sphere/ University of Pennsylvania, PhD, 1997 2) Cross-Cultural Content Analysis of Advertising from the United States and India/ University of Southern Mississippi, U.S., PhD, 1996 3) Getting By: Race and Parasocial Interaction in a Television Situation Comedy/ University of Kentucky, U.S.A., PhD, 1994 4) An Examination of the Theory of the Commodity and its Application to Critical Media Studies/ Florida Atlantic University, Boca Raton, MA, 1996 5) The Invisible Farm: The worldwide decline of farm news and agricultural journalism training/ Carleton University, Ottawa, Canada, MA, 1996 6) Student Perceptions of Rules for Classroom Interaction/ Louisiana State U, MA, 1992 7) The Questioning Behavior of Males And Females in an Undergraduate Language Class/ Indiana University of Pennsylvania, USA, PhD, 1997 8) Cultural Democracy: The Way Festivals Affect Society/ Queen Margaret University College, Edinburgh, United Kingdom, MSc, 2002 9) On-line Virtual Museums: An Application of an On-line VR Museum for the Parthenon Marbles: Internet: A Means of Cultural Repatriation/ University of York, United Kingdom, MSc, 2002 10) Copyright, A Property of Communication, A Link Between Creativity and Control/ School of Communication, Simon Fraser University Can. 2004


11) Acting in the Name of Culture? The Participation of Organized Labour in the
Canadian Broadcasting Policy Process/ Communication and Culture, Ryerson University, Can. 2005 12) There I Was, 250 Miles Away From My Groom: A Genealogy of Media Weddings/ Université de Montréal, 2005 13) The popular reception of new information and communication technologies in Niger/ SOUTHERN ILLINOIS UNIVERSITY AT CARBONDALE/ PhD, 2005 14) From the serialized story to the reality show: A theoretical approach to models of reading and consuming about fiction and reality (Spanish text)/ THE UNIVERSITY OF ARIZONA/ PhD, 2005 15) African-American women's reception, influence and utility of television content: An exploratory qualitative analysis/ LOUISIANA STATE UNIVERSITY AND AGRICULTURAL & MECHANICAL COLLEGE/ PhD, 2005 16) The cultural politics of housing in a capitalist society: Representations of homelessness in contemporary American newspapers/ INDIANA UNIVERSITY/ PhD, 2005 17) Crime content and media economics: Gendered practices and sensational stories, 1950— 2000/ UNIVERSITY OF TORONTO (CANADA)/ PhD, 2005 Computer Science 1) The ALISA Shape Module: Adaptive Shape Recognition using a Radial Feature Token/ George Washington University, Washington DC, USA, PhD, 2002 2) R² - Heaps with Suspended Relaxation for Manipulating Priority Queues and a New Algorithm for Reweighting Graphs/ University of Colorado Boulder, Colorado, USA, PhD, 1995 3) Beautiful Mates: Applying Principles of Beauty to Computer Chess Heuristics/ University of Sussex, MSc, 1997 4) Distributed Programming in Ada with Protected Objects/ University of Alabama in Huntsville, MS, 1998 5) Success Factors Among Community College Students in an Online Learning Environment/ Nova Southeastern University, U.S.A., PhD, 2000 6) A Spare Capacity Planning Methodology for Wide Area Survivable Networks/ University of Pittsburgh, PhD, 1999 7) A Reactive Approach to Comprehensive Global Garbage Detection/ University of Dublin, Trinity College, Ireland, PhD, 1998 8) Experience-Based Language Acquisition: A Computational Model of Human Language Acquisition/ Louisiana State University, PhD, 2002 9) Application of Scheduling Theory to Spacecraft Constellations/ Florida Institute of Technology, Melbourne, Florida U.S., MS, 2000 10) Context Mediation among Knowledge Discovery Components/ University of Ulster – UK, PhD, 2004 11) A Machine Translation Approach to Cross Language Text Retrieval/ University of Glasgow, Glasgow, United Kingdom, MSc, 1997 Demography 1) Population dynamics, health, and labor migration in Micronesia during the Japanese occupation, 1919—1945/ PRINCETON UNIVERSITY/ PhD, 2005 2) Essays in economics of the family: Incorporating cohabitation/ UNIVERSITY OF WASHINGTON/ PhD, 2005 3) Women's lives through women's wills in the Spanish and Mexican borderlands, 1750--1846 (Texas, New Mexico)/ SOUTHERN METHODIST UNIVERSITY/ PhD, 2005 Economics 1) Essays on Genetic Evolution and Economics/ Harvard University - Cambridge MA, USA, PhD, 1997 2) Pricing, Demand Analysis and Simulation: An Application to a Water Utility/ University of Sydney, Australia, PhD, 1998 3) The Role Of Multinational Companies In The Middle East: The Case Of Saudi Arabia/ University of Westminster, London, United Kingdom, PhD, 2002 4) Power and Control in Chinese Private Enterprises: Organizational Design in the Taiwanese Media Industry/ London School of Economics and Political Science, MSc, 1995 5) Role of the Auditor General in Public Accountability - Some Issues/ University of New England, MSc, 1991


6) The Impact of Japanese Investment on the New Town of Milton Keynes/ Leeds University Business School, MA, 1996 7) Two Essays in Finance: Market Response to Catastrophic Losses on the Insurance Industry and Return on Investment of the University of Illinois to the State of Illinois Treasury/ University of Illinois at Urbana-Champaign, PhD, 1996 8) The limits of equality: An economic analysis of the Israeli Kibbutz/ NORTHWESTERN UNIVERSITY/ PhD, 2005 9) Essays on market design and strategic interaction/ CORNELL UNIVERSITY/ PhD, 2005 10) Essays in environmental economics/ HARVARD UNIVERSITY/ PhD, 2005 11) Long-run effects of E.P.A. designations of Superfund sites on housing values in Houston (Texas) UNIVERSITY OF HOUSTON/ PhD, 2005 12) Three essays on the economics of natural disasters (Ontario)/ UNIVERSITY OF GUELPH (CANADA)/ PhD, 2005 13) Electronic commerce in business-to-business procurement: The effects on organizations/ UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN/PhD, 2005 14) Two essays on selection models and one essay on income inequality in rural China/ BOSTON COLLEGE/ PhD, 2005 15) Essays on development and finance/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY/ PhD, 2005 16) Applied papers in public policy (New York)/ THE UNIVERSITY OF OKLAHOMA/ PhD, 2005 17) Essays on the behavioral effects of tax policy/ THE UNIVERSITY OF TENNESSEE/ PhD, 2005 18) Growth and distributional effects of trade liberalization and alternative free trade agreements: A macro-micro analysis with an application to Egypt/ THE GEORGE WASHINGTON UNIVERSITY/ PhD, 2005 19) Essays on corporate governance/ UNIVERSITY OF PENNSYLVANIA/ PhD, 2005 20) Implications of expanding bank powers into securities activities: Section 20 subsidiaries/ TEMPLE UNIVERSITY/ PhD, 2005 Education 1) Differential Effects of a Multiple Intelligences Curriculum on Student Performance/ Harvard University - Cambridge MA, USA., 2000 2) Cracking the Glass Ceiling: Factors Influencing Women's Attainment of Senior Executive Positions/ Colorado State University, PhD, 1994 3) Online Distance Education:Historical Perspective and Practical Application/ American Coastline University, PhD, 1997 4) Hong, Luoluo, Redefining Babes, Booze and Brawls: Men Against Violence -- Towards A New Masculinity/ Louisiana State University, PhD, 1998 5) The Impact of Adventure-Based Training on Team Cohesion and Psychological Skills Development in Elite Sporting Teams/ University of Wollongong - Wollongong, Australia, PhD, 2002 6) Elemental Movement: A Somatic Approach to Movement Education/ Lesley University; Cambridge, Massachusetts USA, MS, 2001 7) Alcohol and the Chosen Few: Organizational Reproduction in an Addictive System/ Indiana University, PhD, 1995 8) Primary Education in Ecuador's Chota Valley: Reflections on Education and Social Reproduction in the Development Era/ University of Denver, U.S.A., MA, 2000 9) An Ethnographic Study of a Special Education School: The Harris-Hillman Story/ Vanderbilt U, PhD, 1996 10) Instructional Technology, Motivation, Attitudes and Behaviors:An Investigation of At-Risk Learners in the Middle School General Music Classroom/ Nova Southeastern University Fort Lauderdale, Florida, PhD, 2003 11) The Voices of Amerasians: Ethnicity, Identity, and Empowerment in Interracial Japanese Americans/ Harvard University - Cambridge MA, USA, 1986 12) Cameroonian teachers' perceptions of culture, education, and mathematics/ The University of Oklahoma, PhD, 2005 13) Buddhism and education in Burma: Varying conditions for a social ethos in the path to 'nibbana'/ PRINCETON UNIVERSITY/ PhD, 2005 14) Recasting Alaska Native students: Success, failure and identity/ STANFORD UNIVERSITY/ PhD, 2005


15) Strong to serve: The Alliance High School of Kikuyu, Kenya/ YALE UNIVERSITY/ PhD, 2005 16) The figured worlds of Mexican teens in Kentucky: Identity and educational decision making/ UNIVERSITY OF KENTUCKY/ PhD, 2005 17) 'It just opens your eyes up': The impact of African American Studies courses on students in a university setting/ TEMPLE UNIVERSITY/ PhD, 2005 18) The effects of intercollegiate athletics success on private giving to athletic and academic programs at National Collegiate Athletic Association institutions/ UNIVERSITY OF OREGON/ PhD, 2005 19) Bayesian expert systems and multi-agent modeling for learner-centric Web-based education/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY/ PhD, 2005 20) ProgrammingLand: An automated system for computer science education/ NORTH DAKOTA STATE UNIVERSITY/ PhD, 2005 Engineering 1) Use of Activity-Based Costing in The Public Sector- MIT 2) Enabling Organizational Strategy through Effective Capital Programming-MIT 3) Financial Engineering For BOT Infrastructure Projects-MIT 4) Globalization of Construction and Real Estate Companines through Mergers and Acquisitions-MIT 5) Capital Budgeting for Tren Urbano Extensions-MIT 6) Neurocontrol of a Cantilever Beam-MIT 7) Financing and Ownership Structures in International Project Finance-MIT 8) Stadium Financing a Case Study of the Stade de France-MIT 9) Effective Use of Integration Mechanisms for Complex Projects: An Empirical Analysis of Building Projects-MIT 10) Risk Management in Toll Road Concessions-MIT 11) Real Option Approach to Investments in Electricity Generating Capacity-MIT 12) Collaborative Negotiation Methodology for an Innovative Procured, Multi-Cultural and Multi-Phase Projects-MIT 13) Effective Partnering in an Innovative Procurred, Multi-Cultural Project-MIT 14) Portfolio Management and Deferred Maintenance at Universities-MIT 15) Risk Management in BOT Project-MIT 16) Bot In China: Opportunities And Challenges For Foreign Firms-MIT 17) Structural Assessment of Pile Supported Piers-MIT 18) Design Strategies for New and Renovation Construction that Increase the Capacity of Buildings to Accommodate Change-MIT 19) System Dynamic Models for Construction Projects-MIT 20) The Climate Change Debate and Its Implications for Megacities-MIT 21) A Robust Planing and Control Methodology for Design-Build Fast-Track Civil Engineering and Architectural Projects-MIT 22) Project Delivery and Planning Strategies for Public Owners-MIT 23) Risk in Global Infrastructure Project Financing-MIT 24) Framework to Assess a Facility's Capability to Accommodate Change: Application to Renovated Buildings-MIT 25) Strategic Visitor & Ferry Management Plan: Boston Harbor Islands National Park Area-MIT 26) Simulation to Assess Plumbing and Fire Protection Systems-MIT 27) Adaptively Prestressed Concrete Structures-MIT 28) Management Systems for Infrastructure-MIT 29) Valuing Project Risk and Flexibity in Mining Resource Development-MIT 30) Innovative Design Approaches For Adaptable Multi-Unit Housing-MIT 31) Robust Control of Cost Impact on Fast-Tracking Building Construction Projects-MIT 32) Opportunities Created by Information Technology for the Executive in the Engineering and Construction Industry-MIT 33) Multi-Organizational Project Teams and Construction Innovation: The Role of the General Contractor and Construction Manager-MIT 34) Inter-Firm Collaboration in The Implementation of Structural Innovations in Building Construction-MIT 35) The Entrepreneurial Process in Construction and Real Estate Ventures-MIT 36) Operational, Aesthetic, and Construction Process Performance for Innovative Passive and Active Solar Building Components for Residential Building-MIT


37) Environment and Infrastructure Development in Mega-Cities: The Case of Shanghai, ChinaMIT 38) State of the Art Review of Methodologies for Dispute Avoidance and Resolution in Large Scale Engineering Systems-MIT 39) AFramework For Strategic Thinking In The Global Market For Large-Scale Japanese Construction Firms-MIT 40) Effect of Delivery Systems on Collaborative Negotiations for Large Scale Infrastructure Projects-MIT 41) Taiwan's Industrial Structural Change and Its Implication on Energy Intensity-MIT 42) Infrastructure Management for Tren Urbano-MIT 43) Transport Development: Impact Study of the London-Stockholm Corridor-MIT 44) Controlling Interfaces: A Key to Project Success-MIT 45) Dynamic Planning and Control Methodology for Large-Scale Concurrent Construction Projects-MIT 46) Stages in Project Financing: A Comparative Analysis of Independent Power Projects in Three Developing Countries India, Indonesia, and Peru-MIT 47) Three-Tiered Procurement Framework for U.S. Navy Waterfront Facilities-MIT 48) Technology and Policy for Removal of Sulfur from Fuel-MIT 49) Integrated Approach for the Analysis and Management of Urban Relocation and Infrastructure Development Projects: The Case of Southwestern Suburbs of Beirut, Lebanon-MIT 50) Managing the Development of the Real Estate Portfolios of State Transportation Authorities in the Boston Area-MIT 51) Study of Firm's Behavior in the B2B E-Business Regime-MIT 52) Economic Modeling of Urban Pollution and Climate Policy Interactions-MIT 53) The future of the Port Of Beirut-MIT 54) Environmental Liability, Policy and Technology in Real Estate Development-MIT 55) Private Risk-MIT 56) Agent-Based Techniques For National Infrastructure Simulation-MIT 57) Transit Oriented Development Strategy: Guangzhou Case Study-MIT 58) Improved Capital Programming Powered By GASB 34 Compliance: A Case Study From Winchester, MA-MIT 59) Green Development: Creating Incentives for Developers-MIT 60) Strategy for Penetrating Engineering & Construction Markets In Southeast Asia for Singapore through BOT Contract-MIT 61) Methodology for Achieving GASB 34 Modified Approach Compliance Using U.S. Navy "Smart Base" Facility Management Practices-MIT 62) Fundamental Analysis and Conceptual Model for Corporate Strategy in Global Engineering and Construction Markets-MIT 63) Comparative Analysis of Energy Consumption Trends in Cohousing and Alternate Housing Arrangements-MIT 64) Outsourcing Transportation Infrastructure Maintenance: a theoretical approach with application to JR East-MIT 65) Overseas Projects Financed by International Institutions for Japanese Construction FirmsMIT 66) Lean Enterprise in the Construction Industry-MIT 67) The Valuation of Construction Companies-MIT 68) A Study of the Naval Construction Force Project Material Supply Chain-MIT 69) Application of Lean Enterprise Concept to Construction Firms in Japan-MIT 70) Validation of the Project Definition Rating Index (PDRI) for MIT Building Projects-MIT 71) High fidelity miniaturized antennas and filters for wireless applications- The University of Michigan 72) An algorithm for selection of power supply systems for mems devices- The University of Michigan 73) Microthermoelectric cooler- The University of Michigan 74) Remotely-powered wireless monitoring systems- The University of Michigan 75) Mixed-signal circuit design issues in nanoscale PD-SOI- The University of Michigan 76) A wireless microsystem for neural stimulating microprobes- The University of Michigan


77) The integration of potentiometric and optical chemical sensor arrays- The University of
Michigan 78) Low-voltage and low-power, deep-submicron analog circuits for single-chip, mixed-signal microinstrumentation systems- The University of Michigan 79) CMOS-integrated liquid chemical microdetection systems- The University of Michigan 80) Monolithic and top-down clock synthesis wıth micromachined radio frequency referenceThe University of Michigan 81) Microfluidic biochemical analysis system with electro-osmotic pump and thermally responsive polymer valve- The University of Michigan 82) MEMS angular rate and angular acceleration sensors with CMOS switched capacitor signal conditioning- The University of Michigan 83) Silicon recording arrays with integrated circuitry for in-vivo neural data compression- The University of Michigan 84) Digital circuit design techniques for low-leakage silicon-on-insulator (SOI) CMOS technologies- The University of Michigan 85) Thin-film technologies for hermetic and vacuum packaging of MEMS- The University of Michigan 86) Microfabricated voltammetric neuro-arrays for use in-vitro- The University of Michigan 87) Investigations of brain-machine interface system-control exploiting local field potential oscillations in motor cortex- The University of Michigan 88) Techniques for thermal and pneumatic programming of column selectivity and methods for reducing the impact of extra-column band broadening on system performance- The University of Michigan 89) A wireless microsystem for multichannel neural recording microprobes- The University of Michigan 90) Control of Sensory Perception for Discrete Event Systems- Australian National University 91) Terrain Aided Localisation of Autonomous Vehicles in Unstructured Environments- The University of Sydney, Sydney, Australia 92) A Preliminary Study of the Structural Dynamic Behavior of the NASA Manned Spacecraft Center (MSC) Centrifuge- Madison University - Madison, USA Geography 1) Modeling Carbon Fluxes, Net Primary Production, and Light Utilization in Boreal Forest Stands, PhD, University of Maryland, 1996 2) Dendroarchaeological and contextual investigations of remote log structures in Jasper, Banff, and Kootenay National Parks, Canada (Alberta, British Columbia),UNIVERSITY OF VICTORIA (CANADA), 2004, MSc 3) Walden: A sacred geography (Massachusetts, Henry David Thoreau)/ ANTIOCH NEW ENGLAND GRADUATE SCHOOL/ PhD, 2005 4) Urban-regional clusters and the mutual fund industry in the United States/ STATE UNIVERSITY OF NEW YORK AT BUFFALO/ PhD, 2005 5) The Tunica miracle, sin and savior in America's Ethiopia: A poverty and social impact analysis of casino gaming in Tunica, Mississippi/ THE PENNSYLVANIA STATE UNIVERSITY/ PhD, 2005 History 1) 'Men and Women of Their Own Kind': Historians and Antebellum Reform/ George Mason University - Fairfax, Virginia, USA, MA, 2001 2) The Nature of Resistance in South Carolina's Works Progress Administration Ex-Slave Narratives/ University of Toledo - Ohio – USA, MA, 1990 3) Chemawa Indian Boarding School: The First One Hundred Years, 1880 to 1980/ Dartmouth College, MA, 1997 4) 'The Atlas of Independency': The ideas of John Owen (1616--1683) in the North Atlantic Christian world/ KANSAS STATE UNIVERSITY/ PhD, 2005 5) 'With eyes before and behind': Time, rhetoric, and the vision of medieval history (England, William of Newburgh, Henry of Huntingdon, Matthew Paris, Geoffrey of Monmouth)/ THE GEORGE WASHINGTON UNIVERSITY/ PhD, 2005 6) 'Almighty God created the races': Theologies of marriage and race in anti-miscegenation cases, 1865—1967/ THE CLAREMONT GRADUATE UNIVERSITY/ PhD, 2005


7) 'An American type': The Kikuchi diaries, a cultural biography (1941--1947) (Charles Kikuchi)/ HARVARD UNIVERSITY/ PhD, 2005 Information Technology/ Information Systems 1) Communication of Information Technology Project Sponsors and Managers in Buyer-Seller Relationships/ Brunel University - Uxbridge, UK, PhD, 2003 2) A full-scale semantic content-based model for interactive multimedia information systems/ LSE (London School of Economics), PhD, 1997 3) Making Sense of Mobile ICT-Enabled Trading in Fast Moving Financial Markets as Volatility-Control Ambivalence: Case Study on the Organisation of Off-Premises Foreign Exchange at a Middle-East Bank/ LSE, 2005 4) Supporting Design Understanding in Evolutionary Prototyping: An Application of Change Theory and Semiotics/ LSE, 1997 5) Global Practices and Local Interests: Implementing Information Technology-Based Change in a Developing Country Context/ LSE, 2000 6) Risk Perception, Trust and Credibility: A Case in Internet Banking/ LSE, 2000 7) The Role of Information Systems Evaluation across an Extended System Life Cycle/ LSE, 1996 8) Formality and Informality in Internal Control Systems: A Comparative Study of Control in Different Social and Cultural Environments in a Global Bank/ LSE, 2002 9) Information Systems and Organisational Change: The Case of Flexible Specialisation in Cyprus/ LSE, 1999 10) The Challenges in Assimilating E-business in Large Established Organizations: A Structurational Examination of the E-Business Development at an American Auto Manufacturer/ LSE, 2004 11) Making Sense of Emergent Properties in IT Enabled Call Centre Operations: An Interpretive Systems Analysis Approach/LSE, 2003 12) Using Precedents to Identify Top Management Fraud: the Study of a Case-Based Learning and Reasoning Model/ LSE, 1996 13) Integrating On-line Learning Technologies into Higher Education: A Case Study of Two UK Universities/ LSE, 2004 14) Interpreting the Management of Information Systems Security/ LSE, 1995 15) Interpreting The Implementation Of Integrated Packaged Software: The Case Of Enterprise Resource Planning/ LSE, 2005 16) Signs and Signals: The Conception of Communication in U.S. Telecommunications Rhetoric/ LSE, 2001 17) The Interplay of Institutional Forces behind Higher ICT Education in India/ LSE, 2005 18) Evaluating Organsational Privacy Policy Implementation/ LSE, 2004 19) Making Sense of Knowledge Creation Processes: The Case of a Greek Petrochemical Company/ LSE, 2005 20) The Strategic Dimensions of Information Systems Capability: Case Studies in a Developing Country Context/ LSE, 1996 21) Computers and the Family: A Study of Technology in the Domestic Sphere/ LSE, 2000 22) Learning Technology in Higher Education: A Structurational Perspective on Technology Mediated Learning Practices/ LSE, 2005 23) The Role of Information Systems in Islamic Banking: An Ethnographic Study/ LSE, 2005 24) India’s Information Technology Industry: Adapting to Globalisation and Policy Change in the 1990s/ LSE, 1997 25) The Impact of Using DSS for National Debt Management/ LSE, 1998 26) Explanation in Information Systems: A Design Rationale Approach/ LSE, 2002 27) Regulating the Technological Actor: How Governments Tried to Transform the Technology and the Market for Cryptography and the Implications for the Regulation of Information and Communications Technologies/ LSE, 2003 28) Group Interaction and the Learning Process through Computer Conferencing/ LSE, 2002 29) The Social Epistemology of Open Source Software Development: The Linux Case Study/ LSE 30) Information Technology as Ontology: a Phenomenological Investigation into Information Technology and Strategy In-the-World/ LSE, 2002


31) Transaction Cost Applications in Information Systems: Explicating Institutions' Influences on Governance/ LSE, 2000 32) Knowledge, Development and Technology: Internet Use among Voluntary-sector AIDS Organisations in KwaZulu-Natal/ LSE, 2005 33) Emerging Work Practices of ICT-Enabled Mobile Professionals/ LSE, 2003 34) Banking and Innovation: The Case of Payment Systems Modernisation in Thailand/ LSE, 1999 35) Telehealth And Information Society: A Critical Study of Emerging Concepts in Policy and Practice/ LSE, 2001 36) The Transformation of IT Governance: A Neo-Institutional Interpretation/ LSE, 2005 37) Temporal Implications of Information Systems in Organisational Work: An Exploratory Study/ LSE, 1997 38) Initiating System Innovation: a Technological Frames Analysis of the Origins of Groupware Projects/ LSE, 2000 Language/ Literature/ Linguistics 1) The Publishing History of Aubrey Beardsley's Compositions for Oscar Wilde's Salomé/ Marquette University - Milwaukee, Wisconsin – USA, PhD, 1995 2) Diary as Fiction:Dostoevskyís Notes from Underground and Turgenevís Diary of a Superfluous Man/ Boston College, U.S., MA, 2000 3) A Computational Phonology of Russian/ University of Oxford – UK, PhD, 2000 4) The Publishing History of Aubrey Beardsley's Compositions for Oscar Wilde's Salomé/ Marquette University - Milwaukee, Wisconsin – USA, PhD, 1995 5) La Novela de las Transnacionales: Hacia Una Nueva Clasificación/ The University of Alabama Tuscaloosa, USA, PhD, 2001 6) Yasukuni Shrine and the Constraints on the Discourses of Nationalism in Twentieth-Century Japan/ U of Kansas, MA, 1996 7) J. Henry Shorthouse, "The Author of John Inglesant" (with reference to T. S. Eliot and C. G. Jung)/ University of London, England, PhD, 1995 8) Cinematic Techniques in the Prose Fiction of Beatriz Guido/ Michigan State University, USA, PhD, 1974 9) Fictionality and Reality in Narrative Discourse: A Reading of Four Contemporary Taiwanese Writers/ University of Washington, Seattle, U.S., PhD, 1990 10) The Plays of Christopher Marlowe and George Peele: Rhetoric and Renaissance Sensibility/ University of Aberdeen, Scotland, PhD, 1999 11) Master Players in a Fixed Game: An Extra-Literary History of Twentieth Century AfricanAmerican Authors/ University of Michigan, PhD, 2001 12) Hawthorne's The Marble Faun: A Re-appraisal/ California State University, San Diego, MA, 1972 13) Julien Gracq and the art of anachronism: Thematic, intertextual and aesthetic parallels with the fin de siecle (France)/ Columbia University, PhD, 2005 14) American modernism and Depression documentary/ University of Pennsylvania, PhD, 2005 15) American regionalist modernism: Willa Cather, William Faulkner, Oscar Zeta Acosta, and Sandra Cisneros/ New York University, PhD, 2005 16) The aesthetics of resistance: Modernism and antifascism/ Indiana University, PhD, 2005 17) 'Far out past': Hemingway, manhood, and modernism (Ernest Hemingway)/ The College of William and Mary, PhD, 2005 18) The poetics of Terayama Shuji (Japan)/ Yale University, PhD, 2005 19) Can Anyone Say What is Reasonable?": Promoting, accommodating to, and resisting elite rhetorical inquiry in a high-school classroom/ Georgetown University, 1998 20) A Coherence-Based Approach to the Interpretation of Non-Sentential Utterances in Dialogue/ University of Edinburgh 21) A Comparative Analysis of Thai and English Contrastive Discourse Markers: With a discussion of the pedagogical implications/ Boston University, 1999 22) A Contrastive Study of Syntactic Relations, Cohesion, and Punctuation as Markers of Rhetorical Organization in Arabic and English Narrative Texts, University of Exeter, 1993 23) A Corpus Linguistic Analysis of Phraseology and Collocation in the Register of Current European Union Administrative French/ University of St. Andrews, 2002


24) A Corpus-Based Analysis of Simultaneous Speech in English Conversation/ Victoria
University of Wellington, 1997 25) A Corpus-Based Discourse Analysis of the Bei-Construction in Chinese Written Discourse: A study with special reference to the be-passive in English/ Ball State University , Department of English, 2005 26) A Critical Discourse Analysis of University ESL Classrooms: Power and accommodation/ Arizona State University , Department of English, 2003 27) A Data-Driven Methodology for Motivating a Set of Coherence Relations/ University of Edinburgh , School of Informatics, 1996 28) A Discourse Analysis of the Interview Process in a Private Social Service Setting/ University of Georgia , Linguistics Program, 1999 29) A Government Approach to Finnish-English Intrasentential Code-switching/ University of Southern California , Department of Linguistics, 1994 30) A Sociopragmatic Approach to the Use of Meta-Discourse Features in Effective Non-Native and Native speaker Composition Writing/ University of South Carolina , Program in Linguistics, 2001 31) Antonymy and Semantic Range in English/ Northwestern University , Department of Linguistics, 1997 32) Corpus Linguistics, Contextual Collocation and ESP Syllabus Design: A text analysis approach to the study of medical research articles/ University of Birmingham , School of English 33) Informalization in UK General Election Propaganda: 1964-1997/ University of Leeds , School of English, 2002 34) Lexical Vagueness in Student Writing/ University of Cambridge , Research Centre for English and Applied Linguistics, 2003 35) Passive Prototypes, Topicality and Conceptual Space/ University of North Carolina at Chapel Hill , Department of Linguistics, 2004 36) Patterns of Stance in English/ Northern Arizona University , Applied Linguistics, 2000 37) Politeness in Cypriot Greek: A frame-based approach/ University of Cambridge , Department of Linguistics, 2001 38) Pronunciation Modeling in Speech Synthesis/ University of Pennsylvania , Department of Linguistics, 1998 39) Recurrent Features of Translation in Canada: A corpus-based Study/ University of Ottawa , Department of Computational Linguistics, 2004 40) Situation-Based Intonation Pattern Distribution in a Corpus of American English/ University of Washington , Linguistics, 2005 Maths 1) Some Notes on the Theory of Hilbert Spaces of Analytic Functions on the Unit Disc/ University of California at Berkeley, PhD, 1994 2) Some Notes on Game Bounds/ University of California at Berkeley, MA, 1991 3) Radiative Transfer Using Boltzmann Transport Theory/ Chicago State University Chicago, USA, MSc, 1998 4) Affine Lie algebras, vertex operator algebras and combinatorial identities/ NORTH CAROLINA STATE UNIVERSITY/ PhD, 2005 5) Learning, adaptation and optimization: The nonnegative Boltzmann machine and the tunneling salesman algorithm/ PRINCETON UNIVERSITY/ PhD, 2005 6) Boundary regularity for conformally compact Einstein metrics in even dimensions/ UNIVERSITY OF WASHINGTON/ PhD, 2005 7) Generalized Tate cohomology/ UNIVERSITY OF KENTUCKY/ PhD, 2005 8) Morphlets: A multiscale representation for diffeomorphisms/ HARVARD UNIVERSITY/ PhD, 2005 9) Computational improvements in the substitution method for bounding percolation thresholds/ THE JOHNS HOPKINS UNIVERSITY/ PhD, 2005 10) Assessment practices in grade 9 academic and applied mathematics/ QUEEN'S UNIVERSITY AT KINGSTON (CANADA)/ Med, 2005


11) Mathematics: Liminal perspectives from those living on the margin/ THE OHIO STATE UNIVERSITY/ PhD, 2005 12) A conflict in values: The dilemma of equity, diversity, and participation in higher mathematics/ UNIVERSITY OF MONTANA/ PhD, 2005 Music 1) New Tonality: An examination of the style with analyses of James Hopkins' 'Songs of Eternity', Paul Moravec's 'Songs of Love and War', and an original composition/ University of Northern Colorado, 2005 2) Musical poetics and political ideology in the work of Luigi Nono (Italy)/ PhD, Yale University, 2005 3) A pioneering twentieth-century African-American musician: The choral works of George T. Walker/ THE FLORIDA STATE UNIVERSITY/ PhD, 2005 4) Use of time and spatial form in String Quartet No. 2 by Charles Ives (with Original composition)/ UNIVERSITY OF MICHIGAN/ PhD, 2005 5) Twentieth-century flesh on an eighteenth-century skeleton: An analysis of George Walker's 'Poeme for Violin and Orchestra'/ MORGAN STATE UNIVERSITY/ MA, 2005 Philosophy/ Religion/ Theology 1) Augustine, Manichaeism and the Good/ University of Ottawa and St. Paul University, Canada, PhD, 1996 2) Overcoming Women's Subordination in the Igbo African Culture and in the Catholic Church: Envisioning an Inclusive Theology with Reference to Women/ Graduate Theological Foundation, Indiana, USA, PhD, 2001 3) Does God Change?: Reconciling the Immutable God with the God of Love/ Sydney College of Divinity - Sydney, Australia, MA, 1998 4) The Pretribulation Rapture Doctrine and the Progressive Dispensational System: Are They Compatible?/ Regent College - Vancouver, Canada, MA, 2003 5) Origen of Alexandria and St. Maximus the Confessor: An Analysis and Critical Evaluation of Their Eschatological Doctrines/ St. Elias School of Orthodox Theology Seward, NE, USA, PhD, 2004 6) The Authenticity of the Parable of the Wheat and the Tares and Its Interpretation/ Westminster College in collaboration with Wycliffe Hall, Oxford, England, PhD, 1991 7) Sermons, Systems and Strategies: The Geographic Strategies of the Methodist Episcopal Church in its Expansion into New York State, 1788 – 1810/ Syracuse University, Syracuse, New York, USA, PhD, 1988 8) Nichiren's Nationalism: A Buddhist Rhetoric of a Shinto Teaching/ Univ of Hawaii at Manoa, USA, MA, 1992 9) Conduct and Behavior as Determinants for the Afterlife: Comparison of the Judgments of the Dead in Ancient Egypt and Ancient Greece/ Florida State University, USA, PhD, 2000 10) The Process of the Cosmos: Philosophical Theology and Cosmology/ Flinders University of South Australia, PhD, 1998 11) Canon 1096 on Ignorance with Application to Tribunal and Pastoral Practice/ Saint Paul University, University of Ottawa, Canada, PhD, 2001 Physics 1) Electronic and Optical Properties of Semiconductors: A Study Based on the Empirical Tight Binding Model/ Worcester Polytechnic Institute, PhD, 1993 2) The Manufacture of High Temperature Superconducting Tapes and Films/ University of Southampton, U.K., PhD, 1996 3) An Improved Form for the Electrostatic Interactions of Polyelectrolytes in Solution and its Implications for the Analysis of QELSS Experiments on Sodium Dodecyl Sulfate and Cetyl Trimethyl Ammonium Bromide Solutions/ Worcester Polytechnic Institute, PhD, 1997 4) Development of the COAMPS adjoint mesoscale modeling system for assimilating microwave radiances within hurricanes/ THE FLORIDA STATE UNIVERSITY/ PhD, 2005 5) Application of effective field theory to density functional theory for finite systems/ THE OHIO STATE UNIVERSITY/ PhD, 2005 6) Cold nitric oxide molecules from a Stark guide/ THE UNIVERSITY OF OKLAHOMA/ PhD, 2005 7) A precision measurement of the top quark mass/ BOSTON UNIVERSITY/ PhD, 2005 8) NMR investigations of hydrogen occupation and mobility in palladium and ettringite/ WASHINGTON UNIVERSITY/ PhD, 2005


9) Numerical simulation of a single emitter colloid thruster in pure droplet cone-jet mode/ MASSACHUSETTS INSTITUTE OF TECHNOLOGY/ PhD, 2005 10) Searching for new physics: Contributions to LEP and the LHC/ THE UNIVERSITY OF WISCONSIN – MADISON/ PhD, 2005 Political Science 1) Terrorist Groups Are Aligning To Conduct Global Terrorism Breyer State University - Kamiah Idaho – USA, 2003 2) The New World Order An Economic Global Regime California State University, U.S, 1994 3) Johnson, McNamara, and the Birth of SALT and the ABM Treaty 1963-1969 Kings College, London, 1996 4) Institutional Change in the Horn of Africa The allocation of property rights and implications for development University of California, Los Angeles, 1995 5) UK Aid Policy and Practice 1974-90 An Analysis of the Poverty-Focus, Gender-Consciousness and Environmental Sensitivity of British Official Aid University of Liverpool, United Kingdom, 1994 6) Party Mobilization, Class and Ethnicity The Case of Hawaii, 1930 to 1964 Indiana University, 1996 7) Africa and the Democratic Option A Quest for Effectiveness and Legitimacy in Governance Clark-Atlanta University - Atlanta, Ga. U.S.A, 1992 8) The Politics of Transnational Television Beyond the Cultural Imperialism Question Clark-Atlanta University - Atlanta, Ga. U.S.A, 1994 9) Democratic Peace In the Spectrum of Conflicts in Sub-Saharan Africa University of Bradford Bradford/England, 2000 10) Regional integration and the challenge of economic development: The case of the Common Market for Eastern and Southern Africa (COMESA)/ RUTGERS THE STATE UNIVERSITY OF NEW JERSEY – NEWARK/ PhD, 2005 11) The phenomenology of courage (Plato, Herodotus, Jean-Jacques Rousseau, Alexis de Tocqueville)/ GEORGETOWN UNIVERSITY/ PhD, 2005 12) Interjurisdictional competition and urban area fragmentation/ UNIVERSITY OF MARYLAND, COLLEGE PARK/ PhD, 2005 13) Essays on political and economic attitudes/ COLUMBIA UNIVERSITY/ PhD, 2005 14) Making a global commodity: The production of markets and cotton in Egypt, Turkey, and the United States/ NEW YORK UNIVERSITY/ PhD, 2005 15) Authoritarian order as an equilibrium outcome of distributional battles in politics: The logic of war and political collusion in 19th and 20th century Mexico/ STANFORD UNIVERSITY/ PhD, 2005 Psychology 1) Repressive defensiveness in masters level psychology students: Implications for practice/ Tennessee State University, PhD 2004 2) A cross-discipline evaluation of clinical skills, knowledge base, and approaches to treatment in a psychiatric emergency, Tennessee State University PhD 2004 3) The master's tools...maori development inside pakeha psychology. MA, Hamilton: University of Waikato. New Zealand(1994). 4) Cultural safety within Clinical Psychology: A Maori perspective. MA. Hamilton: University of Waikato 1997 5) The learning preferences of Maori university students - cooperative, competitive, individualistic or intra-ethnic. MA. Hamilton: University of Waikato. 1991 6) A scale to measure motivation and behaviour in psychological experiments, 1972 7) Are we a violent society? Waikato University Psychology students' perceptions of violent crime and sentencing. 2004 8) Proactive lucidity: Superconsciousness, creativity, and the virtually real. 2004


9) Place attachment and traditional place: An examination of the land, identity and wellbeing relationship between Ngai Te Ahi and Hairini Marae. 2003 10) Client and caregiver perceptions of prodromal symptoms of relapse of schizophrenia in Aotearoa University of Waikato 2002 11) Hazard Perception and headway: The influences of drivers' perception skills on their car following behaviour. University of Waikato 2001. 12) Maori Identity within Whanau University of Waikato 1996 13) A Comparison of Two Approaches of Symbolic Modeling and Self-Efficacy/ Indiana State University, U.S., PhD, 1999 14) Parental Alienation Syndrome in Court Referred Custody Cases/ Northcentral University, Prescott, Arizona, PhD, 2001 15) Juvenile Firesetting: An Exploratory Analysis/ Indiana University, USA, PhD, 2000 16) A Study of the Thematic Apperception Test (TAT) with Japanese Subjects/ Southern California University for Professional Studies, PhD, 1998 17) Aching to Age/ Barrington University, Des Moines, Iowa, 1999 18) Paradox Lost and Paradox Regained: An Object Relations Analysis of Two Flannery O'Connor Mother-Child Dyads/ The San Francisco School of Psychology, U.S., PhD, 1999 19) Shades of Community and Conflict: Biracial Adults of African-American and JewishAmerican Heritages/ The Wright Institute: Berkeley, California, PhD, 1998 20) The Impact of an Anti-Bullying Program on the Prevalence of Bullying in Junior and Senior High School/ North Central University, Prescott, Arizona, US, PhD, 2000 21) Cognitive, Contextual, and Personality Factors in Wife Abuse/ Kansas State University, PhD, 1995 Sociology 1) Farmers' Rural Community Attachment-A Structural Symbolic Interactionist Explanation/ South Dakota State University - Brookings, SD USA, PhD, 2002 2) Towards the Formation of a Sustainable South Florida-An Analysis of Conflict Resolution and Consensus Building in the South Florida Ecosystem Restoration Initiative/ Florida International University, Miami, Florida, USA, 1999, PhD 3) Reframing the Attitude-Behavior Debate:The Case of Meat-Abstinence in Vegetarian Student Cooperatives/ University of Michigan, MS, 1995 4) The Politics of Death: A Sociological Analysis of Revolutionary Communication/ University of Colorado, PhD, 1974 5) Sex differences in rape reporting/ Iowa State University, MS, 1995 6) NHS Complaints Managers: A Study of the Conflicts and Tensions in their Role/ London School of Economics and Political Science, University of London, London, UK, PhD, 2004 7) Women in Transition: Discourses of Menopause/ University of Windsor, Windsor, Ontario, Canada, MA, 2002 8) Student Perceptions of Rules for Classroom Interaction/ Louisiana State University, MA, 1992 9) Modernisation and social development in Saudi Arabia: An exploratory study of the attitudes of students at King Abdul Aziz University towards individual modernity/ PhD, University of Essex (United Kingdom), 2005 10) Transnational networks and community-based organizations: The dynamics of AIDS activism in Tijuana and Mexico City (Mexico)/ UNIVERSITY OF CALIFORNIA, SAN DIEGO/ PhD, 2005 11) Innovation, imitation, legitimacy and deviance in the design of graphical trademarks in the United States, 1884—2003/ THE UNIVERSITY OF ARIZONA/ PhD, 2005 12) Faculty perceptions of academic freedom at a metropolitan university: A case study/ VIRGINIA COMMONWEALTH UNIVERSITY/ PhD, 2005 13) Why neighborhoods matter: Structural and cultural influences on adolescents in poor communities (Massachusetts)/ HARVARD UNIVERSITY/ PhD, 2005 Town Planning 1)Locating the Slovenian nation: Competing folkloristic, state planning, and local constructions of the Trnovo neighborhood, 1895—1989/ University of Pennsylvania, PhD, 2005



APPENDIX G: Interview 1

Interviewee says: slm! Interviewer says: Hello ----, are you there? Interviewee says: yes hello! Interviewer says: Everything OK? Shall we start? Interviewee says: yes - let's Interviewer says: First of all, how do you feel about EFL501 as a graduate course? Interviewee says: very useful to students who are about to start their thesis Interviewer says: Can you elaborate a bit on this usefulness? Interviewee says: most of the students prior to taking this course had been away from English for a while. This course gave them a chance to reacquaint with for example grammar, vocabulary and most importanty writing skills such as orga Interviewee says: inzation, detailing, etc. Most of the time most were not familiar with functions necessary in specific genres in this case thesis writing. Interviewer says: I know that you've taught this course to 2 different groups. Could you please talk a bit about the student profiles? Interviewee says: Focusing on these specific functions enable students to approach thesis writing more confidently; they feel safe knowing when to use the right type of language in the right place. Again most of these students had not any training in academic writing and being introduced to academic writing conventions gave them the chance to express themselves in a style and manner acceptable acceptable in an acad Interviewee says: emic environment. Interviewer says: I know that you've taught this course to 2 different groups. Could you please talk a bit about the student profiles? Interviewee says: The first group were mainly students at intermediate level (as in Prep) who struggled quite a bit in expressing themselves in writing and they had major grammar problems. The second group had a variety of students. Some were native-like students who only needed to learn about style and academic writing


conventions. Few others were more advanced (upper intermediate) and they had a variety of proble Interviewee says: 'ms such grammar, vocabulary and elaboration of ideas. And the rest was of basic level; they needed some very basic training in English. Interviewer says: I can see that there was a variety of students and levels within and among these groups. So are we in a position to say that the problems were specific to one group only? Or can we talk about some common problems? If so, can you please list the most most common problems faced? Interviewee says: common problems - vocabulary, coherence, grammar, organization and alaboration of ideas Interviewee says: elaboration of ideas Interviewer says: What do you think could be done to reduce the problems these students face? Interviewer says: What more? Interviewee says: practice practice & practice in writing and definitely more exposure to Engish throughout their academic career. And of course guidance in dealing with problems specific to each individual. Being exposed to excerpts from published thesis could be very useful too. Interviewer says: Do you think the EFL501 course you were teaching suited the needs of these students? Interviewee says: within the limited time - yes Interviewer says: Did you observe any concrete developments in these students' use of the language at the end of the course? Mostly in terms of what? Interviewee says: yes - mostly better and more formal word choice. It seemed that students paid closer attention to use of correct grammar, expressed ideas more coherently and used academic writing conventions more accurately. Interviewer says: Thanks for your patience....Just a couple of more questions....Would you teach this course in the same way and using the same approach and materials if you taught it again? If not, what changes/additions/deletions would you make and why? Interviewee says: I would change the reading materials which set models for students. would chose more authentic texts e.g. excerpts from published thesis. would still focus on teaching functions and the students would be introduced to them through the formentioned authentic texts. Instead of asking students to do separate writing homework for each function, I would try to encourage them to produce texts with appr Interviewee says: opriate functions throughout the semester which would all fit different sections of


a standard work of thesis e.g. introduction, conclusion, abstract, etc. The amount of work they produce - I would not change that. Interviewer says: I don't understand the last bit.....'the amount of work they produce'?? Interviewee says: they had to do writing homework each week. And each homework - some students had to rewrite it twice or more. Interviewer says: I see, but you would ask this work to be more in the context of parts of a thesis? I'm just checking my own comprehension )))) Interviewee says: Yes of course. Interviewer says: Last question....Some faculties have recently made this course a compulsory course for their graduate students. How do you feel about this? Interviewee says: I think all the other ones should do the same Interviewer says: ---------, thanks a lot. I really appreciate it. I'm hoping to transfer the course back to you next semester with some improvements. Interviewee says: Wish could have helped more. U know I am more of a morning person - I hope my answers have been sufficient. Would love to see the improvements you have made. Interviewer says: Take care!!!!!Thanks a lot. Have a good night's sleep. Interviewee says: No problem Nilgun. I have enjoy Interviewee says: ed it. Sleep well and have a good Sunday. See you soon. Interviewee says: And oh yes please let me know about the DVD... Interviewer says: Definitely, iyi geceler


APPENDIX H: Interview 2

Interviewee says: Hi we are ready, are you? Interviewer says: Please disregard the last 502 course that you have taught while answering the questions Interviewee says: ok Interviewer says: With how many different groups have you taught EFL 502, previously 501? Interviewee says: Oh dear, I've failed on the first item. I don't know, but I must have taught this course six to eight times, I should think. Interviewer says: Could you describe globally your experiences with these groups of students Interviewee says: On the whole, the students have been willing, motivated, aware that they need help with their writing. There have been some excellent students, but a number do have serious academic writing problems for students at postgraduate level. Interviewer says: Could you please talk a little bit about the student profiles, in terms of backgrounds, nationalities, levels, post graduate studies and anything else you think is worth mentioning Interviewee says: They have mostly been masters students, with more recently a growing number of phd students; nationalities have varied, with the majority being Turkish speaking. But again more recently, there has been an increase in foreign students from Iran, Palestine, Eritrea, and Algeria to name just four. Their fields of study have varied from year to year according really to the emphasis and importance... Interviewee says: ...given to this course in departments. Architecture, Economics, and Communications have been particularly common... Academically, the foreign students academic and linguistic background has tended to be stronger than the Turkish speakers, though there have been exceptions. Interviewer says: your last comment....What's the reason for this do you think? Interviewee says: I can't answer that factually, but I suspect the taking of post-graduate courses for our Turkish-speakers has become somewhat ritualised, and that the economic pressure on departments to accept students to take courses even when their level of achievement is not particularly high is quite marked... Interviewee says:


It may also be that the foreign students are more of an 'elite' group in their own countries, and we are getting better students from the top range only, whereas with the local students we enjoy the 'whole range'. Interviewer says: You mentioned some students having serious academic writing problems. Were the problems specific to some groups or can you generalise? What were the most common problems faced by these students in writing? Can you please try to rank the problematic areas starting with the most serious? Interviewee says: OK...I think I can probably safely generalise... Interviewee says: 1) Lack of preparedness, meaning they are writing too early without a clear idea of where they are going. Interviewee says: 2) Lexis and language problems, particularly the former. In fact, please rank this as number one, because it is the most difficult problem to solve in the short-term, whereas preparedness and organisational issues can be dealt with rather more easily. Specifically: their lexical range is not very high, which can lead their work to be quite repetitive, and their accuracy breaks down, because... Interviewee says: ...their knowledge of the grammar of individual items is weak, e.g. accompanying prepositions, and their sense of collocations is limited. Interviewee says: 4) The limited lexis can make lexical chains quite problematic, as their sense of synonymy and the need for variety in writing is quite limited. Interviewee says: 5) Similarly, this leads to problem with nuances, e.g that 'claim' implies a somewhat critical view on the part of the writer, whereas 'As X states' tends to indicate support; This problem with nuance also leads to problems in such areas as hedging, making exaggerated claims for their own research and so on. Interviewee says: Structural problems also exist and are marked by a limited range of competence, but do not present such a great problem in the writing process. Interviewer says: Can you clarify/exemplify the structural problems? Interviewee says: Yes, there are still marked performance errors with active / passive, with relative clauses, reduction, positioning and appropriacy of discourse markers, and more basic areas with tense and agreement. Prepositions are always a problem, and in fact one real area in English where even very good students struggle is in fact the use of articles Interviewer says: What do you think are some solutions to these concrete problems that you have mentioned? Interviewee says: 1) More language work before they reach these critical postgraduate writing phases, based in my view on extended reading. Interviewee says: 2) A more systematic approach to lexis, starting with receptive vocabulary instruction and then moving into productive vocabulary production.


Interviewee says: 3) More use of dictionaries, the introduction of thesaurus type work, and identification of collocations and chunks as the key feature of fluent writing. Interviewee says: 4) More work on pre-writing activities, more exposure to models Interviewee says: 5) Less emphasis on formal grammar, as such problems can normally be dealt with fairly easily at the process stage. Interviewee says: 6) More confidence, by asking students to write more but in smaller and shorter doses and to do so at speed. Interviewee says: 7) More awareness that writing is a process, and less obsession with producing 'mistake-free' work as a first draft. Interviewee says: 8) A sense of genre, organisation, and audience. Interviewee says: Many of these issues could be better dealt with by altering our approach to the teaching of reading, to introduce and deal with these issues. Interviewer says: Your 501 course has now become 502. Do you mean that there should also be more language focus in the new 501that Sezgi and I are teaching? Maybe there should be more collaboration between 501 and 502 in course design? Interviewee says: Lexically, I am sure that one can never do enough with these students, but language-wise, I am not sure. I think we have to recognise the difference between 'learned' errors and 'performance' areas. The latter can pretty much by dealt with by a process approach, with the former, it becomes a question of error frequency, and the effects those errors have on meaning. Interviewee says: Otherwise courses with limited hours inevitably have to prioritise. Interviewer says: Do you think that the original 501 (now 502) that you have been teaching suits the needs of these learners? Is it adequate in terms of materials and approach? Or do you feel that changes/additions/deletions are necessary? Interviewee says: I think it has represented a practical and economical approach to the problem that was originally faced. Good writers are good writers because they have absorbed language from written text, which is why I believe in the long run, students have to be exposed to models, and be trained to analyse these models in discoursal terms. In other words students at this point in their education and if ... Interviewee says: ... they are to develop independent skills to deal with their language problems must become critical thinkers about and critical observers of language... Interviewee says: Good models provide the skeletons and building blocks of good writing, and writing of this type is just as much formulaic as it is creative, a fact that academics are probably simply too immodest to admit. But once you accept the concept of collocation, and particularly chunking, I think there is little choice but to accept this


Interviewer says: Teaching the 502 course as it is now, have you been observing concrete developments in these students' language use at the end of the semester? If yes, mostly in terms of which area/s? Interviewee says: Yes, I think so. The beauty of this lexical approach is its simplicity; What might one day have been viewed of as plagiarism, is simply in fact the acquired skill of 'observing language' and recognising that formulaic and semi-formulaic, fixed and semi-fixed expressions make up the body of a text, and that creativity and originality is not dependent on them... Interviewee says: The students ceratinly acquire and make use of the language once they have identified the necessary chunks, and the progress they make is quite noticeable in a way that I can't say I have observed with more structure-based approaches to language teaching. Process-wise, it also makes complete sense to me that the prior focus to all of this is logical organisation of ideas within the conventions... Interviewee says: ...of the given genre Interviewer says: This semester for the first time you had students who had already taken 501 before 502? Any noticeable differences? Mostly in terms of what? Interviewee says: Definitely! Their sense of organisation was much more marked, and their sense and understanding of language functions was very much in place, so much so that elements of the course became redundant. In past years a solely structural understanding of language, and a fixation with grammar and grammar errors would be very noticeable at the beginning of the programme. Interviewer says: Some faculties have made these two courses compulsory. How do you feel about this? Interviewee says: I presume they have their reasons, but I would prefer them to be 'placed' in 501, or 502, or neither, according to an independent assessment of their level, either conducted by us, or through an IELTS score. It seems like something of a blanket approach to the issue to me, though it hasn't yet caused any serious problems in practice. Interviewer says: Last question. You mentioned that some elements of 502 became redundant as these students had taken 501. Interviewee says: yes Interviewer says: What's the solution? Interviewee says: I remember work I used to do in identifying and using structures that realise such functions as 'compare and contrast', 'cause and effect', and even basic work with relative clauses, and so on, as typical examples. It's not a problem basically, because once I'm aware of it, I can simply omit that work and move on. I wouldn't use the word solution, because this is not actually a problem, but rather Interviewee says:


a bonus, wouldn't you say? Interviewer says: Yes, I agree but what about these ideas? Collaboration between the designers of these two courses so that different elements are focused on, merging the two courses together and offering only one comprehensive course or accepting students on 501 or 502 or both or neither according to their language abilities as you have already mentioned? Interviewee says: Yes, it would make perfect sense right now to sit down with the two course programmes and overview them as a whole, now that these issues have emerged this semester. Maybe we could meet f2f somewhere? Interviewer says: One very final question!!!!! Interviewee says: I'm ready! Interviewer says: Are you happy about the existence of these courses? Do you think that they are doing good service to postgraduate students? And do you feel that they themselves appreciate this support? Interviewee says: Yes, definitely. Never at any point have I felt I was wasting my time, or that this was a service that wasn't educationally useful. A very fine job - and I think the reactions and feedback and invitations you have received are a great testimony to what has been set up in the new 501 (ie by you...). My experience is that the students do appreciate the work, and some of them express their... Interviewee says: appreciation to a very high degree. They feel in fact very often that the real academic support they are getting is from us... Interviewee says: final final final question? Interviewer says: Thank you very much for your time and your very useful ideas and contributions. I deeply appreciate it.


APPENDIX I: Comparison of the top 50 words in the LAC and the TAC

The LAC-21,575 words Word Type

The TAC-174,093 words Word Type

1543 1023 754 595 569 450 331 247 220 216 202 164 153 141 134 131 128 114 99 89 87 83 82 80 79 78 73 71 71 71 70 70 69 65 64

Cum. Percent
7.15 11.89 15.39 18.15 20.78 22.87 24.4 25.55 26.57 27.57 28.51 29.27 29.97 30.63 31.25 31.86 32.45 32.98 33.44 33.85 34.25 34.64 35.02 35.39 35.75 36.12 36.45 36.78 37.11 37.44 37.77 38.09 38.41 38.71 39.01

12550 8631 6783 4261 4223 3699 1904 1872 1664 1582 1426 1145 1113 1080 990 968 894 749 684 669 631 630 609 603 560 517 476 426 425 417 414 411 382 371 363

Cum. Percent
7.21 12.17 16.06 18.51 20.94 23.06 24.15 25.23 26.19 27.09 27.91 28.57 29.21 29.83 30.40 30.96 31.47 31.90 32.29 32.68 33.04 33.40 33.75 34.10 34.42 34.72 34.99 35.23 35.48 35.72 35.96 36.19 36.41 36.62 36.83



64 63 62 62 59 58 55 54 54 52 49 48 46 45 44

39.3 39.6 39.88 40.17 40.44 40.71 40.97 41.22 41.47 41.71 41.94 42.16 42.37 42.58 42.79


349 343 340 331 330 326 322 293 293 292 290 285 284 284 281

37.03 37.23 37.43 37.62 37.81 37.99 38.18 38.35 38.51 38.68 38.85 39.01 39.18 39.34 39.50


APPENDIX J: Family Headwords and Word Types used in the LAC and the TAC

Family (HW) in both LAC and TAC absorb accept access accurate acknowled ge adapt add administer advertise adverse age agent aggregate aid air align alter amount analyse answer thesis any appear argue arise arrive articulate ask attain attract automobile award bank begin bird blind bond boom bear branch

Actual LAC word tokens absorb unacceptable, accepted accessibility inaccurately acknowledge adapting, adaptation, adaptive added, additions administering, administrative advertisers, advertisement adverse ages agent aggregates unaided air align alter amounted analysts answered theses anymore, anyone appeared arguments arising arrive, arrives articulation asks attainment attract, attracting auto award banks begin, beginnings birds blind bonds boomed bearing branches

Actual TAC word tokens absorbed, absorption accept, acceptability access, accessible accuracy, accurate, accurately acknowledging adapt add, adding, additive, adds administered, administrator ads adversely aged agencies, agency aggregation aid airy alignment alteration, altered amounts analytic, analytically, analyzes answering antithesis anything appears argue, argued, arguing, argument arise, arises, arose arrived articulate, articulates ask, asked, asking attain attracted, attractiveness automobile awards banking began, begins bird blindness bond, bonding boom born, borne branching


Family (HW) in both LAC and TAC bring build capacity carry cast causal censor centre chance cheap circumstan ce civilise claim classify column combine commerce commissio n complete concept confidence consequen tial consequen t conserve consist constitute constrain construct consume control convert cope correct couple course cover create curve cycle dam database decide decrease

Actual LAC word tokens bring builders capacity carry, carries casting causality censorship centre, centered chances cheap circumstance civilization claiming classifying, classified columns combinations commerce, commercials commissioning, commissions completed conceptual confident consequentially consequent conserved consists constituent constrains constructions consumes controlling converter coping correct, correctness couple courses covered creativity, creators, recreating curves cycle, cyclic dams database undecided decreases, decreasing

Actual TAC word tokens brings builds capacities carrier, carrying cast causal censored centers, centred chance cheaply circumstances civilized claimed, claims classifications, classify column combines commercially commission complete conception, conceptualizes confidence consequential consequently conservationist, conserve consisted, consisting constituents, constitutes, constitutive constrained constructs consuming controlled, controller, controllers, controls conversion cope corrected, correction coupling course coverage, covering creations curve cycling dam databases decided decreased



Family (HW) in both LAC and TAC define degrade demograph y department depend depict depress detect determine develop disease distinct distribute document door doctor drama drastic drive east economy educate elastic elect eliminate emotion employ enforce engage ensure equip equivalent establish exceed excess excite exhibit expense explain exploit explore export extend fail fall fashion

Actual LAC word tokens definitions degrading demographics departmental dependencies depicting depression detecting determining, deterministic developmental diseased distinctive distribute documenting indoor, outdoor doctor dramatically drastically drive, drivers eastern economy, economical, economists educated inelastic elections eliminate emotion employees, employers, unemployed, unemployment enforcing engaged ensure, ensures equipped equivalent establishing exceeding excess excitation exhibits expensive, expenditure, expenditures explanation exploited, exploiting exploring exporting extensively fail, fails fall fashionable

Actual TAC word tokens defines degraded demography department depended, dependent, depending depicts depressed, depressing detected determinations, determinism developers diseases distinction, distinctly distributional documentation, documented doors dr dramatic drastic driver east economics educate, educators elasticity elect, election eliminated, eliminating emotions employ, employee enforcement engage, engagement, engaging ensured equipment equivalence establish, establishes exceed, exceeds excessive excitement, exciting exhibited, exhibition expenses explaining exploits exploration export extends, extensions failing falls fashion


Family (HW) in both LAC and TAC feed fight figure finance fiscal fix fluctuate follow force founded friend fulfil gain gather generate geography globe go govern graduate graphic hard harmony heat help she hold hospital hot human hypothesis impose impress pure incorporate incentive complete incorporate independe nt indicate induce initiate institute interact depend

Actual LAC word tokens feed fight figured finance, financed fiscal fixing fluctuations follows forced, forces founded friendly fulfillment gained gather generations geography, geographically globalized goes governing, governed graduates, graduation graphic, graphical hardened harmony heating helpful she holder, holding hospitals hot humanity hypothetical impose, imposed impressive pure incorporation incentive completed incorporation independence indications induce initiated instituted interactive dependencies

Actual TAC word tokens fed, feeders fighting figures financially fiscally fixed fluctuated, fluctuating follow, followed force founder friendship, friendships fulfil, fulfilled gains gathering generative geographic, geographies globalization, globe gone gov, governmental, governor graduate graphically hardworking harmonious heats helped her hold, holdings hospital hotly humanism, humans hypotheses, hypothesised, hypothesized imposing impression impurities, impurity inc incentives incomplete incorporated, incorporates, incorporating independent, independently indicative induced initiation institution, institutional interactionist, interactions interdependence


Family (HW) in both LAC and TAC value vary invest investigate involve joint judge keep know lack lag last lay lead learn limit link list local loss low maintain manage map mathemati cs matter maximise meaning measure mechanic mechanise memory metre minimise understand model moment mrs nation near new normal obstruct occupation occupy occur

Actual LAC word tokens valuation variants, varies investors investigative involved joints judgement keeps knowing lacking, lacks lagging lasting, lasts laid misleading learnt limitations, limiting link listed locals loss lows maintains manage, managing mapping mathematical matter maximize, maximized meanings measure mechanical mechanization memory, memorize meter minimize, minimizes, minimized understood modeled moment ms nationals, nationality nearly renewing normal, normalized obstructs occupations occupancy occurrence

Actual TAC word tokens invaluable invariably invest investigating involvement joint judgments keeping knowledgeable lacked lagged lastly lays leader, leadership learned, learners, learns limits linking list localisation, localised, localization, localized, locally losses lowering maintain, maintained manager, managerial maps math matters maximization meant measurements, measuring mechanics mechanistic memories meters, metric minimization misunderstanding modeling momentary mrs nation, nationally, nationwide neared, nearer newly normally obstruction occupation, occupational occupied occurring


Family (HW) in both LAC and TAC organize origin oscillate panel par parent particle partner pass patron pattern perceive percent perform permanent phenomen ology photograph place play politic portfolio posit power precise predict prepare preserve pressure prevent profit programme progress project promote proportion prove purchase pure purpose pursue rate react realise reason recognize combine construct

Actual LAC word tokens organized originality oscillating panels pars parent particles partners, partnerships passes patrons, patronizing pattern perceiving percentage performer permanence petroleum photography, photos placement player, playing, plays politic, politician, politicians portfolios posit powerful precise predict prepare, preparation, prepared preservation pressured prevention profit, profitable programmes progressively projection promotion, promotions proportion, proportions proved, proving purchased, purchases pure purposeful pursue rating reacted realize reasons recognize, recognizes combinations constructions

Actual TAC word tokens organizers, organizes origin, originated, origins oscillation, oscillator panel par parents particle partner passing patron patterns perceives percent performing permanent phenomenology photographs places played politically portfolio posits powers precision predictable, predicted, predictors preparatory preserved pressure preventing profitably, profits programmed progress projected promote, promotes, promotional proportional proven, proves purchasing purity purposely pursuing rated react, reaction, reactive realizes reasonably, reasoned recognition recombination reconstructed, reconstructing,


Family (HW) in both LAC and TAC define distribute refine reflect register regulate reinforce relax rely remain remove new represent require research use revise right routine rule scenario schedule science seem segment conduct send separate sequence serve share shore sick side similar sketch slide solve son special specific split square stable statistic story

Actual LAC word tokens

Actual TAC word tokens reconstructs redefines redistribution refined, refinements, refining reflections, reflective register regulation, regulatory reinforced relaxation, relaxed relies, rely remain, remained, remains remove, removed, removes renewed representation, representational requires, requiring researched reuse revision rightly routinely rule scenarios scheduled sciences, scientific, scientists seemingly, seems segmentation semiconductor, semiconductors sent separations sequences servants, served, serves shared shore sickness sided similarly sketches slides solved sons specialization specifically, specificity split square, squares stabilizing, stable statistically story

definitions distribute refinery reflection registered regulate reinforcement relax reliably, relying remaining removal renewing representatives, representing requirement researches usable revising right routine rules scenario scheduling scientifically seemed segments conducting, conductivity sending separated, separately, separates, separating sequence servant share, shares shores sick side, sides similarity sketch, sketching sliding solving son special specifications splits sq instability, stabilized statistic stories


Family (HW) in both LAC and TAC style routine succeed suffer supply technical tend theme threat tolerate travel tremendou s attract avoid balance clear comfort control cover understand stable update utilise vary visual weak wish yield

Actual LAC word tokens stylish routine succeeded suffered, suffering, suffers suppliers technically tendency thematically threats tolerance traveling tremendously attract, attracting avoided balances, balanced clearing comfortable controlling covered understood instability, stabilized updated utilize variants, varies visually weakness, weaknesses wishes yield

Actual TAC word tokens styled, stylistic subroutine successfully suffer supplies technical tend, tended, tends themes threat tolerant travel, traveling tremendous unattractive unavoidable unbalanced unclear uncomfortably uncontrolled uncover, uncovered understands unstable update utilisation, utilization, utilizes variance, varied, varying visualization weakly wish yielded


APPENDIX K: The TAC Wordlist - 165 Word Families selected for further analysis

Words from GSL 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 BASE ONE FAMILIES APPLY BASE BUILD CASE CHANGE CHARACTER CONSIDER DESCRIBE DEVELOP DIFFERENCE EFFECT FIND FORM GROUP HIGH IMPORTANT INCLUDE INCREASE KNOW LARGE LEVEL MAKE NATURE NEED NEW ORGANIZE PART PARTICULAR PRESENT PROBLEM PRODUCT PROPOSE PROVIDE RELATION RESULT SHOW STATE STUDY SUGGEST SYSTEM TIME UNDERSTAND USE VALUE RANGE 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 TYFREQ FAFREQ 17 202 12 327 23 342 153 188 140 315 15 160 19 167 39 155 59 621 12 302 70 254 44 286 117 220 117 193 101 226 126 190 58 243 45 168 13 161 91 172 115 175 57 195 117 189 80 169 382 392 4 199 113 164 106 155 88 269 98 166 46 160 22 161 137 365 43 405 76 363 89 218 98 199 669 919 50 161 262 530 190 219 52 206 292 930 68 170 F1 31 51 222 31 60 35 48 36 152 47 36 40 67 24 21 52 59 33 20 33 20 48 65 48 133 20 38 30 47 24 42 43 78 94 55 36 52 174 21 112 35 53 187 37 F2 24 84 32 27 81 45 51 51 106 104 45 75 76 65 41 61 54 22 38 34 47 43 59 20 82 44 46 51 86 27 28 29 80 127 91 65 31 270 65 48 68 57 213 26 F3 104 96 64 52 67 46 27 41 219 83 69 53 35 33 76 25 73 48 44 58 48 35 28 60 116 18 49 30 84 70 37 63 95 60 114 56 45 153 22 222 59 28 284 50 F4 43 96 24 78 107 34 41 27 144 68 104 118 42 71 88 52 57 65 59 47 60 69 37 41 61 117 31 44 52 45 53 26 112 124 103 61 71 322 53 148 57 68 246 57


45 46


4 4

148 228

150 344

42 96

34 125

37 49

37 74

Words from GSL 2
1 2 3 4 5 6 7 8 9 1 0 11 1 2 1 3 BASE TWO FAMILIES AIM COLLECT COMBINE COMPARE CRITIC DISCUSS EXAMINING EXPLORE GOVERN IMPROVE INFORM MANAGE MODEL RANGE TYFREQ FAFREQ 4 37 86 4 4 100 4 7 77 4 11 152 4 3 101 4 14 130 4 24 303 4 37 177 4 1 144 4 4 4 4 46 8 10 293 133 256 214 444 F1 16 30 21 29 21 26 73 57 14 25 36 25 72 F2 28 21 19 49 34 38 101 48 34 14 44 35 87 F3 16 13 19 49 22 28 33 26 48 67 48 75 139 F4 26 36 18 25 24 38 96 46 48 27 128 79 146

Words from the AWL
1 2 3 4 5 6 7 8 9 1 0 11 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 BASE THREE FAMILIES ANALYSE DESIGN RESEARCH PROJECT PROCESS CULTURE THESIS CONSTRUCT THEORY ENVIRONMENT IDENTIFY METHOD STRUCTURE DATA ECONOMY APPROACH CREATE SIGNIFICANT STRATEGY RANGE TYFREQ FAFREQ 4 8 568 4 322 469 4 411 441 4 281 424 4 241 372 4 108 358 4 331 331 4 14 310 4 155 293 4 4 4 4 4 4 4 4 4 4 151 62 86 127 262 37 174 48 132 67 289 288 287 267 262 245 238 219 212 184 F1 75 286 78 90 87 95 53 86 46 108 63 44 61 31 30 43 79 33 34 F2 193 46 111 30 62 142 112 37 70 54 135 62 63 62 28 57 48 66 27 F3 132 87 64 245 128 26 106 161 46 76 55 112 78 77 47 72 39 35 78 F4 168 50 188 59 95 95 60 26 131 51 35 69 65 92 140 66 53 78 45


9 2 0 2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2 3 3 3 4 3 5 3 6 3 7 3 8 3 9 4 0 4 1 4 2 4 3 4 4 4 5 4 6 4 7 4 8


4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

31 147 14 131 86 94 86 77 135 124 65 10 95 134 31 75 77 44 59 27 43 10 90 24 8 48 79 18 6

183 182 182 181 180 177 173 172 171 167 152 150 147 145 145 144 136 134 131 130 130 129 127 127 123 121 120 119 116

35 46 15 40 40 58 49 42 42 28 45 32 58 17 55 28 34 30 34 34 30 25 16 36 26 15 18 37 30

70 51 77 65 59 75 89 44 64 15 38 46 44 31 46 36 30 28 30 35 37 18 20 38 19 49 39 17 42

26 19 39 31 50 30 17 30 30 30 27 30 13 52 17 38 38 60 41 18 44 45 32 21 17 12 13 43 16

52 66 51 45 31 14 18 56 35 94 42 42 32 45 27 42 34 16 26 43 19 41 59 32 61 45 50 22 28


4 9 5 0 5 1 5 2 5 3 5 4 5 5 5 6 5 7 5 8 5 9 6 0 6 1 6 2 6 3 6 4 6 5 6 6 6 7 6 8 6 9 7 0 7 1 7 2 7 3 7 4 7 5 7 6 7 7 7


4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

37 10 28 96 60 6 4 29 21 21 10 86 35 23 8 20 0 6 2 29 42 17 57 29 47 26 3 13 23 21

116 115 115 112 106 105 102 101 101 100 99 96 92 91 90 90 87 85 81 81 81 79 79 77 74 72 70 70 69 67

28 26 24 19 17 18 28 12 19 13 14 17 18 25 19 26 10 25 12 10 15 15 13 11 19 17 10 27 21 12

24 28 44 26 19 28 40 19 29 12 37 17 32 37 21 33 16 23 24 26 23 18 27 19 30 13 11 23 12 27

37 34 29 34 40 22 16 45 32 40 15 43 25 13 23 11 38 17 17 20 33 26 29 14 12 23 30 10 23 17

27 27 18 33 30 37 18 25 21 35 33 19 17 16 27 20 23 20 28 25 10 20 10 33 13 19 19 10 13 11


8 7 9 8 0 8 1 8 2 8 3 8 4 8 5


4 4 4 4 4 4 4

23 15 45 40 37 9 38

65 65 64 61 60 52 40

17 14 19 13 16 19 5

14 22 17 24 10 10 12

15 12 12 10 22 11 16

19 17 16 14 12 12 7

Off-list Words
1 2 TYPE DISSERTATION INTERVIEWS INTERVIEW INTERVIEWED INTERVIEWEES INTERVIEWERS FAMILY OCCURRENCES: 89 ORGANIZATIONAL ORGANIZERS FAMILY OCCURRENCES: 56 OBJECTIVES NOVEL COLLABORATE COLLABORATED COLLABORATION COLLABORATIONS COLLABORATIVE COLLABORATIVELY COLLABORATOR COLLABORATORS FAMILY OCCURRENCES: 44 CORRELATE CORRELATED CORRELATION CORRELATIONAL CORRELATIONS FAMILY OCCURRENCES: 28 DEMOGRAPHER DEMOGRAPHIC DEMOGRAPHICS DEMOGRAPHY FAMILY OCCURRENCES: 31 ENTERPRISE RANGE 4 4 3 4 2 1 4 2 4 4 1 2 4 1 4 1 1 1 1 2 4 1 3 1 4 3 2 4 1 18 6 6 24 FREQ 204 62 14 10 2 1 53 3 18 45 1 2 21 1 16 1 1 1 2 5 17 1 3 F1 72 8 3 2 0 0 5 2 2 5 0 1 2 1 3 1 1 0 0 0 1 0 1 0 1 1 0 0 6 F2 48 19 8 6 1 1 12 1 1 13 0 1 5 0 3 0 0 0 2 0 2 0 1 F3 16 9 0 1 0 0 8 0 10 21 0 0 10 0 6 0 0 1 0 4 5 1 0 0 2 3 0 9 F4 68 26 3 1 1 0 28 0 5 6 1 0 4 0 4 0 0 0 0 1 9 0 1 0 7 1 5 7


***4 5 6



8 2 1 2





12 13 14







1 4 1 2 1 1 4 1 2 1 4 4 2 2 2 4 2 2 3 2 4 1 3 1 1 3 4 3 1 1 3 4 2 4 3 1 4 2 4 1 3 1 2 1

3 19 1 8 1 1 123 1 21 1 23 25 2 5 6 6 2 4 24 2 22 1 6 1 1 4 18 10 2 5 4 8 2 22 6 2 20 6 31 1 4 1 3 1

0 12 0 1 0 1 91 1 11 1 6 5 1 3 3 2 1 1 0 0 3 0 2 0 0 0 2 0 2 5 2 1 0 1 0 0 4 0 4 0 1 1 0 0

0 3 0 0 0 0 28 0 10 0 5 6 0 2 0 1 0 3 4 1 3 0 2 0 1 1 2 1 0 0 1 1 0 6 1 2 5 3 11 0 2 0 2 1

0 3 1 7 1 0 1 0 0 0 10 10 0 0 0 2 1 0 4 0 3 0 0 0 0 1 11 4 0 0 1 3 1 5 2 0 4 0 11 1 1 0 0 0

0 1 0 0 0 0 3 0 0 0 2 4 1 0 3 1 0 0 16 1 13 1 2 1 0 2 3 5 0 0 0 3 1 10 3 0 7 3 5 0 0 0 1 0


20 21


1 4 4 1 2 1 2 1 1 1 3 4 1 3 1 4 1 1 2 1 2 4 4

3 151 22 1 4 1 5 1 3 1 10 63 1 3 2 25 1 1 2 1 6 11 59

0 118 12 1 1 0 3 0 3 1 1 27 1 0 0 6 0 0 1 0 0 2 13

3 4 2 0 3 1 2 0 0 0 0 23 0 1 0 7 1 1 1 0 2 2 3

0 22 1 0 0 0 0 0 0 0 1 3 0 1 2 8 0 0 0 1 4 4 16

0 7 7 0 0 0 0 1 0 0 8 10 0 1 0 4 0 0 0 0 0 3 27



***24 ***25

*** These words are placed together with their families by the researcher for further analysis.



MOVE 1: INTRODUCTION Sub-move 1: Defining the Scope of the Study


study (274) thesis (231) dissertation (134) This research (116) work (29) paper (15) project (14) investigation (11) report (5) research study (3) inquiry (2)

The scope of the research provides(120) an overview of … examines(114) the relationship between … how … explores (59) the importance of … the ways in which … how … shows(52) how … that … presents (51) a qualitative investigation of … focuses (43) on the development of … (35) demonstrates how … that … the emergence of … proposes(34) to determine … that … a new design of … the integration of … investigates (30) the impact of … offers (30) alternatives / suggestions … attempts to identify … to establish … that … for the need to …




the theory and methods of a … approach …

is based

on a synthesis of … upon results obtained from …

represents(24) considers (23) seeks

a means of understanding … a series of … who / which …

to offer strategies … traces

the evolution of … the development of … a systematic methodology to … the risks of …

develops (21)



the role of …


Sub-move 2: Identifying a Research Gap -Previous research has not adequately explained why … -Previous research highlights a need for … -Previous research is unclear about … -Previous studies have demonstrated the feasibility of …, however, … still require further investigation -Previous attempts to identify … relied on data from ... Inadequacies in the data often resulted in ... -Although the importance of the topic of the … is evidenced through …, studies on this topic are scarce. -Although there have been many studies on … in recent years, there is no comprehensive study of … -Although researchers have devoted much attention to …, they have devoted little attention to … -Although … has been proposed before as a solution, little serious investigation has been undertaken into … -Although studies have discussed the importance of …, few have actually focused on … -Although overlooked in the … domain, … -Unlike previous … studies, … -Unlike previous studies that examine only …, … -Unlike previous work, … -However, no attempt so far could be found to … -However, no … research has been undertaken on … -However, little research has been conducted to … -However, little empirical research has studied the impact of … on the … -However, little research has been directed at …. -Despite such noteworthy efforts, however, few scholars have investigated … -Despite the importance of the subject, no full ... exists. -Despite evidence that …, researchers still understand relatively little about … -… yet ironically no serious study has been done on …. -… yet no … is available. -Yet, little has been done on … -… yet an in-depth analysis of … does not exist. - Little research has been done on … - Little research in ….. had previously focused on the … - Since there is no agreed and established method for … - Since little …… research has been conducted in …, … -… the previous … research, which generally only acknowledges that … -… to clarify previous research -… which has not been the subject of previous study. -… the disparate findings of previous research. - … previously under-explored territory. -… a previously unstudied / understudied / unexamined… -… the unexplored problem of … -… that have not been previously studied. -… hitherto unexamined … materials… -… broader … range than previously published materials. -… largely unexplored. -No comprehensive study of the … exists to date. -There is no published evidence that … -To date, however, no research exists on … -Unfortunately, very little experimental data exists that demonstrates …


-One area of investigation that has received little critical attention, however, is … -A challenge that has received relatively little attention is … - This … has received little attention in … research. - The number of studies in … is very few as opposed to other areas such as … - Many scholars have failed to ... Sub-move 3: Filling the Research Gap -Where previous research has often focused on …, this research targeted … -Previous attempts to identify … relied on data from ... Inadequacies in the data often resulted in ... This research demonstrates that … -Unlike previous … studies, this research explicitly takes into account … -Unlike previous studies that examine only …, this study also focuses on … -Unlike previous work, the current study considered the … -This study extends the previous … research, which … -This study seeks to go further by … -This study combines … evidence to contribute to an ongoing effort to … advance … and reveal previously under-explored territory. - The current study seeks to clarify previous research. This is the first study to use ... - In presenting …, this dissertation brings to light a previously unstudied phase of … - This text marks a little milestone in the … - This thesis presents results for a broader … range than previously published materials. - This research represents one of a few studies to explore how …, and offers a new theoretical framework to explain … -The purpose of this study is to provide the first … that have not been previously studied. - The motivation for this study arises from previously unexamined phenomena -Unfortunately, very little experimental data exists that demonstrates … Therefore, one of the objectives of this study conducted was … -A broad range of literature has been reviewed and it was found that little research in … had previously focused on the ... As such, this research presents a new area in … study. - Despite such noteworthy efforts, however, few scholars have investigated … in any comprehensive fashion. Thus, the dissertation examines … - Based on previously unexamined ….., my dissertation emphasizes the … - Although studies have discussed …, few have actually focused on … The overall aim of this study, therefore, is to increase understanding of why and how …, and what results are achieved. -I emphasize two previously understudied mechanisms of … -In this paper, we focus on the unexplored problem of … and examine methods to study how … Sub-move 4: Stating the ‘aim’ or ‘purpose’ of the study Pattern 1 provide (137) examine (69) determine (68) identify (62) develop (59) understand (52) improve (45)

study (274) purpose (82) aim (s) (45) (s) thesis (231) dissertation (134)


objective / s (32) THE goal (s) (23) intention (10) OF THIS

research (116) work (29) IS TO paper (15) project (14)

object (4) investigation (11) intent (4) report (5) research (3) inquiry (2 study

increase (45) demonstrate (43) analyse / analyze (39) describe (39) explore (37) investigate (31) make (29) contribute (24) reduce (23) enhance (23) propose (22) evaluate (21) integrate (19) apply (17) present (16) design (15) compare (11) extend (7)


Pattern 2 -address those concerns over … -answer two questions: … -answer some of the most fundamental questions of why and how … -bridge the knowledge gap in … -clarify the challenges faced in … -contribute to the growing body of scholarship on … -demonstrate the … role played by … -develop a discourse on how … -document and interpret this … -enlarge the role of … -establish a …… -examine the engagement between … and … SEEKS (22) -examine the issue of … with reference AIMS (16) to … ATTEMPTS TO -explain the types of … (11) -explain why ……. TRIES (3) -expand … theories … INTENDS -explicate the concept of … (1) -formalize, and define a common ground for … -formulate consensus through … -further our understanding of … -go further by … -identify the linkages between … and … about … -identify the most significant influences on … -illuminate how and why … -make sense of and provide insight into … -offer strategies for … -offer a theoretical foundation for the concept of … -provide an answer to ... -provide information and analysis that would … -understand the interaction between …


Pattern 3




analyze the current … argue that … be the first comprehensive analysis to bring together … contribute to the … demonstrate that … discuss the … examine the … explore the validity of the … illustrate the way that … investigate two different approaches to … look at how … propose to reconnect … provide a greater understanding of … show that …

Pattern 4 reconstructing … THIS STUDY PROJECT DISSERTATION AIMS AT providing a framework that … planning, developing, and revitalizing the …

MOVE 2: METHODOLOGY Sub-move 1: Presenting the methodology employed -utilized -a qualitative/quantitative methodology (72) -used -employed This thesis This dissertation This report This research uses (12) -the … model -a/an … approach utilizes (9) perspective / model technique / algorithm (7) employs / / -to examine … -to construct … -to optimize … -to study …

This study


Name of approach / method / tool

is are was were has/have been can be

used (289) utilized (22) employed (20)

-This new -The research -The proposed -The resultant


produces adopts applies to provides

-to determine the … -to test the … -to direct … -to compute -for the classification of the results -by different researchers. -in this study / research. -in the analysis. -as a … device. -… results. -a/an … approach as the mode of inquiry. -a wider range of … -a … framework for …

The methodological design


a qualitative, comparative

historical- approach to … compliment … and thereby strengthen the …

Many researchers

are of the belief that

the two methodologies

Using a … method Drawing upon theories of mixed-method approach …, this study uses a A multi-method approach Both the A pluralistic research method method

The research This

method method

The research method The research applies the method phenomenological The method The thesis adopts a method

provides more ... based on grounded theory and employing … is used with a quantitative analysis of …, followed by a qualitative analysis of … and the model are used in the development of … combining a case study and a field survey made up of a questionnaire and interviews is used to provide the necessary data. followed is a multiple case study analysis. of analysis is applied to two case studies of … is qualitative, using … to ... of investigation in its original form as developed by … sets the boundaries of the research. of qualitative research using … to …


Using a … A specially developed A…

approach, method method

a second analysis focuses more closely on … of … is used to categorise the …. is used to determine …


Sub-move 2: Justifying the methodology employed A pluralistic research is used to provide the necessary data. method combining a case study and a field survey made up of a questionnaire and interviews This is used as a theoretical base for the framework collection and analysis of data. The research adopted an interpretive gather approach using an information accompanied by ethnographic case study other methods of data collection technique to to have data triangulation. Archival, historical and are used to locate the data within ethnographic records … contexts and to reconstruct a model of … Based on the proposed are developed which allow the viewing or framework, two interpreting of data … within operations different contexts. This research presents and reviews a providing the reader with data significant number of new to … and innovative ways to promote … -To pursue -We address -This study explores -Some … have attempted to resolve -Resolution of this problem, -I combined elements of … theory and … text analysis. this problem -by designing and developing a … system … -by formalizing a … specification which … this problem -using an approach derived from … this problem -by involving … this problem -will require more detailed analysis of …

Sub-move 3: Describing the context This dissertation This thesis This thesis examines documents is concerned with the within the problem of … the importance in a wider of … understanding … within the in the context of … … context. context of … context of …

The concepts are considered of … Optional sub-move: Describing the variables -Other variables such as … -Two independent variables -Variables -Frequencies of occurrence between these variables -Interactions between variables -… variables

-are considered. -were analyzed for … -were investigated. -were tested -contributed significantly. -included …


MOVE 3: RESULTS Sub-moves 1 and 2: What the data show / What the data mean KEY WORD: DATA -showed that -… lead to … -shows -support for this theory. that … a strong … differentiation between …. illustrates that … and … are intimately connected to … demonstrate a …, suggesting … provides an understanding of the … emphasize the complexity of the … by … supported … hypotheses … DATA provide highly suggestive evidence that (from … / … obtained from …/ -that … was influenced by … derived from … / has revealed / -… themes relating to … produced by … / revealed -striking differences in … obtained using / collected from / the ability to … gathered from …) suggests … was as being … identified corroborates observations and definitions of … has implicated the … we can the … as well as … determine implies extensive … sufficient to support the basic hypotheses: evidence was found confirmed virtually all of the research hypotheses.

The … An analysis of … Preliminary Results from an analysis of … This combination of … and … A review of … Analysis of the … These The results of the Results from … using … -The … -A thematic analysis of … -Analysis of… The Based on these It appears the The By analyzing these My analysis of … Through a set of The empirical results, supported with comprehensive secondary


KEY WORDS: ANALYSIS / ANALYSES found that … reduces the probability of … indicates that … indicated shows / showed -that … shows / has -support for … shown demonstrate -the utility of … demonstrates -that … demonstrated -the extent to which … -A identifies significant differences in … second revealed -several … characteristics … set of analyses (pl.) reveals -significant / striking differences in … (sing.) analysis (of …) -that … -The (sing.) -a heavy bias towards … results provide / -support that … of the provided -a better understanding of the … -The -an opportunity to gain insight into … -Furthe suggested -that … r suggests -similar increases in … -The -benefits … case implies extensive … -An assists in considering the … (sing.) contribute to the interpretation of … -This underscores that … (sing.) the importance of … supported the notion … -this finding aids in the … KEY WORDS: FINDINGS / RESULTS indicate / indicated that … illustrate -the duality of … -that … identified … key areas … (clearly) confirm / -that … confirmed -the significant impact of … -the hypothesis of this study. -the validity of … exhibited a… demonstrate -significant evidence of … -that … reinforce the role of … are that … show that … failed to achieve this outcome due to … represent a … improvement in … can be grouped into two categories:

(The / these / experimental/ research / three key) FINDINGS (The / empirical / these / major) RESULTS


MOVE 4: DISCUSSION Sub-move 1: Describing the ‘key findings’ KEY WORDS: IMPLY / SUGGEST The results imply … growth … These … records imply that … The rationale for this approach implies the need to … My analysis of … data implies extensive… The findings / analyses/ results/ suggest /s -that … -there is no relationship … -a link between … -… principal conclusions. … factors Differences in … The results The contribution of … to … Some practices … … It The patterns … KEY WORDS: SEEM / APPEAR seem strongly related seem related seem to be does not seem to be appear to be appeared appear appears appear immune to influence that to fit to be to be to … to … dependent on … significant. -context-dependent. -positively related to … to the influence of … … … criteria supporting the hypothesis of … no single … similar to one another.

There appears … evidence suggests appear that many … The patterns tend to

KEY WORDS: TEND / TENDENCY reflect issues that are connected to the relationship between … The results of tended to show no differences between … and the … … The findings tends to generate responses … suggest that the presence of … Evidence for tends to suggest that … … Research tendency for … to become … findings include a


Sub-move 2: Relating the findings of the study to already existing research KEY WORDS: PAST / PREVIOUS are consistent with past studies showing that … supported previous work conducted by … are in agreement with previous research.

The findings The results Consistent with Unlike Contrary to

KEY WORD: PREVIOUS previous research, the findings previous work, previous claims, the current study the integration of ….

suggest that … considered … does not …

Sub-move 3: Describing the ‘conclusions’ KEY WORDS: CONCLUDE / CONCLUSION This study / dissertation / concludes -that … thesis / -with … investigation It is concluded that … The conclusions drawn from the indicate that … analysis A / The conclusion of this is that … study Several conclusions derive from the … The primary / central are that… conclusions of this investigation The conclusion reiterates that … In conclusion, this study shows that … Two main conclusions are derived. Sub-move 4: Discussing the contributions of the study to the research field The key KEY WORD: CONTRIBUTION contribution of this research is of this study of this research is concerns is is of this thesis of the project: of the thesis introduce are the suggestion that the use of … the support of … the provision of … twofold: practical and theoretical to demonstrate … … techniques that … The first is … 1) the development of …

The main contribution The core contribution theoretical The thesis’ contribution This dissertation’s The first two There are three main The main theoretical contribution contributions contributions contributions


This paper / -has thesis -makes

KEY WORD: CONTRIBUTION -a significant contribution methodological -the following principal contributions a theoretical a methodological contribution -by extending the … -to the ongoing debates about … -to the growing field of … to the literature. to the field of …

The thesis


This work

is believed to be

an important a significant

contribution contribution

The represent investigations reported in this thesis The results

KEY WORD: CONTRIBUTE contribute -to the international need to document and to explore … The research contributes -to the … debate about … (therefore) -to an understanding of … The research contributes -practical implications and insights into … This piece of -to the development of … research The … analysis contributes -to the interpretation of … Our findings contribute -not only to … but also to … This and future may contribute -significantly to the understanding of … work This work contributes -new techniques for … OTHER VERBS DENOTING ‘CONTRIBUTION’ / KEY WORD: STUDY has increased our knowledge of … constitutes a valid process for … has resulted in … publications … -the following benefits. provided / provides -support for … reconceptualizes or elaborates on, the systemic approach of … by … The study or even modifies should dispel some myths about … promotes the application of … validates the … established -a … strategy to … -that produced a model of …


OTHER VERBS DENOTING ‘CONTRIBUTION’ / KEY WORDS: FINDINGS / RESULTS provided / provide -the knowledge needed to … -a better understanding of … (The / these / -highly suggestive evidence that … experimental/ -new insights into … research) have conclusively proven that … findings offer the first empirical evidence for establish that … were used to establish a framework for … (The / shed new light on … empirical / reinforce the role of … these / clearly confirmed -the effectiveness of major) -the validity of results open new prospects to improve challenge … approaches to … UNDERSTANDING / INSIGHT / KNOWLEDGE TO DENOTE ‘CONTRIBUTION’ The thesis makes a methodological to an understanding of … contribution This study contributes to an understanding of … This dissertation provides a better understanding of … This The new … The research These findings This research provides us with offer a number innovative contributes provide contributes by increasing insight of insights insights the knowledge our knowledge about how … into … into … needed both to … of …

Sub-move 5: Making recommendations / suggestions based on the research findings RECOMMENDATIONS: KEY WORDS: RECOMMEND /SUGGEST / PROPOSE suggest / s -that a … approach to … is absolutely necessary -a model for the … It is recommended that a … should be … is proposed that a … be added to… is suggested The findings suggest new approaches to … This thesis makes some recommendations for … This new … provides a unified set of that … recommendations This thesis / recommends that … the use of … be study applied to … The results are recommended to be applied to future … projects … This research proposed -that … The current proposes -a … model to assist in … study - a new perspective in … -ideas that … -the construction of …


RECOMMENDATIONS: KEY WORDS: RECOMMEND /SUGGEST / PROPOSE Recommendations were outlined for improving … -A series of is provided to … recommendations -Recommendations are provided to … Recommendations for include … further study Based on the findings of are put forth to aid … this research, recommendations Specific recommendations will be directed towards … A … classification of … A … scheme A new synthesis Methodological improvements is proposed. is proposed are suggested -as a way of … -which … -for … in order to …

RECOMMENDATIONS: KEY WORDS: MUST / SHOULD The research findings must be interpreted in the light of … The changes that … must be addressed and simplified Members of the … must work together and find team mutually agreeable … This thesis concluded should that … We should The findings from this should study apply work not be generalized … at the … level. to provide … with … to define all …


Sub-move 6: Discussing the Implications of the research ‘IMPLICATIONS’- KEY WORD: IMPLICATION regarding … are discussed. of this study for … are also considered. of … range from …to … are examined. The implication for … is also analyzed. One of this study is that … implication would be that … The main implications (of this research) are that … practical The wider Other of the research findings relate to … theoretical implications Practical regarding … are derived from implications findings. Implications




‘IMPLICATIONS’- KEY WORD: IMPLICATION This research / has (important) implications for … dissertation The results / findings have (broad) implications for … research. concludes implications for … with The study discusses (possible) implications (of for … The research (the) …) contributes (practical) implications into … offers empirically grounded for … implications Sub-move 7: Opening up new areas of research ‘SUGGESTIONS FOR FURTHER RESEARCH’- KEY WORDS: FUTURE / FURTHER This study offers suggestions for further research. provides Future research is recommended into … Further on Recommendations study include … methodologies … on further Suggestions for study are made. further The study emphasizes the need for further research. Based on the suggestions were / are made for future research. results,


APPENDIX M: Screenshots of ENGL501 Moodle


APPENDIX N: A student-led discovery task

Task 1- A student-led discovery task. Data is plural; the singular of data is datum which is the record of a single observation. Data as noun: • What are the relevant data? • Pursuing it, he has logged 500,000 miles, suffered indescribable digestive indignities, and meticulously collected physiological data on the health and eating habits of 10,000 individuals, from Bantu tribesmen to Italian contadini. • For many of these unwed mothers, the data on their family life and early childhood experiences revealed several indications and sources of their basic mistrust of their parents in particular and of the world in general. • In addition, the 1952 study collected comparable data from 4,585 students at ten other colleges and universities scattered across the country: Dartmouth, Harvard, Yale, Wesleyan, North Carolina, Fisk, Texas, and University of California at Los Angeles, Wayne, and Michigan. Data as an adjective • The x ray data are consistent with particle sizes of 1000 A or greater • Preliminary data from 1959 Eta give an average impact rate of **f for masses larger than **f for about 1000 events in a 22-day period (LaGow and Alexander, 1960). • At the fifteenth magnitude, **f and at the twenty-fifth magnitude, **f These extrapolated fluxes are about an order of magnitude less than the values from the satellite data and the figures in Whipple's table. Some Verbs used to talk about data 1. find out the data 2. analyze the data 3. figure out the data 4. express the data 5. provide the data 6. interpret the data 7. use the data 8. Determine the data of…. 9. Measure the data from…. 10. express the data in terms of physical quantities 3 sentences 1. From the resulting data the doctor can determine lung defects with hitherto unknown accuracy and detail. 2. The selective and directional qualities of basic value-orientations are clearly evident in these data : 3. The market for computers and other data -handling continues to expand at the rate of about 30% annually, reaching some $450 million in 1960.


APPENDIX O: Teachers’ Notes for the Advanced Academic Thesis Writing Course

The current ENGL501 advanced academic thesis writing course adopts a genrebased corpus-informed data-driven lexico-grammatical approach to thesis writing. The course therefore focuses on the generic features of texts, makes use of the corpora compiled and analyzed for this study as well as larger corpora, offers the participants the opportunity to explore the corpus data through data-driven learning activities, and pays special attention to how lexico-grammatical structures achieve different language functions. The participants are constantly alerted to the fact that fluency in productive skills is achieved through the knowledge of collocations and lexico-grammatical structures.

The course has three contact hours a week, complemented by the online component of the course on Moodle. Before the three contact hours each week, the participants’ prior knowledge of the section of the thesis to be focused on that week is elicited through a discussion forum on Moodle. The participants share their ideas regarding the sub-genre online in a collaborative environment. They are also encouraged to do research regarding the sub-genre on the web, and provide the source for reference. The discussion is generally structured in such a way as to encourage the participants to consider the sub-sections of the sub-genre in question, and how they are organized, thus focusing on the generic discourse structure. The thesis introduction is given as a sample here.


The first discussion forum on Moodle regarding thesis introductions is entitled ‘What is ‘CARS’?”. The participants are aware that CARS is related to thesis introductions, so they make predictions about the meaning. As they are collaborating with each other, they come up with ideas such as ‘Critical Analysis Research System’, as well as more entertaining ones such as “Ferrari is the most famous motor designer in the world because of its hi-quality and excellent design”. After this initial discussion on thesis introductions, the in-class input materials are put online on Moodle one or two days before the contact hours. The participants continue their discussion in class, and find out that CARS stands for ‘Create A Research Space’, and reflects the three moves in thesis introductions. The in-class materials provide the participants with an authentic sample introduction, and a variety of tasks ensure that the discourse structure of the thesis introduction is clear. Focus on the organization is followed by work on language, how different moves and sub-moves are achieved through lexico-grammatical structures. After an intensive three hours in class, more samples of thesis introductions, links to good outside sources, and online tasks focusing on organization and language are offered to the participants on Moodle. Not only are data-driven learning tasks designed to promote the exploitation of the corpora, the participants are encouraged to explore the AAC Bank of moves and sub-moves for the relevant language. When enough input, exposure, and practice tasks are provided, the participants are invited to submit their first draft introductions online. The feedback from the course instructor is in the form of suggestions, and guidance to sources, and by no means a proofreading and correction exercise. After getting feedback, the participants write their second drafts and submit them online to receive a second round of feedback from the course instructor.


