General Language Proficiency Issues Revisited

This article was downloaded by: [NED University of Engineering & Technological]
On: 14 February 2015, At: 01:29

Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Language Assessment Quarterly

Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/hlaq20
General Language Proficiency

(GLP): Reflections on the “Issues
Revisited” from the Perspective of a UK
Examination Board
a
Lynda Taylor
a
Cambridge English Language Assessment, University of Cambridge,
Cambridge, United Kingdom
Published online: 21 May 2014.
Click for updates
To cite this article: Lynda Taylor (2014) General Language Proficiency (GLP): Reflections on the
“Issues Revisited” from the Perspective of a UK Examination Board, Language Assessment Quarterly,
11:2, 136-151, DOI: 10.1080/15434303.2014.896366
To link to this article: http://dx.doi.org/10.1080/15434303.2014.896366
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [NED University of Engineering & Technological] at 01:29 14 February 2015
Language Assessment Quarterly, 11: 136–151, 2014
Copyright © Taylor & Francis Group, LLC
ISSN: 1543-4303 print / 1543-4311 online
DOI: 10.1080/15434303.2014.896366
General Language Proficiency (GLP): Reflections

on the “Issues Revisited” from the Perspective
of a UK Examination Board
Lynda Taylor
Cambridge English Language Assessment, University of Cambridge, Cambridge,
United Kingdom
Looking back to the language testing world of the 1980s in the United Kingdom, we need to be aware
that how we perceive or remember ourselves to have been then—whether as individual language test-
ing academics or as corporate language testing organisations—will be shaped by multiple influences.
Although we may have been present at and shared in the 1980 discussions, our recollections of how
things were then and our views on how they have (or have not) changed will vary. What follows in
this article offers a predominantly personal perspective. It is the view as I perceive it, in light of my
own journey as a UK-based language teacher and tester over the past 30 years, seen from where I
stand now as a consultant to a large international examining board in the United Kingdom. It is also
therefore an institutional perspective, drawing on a long association with one particular language test-
ing organisation. Just as my perspective is from the position of only one language testing institution,
I am also only one individual from within that institution. There will undoubtedly be other stances,
voices, and perspectives that are equally valid and relevant from within the same institution.
INTRODUCTION
Looking back in time can be an inherently risky business. Some novelists have warned of the
challenges and dangers involved. L. P. Hartley (1895–1972), for example, began his famous
1953 novel The Go-Between with the words, “The past is a foreign country: they do things dif-
ferently there” (L. P. Hartley, 1953/2000, p. 5). Aldous Huxley (1894–1963) warned of a similar
dilemma in his novel The Devils of Loudun, published the previous year: “The charm of his-
tory and its enigmatic lesson consists in the fact that, from age to age, nothing changes and yet
everything is completely different” (Huxley, 1952/2009, p. 300).
Looking back to the language testing world of the 1980s in the United Kingdom, it is per-
haps worth taking heed of such warnings that how we perceive or remember ourselves to have
been 30 years earlier—whether as individual language testing academics or as corporate language
testing organisations—will invariably be shaped by multiple influences, resulting partly from the
passage of time and partly from the place in which we now stand. It is important to be aware of
Correspondence should be sent to Lynda Taylor, Research and Validation Group, Cambridge English Language
Assessment, 1 Hills Road, Cambridge CB1 2EU, United Kingdom. E-mail: Taylor.L@cambridgeesol.org
GENERAL LANGUAGE PROFICIENCY 137
the impact of perspective, and though we may have been present at and shared in the 1980 dis-
cussions, our recollections of how things were then and our views on how they have (or have not)
changed will vary. As theologian Professor Janet Soskice (1993) reminded us, “the truth looks
different from here” (p. 43).
I make this point at the outset because what follows in this article offers first of all a predom-
inantly personal perspective. It is the view as I perceive it, in the light of my own journey as
a United Kingdom–based language teacher and tester over the past 30 years, seen from where
I stand now as a consultant to a large international examining board in the United Kingdom.
It is also therefore an institutional perspective, drawing on a long association with one partic-
ular language testing organisation, during which I have served as both a freelance consultant
and a full-time employee, working in examiner, item writer, test developer, researcher, director,
presenter, and editorial roles over more than 25 years. It is important to emphasise that, just
as my perspective is from the position of only one language testing institution, I am also only
one individual from within that institution. There will undoubtedly be other stances, voices, and
perspectives that are equally valid and relevant from within the same institution.
The World in 1980
When looking back across time to reflect on how the journey has unfolded, it can be helpful to
remember what was going on in the world at that point in history. Was it really “a foreign coun-
try”? Are things completely different 30 years on, or are some things still very much the same?
A quick search of Wikipedia for 1980 offers the following snippets of general knowledge:
in the United Kingdom, Margaret Thatcher was prime minister of a Conservative government
that would remain in power for another 17 years; UK unemployment figures reached 1.9 mil-
lion, and inflation reached 22.8%; Zimbabwe became independent of Britain; and Soviet troops
and Mujaheddin guerrillas clashed in Afghanistan following invasion by the USSR. There might
be some evidence here for Huxley’s assertion that while everything is different, nothing really
changes. Today in the United Kingdom a Conservative government, albeit in coalition, is once
again in power. Unemployment is on the rise, and the economic conditions are not encouraging,
though inflation is thankfully nowhere near a staggering 23%! On the world stage, an independent
(but tragically impoverished) Zimbabwe is frequently in the news, and there is still a bitter war
going on in Afghanistan. It is worth noting, however, that in 1980–1981 the Internet and World
Wide Web did not yet exist, the fall of the Berlin Wall and of the Soviet empire was still 10 years
away, and the events and legacy of 9/11 (the 2001 terrorist attack on the United States involving
four hijacked planes) were still two decades in the future. So in some senses it was indeed “a
foreign country.”
On the personal front, I was 26 years old in 1980, recently married, living and working in
London as an English as a Foreign Language (EFL) teacher, first at Davies’ School of English,
Eccleston Square, and later at Eurocentre, Forest Hill. Most of my teaching was fairly typical in
the field of EFL in the United Kingdom at that time—general English proficiency classes with
students of mixed nationalities and first languages (L1s), mostly on 12-week courses and some-
times working toward a certificated language qualification. Commonly used teaching materials in
UK EFL schools at that time included published course books such as English in Situations
(O’Neill, 1970), which reflected the classic structural paradigm with endless situational dia-
logues for drilling, and Kernel Lessons Intermediate (O’Neill, Kingsbury, Yeadon, & Scott,
138 TAYLOR
1972), purportedly linked to theories of transformational-generative grammar (TGG kernels) in

the Chomskyan tradition. There were also more “progressive” course books such as Building
Strategies (Abbs & Freebairn, 1979) and Functions in English (Jones, 1979), both derived from
the functional-notional syllabuses that had emerged during the 1970s. And then there was the
new and exciting Streamline English course produced by Oxford University Press (B. Hartley
& Viney, 1979) which, although it probably reverted to a more traditional structurally oriented
approach, still seemed to break new ground for many EFL teachers at that time with its exciting
full-colour cartoons and its fun and jokey style.

My own teaching timetable in 1980 was roughly 6 to 7 hr of contact time a day, teaching
mainly skill-focused lessons on reading and listening comprehension, writing, and speaking
skills, as well as enabling skills such as grammar, vocabulary, and pronunciation. For higher
level students there were also lessons covering set books from English Literature (e.g., George
Orwell’s, 1945/2008, Animal Farm) and lessons on British life, culture, and institutions (e.g.,
annual festivals such as Christmas, or national customs such as Poppy Day). I even recall teach-
ing some of my students an elective course on the history of Ireland, in an attempt to help them
understand the background to the “troubles” in Northern Ireland and the socio-political reasons
for the IRA terrorist bomb threats that we faced week by week in London at that time.
And what of language testing? What role did EFL tests or examinations play in my teaching
career in 1980? Some students followed special 3-month (or even 6–9 month) courses to pre-
pare for the Cambridge Proficiency in English (CPE) or the First Certificate in English (FCE).
In 1980 both these examinations still conformed to the versions of the tests released in 1975. The
1975 versions reflected growing interest in the communicative language teaching trend of the
1970s with its increasing concern for language in use rather than language as a system for study.
In 1980 both CPE and FCE comprised five papers: Composition, Reading Comprehension, Use
of English, Listening Comprehension, and Interview. The CPE examination lasted about 8 hr in
total, whereas FCE took just under 6 hr (compared with just under 6 and 5 hr, respectively, today).
Weir (2003) noted that, for CPE, translation tasks from and into English were still offered to test
takers, along with the English Literature papers, but these were now noncompulsory, bolt-on
options that could be endorsed on the test certificate.
Although the overall test focus and structure were similar in some ways to CPE and FCE
as it is today, there were some important differences. For example, the 1980 Listening tests for
CPE and FCE involved printed passages being read aloud to candidates from the front of the
examination room. Tape-recorded listening tests were not introduced until the 1984 revision of
both tests. This is just one example of how technological advances since 1980 have enabled
many examination boards to reengineer their large-scale testing programmes. It is important to
acknowledge, however, that the Association of Registered English Language Schools (ARELS)
had introduced tape-based tests of oral/aural skills from 1967, and I recall preparing students for
both the ARELS Certificate and Diploma in 1980–81. The CPE and FCE Interview components
(i.e., the speaking test) in 1980 included a section where the candidate and examiner had to read
between them the two parts of a dialogue from a play! (See Weir, 2003, and Hawkey, 2009, for a
comprehensive historical account of both tests.)
The Preliminary English Test (PET) had been introduced in 1978 but its take-up was still
small by comparison with its two older Cambridge siblings. Other Cambridge examinations—
such as the Certificate of Advanced English (CAE), the Key English Test (KET), the Business
English Certificates (BEC), and the Young Learner English Tests (YLE)—would not make their
appearance for at least another 10 to 15 years. In 1980 the English Language Testing Service
(ELTS) was introduced by the British Council to provide a specific-purpose language proficiency
measure for international students wishing to follow higher education or vocational training in
the United Kingdom, but numbers were small—only 3,876 in 1981, rising to just over 14,000 by
1988 (see Davies, 2008). In 1981 the Royal Society of Arts introduced the Communicative Use
of English as a Foreign Language (CUEFL) examinations, reflecting the more communicative
approach to language pedagogy and assessment that had been steadily developing in the United
Kingdom throughout the 1970s (see Hawkey, 2004b, on the development of this suite of tests).
THE 1981 “ISSUES” VOLUME
In his Introduction to the 1981 “Issues” volume, Alderson (1981a) made the following assertion:
“. . . ‘testing’ has not yet recovered from this image of being stubbornly irrelevant to or uncon-
cerned with the language teacher, except for its embodiment in “exams” which dominate many
a syllabus” (Alderson, 1981a, p. 5). Was this actually my own experience as an EFL teacher in
1980–81? Did testing have an image of being “stubbornly irrelevant to or unconcerned with” me
as a language teacher, except with regard to its embodiment in formal examinations and their
resulting dominance of my teaching syllabus. An honest answer is probably “yes and no.”
On one hand, those English language examinations that were available in the United Kingdom
in 1980 (and, as just shown, they were still relatively few in number compared with today) offered
teachers and learners at the higher proficiency levels some sort of organisational framework for
the teaching curriculum as well as a motivational driver for learning. So in one sense testing
was a relevant concern for EFL teachers. But I do not personally recall exams “dominating” the
EFL teaching syllabus, and it is worth remembering that there were really few formal tests as
yet at the lower proficiency levels (i.e., beginner, preliminary, lower intermediate). Furthermore,
few of the course books used in 1980 were closely linked to a public examination in the way
that tends to be the norm nowadays. In the schools where I taught, exam practice test books and
other materials (e.g., ARELS tapes for language laboratory practice) were certainly available, but
these were generally used only for the occasional exam-specific practice classes; such classes
were sometimes part of an optional “elective” program offered to those students who specifically
wished to prepare for public certification of their language skills. By no means all students chose
this option. I do recall that we sometimes used FCE and CPE Use of English practice papers for
content input to our Grammar and Vocabulary lessons, and if I was using CPE material I quickly
learned to make sure I always had the answer book with me in the classroom! Of course the
situation may well have been different in other UK schools, or for EFL teachers in other parts of
the world, where testing may indeed have been “embodied” in examinations that dominated the
teaching and learning experience.
On the other hand, a notion of “testing” in a broader sense than just coaching students to take
formal language examinations, that is, testing (or assessment) as an ongoing means of guiding
and supporting learning, probably did not figure very strongly in my consciousness as an EFL
teacher in 1980–81. I do not think I had an awareness of this being a part of my responsibility
or of it being an area in which I could make a contribution as a teacher, but nor do I recall it
being a part of my EFL teacher training in the late 1970s. In that sense, perhaps I did perceive
testing and assessment as largely “irrelevant and unconcerned” with me and my role as an EFL
140 TAYLOR
teacher. Thirty years on, there can be little doubt that testing and assessment have moved centre
stage in language teaching and learning, and that “assessment literacy” is just one of multiple
competences needed today by language teachers (see Taylor, 2009a, for a fuller discussion of
these issues). Nowadays it would be hard to argue that testing and assessment retain an image of
being “stubbornly irrelevant to or unconcerned with the language teacher.” Alderson’s assertion
that they are embodied in examinations that dominate many a syllabus perhaps resonates much
more strongly today than it did in 1981.
The General Language Proficiency (GLP) “Issues” in the 1981 Volume
In light of my reflections thus far on the teaching and testing of general English language profi-
ciency as I experienced it 30 years ago, what were the specific “issues” related to GLP raised by
contributors to the 1980 Lancaster symposium and explored further in the follow-up published
volume? I would identify a set of eight GLP-related issues that seemed to thread their way in and
out of the six papers in the edited 1981 volume, and each of these issues is briefly outlined next.
Issue 1: Hypothesising the GLP construct. Both Palmer and Bachman’s paper and
Vollmer’s paper seemed to raise the fundamental issue of how applied linguists and language
testers can best hypothesise the GLP construct for the purposes of assessment. They reviewed the
strengths and weaknesses of the available options: the unitary competence hypothesis, the divis-
ible competence hypothesis (entailing multiple competences), the partial divisibility hypothesis.
In raising this issue, the authors clearly acknowledged the implications of both theoretical and
empirical findings for practical testing purposes.
Issue 2: Approaches to investigating construct validity in GLP. Palmer and Bachman’s
paper highlighted the growing range of methodological approaches and considerations available
to the field for investigating construct validity in GLP, including principal components analysis,
correlational studies and confirmatory factor analysis.
Issues 3: The social function of GLP testing. Vollmer’s paper touched upon the potential
social implications of testing—concerning “future needs or control purposes”; in doing so, per-
haps he prefigured the concerns for test impact, ethics, and professional responsibility that later
came to occupy a central position in the field during the 1990s and 2000s.
Issue 4: Appropriate and relevant tasks for GLP testing. Hughes’s paper addressed
the issue of how to specify “natural” language tasks and functions. He also raised the issue of
how to recognise differentiated performance across individuals along with the reasons that might
underpin this, such as skills, varieties, exposure, nonlinguistic factors, and processing variation.
Issue 5: The psychological reality of GLP levels. Hughes also raised a question about
the number and nature of proficiency levels that are appropriate to descriptions of particular
languages.
Issue 6: GLP as a practical or mathematical rather than theoretical phenomenon.
Davies’s paper questioned whether the way notions of competence or skills or ability are struc-
tured is partly a practical choice in relation to language teaching, language testing, learner
development, and criteria for judging eventual success. He also suggested that it may be partly a
mathematical choice—an issue of determining which is the “most elegant” solution.
Issue 7: GLP as one side in a dualist dilemma. Alderson’s (1981c) paper perceived a
conflict between a GLP view of the world, as expressed through a unitary competence hypoth-
esis, and an English for Specific Purposes (ESP) perspective, which invariably entailed multiple
competences. He further highlighted the issue of the relevance and meaningfulness of different
tests and of test profiles, that is, the question of test comparability.
Issue 8: The potential for a GLP research agenda. Alderson (1981c) also emphasised
the potential for a rich and productive GLP research agenda that would investigate the L1/L2 rela-
tionship, including the nature of proficiency and performance. Other areas for research, he
suggested, could include the description of criterion levels, the principles underlying test effi-
ciency (linked to test improvement) and understanding of language processing and subject
content.
These eight “issues” raised in the GLP section of the 1981 publication might be relabelled as
follows:
1. Construct specification
2. Research methodologies
3. Social ethic
4. Task analysis
5. Proficiency levels
6. Practicality matters
7. Test comparability
8. Research agenda
THE CONTRIBUTION OF ONE EXAMINATION BOARD OVER 30 YEARS
The remainder of this article uses these eight topic headings as a convenient organising framework
for considering what contribution, if any, one particular examination board in the United Kingdom
was able to make in each of these areas over the past 30 years. The following overview will be
necessarily brief and invariably partial and personal because it is offered from the perspective of
a single UK-based testing institution and through the eyes of one individual within that context.
Nevertheless, the following reflections will hopefully explore some of the ways in which the
1980s issues have been addressed over the past three decades. The intention in what follows is not
to imply that Cambridge English Language Assessment (formerly Cambridge ESOL, and before
that UCLES EFL) took a major lead in or had a monopoly over advances in language testing
understanding and practice, but rather to try to exemplify some of the trends and developments
of the past 30 years against the backdrop of one particular language testing institution. Other
examination boards and testing institutions, in the United Kingdom and in other parts of the
world, will all have their own stories to tell.
Construct Specification
When reflecting on the ways in which the understanding and practice of construct specification
has evolved since 1980, it is important to acknowledge the significant contribution of Canale
and Swain’s (1980) seminal paper “Theoretical Bases of Communicative Approaches to Second
142 TAYLOR
Language Teaching and Testing”; of interest, their article appeared in the same year as the
Lancaster symposium in the very first issue of the new journal of Applied Linguistics. The Canale
and Swain model of communicative competence provided a valuable heuristic framework at a
time when there was a steady move toward increasingly explicit test specification in response
to demands for transparency and accountability, both internally within testing organisations and
externally from test stakeholders. This trend can be detected, for example, in the history of ELTS
and IELTS with its focus on test validation and redevelopment during the 1980s and 1990s (see
Davies, 2008). Hawkey (2009) described how the shift toward ever more detailed test specifica-
tion in the United Kingdom can also be traced in the CPE and FCE revision projects of 1984,
1996, 2002, and 2008.
It is plausible to suggest that explicit construct specification became a priority for examination
boards not only as they revised existing tests but also as they devised new tests to meet new
customer demands and market priorities. Clear differentiation between tests, whether in terms of
the target proficiency level or the intended language use domain, required a clear articulation of
the underlying constructs in order to operationalise these effectively. Since the establishment of
the Evaluation Group within UCLES EFL in 1989, extensive institutional experience of large-
scale testing has combined with findings from internal and external research to support improved
construct definition and operationalisation of the board’s examinations. The 1990s in particular
saw an increasingly systematic approach to test development and validation, articulated through
the ‘VRIP’ framework; this reflected a concern for an appropriate balance of the essential test
qualities of Validity, Reliability, Impact, and Practicality to ensure test integrity and long-term
sustainability.
More recently, Cambridge has found a comprehensive socio-cognitive framework helpful for
its test development and validation activities. The framework seeks to express a unified notion of
validity embracing multiple core components relating to the test taker, cognitive, contextual, scor-
ing, consequential, and criterion-related dimensions of any test. Shaw and Weir (2007), Khalifa
and Weir (2009), Taylor (2011), and Geranpayeh and Taylor (2013) reported the application of
the socio-cognitive framework to Cambridge’s assessment of second language writing, reading,
speaking, and listening, respectively, in a series of published volumes.
Both Morrow and Alderson seem to have foreseen such developments in the area of construct
specification when they wrote the following in 1981:
Examining bodies will have to draw up, and probably publish, specifications of the types of operations
they intend to test, the content areas to which they relate and the criteria which will be adopted in
assessment. (Morrow, 1981, p. 22)
The advantage of testing is that it forces explicitness: the test is an operationalisation of one’s
theory of language, language use and language learning. (Alderson, 1981b, p. 54)
Although both these comments were in the context of the 1980 discussion on Communicative
Language Testing, they are equally relevant to the GLP debate and indeed to the ESP debate.
Research Methodologies
The 1980s and 1990s saw the emergence of several innovative qualitative approaches for
analysing language and language tests. These new methodologies offered the potential for
analysing performance tests and data in new and exciting ways. Applied linguists and language
testers quickly set about applying the tools of discourse analysis, conversation analysis, and ver-
bal protocol analysis to language data, especially in order to investigate features of writing and
speaking assessment. Green (1998) and Lazaraton (2002) provided numerous examples of how
such qualitative research methods were successfully used with data from the Cambridge English
writing and speaking tests, helping to provide test providers with rich, empirically derived insights
into features of task design, test-taker output, rater behaviour, and assessment scales, which could
feed directly into test revision and development projects.
At the same time, the past 30 years have witnessed the rapid development and contribution of
corpus linguistics as it applies to both L1 and L2 usage. This area of research has enabled greater
understanding of the complex nature of L2 proficiency development as well as an awareness of
the marked differentiation between features of spoken and written language (see Taylor & Barker,
2008, for an overview of corpus linguistics and language assessment).
From the Cambridge perspective, the establishment of the Cambridge Learner Corpus (CLC)
in 1993 opened the way for a long-term and far-reaching program of L2 research into English,
which had direct relevance for the fields of applied linguistics, language pedagogy, curriculum
development, and language testing.1 The CLC, together with L1 corpora that are available to
researchers (e.g., the British National Corpus and the Cambridge International Corpus), has
been exploited for many years now to provide Cambridge with empirical input to its vocabu-
lary wordlists for the KET, PET, BEC, and other examinations. Research using the CLC is also
bearing fruit in not only lexical but also structural and functional analyses under the auspices of
the English Profile Programme, undertaken in collaboration with various other applied linguistics,
language teaching and language testing partners.2
Recognition has grown in recent years of the advantages and value of blending quantitative
and qualitative research methodologies, that is, “mixed methods” approaches to the analysis and
interpretation of data from language testing investigations. Studies adopting a mixed methods
approach are now routinely reported in the literature.
Social Ethic
Growing concern for a social ethic in language testing during the early 1990s, inspired partly by
Messick’s (1989) discussion of the social consequences of test use and their implications for any
test validity argument, coincided with Cambridge’s active concern for the impact of its examina-
tions given their worldwide take-up and their frequent use in high-stakes decision making. In the
1990s the examination board began to consider potential models and research methodologies for
investigating test washback and impact (Milanovic & Saville, 1996) and from 1996 onward a
long-term research agenda began to be implemented for examinations such as IELTS, CPE, and
PET (see Hawkey, 2004a, 2006, and Saville & Hawkey, 2004).
The 1990s and 2000s saw the addition of new examinations to the existing Cambridge port-
folio: CPE, FCE, and PET were joined by CAE, KET, BEC, and YLE, among others, reflecting
the board’s rejection of a “one size fits all” approach to testing. Such developments were also
a response to the changing nature of “general English” in the final decade of the 20th century
1 The CLC is an electronic database of exam scripts written by students taking Cambridge English exams around the
world. For more information, go to www.cambridge.org/corpus

2 For more information about English Profile, go to www.englishprofile.org
144 TAYLOR
and the start of the new millennium, as mass population movements and globalisation steadily
became the norm from 1989 onward. The extension of English language teaching and learning
down the age range into the primary curriculum together with demands for new domain-specific
tests (a new wave of ESP/LSP) were key drivers in the commercial world of language testing
at this time, as were the new opportunities afforded by technology for computer- and web-based
assessment. As the ELT landscape evolved, and as stakeholder requirements and expectations
changed, so examination boards needed to adapt their provision accordingly. In his 1981 paper
on ESP, Carroll seems to have clearly foreseen this “process of diversification of test instruments
to meet the diversity of the test situations” (p. 67).
For Cambridge, and presumably for other similar large-scale examination providers, a more
sophisticated and demanding stakeholder community meant that ever greater attention needed to
be paid to issues of test quality, as well as to matters of currency and recognition, access and
security, customer service and pricing. This coincided with a growing self-awareness in the wider
language testing profession and with early attempts at professionalisation and self-regulation by
sections of the language testing community at the national, regional and international level. For
example, in 1990 Cambridge was a founding member of the Association of Language Testers
in Europe (ALTE),3 an association of European providers of foreign language examinations that
included both universities and commercial assessment bodies. By 1994 this new professional
association had drafted its own Code of Practice, articulating a developing commitment on the
part of its member organisations to a process of continuous improvement via a system of quality
management and assurance.4 The development of such initiatives within the language testing
world can perhaps be seen against the wider canvas of socio-political trends in some parts of the
world concerned with ethical behaviour, customer charters, and service user rights.
Task Analysis
From the Cambridge perspective, the last three decades have seen a steady shift toward the
greater use of communicatively oriented skills and tasks in its multicomponent English lan-
guage tests, both those designed to assess general language proficiency and those targeted more
at specific language domains. Some of the significant changes to CPE and FCE in 1984 have
already been mentioned (i.e., introduction of recorded listening tests, more naturalistic speaking
test tasks). Partly thanks to ongoing developments in applied linguistics and language peda-
gogy, test developers’ ability to understand and describe the complex nature of performance
tasks and to operationalise these for assessment purposes steadily improved over time as evi-
denced, for example, in the mandatory use of paired speaking tests (e.g., CAE in 1991) and of
integrated reading-into-writing tasks (e.g., ELTS/IELTS). The special considerations and require-
ments associated with assessment tasks at lower proficiency levels (e.g., PET and KET), for
young learners (e.g., YLE), and for computer-based testing (e.g., BULATS) were also a key
focus for attention. Interest in task analysis and design embraces a concern to understand the
language processing and contextual parameters that can shape the way test takers engage with an
assessment task and the language sample they generate, which provides the basis for evaluation.
3 For more information, go to www.alte.org

4 The ALTE Code of Practice and quality management and assurance can be found at www.alte.org/qa/index.php
The psycholinguistic (interactional or cognitive) and the sociolinguistic (situational or contex-

tual) factors at work are much better understood, differentiated, and described now than they
were 30 years ago. The valuable contribution of conceptual frameworks developed for language
test development and validation, such as those offered by Bachman (1990) and Weir (1993, 2005),
needs to be acknowledged in this regard.
Proficiency Levels
Systematic work on describing and differentiating language proficiency levels for the purposes of
teaching and learning, for example, syllabus and curriculum design, had begun in the early 1970s.
Jan van Ek and John Trim’s Threshold and Waystage specifications, initiated by the Council of
Europe, were available in published form from 1975. These specifications were subsequently
revisited and updated in 1990 (Van Ek & Trim, 1998a, 1998b), paving the way for the devel-
opment of the Common European Framework of Reference for Languages in the mid-1990s
(Council of Europe, 2001). This initiative by the Council of Europe was co-temporaneous with
work by Cambridge and its ALTE partners during the early 1990s on the ALTE Can-Do Project.
Both projects are perhaps best seen as responding to a perceived need to account coherently
and convincingly for the number and nature of different proficiency levels that were emerging
in the teaching, learning, and assessment of languages, not just for EFL but also for other lan-
guages across the European continent. There was a further need to somehow reconcile notions
of what was “core” and what was “specific” across differing language use domains (e.g., “gen-
eral” language as core, but with associated work/study/social dimensions that called for more
specific, domain-oriented language). In the European context, at least, there was perceived to
be considerable value in developing a common metalanguage and shared framework of refer-
ence that could aid communication and transparency among the growing numbers of people and
institutions involved in language policy and practice.
A significant development in this area over the past 30 years is the critique and subsequent
reappraisal of the native-speaker criterion as the de facto top of the proficiency scale. Another
is the acknowledgment of the complexities involved in language variation, leading to a growing
interest in linguistic varieties; concepts such as English as a Lingua Franca have begun to chal-
lenge the classical, dualistic L1 versus L2 paradigm, though the implications of this for language
testing and assessment still need to be carefully thought through (see Taylor, 2009b, for further
discussion).
Practicality Matters
The past three decades have seen something of a revolution in terms of the administrative and
practical dimensions of language testing and assessment. Technological advances have allowed
many test providers to explore new and innovative ways of designing, delivering, and process-
ing tests as well as enabling improved access (in terms of geography, timing, or equity), faster
turnaround times on results, enhanced security features, and sophisticated support systems for a
range of test stakeholders. Since the mid-1990s, the Internet and World Wide Web have allowed
large-scale testing organisations such as Cambridge to redesign and reengineer their assessment
operations in ways that the language testers who met in Lancaster in 1980–81 probably never
imagined.
146 TAYLOR
At the same time, in an increasingly individualistic and consumer-oriented world, the demand
for “tailored” assessment solutions has expanded in ways that might not have been predicted
30 years ago. This is partly made possible by the multiple options and avenues that technology
now offers, promising the “diversification of test instruments to meet the diversity of the test
situations” to which Carroll (1981) referred (p. 67).
Whether technology has been exploited to its fullest potential in assessment remains a matter
for debate, however. Is it the case that traditional paper-and-pencil tests have simply been con-
verted to a different technological platform as a convenient rather than a creative or imaginative

alternative?
It is also worth noting that many of the issues that are a high-priority concern for test
users—for example, test date frequency, accessibility, score turnaround times, security, ease of
administration, cost, and availability of support materials—are not necessarily those that exercise
academic language testers; the latter are understandably more often concerned with the essential
measurement and technical qualities of a test.
Test Comparability
The proliferation of tests over the past 30 years, combined with the interconnectedness of the
world in which language users live and move nowadays, has in recent years brought issues of test
comparability to the forefront. As Alderson predicted in 1981, increasing attention has been paid
to the relevance and meaningfulness of different tests and of the test scores they generate, espe-
cially when these are set alongside one another. Globalisation and increased people movement
for educational and employment reasons mean that decision makers are nowadays frequently pre-
sented with scores from different tests all claiming to measure the same thing, thus making the
business of appropriate score interpretation a priority issue.
The value of more transparent and accessible user-oriented scales to aid the process of score
interpretation has gradually been recognised, and since the early 1990s there have been numerous
attempts by examination providers to go beyond just providing numerical test scores or grades
to test users, whether candidates or decision makers. Nowadays, responsible examination boards
supplement test results with Can Do statements of ability, a statement of results containing a
graphical profile (e.g., showing relative strengths and weaknesses across skills), even exemplar
performances illustrating a particular level of quality. The provision of this type of accompanying
score information and illustrative material supports the development of assessment literacy in the
public domain and it acknowledges the social function of tests highlighted by Vollmer in 1981.
There is today a growing understanding that the use of tests and test scores is often implicated
in the realms of social policy and political control. Responsibility for the appropriate use of tests
and test scores is properly shared between test providers and a range of other test stakeholders,
all of whom have a public duty and moral obligation to avoid the misuse or abuse of tests and
scores. This touches upon the complex issues that can emerge when an existing test, originally
developed for one context and purpose, is adopted to meet the needs of a different context or
purpose, a process that has been referred to as “retrofitting” (see Fulcher & Davidson, 2009).
In 1981 Carroll raised the issue of test diversification, but how sustainable is endless diversifi-
cation to meet changing or emerging testing contexts and purposes? How do language testers
determine when test situations are similar—or similar enough (i.e., when diversification is not
required or cannot be justified)?
Research Agenda
Finally, how far has the language testing and assessment research agenda changed if compared
with the research agenda as it looked in 1980–81?
From the perspective of Cambridge as a large-scale language test provider, the landscape cer-
tainly looks very different today from the way it did 30 years ago. Although the board already had
a long history of collaborating with external consultants and with members of the academic lan-
guage testing community (see the historical accounts of Davies, 2008; Hawkey, 2004b, 2009; and
Weir, 2003), this network of relationships has expanded considerably over the past three decades,
at both the individual and the institutional level.
A key moment in this part of the story must be the establishment in 1989 of Cambridge’s
Evaluation Unit, an internal research and validation facility within the examination board. Over
the past 20 years this facility, now known as the Research and Validation Group at Cambridge
English Language Assessment, has been able to articulate and implement a more systematic
and comprehensive programme of research and validation activity associated with the full range
of Cambridge tests than was the case in 1980–81. This has resulted in increased attendance
and regular presentations by Cambridge’s research staff at language testing conferences nation-
ally, regionally, and internationally (including the annual UK Language Testing Forum and the
Language Testing Research Colloquium), as well as sponsorship of a range of assessment training
and associated events for teachers, policy advisers, and other test stakeholders. Further contribu-
tions to the wider language testing and assessment community have been made through various
research and validation publications, for example, Cambridge’s quarterly Research Notes, from
2000 onwards; and the Studies in Language Testing series, jointly published with Cambridge
University Press, which since 1995 has published nearly 40 volumes including conference pro-
ceedings and quality Ph.D.s, as well as authored monographs, edited collections of research
studies, and two language testing dictionaries. In addition, research papers regularly appear in
the peer-reviewed literature, for example, in international journals such as Language Testing and
Language Assessment Quarterly, and invited chapters by Cambridge staff are often included in
Handbook and Encyclopedia volumes on applied linguistics and language assessment. As a well-
resourced examination board with long experience, Cambridge is fortunate to be able to engage
in a wide variety of collaborative projects and endeavours (e.g., English Profile) and to sponsor
or support the research of others (e.g., the IELTS Joint-Funded Research Programme).
During the 1980s and early 1990s, the Cambridge examination board was heavily criticised
by some within the language testing research community for its insularity, for its failure to have
a serious research agenda and for its lack of transparency. The previous paragraph would sug-
gest that much has changed on this front since 1980–81, with significant progress having been
made by the board in interacting with the wider world, in undertaking quality research and in
being more transparent and accountable. A key aim over the past two decades has been to try to
explain to researchers and other test stakeholder groups how and why Cambridge tests English
language in the ways it does, bringing forward relevant theoretical and empirical research evi-
dence in support of claims of test quality and usefulness. It is interesting, however, to note that
some language testing researchers now express concern over Cambridge’s high profile in the
realm of research and scholarship and its influence over other language testers. One external
reviewer of this article commented as follows: “In 1981, this agency decidedly was not a player
on the stage of human ideas . . . now they are.” The dilemma is an intriguing one. It seems ironic,
148 TAYLOR
and somewhat perplexing at times, that a concerted long-term effort by an institution to address
what were perceived as serious shortcomings has come to be regarded by some as questionable
in terms of its integrity and academic worth. The board’s research outputs are perceived by some
as more of a strategic marketing exercise than a genuine contribution to knowledge and scholar-
ship. One wonders why they cannot be both. Are values and qualities in scholarship inherently
incompatible with business interests, including a proper concern for commercial viability and for
quality service? The commonly held view that academic integrity and rigour are the preserve of
academic researchers needs challenging—it would surely be naïve to imagine that academics do
not have personal or institutional agendas of their own! From a personal perspective, it is tempt-
ing to feel that the examination board faces a no-win situation in this regard especially given
that the situation for Cambridge English Language Assessment is further complicated by its ori-
gins and heritage. As both a nonteaching department within a high-profile research university (cf.
Cambridge University Press) and as a key business strand within an international, multimillion
pound, higher education enterprise, the examination board enjoys a hybrid academic-commercial
status. It seems difficult to reconcile the positive and negative perceptions of Cambridge’s power
and influence in scholarship that apparently exist among the wider language testing research
community. Perhaps the most that can be hoped for is that the board’s scholarly offerings will
be received and critically evaluated in a fair and balanced way and that a process of mutually
respectful and constructive dialogue can be maintained.5
WHAT WAS NOT DISCUSSED IN 1980?
The early part of this article sought to identify eight key issues that appeared to thread their way
through the GLP discussions of the 1980 Lancaster symposium and its follow-up publication
in 1981. These strands were then used as an organising framework for considering some of the
developments that have been made over the past 30 years and what contribution a large-scale
examination board such as Cambridge has been able to make in each of the areas.
Another way of exploring what has changed in language testing over the past 30 years is to
consider what issues or topics did not apparently surface in discussions at the 1980 Lancaster
symposium or the subsequent 1981 publication. This may have been because they did not seem
relevant or significant at the time, or because they were not yet even on the radar or horizon of
language testers, at least in the UK context. If such a list were to be compiled, then it might
include the following:
• The contribution of corpus linguistics to language testing (especially as it relates to the
distinction between spoken and written language)
• The Common European Framework (though Waystage and Threshold had been around since
1975)
• New technologies for test delivery and scoring
• Washback and impact (at least not in those terms)
• Language testing ethics—accountability, codes of ethics/practice
5I am grateful to one of the external reviewers of an earlier draft for drawing my attention to the issue of power and
influence in scholarship and for inviting some reflection on it here.
• Language testing ecology (including the vast publishing and pedagogic infrastructures that
now accompany many commercially available language tests)
• The consequences of test use and consequential validity
• The commercialisation and/or industrialisation of language testing
• The globalisation of language testing
• The professionalisation of language testing
• The potential of qualitative methodologies for test development and validation
• The assessment of younger learners

• The need for assessment literacy—in both professional and public domains
It is no surprise that most of the above have been touched upon in the aforementioned reflections,
confirming just how much things have moved on in our field over the past 30 years.
CONCLUDING REMARKS
Testing language proficiency is akin to taking a snapshot at a particular moment in time.

As Vollmer pointed out in 1981,
Testing language proficiency means making a cut at a given point in time in order to form a more or
less rough idea of a person’s state of achievement. . . . Proficiency then is a dynamic construct as the
relative degree or level of competence a person has reached by the time of measurement. (p. 164)
Taking snapshots is largely what language testers and examination boards do. They interpret
language proficiency (like age or height) as a dynamic construct, as the relative degree or level a
person has reached by the time of measurement. Like photographers they make a cut at a given
point in time to form a more or less rough idea of a person’s state of advancement. Perhaps the
1980 Lancaster symposium, and the 1981 Issues in Language Testing volume that resulted from
it, actually constitute a similar type of “snapshot in time,” that is, a snapshot capturing the features
and characteristics of the UK language testing landscape as it was then?
This article has attempted to reconstruct that 1980–81 snapshot in the United Kingdom and
to set it alongside a photograph of the field 30 years on. The field certainly appears older and
more mature, just as a natural landscape would after the passage of three decades. Hopefully we
are also wiser. But real wisdom involves not just celebrating what has been learned as a result of
the in-between years but also acknowledging how much still remains to be learned in the years
to come. Perhaps we do well to remind ourselves of words usually attributed to George Bernard
Shaw: “We are not made wise by the recollection of our past, but by the responsibility for our
future.”
REFERENCES
Abbs, B., & Freebairn, I. (1979). Starting strategies. Harlow, UK: Longman.
Alderson, J. C. (1981a). Introduction. In J. C. Alderson & A. Hughes (Eds.), Issues in language testing (ELT Documents
111, pp. 5–9). London, UK: The British Council.
Alderson, J. C. (1981b). Reaction to the Morrow paper (3). In J. C. Alderson & A. Hughes (Eds.), Issues in language
testing (ELT Documents 111, pp.45–54). London, UK: The British Council.
150 TAYLOR
Alderson, J. C. (1981c). Report of the discussion on General Language Proficiency. In J. C. Alderson & A. Hughes (Eds.),
Issues in language testing (ELT Documents 111, pp.187–194). London, UK: The British Council.
Alderson, J. C., & Hughes, A. (Eds.). (1981). Issues in language testing (ELT Documents 111). London, UK: The British
Council.
Bachman, L. B. (1990). Fundamental considerations in language testing. Oxford, UK: Oxford University Press.
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing.
Applied Linguistics, 1(1), 1–47.
Carroll, B. J. (1981). Specifications for an English language testing service. In J. C. Alderson & A. Hughes (Eds.), Issues
in language testing (ELT Documents 111, pp.66–110). London, UK: The British Council.
Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment.
Cambridge, UK: Cambridge University Press.
Davies, A. (1981). Reaction to the Palmer and Bachman and the Vollmer papers (2). In J. C. Alderson & A. Hughes
(Eds.), Issues in language testing (ELT Documents 111, pp.182–186). London, UK: The British Council.
Davies, A. (2008). Assessing academic English: Testing English proficiency 1950-1989—The IELTS solution. Cambridge,
UK: UCLES/Cambridge University Press.
Fulcher, G., & Davidson, F. (2009). Test architecture, test retrofit. Language Testing, 26(1), 123–144.
Geranpayeh, A., & Taylor, L. (Eds.). (2013). Examining listening: Research and practice in assessing second language
listening. Cambridge, UK: UCLES/Cambridge University Press.
Green, A. (1998). Verbal protocol analysis in language testing research: A handbook. Cambridge, UK:
UCLES/Cambridge University Press.
Hartley, B., & Viney, P. (1980). Streamline English. Oxford, UK: Oxford University Press.
Hartley, L. P. (2000). The go-between. London, UK: Penguin Classics. (Original work published 1953)
Hawkey, R. (2004a). The CPE textbook washback study. Cambridge ESOL Research Notes 20, 19–20.
Hawkey, R. (2004b). A modular approach to testing English language skills: The development of the Certificates in
English Language Skills (CELS) examinations. Cambridge, UK: UCLES/Cambridge University Press.
Hawkey, R. (2006). Impact theory and practice: Studies of the IELTS test and Progetto Lingue 2000. Cambridge, UK:
Hawkey, R. (2009). Examining FCE and CAE: Key issues and recurring themes in developing the First Certificate in
English and Certificate in Advanced English examinations. Cambridge, UK: UCLES/Cambridge University Press.
Hughes, A. (1981). Reaction to the Palmer and Bachman and the Vollmer papers (1). In J. C. Alderson & A. Hughes
(Eds.), Issues in language testing (ELT Documents 111, pp.176–181). London, UK: The British Council.
Huxley, A. (2009). The devils of Loudun. London, UK: HarperCollins. (Original work published 1952)
Jones, L. (1979). Functions of English. Cambridge, UK: Cambridge University Press.
Khalifa, H., & Weir, C. J. (2009). Examining reading: Research and practice in assessing second language reading.
Cambridge, UK: UCLES/Cambridge University Press.
Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge, UK:
Messick, S. A. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp.13–103). Washington, DC:
The American Council on Education and the National Council on Measurement in Education.
Milanovic, M., & Saville, N. (1996). Considering the impact of Cambridge EFL examinations (Internal working report).
Cambridge, UK: University of Cambridge Local Examinations Syndicate.
Morrow, K. (1981). Communicative language testing: Revolution or evolution? In J. C. Alderson & A. Hughes (Eds.),
Issues in language testing (pp. 9–25). London, UK: The British Council.
O’Neill, R. (1970). English in situations. Oxford, UK: Oxford University Press.
O’Neill R., Kingsbury, R., Yeadon, T., & Scott, R. (1972). Kernel lessons intermediate. Harlow, UK: Longman.
Orwell, G. (2008). Animal farm. London, UK: Penguin. (Original work published 1945)
Palmer, A. S., & Bachman, L. F. (1981). Basic concerns in test validation. In J. C. Alderson & A. Hughes (Eds.), Issues
in language testing (ELT Documents 111, pp.135–151). London. UK: The British Council.
Saville, N., & Hawkey, R. (2004). The IELTS impact study: Investigating washback on teaching materials. In L. Cheng,
Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research contexts and methods (pp.73–96). London,
UK: Erlbaum.
Shaw, S. D., & Weir, C. J. (2007). Examining writing: Research and practice in assessing second language writing.
Cambridge, UK: UCLES/Cambridge University Press.
Soskice, J. M. (1993). The truth looks different from here, or on seeking the unity of truth from a diversity of perspectives.
In H. Regan & A. Torrance (Eds.), Christ and context (pp.43–59). London, UK: T & T Clark.
Taylor, L. (2009a). Developing assessment literacy. Annual Review of Applied Linguistics, 29, 21–36.
Taylor, L. (2009b). Setting language standards for teaching and assessment: a matter of principle, politics or preju-
dice? In L. Taylor & C. J. Weir (Eds.), Language testing matters: Investigating the wider social and educational
impact of assessment – proceedings of the ALTE Cambridge conference, April 2008 (pp.139–157). Cambridge, UK:
Taylor, L. (Ed.). (2011). Examining speaking: Research and practice in assessing second language speaking. Cambridge,
UK: UCLES/Cambridge University Press.

Taylor, L., & Barker, F. (2008). Using corpora for language assessment. In E. Shohamy & N. Hornberger (Eds.),
Encyclopedia of language and education (2nd ed.)—Language testing and assessment, Volume 7 (pp.241–254). New
York, NY: Springer Science+Business Media.
van Ek, J. A., & Trim, J. L. M. (1998a). Threshold 1990. Cambridge, UK: Cambridge University Press.
van Ek, J. A., & Trim, J. L. M. (1998b). Waystage 1990. Cambridge, UK: Cambridge University Press.
Vollmer, H. (1981). Why are we interested in General Language Proficiency? In J. C. Alderson & A. Hughes (Eds.),
Issues in language testing (ELT Documents 111, pp.152–175). London, UK: The British Council.
Weir, C. J. (1993). Understanding and developing language tests. Hemel Hempstead, UK: Prentice Hall International
Ltd.
Weir, C. J. (2003). A survey of the history of the Certificate of Proficiency in English (CPE) in the twentieth century.
In C. J. Weir & M. Milanovic (Eds.), Continuity and innovation: Revising the Cambridge Proficiency in English
examination 1913-2002 (pp.1–56). Cambridge, UK: UCLES/Cambridge University Press.
Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Basingstoke, UK: Palgrave Macmillan.

General Language Proficiency Issues Revisited

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

General Language Proficiency Issues Revisited

Uploaded by

Copyright:

Available Formats

This article was downloaded by: [NED University of Engineering & Technological]

On: 14 February 2015, At: 01:29

Language Assessment Quarterly

General Language Proficiency

Click for updates

To link to this article: http://dx.doi.org/10.1080/15434303.2014.896366

PLEASE SCROLL DOWN FOR ARTICLE

General Language Proficiency (GLP): Reflections

The World in 1980

1972), purportedly linked to theories of transformational-generative grammar (TGG kernels) in

full-colour cartoons and its fun and jokey style.

THE 1981 “ISSUES” VOLUME

The General Language Proficiency (GLP) “Issues” in the 1981 Volume

THE CONTRIBUTION OF ONE EXAMINATION BOARD OVER 30 YEARS

world. For more information, go to www.cambridge.org/corpus

3 For more information, go to www.alte.org

The psycholinguistic (interactional or cognitive) and the sociolinguistic (situational or contex-

verted to a different technological platform as a convenient rather than a creative or imaginative

WHAT WAS NOT DISCUSSED IN 1980?

• The assessment of younger learners

Testing language proficiency is akin to taking a snapshot at a particular moment in time.

UK: UCLES/Cambridge University Press.

You might also like