You are on page 1of 7

ARTICLE REVIEW

WHAT DOES LANGUAGE TESTING HAVE TO OFFER?

LYLE F. BACHMAN

University of California, Los Angeles

Lyle Bachman highlighted the three areas of Advances in Language testing in his study
and have described an interactional model of language test performance that includes "Language
ability" and "Test Methods" as its two major components.

Consequently, in the first paragraph, Bachman's has mentioned in his article that language
ability consist not only the knowledge about the language, but also the metacognitive strategies
on how to use the language in varied forms and various purposes, whereas the method of testing
includes environment, input, expected output and rubric of scoring and relationship between and
among the different aspects. Bachman, has mentioned in his study the two aspects of authenticity,
namely as follows; (a) Situational authenticity, and (b)interactional authenticity. In the first
paragraph of his article, he had mentioned that Previously, there are three areas in advances in
Language testing namely as follows; (a.) the development of a theoretical view that considers
language ability to be multicomponent and recognizes the influence of the test method and test
taker characteristics on test performance, (b) applications of more sophisticated measurement
and statistical tools, and (c) the development of “communicative” language tests that incorporate
principles of “communicative” language teaching.

In Bachman's article, he has mentioned various studies and reviews were conducted on
the same area and the same matter to discuss further the importance of language testing in
applied linguistics and several fields of study. Notably, (Alderson 1991 and Skehan
1988,1989,1991) arguments about language testing which "has come of age as a discipline in its
own rights" as well as they've presented substantial evidence to their arguments. They've
asserted a common theme in their claims that "the field of language testing has much to offer
terms of theoretical, methodological, and practical accomplishments to its sister disciplines in
applied linguistics. In Bachman's article, he had mentioned that Skehan and Alderson, furtherly
studied about the issues in language testing which include varied issues of interest within
language testing which was also discussed by many perspectives.

On the third paragraph of Bachman's article, he had mentioned several areas in which
valuable contributions can be expected from the study and review articles previously had
discussed. He mentioned that theoretical model of second language ability would benefit
researchers, second language teachers and learners. Furthermore, in his study, he had
mentioned that language testing helps to conceptualize the second language acquisition which
was subjected for various research study in relation to instructional efforts and designing language
tests for the effective use in instructional settings and second language teaching and learning. An
approach to characterize the authenticity of the language task was described in his article as well.
As mentioned, prior, both Alderson and Skehan argued that as language testing
progressed in some areas, still there are little progress in language testing until recently. In part
1: Language testing in the 1990s, the article mentioned about various factors, reasons, and
possible outcomes of languages testing, and what it has to offer in the field of Applied linguistic
and other fields. Notably, the article highlighted about the theoretical issues, methodological
advances and language test development.

In addition, (Oller 1979) claimed that language proficiency consists of a single, global
ability which was widely accepted. Which was later on challenged by various empirical studies in
the latter part. As argued, the unitary trait view was replaced with the view that language
proficiency is multicomponent. On the latter part of the study, Alderson stated that we need to be
concerned not only with . . . the nature of language proficiency, but also with language learning
and the design and researching of achievement tests; not only with testers, and the problems of
our professionalism, but also with test takers, with students, and their interests, perspectives and
insights”. Later on, in this part of the study, both Alderson and Skehan agreed to the connection
of the study and the progress to Bachman’s model of language test performance theory. Since it
includes both the language ability and characteristics of test methods which makes possible about
the actual performance and abilities.

In relation to this article, (Bachman & Palmer, 1981, 1982, 1988; Clifford, 1981; Shohamy,
1983, 1984) demonstrated in their studies that the kind of test used can affect the output or
performance as much as the abilities we would like to measure. On the other hand, in this article,
it was mentioned that (Alderson & Urquhart, 1985; Erickson & Molloy, 1983) also demonstrated
in their studies that topical content of test tasks can affect performance. Nevertheless, the result
of various studies ignite the fire of ineterest in the further investigation of test content, as Alderson
and colleagues (Alderson, 1986, 1990; Alderson & Lukmani, 1986; Alderson,Henning, & Lukmani,
1987) had investigated the extent to which "experts" would agree as to what specific skill would
EFL reading test items measures and as to what level of ability would the test be consider difficult
item. Various researchers subsequently conducted several studies to support their claims and
present data to give rise to answer to their investigations.

On the latter part of the article about the topic: Theoretical issues, it has been mentioned
that the result of the investigations conducted by various experts, came up to a conclusion that
they "do not agree" and that there was no evident relationship between judgement of the levels
of ability tested and empirical item difficulty. However, Bachman and colleagues, (Bachman,
Davidson, Lynch, & Ryan, 1989; Bachman, Davidson, & Milanovic, 1991;Bachman, Davidson,
Ryan, & Choi, in press) on the other hand, claimed to have found out that using content-rating
instrument based on a taxonomy of test method characteristic which was presented by
(Bachman,1990b) study and with the standard set by training raters, hence then item difficulty
and discrimination may arise.

Furthermore, despite of the claims stated by Bachman and colleague on this matter, they
have failed to provide consistent and enough factual evidences to support their claims and make
a strong stand on the matter, while on the other hand, based on the article, the previous
researchers as such by the group of Alderson and colleagues presented evidences and
supporting details regarding their collective ideas as prior mentioned.

The article or study highlighted that Bachman’s research, presents "what can be
accomplished in a highly controlled situation, and provides one approach to solving the problem."
Hence, future researches and experts would further enhance approaches to the investigation and
analysis of test methods and how characteristics of test methods affects the output or test
performance. Many experts and researchers account different and varied views about the what
the language testing has to offer, respectfully in the field of applied linguistic and varied language
field, specifically in the context of theoretical issues.

As noted on this article, A number of studies have shown differences in test performance
across different cultural, linguistic or ethnic groups (e.g., Alderman & Holland, 1981; Chen &
Henning, 1985; Politzer & McGroarty, 1985; Swinton & Powers, 1980; Zeidner, 1986), while
others have found differential performance between sexes (e.g., Farhady, 1982; Zeidner, 1987).
Other studies have found relationships between field dependence and test performance (e.g.,
Chapelle, 1988; Chapelle & Roberts, 1986; Hansen, 1984; Hansen & Stansfield, 1981; Stansfield
& Hansen,1983).

In addition, as mentioned, such studies demonstrate the effects of varied test takers'
characteristics on test performance, and should be considered in designing the test and the rubric
in measuring the result or interpreting the test score. Although several studies have been
conducted on the topic, still there is no strong claim and clear concise data that could declare the
connection or relationship between and among the test method, characteristic, environment, test
practitioner, test takers and level of difficulties. Even the significance in the relationship between
Second language acquisition to the teacher and learner's level of metacognition and test output
have not been clearly stated. This only indicate that further study and investigation needs to
conducted to give a clear interpretation and understanding regarding this issue (theoretical
issues).

In the latter part of part I of the article which entails about the Methodological advances,
Bachman highlighted the three areas such as: psychometrics, statistical analysis, and qualitative
approaches to the description of test performance. Here, he had mentioned that the 1980s
witnessed the application of various modern psychometric tools to language testing, namely as
follows: item response theory (IRT) by (Henning, 1987), generalizability theory (G theory) by
(Bachman, 1990b; Bolus, Hinofotis, & Bailey, 1982), criterion-referenced (CR) measurement by
((Bachman, 1990b; Hudson & Lynch, 1984), and the Mantel-Haenszel procedure by (Ryan &
Bachman, in press), which was claimed to be fairly technical. The article notably mentioned that
the application of the item response theory brought about the promising advances in computer-
adaptive language testing, which entails to make language testing more efficient and adaptable
to varied individual test takers and thus potentially more useful in the types of information they
provide (e.g., Tung, 1986).

However, as (Canale, 1986) stated " it also presents a challenge not to complacently
continue using familiar testing techniques simply because they can be administered easily via
computer." Which was also agreed by Alderson (1988a) and Stansfield (1986) to provide
extensive discussions of the applications of computers to language testing.

The article discussed that the major advance in the area of the statistical analysis was the
application of structural equation modelling to the language testing research found in the study of
(long, 1983a, 1983b). Further studies were conducted by several researchers such as:( Fouly,
1985; Purcell, 1983), (Sang, Schmitz, Vollmer, Baumert, & Roeder, 1986) with similar claims.

Although several studies were conducted and various claims were presented with their
different theories and sample data, perhaps I would agree that the most important theoretical
development in language testing in the late 1980s was the perception that language test score
represents complex information and multiple factors to be considered. The test score could not
precisely identify the level of language skill of a test taker and the proficiency of one self to a
certain language may it be fist language or second language learning and acquisition.

Several researchers would argue and would spend considerable extent amount of time
and effort in investigating and studying, the application of methodological tools and even how
sophisticated techniques they may use to reveal how complex responses to test items can be,
indeed I would agree that thus the legacy of the 1980s' studies tells us that language test score
cannot be interpreted simplistically as an indicator of the particular language ability we want to
measure; there are several factors that needs to be considered as it affects the test result or
output. Even the interpretation of the test score varies on the interpretation and the standard of
the test practitioners and the test takers.

This realization gives us an idea that we should carefully consider how we interpret and
scores. There is no standard, correct, or best test design, method, tool or test task, only suitable
and appropriate to varied test takers and their intended purposes.

On the other hand, in the last part of the part I of the article about: Advances in Language
test development, talks about two strains of communicative approaches to language testing has
been highlighted. One of which is the strain of communicative tests, illustrated by the Ontario
Assessment Pool (Canale& Swain, 1980a) and the A Vous la Parole testing unit described by
Swain (1985), traces its rootsto the Canale/Swain framework of communicative competence
(Canale, 1983; Canale & Swain,1980b).

However, the other focuses more on the Test of English for Educational Purposes
(Associated Examining Board, 1987; Weir, 1983), the Ontario Test of English as a Second
Language (Wesche et al., 1987), and the international English Language Testing Service (e.g.,
Alderson, 1988b; Alderson, Foulkes, Clapham, & Ingram, 1990; Criper & Davies, 1988; Seaton,
1983) has grown out of the English for specific purposes tradition. While, (Alderson, 1981a;
Canale, 1984; Carroll, 1980; Harrison, 1983; Morrow, 1977, 1979) proposed a lists of
communicative language test at the same time.

This chapter also focuses on the four characteristics that would distinguish communicative
language tests, namely as follows: (a) test creates an "information gap", (b) task dependency, (c)
test can be characterized by their integration of tasks and content within a given domain of
discourse, (d) test attempts to measure a much broader range of language abilities. The article
presented clear information about each characteristic and with ample examples as to how each
characteristic could be observed and the basis of each.

Furthermore, this chapter discussed about a different approach to language testing that
evolved during the 1980s is the adaptation of the FSI oral interview guidelines (Wilds, 1975) to
the assessment of the oral language proficiency in contexts outside agencies of the U.S.
government. This “AEI” (For American Council for the Teaching of Foreign Languages/

Educational Testing Service/ Interagency Language Roundtable) approach to language


assessment is based on a view of language proficiency as a unitary ability (Lowe, 1988), and thus
diverges from the view that has emerged in language testing research and other areas of applied
linguistics.

In regards to the matter discussed on this chapter, the article mentioned that the previously
mentioned approach to oral language assessment has been argued, studied and analyzed by
both linguists and applied linguist which both specializes in language study, including various
experts and experienced on the field such as language testers and language teachers (e.g.,
Alderson, 1981b; Bachman, 1988; Bachman & Savignon, 1986; Candlin, 1986; Kramsch, 1986;
Lantolf & Frawley, 1985, 1988; Savignon, 1985).

The article explicitly discussed the different approaches to language assessment with
situational samples to conceptualize the ideas and the theories. Also, in this article, Lowe (1988)
has explicitly articulated such a separatist view, in stating that the “concept of Communicative
Language Proficiency (CLP), renamed Communicative Language Ability (CLA), and AEI
proficiency may prove incompatible”. Communicative language testing and AEI assessment
represent two different approaches to language test design, and each has developed a number
of specific manifestations in language tests.

Hence, it was said that language testing will be enriched in the future with variety of tests,
approaches and techniques that may arises and emerge. The summary focuses more on the
common areas and characteristics between and among the four reviews of language testing.
Each review also mentioned both progress and concerns. Many of the researchers presented
their ideas and information but, on this part, Skehan and Alderson, notably agreed that not until
very recently that other areas of applied linguistics may have provided very little input into
language testing. They both agreed that language testing must continue to study new avenues
to testing, such as formats and measure communicative skills more effectively.

However, Skehan was enticed and affected by the relevance of recent work in
sociolinguistics such as the SLA-based approach to assessing language development of
Pienemann, Johnston, & Brindley (1988) to the receptive input.

In this article. Alderson also mentioned about the "washback" effect and learner-centered
testing. Which was furtherly discussed in various study and investigation of Bachman, in relation
to study of language testing. He points out that while we generally assume that tests have an
impact on instruction (washback), there is virtually no empirical research into how, if at all,
instructional impact functions, under what conditions, and whether deliberate attempts to design
tests with positive instructional impact are effective.

Nevertheless, PART 2: AN INTERACTIONAL APPROACH TO LANGUAGE TEST


DEVELOPMENT, highlights the broad categories of variety of language tests such as: the use of
test result for inferences about test takers language abilities, and to decide about their levels of
ability or their capacity for content language use. This signifies that in order to investigate and
demonstrate the validity of the uses we make of test scores; we need a theoretical framework
within which we can describe language test performance as a specific instance of language use.

It was mentioned that in an instructional setting, we need to demonstrate that the content
of the test, represents the course content. It was mentioned in the article that we should
demonstrate that the component of language ability corresponds to the type of learning activities
included.

Furthermore, it was also mentioned on the article that we need to relate our tests to a real-
life situations and applications were students could actually apply and use the test for possible
employment in a job which requires a specified level of proficiency in a foreign language. With
this, we need to demonstrate that the tasks included in the test are representative of the language
use tasks required by the future job. It was mentioned as well that we could provide justification
for using the test scores to foresee or predict future capacity of using foreign language effectively
in the target employment situation.
Demonstrating correspondence was furtherly discussed and varied situational examples
were provided to support the information claimed earlier. Following are just some of the examples
provided as a supporting details: For instance, if we were interested in investigating the
interlanguage development of a specific component of ability in a target language; another is,
sensitivity to appropriate register, and wanted to use a test as one of our research instruments,
we would need to be sure that the test we used measured this aspect of language ability.

In accordance to this, framework was provided as a basis for relating test performance to
non-language use. This includes model of language ability for describing the abilities involved in
language use and test performance and a framework of test method to features of the language
use context.

Nevertheless, on the part of Language ability, Bachman furtherly explained the essential
information related to the matter. There he mentioned that language ability of the language user
is one feature of language use. As language test were designed, we assumed that the test taker's
language ability will be engaged by the test tasks. Thus, in order to relate the abilities, we believe
are involved in test performance to the

abilities involved in language use; we need a model of language ability. Bachman and
Palmer agreed that the model of language ability includes two types of components: (a) areas of
language knowledge, which we would hypothesize to be unique to language use (as opposed to,
for example, mathematical knowledge or musical knowledge), and (b) metacognitive strategies
that are probably general to all mental activity. In relation to this, Bachman and his colleagues (
Bialystok, 1990; Spolsky, 1989; Widdowson, 1983) furtherly presented the same ideas that the
view of language ability is consistent with research in applied linguistics that has increasingly
come to view language ability as consisting of two components: (a) language knowledge,
sometimes referred to as competence, and (b) cognitive processes, or procedures, that
implement that knowledge in language use. Bachman also presented more details on the latter
part of this article.

Furthermore, the article presented more data and figure to further explain and elaborate
the ideas and information about language knowledge on the latter part of the article. There it was
highlighted that "Language knowledge includes two broad areas: organizational knowledge and
pragmatic knowledge. These are constantly changing, as new elements are learned or acquired,
and existing elements restructured. The learning or acquisition of areas of language knowledge
is beyond the scope of my discussion here, and for purposes of describing how they pertain to
language use, I will treat them as more or less stable traits or constructs." (Canale, 1983; Canale
& Swain,1980b).

Lastly, on the part of Strategic Competence, the second component of language ability.
From Bachman's model, two note were mentioned such as: (a) “Vocabulary” has been removed
from “organizational competence” and placed within a new area, “propositional knowledge,” under
“pragmatic knowledge,” and (b) “illocutionary competence” has been renamed “functional
knowledge.” However, on the latter part of article, it was not thoroughly explained as to what are
the essential information in relation of the two models to language ability and language testing.
In the article, the researchers presented varied information and conducted several studies
to identify the essence of “what does language testing have to offer”, and argued at some points
and varied claims, still after several data gathered there is no specific information that could
identify a unified and specific information that could clearly and accurately identify the advantages
and advances the language testing has to offer, as mentioned the result varies in so many ways
and the effect or results of the test also differs with the type of test takers and test administrators.
There is no perfect test method just an appropriate one.

You might also like