LANGUAGE TESTING Chapter 3 & 8

Paper of Group Assignment
LANGUAGE TESTING
Lecturer: Prof. Dr. Hj. Djamiah Husain, M.Hum.
PURPOSES AND TYPES OF LANGUAGE TESTING

AND TESTING GRAMMAR
By:
The Second Group
CLASS C
FARID SUDARMAN (15B01150)

HAERUDDIN (15B01109)
A. ABD. RAHMAN. S (15B01101)
IRNAWATI ISRAIL (15B01111)
GRADUATE PROGRAM
STATE UNIVERSITY OF MAKASSAR
2016
0
INTRODUCTION
Language testing is designed to find out what students have learned both
in language skills and language areas. Other terms of language testing are
language assessment and evaluation. The term of language assessment is used in
free variation with language testing although it is also used somewhat more
widely to include for example classroom testing for learning and institutional
examinations. It is a program under the field of applied linguistics that essentially
focuses on evaluating a person’s fluency in a language.
Testing also has an ethical dimension in so far as it affects people’s lives
(see Davies (ed.) 1997). This leads us into the area of consequential validity
where we are concerned with a test’s impact on individuals, institutions and
society, and with the use that is made of test results. Getting it right, ensuring test
fairness is a necessity not an ideal for testing. In developing assessment tools a
decision must be taken on what is criteria in the particular domain under review,
and this decision and the test measures used for operational zing it must be
ethically defensible. Test developers must be made accountable for their products.
Test validation is the process of generating evidence to support the well
founded of inferences concerning trait from test scores, i.e., essentially, testing
should be concerned with evidence-based validity. Test developers need to
provide a clear argument for a test’s validity in measuring a particular trait with
credible evidence to support the plausibility of this interpretative argument (see
Kane 1992).
Language evaluation itself is used for various purposes in education.
Students’ evaluation gauges students’ growth, development, and progress against
stated learning objectives. Making judgments on the basis of the information
collected. Evaluation tells educators the strengths and weaknesses of the program
in order that adjustments and adaptations can be made. In addition, teachers grow
professionally when they reflect on their own teaching and when they keep
informed of current instructional strategies and evaluation methods they may use
in their programs.
1
Questions about the tests to be tested can be made by teachers themselves
or others. There are some of teachers do not understand about the creation of
good, quality matter made or things that need to be tested. They do not know how
to test the quality of questions that tested or will be tested. It is also not free from
their ignorance of what is language testing or language tests. Language testing
needs to be known because it would provide the basis for language testing.
In testing language, it is necessary to know what should be tested. As
mentioned above that language testing involves language skills and language
areas. In testing language skills, four skills of languages are important to be tested.
Four skills of English that should be tested are listening comprehension, speaking
ability, reading comprehension, and writing ability. Besides, testing the language
areas involves tests of grammar and usage, tests of vocabulary and tests of
phonology.
One of language areas to test grammar. Grammar is rules of a language.

Grammar is a system of meaningful structures and patterns that are governed by
particular pragmatic constraints (Larsen-Freeman, 2001). In another definition
grammar is a description of the rules for forming sentences, including an account
of the meanings that these forms convey (Thornbury, 1999, p.13). In order to test
grammar, it is necessary to know some types of grammar tests and the usage. This
paper focuses on the purposes and types of language testing and testing grammar.
In relation to testing grammar there are some purpose and types of
language testing. The first is purpose of language testing, they are : diagnosis and
feedback, screening and selection, placement, program evaluation, providing
research criteria, and assessment of socio-psychological differences. And the
second is the types of language testing. They are: objective vs subjective tests,
direct vs indirect tests, discreate-point vs integrative tests, aptitude, achievement,
and proficiency tests, criterion-refferenced vs norm referenced tests, speed tests vs
power tests, and other test categories.
2
DISCUSSSION
I. PURPOSES AND TYPES OF LANGUAGE TESTING

A. The Purposes of Language Testing
This part describes about the purpose of language testing, they are:
diagnosis and feedback, screening and selection, placement, program evaluation,
providing research criteria, and assessment of socio-psychological differences.
1. Diagnosis and Feedback
This use of tests, frequently termed diagnostic testing, is of value in that it
provides critical information to the student, teacher, and administrator that should
make the learning process more efficient. Without the specific information thus
made available, the teacher might persist in teaching pronunciation to this student
and fail entirely to address weakness in the area of vocabulary (Hening, 1987).
Although the term diagnostic test is widely used, few tests are constructed
solely as diagnostic test. Achievement and proficiency tests however are
frequently used diagnostic purposes: areas of difficulty are diagnosed in such tests
so that appropriate remedial action can be taken later. Sections of tests which lend
themselves particularly well to diagnostic purposes are phoneme discrimination
tests, grammar and usages tests, and certain controlled writing tests.
Test of writing and oral production can be used diagnostically provided
that there is an appreciation of the limit to which such tests can be put. Since
diagnosing strengths and weaknesses is such an important feature of progress tests
and of teaching, the teacher should always be alert to every facet of achievement
revealed in a class progress test.
2. Screening and Selection
Another important use of test is to assist in the decision of who should be
allowed to participate in a particular program of instruction. Such selection
decisions are often made by determining who is mostly likely to benefit from
instruction, to attain mastery of language or content area, or to become the most
useful practitioner in the vocational domain represented. Howeever, most
educators agree that some, though perhaps not entire, reliance must still be placed
3
on test scores when screening or selection decisions are being made. In order for
such decisions to be fair, our tests must be accurate in the sense that they must
provide information that is both reliable and valid.
In the area of language testing, a common screening instrument is termed
an aptitude test. It is used to predict the success or failure of students prospective
in a language learning program.
3. Placement
Closely related to the notions of diagnosis and selection is the concept of
placement. In this case test is used to identify a particular performance level of the
students and to place him or her at an appropriate level of instruction. It follows
that a given test may serve a variety of purposes; thus the UCLA Placement Exam
may be use to assign students to level as well as to screen students with extremely
low English proficiency from participation in regular university instruction.
A placement test identifies the right class for a particular learner; there is
no such thing as a good score or a bad score, only a recommendation for the most
suitable class. Obviously, the tester must know which classes or level are
available classes to put the learner.
4. Program Evaluation
In this way the focus of evaluation is not the individual student so much as
the actual program of instruction. Therefore, group mean average scores are of
individual students. Often one or more pretests are administered to assess gross
levels of student proficiency or “entry behavior” prior to instruction. Following
the sequence of instruction, one or more posttests are administered to measure
post-instructional levels of proficiency or “exit behavior”. The differences
between pretest and posttest scores for each student are referred to as gain scores.
Frequently in program evaluation tests or quizzes are administered at
intervals throughout the course of instruction to measure “en route behavior”. If
the result of these tests is used to modify the program to better suit the need of the
students, this process in termed formative evaluation. The final exam or posttest is
administered as a part of the process of what is called summative evaluation.
4
Sometimes language program may be evaluated by comparing mean
posttest or gain scores one program or partial program with those of other
programs. Whatever the method of evaluation, the important of sensitive, reliable,
and valid test is obvious.
5. Providing Research Criteria
Language tests scores often provide a standard of judgment in a variety of
other research contexts. Comparisons of methods and technique of instruction,
textbooks, or audiovisual aids usually entail reference to test scores. Even
examination of the structure of the language itself or the physiological and
psychological processes of language use may involve some form of measurement
testing. If we are to learn about effective methods of teaching, strategies of
learning, presentation of material for learning, or description of language and
linguistic processes, greater effort will need to be expended in the development of
suitable language tests.
6. Assessment of Socio-Psychological Differences
Aptitude toward the target language, its people, and their culture has been
identified as important affective correlates of good language learning. It follows
that appropriate measures are needed to determine the nature, direction, and
intensity of attitudes related to language acquisition. Apart from attitudes, other
variables such as cognitive style of the learner, socioeconomic status and locus of
control of the learner, linguistic situational context, and ego permeability of the
learner have been found to relate to level of language achievement and/or
strategies of language use. Each of these factors in turn must be measured reliably
and validly in order to permit rigorous scientific inquiry, description, explanation,
and/or manipulation. This is offered as further evidence for the value of a wide
variety of tests to serve a variety of important functions.
B. Types of Language Testing

There are many important broad categories of tests that do permit more
efficient description and explanation. Many of these categories stand in opposition
to one another, but they are the same time bipolar or multipolar in the sense that
5
they describe two or more extremes located at the end of the same continuum.
Many of the categorizations are merely mental constructs to facilitate
understanding. The fact that there are so many categories and that there is so
much overlap seems to indicate that few of them are entirely adequate in and of
themselves, particularly the broadest categories. This part describes the types of
language testing. They are: objective vs subjective tests, direct vs indirect tests,
discreate-point vs integrative tests, aptitude, achievement, and proficiency tests,
criterion-refferenced vs norm referenced tests, speed tests vs power tests, and
other test categories.
1. Objective vs. Subjective Tests
An objective test is said to be one of that may be scored by comparing
examinee responses with an established set of acceptable responses or scoring
key. No particular knowledge or training in the examined content area is required
on the part of the scorer. A common example would be a multiple-choice
recognition test. Conversely a subjective test is said to require scoring by
opinionated judgment, hopefully based on insight and expertise, on the part of the
scorer.
Many test, such as cloze tests permitting all grammatically acceptable
responses to systematic deletion from a context, lie somewhere between the
extreme of objectivity and subjectivity. So, it is called subjective test such as free
compositions are frequently objectified in scoring through the use of precise
rating schedules clearly specifying the kinds of errors to be quantified, or through
the use of multiple independent raters.
Objectivity-subjectivity labels, however, are not always confined in their
application to the manner in which tests are scored. These descriptions may be
applied to the mode of item of distracter selection be the tests developer, to the
nature of response elicited from the examinee, and to the use that is made of the
results for any given individual. Often the term subjective is used to denote
unreliable or undependable. The possibility of misunderstanding due to ambiguity
suggests that objective-subjective labels for tests are of very limited utility. This
objective and subjective tests will be further discussed in another part.
6
2. Direct vs. Indirect Tests
It has been said that certain tests, such as a ratings of language use in real
and uncontrived communication situations, are testing language performance
directly; whereas other tests, such as multiple-choice recognitions tests, are
obliquely or indirectly tapping true language performance and therefore are less
valid for measuring language proficiency. Whether or not this observation is true,
many language tests can be viewed as lying on a continuum from natural-
situational to unnatural-contrived. Thus an interview may be thought of a more
direct than a cloze test for measuring overall language proficiency. A
contextualized vocabulary test may be thought more natural and direct than a
synonym-matching test.
The issue of test validity is treated in greater detail in another part. It
should be noted here that the usefulness of tests should be decided on the basis of
other criteria in addition to whether they are direct or natural. Sometimes tests are
explicitly designed to elicit and measure language behaviors that occur only rarely
if at all in more direct situations. Sometimes most of the value of direct language
data is lost through reductionism in the manner of scoring.
3. Discrete-Point vs. Integrative Tests
Another way of slicing the testing pie is to view tests as lying along a
continuum from discrete-point to integrative. Discrete-point tests, as a variety of
diagnostic tests, are designed to measure knowledge of performance in a very
restricted target language. Thus test of ability to use correctly the perfect tenses of
English verbs or to supply correct prepositions in a cloze passage may be termed a
discrete-point test. Integrative tests, on the other hand, are said to tap a greater
variety of language abilities concurrently and therefore may have less diagnostic
and remedial-guidance value and greater value in measuring overall language
proficiency. Examples of integrative tests are random cloze, dictation, oral
interviews, and oral imitation tasks.
Here again, some tests defy such ready-made labels and may place the label
advocates on the defensive. A test of listening comprehension may top one of the
label four general language skills (i.e., listening, speaking, reading, and writing) in
7
a discrete manner and thus have limited value as measure of overall language
proficiency. On the other hand, such a test may examine a board range of lexis
and diverse grammatical structures and this way be said to be integrative.
4. Aptitude, Achievement, and Proficiency Tests
Aptitude tests are most often used to measure the suitability of a candidate
for a specific program of instruction or a particular kind of employment. For this
reason these test are often used synonymously with intelligence test or a screening
tests. A language aptitude may be used to predict the likelihood of success of a
candidate for instruction in a foreign language. The Modern Language Aptitude
Test is a case in point. Frequently vocabulary tests are effective aptitude
measures; perhaps because they correlate highly with intelligence and may reflect
knowledge and interest in the content domain.
A language aptitude test (prognostic test) is designed to measure the
students’ probable performance in a foreign language which they have not started
to learn. It assesses aptitude for learning a language. Language learning aptitude is
a complex matter, consisting of such factors as intelligence, age, motivation,
memory, phonological sensitivity, and sensitivity to grammatical patterning.
Achievement tests are used to measure the extent of learning in a prescribed
content domain, often in accordance with explicitly stated objectives of a learning
program. These tests may be used for program evaluation as well as for
certification of learned competence. It follows that such tests normally come after
a program of instruction and that the components or items of the tests are drawn
from the content of instruction directly. If the purpose of achievement testing is to
isolate learning deficiencies in the learner with the intention of remediation, such
tests may also be termed diagnostic tests.
Achievement (attainment) tests are based on what the students are
presumed to have learnt- not necessarily on what they have actually learnt nor on
what has actually bee though. Achievement tests are more formal and are intended
to measure achievement on a large scale.
Proficiency test are most often global measures of ability in a language or
other content area. They are not necessarily developed or administered with
8
reference to some previously experience course of instruction. These measures are
often used for placement of selection, and their relative merit lies in their ability to
spread students out according to ability on a proficiency range within the desired
area of learning.
It is important to note that the primary differences among these three kinds
of test are in the purposes they serve and the manner in which their content is
chosen. Otherwise it is uncommon to find individual items that are identical
occurring in aptitude, achievement, and proficiency tests.
5. Criterion-Referenced vs. Norm Referenced Tests

Characteristically criterion-referenced tests (CRT) are devised before the
instruction itself designed. The test must match teaching objectives perfectly, so
that any tendency of the teacher to “teach to the test” would be permissible in that
attaining objectives would thereby be assured. A criterion or cut-off score is set in
advance, and those who do not meet the criterion are required to repeat the course.
Students are not evaluated by comparison with the achievement of other students,
but instead their achievement is measured with respect to the degree of their
learning or mastery of the prespecified content domain. Consistent with a view of
a teacher or environmental responsibility for learning, the failure of a large
proportion of the learners to pass part or the test may result in the revision of the
course or a change in method, content instructor, or even the objectives
themselves.
Norm-referenced or standardized tests are quite different from criterion-
referenced tests in a number of respects; although once again some of the identical
items may be used under certain conditions. A norm-referenced test (NRT) is a
type of test, assessment, or evaluation, in which the tested individual is compared
to a sample of his or her peers (referred to as a normative sample). The term
“normative assessment” refers to the process of comparing one test-taker to his or
her peers. By definition, a norm-referenced test must have been previously
administered to a large sample of people from the target population (e.g., 1,000 or
more)
9
Norm-referenced test are not without their share of weakness. Such test are
usually valid only with the population on which they have been normed. Norms
change with time as the characteristics of the population change, and therefore
such tests must be periodically renormed. Since such tests are usually developed
independently of any particular course of instruction, it is difficult to match results
perfectly with instructionally objectives. Test security must be rigidly maintained.
Debilitating test anxiety may actually be fostered by such tests. It has also been
objected that, since focus is one the average score of the group, the test may
insensitive fluctuations in the individual. This objections relate to the concept of
reliability discussed in other part, and may be applied to criterion-referenced as
well as to norm-referenced tests.
Some teachers may fail to grasp the distinctions between criterion-
referenced and norm-referenced testing. It is common to hear the two types of
testing referred to as if they serve the same purposes, or shared the same
characteristics. Much confusion can be eliminated if the basic differences are
understood.
6. Speed Tests vs. Power Tests
A purely speed tests is one in which the items are so easy that every person
taking the test might be expected to get every item correct, given enough time.
But sufficient time is not provided, so examinees are compared on their speed
performance rather than on knowledge alone. Conversely, power tests by
definition are tests that allow sufficient time for every person to finish, but that
contain such difficult items that few if any examinees are expected to get every
item correct. Most tests fall somewhere between the two extremes since
knowledge rather than speed is the primary focus, but time limits are enforced
since weaker students may take unreasonable periods of time to finish.
7. Other Test Categories
The few salient test categories mentioned here are by no means exhaustive.
Mention could be made of examinations vs. Quizzes, questionnaires. A distinction
could be made between single-stage tests as is done in other part. Contrast might
10
be made between language skill tests and language features tests, or between
production and recognition tests.
As a still lower level of discrimination, mention will be made of cloze of
tests, dictation tests, multiple-choice tests, true false tests,
essay/composition/précis tests, memory-span tests, sentence completion tests,
word-association tests, and imitation tests, not to mention tests of reading
comprehension tests, listening comprehension, grammar, spelling, auditory
discrimination, oral production, listening recall, vocabulary recognition and
production and so on.
II. TESTING GRAMMAR

In this part, the discussion is about the most common types of grammar
tests. There are two tests formats in grammar tests namely recognition and
production format. Recognition formats are multiple-choice, error-recognition,
and pairing and matching. While production formats are completion,
transformation, rearrangement, and combination and addition items.
1. Multiple Choice Tests
For the basis of constructing multiple-choicenitems to test students’
achievement in grammar, it is possible to use samples of their written work and
answers to open-ended questions. The form of multiple choice items to tests is the
incomplete statement type. The item may be written in the following:
1. Ali ought not to………. (A. tell B. having told C. be telling D. have told)
me his secret, but he did.
2. Ali ought not to…….me his secret, but he did.

A. Tell
B. Having told
C. Be telling
D. Have told
A. tell
3. Ali ought not to B. having told me his secret, but he did.
C. Be telling
D. Have told
11
4. A. Ali ought not to tell me his secret, but he did.
B. Ali ought not to having told me your secret, but he did.
C. Ali ought not to be telling me your secret, but he did.
D. Ali ought not to have told me your secret, but he did.
5. Ali ought not to have told me.

A. Ali did not told me but she should.
B. Perhaps Ali may not tell me.
C. Ali told me but it was wrong of him.
D. It was necessary for Ali not tell me.
2. Error Recognition Tests

Error recognition test may be constructed in the form of multiple-choice
format. The following are two comparable types of error recognition multiple
choice items. The first type is used in testing structure and written expression part
of TOEFL test.
1: Each sentence contains four words or phrases underlined, marked A, B, C,

D. Select the underlined word or phrase which is incorrect or
unacceptable.
a. This is a example of a sentence with an error underlined
A B C D
b. I’m worried that you’ll be angry to me.
A B C D
c. I do hope you wouldn’t mind waiting for such a long time.
A B C D
2: There is a mistake in each of the following sentences. Write the letter of
that part of the sentence in which it.
A B C D
a. This is/a example of an sentence/with an error/in it.
A B C D
b. The man/enjoyed/looking the children/playing in the yard.
12
A B C D
c. Rini’s mother/does not let her/to play/on the dirty floor.
3. Rearrangement Tests
Rearrangement items can be in the form of multiple choice items or in the
other forms. Consider the following different examples:
Well, you know how……………..
A. Warm is it today C. today it is warm
B. Is it warm today D. warm is it today
This type of arrangement may be more confusing and may be written in
word order form:
Complete each sentence by putting the word below it in the right order, put in the
boxes only the letters of the words.
Well, you know how……….
A. it B. today C. warm D. is
Not only………, but he also took me to his house.
A. me B. he C. did D. meet
4. Completion Tests
Completion item is a useful means of testing the student’s ability to
produce acceptable and appropriate forms of language. It measure production
rather than recognition, testing the ability to insert the most appropriate words in
selected blanks in sentences. The words selected for omission are grammatical or
functional words.
Is ………sun shining today?
The answer to the above sentence is the, and there is no other possible answer.
The following example indicates the wide range of possibilities for one
completion item:
I exercise regularly, but I………..to the gym for months.
13
The answer obviously required by the tester is haven’t been; however,
possible answers are:
Haven’t been hadn’t been may not go

Can’t go (sometimes) don’t go shan’t go
Shan’t be going am not going didn’t go
Haven’t been able to go won’t go haven’t gone
Don’t know whether I’ve haven’t been going
been
There are three possible ways of restricting the possible answers available
that is, by providing the context, providing data, or using multiple choice
techniques. The completion items in context may be in the form of blanks or the
omissions are not indicated; the students are required to put a slash (/) at the place
where word has been omitted and then to write the missing word in the
appropriate space. Consider the following example;
It (1) always useful to practice answering the types of (1)…….
Question that you may (2) asked. However it is not (2)…….
Enough simply to glance through a past paper (3) (3)…….
Answer the question that you heard.
5. Transformation Tests
The transformation type of them is exceptionally useful for testing ability
to produces structures in the target language, although transforming sentences is
different from producing sentences.
Rewrite each the following sentences in another way, beginning each new
sentence with the words given. Make any changes that are necessary but do not to
change the general meaning of the sentence.
1. I haven’t written to you for a long time.
It’s a long time…………………………………..
14
2. Ahmad can sing better than you.
You cannot………………………………………
6. Pairing And Matching Tests

This type of item used to test the ability to select appropriate responses to
stimuli which would be presented orally in normal everyday situation. The item is
more useful for testing students’ sensitivity to appropriate and their awareness of
the function of language rather than their knowledge of grammar.
Colomn Letter Colomn 2
Going to see a film tonight D A. No, I didn’t
How was the film? B. So do I
I like war films C. All right. Nothing special
So you went to the cinema D. Yes I probably will
7. Combination and Addition Tests

In combination items, the students are required to join each pair of
sentences, using the words in brackets. See the following examples;
1. You provide the ingredients. Then boil the water. (AFTER)
2. I bought a book. It is very interesting to read. (WHICH)
In addition items, the students are required to insert the words in capitals
in the most appropriate place in each sentence. See the following examples:
1. YET have you answered all the questions?
2. SOMETIMES there may be little choice of questions.
15
CONCLUSION
Based on introduction and discussion above, we can conclude that

grammar is the rule of forming sentences and conveying the meaning. Based on
the meaning of grammar there are some types of testing grammar, they are: are
multiple-choice tests, error-recognition tests, rearrangement tests, completion
tests, transformation tests, combination and addition tests, and pairing and
matching tests. Those of testing grammar are used to know the students strengths
and weaknesses of grammar comprehension and mastering.
16
REFERENCES
Hughes, A. 2003. Testing for language teachers. Cambridge: Cambridge

University Press
Jabu, Baso. 2008. English Language Testing. Makassar: the UNM Publisher
Larsen-Freeman, D. 2001. Teaching Grammar. In M. Celce-Murcia (ed.),
Teaching English as a Second or Foreign Language (3rd ed., pp. 251-
66). Boston, MA: Thomson/ Heinle.
Thornbury, Scott. (1999). How to Teach Grammar. Essex: Pearson
Education Limited.
Mart, Ç. T. 2013. Theory and Practice in Language Studies. Erbil:
Department of Languages, Ishik University. (Vol. 3, No. 1, pp. 124-
129,)
http://washington.academia.edu/PriscillaAllen taken on 3rd Sept. 2016
17

LANGUAGE TESTING Chapter 3 & 8

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LANGUAGE TESTING Chapter 3 & 8

Uploaded by

Copyright:

Available Formats

Paper of Group Assignment

PURPOSES AND TYPES OF LANGUAGE TESTING

FARID SUDARMAN (15B01150)

One of language areas to test grammar. Grammar is rules of a language.

I. PURPOSES AND TYPES OF LANGUAGE TESTING

B. Types of Language Testing

5. Criterion-Referenced vs. Norm Referenced Tests

II. TESTING GRAMMAR

2. Ali ought not to…….me his secret, but he did.

5. Ali ought not to have told me.

2. Error Recognition Tests

1: Each sentence contains four words or phrases underlined, marked A, B, C,

Is ………sun shining today?

I exercise regularly, but I………..to the gym for months.

Haven’t been hadn’t been may not go

It (1) always useful to practice answering the types of (1)…….

Question that you may (2) asked. However it is not (2)…….

Enough simply to glance through a past paper (3) (3)…….

Answer the question that you heard.

6. Pairing And Matching Tests

7. Combination and Addition Tests

Based on introduction and discussion above, we can conclude that

Hughes, A. 2003. Testing for language teachers. Cambridge: Cambridge

You might also like