You are on page 1of 56

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/350755164

TESTING AND ASSESSMENT OF SPEAKING SKILLS, TEST TASK TYPES AND


SAMPLE TEST ITEMS CHAPTER VI

Chapter · April 2021

CITATION READS

1 4,128

1 author:

Ciler Hatipoglu
Middle East Technical University
53 PUBLICATIONS   265 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Catch a Tiger by the Toe: Modal Hedges in EFL Argumentative Paragraphs View project

Metadiscourse across Genres: Mapping out interactions in spoken and written discourses March 2017, CYPRUS View project

All content following this page was uploaded by Ciler Hatipoglu on 09 April 2021.

The user has requested enhancement of the downloaded file.


CHAPTER VI

TESTING AND ASSESSMENT OF SPEAKING


SKILLS, TEST TASK TYPES AND SAMPLE TEST
ITEMS

Çiler Hatipoğlu
Middle East Technical University, TURKEY
ciler@metu.edu.tr
https://orcid.org/0000-0002-7171-1673
C H A P T E R VI
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 121

Pre-reading Questions

1. This Chapter focusses on “Testing and Assessment of Speaking.”


(a) You have one minute to write down everything that you know
about this topic.
(b) Now, in one minute, write down everything you want to know
about the topic.
(c) Share with your classmates what you put down for (a) and (b) and
tick the ones that overlap with at least 3 of your classmates.

2. Why do we communicate with others? What are we trying to achieve


when interacting with others?

3. What does it mean to “speak a language”? What skills, sub-skills,


and kinds of knowledge do you need to be able to speak a language?

4. Is speaking in your mother tongue “difficult”?


(a) If “YES”, explain what makes speaking difficult. If “NO”, explain what
makes it easy.
(b) For both answers, first individually, then in pairs and, finally, in
groups, try to devise lists that include features that make speaking
difficult/easy?
5. Is speaking in a foreign language “difficult”?
(a) If “YES”, explain what makes speaking in a foreign language difficult.
If “NO”, explain what makes it easy.
(b) For both answers, first individually, then in pairs, and, finally, in
groups, try to devise lists that include features that make speaking
in a foreign difficult/easy?

6. Is assessing students’ speaking skills in a foreign language difficult/


easy? Why? What could be the main difficulties associated with as-
sessing speaking in a foreign language?

7. Can ‘speaking’ be divided into types? If “YES”, list the types of speak-
ing that you know.
(a) Compare your answers with your classmates and make a list of all
the types of speaking mentioned in class.
(b) In pairs, think of criteria that you could use to place these types of
speaking in different groups.
122 LANGUAGE ASSESSMENT AND TEST PREPARATION

Pre-reading Questions

8. Choose a partner in class/Form pairs. One of you will be the examinee,


and the other the examiner in a speaking test. The examinee will be
asked to choose a topic (e.g., talk about yourself, describe your daily
routine, what are your hobbies, tell us about the last book you read,
describe your town/school to your foreign friends who want to visit you).
The examinee will be given a minute to prepare, and then s/he will
be asked to speak for a minute about the topic. The examiner will
note down points for evaluation of the examinee and will give them
a final grade.
(a) Share with the class the points you noted down as an examiner.
Which evaluation criteria did you list/use? Why?
(b) Explain why and how are the points you used for the evaluation
important.
(c) As a class, make a list of all the criteria you listed. Then group them
and, finally, order them from most to least important.

9. Should test writers try to assess speaking as a separate skill?


(a) List a number of reasons and advantages for doing this.
(b) Now, list some benefits of assessing speaking as a skill that
integrates with one or more of the other three skills.

10. Form groups and try to define the following terms without using any
extra sources:
(a) speaking vs writing
(b) testing vs assessment of speaking
(c) validity of a speaking test
(d) reliability of a speaking test
(e) practicality
(f ) rubrics
As a class, compare the definitions written by the different groups.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 123

Introduction

It is easy to take spoken language for granted – for most of us it is perhaps our most
natural form of expression. However, when learning a foreign language, it is the spoken
language that is often found the most challenging, and often frustrating, perhaps
precisely because speech is so natural for us (Bradshaw, 2020, p. 212).

Speaking is one of the most natural and one of the most crucial language skills in
our lives. It is the most needed skill in our everyday interactions, and the way we
speak reveals our identities and views of the world (Hatipoğlu, 2017b). Speaking is
also the “most complex and demanding of all human mental operations” (Taylor,
2011, p. 70). In today’s globalised world and with the technological advances and
economic globalisation, differently from the previous centuries when speaking in
the foreign language was either ignored or deemed with secondary importance,
the ability to speak in a foreign language is a “highly coveted skill and a source
of cultural capital” (Isaacs, 2016, p. 131). Speaking skills are also “often considered
the most important part of an EFL course, and yet the difficulties in testing oral
skills frequently lead teachers into using inadequate oral tests or even not testing
speaking skills at all” (Knight, 1992, p. 294). The difficulty related to the assessment
of speaking, according to testing experts, stems from the following distinctive
characteristics of the skill:

Speech is difficult to define


For test writers, “the testing of speaking is widely regarded as the most challenging
of all language exams to prepare, administer, and score” (Madsen, 1983, p. 147),
and for language learners, it is usually the most under-developed skill (Hatipoğlu,
2017b; TES, 2017). One reason why speaking is that difficult to deal with is its
elusive nature. There is a disagreement among experts on what a successful oral
communication/skill is, and what the evaluation criteria should be. Pronunciation,
vocabulary usage and grammar are frequently named the main ingredients, but
speech is more than just a series of sounds that form words and bigger meaningful
units. Elements such as fluency, correct tone (e.g., expressing happiness, fear
or sadness), appropriateness of expression, reasoning ability and listening
comprehension are also very important since speech is a joint co-construction and
negotiation of meaning. While communicating, interlocutors have to consider a
myriad of linguistic and contextual variables and choose wisely linguistic forms
(e.g., address forms, speech acts with a suitable level of politeness) from within a
socially defined set of accepted choices. The range of oral communication (formal,
informal, professional, over the phone vs face-to-face, in crowded, noisy places,
in the emergency room etc.) is another variable complicating the matter. What
is acceptable in one setting might be offensive or rude in another. While in some
contexts, the gender of the interlocutor is the main variable, in others, age matters
124 LANGUAGE ASSESSMENT AND TEST PREPARATION

the most. If to all these, issues related to assessing speaking in a valid, reliable
and practical manner are added, then it would not be difficult to understand why
many of the well-known international language tests (e.g., TOEFL) and editions
of prominent books discussing “current issues” in assessing second language
proficiency (e.g., Lowe & Stansfield, 1988) until very recently did not have sections
focusing on speaking.

Spoken vs written language


Spoken language is notably dissimilar to written language on a number of
important levels (e.g., lexicon, morphology, syntax) (McCarthy, 2006). Written work
is fixed, and, if needed, it can be re-read and reviewed. Speech, on the other hand,
is dynamic and ephemeral. In real-world interactions, interlocutors are under
time pressure. They have to contribute to the conversation in an appropriate and
effective manner, often without the possibility of asking for clarification and/or,
time to plan and think about their answers. These differences between spoken and
written languages should be acknowledged in teaching as well as in testing and
assessment (McCarthy, 2006). Test designers should strive to develop tests and
utilise techniques and practices to elicit valid and reliable information related to
the targeted micro- and macro-speaking skills. Methods that are useful in testing
written skills might not be the most suitable ones when it comes to uncovering
students’ level and progress in speaking.

Effects of the listening skill and interlocutors


With recent developments in foreign language education, it is now widely
accepted that teaching spoken language means developing students’ ability to
interact successfully in the target language and that “this involves comprehension
as well as production” (Hughes, 2003, p. 113). Hence, the claim that listening and
speaking skills are almost always closely interrelated (Brown, 2004, p. 140). Apart
from the limited contexts where the speaker gives a monologue or a speech (which
can be planned or impromptu), tells a story or reads aloud, the aural participation
of an interlocutor is vital. The majority of the real-life spoken interactions are
interpersonal or transactional dialogues (i.e., involving two or more interlocutors)
that are almost always unplanned. Therefore, in nearly all of the spoken interactions,
the speakers’ success is highly dependent on their interlocutors.
Speaking is a productive skill, and as such, it can be empirically observed. However,
the success of the productive tasks depends on the accuracy and effectiveness of
the test taker’s listening skills (i.e., receptive skills). If the test takers do not know
their interlocutors well (i.e., they do not have enough background information
about them) and they are not familiar with their accents, the reliability and
validity of an oral production test are compromised (Brown, 2004). An unrelated
or incorrect answer by a test taker could stem from a misunderstanding or lack
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 125

of comprehension since “speakers react to each other, take turns or overlap each
other, ask for clarification, fill in gaps in their interlocutor’s utterances and, finally,
produce the text of their joint speech” (Hatipoğlu, 2017b, p. 121). Because of all
those reasons, it is difficult to separate the speaking skill from listening skill.

Design of elicitation techniques/Setting the task


When receptive performance (i.e., listening and reading) is assessed, the elicitation
stimulus can be structured to anticipate only the predetermined responses (e.g.,
using only the structures and providing only the information that test developers
want to test/elicit). Speech, however, is a productive skill where speakers creatively
combine grammatical, lexical and discursive structures. Therefore, to elicit only the
structures that test takers are interested in, the stimulus they design should be
carefully crafted and should prohibit test-takers from using other structures and
avoiding or paraphrasing the target structures.
When designing tasks for testing students’ oral ability, Hughes (2003, p. 113)
suggests that the following three rules are followed:
(i) Set tasks that form a representative sample of the population of oral tasks
that we expect candidates to be able to perform.
(ii) Create tasks that elicit behaviour that truly represent the candidates’ ability.
(iii) Design the tasks in such a manner that the collected sample of behaviour
can and will be scored validly and reliably.

Achieving the objectives listed by Hughes (2003) becomes more difficult with the
increase in the level of freedom of the exam tasks. With more open-ended tasks,
test takers have the liberty to respond with a wider variety of words and structures
not anticipated by the test writers. To avoid problems that might damage both the
reliability and validity of the exam, test writers should prepare detailed analytical
rubrics where every type and piece of information is allocated an individual score.
Speaking rubrics should be as detailed as possible and, depending on their aims of
the exam, should allocate points for pronunciation, fluency, grammar, vocabulary,
discoursal elements, and level of pragmatic appropriacy.

Number of test-takers
If the tasks are well designed, it is possible to assess the listening, reading, and
writing skills of a large number of students in a relatively short period of time in a
valid and reliable manner. However, due to the speaking skill’s real-time interactive
nature, assessing it becomes notoriously difficult when the group taking the test
is big, and the time allocated for the assessment process is limited. To be done
successfully, enough time, meticulous planning, and the involvement of a large
number of well-trained testers are required. Therefore, test administrators should
126 LANGUAGE ASSESSMENT AND TEST PREPARATION

plan and schedule their speaking exams accordingly. With the more significant
emphasis on speaking in the new educational system, nowadays, the pressure on
assessment experts to find quick, reliable and valid ways to test speaking is bigger
than ever before.

Theoretical Background
In this part of the Chapter, the developments in the last two centuries or so related
to how speaking in the foreign language were conceptualised and how those
changes have been reflected in the construction of speaking tests over the years
are reviewed.
Up until the end of the 19th C, when the predominant method of language teaching
was the Grammar-Translation Method (GMT), there was “little consideration
given to the need for a speaking test of any type” (Bradshaw, 2020, p. 214).
Originally, the aim of the proponents of GMT was to adapt the scholastic study of
the teaching of Latin and Greek “to the circumstances and requirements of school
students” (Howatt, 1984, p. 131). To carve out a place for GMT in the teaching of
modern languages, classes focused on grammar, reading and translation. This
training enabled learners to access the Classical cultures’ literature in their original
form and translate it, while speaking was defined mostly as reading aloud. During
that period, foreign language tests included mainly translation as a device for
assessing grammar and vocabulary (Hatipoğlu, 2017a; Hatipoğlu & Erçetin, 2016).
This situation started to change towards the end of the 19th C when the Reform
Movement in Germany began (around 1882). The new movement took as its
focus the recently introduced science of phonetics (i.e., in 1886, the International
Phonetics Association was founded) and argued that teaching foreign languages
should “begin with the spoken language” (Sweet, 1899, p. 49). This movement
led to the development of a new pedagogical approach rooted in the spoken
language – the Direct Method (DM). DM supporters suggested more of an
immersion approach to language learning. According to them, rather than
learning L2, it should be acquired, and among all skills, “the acquisition of the
spoken language was seen as the main purpose” (Bradshaw, 2020, p. 2015). This
change in perspective started to be slowly reflected in the structure and content of
foreign language proficiency exams as well. The Certificate of Proficiency in English
(CPE) (now known as Cambridge English Proficiency), which was first administered
in 1913 and lasted for 12 hours, included both Written (11 hours) and Oral (1 hour)
sections (See Table 1).
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 127

Table 1
1913 CPE Examination Parts (Adapted from Weir, 2013, p. 3)
Main Parts Sub-sections Time (hours)

A: Written (a) Translation from English into French or German 2

(b) Translation from French or German into English, 2.5


and questions on English grammar

(c) English Essay 2

(d) English literature 3

(e) English Phonetics 1.5

B: Oral (a) Dictation 0.5

(b) Reading and conversation 0.5

The Written Section of the exam had five sub-sections and lasted for 11 hours,
while the Oral Section comprised two sub-parts: Dictation (0.5 hours) and Reading
and Conversation (0.5 hours) and was just an hour-long (Weir, 2013). As can be
seen from the content of the exam and the weights of each of the sub-sections,
at that time, the effect of the GTM was still strongly felt. The Written section of
the exam (including English Phonetics) focused on form but, for the first time,
“attention was clearly paid to active language as well” (Weir, 2013, p. 3).
The next development in the field of linguistics that affected how foreign languages
were taught and assessed was the birth of ‘structural linguistics’ (i.e., an approach
aiming to uncover the rules that underlie language and govern how it functions)
in the 1920s. Structural linguistics had a long-lasting (from the 1920s well into the
1970s) but varied effects on the teaching and assessing foreign languages on both
sides of the Atlantic. In the USA, where the leading structuralists were Bloomfield
(1926, 1933) and Fries (1945), the primary focus of foreign language classes and
exams was the testing of isolated language structures and groups of words using
objectively scored items (e.g., multiple-choice items, True/False questions). It is not
surprising then that the Test of English as a Foreign Language (TOEFL), which
was first administered in 1964, consisted of only reading, listening, structure and
vocabulary sections for the following 20 years. Even when a speaking section was
added to TOEFL, it was optional for quite a long period of time. Factors such as
reliability, practicality and lack of experts (i.e., variables that are still valid today)
were cited as the main reasons for the non-adoption of a speaking section by
TOEFL writers. In Britain, on the other hand, where the leading linguist was Palmer
(1921a, 1921b), the structuralist principles and content were “usually combined
with the direct method with its emphasis on the spoken language” (Weir, 2013,
p. 3). Word lists and grammatical structures were tested, but exam writers tried
to locate them in relevant and interesting situations to make their meaning clear
leading to the introduction of the “situational approach” in the UK.
128 LANGUAGE ASSESSMENT AND TEST PREPARATION

The gradual shift away from structuralism and towards the Communicative
Approaches to Language Teaching and Testing started at the beginning
of the 1980s and flourished during the 1990s. The communicative approach
underlined the importance of meaningful tasks and authentic materials, made
use of the rapidly developing field of sociolinguistics and gave birth to the
notional-functional syllabus. Because of these, differently from the previous
periods, a considerable amount of teaching and learning of foreign languages
began to be done orally. Experts, language teachers, and students became
interested in language as a means of communication rather than its structures
per se. Consequently, developing learners’ speaking proficiency started to be held
high among most foreign language programs’ objectives. These changed views
“forced language testers out of their narrow conception of language ability as
an isolated trait, and required them to take into consideration the discoursal and
sociolinguistic aspects of language use, as well as the context in which it takes
place” (Bachman, 2000, p. 3). Finally speaking has come to the fore and exam
tasks began to be constructed around this skill. Nowadays, communicative test
designers try to ensure that their exams involve ‘performance’ (i.e., test takers are
engaged in the act of communication) and are real-life like/authentic (i.e., the
social roles assumed by the test takers during the exam are similar to the ones
they are likely to undertake in real-world settings and the test takers know and
understand the communicative purpose of the task) (Fulcher, 1999; Harding, 2014;
McNamara, 2000). Therefore, now the aim of the speaking sections of exams such
as TOEFL and IELTS is to “measure test takers’ ability to actually communicate
in English about what they have read and heard” (Bridgeman et al., 2011, p. 91-
92). That is, they try to approximate as closely as possible the tasks students are
expected to perform in their English medium institutions.

Testing and Assessment of Speaking and Speaking Test Task Types


Tasks are the means using which exam writers can “elicit a sample of language
that can be scored” (Fulcher, 2014, p. 50). Speaking tasks and their evaluation
criteria should be designed based on the analysis of the students’ needs and the
test’s aims. To choose the most appropriate tasks for their tests, exam developers
should clearly understand what the test scores will be used for and what type
of information the test takers need. The speaking-assessment tasks must also
be authentic (i.e., they should involve realistic and genuine communicative
interactions) and contextualised (i.e., as “normal” conversations do not occur in a
vacuum, the exam tasks should describe the conversational contexts in as much
detail as possible).
Depending on how they defined ‘knowing a language’, assessment experts
have classified speaking exam tasks differently. Valette (1977), for instance,
grouped tasks depending on what could be tested through each of them (e.g.,
“Communication Self-expression The student uses the foreign language
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 129

to express his personal thoughts orally or in writing. He uses gestures as part of


his expression.”, Johnson, 1985, p. 34). Others (e.g., Underhill, 1987; Weir, 1988,
1993) organised their categories around the advantages and disadvantages
accompanying the elicitation techniques. More recently, with the development of
the communicative and intercultural communicative competence (ICCC) models,
the interactional ability’s importance was highlighted (Bachman & Palmer, 1996;
Candlin, 1987). The focus shifted from lexis and grammar to the operationalisation
of the dynamic process of communication (Kramsch, 1986, p. 386). This progress
was combined with the developments and the gaining of popularity of the task-
based language teaching (TBLT) in the field and “the pursuit of developing
appropriate testing models for this approach of pedagogy” (Norris, 2016, p.
230). As a result, speaking assessment tasks have started to be specified in terms
of context (i.e., the specific social settings where the interactions take place),
interlocutor(s) (i.e., who is interacting with whom, what is the level of closeness
and the social status relationship between the interlocutors), and goals (i.e.,
informative vs interactional; what the interlocutors aim to achieve) (May, 2010).
This led to Bachman and Palmer’s (1996) new definition of speaking test tasks
where the active participation of the learners is acknowledged: “an activity that
involves individuals in using language for the purpose of achieving a particular
goal or objective in a particular situation” (p. 44). The new policy now is to create
tasks (e.g., portfolios, performance assessment) that better represent test takers’
abilities to use the new language (Mislevy et al., 2002; Norris, 2016). This, in turn,
allows teachers and test developers to reorient their teaching, learning, and
assessment from focusing on rote memorisation and discrete facts to emphasising
more authentic and meaningful tasks and areas (e.g., appropriate use of knowledge
when needed).
For a fuller and more inclusive representation of the available speaking assessment
tasks, in this Chapter, an eclectic categorisation is presented and discussed. It is
based both on more classic taxonomies as Brown’s (2004, pp. 141-142) as well as
on more modern task-based ones (Luoma, 2004; Norris, 2016). Brown describes
the primary tasks of oral production in a continuum, ranging from the least
spontaneous and most monitored ones to the most spontaneous, unmonitored
and demanding ones (Pawlak, 2016): (1) imitative, (2) intensive, (3) responsive, (4)
interactive, and (5) extensive.

Imitative Assessment Tasks (IAT)


Imitative speaking tasks assess test takers’ ability to “parrot back” (Brown, 2004,
p. 141) short units such as words, phrases, or sentences. These are the most
monitored and least spontaneous among the tasks eliciting oral production.
Madsen (1983, p. 148) calls them “pre-speaking activities” since they focus on
specific phonetic and/or phonological (e.g., stress, intonation) components of the
target language (TL) and only assess test takers’ ability to hear and repeat them.
130 LANGUAGE ASSESSMENT AND TEST PREPARATION

As IAT do not encourage exchanges of meaning, no inferences related to the


test takers’ capability to understand, communicate, or participate in meaningful
interactions are made. The assessment criteria for the IATs may consist of prosodic
(e.g., stress patterns), lexical (e.g., correct pronunciation of words), or grammatical
(e.g., the pronunciation of grammatical suffixes such as -ed, -ing) features of the TL.
Recent studies on the use of speaking assessment tasks with secondary school
students (Lee, 2010) and adult learners (Tajeddin et al., 2018) showed that imitative
tasks were not usually among the speaking assessment methods employed by
language teachers. However, when employed, IAT are useful in honing language
students’ pronunciation (Seo, 2014; Stroh, 2012) especially when utilized as
classroom formative assessment activities (Madsen, 1983; Swastika et al., 2020).
They focus learners’ attention on crucial/meaning changing dissimilarities
between the native and TLs (e.g., lack/presence of some sounds, difference in stress
patterns and intonation). IAT are particularly useful with elementary, beginner
level students (e.g., children who have been learning a foreign language for just a
few months). Activities where young learners are asked to repeat a stimulus (e.g.,
a new word, part of a song) several times proved to be particularly enjoyable and
useful (Ayas et al., 2020). With such exercises, both the perception (see Example
1) and the production (see Example 2) of speech in the foreign language can be
tested.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 131

Example 1: Imitative Assessment Task Examples: PERCEPTION

Procedure:
The test taker who is given a pair of pictures of contrasting words hears: Listen and choose the correct
picture
The test taker listens and chooses a picture.

Example stimuli Focus/Explanations

(i) a. ship /ʃɪp/ b. sheep /ʃiːp/ Contrasting sounds (i.e.,


long vs. short vowels) in
the TL, minimal pairs
Note: Particularly useful
if such contrasts do not
exist in the learners’ native
language (e.g., Turkish).

Picture (ia) source: https://


ar tprojec tsfork ids.org/
how-to-draw-a-ship/
Picture (ib) source:
https://tr.pinterest.com/
pin/194288171399913233/

(ii) a. wet /wet/ b. vet /vet/ Contrasting sounds (i.e.,


/w/ vs /v/) in the TL/
minimal pairs
Note: Particularly useful if
one of the sounds in the
given minimal pair does
not exist in the native
language of learners.

Picture (iia) source: https://


www.dreamstime.com/
cartoon-stick-drawing-
conceptual-illustration-
surprised-wet-drenched-
man-another-man-
holding-water-hose-
cartoon-image131024236
Picture (iib) source: https://
www.dreamstime.com/
illustration/vet-draw.html
132 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 2: Imitative Assessment Task Examples: PRODUCTION (adapted from Hatipoğlu, 2017b)

Procedure:
The test taker hears: Repeat after me
Test taker repeats the stimulus

Example stimuli Focus/Explanations

(i) (ia) permit /pəˈmɪt/ [pause], Contrasting stress patterns leading to meaning and/or word
permit /ˈpɜː.mɪt/ [pause] category change
Note 1:
(ib) address /ˈæd.res/ [pause], - permit /pəˈmɪt/ (Verb): to allow something
address /əˈdres/ [pause] - permit /ˈpɜː.mɪt/ (Noun): an official document that allows
you to do something or go somewhere
Note 2:
- address /ˈæd.res/ (Noun): a place where someone lives
- address /əˈdres/ (Verb): to speak or write to someone

(ii) (iia) Every day, we fill in the file.


Contrasting sound units in contexts/sentences (e.g.,
monophthongs vs diphthongs, consonant clusters)
(iib) A strange black sheep Note 1: Every day, we fill /fɪl/ in a file /faɪl/. (monophthong vs
jumps up and down in sixth diphthong contrast)
street. Note 2: A strange /streɪndʒ/ black /blæk/ sheep jumps /dʒʌmps/
up and down in sixth /sɪksθ/ street /striːt/. [In some languages
(e.g., English, Bulgarian, Serbian) consonant clusters are
frequently found in different positions in the words (e.g., initial
/strange, black/, final /jumps/) while in other languages (e.g.,
Turkish) consonant clusters in certain positions (e.g., initial)
are not allowed. It is useful/important to focus students’
attention on such differences between the mother and target
language. ]

(iii) (iiia) Are they busy? Such exercises are useful in focusing students’ attention on
(iiib) Where did you go the existing intonational patterns in the target language (e.g.,
yesterday? Yes/No questions, Wh-questions, tag questions) as well as on
(iiic) They are students, aren’t the differences between the first and target languages of the
they? students.

One of the biggest problems related to testing speaking is its scoring since various
factors related to the administration of the exam (e.g., a noisy environment), the
examiner (e.g., his/her background, (lack of ) knowledge of other foreign languages,
acquaintance with the test taker’s accent) and/or the examinee (e.g., being shy/
introvert, having other problems related to speaking) can affect the evaluation
process. To make the assessment as reliable as possible, the scoring criteria should
be clearly defined and explained in detail while keeping in mind the goals of the
task. Brown and Abeywickrama (2019, p. 161) state that with imitative tasks, it
would be more practical to adopt a two- or three-point system for each response.
Scoring scale for repetition tasks

2 Acceptable pronunciation

1 Comprehensible, partly correct pronunciation

0 Silence, serious incorrect pronunciation

When the stretch of language that test-takers are asked to repeat gets longer, the
likelihood of making mistakes increases, and the evaluation of the oral production
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 133

becomes a difficult task. With longer utterances, it gets more difficult for the
examiners to isolate and describe the patterns that they are interested in. To
avoid such validity problems (e.g. instead of focusing on the intended pattern,
the examiner’s evaluation is affected by other more prominent mistakes related
to the pronunciation of the repeated phrase/sentence), Brown and Abeywickrama
(2019) advice testers to “score only the criterion of the task” (p. 161). That is, in
instances where the focus is on meaning and word category changing stress
patterns (e.g., I am content with the content of this blog), test takers should be
awarded points for the correct stress patterns in the focus words, regardless of
any other problems in the utterance. If the focus is on stress patterns in the target
language, the examiners should refrain from penalising students for mistakes
related to pronunciation or intonation.

Intensive Assessment Tasks (INAT)


The intensive assessment tasks (INAT) are limited response (Madsen, 1983),
mechanical (Underhill, 1987) tasks where test-takers are expected to demonstrate
their comprehension of a narrow band of grammatical, semantic, or phonological
relationships. To be able to complete the INAT successfully, test takers must be aware
of some of the phonological and lexical-semantic facets of the target language, but
here similarly to the IAT, interactions with interlocutors or test administrators are
minimal or non-existent. Scrutiny of the teachers’ classroom speaking assessment
practices in South Korea (Lee, 2010) showed that INAT were rarely used in secondary
schools, which led the researcher to conclude that “evaluation of phonological
features is rarely aimed at in classroom speaking assessment” (Lee, 2010, p. 41).
Examples of INAT include read-aloud, translation up to a simple sentence, direct
response tasks, sentence and dialogue completion, and limited picture-cued
tasks. These tasks illustrate the continuum of non-existent to minimal interaction
between the test taker and the test administrator.

Read-aloud Tasks
The read-aloud assessment tasks are the most controlled and the least
interactional INATs. As the name suggests, the test takers, depending on their age
and proficiency level, are asked to read out loud texts with various lengths (e.g.,
from a single sentence up to 150 words long texts). These ‘diagnostic passages’
(Prator & Robinett, 1985, p. ix), are constructed according to the objectives of
the test. That is, they include words and types of structures that enable testers
to elicit information related to the phonological (e.g., problematic consonants
and consonant clusters; vowels, diphthongs and triphthongs; difficult or distinct
stress and intonation patterns), structural (e.g., regular interrogative sentences vs
question tags; irregular nouns and verbs) and lexical (e.g., homophones, synonyms,
antonyms) knowledge of test takers’ in thetarget language (TL)
134 LANGUAGE ASSESSMENT AND TEST PREPARATION

Until recently, general ELT methodology literature did not recommend the read-
aloud practice as it was not seen as ‘genuinely communicative’ (e.g., Broughton
et al., 1980). Assessment experts also argued that it was not a good technique
to test reading in a foreign language (Heaton, 1990; Hughes, 2003). However, a
significant number of studies conducted in the last two decades revealed the
benefits of using such assessment tasks and started calling for the reappraisal
of the technique (Gibson, 2008; Huang, 2010; Seo, 2014; Stroh, 2012; Supraba et
al., 2020; Weir & Wu, 2006). They have also shown where, why and how language
teachers, learners and testers can benefit from this practice. Below is a summary of
the most frequently mentioned benefits/advantages of using read-aloud activities
as speaking assessment tasks:

Versatile
Read-aloud tasks are versatile (Gibson, 2008). With well designed ‘diagnostic texts’,
read-aloud tasks can be used to assess the language abilities of learners with
different ages (young learners, adults) and proficiency levels (A1, A2, B1 etc.).

Reliable
Read-aloud tasks are easy to administer and score as all of the test takers’
production is controlled (Brown, 2004; Weir & Wu, 2006). Since all of the students
are asked to read the same text, test developers will not have to worry about the
parallel-test reliability either (Weir & Wu, 2006), and they will have a firm ground to
compare students’ performances to each other.

Assessment of phonetic and phonological features of the target language


Pronunciation (i.e., the ability to say the sounds and words in the new language in
a correct way), intonation (i.e., the rise and fall of the speaker’s voice that affects the
meaning of what is said), fluency (i.e., the ability to speak well and quickly) and correct
tone (i.e., the quality in the voices that expresses the speaker’s feelings and thoughts)
are some of the fundamental features of speech. Studies done on read-aloud show that
this technique can be used to assess all of these separately from the content of speech.
It also helps foreign language students to “get used to their own voice pronouncing
the target language and thus reduces anxiety” (Seo, 2014, p. 46). Finally, it aids the
acquisition of prosodic features of English (Gibson, 2008) and improves speech rate,
phonation, linking, and length of pauses in the foreign language speech of learners
(Can Daşkın & Hatipoğlu, 2019a, 2019b, 2019c; Chun, 2002; Stroh, 2012).

Formative assessment tool leading to autonomous learning


Reading aloud does not have to be a part of an official exam setting. It can happen
at home, in language laboratories or anyplace where students feel comfortable, and
as such, it could be employed as a formative assessment tool (i.e., the assessment
that is specifically intended to provide feedback on performance to improve and
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 135

accelerate learning, Sadler, 1998, p. 77). After introducing new sounds, stress patterns,
structures or lexical groups, students might be asked to read aloud assigned texts
and record themselves in their own time. These recordings can be submitted to
the teacher for analysis and/or could also be employed as self-assessment tools
(i.e., students listen to themselves and evaluate their performance). Such exercises
can develop students’ pronunciation, intonation, chunking of a text into sense
groups and comprehension as well as their memorisation of new vocabulary items.
According to Prator and Robinett (1985), “reading aloud is clearly one of the most
effective mechanisms for learning to monitor one’s own pronunciation” (p. xxvi).
By hearing and analysing their own speech, students will gain self-confidence,
and this eventually will lead to “autonomous learning and may help some anxious
students to feel more able to speak” (Gibson, 2008, p. 29). It can also help students
to move between different levels of comprehension of a text as with more practice
in reading and speaking they become increasingly engrossed in the meaning of
what they are reading and saying.
The read aloud activities in the tests (especially the ones prepared for progress and final
achievement tests) should match the curriculum goals. When selecting or preparing
‘diagnostic paragraphs’, preference should be given to texts written in simple modern
language (almost) free from unusual private names (e.g., person, town, place) and
dialectal peculiarities. The font of the text should be big enough, and there should
be enough space between the lines.Numbering every sentence will help testers take
notes and follow test takers’ progress (see Example 3).

Example 3: Read-aloud test task

Procedure:
The test taker hears: This is the passage that you will be asked to read aloud. First, you will have 2
minutes to look over the text and read it silently. Then, you will be asked to read the passage aloud
with attention to pronunciation, intonation, and flow of delivery. The reading should be done at
normal speed, and should sound as much like natural speech/conversation as possible. Do you
have any questions?
Test takers go over the passage and when the time is up, they start reading the diagnostic
passage out loud.

Example ‘diagnostic passage’ (adapted from Prator & Robinett, 1985, p. x)

(1) When a student from another country comes to study in the United States, he has to find
out for himself the answers to many questions, and he has many problems to think about. (2)
Where should he live? (3) Would it be better if he looked for a private room off campus or if he
stayed in a dormitory? (4) Should he spend all of his time just studying? (5) Shouldn’t he try to
take advantage of the many social and cultural activities which are offered? (6) At first, it is not
easy for him to be casual in dress, informal in manner, and confident in speech.

Depending on the test-takers’ age, proficiency level, and exam goals, the evaluation
of their performance on the read-aloud activities can be done using a very detailed
evaluation rubric (see Example 4) or a more general one (see Example 5).
136 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 4: Checklist of problems for a Read-aloud test task (adapted from Prator &
Robinett, 1985, p. xii-xiv)

EVALUATION/
CRITERIA POINTS TO CHECK
NOTES

I. Stress and Ia. Stress on the wrong syllable in words with


rhythm more than one syllable.

Ib. Improper sentence stress.

Ic. Improper division of sentences into thought


groups.

Id. Failure to make a smooth transition between


words and syllables.

II. Intonation IIa. Unnatural intonation at the end of statements.

IIb. Unnatural intonation in wh-questions.

IIc. Unnatural intonation in questions with two


alternatives.

III. Vowels IIIa. Substitution of an improper vowel sound/


diphthongs
____ for /æ/ ____ for /oʊ/
____ for /Ɛ/ ____ for /aj/

IIIb. Failure to lengthen stressed vowels.

IV. Consonants IVa. Substitution due to improper point of


articulation
____ /p/ for /b/ ____ /v/ for /w/
____ /t/ for /ɵ/ ____ /n/ for /ŋ/

IVb. Omission of a consonant.

IVc. Insufficient aspiration of initial voiceless


consonants.

V. Vowels and Va. Confusion between the three usual ways of


consonants pronouncing the -ED ending.

Vb. Confusion between the three usual ways of


pronouncing the -S ending.

VI. General
Comments
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 137

Example 5: Read-aloud test task -Rating Scale (adapted from Weir & Wu, 2006, p. 196)

EVALUATION/
RATING INTERPRETATION
NOTES

5=Excellent Entirely intelligible pronunciation; very natural and


correct intonation; the candidate speaks fluently with
minimal hesitations.

Generally intelligible pronunciation; generally natural


4=Good and correct intonation; the candidate generally
speaks fluently, hesitations may sometimes occur.

Some errors in pronunciation and intonation influence


3=Fair comprehensibility; the candidate sometimes speaks
fluently, though unnecessary hesitations still occur.

Many errors in pronunciation and intonation; the


candidate sometimes gives up on reading words
2=Poor which he or she does not recognise; the candidate
doesn’t speak with ease; unnecessary hesitations
occur frequently.

The candidate has little ability to handle the task;


1=Very poor the candidate doesn’t speak with ease; unnecessary
hesitations occur very frequently.

Translation (of Limited Stretches of Discourse)


Translation, defined by House (2015, p. 3) as the “linguistic-textual operation in
which a text in one language is re-contextualised in another”, was one of the
essential tools for teaching and assessing language competence for a very long
time. However, with the emergence of the communicative era, which claimed
that translation was an inadequate technique/method, it lost its importance
and got almost forgotten (Tsagari & Floros, 2013). This negative attitude towards
translation started to slowly disappear at the beginning of the 21st C, when critics
of the Communicative Language Teaching approach began to argue that “it is an
inappropriate methodology in contexts where the accuracy of language use is
valued more highly than fluency (Sun & Cheng, 2013, p. 235; Thornbury 2003).
After that, discussions about the merits of translation as a method for language
teaching and assessment have been enriched and widened (Laviosa, 2014; Sun
& Cheng, 2013), and it has started to regain its respectability among language
professionals. Nowadays, there are many who claim that it can play a useful role in
second/foreign language teaching, learning and assessment (Sun & Cheng, 2013,
pp. 235-236). The most frequently cited advantages of the translation tool are:
(i) A reliable assessment tool (despite the fact that test the taker’s output is not a
hundred percent controllable) (Buck, 1992).
138 LANGUAGE ASSESSMENT AND TEST PREPARATION

When all of the students in the tested group are given the same words, phrases or
sentences to translate, it is possible to both limit the output and to compare the
test takers among each other against the same criteria.
(ii) An integrative assessment tool
To be able to translate a phrase or a sentence from L1 into L2 or vice versa,
students need to mobilise a diverse set of interdisciplinary skills (e.g., problem-
solving, creative and critical thinking) and types of knowledge (e.g., metalinguistic
competence, grammar, vocabulary, pronunciation skills) (Laviosa, 2014; Presas,
2000). What is more, according to Sun and Cheng (2013), and House (2015),
translation is unique in scaffolding students’ learning via their first language.
(iii) A practice that creates multilingual identities and facilitates communication
across cultures (House, 2015; Laviosa, 2014).
A good translation is fair, balanced, and grounded. It is not only a linguistic act,
where forms from one language are just converted into the other. It is also an
act of communication across cultures (House, 2015, p. 3). Eugene Nida (1964),
one of the leading figures of translation theory, argues that translation always
involves L1, L2 and the cultures of the two languages since language and
culture cannot be neatly separated. According to him, translation is one of the
major means of constructing representations of other cultures since languages
are culturally embedded. Languages serve to express and shape cultural reality,
and the “meanings of linguistic units can only be understood when considered
together with the cultural contexts in which they arise, and in which they are used.
In translation, therefore, not only two languages, but also two cultures invariably
come into contact” (House, 2015, p. 4). When viewed from that perspective,
translation is a form of intercultural communication.
Similarly to the read-aloud tasks, the translation assessment tasks are non-
interactional and allow for different levels of control of the test taker’s output.
In translation assessment tasks, instead of a picture or a written stimulus, test-
takers, depending on their age and proficiency level, are given a word, phrase, or
sentence in their native language and are asked to translate it into the TL. The test
takers could be asked to translate the stimulus instantly or might be allowed 1-2
minutes of thinking and organisation time before orally producing a translation
of somewhat longer texts. Example 6 shows instances where the test-takers are
asked to translate words (Part A), phrases (Part B) and sentences (Part C).
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 139

Example 6: Translation Tasks


Instructions
The test-taker hears/reads: First, listen to the given words/phrases/sentences and translate them
into English. (You have X minutes to prepare.)
The test taker hears
Part A
Examiner: muz / sandalye / kırmızı / omuzlar / üzgün
Tes taker (expected answer): a banana/ a chair / red / shoulders / sad

Part B
Examiner: lezzetli bir muz/ iki ağaç sandalye / koyu kırmızı bir elbise
/ kırılan omuzum / kızgın olan taraftarlar
Tes taker (expected answer): a delicious banana / two wooden chairs/ a dark red dress
/ my broken shoulder / fans who are angry

Part C
Examiner: Çocuklar bu oyunu seviyor. / Bu hoteldeki yemekler
lezzetli değil. / Bu bilet kaç para? / Partiye gelecek misin?
Test taker (expected answer): Kids/Children love this game. / Food in this hotel is not
tasty. / How much is this ticket? / Will you come to the
party?

The quality of the translation done by test takers can be based on a number of
different criteria. Example 7 shows a Rubric where a more holistic approach
to speaking is adopted. Pronunciation, word and sentence stress, intonation,
vocabulary usage and grammar are evaluated together.
140 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 7: Read-aloud test task -Rating Scale (adapted from Angelelli, 2009, p. 33):

EVALUATION/
RATING INTERPRETATION
NOTES

Translation (T) shows masterful control of


target language (TL) pronunciation, word and
5=Excellent
sentence stress, intonation, vocabulary usage and
grammar. Very few or no errors.

T shows a proficient control of TL pronunciation,


4=Good word and sentence stress, intonation, vocabulary
usage and grammar. Occasional minor errors.

T shows a weak control of TL pronunciation, word


3=Fair and sentence stress, intonation, vocabulary usage
and grammar. T has frequent minor errors.

T shows some lack of control of TL pronunciation,


word and sentence stress, intonation, vocabulary
2=Poor
usage and grammar. T is compromised by
numerous errors.

T exhibits lack of control of TL pronunciation,


word and sentence stress, intonation, vocabulary
1=Very poor
usage and grammar. Serious and frequent errors
exist

No attempt to translate is made, or speech is


0=No language use
incomprehensible.

Directed Response Tasks (DRT)


DRT are tasks requiring ‘limited responses’, and some of them can be quite artificial.
However, if designed well, they can elicit connected speech and, more importantly,
can represent everyday communication (Madsen, 1985). In DRT exams, students
hear a sentence which they are asked to transform into a new form. To be
successful in these tasks (i.e., to produce the correct grammatical output), test-
takers should be able to process the meaning of the sentence correctly. During the
exam, test takers can interact with an examiner, a third person who carries the role
of an interlocutor, or they can listen to a pre-recorded set of sentences. Each of the
DRT exam sentences focuses on a particular grammatical structure or a group of
structures. DRT are useful assessment tools because they are:

(a) Versatile
They can be utilised with students of any age (e.g., young learners and adults)
and proficiency level (e.g., students with limited speaking skills and students
approaching intermediate level).
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 141

(b) Reliable
As all students are asked to repeat the same prompts, there is a stable criterion
against which all of the students are compared (i.e., there is a consistency with
which the assessment measures what it aims to measure, Popham, 2018).

(c) Practical
In schools with language labs or big classrooms with the required equipment, big
groups of test-takers can be tested using DRT.
With DRT, the level of difficulty and authenticity, and the degree of creativity
required from the test-takers can be manipulated depending on the goals of
the exam (see Example 8). Examples 8a-8c are all more artificial, but they have
different levels of difficulty as they go from the one requiring a simple repetition
of a part of the original sentence (8a) to prompts that get longer (8c) and require
one (8c) or more transformations (8b). Examples 8d-8g are more authentic as they
are related to the test takers’ lives (8d), imitate message giving, which is frequently
encountered in real life (8e, 8f ), and require the analysis of the contexts within
which the conversation is taking place (8e, 8g).
142 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 8: Direct Response Tasks

Instructions: The test taker hears: Listen to me carefully and do as instructed

PROMPT EXPECTED REQUIRED


ANSWER(S) TRANSFORMATION/
COMMENTS

8a. Tell me they finished their They finished their *Artificial and very basic
homework. homework. *No transformation is required;
simple repetition of the
last part of the examiner’s
prompt.

8b. Tell me that you aren’t taller than I am not taller than *Artificial
Mary. Mary. *Modification required:
Change of the subject (‘you’ to
‘I’) and contraction (‘you aren’t’
to I am not)

8c. Tell me that you want to be a part I want to be a part *Artificial if not related to the
of the team that will visit Italy for of the team that will test taker’s real situation.
ground-breaking research next visit Italy for ground- *More difficult as the students
winter. breaking research will have to remember a long
next winter. prompt.
*Modification required:
Change of subject (‘you’ to ‘I’)

8d. If the test taker is a native I speak Turkish/ *More authentic/adapted to


speaker of Turkish/German: German. the test taker’s situation
Tell me that you speak Turkish/ *Modification required:
German. Change of subject (‘you’ to ‘I’)

8e. There are three people in the (i) Mrs Taylor says *More authentic as in real
exam room: the test-taker, the that she can visit you life, we are required to give
examiner (Mrs Taylor) and the on Wednesday. messages to other people.
interlocutor (John): (ii) Hey John/ *Modification required:
Examiner: Tell him I can visit him Pardon me, John. Change of subject and
on Wednesday. Mrs Taylor says that object (as in i)
she can visit you on *If the aim is to assess the
Wednesday. pragmatic development
of the students (i.e., giving
socially more appropriate
messages), answer (ii) is
expected.
*Answer (ii) requires a more
detailed analysis of the
context and more creativity
on the part of the test taker.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 143

8f. There are three people in the (i) Where is the *Authentic as in real life, we
exam room: the test-taker, the chemistry lab? are required to request
examiner (Mr Wilson) and the (ii) Pardon me, something as speakers/
interlocutor (Linda): Linda. Where is the representatives of a group.
Examiner: Ask her where the chemistry lab? *Modification required:
chemistry lab is. (iii) Good morning. Change of subject and
Could you tell me object (as in ii)
where the chemistry *If the aim is to assess the
lab is, please? pragmatic development
of the students (i.e., giving
socially more appropriate
messages), answers (ii) or (iii)
can be expected depending
on the level of formality of
the interaction.
*Answers (ii) and (iii) require
analysis of the situation,
initiative, and creativity on
the test taker part.

8g. It is 5:30 in the afternoon. You (i) Sorry. I didn’t see *Authentic
are in a hurry on your way you there. *As in real life, test takers have
home from the department (ii) I’m really sorry. more information about
store because you are expecting Are you alright? the context in which the
important guests. While you are interaction is taking place.
getting out of the department *This exercise demands more
store, you bump into a well- critical and creative thinking
dressed elderly lady. It is your as well as more initiative on
fault, and the lady seems really the part of the test takers.
upset. What would you say *To be successful in such
in that situation? (Hatipoğlu, tasks, students would need
2009). practice with commutative
activities in class and
information about the target
culture

Dialogue Completion Tasks (DCTs)


Dialogue completion tasks (DCTs) are exercises where test-takers see written
dialogues in which the lines of one of the interlocutors are omitted (see Examples
9 and 10). They are also known as “one-sided dialogues” (Heaton, 1990, p. 90).
These dialogues are written in advance so that every line focuses on a particular
issue (e.g., a grammatical structure, a type of vocabulary, a speech act). The aim
is to uncover how much test takers know about the assessed topics. Test-takers
are first given time to read the dialogue to understand its essence and to think
144 LANGUAGE ASSESSMENT AND TEST PREPARATION

about the appropriate utterances to fill in the gaps in the exchange. Then, the test
administrator, an interlocutor, or the tape reads aloud/produces the non-deleted
lines in the dialogue while the test-taker responds.
Depending on the level of proficiency of the test takers (lower vs higher) and on
how much control the test writers want to have on the output produced by the test
takers two types of DCT can be created. In Type 1 (i.e., the guided DCT), students
are informed of what is expected from them. The expected answer is summarised
in brackets (see Example 9), while in Type 2, students see the prompts/questions
but have more freedom to construct the responses the mselves (see Example 10).

Example 9: Guided Dialogue completion task (Type 1): Focus: Speech Acts

Instructions
First, read the dialogue carefully and think about appropriate answers. You have X minutes
to prepare.
Now, listen to the statements carefully and respond with the appropriate lines.
At the local shop
Shop assistant: Hi there. Can I help you?
(You want to buy a present for your friend’s birthday)
You:

Shop assistant: What does he/she like wearing?


(Tell the shop assistant that he/she likes T-shirts and shirts)
You:

Shop assistant: Oh, we have many of those. What colour and what size are you looking for?
(You think that he/she likes darker colours and he/she is S or M size.)
You:

Shop assistant: Oh, ok. What do you think about this T-shirt and this shirt?
(Tell the shop assistant that you like the T-shirt and ask its price)
You:

Shop assistant: This T-shirt is 10$


(Buy the T-shirt and thank the shop assistant)
You:
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 145

Example 10: Dialogue completion task (Type 2): Focus: Grammar


Instructions
First, read the dialogue carefully and think about appropriate answers. You have X minutes
to prepare.
Now, listen to the statements carefully and respond with the appropriate lines.
At a scholarship interview
Interviewer: Tell us about yourself. What is your name? How old are you?
Candidate:

Interviewer: What are your favourite school subjects?


Candidate:

Interviewer: Why did you apply for this scholarship?


Candidate:

Interviewer: What will you do if you get this scholarship?


Candidate:

Interviewer: What have you learned about the city where you are going?
Candidate:

Some of the most important advantages of DCT as assessment tools are:

(i) Reliability
Examiners read the same prompts, and all test takers are assessed against identical
criteria. Since test takers can see the dialogues in written format, they have
relatively more time to process and anticipate the exam prompts. In turn, this
removes or, in some instances, at least decreases the “ambiguity created by aural
misunderstanding” (Brown, 2004, p. 151).

(ii) Focus students’ attention on certain aspects of the target spoken language
When created well, DCT are very useful assessment tools in contexts where English
is taught as a foreign language and grammar, and reading are the foci of education
(Heaton, 1990). Such tasks help teachers focus learners’ attention on specific
aspects of the target spoken language (e.g., irregular nouns/verbs; collocations;
speech acts).
146 LANGUAGE ASSESSMENT AND TEST PREPARATION

(iii) The examiners can (moderately) control the output of the test takers
By creating dialogues focusing on specific characteristics of the target language,
(and by providing guides related to the expected answers), test writers limit the
acceptable responses (e.g., in Example 10, by asking Question 2 in the past tense,
test writers limit the possible answers for the test-takers).

(iv) The answers expected from the test-takers are contextualised


The deleted lines are a part of a dialogue. To produce sociolinguistically acceptable
replies, test takers have to consider the previous and following lines in the
conversation and analyse contextual variables (e.g., power distance, gender, age)
and discern expectancies in an interaction.

The disadvantages associated with the DCT are:


(i) “Reliance on literacy and the ability to transfer easily from written to spoken
English” (Brown, 2004, p. 150).
These tasks can put students with special needs (e.g., dyslexic) and those with
language backgrounds where different writing systems are used (e.g., Arabic,
Chinese) in a disadvantageous position. In such cases, students can be given more
time to go through the texts or a specialist’s help.

(ii) Inauthentic and contrived


Regardless of the responses given by test-takers, DCT administering examiners
continue reading the lines in the “pre-prepared dialogue”. This is the reason why
Heaton (1990) calls DCT the “dialogue of the deaf!” (p. 91) and argues that they do
not allow for any real kind of “constructive interplay with unpredictable stimuli
and responses” (Heaton, 1990, p. 90). He also maintains that DCT cannot be
described as “valid tests of speaking” (Heaton, 1990, p. 90) and suggests that they
are used in combination with tasks that allow test-takers to be involved in genuine
conversations and discussions.

Limited Picture-cued Tasks


Pictures, maps, diagrams and realia are versatile cues that can be used to assess the
speaking skills of students of different ages (from very young learners to adults)
and different levels of skills in English. They can be utilized while testing students
individually, in pairs or small groups. Picture-cues have the anchoring power of
giving test-takers “something specific to talk about, while allowing some flexibility
in expression” (Fulcher, 2014, p. 73). With picture-cues, depending on the aims of
the exam and the characteristics of the assessed group, it is possible to elicit a
single word, phrases, sentences, as well as descriptions and stories. That is, with
pictures and realia, test-developers might create tasks that focus only on sound
contrasts in the TL (e.g., minimal pairs, as in Example 11), grammatical categories/
structures (e.g., action verbs as in Example 12) or groups of vocabulary items.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 147

Example 11: Picture-cued assessment tasks: Minimal pairs


Instructions
The test taker sees the pictures.
The test administrator points at one of the pictures and asks: What is this?
Test taker: (utters the name of the shown object/animal/person etc.)
(i) Consonant contrast (ii) Vowel contrast

Expected answer: sum /sʌm/ vs. thumb /θʌm/ Expected answer: cap /kæp/ vs. cup /kʌp/

Assessment: The candidate should be able to enunciate the target consonant


sounds (in (i)) and vowels (in (ii)) as well as the words they are part of.
148 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 12: Picture-cued assessment tasks: Action verbs

Instructions
The test taker sees the pictures showing different actions.
Test taker hears [test administrator points to each picture in succession]:
Situation 1: If the assessed students have a lower level of proficiency, they can say: What is the
monkey doing in 1/here?
Situation 2: When the test is constructed for students with a higher level of proficiency, and
the aim is to elicit longer stretches of language, the questions might be: Tell me
about picture 1. What do you see? What is happening here?
Test taker: [Gives the answer required by the question.]

Picture source: https://www.123rf.com/photo_50850135_stock-vector-verbs-of-action-in-


pictures-cute-happy-monkey-character-black-and-white-outline.html

Scoring: The task is achieved if the candidates can produce comprehensible and grammatically
correct utterances fulfilling the requirements of the task.

Expected answers for Situation 1: Expected answers for Situation 2:

(1) pulling (a rope) (1) This is a picture of a cute monkey that looks
worried. The monkey is pulling a rope, and it
looks strained.

(2) throwing a ball (2) This is Chimp. He is a young monkey that likes
throwing balls. In the picture, we see Chimp
doing his favourite activity. He is throwing
his favourite ball. Chimp is very excited about
what he is doing. We can see that from his
facial expression.

(3) pushing (a box) (3) This is Aladdin. He is a handsome young


monkey, and he loves bananas. In the picture,
he is pushing a box full of bananas. He does
not want other monkeys to take his bananas.
The box looks heavy. It isn’t easy to move the
box, but it does not look as if Aladdin will give
up.

(4) swimming (4) In this picture, we see a monkey that is


swimming. The monkey is enjoying himself.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 149

Diagrams and Figures coming from authentic research papers can also be used in
speaking tests to elicit limited oral output. In Example 13, a Figure from a study
(Hatipoğlu et al., 2020) comparing the use of crowdsourcing platforms for learning
foreign languages in Turkey, Poland, The Republic of North Macedonia and, Bosnia
and Herzegovina is given. By exposing students to such authentic materials, test-
writers can create experiences of real contexts in the target language. This gives
language learners self-confidence (i.e., they know that they are able to understand,
analyse and discuss authentic materials) and motivates them to continue learning
the foreign language, which, in the end, leads to the desire to continue learning
autonomously (Umirova, 2020).
Example 13: Picture-cued assessment tasks: Describing information given in a diagram/
figure
Instructions
The test takers are given the diagram to study for a few minutes. Then, they are asked to
answer the questions they hear/read.
Test administrator: You have one minute to study Figure 1. The given numbers represent
percentages. [When the time is up] Look at Figure 1 and answer the questions that you
hear.

Figure 1: Which languages have you learned while using crowdsourcing platforms? (from
Hatipoğlu et al., 2020, p. 86)

Test takers hear:


1. What is this Figure about?
2. Data sets from how many countries are compared in this Figure? What are those
countries?
3. What is the most popular foreign language in all four countries?
4. Which languages are more widely learned in Poland (POL) when compared to Bosnia and
Herzegovina (B&H)?
5. In which countries is Turkish the most and least studied foreign language?
6. What are the differences between Turkey (TUR) and Macedonia (MAC)?
7. How popular is Spanish when compared to English in those four countries?
8. Compare the popularity of Italian, French and German in the studied countries.
9. Are there any results that you knew already? Where from?
10. Is there anything surprising related to the results?
150 LANGUAGE ASSESSMENT AND TEST PREPARATION

More open-ended responses can also be elicited when picture-cued tasks are
used. Students can be asked to describe a picture or the people in a picture (i.e.,
assessment of descriptive adjectives, physical appearance, prepositions, colours).
Test takers can be asked to talk about the relationships between the people in
the picture and/or their feelings as well (see Example 14). To complete such tasks,
students will have to use a wide variety of grammatical and semantic units.

Example 14: Picture-cued assessment tasks: Describing people, relationships and feelings

Instructions
The test takers are given a picture and told that they have 2 minutes to study it. When the time
is up, they are asked to describe in detail the place and the people they see in the picture.
Then they are instructed to look at the picture again and guess the relationship between the
people in the picture or what they might be discussing. Students can be given 2-3 minutes to
complete the task.

Test administrator:
(a) Look at the picture and describe the place and the people you see in the picture in
detail.
(b) What do you think the people in the picture are discussing? Why do you think so?
(c) What could be the relationship between them? Why do you think so?

Evaluating/Scoring students’ performance: For test takers to be successful on


this task, they should be, first, able to describe the environment and the people in
the picture with little or no prompting. They are expected to use colours (e.g., the
green and blue colours dominate the picture), comparatives (e.g., shorter hair, taller)
and shapes (e.g., rectangular); to describe the environment (e.g., tense, beautiful,
business-like) and the physical appearance of the people in the picture (e.g., tall,
slim, fit, long hair). Their sentences should be grammatically correct, and they
should use cohesive devices successfully.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 151

Possible scoring methods:


Depending on the aims of the exam and the variables test administrators choose
to evaluate, the final score related to the students’ performance can be based on
three evaluation techniques:
(i) Error-Based Method: For this method to be successfully implemented,
there should be three examiners in the room. The first one is the test taker’s
interlocutor (i.e., the person giving instructions, asking questions, providing
prompts and answering questions posed by the test taker). The second
examiner counts all of the test taker’s utterances (e.g., words, phrases and
sentences; if the focus is on a particular word category: how many physical
appearance adjectives are utilised). The last examiner counts the number of
mistakes made by the test taker The final score is the ratio of the utterances vs
mistakes (e.g., 100 utterances / 20 mistakes = 50 Overall score; Heaton, 1990).
With the error-based method, it is easy to objectify and calculate students’ final
scores. However, when it comes to speaking, such mechanical analyses do not
always provide the most valid and reliable evaluation methods.
(ii) Analytic Method: This scoring begins by designing detailed rubrics about the
micro and macro skills considered important with the group assessed in the
exam. A few examples could be:
Beginner level: correct pronunciation, word stress, and intonation contours
Intermediate level: produce language chunks, generate fluent speech,
respond with relevant phrases
Advanced level: generate fluent and intelligible speech, use grammatically
correct sentences, follow pragmatic conventions
The speaking exams are audio or video-recorded. Examiners watch the
recording a number of times, every time scoring the test taker’s performance
related to just one of the listed criteria (e.g., word stress).
Madsen (1983) argues that both teachers with no or little specialised training
as well as highly trained examiners can use the Analytical Method of scoring as
it is “consistent and easy to use” (p. 167).
(iii) Holistic method/Goal Oriented Method
A method where examiners evaluate several criteria simultaneously and assign
an overall performance score is called a holistic method of scoring. With this
scoring method, individual criteria such as pronunciation, fluency, grammar,
vocabulary etc. are still considered, but the more important factor affecting
the scoring is whether or not test takers are able to achieve their goals (e.g., ask
for directions, describe a person, book a room). Language errors that impede
successful communication are more heavily penalised, while the ones that do
not are penalised more lightly.
152 LANGUAGE ASSESSMENT AND TEST PREPARATION

Successful implementation of this scoring method requires more experience


and more thorough training. Therefore, novice teachers/examiners are advised
to do their first evaluations using Analytical Rubrics and slowly move to Holistic
Assessment.
Selection of the pictures for the picture-cued tasks is one of the most challenging
parts of the process. Therefore, pictures should be chosen carefully following
closely selection criteria selection criteria listed in Example 15.

Example 15: PICTURE PROMPTS (Hatipoğlu, 2017b, p. 134)

SHOULD NOT BE SHOULD BE

1. depicting distressing/violent/ 1. big enough, clear and uncluttered


disastrous etc. scenes (e.g., A (e.g., if the pictures are too small and
picture showing the aftermath of an there are too many details in them,
earthquake might be distressing/ this may confuse test takers; the main
discouraging for children who live in characters, buildings etc. should be
earthquake-prone regions. Because of easily identifiable; characteristics
the stress, they might choose not to such as gender, age,the role of the
continue the exam.) people in the picture should be clearly
identifiable)

2. culture laden (i.e., try to choose 2. rich enough to allow test


prompts showing universally valid administrators to elicit the targeted
environments that every test taker will structures and vocabulary items
be able to talk about)

3. representing scenes/places/jobs the 3. real-life picture where possible so


description of which require the use that test takers are able to relate to
of vocabulary outside the expected them more readily
range of test takers (e.g., young
learners asked to describe a nuclear
weapon test)

4. identical with the ones used in class 4. printed with high quality on quality
(i.e., The use of the same pictures may paper (e.g., problems related to the
lead to just parroting of information quality of the pictures may lead to
students remember from class.) invalid exam results)

5. politically biased (i.e., should not 5. (if necessary) accompanied by texts,


advocate a political point of view over captions, label, that are clear and free
any other), racist etc. of errors
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 153

Responsive Assessment Tasks (RAT)


RAT require brief interactions and assess comprehension. The conversations are
limited and short and include asking simple questions, brief apologies/requests,
standard greetings. To make these tasks as authentic as possible (i.e., close to a
natural conversation), the students are usually asked to respond to a spoken
prompt and then, allowed to ask questions themselves.

Question and Answer


The question and answer tasks are, as the name suggests, tasks where questions
are asked and answered. They are flexible in the sense that they can be used to
assess one student at a time, a pair or a group of students. In all of these contexts,
test takers’ ability to understand, answer and ask questions are evaluated. In the
instances where students are invited in the exam room in pairs or groups, they have
the chance to interact not only with the examiner but also among themselves,
which taps into a missing part of the education in countries where English is
taught as a foreign language, and the students mostly interact with their teachers
(Hatipoğlu, 2013, 2016b, 2017c).
The question and answer tasks can be just a part of a longer exam or can form
the whole spoken section of the test. When used as the main part of the exam,
they consist of a whole battery of questions ranging from straightforward ‘Hello
questions!’ to more complex items aiming to discover the ceiling of the students’
abilities. The first one or two groups of questions are asked by the interviewer,
who usually starts with the so-called display questions (i.e., the person asking the
questions knows the answers but s/he asks them to check whether the test taker
can ‘display’ the answer; they are also called closed questions as they require
short and limited answers) (see Questions 1-5 in Example 16). If the students have
just started learning the foreign language, the session can end here. However, for
students with a more advanced proficiency level, display questions are followed by
a bunch of referential/information-seeking questions (i.e., ‘genuine’ questions
whose answers are not known by the person asking the question). Questions 6-10
in Example 16 are referential questions. They are also known as open-questions
since they require longer and varied answers. Some of them ask students about
their feelings and attitudes. The last part of the session aims to assess test takers’
ability to construct interrogative sentences. Therefore, examiners invite students
to ask questions (see Questions 11-15 in Example 16).
154 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 16: Question and answer assessment tasks


Instructions
Please answer the questions that you hear
1. Good morning! Is your name Steven?
2. In which grade are you?
3. [Pointing at the pen in his/her hand] Is this a pen?
4. [Pointing at the ruler on the table] What is this called in English?
5. What day was yesterday?
6. What is your favourite colour, and why?
7. What do you think about the food served in the school cafeteria?
8. Where did you go on your summer holiday? Why did you choose that place? What is
unique about it?
9. Why do you learn English/French/German?
10. Have you ever been to a foreign country? Why did you go there?
11. Ask the time. / Ask me about my family members.
12. Ask me about my favourite food/activities.
13. Ask me for a pen. / Ask for permission to leave the room.
14. If you could interview one famous person, who would that be, and what would you
ask him/her?
15. Do you have any other questions for me?

The writing, ordering and combining of the questions in the ‘Question and Answer’
tasks require careful planning and precise implementation as the exam time is
limited, and every question should serve a specific purpose. Depending on the
aims of the exam, the questions should elicit as detailed and accurate information
as possible about the phonetic, grammatical, semantic, discourse, pragmatic
knowledge of the students in the foreign language.

Paraphrasing
Paraphrasing is “restating a passage text using different words or sometimes even
rearranging the sentence structure” (Chen et at., 2015, p. 22). It is a vital skill that
we frequently employ at work, at school, or in our everyday lives. We paraphrase
because we cannot remember the exact words (i.e., lexical gaps), want to simplify
the topic, and make it more comprehensible (e.g., when we have to teach a
grammatical structure to young learners), need to summarize a long story when
we do not have much time for lengthy explanations, or even try to make something
more acceptable for the audience. It is also a critical skill for language learners to
be successful in their academic lives and survive in everyday interactions. That is
why paraphrasing is a skill that should be taught and closely assessed in foreign
language classes. Research shows, however, that this is not the case and that many
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 155

foreign language learners struggle when faced with the task (Frodesen, 2002; Sun,
2009) mainly due to insufficient lexical knowledge and lack of training (McInnis,
2009; Milićević & Tsedryk, 2011).
To be able to use this technique (i.e., paraphrasing), test developers have two
options. They can combine speaking with either listening or reading (Madsen,
1983). Combining speaking with listening brings more freedom for the test
developers as those tasks can be used with almost all groups of students, even
with illiterate ones (e.g., very young and young learners). The technique is simple.
The teacher/tester reads a sentence/ story/ joke etc., and asks the student to tell it
in their own words. Example 18 is a paraphrasing task appropriate for intermediate
level students where students hear a separate sentence each time, and they are
asked to paraphrase it. To control the output and check whether the students can
paraphrase the given statements in a wanted manner, test developers can provide
the beginning of the expected utterance.

Example 18: Paraphrasing: Combining Listening with Speaking (Intermediate Level


Students)
Instructions
Listen as I read each sentence. After I am done reading, you will be asked to paraphrase the
sentence.
Original Sentence Prompt
There will be a number of changes around the
1. The principle announced that …
school.
They are well-educated, but they are not
2. Although …
hardworking, unfortunately.
3. She was not careful, so she broke the vase. If …

4. Does he have any plans for next week? Is …?


“I was not given enough information,” she
5. She complained …
complained.

When speaking is combined with reading (i.e., when the students are literate and
older), teachers can give students a story to read silently in one room and then, in
a separate room, they can be asked to orally paraphrase the story individually to
the teacher (see Example 19). To reduce the burden on students’ memory, teachers
might keep the stories short or might prepare drawings reminding students of the
main points in the story (see Example 20).
156 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 19: Paraphrasing: Combining Listening with Reading (Upper-Intermediate to


Advanced Level Students)

Instruction
Read the story on this handout carefully. When you finish reading, you will be asked to tell
me the story in your own words.

I found a frog
By T. Albert

Even though I have grandchildren of my now, it seems like it was only yesterday when I
returned home from school to find a frog on my bed. My mother chuckled when I yelled out.
“I found a frog on my bed.”

Now, she knew that I would eventually find one, but she let me discover a wonder of Nature
that many people miss. I am glad she did.

You see, a little earlier that spring, when I was 6 years old; I saw some little black fish in a pond.
Since I did not have any pets, I went home and asked my mother if I could have one. She
agreed. She gave me a bowl and told me to catch a few. Off I went. There were so many that
they were easy to catch. I filled the bowl and ran home. When my mom looked at the fish, she
said with a big smile,
“Tadpoles. Wow! You are in for a surprise.”

After a few weeks, I noticed some changes.


“Mom,” I yelled with excitement. “Come here, my fish are growing legs.”
She came to my room, looked, smiled and told me to keep watching.

After several more weeks, there were more changes.


“Mom,” I yelled with excitement. “Come here, my fish are growing front legs, and their tails are
going away.” She came to my room, looked, smiled and told me to keep watching.

A week or so later, when I got up, I was amazed. There were more changes. My fish did not
have tails, their legs were bigger, and they did not look like the little black fish I had caught
earlier in the spring.
“Mom,” I yelled with excitement. “Come here, my fish are really different.”
She came to my room, looked, smiled and told me that a surprise was very close.

That day, when I returned home from school, is when I yelled out,
“I found a frog on my bed.”

“Surprise,” yelled mon. “You watched a miracle right before your eyes. A fish changed into a
frog. Now you had better catch and take him and the other almost frogs back to the pond. I
do not need 50 more surprises tomorrow morning.”
Off I went…

(Adapted from: https://cdn.shopify.com/s/files/1/2081/8163/files/022-I-FOUND-A-FROG-


Free-Childrens-Book-By-Monkey-Pen.pdf?v=1589890638)
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 157

Example 20: Pictures used to remind test-takers of the main parts of the story.

Picture 1 Picture 2

Picture 3 Picture 4

(Source: https://cdn.shopify.com/s/files/1/2081/8163/files/022-I-FOUND-A-FROG-Free-
Childrens-Book-By-Monkey-Pen.pdf?v=1589890638)

Interactive Assessment Tasks (IAT)


Interactive speaking is longer and more complex than responsive speaking. In
such exams, tasks require relatively long stretches of interactive discourse either
between the interviewer and the interviewee (e.g., interviews, games) or between
test takers (e.g., role-plays, discussions). In the majority of the situations, interactive
tasks involve interpersonal interactions, but tasks, including transactional
interactions, can also be included. Interpersonal interactions aim to connect,
bond and/or enhance solidarity between the interlocutors, and mark attitudes
towards the propositional content of the discourse unit. Since the degree of
intimacy in interpersonal relations, the context (e.g., formal, informal) where the
conversation takes place can vary greatly being involved in such interactions and
knowing which words, grammatical structures, speech acts and styles to use is
158 LANGUAGE ASSESSMENT AND TEST PREPARATION

usually difficult for language learners to master (Hatipoğlu, 2009, 2012, 2016a,
2017c). For such speech to sound natural and more authentic, the inclusion
of proverbs, colloquial language, slang, or ellipses is expected (Can Daşkın &
Hatipoğlu, 2019c; Hatipoğlu & Can Daşkın, 2020). The aims of the transactional
interactions, on the other hand, are to inform our interlocutors and to ‘get what
we need to get’. Conversations, where somebody is informed about the opening
hours of the supermarket, where the conference is going to be and what is the
abstract submission deadline, who was appointed as the new project manager
and what are the team members’ new responsibilities, which topics are included in
the midterm exam and what is the weight of each of the topics etc., are examples
of transactional interactions. In such interactions, speakers usually say not what
they want to but “what they have to” (Harley, 1993, p. 22).
Assessment tasks such as oral interviews, role-plays, games, and discussions
are classified as interactive. In this Chapter, we are going to discuss the most
common and the most widely used one of these tasks, i.e., the oral interview.

Oral Interviews
Oral interviews can include description tasks, narrative tasks, instruction
tasks, comparing and contrasting tasks, explaining and predicting tasks, and
decision tasks. Among these, the chapter focuses on the most frequently used
ones: The description and narrative interviews.

(i) Description Tasks


Descriptive tasks are the most commonly used tasks in current speaking tests. They
are practical, flexible, versatile and can be used in both one-to-one interviews and
with pairs. They are also popular because they suit tape-based testing.

(i1). One-To-One Interview


Example 21 shows a task suitable for a one-to-one oral interview. It is practical as
the description of the task is concise (i.e., does not require lengthy explanations on
the part of the examiner). It is also efficient. The prompt of the examiner is brief,
but the examinee is expected to give a long description which allows for efficient
use of exam time.
Such tasks are also versatile since they can be used with test takers of different
age, level of proficiency and reasons for learning English (i.e., depending on the
purpose for learning a foreign language test takers can be asked to describe
different places, people, situations, events etc.). It also has international validity
since it can be adapted to suit test takers with different cultural backgrounds.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 159

Example 21: One-to-one interview

Instructions: Describe to me (in detail) your room/classroom/house/garden/city/village/


best friend/the person who had a profound effect on your life/favourite toy.

Expectations from examinees:


To describe in detail the place/person/event.

Criteria for judging the performance:


Can the listeners picture what is being described, as much as it would be in real life?

This task is authentic since every test taker will provide a different description,
and the examiner will ask genuine questions if more information is needed. These
types of exercises are particularly suitable if the aim is to know how well test takers
can describe something they know. What is more, to avoid the negative effects
of students comparing notes after the exam, every test-taker can be asked to
describe a different object (e.g., something that s/he mentions in her/his talk) as
long as there is ‘comparable parallelism’ between the tasks given to all students
(e.g., all students are asked to provide comparable descriptions of people, places,
animals).
The main criterion for scoring in such tasks is whether or not the listener can
picture what is being described by the test-taker.

(i2). Pair or Group Tasks in an Interview


When the aim is to check how students interact with each other and solve problems
together/collectively, pair or group interviews can be used. With those tasks, a
bigger number of students are tested at the same time (i.e., they are practical),
and by giving all students similar pictures, the content of their descriptions can be
controlled, and their performance compared. In Example 22, for instance, the test-
takers are asked to talk about the similarities and differences between the given
pictures, but both students will focus on the same topic (i.e., fruits and vegetables)
and will be required to use the same vocabulary to successfully complete the task.
Using pictures with similar content creates ‘information gap’ contexts. Students
have a genuine need to communicate to uncover the information missing in their
picture. Such tasks also increase students’ listening and problem-solving skills, as
students have to ask questions that will lead them to their goals. Students will also
need to listen carefully to check whether they can use the information to move
forward with solving the problem.
160 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 22: Pair task in an interview test


Picture A Picture B

Source: https://www.bbc.com/news/ Source: https://www.heart.org/en/


health-39057146 healthy-living/healthy-eating/add-color/
how-to-eat-more-fruits-and-vegetables
Instructions:
In this part of the test, I will give each of you a picture. Don’t show the pictures to
each other.
You will have 1 minute to examine your pictures. Then, each of you will be given 1
minute to describe their picture to your friend. Finally, you will have two minutes to
talk about the similarities and differences between your pictures.
Please have a look at your pictures for a minute now.
Note: If necessary, the examiner can give examinees further prompts: mention the
colour, size, number of the objects in your pictures.
Expectations:
First, Student A and, then Student B describe their pictures in detail for a minute.
Then, they talk about what is similar and what is different in their pictures for 2
minutes.

When designing interactive description tasks, similarly to the picture-cued tasks,


test developers need to consider carefully which pictures to use and whether
pictures really will be needed (see Section 3.2.5 where the rules for picture
selection are discussed).

(ii) Narrative Tasks


Narrative tasks are also frequently used in speaking exams. They help test
administrators to identify “how well the examinees can recount a sequence of
events, usually in one time frame, either present or past (Louma, 2004, p. 144).
Narratives could be based on two sources: (i) personal experiences and (ii) picture
sequences. Personal experience narratives are usually prompted by asking test-
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 161

takers to talk about events that happened in their lives. Such tasks are authentic
and relevant to the students and might encourage some students to talk more.
However, “while stories about ‘what happened to me’ are very common in real
life, they usually belong to social chatting, which is difficult to replicate in a test
situation” (Luoma, 2004, p. 144). That is, the stories that test-takers tell can be
very different from each other (e.g., content, place, time frame), and this might
make the comparison between them very difficult. Some personal stories might
require sharing sad, anxiety-causing events or revealing embarrassing details,
which might dishearten some of the test-takers and make introvert, shy students
go even quieter. Conversely, without any preparation, students might not be able
to think of a story interesting enough to tell, which again might prevent them
from exhibiting their real/best skills in speaking. Because of all these, picture-
based sequences are more common in speaking exams. Sequences provide
structure and a common ground for comparison and evaluation of the test takers.
However, while selecting them, test designers have to be really careful as for a
picture sequence to work well, it should both generate enough talk and give test
takers an opportunity to show what they know concerning all criteria deemed
important in the exam (e.g., various grammatical structures, vocabulary, register,
cohesive devices etc.). They should also enable students to “show their control of
the essential features of narratives: setting the scene, identifying the characters
and referring to them consistently, identifying the main events, and telling them in
a coherent sequence” (Luoma, 2004, p. 144). To make sure that picture sequences
will elicit the variety of language targeted in the exam, they should be tried out
with similar groups of students or even among the teachers who were not involved
in the selection of the pictures and the preparation of the exam.
Example 23 is a describe and rearrange exercise, which can be used both as a pair
and group task. Here, each student is given a different set of pictures in mixed
order, and the students have to describe, discuss and put them in the correct order
as a pair/group.
Step 1: Students are given 1 minute to look at their own pictures and to prepare to
describe the pictures without showing them to the other group members.
Step 2: Each student describes his/her pictures, while others are asked to take
(mental) notes.
Step 3: Students compare notes and decide on the appropriate sequence.
Step 4: Each student puts down the picture(s) they hold in the agreed order.
Step 5: Students are given 1 minute to comment and to change the order of the
pictures if needed.
Step 6: Students tell the story following the final agreed-upon order. Each student
talks about his/her pictures.
162 LANGUAGE ASSESSMENT AND TEST PREPARATION

Example 23: Ordering pictures and Narrating a story


Pictures source: https://tr.pinterest.com/pin/561472278516762461/
Picture 1 Picture 2

Picture 3 Picture 4

Picture 5 Picture 6
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 163

Depending on the performance of the students in the descriptive or narrative


tasks, they are placed in one of the Common European Framework of Reference
for Languages (CEFR) (Council of Europe, 2001) six levels listed in Example 24.

Example 24: CEF Levels (Source: https://www.examenglish.com/CEFR/cefr.php)

CEFR Levels Descriptions

C2 The capacity to deal with material that is academic or cognitively


Mastery demanding, and to use language to good effect at a level of
performance which may in certain respects be more advanced than
that of an average native speaker.
Example: CAN scan texts for relevant information, and grasp the main topic
of a text, reading almost as quickly as a native speaker does.

C1 The ability to communicate with the emphasis on how well it is done,


Effective in terms of appropriacy, sensitivity and the capacity to deal with
Operational unfamiliar topics.
Proficiency Example: CAN deal with hostile questioning confidently. CAN get and hold
onto his/her turn to speak.

B2 The capacity to achieve most goals and express oneself on a range of


Vantage topics.
Example: CAN show visitors around and give a detailed description of a
place.

B1 The ability to express oneself in a limited way in familiar situations and


Threshold to deal in a general way with non-routine information.
Example: CAN ask to open an account at a bank, provided that the
procedure is straightforward.

A2 An ability to deal with simple, straightforward information and begin to


Waystage express oneself in familiar contexts.
Example: CAN take part in a routine conversation on simple, predictable
topics.

A1 A basic ability to communicate and exchange information in a simple


Breakthrough way.
Example: CAN ask simple questions about a menu and understand simple
answers.
164 LANGUAGE ASSESSMENT AND TEST PREPARATION

Extensive Assessment Tasks (EAT)


Speeches, oral presentations, storytelling, translating longer texts etc.
are, activities included in the extensive assessment task (EAT) category. They
require test takers to produce ‘extensive’ spoken output for which they have to
plan, research, and deliver in a more formal manner. During those activities, the
opportunities for oral interaction between the speakers and listeners are either
highly limited (only nonverbal responses, backchannels) or ruled out altogether.

Oral Presentations
Oral presentations are a part of our academic, professional and everyday lives.
Young children, students, teachers, researchers, and other professionals are
frequently required to speak in public about their favourite toys and games,
courses, final projects, new marketing strategies, the books they read recently.
Since oral presentations can be about more general topics as well as about more
specific ones requiring expertise, they can be used both in general and in English
for specific purposes exams. What is more, oral presentations can be used with
learners of different ages and proficiency levels. While selecting topics and working
on evaluation criteria, however, test designers have to keep in mind a number of
rules, some of which are listed in Example 25.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 165

Example 25: Oral Presentations: Topic selection rules and topics (Adapted from
https://www.myspeechclass.com/good-2-minute-speech-topics-for-
students.html)
Age Group Rules for topic selection Example Topics
Young learners Topics selected for young learners * My favourite (stuffed) animal
should be * My best friend
* My favourite food for
* simple and fun breakfast/ lunch/ dinner
* omething they are familiar with * My favourite colour(s)
* something that they can see and * Something I love to do for fun
experience in their environment * How to build with Lego
* something that comforts them and * How to eat an ice cream/
they are happy to talk about apple/ pizza
* A time I was brave
Topics selected for young learners * The best day of my life
should NOT be * The smartest cartoon
* too difficult character(s)
* something that requires too much * Why does it rain?
preparation * Why is the sky blue?
* something that they are
uncomfortable about

Teenagers Topics selected for teenagers could/ * Teaching my grandparents to


should be use a smartphone
* Would you rather use
* a bit more personal textbooks or tablets in class?
* demonstrative * How to effectively fake being
* motivational sick
* persuasive * If time travel were real
* informative * The Best Book I’ve Ever Read
* related to the topics discussed in class * The Best App on My Phone
* funny * Three Things I Can’t Live
* audience captivating without
* How Do I Feel when My Cell
Topics selected for teenagers Battery is at 10%?
should NOT be * My Favourite Sandwich
* something that might worry them/ * Should School Start Later?
make them nervous
* something that shows how they do not
fit in

Depending on the length and the type of content of the oral presentation, students
can be given from only 2-3 minutes to a number of days to prepare. In English for
specific purposes exams and with more advanced learners, test-takers might be
asked to include references, prepare notes and PowerPoint presentation so that
real-life context is replicated as much as possible.
166 LANGUAGE ASSESSMENT AND TEST PREPARATION

Practice Activities
Picture-cued Items
Ask your students to draw pictures of their favourite food/ place/ cartoon character.
Collect those and prepare questions related to each of those pictures. Make sure
that the questions are appropriate for the level of your students, and they cannot
be answered with one-word responses.
Give sample answers for each of the questions and prepare scoring rubrics.

Using Maps, Figures, Charts


Collaborate with the art teacher in your school and ask students to draw maps
showing their town/ street/ school.
Write questions, example responses and prepare rubrics for objective scoring of
the answers.

Reading-aloud Task
Revise the topics and structures you covered in class and would like to include in
your first Midterm exam. Based on your analysis, prepare a ‘diagnostic passages’ for
a reading-aloud task on the exam.
Write the instructions for this task and list the criteria that will be used for grading
students’ performance in the exam.

Paraphrasing
Keeping in mind your students’ age, level of proficiency, interest and the material
covered in class, select/write two little stories for your students to paraphrase. The
first story is to be read aloud to the students, while they will have a chance to read
the second one for themselves.
Prepare rubrics for both of those tasks.

Types of Speaking
Think about the five types of speaking tasks discussed in this Chapter (i.e., imitative,
intensive, responsive, interactive, and extensive). First, in pairs and then as a class,
prepare lists for the groups of students and the types of exams where each of
them can be used.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 167

Questions for Study and Reflection


1. Think about the five types of speaking tasks discussed in this Chapter.
(a) In groups, list the similarities and differences between imitative
and intensive and between responsive and interactive tasks.
(b) Make a list of the instances in which each of those types of tasks
were used to assess your level/progress in English as a foreign
language.
(c) Compare your notes with your classmates and identify the
tendencies for assessing speaking in your country.
2. Think of and describe in detail the groups of students you would like
to work with after your graduation (e.g., young children; high
school students preparing for the university entrance exam in your
country; adults in a prep school of an English medium university).
(a) Which micro and macro-skills of speaking each of these groups will
need the most (see Brown, 2004, pp. 142-143)?
(b) What are the most appropriate techniques to measure those micro
and macro speaking skills? Why?
(c) Compare your answers with classmates planning to work with the
same/a similar/a different group of learners and identify the tasks
that have been listed by at least 3 of your classmates.
(d) Compare and contrast those tasks with the ones at the bottom of
the list.
3. Examine the Primary/Middle/High school books published by the
Ministry of Education in your country
(a) Find out how “successful speech” is defined/ which micro and
macro skills of speaking are taught/ emphasised in these books.
(b) What are the common and different types of knowledge and
skills included in the books written for the Primary/Middle/High
schools?
(c) Are the tasks used to assess speaking given in the books
appropriate/ parallel to the taught micro and macro skills?
168 LANGUAGE ASSESSMENT AND TEST PREPARATION

References
Angelelli, C. V. (2009). Using a rubric to Bridgeman, B., Powers, D., Stone, E.
assess translation ability: Defining & Mollaun, P. (2011). TOEFL iBT
the construct. In C. V. Angelelli, speaking test scores as indicators
& H. E. Jacobson (Eds.), Testing of oral communicative language
and assessment in Translation and proficiency. Language Testing, 29(1),
Interpreting studies: A call for dialogue 91–108.
between research and practice (pp. 13- Brown, H. D. (2004). Language
47). John Benjamins. Assessment: Principles and Classroom
Ayas, A., Ayaydin, A., Öncü, E., Kaymakçi, Practices. Pearson Education.
S., Börkan, B., Hatipoğlu, Ç., Durukan, Brown, H. D., & Abeywickrama, P. (2019).
E., Karataş, F. Ö., Çetinkaya, F. Ç., Aslan Language assessment principles and
Tutak, F., Ateşkan, A., Orçan, F. & Dinç classroom practices (Third Edition).
Altun, Z. (2020). Okul ve sınıf tabanlı Pearson.
değerlendirmeye dayalı öğretmen Broughton, G., Brumfit, C., Flavell, R.,
kapasitesinin güçlendirilmesi: Yabanci Hill, P., & Pincas, A. (1980). Teaching
dil olarak İngilizce dersi öğretmen English as a foreign language (Second
rehber kitapçiği. Ankara: Milli Eğitim edition). Routledge and Kegan Paul.
Bakanlığı. https://odsgm.meb.
Buck, G. (1992). Translation as a language
gov.tr/www/okul-ve-sinif-tabanli-
testing procedure: Does it work?
degerlendirmeye-dayali-ogretmen-
Language Testing, 9(2), 123-148.
kapasitesinin-guclendirilmesi-
calismasi/icerik/554 Can Daşkın, N., & Hatipoğlu, Ç. (2019a).
Reference to a past learning
Bachman, L. F. (2000). Modem language
event emerging as a practice of
testing at the turn of the century:
informal formative assessment in
Assuring that what we count counts.
L2 classroom interaction. Language
Language Testing, 17(l), 1-42.
Testing, 36(4), 524-551.
Bachman, L. F. & Palmer, A. S. (1996).
Can Daşkın, N., & Hatipoğlu, Ç. (2019b).
Language testing in practice:
Reference to a past learning event in
Designing and developing useful
teacher turns in an L2 instructional
language tests. Oxford University
setting. Journal of Pragmatics, 142,
Press.
16-30.
Bloomfield, L. (1926). A set of postulates
Can Daşkın, N., & Hatipoğlu, Ç. (2019c).
for the science of language.
A proverb learned is a proverb
Language, 2, 153–164.
earned: Proverb instruction in EFL
Bloomfield, L. (1933). Language. Henry classrooms. Eurasian Journal of
Holt. Applied Linguistics, 5(1), 57–88.
Bradshaw, D. (2020). Assessing Candlin, C. N. (1987). Towards task-
speaking: Then and now. In D. based language learning. In C. N.
Gonzalez-Alvarez & E. Ramma- Candlin and D. F. Murphy (Eds.),
Martinez (Eds.), Languages and Language Learning Tasks (pp. 5-22).
the internationalisation of higher Prentice Hall International: Lancaster
education (pp. 212-228). Cambridge Practical Paper in English Language
Scholars Publishing. Education, Vol. 7.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 169

Chen, M. H., Huang, S. T., Chang, J. S., & Hatipoğlu, Ç. (2012). Apologies and gender
Liou, H. C. (2015) Developing a corpus- ina Turkish and British English. Dilbilim
based paraphrase tool to improve Araştırmaları, 1, 55-79.
EFL learners’ writing skills. Computer Hatipoğlu, Ç. (2013). First Stage in the
Assisted Language Learning, 28(1), Construction of METU Turkish
22-40. English Exam Corpus (METU TEEC).
Chun, D. (2002). Discourse intonation Boğaziçi University Journal of
in L2: From theory to practice. John Education, 30(1), 5-23.
Benjamins Publishing Company. Hatipoğlu, Ç. (2016a). Explicit Apologies
Council of Europe. (2001). Common in L2 Turkish. In Ayşe Gürel (Ed.), L2
European Framework of Referencefor Acquisition of Turkish (pp. 221-248).
Languages: Learning, teaching, John Benjamins.
assessment. Cambridge: Cambridge Hatipoğlu, Ç. (2016b). The impact of
University. the University Entrance Exam
Fries, C. C. (1945). Teaching and Learning on EFL education in Turkey: Pre-
English as a Foreign Language. service English language teachers’
University of Michigan Press. perspective. Procedia-Social and
Frodesen, J. (2002). Developing Behavioral Sciences, 232, 136-144.
paraphrasing skills: A pre- Hatipoğlu, Ç. (2017a). History of
paraphrasing mini-lesson. Retrieved Language Teacher Training and
from http://www.ucop.edu/dws/ English Language Testing and
lounge/dws_ml_pre_paraphrasing. Evaluation (ELTE) Education in
pdf Turkey. In Y. Bayyurt, & N. Sifakis
Fulcher, G. (1999). Assessment in English (Eds.), English Language Education
for academic purposes: Putting Policies and Practices in the
content validity in its place. Applied Mediterranean Countries and Beyond
linguistics, 20(2), 221-236. (pp. 227-257). Peter Lang.
Fulcher, G. (2014). Testing second Hatipoğlu, Ç. (2017b). Assessing speaking
language speaking. Routledge. skills. In E. Solak (Ed.), Assessment
in language teaching (pp. 118-148).
Gibson, S. (2008). Reading aloud: a useful
Pelikan.
learning tool? ELT journal, 62(1), 29-
36. Hatipoğlu, Ç. (2017c). Status and
apologising in L2: A problem? In
Harding, L. (2014). Communicative
S. Nalan Büyükkantarcıoğlu, Işıl
Language Testing: Current issues
Özyıldırım and Emine Yarar (Eds.),
and future research. Language
45. Yıl Yazıları (pp. 195-213). Ankara:
Assessment Quarterly, 11(2), 186-197.
Hacettepe University Publications
Hartley, J. (1993). Writing, thinking (ISBN: 978-975-491-455-9).
and computers. British Journal of
Hatipoğlu, Ç., & Can Daşkın, N. (2020). A
Educational Technology, 24 (1), 22-31.
proverb in need is a proverb indeed:
Hatipoğlu, Ç. (2009). Culture, Gender and Examination of the proverbs in the
Politeness: Apologies in Turkish and coursebooks used in high schools
British English. VDM Verlag Dr. Muller in Turkey. South African Journal of
Aktiengesellschaft & Co. KG. Education, 40(1), 1-15.
170 LANGUAGE ASSESSMENT AND TEST PREPARATION

Hatipoğlu, Ç., & Erçetin, G. (2016). of Humanities, 32, 28-54. Accessed


Türkiye’de yabancı dilde ölçme ve at https://m-repo.lib.meiji.ac.jp/
değerlendirme eğitiminin dünü dspace/bitstream/10291/11979/1/
ve bugünü (The past and present jinbunkagakuronshu_32_(35).pdf
of Foreign Language Testing and Knight, B. (1992). Assessing speaking
Evaluation Education in Turkey). skills: a workshop for teacher
In S. Akcan & Y. Bayyurt (Eds), 3. development. ELT journal, 46(3), 294-
Ulusal Yabancı Dil Eğitimi Kurultayı 302.
Bildiri Kitabı (pp. 72-89). Boğaziçi
Kramsch, C. (1986). From language
Üniversitesi Press.
proficiency to interactional
Hatipoğlu, Ç., Gajek, E., Milosevska, L., competence. The modern language
& Delibegović Džanić, N. (2020). journal, 70(4), 366-372.
Crowdsourcing for Widening
Laviosa, S. (2014). Translation and
participation and learning
language education: Pedagogic
opportunities: A view from pre-
approaches explored. Routledge.
service language teachers’ window.
In K.-M. Frederiksen, S. Larsen, L. Lee, S. (2010). Current practice of
Bradley, & S. Thouësny (Eds.), CALL classroom speaking assessment in
for widening participation: Short secondary schools in South Korea.
papers from EUROCALL 2020 (pp. [Unpublished M.A. Thesis]. University
81-87). Research-publishing.net. of Queensland.
https://research-publishing.net/ Lowe, P. Jr., & Stansfield, C. W. (Eds).
publication/978-2-490057-81-8.pdf (1988). Second language proficiency
Heaton, J. B. (1990). Writing English assessment: Current issues. Prentice
Language Tests. Longman. Hall.
House, J. (2015). Translation quality Luoma, S. (2004). Assessing speaking.
assessment: Past and present. Cambridge University Press.
Routledge. Madsen, H. S. (1983). Techniques in testing.
Howatt, A. P. R. (1984). A History of English Oxford University Press.
Language Teaching. Oxford University May, L. (2010). Developing speaking
Press. assessment tasks to reflect the ‘social
Huang, L. (2010). Reading aloud in the turn’ in language testing. University of
foreign language teaching. Asian Sydney Papers in TESOL, 5, 1-30.
Social Science, 6(4), 148. McCarthy, M. (2006). Explorations
Hughes, A. (2003). Testing for Language in corpus linguistics. Cambridge
Teachers (Second Edition). Cambridge University Press.
University Press. McInnis, L. (2009). Analyzing English
Isaacs, T. (2016). Assessing speaking. In D. L1 and L2 paraphrasing strategies
Tsagari & J. Banerjee (Eds.), Handbook through concurrent verbal report
of second language assessment (pp. and stimulated recall protocols.
131–146). DeGruyter Mouton. [Unpublished MA Thesis]. University
of Toronto.
Johnson, V. E. (1985). The place of general
achievement testing in the Freshman McNamara, T. (2000). Language testing.
English language program. Collection Oxford University Press.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 171

Milićević, J., & Tsedryk, A. (2011). Sadler, D. R. (1998) Formative


Assessing and improving Assessment: revisiting the territory.
paraphrasing competence in FSL. In Assessment in Education: Principles,
Proceedings of the 5th international Policy & Practice, 5(1), 77-84, DOI:
conference on meaning-text theory 10.1080/0969595980050104
(pp. 175-184). Seo, S. (2014). Does reading aloud
improve foreign language learners’
Mislevy, R. J., Steinberg, L. S. & Almond, speaking ability? GSTF International
R. G. (2002). Design and analysis in Journal of Education, 2(1), 46-50.
task-based language assessment. Stroh, E. N. (2012). The effect of repeated
Language testing, 19(4), 477-496. reading aloud on the speaking
Nida, E. (1964). Toward a science of fluency of Russian language learners.
translation. Brill. [Unpublished MA Thesis]. Brigham
Young University.
Norris, J. M. (2016). Current uses of task-
based language assessment. Annual Sun, Y. C. (2009). Using a two-tier test in
Review of Applied Linguistics, 36, 230- examining Taiwan graduate students’
244. perspectives on paraphrasing
strategies. Asia Pacific Education
Palmer, H. E. (1921a). The oral method of
Review, 10(3), 399-408.
teaching languages. Heffer and Sons
Ltd. Sun, Y. & Cheng, L. (2013). Assessing
second/foreign language
Palmer, H. E. (1921b). The principles of
competence using translation: The
language-study. Harrap.
case of the college English test in
Pawlak, M. (2016). Assessment of China. In D. Tsagari & G. Floros (Eds.),
language Learners’ spoken texts: Translation in language teaching and
Overview of key issues. In H. assessment (pp. 235-252). Cambridge
Chodkiewicz, P. Steinbrich, & M. Scholars Publishing.
Krzemińska-Adamek (Eds.), Working
Supraba, A., Wahyono, E. & Syukur,
with text and around text in foreign
A. (2020). The Implementation
language environments (pp. 89-105).
of Reading Aloud in Developing
Springer.
Students’ Speaking Skill. IDEAS:
Popham, W. J. (2018). Classroom Journal on English Language Teaching
assessment: What teachers need and Learning, Linguistics and
to know (Eighth Edition). Pearson Literature, 8(1), 145-153.
Education Company.
Swastika, P. A., Miranti, R. R., & Nur, M. R.
Prator, C. H., & Robinett, B. W. (1985). O. (2020). The analysis of speaking
Manual of American English assessment types in textbook “When
pronunciation. University of California English Rings a Bell Grade VII”. Jurnal
Press. Studi Guru dan Pembelajaran, 3(2),
Presas, M. (2000). Bilingual competence 167-173.
and translation competence. Sweet, H. (1899/1964). The practical study
Benjamins Translation Library, 38, of languages. Oxford University Press.
19-32.
172 LANGUAGE ASSESSMENT AND TEST PREPARATION

Tajeddin, Z., Alemi, M., & Yasaei, H. (2018). Underhill, N. (1987). Testing spoken
Classroom assessment literacy for language: A handbook of oral testing
speaking: Exploring novice and techniques. Cambridge University
experienced English language Press.
teachers’ knowledge and practice. Valette, R. M. (1997). Modem language
Iranian Journal of Language Teaching testing. Harcourt Brace Jovanovich.
Research, 6(3), 57-77.
Weir, C. (1988). Communicative Language
Taylor, L. (2011). Studies in language Testing. Exeter University Press.
testing, 30. Examining speaking:
Weir, C. (1993). Understanding and
Research and practice in assessing
Developing Language Tests. Prentice
second language speaking.
Hall International.
Cambridge University Press.
Weir, C. (2013). Measured constructs:
Thornbury, S. (2003). How to Teach
A history of Cambridge English
Grammar. World Affairs Press.
language examinations 1913-2012.
Tsagari, D., & Floros, G. (Eds.). (2013). Cambridge English Research Notes, 51,
Translation in language teaching and 2-10.
assessment. Cambridge Scholars
Weir, C., & Wu, J. R. W. (2006). Establishing
Publishing.
test form and individual task
Umirova, D. (2020). Authenticity and comparability: A case study of a
authentic materials: History and semi-direct speaking test. Language
present. European Journal of Research Testing, 23(2), 167-197.
and Reflection in Educational Sciences,
8(10), 129-133.
CHAPTER VI: Testing and Assessment of Speaking Skills, Test Task Types and Sample Test Items 173

About the Author


Dr. Çiler Hatipoğlu is a Professor in the Department of Foreign Language
Education (FLE) at Middle East Technical University (METU), Ankara, Turkey,
where she teaches various Linguistics and FLE courses at the undergraduate
and graduate levels. Her main research areas are foreign language teaching
and assessment, cross-cultural communication, pragmatics, politeness, corpus
linguistics and metadiscourse. Dr. Hatipoğlu has published articles on these
issues in various national and international journals and books (e.g., Language
Testing, Journal of Pragmatics, System, South African Journal of Education, EJAL,
Dilbilim Araştırmaları, John Benjamins, Lexington, Peter Lang). She is also a member
of the team that was responsible for the development of the first Spoken Turkish
Corpus.

View publication stats

You might also like