Professional Documents
Culture Documents
Hill McNamara (2011) - Framework For Classroom-Based Assessment
Hill McNamara (2011) - Framework For Classroom-Based Assessment
/$1*8$*(
Article 7(67,1*
Language Testing
Developing a comprehensive,
29(3) 395–420
© The Author(s) 2011
Reprints and permission:
empirically based sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0265532211428317
research framework for ltj.sagepub.com
classroom-based assessment
Abstract
This paper presents a comprehensive framework for researching classroom-based assessment
(CBA) processes, and is based on a detailed empirical study of two Australian school classrooms
where students aged 11 to 13 were studying Indonesian as a foreign language. The framework can
be considered innovative in several respects. It goes beyond the scope of earlier models in addressing
a number of gaps in previous research, including consideration of the epistemological bases for
observed assessment practices and a specific learner and learning focus. Moreover, by adopting
the broadest possible definition of CBA, the framework allows for the inclusion of a diverse range
of data, including the more intuitive forms of teacher decision-making found in CBA (Torrance &
Pryor, 1998). Finally, in contrast to previous studies the research motivating the development of
the framework took place in a school-based foreign language setting. We anticipate that the
framework will be of interest to both researchers and classroom practitioners.
Keywords
classroom-based assessment, continuity in language learning, language policy, language testing,
school foreign language learning, transition in language programs
Introduction
The aim of the study reported in this paper was to develop from the ground up a compre-
hensive framework for conducting research on classroom-based assessment. The context
for this research is an increasing trend to devolve responsibility for assessment to classroom
teachers (e.g. Cumming & Maxwell, 2004; Davison & Leung, 2009), together with a grow-
ing awareness of the impact of assessment on learning (Black & Wiliam, 1998). However,
we would argue there is a lack of coherence in terms of focus and approach amongst
existing CBA studies. Previous studies have focused on issues such as validity and reliability
Corresponding author:
Kathryn Hill, Medical Education Unit, School of Medicine, University of Melbourne, Victoria 3010, Australia
Email: kmhill@unimelb.edu.au
396 Language Testing 29(3)
(e.g. Gipps, 1994; Heurta-Macias, 1995), criteria and standards (e.g. Leung, 2007; Leung
& Teasdale, 1997b) and the influence of external assessment and reporting regimes on
classroom practices (e.g. Brindley, 1998, 2001; Clarke & Gipps, 2000; Davison, 2004).
Relatively fewer studies, however, have focused on the actual processes of classroom-based
assessment (research on which this framework is intended to guide) and even then only
focus on aspects of the assessment process rather than assessment as a comprehensive whole.
The framework we propose is based on a detailed empirical study of two Australian
school classrooms where students aged 11 to 13 were studying Indonesian as a foreign
language. Although essentially a bottom-up study, an initial orientation to the research
reported on here was developed from the existing literature. In this case, two themes from
the literature guided the investigation: the dimensions and scope of CBA, as well as the
way these two aspects intersected with each other.
Dimensions
McNamara (2001) sets out three critical dimensions of assessment: evidence, interpreta-
tion and use. According to McNamara, CBA is ‘[a]ny deliberate, sustained and explicit
reflection by teachers (and by learners) on the qualities of a learner’s work’ and the use
of this information, for example, ‘as an aid to the formulation of learning goals…’ (2001,
p. 343). However, the literature reveals significant diversity in how each of these dimen-
sions is understood (Table 1).
The definition adopted for the study is designed to reflect this diversity and, in line with
ethnographic principles, to admit all possible evidence. We thus propose the following
amended definition of CBA:
any reflection by teachers (and/or learners) on the qualities of a learner’s (or group of learners’)
work and the use of that information by teachers (and/or learners) for teaching, learning (feedback),
reporting, management or socialization purposes.
Note this definition of CBA incorporates both formative (or assessment for/as learning)
and summative assessment (assessment of learning).
There are also important differences in how the unit of analysis in CBA research is
defined. Whereas Leung and Mohan (2004), for example, focused on a planned assessment
activity, the unit of analysis in Rea-Dickins’ (2001) study, the ‘assessment opportunity’,
also included activities identified post hoc (i.e. on reflection) by the participating teachers.
Torrance and Pryor (1998) go even further, however, to include the forms of assessment
occurring ‘within the largely taken-for-granted discourse structure of teacher questions
and pupil responses’ (p. 131). The difficulty here is identifying when (or whether) assess-
ment is actually taking place, given the often intuitive nature of teacher decision-making
in CBA (Rea-Dickins, 2006). The challenge for the researcher is thus how to understand
CBA when it takes place ‘without conscious planning, as part of ordinary interaction with
students, that is, when [teachers] consider themselves to be teaching’ (Leung, 2005, p. 877).
This phenomenon is nicely captured in the following comment from a teacher in our study:
Hill and McNamara 397
It’s all like you’ve got antennae sticking out of your ears and it all comes in… You’re constantly
processing it, you’re constantly building up, I mean, I just know, just sitting in class, you know,
you become aware of who’s got the answer or who’s gonna have a go at it. Like Arthur will
keep trying till the cows come home. You know he won’t get it straight away but, you know?
So but, there’s that but there’s also, there’s their identity in the class and there’s all sorts of
things. (Year 6)
In order to allow for these more intuitive forms of assessment, we have extended Rea-Dickins’
(2001) notion of the ‘assessment opportunity’ to include the following:
Once again this definition is deliberately broad. In particular, it allows the observer
to consider the nature of the ‘affordances’ (van Lier, 2004) or evidence available for
assessment in the classroom, in recognition of the possibility that incidental forms of
assessment are taking place.
398 Language Testing 29(3)
Scope Dimensions
1. What do teachers do? Evidence What activities or behaviours are
assessed? Is it planned/incidental,
explicit/embedded? Does it target
individuals, groups, the whole class?
Interpretation Is reflection sustained or fleeting?
Use How is assessment used?
2. What do they look for? Interpretation What criteria do they apply?
3. What theory or Interpretation What are the values guiding
‘standards’ do they use? assessment?
4. Do learners share the Evidence What are learners’ beliefs about
same understandings? Interpretation how assessment is conducted,
Use interpreted and used?
Scope
In terms of scope, the following focal research questions, reflecting issues identified by
Leung (2005) and Rea-Dickins (2006) respectively, were used to guide the empirical study:
The relationship between scope and dimensions is set out in Table 2. Questions 2 and 3
primarily relate to ‘interpretation’, while Questions 1 and 4 investigate all three dimen-
sions (‘evidence’, ‘interpretation’ and ‘use’).
In summary, to date there is no comprehensive framework for conceptualizing and
guiding research into processes of CBA in its broadest definition. Here we present such
a framework, based on a detailed empirical study of two Australian classrooms where
students aged 11 to 13 were studying Indonesian as a foreign language. Hence, following
Wiliam (2001), ‘instead of building theoretical models and then trying to apply them to
[teachers’] assessment practices, we try to theorise what is actually being done’ (p. 172).
Research context
The study took place in two Indonesian language classrooms in Victoria, Australia, and
coincided with the introduction of a new curriculum and standards framework, VELS
(VCAA, 2008).1 Schooling in Victoria commences at age 5 and is divided into two levels,
primary (Preparatory to Year 6) and secondary (Years 7 to 12). Mainstream foreign language
programs in Australia involve ‘language arts’ (i.e. in contrast to, e.g., the Content-Language
Integrated Learning approach currently favoured in Europe). Furthermore, while govern-
ment schools are obliged to use the official framework for reporting, assessment at the
year levels involved in this study is not considered ‘high stakes’ (as is the case, for example,
in the UK National Curriculum).
There were a number of reasons for choosing Indonesian. Firstly, as this was an ethno-
graphic study involving participant observation, it was important to choose a language that
the primary researcher (Hill) speaks and understands. A second consideration was the avail-
ability of suitable programs. Since the 1990s there has been a particular emphasis on the
study of Asian languages in Australian language education policy. Indonesian is perceived
to be the easiest of the Asian languages promoted under the policy (e.g. in contrast to
Chinese, Japanese and Korean, there is a direct and transparent relationship between written
and spoken forms of the language). As a result, at the time of data collection Indonesian
was the most widely studied language in Victorian schools after Italian (DOE, 2006).
Interactions Documents
Teacher–Teacher/Researcher
planning sessions assessment task sheets
reporting sessions assessment rubrics (criteria)
discussions with researcher course outlines, work requirements
Teacher–Student
written (on board) or verbal instructions text books
explanation/clarification requests student workbooks, worksheets & assignments
oral or written feedback statements of aims (policy) at school & system level
Student–Student
discussion of task requirements teachers’ notes & ‘running records’
discussion of feedback/results written communications between teachers
self- and peer evaluations summative reports
student focus group interviews
Table 4 shows that data for this unit of instruction (‘Introducing Yourself’) was col-
lected over the course of five lessons and includes recordings and transcriptions of whole-
class and paired interactions for each lesson. It shows, for example, that data for Pair 1
(Tamara and Jess) includes recordings of pair-work interactions from four lessons, an
extended individual feedback session, an oral presentation (including teacher feedback)
and related focus group interview discussions. Data also includes copies of the role-play
scripts they produced, the printed assessment rubric, the teacher’s notes about the pair’s
oral presentation, ledger entries, and other relevant documents (e.g. notes from the board,
completed homework exercises, etc.).
The coding process was grounded and iterative (Strauss, 1987; Strauss & Corbin, 1998).
A set of preliminary codes which had been established based on a combination of issues
suggested by theory and/or previous research were refined on the basis of a continuing
‘dialogue‘ between the research questions, the literature and the data (Ritchie & Spencer,
2002). The aim was to produce a set of analytic categories, grounded in the data, which
provided a framework ‘that leaves nothing unaccounted for and that reveals the interre-
latedness of all the component parts’ (Hornberger, 1994, p. 688). The validity of the
categories was tested through comparison of similarities and differences within as well
as across the two classroom contexts. Hence successive iterations of the scheme were
applied to a small subset of the data, starting with a transcript from a single lesson expand-
ing to additional transcripts from the same unit of instruction. Revisions were informed
by the need to provide a faithful representation of the data and to account for emerging
insights with relevance to the main research questions (Lynch, 2003).
Other measures to ensure the validity of the categories included the large quantity of
data collected and analysed (Morgan, 2002), participant checks of transcripts and analyses
as well as checking by an ‘external’ coder (a colleague with knowledge of Indonesian).
Findings
In what follows, we will report the findings in relation to each of the four focal questions
using brief illustrations from the classroom-based study and with reference to existing
CBA research.
Planning assessment
The first category, ‘Planning’, arose from an analysis of internal syllabus documents as
well as from discussions about teaching and assessment with, and between, the partici-
pating teachers. This category captures information about the type and nature of planned
402 Language Testing 29(3)
Jigsaw reading – students read short passages of Indonesian history leading to Kemerdekaan
[Independence] and recombine to create a group timeline using teacher generated word bank.
assessment tasks and the relationship of assessment to instruction as well as to the relevant
external frameworks (this last aspect will be revisited under RQ3).
Examination of the planning documents from Year 6 showed teaching and assessment
was planned in some detail. Extracts from the documents, reproduced as Figures 2 and 3,
indicate that the same task, in this case ‘jigsaw reading’ (present in both documents), was
specified as both a ‘learning’ and an ‘assessment’ activity. Note that the format and ter-
minology (e.g. ‘intercultural understanding’) explicitly reference the mandated curriculum
and assessment guidelines.
Your task, we’re doing a writing task. On your report you have marks for reading comprehension,
writing tasks, speaking, listening and tests. This is going on your report. This is your writing
task for Term 1. (T2, Year 7)
The second example (Figure 4), which appears on the first page of the Year 7 student
workbook, informs students that each unit of instruction will be associated with an assess-
ment activity.
Hill and McNamara 403
Other examples of how the assessable nature of the activity was flagged to learners
included printed assessment task specifications and scoring rubrics (see RQ 2).
Conducting assessment
The processes captured by this category range from explicit, planned, formal assessment
activities to less visible, unplanned, instruction-embedded assessment activities. Table 5
sets out the terms used to distinguish between these types.
Um Dan can you read the next paragraph and then we’ll go um, Violet sorry Violet, Violet, um
Margie, sorry we’ll just go around, ok? (T1, Year 7)
This category was also used to code whether the assessment involved individual students,
pairs or small groups of students, or the whole class. In the following example of group-
level assessment in Year 7, asking for a show of hands provided the teacher with an indica-
tion of how the class, as a whole, had performed on a peer-corrected vocabulary test.
(T1, Year 7)
Torrance and Pryor (1998) concluded from their study of CBA practices that assess-
ment more often occurred at the whole group level (as in this example), than at the level
of individual students.
Finally, this category investigates the teachers’ approach to CBA at a more general,
philosophical, level. Torrance and Pryor (1998) contrast convergent and divergent
approaches to assessment. Convergent assessment is concerned with ‘whether the learner
knows, understands or can do a predetermined thing’ (p. 153). It is characterized by the
use of closed questions, intended to elicit a single ‘expected’ answer. This type of ques-
tioning limits the amount and quality of information afforded for assessment purposes. It
also tends to encourage guessing as demonstrated in the following example.
(Year 6)
Divergent assessment, on the other hand, involves asking ‘genuine’ questions (i.e. where
the answer is not predetermined in advance) and ‘require[s] learners to attend to the
Hill and McNamara 405
principles at stake rather than the ritual of question and answer’ (Torrance & Pryor, 1998,
p. 129). Contrast the type of response elicited in the previous example (i.e. discrete ques-
tion words and guessing) with the level of learner engagement and type of knowledge
elicited using the more open, or exploratory, type of questioning used in the next example
(also from Year 6).
(Year 6)
It is perhaps worth noting that the focus of this discussion (which was the only example
of divergent assessment occurring in the data) was cultural, rather than linguistic,
knowledge.
Teaching. The following example shows how the teachers’ informal observations of how
the Year 7 class was progressing (‘incidental’, ‘group-level’ assessment) was used to
inform the pace of teaching.
T1: (I’m) trying to think how long [to spend on this topic]. When I first taught it I spent
about five lessons on it but sometimes I feel like I tend to go a bit slow and maybe
I need to speed it up a bit. But sometimes I feel that [going slow] also works better
as well. But that class, I don’t know.
T2: I only had them period 6.
T1: But they’ve really pulled up their socks.
T2: They have. I reckon we can crank it up a notch.
(Year 7)
You’re very smart. That was very good. Very, very good. (T1, Year 7)
(T1, Year 7)
Hill and McNamara 407
Feedback is explanatory when it is used to highlight and/or explain the successful aspects
of performance.
Very good, um, good body language too. I liked the way you shook hands, said, then you waved
goodbye to each other at the end. (T1, Year 7)
The third type of task-related feedback, corrective feedback, is used to draw attention to
the gap between what the student has done and what was expected.
Corrective feedback may vary according to degree of explicitness, using Aljaafreh
and Lantolf’s (1994) ‘Regulatory Scale’. Based on an analysis of the level of assistance,
or ‘regulation’, provided by tutors, the scale specifies increasingly explicit forms of
corrective feedback, from ‘0’, where the learner is completely self-regulating, to ‘12’,
where the teacher provides additional examples to illustrate the point. In the following
example, we can see that the feedback became increasingly explicit over the course of
the interaction. The task was to translate the number, ‘1,979’ into Indonesian.
S1: Yep ok seribu sembilan ratus tujuh sembilan should be tujuh puluh
[one thousand nine hundred seven_ nine] [seventy]
T: Say it again Type 3 indicates something
is wrong
S1: Seribu sembilan ratus tujuh sembilan
[one thousand nine hundred seven^ nine]
T: Missing one word Type 6 identifies nature
of the error
S1: Ok
T: What’s the word please [*] over here
S2: Jake
T: What’s your name?
S2: Jake
T: Jake
S2: Tujuh puluh satu [seventy one]
T: Tujuh puluh? [seventy?] Type 5 identifies the location
(rising intonation) of the error.
S2: Satu [one]
T: Sembilan [nine] Type 10 T provides correct form.
S2: Is that a nine?
T: That’s a nine.
(T1, Year 7)
Table 6 summarizes the relationship of the feedback types as they have been defined here
to terminology used by Tunstall and Gipps (1996) and others.
Reporting. In the following example the Year 7 teachers are using assessment-related
information to inform decisions about students’ end-of-year reports.
408 Language Testing 29(3)
She got a 17 for her writing task, her personal writing task. She only got 11 out of 20 for her
[first] oral and then 19 out of 24 for her [second] oral.
(T1, Year 7)
Management. A number of researchers have noted the use of assessment for classroom
management, such as controlling or reinforcing behaviour, for encouragement or for creat-
ing a positive atmosphere (Mavrommatis, 1997; Tunstall & Gipps, 1996). Torrance and
Pryor (1998) argue the ubiquitous initiation-response-feedback (IRF) sequence (Mehan,
1979), featured in the next example, functions as much to manage teaching or ‘accomplish
the lesson’ as to assess group-level knowledge.
T: Ok. What’s the word we use when we want to get a number in the answer?
S1: Ke- [the prefix used for ordinals]
T: Now I’ve worn myself out for four years teaching you this one
S2: Berapa? [how many?]
T: Hey! Who said that? Well done!
(Year 6)
This example demonstrates the tension between the pedagogic and managerial func-
tions of classroom-based assessment. While ‘Ke-’ was not the expected response, it provided
an opportunity to explore understanding of the use of ‘ke-’ [prefix used for ordinals] as
Hill and McNamara 409
opposed to ‘berapa’ [how many]. However, instead the teacher gave priority to moving
through the lesson.
Socialization. Finally researchers have highlighted a role for assessment in the ‘socializa-
tion’ of learners into the local conventions of teaching and assessment (Moni, 1999;
Torrance & Pryor, 1998; Tunstall & Gipps, 1996). In this interaction, from Year 7, the
teacher introduces learners to the (at this point novel) concept of using descriptive feedback
from successive oral assessment tasks to monitor their own progress.
T1: Remember the rubrics, remember these from your last oral presentation?
S?: All right
T1: What did I ask you to do with these last time?
S1: Keep them
T1: Ok. Today when I give you your [new] rubric I want you to look at it very closely. I
want you to stick it in your books. The [rubric from the] last lesson, last oral. Look
at it very closely. Have a look at where you’ve improved.
S2: The conversation one?
T1: Yes the conversation.
S3: um (*)
T1: Where you’ve improved.
(T1, Year 7)
Again, the same interaction could also be seen as an example of using assessment for the
purpose of learning (self-assessment).
In advance
Figure 5 provides a segment of the instructions to students for a Year 6 speaking and
writing task. It provides advance information about the obligatory components of the task
(Parts A, B and C), the response format (writing and speaking), performance conditions
(‘with a partner’) and weighting (15 from a total of 60 points). However, the specified
criteria (‘organization’, ‘persistence’, ‘getting along’ and ‘confidence’) refer to personal
qualities rather than any feature of written and spoken language.
410 Language Testing 29(3)
Bonus points will be awarded for organization, persistence, getting along and confidence.
In feedback
A number of researchers have highlighted the importance of feedback in communicating
criteria and standards as well as the strategies for achieving them (e.g. Sadler, 1989;
Torrance & Pryor, 1998; Tunstall & Gipps, 1996).
In the following example the Year 7 class had been given an aural discrimination task,
where they had to note how many times they heard a specific vocabulary item. Here the
teacher provides information about the acceptable standard (or range of performance) for
this task, that is, exactly seven times (the correct answer) or close to that number (e.g.
five times).
Ok, whether you heard it five times or seven times as long as you’re in the vicinity you should
be pretty happy with yourselves that you’ve heard it that many times. If you’ve written down
‘once’ or if you’ve written down ‘24’ then there’s a bit of a problem. (T2, Year 7)
In reporting
Another source of evidence about what teachers look for was provided during the report-
ing process. The following example occurred during the end of year reporting meeting in
Year 7.
(Year 7)
A number of researchers have found that teachers draw on a number of factors outside
the official assessment criteria (e.g. Harlen, 1994; Leung, 2007; Leung & Teasdale, 1997;
Mavrommatis, 1997). Torrance and Pryor (1998), for example, noted a widespread
Hill and McNamara 411
perception amongst teachers that ‘knowledge of the ‘whole child’ is important in interpret-
ing performance and achievement’ (p. 36).
In the next example, the Year 7 teachers appeared to rely on a shared, but largely
unarticulated, understanding of the ‘expected’ level on VELS for Year 7 (4.5).
He’s been away, a lot of problems with absenteeism and that’s one of the reasons he’s down a
bit whereas everyone else, everyone else in my eyes is performing at 4.5. (T1, Year 7)
Tunstall and Gipps (1996) suggest teachers have a ‘notion of excellence’ which they
characterize as part of the teachers’ ‘guild knowledge’. This ‘guild knowledge’ informs
what Wiliam (2001) has termed ‘construct-referenced’ assessment, which ‘relies on the
existence of a construct (of what it means to be competent in a particular domain) being
shared by a community of practitioners’ (pp. 172–173).
(Year 7)
The teacher here attributes Karim’s performance to his presumed bilingualism, a personal
background variable, rather than to effort alone. This in turn suggests a belief that knowl-
edge of a second language makes it easier to learn a third.
I’ve had these for four years and I know where they’re at and I can probably pull a kid out at
the end and say, ‘Yes well he’s not too good at this, yeah, but he’s improving in that’ and whatever.
But yeah if you’ve had them for only one year it’s very, very hard to track. Yeah well it would
be very difficult. (Year 6)
In the next example, which occurred during a Year 7 planning meeting, the teachers
express a belief in the importance of assessing each unit of study.
T1: I did ‘Families and friends’ – do you feel you finished ‘Physical and emotional
descriptions’?
T2: Pretty much but I didn’t finish it off how I would have liked. I would have liked to
do some sort of test or something at the end.
Hill and McNamara 413
(Year 7)
The final comment suggests a view (supported by research) that assessment needs to
take place reasonably soon after instruction to be effective in reinforcing learning
(Black & Wiliam, 1998). It is also consistent with a mastery orientation to assessment
and learning which, according to Thomas and Oldfather (1997), reflects a transmission
model of learning.
The study found teachers’ articulated views were broadly reflected in classroom prac-
tice. For example, it concluded that the Year 6 teacher’s focus was exposing learners to a
rich variety of culturally embedded language and informal assessment whereas in Year 7
the focus was on mastery (as evidenced by regular, formal assessment) of a relatively
narrow repertoire of linguistic items (Hill, 2010).
R: On the tape (you said) you thought she was good at Indonesian, so
S1: Yeah she is.
R: Why do you think that is?
S1: Cause she’s Vietnamese.
R: That would explain being good at Vietnamese but
S1: Yeah well isn’t it similar sort of?
R: Not really.
S1: Well see that just goes to show how much I don’t know.
S2: I think she’s really smart like she’s like smart at everything.
414 Language Testing 29(3)
(Year 7)
Understandings of assessment
Previous research has found that learners often draw on their own, possibly incongruent,
understandings of task, criteria and standards (Coughlan & Duff, 1994; Moni, 1999;
Torrance & Pryor, 1998). In the following example, the Year 7 teacher anticipates and
expressly discourages a known propensity for students to focus on presentation at the
expense of content in their written work.
T: Please remember I’m collecting these today and I’m not marking them on how
pretty your border is. I’m marking you on your Indonesian…
T: Ok girls. Writing more than pictures.
S1: But I like pictures.
T: I know but I would like some writing, ok? Girls in this row in particular. Violet,
girls, Jessica, you need to do your writing first. Do your heading later, ok?
S2: (*)
T: Do it later. I’d rather you did it at the end.
S2: (*)
T: No. Write, write, write, write!
(T2, Year 7)
R: All right and so that’s just, what does the eight mean?
T: Depending, they’re not very huge errors so I just figured that that would get an
eight. A person who had incorrect use of a verb or whatever, made a major mistake
might have got seven and a half or something like that
R: What, what sort of things do you think [the teacher’s] looking at when she gives you
an eight?
S: Um your language, your spelling, how you put them [words?] in order like if you put
them in the wrong order she’d take a point off or, something um
Hill and McNamara 415
Conclusion
This paper proposes a comprehensive framework for researching classroom based assess-
ment processes. The framework can be considered innovative in several respects. First,
as argued in the previous section, the framework goes beyond the scope of earlier models
and addresses a number of gaps in previous research, including consideration of the
epistemological bases for observed assessment practices and a specific learner and learn-
ing focus.
Second, by adopting the broadest possible definition of CBA (‘any reflection by teach-
ers (and/or learners) on the qualities of a learner’s (or group of learners’) work and the
use of that information by teachers (and/or learners) for teaching, learning, reporting,
management or socialization purposes’), the framework allows for the inclusion of a
diverse range of data.
Furthermore, the chosen unit of analysis, the ‘assessment opportunity’ (‘any actions,
interactions or artefacts (planned or unplanned, deliberate or unconscious, explicit or
embedded) which have the potential to provide information on the qualities of a learner’s
(or group of learners’) performance’), enables consideration of the more intuitive forms
of teacher decision-making in CBA.
Finally, whereas previous studies have been conducted in the context of general educa-
tion (e.g. Mavrommatis, 1997), English literacy (e.g. Torrance & Pryor, 1998) or English
as an Additional Language EAL classes (e.g. Rea-Dickins, 2001) the research motivating
Hill and McNamara 417
the development of the framework took place in a school-based foreign language setting.
It is hoped that other researchers will investigate the utility of the framework for investi-
gating CBA at different levels of education (e.g. tertiary classrooms), program types
(especially content-based language programs) and policy contexts (especially in high
stakes assessment regimes).
The aim of the empirical study was to understand rather than evaluate CBA practices
in the respective classrooms with the aim of expanding, rather than answering, the ques-
tions that should be asked in CBA research. However, in the words of Newman, Griffin
and Cole (1989), ‘descriptions of how a system works are never far removed
418 Language Testing 29(3)
from questions about how to make it work better’. There is already a volume of research
evidence regarding the effects of different CBA practices on learning, not least of all that
found in Black and Wiliam’s (1998) influential meta-analysis. However, there is clearly
a place for experimental studies of how the different CBA processes outlined in this paper
might impact on learning outcomes.
In conclusion, we anticipate that the framework will be useful for researchers interested
in understanding classroom-based assessment and for teachers wishing to gain greater
insight into the integration of assessment in their everyday teaching practices and the
impact of their assessment practices on learning. As classroom-based assessment is increas-
ingly a focus of policy and research, the need for such a comprehensive framework for
both researchers and practitioners has never been more urgent.
Transcription symbols
Notes
1. The Victorian Essential Learnings & Standards (VELS) (VCAA, 2008) contain a particular
emphasis on ‘intercultural understanding’ as well as the ‘interdisciplinary’ and ‘interpersonal’
aspects of language learning. It is linked to year levels (or ‘stages of learning’) and specifies
separate trajectories (or ‘Pathways’) for ‘beginning’ and ‘continuing’ students post-primary.
2. This design reflects the focus of the larger study from which the data for this paper is drawn;
that is, the issue of continuity between primary and high school language programs.
References
Aljaafreh, A., & Lantolf, J. P. (1994). Negative feedback as regulation and second language learn-
ing in the Zone of Proximal Development. The Modern Language Journal, 78(4), 465–483.
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education,
5(1), 7–74.
Brindley, G. (1998). Outcomes-based assessment and reporting in language learning programmes:
A review of the issues. Language Testing, 15, 45–85.
Brindley, G. (2001). Outcomes-based assessment in practice: Some examples and emerging insights.
Language Testing, 18(4), 393–407.
Clarke, S. (1998). Targeting assessment in the primary classroom. Strategies for planning assess-
ment, pupil feedback and target setting. London: Hodder & Stoughton.
Clarke, S., & Gipps, C. (2000). The role of teachers in teacher assessment in England 1996–1998.
Evaluation and Research in Education, 4, 38–52.
Hill and McNamara 419
Coughlan, P., & Duff, P. (1994). Same task, different activities: Analysis of SLA from an activity
theory perspective. In J. Lantolf & G. Appel (Eds.), Vygotskian approaches to second language
research (pp. 173–194). Norwood, NJ: Ablex.
Cumming, J. J., & Maxwell, G. S. (2004). Assessment in Australian Schools: Current practices and
trends. Assessment in Education, 11, 89–108.
Davison, C. (2004). The contradictory culture of teacher-based assessment: ESL teacher assess-
ment practices in Australian and Hong Kong secondary schools. Language Testing, 21(3),
305–334.
Davison, C., & Leung, C. (2009). Current issues in English language teacher-based assessment.
TESOL Quarterly, 43(3), 393–415.
Department of Education, Victoria, Australia. 2006. Languages other than English in government
schools, 2006. Retrieved August, 2009 from www.education.vic.gov.au/studentlearning/teach-
ingresources/lote.html
Duff, P. A., & Uchida, Y. (1997). The Negotiation of teachers’ Sociocultural Identities and Practices
in Postsecondary EFL Classrooms. TESOL Quarterly, 31(3), 451–486.
Gipps, C. (1994). Quality in teacher assessment. In W. Harlen (Ed.), Enhancing quality in assess-
ment (pp. 71–86). London: Paul Chapman.
Harlen, W. (1994). Issues and approaches to quality assurance and quality control in assessment.
In W. Harlen (Ed.), Enhancing quality in assessment (pp. 11–25). London: Paul Chapman.
Heurta-Macias, A. (1995). Alternative assessment: Responses to commonly asked questions. TESOL
Journal, 5, 8–11.
Hill, K. (2010). Classroom-based assessment and the issue of continuity between primary and
secondary school languages programs. Babel, 45(1), 5–12.
Hornberger, N. H. (1994). In A. Cumming (Ed.), Alternatives in TESOL research: Descriptive,
interpretive and ideological orientations. TESOL Quarterly, 28(4), 673–703.
James, M. (2006). Assessment, teaching and theories of learning. In J. Gardner (Ed.), Assessment
and learning (pp. 47–60). London: Sage Publications.
Leung, C. (2005). Classroom teacher assessment of second language development. In E. Hinkel
(Ed.), Handbook of research in second language teaching and learning (pp. 869–888). Mahwah,
NJ: Lawrence Erlbaum.
Leung, C. (2007). Dynamic assessment: Assessment for and as teaching? Language Assessment
Quarterly, 4(3), 257–278.
Leung, C., & Mohan, B. (2004). Teacher formative assessment and talk in classroom contexts:
Assessment as discourse and assessment of discourse. Language Testing, 21(3), 335–359.
Leung, C., & Teasdale, A. (1997). What do teachers mean by speaking and listening? A contextu-
alised study of assessment in multilingual classrooms in the English National Curriculum. In
A. Huhta, V. Kohonen, L. Kurki-Suonio, & S. Luoma (Eds.), Current developments and alter-
natives in language assessment: Proceedings of LTRC 96 (pp. 291–326). Jyväskylä: University
of Jyväskylä.
Lynch, B. K. (2003). Language assessment and programme evaluation. Edinburgh: Edinburgh
University Press.
Mavrommatis, Y. (1997). Understanding assessment in the classroom: Phases of the assessment
process – The assessment episode. Assessment in Education, 4, 381–400.
McNamara, T. (2001). Language assessment as social practice: Challenges for research. Language
Testing, 18(4), 333–350.
Mehan, H. (1979). Learning lessons: Social organization in the classroom. Cambridge, MA:
Harvard University Press.
Moni, K. B. (1999). Constructions of literacy assessment in two year 8 English classrooms.
Unpublished PhD thesis, Education, University of Queensland.
420 Language Testing 29(3)
Morgan, D. L. (2002). Focus group interviewing. In J. F. Gubrium & J. A. Holstein (Eds.), Hand-
book of interview research: Context and method (pp. 141–159). London: Sage Publications.
Nassaji, H., & Swain, M. (2000). A Vygotskian perspective on corrective feedback: The effect of
random versus negotiated help on the learning of English articles. Language Awareness, 9,
34–51.
Newman, D., Griffin, P., & Cole, M. (1989). The construction zone: Working for cognitive change
in schools. Cambridge: Cambridge University Press.
Nicholls, J. G. (1989). The competitive ethos and democratic education. Cambridge, MA: Harvard
University Press.
Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment.
Language Testing, 18(4), 429–462.
Rea-Dickins, P. (2006). Currents and eddies in the discourse of assessment: A learning focused
interpretation. International Journal of Applied Linguistics, 16(2), 164–188.
Ritchie, J., & Spencer, L. (2002). Qualitative data analysis for applied policy research. In
A. M. Huberman & M. B. Miles (Eds.), The qualitative researcher’s companion (pp. 305–330).
Thousand Oaks, CA: Sage Publications.
Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional
Science, 18, 119–144.
Strauss, A. L. (1987). Qualitative analysis for social scientists. Cambridge: Cambridge University
Press.
Strauss, A., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for
developing grounded theory (2nd ed.). Newbury Park, CA: Sage Publications.
Thomas, S., & Oldfather, P. (1997). Intrinsic motivations, literacy and assessment practices: ‘That’s
my grade. That’s me.’ Educational Psychologist, 29, 107–123.
Torrance, H., & Pryor, J. (1998). Investigating formative assessment: Teaching, learning and
assessment in the classroom. Buckingham: Open University Press.
Tunstall, P., & Gipps, C. (1996). Teacher feedback to young children in formative assessment: A
typology. British Educational Research Journal, 22(4), 389–404.
Van Lier, L. (2004). The ecology and semiotics of language learning. A sociocultural perspective.
Boston: Kluwer Academic.
Victorian Curriculum & Assessment Authority (2008). Framework of essential learnings: Languages
other than English. Melbourne: Victorian Curriculum and Assessment Authority.
Wiliam, D. (2001). An overview of the relationship between assessment and the curriculum. In
D. Scott (Ed.), Curriculum and assessment (pp. 165–181). Westport, CT: Ablex.
Yin, M. (2010). Understanding classroom language assessment through teacher thinking research.
Language Assessment Quarterly, 7(2): 175–194.