You are on page 1of 20

Assessment & Evaluation in Higher Education

ISSN: 0260-2938 (Print) 1469-297X (Online) Journal homepage: https://www.tandfonline.com/loi/caeh20

Three in-course assessment reforms to improve


higher education learning outcomes

D. Royce Sadler

To cite this article: D. Royce Sadler (2016) Three in-course assessment reforms to improve higher
education learning outcomes, Assessment & Evaluation in Higher Education, 41:7, 1081-1099,
DOI: 10.1080/02602938.2015.1064858

To link to this article: https://doi.org/10.1080/02602938.2015.1064858

Published online: 17 Jul 2015.

Submit your article to this journal

Article views: 3164

View related articles

View Crossmark data

Citing articles: 24 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=caeh20
Assessment & Evaluation in Higher Education, 2016
Vol. 41, No. 7, 1081–1099, http://dx.doi.org/10.1080/02602938.2015.1064858

Three in-course assessment reforms to improve higher education


learning outcomes
D. Royce Sadler*

School of Education, The University of Queensland, Brisbane, Australia

A current international concern is that, for too large a proportion of graduates,


their higher order cognitive and practical capabilities are below acceptable levels.
The constituent courses of academic programmes are the most logical sites for
developing these capabilities. Contributing to patchy attainment are deficiencies
in three particular aspects of assessment practice: the design and specifications
of many assessment tasks; the minimum requirements for awarding a passing
grade in a course and granting credit towards the degree; and the accumulation
of points derived from quizzes, assessments or activities completed during the
teaching period. Rethinking and reforming these would lead to improvements for
significant sub-populations of students. Pursuing such a goal would also have
significant positive implications for academic teachers, but be contingent on
favourable contextual settings including departmental and institutional priorities.
Keywords: generic skills; higher education competencies; learning outcomes;
academic standards; higher education grades

Introduction
This article is mainly about cognitive capabilities that are important in most
academic fields: proficiency in thinking, reasoning, synthesising, conceptualising,
evaluating and communicating. These ‘higher order’ capabilities form a subset of
what are also variously called ‘intended learning outcomes’ (Biggs and Tang 2011)
or some combination of ‘generic’, ‘graduate’ or ‘higher education’ with ‘competen-
cies’, ‘skills’ or ‘attributes’. With the rapid expansion of higher education world-
wide, it is natural to ask about the extent to which all students can demonstrate
adequate levels of such ‘higher order’ capabilities by the time they graduate. But
what is meant by ‘adequate’? This is the fundamental question. A number of
agencies and commentators referenced in the next section have alleged that, while
many graduates do achieve desired standards, many others do not.
This article is based on the premise that the most logical, direct and appropriate
site for developing capabilities is within the courses that constitute degree pro-
grammes. Research by Jones (2009, 2013) has demonstrated that interpretations of
competences differ from field to field, sometimes widely. This is the nature of
disciplines. However, there are reasonable grounds for believing that capabilities
developed thoroughly in one context – a particular course or sequence of courses –
normally have a transferable element to them. This allows them to be reconfigured

*Email: d.sadler@uq.edu.au

© 2015 Taylor & Francis


1082 D. R. Sadler

and repurposed for use in other contexts at other times. As Strathern (1997, 320), an
anthropologist, explained it:
In making transferable skills an objective, one cannot reproduce what makes a skill
work, i.e. its embeddedness … what is needed is the very ability to embed oneself in
diverse contexts, but that can only be learnt one context at a time … if you embed
yourself in site A you are more likely, not less, to be able to embed yourself in site B.
But if in Site A you are always casting around for how you might do research in B or
C or D, you never learn that. There is a lesson here for disciplines … Somehow we
have to produce embedded knowledge: i.e. insights that are there for excavating later,
when the context is right, but not until then … we have not to block or hinder … the
organism’s capacity to use time for the absorption of information … time-released
knowledge or delayed-reaction comprehension. [Capitalization in the original]
Reforming three particular assessment practices would increase the likelihood that
more students, especially those currently at the minimum ‘pass’ level, would achieve
the levels expected of all graduates. The three form a mutually interdependent pack-
age. They are the design and specification of assessment tasks; the requirements for
a pass; and the design of course assessment programmes. Wherever these are not
currently being practiced as aspects of normal institutional quality assurance, they
amount to reforms that require enabling changes to be made elsewhere in the
learning environment.

Context
Two widely read books by Bok (2006) and Arum and Roksa (2010) describe
unevenness in graduate outcomes as perceived in the USA. Bok (2006, 7–8) wrote
‘Survey after survey of students and recent graduates shows that they are remarkably
pleased with their college years’. Overall, they also ‘achieve significant gains in
critical thinking, general knowledge, moral reasoning, quantitative skills, and other
competencies’. At the same time and fully compatible with that, ‘colleges and uni-
versities, for all the benefits they bring, accomplish far less for their students than
they should. Many seniors graduate without being able to write well enough to sat-
isfy their employers’ (8) by expressing themselves ‘with clarity, precision, … style
and grace’ (82). ‘Many cannot reason clearly or perform competently in analysing
complex, nontechnical problems, even though faculties rank critical thinking as the
primary goal of a college education’ (8). ‘The ability to think critically – to ask
pertinent questions, recognise and define problems, identify the arguments on all
sides of an issue, search for and use relevant data, and arrive in the end at carefully
reasoned judgments – is the indispensable means of making effective use of
information’ (109).
Here, Bok has raised quite specific concerns. They may be valid to a greater or
lesser extent for particular institutions, academic degree programmes or component
courses – there is usually no independent way of telling. However, his portrayal of
the situation in the USA resonates with similar concerns raised in other countries.
These are reflected in the number of national and international discussions, policies,
projects, regulations, instruments and forms of cooperation aimed at assuring gradu-
ate outcomes (Australian Learning and Teaching Council 2010; Bergan and Damian
2010; Blömeke et al. 2013; Coates 2014; Dill and Beerkens 2013; Douglass,
Thomson, and Zhao 2012; Lewis 2010; Sadler 2013b; Shavelson 2010, 2013;
Tremblay 2013; Williams 2010). Part of the overall unease is because, globally,
Assessment & Evaluation in Higher Education 1083

higher education has expanded rapidly without matching increases in public funding
directed specifically towards teaching.
Despite what may seem an overwhelming challenge, progress could be made by
ensuring that the course grades entered on students’ academic transcripts can be
trusted to represent adequate levels of the expected graduate competencies. Across a
full degree programme, the transcript reports student performance on a large range
of demanding tasks, in a wide variety of courses, studied over a considerable period
of time, and covering substantial disciplinary and professional territory. Specialised
tests of graduate competencies are not set up to do this (Shavelson 2013). If third
parties are to draw reasonably robust conclusions about a graduate’s acquired overall
capability or competence, the grades on transcripts must be trustworthy.

Reform 1: Assessment task design and specifications


Grading a student’s performance involves drawing an inference from what the stu-
dent produces. The quality of the inference depends on several factors, two obvious
ones being the quality of the data (the student production) and the ability of the
assessor. The quality of the data is the focus here. Ordinarily, students respond to
assessment items. An ambiguous item is unlikely to give rise to good-quality data
because different students will most likely interpret the item differently. This is why
the stimulus needs to be both well designed and clearly specified. It must set up a
fresh problem to be solved, a question to be answered, an issue to be addressed, or a
position to be critiqued or defended. Students may or may not be able to do the task
well, but at least there is no excuse for not knowing the type of response required.
To make this concrete, consider this poor assessment task: ‘Write an essay on
directive and supportive leadership styles’. Any student who simply writes separate
detailed accounts of the two leadership styles technically fulfils the requirements of
the task description. However, high performing students delve deeply into a topic as
a matter of course. They may, for example, describe the two leadership styles briefly
but then go on to analyse similarities, differences and the superior fit of one of the
styles for a particular purpose. These students’ comprehensive understanding of both
the topic itself and the assessment context leads them to be analytical rather than
descriptive, and high marks typically follow. Regardless of the actual form of the
assessment task specifications, examiners and markers find that student responses
generally range from low to high quality for any reasonably sized student group. In
some examiners’ eyes, this range would be sufficient evidence to conclude that the
structure and content of the assessment task is unproblematic, thus reinforcing the
status quo. That reasoning is faulty.
An example of better design for the leadership styles task would be to set up
some scenario involving two particular types of organisations (say, a voluntary
association and a business employing mostly casual staff). Ask students to explain
which leadership style, or which aspects of each, might suit the two organisations.
Making the intention clear in this way makes separate descriptions unnecessary,
because how well students know the two styles will be evident in their responses.
The improved design also makes it reasonable to hold all students to the task
requirements. This is an important consideration if the ‘evidence’ of achievement is
not to be compromised by poor item structure: ‘Poor quality evidence of a stu-
dent’s... achievement must not be confused with evidence of poor achievement’
(Sadler 2014b, 286).
1084 D. R. Sadler

In general, tasks need to stimulate higher order thought processes such as


hypothesising; extrapolating (or interpolating); exploring and articulating relation-
ships among things; estimating the likely effects of varying the parameters of a sys-
tem; redesigning something to suit a new purpose; using analogues as explanatory
tools; outlining and defending a scenario; and evaluating inadequacies or errors in
solutions or arguments. Given the huge variety of expected outcomes in different
disciplines, fields and professions, academics in those fields are best placed to deter-
mine the nature of well-formed questions that push the students into the right
amount of unfamiliar territory.
Ideally, the task specifications identify for all students the genre of response
required. Critical reviews, arguments, underlying assumptions that have to be identi-
fied, and causal explanations are all distinct response genres (Sadler 2014a). This
does not mean that students should be given copious instructions on how to go
about the task or detailed rubrics and statements of criteria and standards of the type
often recommended (Grunert O’Brien, Millis, and Cohen 2008). It means they have
the right to know the genre for their response. It is both illogical and counterproduc-
tive to appraise the quality of a student work as a member of a particular genre if
the work is not actually a member of that genre (the concept of ‘response genre’ is
not identical with ‘writing genre’ as Gardner and Nesi (2013) use the term in
connection with teaching academic writing to students).
Creating demanding assessment tasks from scratch is hard work if the tasks are
to tap into higher order operations on ideas and information. A straightforward way
to proceed is to collect a broad range of existing tasks that require students to con-
struct responses of considerable length. Sources include previous assignment tasks,
project descriptions and examinations in the field. Similar material from related
discipline fields may also prove useful for ideas. Academics, individually or in
groups, generally can, without special tuition or much difficulty, scrutinise the mate-
rials, broaden their own insights, and differentiate them according to quality. In so
doing, they expand their own understanding of the possibilities and can decide
which to avoid, emulate or adapt to suit their own context and purpose. They can
also imagine themselves as students faced with responding to particular task
specifications, trying to figure out how they would proceed.
Potential sources also include real-life problems in the relevant field. These may
be of special value in assessing graduate capability late in a degree programme.
Although it may not be feasible to deal with the complexities of the full problem in
its context, doing away with unnecessary detail has to be balanced against the cost
of providing students with experience in deciding for themselves what is necessary
and what may be safely discarded to make the problem amenable to solution (Taylor
1974).

Iterative improvement of task specifications


Professional test developers routinely engage in revising task designs and specifica-
tions in the light of experience. In higher education, a simple but revealing check on
whether an assessment task actually requires higher order thinking and production is
to ask one or two competent others. They should interpret the wording of the speci-
fications literally and either indicate the absolute minimum that could be done to
complete the task, or, better still, actually attempt the task itself. A more thorough
check is to compare task specifications with actual student works or performances
Assessment & Evaluation in Higher Education 1085

and analyse how students responded to the tasks. This process is passive and
distinctly different from that used to score responses. What is sought is at least a
partial diagnosis of any deficiencies in the task design or specifications. Where at
least some responses technically do fall within a literal interpretation but are much
simpler than was intended, it may not have been imagined that such interpretations
would be possible. At the opposite extreme is a response that really ‘captures’ what
was intuitively hoped for but not fully conceptualised when the specifications were
written. Capitalise on that for the future.
The final check is to consult students themselves (Hounsell 1987), rather than try
to infer what they ‘must have been thinking’ as they went about the task. This is the
only independent source that can confirm or disconfirm their understanding or reac-
tions (Alderson 1986). What went on in their heads while they were working out
how to respond to the task, and then during the planning and production phases?
Were they surprised by how the quality of their work was appraised?

Intended learning outcomes


To digress briefly, it might be thought at this point that assessment task design should
start with statements of course objectives, graduate capabilities or intended learning
outcomes. Biggs and Tang (2011) recommended this as foundational to achieving
what they termed ‘constructive alignment’. The treatment above, however, began
directly with consideration of assessment task design and specification. The rationale
for this is as follows. Statements of objectives typically use abstract terminology to
frame higher order cognitive competencies such as ‘critical analysis’, ‘problem-
solving’ and the like. These terms are open to wide interpretation (Weissberg 2013),
and adding more words cannot solve the problem. The explanation can be found in
Sadler’s (2014b) parallel argument about the impossibility of expressing academic
achievement standards in verbal or other codified form. The same reasoning applies
to learning outcomes. The key terms in the language used cannot be interpreted
unambiguously. They ‘float’ according to context because they have imagined rather
than concrete referents. On the other hand, assessment tasks and specifications are
material formulations that can be exhibited, argued about and administered. They
provide the sharpest and most direct tool available for discussing, clarifying and
communicating course intentions for students and academics alike.

Reform 2: Grading at the ‘Pass’ level


Many of the objects, products and processes used in everyday life have their quality
governed by external standards that are set by some recognised authority, and shar-
ply discriminate pass from fail. Independent licensed testing agencies apply these
standards using calibrated testing procedures. No corresponding infrastructure exists
for marking, grading and reporting course-based student achievement in higher
education. Exactly what ‘standards’ and ‘comparability’ mean receives relatively lit-
tle attention. Yet markers constantly need to make sound judgments about the qual-
ity of work in order to infer underlying competence or capability. A central issue is
where to pitch the course grade boundaries. An especially important one is the lower
boundary for a passing grade, because that usually determines whether credit will be
granted towards the degree. Where should that lower boundary be set so that, when
all courses are taken together, the result satisfies discipline-based expectations,
1086 D. R. Sadler

professional accreditation requirements and the capabilities society expects of all


higher education graduates?
In ordinary conversation, something is said to be ‘passable’ when it is adequate
or satisfactory for the purpose. The speaker initially assumes – and hearers could
clarify if they need to – what ‘adequacy’ means in the context. Sometimes, tone of
voice can indicate that the requirements must technically be met, even if only just.
The following has been distilled from several existing definitions:
Pass (v): to demonstrate attainment, achievement or proficiency at or exceeding a level
accepted as satisfactory but not necessarily of the highest level; to satisfy fully a mini-
mum agreed performance requirement; to show sufficiency or adequacy to purpose; to
meet expectations, conform to specifications or reach some fixed and approved
standard.
How do institutions conceptualise what should count as a pass? Some clues can be
found in their published grade descriptors, where these exist, although the statements
may not necessarily correspond closely with actual grading decisions. Consider the
five statements in Table 1 outlining what a pass represents in five different institu-
tions, all obtained from their web sites. All use the word ‘pass’ explicitly as a grade
label or refer to a ‘pass’ in associated documentation. In some cases, conditions
apply. For example, the number of courses that can be passed at the minimum level
and also credited towards a degree may be strictly limited.
In these statements, expectations range from concessions to students who stay
the full length of courses, but may actually learn very little, through to notionally
adequate levels of capability. Also in there can be found open tolerance of low
levels of performance on higher order objectives (specifically, the ability to make

Table 1. Five grade descriptors for the lowest level of achievement in a course for which
credit can be counted towards the degree. Conditions may apply.
Item Designator* Grade descriptor
1 50–59 Pass Satisfactory. Demonstrates appreciation of subject matter and issues.
Addresses most of the assessment criteria adequately but may lack in
depth and breadth. Often work of this grade demonstrates only basic
comprehension or competency. Work of this grade may be poorly
structured and presented. (Monash University)
2 D (D+, D, D-) Earned by work that is unsatisfactory but indicates some minimal
command of the course materials and some minimal participation in
class activities that is worthy of course credit toward the degree.
(Harvard University, College of Arts and Sciences)
3 40–49 3rd Acceptable attainment of most intended learning outcomes,
Pass displaying a qualified familiarity with a minimally sufficient range of
relevant materials, and a grasp of the analytical issues and concepts
which is generally reasonable, albeit insecure. (University of Stirling)
4 E Sufficient: A performance that meets the minimum criteria, but no
more. The candidate demonstrates a very limited degree of
judgement and independent thinking. (University of Oslo)
5 D Deficient in mastery of course material; originality, creativity, or both
apparently absent from performance; deficient performance in
analysis, synthesis, and critical expression, oral or written; ability to
work independently deficient. (Dartmouth College)
*Grade code as entered on academic transcript.
Assessment & Evaluation in Higher Education 1087

sound judgments, act independently, engage in analysis and communicate clearly),


and specific endorsement of participation in class towards course credit. Participation
is not strictly an element of ‘achievement’ or ‘competence’ at all. Taken together,
these grade descriptors send mixed messages about what it means to ‘pass’.
Although these formal grade descriptors indicate particular orientations, the defini-
tive measure of the adequacy of an institution’s standards is whether the lowest-
performing students who gain credit for a course achieve higher order objectives to a
sufficient degree. In the case of written responses, that includes the quality of writing.
This can be determined only by scrutinising student responses to well-constructed
assessment tasks. If a grade of D is officially the lowest on the credit-earning scale
but all students gain at least a B−, the salient issue is whether the work awarded a
B− deserves credit in terms of higher order outcomes. At the upper end of the grade
scale, the issue is whether all students who gain the highest available grade really do
demonstrate excellence or a high level of distinction.
This is not the end of the story, and key questions still need to be asked: What is
meant and implied by acceptable standards or to a sufficient degree? How can
appropriate standards be set collaboratively so as to reflect a broad consensus? What
is required to give course grades integrity and currency across courses, programmes
and institutions? How may standards be given material form so they can remain
stable reference points over time? These have been at least partially addressed both
theoretically (Sadler 2013a, 2014b) and in field trials (Watty et al. 2014).

Reform 3: Redesigning course assessment plans


This reform is about the timing, purpose and structure of assessment during and at
the end of a course. Specific aspects are the practice of combining marks awarded
during a course with those awarded at the end; ensuring that assessment during a
course functions formatively; and changing the parameters of summative assessment
for grading.

Accumulation of marks
In theory, a course grade is meant to represent a student’s level of capability attained
by the end of a course: ‘grading … is the assignment of a letter or number to indi-
cate the level of mastery the student has attained at the end of a course of study’
(Schrag 2001, 63). It is literally the out-come that goes on record. This is entirely
consistent with the customary (and legitimate) way of expressing intended learning
outcomes: ‘By the end of this course, students should …’ Whether the actual path
of learning is smooth or bumpy, and regardless of the effort the student has (or has
not) put in, only the final achievement status should matter in determining the course
grade (Sadler 2009, 2010b). However, in many higher education institutions,
accumulating marks or points for work assessed during a period of learning (con-
tinuous assessment) is the prevailing practice, mandated or at least endorsed by the
institution. Readily available software provides bookkeeping tools for it. These make
it easy to progressively ‘bank’ marks, then weight and process them at the point of
withdrawal for conversion into the course grade.
The common arguments for accumulation are essentially instrumentalist
(Isaksson 2008). The purpose is not so much to help learners attain adequate levels
of complex knowledge and skills by the end of a course, as to keep them working
1088 D. R. Sadler

and provide multiple opportunities for feedback. In any case, so it is argued,


students need, expect, appreciate and thrive under continuous assessment
(Hernández 2012; Trotter 2006). However, notwithstanding its superficial appeal,
accumulation actually diverts attention from the goal of achieving a satisfactory
level by course end.
First, accumulating performance measures during the learning period maps the
shape of the actual learning path into the grade (Sadler 2010b). In general, the context
and actions of both teacher and learner influence the rate and depth at which learning
occurs. For many students, coming to grips with and then overcoming false starts,
errors, bumbling attempts and time spent going up blind alleys lead to deep under-
standings by the end of a course. Students can take bold risks that end in disasters and
safely ‘make’ conceptual connections that later have to be unlearned. For well over a
century, the role of spacing during the total time available for developing high-order
knowledge and skills has been extensively studied. This research provides robust find-
ings on how humans learn (Bloom 1974; Budé et al. 2011; Conway, Cohen, and Stan-
hope 1992; Ebbinghaus 1885; Rohrer and Pashler 2010). This is ‘especially marked
in sequential learning in which competence is attained only after a series of learning
experiences that may take months or years to complete before the learner has devel-
oped a satisfactory degree of attainment in the field’ (Bloom 1974, 682).
Second, accumulation lends itself to awarding and banking marks for a variety
of non-achievements for the purpose of influencing student behaviour. Marks are
used to incentivise and reward student effort, engagement in preferred activities,
completion of exercises or work stages, and participation. These behaviours and
activities may well assist learning, but they do not constitute the final level of
achievement, or even part of it. On the debit side of the ledger, marks may be
deducted to penalise late submission, cheating or plagiarism. The cost of using
marks to modify behaviour is contamination of the grade. Other ways have to be
found. Quite apart from behaviour management, many students insist they have a
moral right for aspects other than unadulterated achievement to be included in their
grades (Tippin, Lafreniere, and Page 2012; Zinn et al. 2011). Overall, the banking
model takes data from non-achievement contaminants, early ‘deficits’ and idiosyn-
cratic paths of learning, and mixes them all into the final grade. The grade is then
logically impossible to disentangle and hence interpret (Brookhart 1991; Sadler
2010b). Equally serious is that no coherent concept of a ‘standard’ can apply to such
a mishmash of data.
Finally, although accumulating marks may succeed in motivating and focusing
student effort, the pressure and drive typically ease off once the ledger balance
approaches the pass score cut-off. This allows students to sidestep the challenge of
gaining a command over the course as a whole, especially its higher order
objectives. Put another way, accumulation invites students to valorise externally
offered proximate goals at the very time that the eventual goal should be kept front
and centre in their minds. A person’s perspective on the fullness of the eventual goal
to be achieved, or the central purpose to be served, can play a determinative role in
how they approach and manage their own learning, and the task of becoming
competent (Entwistle 1995; Sadler 2014a; Sommers 1980). A steady stream of
extrinsic rewards is a poor substitute for developed intrinsic rewards where students
take primary responsibility for their own learning. Extrinsic rewards work directly
against the students-as-learners maturation process in which they progress towards
becoming independent, self-directed, lifelong learners.
Assessment & Evaluation in Higher Education 1089

Designing during-course formative assessment


When the drag imposed by accumulating low or irrelevant marks is eliminated, dur-
ing-course tests and assessments are freed up to function purely formatively.
‘Purely’ indicates high stakes for learning but zero influence on the summative
grade. True, in a broader context, students may use information about end-of-course
performance in one course to improve their performance in subsequent courses, but
that is a different issue. Within a single course, formative and summative assess-
ments need to be clearly separated so that they can serve their respective purposes.
Given a set of course objectives, formative assessment is commonly viewed nar-
rowly as giving students assessment tasks and then feedback so that they can
improve (Nicol and Macfarlane-Dick 2006). Despite all the effort typically invested
in creating better and better feedback, it too often makes practically no difference.
Sadler (2010a, 2013c) argued that the principal reason is that feedback is basically
about telling students – the transmission model of teaching transposed into an
assessment setting. The alternative is to offer students formative assessment oppor-
tunities that provide authentic evaluative experience of the type they need in order
to become better able to recognise, monitor and control the quality of works they
themselves are to produce. This matter is discussed at length by Sadler (1989,
2010a, 2013c, 2014a).
Students need to be exposed to a variety of complex tasks and their correspond-
ing response genres. This immerses them in decision spaces that are similar to those
inhabited by marker-teachers in which judgments are made about whether a particu-
lar work falls within the required response genre, and if so, the macro-level and
micro-level determinants of its quality. Students need to become competent not only
in making judgments about their own works, but also in defending those judgments
and figuring out how those works could have been made better. This involves learn-
ing to notice aspects that make a difference to quality, and to pass over those that
make only negligible difference. In other words, they need practice in appraising
works holistically, so that they come to understand how the appropriate use of smal-
ler scale tactics enables a larger scale purpose to be accomplished. ‘This type of
“seeing” typically goes unrecognised in most of the research on assessment for
learning, where the focus has been on feedback’ (Sadler 2013c, 58).
Part of the agenda is specifically and deliberately to induct students into appre-
ciating the types and ranges of problems, issues or questions that could legitimately
be set as assessment tasks in the course. Multiple assessment tasks that demand
complex cognitive and other capabilities serve multiple purposes that include con-
veying the intended learning outcomes for the course and equipping students for
summative assessment at the end of the course. Students need to be challenged with
problems that develop, activate and coordinate the same cognitive processes and
professional skills they will need as graduates. By definition, capability implies the
freedom, versatility and adaptability to tackle successfully problems that have not
been delineated or anticipated in advance, and to do so on demand, unaided and to a
satisfactory standard. There are not just a handful of stereotypic problems or types
of tasks that characterise the course, but a wide range of possibilities that entail
diverse cognitive and practical skills in different combinations. Research by
Entwistle (1995) showed that the best preparation for course examinations comes
about only by having a thorough grasp of the whole course. Sound assessment
plans, tasks and specifications are crucial to this.
1090 D. R. Sadler

The choice of assessment task format is an important meta-parameter in the


design of course assessment programmes, both formative and summative. Extensive
use of multiple choice tests reduces – if not eliminates altogether – the number of
written prose responses, and with it a valuable opportunity to develop competence
in discipline-focused writing. Creating precise and cogent prose promotes high-level
learning primarily because it requires ‘careful, probing thought’ (Bok 2006, 103). In
her classic 1977 article, Emig wrote that ‘Clear writing by definition is that writing
which signals without ambiguity the nature of conceptual relationships, whether they
be coordinate, subordinate, superordinate, causal, or something other’ (126). In
Sternglass’s research, students repeatedly reported that ‘Only through writing [papers
of a type that] … required them to integrate theory with evidence did they achieve
the insights that moved them to complex reasoning about the topic under considera-
tion’ (1997, 295). Bok (2006), Zorn (2013) and many others have argued that the
best site for developing good writing is within the disciplines themselves, not
separately as a specialist activity.

Re-inventing end-of-course summative assessment


The plan for summative assessment in a course amounts to more than just ensuring
that each assessment task is well designed and specified. A basic tenet of assessment
is that the evidence of academic achievement should be unquestionably the student’s
own work. Common threats to the integrity of achievement data include cheating,
collusion, plagiarism, outsourcing term papers and using substitute test takers.
Increasingly sophisticated digital technologies and telecommunications have con-
tributed to the problem (Park 2003; Walker 2010). The traditional way of satisfying
the secure data requirement has been to use previously unseen assessment tasks in
invigilated, time-restricted written examinations. These assessment formats typically
fail to take advantage of the technologies and tools of production currently used in
most workplaces – use of keyboard input, office productivity software, the internet
and web searching. Exploring ways to address this is an active area of research.
Williams (2006) and Williams and Wong (2009), for instance, have trialled forms of
open-book, open-web assessments.
Composing text and manipulating data using a keyboard (rather than pen and
paper) are now so common that these tools should be readily available for candi-
dates. Editing of text, in particular, has the potential to improve learning during test-
ing. Sommers (1980) highlighted the special role that editing and revising one’s
work can play in creating and clarifying meaning:
experienced writers seek to discover (to create) meaning in the engagement with their
writing, in revision. They seek to emphasize and exploit the lack of clarity, the differences
of meaning, the dissonance that writing as opposed to speech allows in the possibility of
revision. Writing has spatial and temporal features not apparent in speech – words are
recorded in space and fixed in time – which is why writing is susceptible to reordering and
later addition. Such features make possible the dissonance that both provokes revision and
promises, from itself, new meaning. (386)
Additional concerns about traditional examinations have their roots in typical exam-
ination conditions. Students often experience considerable stress because of both the
strict time limits and the ‘summary’ nature of high stakes, make-or-break events. In
some cases, medical researchers have explored coping strategies and the possible
use of medication (Edwards and Trimble 1992). Removing or relaxing problematic
Assessment & Evaluation in Higher Education 1091

examination conditions could well include making time limits generous (within
reasonable limits) and allowing review time and re-examination (with an accompa-
nying fee if necessary). If it is objected that all students in a course should perform
under identical conditions, the reply is straightforward. Students with special needs
typically have accommodations made for them, but within any course, some stu-
dents may be just below the threshold at which special accommodations would
apply. In addition, the quality of a student’s response as appraised against standards
rather than against other students’ work is a clearer indicator of their capability than
the speed of task completion.
Two observations apply regardless of the mode or medium of response: effi-
ciency and sampling. An efficient plan results in high levels of valid achievement
information relative to the costs of getting it – including time in setting and marking
student work, and administrative overheads. Appropriate sampling involves cover-
age across both the course subject matter (a preoccupation with many examiners)
and the range of relevant intended higher order outcomes. These two together are
somewhat analogous to evaluating the economic potential of a mineral deposit by
drilling a series of cores into a prospective ore body to test its lateral extent and its
richness (Whateley and Scott 2006). Emphasising depth in thinking and precision in
expression may well result in higher quality but more condensed outputs.

Implications for students


In many higher education institutions, reforms along the lines sketched out above
would shift a significant measure of responsibility from the educational environment
(teacher, programme director, resources, technology and the institution) to the stu-
dents themselves. For students to rise to the challenge of passing each course with-
out concessions of any kind, they have to set their priorities, and coordinate the
resources over which they have control (prior knowledge, personal talent, effort and
time) in order to gain credit towards their degrees. For that, they need a clear sense
of ‘future-mindedness’ or ‘prospection’ (Osman 2014; Seligman et al. 2013). This
makes sense for the student only when the socially constructed context or ‘order’ in
which they will live when they finish their degrees is sufficiently stable or secure for
them to know all the effort will have been worthwhile (Schatzki 2001). In the short
term, this involves attending to course achievement goals as they come, and, for
each, a sense of agency over personal performance.

Goal setting
Extensive research over several decades in a wide variety of field and laboratory set-
tings has investigated the impact that so-called hard goals have on task performance.
Progressive reviews of this work are available in Locke et al. (1981), Locke et al.
(1990), and the first and last chapters of Locke and Latham (2013). Hard goals are
specific and clear rather than general or vague, difficult and challenging rather than
simple or easy, and closer to the upper limit of a person’s capacity to perform than
was their initial level of performance. Goals that require students to stretch for them
generally lead to substantial gains in performance. They act to focus attention, mobi-
lise effort and increase persistence at a task. In contrast, do-your-best goals often
fare little better than having no goals at all. As one would expect, the degree of
improvement is moderated by other factors, including the complexity of the task, the
1092 D. R. Sadler

learner’s ability, the strategies employed and various contextual constraints (Locke
et al. 1981). However, the general conclusion is that ‘an individual [cannot] thrive
without goals to provide a sense of purpose … If the purpose is neither clear nor
challenging, very little gets accomplished’ (Locke and Latham 2009, 22).
Arranging the learning environment so that all students have an adequate grasp
of the higher order outcomes stated in course outlines is a clear imperative for uni-
versities and colleges. Setting standards that some students initially see as tough –
and possibly even unfair or coercive, depending on their initial expectations – is part
of that. Serious students adapt pragmatically to hard constraints provided the settings
are known, fair and relevant. The consequences of a hard-earned pass are highly
positive in terms of both credit towards the degree and personal sense of accom-
plishment. Carried out ethically, hard goals work constructively for the student in
both the short and the long term (Sadler 2014b).

Student sense of agency


The nub of student agency is the belief that, in the matter of passing courses and
gaining credit, one is significantly responsible for one’s own learning and achieve-
ment. This is captured nicely by Pacherie (2008, 195), who drew a distinction
between a long-term and an ‘occurrent’ sense of agency. The long-term sense is ‘a
sense of oneself as an agent apart from any particular action... a form of self-narra-
tive where one’s past actions and projected future actions are given a general coher-
ence and unified through a set of overarching goals, motivations, projects and
general lines of conduct’. The occurrent sense is that which ‘one experiences at the
time one is preparing or performing a particular action’. Pacherie was not writing
specifically about higher education, but Lindgren and McDaniel (2012) were:
‘Agency can shape both the process and the outcomes of student learning … People
are more driven to achieve the agendas they set for themselves’ (346). The three
reforms outlined above, including formative assessment implemented according to
sound principles, can contribute to the growth of student agency in learning, not
only by imposing rigorous but reasonable conditions for students to succeed, but
also by providing effective developmental support.
Being vividly aware that one is in control of one’s actions brings with it a per-
sonal sense of responsibility. Frith (2014) summarised an ancient Hellenistic per-
spective on this, which in essence is that one’s sense of agency is developed
through two factors. The first is the cognitive binding that links one’s intentional
action (say, considerable effort) to its outcome (passing the course). The second is
the belief that an alternative action (investing little or no effort) would have led to a
different outcome (fail) accompanied by an experience of regret. The second part of
this is known as ‘counterfactual reasoning’, because, although it is valid to think this
way, it is essentially hypothetical, being contrary to what actually happened (Roese
1997). If the likelihood of failure in a course is low or non-existent, the sense of
agency is weakened or disappears altogether.
For students to gain clarity on a complex course-based achievement goal – some-
thing radically different from trying to improve by, say, one grade – they must
understand what high-level achievement looks like and experience for themselves
what reaching it entails. Overall, students need to see and appreciate the purpose to
be served, experience success in moving towards its attainment, and be motivated,
with grit and determination, to follow through to completion.
Assessment & Evaluation in Higher Education 1093

Genuine achievement for which a student works hard and produces a high
quality result brings about levels of fulfilment and confidence that come only from
possessing deep and thorough knowledge of some body of worthwhile material, or
attaining proficiency in high-level professional skills. The terms pleasure, satisfac-
tion, motivation and accomplishment have many nuanced and overlapping mean-
ings, but there is little doubt about the legitimacy of ‘pleasure as a by-product of
successful striving’ (Duncker 1941, 391). This is categorically different from, in the
modern context, having satisfying experiences in the classroom (although the two
may co-occur) or experiencing success in winning against others. For some students
more than others, developing this type of personal capital demands substantial
striving and struggling – and induces considerable stress. However, little by way of
significant and enduring learning comes cheaply, and experiencing success at some-
thing that was originally thought to be out of reach brings a distinctive personal
reward, a palpable sense of accomplishment. Not to insist on a demonstration of an
adequate level of higher order capabilities is to deprive students of both an important
stimulus to achieve and the satisfaction of reaching a significant goal.

Inhibitors of change
Some inhibitors are conceptual in nature. One of these consists of the multiple
meanings attached to the term ‘standard’. Add to that a limited awareness of the
need for externally validated anchorage points for standards generally – and passing
grades in particular (Sadler 2011, 2013a). Others have to do with assessment prac-
tices that detract from the integrity of course marks and grades. Some have been
criticised in the literature for decades (Elton 2004; Oppenheim, Jahoda, and James
1967; Sadler 2009), but they are now so deeply embedded in assessment cultures
they are resistant to change. In addition, new practices keep coming along and are
added incrementally. Accepted uncritically, these often become popular through
being labelled as ‘innovative’ or ‘best practice’. They are defended strongly by aca-
demic teachers, students and administrators, and may even be mandated in institu-
tional assessment policies. Accumulating marks is but one example. That they
reduce the integrity of course grades goes largely unheralded.
Whether hard goals are actually set and enforced depends on a variety of other
factors as well, some of which are related to the grading dispositions of individual
academics. At successively higher levels in the chain of authority, the freedom of
academics to make significant changes depends on: an enabling and supportive con-
text provided by academic department heads and programme directors; the fixedness
of the prevailing assessment traditions, grading policies and academic priorities; and
requirements externally set by governments or accrediting agencies.

Internal momentum and culture


Recent trends in higher education assessment practice have included: minimising
the proportion of achievement data that is secure; allowing lower-order outcomes to
be substituted for higher order as the minimum for course credit; programming the
learning path so it is presented in small manageable self-contained steps to facilitate
smooth, painless progress from step to step; and markers reading between the lines
of poorly composed written work accompanied by making generous inferences as to
1094 D. R. Sadler

the students’ level of understanding. The underlying drive is to ensure the least
possible discomfort or stress for students (Fiamengo 2013).
At the institutional policy level lie: curriculum freedom that allows students the
flexibility to pick and choose courses from a wide range to make up a substantial
part of a degree programme; and credit transfer policies and recognition of prior
learning that impose few restrictions. These decrease the effectiveness of coherent
sequences of courses specifically designed to promote development of higher order
outcomes, which require considerable time and multiple encounters to mature prop-
erly. Institutional factors are also influenced by financial considerations, particularly
continuity in total income from student fees and government funding. In principle,
assuring the quality of all graduates, maintaining student entry levels and ensuring a
satisfactory enrolments-based income stream are not incompatible. However in prac-
tice, academic achievement standards can be compromised to avoid rebalancing
internal resource allocations to prioritise teaching.
At the scale of individual academics, the following statements all have their
origins in conversations with academics in universities in different countries. They
reveal a range of problematic dispositions and attitudes related to passing courses.
‘Many students are low-entry or from disadvantaged backgrounds. They have lim-
ited ability to achieve well on higher order outcome measures, but they nevertheless
benefit greatly from the experience of higher education’. ‘Students who put in sub-
stantial effort no doubt learn something of importance in the process and therefore
deserve to pass’. ‘While it is disappointing when students submit low-level work,
there is no guarantee those students would gain employment directly in the fields of
their degrees anyway’. ‘Students who fail courses suffer adverse personal and social
consequences, such as loss of face, additional fees and delay to graduate earnings,
so avoiding failing grades is important’. ‘When students have to pay substantial fees,
they expect to pass and in any case would appeal against failure’. ‘All grade results
are reviewed by the Assessment Review Committee and, with very few exceptions,
approved without amendment’. ‘Consistent with the principle of academic freedom,
professors must be free to decide, according to their own professional judgments,
the grades to be assigned’. ‘Creative ways are found for students to earn enough
marks for them to at least pass, with scaffolding and active coaching there to help’.
‘Students these days need a qualification even if it means they are not truly qualified
at the end. In any case, graduates learn most of what they need to know after
graduation’. ‘Cutting out cumulative assessment and instead, grading according to
serious standards would produce high failure rates and consequential loss of income.
The institution would not tolerate that’.
Finally, ‘I know I am generous in grading, but I need to keep my teaching
evaluation scores up so I can look forward to tenure’. Whether there is a causal link
between grades and teaching evaluations is debated, but
Regardless of the true relationship between grades and teaching evaluations … that
many instructors perceive a positive correlation between assigned grades and student
evaluations of teaching has important consequences when there also exists a perception
that student course evaluations play a prominent role in promotion, salary, and tenure
decisions. (Johnson 2003, 49)
Most of these comments amount to admissions that things as they exist may not be
as they ought to be, but, by implication, not much can be done about it. Addressing
Assessment & Evaluation in Higher Education 1095

inflated pass rates at their source by raising actual achievement levels is the only
valid means of ensuring grade integrity. No amount of tinkering with other variables,
and no configuration of proxy measurements, will make the difference required.

Conclusion
In recent decades, the focus for evaluating teaching quality has been heavily
weighted towards inputs (student entry levels, participation rates, facilities, resources
and support services) and a select group of outcomes (degree completions, employa-
bility, starting salaries and student satisfaction, experience or engagement). Con-
spicuously absent is anything to do with actual academic achievement in courses.
This has allowed a number of sub-optimal assessment practices to become nor-
malised into assessment cultures. One of the consequences is that too many students
have been able to graduate without the capabilities expected of graduates, yet this is
not necessarily apparent from their transcripts.
The focus in this article is on student outcomes rather than inputs, with particular
emphasis on the higher order capabilities of students. Many students fail to master
these, yet they gain credit in course after course and eventually graduate. Directly
addressing the deficient aspects of assessment culture and practice could radically
alter this state of affairs, but it would require a transformation in thinking and prac-
tice on the part of many academics. The ultimate aim is to ensure that all students
accept a significant proportion of the responsibility for achieving adequate levels of
higher order outcomes. Bluntly put, no student would be awarded a pass in a course
without being able to demonstrate these levels. For some students, this would
necessitate a major change in their priorities. For academics, both their assessment
practices and the nature of the student–teacher relationship would change.
Undoubtedly, determination to pursue this end would have significant washback
effects on teaching, learning, and course and programme objectives, but that is
intended. The likelihood of success depends on finding a rational, ethical and afford-
able way to do it. This may require re-engineering some parts of the transition path,
creating other parts from scratch, and reworking priorities, policies and practices to
a considerable extent. In particular, it would entail rebalancing institutional resource
allocations in order to cater for student cohorts that have become much more
diversified. Except for aims geared narrowly to economic and employment con-
siderations, this goal is broadly consistent with older and many recent statements of
the real purposes of higher education.

Disclosure statement
No potential conflict of interest was reported by the author.

Notes on contributor
D. Royce Sadler is a senior assessment scholar in the School of Education, University of
Queensland. His research interests are in assessment and grading policies and practice in
higher education, especially the role of assessment in improving learning and capability,
academic achievement standards and the competence of graduates.
1096 D. R. Sadler

Sources for grade descriptors in Table 1


1. Monash University, Assessment in Coursework Units Policy: Grade Descriptors; Course
label Pass P (above Credit C; below Fail, F). Accessed 26 May 2015. http://
policy.monash.edu.au/policy-bank/academic/education/assessment/unit-assessment-
procedures.html
2. Harvard University, College of Arts and Sciences; The Grading System. Grade label D
(above C; below E). Course Requirements for the Degree: All candidates for the
Bachelor of Arts or the Bachelor of Science degree must pass 16.0 full courses and
receive letter grades of C− or higher in at least 10.5 of them (at least 12.0 to be eligible
for a degree with honours). Additional note: ‘Grades of D + through D− are passing but
unsatisfactory grades’. Accessed 26 May 2015. http://static.fas.harvard.edu/registrar/
ugrad_handbook/current/chapter2/grades_honours.html
3. University of Stirling, University Common Marking Scheme; Grade label 3rd Pass
(above 2.2 Pass; below: Fail-Marginal). Accessed 26 May 2015. http://www.stir.ac.uk/
regulations/undergrad/assessmentandawardofcredit/
4. University of Oslo Grading system: Grading scale with letter values. Grade label: E
Sufficient (above D Satisfactory; below F Fail). Accessed 26 May 2015. http://www.uio.
no/english/studies/about/academic-system/grading-system/
5. Dartmouth College. Grade descriptions; Grade label: D (above C; below E); Credit
eligibility: Requirements for the Degree of Bachelor of Arts. II. ‘A student must pass
thirty-five courses … No more than eight courses passed with the grade of D … may be
counted toward the thirty-five courses required for graduation’. Accessed 26 May 2015.
http://www.dartmouth.edu/~reg/transcript/grade_descriptions.html

References
Alderson, J. C. 1986. “Innovations in Language Testing?” In Innovations in Language Test-
ing: Proceedings of the IUS/NFER Conference, edited by M. Portal, 93–105. Windsor:
NFER-Nelson.
Arum, R., and J. Roksa. 2010. Academically Adrift: Limited Learning on College Campuses.
Chicago, IL: University of Chicago Press.
Australian Learning and Teaching Council. 2010. Learning and Teaching Academic Stan-
dards Project – Final Report. Strawberry Hills, NSW: Australian Learning and Teaching
Council.
Bergan, S., R. Damian, eds. 2010. Higher Education for Modern Societies: Competences and
Values. Higher Education Series No. 15. Strasbourg: Council of Europe Publishing.
Biggs, J. B., and C. Tang. 2011. Teaching for Quality Learning at University: What the
Student Does. 4th ed. Maidenhead: Open University Press.
Blömeke, S., O. Zlatkin-Troitschanskaia, C. Kuhn, and J. Fege, eds. 2013. Modeling and
Measuring Competencies in Higher Education: Tasks and Challenges. Rotterdam: Sense
Publishers.
Bloom, B. S. 1974. “Time and Learning.” American Psychologist 29 (9): 682–688.
doi:10.1037/h0037632.
Bok, D. 2006. Our Underachieving Colleges: A Candid Look at How Much Students Learn
and Why They Should Be Learning More. Princeton, NJ: Princeton University Press.
Brookhart, S. M. 1991. “Grading Practices and Validity.” Educational Measurement: Issues
and Practice 10 (1): 35–36. doi:10.1111/j.1745-3992.1991.tb00182.x.
Budé, L., T. Imbos, M. W. van de Wiel, M. P. Berger. 2011. “The Effect of Distributed
Practice on Students’ Conceptual Understanding of Statistics.” Higher Education 62 (1):
69–79. doi:10.1007/s10734-010-9366-y.
Coates, H., ed. 2014. Higher Education Learning Outcomes Assessment: International Per-
spectives. Vol. 6 Series: Higher Education Research and Policy. Frankfurt am Main:
Peter Lang.
Assessment & Evaluation in Higher Education 1097

Conway, M. A., G. Cohen, and N. Stanhope. 1992. “Very Long-Term Memory for Knowl-
edge Acquired at School and University.” Applied Cognitive Psychology 6 (6): 467–482.
doi:10.1002/acp.2350060603.
Dill, D. D., and M. Beerkens. 2013. “Designing the Framework Conditions for Assuring
Academic Standards: Lessons Learned about Professional, Market, and Government Reg-
ulation of Academic Quality.” Higher Education 65 (3): 341–357. doi:10.1007/s10734-
012-9548-x.
Douglass, J. A., G. Thomson, and C.-M. Zhao. 2012. “The Learning Outcomes Race: The
Value of Self-Reported Gains in Large Research Universities.” Higher Education 64 (3):
317–335. doi:10.1007/s10734-011-9496-x.
Duncker, K. 1941. “On Pleasure, Emotion, and Striving.” Philosophy and Phenomenological
Research 1 (4): 391–430. doi:org/10.2307/2103143.
Ebbinghaus, H. 1885. Memory: A Contribution to Experimental Psychology. Translated
H. A. Ruger and C. E. Bussenius. 1913. New York: Teachers College, Columbia University.
Edwards, J. M., and K. Trimble. 1992. “Anxiety, Coping and Academic Performance.”
Anxiety, Stress & Coping: An International Journal 5 (4): 337–350. doi:10.1080/10615
809208248370.
Elton, L. 2004. “A Challenge to Established Assessment Practice.” Higher Education Quar-
terly 58 (1): 43–62. doi:10.1111/j.1468-2273.2004.00259.x.
Emig, J. 1977. “Writing as a Mode of Learning.” College Composition and Communication
28 (2): 122–128. doi:10.2307/356095.
Entwistle, N. 1995. “Frameworks for Understanding as Experienced in Essay Writing and in
Preparing for Examination.” Educational Psychologist 30 (1): 47–54. doi:10.1207/
s15326985ep3001_5.
Fiamengo, J. 2013. “The Fail-Proof Student.” Academic Questions 26 (3): 329–337.
doi:10.1007/s12129-013-9372-5.
Frith, C. D. 2014. “Action, Agency and Responsibility.” Neuropsychologia 55 (1): 137–142.
doi:10.1016/j.neuropsychologia.2013.09.007.
Gardner, S., and H. Nesi. 2013. “A Classification of Genre Families in University Student
Writing.” Applied Linguistics 34 (1): 25–52. doi:10.1093/applin/ams024.
Grunert O’Brien, J., B. J. Millis, and M. W. Cohen. 2008. The Course Syllabus: A
Learning-Centered Approach. San Francisco: Jossey-Bass.
Hernández, R. 2012. “Does Continuous Assessment in Higher Education Support Student
Learning?” Higher Education 64 (4): 489–502. doi:10.1007/s10734-012-9506-7.
Hounsell, D. 1987. “Essay Writing and the Quality of Feedback.” In Student Learning:
Research in Education and Cognitive Psychology, edited by J. T. E. Richardson, M. W.
Eysenck and D. Warren-Piper, 109–119. Milton Keynes: Open University Press.
Isaksson, S. 2008. “Assess As You Go: The Effect of Continuous Assessment on Student
Learning During a Short Course in Archaeology.” Assessment & Evaluation in Higher
Education 33 (1): 1–7. doi:10.1080/02602930601122498.
Johnson, V. E. 2003. Grade Inflation: A Crisis in College Education. New York: Springer-
Verlag.
Jones, A. 2009. “Redisciplining Generic Attributes: The Disciplinary Context in Focus.”
Studies in Higher Education 34 (1): 85–100. doi:10.1080/03075070802602018.
Jones, A. 2013. “There is Nothing Generic about Graduate Attributes: Unpacking the Scope
of Context.” Journal of Further and Higher Education 37 (5): 591–605. doi:10.1080/
0309877X.2011.645466.
Lewis, R. 2010. “External Examiner System in the United Kingdom.” In Public Policy for
Academic Quality: Analyses of Innovative Policy Instruments, edited by D. D. Dill and
M. Beerkens, 21–36. Dordrecht: Springer.
Lindgren, R., and R. McDaniel. 2012. “Transforming Online Learning through Narrative and
Student Agency.” Educational Technology & Society 15 (4): 344–355.
Locke, E. A., and G. P. Latham. 2009. “Has Goal Setting Gone Wild, or Have Its Attackers
Abandoned Good Scholarship?” Academy of Management Perspectives 23 (1): 17–23.
doi:10.5465/AMP.2009.37008000.
Locke, E. A., and G. P. Latham, eds. 2013. New Developments in Goal Setting and Task
Performance. New York: Routledge.
1098 D. R. Sadler

Locke, E. A., G. P. Latham, K. J. Smith, and R. E. Wood. 1990. A Theory of Goal Setting
and Task Performance. Englewood Cliffs, NJ: Prentice Hall.
Locke, E. A., K. N. Shaw, L. M. Saari, and G. P. Latham. 1981. “Goal Setting and
Task Performance: 1969–1980.” Psychological Bulletin 90 (1): 125–152. doi:10.1037/
0033-2909.90.1.125.
Nicol, D. J., and D. Macfarlane-Dick. 2006. “Formative Assessment and Self‐Regulated
Learning: A Model and Seven Principles of Good Feedback Practice.” Studies in Higher
Education 31 (2): 199–218. doi:10.1080/03075070600572090.
Oppenheim, A. N., M. Jahoda, and R. L. James. 1967. “Assumptions Underlying the Use of
University Examinations.” Higher Education Quarterly 21 (3): 341–351. doi:10.1111/
j.1468-2273.1967.tb00245.x.
Osman, M. 2014. Future-Minded: The Psychology of Agency and Control. Basingstoke:
Palgrave-Macmillan.
Pacherie, E. 2008. “The Phenomenology of Action: A Conceptual Framework.” Cognition
107 (1): 179–217. doi:10.1016/j.cognition.2007.09.003.
Park, C. 2003. “In Other (People’s) Words: Plagiarism by University Students – Literature
and Lessons.” Assessment & Evaluation in Higher Education 28 (5): 461–488.
doi:10.1080/02602930301677.
Roese, N. 1997. “Counterfactual Thinking.” Psychological Bulletin 121 (1): 133–148.
doi:10.1037/0033-2909.121.1.133.
Rohrer, D., and H. Pashler. 2010. “Recent Research on Human Learning Challenges Conven-
tional Instructional Strategies.” Educational Researcher 39 (5): 406–412. doi:10.3102/
0013189X10374770.
Sadler, D. R. 1989. “Formative Assessment and the Design of Instructional Systems.”
Instructional Science 18 (2): 119–144. doi:10.1007/BF00117714.
Sadler, D. R. 2009. “Grade Integrity and the Representation of Academic Achievement.”
Studies in Higher Education 34 (7): 807–826. doi:10.1080/03075070802706553.
Sadler, D. R. 2010a. “Beyond Feedback: Developing Student Capability in Complex Apprai-
sal.” Assessment & Evaluation in Higher Education 35 (5): 535–550. doi:10.1080/
02602930903541015.
Sadler, D. R. 2010b. “Fidelity as a Precondition for Integrity in Grading Academic Achieve-
ment.” Assessment & Evaluation in Higher Education 35 (6): 727–743. doi:10.1080/
02602930902977756.
Sadler, D. R. 2011. “Academic Freedom, Achievement Standards and Professional Identity.”
Quality in Higher Education 17 (11): 103–118. doi:10.1080/13538322.2011.554639.
Sadler, D. R. 2013a. “Assuring Academic Achievement Standards: From Moderation to Cali-
bration.” Assessment in Education: Principles, Policy and Practice 20 (1): 5–19.
doi:10.1080/0969594X.2012.714742.
Sadler, D. R. 2013b. “Making Competent Judgments of Competence.” In Modeling and Mea-
suring Competencies in Higher Education, edited by S. Blömeke, O. Zlatkin-Troitschan-
skaia, C. Kuhn and J. Fege, 13–27. Rotterdam: Sense Publishers.
Sadler, D. R. 2013c. “Opening up Feedback: Teaching Learners to See.” In Reconceptualising
Feedback in Higher Education: Developing Dialogue with Students, edited by S. Merry,
M. Price, D. Carless and M. Taras, 54–63. London: Routledge.
Sadler, D. R. 2014a. “Learning from Assessment Events: The Role of Goal Knowledge.” In
Advances and Innovations in University Assessment and Feedback, edited by C. Kreber,
C. Anderson, N. Entwistle and J. McArthur, 152–172. Edinburgh: Edinburgh University
Press.
Sadler, D. R. 2014b. “The Futility of Attempting to Codify Academic Achievement
Standards.” Higher Education 67 (3): 273–288. doi:10.1007/s10734-013-9649-1.
Schatzki, T. R. 2001. “Practice Mind-ed Orders.” In The Practice Turn in Contemporary
Theory, edited by T. R. Schatzki, K. K. Cetina and E. von Savigny, 50–63. London:
Routledge.
Schrag, F. 2001. “From Here to Equality Grading Policies for Egalitarians.” Educational
Theory 51 (1): 63–73. doi:10.1111/j.1741-5446.2001.00063.x.
Seligman, M. E. P., P. Railton, R. F. Baumeister, and C. Sripada. 2013. “Navigating Into the
Future or Driven by the Past.” Perspectives on Psychological Science 8 (2): 119–141.
doi:10.1177/1745691612474317.
Assessment & Evaluation in Higher Education 1099

Shavelson, R. J. 2010. Measuring College Learning Responsibly: Accountability in a New


Era. Stanford, CA: Stanford University Press.
Shavelson, R. J. 2013. “An Approach to Testing and Modeling Competence.” In Modeling
and Measuring Competencies in Higher Education, edited by S. Blömeke, O. Zlatkin-
Troitschanskaia, C. Kuhn and J. Fege, 29–43. Rotterdam: Sense Publishers.
Sommers, N. 1980. “Revision Strategies of Student Writers and Experienced Adult Writers.”
College Composition and Communication 31 (4): 378–388. doi:10.2307/356588.
Sternglass, M. 1997. Time to Know Them: A Longitudinal Study of Writing and Learning at
the College Level. Mahwah, NJ: Erlbaum.
Strathern, M. 1997. “‘Improving Ratings’: Audit in the British University System.” European
Review 5 (3): 305–321. doi:10.1002/(SICI)1234981X(199707)5:33.0.CO;24.
Taylor, R. N. 1974. “Nature of Problem Ill-Structuredness: Implications for Problem
Formulation and Solution.” Decision Sciences 5 (4): 632–643. doi:10.1111/j.1540-5915.
1974.tb00642.x.
Tippin, G. K., K. D. Lafreniere, and S. Page. 2012. “Student Perception of Academic Grad-
ing: Personality, Academic Orientation, and Effort.” Active Learning in Higher Education
13 (1): 51–61. doi:10.1177/1469787411429187.
Tremblay, K. 2013. “OECD Assessment of Higher Education Learning Outcomes (AHELO):
Rationale, Challenges and Initial Insights from the Feasibility Study.” In Modeling and
Measuring Competencies in Higher Education, edited by S. Blömeke, O. Zlatkin-
Troitschanskaia, C. Kuhn and J. Fege, 113–126. Rotterdam: Sense Publishers.
Trotter, E. 2006. “Student Perceptions of Continuous Summative Assessment.” Assessment &
Evaluation in Higher Education 31 (5): 505–521. doi:10.1080/02602930600679506.
Walker, J. 2010. “Measuring Plagiarism: Researching What Students Do, Not What They
Say They Do.” Studies in Higher Education 35 (1): 41–59. doi:10.1080/
03075070902912994.
Watty, K., M. Freeman, B. Howieson, P. Hancock, B. O’Connell, P. de Lange, A. Abraham.
2014. “Social Moderation, Assessment and Assuring Standards for Accounting Gradu-
ates.” Assessment & Evaluation in Higher Education 39 (4): 461–478. doi:10.1080/
02602938.2013.848336.
Weissberg, R. 2013. “Critically Thinking about Critical Thinking.” Academic Questions 26
(3): 317–328. doi:10.1007/s12129-013-9375-2.
Whateley, M. K. G., and B. C. Scott. 2006. “Evaluation Techniques.” Chap.10 in Introduc-
tion to Mineral Exploration. 2nd ed. edited by J. Charles Moon, K. G. Michael Whateley,
and M. Anthony Evans, 199–252. Malden, MA: Blackwell.
Williams, J. B. 2006. “The Place of the Closed Book, Invigilated Final Examination in a
Knowledge Economy.” Educational Media International 43 (2): 107–119. doi:10.1080/
09523980500237864.
Williams, G. 2010. “Subject Benchmarking in the UK.” In Public Policy for Academic
Quality: Analyses of Innovative Policy Instruments, edited by D. D. Dill, and M.
Beerkens, 157–181. Dordrecht: Springer. doi:10.1007/978-90-481-3754-1_9.
Williams, J. B., and A. Wong. 2009. “The Efficacy of Final Examinations: A Comparative
Study of Closed-Book, Invigilated Exams and Open-Book, Open-Web Exams.” British
Journal of Educational Technology 40 (2): 227–236. doi:10.1111/j.1467-
8535.2008.00929.x.
Zinn, T. E., J. F. Magnotti, K. Marchuk, B. S. Schultz, A. Luther, and V. Varfolomeeva.
2011. “Does Effort Still Count? More on What Makes the Grade.” Teaching of
Psychology 38 (1): 10–15. doi:10.1177/0098628310390907.
Zorn, J. 2013. “English Compositionism as Fraud and Failure.” Academic Questions 26 (3):
270–284. doi:10.1007/s12129-013-9368-1.

You might also like