You are on page 1of 9

741489

research-article2017
UPDXXX10.1177/8755123317741489Update: Applications of Research in Music EducationDenis

Article
Update

Assessment in Music: A Practitioner


2018, Vol. 36(3) 20­–28
© 2017 National Association for
Music Education
Introduction to Assessing Students Reprints and permissions:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/8755123317741489
https://doi.org/10.1177/8755123317741489
update.sagepub.com

John M. Denis1

Implications
•• Researchers have identified several assessment techniques and approaches that allow for measurement of
varying teaching and learning goals in music education. Practitioner understanding of such differing
approaches may allow for better implementation of assessment in the classroom.
•• Researchers have found difficulties in assessing music in the schools, including the lack of a set curriculum,
lack of transparency in assessment procedures, and the inherent difficulties in assessing creativity. It is rec-
ommended that practitioners understand and implement multiple assessment techniques to address differ-
ing learning objectives at varying times in instruction sequences.

Abstract
There has been an increased interest in documenting the growth and learning outcomes of students in all subjects
in the past 20 years, and music education has not been immune to the accountability movement. Yet, in spite of the
increased sociopolitical pressures put on educators, music has remained a difficult discipline to assess, which in turn
has created tension between music educators and policymakers. This review of literature examines the basic nature
of assessment in music education and discusses possible concepts and methods to improve practitioner understanding
of student growth and learning. Topics include (a) What is assessment? (b) Why does assessment matter? (c) How do
we assess in music? and (d) What challenges occur in music assessment?

Keywords
assessment, evaluation, feedback, formative assessment, rubric, summative assessment

There has been an increased interest in documenting the class settings often navigated by music teachers has
growth and learning outcomes of students in all subjects been suggested as heightening the necessity for building
in the past 20 years, and music education has not been assessments around student needs (Asmus, 1999; West,
immune to the accountability movement (Ciorba & 2012). Classes often vary in size, language spoken, cul-
Smith, 2009; Colwell, 1998; Fisher, 2008; McQuarrie & tures represented, and cognitive development.
Sherwin, 2013). As such, assessment in the music class- Accordingly, Chiodo (2001) recommended that music
room has become a primary area of concern to music edu- educators seek out information about their students
cators (McQuarrie & Sherwin, 2013). Yet, in spite of the from the students themselves, professional colleagues,
increased sociopolitical pressures put on educators, music and from data resources to inform decisions about
has remained a difficult discipline to assess, which in turn assessment.
has created tension between music educators and policy- The purpose of this article is to examine the following
makers (Colwell, 2008). questions: (a) What is assessment? (b) Why does assess-
Therefore, it has become increasingly important that ment matter? (c) How do we assess in music? and (d)
music educators maintain an active awareness of assess-
ment importance and practice in the field. Maintaining 1
Texas State University, San Marcos, TX, USA
such awareness, however, may prove challenging for
young practitioners, due, in part, to a lack of knowledge Corresponding Author:
John Denis, College of Fine Arts and Communication, School of
about assessment. Additionally, Asmus (1999) stressed Music, Texas State University, 601 University Drive, San Marcos,
the importance of teachers knowing the needs and learn- TX 78666, USA.
ing styles of the students in their classes. The variety of Email: john.denis@txstate.edu
Denis 21

What challenges occur in music assessment? Topics were (2010) found that the vast majority (95%) of the 352 sec-
selected a priori by the author and the included studies ondary music teachers surveyed worked in schools that
were selected through ERIC, JSTOR, and ProQuest data- used traditional summative grading systems. Comparable
base searches. systems have been noted as extremely common; however,
Harrison, Lebler, Carey, Hitchcock, and O’Bryan (2013)
suggested that simple grades lack meaningful feedback.
What Is Assessment? Due to their perceived and real importance, summative
experiences often influence instructional decisionmaking,
Idealistically, assessment has been presented as a method
from sequencing to curriculum.
for gathering information relevant to both teachers and
students about teaching and learning, centered on student
knowledge and skills (Parkes, Rohwer, & Davidson, Why Does Assessment Matter?
2015). Goolsby (1999) suggested that four types of assess- Researchers have found that students may view assess-
ment are, in his experience, commonly used in instrumen- ment, particularly when measured by grades, as important
tal settings: (a) placement (auditions, chair tests, etc.); (b) and a powerful motivating factor (Colwell, 1998;
summative (concert, festivals, etc.); (c) diagnostic (identi- McClung, 1996; Reimer, 2009). Boud, Cohen, and
fying performance skills and knowledge difficulties); and Sampson (1999) claimed that “assessment is the single
(d) formative (the regular monitoring of students for learn- most powerful influence on learning in formal courses
ing outcomes). Eisner (1998) also underscored the diag- and, if not designed well, can easily undermine the posi-
nostic, positioning, content evaluative, and reflective roles tive features of an important strategy in the repertoire of
of assessment in the arts. Conceptual definitions and teaching and learning approaches” (p. 414). Lehman
goals, such as those mentioned above, provide the founda- (2008) also emphasized the interconnectivity of instruc-
tion for the creation and implementation of assessments. It tion and assessment. In this view, assessment has been
may, therefore, be necessary to have a basic understanding considered an essential component of quality instruction
of what assessments accomplish.
and necessary for improvement in teaching and learning
Students have been found to benefit from specific and
to take place (Eisner, 1998; Lehman, 2008). Formative
reliable feedback gained from both formative and sum-
assessments, in their role of providing data for reflection
mative assessments (Colwell, 1998; Salvador, 2011;
and adjustment of instruction, may improve specific les-
Sicherl Kafol, Kordeš, & Holcar Brunauer, 2017; Zerull, sons, general content delivery, sequencing, and/or curricu-
1990). Formative assessments have been defined as lar focus. Additionally, the cumulative nature of summative
assessments conducted during learning, and have been assessments may be used to evaluate achievement, influ-
one of the primary aspects of designing and differentiat- ence the development of future goals, serve as a motivat-
ing instruction (Salvador, 2011; Saunders & Holahan, ing factor, and provide meaningful data about teaching
1997; Sicherl Kafol et al., 2017). Music teachers have and learning in the music classroom. It may be important,
been said to engage in formative assessment whenever however, to note that these positive factors may be predi-
they listen critically to student performance/practice, cated on the proper implementation of assessment.
make judgments, and provide feedback (Saunders & Increases in accountability, spurred by laws such as
Holahan, 1997). While practitioners may often use for- No Child Left Behind and grant programs such as Race to
mative assessment to ensure that students do not simply the Top, have exacerbated the preexisting natural empha-
skirt the instructor’s acceptance of mistakes (Goolsby, sis on assessment (Ciorba & Smith, 2009; Wesolowski,
1999), formative assessment has often been relegated to 2014). This increased public attention, in turn, has
error detection, identification, and communication increasingly affected policy decisions about music assess-
(Colwell, 1998). True formative assessment, according to ment (West, 2012; Zerull, 1990). Music teachers have
Colwell (2008), only occurs when “some learning action been asked to teach more than one area of music, in many
follows from the testing, otherwise it is merely frequent cases, to free up resources to support subjects struggling
summative assessment” (p. 13). with high-stakes testing (Lehman, 2008; West, 2012).
Summative assessment, or the end measurement of The implementation of standardized music tests, while
either learning, growth, or achievement (Sicherl Kafol attempted several times in the past 40 years, has not been
et al., 2017), can also be thought of as answering the fol- widespread due to the inherent difficulties in uniform
lowing question: How did the student(s) do? (Mastrorilli, measurement of music skills (Parkes et al., 2015; Zerull,
Harnett, & Zhu, 2014). In music, individual performances 1990). Researchers have also pointed to the breadth of
and competitions often serve as summative assessments music as a field, the expressive nature of the discipline,
and have been found to be the source of many student and the intrinsic value of all art forms as factors contrib-
grades and teacher evaluations (Ciorba & Smith, 2009; uting to music’s unsuitability for assessment (Lehman,
Latimer, Bergee, & Cohen, 2010). Russell and Austin 2008; Zerull, 1990).
22 Update 36(3)

Assessment has been found to be of such importance rating scales, checklists, rubrics, report cards, aptitude
that evaluating music and music performance is included testing, observations, and portfolios to assess student
in the current national standards (Shuler, Norgaard, & learning and growth (Parkes et al., 2015; Rohwer, 1997;
Blakeslee, 2014). Music has been recognized at both Salvador, 2011). Many of the common assessments found
national and state levels as being a core curricular sub- in schools, however, have been noted as either nonmusic
ject, which has brought more focus on assessment in the or containing no actual measurements (Barkley, 2006;
music classroom (Fisher, 2008). As such, Asmus (1999) McClung, 1996; McQuarrie & Sherwin, 2013; Russell &
noted that some objections to assessment in music may Austin, 2010; Simanton, 2000). McQuarrie and Sherwin
stem from a lack of specificity as to students’ expected (2013) found that the top five assessments used in elemen-
knowledge and skills. Fisher (2008) provided several rea- tary general music were based on (a) participation, (b)
sons for the adoption of more regular assessments in effort, (c) individual performances using informal obser-
music classrooms, including easily understood and con- vation, (d) group performances, and (e) behavior. These
sistent accountability, legitimization in the perceptions of results were echoed by Barkley (2006) who reported that
those outside music, and protection of music instruction observation was the most commonly used assessment
time. Wright, Humphrey, Larrick, Gifford, and Wardlaw strategy in elementary music classrooms, followed by
(2005), however, countered that there are also viable concert performances and written tests. Russell and Austin
arguments that music is not suitable for formal standard- (2010) found similar results among secondary music
ized testing, including the inconsistent schedule of music teachers, with the top five assessment criteria in order of
instruction and the miniscule impact on subject credibil- use being (a) performance/skill, (b) knowledge, (c) atten-
ity testing may provide. Likewise, the researchers listed dance, (d) attitude, and (e) practice. Additionally, Simanton
additional contributing factors, including the breadth of (2000) reported that secondary band directors used (a)
music as a field, the expressive nature of the discipline, participation/attitude, (b) band music performance, (c)
and the intrinsic value of all art forms. Asmus (1999) sug- attendance, and (d) technique/sight-reading as the primary
gested that music, possibly more than other subjects, means of assessing students, and that 56% of assigned
operates in more unique circumstances and has longer grades in band classes were derived from nonperformance
teacher-to-student interactions. Therefore, formative and criteria. McClung (1996) noted that students perceived
summative assessments may affect long-term learning individual participation and attitude as the most frequent
outcomes. predictor of grades in choir.
The importance of music assessment extends beyond The proliferation of nonmusic assessments, such as
the need to evaluate for pedagogical purposes. A number attendance or attitude, in classrooms has contrasted with
of studies have shown that assessments can be used to much of the published literature, which has focused on
communicate value and advocate for school music to assessments like individual and group performance, stan-
those outside the discipline (Colwell, 1998; Reimer, dardized testing, assessment software, and others
2009; Zerull, 1990). Accordingly, Reimer (2009) (McQuarrie & Sherwin, 2013). While simple to execute
remarked that the use of nonmusic criteria for formative and often supported by administrators (McClung, 1996;
assessment, or even worse for summative grades, can Russell & Austin, 2010), nonmusic assessments have not
lend credence to claims that music is a less serious sub- been noted to support music learning and growth to the
ject than other academic disciplines. McClung (1996) same extent as content-based assessments built on dem-
found that only 18% of administrators surveyed per- onstrations of music knowledge and skills (Reimer,
ceived choir as having equal educational status as other 2009). Russell and Austin (2010) determined that con-
core subjects. McQuarrie and Sherwin (2013) found that tent-based assessment, such as evaluation of music per-
many of the most common assessments used in music formance or music composition, was the most effective
classrooms were nonachievement based, which deviated approach to improve teaching and learning.
sharply from the assessment literature in the Music
Educators Journal. In general, music ensemble instruc-
Assessment Techniques
tors have embraced the necessity of assessment in their
classes, but often have no training in developing or NAfME (2015) leaders have proposed that assessment in
administering assessment (Colwell, 1998). music “should measure student learning across a range of
standards representative of quality, balanced music cur-
riculum, including not only responding to music but also
How Do We Assess in Music? creating and performing music.” Wesolowski (2014)
Within the differing forms and goals of assessment, schol- remarked that having a deep understanding of music is
ars have noted a multitude of approaches (McQuarrie & essential for evaluating both student learning and one’s
Sherwin, 2013; Rohwer, 1997; Russell & Austin, 2010). own instruction. Fortunately, many of our schools of
Teachers and researchers have used content-specific music have been perceived by recent graduates as
Denis 23

excelling at instruction in performance, music history, Norris & Borst, 2007). Latimer et al. (2010) suggested
and music theory (Denis, 2017). Subsequently, music that the limited reliability of rhythms may be due to a lack
educators may find that their music preparations may of rubric specificity. As such, practitioners may desire to
facilitate the acquisition of performance assessment strive for specific and clear definitions of rhythm and
skills. rhythmic accuracy in self-designed rubrics to increase
Performance has been found to be one of the most reliability.
common forms of assessment in music, due to its para- Several other ways to assess music performance,
mount nature with regard to music making and its including checklists, rating scales, and recordings, have
power to motivate students (Latimer et al., 2010; been discussed in the literature. Essentially functioning
Reimer, 2009). While definitively authentic, perfor- as a tally system to guide instructor focus, practitioners
mance assessment has been cited as a subjective have described checklists as allowing for individual
endeavor (Bergee & McWhirter, 2005; Latimer et al., assessment on any number of teacher-determined skills
2010; Reimer, 2009; Ryan & Costa-Giomi, 2004). (Chiodo, 2001; Goolsby, 1999). Therefore, checklists
Researchers have noted that school size classification, may be used with judges to arrive at an overall rating, and
time of day, type of event, and level of expenditure per have been found to be reliable (Doane, Davidsen, &
average daily attendance were all strong predictors of Hartman, 1990). Wesolowski (2014), however, expressed
festival scores (Bergee & McWhirter, 2005). Bergee validity concerns for the implementation of checklists in
and Westfall (2005) found many of the same strong his own experience, particularly due to the potential dis-
relationships between nonmusic predictors like school connect between cognitive processes and observable
size or time of day, as well as the added predictor of music behaviors. In music classrooms, checklists may
geographical district (metropolitan or nonmetropoli- often appear in the form of pass off charts or objective
tan), and festival ratings. Furthermore, Ryan and Costa- sheets. Akin to rubrics, the checklist has been set forth by
Giomi (2004) reported that evaluations by judges were music educators as allowing for communication of expec-
influenced by perceived attractiveness and gender. tations and content across a variety of topics (Chiodo,
Several different approaches have been developed 2001; Goolsby, 1999) and may allow for more specific
and researched to offset inconsistencies in evaluation, feedback than rubrics (Colwell, 2002). The feedback
including the use of rubrics. Chiodo (2001) suggested innate to the format, on the other hand, may be limited, as
that rubrics can be effective tools to help judges organize the metric relies on a pass/fail check of skill acquisition.
and evaluate music performance, when aligned with Rating scales have been likened to checklists in prac-
expressed standards or expectations. In general, rubrics titioner literature; however, they have been suggested to
use descriptors of performance domain criteria (e.g., measure performance skills with more gradation in results
tone, balance, rhythm, etc.) to provide an isolated (Chiodo, 2001; Wesolowski, 2014). These scales have
domain-specific rating or overall total for a holistic per- been found to be criteria specific and can be continuous,
formance score (Latimer et al., 2010). Multidimensional through the measurement of mastery and a hierarchy of
rubrics with specific descriptions of performance dimen- interdependent skills, or additive, through the measure-
sions have been found to provide both higher interjudge ment and summation of independent skills (Azzara, 1993;
reliability (Norris & Borst, 2007) and a more detailed Saunders & Holahan, 1997). Researchers have found rat-
picture of student achievement (Ciorba & Smith, 2009). ing scales to lead to high interjudge reliability (Bergee,
Latimer et al. (2010) used the institution of a new state- 2003; Saunders & Holahan, 1997).
wide rubric to assess both reliability and validity of Rating scales may offer a quick way to focus on spe-
rubric use in Kansas for bands, choirs, and orchestras cific music skills or knowledge and still attach the feed-
across a 2-year period. In addition to finding that the back necessary for learning in a large classroom setting.
rubric led to high interjudge reliability, they also reported While Chiodo (2001) suggested that, in her experience,
that both adjudicators and ensemble directors found that rating scales offered an efficient way to assess perfor-
rubric use led to scores that represented performance mance, a balance must be found between having a scale
quality and provided better feedback than previously large enough to provide meaningful feedback and one
used forms. For practitioner purposes, rubrics may offer small enough to keep the pacing advantages the format
a way to remain consistent between individuals or groups offers. In practical use, criteria-specific rating scales may
(Chiodo, 2001) and may increase convergent thinking be used for quick placement, diagnostic, or formative
and focus among students, as well as serving as a source assessments. In small group settings, such as a sectional
of student motivation (Colwell, 2002). Similarly, rubrics rehearsal, a performance assessment on phrasing might
may communicate instructor expectations, highlight be taken and feedback given through a rating scale with-
essential concepts or material, and guide student work. out the use of an excessive amount of instruction time.
Rhythm, however, has been noted to be the least reli- Apart from purely performance assessment, portfolios
able dimension used with rubrics (Latimer et al., 2010; have been implemented with increased frequency over
24 Update 36(3)

the past 25 years in music settings (Lehman, 2008; Parkes essays about music experiences, may be viewed favor-
et al., 2015; Zerull, 1990). Often touted in response to the ably by many administrators, but may not be valid
value-added modeling assessments stressed in the Race assessments of music content knowledge. In general,
to the Top grant program, portfolios have been defined as written assignments may allow for assessment of music
collections of artifacts and student work that serve to knowledge, and careful consideration may alleviate
document growth and learning over a period of time validity concerns due to assignment construction or
(Asmus, 1999; Hughes & Keith, 2014; Parkes et al., content. Moreover, attentive planning may free written
2015). Portfolios may contain quantifiable aspects, such assignments from the grade-focused summative nature
as the number of recordings, amount of written work, and practitioners may initially call to mind when consider-
scores from competitions, qualitative journal entries, ing traditional pen-and-paper assessments.
teacher impressions, and/or the performance recordings Assessment is not limited to teacher-centered activi-
of music (Zerull, 1990). Additionally, Wesolowski (2014) ties; however, the development of self-assessment skills
suggested that practitioners may also use portfolios for requires that students experience appropriate external
both formative and summative assessment, in addition to assessment, and be taught to transfer those concepts to
authentic or alternative assessment, depending on timing autonomous use (Reimer, 2009). Therefore, the complex-
and material selection approach. Colwell (2002) noted ities of self-evaluation should be experienced by students
that portfolios may require many individual tasks for after basic skill acquisition and learning have occurred
each assessed objective, and may lead to increased work (Colwell, 1998; Goolsby, 1999). Any of the previously
for practitioners. mentioned techniques can be used to help facilitate self-
In 2012, the Memphis City Schools (now Shelby assessment; however, the use of recordings becomes
County Schools) and the state of Tennessee pioneered essential (Silveira & Gavin, 2016). Contest adjudicators
and developed a portfolio assessment that used student and teachers have been trained in assessment using audio
work in evaluating learning growth. Eventually called the recordings (Bergee, 2007) and when teaching students to
Portfolio Growth Measure System, the approach used evaluate themselves, the use of recordings can be advan-
both teacher selection/scoring of materials and blind tageous (Silveira & Gavin, 2016). When asking students
review by trained content-specific reviewers. In 2015, to record themselves, Goolsby (1999) suggested that
Parkes et al. piloted a modification of the Portfolio practitioners both communicate and practice the neces-
Growth Measure System in Virginia used for evaluating sary procedures to alleviate challenges with using
music educators and found that stakeholders believed that technology.
documenting teacher effectiveness with portfolios was an
acceptable approach to documentation, and that the What Challenges Occur in Music
development of measurement instruments and reviews of
the portfolios held reasonable validity and reliability.
Assessment
Making a diverse selection of artifacts that reflect content A number of inherent challenges exist when trying to
knowledge has been found to be essential for practitio- develop music assessments. From a broad perspective,
ners to receive the benefits of portfolio assessment the lack of agreement on music curricula or the end goals
(Parkes et al., 2015). This approach could be applied to of instruction have created divisions in assessment
student portfolios, and therefore used to document both approaches, which in turn may have erected barriers to
specific learning and growth. It may be helpful to decide assessment (Lehman, 2008; Reimer, 2009). The subjec-
on the categories or types of artifacts prior to the begin- tive nature of any music value judgments may have fur-
ning of the course, so as to avoid confusion. ther hindered the development of any consistency in
In addition to performance assessments, traditional assessment; however, the rise of the standards movement
written assignments can also be used in calculating stu- may assuage some of these concerns (Colwell, 2008;
dent grades (Russell & Austin, 2010). The most com- Lehman, 2008). The use of standards may also align with
mon written assessments in secondary music classes McClung (1996), who commented that grades should be
have been reported to be quizzes, worksheets, and linked to specific learning objectives.
exams, all of which may be used to appraise basic con- Logistical challenges may also influence assessment
tent knowledge, such as vocabulary, notation, or music decisions and impede improvement in assessment prac-
theory, and basic music reading and listening skills tice (Ferm Almqvist, Vinge, Väkevä, & Zandén, 2017;
(Russell & Austin, 2010; Salvador, 2011). Tests have Harrison et al., 2013; Russell & Austin, 2010; Salvador,
been noted to be frequently used as an assessment tool; 2011). In a qualitative study of three purposely selected
yet writing an appropriate, effective test has been found elementary music teachers, Salvador (2011) noted that
to be a difficult and time-intensive undertaking, and the participants viewed the number of students taught, time
necessary skills must be learned and developed constraints, and the lack of administrator support as being
(Lehman, 2008). Cross-discipline assignments, such as obstacles to improved assessment. Similarly, Harrison
Denis 25

et al. (2013), Lehman (2008), and Ferm Almqvist et al. In contrast, product-based assessment has been
(2017) all echoed the concern over class size issues and favored by researchers in recent years, with many studies
their impact on effective assessment. Russell and Austin using Amabile’s (1983) Consensual Assessment
(2010) also found that administrative guidance or assis- Technique (CAT) (Barbot & Lubart, 2012; Eisenberg &
tance with assessment was rare, yet when administrators Thompson, 2003; Hickey, 2001). Counter to standardized
provided help, nonmusic assessments were less likely to tests, CAT measurements involve domain-appropriate
be present in the classroom. Additionally, Reimer (2009) judges using a Likert-type scale to rate artistic products
suggested that the complexity of grading individuals in on multiple criteria. This approach has been found by
ensembles may create confusion or disagreement between researchers to be reliable when used with both improvisa-
music educators and administrators, and that administra- tion (Eisenberg & Thompson, 2003) and composition
tors may be concerned that such confusion could lead to (Hickey, 2001; Stefanic & Randles, 2015). Stefanic and
grade inflation. Randles (2015) also found that CAT was less reliable
A lack of transparency in grading procedures used in when evaluating group compositions or when there was
ensembles can lead to perceptions of favoritism (Harrison only one judge. The need for multiple expert judges may
et al., 2013). Providing written grading policies to students pose a logistical problem for the implementation of CAT
and parents may alleviate transparency concerns. On the in some settings (Lu & Luh, 2012).
contrary, Ferm Almqvist et al. (2017) found that assess- General tenants from both researcher-designed mea-
ment often drove instruction and learning goals directly, surements and general assessment strategies may still be
stating that “assessment procedures and techniques valuable to practitioners in developing appropriate
[became] so persistent that they completely dominate[d] assessments for music creativity. The lack of criterion or
the teaching and learning experiences” (p. 6). Ferm standards for evaluating creativity, for instance, may free
Almqvist et al. (2017) further suggested that transparency music educators to focus on growth in novelty or appro-
contributed to the dominant role assessment played in par- priateness of the creative product. To this end, practitio-
ticipant classrooms. Overall, student understanding of ners may use assessment tools such as checklists, rating
teacher expectations has remained crucial for learning, and scales, peer/teacher responses, or portfolios to evaluate
assessments have been suggested as meaningful indicators student creativity (Rohwer, 1997).
of expectations, goals, and objectives (Reimer, 2009). Finally, the common use of nonmusic assessments
Including rationales and reasoning in assessments has also may point to external factors influencing assessment
been offered by practitioners as a way for music educators choices. Russel and Austin (2010) found that practitioner
to document rigor (Wesolowski, 2014). practice and district/campus policies were often at odds,
Even when transparency exists, assessing creativity in and music teachers were rarely provided assistance in
music has often proven especially challenging for music reconciling any discrepancies. Barkley (2006) found that
educators (Hickey, 2001). Scholars have described cre- elementary music educators believed nonmusic factors,
ativity as novel, appropriate, and valuable (Barbot & such as itinerancy, district/campus policies, planning
Lubart, 2012; Stefanic & Randles, 2015) and have noted time, training, and resources, all influenced assessment
its historical importance in music education (Rohwer, practices. Simanton (2000) suggested that the extensive
1997). Despite the educational value of creativity, the work load of band directors may contribute to the lack of
inherent subjectivity may prove problematic for educator specific, individual assessments.
assessments (Hickey, 2001; Rohwer, 1997).
Researcher-designed standardized assessments and
Conclusions
product-based assessments have been used to evaluate and
measure creativity in research literature. The most promi- In a climate increasingly focused on accountability, how
nent researcher-designed standardized measures of cre- do we implement effective assessment without sacrific-
ativity in music were developed by Guilford and Torrance ing effective music making? This process may begin with
during the middle of the 20th century (Rohwer, 1997; the instructor’s deep understanding of music. Colwell
Stefanic & Randles, 2015). The Torrance Tests of Creative (1998) argued that developing a meaningful assessment,
Thinking (TTCT) have been renormed as may serve as an particularly in a discipline with a high level of subjectiv-
example of standardized creative measures (Stefanic & ity, requires a strong foundation of music knowledge.
Randles, 2015). Individuals or groups respond to activities Difficulty in assessment may increase if one attempts to
(five activities on the TTCT-Verbal and three activities on keep the flexibility needed for creativity and artistry in
TTCT-Figural) in writing and in drawing, which are then the music classroom alive. The authentic performance
scored (Kim, Crammond, & Bandalos, 2006). These stan- assessments that so often constitute the majority of con-
dardized tests have been used as a foundation for creativ- tent-specific grades in music ensembles may, if devel-
ity research, although concerns about reliability have been oped and implemented poorly, contribute to the restriction
raised (Stefanic & Randles, 2015). of creative thought by reinforcing the power dynamics of
26 Update 36(3)

teacher-led decisionmaking. As such, knowing music illustration, when working with a secondary ensemble,
from a formalist and expressionist perspective might instructors may select the music, decide on the stan-
allow practitioners to create learning environments tied to dards and objectives for musicality/expression/accu-
a holistic ideal of music making, instead of simply racy/and so on, develop several approaches for formative
continuing the deficiency rote performance assessments assessment (such as an overall rubric for the selected
often seen in school. music, checklists for each song, or designated moments
Assessing specific knowledge may also be important; of peer feedback), and culminate with the summative
however, knowing exactly what and how to assess appro- assessment of concert performance. Each specific learn-
priately may remain problematic. Therefore, a significant ing objective or standard may need multiple assess-
understanding of assessment fundamentals and strategies ments. Such frequency may serve to increase the
may help practitioners gather truly meaningful data about opportunities to transition content knowledge about
their students (Colwell, 1998). Yet, in spite of this self- music into procedural knowledge of making music
evident fact, many teachers may be unaware of validity while lessening the consequences for failure (Duke,
concerns in either teacher-created or externally provided 2005). Consequently, practitioners may want to have
assessments. Additionally, deficiencies in practitioner frequent, intentional, formative assessments in their
assessment knowledge may lead to misuse of assessment classrooms to facilitate teaching and learning.
strategies, and in turn promote incorrect conclusions and As discussed previously, several different approaches to
negatively influence instruction. To avoid misuse, research- assessment may be best suited for differing objectives.
ers have suggested that assessments be linked with learn- Performance remains one of the most common forms of
ing objectives (McClung, 1996; Parkes et al., 2015) and be assessment in music; however, practitioners may wish to
implemented in an instructionally meaningful sequence extend their conceptualization of performance beyond the
(Wesolowski, 2014). While nonmusic assessments may be concert stage. Performances for peers may provide oppor-
common, they may devalue the overall learning goals and tunities for reflection, the development of evaluation skills,
public perceptions of the rigor of school music. and practice in the procedural demonstration of music
Accordingly, practitioners should understand the understanding. Furthermore, performances in varying set-
knowledge and skills related to assessment in order to eval- tings may allow for students to exhibit music learning for
uate and instruct students effectively. Therefore, intention- the teacher, administration, community, and themselves.
ally varying assessment approaches and material assessed Practitioners may then use the information gathered to
may provide a more holistic picture of student progress and inform future curricula and instruction. For instance, stu-
the efficacy of teacher instruction. Additionally, disparate dents may be assigned solos and ensembles as part of the
learning objectives may be best assessed by disparate tech- class curriculum. During the learning stages, performances
niques. Consequently, practitioner planning may be an for the instructor and other students may offer frequent,
essential part of meaningful assessment. low-impact opportunities for feedback and assessment.
Planning may also allow for practitioners to gain a Building from these performance assessment experiences,
more holistic view of student learning. For instance, a students may then participate in community performances
student composition may be assessed through both a of their music. At this stage, students have further opportu-
teacher checklist to provide feedback about specific areas nities to demonstrate their understanding of the music, and
of growth and a written peer evaluation to address more if recordings are made they may again reflect, either indi-
general aesthetic attributes (Rohwer, 1997). Planning vidually or with others, on their performance. Finally, per-
may also mitigate issues common to assessment. Peer forming the solo or ensemble as part of a competition may
feedback, for example, may require careful introduction give a summative assessment of students’ understanding of
and practice of procedures to develop a safe environment the piece and individual instrument. By having multiple
for the students and facilitate the generation of relevant assessments in various settings, teachers may mitigate
and meaningful feedback. Sicherl Kafol et al. (2017) some of the noted subjectivity of performance assessments
found that proper establishment of peer feedback proce- (Bergee & McWhirter, 2005; Latimer et al., 2010; Reimer,
dures led to improved student perceptions of assessment. 2009; Ryan & Costa-Giomi, 2004).
Practically, assessment can direct student and teacher Similarly, rubrics may help practitioners clearly com-
focus. Ferm Almqvist et al. (2017) found that summa- municate expectations and increase consistency in certain
tive assessments drove student learning, as students assessment areas. Practitioners may use rubrics as a way
were coached to perform well on assessments. In light to communicate the desired learning outcomes of a par-
of the power of assessment to shape content, Duke ticular unit or topic, and give structure to planned assess-
(2005) advocated for planning instruction with the ment. To illustrate this, practitioners may create a rubric
assessment clearly in mind. For practitioners, this may for a piece of music prior to distribution to students that
mean deciding on aspects of the material to assess and describes the intended learning objectives that will be
approaches prior to developing lesson plans. As an assessed. This rubric may prime students for engaging
Denis 27

with the music once rehearsals begin. Of particular appli- behaviors. Psychology of Aesthetics, Creativity, and the
cation, the high rate of interjudge reliability found with Arts, 6, 231–242. doi:10.1037/a0027307
the use of rubrics (Latimer et al., 2010; Norris & Borst, Barkley, M. (2006). Assessment of the National Standards for
2007) may facilitate assessments where more than one Music Education: A study of elementary general music
teacher attitudes and practices (Doctoral dissertation).
practitioner might be involved, such as placement audi-
Retrieved from ProQuest Dissertations and Theses data-
tions. Correspondingly, rating scales and checklists may
base. (UMI No. 1439697)
also be used to communicate expectations. Bergee, M. J. (2003). Faculty interjudge reliability of music
Portfolios, while potentially time-consuming for practi- performance evaluation. Journal of Research in Music
tioners, may provide a wider picture of student learning by Education, 51, 137–150. doi:10.2307/3345847
incorporating various diverse artifacts related to music. Bergee, M. J. (2007). Performer, rater, occasion, and sequence
Student music theory work, performance, composition, as sources of variability in music performance assessment.
and writings may all contribute to a holistic picture of Journal of Research in Music Education, 55, 344–358.
understanding and growth. Practitioners who implement doi:10.1177/0022429408317515
portfolios may desire to predetermine a specific approach Bergee, M. J., & McWhirter, J. L. (2005). Selected influences
to artifact inclusion in the portfolio to avoid student confu- on solo and small-ensemble festival ratings: Replication
and extension. Journal of Research in Music Education,
sion and effective assessment of the learning objectives.
53, 177–190. doi:10.1177/002242940505300207
Parkes et al. (2015) noted that blind portfolio reviewers
Bergee, M. J., & Westfall, C. R. (2005). Stability of a model
occasionally struggled to understand teacher artifact selec- explaining selected extramusical influences on solo and
tion. To reduce extraneous time spent evaluating portfo- small-ensemble festival ratings. Journal of Research in Music
lios, practitioners may choose to have a priori categories or Education, 53, 358–374. doi:10.1177/002242940505300407
descriptors for artifact inclusion and establish overarching Boud, D., Cohen, R., & Sampson, J. (1999). Peer learning
goals and procedures for the portfolio. Structured planning assessment. Assessment & Evaluation in Higher Education,
may allow practitioners to synthesize artifacts with greater 24, 413–426.
ease. Conversely, too detailed criteria may limit creativity Chiodo, P. (2001). Assessing a cast of thousands. Music
by once again focusing students and teachers toward the Educators Journal, 87(6), 17–23. doi:10.2307/3399687
assessment instead of the learning objective. Ciorba, C. R., & Smith, N. Y. (2009). Measurement of instrumen-
tal and vocal undergraduate performance juries using a multi-
Assessment and accountability are inherent in music
dimensional assessment rubric. Journal of Research in Music
education. As a profession, we strive for improved teach-
Education, 57, 5–15. doi:10.1177/0022429409333405
ing and learning, leading toward instilling in our students Colwell, R. (1998). Preparing student teachers in assessment.
a lifelong love of music. Consequently, the effectiveness Arts Education Policy Review, 99, 29–36. doi:10.1080/
of our assessments matters, as motivation (Colwell, 1998; 10632919809600780
McClung, 1996; Reimer, 2009), external value judgments Colwell, R. (2002). Assessment’s potential in music education.
(Colwell, 1998; Reimer, 2009; Zerull, 1990), and most In R. Colwell & C. Richardson (Eds.), The new handbook
important, student learning (Boud et al., 1999; Eisner, of research on music teaching and learning (pp. 1128–
1998; Lehman, 2008) all can be linked to assessment. 1158). Reston, VA: MENC.
Colwell, R. (2008). Music assessment in an increasingly politi-
Declaration of Conflicting Interests cized, accountability-driven educational environment.
In T. S. Brophy (Ed.), Assessment in music education:
The author declared no potential conflicts of interest with respect Integrating curriculum, theory, and practice (pp. 3–16).
to the research, authorship, and/or publication of this article. Chicago, IL: GIA.
Denis, J. M. (2017). Novice Texas band directors’ perceptions of
Funding the skills and knowledge for successful teaching (Doctoral
dissertation). Retrieved from https://digital.library.unt.edu/
The author received no financial support for the research,
ark:/67531/metadc1011801/
authorship, and/or publication of this article.
Doane, C., Davidson, C., & Hartman, J. (1990). A validation
of music teacher behaviors based on music achievement in
References elementary general music students. Research Perspectives
Amabile, T. M. (1983). The social psychology of creativity. in Music Education, 44(1), 24–41.
New York, NY: Springer. Duke, R. A. (2005). Intelligent music teaching: Essays on
Asmus, E. P. (1999). Music assessment concepts. Music the core principles of effective instruction. Austin, TX:
Educators Journal, 86(2), 19–24. doi:10.2307/3399585 Learning and Behavior Resources.
Azzara, C. D. (1993). Audiation-based improvisation tech- Eisenberg, J., & Thompson, W. F. (2003). A matter of taste:
niques and elementary instrumental students’ music Evaluating improvised music. Creativity Research Journal,
achievement. Journal of Research in Music Education, 41, 15, 287–296. doi:10.1080/10400419.2003.9651421
328–342. doi:10.2307/3345508 Eisner, E. W. (1998). The enlightened eye: Qualitative inquiry
Barbot, B., & Lubart, T. (2012). Creative thinking in music: and the enhancement of educational practice. Upper Saddle
Its nature and assessment through musical exploratory River, NJ: Merrill.
28 Update 36(3)

Ferm Almqvist, C., Vinge, J., Väkevä, L., & Zandén, O. (2017). Journal of Research in Music Education, 5, 237–251.
Assessment as learning in music education: The risk of “cri- doi:10.1177/002242940705500305
teria compliance” replacing “learning” in the Scandinavian Parkes, K. A., Rohwer, D., & Davison, D. (2015). Measuring stu-
countries. Research Studies in Music Education, 39, 3–18. dent music growth with blind-reviewed portfolios: A pilot
doi:10.1177/1321103X16676649 study. Bulletin of the Council for Research in Music Education,
Fisher, R. (2008). Debating assessment in music education. 203, 23–44. doi:10.5406/bulcouresmusedu.203.0023
Research & Issues in Music Education, 6(1), 4. Retrieved Reimer, M. U. (2009). Assessing individual performance in the
from http://ir.stthomas.edu/rime/vol6/iss1/4 college band. Research & Issues in Music Education, 7(1),
Goolsby, T. W. (1999). Assessment in instrumental music. Music 7. Retrieved from http://ir.stthomas.edu/rime/vol7/iss1/3
Educators Journal, 86(2), 31–50. doi:10.2307/3399587 Rohwer, D. A. (1997). The challenges of teaching and assess-
Harrison, S. D., Lebler, D., Carey, G., Hitchcock, M., & O’Bryan, ing creative activities. Update: Applications of Research in
J. (2013). Making music or gaining grades? Assessment prac- Music Education, 15(2), 8–12.
tices in tertiary music ensembles. British Journal of Music Russell, J. A., & Austin, J. R. (2010). Assessment practices of
Education, 30, 27–42. doi:10.1017/S0265051712000253 secondary music teachers. Journal of Research in Music
Hickey, M. (2001). An application of Amabile’s consensual Education, 58, 37–54. doi:10.177/0022429409360062
assessment technique for rating the creativity of children’s Ryan, C., & Costa-Giomi, E. (2004). Attractiveness bias in the eval-
musical compositions. Journal of Research in Music uation of young pianists’ performances. Journal of Research
Education, 49, 234–244. doi:10.2307/3345709 in Music Education, 52, 141–154. doi:10.2307/3345436
Hughes, D., & Keith, S. (2015). Linking assessment practices, Salvador, K. (2011). Individualizing elementary general music
unit-level outcomes and discipline-specific capabilities in instruction: Case studies of assessment and differen-
contemporary music studies. In D. Lebler, G. Carey, & tiation (Doctoral dissertation). Retrieved from ProQuest
S. Harrison (Eds.), Assessment in music education: From Dissertations and Theses database. (UMI No. 3482549)
policy to practice (pp. 171–193). New York: NY, Springer. Saunders, T. C., & Holahan, J. M. (1997). Criteria-specific rat-
doi:10.1007/978-3-319-10274-0_12 ing scales in the evaluation of high school instrumental
Kim, K. H., Cramond, B., & Bandalos, D. L. (2006). The performance. Journal of Research in Music Education, 45,
latent structure and measurement invariance of scores 259–272. doi:10.2307/3345585
on the Torrance Tests of Creative Thinking-Figural. Sicherl Kafol, B., Kordeš, U., & Holcar Brunauer, A. (2017).
Educational and Psychological Measurement, 66, 459– Assessment for learning in music education in the
477. doi:10.1177/0013164405282456 Slovenian context: From punishment or reward to support.
Latimer, M. E., Jr., Bergee, M. J., & Cohen, M. L. (2010). Reliability Music Education Research, 19, 17–28. doi:10.1080/14613
and perceived pedagogical utility of a weighted music per- 808.2015.1077800
formance assessment rubric. Journal of Research in Music Simanton, E. G. (2000). Assessment and grading practices
Education, 58, 168–183. doi:10.1177/0022429410369836 among high school band teachers in the United States: A
Lehman, P. R. (2008). Getting down to basics. In T. S. Brophy descriptive study (Doctoral dissertation). Retrieved from
(Ed.), Assessment in music education: Integrating curricu- ProQuest Dissertations and Theses database. (UMI No.
lum, theory, and practice (pp. 17–28). Chicago, IL: GIA. 304630933)
Lu, C. C., & Luh, D. B. (2012). A comparison of assessment Shuler, S. C., Norgaard, M., & Blakeslee, M. J. (2014). The new
methods and raters in product creativity. Creativity Research national standards for music educators. Music Educators
Journal, 24, 331–337. doi:10.1080/10400419.2012.730327 Journal, 101(1), 41–49.
Mastrorilli, T. M., Harnett, S., & Zhu, J. (2014). Arts achieve, Silveira, J. M., & Gavin, R. (2016). The effect of audio record-
impacting student success in the arts: Preliminary findings ing and playback on self-assessment among middle school
after one year of implementation. Journal for Learning instrumental music students. Psychology of Music, 44,
Through the Arts, 10, 1–24. Retrieved from http://escholar- 880–892. doi:10.1177/0305735615596375
ship.org/uc/item/6c81239d Stefanic, N., & Randles, C. (2015). Examining the reliability
McClung, A. C. (1996). A descriptive study of learning assess- of scores from the consensual assessment technique in
ment and grading practices in the high school choral music the measurement of individual and small group creativity.
performance classroom (Doctoral dissertation). Retrieved Music Education Research, 17, 278–295. doi:10.1080/146
from ProQuest Dissertations and Theses database. (UMI 13808.2014.909398
No. 9700217) Wesolowski, B. (2014). Documenting student learning in music
McQuarrie, S. H., & Sherwin, R. G. (2013). Assessment in performance: A framework. Music Educators Journal,
music education: Relationships between classroom prac- 101(1), 77–85. doi:10.1177/0027432114540475
tice and professional publication topics. Research & West, C. (2012). Teaching music in an era of high-stakes testing
Issues in Music Education, 11(1), 6. Retrieved from http:// and budget reductions. Arts Education Policy Review, 113,
ir.stthomas.edu/rime/vol11/iss1/6 75–79. doi:10.1080/10632913.2012.656503
National Association for Music Education. (2015). Assessment Wright, J., Humphrey, J., Larrick, G. H., Gifford, R. M.,
in music education. Retrieved from http://nafme.org/ & Wardlaw, M. (2005). Don’t count on testing. Music
about/position-statements/assessment-in-music-education- Educators Journal, 92(2), 6–7. doi:10.2307/3400175
position-statement/assessment-in-music-education/ Zerull, D. S. (1990). Evaluation in arts education: Building and
Norris, C. E., & Borst, J. D. (2007). An examination of the using an effective assessment strategy. Design for Arts in
reliabilities of two choral festival adjudication forms. Education, 92(1), 19–24.

You might also like