You are on page 1of 235

The Florida State University

DigiNole Commons
Electronic Theses, Treatises and Dissertations

The Graduate School

11-8-2011

The Role Of Segmentation And Expectation In The


Perception Of Closure
Crystal Peebles
The Florida State University

Follow this and additional works at: http://diginole.lib.fsu.edu/etd


Recommended Citation
Peebles, Crystal, "The Role Of Segmentation And Expectation In The Perception Of Closure" (2011). Electronic Theses, Treatises and
Dissertations. Paper 5102.

This Dissertation - Open Access is brought to you for free and open access by the The Graduate School at DigiNole Commons. It has been accepted for
inclusion in Electronic Theses, Treatises and Dissertations by an authorized administrator of DigiNole Commons. For more information, please contact
lib-ir@fsu.edu.

THE FLORIDA STATE UNIVERSITY


COLLEGE OF MUSIC

THE ROLE OF SEGMENTATION AND EXPECTATION IN THE PERCEPTION OF


CLOSURE

By
CRYSTAL PEEBLES

A dissertation submitted to the


College of Music
in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy

Degree Awarded:
Fall Semester, 2011

Copyright 2011
Crystal Peebles
All Rights Reserved

Crystal Peebles defended this Dissertation on October 31, 2011.


The members of the supervisory committee were:

Nancy Rogers
Professor Directing Dissertation
Michael Kaschak
University Representative
James Mathes
Committee Member
Matthew Shaftel
Committee Member

The Graduate School has verified and approved the above-named committee members, and
certifies that the dissertation has been approved in accordance with university requirements.
ii

For my parents, John and Lola Peebles

iii

ACKNOWLEDGEMENTS
I first must thank my committee members for their unending support and guidance
through the duration of this project. The unique perspectives contributed by Jim Mathes, Matt
Shaftel, and Mike Kaschak positively shaped my initial inquiry and the subsequent outcome. I
especially want to thank Mike for his assistance in designing the experiments, recruiting and
testing participants, and analyzing the data. Finally, to the head of my committee, Nancy Rogers,
I am immeasurably grateful for her constructive criticism, her keen attention to detail, and her
encouragement, both during this project and throughout my graduate career.
My experience as a graduate student in music theory program at The Florida State
University has shaped me personally and professionally in more ways than I can count. I am
indebted to the entire music theory faculty for providing opportunities for me to grow both as a
teacher and as a scholar. I also value the meaningful friendships and professional relationships I
have formed with my fellow graduate students at Florida State. I will always cherish the time we
spent together in the T.A. office, in the library, and at the local pizza joint.
Within the Tallahassee community, I would also like to thank my violin students and
their parents for their untempered enthusiasm for music as well as the lovely people at St. Pauls
United Methodist Church. You have been my second family during my six years in Tallahassee,
offering an unwavering community of love and support. Indeed, I thank God every time I
remember you.
I cannot begin to express my gratitude for my familys love and encouragement. My
parents, John and Lola Peebles, have consistently supported me in all I do, even when it takes me
halfway across the country. A special note of thanks also goes to Rachel McCleery, whose
loving companionship during my graduate studies has made an indelible mark on my life. I will
always treasure our many meaningful conversations shared over a delicious dinner or a warm
cup of coffee. Your critical commentary and insight certainly influenced this project and is
reflected in these pages.
Finally, I would like to acknowledge Boosey & Hawkes and European American Music
Distributors for their permission to reproduce copyrighted excerpts of twentieth-century works
by Bartk, Copland, and Webern.

iv

TABLE OF CONTENTS
List of Tables ............................................................................................................................vii
List of Figures............................................................................................................................ix
List of Examples .........................................................................................................................x
Abstract ....................................................................................................................................xii
CHAPTER 1: INTRODUCTION................................................................................................1
CHAPTER 2: MUSICAL CHARACTISICS OF CLOSURE.......................................................6
Closure as the Completion of a Goal-Directed Process ........................................................6
Closure as the Segmentation of Musical Experience..........................................................11
Hierarchy and Closure.......................................................................................................15
Style and Closure ..............................................................................................................18
CHAPTER 3: MUSICAL EXPECTATION AND CLOSURE ..................................................22
Formation of Expectations: Statistical Learning ................................................................22
Expectation .......................................................................................................................26
Expectation in Music Theory ...................................................................................26
Types of Expectation and Schema............................................................................29
Expectation and Memory: An Alternative View.......................................................38
An Expectation-based Model of Closure ...........................................................................41
Three Analytical Vignettes................................................................................................47
Schumanns Widmung .........................................................................................47
Weberns Der Tag ist vergangen...........................................................................57
Coplands The World Feels Dusty ........................................................................60
CHAPTER 4: EVENT SEGMENTATION THEORY...............................................................67
Event Segmentation ..........................................................................................................67
Event Segmentation Theory ..............................................................................................73
Event Segmentation Theory and Musical Closure .............................................................77
Experiment Overview .......................................................................................................81
Experiment 1 ...........................................................................................................81
Experiment 2 ...........................................................................................................81
Experiment 3 ...........................................................................................................82
CHAPTER 5: EXPERIMENT 1................................................................................................85
Method .............................................................................................................................86
Participants ..............................................................................................................86
Stimuli .....................................................................................................................86
Coding Procedure ....................................................................................................88
Participant Procedure.............................................................................................101
Results ............................................................................................................................102
General Results......................................................................................................104
v

Experiment 1a: Bartk Results...............................................................................116


Experiment 1b: Mozart Results ..............................................................................124
Grouping Analysis .................................................................................................135
Discussion.......................................................................................................................149
CHAPTER 6: EXPERIMENT 2..............................................................................................153
Method ...........................................................................................................................154
Participants ............................................................................................................154
Stimuli...................................................................................................................154
Procedure...............................................................................................................157
Results ............................................................................................................................157
Discussion.......................................................................................................................162
CHAPTER 7: EXPERIMENT 3..............................................................................................165
Method ...........................................................................................................................166
Participants ............................................................................................................166
Stimuli...................................................................................................................166
Procedure...............................................................................................................167
Results ............................................................................................................................170
Discussion.......................................................................................................................179
CHAPTER 8: CLOSURE........................................................................................................181
APPENDIX A: SEGMENTATION RESPONSES IN EXPERIMENT 1 .................................186
APPENDIX B: ANNOTATED SCORES FOR EXPERIMENT 3 ...........................................201
APPENDIX C: COPYRIGHT PERMISSION LETTERS........................................................208
APPENDIX D: IRB APPROVAL LETTER AND INFORMED CONSENT LETTER ...........211
REFERENCES .......................................................................................................................215
BIOGRAPHICAL SKETCH...................................................................................................222

vi

LIST OF TABLES
2.1 Lerdahl and Jackendoffs Grouping Preference Rules .......................................................12
3.1 Text and Translation of Widmung .................................................................................49
4.1 Lerdahl and Jackendoffs Grouping WellFormedness Rules ............................................69
5.1 Musical Stimuli Characteristics for Experiments 1a and 1b ...............................................88
5.2 Window Construction in Each Window.............................................................................89
5.3 Arrival and Change Features in Bartk..............................................................................90
5.4 Arrival and Change Features in Mozart .............................................................................91
5.5 Total Number of Responses and Percentage Used in Data Analysis (Bartk)...................105
5.6 Total Number of Responses and Percentage Used in Data Analysis (Mozart)..................106
5.7 ANOVA Means for Interactions between Starting Task and the Nested Structure
(Bartk)..........................................................................................................................110
5.8 Mixed Models Regression Analysis: Latency Time.........................................................110
5.9 Mixed Logit Regression Analysis: Number of Changes...................................................111
5.10 Mixed Logit Regression Analysis: Ending Type..............................................................114
5.11 Mixed Logit Regression Analysis: Arrival Features, Third Movement ............................117
5.12 Mixed Logit Regression Analysis: Arrival Features, Fifth Movement .............................118
5.13 Mixed Logit Regression Analysis: Change Features, Third Movement............................120
5.14 ANOVA Means for Interactions in Change Feature Analysis, Third Movement ..............120
5.15 Percentage of Responses at Complete Silence in Coarse 2, Third Movement...................122
5.16 Mixed Logit Regression Analysis: Change Features, Fifth Movement.............................123
5.17 ANOVA Means for Interactions in Change Feature Analysis, Fifth Movement ...............124
5.18 Mixed Logit Regression Analysis: Arrival Features, No. 19............................................125
5.19 ANOVA Means for Interactions in the Arrival Feature Analysis, No. 19.........................127
5.20 Percentage of Responses at PACs in Fine 2, No. 19 ........................................................128
5.21 Mixed Logit Regression Analysis: Arrival Features, No. 21............................................130
5.22 ANOVA Means for Interactions in the Arrival Feature Analysis, No. 21.........................131
5.23 Mixed Logit Regression Analysis: Change Features, No. 19 ...........................................132
5.24 Mixed Logit Regression Analysis: Change Features, No. 21 ...........................................132
5.25 Section Divisions in Bartk, String Quartet No. 5, Fifth Movement ................................139
5.26 Mixed Logit Regression Analysis: Grouping Analysis ....................................................147
vii

5.27 ANOVA Means for Interactions in the Grouping Analysis ..............................................148


6.1 Exposure Excerpts ..........................................................................................................155
6.2 Mixed Models Regression Analysis: Rating ....................................................................158
7.1 Mixed Logit Regression Analysis: Cadence and Hypermeter ..........................................172
7.2 Mixed Logit Regression Analysis: Cadence Types ..........................................................173
7.3 Mixed Models Regression Analysis: Response Time and Cadences ................................177
7.4 Mixed Models Regression Analysis: Response Time and Cadence Types .......................177
7.5 Mixed Models Regression Analysis: Ratings...................................................................178
7.6 Mixed Models Regression Analysis: Ratings and Response Time ...................................179

viii

LIST OF FIGURES
3.1 Continuum of Expectations ...............................................................................................33
3.2 Continuum of Expectations with Schemata .......................................................................34
3.3 Continuum of Expectation: Closural Expectations.............................................................44
4.1 Schematic Depiction of the Event Segmentation Theory ...................................................74
4.2 Schematic Depiction of the Segmentation Process as Posited by EST................................76
5.1 Interactions between Subject Group and Consistency (Mozart) .......................................109
5.2 Interaction between Starting Condition and Consistency (Mozart)...................................109
5.3 Interactions between Subject Group and Number of Changes (Bartk)............................112
5.4 Interactions between Subject Group and Number of Changes (Mozart) ...........................113
5.5 Interactions between Subject Group and Ending Type (Bartk).......................................115
5.6 Interactions between Subject Group and Ending Type (Mozart) ......................................116
5.7 Possible Phrase Structure Analyses .................................................................................139
5.8 Mozart, String Quartet No. 19, mvmt. 4: Grouping Analysis of the Exposition................144
6.1 Two-way Interaction between Subject Group and the Exposure Composer......................159
6.2 Three-way Interaction between Subject Group, the Exposure Composer, and the Rated
Composer........................................................................................................................160
6.3 Two-way Interaction between Participant Group and the Ratings for Composer (cadential
excerpts) .........................................................................................................................161
6.4 Two-way Interaction between Participant Group and the Ratings for Composer (all
excerpts) .........................................................................................................................162
7.1 Excerpt No. 6 from Mozarts String Quartet in G Major (K. 156), third movement .........166
7.2 Interactions between Subject Group and the Presence of a Cadence ................................172
7.3 Interactions between Subject Group and Cadence Type...................................................175
A.1 Bartk, String Quartet No. 4, third movement .................................................................187
A.2 Bartk, String Quartet No. 4, fifth movement..................................................................190
A.3 Mozart, String Quartet No. 19, fourth movement.............................................................195
A.4 Mozart, String Quartet No. 21, second movement ...........................................................199

ix

LIST OF MUSICAL EXAMPLES


2.1 Beethoven, String Quartet, Op. 130, second movement, mm.18 (analysis after Meyer) ...15
3.1 Beethoven, God Save the King, WoO 78, mm. 16 ...........................................................40
3.2 Schumann, Myrthen, Widmung Op. 25, No. 1, mm. 113 ..............................................50
3.3 Schumann, Myrthen, Widmung Op. 25, No. 1, mm. 1429 ............................................54
3.4 Schumann, Myrthen, Widmung Op. 25, No. 1, mm. 3744 ............................................56
3.5 Webern, Vier Lieder, Der Tag ist vergangen, Op. 12, No. 1, mm. 111..........................59
3.6 Webern, Vier Lieder, Der Tag ist vergangen, Op. 12, No. 1, mm. 1821........................60
3.7 Copland, Twelve Poems of Emily Dickinson, The World Feels Dusty, mm. 12.............63
3.8 Copland, Twelve Poems of Emily Dickinson, The World Feels Dusty, m. 27..................65
5.1 Mozart, String Quartet No. 19, fourth movement, mm. 8993 ...........................................92
5.2 Mozart, String Quartet No. 19, fourth movement, mm. 6770 ...........................................93
5.3 Mozart, String Quartet No. 19, fourth movement, mm. 7678 ...........................................93
5.4 Mozart, String Quartet No. 21, second movement, mm. 1520 .........................................94
5.5 Bartk, String Quartet No. 4, third movement, mm. 2023................................................95
5.6 Bartk, String Quartet No. 4, third movement, mm. 4041................................................96
5.7 Bartk, String Quartet No. 4, fifth movement, mm. 235239.............................................96
5.8 Bartk, String Quartet No. 4, fifth movement, mm. 330332.............................................97
5.9 Bartk, String Quartet No. 4, fifth movement, mm. 7476 ................................................97
5.10 Bartk, String Quartet No. 4, fifth movement, mm. 279284.............................................98
5.11 Bartk, String Quartet No. 4, third movement, mm. 635 (cello).....................................121
5.12 Mozart, String Quartet No. 21, second movement, mm. 18 (violin 1) ............................131
5.13 Bartk, String Quartet No. 4, fifth movement, mm. 1118 ..............................................140
5.14 Bartk, String Quartet No. 4, fifth movement, mm. 102108...........................................140
5.15 Bartk, String Quartet No. 4, fifth movement, mm. 238249...........................................141
5.16 Mozart, String Quartet No. 19, fourth movement, mm. 134 (violin 1 and cello).............143
5.17 Mozart, String Quartet No. 19, fourth movement, mm. 118135 (violin 1 and cello).......144
5.18 Mozart, String Quartet No. 21, second movement, mm. 4044........................................146
6.1 Motive x from Bartks String Quartet, No. 4, first movement, m. 7................................156
6.2 Motive y from Bartks String Quartet, No. 4, fifth movement, m. 1618 .......................156

7.1 Mozart String Quintet No. 4 in G Minor (K. 516), third movement, mm. 113................168
7.2 Mozarts Sonata for Piano and Violin in B!"Major (K. 454), third movement,
mm. 116.......................................................................................................................170
B.1 Mozart, Quartet No. 3 in G Major, K. 156, third movement.............................................201
B.2 Mozart, String Quartet No. 8 in F Major, K. 168, third movement...................................203
B.3 Mozart, String Quartet No. 13 in D Minor, K. 173, third movement ................................205

xi

ABSTRACT
In the musicological discourse, closure can refer to a variety of musical phenomena,
but the language describing closure usually involves at least one of two common metaphors:
closure is the completion of a musical process, or closure is the segmentation of musical
experience. Along with these two descriptions of closure, musicians also recognize that closures
markers vary between musical styles and that some moments of closure are stronger than others,
articulating a compositions hierarchical construction. These four characteristics of closure,
gleaned from the musicological literature, inform my definition of closure: an anticipated end to
a musical segment.
This dissertation will empirically investigate the role of expectation in the perception of
closure. I hypothesize that closure is not something intrinsic to a piece of music; rather, it relies
on an individuals previous musical encounters. This previous experience gives rise to musical
expectations, and closure is experienced when a listener is accurately able to anticipate the end of
a musical segment, on any hierarchical level. The degree of perceived closure correlates with a
listeners ability to predict an ending, coupled with relatively weak expectations for what will
occur next. This perspective is informed by recent research in Event Segmentation Theory
(EST), a theory from the field of cognitive psychology that investigates the segmentation of
everyday non-musical events.
Three experimental studies test this hypothesis. The first study determines whether
listeners segment music according to the predictions made by EST. The results from this study
corroborate previous research: listeners consistently use musical features to segment an ongoing
composition, and the fine segmentation results are nested within the coarse segmentation results.
The learning task in the second study ascertains whether exposure to an unfamiliar musical style
will change a listeners perception of closure in that style. While the data do not entirely confirm
this hypothesis, results from this study suggest the importance of previous experience in the
perception of closure. The third study finds a correlation between predicted endings in a familiar
style and the rating of a listeners perceived strength of closure. Results from all three studies
support an expectation-based model of musical segmentation and the perception of closure.

xii

CHAPTER 1
INTRODUCTION

Listeners and musicians have intense aesthetic opinions regarding the degree to which a
particular musical ending sounds satisfying. Closure is the oft-used term to describe the
listeners feeling of satisfaction or completeness. Although the concept of musical closure is
ubiquitous in the scholarly discourse, the exact meaning of closure can vary widely among
authors. Despite these different meanings, similarities in closural metaphors speak to our shared
experience of finality. This study examines that shared experience: exploring characteristics of
closure, situating these characteristics in a listeners musical expectations, and creating a
cognitive model of musical closure that transcends stylistic boundaries. In the second half of this
dissertation, I discuss a series of three experiments that test this model.
I define closure as the feeling of finality that occurs at the anticipated end of a musical
segment. This definition brings up three points that will be expanded throughout the course of
this project. First, I am primarily concerned with the listeners perception of closure, not with
how a composer achieves closure in a composition or with pinpointing the moment at which
closure occurs. By taking a listeners perspective, I am free to explore how listeners experience
closure in different repertoires without being bogged down by stylistic differences. Instead of
specifically talking about musical signs of closure, I examine how these signs evoke the feeling
of finality in a listener. Second, musical experience is segmented into discrete events that have a
beginning and end. Being able to segment the continuous stream of acoustical input is a
prerequisite to experiencing closure in the first place, for without endings there would be no
closure. Finally, a listener must be able to anticipate the placement and content of an ending in
order to experience finality. These expectations may not be conscious, and a listener will not
always be able to anticipate the exact content of an ending, but expectation is an integral part of
my model of closure.
In this chapter, I illustrate various meanings of closure seen through a handful of recent
contributions to the fields of music theory and musicology, where closure shapes the theoretic or
analytic narrative. Closure is regularly used as a synonym for resolution, particularly a

cadence defining V-I progression in tonal music, but other times musical closure is highlighted
because it advances the theoretical aims of the author, illustrates the stylistic tendencies of a
composer, or supports the authors overarching analytical narrative. Despite methodological
differences, common metaphors of closure emerge. By far, the most common metaphor for
closure is likening it to a goal at the end of a musical pathway, experienced as a point of finality
or stasis, but closure can also be conceived as a boundary separating musical entities.1 These
common metaphors speak to two actions of closure: (1) closure marks the achievement of a
musical goal (usually coupled with a feeling of finality), and (2) closure segments a listeners
musical experience. While these metaphors are not necessarily explicit within the discourse, they
do shape the concept of closure.
James Hepokoski and Warren Darcys oft-cited Elements of Sonata Theory
reconceptualizes sonata form as movements toward genre-defined goals (2006). These two main
goals are the essential expositional closure (EEC)the obligatory perfect authentic cadence
(PAC) located near the end of the expositionand the essential structural closure (ESC)the
PAC in the corresponding place in the recapitulation. This perspective shifts the focus of sonata
form from a schematic script to a more dynamic process. According to Hepokoski and Darcys
theory, the EEC is usually the first PAC following the initiation of the secondary theme: it is the
moment towards which the preceding secondary theme has been aiming (120). The authors are
careful to note that the EEC may not be the strongest cadence in the exposition; the EEC merely
represents the goal of the exposition.
One should not determine an EEC on the basis of what one imagines an EEC should
feel like in terms of force or unassailably conclusive implication. Nor should one
assume that we are making grand claims regarding either the completeness or the degree
of the closure implied by the EEC. Its closure may not in fact be absolute or fully
satisfying from the perspective of the larger proportions of or other telling factors within
the exposition as a whole. The first PAC closing the essential exposition is primarily an
attainment of an important generic requirementnothing more and nothing less. (124)
Not only does the EEC mark the first confirmation of the new key (the goal of the exposition), it
also separates the secondary theme from the closing theme, dividing the exposition into smaller
1

These two metaphors of closure are similar to two of Browers (2000) embodied image schemas:
SOURCE-PATH-GOAL and CONTAINER. These yield the metaphoric concepts of musical motion and musical
space. While a detailed study of an embodied basis of musical closure is outside the scope of this project, I speculate
that our shared experience of closure is rooted in embodiment.

parts. Closure in this sense is not dependent on listener perception; rather, it is a compositional
construct (according to Hepokoski and Darcy) that marks the end of a theoretically defined
process.
In some cases, a listeners perception of closure contradicts the interpretation of closure
from a theoretical perspective. Such is the case in Edward Pearsalls (1999) article, which
explores the analytical process through the lens of current cognitive theories. The second half of
this article analyzes Nun ich der Riesen Strksten berwand by Alban Berg (Op. 2, No. 3).
Pearsall notes that the song ends on the dominant, an unexpected harmony that could signify a
lack of closure. Yet, as he states, when we listen to the song, we have the sense that the piece
does end with finality (246). In order to reconcile the perception of finality with the apparent
open-ended conclusion, Pearsall reconstructs the analysis to make the ending goal-directed
(253, footnote 20). By reinterpreting the last note of the penultimate melodic unit as an upper
neighbor to the last pitch of the song, he is able to reinterpret previously unexplained pitches
from earlier in the composition as semitone neighbors to the last pitch as well. In this example,
the author constructs a goal-directed process toward the last note to account for his feeling of
finality, and the analytical whole is shaped by this interpretation. This process of reconciling the
perception of finality with the theoretical lack of closure (defined by not ending on a tonic
harmony) illustrates Pearsalls stance that music is not a collection of immutable structures but
rather a subject for intentional creative perception (231).
The achievement or denial of closure frequently supports a larger narrative, especially
when closure is conceived as achieving a goal in the music, thus playing an integral role in
conveying musical meaning. When closure is problematized in analysis, often it is because the
expected goal of a musical process is postponed or even completely denied. Alternatively, as
seen in Pearsalls article, the experience of finality may contradict prevailing theories regarding
closure. In other cases it is the very denial of closure that carries the expressive content of the
composition. For instance, Ramon Satyendra (1997) examines four works by Lizst, each
implying a key but lacking adequate tonic resolution; all four compositions prolong a dominant
harmony that never resolves to a tonic chord on the same structural level. While Satyendra
argues that these pieces are contextually closed (beginning and ending with the same harmony),

they remain open because the dominant harmony remains unresolved. This paradox (the tension
between contextual closure and tonal openness) reflects the Romantic aesthetic (194).
Other authors examine closure as a reflection of a particular composers stylistic
tendencies. W. Dean Sutcliffe (2010) argues that Haydns slow movements written in the 1770s
are marked by an expressive ambivalence that defines his style (98). One way Haydn creates
this ambivalence, according to Sutcliffe, is by using the same passage to evoke two different
affective attributes. In the Andante of Haydns Symphony No. 52, the same musical material
both opens and closes the first phrase, creating two different, apparently opposed, meanings
(102). Closure is somewhat obscured, compared with Mozarts more punctual endings from
the same period (110), but in Haydns slow movements this helps create the impression of
ambivalence because the same gestures engender a feeling of initiation and finality.
Closure in Mahlers symphonies contributes expressively to the unfolding musical drama,
according to Seth Monahan (2011). In his analyses, recapitulatory success or failure (as defined
by Hepokoski and Darcy) correlates with the expressive outcome of the movement, but Mahler
moves away from a closure-oriented tonal narrative in his later compositions (38). Thus, the
expressive meaning of closure changes as Mahlers style changes. Monahan states that Mahlers
earlier sonata dramas are oriented specifically around the ability of the secondary theme (S) to
attain tonic closure, where the S-theme acts like a musical agent bent on controlling its own
modal/tonal fate, seeking to secure closure in the tonic major while avoiding a tragic collapse
into minor (40). This goal-oriented perspective of closure supports Monahans dramatic
narrative.
At the annual meeting of the Society for Music Theory in 2010, John Roeder presented
his analysis of Saariahos song for two sopranos, The claw of the magnolia, which is the third
of five settings from Sylvia Plaths poem Paralytic. As a part of his narrative, Roeder
demonstrated how a gesture can become associated with closure as the piece unfolds and how
surrounding pitch material may imbue this gesture with meaning. He noted that the tritone in
m. 2 (between F4 and B4) sounds like an ending retrospectively because it precedes the first
simultaneous attack of the song (F#4 and A#4). This reading elevates the tritone to a cadential
gesture of sortsa harmonic entity regularly segmenting phrases. Another analysis of this same
passage suggests that two different diatonic sets are superimposed, one with a B tonic, the other

with a B! tonic. As Roeder noted, such a reading seems contradictory to hearing the tritone as an
ending feature: [T]he tonal focus modulates again to A#/B! and then to B. Such intuitions raise
an interpretative problem: they do not attribute repose to a tritone, and so they run contrary to
hearing stability at the {B, F}s that terminate the phrases. To reconcile these readings, Roeder
interpreted the {F#, A#} dyad as implying both tonalities simultaneously, combining the
dominant of B with the tonic of B!, which allows the tritone to be interpreted as including both
the dominant of B! and the tonic of B. By recasting this gesture in a larger analytic narrative, this
same tritone, which concludes the entire song, provides both convincing closure and an
ingenious musical expression of the paralytics mentality. Closure in this sense is not the
realization of a musical goal, for as Roeder stated the mezzos last F4 wants to fall again to A#3
tonic, but this goal fails to be realized, leaving the listener musically, like the paralytic
literally, in a state of suspended animation; instead, closure here refers to a fitting ending
considering the meaning of the text.
This small sample of musicological discourse demonstrates how closure shapes the
analytic and theoretic narrative. All of these examples serve to emphasize how important the
feeling of closure is in our musical experience, and how the common metaphoric descriptions of
closure speak to a shared experience of closure. By shifting the focus from how a composition or
composer achieves closure to how a listener experiences closure, I create an expectation-based
model for this shared experience. The satisfaction and feeling of finality associated with closure
is tied to how a listener expects the piece to unfold and, more importantly, how and when a
listener expects a composition (or a segment of a composition) to end. In the remainder of the
first part of my dissertation, I explore four common characteristics of closure (Chapter 2) and
describe how these characteristics are accounted for in an expectation-based model of closure
(Chapter 3). This emphasis on expectation is supported by recent theories in event segmentation,
which outline a possible cognitive process for the segmentation of music and the perception of
closure (Chapter 4).

CHAPTER 2
MUSICAL CHARACTISICS OF CLOSURE

Leonard Meyer conveys an inclusive understanding of closure in his influential books


Emotion and Meaning in Music (1956) and Explaining Music (1973). In these books, Meyer
enumerates various musical parameters relating to closure, recognizes that closure occurs on
different hierarchical levels, and considers closure in post-tonal music. He suggests four
characteristics of closure, which will be fleshed out in more detail throughout this chapter:
(1) closure is a completion of a goal-directed process resulting in an arrival of relative
stability or rest;
(2) closure segments a continuous musical stream into discrete events;
(3) the strength of closure depends on many musical variables and plays an integral role
in the hierarchic construction of a composition; and
(4) closure is stylistically dependent.
The first two characteristics of closure were discussed as common metaphors of closure (closure
is a directed goal and closure is a segmenting agent) in the previous chapter. The final two
characteristics can also be inferred through a close reading of those same analyses; for instance,
Hepokoski and Darcy (2006) recognize that PACs have varying degrees of finality, and even
within the small sample of analyses in the previous chapter, authors highlight different signs of
closure appropriate for the musical style. The extent to which an analyst emphasizes one of these
concepts over another colors the resulting musical discourse, resulting both in different
metaphorical descriptions of closure and in different evaluations of closures meaning within the
analytic narrative.

Closure as the Completion of a Goal-Directed Process


For both Robert Hopkins and Leonard Meyer, the completion of a musical process is the
most important marker of closure; both state that without a process, the music will just stop and
not close (Hopkins 1990, 4; Meyer 1956, 139). In other words, closure occurs upon completion
of a commonly recognized musical process or event as defined through an analytical theory. For
Meyer, the musical syntax of a particular style determines the processes that drive toward closure
and is manifested through its primary musical parameters. In tonal music, these primary

parameters would include melody, rhythm, and harmony while timbre, dynamics, and register
would be secondary parameters (1973, 88). Meyer values primary parameters over secondary
parameters because these primary parameters point to a particular moment of cadential closure in
tonal music, while secondary parameters merely contribute to the perceived strength of this
closure.2 Closure defined through these goal-directed processes is understood as syntactical
closure.
For many theorists looking at tonal literature, the completion of the Schenkerian Ursatz
defines syntactical closure, with the arrival of $"marking the attainment of a goal. This syntactic
understanding of closure is pervasive in the musicological discourse. Mark Anson-Cartwright
(2007) acknowledges that one assumption about closure has abided in discourse about tonal
music: the idea that it is synonymous with tonal (or structural or syntactic) closurea state of
rest articulated by a cadence, usually very close to, or even coinciding with, the end of a piece
or movement (1). Even Patrick McCreless (1991), who addresses four types of closure,
privileges syntactic closure over other varieties. In his analysis of Beethovens Piano Sonata in
C minor (Op. 10, No. 1), McCreless reveals this preference by first locating the point of
syntactic closure, because syntactic closure is primary (65).
While Anson-Cartwright and McCreless are mainly concerned with closure of the entire
piece or movement (or, using Agawus term (1987), global closure), tonal processes can create
closure at the end of a phrase as well, with the goal being the last chord in a cadence. Although
tonal processes at the local or intermediate level are less specifically prescribed, from a
Schenkerian perspective, harmonic and contrapuntal structures of the Ursatz are projected onto
lower structural levels (Cadwallader and Gagn 2006). Alternatively, the basic phrase model
provides another source of smaller-scale musical processes: since the basic phrase model is a
formulaic succession of sonic events (contextually defined by a particular musical style), the
musical process consists of completing the step-by-step model. Even more simply, syntactic
closure often serves as a synonym for the resolution of V to I on any structural level of a
composition.

According to Meyer (1956), being able to predict when closure will occur is a prerequisite for being able
to perceive closure.

While tonal closure from a Schenkerian perspective remains a theoretical construct,


empirical research on the extent to which listeners perceive large-scale closure adhering to
Schenkerian norms (e.g., a composition beginning and ending in the same key) is not clear.3
Nicholas Cook (1987) found that listeners indicated a sense of closure in recomposed works even
when the music ended in a different key, suggesting that large-scale tonal closure is not
necessarily the only marker of closure. These studies, however, did not ask whether a listener
could detect these recomposed passages; Elizabeth West Marvin and Alexander Brinkmans
(1999) study demonstrated that expert musicians were able to detect whether musical passages
began and ended in the same key.
Michael Graubart (2003) recognizes the importance of tonal musics goal-directed nature,
noting the sense of completion when, after setting-up of a charged dominant region and various
ensuing tonal adventures and misadventures, the tonic key is firmly re-established (34). This
type of goal direction is missing in twelve-note music, so Graubert appeals to the completion of a
twelve-note row, proposing that the twelfth note of the pattern would supply the needed goal of
the musical process (34).4 Hence, this compositional method might substitute for tonal closure
in non-tonal works, although Graubart admits that a listener may have trouble recognizing this
pattern completion (specifically at the beginning of a row, given that a listener would not be able
to anticipate the ending pitch-class until the row approached its end). Although Graubart perhaps
carries his comparison between closure at the completion of a twelve-tone row and closure at a
tonal cadence to an extreme, his proposal speaks to the pervasive use of a goal-directed syntactic
concept of closure in musicological literature.
Not all musicians, though, ascribe to this goal-oriented view of musical closure. Anne
Hyland (2009) finds fault with syntactic closure as Hepokoski and Darcys (2006) main defining
factor of sonata form. In her analysis of the first movement from Schuberts C-Major String
Quartet (D. 46), Hyland argues that the bias of goal-directed trajectories in Hepokoski and
Darcys theory (e.g., the exposition moves towards the EEC) does not accurately describe this

I do not think that any analytic theory needs empirical validation. However, since the main crux of my
study will focus on listener perception of closure, I will use empirical studies to support my model of the perception
of musical closure.
4

As Graubart states, twelve-note rows may give back to atonal music a goal-directed force and the
possibility of closure (36).

movement, which instead relies on rhetorical signs of closure. In contrast to syntactic closure,
rhetorical closure consists of those signals of a works finality or closure which are not tonal in
nature (113). While syntactical closure, defined through primary parameters, may be sufficient
for teleologically-composed pieces in the tonal style, other musical features may contribute to a
sense of goal-direction and the perception of closure. Like Hyland, Hopkins (1990) and Bryden
(2001) turn to secondary parameters to explain a process of closure.
In musical styles beyond common practice, secondary features may signify goal
completion. In his study of Mahlers music, Hopkins (1990) proposes the concept of
abatement to explain closure for a composer who at times eschews traditional cadences. In
order to depict abatement, Hopkins creates graphs that show the composite dynamics, registral
pitch, durations, and concordance (i.e., consonance) of the entire texture along with the number
of voices. For Hopkins, closure occurs when these parameters abate (or descend): for instance,
durations increase, harmonies become more consonant, and dynamics soften. Hopkins, however,
criticizes his own theoretic concept of abatement for not technically being a goal-directed
process. He explicitly states that a goal-directed process is a requirement for closure because for
closure to occur, it is necessary but not sufficient for a discernible process or pattern in one
or more musical parameters to imply a particular point of conclusion (4; emphasis added).
Hopkins recognizes that a listener cannot predict the moment closure will occur using his
abatement principles, but there seems to be an eventual goal of dying out that the listener
perceives. Even though Hopkins seems to be contradicting himself (i.e., abatement is the process
of closure in Mahler, but because we cannot predict the moment at which the process is
completed, closure can never occur), he suggests that a listeners feeling of finality may
depend on expectations other than predicting the exact moment of an ending.
Kristy Bryden (2001) uses a similar model in her dissertation on closure in late twentiethcentury chamber works. She begins with six characteristics of closural processes that transcend
stylistic boundaries.

Closural processes are


1) temporal and may operate on both local and larger more global levels,
2) lines of increasing intensity followed by lines of decreasing intensity,
3) the creation and either the fulfillment or postponement of expectations,
4) a summary of past events,
5) the highlighting of concluding moments, and
6) transitional techniques leading into or foreshadowing the following event. (i)
The second of her six definitions outlines a theoretical process based on secondary parameters.
Brydens intensity curves represent processes of musical growth and decline. She creates a
graphical representation of her score analysis by mapping various musical elements: dynamics,
registral height, textural space, frequency of attack, density, composite rhythm, and pulse. These
are then averaged to create a composite curve.5 She posits that closure occurs when a rise in
tension is followed by a decrease in tension, with a possible parallel in tonal music: the rise in
harmonic tension followed by a decrease in tension at the end of a phrase.
Bob Snyder (2000) echoes this description of the influence of secondary parameters on
the perception of closure, stating
decreases in intensity can establish closure. If we look at changes in intensity of the
elements in a melodic, temporal, or formal grouping, we find that all other things being
equal, a grouping feels more closed the more the intensity of its various musical
parameters decreases at the end. (emphasis Snyders; 63)6
Snyder argues that conceptualizing the musical surface in terms of our own bodily experiences
influences our perception of closure, comparing the feeling of repose in music to the way our
bodies feel after we have completed an action. This embodied perspective of closure implies that
we understand the metaphors of goal direction, completion, and finality through the way in
which our bodies interact with the world.
Many authors observe that the completion of a musical process results in a feeling of
repose or finality (or, in Meyers words, an arrival at relative stability (1973, 81)). For

Brydens methodology. First, she awards equal weight to each parameter in the overall average when one
parameter may exert more influence in projecting a close or continuation. Second, even though the data from these
parameters are normalized to fit on a scale from 110, the units of measurement are so different that an increase of a
single unit does not mean the same thing across parameters. Both of these render the composite curve almost
meaningless because it attempts to capture too much different information. While the methodology may not achieve
a good curve, Brydens idea of using secondary parameters to inform music analysis certainly has merit.
6

This emphasis on abatement and relaxation overlooks other instances of finality where there is more of a
feeling of excitement.

10

instance, in his 2007 critical study of concepts of tonal closure, Anson-Cartwright posits that the
feeling of rest is a result of tonal resolution. Bryden (2001, 1) also notes that the perception of
decreasing intensity at the moment of closure results from a dynamic temporal process
modeled by her intensity curves. This feeling of finality that accompanies the completion of a
goal-directed process is further explored by David Huron (2006), who states that a listener
perceives closure at the expected completion of some process, andbecause the listener is less
able to predict what will occur nextthis completion is followed by a perceived loss of forward
continuation. From Hurons perspective, a feeling of repose is not created by musical parameters
that diminish in intensity or the resolution of a tense harmony; rather, the decrease in
predictability for subsequent events causes the perception of repose. Eugene Narmour (1990)
agrees, defining closure as syntactic events whereby the termination, blunting, inhibiting, or
weakening of melodic implication occurs (102).
Although the completion of a goal-directed process and a feeling of finality are
connected, it is important not to conflate these two characteristics of closure. A musical process
need not be consciously perceptible, whereas the feeling of finality is a psychological
experience. From a listeners perspective, closure is the sense of finality that occurs at an
anticipated ending. This perspective shifts the focus away from the music per se and toward an
individuals musical expectations. The completion of a goal-directed process does not itself elicit
a feeling of finality; rather, this feeling stems from the combination of an anticipated ending with
lessening of expectation for subsequent events.

Closure as the Segmentation of Musical Experience


The point at which a listener experiences finality segments the musical experience.
Closure marks the end of a musical event, resulting in a perceived boundary between two
musical entities. Early in his book, Snyder (2000) defines closure as the establishment of a
grouping boundary, allowing a listener to segment musical events (33). Meyer (1973) suggests
that closure creates relatively stable musical entities (90), although many musicians might
dispute the notion that all event boundaries necessarily correspond with a feeling of closure: for
instance, analysts normally do not consider motives to have closure despite their local-level
event boundaries. Not surprisingly, there is a close relationship between closure and
segmentation, but it is not necessarily a causal relationship.
11

Meyer (1973) uses both primary and secondary parameters to establish musical grouping
structure, focusing on the completion of harmonic units, changes in the musical surface, and
repetition as a means to create event boundaries. Meyer freely uses the term closure when
referring to segmentation at any level, including the identification of discrete motives. This sense
of closure comes from what Meyer calls patterning, and he provides a list of various factors
that delineate musical patterns:
1. the presence of similarity and difference between successive events within a
particular parameter. Both complete uniformity and total heterogeneity preclude
syntactic organization, and hence establish no stability-instability relationships;
2. the separation of one event from another in time, pitch, or both; or through clear
differences in dynamics, timbre, or texture;
3. immediate repetition, whether varied or exact, of part or all of a pattern;
4. the completion of previously generated implications;
5. harmonic cadence and tonal stability. (83)
Many of these ideas regarding grouping are incorporated into Lerdahl and Jackendoffs
Grouping Preference Rules (GPRs), which reflect the principles of Gestalt psychology. While
Lerdahl and Jackendoff (1983) do not specifically discuss closure, they do use secondary
parameters such as attack points, register, dynamics, articulation, and duration to create grouping
boundaries. A complete list of GPRs is provided in Table 2.1; notice that GPRs 2 and 3 use
Meyers secondary parameters.
Table 2.1: Lerdahl and Jackendoffs Grouping Preference Rules
GPR 1: Avoid analyses with very small groupsthe smaller, the less preferable.
GPR 2: Group boundaries are heard at a slur or rest, as well as points where there is a
greater attack-point time interval.
GPR 3: A change in register, dynamics, articulation, and note lengths can distinguish
group boundaries.
GPR 4: Where the effects of GPR 2 and 3 are relatively more pronounced, a larger-level
group boundary may be placed.
GPR 5: Prefer grouping analyses that most closely approach the ideal subdivision of
groups into two parts of equal length.
GPR 6: Where two or more segments of the music can be construed as parallel, they
preferably form parallel parts of groups.
GPR 7: Prefer a grouping structure that results in more stable time-span and/or
prolongational reductions.

12

Narmour (1990) exclusively focuses on the establishment of closure within a three-note


grouping. In his Implication-Realization Model, Narmour suggests that the third note of a
sequence creates closure when the interval formed between the second and third notes does not
create any new implications (102). He also identifies six parametric conditions of closure, similar
to Meyers list, which include:
1.
2.
3.
4.
5.
6.

a rest or repetition;
strong metric emphasis;
resolution of dissonance;
increase in duration;
smaller intervallic motion;
change of registral direction. (1112)

Snyder (2000) builds on Narmours work, stating, continuity is nonclosural and progressive,
whereas reversal of implication is closural and segmentive (148). This type of closure, Snyder
argues, does not necessarily end anything, but it helps articulate the contour of the
phrase (148). He differentiates this from closure at the end of the phrase with the designation
soft closure, which refers to having any kind of segmentation, however weak (148).
In his article on segmentation in post-tonal music, Christopher Hasty (1981) segments the
musical surface using parameters such as timbre, dynamics, intervallic associations, register and
contour. These parameters are described as musical domains, and Hasty suggests that groupings
are formed by discontinuities in at least one domain. He also claims that music is typically
segmented in such a way that the emerging groups are similar to one another; for instance,
groups in a stronger segmentation may contain the same number of constituents and may share
intervallic content. Hasty uses the term closure to describe a return to some musical quality
from a prior segment within a phrase, like an overall aba form, and reserves this expression as a
marker for higher-level formal divisions, such as a phrase or section. In a later article, he revisits
closures role in creating phrase segments, stating that, closure is itself the articulation of the
unit since unrelated elements are thereby segregated (1984, 172). Closure in this sense creates
a phrase-like entity containing musical elements that are related to each other (e.g., notes
segmented into the same set class, or notes in the same register). A phrase is closed off,
becoming its own entity, not including unrelated elements.
Dora Hanninen more carefully balances differentiation, similarity, and the role of music
theory in her 2001 article, which explicates her general theory on music segmentation. Hanninen

13

posits three types of criteria for segmenting music: (1) sonic, which rely on disjunction between
adjacent events or non-adjacent events, (2) contextual, which are based on associative
relationships between two possible groups, and (3) structural, which reflect a theoretical
orientation that remains purely conceptual until paired with sonic or contextual cues, resulting in
a musical segment.
Both Meyer (1973) and Hopkins (1990) also indicate that segmentation in music depends
on a plethora of musical markers; however, they divide these markers into two types: primary
parameters that allow a listener to project the moment of closure, and secondary parameters that
can strengthen the presence of an ending or retrospectively mark a moment of closure. Changes
in secondary parameters can create a sense of a new beginning, and hence a boundary, but
retrospectively recognized closure and anticipated closure may be different psychological
phenomena. The degree to which a listener is able to anticipate closure can vary as well, and it is
important to recognize that the capacity to predict an ending is quite different from arriving at an
ending and subsequently recognizing it as the end of a segment.
Huron suggests an even closer connection between musical segmentation and closure: we
perceive boundaries because we have experienced closurea fulfillment of musical expectation.
Thus, from the perspective of segmentation, anything that creates a separable unit is closed
(2006). Meyer concurs, stating,
A motive, a phrase, or a period is defined by some degree of closure. On the level of its
closurethe level on which it is understood as a separable eventit is a relatively stable,
formal entity. Though it contains and is defined by internal processes, once closed, it is
not a process but a palpable thing. (1973, 90)
This understanding allows for closure in motives, units that most musicians would not consider
closed, as well as closure in segments that do not evoke any particular expectation for a specific
ending point.
These two characteristics of closure discussed thus far (i.e., closure as completing a
musical process and closure as marking the segmentation of musical events) harkens back to the
metaphors examined in the first chapter. This shared understanding of closure, especially from a
listeners perspective, will form the basis of my expectation-based model of closure. Before
discussing segmentation and expectation from the perspective of cognitive psychology, I first
describe two related characteristics of closure: (1) closure has varying strengths, creating

14

segments that group together on different hierarchical levels, and (2) goal-directed processes and
specific closural expectations are defined by musical style.

Hierarchy and Closure


Having enumerated the various parameters that affect closure (both primary and
secondary), Meyer states, the degree of closure depends upon the shaping of the particular
parameters at work, the degree of articulation contributed by each, and the number of parameters
promoting or preventing closure (1973, 88). To illustrate his point, Meyer presents an analysis
of Beethovens String Quartet, Op. 103, second movement (reproduced here as Example 2.1),
indicating that the sense of finality at the end of m. 4 is stronger than the sense of finality at the
end of mm. 1 and 2 because m. 4 is articulated by rhythmic closure.7 Furthermore, the half
cadence (HC) is weaker than the perfect authentic cadence (PAC) that follows in m. 8. The
reason for this, according to Meyer, is that in a HC not all the musical elements are implying
closure at the same time: the rhythm implies closure, but the harmony does not. As Meyer states,
a semicadence is a case of parametric noncongruence which has become archetypal in the
stylistic syntax of tonal music (85). Other authors further discuss how parametric congruence
and noncongruence vary closures perceived strength, focusing especially on the role of
secondary parameters in confirming or weakening a syntactical close (Hopkins 1990, Snyder
2000, Hyland 2009).

Example 2.1: Beethoven String Quartet, Op. 130, second movement, mm. 18
(analysis after Meyer [1973])
McCreless (1991) uses formal and rhetorical markers of closure to determine the location
of structural (or syntactical) closure, implying that formal schema and rhetorical emphasis can
7

Meyers precise meaning is unclear, but I suspect Meyer may be referring to hypermetrical expectations.

15

modulate the strength of closure. If a composition has several possible points of syntactic
closure, formal expectations can emphasize one of these PACs as the structural close. McCreless
further states that rhetorical closure uses rhythmic and registral extremes to highlight the end of
the melody, which can make one ending sound more conclusive than another ending.
Kofi Agawu (1987) also discusses closure as occurring on various musical levels. While
his definition of closure, the tendency to close (2), seems a bit circular, it emphasizes that
closure is not synonymous with an ending, but rather is dependent for its effect on the listeners
experience of the entire composition (4). From Agawus perspective, an ending describes
local elements in the musical structure, whereas closure denotes a global mechanism (4). This
view of closure is similar to Anson-Cartwrights third concept of closure, that condition of
immanent rest or finality which a piece or movement possesses as a temporal whole, by virtue of
all the tendencies to close projected within that whole (2007, 3). Global closure, according to
Agawu, secures closure for the entire piece and fulfills these tendencies to close (6). There is
only one global closure in a composition, and nested within this closure are subordinate closes on
the local and intermediate levels. While global closure is the most decisive close, local closure
articulates the smallest meaningful units of the piece and intermediate closure nests one or
more local closes (6). In all of these cases, Agawu requires a syntactic V-I gesture at the end,
but his idea of nested closes could be applied to music outside the tonal idiom.
William Caplin (2004) discuses closure primarily at the phrase level, stating that the
cadence effects formal closure at middle-ground levels in the structural hierarchy of a
work (56). He goes on to explain that a cadence creates musical closure, but not all closure in
music is cadential, reserving the term cadence for a limited number of hierarchic levels (56).8
For Caplin, cadences close specific musical processes (harmonic and melodic, in the case of a
PAC); most importantly, cadences elicit a sense of formal closure. According to Caplin, a
cadence follows the structural beginning of a group on the same hierarchic level, and any
cadential harmonic paradigm must include a root-position dominant chord. While Caplins
insistence that cadences must be on the same hierarchical level as the phrase beginnings is a bit
8

Caplin specifically states that the Ursatz does not end with a cadence because there is no true beginning to
the Ursatz. Also, because he reserves the term cadence as the means to close a theme, the fact that the Ursatzs
structural close is on a higher hierarchical level precludes his use of this term. Caplin does indicate that a cadence
can occur at the same time as the structural close.

16

idiosyncratic, it does emphasize that a combination of V followed by I will not achieve closure
unless it concludes a formal unit.
There is some interaction between musical hierarchy and the perceived strength of
closure. While the strength of closure may depend on local musical cues, as previous research
has indicated, it also depends on the boundarys role in the overall musical hierarchy (Joichi
2006).9 Meyer further states, every composition, then, exhibits a hierarchy of closures. The
more decisive the closure at a particular point, the more important the structural articulation .
The way in which a particular parameter acts in articulating structure may be different on
different hierarchic levels (1973, 89). It seems that, according to Meyer, the strength of closure
for a particular segment determines the hierarchical structure of the piece, and markers of closure
at one level can vary from markers at another level. It follows that closure at the end of a phrase
is weaker than closure at the end of a section, which in turn is weaker than closure at the end of a
piece.10
I am not convinced that there is a simple correspondence between formal hierarchy and
the perceived strength of closure at the end of any particular segment. Some research has shown
that top-down knowledge of formal design can influence the perceived strength of a point of
closure (Joichi 2006). Agawu (1987) also emphasizes listener knowledge of a how a composition
should unfold: its scheme. Comparing poetry to music, he suggests,
the trained reader (or listener) approaches a lyric genre such as the Shakespearean
sonnet with a set of expectation regarding its length, meter and rhyme scheme. The
awareness of this scheme mediates the experience of the poem, and therefore of closure.
The same is true of musical genres such as minuet and trio, nocturne, concerto, and
prelude, genres in which various types of signssome conventional, others arbitrary
are used to inform the listener of how and when a piece is going to end. (4)
The completion of a schematic formal unit thus elicits a feeling of finality based on expectations
generated by previous experience, and this knowledge could lead to the various completions
9

Joichi also notices that the length of the preceding context influences decisions regarding the strength of
closure (longer contexts are rated as having stronger closure), but longer segments ending with a higher hierarchical
boundary are rated as more closed than are longer segments ending with a lower hierarchical boundary.
10

Caplin (2004) would qualify that statement, arguing that although the cadence located at the end of a
piece may seem stronger than previous cadences, the cadence itself does not close the composition. A cadence
typically presumed to close an entire movement is often accorded a high degree of foreground rhetorical emphasis
[which] renders such cadential arrivals so prominent and forceful that they can give the impression that they must
be concluding something more structurally significant than a thematic region alone (6465).

17

within a composition having varying strengths of finality. Meyers bottom-up view (the
hierarchical arrangement of these closes depending solely on the number of parameters
promoting or preventing closure) and this top-down view (underlying formal schemata
influencing a listeners perception) will be re-examined in Experiment 3, located in Chapter 7.

Style and Closure


Several authors have claimed that knowledge of styleeven if only implicitis a
prerequisite for perceiving closure. According to Mary Louise Serafine (1988), closure is marked
by stasis and rest compared to the surrounding material, and the factors that generate movement
and stasis vary among styles. While markers of closure differ among styles, it may be possible to
experience closure in unfamiliar styles by imposing knowledge gained by experience in some
familiar style to the unfamiliar style. While markers of closure are highly conventionalized in
tonal music (cadences, stepwise descent to $% etc.), they are more variable in recent music. This
variability has led to two approaches to discussing closure in post-tonal music: (1) authors retain
tonal models of closure even for music in a clearly non-tonal style (Kurth 2000, Pellegrino 2002)
or (2) authors turn to alternative goal-directed processes, such as abatement and intensity curves
(Hopkins 1990, Bryden 2001).
Richard Kurths (2000) analysis of Schoenbergs fourth string quartet is particularly
revealing in regards to the first approach. He states that memory is one of the general conditions
for musical closure (139), both within a work, where memory engages elements to create
musical forms, and between works, where memory invokes materials from earlier pieces or
compositional approaches. From this perspective, Kurth argues that latent tonal tendencies are
present in Schoenbergs fourth string quartet, suggesting that Schoenbergs compositional
background (and presumably a listeners abundant experience with tonal music) allows tonal
implications and realizations to serve as markers of closure in this work. Kurth states that closure
occurs when fluctuating tonal latencies can no longer be kept in a state of balanced suspension.
The latency of one or several individual tonalities is then revealed and itself becomes an
attribute of closure, in moments that are characterized by vivid qualities of incipience and
expectancy (159). While other factors, such as duration and dynamic level, may also contribute
to closure at those moments in the quartet, Kurth raises an important point: knowledge of
structures within a work and between works can influence the way a listener perceives closure.
18

In her article on closure in John Adams music, Catherine Pellegrino (2002) specifically
states that closure at the end of a work primarily depends upon tonal organization. Although she
acknowledges that Adams music is not tonal, she suggests that discernible pitch patterns
emerge, and the completion of these patterns contributes to closure. She states,
[I]f the end of a work is to be experienced as closure and not simply as an arbitrary
stopping point, the nature and placement of the point of closure must be anticipated. In
other words, for closure to occur, the tonal organization of the music must either define
its own endpoint or participate in a system in which a given endpoint is already defined.
(150)
Pellegrino likens this experience of closure to achieving $ in the melody over the tonic harmony
at the end of a tonal work. She also recognizes that other factors contribute to closure in this
repertoire: the completion of a well-known formal structure and rhetorical, stereotypical ending
gestures; she maintains, however, that these are subservient to tonal closure.
In contrast to these approaches, Robert Clifford (2005) suggests that we abandon the
notion of tonal closure in defining closure in atonal music, specifically addressing compositions
by Webern. He suggests that we instead redefine our expectations for closure in this style of
music based on compositional elements found in the piece.
For isnt tonal closure really about expectations set into motion by the composer?
Should we expect in atonal music, then, with its radically different melodic and harmonic
landscape, the same type of musical experience, the same solid confirmation of musical
expectations? I think not. (29)
The processes set in motion at the beginning of a work will differ from those in other
compositions, and could include symmetrical arrangement of pitches around a center pitch or a
series of gestures that balance each other (e.g., a rising gesture balanced by a descent). While
Clifford questions whether these types of processes are perceptible, he emphasizes that there are
alternative means of achieving closure besides those that are tonally motivated.
Abatement (Hopkins 1990) and intensity curves (Bryden 2001) were addressed
previously, but this approach to describing closure in non-tonal styles warrants further
discussion. Returning to Meyer and Hopkins, the emphasis on goal direction as a marker of
closure across styles leads to a disturbing conclusion: there can be no closure in music without a
goal-directed process. Setting aside the obvious difficulties in defining what exactly constitutes a
goal-directed process, I think most musicians would agree that the musical features determining

19

a goal-directed process depend on music style. The issue of style becomes increasingly
problematic throughout the twentieth and twenty-first centuries, because works by different
composers (and sometimes even works by the same composer) do not typically share the same
musical syntax. If closure requires the completion of a goal-directed process, there must first be a
goal-directed process to complete.
Intensity curves, and the like, are similar to the phenomenon McCreless (1991) describes
as rhetorical closure, the importation of closural conventions or the use of harmonic, melodic,
rhythmic, textural, orchestrational, dynamic, articulative, or registral extremes as a means of
dramatizing the end of a piece (51). Although some of these rhetorical conventions transcend
styles (like Brydens closural processes), repertoires and composers can have their own
idiosyncratic rhetorical ending gestures. For instance, Gretchen Wheelock (1991), George
Edwards (1991), and Floyd Grave (2009) all focus on rhetorical indicators of closure in Haydns
string quartets, while Wye Allanbrook (1994) looks at a tune (as defined in her article) as a
closural sign in Mozart.
In contrast to these composer-specific signs of closure, some theories of closure and
segmentation attempt to define musical characteristics of closure that are not style specific (most
notably, Lerdahl and Jackendoff 1983; Narmour 1990). One such example is durational closure
(Joichi 2006), defined by rests, pauses, and longer durations at the end of a segment. While
Narmour cites these characteristics as closural, Elizabeth Margulis (2007) found that the
interpretation of silence (how much tension the silence carries) varies with context. This suggests
that the interpretation of closure is based on more than changes in acoustic inputthat the
meaning of these supposedly cross-stylistic cues varies based on context and listener experience.
Thus, stylistic competency is a direct manifestation of a listeners experience, where
expectations gathered through statistical learning (Huron 2006) influence the perception of
closure. From the perspective of probabilistic learning, conventional signs of closure (such as
cadential patterns) begin as recurring surface features that are gradually incorporated into a
listeners stylistic knowledge. As a listener experiences the same harmonic/melodic paradigms
ending musical units, such paradigms begin to evoke the feeling of closure. Listeners are better
able to anticipate endings in musical styles where they have sufficient experience.

20

Along with stylistic considerations in the perception of closure, the other three
characteristics of closure explored in this chapter are dependent in some fashion on a listeners
previous musical experience. It is musical expectations engendered from these knowledge
structures that contribute to the sense of goal-direction, the segmentation of musical experience,
and the recognition of differing strengths of closure. The formation of expectations and their
influence on a listeners perception of finality will be further explored in the next chapter.

21

CHAPTER 3
MUSICAL EXPECTATION AND CLOSURE

As discussed in the previous chapter, there is a relationship between expectation and


closure. To this effect, Eugene Narmour asserts that closure is a fulfillment of musical
expectation followed by an absence of expectation for what will follow, or, in Narmours terms,
closure is the realization of a melodic implication that does not create some new implication.
Leonard Meyer (1956) even goes so far as to state that without any expectation of when and how
a musical segment will end, the music will always sound incomplete; it will merely stop and
will not close. David Huron, in his seminal study on musical expectation (2006), also
acknowledges the role of expectation in the perception of closure. Building on the work of Huron
and others, I propose a model of closure that explains how various characteristics of closure
(completion of a goal-directed process, segmentation of musical experience, hierarchical
construction, and stylistic dependencesee Chapter 2) are derived from expectation. This
chapter concludes with my model of musical closure and illustrates its predictions in three short
songs: Robert Schumanns Widmung, Anton Weberns Der Tag ist vergangen, and Aaron
Coplands The World Feels Dusty.

Formation of Expectations: Statistical Learning


Previous research has shown that we are experts at extracting statistical regularities from
auditory input, and this process of statistical learning leads to expectations for musical events
(Krumhansl 1990; Huron 2006). An individuals musical experience will therefore determine
how closure is perceived; for example, the more often a person hears a certain harmonic or
melodic unit at the conclusion of a musical segment, the more that the listener will associate that
unit with closure. This is supported by a study (Eberlein and Frick 1992) that asked musicians to
rate the strength of closure projected by cadential patterns from a variety of historical periods.
The ratings correlated with an individuals self-determined own stylistic competency, confirming
that increased exposure to a musical style influences the perception of closure.

22

Since it is well documented that different musical styles have their own characteristic
tokens of closure, these results are hardly surprising,11 but they leave an important question
unanswered: how do listeners form an association between musical cues and a feeling of finality?
A mere exposure effect (whereby listeners perceive closure because similar patterns have ended
musical segments in the past) is an insufficient explanation: this simplistic view cannot account
for how listeners segment music into meaningful units. A possible solution may be found in
research on language acquisition, which has shown that an auditory stimulus is segmented based
on sequential probabilities. As in language acquisition, these sequential probabilities may
influence a listeners segmentation of music and, thus, contribute to the perception of closure.
Children acquiring a language are faced with a daunting task. Before they can even begin
to learn semantic meaning and grammatical syntax, they must first learn to discern boundaries
between words. This is a difficult task when based on acoustical cues alone, because word
boundaries are not consistently marked in fluent speech (Saffran, Aslin, and Newport 1996).
Saffran and her colleagues demonstrated that infants as young as eight months can extract
transitional probabilities (the probability that one event will follow another) between spoken
syllables. As an example, imagine that an infant hears the phrases pretty baby and pretty
flower. The transitional probability between pre and ty is higher than the transitional probability
between ty and ba simply because the former sounds have been heard in sequence more often.12
Saffran, Aslin, and Newport (1996) created a speech stream of three-syllable nonsense words
where every syllable was spoken without accentual stress and at a steady tempo. The only cues
to the location of word boundaries were the transitional probabilities between the sounds. In two
separate experiments, infants were able to distinguish between words and non-words, as well
as between words and part-words, after only a two-minute exposure period.

11

As Robert Gauldin (1988) wrote in his eighteenth-century counterpoint text, Each period of music
history has devised clichs associated with cadential formulas. These may include stereotyped soprano and bass
melodic movements, harmonic progressions, rhythmic figuration, non-harmonic activity, and suspensions (13).
Following this statement, Gauldin presents cadential paradigms common to the late-Baroque period. Similar lists of
stylistically appropriate cadential paradigms are included in his sixteenth-century counterpoint text as well (1985, 27
and 87).
12

The formula for calculating the transitional probability of x followed by y is


y|x = Frequency of xy / Frequency of x. In the limited example above, the probability of pre being followed by ty is
2/2 = 1.0, while the probability of ty being followed by ba is 1/2 = 0.5.

23

Of course, there are other acoustical cues that assist in the perception of word boundaries
(accentual patterns, intonational profiles, pauses, etc.), so it is especially remarkable that these
infants showed significant learning despite such impoverished stimuli. This experiment has been
replicated with adult participants (Saffran 2001), as well as with stimuli consisting of action
sequences (Baldwin et al. 2008) and tones (Saffran et al. 1999). A more recent study replicated
these results using chord sequences derived from an artificial harmonic syntax. Jonaitis and
Saffran (2009) found that after a two-day exposure period listeners were able to generalize
syntactical rules from the transitional probabilities governing chord succession, and subsequently
to differentiate novel correct harmonic progressions from progressions that did not adhere to the
artificial harmonic syntax. Compared to actual music, the stimuli were quite improvised (lacking,
for instance, metrical regularities), yet listeners were able to extract transitional probabilities
between chords after sufficient exposure.
Although the experiment outlined above does not explicitly address the inference of
musical segments based on transitional probabilities, it does suggest that a similar learning
mechanism is used for both language and music. Comparable to word boundaries in language,
musical boundaries are formed when two musical events have a relatively low transitional
probability in a particular style. Such events are not limited to pitch and harmonic material
(although these elements are the most explored in the literature), but can extend to timbre,
rhythm, loudness, articulation, etc.13
The analytical preference to segment music at a point of change in the musical surface
can be explained with the help of transitional probabilities. Recall from Chapter 2 Hanninens
(2001) theory of segmentation, which posits that a change in a musical parameter (e.g., register,
instrumental timbre, or articulation) creates a boundary in the sonic domain. Other authors
(Lerdahl and Jackendoff 1983; Meyer 1973) have also used differentiation to segment musical
experience. These authors imply that listeners expect continuity in all musical domains, an
expectation that is formed through statistical learning. For instance, a large melodic leap could
signify a boundary between two musical groups (similar to Lerdahl and Jackendoffs GPR 3).
13

When using transitional probabilities, the grain at which probabilities are extracted must be specified. In
music, this depends on the time window, or event type, in question when describing statistical regularities.
Transitional probabilities can be calculated between motivic cells, chords, and individual notes, but one can also
reduce the window size to calculate the transitional probabilities within a single sound. Such a fine grain of division,
though, will not necessarily provide interesting results.

24

Folk songs from a variety of musical cultures reveal that smaller intervals occur more frequently
(Huron 2006), increasing the transitional probability for pitches that are closer together in pitch
space. Although I know of no database documenting the transitional probabilities of non-pitch
domains, given that expectation is informed by statistical learning, it is likely that other musical
domains have a transitional probability profile similar to that of intervallic succession, where no
change or slight changes occur more frequently than do drastic sonic changes. That said, not
every sonic disjunction will result in a meaningful musical boundary and a feeling of closure.
The mind unconsciously uses statistical learning to extract transitional probabilities of
musical events, which in turn guide segmentation.14 Along with extracting transitional
probabilities, a listener also becomes sensitive to the likelihood that a particular sound will occur
somewhere in a composition. Huron (2006) nicely summarizes this point by differentiating
between inclusional probabilities and transitional probabilities.15 Inclusional probabilities convey
the likelihood that a particular sound element will be present in the style, regardless of the
preceding events, while transitional probabilities represent the likelihood that a sound element
will occur based on the previous event. In tonal music, members of the tonic triad occur more
frequently than do other scale degrees (inclusional probability), and the tonic chord usually
follows a dominant harmony (transitional probability). Both sets of probabilities would give rise
to musical expectations, but I posit that a segmentation resulting in the strongest feeling of
finality, or closure, depends specifically on transitional probabilities. In addition to creating
perceptual boundaries, information gleaned through statistical learning is generalized into a
broad set of musical expectations called schematic expectations.

14

In an analytical narrative, transitional probabilities are not the only means of segmenting a musical
stream. Take, for instance, Hanninens other criteria, contextual and structural. Forming associations between
groups in a composition requires listeners to remember past material in order to form new groupings. This dynamic
listening process is not based on a generalization of transitional probabilities. Structural criteria are based on a
theoretical framework and are applied consciously to the sonic and contextual domains to create musical segments.
Because statistical learning occurs unconsciously, using conscious theoretical knowledge to create groupings is a
different phenomenon.
15

Huron labels inclusional probabilities as zeroth-order probabilities and transitional probabilities as


first-order probabilities.

25

Expectation
Being able to anticipate upcoming musical events is an integral part of musical
experience. This experience has not only been captured empirically through various experimental
paradigms, but it is reflected in musical discourse. Empirical research indicates that not all of
these expectations are explicit and that expectations can reflect different types of musical
knowledge, such as generalized knowledge applicable to different works, or exact knowledge of
a particular work. While I do not provide a comprehensive overview of musical expectation (see
Huron 2006; Ockelford 2006), I first discuss how the concept of expectation informs discourse in
the discipline of music theory, especially with regard to musical closure. After this summary, I
turn to Hurons four types of expectation (schematic, veridical, dynamic, and conscious),
outlining concepts essential to an expectation-based model of musical closure.
Expectation in Music Theory
As Schmuckler (1989) states, almost all contemporary music-theoretic analyses have
adopted implicit or explicit ideas of expectation (111). While almost all may seem like an
overstatement, I believe that the aims of music theory, as a discipline, are indeed rooted in
expectation. Some theorists strive to bring unconscious expectations to consciousness, while
others create alternative sets of musical expectations, allowing listeners to experience music in
new ways. This is readily evident in the language used in music theory pedagogy, analytical
discourse, and theoretic systems.
A common topic in which we invoke expectation in the music theory classroom is the
deceptive cadence (or deceptive resolution, or deceptive motion). The term deceptive clearly
indicates the use of an unexpected chord in place of the expected chord, and textbooks usually
spell out clearly that listener expectations have been thwarted. Take, for instance, Clendinning
and Marvin (2005):
Bachs solution at the end of the first phrase is to replace the expected tonic
harmony with a tonic substitute, the submediant triad, to make a deceptive
cadence: V7-vi. The name of this cadence is appropriate, since the drama of this
harmonic deception can be striking. (300; emphasis added)
Other labels for musical phenomena reveal a foundation in expectation. Some musical
vocabulary implicitly relies on expectation; consider tendency tone and anticipation.
26

Tendency tones require a particular resolution in common-practice tonality, expressing recurring


patterns of dissonance resolution. Anticipations, a term that describes the early arrival of $"
(usually), anticipate the next harmony. In the realm of harmonic syntax, textbooks frequently
organize the common pre-dominant chords into a hierarchy based on strength or how likely a
given chord is to progress to dominant harmony. It is the higher transitional probability from
ii6-V compared to IV-V that makes the ii6 chord a stronger pre-dominant than the IV chord. In
a similar vein, the transitional probability between V65/V-V is even higher, resulting in an even
stronger pre-dominant harmony.
In aural skills classrooms, some teachers advise students taking dictation to rely
strategically on the conscious application of theoretical patterns to guide the listening experience.
For instance, Rogers (1984) suggests that instructors train students to chunk the musical surface
into memorable musical patterns to assist in melodic dictation, where the student learns to expect
patterns taught in the written theory classroom. This intelligent guessing allows students to fill
in missing pitches based on theoretical expectations. In their recent aural-skills textbook, Jones
and Shaftel (2009) encourage students to fill in the harmonic content of cadences early in the
listening process because the harmonies in these measures will be very predictable (3-10). In
both written theory and aural skills classes, textbooks and instructors regularly appeal to student
expectationsthose formed through previous musical experiences and those formed within the
classroom.
Other writings about music regularly draw upon a hypothetical listeners expectation,
either explicitly or implicitly. As an example, Sarver (2010) explores how chromatic passages
interact with prolongational processes in works by Richard Straus. Her analysis of the chromatic
passage in mm. 3440 of Susle, liebe Myrthe makes explicit use of expectation.
The digression leads to a cadential six-four in E! minor, which establishes the
expectation for local closure in E! in the measures that follow. The illusory
cadential six-four, however, is thwarted in m. 39 by an upward chromatic shift
that leads to a surprising cadence in E major. (83; emphasis added)
For this short passage, Sarver draws on conventional tonal expectations to explain the surprising
cadence that follows when expectations for closure are thwarted. Denial of closural
expectations set up by a listeners extensive familiarity with tonal syntax is integral to Sarvers
analytical methodology and ensuing narrative.
27

Forrests 2010 article Prolongation in the Choral Music by Benjamin Britten implies a
slightly different view of expectation. Exploring the means by which a musical entity might be
prolonged in a non-functional, yet triadic, musical style, Forrest makes the case for surface-level
triads prolonging symmetrical middle-ground interval cycles, which in turn promote pitch
centricity. Central to this discussion of musical expectation is his analysis of the fifth movement
of Brittens Ad majorem Dei gloriam, entitled O Deus, Ego Amo Te. In his analysis, Forrest
shows how the first two sections of the movement establish the precedent for unbroken interval
cycles, setting up the implicit expectation that the final two sections of the piece will also include
complete interval cycles. When an ic3 cycle begins in the third phrase (starting with B major and
moving to D major), he suggests that, This incomplete cycle creates a strong expectation of F
major, the next step in the cycle (21; emphasis added). However, this projected expectation is
not immediately satisfied: the ic3 cycle is temporarily suspended with a strong arrival on E! in
the last section of the movement. The concluding phrase of the movement finally resumes the
interval cycle, arriving on an F-major chord, which itself is prolonged by a complete ic3 cycle.
To this effect, Forrest states, Both voices then proceed through their familiar minor-third cycle
to cadence ultimately on the pitch classes which began the piece, thereby completing the
interrupted cycle (22). While the expectation for a complete ic3 cycle may not be shared by
as many musicians as are expectations for syntactical tonal language, the expectation for
continuity, especially continuity on deeper structural levels, is shared among music theorists.
This expectation for continuity is built into many of our theoretic systems, especially
theoretical narratives that rely on organicism or self-similarity. In Schenkerian analysis, we
expect to find foreground musical structures in the middleground, and in transformational theory
we expect transformations to relate sound objects to each other. Further, a Reti-style analysis
would encourage listeners to find homogeneity both between the movements and between the
parts of one movement within a multi-movement composition (Reti 1951 [reprinted 1978], 5).
In short, any theoretical system creates expectations for how the structure of the music can be
experienced or explained.
The concept of expectation pervades the entire discipline of music theoryits pedagogy,
analytical discourse, and theoretical systems. These expectations need not be empirically
grounded: while empirical evidence is necessary for cognition studies, music analysis is
28

interpretative. An analyst draws upon explicit and implicit expectations (formed through
previous musical engagements) and methodological choices to create an individual
interpretation. However, as we saw with closure, the ubiquity of expectation in this discipline
speaks to the general human experience of musical engagement. While listeners might not be
able to verbalize their expectations, we have definite opinions regarding how music should go in
a particular style.
Types of Expectation and Schema
Different experiences of expectation emerge from this brief survey of music theory
literature. Music theory pedagogy and Sarvers thwarted expectation for phrase closure, for
example, imply that listeners have generalized expectations regarding how harmonic syntax
should proceed and how phrases should progress in common-practice tonality. In contrast,
Forrests analytical expectation of ic3 cycle completion is based on prior events in that particular
composition.16 This later experience could also be considered a conscious expectation,
expectations dictated by a theoretical system or other musical knowledge, while expectations
stemming from previous knowledge of a particular composition capture yet another type of
expectation. These different experiences of expectation have led scholars, such as Bharucha,
Huron, and Margulis, to categorize various types of expectation.
Schematic expectations represent broadly enculturated patterns of events. According to
Bharucha and Stoeckig (1987), these automatically formed expectations generalize musical
patterns from a large musical corpus. Such patterns range from the consistent hypermeter and
harmonic syntax of the Classical style to the timbre and riffs associated with punk music. I want
to emphasize that these are generalized expectations: although a listener may have specific
expectations for ensuing events (for instance, a V42 chord in a Mozart piano sonata will lead a
listener specifically to expect a I6 chord), they are not based on knowledge of that particular
work. Rather, these expectations are formed gradually by listening to many exemplars of a
particular musical style and accumulating knowledge of their recurring patterns.

16

There is no clear boundary between piece-specific expectations and general expectations. Someone
familiar with a large corpus of Brittens works might form more general expectations for interval cycles, just as
someone familiar with tonal music has general expectations for harmonic syntax.

29

In contrast, veridical expectations are formed by specific knowledge about the sequence
of events in a single composition. Both Bharucha and Stoeckig (1987) and Huron (2006) use
these two types of expectation to explain the musical surprise associated with a deceptive
cadence (V-vi) in a well-known piece of music. Schematic expectations guide listeners to
anticipate a tonic harmony following a V chord (the transitional probability for the progression
V-I is higher than that of any other chordal succession), even if they veridically expect a vi chord
based on previous experience with this particular composition. This violation of schematic
expectations results in a feeling of deception even when a listener expects the surprising
harmony.
Put another way, general schematic expectations derive from learned categories of
musical experience. Both psychologists and music theorists use the term schema to describe
these learned categories. Like closure, schema is an often used but seldom defined term.
Even among musicians, schema encompasses different shades of meaning, due in part to
disciplinary differences (whether the focus of the study is psychological or musicological). Three
of the most common uses of the word are exemplified in the writings of Meyer, Gjerdingen, and
Huron. (For a comprehensive and more nuanced discussion of schema, see Byros 2009,
especially chapter 5, part 1.)
In Explaining Music (1973), Meyer posits that melodies in Western music derive from a
limited set of melodic processes. Melodic processes represent basic archetypes, his term for an
innate or universally valid schemata (Gjerdingen 1988, 7). Two archetypes that Meyer
subsequently tested with Rosner are the gap-fill and changing-note archetypes. A gap-fill
archetype consists of an initial upward melodic leap subsequently filled in by a stepwise descent,
while the changing-note archetype is comprised of two melodic dyads, the first leading away
from the tonic triad, the second leading back to the tonic triad (for instance, $-&-'-(). Rosner and
Meyer (1982) found that listeners could abstract the archetype from musical exemplars of each
category, then, in a forced-choice paradigm, they could identify the archetype present in novel
musical clips. A later study (Rosner and Meyer 1986) expanded this work and found that these
archetypes also influenced similarity judgments between musical excerpts.17
17

In a more recent article, Paul von Hippel (2000) questions the perceptual validity of these archetypes by
re-examining the results from the 1982 and 1986 studies. Von Hippel concludes that gap-fill does not influence
melodic shape, nor does it influence the classification of melodies to the extent that Rosner and Meyer suggest.

30

Gjerdingens (1988) work on schema builds upon the foundation laid by Meyer,
especially Meyers changing-note archetype. Gjerdingen limits his study to schemata that
generalize a musical event-sequence (notes and rhythms notated in the score). He differentiates
scripts, which outline an event sequence, from plans, which contain information regarding
intentionality rather than implying a particular series of events. Relating these two types of
schemata to Meyers archetypes, a changing-note archetype is an example of a script, while a
gap-fill archetype would be a plan. This distinction can account for style change; for instance,
according to Gjerdingen, the eighteenth-centurys scripted phrase construction evolved into the
nineteenth-centurys plan-like phrase. Although my own discussion of schematic expectations
will not distinguish scripts from plans, it is important to recognize that schemata can vary in their
specificity.
Huron offers the most inclusive definition of schema, a mental preconception of the
habitual course of events (2006, 419). Huron likens schemata to semantic categories, where
schemas are generalizations formed by encountering many exemplars. Our most established
schemas reflect the most commonly experienced patterns (225). Without schemata (which guide
schematic expectations) it would be impossible to have any expectations for a novel work;
listeners could only have expectations for a work after listening to it. Schemata also aid in
encoding and remembering music; for instance, pitches presented in a tonal context are more
easily remembered than those presented in a non-tonal context (see the discussion in Hrbert,
Peretz, and Gagnon 1995, 194). Further support for the existence of schemata comes from
instances in which these broad generalizations create an incorrect expectation or musical
memory, as Huron discusses (2006, 21016). For instance, a schema could be overly general
(e.g., not providing specific enough expectations) or misapplied (e.g., approaching Non-Western
music with Western expectations). Since schemata aid in remembering music, listeners tend to
misremember an atypical musical pattern in a way that conforms to a more common schema.
This brief overview shows that schemata can range in specificity, from less specific
expectations (pitch proximity, stylistic timbres, and behavior of scale degrees in the major mode)
to more specific expectations (a particular chord succession). The narrower understanding of
schema posited by Meyer (universal archetypes) and Gjerdingen (style-specific
harmonic/melodic progressions) are easily subsumed within Hurons broad definition of
31

schema. Margulis (2005) addresses this range of schemata-types, positing that schematic
expectation itself encompasses more than one type of expectation.
Specifically, schematic expectations inhabit a continuum from relatively deep to
relatively shallow, where depth relates to availability for direct access (from little to
much availability), susceptibility to change through exposure (from little to much
susceptibility), and scope of application (from more universal to more limited). Examples
of increasingly shallow schematic expectations might be: expectations for closure;
expectations for cadential closure in tonal music; expectations for common cadence types
in music from the classical period; and expectations for common cadence figures in the
music of Mozart, where these expectations are increasingly available for access,
increasingly susceptible to change through exposure to new pieces within the relevant
repertoire, and increasingly limited in scope. (666)
I agree that schematic expectations exist along a continuum, but, as noted by authors who
specifically discuss deeply schematic expectations, there seems to be a question about the extent
to which music alone informs the creation of these expectations. Expectations for pitch proximity
and melodic regression transcend musical culture (for the most part), so there seems to be a
perceptual preference for small melodic intervals and post-skip reversals. Indeed, many of these
expectations do not apply just to musical stimuli, but adhere to the broad perceptual laws set
forth by Gestalt theories of perception. Instead of teetering dangerously close to a chicken-or-egg
questionwhich came first, a preference for small melodic intervals (evident in musical
composition), or small melodic intervals in musical composition (informing a preference for
these intervals)I simply posit that these deeply schematic expectations are different from other
types of schematic expectation because they are not distinctive to music and arise from general
perceptual processes. Music-specific schematic expectations are then derived solely from
musical experience.18 Both types of expectation are applicable to music, but only the latter is
solely applicable to music.
Figure 3.1 shows this continuum between deep schematic expectations (Margulis:
deeply schematic expectations) and surface schematic expectations (Margulis: shallowly
schematic expectations). The difference in shading indicates that the deepest of the deep
expectations transcend musical culture and are cross-modal; however, there is no clear boundary
between these expectations and ones that are unquestionably influenced by a particular musical
18

Musical experience here is understood in the broadest sense: listening to music, performing music, bodily
engaging music, and understanding limits the body may place on performance can all contribute to these musicspecific expectations.

32

culture. As Narmour explains (1990), these cross-modal expectations are formed through a
bottom-up cognitive system (consisting of Gestalt principles), while the music-specific
expectations are formed through a top-down cognitive system. Even so, these cross-modal
expectations are evident as statistical regularities.
Pearce and Wiggins (2006) present another perspective on the creation of these
regularities, stating patterns of expectation that do not vary between musical styles are
accounted for in terms of simple regularities in music whose ubiquity may be related to the
constraints of physical performance (378). Whether these statistical regularities are determined
by performance or perceptual limitations, and whether such expectations are formed through
statistical learning or are innate, it remains that listeners expect continuity of sound (in terms of
pitch proximity, location, timbre, etc.). Regardless of its origins, continuity is the cross-modal
bedrock on which other expectations are constructed.

Figure 3.1: Continuum of Expectations


Figure 3.2 shows the continuum again with possible schemata located along the right
side. The expectations on this continuum are implicit, evident in statistical regularities in the
33

music. Expectations for pitch proximity and melodic regression (Huron 2006) are considered
deep expectations because they are the most widely applicable, providing generalized
expectations. Style-specific schemata can range from general tonal and rhythmic expectations to
more specific expectations for a particular harmonic progression. Within a composers oeuvre,
his or her characteristic fingerprint may result in statistical regularities that differentiate these
works from those of other composers writing within the same style. In general, a greater quantity
of compositions informs the creation of deep schematic expectations while a smaller quantity of
compositions informs the creation of surface schematic expectations.

Figure 3.2: Continuum of Expectations with Schemata

34

Empirical research addressing expectation has predominantly focused on schematic


expectations. For instance, Schellenberg (1996, 1997) found empirical support for aspects of
Narmours Implication-Realization theory, which makes relatively specific predictions for deep
melodic expectation, particularly pitch proximity and pitch-reversal. Many authors have found
resounding empirical support for expectations of pitch proximity (Shepard 1964; Deutsch 1991;
Aarden 2003) using a variety of experimental methodologies and stimuli. Despite overwhelming
evidence that pitch proximity is preferred, the expectation for pitch proximity can be influenced
by other expectations, as Hrbert, Peretz, and Gagnon (1995) found in their probe-tone study,
which examines listener preference for tones at the end of musical phrases, where scale degrees
predicted the results better than did pitch proximity alone. So, while pitch proximity represents a
deep schematic expectation, more accessible surface-level expectations (such as the tonal
system) can exert greater influence on listener expectations.
Surface schematic expectations, especially those pertaining to expectations within the
tonal system, have been extensively explored. Along with Hrbert, Peretz, and Gagnon (1995),
who focused on melodic phrase endings, other authors have examined more general melodic
expectations (Carlsen 1981), harmonic expectations (Bharucha and Krumhansl 1983; Bharucha
and Stoeckig 1986), and a combination of both (Schmuckler 1989). Other studies have examined
types of musical expectation beyond a tonal context, and they still confirm the existence of
expectations that are not work-specific, but applicable to a wider body of music (Cuddy and
Lunney 1995). Finally, a series of cross-cultural expectation studies illustrates how these surface
expectations are formed through previous exposure (Krumhansl et al. 1999; Krumhansl et al.
2000).
Dynamic expectations, which Huron also discusses, exploit a listeners short-term
memory to form predictions of likely future events within a musical composition while it is
being heard. Dynamic expectations are so named because they are relatively volatile compared to
schematic and veridical expectations, and arise from brief exposures to a stimulus. Huron states,
as the events of a musical work unfold, the work itself engenders expectations that influence
how the remainder of the work is experienced (227). Adaptive expectations of this sort have
been explored in empirical studies demonstrating that listeners unfamiliar with a particular style
can pick out statistical probabilities after a short exposure period. Kessler, Hanse, and Shepard
35

(1984) performed a cross-cultural study using a probe-tone method to explore schemata based on
tonal structure. Their Balinese and American participants listened to three melodies, one based
on the Western scale and two based on Balinese scales (Pelog and Slendro), then rated the
goodness of fit between the material they just heard and a probe tone. In general, the results
found that pre-existing schemata based on first-order probabilities influenced ratings by
enculturated listeners, while ratings by nave listeners were based on pitch frequency (inclusional
probabilities). Even listeners completely unfamiliar with the style were able to extrapolate
statistical regularities from the given music to form a set of expectations.
These three types of expectation (schematic, veridical, and dynamic) are very much
related, as illustrated by Ockelfords (2006) zygonic model of musical expectation. His model
shows how implications (and hence expectations) can arise while listening to a musical
composition. Ockelford differentiates between expectations within a musical segment and those
between musical segments. Within a segment, there are implications of pitch proximity (we tend
to expect small intervals), while expectations between segments are formed through four
different experiences:
1) other material or materials occurring within the same hearing of the same performance of
the same piece;
2) a different hearing or hearings of the same performance of the same piece (in the case of
recorded music);
3) a hearing or hearings of a different performance or performances of the same piece; or
4) a hearing or hearings of a performance or performances of a different piece or pieces.
(110)
The first expectation refers to dynamic expectations, which are molded by the ongoing
composition. The second and third experiences result in piece-specific expectations, or veridical
expectations, while the fourth results in generalized learning, influencing schematic expectations.
Ockelford relates the interaction of structures within a segment (which he calls withingroup structures) and the interaction of structures between two different segments (called
between-group structures). Previous structures, which inform schematic and veridical
expectations, are stored in long-term memory while current structures are encoded in shortterm memory. Previous structures form between-groups expectations and can also be a general or
specific indication of future events. Dynamic expectations guide between-group expectations,

36

while pitch proximity forms within-group expectations.19 As a listener hears a composition


multiple times, more specific expectations are formed, but these expectations are always
contextualized by schematic expectations. Although these expectations seem categorically
discrete, all three operate concurrently, allowing a listener to experience musical expectations in
unfamiliar music as well as thwarted expectations in well-known music.
According to Huron and Ockelford, the differences between schematic and veridical
expectations arise from the way in which the information is encoded in long-term memory;
however, the difference between schematic and dynamic expectations is less clear. First, both
types of expectation are formed through a common learning mechanism: statistical learning.
Second, assumptions behind what is valued in dynamic expectations seem to be rooted in
schematic expectations. Expectations for continuity and repetition are deep schematic
expectations, but the specific musical elements to be repeated or continued are piece-specific. As
discussed earlier, Forrests analytical narrative implies that the expectation for interval cycle
completion is formed as Brittens composition progresses. Given that dynamic expectations are
considered implicit, and this particular pattern could be captured by transitional probabilities, a
listener could indeed form an unconscious expectation for interval-cycle completion. However,
the foundation for this expectation would be the listeners schematic expectation for repetition in
a work, blurring the line between these two types of expectation. Furthermore, since Britten uses
interval cycles regularly in his works, a listener could unconsciously form a general schema for
this compositional characteristic after exposure to many exemplars.20 Even though dynamic
expectations may sometimes be difficult to distinguish from schematic expectations, the term
remains useful because the concepts behind dynamic expectations strongly influence music
analytical discourse.
Ockleford emphasizes that the expectations in his model are not explicit and are formed
unconsciously; however, there are times when expectations rise to the surface of consciousness.
A knowledgeable listener hearing the first movement of a Mozart piano sonata will have specific
19

Ockelford distinguishes pitch proximity from other forms of schematic expectations, very similar to my
own representation of the continuum between various types schematic expectation. Ultimately, I still maintain
(along with Margulis [2005] and Huron [2006]) that pitch proximity is a deeply schematic expectation.
20

Whether the completion of an interval cycle can be an implicit expectation is debatable. It is more likely
that conscious knowledge of Brittens compositional preference for interval cycles shapes a listeners hearing of a
work. In any case, Forrest uses the language of dynamic expectation to shape his narrative.

37

formal and structural expectations that, assuming some training and technical vocabulary, could
be explicitly articulated (e.g., the second theme will probably have a lyrical character, and the
recapitulation will probably be preceded by a prolonged dominant harmony). Conscious
expectations arise from conscious reflection and conscious prediction (Huron 2006, 235),
bringing us full circle back to the role of expectation in music theory. Theoretic systems ask us to
carry another set of expectations into our musical experiences, which can enhance our musical
experience. Many music-theoretical systems invoke the language of expectation, but we should
be careful not to confuse conscious expectations reflecting implicit expectations with those that
arise solely from abstract theories. I do not presume that abstract theoretical expectations are
invalid, but it is importantgiven the pervasive references to expectation in our scholarship and
teachingto recognize the distinctive varieties of expectation.
Expectation and Memory: An Alternative View
While these different types of expectation seem to capture our experience with music, the
underlying assumption that they reside in different memory structures is problematic. Huron
(2006) posits two different types of memory guiding schematic and veridical expectations.
Knowledge of individual pieces is stored in episodic memory (also known as autobiographical
memory), while auditory generalizations are stored in semantic memory. Huron notes that this
distinction is problematic for two reasons (225). First, familiar works may leave out biographical
episodic content (although we can remember a composition along with the context in which it
was heard, we are also capable of remembering a composition without explicitly recalling the
context). Also, all auditory generalizations began as a single exemplar of a recurring pattern,
suggesting that all semantic memories began as a large collection of episodic information.
Rather than two distinct systems, Hintzman posits a multiple-trace memory model that
can account both for schemata abstraction and for veridical expectations. In Hintzmans
multiple-trace memory model (1986, 1988, 2010), each experience is recorded in long-term
memory as a separate memory tracein contrast with models where subsequent exposures to a
given stimulus strengthen an existing memory trace for that stimulus. Hintzman suggests that
schema abstraction of everyday concepts (like chair or table) is determined by a persons
exposures to many exemplars of a category, each exposure laying down a memory trace. The
memory traces encode various features of each exemplar, such as the context in which the
38

exemplar was experienced, along with other sensory characteristics. Abstract concepts are then
derived from this pool of episodic traces. When a retrieval cue interacts with all of these traces
simultaneously, it activates traces according to their similarity with the cue. Traces that are more
similar to the cue are more strongly activated than other traces, and the summed content of these
activations represents the information retrieved from memory (Hintzman 1986). This concept of
memory can also reflect the results of statistical learning: the number of activated traces informs
inclusional probabilities, while the number of activated traces involving a particular event
succession informs transitional probabilities.
This memory model can also be applied to music cognition. Although Hintzman has not
explored the creation and content of memory traces for temporal experiences, if each musical
experience results in a separate memory trace, then this model can account for both veridical and
schematic expectations without relying on two separate memory systems.21 When a person
listens to music, each musical trace is activated in parallel. Traces that are most similar to the
current auditory input are activated more strongly and will have the greatest impact on listener
expectations; less similar traces, while present, will influence expectations to a lesser degree. In
this model, generalized schematic expectations are based on the activation of memory traces
from many different pieces of music.
The range of specificity for expectation depends on the degree to which these traces share
the same subsequent events. A large number of memory traces lead us to expect pitch proximity,
but these traces do not imply the same pitch or scale degree because the context of each trace is
so different. In contrast, harmonic progressions conforming to tonal syntax may activate fewer
memory traces, but these traces will overwhelmingly involve a tonic chord following a dominant
harmony, creating a more specific expectation. Memory traces for a particular composition
provide even more specific expectations, of course. Because all these traces respond in parallel, a
listener can have several different expectations for a single piece of music.
Once again, consider a deceptive cadence, which, as previously discussed, can be
surprising even in a well-known composition. When a listener hears a dominant chord, all
21

The creation of memory traces would probably be influenced by the way in which a listener interacts
with music. A listeners ability to use language to label musical events, the extent to which a listener is paying
attention to the music, and how the listener is parsing the musical surface would probably influence the creation and
content of the memory traces. Also, the way in which a memory trace is activated would differ from Hintzmans
original conception because listening to music is a temporal experience.

39

memory traces containing an experience of this harmony are activated, including the memory
trace(s) for the work itself. In the vast majority of these traces, the dominant chord is followed by
a tonic harmony, eliciting a strong and relatively specific expectation for tonic. This expectation
is in direct conflict with the even more specific expectation for the submediant chord that stems
from the listeners experience with this particular piece.
Increased exposure to a particular composition can lessen the effect of a surprising
musical feature, like the deceptive cadence, and Hintzmans model can account for this type of
experience. Consider, for example, the opening phrase of America (My Country tis of Thee;
Beethovens arrangement of the identical God Save the King is shown in Example 3.1). This
six-measure phrase contains a deceptive resolution in the fourth measure. The abnormal length of
the phrase (six measures instead of the more typical four) and the deceptive motion violate
schematic expectations and should therefore elicit surprise. However, America is so familiar in
our culture that listeners have multiple memory traces for this song. The sheer number of traces
correctly predicting the features of the phrase and the greater activation of these traces based on
their similarity to the incoming input lessens the influence of the generalized expectations based
on a listeners cumulative musical experience. Because this model relies on specificity of
expectation as well as a listeners cumulative musical experience, it can account for the
simultaneous occurrence of different expectations as well as a listeners changing expectations
with increased exposure.

Example 3.1: Beethoven, God Save the King, WoO 78, mm. 16
Hintzmans conception of memory allows us to consider two distinct factors influencing
our expectations: the specificity of an expectation and the number of times a listener has been
exposed to a particular pattern. Figure 3.2 depicted both generalized expectations and specific
expectations for particular pieces (that is, both schematic and veridical expectations). The
specificity and quantity of memory traces determines the strength of an expectation, where
40

higher specificity and greater trace quantity result in stronger expectations. For instance, tonal
syntax makes highly specific predictions about the harmonic and melodic content of a
composition, and the considerable exposure to music conforming to these norms creates strong
expectations in Western listeners.

An Expectation-based Model of Closure


The concept of expectation can also provide insight into the listeners experience of
musical closure. This idea is nothing new; recall Meyers (1956) statement about closure: A
stimulus series which develops no process, awakens no tendencies, will . . . always appear to be
incomplete (139). Whether expectations are implicit or explicit, Meyer is quite adamant that a
listener must anticipate the point at which a musical segment will terminate in order to
experience a feeling of finality. Observing the definition of closure from Chapter 1 (the expected
end to a musical segment, resulting in a feeling of finality), closure necessarily depends upon
expectation. The characteristics of closurecompletion of a goal-directed process, segmentation
of musical experience, hierarchical construction, and stylistic dependencedescribe the
listeners experience. To capture the various experiences of closure proposed by this model, I
suggest three types of closure: anticipatory closure, arrival closure, and retrospective closure.
These describe a listeners phenomenological perception of closure, where each of these types of
closure draws upon a different set of expectations, resulting in a different feeling of closure.
Segmentation is a prerequisite to experiencing musical closure. After unconsciously
learning the transitional probabilities of various musical domains, a listener is able to segment
the musical surface into meaningful musical units. Some of these segmentations reflect deep
schematic expectations (e.g., expectations for continuity and proximity), while other
segmentations depend on stylistic knowledge. If a style consistently uses a formulaic pattern at
boundary locations, then this pattern itself becomes associated with endings. Consider for a
moment the classical cadence, V-I. Nothing inherently links this harmonic succession with
closure; only through a learned schematic expectation has the authentic cadenceespecially the
perfect authentic cadence (PAC)become synonymous with closure. Because musical
boundaries are formed at points with a lower transitional probability (and hence a higher
prediction error), the uncertainty of subsequent events leads to the closing off of the musical
segment, resulting in a feeling of finality.
41

According to Huron, this feeling of finality is a direct consequence of learned first-order


probabilities (2006, 167). He associates closure or a feeling of home with the positive
emotions resulting from a correct prediction based on these transitional probabilities. A listener
misattributes this positive feeling to the sound itself (in the case of a PAC, to the tonic triad).
This understanding of the relationship between expectation and closure works well for tonal
music, but more recent music does not have so many first-order probabilities that transcend an
individual composition. Expectations for twentieth- and twenty-first-century works typically
result from fewer episodic traces (surface schematic expectations) or tend to be overly general
(deep schematic expectation). Specific expectations supported by many memory traces (from the
middle of the continuum) are not present in this repertoire, unlike in tonal music. Without a
strong prediction effect, listeners do not have as strong of a sense of finality.
Huron notes that the uncertainty following a cadence also contributes to the sense of
finality. In my model, more specific expectations for the end of a segment lead to a greater
change in the listeners ability to predict subsequent events. Once again, consider the PAC in the
Classical style. A knowledgeable listener will have strong expectations for the tonic arrival, but
not necessarily for subsequent events, resulting in a large increase in prediction error. Compare
this to a listener who is not fluent in the Classical style and therefore did not strongly expect this
same tonic arrival. Both listeners may have similar uncertainty for events following the cadence,
but the knowledgeable listener will experience a greater decrease in the ability to predict
subsequent events, resulting in a greater feeling of finality for the same passage of music.
Considering that the feeling of finality is the product of a successful prediction (and its
resulting positive valence) followed by a rise in uncertainty for subsequent events, and that
memory traces for music are activated in parallel, the experience of musical closure is directly
related to a listeners prior experience and the implicit expectations derived from this
experience.22 The multiple-trace memory model can account for a comparative rise in
uncertainty even in well-known pieces, given that general schematic expectations are an
amalgamation of all active memory traces for the musical context. While not all traces are
22

Listeners also experience closure when an ending conforms to conscious expectations. For instance, a
listener might expect a segment-concluding Figure that occurs early in a composition to return in a significant way,
closing another segment later in the work. Also, conscious knowledge of musical structure could lead a listener to
anticipate when an ending is likely to occur.

42

activated to the same degree (the ones that share the most features with the incoming sonic input
are more strongly activated), the sheer variety of musical content following a typical ending
gesture results in a higher prediction error for the next event.23
As discussed in Chapter 2, many aspects of closure are style dependent, but broader
schematic expectations allow the experience of closure not only in unfamiliar pieces but in
unfamiliar styles as well. As Huron notes, however, the crossover that occurs with schematic
expectations between styles does not always provide contextually sensitive expectations. More
general expectations, like the expectation that phrases normally relax at the end, can transcend
styles (Hopkins 1990), but whether this expectation is a product of understanding music through
embodied motion (Snyder 2000) or acquired through statistical learning is a question that
warrants more research.
I would argue that the strongest sensation of closure is typically associated with musicspecific schematic expectations (as opposed to cross-domain schematic expectations), which
may account for the common description among music theorists that closure occurs at the end of
a goal-directed process. This feeling of goal direction is often misattributed to the music (or,
more specifically, to an actively unfolding musical process), when in fact the feeling is an
artifact of the listeners ability to predict with increasing certainty subsequent events within a
musical segment. Consider, for instance, a schema that provides a general outline of successive
events (such as Gjerdingins do-ti-re-do schema or Byros le-sol-fi-sol schema). As each event in
such a schema occurs, the transitional probability for subsequent elements increases in that
context. More specific listener expectations have a higher transitional probability between
successive elements, resulting in a greater feeling of goal direction and, ultimately, closure.
The continuum of expectation specificity also applies to closure, as depicted in
Figure 3.3. On the deepest level, we expect some sort of musical closurea feeling of finality at
23

To explore how traces can be activated to different degree (as opposed to a binary activation), consider a
categorization task. Suppose you see a chair, and for this illustration this chair has five features: sitability, four
legs, no arms, a back, and a location close to the ground. The memory trace for a different chair that shares all of
these features will be more strongly activated than the memory trace for a three-legged barstool, which would only
share the sitability and no arms features with the perceived stimulus. While these features are binary in nature,
they may be weighted differing amounts in determining the degree of activation of a particular memory trace (in this
example sitability is more of a requirement for something to be a chair than whether the chair has arms). In
actuality, the mind probably keeps track of many more features than this simple example, including context. In
music, the process may be more complicated due to the temporal nature of music and listener limitations in
remembering the preceding musical context.

43

the end of a music segment or piece. Margulis (2005) agrees that closure is a deep schematic
expectation; it is also a cross-modal expectation (in general, we expect sounds to end eventually,
and, from a Gestalt perspective, we see broken objects as completed wholes), not a musicspecific expectation. This deeply schematic expectation is extremely general, applicable to all
music.

Figure 3.3: Continuum of Expectation: Closural Expectations


Appearing just above these expectations are general expectations for phrase length,
ending gestures, and performative signs of closure, all of which contribute to a basic phrase
shapeusually corresponding to an increase in tension and a subsequent relaxation. These
44

general expectations are applicable across styles, but musical patterns found in a smaller
collection of musical works (style-specific, genre-specific, or composer-specific patterns) also
shape expectations. Endings that conform to the surface end of the schematic expectation
continuum result in a greater feeling of finality because of the relatively high specificity of the
expectations they engender. With the frequent use of a formulaic cadence in certain styles, a
listener has more exact expectations for the end, resulting in a better ability to predict when
phrases will end and consequently an increased feeling of goal-direction. The number of memory
traces supporting a particular progression would also influence the feeling of goal-direction,
where an increased number of traces would increase the anticipation of the ending.
The more specific an expectation is, the less applicable it will be to a variety of pieces:
for instance, there are forms of closure specific to a particular period, genre, or even composer.
At the top of the diagram are expectations for closure based on previous encounters with a
particular composition. Even though these work-specific expectations allow a listener to
anticipate the end of a musical segment accurately, the quantity of these specific episodic traces
is dwarfed by a listeners more general cumulative musical experiences; the feeling of finality
would be less intense than the feeling of finality following signs of closure from the middle of
the continuum.
Because the feeling of finality results from a change in the listeners ability to predict
subsequent events, especially specific expectations leading to the end of a segment will produce
a greater subsequent decrease in predictability. For instance, the feeling of closure following the
end of a harmonic schema is stronger than an ending indicated by a long note because the
expectations are more specific (in regards to content and timing), resulting in a larger difference
in a listeners ability to predict subsequent events. Deeper schematic expectations are too general
to elicit specific expectations for the end of a segment, resulting in a smaller change in prediction
error. At the same time these expectations are unfolding, listeners still carry surface schematic
expectations and perhaps veridical expectations for closure. Similar to the experience of a
deceptive cadence, an ending that conforms to a listeners veridical expectations for a particular
piece, but that overall denies schematic expectations, could lead to a less intense feeling of
finality because some expectations imply continuation while other imply an ending. The

45

proportion of continuation and ending implications results in various degrees of closure, creating
a sense of hierarchical closes within a piece.
I posit three different types of closure, based on the type of expectations governing the
feeling of finality and the amount of change in the prediction error, which results in closes with
different strengths. Anticipatory closure occurs when a listener can predict when the musical
segment will be completed. Anticipatory closure can be experienced through all the types of
expectation, but, as mentioned earlier, schematic expectations on the surface end of the spectrum
seem to create the greatest feeling of finality. Anticipatory closure is the strongest type of closure
because the expectations leading into the final event are so strong, resulting in a large rise in
prediction error. Veridical and dynamic expectations can also lead a listener to experience
anticipatory closure, even if a composer eschews conventions set forth by schematic
expectations; however, without the combined weight of many exemplars, the experience of
closure may not be as strong.24
Arrival closure occurs when a listener experiences finality before the beginning of the
next segment. This is best understood in Narmours definition of closure: a musical event that
creates no further implications. Like anticipatory closure, arrival closure depends upon a
listeners schematic expectations and requires a decrease in the ability to predict subsequent
events, but in arrival closure the listener does not know when a musical segment is going to end
until it actually ends. Sonic features (learned through statistical learning) such as an extended
note or silence signify the endings arrival, but the transient increase in prediction error is not as
high as in anticipatory closure because the expectations approaching a particular ending are not
as strong.
Retrospective closure can only barely be classified as closure, since it refers to
experiencing an ending solely because a new beginning has occurred. Meyer would not classify
this as closure at all, although he does suggest that closure occurs prior to a recognized
beginning, such as the beginning of a parallel consequent phrase (Meyer 1973, 86). In my view,
rather than experiencing finality because the next event is more difficult to predict than previous
24

An evaded (or deceptive) cadence would be an example of denied anticipatory closure. A listener is
primed to expect an ending following the dominant harmony, but the chord that follows implies continuation
instead. While a listener may experience some finality with this harmony, like the closing off of a subphrase, the
feeling is not nearly as strong as it would have been had the expected harmony occurred instead.

46

events, we instead experience an ending because the current event defied expectations for
continuation implied by the previous event. Retrospective closure can be characterized as a
failure to recognize an ending precisely when it occurs. Deeply schematic expectations are at
work here, based mainly on sonic disjunction. While multiple listenings may change a once
retrospective close to an arrival or anticipated close, these veridical expectations will not elicit
the same effect as shallow-end schematic expectations.

Three Analytical Vignettes


This section will briefly examine closure in three songs that represent different musical
styles and carry different sets of schematic expectations: Robert Schumanns Widmung from
Myrthen, Op. 25, No. 1; Anton Weberns Der Tag ist vergangen from Vier Lieder fr
Singstimme und Klavier, Op. 12, No. 1; and Aaron Coplands The World Feels Dusty from
Twelve Poems of Emily Dickinson. My analyses, representing the perspective of an experienced
listener, will focus on schematic, dynamic, and conscious expectations; obviously, multiple
repetitions of a single work would also create more specific veridical expectations. I will
highlight goal-directed processes and surface features that most compellingly project closure in
each song, briefly exploring how schematic and dynamic expectations influence closure.
Because the text can contribute to closure through repetition, rhyme scheme, and semantic
meaning, I chose these three texted works to illustrate the role of expectation in the perception of
closure.
Schumanns Widmung
Many authors portray the completion of a tonal process as the main marker of closure;
however, these tonal markers of closure, such as cadences and the completion of the Ursatz,
combine with several musical elements to create a sense of closure. Other markers of ending,
such as increased duration and strong metrical articulation, interact with pitch-based signs of
closure to create a hierarchy of segmentation, ranging from subphrase to phrase to section to the
entire song. While subphrases are not typically considered closed, they are usually marked by
musical characteristics that also accompany phrase endings. These elements (such as change of
texture, increase in rhythmic duration, and silence) create perceptual boundaries in the musical
surface because of the discontinuity in the sound. Other surface characteristics (such as a
47

decrease in tempo, softer dynamics, and falling pitch contour) also signify endings, usually
coinciding with the end of a subphrase or a phrase. All of these features, which reflect common
ending gestures, contribute to the perception of closure.
From a tonal standpoint, Widmung is relatively straightforward. It begins in A! major,
and the first part of its simple ternary form contains a single phrase. The B section of the ternary
form begins with a common-tone modulation to E major (!VI). Over the course of this section,
the music slips back to A! by reinterpreting the A-major chord in m. 25 as a B!! chord (!II in A!
major). The B section sets up the return of the A section with a prolonged dominant, which could
be read from a Schenkerian perspective as the arrival of ) in an interrupted structure. The A'
section begins in m. 30, exactly repeating the music and text from the beginning until m. 35.
Here Schumann tonicizes ii instead of IV, and the lyrics return to the text from the end of the B
section. The tempo broadens and the vocal line rhetorically leaps up to F5E!5 (covering the
structural )* before falling conclusively to $ on the strong beat of m. 39, achieving tonal closure
(from a Schenkerian perspective, completing the Ursatz).
The text (provided along with an English translation in Table 3.1) is a setting of
F. Rckerts poem. In the first half of the poem, set in the A section of Schumanns song, the
author acknowledges that his love is his entire life. The anaphora preceding each characteristic of
his lover (du meine) is contrasted with the anaphora du bist in the second half of the poem (set
in Schumanns B section). In this second half, the author describes the transformative power of
his lovers affection. The twelve-line poem consists of rhymed couplets, where both lines within
each couplet have the same number of syllables. The poem alternates between couplets of eight
and couplets of nine syllables until the last couplet, which breaks the pattern and substitutes an
eight-syllable couplet in place of the expected nine-syllable couplet. Rckerts consistent syllabic
structure and rhyme scheme within each couplet provide a listener with very specific
expectations for the ending of each couplet.
The A section has only one true ending, as dictated by tonal schemata: the perfect
authentic cadence in m. 13 (see Example 3.2) completes the goal-directed harmonic process
initiated at the beginning of the song. Other musical markers, such as the middleground
descending step progression and the relatively long vocal note followed by a rest (coupled with
the diminuendo and ritardando in the piano) strengthen the feeling of closure at this point. These
48

musical features represent deep schematic expectations, which apply to a larger variety of works
than do the more surface schema for tonal syntax. Other points in the A section (such as in
mm. 5 and 10) also include a descending pitch contour, a longer rhythmic duration, and
diminuendos to mark endings, but these points do not evoke the same feeling of finality as does
the cadence in m. 13. The more surface schemata of tonal process allow a listener to predict a
more specific type of ending (a tonic chord on a strong beat) than do more general schematic
signs of ending (or, to use Meyers term, secondary parameters).
Table 3.1: Text and Translation of Widmung
Poem by F. Rckert [trans. by Reinhard 1989, 128-9]
The number of syllables in each line is noted to the left of the text

8
8
9
9
8
8

Du meine Seele, du mein Herz,


Du meine Wonn', o du mein Schmerz,
Du meine Welt, in der ich lebe,
Mein Himmel du, darin ich schwebe,
O du mein Grab, in das hinab
Ich ewig meinen Kummer gab!

You my soul, you my heart,


You my bliss, oh you my grief,
You my world in which I live,
My heaven, you, therein I soar,
Oh you my grave down into
I eternally gave my sorrow!

9
9
8
8
8
8

Du bist die Ruh, du bist der Frieden,


Du bist von Himmel, mir beschieden.
Da du mich liebst, macht mich mir wert,
Dein Blick hat mich vor mir verklrt,
Du hebst mich liebend ber mich,
Mein guter Geist, mein beres Ich!

You are repose, you are peace,


You are bestowed to me by heaven.
Your love for me makes me worthy to myself,
Your gaze has transfigured me within my own eyes,
You lift me above myself with your love,
My good spirit, my better self!

A listener could segment the first phrase into smaller units; as Meyer (1973) explains, a
hierarchy of closure emerges based on the proportion of musical parameters projecting an ending
to the parameters projecting continuation. A listeners previous musical encounters will
determine whether the features of Schumanns first phrase project closure or continuation.
Because harmony is one of the best predictors of endings in this style, an experienced listener
will presumably first sense closure in m. 13. The subphrase-concluding harmonies prior to this
point imply continuation because an experienced listener will have specific expectations for
subsequent harmonies. A subphrase in this analysis consists of a grouping on a lower hierarchic
level than the harmonically-driven phrase.25
25

While in this analysis, subphrases are marked by an initiating and concluding gesture occurring on the
same hierarchical level, subphrases in other pieces could be formed on the basis of discontinuity within the phrase.
Further, in this analysis, there are different levels of subphrases: larger subphrases can be divided into smaller
subphrases.

49

The first two subphrases (mm. 13 and mm. 45; refer to Example 3.2) are similarly
delineated by rests and end with a relatively long note on a strong beat. The first subphrase
ascends to C5 over a prolonged tonic harmony, and a change of harmony initiates the second
subphrase, which reaches beyond the registral boundary established by the first subphrase before
stepping down to ) (harmonized with the borrowed ii65 chord) in m. 5. Both the first and second
subphrases clearly set up the expectation that more music will follow, but in the local context
they group together to form a higher hierarchical unit. Although musical elements at the end of
the second subphrase also imply continuation (setting up sentential expectations and arriving on
an unstable scale degree and harmony), the descending gesture in mm. 45 balance the rising
gesture in mm. 23, creating a stronger boundary at m. 5. The text influences my interpretation
as well: the last words rhyme confirms the grouping of these subphrases. In other words, I am
able to predict the conclusion of the second subphrase better than the first because of the rhyme
scheme and repeated syllabic pattern.

Prefix to the opening subphrase

Example 3.2: Schumann, Myrthen, Widmung Op. 25, No. 1, mm. 113
The annotations under the score represent the two levels of subphrase analysis discussed in the text.

50

Example 3.2 (continued): Schumann, Myrthen, Widmung Op. 25, No. 1, mm. 113
Both the subphrase that begins with the V42 chord in the second half of m. 5 and the
subphrase that begins with the V42/IV chord in the second half of m. 7 use the same harmonic
progression, first in the home key, then tonicizing IV. Even without a rest separating the two
events, the repetition of the harmonic progression and basic melodic outline ()-$-'-(), the 43
51

suspension on the strong beat, and the dynamic expectation of subphrase length (established by
the first two subphrases) suggest that the V42/IV chord in the second half of m. 7 begins a new
event. Again, these subphrases group together because of their musical similarity; a listener
would have strong expectations for the location of the end of this second subphrase because of
the harmonic and melodic repetition. A dynamic expectation of subphrase length coupled with
the number of syllables present in each line of text also provides fairly specific expectations for
this point of conclusion. The arrival on the pre-dominant harmony with the rhyming final word
of the couplet in m. 9 does initially sound like the conclusion of the subphrase, fulfilling a
listeners expectation for an ending.
However, the material that immediately follows (end of m. 9beginning of m. 10) does
not sound like a complete subphrase: it sounds like a stronger ending than the resolution of the
43 suspension in m. 9. Despite confounding dynamic expectations for subphrase length, hearing
the arrival on ) in m. 10 as an ending confirms other dynamic expectations. O du mein Grab
begins with the same words and melodic material as the end of the second subphrase (mm. 45),
producing specific expectations for an ending. One possible reading of the grouping structure is
that this short segment groups with the previous subphrase to prolong the pre-dominant area,
creating a longer subphrase in mm. 810, which, when combined with the previous subphrase in
mm. 67, essentially replicates the harmonic motion in mm. 25. Even though both measures
conclude with a descent down to ), m. 10 sounds more implicative than does m. 5. Perhaps the
premature stop in the middle of a line and the absence of a complete rhyming couplet project
continuation at the surface level, while the expanded pre-dominant chord increases anticipation
for an upcoming cadence at the middleground level. This anticipated completion of a harmonic
schema and the unexpected subphrase length overshadows other signs of ending.
This sections last subphrase is longer than the previous ones, concluding in m. 13 with a
PAC. Interestingly, Schumann disguises the end of the fifth line, which occurs on the downbeat
of m. 11 (the expected place for the line to end, as implied by the previous subphrase lengths).
The rising melodic contour and short note values make in das hinab sound more like a musical
beginning than an ending. The cadential arrival coincides with the conclusion of a rhymed
couplet, and, although the end of the couplets first line was not musically articulated, Schumann
takes advantage of the internal rhyme in line five. In this first section, Schumanns compositional
52

decisions and the text of the poem influence the perceived grouping structure of the subphrases,
the conclusiveness of the endings, and the sense of closure at the end of the A section.
Even though Schumann concludes the first phrase with a (-)-$ melodic motion coupled
with a strong harmonic cadential gesture, an experienced listener is unlikely to hear this as the
final close of the piece. There has not been enough music; a listener familiar with Romantic
Lieder would know that a contrasting section, or at least another stanza, usually follows.
Furthermore, the poem itself remains open; the speaker has only listed the characteristics of his
love. Musicians who take an organicist analytical position may further state that the full
implication of the borrowed !+ has not been fully realized, necessitating more music.
The common-tone modulation to !VI (enharmonically respelled as E major) begins the
songs next section (see Example 3.3). This section is differentiated from the first not only by
this key change but also by a change in the accompaniment pattern and the longer durations in
the melodic line. After the half cadence (HC) in m. 21, harmonic and melodic patterns from the
first section recur followed by another common-tone modulation back to the home key. Despite a
ritardando and a return of the original accompaniment figure, the return to the original key does
not sound like the third part of a ternary form. Part of the reason is that elements from the A
section gradually emerge from the B section, blurring the boundary from the perspective of tonal
and thematic content. It is not until the HC in m. 29, which concludes the phrase begun in m. 22,
that there is a strong enough anticipated ending to close the B section of this Lied. The rhymed
couplet, the dominant prolongation preceding this point, and the ritardando that occurs in the
previous measure all contribute to the close of this section. However, the very nature of a HC
and, to a lesser extent, the ascending melodic line implies continuation. In his paper at the 2010
meeting of the Society for Music Theory, Poundie Burstein explored this very issue of
continuation at a half cadence, noting that the implicative nature of the V chord sometime blurs
the distinction between a half-cadential ending and an elided authentic cadence.

53

Example 3.3: Schumann, Widmung from Myrthen, Op. 25, No. 1, mm. 1429
From a theoretical perspective, a HC can end a phrase, but the phrase is not considered
closed. If closure is seen as the completion of a goal-directed process, the most common
harmonic process in tonal music is the motion away from tonic and then motion back towards
54

tonic. For instance, in his textbook Harmonic Practice in Tonal Music, Gauldin describes half
cadences as open cadences compared to closed authentic cadences (2004, 132). However,
listeners do experience a sense of some finality at a HC because of the role of expectation in the
perception of closure. While a HC does not conclude on a restful tonic chord, it does signify
the end of a common harmonic paradigm. As previously discussed, listeners misattribute the
feeling of a goal-directed process to the increased prediction success when successive elements
in a script-like schema are confirmed. Since the ending on V can be anticipated, there is a degree
of finality at the arrival of a HC, and yet a HC does not sound as closed as an authentic cadence
because the dominant harmony itself implies continuation. Despite the prevalence of halfcadential harmonic patterns, there are still many more traces in long-term memory where the V
chord does, in fact, proceed to a tonic harmony.26
Because Schumanns A' section replicates many elements from the beginning, I will not
provide a thorough analysis of this section, but a few words about the structural close and the
codetta are warranted. To heighten the expectation of closure in m. 39, Schumann uses rhetorical
and formal markers of closure. Returning to the last two lines of the poem, Schumann concludes
the songs narrative by reiterating the transformative power of the poets love. The dramatic leap
up to +"before the leap down to the final tonic covers the structural descending step progression.
This rhetorical flourish points towards the approaching cadence, whose finality is further
emphasized by a ritardando.
An experienced listener would know that the song is unlikely to end with this structural
close, because most Lieder conclude with a piano codetta. The passing 42 chord following the
cadence implies a continuation, which is realized by the return of the accompaniment
Figure from the beginning. This codetta divides into two segments (see Example 3.4), each one
concluding with a V-I harmonic motion. Adopting Caplins (2005) perspective on phrase and
cadence, these two-bar units do not constitute a phrase: they are merely an external phrase
extension, repeating the cadential gesture from the previous phrase.27 The last one sounds more

26

There are instances where the half cadence can sound more conclusive. Most notably, the Phrygian half
cadence (typically iv6-V) is a common harmonic formula in Baroque music that may conclude non-final movements
in multi-movement works.
27

According to Caplin, a cadence must end a formal unit. These two-bar units do not end anything; rather,
they are extra endings tacked onto the end of the phrase.

55

final because it lacks the passing 42 chord on the third beat of m. 43 and the tonic harmony is
sustained as the tempo slows down and the music fades out.28

Example 3.4: Schumann, Widmung from Myrthen, Op. 25, No. 1, mm. 3744
In this analysis, the completion of harmonic schemata projected the strongest feeling of
closure. Harmonic schemata, by their very nature, are broad generalizations formed by many
exemplars and tend to elicit rather specific expectations, allowing a listener to experience
anticipatory closure at cadential points. Signs of closure such as silence and increased duration
contribute to a works grouping structure, but perhaps not its sense of closure (especially at the
level of the subphrase) because the arrival of these features arent as strongly anticipated.29
28

One can point to the final sonority of this song, an A!-major triad in second inversion as evidence that
this song does not have a satisfying ending; however, I hear the low A!2 at the beginning of the measure as the
functional bass note for the entire measure. Further when viewed as a part of the entire song cycle, this opening song
contains implicative elements in terms of the overall narrative of the cycle, but in this analysis, I only examined
closure within the context of this single song. My analysis of Coplands song The World Feels Dusty, will
consider closure within the context of the entire song cycle.
29

A listener with less experience in this musical style might have quite a different impression of endings. I
imagine that sonic disjunctions might exert much more influence in the songs segmentation and in the listeners
perception of closure.

56

Returning to Meyers (1973) suggestion that the strength of a particular ending can influence the
perception of form, it follows that the strongest closes conclude the larger sections of the ternary
form, while less strong endings delineate and group the subphrases. Furthermore, sophisticated
listeners may use genre-specific knowledgethe relationship between the piano and voice,
stylistic knowledge about typical Romantic era harmonic language, and conscious knowledge
about prototypical formal constructionto shape expectations, affecting their perception of
closure for this work.30
Weberns Der Tag ist vergangen
In my analysis of closure in Widmung, I relied primarily on tonal paradigms. Although
segmentation in post-tonal styles may seem more reliant than tonal music on differences in the
musical surface, I believe that schematic knowledge structures still provide a top-down guide to
segmentation. One such structure is an increase in tension followed by a decrease in tension,
marking a complete unit (this is also discussed by Bryden [2001] and Hopkins [1990]). This deep
schematic expectation does not imply a specific type of ending or point of ending, but refers to a
general phrase shape based on exposure to many different types of music.31 Both Meyers
primary and secondary musical parameters can influence our learned association between
falling or decreased intensity and our perception of closure. Even in tonal music, musicians
maintain that we experience an increase in harmonic intensity followed by a relaxation at
cadences. According to Huron (2006), this feeling of intensity and relaxation is misattributed: it
is our strong first-order expectations at the dominant chord followed by the reward for a correct
prediction at the tonic chord. Some other musical elements (discussed in the previous analysis),
such as falling pitch contour and a decrease in tempo and dynamics, can also contribute to a
sense of lessening intensity.
Violations of the even deeper expectation for continuity in sound also play a role in the
segmentation of music. Abrupt changes in pitch and articulation can segment a work into motivic
units, usually occurring within a phrase, while other changes, such as the intrusion of silence and
30

This interaction between a bottom-up construction of form and pre-existing top-down knowledge of
formal structure will be explored further in Chapter 7.
31

Some authors derive this structure from our embodied experience (Snyder 2000), while others attribute it
to the perception of movement by musical forces (Bryden 2001). In either case, statistical generalizations of how
musical segments should end are accumulated in long-term memory.

57

the lengthening of durations, usually occur at the end of a phrase. It is the degree to which these
elements are anticipated, if they are at all, that gives rise to feelings of closure.
Schematic knowledge of phrase shape and discontinuity in the sound influence a
listeners sense of closure (and therefore formal structure) in Weberns Der Tag ist vergangen.
Dynamic expectations also play a role in establishing which musical parameters will contribute
to the sense of closure. Consider, for instance, the songs opening measures (reproduced in
Example 3.5), which illustrate the parameters that contribute to the shape of phrases in this work:
dynamics and pitch register, where an increase in the dynamic level and pitch height build
musical tension, followed by relaxation as the dynamic level drops and the pitches descend.
This basic shape is expanded in the first vocal phrase, where the vocal line has two
distinct peaks, one in each line of the poem, with the second peak reaching a step higher than the
first. The piano line has a similar shape in mm. 56, but it does not align with the vocal shape.
The vocal line is foregrounded in this phrase, due, in part, to the higher register, making the
vocal rest at the end of the phrase more salient than the silence separating gestures in the piano
part. A listener who did not recognize a phrase ending with the arrival of the G, in m. 6 (arrival
closure) would almost certainly realize the phrase had concluded with the first instance of silence
in the vocal line (retrospective closure).32 If this silence is insufficient, the piano punctuates the
end of the phrase with a two-chord gesture that collapses in pitch space (a span of 56 semitones
contracting to 34 semitones).
The first phrase creates a dynamic expectation for phrase length and basic phrase shape.
The second phrase traverses more pitch space, reaching past the registral boundaries of the first
phrase, but still descends in pitch space at the end of the phrase. The end of the second phrase
sounds more conclusive because its length exactly matches that of the first phrase, confirming a
listeners expectations for when the phrase should end. Coupled with the rhyme scheme, a
listener can experience arrival closure on miror even anticipatory closure, if the listener is
paying attention to phrase length.

32

There are elements present in this first phrase that could elicit anticipatory closure: the foreshortening of
the contour segments and the syntax of the text.

58

Webern VIER LIEDER, OP. 12


1925 by Universal Edition A.G., Wien; Renewed; All rights reserved
Used by permission of the European American Music Distributors LLC,
U.S. and Canadian agent for Universal Edition A.G., Wien

Example 3.5: Webern, Vier Lieder, Der Tag ist vergangen, Op. 12, No. 1, mm. 111
In the first stanza, an increase in the dynamic level and pitch height correlates with an
increase in tension, which is relaxed as the dynamic level and pitch drop, but this pattern does
not hold true at the end. While the dynamics decrease in intensity towards the end (Example 3.6),
the vocal line leaps a diminished octave up to F5. One could argue that the pitch is displaced an
octave and it essentially moves only a half step from the preceding F#, but even aside from this
explanation there are other elements that clearly contribute to the sense of closure here. First, the
dynamic expectations established by the first stanza suggest that the length of the second stanza
will be similar, which it is, and also that it will probably end with a word conforming to the
59

poems rhyme scheme, which it does. The descending perfect fifth interval between the last pitch
of phrase one (G4) and that of phrase two (C4) is inversionally related (in pitch space) to the last
pitches of phrases three and four (B!4 and F5). This intervallic balance may also contribute to
closure.33

Webern VIER LIEDER, OP. 12


1925 by Universal Edition A.G., Wien; Renewed; All rights reserved
Used by permission of the European American Music Distributors LLC,
U.S. and Canadian agent for Universal Edition A.G., Wien

Example 3.6: Webern, Vier Lieder, Der Tag ist vergangen, Op. 12, No. 1, mm. 1821
Incidentally, ending a song with an upward leap is not uncommon within Weberns own
oeuvre; for example, the well-known songs Wie bin ich froh and Vorfrhling end with rising
vocal leaps. While this otherwise unusual interval may be a composer-specific sign of closure,
the small number of examples of this gesture ending a musical segment compared with the vast
number of more typical endings may still imply openness or continuation for even the
most expert of Webern listeners.34
Coplands The World Feels Dusty
In his setting of Emily Dickinsons poem The world feels dusty, Copland manipulates a
listeners expectation for closure. His consistent use of two-bar groupings and stereotypical tonal
patterns at the beginning of the song creates a set of dynamic expectations that are left unfulfilled
33

An analysis could also imply latent tonal tendencies (Kurth 2000) by pointing to allusions of a tonal
relationship between these two intervals (suggesting that the first and third phrase end with a kind of HC that is
answered by a PAC). While this is a possible hearing, I think other factors discussed in my analysis play a larger
role in projecting closure.
34

One could argue that regarding a large ascending leap as strong closure would constitute an attempt to
normalize music that is intended to be unusual. The very idea of the textGive to the deceased eternal rest
suggests eternal stasis, not closure at all, so Webern may be manipulating closural expectations in order to explore
meaning within the text.

60

as the song progresses. These unfulfilled expectations for closure might lead the listener to an
understanding of the poem that differs from reading Dickinsons words without music, and
presumably Coplands compositional choices reflect his own interpretation of the poetry. Of
course, we might also understand this song differently in the context of the entire song cycle.
The text for this song is taken from the 1929 Bianchi edition of Dickinsons poems,
which include quite a few changes from Dickinsons original poem. For instance, the last two
lines were changed from And Hybla BalmsDews of Thessaly, to fetch to Dews of thyself
to fetch and Holy balms. According to Cherlin (1991), the 1929 edition also removes the
possibility of enjambment by inserting periods at the end of each stanza, thereby limiting the
number of possible readings of this poem.
The dashes at the ends of stanzas equivocate, no less than those internal to quatrains. At
the ends of stanzas dashes avoid the strong sense of closure that the unfortunate editorial
choice of 1929, periods, bring. Hence, Honorstaste dryFlagsvex are separated
by versification, yet connected by belonging to the same short catalogue of things dry,
vexing, things not needed that can crowd out those most needed. (58)
Still, multiple readings of the text are possible. Looking just at the text, this poem can be
understood as leaving behind the dusty honors and flags of this world and longing for close
companionship while death is embraced. Another interpretation places the poem within the
context of Coplands song cycle.35 The world feels dusty immediately follows the poem Why
do they shut me out of heaven? Taken in this context, Baker (2003) argues that the speaker
longs for the dews of this world, since she will not receive the honors of the next.
At the moment of death, when the good Christian is said to weary of the world and seek
heavenly waters for spiritual relief, Dickinson argues that one thirsts for the dew of this
world, not the next. We want the dew then Honors taste dry. The honors that come with
the privilege and victory of death in anticipation of salvationto invoke the
metaphors of the nineteenth-century Calvinisttaste dry on the tongue of one who has
just learned in song three that she will be denied happiness in the afterlife. Dickinson
looks to the soothing waters of worldly friendship as a holy balm that might restore one
to life on earth. (11)

35

Soll and Dorrs (1992) research indicates that Copland intended for the song cycle to be heard as a whole
(100). Their subsequent analysis reveals features supporting their cyclic reading, where the intricate musical and
textual materials combine to create a highly organized and unique overall structure (101).

61

In either case, Coplands manipulation of a listeners expectation for closure can color the
meaning of the text. This analysis will focus on musical expectations Copland creates in the
beginning of the song and what the confirmation or denial of these expectations might mean.
Although diatonic, this song does not project a clear tonal center in the manner of
Schumanns Widmung, but the diatonic (or perhaps very weakly tonal) nature of this piece
creates tonal implications, unlike Weberns highly chromatic song Der Tag ist vergangen.
F# minor seems to be prevalent in the piano part, but the voice emphasizes D major in the first
few measures. This conflict is also present at the end of the first stanza, where the voice clearly
implies B minor (with a skip down a fifth in the vocal part at the end of the first stanza to
emphasize pitch-class B), but the piano part does not corroborate this tonal center. Fleeting tonal
implications would activate traces of previous experiences with tonal music, eliciting tonal
expectations.
The piano introduction establishes an expectation for two-measure subphrases with an
emphasis on the second beat of each measure (see Example 3.7). The vocal line continues this
two-measure division in the first two lines of the poem, and both of these subphrases create a
dynamic expectation that registral extremes in the vocal line will correlate with segmentation.
The established subphrase length from the piano introduction projects a boundary after dusty
in m. 4, even though the contour ascends to this pitch and the rhythmic syncopation emphasizes
the weak syllable of this word. The stronger boundary in m. 6 concludes a higher-level subphrase
unit. The features contributing to the close in m. 6 include the completion of a thought in the
text, the slightly longer note on the word die, and a longer break in the vocal line. The pitch in
m. 6 is higher than in m. 4, confirming a dynamic expectation that a range extreme correlates
with the end of musical segments.
Copland changes the subphrase grouping in the reminder of the first stanza. In m. 7, the
vocal line articulates a one-measure group, with the durational accent beginning on the second
beat of the measure, echoing the sighing piano gesture from the beginning. The piano part
supports this reading: the recurring gesture is transposed up a third for just this measure before
slipping back down to the original pitch level in m. 8. The increased pace is surprising,
differentiating this subphrase from the previous material.

62

The World Feels Dusty by Aaron Copland, text by Emily Dickenson


Copyright 1951 by The Aaron Copland Fund for Music, Inc.
Copyright Renewed. Boosey & Hawkes, Inc., Sole Licensee.
Reprinted by Permission.

Example 3.7: Copland, Twelve Poems of Emily Dickinson, The World Feels Dusty, mm. 12
The last line of text appregiates a B-minor triad, using the same syncopated rhythm from
m. 4, and concludes on the lowest vocal pitch sounded thus far (B3). While this creates an archshaped phrase (the register gradually ascends through B4 in m. 4 to the E5 in m. 7 before falling
quickly in the last line), other musical parameters do not follow the pattern of abatement
suggested by Hopkins (1990). The accompaniment at this cadential point subtly changes from a
sighing gesture to a rising step gesture in an inner voice, but perhaps the most obvious
continuational parameters are the increased tempo and dynamics.36
In summary, the first stanza sets up these expectations:
1) Text: four lines in a stanza, abcb rhyme scheme, 5+5+5+4 syllabic pattern
2) Two-measure subphrases (except for the last three measures in the vocal line)
creating balanced larger groups (2+2)
3) Diatonic, but not strongly tonal, melodic line
4) Generally, registral extremes mark endings
5) More specifically, phrases end with a"--$ gesture, concluding with the pitch B3
However, in the second stanza, many of these dynamic expectations are altered to match
the changes in the text. The rhyme scheme changes to abbc and the syllabic pattern slightly shifts
to 6+4+5+4. These changes, along with the descending contour in mm. 1213 and the similar
melodic gestures in mm. 14 and 15, lead me to group the subphrases a bit differently from the
first phrase. I hear a 1+2+1 grouping instead of the balanced 2+2 structure I heard earlier. The
last note of this section overshoots the anticipated B3 from the first section, arriving on A#3.
Cherlin (2003) recognizes that
36

The use of secondary parameters to imply continuation at cadential points is not atypical for Coplands
settings; see There Came a Wind Like a Bugle and Heart, We Will Forget Him.

63

the low A# on rain is clearly a substitute for the B of dry, which is the expected
notethe fresh note for refreshing rain. This touch is quite effective, and it works in
conjunction with a modulation (extending the term slightly so that it can fit this not-quitetonal context) projected by a shift in the piano's ostinato. (73)
These changes in the second stanza, along with the pianos registral change (now
reaching down to G2), the louder dynamics, and the faster tempo, set this section apart from the
first. Coplands B section creates an expectation for ternary form, confirmed by the return of the
original tempo in m. 17 and the opening sonority (minus pitch-class B) in m. 19. However, any
expectation for a clear return to the A section is thwarted by the lingering A# in the piano part
and the new vocal line. A literal repetition of the A section would not capture the poets
transformed perspective: that friendship provides needed comfort at the time of death.
The material in mm. 2324 imitates the pitch and rhythmic content of mm. 78 (where
the pace increased in the first phrase), both setting similar textswe want the dew then (mm.
78) and Dews of thyself to fetch (mm. 2324). The listener may consequently develop strong
expectations for the phrase in this final section to close on B3, similar to ending in m. 9;
however, the phrase instead arrives on A#3. Along with this surprising pitch, the abrupt change of
pitch collection (G Aeolian) and the final rising gesture in the piano (m. 27; see Example 3.8)
suggest that this song is not closed at all, it is just one in a cycle. While elements at the end of
this song imply finality, and multiple hearings of the song can allow a listener to experience a
sense of closure, the first encounter might leave a listener unsatisfied. Copland may have chosen
to undermine strong closure at the end of this song in order to underscore the text. The poem
ends trapped in the time between life and death; strong closure would imply death. Here the poet
wants to hold onto life in this world as long as possible.
This lack of satisfactory closure leads me to consider further how this poem functions in
the larger context of the song cycle. Baker (2003) argues that its placement after Why do they
shut me out of heaven? shows that the poet chooses a worldly life, ministering to friends, over
the afterlife. This reading is supported in Coplands last song of the cycle, The Chariot, where
the protagonist will not stop for Death. Although this reading creates a single trajectory from
the beginning of the cycle until the end, I tend to favor the interpretation of the song offered in
the previous paragraph. For me, this song cycle explores various aspects of the human
experience: our relationships with nature, with each other, and, ultimately, with death. Coplands
64

blatant thwarting of rather specific expectations created in the first stanza captures the
timelessness Dickenson drafted, denying death as long as possible in order to enjoy earthly
friendship.

The World Feels Dusty by Aaron Copland, text by Emily Dickenson


Copyright 1951 by The Aaron Copland Fund for Music, Inc.
Copyright Renewed. Boosey & Hawkes, Inc., Sole Licensee.
Reprinted by Permission.

Example 3.8: Copland, Twelve Poems of Emily Dickinson, The World Feels Dusty, m. 27
In the three preceding examples, we have seen that expectation informs a listeners sense
of closure, segmenting musical experience. The perception of goal direction is an artifact of
transitional probabilities formed through a listeners familiarity with a musical style. The
positive valence resulting from a confirmation of a listeners expectation for the end of a
segment followed by a weakening of expectations for subsequent events (i.e., an increase in
prediction error) causes a feeling of finality (or momentary repose). The perceived strength of
closure will correlate with the specificity of expectation and number of traces in long-term
memory, which together will influence the difference between the transitional probabilities for
sonic entities approaching and leaving an ending. From here, I hypothesize that there are two
main factors that influence our perception of closure:
1) Knowledge structures will guide our expectations for when and how musical
segments will conclude. These knowledge structures give rise to schematic, veridical,
and conscious expectations, promoting anticipatory and arrival closure.
2) At the same time, sonic disjunctions in the musical surface features will segment
musical experience. We expect musical surface features to continue in a similar
manner (deep schematic expectation); when they dont, we retrospectively perceive
closure.
Musical surface features can either support or contradict the expectations formed through
knowledge structures, and the weight of elements projecting closure compared to those
65

projecting continuation creates a hierarchical grouping structure. I predict that sophisticated


listeners with more experience in a particular style will tend to rely more on knowledge
structures than on sonic features. Expectations from both of these types of features are
incorporated in the Event Segmentation Theory (EST), which is a cognitive model of
segmentation. The next chapter explores this cognitive model and applies it to the perception of
musical structure and closure.

66

CHAPTER 4
EVENT SEGMENTATION THEORY
Chapters 2 and 3 illustrated that segmentation is closely related to closure, comprising an
integral part of my definition of closurethe anticipated end of a musical segment.
Segmentation in music depends both on the bottom-up processing of sensory features and on the
top-down processing of knowledge structures. Deep schematic expectations for sound continuity
are reflected through the bottom-up sensory features that segment musical experience (for
instance, the musical features listed in Lerdahl and Jackendoffs GPRs 2 and 3). Knowledge
structures inform surface schematic expectations, allowing a listener to anticipate endings and
providing a framework for musical segmentation. Event Segmentation Theory (EST), as
proposed by Christopher Kurby and Jeffrey Zacks (2007), incorporates segmentation based on
learned knowledge structures and changes in sensory features into a single cognitive model.
Kurby and Zacks proposed process by which we segment everyday events is grounded in
expectation, which, when applied to music, can account for the perception of musical closure.

Event Segmentation
An event, as defined by Kurby and Zacks (2007), is a segment of time at a given
location that is conceived by an observer to have a beginning and an end (72). This definition is
extremely similar to how musicians define a phrase, which can be considered a discrete event in
music. For instance, Lerdahl and Jackendoff (1977) define a phrase as the lowest level of
grouping which has a structural beginning, a middle, and a structural ending (a cadence) (123).
If we carry this line of thinking to its logical conclusion, then any musical object is an event
since it exists as a temporal experience; for instance, a single tone has a beginning and end, so by
the definition posited by Kurby and Zacks, it is an event. But this conclusion goes too far
because musical tones are not usually heard in isolation; instead, tones are usually grouped
together to create larger units. Because I am examining the perception of musical closure, I
examine grains of segmentation resulting from the creation of formal units, starting with
subphrases through entire pieces.

67

Event segmentation is the process by which people parse a continuous stream of activity
into meaningful events (Zacks and Swallow 2007). Research has shown that event segmentation
is an automatic, hierarchical process, and is essential for guiding memory and learning. In many
event segmentation experiments, participants designate perceived boundaries through an explicit
task. Although this does not directly support the idea that event segmentation is automatic,
widespread agreement among individuals on the location of event boundaries suggest that
individuals may be tapping into ongoing event processing (Kurby and Zacks 2007; Zacks, Speer,
and Reynolds 2009). Even music segmentation studies reveal a high level of agreement among
participants on the location of boundaries (Joichi 2006). Listeners broadly agree on the location
of musical segments across a variety of literature (Delige 1987; Delige et al. 1996; Krumhansl
1996). Even when confronted with ambiguous stimuli, participants using the same segmentation
strategy still generally agree (Pearce, Mllensiefen, and Wiggins 2010), implying a shared
underlying cognitive process. However, an explicit task may change the nature of the perceptual
processing (Zacks and Swallow 2007, 80); Zacks, Tversky, and Iyer (2001) found that
participants segmentation behavior varies depending on whether the participants verbally
described the event and, to a lesser extent, their familiarity with the stimulus.
Brain-imaging data provide much stronger evidence for an automatic process. In one
such study, participants passively watched movies of actors portraying everyday events while
their brains were scanned. The fMRI showed increased brain activity at points later designated
by the same subjects as event boundaries (Zacks et al. 2001). A related music study found similar
results. Musically untrained participants listened to excerpts from a multi-movement orchestral
work by William Boyce while their brain activity was recorded with fMRI (Sridharan et al.
2007). There was increased brain activity in two distinct regions of the brain coinciding with
movement boundaries: first there was an increase in activity in the ventral network,
corresponding to violations of musical expectancy, followed by an increase in activity in dorsal
network, corresponding to the processing of new musical information. Despite not directly
attending to the event structure of the stimuli, participants in both studies were still sensitive to
event boundaries.
Event segmentation is hierarchical and occurs simultaneously on multiple time scales
(Kurby and Zacks 2007). When asked to mark the smallest meaningful units and the largest
68

meaningful units, participants fine-grained divisions are nested within their coarse-grained
divisions (Zacks, Speer, and Reynolds 2009). Music is also understood to be hierarchical: notes
form motives, which form subphrases, which form phrases, etc. Lerdahl and Jackendoff (1983)
have laid out grouping well-formedness rules (GWFRs) as well as grouping preference rules
(GPRs, discussed previously) to explain how a listener creates groups as well as how these
groups are hierarchically related.37 These rules are listed in Table 4.1.
Table 4.1: Lerdahl and Jackendoffs Grouping Well-Formedness Rules
GWFR 1: Any contiguous sequence of pitch-events, drum beats, or the like can
constitute a group, and only contiguous sequences can constitute a
group.
GWFR 2: A piece constitutes a group.
GWFR 3: A group may contain smaller groups.
GWFR 4: If a group G1 contains part of a group G2, it must contain all of G2.
GWFR 5: If a group G1 contains a smaller group G2, then G1 must be
exhaustively partitioned into smaller groups. (3739)
Being able to segment an activity into events can also guide learning and understanding.
Zacks et al. (2006) demonstrate that subjects tend to parse events in a similar manner, usually
creating segments falling into nameable events, and the better a person can form large
meaningful groups, the better he or she will remember the activity. Here, elderly adults
segmented movies of everyday events; some adults did the task well, meaning their
segmentation followed the patterns of the group, while other adults were not as successful. In a
subsequent identification task, participants were shown still pictures, some of which were drawn
from the movies and others of which were not. Participants who segmented the movies
successfully also better identified the visual images from the movies. This research is directly
applicable to music. As discussed in Chapter 3, some ear-training curricula teach undergraduates
to listen for larger patterns in order to remember the passage for dictation, since creating discrete,
namable units aids in remembering the music.38
37

In their chapter on grouping structure, Lerdahl and Jackendoff liken the grouping of the musical surface
to partitioning the visual field into objects, parts of objects, and parts of parts of objects (36). This perspective is
quite different from my more phenomenological approach, comparing the grouping of the musical surface to the
segmentation of experience.
38

Ease of segmentation might also relate to individual aesthetic preferences: musical processing would be
facilitated for a person who could easily segment a composition into meaningful events. A similar correlation

69

Both top-down knowledge structures and bottom-up sensory characteristics influence


segmentation (Zacks 2004). Knowledge structures, as defined by Zacks, are representations that
capture recurring patterns of covariation (2004, 980), referring to transitional probabilities
gleaned through statistical learning. Other authors refer to knowledge structures using different
terms such as event schemata (Hard, Tversky, and Lang 2006) and situation models (Zwaan and
Radvansky 1998). Knowledge structures guide the top-down processing of events, especially for
perceived goals and intentions. Research has shown that top-down knowledge of goals assists in
segmentation, and event boundaries coincide with changes in perceived intention (Baldwin and
Baird, 2001; Baldwin et al., 2008; Hard, Tversky, and Lang 2006; Zacks 2004). When the actors
goals are unpredictable, viewers tend to segment an activity into smaller units, suggesting that
knowledge structures particularly assist in creating larger groups.
In music, stylistic competency might also assist in creating larger groups. For instance,
the first section of Schumanns Widmung (discussed earlier) contains a single phrase, although
surface discontinuities suggest several subphrases. A listener less familiar with Schumanns style
may not be able to perceive harmonic intention towards the tonic goal over these breaks in the
sound. This is readily apparent in the undergraduate classroom, where instructors have to teach
music students not to be tricked by surface discontinuities and to focus on the longer,
harmonically driven phrase.
In music, knowledge structures are revealed through musical expectations, both implicit
and explicit. In light of the discussion from the previous chapter, knowledge structures can take
the form of generalized schemata (e.g., tonality), learned knowledge (e.g., formal archetypes),
and piece-specific knowledge. While an expectation for continuity can be a knowledge structure,
especially when used consciously or when it creates dynamic expectations within a composition,
the deepest expectations for continuity merely rely on sonic disjunctions and correspond with the
bottom-up changes in sensory characteristics.39
Segmentation created by bottom-up processing relies on changes in movement features or
spatial location to produce event boundaries (Magliano, Miller, and Zwann 2001). In the visual

between processing fluency and aesthetic preference in visual art has been demonstrated (Reber, Winkielaman, and
Schwarz 1998).
39

Refer to Figure 3.1 for an illustration of the different types of schematic expectations.

70

realm, when a person or object changes direction or speed, a boundary is perceived, and these
changes correlate with increased activity in brain regions involved with motion processing
(Zacks, Braver, et al. 2001). Even segmentation based on surface features alone can create a
nested hierarchical structure, where event boundaries with a large number of feature changes are
perceived as more important than those with fewer changes (Newtson, Engquist, and Bois 1977).
Changes in the musical surface equate to these visual discontinuities; for instance,
Lerdahl and Jackendoffs (1983) GPRs 2, 3, and 4 (see Table 2.1) are based on sensory
characteristics, indicating that a change in attack-point, register, dynamics, articulation, and
duration distinguish group boundaries, while more marked changes result in higher-level
boundaries. Delige (1987) has shown that when musicians are asked to segment a short excerpt,
most of their divisions correspond to these GPRs. When listening to recorded music, one cannot
see physical motion, unlike the visual stimuli used in studies examining event segmentation.
However, many descriptions of music employ motion words (for instance: a stepwise ascent, a
downward leap), suggesting that, even when motion is not literally present, it is still conceptually
understood.40 Research in discourse processing suggests that conceptual changes in sensory
characteristics have the same effect on event segmentation as do physical changes (Zacks, Speer,
and Reynolds 2009). Of course, the influence of musical motion on segmentation becomes
much more complex when combined with a visual stimulus such as a performer, conductor,
score, or dancer. While this is outside the scope of my study, it would be interesting to see the
extent to which visual input influences auditory segmentation.41
Comparable to the expectation continuum discussed in Chapter 3, in which there was no
clear boundary between cross-modal expectations and musically-derived expectations,
segmentation prompted by knowledge structures and segmentation prompted by sensory
characteristics cannot be sharply distinguished. Zacks and Swallow (2007) also characterize
elements essential to segmentation in terms of a continuum:
Further, we believe that a number of little-studied features, from purely sensory to purely
conceptual, must be important for event segmentation. Toward the sensory end are
Gjerdingen (1994) likens our experience of motion in music to apparent motion in visual studies: a
sequence of flashing lights can create the impression that light is moving from one place to another. Others (Clarke
2001) explain our perception of motion in music from an embodied perspective.
40

41

For instance, observers can acquire both structural and expressive information from musicians motions
(Nusseck and Wanderley 2009).

71

features such as sound, lighting, and contact between actors and objects. Toward the
conceptual end are features such as goals and social conventions. In the middle are
features such as sequential statistical structurethat is, the order in which events tend to
occur. (83)
Other research indicates an interaction between knowledge structures and movement
features (characteristics of the objects movement), supporting four postulates (Zacks 2004,
983):
1. Movement features contribute to the identification of fine event segments.
2. Grouping these fine segments into larger units can be based on aspects of the activity
other than movement features. Observers rely less on movement features as the grain
of encoding becomes larger.
3. Inferences about actors intentions can affect how and to what extent movement
features drive the identification of event segments.
4. Inferences about actors intentions can be influenced by both intrinsic features of the
stimulus and by top-down information.
Sensory characteristics (e.g., movement features) contribute more to the segmentation of small
units than to larger units. The extent to which an observer can infer an actors intention
determines the influence of sensory characteristics on segmentation, and intention itself is
inferred both by top-down information and by sensory input. Usually fine-level events are
described with the actors motion path, while coarse-level events are described with the actors
intention. As previously mentioned, intention isnt the only determiner of course-level events:
hierarchical event segmentation can result even without an overarching event schema. More
change in motion results in a higher-level segment, which could then be stored as a new
knowledge structure (Delige 2006; Hard, Tversky, and Lang 2006).
Even in music, knowledge structures can influence the perception of sensory
characteristics and vice versa. Returning once again to music theory pedagogy, one of the main
goals of an aural skills curriculum is to teach students strategies for determining the structure of a
composition. Students who learn to label musical events are more likely to parse the musical
surface in a way that highlights those events. According to the tenets of statistical learning,
repetitions of musical patterns, even atypical patterns, guide the creation of new knowledge
structures. For instance, as a listener is exposed to more instances of a particular melodic
paradigm occurring at the end of a segment (as determined by movement features), the listener
will began to expect the end of a segment when the melodic paradigm is heard (Eberlein and
Frick 1992; Huron 2006). Segmentation therefore facilitates both learning and memory. Event
72

Segmentation Theory, outlined in the next section, incorporates these characteristics of event
segmentation into a single cognitive model.

Event Segmentation Theory


According to Kurby and Zacks (2007), the Event Segmentation Theory (EST) proposes
that perceptual systems spontaneously segment activity into events as a side effect of trying to
anticipate upcoming information (72). Predictions in this model are formed both by knowledge
structures and by sensory characteristics, as discussed earlier. Because segmentation results
from the continual anticipation of future events (77), a person can adaptively encode event
structure from a continuous stream, understand the intention of an actor (hence anticipate the
actors future actions), and select future actions in response to the ongoing event. Correct
segmentation has distinct evolutionary advantages: being able to chunk an interval of time
together as a single event saves on cognitive resources and, as seen earlier, improves
comprehension. Also, being able to group events together hierarchically assists in learning and
problem solving. The way we segment experience reflects the environment in which the human
perceptual system developed, as stated by Kirby and Zacks (2007):
None of this would be true if the structure of the world were not congenial to
segmentation. If sequential dependencies were not predictable, if activity were not
hierarchically organized, there would be no advantage to imposing chunking and
grouping on the stream of behavior. In this regard, as in many others, human perceptual
systems seem to be specialized information-processing devices that are tuned to the
structure of their environment. (78)
EST posits that perceivers form a representation of the eventan event modelin
working memory. These event models capture what is happening now and guide predictions
about upcoming actions. As long as predictions are accurate, the event model is maintained,
integrating the new information, but when prediction errors rise, a boundary is perceived as the
event model is updated. At this moment, the system becomes more sensitive to incoming
information. As a new event model is established, prediction errors fall and the system stabilizes
once again. Periods of stability in this system are then perceived as single events, while periods
of change create perceptual boundaries (Kurby and Zacks 2007; Zacks et al. 2007; Zacks et al.
2009).

73

Figure 4.1 (reproduced from Zacks et al. 2007, 274, Figure 1) is a schematic depiction of
this theory. Sensory inputs include information collected by the peripheral nervous system, such
as visual, auditory, and tactile information. These inputs are transformed by perceptual
processing, which produces rich multimodal representations with rich semantic content (274).
Here objects are identified, motion trajectories are determined, and intentions and goals are
inferred. The ultimate goal of the processing is to make predictions about the future state of
events. Event models bias processing since they provide a stable representation of the current
event (275), and they are only open to new sensory input when they are updated at event
boundaries (discussed in more detail below). While event models are active and accessible
representations of current events, the amount of information held in these models exceeds the
amount of information held in working memory. Zacks and colleagues (2007) suggest that the
mental capacity of the event model is extended by efficiently using previously stored knowledge
structures (275).

Figure 4.1: Schematic Depiction of the Event Segmentation Theory


(Zacks et al. 2007, 274, Figure 1).
The grey arrows show the flow of information while the dashed arrow represents the signal that initiates an updating
of event models. Sensory input only feeds into the event models when the model is updated.

Event schemata represent the knowledge structures in this model. While they can be
understood as semantic memory representations that capture shared features of previously
encountered events (Zacks et al. 2007, 275), Hintzmans view of schematic abstraction as
74

deriving from the summed activation of episodic traces can also be applied here (refer back to
Chapter 3). No matter the perspective, these event schemata contain previously learned
information about the sequential structure of activities (Zacks et al. 2007, 275). These
knowledge structures, which represent learned statistical regularities, interact with the current
representation of the event, influencing its shape. In turn, the content in the event model shapes
long-term memory through a learning process. So, in terms of expectations derived from
knowledge structures, when the actor achieves the perceived goal, there is a momentary rise in
prediction error, since the observer cannot predict the actors future intentions. A similar effect
was noted in the last chapter, where the event following an authentic cadence is less predictable
than the goal tonic chord. The event model is updated as the perceiver discerns the actors next
goal.
These knowledge structures correspond with longer events, whereas fine segmentation is
usually influenced by changes in motion. Being able to predict the movement of objects and
people around us is essential to being able to interact with the world, so when an actor changes
direction and speed, the perceiver has to form new predictions about where the actor will be next
in order to interact effectively. There are deep-seated expectations for continuity (e.g., in music,
pitches in a certain register are usually followed by notes in that same register), so when there is
an unexpected disruption, the event model is updated, incorporating new sensory input in order
to form a new set of expectations.
Figure 4.2 (from Kurby and Zacks 2007, 73, Figure 1) illustrates a schematic depiction of
segmentation process, portraying event models as relatively robust representations of the current
event. This perceptual constancy allows the ongoing event to be a single entity despite potential
disruptions in sensory input such as occlusion or distraction (Zacks et al. 2007, 274). The first
panel shows that the event model accurately guides perceptual processing and predictions,
therefore it is not open to new sensory input. Only when the prediction error rises does the
current model become insufficient, at which point it resets and integrates new perceptual
information (Kurby and Zacks 2007). Several studies have corroborated this facet of the model:
observers had superior long-term memory at event boundaries, suggesting greater sensitivity to
sensory information at event boundaries (Baird and Baldwin 2001; Zacks et al. 2006; Swallow,
Zacks, and Abrams 2009).
75

The model suggested by EST may be easily assimilated into a theory of event
segmentation in music, but with one important difference. As originally conceived, EST
considers perceptional changes of motion and intentionality, not conceptual changes; however,
the predictions made by EST can account for event segmentation in narratives, which rely on
conceptual changes. In discourse comprehension, readers are able to track multiple dimensions at
once. Despite changes in character, location, character goals, causal relationships, and time,
readers are able to follow and understand what is going on in the text. Several researchers,
including Zwaan and Radvansky (1998), have suggested that we use situation models to
comprehend the action in the discourse. When there is a discontinuity in any of these dimensions
(such as a change in location), the reader updates the situation model. According to Zwaans
(1995) event-indexing model, readers monitor five independent dimensions of the situation
model: time, space, protagonist, causality, and intentionality. A change in any one of these
dimensions prompts an update to the corresponding index in the situation model. Because
updating the model takes time, reading time increases as the number discontinuous elements
increases in a narrative (Zwaan, Langston, and Graesser 1995).

Figure 4.2: Schematic Depiction of the Segmentation Process as Posited by EST


(Kurby and Zacks 2007, 73, Figure 1)
Further recent research by Zacks, Speer, and Reynolds (2009) suggests that EST both
supports and extends discourse comprehension theories (basically equating the event model and
76

the situation model). Because the event-indexing model addresses conceptual events, while EST
focuses on live-action events, Zacks and his colleagues designed experiments to discern whether
the predictions made by EST are applicable to conceptual events and whether the predictions
made by the event-indexing model are applicable to live-action events. In their first two
experiments, the authors presented either a set of narratives or a set of short films to two different
groups of participants, who were asked to divide the stimuli into smaller events. For both groups,
the event-indexing model predicted the location of participants boundaries, and a later study
confirmed that boundaries in narratives corresponded with story elements rated by readers as less
predictable. These findings indicate that both physical cues and conceptual changes enable event
segmentation.
Using a combination of empirical research and computational modeling, Pearce,
Mllensiefen, and Wiggins (2010) illustrate that expectation, based on probabilistic learning, can
also inform the segmentation of musical melodies, further speculating that it can be extended to
all auditory stimuli. They hypothesize that, similar to the mechanisms of segmentation described
by EST, boundaries are perceived before events for which the unexpectedness of the outcome
and the uncertainty of the prediction are high (1375). The first principle of segmentation, the
unexpectedness of the outcome, refers to discontinuities in the sound, roughly corresponding
with Lerdahl and Jackendoffs GPRs 2 and 3 (unexpected disruptions in the musical surface).
Their results indicate that a computer model, using only probabilistic learning, can segment a
melody according to their first principle in a way that mirrors the results from expert listeners.
From a phenomenological perspective, this results in a retrospective marking of a boundary. The
second principle, the uncertainty of a predication, refers to the way preexisting knowledge
structures guide segmentation. Both principles are derived from statistical learning and inform
listener expectation. EST provides a cognitive mechanism that describes how we segment not
only visual or conceptual experience, but musical experience as well.

Event Segmentation Theory and Musical Closure


Because EST is applicable both to live-action events as well as to conceptual events, EST
interacts with an expectation-based model of musical closure. Considering only the auditory
experience of music (setting aside visual input from the score or a performer), according to EST,
boundaries are formed when the event model is updated because of an increased prediction error
77

in the music. Both sensory input from the musical surface and pre-existing knowledge structures
influence the creation of the event model and the ensuing predictions.
This theory has several implications for music cognition in general and for the perception
of closure in particular. First, EST represents an automatic and unconscious model for
segmentation; it does not require conscious attention. While listeners can focus their attention on
segmenting music, it occurs spontaneously as well, as seen in brain imagining studies. For
instance, Knsche et al. (2005) asked musicians to determine whether a melody they heard
contained any note outside of the key while their brain waves were measured with an EEG. The
authors found that between 500 and 600 ms following the end of a phrase, there was a positive
wave spike (called a closure positive shift) in the electrical output of the brain structures that
guide memory and attention processes.42 The location of this closure positive shift suggests
increased attention following the end of a phrase, not merely resulting from the identification of
phrase boundaries.43 An extension of this study examined the difference between musicians and
non-musicians. Although both groups experienced a closure positive shift following the end of a
tonal phrase, Neuhaus, Knsche, and Friderici (2006) note differences between the musical
features that elicited this effect. This study manipulated the last chord implied by the melody
(either I or V), the length of the last note, and the length of the silence between phrases. Both
groups responded to all three markers, but musicians were more sensitive than non-musicians to
the implied harmony. The authors speculate that musicians, who had more stylistic knowledge of
closure, used top-down knowledge, while non-musicians relied more on sensory features.
Event segmentation occurs on multiple time scales simultaneously, creating a hierarchical
construction of the musical grouping structure. Hierarchical grouping structures are explored in
detail by Lerdahl and Jackendoff (1983), and empirical work by Krumhansl (1996) supports the
perceptual validity of hierarchically construed musical segments. In Krumhansls study,
participants designated section endings on three different time scales. First, they segmented the
entire first movement of Mozarts Piano Sonata in E! Major, K. 282, followed by the first fifteen
measures of the movement, and finally just the first eight measures. As the musical excerpts
42

This closure positive shift is similar to the positive shift found at the end of spoken phrases in language

studies.
43

This might reflect the gating mechanism in EST. This cognitive control mechanism delegates more
processing resources (thereby increasing sensitivity to new input) at boundaries when the event model is updated.

78

length decreased, participants were told to decrease the grain size of the segmentation, and the
results indicated a high correlation between the responses for the large and small sections.
Boundaries at different segmentation grains imply varying degrees of continuation and closure.
EST posits that we can simultaneously hold event models on multiple time scales, which can
account for the feeling of closure at the level of a phrase even while the listener expects a
continuation of the piece: although a hypothetical event model is updated on the phrase level, a
separate event model could continue on the level of the entire composition.
Other studies specifically addressing the question of musical closure, rather than
segmentation in general, usually explore the knowledge structures guiding a listeners experience
of closure. For instance, Rosner and Narmour (1992) played two pairs of chords and asked
participants to rate which pair seemed more closed.44 Unsurprisingly, they found that V-I was
rated as more closed than III-I, IV-I, or VI-I. Most likely due to the brief context, there was no
effect for soprano scale degree and only a weak effect for inversion. Tonic is usually described
as the goal of V; recall from Chapter 3 that this feeling of goal directedness is an artifact of firstorder probabilities. Because a tonic chord can go just about anywhere, a listener has no strong
expectations for the next event, creating a perceptional boundary.
While the findings from probe-tone studies are intended to measure melodic expectancy
and are usually used to support a hierarchical model of tonality, Aarden (2003) suggests that
these studies really examine the perception of closure. Aarden posits that due to the nature of the
design (a retrospective rating of a tone following a musical context) these studies really ask how
well a particular tone would complete that musical unit, reflecting learned schema of the
distribution of tones as the final note in a melody (ii-iii). From this perspective, probe-tone
studies support the model of closure based on expectations formed through statistical learning.
For instance, because $"is statistically more likely to conclude musical units, listeners rated $"as a
better fit to end the musical context compared to the other scale degrees.

Rosner and Narmour described closure to their participants as the degree of conclusiveness or satisfaction
of a musical ending. They also described the varying strength of closure by likening it to punctuation, The most
strongly closed progressions says that a piece has finished. This is like the words, THE END, at the conclusion of a
story. Less closed chord progressions in music act like a full stop (a period) at the end of a sentence, signaling that
one thought is complete and a new one will follow. Still less closed progressions behave like semicolons and tell
you that one complete thought will be followed by a closely related one. Just as a writer must use punctuation marks
correctly, a composer must get his or her signs of closure right (390).
44

79

Hierarchical knowledge structures can also influence the perception of closure. Joichi
(2006) examines two related issues: (1) the perception of closure in small musical units in
relation to the larger hierarchical structure and (2) the variation of a particular cues influence on
the perception of closure among different hierarchical levels. In one study, Joichi divided binaryform excerpts into four shorter units (usually evenly dividing the binary into fourths), which
were played either individually or grouped into longer contexts. The most robust finding was a
positive correlation between the listeners rating of completeness and the length of the excerpt.
With a longer context, participants had more opportunity to anticipate the point at which the
excerpt would conclude, especially if the longer context had a cadential arrival occurring
halfway through. This cadence at the midpoint provided a fairly specific expectation for the
location of the eventual ending.
EST predicts that expectations for schematic knowledge structures, like the ones explored
above, will influence the perception of higher hierarchic levels, while changes in surface events
will dictate boundaries at lower levels. At the moment the schematic structure is completed, there
is a rise in uncertainty for subsequent events, initiating an update of the event model. As
discussed in the previous chapter, the more specific an expectation is for a particular ending, the
greater the change in expectancy levels after that ending occurs, and this amount of change
correlates with the strength of closure. Fulfillment of expectations for endings derived from
knowledge structures results in the feeling of anticipatory and arrival closure.
Changes in the musical surface can lead to an increase in prediction error. However, the
expectation for continuity (and other similarly deep schematic expectations) is very general in
nature and does not regularly lead to a feeling of finality at the end of a segment dictated solely
by changes in the musical surface. In EST, a musical boundary forms when the expectation for
continuity is violated; changes in sensory input initiate an updating of the event model.
Inexperienced listeners rely on these surface discontinuities more than seasoned musicians do
because they have not formed the knowledge structures necessary to segment their experience.
Nevertheless, for all listeners, a greater change in the musical surface will result in the creation
of a higher-level boundary. Retrospective closure is associated with changes in the continuity of
sound.

80

Experiment Overview
To explore the association between segmentation and expectation in the perception of
musical closure, I conducted a series of studies designed to test three main hypotheses of EST:
(1) musical experience is segmented unconsciously, hierarchically, and consistently among
subjects; (2) stylistic knowledge in the form of learned musical schemata influences a listeners
perception of closure; and (3) boundaries are formed at moments of transient increases in
prediction error.
Experiment 1
Earlier empirical research has shown that listeners segment music consistently and
hierarchically (Delige 1989; Krumhansl 1996). The purpose of this experiment is to replicate
these findings using the same methodology outlined in Zacks, Speer, and Reynolds (2009).
Listeners will be asked to indicate the end of both fine- and coarse-grained segments while
listening to string quartet movements by either Mozart or Bartk. In accordance with past
research, I hypothesize that listeners will segment the musical stream consistently and
hierarchically.
This study will also examine correlations between a set of musical features and perceived
boundaries. Along with correlations between the musical surface and event boundaries, I will
also see whether factors such as musical expertise or the order in which tasks are completed
apparently influence listeners perception of event boundaries. For instance, perhaps it is easier
to make decisions about larger boundaries after hearing the movement in its entirety, or
musicians might indicate boundaries at larger formal units more consistently than do nonmusicians. Participants more familiar with Bartks music, and with twentieth-century music in
general, might mark boundaries more consistently. While this study does not directly ask about
closure, it does establish where listeners perceive boundaries, and it may lend support for
applying EST to music.
Experiment 2
The perception of closure is contingent on a listeners musical expectations, especially
those that anticipate the completion of a musical schema. As explored in previous chapters, these
expectations are formed through musical experiences. Event Segmentation Theory supports a
81

developmental story for the formation of these knowledge structures and how they can influence
an individuals perception of closure. When confronted with a new style, a listener relies on
changes in the musical surface to segment the composition into smaller events, with bigger
changes resulting in a higher hierarchical boundary. With repeated exposure to the style,
statistical regularities allow the listener to develop expectations for how various hierarchical
levels of the piece should unfold in this style. When events include stylized signs of endings, a
listener becomes able to predict that an ending is about to occur.
This study will test the hypothesis that learned stylistic cues influence our perception of
closure. This study is in two parts. First, participants will listen to string quartet music during a
twelve-minute exposure period and will mark endings within each excerpt by pressing a keycombination on the computer. Participants will be randomly assigned to one of two conditions:
one group will listen to excerpts by Bartk and the other group will listen to excerpts by Mozart.
In the second part of the study, participants will rate the degree of closure for two blocks of short
excerpts, one drawn from the Bartk quartet and the other from the Mozart quartet.
I expect that ratings for cadential paradigms in the Mozart excerpts will be higher than
those for the Bartk excerpts, based on a presumed greater listener familiarity with Mozarts
style. While I do not expect a strong effect, I hope to see that participants exposed to as little as
twelve minutes of Bartks music will interpret closure in these works differently from the group
of participants who initially listened to Mozart. Even though studies have shown that participants
pick up on statistical regularities in auditory stimuli rather quickly, this study may not produce
robust differences between the conditions because it asks a slightly different question. Other
statistical learning studies ask whether a series of sounds form a grammatical entity based on an
exposure period. However, all of the testing excerpts in this task are grammatical entities in the
style they represent, and participants instead must make an interpretive judgment regarding the
suitability of an excerpt to end a musical unit in that style.
Experiment 3
The first two experiments examine two facets of EST: the segmentation of music and the
influence of learned musical schemata on the perception of closure. This experiment seeks to
support the theoretical claim that the perception of closure stems from being able to predict the
moment of the completion for a schematic unit, followed by a transient increase in prediction
82

error. In their examination of the segmentation of narratives, Zacks, Speer, and Reynolds (2009)
found that boundaries tended to occur when the activity in the narrative was rated as less
predictable. However, subjective ratings predictably did not account for all of the event
boundaries, especially those formed through a change in character, so the authors suggested the
need for a more objective measure of prediction performance (323324).
In this study, instead of asking participants retrospectively to rate the predictability of a
musical segment, I will ask participants to predict the moment at which a musical unit will
conclude. I will then correlate these results with the listeners perceived degree of closure.
Participants in this study will include both musicians and non-musicians, and all musical
excerpts will come from minuet movements of three Mozart string quartets (K. 156, K. 168, and
K. 173). After the prediction task, participants will hear excerpts from the same minuets in one
of two conditions: either in the order in which they occur in the movement or in a random order.
Participants will rate each excerpts strength of closure on a seven-point scale. Participants in the
ordered condition will see a schematic representation of the movement to help locate each
excerpt within the movement.
In the prediction task, I expect that musicians, who probably are more familiar with
Mozarts compositional style, will be more successful than non-musicians at predicting phrase
endings. Since the minuets used for this study are all in binary form, participants will hear each
section of the movement at least twice, so I anticipate that all participants will make more
accurate predictions the second time through each section. Further, the predictability of a musical
units conclusion may vary with the type of musical ending; for instance, a PAC may be more
predictable than a HC.
According to EST, endings that are better predicted should correlate with a higher rating
of closure. Also with the data from the rating task, I will look for a main effect for condition
(ordered vs. unordered excerpts), which might reveal that formal hierarchy can influence the
perception of closure. If ratings for the same excerpt vary widely between participants in
different conditions, my results might indicate that a schematic understanding of form
contributes to the sense of musical closure. As previously discussed, the strength of closure
informs the hierarchy of a composition; however, this study may instead demonstrate that

83

schematic knowledge of formal structures affects the listeners perception of the strength of
closure.

84

CHAPTER 5
EXPERIMENT 1
This first study does not explicitly examine a listeners perception of closure; rather, it
addresses what I consider a prerequisite to this larger issue by looking only at segmentation.
Event Segmentation Theory (EST), as discussed in Chapter 4, provides a model for the
perception of closure that is compatible with previous research in musical expectation and
segmentation. Experiments 1a and 1b only consider segmentation, to see whether listeners
segment music in a manner consistent with this theory. To test this, I adopted an experimental
paradigm previously used in studies to support EST.
The design for Experiment 1 is based on the segmentation task used in Zacks, Speer, and
Reynolds (2009), in which participants either read or listened to several narratives detailing the
everyday activities of a seven-year-old boy and were asked to divide the continuous narrative,
identifying points at which one meaningful unit of activity ended and another began (309).
Each participant read or heard every narrative twice, once to indicate the largest unit of
meaningful activity (coarse segmentation) and once to indicate the smallest unit of meaningful
activity (fine segmentation).45 The results indicated that participants segment narratives
hierarchicallywith smaller units nested within larger unitsand that conceptual changes in the
narratives can predict the presence of a boundary. In the terms of EST, the event model updates
at a conceptual change (like a change in temporal or spatial location), since change is less
predictable than continuity, and at the end of an event defined by pre-existing event models.
In my study, participants segmented two complete string quartet movements using a
similar segmentation task. In Experiment 1a, participants segmented two movements by Bla
Bartk, and in Experiment 1b, participants segmented two movements by Wolfgang Amadeus
Mozart. Consistent with Zacks, Speer, and Reynolds, I hypothesize that listeners will segment
the music hierarchically and that their segmentation will consistently correlate with various
musical features falling into two broad categories: arrival features, marking the end of a musical
segment; and change features, epitomized by Lerdahl and Jackendoffs Grouping Preference
45

The order of tasks was counterbalanced between subjects.

85

Rules (GPRs) 2 and 3. Further, based on the literature examined in Chapter 4, I predict an
increased response time for the coarse-grained segmentation task compared to the fine-grained
task and that these higher-level boundaries will correlate with an increased number of change
features.

Method
Participants
In both studies, participants were divided into three groups based on their musical
expertise: non-musicians, first-year undergraduate music majors, and graduate/professional
musicians. There were 32 participants in Experiment 1a (14 non-musicians, 9 undergraduate
musicians, 9 graduate musicians) and 33 participants in Experiment 1b (14 non-musicians, 10
undergraduate musicians, 9 graduate musicians).
Stimuli
Two sets of stimuli were used in this study, one from twentieth-century non-tonal
practice and the other from the common-practice tonal idiom. In Experiment 1a, participants
listened to the third and fifth movements from Bla Bartks String Quartet No. 4; in Experiment
1b, participants listened to the fourth movement from Wolfgang Amadeus Mozarts String
Quartet No. 19 in C major (K. 465) and the second movement from Mozarts String Quartet No.
21 in D major (K. 575). Mozarts music unquestionably exemplifies the common-practice style,
whereas Bartks music represents just one of many twentieth-century styles. Bartks style
tends to be relatively accessible to listeners unfamiliar or uncomfortable with non-tonal music
because he incorporates phrase lengths and formal divisions familiar from the common-practice
repertoire, and he usually provides a metrical framework. His String Quartet No. 4 particularly
epitomizes these stylistic characteristics.
One inherent difficulty in using pre-composed pieces of music in an experiment is that
many elements of the stimuli cannot be controlled. For instance, given the experiments reliance
on existing recordings, the exact length of the movement and tempo could not easily be
manipulated. To compensate, I used strict selection criteria. First, I wanted to use a genre and
instrumental group that is well established in both the common-practice and twentieth-century
repertoires, a criterion met by the string quartet. I only considered short string quartet movements
86

(lasting under five minutes) that included both a clearly distinguishable melody and also a
strongly articulated formal structure that could be divided into smaller phrases. Finally, I wanted
to include one fast movement and one slow movement in each study, so I specifically sought fast
and slow movements for both composers.
Bartk has the smaller repertoire (six string quartets compared with Mozarts twentythree), so I began by selecting two movements by Bartk that fit my criteria before turning to
Mozarts repertoire to find suitable pairs. Bartks slow third movement from his fourth string
quartet has characteristics of a theme and variation movement, which is then organized into a
larger ternary form. This movement significantly features the cello, which plays the opening
theme and subsequent variations, accompanied by sustained chords in the upper strings for the
first 34 measures. The matching Mozart movement (String Quartet 21, second movement) also
has a clear ternary construction and extensively features the cello (especially in mm. 3850).
To match the sonata-like thematic construction of the fifth movement from Bartks
fourth quartet, I used the last movement of Mozarts Quartet No. 19 (the Dissonance Quartet),
which is in sonata form.46 Although the two movements have vastly different characters, both
conclude longer works, suggesting that their respective composers considered their degree of
finality appropriately strong for the conclusion of a significant work. Presumably to enhance its
degree of closure, the last movement of Bartks string quartet harkens back to the first
movement, repeating its ending gesture at a much slower tempo. The last movement of Mozarts
Dissonance Quartet features an extensive coda, ending with a recurring motive from the
secondary tonal area (STA). Even though the Mozart movements are not from the same quartet,
the two movements chosen are still suitable companions because both were written later in
Mozarts life (No. 19 was composed in 1785 and No. 21 in 1789). Table 5.1 outlines the basic
characteristics of each movement.
Along with these four movements, I selected two shorter excerpts for practice before data
collection. Participants in Experiment 1a listened to another excerpt from Bartks fourth string
quartet, the first 49 measures of the first movement (lasting 1:43) as performed by the Emerson
String Quartet; participants in Experiment 1b listened to the third movement of Mozarts String
Quartet No. 2 (K. 155) as performed by the Amadeus String Quartet. Both excerpts introduced
46

In order not to exceed the five-minute time limit, I chose a recording that did not repeat the exposition.

87

participants to the style of music used in the study and exemplified the grouping of phrases into
clearly differentiated sections.
Table 5.1: Musical Stimuli Characteristics for Experiments 1a and 1b

Experiment
Tempo marking
Time (in m:ss)
Performers47
Number of Measures
Meter

Bartk, No. 4,
mvmt 3
1a
Non troppo lento
5:12
Emerson String
Quartet
71
4/4

Bartk, No. 4,
mvmt 5
1a
Allegro molto
5:05
Emerson String
Quartet
392
2/4

Mozart, No. 19,


mvmt 4
1b
Allegro
5:22
Emerson String
Quartet
419
2/4

Mozart, No. 21,


mvmt 2
1b
Andante
4:01
Amadeus String
Quartet
73
3/4

Coding Procedure
The most practical way to accommodate variations in subject response time was to
examine only responses made in predetermined time windows in each movement.48 I analyzed
each movement and created two types of windows: Type 1 Windows, coinciding with a
meaningful arrival feature, and Type 2 Windows, determined by changes in the musical surface.
Listeners were free to indicate endings at any point, and I do not mean to suggest that these
windows are the only correct places to respond. The windows occur in locations that I thought
were likely dividing points because of their musical features, allowing me to explore how a set of
chosen features predicts listener responses resulting in a simplification of the data analysis.
Type 1 Windows begin with the onset of the beat containing the last note of a segment
and continue on until the beginning of the next musical segment, whether it is a section, phrase,
or subphrase. It would have been interesting to see if participants responded immediately to the
end of a formal unit or if they instead waited until the beginning of the next formal unit to make
a decision, but because I was unable to determine the length of time between the listeners
47

The Bartk recordings are performed by the Emerson String Quartet on the CD Bartk: The String
Quartet (1988). Mozarts String Quartet No. 19 is also performed by the Emerson String Quartet in their 2005
album Mozart String Quartets K. 465 Dissonance, 458 The Hunt & 421. Unfortunately, the Emerson Quartet
did not record the slow movement from Mozarts String Quartet No. 21, so the recording used here was performed
by the Amadeus String Quartet in their 1988 recording Mozart: The String Quartets.
48

Alternatively, I could have followed Krumhansl (1996), who smoothed out responses over a two-beat
window in her data analysis of a similar data set. Because I am examining more features than Krumhansl and I am
not working from a MIDI source, using predetermined windows is a better option.

88

perception of an ending and his/her subsequent response, I included both the ending and
subsequent beginning in a single window. Type 2 Windows do not correspond with an ending,
but instead occur at some sort of change in the musical surface as defined by the change
features described below. I created these windows to begin with the onset of the beat before the
change occurs and to continue on for one to five beats, depending on the tempo (some windows
are shorter to avoid overlapping with another window).
Window beginnings always coincide with the beginning of the beat, even when the last
melodic note of the phrase begins off the beat (for instance, when Mozart includes a suspension).
Because listeners might conceivably respond to the harmonic arrival as opposed to the melodic
resolution, Type 1 Windows begin at the initiation of the goal harmony. When the melody is
presented in canon, listeners might respond to the melodic ending in the leading voice, so in
these cases the window begins with the last note of the leading voice. In all such cases, the
imitation was temporally close, and the ending of the following voice fell within the same
window.
Windows exist in various sizes, both within a movement and between movements. Table
5.2 presents the number of windows found in each piece as well as the average duration of these
windows (measured both in seconds and in beats). This variation in window length will not
affect the data analysis. While the windows are useful for identifying which features are present,
they are also useful for determining the extent to which the data is hierarchically constructed, as
well as for measuring the consistency of the responses between listenings.
Table 5.2: Window Construction in Each Movement
Movement
Bartk 3
Bartk 5
Mozart 19
Mozart 21

Average duration of
windows (high/low)

Number of windows
43
94
77
40

3.53 s (13.7 s / 1.39 s)


1.87 s (5.2 s / .69 s)
1.61 s (3.11 s / .60 s)
2.73 s (5.14 s / .92 s)

Average number of beats


in each window*
(high/low)
2.77 beats (10 / 2)
4.76 beats (12 / 2)
4.01 beats (7 / 2)
2.65 beats (5 / 1)

*not including the last window in each movement, which lasted until all the sound faded out

From these movements, I chose a set of musical features that could influence listener
segmentation and catalogued the presence of these features in each window. There are two
categories of features: features that define musical endings (arrival features) and features
89

corresponding to a change in the musical surface (change features). Each composer has a
different set of features, defined in Tables 5.3 and 5.4.49 Once defined, most of the musical
features can simply be catalogued from the music, but some rely on analytic interpretation.
Table 5.3: Arrival and Change Features in Bartk
Arrival
Features

Intervallic Direction
Intervallic Approach
Duration Change
Cadences

Change
Features

Silence
Orchestration
Changes
Other Changes

Descent; Ascent: Melodic line has at least a two-note descending or


ascending figure
-1/-2; -3/-4; -5/-7; +1/+2: The distance in semitones to the final
melodic pitch of a segment measured from the previous melodic
pitch
Compared to the preceding melodic sound, the last note of the
segment is longer or shorter
Falling fourth (4th); Falling third (3rd); Low-high Chord (LH);
Single Chord (1); Double Chord (2): Defined by common ending
gestures in these two movements 50
Complete Silence; Melodic Silence; Non-Melodic Silence: Silence
in the entire texture, just the melody, or in at least one non-melodic
instrument
New Instrument; New Melodic Instrument: A new instrument
joins the texture or a new instrument performs the melody
Register Change: The melodic line leaps up or down an octave
Dynamic Change: The melody is performed louder or softer
Ostinato Change: The underlying ostinato changes in pitch content
or rhythmic figuration

Most of the arrival features are pitch-centered structural features: the approach to the last
melodic note, the scale-degree of the last melodic note and its harmonic support (Mozart only),
and the presence of a cadential paradigm (defined by the repertoire). Duration change is included
in this category (as opposed to the change features) because a change in duration can punctuate
the end of a segment (so called durational closure).51 As stated in Chapter 3, listeners have
learned though previous experience with music which specific features define the end of a
musical unit, and when an expectation for an ending gesture is fulfilled, the listener experiences
anticipatory or arrival closure. While this study does not explicitly ask about closure, an
expectation-based view of closure would suggest that listeners more familiar with a particular
49

These lists do not represent an exhaustive list of features that could influence the responses to a musical
segmentation task.
50

The labels in the parenthesis are the labels used throughout this chapter to designate specific cadential
gestures in the Bartk. See the discussion regarding Examples 5.45.10 for more detail about these cadences.
51

See Narmour (1990) and Joichi (2006).

90

repertoire, or even the typical cadential gestures of a style, will consistently respond to these
features. These end-defining features roughly correspond with a goal, such as $, the tonic triad,
or the conclusion of a cadential gesture. EST predicts that our understanding of goals and
intentions helps us segment life experience on a coarser grain. Musical goals, which are merely
a metaphor resulting from a misattribution of the positive emotions experienced when a listener
makes a correct prediction, still have perceptual salience.52 Since determining the goal of a
musical unit is highly subjective, especially in the Bartk movements, which do not conform to a
widely shared syntax, these end-defining features as a group do not necessarily signify goals, but
a listener could interpret some of them as goals.
Table 5.4: Arrival and Change Features in Mozart
Arrival
Features

Scale-Degree
Harmony/Harmonic
Progression
Intervallic Direction
Steps/Embellished
Steps

Duration Change
Cadences
Change
Features

Silence
Orchestration
Changes
Other Changes

$;"(;"-; and")"or"&: scale-degree of the last melodic note of the


musical segment as defined by the local tonal context
I; V; V7: Last harmony of a musical segment as defined by the local
tonal context
V-I; x-V: Motion into the last harmony of a musical segment as
defined by the local tonal context
Descent; Ascent: Melodic line has at least a two-note descending or
ascending Figure into the final note of a segment
Leading Tone Ascent to Tonic
Step Descent; Step Ascent: Melodic line has three or more notes
descending or ascending by diatonic step
Embellished Step Descent; Embellished Step Ascent: Melodic line
has three or more notes descending or ascending by diatonic step
with surface embellishment
Compared to the preceding melodic sound, the last note of the
segment is longer or shorter
PAC; IAC; HC; Evaded Cadence: Defined by tonal cadential
paradigms.
Complete Silence; Melodic Silence; Non- Melodic Silence: Silence
in the entire texture, just the melody, or in at least one non-melodic
instrument
New Instrument; New Melodic Instrument: A new instrument
joins the texture or a new instrument performs the melody
Register Change: The melodic line leaps up or down an octave
Dynamic Change: The melody is performed louder or softer

52

As explored in Chapter 3, Huron (2006) suggests that the feeling of finality or repose associated with
closure is an artifact of correctly anticipated endings. These expectations result from learned transitional (first-order)
probabilities.

91

In both Mozart movements, the local tonal area (not the overall key of the movement)
determines scale-degree designations and harmonic and cadential labels. For instance, in the
exposition from the C-major Dissonance Quartet, the STA turns from G Major (V) to
tonicize E!"Major (!VI in the context of G major)see Example 5.1. Even though there is no
cadence in E!, the I-V-I progression in that key clearly implies E! as a local tonic in this short
passage. Therefore, the melodic G4 that ends the opening subphrase of the longer phrase is
interpreted as (.
89

Example 5.1: Mozart, String Quartet No. 19, fourth movement, mm. 8993
The Type 1 Window is annotated on the score with a solid box

In both the Mozart and the Bartk analysis, the directional approach into the last melodic
note of a musical unit is easily catalogued from the musical surfaceeither a descent or ascent.
In the case of Example 5.1, there is a melodic descent into ( from the preceding B!. Specific
ordered pitch intervals in the Bartk movements are also determined from the musical surface.
These particular intervals were chosen because of their prevalence at endings in both movements
and roughly correspond to a downward leap, a smaller downward skip, and a stepwise ascent or
descent.
Step progressions in the Mozart analysis only consider a surface stepwise ascent or
descent, but often a mid-range step progression is decorated with embellishing tones. After
removing these tones, if the resulting melodic line moves three diatonic steps up or down in its
approach to the last note of a musical segment, then it is classified as an embellished step
progression. The two annotated examples below show cadential arrivals from the fourth
movement of Mozarts String Quartet No. 19. Example 5.2 (mm. 6770) shows an approach to a
92

PAC in G major, which arrives on the downbeat of m. 69. The annotations highlight members of
the underlying step-progression (which is embellished by a series of escape tones) by circling
notes involved in the stepwise descent. The final note of the phrase is approached from above,
but does not appear to include three diatonic steps leading to the cadence until a layer of
embellishment is removed, so this phrase ending only has a melodic descent and an embellished
step descent. In addition to both of those features, Example 5.3 (mm. 7678) also has a stepwise
descent as the sixteenth notes cascade down to the cadential G4 in the first beat in m. 77. In this
case, the embellished descent connects the B4 and A4 in m. 76 with the final G4 of the phrase.
67

Example 5.2: Mozart, String Quartet No. 19, fourth movement, mm. 6770
The Type 1 Window is annotated on the score with a solid box.

76

Example 5.3: Mozart, String Quartet No. 19, fourth movement, mm. 7678
The Type 1 Window is annotated on the score with a solid box.

Cadential gestures vary between repertoires. In the Mozart movements, cadences are
defined by standard harmonic and melodic paradigms. A perfect authentic cadence (PAC) is
narrowly defined as a root-position V-I progression ending with $ in the melody, which is
93

contrasted with the more broadly defined imperfect authentic cadence (IAC)a cadential V-I
progression in which either chord (or both chords) may not be in root position or, more likely,
the melody does not conclude on $. A half cadence (HC) ends a formal unit with a V chord. An
evaded cadence is not technically a cadence; rather, it is the denial of cadential expectation. Here
this term encompasses both the deceptive cadence (typically a V-vi progression) and a weakened
authentic cadence. An evaded PAC is illustrated in Example 5.4, where Mozart denies a perfect
authentic arrival in m.16. Instead of landing conclusively on a root-position tonic harmony, the
bass voice slips to a I6 chord, the cello revisits the melody from mm. 1314, and the top voice
takes up an accompanimental texture. The cadential expectation set up by the dominant chord in
m. 15 is finally resolved in m. 19.
19

15

Example 5.4: Mozart, String Quartet No. 21, second movement, mm. 1520
The Type 1 Windows are annotated on the score with a solid box.
The Type 2 Windows are annotated on the score with a dashed box.

To clarify, the presence of these harmonic and melodic paradigms does not necessarily
signify a cadence. I agree with Caplins 2004 definition of cadence in the Classical style, which
describes cadence as a syntactic ending to mid-level formal units. In Caplins words, a cadence
must end something (56), and that something is usually a phrase. However, the moment of
cadential articulation may or may not coincide with the conclusion of a phrase, an issue explored
in more detail below.
The cadential gestures in the Bartk movements are not representative of a wider
twentieth-century style or even of cadences in Bartks own oeuvre. So, in this context, a
cadence is a movement-specific gesture that concludes a mid-level formal unit. In the third
94

movement, a descending fourth gesture concludes many of the variations. This melodic Figure is
usually presented with a long-short durational pattern; see m. 21 in Example 5.5 (the phrase ends
on the third beat of m. 21). A lesser-used cadence in this movement is a descending minor third
(illustrated in Example 5.6), which is used extensively in the fifth movement (Example 5.7).
Also in the fifth movement, Bartk uses a multi-voiced chord in all the instruments as a
concluding gesture, and this assumes several guises throughout the movement. At times it is
presented as a single chord, as in the cadential arrival in m. 332 (Example 5.8). More often, a
held lower note precedes the single chord. This lower note is usually presented in unison, for
instance in mm. 280281 (not serving a cadential function in this passage), but it also takes other
forms like in m. 75 (see Examples 5.9 and 5.10). Another variation of this cadential type is
articulating the chord twice following a long held note (mm. 283284, also in Example 5.10).

20

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission.

Example 5.5: Bartk, String Quartet No. 4, third movement, mm. 2023 (Falling fourth 4th)
The Type 1 Window is annotated on the score with a solid box.
The Type 2 Windows are annotated on the score with a dashed box.

95

40

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.6: Bartk, String Quartet No. 4, third movement, mm. 4041 (Falling third 3rd)
The Type 1 Window is annotated on the score with a solid box.
The Type 2 Windows are annotated on the score with a dashed box.

237

235

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.7: Bartk, String Quartet No. 4, fifth movement, mm. 235239 (Falling third 3rd)
The Type 1 Window is annotated on the score with a solid box.

96

330

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.8: Bartk, String Quartet No. 4, fifth movement, mm. 330332 (Single chord 1)
The Type 1 Window is annotated on the score with a solid box.

74

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.9: Bartk, String Quartet No. 4, fifth movement, mm. 7476
Single chord preceded by lower dyad LH-1
The Type 1 Window is annotated on the score with a solid box.

97

279

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.10: Bartk, String Quartet No. 4, fifth movement, mm. 279284
Single chord preceded by single pitch class LH-1 (not a cadence here);
Double chord preceded by single pitch class LH-2
The Type 1 Window is annotated on the score with a solid box.
The Type 2 Window is annotated on the score with a dashed box.

For duration changes, the notated rhythmic value of the last note of a segment had to be
longer or shorter than the note immediately preceding it. A similar procedure was used to
determine a change in register or dynamics, where the feature is present if there is a notated
octave leap or change in dynamics. For practical reasons I did not interpret either feature,
determining both through a score-based analysis. Although the leap to the F. following the
cadence in Example 5.3 lands on an embellishing tone that resolves upward to the G, it is not
interpreted as a register change because on the surface of the music it is a major seventh leap, not
an octave leap. Dynamic change was determined through a score-based analysis, only relying on
the composers written directions for dynamics, which the performers conveyed faithfully. While
there are varying degrees of change for all three features, I decided to treat them as binary
features, indicating only whether such a change was present. The last two features, register and
dynamic change, usually dont correspond to a musical ending; instead, they belong to the
second category of featuresones that contribute to an acoustic change.
Along with register and dynamic change, the other features (silence, texture change, and
orchestration change) indicate some change in the musical surface and can possibly elicit
98

retrospective closure by indicating a new beginning or the space between an ending and a
subsequent new beginning. Even though listeners were instructed in this study to indicate
musical endings, they might not realize an ending had occurred prior to the onset of a new
beginning. EST, and more specifically the segmentation study by Zacks, Speer, and Reynolds
(2009), suggests than an increased number of changes in a stimulus will correlate with an
increased likelihood of segmentation, especially on a coarser grain of segmentation.
The absence of sound is one of these acoustic changes that could influence the
segmentation task. I distinguish complete silence from melodic silence and from non-melodic
silence, although certainly these categories are interrelated. Both melodic and non-melodic
silence occur in moments of complete silence, so I reserve these terms for instances in which not
all instruments are silent. Referring back to Examples 5.2 and 5.5, complete silence occurs
immediately following the cadence (the black line in the Bartk example indicates a caesura),
while only melodic silence occurs after the PAC in Example 5.4. Melodic silence can occur
simultaneously with non-melodic silence, though. Following the PACs in Examples 5.3 and 5.4,
at least one non-melodic instrument temporarily drops out, thinning the texture. In contrast to
silence, the addition of an instrument thickens the texture. This change can coincide with a
change in orchestration, where the new instrument becomes a melodic force (as in Example 5.4,
m. 16).
For each window, I recorded whether a particular feature was present. Arrival features
always occur at the beginning of a window coinciding with the last note of a musical segment
(Type 1 Window) or the last note before a change (Type 2 Window). Change features occur later
in the window; in a Type 1 Window they follow the ending and coincide with the new beginning,
while they occur before the second beat is completed in a Type 2 Window.
The presence of a musical ending was determined through my own analysis, which
reflects a more complex interaction between the various arrival features (going beyond merely
cataloguing their presence), and my analysis also represents a possible well-formed hierarchal
grouping structure for each movement. I analyzed each movement for the location of subphrase,
phrase, and section endings, which determined the placement of Type 1 Windows.53 Since I am
admittedly bringing my own musical experience and bias into this study, I will briefly outline
53

The presence of these formal endings was also coded with the other features.

99

how I created a three-level grouping structure, by determining the end of subphrases, phrases,
and sections for each movement.
Because I decided that all grouping structures must be well-formed, as defined by
Lerdahl and Jackendoff (1983), and I was working with a limited vocabulary for data analysis
purposes, some of my analytic designations use these termsespecially subphrasein nontraditional ways.54 While the hierarchical relationships between sections, phrases, and subphrases
remain constant between composers, some of the defining features of these units vary. Also,
determining the type of hierarchical ending represents my own analytic interpretations (even
more so than evaluating the features already mentioned), and does not represent an objective
measure of the phrase structure.
For the Mozart compositions, my definition of a phrase conforms to current analytic
understanding of this term: a formal musical unit consisting of a beginning, middle, and end,
most of the time concluding with a cadential gesture. Following Caplin (2004), who disentangles
phrases and cadences (allowing for phrases to exist without cadential punctuation), a phrase is
not necessarily completed immediately following a cadential gesture, for a phrase also
encompasses any phrase extensions that follow the cadence. I use subphrase to describe formal
units smaller than a phrase, and in order to have a well-formed hierarchical analysis, a phrase
must be divided into subphrases either completely or not at all. I use this designation for both
parts of the presentation as well as for the continuation of a sentential formal structure, for legs in
a sequence, for the material between the cadential arrival and the end of a phrase (i.e., an
external phrase extension), and for introductory material preceding the beginning of a phrase
(i.e., a prefix), although the subphrase label might not be completely apt in every case. While I
could have expanded my analytic vocabulary, acknowledging each feature individually, grouping
these features together under subphrase simplifies the data analysis. A section describes
formal units larger than a phrase. Not every unit larger than a phrase received the designation of
a section; instead, sections unite areas of the movement that share the same formal function, the
same key, and related melodic ideas.
Formal designations in Bartk are more open to interpretation since there is not a widely
agreed upon definition of phrase in this repertoire. Given that cadential arrival, as narrowly
54

A list of Lerdahl and Jackendoffs well-formedness rules can be found in Chapter 4, Table 4.1.

100

understood from common-practice style, is absent, I decided to analyze these works using a topdown approach, starting with large sections. I first divided each movement into large sections,
grouping together music that is thematically related, shares the same pitch-collection, and
implies the same formal function. I then divided the sections into phrases, units that seemed to
have a beginning, middle, and end. In many cases these units ended with a shared musical
gesture that could arguably be described as cadential because of its prevalence at endings.
Finally, if there seemed to be any internal divisions in the phrase, I then further divided the
phrases into subphrases. Like in the Mozart analysis, subphrase was also used to describe
music that functions as a prefix or suffix.
Participant Procedure
After giving informed consent, participants were assigned to one of two conditions,
which determined the starting task. In order to differentiate between the coarse and fine
segmentation tasks while avoiding unfamiliar technical vocabulary, I described the tasks to all
participants using a linguistic analogy. The directions stated:
In language, sentences group together to form paragraphs. The same is true in music,
where smaller sentence-like phrases are combined to form larger paragraph-like sections.
In this task, you will hear the same piece of music four times. The first time, you will
press the SPACEBAR every time you hear the end of a PARAGRAPH-LIKE SECTION.
Because you may change your mind about the location of the boundaries, you will repeat
this activity in the second listening. In the third and fourth listenings, you will indicate
the end of sentence-like phrases.55
After the participants read the instructions and were given the opportunity to ask
questions, they began with a practice task on a short excerpt before segmenting the actual
stimuli. Participants in Experiment 1a listened to the first 49 measures of the first movement of
Bartks String Quartet No. 4, and participants in Experiment 1b listened to the third movement
of Mozarts String Quartet No. 2 (K. 155). The first time through, they performed the
segmentation task as dictated by their condition. If participants performed the task in a manner
that demonstrated understanding of the instructions (designating between 8 and 15 fine divisions
or between 3 and 7 coarse divisions) they continued on to segment the practice excerpt again
using the other grain of division. Subjects who did not achieve the needed number of responses
55

This order was changed in the other condition.

101

received feedback and additional instruction before performing the same segmentation task
again. After the minimum requirements were met in both tasks, subjects were given the
opportunity to ask any additional questions before moving on to the actual test.
While listening to the entire movement, participants in the coarse segmentation condition
first indicated event boundaries delineating groups of phrases and formal sections, while
participants in the fine segmentation group first indicated boundaries for shorter events, such as a
phrase or subphrase. Because segmenting music in real time could be a difficult task, the
participants immediately repeated the task before switching conditions and performing the other
task on the same movement. Thus, all participants listened to each movement four times,
performing both the coarse and fine segmentation tasks twice on each movement. Everyone
listened to both movements through headphones and indicated event boundaries by pressing the
space bar on a computer, which recorded the times for these key presses. Participants in
Experiment 1a listened to the third and fifth movements of Bartks String Quartet No. 4, while
participants in Experiment 1b listened to the fourth movement of Mozarts String Quartet No. 19
(K. 465) and the second movement of Mozarts String Quartet No. 21 (K. 575).56 Following this
task, participants completed a questionnaire documenting musical experience and familiarity
with the compositions.

Results
Each subject had four trials with each movement, which I will identify as Fine 1, Fine 2,
Coarse 1, and Coarse 2 (the number indicates whether it was the first or second time the
participant performed that particular segmentation task on the given movement). Since there is
no limit to the number of responses an individual could make, nor can I reliably determine
listener response time (i.e., the time between a given feature and the key press), the data analysis
only examines presses that occurred within the predetermined windows. The dependent variable
is a binary variable indicating whether the participant responded within a particular window
(scored as 1) or not (scored as 0).
I used two different types of mixed models regressions in this analysis section. Both
regression types take into account the fact that individual participants may respond differently
56

The order in which the movements were heard was counterbalanced between participants.

102

during the task and allow for the assessment of variables such as whether the participant was a
musician or a graduate student, along with the start tempo (fast or slow) and start segmentation
(fine or coarse) for each participant. Most analyses used a mixed logit model (i.e., a mixedmodels logistic regression) to analyze the binary response variable. This regression is used to
predict the odds of a participant responding to a feature of the stimulus (or any other independent
variable), and this information is conveyed by the odds ratio (OR) value. Odds ratios of 1.0
indicate that the odds of a response are as likely when a specific feature or variable is present as
when it is not. Odds ratios greater than 1.0 indicate that the odds of a response increase when the
feature is present, while odds ratios less than 1.0 indicate that the odds of a response decrease
when the feature is present. While odds ratios cant be less than zero, there is no upper bound for
these ratios. The other type of mixed models regression was used to analyze continuous
dependent variables, such as response time. This regression does not produce an odds ratio, but
rather a series of coefficients showing the weight of each variable on the outcome as predicted by
the regression equation.
For this study, only results significant at p < 0.05 could be included in the results; indeed,
most of the discussion will highlight results significant at p < 0.02 to avoid over-interpreting
spurious results. Analysis of interactions will focus on the apparent influence of musical training
(specifically, whether the subjects were non-music majors, music majors, or post-graduate
musicians). For interactions significant at p < 0.02, I ran an ANOVA to determine the direction
of the interaction. In these interactions, I usually compare two groups of participants, one with a
higher level of musical training to one with a lower level of training. When I compare musicians
to non-musicians, the musicians group includes both graduate and undergraduate musicians,
but when I compare graduates to undergraduates, the undergraduate group includes both the
undergraduate musicians and the non-musicians (all of whom were undergraduates). I labeled the
ANOVA tables throughout the chapter with the headings Less Musical Training and More
Musical Training to distinguish between these types of groups. I interpreted the direction of the
interaction by comparing the change between the means for instances when a particular feature is
present and instances when it is not for each subject group.
For each composer, I examine how well the Fine 1 and Coarse 1 conditions predict the
Fine 2 and Coarse 2 responses respectively (i.e., within-subject consistency) and how well the
103

Coarse 1 and Coarse 2 responses predict the Fine 1 and Fine 2 responses (i.e., nested lower
levels). For each composer, I also observe the influence of tempo and segmentation task on
latency (i.e., the delay between the beginning of a window and the subjects response). Then, for
each individual movement, I use a series of mixed logit regressions to explore how well the
coded arrival features, change features, and formal endings outlined earlier in this chapter predict
the responses. These data cannot conclusively indicate whether subjects are responding to these
musical features; rather, they represent the probability of a response given the presence of a set
of features.
General Results
To get an overall picture of the responses in these movements, Figures A.1 through A.4
in Appendix A tally the total number of responses associated with each beat. The red line
indicates responses in the Fine 2 trial, while the dashed purple line shows the Coarse 2 trial. The
distinct peaks and valleys in all four movements imply that listeners were responding to musical
features consistently, rather than just pressing the spacebar in a random manner. For clarity, I
have labeled the peaks with the measure number and beat in the measure where it is located.
Notice that the dashed-purple line peaks (coarse segmentation) tend to match up with the red
peaks, suggesting a nested hierarchical structure. For participants segmenting the Mozart
movements (A.3 and A.4), the peaks and valleys are more sharply articulated than they are for
the Bartk movements, suggesting more consensus among these participants. All participants
tend to take more time to indicate coarse boundaries; generally these boundaries occur slightly
later than do the boundaries in the fine condition. Overall, as expected, there are far fewer coarse
segmentation responses than fine segmentation responses.

104

Table 5.5: Total Number of Responses and Percentage Used in Data Analysis (Bartk)
Movement

Trial

Subject Group

Bartk,
No. 4
Mvmt. 3

Fine 1

Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total

Fine 2

Coarse 1

Coarse 2

Bartk,
No. 4
Mvmt. 5

Total
Fine 1

Fine 2

Coarse 1

Coarse 2

Total

Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total

Total number of
responses

Average number of
responses

420
186
109
715
369
192
104
665
111
62
48
221
96
50
44
190
1791
453
256
252
961
502
337
278
1117
146
92
92
330
121
71
76
268
2676

30.00
20.67
12.11
22.34
26.36
21.33
11.56
20.78
7.93
6.89
5.33
6.91
6.86
5.56
4.89
5.94
55.97
32.36
28.44
28.00
30.03
35.86
37.44
30.89
34.91
10.43
10.22
10.22
10.31
8.64
7.89
8.44
8.38
83.63

Percentage of
responses in a
window
64.76%
78.49%
95.41%
73.01%
65.58%
71.35%
93.27%
71.28%
88.29%
96.77%
93.75%
91.86%
93.75%
96.00%
100.00%
95.79%
77.22%
83.00%
83.98%
88.49%
84.70%
79.48%
81.90%
88.13%
82.36%
82.88%
84.78%
89.13%
85.15%
85.95%
85.92%
80.26%
84.33%
83.74%

While the figures in Appendix A show every response made in the Fine 2 and Coarse 2
trials, I am only considering responses that fell inside one of the predetermined windows for data
analysis, discarding responses not meeting this requirement. Tables 5.5 and 5.6 show the total
number of responses in each trial, divided by subject group, and the percentage of those
responses that fell into a window. A high number of responses was retained in all four
movements. These responses indicate a general trend: as musical expertise increases, participants
105

make fewer responses during the segmentation task, and, a majority of the time, greater musical
expertise also correlates with a higher percentage of the responses falling in the predetermined
windows.
Table 5.6: Total Number of Responses and Percentage Used in Data Analysis (Mozart)
Movement

Trial

Subject Group

Mozart,
No. 19
Mvmt. 4

Fine 1

Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total

Fine 2

Coarse 1

Coarse 2

Mozart,
No. 21
Mvmt. 2

Total
Fine 1

Fine 2

Coarse 1

Coarse 2

Total

Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total
Non-musicians
Undergraduates
Graduates
Total

Total number of
responses

Average number of
responses

713
496
270
1479
665
500
315
1480
218
184
113
515
121
197
125
443
3917
163
230
136
529
316
230
130
676
128
82
37
247
126
73
39
238
1690

50.93
49.60
30.00
44.82
47.50
50.00
35.00
44.85
15.57
18.40
12.56
15.61
8.64
19.70
13.89
13.42
118.70
11.64
23.00
15.11
16.03
22.57
23.00
14.44
20.48
9.14
8.20
4.11
7.48
9.00
7.30
4.33
7.21
51.21

106

Percentage of
responses in a
window
65.08%
76.01%
85.19%
72.41%
64.06%
74.80%
88.25%
72.84%
82.11%
80.43%
69.91%
78.83%
85.95%
78.17%
82.40%
81.49%
74.44%
76.07%
88.26%
88.97%
84.69%
78.80%
86.96%
94.62%
84.62%
79.69%
84.15%
78.38%
80.97%
77.78%
86.30%
79.49%
80.67%
83.55%

Using these data, the first set of mixed logit regressions demonstrates how well one set of
subject responses predicts another set of responses.57 Results indicate that participants are
consistent in their responses between trials in the same condition. Across both Bartk
movements, Fine 1 and Coarse 1 responses predict Fine 2 and Coarse 2 responses, respectively
(t(4251) = 6.19, p < 0.001, OR = 3.24 and t(4251) = 8.25, p < 0.001, OR = 20.4). The same trend
occurs across both Mozart movements (Fine: t(3757) = 5.76, p < 0.001, OR = 5.57; Coarse:
t(3757) = 12.73, p < 0.001, OR = 25.16). In both of these cases, the odds ratios suggest that
participants who respond in a particular window the first time through the piece are more likely
to respond in that window the second time through the piece. In both the Bartk and Mozart
conditions, the difference in odds ratios between the fine task and coarse task is significant,
indicating more consistency in the coarse condition, but the Bartk responses are not
significantly different from the Mozart responses.58
Only in the Mozart condition, are there two significant interactions. First, musicians in
the fine condition are more likely to respond in a window within which they had responded
previously. This effect is even stronger for graduates, suggesting that an increase in musical
training correlates with increased consistency (see Figure 5.1). This effect is only in the fine
condition; there is no interaction in the coarse condition. Second, the starting segmentation task
influenced consistency in the coarse condition: participants who began with the fine task tended
to be more consistent in the coarse condition (Figure 5.2). This could reflect a learning effect
since the coarse condition was the third and fourth listenings for these participants, but a similar
interaction was not found in the Bartk condition. Instead, the use of consistent cadential
paradigms at the end of fine divisions in the Mozart stimuli might facilitate the formation of
larger sections, suggesting a bottom-up approach to determining formal sections.

57

In all of these mixed logit analyses, the odds ratio shows the probability of a response when a given
variable is present. In other words, the presence of a variable can predict the occurrence of a response. In this
particular case, I am treating the presence of a response in another trial as a variable to see if one response can
predict another. Used this way, predict is not time sensitive: the data from a later response can predict the presence
of a response in an earlier trial.
58

Odds ratios are said to be significantly different if one odds ratio does not fall within the confidence
interval of the other odds ratio. In short, I am saying that I am 95% confident that these ratios are different from one
another. In the Bartk analysis the confidence interval for the fine odds ratio is (2.232, 4.698) and the confidence
interval for the coarse odds ratio is (9.969, 41.763), while in the Mozart analysis the confidence interval for the fine
odds ratio is (3.11, 10.00) and the confidence interval for the coarse odds ratio is (15.31, 41.33).

107

In order for the resulting subject analysis to be hierarchically constructed, every coarse
response should correspond with a fine response, but obviously not vice versa. In both composer
conditions, Coarse 1 is a significant predictor for Fine 1, while Coarse 2 is a significant predictor
of Fine 2.59 In the Bartk analysis, Coarse 1 and 2 responses are strong predictors of Fine 1 and 2
responses (t(4251) = 7.25, p < 0.001, OR = 4.52 and t(4251) = 3.00, p = 0.003, OR = 2.54),
indicating that the fine responses are nested within the coarse responses. The odds ratio for the
first trial in each condition is slightly higher than that of the second trial, but this difference is not
significant. The Mozart analysis shows the same trend: Coarse 1 responses significantly predict
Fine 1 responses (t(28) = 4.248, p < 0.001, OR = 3.69), and Coarse 2 responses significantly
predict Fine 2 responses (t(28) = 3.04, p = 0.005, OR = 4.44). As with the Bartk analysis, the
difference between odds ratios is not significant.
For the participants in the Bartk condition, the starting segmentation task significantly
interacted with the coarse responses, where the ability of the coarse responses to predict the fine
responses varies based on the starting segmentation task. The direction of the difference is
represented by the estimated means shown in Table 5.7. In both cases, the difference between the
estimated means for subjects beginning with the coarse segmentation task is significantly higher
than for the subjects who began with the fine segmentation task, so the coarse responses are
better predictors of the fine responses when participants start with the coarse segmentation task.
These participants are therefore more likely to have their fine segmentation responses nested
within the coarse responses. Unlike in the Mozart condition where starting with the fine
segmentation task produces more consistent results, participants had an advantage in the Bartk
condition when they began with the coarse segmentation task.
The segmentation task and tempo of the movement affected participant response time
across the board. Response latency was measured from the beginning of each window to the time
at which the subject responded, and it varies significantly between different tempos and
segmentation tasks (Table 5.8). Segmentation task and tempo were coded as binary variables
where 0 represents the fine segmentation task and a fast tempo and 1 represents the coarse
segmentation task and a slow tempo. Both coefficients are positive, indicating that subjects were
59

I did not examine whether Coarse 2 predicts Fine 1 or whether Coarse 1 predicts Fine 2 because this
would compare the first trial in one condition with the second trial in the other condition.

108

slower to respond in the coarse segmentation task or while listening to the movement with the
slower tempo. The first result confirms the observation made previously that coarse responses
tend to occur later than fine responses (refer to Appendix A, Figures A.1-A.4). The latter result is
also unsurprising because the response windows are almost twice as long in the slow
movements, providing the opportunity for a longer response time.

Figure 5.1: Interactions between Subject Group and Consistency (Mozart)


Each line connects the mean number of responses in the Fine 2 trial that do not occur in the same window in both
trials to the mean number of responses that do occur in the same window in both trials.

Figure 5.2: Interaction between Starting Condition and Consistency (Mozart)


Each line connects the mean number of responses in the Coarse 2 trial that do not occur in the same window in both
trials to the mean number of responses that do occur in the same window in both trials.

109

Table 5.7: ANOVA Means for Interactions between Starting Task and the Nested Structure
(Bartk)60
Outcome
variable61

Feature62

Fine 1
Fine 2

Start with Fine

Start with Coarse

p-value63

Feature
Absent

Feature
Present

Feature
Absent

Feature
Present

Coarse 1

0.268

0.621

0.241

0.705

0.019

Coarse 2

0.324

0.687

0.232

0.745

0.008

Table 5.8: Mixed Models Regression Analysis: Latency Time


Composer
Bartk

Mozart

Fixed Effect

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Intercept64

976.716640

39.947454

24.450

27

<0.001

Segmentation

1011.440535

170.049966

5.948

3582

<0.001

Tempo

1447.353184

155.966148

9.280

3582

<0.001

Intercept

1002.587521

64.547880

15.532

28

<0.001

Segmentation

338.847945

53.545578

6.328

4500

<0.001

Tempo

711.824574

101.598533

7.006

4500

<0.001

The next set of analyses examines whether the probability of a segmentation response
increases as the number of changes in the music increases. For this analysis, I counted the
number of changes occurring in each window. The features included in this count are: complete
silence, melodic silence, non-melodic silence, entrance of a new instrument, and a change of
register, dynamics, or ostinato.65 In the Bartk movements, the number of changes in a given
window ranges from 07, but windows with more than four changes are grouped together
because there are relatively fewer windows with more than four changes, resulting in a scale
60

These means are not the same estimated means produced by the regression, but they still indicate the
direction of the interaction.
61

Also known as the dependent variable; it is the variable that the regression predicts.

62

In this case, the means represent the number of windows in which there is a fine response, but not a
coarse response (feature absent) and the number of windows in which there is a fine response and a coarse response
(feature present).
63

In all of the ANOVA tables in this chapter, the p-values come from the mixed models regression, not the
actual ANOVA.
64

The intercept is needed for the regression equation and it does not represent anything meaningful (it is
the point where the line crosses the y-axis when the other variables are not included in the equation).
65

Change of ostinato only occurred in the Bartk stimuli.

110

from 04. The Mozart windows have fewer changes (only ranging from 04), so windows with
three or more changes are grouped together, forming a scale from 03. The results from a mixed
logit regression predicting the presence of a listener-perceived boundary from the number of
changes in a window are shown in Table 5.9. A positive coefficient indicates a higher probability
of a boundary, and the p-value indicates whether an increase in the number of changes is a
statistically significant predictor of the observed behavior.
Table 5.9: Mixed Logit Regression Analysis: Number of Changes
Composer

Outcome
variable

Bartk

Mozart

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

0.280069

0.049584

5.648

4251

<0.001

1.323221

(1.201,1.458)

Fine 2

0.093373

0.089227

1.046

4251

0.295

1.097871

(0.922,1.308)

Coarse 1

0.300257

0.081595

3.680

4251

<0.001

1.350206

(1.151,1.584)

Coarse 2

0.624884

0.062921

9.931

4251

<0.001

1.868029

(1.651,2.113)

Fine 1

0.276388

0.053886

5.129

3757

<0.001

1.318359

(1.186,1.465)

Fine 2

0.180791

0.109938

1.644

3757

0.100

1.198165

(0.966,1.486)

Coarse 1

0.440043

0.103856

4.237

3757

<0.001

1.552774

(1.267,1.903)

Coarse 2

0.631480

0.076264

8.280

3757

<0.001

1.880391

(1.619,2.184)

In the Bartk responses, in three of the four trials, the amount of change significantly
predicts the subject responses. For Fine 1, Coarse 1, and Coarse 2, an increase in the number of
changes increases the chance of respondingespecially for Coarse 2, whose odds ratio is
significantly greater than those of the other three trials. There is also an interaction effect for the
first three outcome variables and graduate students, illustrated in Figure 5.3, where the lines
connect the estimated means determined by an ANOVA analysis. The slopes of lines
representing graduates are steeper than those representing undergraduates, indicating that
graduates are more likely to respond as the number of changes increases. The two lower graphs
show that coarse responses are best predicted by four or more changes in the musical surface,
indicated by the large jump from 3 to 4 instead of the incremental rise seen in the fine responses.
The difference between the two subject groups in Coarse 2 is not significant.

111

Figure 5.3: Interactions between Subject Group and Number of Changes (Bartk)
There is not a significant interaction between the subject groups in Coarse 2.

Now referring to the Mozart analysis, there is also a main effect for an increase in the
number of changes in three of the four trials (Fine 1, Coarse 1, and Coarse 2). The odds ratios in
both coarse trials are significantly larger than the odds ratios in the fine trials, indicating that
more musical changes are needed before a listener will respond in the coarse condition, as
compared with the fine condition. There was only one interaction between the fixed effect
(number of changes) and a subject group (graduates) occurring in the Fine 1 trial, where
graduates are less likely to respond overall. Despite the lack of this interaction effect in the other
trials, I have showed all with estimated means for all four trials for easier comparison with the
Bartk results. In the fine condition, there is an inconsistent upward slope as the number of
changes increase, while participants in the coarse condition are increasingly likely to respond as
number of changes increase. This might suggest that phrase ending analyses are less dependent
112

upon changes in the musical surface, and perhaps participants are paying attention to other
musical features in Mozart, while subjects are much more sensitive to changes as a demarcation
of boundaries in Bartk.

Figure 5.4: Interactions between Subject Group and Number of Changes (Mozart)
The only significant difference between subject groups occurs in Fine 1.

My own grouping analysis might provide a more nuanced measure of boundary strength,
since it reflects an interaction between arrival features which I deemed meaningful for a
particular piece and change features taken from the surface of the music. To determine how well
my three-level grouping hierarchy predicts responses, I coded the windows according to the type
of ending each contained. A window containing at least a section ending was coded as 3; a
window containing a phrase ending was coded as 2; a window containing a subphrase ending
was coded as 1. This creates a rating that corresponds to the hierarchical level of the ending.
113

These designations are my own analytical interpretations, of course, but there is a main effect for
an increase in hierarchical ending level on responses in all four outcome variables in both
composer conditions (see Table 5.10). Overall, the odds ratios for the ending ratings in the coarse
condition are significantly higher, indicating a much higher probability of responding as the
hierarchical level changes from a lower level to a higher level.
In the Bartk condition, there is an interaction between graduates and the ending ratings
in the first three conditions (Figure 5.5), where graduates tend not to respond as often as
undergraduates within a window containing no ending or just a subphrase ending; conversely,
graduates are more likely to respond within a window that concludes a section. The Mozart
condition also exhibits this same interaction between the ending type and level of expertise.66 As
Figure 5.6 illustrates, graduate responses are less likely to occur without some sort of ending,
especially in the coarse condition, where graduates wait for at least a phrase ending before
responding.
Table 5.10: Mixed Logit Regression Analysis: Ending Type
Composer

Outcome
variable

Bartk

Mozart

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

0.583101

0.070496

8.271

4251

<0.001

1.791586

(1.560,2.057)

Fine 2

0.365162

0.122154

2.989

4251

0.003

1.440748

(1.134,1.830)

Coarse 1

0.936090

0.155891

6.005

4251

<0.001

2.549992

(1.879,3.461)

Coarse 2

1.160561

0.144327

8.041

4251

<0.001

3.191725

(2.405,4.235)

Fine 1

0.576789

0.141974

4.063

3757

<0.001

1.780312

(1.348,2.352)

Fine 2

0.615687

0.167034

3.686

3757

<0.001

1.850927

(1.334,2.568)

Coarse 1

1.050279

0.074346

14.127

3757

<0.001

2.858447

(2.471,3.307)

Coarse 2

1.220777

0.223261

5.468

3757

<0.001

3.389822

(2.188,5.251)

Neither a simple count of changes nor the presence of an ending determined by a


grouping analysis can examine how particular musical features predict listener responses in each
movement. Movements were separated for these subsequent analyses because different features
may predict endings in each movement. For similar reasons, a separate analysis was run for each
of the four trials. The first set of regressions looks at the arrival features, which vary between
66

This interaction is between musicians and non-musicians in the Fine 1, Fine 2, and Coarse 1 conditions;
and between graduates and undergraduates in the Fine 2, Coarse 1, and Coarse 2 conditions.

114

composers (refer back to Tables 5.3 and 5.4). Because some of these arrival features strongly
correlate with one another (e.g., a PAC will always occur with $) they were not combined into
one large regression; instead, I ran several smaller regression analyses. The change features were
also divided into separate regressions according to their location in a window; for instance,
silence is more likely to occur following the end of a segment and before a new beginning, while
the other change features usually signify a new beginning. The final regression examines the
extent to which analytic endings (subphrase, phrase, and section) predict a perceived boundary.
In all of these analyses, the presence of a feature was coded as 1, so a positive coefficient
indicates that a musical feature predicts the segmentation responses, while a negative coefficient
means the listener is less likely to respond within a window containing the given feature.

Figure 5.5: Interactions between Subject Group and Ending Type (Bartk)
There is not a significant difference between the subject groups in Coarse 2.

115

Figure 5.6: Interactions between Subject Group and Ending Type (Mozart)
There is not a significant difference between the subject groups in Fine 1.

Experiment 1a: Bartk Results


Arrival Features: In both movements, arrival features were grouped into four separate
analyses: the interval into last note of a segment (Type 1 Window) or first note of a window
(Type 2 Window) and whether it was approached from below or above; change of duration; and
cadential type (which varied between movements). Tables 5.11 and 5.12 summarize the
significant results from this set of regressions. Overall, no single feature or set of features
predicts responses across both Bartk movements; instead, the features that correspond with
listener responses are movement-specific.

116

Table 5.11: Mixed Logit Regression Analysis: Arrival Features, Third Movement
Outcome
variable

Descent
Intervallic
Direction
Ascent

-5
-7
Intervallic
Approach

+1
+2

Duration
Change
Cadence

Falling 4th

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Coarse 1

0.965358

0.200947

4.804

1398

<0.001

2.625727

(1.770,3.895)

Coarse 2

0.897002

0.130830

6.856

1398

<0.001

2.452240

(1.897,3.170)

Fine 1

1.228127

0.206982

5.933

1398

<0.001

3.414828

(2.275,5.125)

Coarse 1

1.136407

0.217230

5.231

1398

<0.001

3.115554

(2.034,4.771)

Coarse 2

1.242533

0.216170

5.748

1398

<0.001

3.464376

(2.267,5.294)

Coarse 1

1.768610

0.243063

7.276

1392

<0.001

5.862696

(3.639,9.445)

Coarse 2

2.218797

0.210323

10.549

1392

<0.001

9.196257

(6.087,13.894)

Fine 1

1.921392

0.429727

4.471

1388

<0.001

6.830462

(2.940,15.871)

Fine 2

0.723158

0.297351

2.432

1388

0.015

2.060932

(1.150,3.693)

Coarse 1

1.363304

0.287454

4.743

1392

<0.001

3.909086

(2.224,6.871)

Coarse 2

2.108026

0.343805

6.131

1392

<0.001

8.231973

(4.193,16.161)

Coarse 1

1.049846

0.199121

5.272

1403

<0.001

2.857210

(1.933,4.223)

Coarse 2

0.875178

0.231805

3.775

1403

<0.001

2.399303

(1.523,3.781)

Coarse 1

1.277560

0.174930

7.303

1372

<0.001

3.587874

(2.546,5.057)

Coarse 2

1.333816

0.161271

8.271

1372

<0.001

3.795498

(2.766,5.208)

For the third movement, most of the main effects point toward the influence of the falling
fourth cadence on listener segmentation in the coarse condition (an example of this cadence is
found back in Example 5.5). Along with a descent of five semitones, this cadential gesture also
features a duration change, where the arrival note is shorter than the preceding note. While the
main effect for an intervallic ascent, specifically by one or two semitones, is not associated with
the falling fourth cadential gesture, the melodic note in the last window of the movement is
approached by an ascending step. Almost everyone in each trial identified this window as
boundary point. Even though this does not constitute a predefined cadential gesture, it does
illustrate how the results can by swayed by particular compositional features.
The arrival features in the fifth movement do not indicate a systematic preference for any
particular cadential gesture. There is a positive main effect for a melodic descent in two trials,
while participants are less likely to respond to a melodic ascent (note the negative coefficient).
Musicians, however, exhibit less reaction to a melodic descent; an interaction suggests that the
contour leading to the final note of a segment is not a strong indicator of course endings for

117

musicians.67 The more specific intervallic approaches illustrate a similar trend: regardless of the
interval size, a descending contour better predicts responses. Again, an interaction in the coarse
condition for the descending step suggests that this preference for downward intervals may not
hold across subject groups. While undergraduates are more likely to respond when this feature is
present, this feature does not influence graduates, supporting that motion into the final note of a
segment is not a strong indicator for boundaries as musical training and segmentation grain
increases.68
Table 5.12: Mixed Logit Regression Analysis: Arrival Features, Fifth Movement
Outcome
variable

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

0.150503

3.466

2806

<0.001

1.684911

(1.254,2.263)

Coarse 2

0.437531

0.151715

2.884

2806

0.004

1.548878

(1.150,2.085)

Ascent

Coarse 2

-0.887467

0.229938

-3.860

2806

<0.001

0.411697

(0.262,0.646)

-1, -2

Coarse 2

0.551656

0.151708

3.636

2796

<0.001

1.736125

(1.290,2.337)

Fine 1

0.587790

0.205178

2.865

2796

0.004

1.800005

(1.204,2.691)

Coarse 1

0.760485

0.228989

3.321

2796

<0.001

2.139314

(1.366,3.351)

-5, -7

Fine 1

1.779049

0.720338

2.470

2796

0.014

5.924219

(1.444,24.311)

+1, +2

Coarse 2

-1.440268

0.610440

-2.359

2796

0.018

0.236864

(0.072,0.784)

Fine 1

0.568032

0.154410

3.679

2811

<0.001

1.764791

(1.304,2.389)

Fine 2

0.622929

0.175157

3.556

2811

<0.001

1.864380

(1.323,2.628)

-2.097373

0.804417

-2.607

2800

0.009

0.122779

(0.025,0.594)

Duration
Change

Cadences

t-ratio

0.521713

-3, -4

Intervallic
Approach

Standard
error

Fine 1

Descent

Intervallic
Direction

Coefficient

Coarse 2

Fine 2

0.755961

0.255543

2.958

2796

0.003

2.129657

(1.291,3.514)

Coarse 1

1.052860

0.307220

3.427

2796

<0.001

2.865836

(1.569,5.233)

L-H

In both fine trials, there is a main effect for duration change; this is unlike the third
movement, where duration better predicted the coarse responses. Perhaps this difference reflects
the association between the long-short rhythmic Figure and the falling fourth cadential gesture in
the third movement, while the fifth movement has no consistent relationship between a particular
durational pattern and phrase endings. This is further reflected in the lack of main effects for
67

For non-musicians, the ANOVA means increased from 0.062 to 0.126 when the melody descended,
compared with the smaller percent increase from 0.054 to 0.120 for musicians.
68

For undergraduates, in Coarse 1 the ANOVA means increased from 0.095 to 0.140 when the melody
descended, compared with the smaller percent increase from 0.100 to 0.111 for graduates, and in Coarse 2 the
ANOVA means increased from 0.067 to 0.133 when the melody descended, compared with the smaller percent
increase from 0.072 to 0.093 for graduates.

118

cadential gestures, where the only positive main effects include the double chord gesture (Fine 2)
and the low-high succession (Coarse 2). There was an interesting interaction in the first fine trial
involving the double chord cadential gesture and musical expertise: the ANOVA means indicate
that all subjects are more likely to respond at a double chord cadence, but musicians demonstrate
a higher percent increase with the presence of this feature, and graduates show an even higher
percent increase.69
Change Features: Another set of mixed logit regressions calculated the odds ratios for
the change features in each movement. Tables 5.13 and 5.16 summarize the main effects for
these features. In both movements, participants tend to respond to silence fairly consistently,
especially in the coarse trials, but other change features vary between movements. For instance,
register, dynamic, and ostinato changes tend to influence the results more in the third movement
than in the fifth movement.
Complete silence and a thinning of the texture tends to predict responses in the third
movement, and there is an interaction between subject groups and non-melodic silence, where a
thinning of the texture influences graduates (who are overall less likely to respond) more than
undergraduates. (All interactions for this set of features are located in Table 5.14.) Another
feature that usually follows endings, melodic silence, significantly predicts an absence of a
response in the coarse condition. This feature usually occurs at subphrase divisions and in the
middle of musical segments in this movement, which is reflected in these results. For instance,
consider the cello melody from mm. 635 (Example 5.11). In this example all the windows are
marked in mm. 1929 and these windows are annotated with the percentage of participants who
responded in each window (Coarse 1 trial only). These low-performing silences may be too short
to evoke a boundary, or another feature, like phrase length or melodic content, may be
influencing the participants performance at this grain of segmentation.

69

For non-musicians, the ANOVA means increased from 0.288 to 0.379 at a double chord cadence,
compared with the larger percent increase from 0.054 to 0.120 for musicians; for undergraduates, the ANOVA
means increased from 0.271 to 0.400, compared with the larger percent increase from 0.017 to 0.047 for graduates.

119

Table 5.13: Mixed Logit Regression Analysis: Change Features, Third Movement
Outcome
variable

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Coarse 1

1.551360

0.371348

4.178

1393

<0.001

4.717880

(2.277,9.776)

Coarse 2

2.194610

0.280773

7.816

1393

<0.001

8.976497

(5.174,15.572)

Coarse 1

-0.797169

0.227236

-3.508

1393

<0.001

0.450603

(0.289,0.704)

Fine 1

2.180737

0.311925

6.991

1393

<0.001

8.852828

(4.801,16.326)

Fine 2

1.485596

0.228470

6.502

1393

<0.001

4.417596

(2.822,6.916)

Coarse 1

1.477900

0.425064

3.477

1393

<0.001

4.383729

(1.904,10.093)

Coarse 2

2.521323

0.451601

5.583

1393

<0.001

12.445049

(5.131,30.186)

New
Instrument

Coarse 1

1.113262

0.339207

3.282

1398

0.001

3.044271

(1.565,5.923)

Coarse 2

1.818096

0.128201

14.182

1398

<0.001

6.160116

(4.790,7.922)

New Mel.
Instrument

Fine 1

0.998206

0.302832

3.296

1398

0.001

2.713410

(1.498,4.915)

Coarse 1

0.874641

0.366009

2.390

1398

0.017

2.398015

(1.169,4.917)

Fine 2

0.362799

0.154054

2.355

1393

0.019

1.437347

(1.062,1.945)

Coarse 1

0.868092

0.304006

2.856

1393

0.004

2.382360

(1.312,4.326)

Coarse 2

0.775337

0.297488

2.606

1393

0.009

2.171325

(1.211,3.892)

Fine 1

0.973550

0.206354

4.718

1393

<0.001

2.647326

(1.766,3.969)

Coarse 1

1.336720

0.204935

6.523

1393

<0.001

3.806536

(2.546,5.691)

Coarse 2

2.513043

0.484093

5.191

1393

<0.001

12.342428

(4.774,31.907)

Coarse 2

0.664673

0.227824

2.917

1393

0.004

1.943855

(1.243,3.039)

Complete
Silence
Melodic
Silence

Non-melodic
Silence

Register

Dynamics

Ostinato

Table 5.14: ANOVA Means for Interactions in Change Feature Analysis, Third Movement
Outcome
variable

Feature

Level of Musical
Experience for
Those with More
Training

Less Musical Training

More Musical Training

Feature Absent

Feature Present

Feature Absent

Feature Present

p-value

Fine 1

Lose Instr.

Graduate

0.298

0.632

0.150

0.556

0.004

Fine 2

Lose Instr.

Graduate

0.280

0.545

0.147

0.525

0.003

Coarse 1

Lose Instr.

Graduate

0.075

0.439

0.007

0.434

<0.001

Coarse 2

Silence

Graduate

0.086

0.598

0.089

0.278

0.001

Fine 1

New Instr.

Graduate

0.356

0.446

0.158

0.500

0.001

Fine 2

New Instr.

Graduate

0.327

0.395

0.152

0.481

0.013

Coarse 2

New Melody

Graduate

0.112

0.326

0.070

0.472

0.013

Fine 1

Register

Graduate

0.355

0.478

0.185

0.506

0.01

Fine 2

Register

Graduate

0.316

0.459

0.164

0.543

<0.001

120

9%

2%

6%

69%

3%

3%

0%
3%

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.11: Bartk, String Quartet No. 4, third movement, mm. 635 (cello)
The Type 1 Windows are annotated on the score with a solid box.
The Type 2 Windows are annotated on the score with a dashed box.

121

On the other hand, all participants strongly respond in the coarse condition to the
presence of complete silence, which usually is reserved for the ends of phrases and sections. In
Coarse 2, however, the effect is tempered for graduates, indicating that not every complete
silence indicates a boundary in this condition. Returning to the cello melody in Example 5.11,
complete silence is marked in the texture by a short, black vertical line. Table 5.15 lists all the
points of silence in this short passage followed by the percentage of undergraduate and graduate
responses in the Coarse 2 trial. Only after m. 34, which doesnt have complete silence, does the
texture change and a new melody enters, initiating what I consider a new section.70 On their
second time performing the coarse segmentation task, graduates may have learned enough not to
be tricked by the complete silence before the end of this section.
Table 5.15: Percentage of Responses at Complete Silence in Coarse 2, Third Movement
Window Location
m. 13
m. 21
m. 34

Undergraduates

Graduates
78%
74%
39%

22%
11%
89%

The last two sets of regressions examine how changes usually associated with new
beginnings affected listener responses. All of the changes have a main effect in at least one of the
coarse conditions, but the two strongest predictors in this condition are the introduction of a new
instrument and a change in dynamics. There is no main effect for the entrance of a new
instrument in fine segmentation task, but there is an interaction between this feature and
graduates, who tend to respond more to this cue than undergraduates. In contrast, the entrance of
a new melodic instrument significantly predicts responses in the fine condition, but the
interaction remains the same: in Coarse 2, graduates are more likely to respond to this cue.71 In
sum, both features, which sometimes roughly coincide, significantly predict listener responses,
especially in the coarse conditions. Among the other change features, dynamic change is
especially predictive of coarse segmentation (producing a significantly higher odds ratio in

70

Even though complete silence is not indicated in the score, the performers insert a little space between
beats 2 and 3 in m. 34.
71

There is also an effect for this feature in Fine 2 (which is not included in the chart): t(1398) = 2.01,
p = 0.044, OR = 2.83.

122

Coarse 2), and there is both a main effect and interaction for a change in register, especially for
graduates, for whom this feature has a pronounced effect in the fine condition.
For the features present between musical segments in the fifth movement, complete
silence consistently predicts listener responses, especially in the coarse condition (notice the
significantly higher odds ratios), while non-melodic silence only predicts responses in the coarse
condition (see Table 5.16). Although there is no main effect for melodic silence, this feature is
involved in a couple of interactions (Table 5.17). In Fine 1, musicians are more likely to respond
to melodic silence, whereas non-musicians show no reaction. The interaction in the Coarse 1 trial
reveals a different trend: participants are less likely to respond to melodic silence,
undergraduates more so than graduates. This suggests that, for trained musicians, melodic silence
is sufficient for a fine boundary, but not necessarily for a coarse boundary.
Table 5.16: Mixed Logit Regression Analysis: Change Features, Fifth Movement
Outcome
variable

Complete
Silence
Non-melodic
Silence

New
Instrument
Dynamics

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

0.976257

0.265263

3.680

2801

<0.001

2.654503

(1.578,4.465)

Coarse 1

1.910363

0.319696

5.976

2801

<0.001

6.755540

(3.610,12.641)

Coarse 2

2.114830

0.305034

6.933

2801

<0.001

8.288176

(4.558,15.070)

Coarse 1

0.610535

0.170981

3.571

2801

<0.001

1.841417

(1.317,2.575)

Coarse 2

1.190035

0.146008

8.151

2801

<0.001

3.287195

(2.469,4.376)

Fine 1

-0.502388

0.187898

-2.674

2806

0.008

0.605084

(0.419,0.874)

Coarse 1

-1.102216

0.214054

-5.149

2806

<0.001

0.332134

(0.218,0.505)

Coarse 2

-0.665164

0.157920

-4.212

2806

<0.001

0.514189

(0.377,0.701)

Coarse 2

1.024097

0.277650

3.688

2801

<0.001

2.784581

(1.616,4.798)

During the third movement, the introduction of a new instrument predicts a listener
response, but during the fifth movement we observe the opposite effect. This could be an effect
of the more complex contrapuntal texture of the fifth movement compared with the texture of the
third movement. This more complex texture could also explain the lack of main effects for the
change features that mark the beginning of a new musical segment. Dynamic change in the
Coarse 2 trial produced the only main effect; however, an interaction reveals that this is almost
entirely attributable to graduates.

123

Since Bartks music did not necessarily conform to an established syntax, the arrival
features that listeners used to decide upon boundaries vary between pieces. Surprisingly, though,
silence is the only change feature that is consistently used as a boundary marker for both
movements. More specific feature interactions might have been concealed by this broad
overview: for instance, listeners may only respond when a certain combination of change and
arrival features are present. My future research will pursue this avenue, but my own grouping
analysis can function as a simplification of these interactions, since these features influenced my
decisions. I will return to this point after examining the influence of arrival and change features
in Mozart.
Table 5.17: ANOVA Means for Interactions in Change Feature Analysis, Fifth Movement
Outcome
variable

Feature

Level of Musical
Experience for
Those with More
Training

Less Musical Training

More Musical Training

Feature Absent

Feature Present

Feature Absent

Feature Present

p-value

Fine 1

Mel. Silence

Musician

0.296

0.302

0.258

0.301

0.007

Coarse 1

Mel. Silence

Graduate

0.112

0.084

0.109

0.085

0.008

Fine 2

Dynamic

Graduate

0.314

0.334

0.259

0.368

0.007

Coarse 1

Dynamic

Graduate

0.072

0.146

0.044

0.181

0.021

Experiment 1b: Mozart Results


Arrival Features: While the features that best predicted responses in the Bartk excerpts
were fairly evenly divided between arrival features and change features (especially in the third
movement), arrival features, such as melodic scale degrees and cadential figures, become highly
predictive of responses in the Mozart stimuli. Tables 5.18 and 5.21 list the main effects for the
arrival features, which were grouped into six different regressions comparing similar features:
scale-degrees, harmony/harmonic progression, intervallic direction, step progressions, duration,
and cadences. While some of these features are also explored in the Bartk analysis, a majority
of these features are only associated with endings in the tonal style.
For Mozarts String Quartet No. 19, melodic $, (, and - are all highly predictive of
listener responses, especially in the coarse condition (notice the high odds ratios). While it may
seem strange to have such high odds ratio for -, in this movement, the three largest sections
exposition, development, and recapitulationall end with -"in the soprano. There is not a
124

consistent main effect for $"in the fine condition because non-musicians tend not to respond to
this feature while musicians are likely to respond (see the interaction table: Table 5.19). There
are additional interactions between the other scale degrees and subject group in the fine
condition: all participantsespecially musiciansare less likely to respond to ("in the melody,
whereas all participantsespecially non-musiciansare more likely to respond to -."These
results suggest that musicians are more sensitive to scale degrees, reserving most of their
responses for cadences with $ in the melody.
Table 5.18: Mixed Logit Regression Analysis: Arrival Features, No. 19
Outcome
variable

$"

Scale
Degrees

("

-"

I
Harmony/
Harmonic
Progression

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

1.018028

0.409848

2.484

2492

0.013

2.767730

(1.239,6.183)

Coarse 1

3.680713

0.647215

5.687

2496

<0.001

39.674674

(11.151,141.161)

Coarse 2

2.834016

0.612873

4.624

2496

<0.001

17.013647

(5.115,56.592)

Coarse 1

2.762380

0.611132

4.520

2496

<0.001

15.837484

(4.778,52.500)

Coarse 2

1.755356

0.528397

3.322

2496

<0.001

5.785508

(2.053,16.306)

Fine 1

1.344350

0.422937

3.179

2492

0.001

3.835691

(1.674,8.791)

Coarse 1

4.561242

0.738300

6.178

2496

<0.001

95.702250

(22.498,407.095)

Coarse 2

3.738453

0.577161

6.477

2496

<0.001

42.032921

(13.553,130.356)

Coarse 2

1.563465

0.638883

2.447

2496

0.014

4.775340

(1.364,16.715)

Coarse 2

1.848644

0.617211

2.995

2496

0.003

6.351204

(1.893,21.306)

V-I

Coarse 2

1.561891

0.638213

2.447

2500

0.014

4.767829

(1.364,16.667)

x-V

Coarse 2

1.464869

0.597402

2.452

2500

0.014

4.326977

(1.341,13.962)

Fine 2

0.967142

0.308031

3.140

2493

0.002

2.630415

(1.438,4.812)

Fine 1

1.770332

0.473346

3.740

2493

<0.001

5.872802

(2.321,14.858)

Fine 2

1.732640

0.432869

4.003

2493

<0.001

5.655566

(2.420,13.217)

Coarse 1

1.678095

0.396861

4.228

2493

<0.001

5.355344

(2.459,11.662)

Fine 1

-1.576431

0.407705

-3.867

2493

<0.001

0.206711

(0.093,0.460)

Fine 1

0.641761

0.204140

3.144

2488

0.002

1.899823

(1.273,2.835)

Fine 2

0.663260

0.209690

3.163

2488

0.002

1.941109

(1.287,2.928)

-0.663897

0.240564

-2.760

2492

0.006

0.514841

(0.321,0.825)

Fine 1

0.771459

0.287521

2.683

2488

0.007

2.162920

(1.231,3.801)

Fine 2

0.811606

0.264175

3.072

2488

0.002

2.251521

(1.341,3.780)

Coarse 1

0.980611

0.106508

9.207

2503

<0.001

2.666084

(2.164,3.285)

Coarse 2

1.083713

0.220590

4.913

2503

<0.001

2.955633

(1.918,4.554)

Coarse 2

-1.733752

0.646037

-2.684

29

0.012

0.176621

(0.047,0.662)

Ascent

LT-Tonic

Descent
Steps/
Emb. Steps

Coarse 2
Emb.
Ascent

Duration
Change
Cadences

Standard
error

Descent
Intervallic
Direction

Coefficient

Evaded

125

Harmonic progression only had a significant effect in Coarse 2, suggesting that harmonic
goals only influence segmentation on a coarse grain. The increased likelihood of responding
following a dominant harmony is a bit surprising, but an interaction effect shows that musicians
are less likely than non-musicians to respond when that feature is present. Structural features of
the movement, like the standard ) over V at the end of the development, may also account for
this result.
The approach to the last note of a segment is also significant, but not consistent between
trials. A descending stepwise melodic line into the last note predicts responses in both fine trials,
but a more general descending melodic contour only predicts responses in Fine 2. The opposite
effect occurs in the coarse condition, where participants are less likely to respond to a descending
stepwise linea feature that in this particular movement is associated more with cadential
articulations within the exposition and recapitulation than with the ends of these sections. A
significant interaction shows that non-musicians are mostly responsible for this effect in the
coarse condition; musicians exhibit no change based on the presence or the absence of this
feature. Particular compositional characteristics of this movement might account for this result:
the approach to -"at the end of the exposition, development, and recapitulation (before the coda)
is not a stepwise descent, and the entire movement concludes with an ascending gesture, &/$ in
the melody.
There is a strong main effect for an ascending motion into the last note of a segment, and
an interaction shows that graduates are slightly more likely than undergraduates to respond to
this feature in the fine condition. On the other hand, while an embellished ascending line also
predicts the fine responses, graduates are much less likely to respond than undergraduates when
this feature is present. This might reflect the tendency for the less-experienced musicians to
perceive a boundary in mm. 2930 (see Example 5.16) and other similar passages, in contrast to
more experienced musicians.72 This moment of silence interrupts the ongoing phrase and is
preceded by an embellished stepwise ascent that concludes on - (supported by a dominant
harmony). Participants who are listening for the arrival of harmonic and melodic goals
probably would not perceive a boundary at this point, but participants who are responding to
72

For instance, within this particular window in Fine 1, 11% of graduates responded, 70% of undergraduate
musicians responded, and 93% of non-musicians responded.

126

changes in the surface might. This trend is verified by the negative main effect for the leading
tone. While listeners do not indicate endings in Fine 1 for the motion from &-$, an interaction
reveals that graduates are much more likely to respond to this feature.73 In general, increased
expertise seems to elevate arrival features that project a harmonic or melodic goal.
Table 5.19: ANOVA Means for Interactions in the Arrival Feature Analysis, No. 19
Outcome
variable

Feature

Level of Musical
Experience for
Those with More
Training

Less Musical Training

More Musical Training

Feature Absent

Feature Present

Feature Absent

Feature Present

p-value

Fine 1

Musician

0.452

0.467

0.346

0.524

<0.001

Fine 2

Musician

0.364

0.446

0.381

0.549

0.019

Fine 1

Musician

0.461

0.440

0.438

0.275

0.004

Fine 1

Musician

0.430

0.582

0.390

0.504

0.001

Coarse 2

Musician

0.179

0.116

0.202

0.088

0.007

Coarse 2

x-V

Musician

0.165

0.157

0.199

0.137

0.018

Coarse 2

Step Descent

Musician

0.185

0.106

0.173

0.165

0.009

Fine 1

Ascent

Graduate

0.441

0.573

0.306

0.424

0.001

Fine 2

Ascent

Graduate

0.408

0.518

0.364

0.528

0.004

Fine 1

Emb. Ascent

Graduate

0.449

0.596

0.342

0.256

0.002

Fine 2

Emb. Ascent

Graduate

0.419

0.508

0.410

0.322

0.003

Fine 1

LT-Tonic

Graduate

0.469

0.458

0.307

0.506

<0.001

Fine 2

LT-Tonic

Graduate

0.430

0.440

0.371

0.605

0.002

Fine 1

Duration

Musician

0.434

0.503

0.344

0.532

0.007

Fine 1

PAC

Graduate

0.460

0.491

0.358

0.246

<0.001

Fine 2

PAC

Graduate

0.431

0.430

0.416

0.345

0.01

Coarse 1

PAC

Graduate

0.169

0.200

0.128

0.044

0.001

Surprisingly though, none of the cadences were predictive at the p < 0.02 level,74 but
there are interactions involving cadences and graduates in almost every trial. While
undergraduate responses show little (Fine 2) or no (Fine 1 and Coarse 1) influence from a PAC,
graduates consistently are less likely than undergraduates to respond in a window with a PAC
(note the decreasing means at the bottom of Table 5.19 for graduates).75 This seems
73

This particular arrival feature might account for graduates responding more to an ascending motion.

74

There is a main effect for the PAC in Coarse 1: t(2465) = 2.24, p = 0.025, OR = 1.54.

75

This trend continues in the Coarse 2 trial, but the interaction is only significant at p = 0.046, meaning the
difference between the two groups has a greater likelihood of occurring by chance.

127

counterintuitive given the previous results that indicated that tonal structural features predict
endings. As will be discussed in more detail below, this movement had a large number of
external phrase extensions (i.e., extra material following the cadence but occurring before the
end of the phrase). Graduates, who are presumably more familiar with Mozarts style, would be
more likely to wait until the end of the phrase extension (which extends the cadence) before
indicating a response.76 Furthermore, some of the PACs in this movement arent as strong as
other PACs. Perhaps musicians are more choosy than non-musicians about which PACs indicate
a boundary, as indicated in Table 5.20. This table displays the percentage of responses for three
windows in the passage from mm. 7087. Graduates do not respond at all to the weaker PAC in
m. 73, while everyone responds in greater numbers to the cadential arrival in m. 77. Following
this arrival is a ten-measure phrase extension prolonging the tonic harmony. At the conclusion of
this passage, everyone is much more likely to indicate a boundary.
Table: 5.20 Percentage of Responses at PACs in Fine 2, No. 19
Window Location
m. 73
m. 77
m. 87

Undergraduates

Graduates
29%
50%
67%

0%
56%
89%

Table 5.21 reveals no main effects for scale degree at p < 0.02 in String Quartet No. 21,
however $ does significantly predict the fine endings.77 Since there was no subject group
interaction present with this feature (see Table 5.22), there might be a feature interaction, where
subjects respond to $"only when another feature is present. The significant main effect in the fine
condition for the presence of a tonic chord also suggests that this may be the case. There is also a
main effect for the presence of a dominant chord in this condition, but the odds ratio for the tonic
chord is twice that of the dominant.78 Harmonic motion into a tonic harmony is only significant
in Coarse 1, but the presence of a V-I progression correlates with a much higher response rate for
76

I only noted the cadential arrivals in my coding even though the phrase extensions carry the cadential
function until the end of the phrase.
77

In Fine 1, t(1201) = 2.00, p = 0.046, OR = 4.46 and in Fine 2, t(1201) = 2.06, p = 0.039, OR = 8.01.
Despite a high odds ratio for this feature, the large standard error reveals the high variability in the data, increasing
the p value.
78

Due to the large standard error, these ratios arent significantly different.

128

the graduate subject group in the fine condition; this is due mostly to a higher baseline for the
undergraduates, implying that undergraduates are less discriminate in their responses (the same
pattern occurs in Coarse 2, but with the musicians subject groupsee Table 5.22). While
musicians are more likely to respond to a V-I progression, they are less likely than nonmusicians to respond to a progression that terminates on the dominant in the fine condition.
However, in Coarse 2 the opposite is true: both groups again tend to not respond to an ending on
the dominant, but this effect is lessened for musicians, meaning that they are more likely than
non-musicians to respond. This may relate to main effect of the HC in the coarse condition,
suggesting that cadential goals influenced segmentation more in the coarse condition than in the
fine condition.
Unlike the previous movement, where there were no main effects for the presence of the
cadence, participants more consistently responded to cadential articulation in Mozarts String
Quartet No. 21. Across the board, PACs are significant, and in both coarse conditions there was
also a main effect for the IAC and HC, although the odds ratio for the HC is significantly lower
than for the PAC. This movement has considerably fewer phrase extensions, so a majority of the
time the cadence coincides with the end of the phrase.
This is the only movement that exhibits a consistent main effect for a stepwise descent.
Although a general downward approach to an ending is not significant, a stepwise descent, even
when embellished, predicts increased responses across all four trials.79 This may reflect the
melodic construction of this piece, where clearly defined subphrases end with a stepwise descent
(m. 2) or an embellished stepwise descent (m. 4) (see Example 5.12). Like in the previous
movement, a general upward contour, including the motion from the leading tone to the tonic,
significantly predicts the absence of a response in two of the trials. Again, musicians are less
likely than non-musicians to perceive an ending when the line ascends, but they are more likely
than non-musicians to perceive an ending when the leading tone ascends to tonic. A stepwise
ascent is predictive in three trials, while an embellished stepwise ascent is only significant in
Fine 2 (an example of an embellished stepwise ascent appears in mm. 5 and 6 of Example 5.12).

79

An interaction reveals that this effect for a descending stepwise line, however, is absent for graduates in
the Coarse 2 trial, where this feature does not influence their responses compared with those of the undergraduates.

129

Table 5.21: Mixed Logit Regression Analysis: Arrival Features, No. 21


Outcome
variable

Harmony/
Harmonic
Motion

I
V
V-I
Ascent

Direction

LT-Tonic

Descent

Steps/
Emb.
Steps

Ascent

Embellished
Descent

Emb. Ascent

PAC

Cadence

IAC

HC
Evaded
Cadence

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

2.783041

0.849425

3.276

1215

0.001

16.168114

(3.054,85.594)

Fine 2

2.793356

1.131480

2.469

1217

0.014

16.335743

(1.774,150.403)

Fine 1

2.123094

0.794719

2.672

1215

0.008

8.356952

(1.757,39.739)

Coarse 1

1.257265

0.407899

3.082

1213

0.002

3.515791

(1.579,7.827)

Coarse 1

-1.720381

0.545702

-3.153

1209

0.002

0.178998

(0.061,0.522)

Fine 2

-2.448702

0.816184

-3.000

1206

0.003

0.086406

(0.017,0.429)

Coarse 2

-2.937525

0.957314

-3.069

1209

0.002

0.052997

(0.008,0.347)

Fine 2

1.507373

0.284820

5.292

1201

<0.001

4.514854

(2.582,7.895)

Coarse 1

1.058496

0.309171

3.424

1203

<0.001

2.882034

(1.571,5.286)

Coarse 2

1.463821

0.323258

4.528

1203

<0.001

4.322443

(2.292,8.150)

Fine 1

1.418066

0.567440

2.499

1201

0.013

4.129126

(1.356,12.571)

Fine 2

1.218496

0.394620

3.088

1201

0.002

3.382096

(1.559,7.336)

Coarse 1

1.616638

0.553836

2.919

1203

0.004

5.036132

(1.699,14.928)

Fine 1

1.184301

0.331546

3.572

1201

<0.001

3.268403

(1.705,6.264)

Fine 2

2.014063

0.294312

6.843

1201

<0.001

7.493704

(4.206,13.350)

Coarse 1

1.228347

0.453447

2.709

1203

0.007

3.415580

(1.403,8.315)

Coarse 2

0.712774

0.298505

2.388

1203

0.017

2.039641

(1.136,3.664)

Fine 2

1.240258

0.399695

3.103

1201

0.002

3.456504

(1.578,7.572)

Fine 1

1.592021

0.484712

3.284

1201

0.001

4.913670

(1.898,12.718)

Fine 2

2.365207

0.487669

4.850

1201

<0.001

10.646245

(4.089,27.716)

Coarse 1

2.058855

0.255570

8.056

1203

<0.001

7.836993

(4.747,12.939)

Coarse 2

1.955235

0.270974

7.216

1204

<0.001

7.065582

(4.152,12.024)

Fine 2

2.449530

0.991296

2.471

1201

0.014

11.582905

(1.656,81.000)

Coarse 1

1.848909

0.699153

2.644

1203

0.008

6.352888

(1.612,25.044)

Coarse 2

1.854143

0.767237

2.417

1204

0.016

6.386224

(1.417,28.774)

Coarse 1

1.314523

0.414939

3.168

1203

0.002

3.722974

(1.649,8.403)

Coarse 2

1.419039

0.484763

2.927

1204

0.003

4.133147

(1.597,10.699)

Fine 2

-1.82185

0.699290

-2.605

1201

0.009

0.161726

(0.041,0.638)

130

Table 5.22: ANOVA Means for Interactions in the Arrival Feature Analysis, No. 21
Outcome
variable

Feature

Level of Musical
Experience for
Those with More
Training

Less Musical Training

More Musical Training

Feature Absent

Feature Present

Feature Absent

Feature Present

p-value

Fine 1

V-I

Graduate

0.436

0.607

0.236

0.595

0.016

Fine 2

V-I

Graduate

0.424

0.595

0.231

0.611

0.003

Coarse 2

V-I

Musician

0.143

0.291

0.066

0.237

<0.001

Fine 2

x-V

Graduate

0.580

0.359

0.490

0.208

0.011

Fine 2

x-V

Musician

0.536

0.366

0.569

0.283

0.006

Coarse 2

x-V

Musician

0.260

0.112

0.153

0.095

<0.001

Coarse 2

Step Descent

Graduate

0.110

0.271

0.086

0.091

0.006

Fine 1

Ascent

Musician

0.563

0.304

0.593

0.140

0.007

Coarse 1

LT-Tonic

Musician

0.206

0.024

0.153

0.350

0.009

Example 5.12: Mozart, String Quartet No. 21, second movement, mm. 1-8 (violin 1)
The Type 1 Windows are annotated on the score with a solid box.
The Type 2 Windows are annotated on the score with a dashed box.

Change Features: There is an overall lack of main effects for change features in the
Mozart movements, indicating that change alone is insufficient to create a sense of ending. The
amount of surface change in both movements is much less than in the Bartk movements, and
thus presumably plays a smaller role in the segmentation task. The change features were
analyzed in the same groups as in the Bartk analysis: silences, orchestration changes, and other
changes. The significant results from these analyses are shown in Tables 5.23 and 5.24. There
are fewer change features positively influencing the results compared with the Bartk analyses:
in fact, String Quartet No. 19 has positive coefficients only for complete silence and a change in
dynamic level. Surprisingly, String Quartet No. 21 does not have an effect for complete silence,
but other changes influence participants responses.

131

Table 5.23: Mixed Logit Regression Analysis: Change Features, No. 19


Outcome
variable

Silence

Melodic
Silence
Other
Instrument
Silence
New
Instrument
Register

Dynamics

Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

1.299193

0.232535

5.587

2493

<0.001

3.666337

(2.324,5.785)

Fine 2

1.110595

0.282217

3.935

2493

<0.001

3.036164

(1.746,5.281)

Coarse 1

1.803089

0.274909

6.559

2493

<0.001

6.068363

(3.540,10.404)

Coarse 2

2.011653

0.312752

6.432

2493

<0.001

7.475663

(4.048,13.804)

Fine 1

-0.691147

0.289167

-2.390

2493

0.017

0.501001

(0.284,0.883)

Coarse 1

-1.830045

0.532366

-3.438

2493

<0.001

0.160406

(0.056,0.456)

Fine 1

-1.103671

0.177798

-6.207

2493

<0.001

0.331651

(0.234,0.470)

Fine 2

-0.980578

0.191200

-5.129

2493

<0.001

0.375094

(0.258,0.546)

Coarse 1

-0.907200

0.219563

-4.132

2493

<0.001

0.403653

(0.262,0.621)

Fine 1

-0.975525

0.354085

-2.755

2493

0.006

0.376995

(0.188,0.755)

Fine 2

-1.638651

0.426600

-3.841

2493

<0.001

0.194242

(0.084,0.448)

Coarse 1

-0.756947

0.259056

-2.922

2498

0.004

0.469096

(0.282,0.780)

Coarse 2

-0.674887

0.270200

-2.498

2498

0.013

0.509214

(0.300,0.865)

Fine 1

0.770116

0.179109

4.300

2493

<0.001

2.160017

(1.520,3.069)

Coarse 1

1.664430

0.293689

5.667

2498

<0.001

5.282661

(2.970,9.397)

Coarse 2

2.220171

0.317209

6.999

2498

<0.001

9.208909

(4.944,17.154)

Table 5.24: Mixed Logit Regression Analysis: Change Features, No. 21


Outcome
variable
Non-mel Sil

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

0.861873

0.187515

4.596

1206

<0.001

2.367591

(1.639,3.420)

New Instr

Fine 2

-0.903296

0.293516

-3.078

1211

0.002

0.405232

(0.228,0.721)

New Mel

Fine 2

0.610553

0.255521

2.389

1211

0.017

1.841450

(1.115,3.040)

Register

Fine 1

0.756718

0.239491

3.160

1211

0.002

2.131269

(1.332,3.410)

Coarse 1

0.971746

0.257615

3.772

1211

<0.001

2.642554

(1.594,4.381)

Coarse 2

1.016433

0.248980

4.082

1211

<0.001

2.763320

(1.695,4.504)

Dynamics

Coarse 1

Coefficient

The negative coefficients in Mozarts String Quartet No. 19 may reflect that participants
were using cues other than surface changes in the segmentation task, and these changes tend to
distinguish larger units. Consider, for instance, the changes of register in mm. 16 and 17 (refer to
Example 5.16): while the first follows the end of a phrase, the second does not. Accordingly,
participants are much more likely to respond in m. 16 than in m.17 (in the Coarse 1 trial, 45% of
the participants respond in m. 16 compared with only 9% in m. 17). However, a pair of subject
132

group interactions show that participants with formal musical training are more likely than nonmusicians to indicate endings when the register changes.80 This probably reflects the tendency
for musicians to be more consistent than non-musicians in their responses, but does not change
the overarching trend that register change by itself probably does not influence segmentation,
especially in the coarse condition. Also, as seen in the third movement of Bartks quartet, a
thinning of the texture is not sufficient to perceive a boundary. An analysis examining feature
interactions would probably reveal that when these change features are combined with a
cadential gesture, participants are more likely to respond to them.
Complete silence, on the other hand, is highly predictive in both conditionsespecially
in the coarse condition, where the odds ratio is significantly higher. Besides silence, only
dynamic change elicits a positive main effect, and this occurs only in the coarse condition. This
reflects a feature interaction between cadential arrival followed by the introduction of a new
theme and a change in dynamics: Mozart is much more likely to change the dynamics than to
change the register at these points. Further, in Fine 2, despite not having a main effect for
dynamics, musicians are much more likely to respond to a change in dynamics than nonmusicians, while the opposite is true in Coarse 1, when musicians are less likely to respond.81
This suggests that musicians may be more influenced by these sorts of change features at a finer
grain of segmentation, while arrival features project coarse boundaries.
The second movement of String Quartet No. 21 is the first movement examined here
where complete silence does not predict a response. This may be a result of the compositional
design of this movement. Complete silence never occurs at cadential points; instead, it only
articulates subphrase divisions in mm. 2 and 10 as well as in the corresponding points in the
return of the opening section: mm. 44 and 52. As in the previous movements, melodic silence is
still not a strong predictor of the results, but a thinning of the texture with a non-melodic silence
is significant in the Coarse 1 trial (Table 5.24). An interaction shows that graduates are more
80

For non-musicians, in Fine 2 the ANOVA means increased from 0.383 to 0.433 when the register
changed, compared with the larger percent increase from 0.428 to 0.492 for musicians, and in Coarse 1 the ANOVA
means increased from 0.168 to 0.206 when the register changed, compared with the larger percent increase from
0.089 to 0.170 for musicians.
81

For non-musicians, in Fine 2 the ANOVA means increased from 0.367 to 0.433 when the dynamics
changed, compared with the larger percent increase from 0.346 to 0.577 for musicians, and in Coarse 1 the ANOVA
means increased from 0.080 to 0.433 when the dynamics changed, compared with the smaller percent increase from
0.062 to 0.170 for graduates.

133

likely to respond to this same feature in the fine condition than are undergraduates, who are less
likely to change their behavior when the feature is present.82 This is opposite from the effect
found in the previous Mozart movement, perhaps reflecting a different compositional structure
where a thinning of the texture co-varies with new sections.
The other change features that predict a segmentation response include a new melodic
instrument and changes of register or dynamics, but none of these features is consistently
significant across trials. Generally, just the introduction of a new instrument predicts the absence
of a response for all participants in the Fine 2 trial (like in the previous movement), but when the
orchestration of the melodic line changes in this same trial, participants respond more often. This
suggests that just the entrance of a new instrument isnt important for segmentation, unlike a
change of timbre in the melodic line. In this movement, like the third movement of Bartks
Fourth String Quartet, the violin and cello exchange the melodic role several times (usually at the
beginning of a new phrase), perhaps accounting for this result. A registral change increases
responses in the fine condition, while a dynamic change only increases responses in the coarse
condition. An interaction reveals that graduates in Fine 2 and musicians in Coarse 2 are more
likely than other subjects to respond to dynamic changes.83 As before, Mozart tends to shift
dynamics at important structural points. In this movement, the second phrase begins much louder
than the sotto voce first phrase; the contrasting B section suddenly begins softer following the
loud cadential gesture; and so forth.
In both movements, arrival features best predicted listener responses, especially for
participants with increased musical experience. As expected, many of the features associated
with tonal closure influenced segmentation. The change features, on the other hand, do not
consistently predict responses, suggesting one of two things: (1) because listeners have a familiar
musical syntax on which to base their segmentation, they are less swayed by surface changes; or
(2) listeners are influenced by surface changes, but only when these changes are combined with
82

For undergraduates, the ANOVA means increased from 0.497 to 0.504 in Fine 1 and from 0.483 to 0.496
in Fine 2 when the textured thinned, compared with the larger percent increase from 0.325 to 0.475 in Fine 1 and
0.333 to 0.465 in Fine 2 for graduates.
83

For undergraduates, in Fine 2 the ANOVA means increased from 0.478 to 0.524 when the dynamics
changed, compared with the larger percent increase from 0.319 to 0.603 for graduates. For non-musicians, in
Coarse 2 the ANOVA means increased from 0.175 to 0.296 when the dynamics changed, compared with the larger
percent increase from 0.080 to 0.346 for musicians.

134

another feature. These feature interactions are not captured by the current data analysis. My own
grouping analysis can represent how these features may interact in a segmentation task. As
already noted, the data corroborate the hierarchical levels identified by my grouping analysis, but
this next set of analyses explores how well one of these ending types (subphrase, phrase, and
section) predicts responses. Before looking at the data, I will briefly describe which musical
features contributed to my own grouping analysis.
Grouping Analysis
Bartk, String Quartet No. 4, Third Movement: As discussed previously, because the
notion of phrase and subphrase are less objective in twentieth-century repertoire than in
common-practice repertoire, I first divided the movement into sections, defined mainly by
changes in texture, melodic content, and melodic instrument. Formally, I hear this movement as
a series of varied repetitions of a theme that are organized into a larger ternary structure: the
opening A section (mm. 134), a contrasting B section (mm. 3455), a return to the A material
(mm. 5563), plus a coda that incorporates elements from both previous sections.84
The analysis of phrases, and especially subphrases, in this style is quite open to
interpretation. After dividing the movement into sections, I further divided it into eight phraselike units, mostly informed by changes in the sustained chord and silence in the entire texture. At
the beginning of this movement, a diatonic chord leads into the cellos first melodic entrance in
m. 6. I interpret the first five measures as a prefix to the beginning of the phrase, and in order to
create a well-formed hierarchical structure, I coded it as a subphrase within the larger phrase.
This analysis does not capture the introductory character of the opening five measures, nor does
it convey the feeling of a beginning at m. 6. Alternatively, I could have separated the first five
measures from the material starting in m. 6 by creating two phrases; however, the first five
measures do not contain a sense of a beginning, middle, and end. This analytical choice of
interpreting the opening five measures as part of a larger phrase is consistent with my subsequent
choices to designate phrase beginnings at the chord changes throughout this movement.

84

Other formal designations are also possible, such as sonata form and strophic construction (Bayley, 2000,
3634). While a decision about the formal designation is not essential to this study, it is important to note the
movements range of possible interpretations.

135

The body of this eight-measure phrase (mm. 613) further divides into 4+4 subphrases.
Measure 10 introduces a new melodic gesture following an inversion of the falling fourth
cadential Figure used throughout this movement to mark endings in both A sections. Other
analytic writings support this sense that m. 10 marks a division within a larger unit. Bayley notes
a division here, and recognizes that the background continuity of a sustained harmony is
retained in order that the melody is perceived as an eight-bar entity (2000, 375).85 Refer to
Example 5.11, which shows the cello melody from mm. 635.
The beginning of the second phrase (mm. 1420) poses new questions regarding the
interpretation of the sustained harmony, which changes on the fourth beat of m. 13, preceding the
cello entrance on the third beat of m. 14. I could interpret this music between the presentation of
the chord and the cello melody as a prefix to the ensuing melodic entrance of the cello,
comparable to mm. 15. However, unlike the beginning, these three beats seem too short to
constitute a subphrase. Another alternative is to consider these three beats as belonging to neither
phrase (i.e., merely existing between phrases), but the resulting phrase analysis will not be wellformed unless this unit is a phrase itself, which would be inconsistent with the analysis of the
opening five measures. The remaining alternative is not to divide the subphrase (mm. 1317)
into smaller units, recognizing that the beginning of the phrase may not exactly coincide with the
melodic beginning. In this case, and for similar instances in this movement, I chose this third
alternative, which does not explicitly mark the melodic entrance. Similar questions regarding
beginnings and endings arise in the analysis of the fifth movement of this quartet.
The division of the second phrase into subphrases creates a structure similar to the first
phrase, where an inversion of the material from m. 10 motivates a subphrase division in m. 17,
beat 3. Dividing the last phrase of this section (mm. 2135) is more difficult than dividing the
first two phrases. Bayley (2000) places her only subphrase division at m. 31, drawing a
connection between the cadential material found in m. 19, beat 4 through m. 21, beat 3 and the
material in mm. 2930. I agree that mm. 2930 is cadential, so I also placed a new subphrase at
31. However, I interpreted the material in 3134 as a cadential extension, not as a consequent
response to the material in mm. 2131. The unsettled character of mm. 3134 transitions to the B
85

Bayley uses different terminology at this point. From her perspective, each of the phrases I note is in fact
a period containing its own antecedent and consequent phrases (which I describe as subphrases). We agree about the
boundaries of these entities, but we interpret the music on slightly different hierarchical levels.

136

section, which begins in m. 35. I also heard another subphrase division at the end of m. 25,
occurring around the same point in the phrase as the subphrase divisions in the first two phrases.
Even though m. 26 does not pick up the same melodic content as the subphrases that began in
mm. 10 and 17, it does repeat the opening gesture from m. 22, beat 3, clearly articulating a new
beginning. I did not further divide the phrase, despite the melodic silence in m. 27 (which, in my
opinion, just articulates the motivic repetitions as the melody becomes more fragmented).
The B section begins on the third beat of m. 34 and features less homogeneity between
the phrases than did the preceding A section. This section also has three phrases: mm. 3441,
mm. 4247, and mm. 4755. The first phrase has a subphrase division on the second beat of
m. 37, when the rhythmic kernel introduced in m. 35 morphs into a more melodic rendition. Also
at this point, the texture of the harmonic accompaniment changes to a tremolo figure, further
setting this material apart. Following a strong falling third cadence, the second violin takes up
the melody in the short second phrase, with a subphrase division coinciding with the first
melodic rests in this phrase, at m. 44. The last phrase of this section (mm. 4755) presents two
contrasting ideas, organizing the phrase into three subphrases with a loose aba construction. The
canonic presentation of a new rambunctious melodic idea interrupts the circling around C
melody in m. 50, thus initiating a new subphrase. This subphrase is short-lived; the phrase
returns to its previous melodic content on the third beat of m. 51.
The third section (A' section) is a single phrase in sentential structure. A variation of the
opening melody is played in inverted canon between the cello and the first violin. Each
subphrase ending is marked by the descending fourth cadential gesture in the cello. Measure 64
begins the last section of the piece, a coda that incorporates ideas from the previous sections. I
did not divide this coda into subphrases: although a listener could segment this section at one of
the abundant rests, I believe there is insufficient contrast to divide this phrase. Instead, the
rhythmic kernel from the B section spins out over the A diatonic harmony from the A section.
Most of my interpretation of the grouping structure was based on differentiation of
melodic material, a variable not included in my data analysis. These points are usually articulated
by some sort of change in the musical surface or by an arrival feature, which I included in my
data analysis. A similar interaction between change and arrival features influenced my analysis
of the grouping structure in the fifth movement.
137

Bartk, String Quartet No. 4, Fifth Movement: I divided this movement into eight
large sections, outlined in Table 5.25. The thematic content of this driving movement has some
similarities to sonata form, where the material in Sections 2 and 3 returns in Sections 6 and 7 as a
quasi-recapitulation after a developmental Section 5. Sections are demarcated by the introduction
of a new melodic idea, new texture, or new ostinato figure. Much of the movement, in fact,
employs some ostinato, which changes throughout the course of the piece. This ostinato, which
continues past the end of melodic gestures, sounds like a backdrop upon which the melodic
phrases are presented. Although the continuous ostinato sounding between phrase presentations
doesnt clearly belong to the phrases on either side, I had to interpret the music between melodic
gestures either as a prefix or as a suffix in order to use only well-formed hierarchical structures
(as discussed in the context of the third movement). A pictorial representation of these
possibilities is presented in Figure 5.7.
Usually my analytic decisions interpreted the earliest element (whether the ostinato or the
melody) as the beginning of the phrase. For instance, the first phrase in the second section
(mm. 1118, reproduced in Example 5.13) began with an ostinato figure, so the ostinato is a
prefix, like the hypothetical phrase diagram in Figure 5.2b. The first phrase in the third section
(Example 5.14) begins with the melody, so the accompanimental material is a suffix, like
Figure 5.2a.
Even though Section 6 recapitulates melodic material from Section 2, my interpretation
of the phrase structure shifts with the introduction of the strong descending third cadence in the
fifth section. In Section 2, phrases concluded with the end of the movements primary melodic
theme (Example 5.13); however, in Section 6 repeated chords from the beginning concluding
with a descending third cadence follows the melodic theme (Example 5.7). This cadence sounds
like a stronger ending than did the conclusion of melodic theme (Example 5.15); my phrase
analysis of this section therefore looks quite different from the expositional presentation in
Section 2. Because the development marked the unison descending third as a strong cadential
gesture, its presence elevates the repeated chords to a subphrase within the phrase proper rather
than a suffix (although the limited vocabulary I used for the sake of data analysis does not
preserve the distinction between an internal subphrase and a suffix).

138

Table 5.25: Section divisions in Bartk, String Quartet No. 5, fifth movement
Section
1
2
3
4
5
6
7
8

Formal Function
Introduction
Exposition: First Theme
Exposition: Second Theme
Closing material
Development
Recapitulation: First Theme
Recapitulation: Second Theme
Coda

Measures
111
11101
102121
121151
152238
238343
344374
374392

a.

b.

Figure 5.7: Possible Phrase Structure Analyses


A complete arc in the phrase diagram represents the main content of the phrase, while the incomplete arc leaning on
it represents either a prefix (if it comes before) or a suffix (if it comes after). In order to create a well-formed
grouping analysis, connected arcs collectively form a single phrase.

139

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.13: Bartk, String Quartet No. 4, fifth movement, mm. 1118

String Quartet No. 4 by Bla Bartk


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.14: Bartk, String Quartet No. 4, fifth movement, mm. 102108
My analysis of this movement was based on differentiation (as in the third movement),
but context influenced my categorizing of the prevailing ostinato as a beginning or ending. The
ways in which various musical features shaped my interpretation of the context is difficult to
capture using only the arrival and change features from the data analysis, but features that
particularly influenced my segmentation included cadential gestures, changes of ostinato, and
melodic content and repetition.

140

String Quartet No. 4 by Bla Bartok


Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.
Reprinted by Permission

Example 5.15: Bartk, String Quartet No. 4, fifth movement, mm. 238249
Mozart, String Quartet No. 19 in C Major (Dissonance), K. 465, Fourth
Movement: In my analysis of both Mozart movements, arrival features, especially those
associated with tonal paradigms, principally shaped my analysis, while formal schema and
phrase length also contributed to my interpretation. The primary tonal area (PTA) of this sonataform movement resembles a rounded sectional binary form (without the repeats), where the first
sixteen measures present a periodic structure followed by an eight-measure phrase group leading
into a return of the main theme (see Example 5.16). Each phrase in the opening period, an
antecedent phrase ending on the dominant and the parallel consequent concluding on tonic, can
be divided into a pair of clearly articulated subphrases, where the first three subphrases conclude
on a dominant harmony. Following the PAC in m. 16, the next eight measures form a parallel
phrase group, where each phrase begins on the tonic but quickly moves to prolong the dominant
for the remainder of the phrase. These phrases lack true cadential articulation, but the melodic
repetition suggests two phrase-like units grouped together under a higher hierarchical umbrella.
This emphasis on the dominant and the shorter phrase lengths rhetorically signifies the beginning
of the binary forms second reprise. In m. 24, the opening melody returns, but instead of
immediately concluding with a PAC, the phrase is expanded by repeating the rising fourth
gesture up a step in the first subphrase (mm. 2829) followed by two beats of complete silence
before initiating the cadential formula.

141

Following the PTA, a typical independent transition begins, with a cadential arrival on
the dominant of V (a D-major chord) in m. 49, followed by a six-measure phrase extension
prolonging the cadential harmony. While the goal of this harmonic motion is achieved in m. 49, I
could not consider m. 49 to be the end of the phrase because it would have left an unacceptable
gap in the hierarchical analysis: mm. 4954 do not constitute a phrase (there is no harmonic
motion) and the next phrase does not begin until after the medial caesura with the initiation of
the STA. Because mm. 4954 follow the structural ending, they are not technically a subphrase,
but in order to form a nested hierarchical analysis that acknowledges the cadential arrival in
m. 49 and the formal phrase ending in m. 54, it seemed reasonable to designate both mm. 3449
and mm. 4954 as subphrases. Both subphrases are nested within a single larger phrase, forming
the entire transition section.
I divided the STA in three sections. The first, mm. 5488, remains in G major (dominant
of the original key). The second, mm. 88103, tonicizes E! major (!VI of G) before returning to
G for the cadence in m. 103, and this section elides into the third, which continues until the end
of the exposition (refer to Figure 5.8, illustrating the division of the entire exposition into
sections, phrases, and subphrases). In this well-formed analysis, all music must be included in a
phrase and in a section, but it need not belong to a subphrase because not every phrase is divided
into subphrases. For instance, the phrase that begins the STA (mm. 3561) does not easily divide
into smaller units, and so the subphrase level is absent from my analysis at this point.
To this point, I have used subphrases to account for external phrase extensions, but the
subphrase level is also used to account for internal phrase expansions, including those caused by
an evaded cadence. Measures 118135 present one such use of subphrases (see Example 5.17).
The cadential arrival occurring in m. 125 follows a deceptive motion in m. 122. This evaded
cadence initiates an internal expansion, dividing the phrase into two shorter subphrases. Three
subphrases follow the PAC in m. 125: the first two (mm. 125129 and 129131) extend the
phrase by repeating the cadential formula, and the third detonicizes G major with the
introduction of F,"in preparation for the repeat of the exposition (which is not taken in the
recording I used for this study) or for the C-minor beginning of the development section. Along
with illustrating the division of the exposition into smaller groups, Figure 5.8 also shows the

142

location of cadential arrivals in relation to phrase endings.86 My analysis of the recapitulation


followed the model established in the exposition since the recapitulation essentially replicates the
formal construction of the exposition (except that STA2 is expanded with a deceptive motion
leading to a tonicization of !II in m. 306). Both remaining parts of the movement, the short
development and coda, consist of a single larger section divided into three phrases.

Example 5.16: Mozart, String Quartet No. 19, fourth movement, mm. 134
(violin 1 and cello)

86

I could have included additional grouping levels between the phrase level and section level since not all
phrase endings are equivalent. One such example occurs in mm. 70-77, where the PAC in m. 73 is weaker than the
PAC in m. 77, creating a longer eight-measure unit (not counting the phrase extension that follows). While such an
addition would better reflect the grouping structure of the movement, it was omitted in order to simplify the data
analysis.

143

PTA

TR

8
HC

12

16 20 24
PAC

29

34
PAC

STA1

49
HC

54

STA2

61
HC

69

73

77

PAC (PAC) PAC

87

91 95

103

117

PAC

PAC

121

125

129

PAC

Figure 5.8: Mozart, String Quartet No. 19, Fourth Movement: Grouping Analysis of the Exposition
The top line shows phrase boundaries; the bottom line shows subphrase boundaries.

Example 5.17: Mozart, String Quartet No. 19, fourth movement, mm. 118135 (violin 1 and cello)
144

131

135

Formal schema (including typical phrase length) and arrival features (such as cadential
articulation and harmonic goals) influenced my segmentation in this movement. In my analysis,
endings were marked by typical cadential paradigms and beginnings were marked by new
melodic material or by a repetition of previous melodic material. While surface changes in
dynamics, articulations, and register may coincide with a beginning, they did not play a major
role in my grouping analysis. Given that the second movement of the String Quartet No. 21
shares the same tonal syntax and is written in the same style, I used a similar procedure for my
analysis of this movement.
Mozart, String Quartet No. 21 in D Major, K. 575, Second Movement: I divided this
andante movement into five large sections: the primary theme (mm. 119), transition (mm. 20
33), secondary theme and retransition (mm. 3442), return of the primary theme (mm. 4361),
and closing material with a codetta (mm. 6273). The A-major primary theme is a two-phrase
parallel period where both phrases exemplify sentential structure and the second phrase evades
an expected cadence in m. 16 before arriving on the tonic in m. 19 (refer back to Example 5.2).
As in the previous movement, I analyzed the evaded cadence as completing a subphrase (from
mm. 1216) and initiating a new subphrase (from mm. 1619). Each part of the sentential
structureboth parts of the presentation and the entire continuationis also analyzed as a
subphrase.
The transition begins with a sequence that passes the melodic material between the
instruments, eventually making its way to the dominant (E Major). Despite the complete I-V-I
progression in mm. 2023 (A major) and in mm. 2427 (F! minor), I interpret every two
measures as a subphrase, synchronized with the sequential repetition of the melodic line. The
cadential arrival occurs in m. 31 on a HC in E major, but it is immediately extended by two
external phrase expansions (analyzed as subphrases) until the end of the formal phrase in m. 33.
The secondary theme in E major, which is only two phrases long, follows this transition. The
first violin plays the melody in the first phrase, ending with an IAC, and the cello picks up the
melody in m. 38, reaching a PAC in m. 41. E major is then detonicized with the introduction of a
D", setting up the return of the primary theme (see Example 5.18). Because this retransition is so
brief, I heard it as an extension of the phrase begun in m. 38, yet another example of the cadence
occurring before the end of the phrase. The return of the primary theme mirrors the formal
145

structure from the beginning, complete with an evaded cadence, and m. 62 initiates the closing
section of the piece. The second of the two phrases in this section is extended past the cadential
arrival in m. 69. The two subphrases that follow repeat the cadential pattern, rhetorically serving
as a codetta to this movement.

40

Example 5.18: Mozart, String Quartet No. 21, second movement, mm. 4044
Data Analysis: While my decisions were based on many listenings and conscious
reflection, participants in this study had limited experience with these compositions and did not
have the opportunity to reflect on their segmentation decisions. Despite these differences, the
participants segmentation decisions significantly mirrored my own, especially as musical
expertise increased. This set of analyses investigates the extent to which analytical ending types,
defined by my grouping analysis, predict subject responses. The results from these regressions
are in Table 5.26.
Among all participants, phrase endings consistently predict responses. Sometimes this
effect is particularly strong; for instance, the coarse trials in Bartks third movement have
exceedingly high odds ratios. Subject group interactions reveal that the group with more musical
training is more likely to respond at phrase endings (with the exception of Bartks fifth
movement).43 Most of these results are driven by the less experienced musicians having a
relatively higher baseline mean for when a phrase ending is absent. This suggests that non43

This result is inexplicable, especially since all section endings corresponded with a phrase ending and the
main effect for section ending in this movement indicates that participants respond consistently to that feature.

146

musicians are in less agreement with my analysis, and looking at the results as a whole, they tend
to be less discriminating overall.
Table 5.26: Mixed Logit Regression Analysis: Grouping Analysis
Outcome
variable
Section

Bartk
Mvmt. 3
Endings

Phrase

Subphrase

Section

Bartk
Mvmt. 5
Endings

Phrase

Subphrase

Section

Mozart
No. 19
Endings

Phrase

Subphrase

Phrase
Mozart
No. 21
Endings
Subphrase

Coefficient

Standard
error

t-ratio

Approx
d.f.

p-value

Odds
Ratio

Confidence
Interval

Fine 1

2.834667

0.957010

2.962

1393

0.003

17.024731

(2.604,111.310)

Fine 1

1.701294

0.502734

3.384

1393

<0.001

5.481033

(2.044,14.697)

Fine 2

1.917259

0.547971

3.499

1393

<0.001

6.802289

(2.321,19.933)

Coarse 1

4.000502

0.690529

5.793

1393

<0.001

54.625563

(14.093,211.732)

Coarse 2

6.247745

1.037728

6.021

1393

<0.001

516.845955

(67.473,3959.08)

Fine 2

-0.545557

0.214859

-2.539

1393

0.011

0.579519

(0.380,0.883)

Coarse 1

-1.310557

0.464784

-2.820

1393

0.005

0.269670

(0.108,0.671)

Coarse 2

-2.955109

0.911855

-3.241

1393

0.001

0.052073

(0.009,0.312)

Coarse 1

2.190514

0.265271

8.258

2801

<0.001

8.939810

(5.315,15.036)

Coarse 2

1.588419

0.248499

6.392

2801

<0.001

4.896003

(3.008,7.968)

Fine 1

0.939456

0.161433

5.819

2801

<0.001

2.558589

(1.865,3.511)

Fine 2

0.684197

0.187449

3.650

2801

<0.001

1.982180

(1.373,2.862)

Coarse 1

0.870921

0.214951

4.052

2801

<0.001

2.389110

(1.568,3.641)

Coarse 2

1.310339

0.201581

6.500

2801

<0.001

3.707429

(2.497,5.504)

Fine 2

0.315503

0.095895

3.290

2801

0.001

1.370949

(1.136,1.654)

Coarse 2

-0.604209

0.207700

-2.909

2801

0.004

0.546507

(0.364,0.821)

Coarse 1

0.642937

0.256859

2.503

2493

0.012

1.902059

(1.149,3.148)

Coarse 2

1.202767

0.405722

2.965

2493

0.003

3.329315

(1.503,7.377)

Fine 1

1.209364

0.254545

4.751

2493

<0.001

3.351351

(2.034,5.521)

Fine 2

0.907300

0.356961

2.542

2493

0.011

2.477624

(1.230,4.989)

Coarse 1

1.545690

0.152140

10.160

2493

<0.001

4.691206

(3.481,6.322)

Coarse 2

1.722955

0.248001

6.947

2493

<0.001

5.601058

(3.444,9.109)

Coarse 1

0.624000

0.200877

3.106

2493

0.002

1.866379

(1.259,2.767)

Fine 1

1.647495

0.615075

2.679

1206

0.007

5.193953

(1.554,17.362)

Fine 2

2.249056

0.595786

3.775

1206

<0.001

9.478788

(2.945,30.508)

Coarse 1

2.410880

0.425501

5.666

1206

<0.001

11.143764

(4.836,25.680)

Coarse 2

2.840864

0.509186

5.579

1209

<0.001

17.130561

(6.308,46.520)

Fine 1

0.920706

0.317333

2.901

1206

0.004

2.511062

(1.347,4.680)

Coarse 1

1.997244

0.378694

5.274

1206

<0.001

7.368718

(3.505,15.491)

Coarse 2

1.284160

0.476020

2.698

1209

0.007

3.611634

(1.419,9.190)

147

The results for section endings depend somewhat on the tempo of the movement and the
grain of segmentation. There are not as many main effects for the end of a section in the slower
movements because these movements have fewer sectional divisions that are spread out over a
longer period of time. Participants with less musical training tend to indicate boundaries more
often (seen in the total counts of all responses in Tables 5.5 and 5.6) and this would lessen the
effect of section on the results, especially in the slow movements. The end section feature is
notably absent from the main effects for the slow Mozart movement, but there is an interaction
with this feature in the Coarse 1 trial. As seen in Table 5.27, both graduates and undergraduates
respond to this feature, but graduates have a higher percent change (notice the high baseline
again for the undergraduates). This interaction is replicated in the other three movements.
Table 5.27: ANOVA Means for Interactions in the Grouping Analysis
Movement

Bartk
Mvmt. 3

Bartk
Mvmt. 5

Mozart
No. 19

Mozart
No. 21

Outcome
variable

Feature

Level of Musical
Experience for
Those with More
Training

Less Musical Training

More Musical Training

Feature
Absent

Feature
Present

Feature
Absent

Feature
Present

p-value

Fine 1

Section

Musician

0.350

0.893

0.247

0.847

0.01

Fine 1

Section

Graduate

0.336

0.826

0.179

0.972

0.011

Coarse 2

Phrase

Graduate

0.076

0.696

0.027

0.917

<0.001

Coarse 1

Section

Graduate

0.107

0.750

0.030

0.944

<0.001

Fine 1

Subphrase

Musician

0.354

0.464

0.191

0.463

<0.001

Fine 2

Subphrase

Graduate

0.288

0.430

0.115

0.426

0.005

Coarse 1

Section

Graduate

0.072

0.429

0.056

0.569

0.003

Coarse 2

Phrase

Graduate

0.020

0.146

0.340

0.121

<0.001

Coarse 1

Section

Musician

0.122

0.411

0.089

0.482

0.013

Fine 1

Phrase

Musician

0.370

0.580

0.233

0.660

<0.001

Fine 1

Phrase

Graduate

0.351

0.633

0.131

0.609

<0.001

Fine 2

Phrase

Musician

0.306

0.518

0.266

0.690

<0.001

Fine 2

Subphrase

Musician

0.393

0.395

0.421

0.449

0.001

Coarse 1

Section

Graduate

0.138

0.558

0.024

0.511

0.02

Coarse 2

Phrase

Musician

0.126

0.429

0.027

0.456

0.02

Fine 1

Subphrase

Musician

0.363

0.536

0.184

0.573

0.003

The effect of subphrase is less clear over all four movements. Looking first at the Bartk
responses, participants rarely respond within the windows containing a subphrase division
(observe the negative coefficient), especially in the coarse condition. This suggests that listeners
148

tend to reject lower-level endings when asked to note higher-level endings. In the third
movement, despite the negative coefficient for the main effect of subphrase in Fine 2, musicians
are more likely than non-musicians to indicate a boundary at a subphrase division in the fine
condition. Table 5.27 illustrates that when a subphrase ending is present in the third movement
of the Bartk quartet, there is little difference between the means of the two subjects groups. In
the fine condition, when a subphrase ending is not present, the group with more musical
experience has a much lower mean, resulting in a higher percent change for when the feature is
present. This indicates that the effect of subphrase is less pronounced on non-musicians than it is
on musicians, reflecting once again that non-musicians may be less discriminating in their
responses and dont agree with my analysis to the same extent as do the musicians.
While in the fast Mozart movement, subphrases are only significant predictors in
Coarse 1, there is an increased number of main effects for subphrase in the slow Mozart
movement. For both movements, there is an interaction for this feature: although the presence of
a subphrase ending does not change the non-musicians responses much, musicians have a higher
percent change when a subphrase ending is present. Again, this reflects a more consistent
segmentation strategy for participants with more formal training.

Discussion
The results from this experiment support the hypotheses posited by EST: 1) participants
will segment a given stimulus consistently between trials; 2) different participants will segment a
given stimulus consistently; (3) the resulting segmentation will form a nested hierarchical
structure; and (4) pre-existing knowledge structures will influence segmentation. The
expectations both for continuity and for goal-directed musical successions derive from these
knowledge structures. Differences between the subject groups further support this last
hypothesis. Non-musicians tend to perceive more boundaries that arent paired consistently with
musical features, but as musical expertise increased participants are more likely to respond
consistently to particular features. More musical experienceboth listening and performing
would create knowledge structures that could assist in this segmentation task.44
44

For future data analysis, it might be helpful to divide subjects according to how often they indicated a
boundary, given that participants may have been segmenting on different hierarchical levels (meaning that one
listeners phrase might be another listeners section).

149

Overall, participants segmented the compositions consistently between trials, forming a


nested hierarchical structure, and participants somewhat agreed about the location of these
boundaries. There is no significant difference between the odds ratios in the Bartk and Mozart
conditions for these main effects, pointing to a shared cognitive process that is not style-specific,
although the musical features that mark boundaries differ between the styles represented by these
composers. These different features may have resulted in different segmentation strategies,
reflected in the interactions between the starting segmentation task and consistency for the
Mozart subjects and between the starting segmentation task and hierarchy for the Bartk
subjects. In the Mozart condition, participants who began with the fine segmentation task tended
to be more consistent in their coarse responses. The opposite is true for participants who
segmented the Bartk movements: these participants produced a better nested analysis when they
began with the coarse segmentation task. The aural analysis of the Mozart movements improved
when participants were instructed to listen first for fine-grained boundaries, which usually
coincide with a cadential paradigm. This type of musical syntax is missing in Bartks style, so
change in the musical surface apparently informed the participants segmentation decisions.
Participants in the Bartk condition who first divided the movements into large sections seem to
have formed a more stable representation of the movement into which their fine responses were
nested. Since Mozarts style does not have as many surface changes differentiating larger
sections, it may have been easier for participants to group together already determined shorter
musical segments into longer sections, as opposed to dividing a longer section into shorter
phrases.
Both of these experiences of the musical structure are supported by the theoretic
cognitive mechanism that guides event segmentation as posited by EST. First, coarse segments
are usually marked with the culmination of some sort of goal or a more drastic change in the
musical surface compared to the surrounding input. The Bartk analysis was influenced more
than the Mozart analysis by the number of changes in the musical surface, especially the
perception of coarse divisions. Returning to Figure 5.5 (the estimated mean response in a given
trial for the number of changes), the coarse condition shows a large increase in response rate
when there are between three and four musical changes, suggesting that the change features
highlighted in this analysis strongly influenced segmentation when present in sufficiently large
150

quantities. The creation of a nested structure is facilitated by first identifying these points in a
style where standard cadential gestures are not the norm. Further research suggests that memory
tends to be better at perceptual boundaries because these are the moments at which event models
update, incorporating new input from the ongoing perceptual stream (Swallow, Zacks, and
Abrams 2009). Perhaps participants in my study remembered these points better than other
moments in the music, and this assisted in the fine segmentation task.
The feature analysis reveals that when a movement has goal-directed features (such as
consistent cadential progressions), participants tend to rely on them more than on surface
changes, especially when segmenting on a coarse grain. When goal-directed features are absent,
however, an increased number of changes creates a hierarchical structure. In the Mozart results,
participants relied mainly on arrival features to mark both fine and coarse boundaries. Most of
the arrival features reflect transitional probabilities (it is highly likely that an end would follow #,
but not necessarily $), which would allow participants to anticipate an ending in this repertoire
more so than in the Bartk movements. Since sections werent delineated by drastic changes in
the musical surface, participants had to group together shorter segments to build longer sections.
Even though these arrival features, which resemble a musical goal, predict responses in both
conditions, they are far more likely to predict responses in the coarse condition. This effect
increases with musical training, suggesting that musicians use these arrival cues consistently to
segment musical experience.
Even though the absence of standard cadential paradigms in the Bartk examples makes
goal achievement difficult to quantify, features associated with movement-specific cadential
paradigms tend to predict coarse boundaries in the third movement. In this movement, the falling
fourth cadence and other arrival features associated with this cadence (a specific duration pattern
and intervallic succession) consistently predict coarse responses.45 This can be contrasted with
Bartks fifth movement, where a change in duration predicts fine responses because it is not
consistently paired with a cadential goal. Although, for both composers, not every feature that
predicts a coarse response is coupled with a cadential paradigm and cadential paradigms can also
45

It is difficult to distinguish whether listeners where responding to these cadential gestures or to the large
amount of surface change that usually followed these gestures. A follow-up study that controls for these variables in
the segmented stimuli might be able to distinguish the extent to which a listener depends on arrival and change
features.

151

predict fine segmentation, a general trend that associates goal-directed motion with coarse
divisions emerges from this data.
EST also predicts that fine divisions occur at a change in motion, which in music would
presumably involve a change in the acoustical stream. Although a durational change predicts
fine responses in Bartks fifth movement, as just mentioned, no consistent effects otherwise
support this hypothesis. Because these analytical features are very general, some of the fine
division predictors may have been lost in an over-generalized picture, explaining this lack of
effect. Coding more specific feature interactions (for instance, adding features that acknowledge
a particular change in duration or register) or coding in the degree of change might yield
different results.
For all four movements, the formal divisions I previously identified tended to be the best
predictor of listeners responses. Of course, my own analysis did not solely rely on the presence
of a change in the musical surface; I also considered motivic and melodic repetition, conformity
to formal schema, and phrase length, among other things. This evaluation was analytically
complex rather than simplistic, and it depended more on my prior musical experience than on
individual surface features. Despite disparities in musical training and experience, listeners
evidently agreed to a remarkable extent with my formal analysis. In the fast movements, subjects
tended to corroborate my higher-level endings, while in slow movements they were more likely
to confirm my lower-level endings, perhaps revealing a general feeling of phrase length.
Participants with more musical training were even more likely to respond at my formal divisions,
presumably reflecting similarities in our musical experience and training.
This experiment demonstrates that pre-existing knowledge structures can influence
segmentation. Both expectations for continuity and arrival features derive from learned
transitional probabilities, but, as discussed in Chapter 3, the degree of finality would correlate
with the degree to which the arrival of the ending was anticipated and with the subsequent rise in
prediction error. The next study uses a learning task to explore how arrival features may
influence the perception of closure.

152

CHAPTER 6
EXPERIMENT 2
Experiment 1 found that listeners, despite varying familiarity with the musical style,
could segment a musical stream consistently based on features in the music; as previously
discussed, determining meaningful musical segments is a necessary precursor to the actual
perception of closure. Previous research has suggested that the ability to segment a given
stimulus, as well as the associated perception of closure at the end of a segmented unit, derives
from unconsciously learning the statistical structure of music. Probabilistic learning, which
creates expectations that guide listener segmentation, falls into one of two categories: inclusional
probabilities and transitional probabilities. Transitional probabilities are the most associated with
closure, because listeners who are able to anticipate endings will experience a stronger feeling of
finality at the end of a unit. In the previous study, transitional probabilities were associated both
with goal completion and with a change in the acoustic landscape, the latter revealing an
expectation for continuity. Broadly speaking, coarse segmentation responses correspond with
arrival features (equivalent to achieving a musical goal), while fine segmentation responses
reflect a change in musical motion; however, this correspondence is not consistent and varies
according to musical training. Presumably, the boundaries marked by coarse segmentation
elicited a stronger feeling of finality in the participants than did the boundaries marked by fine
segmentation alone. While the specific feeling of finality (anticipatory, arrival, or retrospective)
wasnt explored, participants consistently use cues to make decisions about segmentation and
their ability to make these decisions consistently depends upon musical experience.
Transitional probabilities broadly stemming from a listeners previous musical
experiences as well as transitional probabilities learned as a particular composition unfolds guide
a listeners segmentation of an unfamiliar style. If a consistent feature concludes segments,
listeners can pick up on these first-order probabilities, resulting in a greater feeling of finality
when this feature occurs. Experiment 2 examines the process by which listeners learn the
musical markers of endings in two styles: a more familiar common-practice style exemplified by
the composer Wolfgang Amadeus Mozart and a twentieth-century style represented by the
153

composer Bla Bartk. As discussed in the previous chapter, Bartk tends to provide a metrical
framework and to employ phrase lengths and formal divisions familiar from common-practice
style. More important for my purposes, consistent gestures conclude phrases in the representative
composition by Bartk used in this experiment. While the first-order probabilities for Bartks
cadential gestures may not be as high as for Mozarts cadential gestures, the goal of this study is
to see whether listeners can extract these cadential cues.
While many statistical learning tasks have shown that listeners are sensitive to both
inclusional and transitional probabilities, this study determines whether exposure to an
unfamiliar style can change the way a listener interprets a satisfactory ending. Any effects
from this study would not be robust, for several reasons. First, the statistical learning task
outlined in this chapter has an additional layer of interpretation: rather than asking participants
whether a test item is a grammatical entity within a style, this study asks participants to
determine whether an ending sounds complete or satisfying. Also, this task uses music from
the existing repertoire, which could not be controlled for all the transitional probabilities between
sound elements. Finally, while Bartk is more consistent than many other twentieth-century
composers in his ending gestures, learning any first-order probabilities from a twelve-minute
exposure could be a difficult task.

Method
Participants
The participants were divided into three subject groups determined by their levels of
formal musical training. Data from 21 undergraduate non-musicians (who received psychology
credit for participating in this study), 22 undergraduate music majors (who received extra credit
in their freshman Music Theory class), and 10 graduate music majors (who received a $10 gift
card for their participation) were included in this study. Data from 10 additional participants
were discarded due to technology problems.
Stimuli
Stimuli consisted of clips from Mozarts String Quartet No. 19 in C Major
(Dissonance), movements 1, 2, and 4; and Bartks String Quartet No. 4, movements 1, 3, and

154

5.87 To create the exposure period, I selected excerpts from each movement that conclude with a
cadential gesture. Table 6.1 lists the excerpts used in the exposure period, all of which were
created in Audacity by segmenting the original digital file. All excerpts for each composer were
then combined into a single audio file in the order in which they occur in the composition. Each
clip was heard twice in succession with successive clips separated by three seconds of silence.
The resulting Mozart and Bartk sound files were each slightly over twelve minutes long.
Table 6.1: Exposure Excerpts
Composer

Movement

Measure Numbers

Time (s)

Bartk

149 (beat 1)

104.34

Bartk

148 (beat 4)161

26.66

Bartk

15

22.93

Bartk

13 (beat 4)21

34.00

Bartk

4755 (beat 1)

37.28

Bartk

1557

45.00

Bartk

121148

22.86

Bartk

238284 (beat 2)

35.88

Mozart

2344 (beat 2)

38.55

Mozart

176211

62.54

Mozart

113 (beat 2)

46.56

Mozart

2639 (beat 1)

52.74

Mozart

101109 (beat 2)

33.02

Mozart

134

26.85

Mozart

258291

26.92

Mozart

326348

18.87

Mozart

371-419

40.21

From these same movements, I selected 115 test clips representing each composer (a total
of 230 clips). These clips were catalogued either as cadential (target stimuli) or as non-cadential
fillers. For the Bartk stimuli, the average time for the 55 cadential excerpts is 3.06 seconds
87

The Emerson String Quartet performed both of the recordings used in this study from the albums Bartk:
The String Quartets (1988) and Mozart String Quartets K. 465 Dissonance, 458 The Hunt & 421 (2005).

155

(SD = 1.28), while the average time for the 51 cadential Mozart stimuli is 3.59 seconds
(SD = 1.36). Some of the same cadential points were heard more than once, each one having a
different length context preceding the cadential point. For the Mozart stimuli, the sound clips
began at the initiation of the pre-dominant area (long context) or the dominant area (short
context). For the Bartk stimuli, the sound clips began between 2 and 10 beats prior to cadential
arrival. Sixty percent of the cadential excerpts were present in the exposure; participants heard
these clips in a semi-random order. While the clips were randomized within each movement, the
movements were heard in the order in which they occurred in the composition.
Cadential gestures in Mozarts string quartet include authentic cadences (both the PAC
and the IAC) and half cadences, as defined in Chapter 5. Cadential gestures in Bartks string
quartet include those defined in Chapter 5: the descending fourth and the descending third
gestures as well as the multi-voiced chord, which can be presented by itself, immediately
repeated, or following a lower note. To supplement this list of cadences, the ends of motives x
and y and the diatonic chord that concludes the third movement were also treated as cadential.
Motive x is a chromatic gesture that occurs both in its prime form (Example 6.1) and inverted in
pitch-space. This motive is introduced in the first movement in m. 7 and reappears about halfway
through the fifth movement. Motive y first occurs in the fifth movement in m. 16. Like motive x,
the rhythmic gesture is the primary identifier for this motive (see Example 6.2).

Example 6.1: Motive x from Bartks String Quartet, No. 4, first movement, m. 7

Example 6.2: Motive y from Bartks String Quartet, No. 4, fifth movement, m. 1618
The target clips were constructed to include as much of the final cadential event as
possible, cutting the clip right before the beginning of the next event. Unlike the first experiment
where I could not differentiate between the feelings of anticipatory, arrival, or retrospective
156

closure, the construction of the stimuli in this experiment allowed me to examine specifically the
feeling of closure based solely on arrival features rather than discontinuities in the musical
surface. In some cases, the construction of the music resulted in a clipped ending in the sound
file, especially for cadences that werent followed by a rest. Because participants might be
sensitive to this sound, I cataloged the presence of silence in the music, along with other features
such as the type of cadential gesture and the length of the excerpt for data analysis.
Procedure
After giving informed consent, participants were assigned to one of two listening
conditions that determined the exposure content: participants heard the excerpts either by Mozart
or by Bartk. While listening to the exposure track, participants were asked to indicate on the
computer every time they heard an ending to ensure they were paying attention. Following the
exposure, they listened to the test clips from both composers (order of presentation was
counterbalanced between participants) and rated how complete each ending sounded on a sevenpoint scale.88 All of their responses were recorded on a computer. Upon completion of the rating
task, participants filled out a brief questionnaire documenting their familiarity with the
compositions and their musical experience.

Results
The results only examine data from the target stimuli: ratings from the clips that conclude
with a cadential gesture. Using a mixed-model regression, I examine the influence of several
independent variables on the dependent rating variable: within-subject variables, which record
characteristics of the stimuli (the composer of the excerpt, whether a rest followed the cadential
arrival, the length of the excerpt, and the order in which the clips were presented) and betweensubject variables (exposure composer, composer first rated, and subject group, as determined by
musical experience). Composer and silence are binary variables (Bartk = 0 and Mozart = 1; no
silence = 0 and silence = 1), while group is a three-level variable (Non-musicians = 0,
Undergraduate Musicians = 1, and Graduate Musicians = 2). The question order is a number
The instructions read to the participants were: In this part of the study you will use a seven-point scale to
rate how well the short musical clip would end a musical idea. In music, some endings sound more conclusive than
others. If the musical clip does not sound at all like an ending, then press 1. Use higher numbers to indicate stronger
endings, with 7 representing the strongest possible ending. Since this is a matter of opinion, dont worry that there is
a right or wrong answer. Feel free to use the entire range of the scale.
88

157

from 1115, corresponding with the order of the excerpts within each block (determined by
composer), and time is the length of each excerpt measured in seconds.
Table 6.2: Mixed Models Regression Analysis: Rating
Row
Number
1

Fixed Effect 89
Intercept

Coefficient
3.072478

Standard
error

t-ratio

Approx.
d.f.

p-value

0.164794 18.644

51

<0.001

Exposure

-0.561667

0.217472

-2.583

51

0.013

First Rated

0.065684

0.218071

0.301

51

0.764

Group

0.593148

0.125842

4.713

51

<0.001

0.790687

0.126829

6.234

5159

<0.001

Rated Composer

Exposure

0.069363

0.197554

0.351

5159

0.726

First Rated

0.141623

0.195685

0.724

5159

0.469

Group

0.307577

0.111427

2.760

5159

0.006

1.845760

0.139089 13.270

5159

<0.001

Silence

10

Exposure

0.013615

0.177474

0.077

5159

0.939

11

First Rated

-0.157893

0.177682

-0.889

5159

0.374

12

Group

0.040004

0.116624

0.343

5159

0.732

0.274982

0.039850

6.900

5159

<0.001

13

Excerpt Length

14

Exposure

0.079446

0.051066

1.556

5159

0.120

15

First Rated

-0.065781

0.051092

-1.287

5159

0.198

16

Group

-0.066598

0.029681

-2.244

5159

0.025

-0.007765

0.000878

-8.847

5159

<0.001

17

Question Order

18

Exposure

0.002446

0.001237

1.976

5159

0.048

19

First Rated

0.001444

0.001246

1.159

5159

0.247

20

Group

-0.001042

0.000807

-1.291

5159

0.197

An interaction between the variables that document the composer of a rated excerpt and
the composer a participant heard during the exposure period would indicate that the exposure
period influenced the listeners perception of closure. As seen in row 6 of Table 6.2, the
interaction between exposure and composer is not significant. Despite the lack of direct support

89

A few notes on this table: the first row is the coefficient needed for the regression equation. The betweensubject variables underneath it show the influence of being in any one of these groups on the rating variable. The
between-subject variables underneath a given within-subject variable show the interaction between these subject
groups and a given independent variable.

158

for the main hypothesis of this experiment, other variables significantly influenced the rating
data. For instance, rows 2 and 4 show a main effect for two of the between-subject variables
exposure and group. The negative coefficient for exposure indicates that the ratings made by
participants who listened to Bartk during the exposure period are generally higher than the
ratings made by participants who listened to Mozart, while the positive coefficient for group
indicates that participants with more musical experience also tend to rate all the excerpts higher.
A separate ANOVA reveals an interaction between exposure and subject group on the
mean rating of the excerpts. This interaction reveals that the graduate musicians who were
exposed to Bartk drive the main effect of higher ratings for the Bartk excerpts (see Figure 6.1).
In fact, a significant three-way interaction shows that the ratings for cadential gestures in Bartk
made by graduate students who listened to Bartk during the exposure period are much higher
than the ratings made by any of the other subject groups for the same cadential gestures
(Figure 6.2).

Figure 6.1: Two-way Interaction between Subject Group and the Exposure Composer

159

Figure 6.2: Three-way Interaction between Subject Group, the Exposure Composer, and the
Rated Composer
The first Figure shows the mean ratings for the Bartk stimuli, and the second Figure shows the mean ratings for the
Mozart stimuli.

160

The main effect for rated composer shows that all participants tended to rate the excerpts
composed by Mozart higher than those composed by Bartk (row 5), and the significant
interaction for group (row 8) suggests that the three subject groups evaluated the composers
differently. While all three groups exhibit the general trend of rating Mozart excerpts higher then
Bartk excerpts, graduate musicians rate both composers higher, suggesting more familiarity
with both composers (see Figure 6.3). Undergraduates seem to be more familiar with the Mozart
stimuli than with the Bartk stimuli; their estimated mean rating for the Bartk stimuli is closer
to that of the non-musicians, whereas their estimated mean rating for the Mozart stimuli is closer
to that of the graduate musicians. This effect is accentuated in the data set that includes the noncadential filler ratings (see Figure 6.4).

Figure 6.3: Two-way Interaction between Participant Group and the Ratings for Composer
Only the cadential excerpts are included in this data set.

161

Figure 6.4: Two-way Interaction between Participant Group and the Ratings for Composer
All the excerpts (both cadential and non-cadential) are included in this data set.

The remaining within-subject variables are all significant. The presence of silence has the
greatest effect on the ratings, suggesting that participants were paying more attention to the
acoustical properties of the final note of the excerpt rather than the approach to the last note of
the excerpt. The positive coefficient for the length indicates that as the length of the excerpt
increases, the rating increases, but the negative coefficient in the interaction suggests that
participants with more musical training arent as influenced by the length of the stimulus. The
final variable, question order, shows that as the experiment progressed all participants tend to
rate the excerpts lower. While this could suggest that participants became slightly more
discerning between degrees of cadential articulation as the experiment progressed, a significant,
but weak negative correlation between all the ratings (including the ratings for non-cadential
fillers) and the question order reveals that all the ratings gradually decreased as the experiment
progressed (r = -0.055, p = 0.01).

Discussion
While there isnt a significant effect for the interaction between the exposure composer
and the rated composer (exposure match), these data suggest other trends that are supportive of
162

EST. Increased familiarity with musical style increases the ratings for closure. Musicians, who
are presumably more familiar with common-practice repertoire, rate Mozarts excerpts higher
than the non-musicians do; graduates, who are presumably more familiar with twentieth-century
repertoire, rate Bartks excerpts higher than the other two groups do. Further, graduate
musicians seem to be more affected by the exposure period. While both groups of graduates rate
the Mozart excerpts higher than the Bartk excerpts, graduates who were exposed to Bartk rate
the Bartk excerpts higher than did the participants who were exposed to Mozart. Given their
more extensive musical experience, graduates could assimilate new stylistic markers of closure
better than participants with less musical experience could.
Exposure match does not have a significant main effect on the rating of the target stimuli,
which may be a product of the design of this experiment. The twelve-minute exposure period
was probably too short for participants to acquire stylistic cues for endings. In their statistical
learning study, Jonaitis and Saffran (2009) found that participants only learned a novel harmonic
syntax after a two-day exposure period. Further, the present study did not ask whether the stimuli
formed a grammatical entity, as most statistical learning tasks do; rather, it asked for an aesthetic
judgment of completeness. Such opinions may require even more experience with a style,
consistent with the increased ratings by the more experienced musicians. Second, the strong
main effect for silence suggests that acoustical properties of the last pitch influenced ratings
more than the feeling of finality experienced at the arrival of the last pitch. An alternatively
designed study could control for this variable by prematurely clipping the last note of every
cadence, which would shift attention towards the predictability of the last pitch.
Both Experiments 1 and 2 show the importance of pre-existing knowledge structures in
musical segmentation and the perception of closure. While the exposure period in this study
failed to create new knowledge structures, the main effect for musical expertise (measured by
formal training) suggests that previous knowledge did, in fact, influence the results. Research
reviewed in Chapters 3 and 4 suggests that both segmentation and the perception of closure
depend upon a listeners expectations. Increased experience with a particular style would
presumably create more accurate expectations, which could be reflected by increased ratings for
more familiar music because the feeling of finality results from accurate expectations followed
by a rise in uncertainty for subsequent events. While Experiment 2 does not specifically examine
163

the influence of expectation on closure, Experiment 3 explicitly asks participants to predict when
musical phrases will end. Endings that are more predictable should correlate with an increased
feeling of finality.

164

CHAPTER 7
EXPERIMENT 3
This last experiment tests the third hypothesis of EST: anticipated endings followed by a
rise in uncertainty for subsequent events correspond with a feeling of finality. Zacks, Speer, and
Reynolds (2009) tested this hypothesis by asking participants to rate retrospectively the
predictability of clauses from a longer narrative. The authors found that the perceived
predictability decreased as the number of changes in the narrative increased. Participants also
tended to read less predictable clauses more slowly than predictable clauses, suggesting an
update of the event model during unpredictable clauses. While there was a correlation between
reading speed and predictability, Zacks, Speer, and Reynolds note that a retrospective rating of
predictability may not be the most reliable measure, and they suggest that a real time measure of
predictability might provide more support for EST.
In Experiment 3, I measure predictability by asking participants to anticipate the endings
of musical phrases while listening to three complete movements by W.A. Mozart. Following this
prediction task, participants rated the degree of completeness for short clips from these
movements. In this rating task, one group of participants heard the clips in order of the
movement, while the other group of participants heard a random-order presentation of the clips,
allowing an examination of the relationship between the perception of closure and the formal
structure of a composition. Meyer (1973) suggests that the formal structure of a composition is
articulated through a hierarchy of closes. His bottom-up construction of structure suggests that
stronger endings result from more parameters projecting closure; however, a listeners previous
experience with typical formal structures could also influence an endings perceived strength.
This top-down view suggests that knowledge structures represent yet another parameter that can
then project closure, hence influencing the perceived strength of closure.

165

Method
Participants
The participants were divided into three subject groups determined by their levels of
formal musical training. Data from 24 undergraduate non-musicians (who received psychology
credit for participating in this study), 27 undergraduate musicians (who received extra credit in
their freshman Music Theory class), and 23 graduate student musicians (who received a $10 gift
card for their participation) were included in this study. Each subject group was divided into two
conditions that determined the nature of the rating task. In the random condition, participants
heard the clips in random blocks by movement, while in the visual condition participants not
only heard the clips in the order they occurred in the movement but also saw a visual
representation of the movement, an example of which is replicated as Figure 7.1.

Figure 7.1: Excerpt No. 6 from Mozarts String Quartet in G Major (K. 156), third movement.
Participants were instructed that the blue box represents the relative length and location of the clip.
Stimuli
Stimuli in this study used the minuet and trio movements from three string quartets by
W.A. Mozart: String Quartet No. 3 in G major (K. 156), third movement; String Quartet No. 8 in
F major (K. 168), third movement; and String Quartet No. 13 in D minor (K. 173), third
movement. Participants listened to all three movements as performed by the Amadeus String
Quartet. (The scores of these movements are located in Appendix B.) I chose these particular
movements because Mozart controls two musical features that may influence the predictability of
phrase endings: consistent four-bar hypermeter and clear cadential arrivals. First, while K. 168
166

maintains a consistent four-bar hypermeter throughout, the other two movements contain phrase
expansions and extensions that disrupt the established four-measure groupings. This variability
in the length of phrases forces the listeners not to rely exclusively on predictable metrical cycles
to anticipate endings. Second, the three movements contain a variety of cadential paradigms,
including all three significant cadence types: PAC, HC, and IAC).90 For experimental purposes,
there is also a practical advantage that the majority of cadences in these movements do not
involve a melodic suspension (i.e., the harmonic and melodic arrivals coincide).
For the prediction task, I combined all three movements into a single audio file, inserting
fifteen seconds of silence between successive movements (the order of the movements was
counterbalanced between participants). Participants listened to this file through the digital audio
software Audacity, recording their predictions for endings on a separate track by pressing a key
on the computer keyboard. I then converted this label track into a text file that listed every
participant response according to how much time had progressed from the beginning of the file
for subsequent data analysis.
I followed the model used in Experiment 2 to create the excerpts for the rating task. In
each movement, I selected excerpts that concluded different formal units in each composition,
representing subphrase, phrase, and section endings. As in Experiment 2, I created these clips by
splicing the original audio file using Audacity. The clips varied in length from two to six
measures, where each clip began with the onset of a formal unit (subphrase or phrase) and
concluded with the release of the last sound of that formal unit. These clips were drawn from the
first iteration of a passage on the recording, with the exception of clips from the return of the
Minuet section. The clips were paired with a visual representation of the movement and were
heard either in the order in which they occur in the movement (visual condition) or in random
blocks (random condition).
Procedure
After the participants gave informed consent, they read and listened to instructions for the
first part of the study before completing two practice excerpts:
In the first part of the study you will listen to several pieces. While listening, try to
predict the moment at which a musical phrase ends. Your goal is to press the control-m at
90

My hypermetrical interpretations and cadence analyses are annotated on the scores in Appendix B.

167

the exact moment the phrase is completed. The composer could surprise you, so its okay
if you press prematurely. Just keep on listening and try to anticipate the next ending. You
will hear two practice excerptscomplete the practice and compare your answers with
the ones provided.
The two practice excerpts were chosen because they illustrated the nature of the task and trained
the participants to predict endings actively rather than react retrospectively to a phrase ending.
The first example, the first reprise from Mozarts String Quintet No. 4 in G minor, third
movement (K. 516), is a modulating contrasting period with phrases of different lengths. The
second phrase is longer due in part to an internal expansion caused by a deceptive cadence in m.
10. Most listeners who did the task correctly were initially tricked by this deceptive cadence,
although only some were also tricked the second time. Listeners heard this excerpt through
Audacity (the actual sound wave was hidden from view) and indicated their predictions on a
separate label track. After completing this first practice excerpt, participants compared their label
tracks to ones that indicated the cadences. Participants who did poorly or did not understand the
task repeated the task on this excerpt.

Example 7.1: Mozart String Quintet No. 4 in G Minor (K. 516), third movement, mm. 113
168

The quick tempo and the possibility for multiple interpretations made the second excerpt,
Mozarts Sonata for Piano and Violin in B%&major, third movement, mm. 116 (K. 454), a bit
more difficult than the previous practice excerpt. While I hear a parallel period with a HC in m. 8
and a PAC in m. 16, it is also possible to interpret HCs in both mm. 4 and 12, creating a double
period. For this reason, this excerpt was chosen to demonstrate to the participant that there could
be multiple right predictions in this task, and also that their interpretation of phrase endings
might change over time.91 Because this excerpt also has a suspension, subjects were instructed to
predict when the goal harmony would arrive, not when the dissonant melodic tone(s) would
resolve. As before, after completing the practice task, participants compared their results with my
responses; if they did poorly, they repeated the task. Any remaining questions were addressed
before the participants began the prediction task with the test stimuli.
Following a successful completion of the practice tasks, participants began the prediction
task on the three minuet and trio movements. Afterwards, participants rated clips from these
movements on a seven-point scale to indicate how complete the end of the clip sounded.92 The
presentation mode of this task, random or visual, varied according to the subjects assigned
condition. Following this rating task, participants filled out a questionnaire documenting their
familiarity with the compositions and their musical training.

91

As a side note, many participants only predicted endings at mm. 8, 12, and 16, pointing towards this
change. The first half of the second phrase (mm. 912) is the same as the previously heard first phrase, except for a
change in orchestration. If more predictable cadences correspond to stronger closes, this suggests that the V chords
in mm. 4 and 12 are less closed than the one in m. 8, which was only predictable in the second listening.
92

This task used the same directions as Experiment 2.

169

Example 7.2: Mozarts Sonata for Piano and Violin in B%&Major (K. 454),
third movement, mm. 116

Results
This results section is divided into two parts. The first examines the results only from the
prediction task, specifically assessing the musical features that led listeners to predict the end of
a phrase successfully. As in Experiment 1, I created windows around the endings for this part of
the analysis. Some of these endings coincided with a cadential arrival, while others merely
demarcated subphrases (all windows are marked on the scores in Appendix B). These windows
began 500 ms before the arrival of the ending and lasted for 1500 ms after this point.93 None of
the windows overlapped. I measured response time from the onset of the last note, so responses
occurring before the last note received a negative response time while responses occurring after
93

In the case of suspensions, the window began 500 ms before the beginning of the goal harmony.

170

the last note received a positive response time. The results show that listeners are more sensitive
to cadential cues than they are to hypermetric regularities: in general, listeners best predict the
tonic arrival in a PAC. The second section examines correlations between the participants
ratings for closure and their responses from the prediction task, along with the correlation
between a listeners response time in the prediction task and his/her rating of closure. While the
data indicate that both predictability and response time influence the ratings, so do other
independent variables such as cadential closure within the clip and the overall length of the clip.
The first regression, which analyzes the data from the prediction task, examines whether
formal units that conclude either with a cadence or at a temporal distance of four or eight
measures from the end of the previous phrase are associated with an increased probability that a
participant will predict the end of a formal unit (see Table 7.1). The significant value for group
(in the second row) indicates that as musical expertise increases, so does the participants ability
to predict the ends of phrases. Of the two independent feature variables, only the presence of a
cadence significantly predicts participant responses (participants are 1.8 times more likely to
respond when a cadence occurs). There is a significant interaction between musical expertise and
the presence of a cadence, where participants with more musical expertise are more likely to
respond at a cadential gesture (see Figure 7.2). The hypermeter variable examined whether
phrases that exhibit a regular 4-bar hypermeter are more predictable than phrases that have some
sort of phrase expansion. As seen in the bottom third of Table 7.1, there is no main effect for
ending points occurring 4 or 8 measures after the end of the previous formal unit. Evidently these
participants were able to predict endings based on the presence of a cadence and were not
necessarily influenced by hypermeterical regularities or irregularities.
Looking more closely at the main effect for cadence, I separated this variable into three
categories based on the type of cadence (PAC, HC and IAC) and ran an additional analysis with
these variables (the results from the mixed logit regression analysis are located in Table 7.2).
While there is not a significant main effect for the HC, there are significant main effects for both
the PAC and the IAC. Listeners are more than twice as likely to predict an ending when a tonic
chord concludes the phrase. The tonic arrival in both the PAC and IAC always follows a
dominant harmony, so listeners expectations for the ending are presumably influenced by the
high transitional probability between V and I. In contrast, because the dominant harmony of the
171

HC is not always preceded by the same harmony, a listener may not be able to accurately predict
its arrival.
Table 7.1: Mixed Logit Regression Analysis: Cadence and Hypermeter
Coefficient

Standard
error

t-ratio

Approx.
d.f.

p-value

Odds
Ratio

Confidence
Interval

Intercept

-1.176341

0.274569

-4.284

9023

<0.001

0.308405

(0.180,0.528)

Group

0.462805

0.114176

4.053

72

<0.001

1.588524

(1.265,1.995)

0.607246

0.168288

3.608

9023

<0.001

1.835371

(1.320,2.553)

Group

0.801740

0.081178

9.876

9023

<0.001

2.229417

(1.901,2.614)

Hypermeter

-0.114407

0.161683

-0.708

9023

0.479

0.891895

(0.650,1.224)

Group

-0.094700

0.077748

-1.218

9023

0.223

0.909646

(0.781,1.059)

Fixed Effect 94

Cadence

Figure 7.2: Interactions between Subject Group and the Presence of a Cadence
94

As in the analyses in Chapter 5, the odds ratio in each row for every within-subject variable shows the
odds of a participants response if the feature is present (with the exception of the first row, which is only needed for
the regression equation). The Group row for each within-subject variable shows the interaction between musical
expertise and the independent variable.

172

Table 7.2: Mixed Logit Regression Analysis: Cadence Types


Standard
error

Approx.
d.f.

p-value

-4.012

9022

<0.001

0.273072 (0.145,0.515)

0.127846

4.311

72

<0.001

1.735157 (1.345,2.239)

0.784639

0.197140

3.980

9022

<0.001

2.191616 (1.489,3.225)

1.260553

0.103436

12.187

9022

<0.001

3.527371 (2.880,4.320)

0.109291

0.182932

0.597

9022

0.550

1.115487 (0.779,1.597)

0.708266

0.085718

8.263

9022

<0.001

2.030468 (1.716,2.402)

0.913797

0.221879

4.118

9022

<0.001

2.493774 (1.614,3.852)

0.420757

0.104090

4.042

9022

<0.001

1.523114 (1.242,1.868)

Fixed Effect

Coefficient

Intercept

-1.298018

0.323561

0.551098

Group
PAC
Group
HC
Group
IAC
Group

t-ratio

Odds
Ratio

Confidence
Interval

I am defining these cadences by their traditional harmonic paradigms, as explained in


Chapter 5; however, these movements challenge some of these traditional markers of cadence
and phrase. Phrases are traditionally defined as having some sort of harmonic motion, with the
cadence representing the culmination of this motion. Several times in these movements, there is
no harmonic motion leading into the point of ending. One such example occurs in m. 16 in the
F-major. String Quartet (K. 168). Here, the B section ends on a V chord that arrives in m. 15;
however, the end of the phrase is not until m. 16, so I coded the HC as occurring at that point
(refer to the annotated scores in Appendix B). Also in this movement, a HC that arrives in m. 36
is followed by a post-cadential extension that repeats the cadential gesture. In this case, the I-V
gesture in m. 40 is not coded as a HC, despite the hypermetric four-measure groupings, because
it merely extends the ending that arrived in m. 36. Four measures later, in m. 44, the trio
concludes with the same type of cadential gesture from m. 15, but this one extends the tonic for
two measures. Even though the goal-directed motion concludes in m. 43, hypermetrical
expectations project an ending at m. 44, which is where the trio concludes.
Most of the cadential types are clear in these movements, but there are a few moments of
possible cadential ambiguity. In his 2010 talk at the Annual Meeting of the Society for Music
Theory, Burstein effectively demonstrated that distinguishing between a HC and an elided PAC
could be difficult, especially when there is continuous motion from the dominant chord of the
HC to the tonic beginning of the next phrase. Measures 5455 of the G-major Quartet represent
one such case of this type of cadential ambiguity: despite the convincing arrival on the dominant
173

in m. 54, listeners could interpret the cellos downward motion into the tonic pitch on the
downbeat of m. 55 as the ending instead. A similar situation occurs in mm. 2425 of the same
quartet. Here an arrival on the dominant in m. 24 marks the end of the B section, but at this point
the second violin initiates a gesture that leads into the return of the A section. It could be possible
that without a break in the sound, listeners would not experience arrival closure in m. 24, but
rather retrospective closure when the opening theme recurs. This ambiguity surrounding the HC
may further explain the lack of a main effect for this cadence type.
Along with the main effect for the PAC and the IAC, musical expertise also affects the
results of the prediction task. Overall, as musical expertise increases, participants are more likely
to make their predictions within the two-second windows around the cadence points.
Specifically, for all three cadence types there is an interaction between the cadence type and
musical expertise (see Figure 7.3, which graphs the ANOVA estimated means for each
participant group). Compared to the non-musicians, the musician groups have a larger change in
their responses when there is a PAC. For the HC, only the graduate musicians are more likely to
respond; all participants respond to the IAC, but to different extents.
Response time data for the points at which listeners predicted an ending were also
analyzed. Response time data less than zero signifies that the participant responded prior to the
onset of the last note of the formal unit, while a response time greater than zero signifies that the
participant pressed after the onset of the last note. Most of the data is greater than zero, which
could reflect the time it takes for a participant physically to respond to a prediction. It could also
reflect the difficulty of the prediction task, where participants may be responding retrospectively
to an ending point despite instructions to predict endings. Even so, response times can still
measure the fulfillment of expectations, given that faster response times should correlate with
expected musical events.

174

Figure 7.3: Interactions between Subject Group and Cadence Type


175

Figure 7.3 (continued): Interactions between Subject Group and Cadence Type
Since there is no main effect for hypermetric regularity, these analyses will only consider
the influence of cadences on the timing of the prediction. As seen in Table 7.3, as musical
expertise increases, the response time decreases (note the negative coefficient for group in the
third row). This result suggests that more experienced musicians better predict the approach of
an ending. The positive coefficient for the presence of a cadence is surprising, because cadences
represent highly predictable patterns in music. The more predictable a pattern, the faster listeners
should respond to it. The interaction reveals that more experienced musicians respond faster to
cadential patterns.

176

Table 7.3: Mixed Models Regression Analysis: Response Time and Cadences
Fixed Effect 95
Intercept
Group
Cadence
Group

Coefficient
0.821911
-0.126699
0.114577
-0.048349

Standard
error
0.072698
0.031078
0.038407
0.017685

t-ratio
11.306
-4.077
2.983
-2.734

Approx.
d.f.
3988
72
3988
3988

p-value
<0.001
<0.001
0.003
0.006

Looking at specific cadence types (Table 7.4), there is a main effect for all three cadence
types on the response time. A smaller coefficient corresponds with a smaller increase in the
response time for that cadence. Participants respond faster to a PAC than to either a HC or an
IAC. For windows in which participants responded to a HC, their response time was faster than
for an IAC. However, it is important to remember that this analysis uses a slightly different data
set, only using the points where subjects responded. This may have removed then more
ambiguous cadences, leaving those that were especially predictable. For both the PAC and the
IAC, there is a subject group interaction indicating that participants with more musical
experience responded more quickly.
Table 7.4: Mixed Models Regression Analysis: Response Time and Cadence Types
Fixed Effect
Intercept
Group
PAC
Group
HC
Group
IAC
Group

Coefficient
0.821587
-0.125998
0.104156
-0.071626
0.135055
-0.035412
0.264896
-0.095702

Standard
error
0.072173
0.031354
0.043245
0.019831
0.046483
0.021012
0.052113
0.023666

95

t-ratio
11.384
-4.019
2.408
-3.612
2.905
-1.685
5.083
-4.044

Approx.
d.f.
3986
72
3986
3986
3986
3986
3986
3986

p-value
<0.001
<0.001
0.016
<0.001
0.004
0.092
<0.001
<0.001

This table (and Table 7.4) is similar to the corresponding one found in Chapter 6 (Table 6.2). In Table
7.3, there is a significant main effect for cadence: when a participant is predicting a cadential arrival, the coefficient
for that variable is factored into the regression equation. The first group variable (a three-level variable where
1 = non-musicians, 2 = undergraduate musicians, and 3 = graduate musicians) is always present in the equation
whether or not a cadence is present, but the group variable under cadence is only factored into the equation when a
cadence occurs.

177

Data from the second half of the study reveals no main effect for the rating condition,
whether visual or random on the ratings of closure, nor did the condition factor into in any
interaction. There are several possible interpretations: 1) participants in the visual condition may
have disregarded the visual information; 2) participants in the random condition may have been
able to place the clip correctly within the formal hierarchy, given that they heard each movement
in its entirety prior to the rating task (which seems improbable due to memory constraints); or 3)
the visual information may have corroborated the rating that would have occurred even without
it. Both the first and last possibilities support Meyers statement that the form of a piece emerges
from its hierarchy of closes (1973). Because condition did not influence the rating results, it was
not included in the data analysis.
The remaining independent variables (whether the participant anticipated a particular end
in the prediction task, the length of the excerpt, and the presence of a cadence) all significantly
influence the rating task. Both the predicted and the cadence variables are binary variables
(1 = participant predicted that particular ending in the previous task and 1 = presence of a
cadence), while the length variable is coded in seconds. For each of these three variables, an
increase corresponds to a significant increase in the rating of closure, with the presence of a
cadence having the largest effect. Before taking into account any of the independent variables,
there is no significant difference between the ratings made by subjects with different levels of
expertise, but there are interactions between subject group and predicted ends as well as subject
group and the presence of a cadence. Participants with more musical experience consistently rate
the clips higher when these variables are present.
Table 7.5: Mixed Models Regression Analysis: Ratings
Fixed Effect
Intercept
Group
Predicted
Group
Length
Group
Cadence
Group

Coefficient
1.851092
0.054959
0.418119
0.270692
0.288156
-0.022383
1.296993
0.191809

Standard
error
0.310883
0.142029
0.157657
0.076986
0.031036
0.014364
0.162001
0.078961

178

t-ratio
5.954
0.387
2.652
3.516
9.285
-1.558
8.006
2.429

Approx.
d.f.
3643
72
3643
3643
3643
3643
3643
3643

p-value
<0.001
0.700
0.008
<0.001
<0.001
0.119
<0.001
0.015

The final analysis uses only the rating data from clips in which the participant
successfully predicted the ending to see if there is a correlation between ratings and response
time. A negative coefficient for the response time variable in Table 7.6 indicates that as response
times increase the ratings of closure decrease. While there is no main effect for response time,
there is an interaction: as musical expertise increases, subjects are more likely to give the clips
with a faster response time in the prediction task a higher rating.
Table 7.6: Mixed Models Regression Analysis: Ratings and Response Time
Fixed Effect
Intercept
Group
Response Time
Group

Coefficient
5.044658
0.186685
-0.282416
-0.405425

Standard
error
0.297488
0.128796
0.311667
0.152425

t-ratio
16.958
1.449
-0.906
-2.660

Approx.
d.f.
1875
72
1875
1875

p-value
<0.001
0.152
0.365
0.008

Discussion
Overall, the data support the hypothesis that anticipated musical endings evoke a feeling
of closure. The data illustrate a correlation between a listeners ability to predict an ending as the
composition unfolds and that listeners subsequent rating of closure for that particular ending.
Further, cadences that are traditionally considered more closed were better predicted in the first
task and had faster response times (in other words, participants responded more consistently and
quickly to an anticipated PAC than to the other cadences).
As seen in the previous studies, musical experience influenced the participants results.
Here, the main effects were magnified for the participants with more musical experience:
participants with more experience successfully predicted more endings, and for these endings
experienced participants predicted all the cadences types better than participants with less
experience did. Their ability to anticipate endings more quickly and accurately in the prediction
task suggests that the participants with increased musical experience drew from knowledge
structures supported by many more exemplars of common ending paradigms in this style. These
experienced participants showed a stronger correlation between their ratings and their data from
the prediction task: both the endings they predicted and their faster response times correlate with
higher ratings for closure.
179

Surprisingly, the rating condition had no influence on the rating task: participants in both
the visual and random rating conditions showed no difference in their clip ratings. These data
support Meyers assertion that form emerges from a hierarchy of closes more than top-down
knowledge of formal structure influences the perception of closure. Appendix B shows windows
for each movement, the percentage of participants who indicated an ending in that window, and
the mean rating in each window. While there is not always an exact match between the formal
structure and the data, many times the endings that demarcate the conclusion of a formal section
were better predicted, and subsequently were given a higher rating.
Event Segmentation Theory posits that an increase in the transient prediction error creates
a perceptual boundary. While I was unable to measure directly any increase in this transient
prediction error following a cadence, it is safe to assume that the ability of a listener to predict
upcoming musical material following a cadence is lower than their ability to predict the cadential
arrival. The correlation between the prediction and rating tasks further suggests that larger
increases in prediction error results in a hierarchically significant musical boundary, eliciting a
stronger feeling of closure.

180

CHAPTER 8
CLOSURE
Four characteristics of closure inferred from the musicological literature form the
foundation for my definition of closure and my cognitive model for the perception of closure
outlined in this dissertation:
1) closure segments a continuous musical stream into discrete events
2) closure is stylistically dependent
3) closure is a completion of a goal-directed process resulting in an arrival of relative
stability or rest
4) the strength of closure depends on many musical variables and plays an integral role in
the hierarchic construction of a composition
While my own definition of closure, the anticipated end to a musical segment, responds to the
concept of closure as used in musical analysis, my methodology is removed from the music
itself as an object of study, alternatively focusing on the perception of closure. In other words,
instead of examining closural processes in a particular musical style or a specific composers
corpus, the cognitive model for the perception of closure (developed in Chapters 3 and 4 and
supported by the three experiments in Chapters 5, 6, and 7) uses recent research in event
segmentation and musical expectation to explore how and why a listener perceives closure.
Corroborating previous studies examining event segmentation, Experiment 1 established
the possibility of a shared cognitive process in musical segmentation. Specifically, the results
from this study showed that subjects were highly consistent in their segmentation responses, both
within an individual subject and between subjects, and that subjects perceived event structure
hierarchically, where smaller musical segments combine to form larger segments. This study also
revealed that specific musical features can predict a listeners perception of an ending, and this
correlation grows stronger with musical training. Many times, a perceptual boundary occurred at
the end of a schematic unit or following a discontinuity in the musical surface, corresponding
with the arrival and change features in Experiment 1. Segments that terminate with an arrival
feature would presumably sound more closed (resulting in anticipatory or arrival closure) than
would segments ending with a change feature (resulting in retrospective closure).

181

Results for segmentation consistency and nested lower levels did not significantly vary
for the participants who segmented Mozart and the participants who segmented Bartk, but there
was a difference in the types of features that signaled an end in these two styles. This suggests
that the cognitive mechanism of segmentation remains constant between styles, while the
specific features that signal closure may change. Subjects who segmented Mozart tended to rely
on arrival features, while subjects who segmented Bartk tended to rely on change features. The
arrival features in Mozart represent well-learned endings from the common-practice style, while
the arrival features that predicted endings in Bartk tended to be peculiar to these particular
compositions. Because participants who segmented Bartk could not rely on previously learned
endings representing a wide variety of compositions, they tended to rely more on surface
changes during their task.
Learning ending gestures for a style, or even for a specific piece, is an unconscious
process dependent on listener experience. According to Hintzmans multiple trace theory (1986),
every encounter with a stimulus creates a trace in long-term memory. Of course, the quality of
the information stored in the trace is contingent on listener attention to the stimulus and the type
of encounter with the stimulus. The number of traces in LTM that anticipate the conclusion of a
particular ending gesture determines whether a listener experiences closure. For instance, while a
listener may have veridical expectations for an ending in a particular composition, a listener may
also have many more traces in LTM for continuation at that point, lessening the sense of closure.
Experiment 2 used a learning task to explore whether a listener can associate the arrival
features of a particular compositional style with the feeling of closure. The aim was to see
whether a brief period of listening to excerpts by either Mozart or Bartk would influence
subsequent ratings of closure of similar endings. While there wasnt an interaction between the
composer heard during the exposure period and a listeners rating of that composer (maybe due
to the exposure period being too brief, or to my inability to control the transitional probabilities
between sound elements), participants with more musical training rated all cadential excerpts
higher than did those participants with less training. Given that more experienced musicians
presumably have had more exposure to cadential gestures in both styles, these higher ratings
support the learned association between closure and a particular musical gesture. Further,
graduate music students who were exposed to Bartk tended to be more sensitive than other
182

subject groups to the learning task, suggesting that their increased training allowed them to
assimilate cadential cues more quickly from a less-familiar style.
The perception of goal direction towards an ending also supports our ability to pick out
transitional probabilities. The feeling of moving towards a musical goal is an artifact of being
able to predict with increasing certainty subsequent events in a phrase. Studies have revealed
expectations for acoustic continuity as well as expectations for learned musical patterns. Event
segmentation theory is an expectation-based model of event segmentation where an unexpected
event (e.g., a discontinuity in the musical surface) or the expected completion of an event causes
a perceptual boundary. According to my definition of closure, not every end of every segment
produces a feeling of finality. Anticipatory and arrival closure capture the experience of an
anticipated ending, while retrospective closure represents a failure to predict the moment of
closure. In retrospective closure, a discontinuity signals a beginning, but the preceding ending
wasnt accurately predicted or recognized at the moment it occurred.
Experiment 3 asked listeners to predict endings in three Mozart minuet movements. Most
of their responses coincided with cadencesespecially authentic cadences, which represent
highly predictable endings in the common-practice style. Data from a subsequent rating task
showed that listeners rated the endings they predicted in the previous task as more closed than
other endings from the same composition. These data suggest that the strength of closure is
directly related to the predictability of an ending; larger structural boundaries were generally
more predictable and received higher ratings, supporting Meyers argument that form emerges
from a hierarchy of closes.
The degree to which closure permeates the musicological discourse is a testament to its
analytical and aesthetic importance and speaks to an essential characteristic of the music
listening experience. Despite stylistically varied markers of closure, I posit that an innate
cognitive mechanism engenders the perception of closure. EST provides a model for musical
segmentation and closure that transcends stylistic boundaries and captures some of our musical
intuitions about closure: that closure is contingent upon musical expectation and prompts a
hierarchical understanding of a composition. The perception of closure is thus a product of an
ongoing cognitive process that segments our continuous life experiences into discrete events.

183

Two different agents of closure can be inferred from the language used in musicological
discourse: the music (referring to a compositional process) or the mind (referring to a
psychological experience). While these perspectives may seem irreconcilable, the four
characteristics of closure can serve as a point of intersection, and only the language differences
remain. Because the perception of closure (or at least the evocation of closure) shapes musical
analyses of all kinds, we should look past the differences in language and recognize the
underlying role of expectation (whether musical or disciplinary) in musical analysis. By
considering closure as a result of musical expectation, we can better reevaluate how we use the
concept of closure to shape our analysis of music.
While this project builds a strong case for the role of expectation in both the creation of
musical segments and the perception of closure, work on this topic remains to be done. My own
studies were large in scope, deriving their stimuli from actual musical compositions, and there
was no limit on the number of segmentation/prediction responses a subject could make. I plan to
reanalyze some of the data collected because there are additional ways to examine it that I didnt
pursue in this dissertation. For instance, participants in Experiment 1 could be divided into
subject groups based on how often they indicated a boundary during the segmentation task. From
a pragmatic perspective, this would ensure that subjects segmenting on the same hierarchical
level would be grouped together, and preferred segment length might distinguish musically
experienced listeners better than did degree programs. In Experiment 2, I did not discuss the data
collected during the exposure period (while listeners listened to music in the exposure period,
they indicated endings using a computer keyboard). While there wasnt an interaction effect for
exposure composer and rated composer, there might be a correlation between the endings
identified by a participant in the exposure period and that participants ratings of closure in
subsequent task. In all of the studies, additional musical features could be examined for their
effect on participants segmentation/prediction responses.
Additionally, I plan to perform more focused and controlled studies that I hope will
produce more robust results in support of these theories. For instance, to replicate the failed
learning task in Experiment 2, I could create a statistical learning task where I compose the
material in the exposure period, controlling for the transitional probabilities between pitches
(using a non-tonal style). Every pitch could be presented at a steady rate, but cadential figures
184

(which would be a composed pitch pattern particular to this task) would be followed by rests. I
would then compare the listeners ratings of these cadential pitch-patterns to the listeners ratings
of pitch-patterns from the beginning and middle of segments.
Two avenues for future research include examining the influence of previous knowledge
structures on the formation of new closural expectations and the influence of non-compositional
features on the perception of closure. This expectation-based theory of closure posits that
through statistical learning listeners associate ending gestures with closure even in an unfamiliar
style, but the extent to which already learned closural gestures may influence this process is
unknown. While a learning task similar to Experiment 2 could explore this issue, such an
experiment could also be expanded into a cross-cultural study that could specifically examine the
influence that learned closural gestures in one style may have on the perception of closure in a
different style. In regards to non-compositional features influencing the perception of closure, the
data from Experiment 2 showed that the acoustic properties of the final pitch of a segment might
influence the perception of closure. A study that specifically manipulated the final sound of a
segment could reveal how performance aspects may shape the perceived structure of a
composition. Another avenue for further research is the influence of bodily gestures, including
those that necessarily accompany a live performance, on the perception of formal structure and
closure.
This project differs from previous theoretical and cognitive studies regarding closure by
casting both the creation of discrete segments from an ongoing musical stream and the
perception of closure at the end of some of these segments as contingent upon a listeners
musical expectations. While much work remains to be done, the theoretical literature, previous
cognitive studies in both music and event segmentation, and the three experiments presented in
this dissertation support the connection between segmentation and expectation and their
influence on the perception of closure.

185

APPENDIX A
SEGMENTATION RESPONSES IN EXPERIMENT 1
The figures in this appendix plot all of the responses made over the course of each
movement in the Fine 2 and the Coarse 2 trials. Points on the solid red line illustrate the total
number of segmentation responses made during each beat in the Fine 2 trial; points on the dashed
purple line represent the total number of segmentation responses made during each beat in the
Coarse 2 trial. These counts represent the responses made by all three subject groups: nonmusicians, undergraduate musicians, and graduate musicians. Relatively high numbers of
responses are labeled on the Figure with the measure and beat number of their occurrence (where
the first number in the label is the measure number and the second number is the beat within that
measure). Figures do not illustrate the same number of measures because they are divided by
formal sections within each movement. Figures A.1 and A.2 represent the responses made by the
participants in Experiment 1a (n = 32), and Figures A.3 and A.4 represent the responses made by
participants in Experiment 1b (n = 33).

186

Figure A.1: Bartk, String Quartet No. 4, third movement


187

Figure A.1 (continued): Bartk, String Quartet No. 4, third movement


188

Figure A.1 (continued): Bartk, String Quartet No. 4, third movement


189

Figure A.2: Bartk, String Quartet No. 4, fifth movement


190

Figure A.2 (continued): Bartk, String Quartet No. 4, fifth movement


191

Figure A.2 (continued): Bartk, String Quartet No. 4, fifth movement


192

Figure A.2 (continued): Bartk, String Quartet No. 4, fifth movement


193

Figure A.2 (continued): Bartk, String Quartet No. 4, fifth movement


194

Figure A.3: Mozart, String Quartet No. 19, fourth movement


195

Figure A.3 (continued): Mozart, String Quartet No. 19, fourth movement
196

Figure A.3 (continued): Mozart, String Quartet No. 19, fourth movement
197

Figure A.3 (continued): Mozart, String Quartet No. 19, fourth movement
198

Figure A.4: Mozart, String Quartet No. 21, second movement


199

Figure A.4 (continued): Mozart, String Quartet No. 21, second movement
200

APPENDIX B
ANNOTATED SCORES FOR EXPERIMENT 3
This appendix includes all three movements used in Experiment 3, including some
annotations: cadences are marked above each system, the hypermeter is notated between the
violin 2 and viola parts, and vertical lines through the score signify the points I used in data
analysis for the prediction task. Under each system, I included the percentage of participants who
predicted an ending at various points (on the first listening) as well as the mean rating. Not every
point from the prediction task was included in the rating task due to time constraints.

PAC

HC

20.5%
2.6

6.8%

10

63.0%
5.0

PAC

9.6%
1.4

79.9%
5.2

Example B.1: Mozart, Quartet No. 3 in G Major, K. 156, third movement

201

19

IAC

HC

42.3%
3.5

28

32.4%

19.2%
2.0

15.1%
2.1

IAC

HC

27.0%

2.5%
1.4

PAC

66.2%
6.4

66.2%

37

4.1%
1.6

Example B.1 (continued): Mozart, Quartet No. 3 in G Major, K. 156, third movement

202

45

PAC

67.1%
4.9
54

5.6%
1.9

HC

PAC

56.9%
2.5

11.0%
1.7

69.4%
5.7

75.7%
3.8

Example B.1 (continued): Mozart, Quartet No. 3 in G Major, K. 156, third movement

HC

5.4%
1.6

53.3%
3.1

PAC

82.9%
5.6

Example B.2: Mozart, String Quartet No. 8 in F Major, K. 168, third movement

203

10

HC

20.5%
2.1
22

HC

80.0%
4.4

51.3%

PAC

PAC

12.2%
1.9

85.3%
6.2
33

83.6%
5.6

HC

PAC

29.7%

28.4%
3.2

57.3%
4.4

Example B.2 (continued): Mozart, String Quartet No. 8 in F Major, K. 168, third movement

204

IAC

IAC

60.6%
3.1

63.5%
2.5

HC

PAC

11

81.9%
5.75

24.3%
2.4

60.3%
4.3

IAC

21

51.4%

4.2%
2.0

Example B.3: Mozart, String Quartet No. 13 in D Minor, K. 173, third movement96

96

There is an incorrect note in m. 8: the cello performs a B! " instead of the notated C3.

205

31

PAC

HC

62.0%
3.1

41.7%
2.6

43

84.9%
6.5

HC

38.4%
3.0

49

PAC

31.5%
3.0

78.1%
5.9

Example B.3 (continued): Mozart, String Quartet No. 13 in D Minor, K. 173, third movement

206

56

HC

69.0%
3.6

63

PAC

HC

41.9%

76.4%
6.0

Example B.3 (continued): Mozart, String Quartet No. 13 in D Minor, K. 173, third movement

207

APPENDIX C
COPYRIGHT PERMISSION LETTERS

208

209

210

APPENDIX D
IRB APPROVAL LETTER AND
INFORMED CONSENT LETTER
Office of the Vice President For Research
Human Subjects Committee
Tallahassee, Florida 32306-2742
(850) 644-8673 FAX (850) 644-4392
APPROVAL MEMORANDUM
Date: 6/23/2010
To: Crystal Peebles
Address:
Dept.: MUSIC SCHOOL
From: Thomas L. Jacobson, Chair
Re:
Use of Human Subjects in Research
Listener perception of segmentation and closure in music
The application that you submitted to this office in regard to the use of human subjects in the
proposal referenced above have been reviewed by the Secretary, the Chair, and two members of
the Human Subjects Committee. Your project is determined to be Expedited per 45 CFR
46.110(7) and has been approved by an expedited review process.
The Human Subjects Committee has not evaluated your proposal for scientific merit, except to
weigh the risk to the human participants and the aspects of the proposal related to potential risk
and benefit. This approval does not replace any departmental or other approvals, which may be
required.
If you submitted a proposed consent form with your application, the approved stamped consent
form is attached to this approval notice. Only the stamped version of the consent form may be
used in recruiting research subjects.
If the project has not been completed by 6/22/2011 you must request a renewal of approval for
continuation of the project. As a courtesy, a renewal notice will be sent to you prior to your
expiration date; however, it is your responsibility as the Principal Investigator to timely request
renewal of your approval from the Committee.
You are advised that any change in protocol for this project must be reviewed and approved by
211

the Committee prior to implementation of the proposed change in the protocol. A protocol
change/amendment form is required to be submitted for approval by the Committee. In addition,
federal regulations require that the Principal Investigator promptly report, in writing any
unanticipated problems or adverse events involving risks to research subjects or others.
By copy of this memorandum, the Chair of your department and/or your major professor is
reminded that he/she is responsible for being informed concerning research projects involving
human subjects in the department, and should review protocols as often as needed to insure that
the project is being conducted in compliance with our institution and with DHHS regulations.
This institution has an Assurance on file with the Office for Human Research Protection. The
Assurance Number is IRB00000446.
Cc: Nancy Rogers, Advisor
HSC No. 2010.4328

212

Office of the Vice President For Research


Human Subjects Committee
Tallahassee, Florida 32306-2742
(850) 644-8673 FAX (850) 644-4392
RE-APPROVAL MEMORANDUM
Date: 5/4/2011
To: Crystal Peebles
Address:
Dept.: MUSIC SCHOOL
From: Thomas L. Jacobson, Chair
Re:
Re-approval of Use of Human subjects in Research
Listener perception of segmentation and closure in music
Your request to continue the research project listed above involving human subjects has been
approved by the Human Subjects Committee. If your project has not been completed by
5/1/2012, you must request a renewal of approval for continuation of the project. As a courtesy, a
renewal notice will be sent to you prior to your expiration date; however, it is your responsibility
as the Principal Investigator to timely request renewal of your approval from the committee.
If you submitted a proposed consent form with your renewal request, the approved stamped
consent form is attached to this re-approval notice. Only the stamped version of the consent
form may be used in recruiting of research subjects. You are reminded that any change in
protocol for this project must be reviewed and approved by the Committee prior to
implementation of the proposed change in the protocol. A protocol change/amendment form is
required to be submitted for approval by the Committee. In addition, federal regulations require
that the Principal Investigator promptly report in writing, any unanticipated problems or adverse
events involving risks to research subjects or others.
By copy of this memorandum, the Chair of your department and/or your major professor are
reminded of their responsibility for being informed concerning research projects involving
human subjects in their department. They are advised to review the protocols as often as
necessary to insure that the project is being conducted in compliance with our institution and
with DHHS regulations.
Cc: Nancy Rogers, Advisor
HSC No. 2011.6316

213

214

REFERENCES
Aarden, Bret. 2003. Dynamic Melodic Expectancy. Ph.D. Diss., Ohio State University.
Agawu, Victor Kofi. 1987. Concepts of Closure and Chopins Opus 28. Music Theory
Spectrum 9: 117.
Allanbrook, Wye Jamison. 1994. Mozarts Tunes and the Comedy of Closure. In On Mozart,
edited by James M. Morris, 16989. Cambridge: Woodrow Wilson Center Press and the
Press Syndicate of the University of Cambridge.
AnsonCartwright, Mark. 2007. Concepts of Closure in Tonal Music: A Critical Study. Theory
and Practice 32: 117.
Baird, Jodie A. and Dare A. Baldwin. (2001). Making Sense of Human Behavior: Action
Parsing and Intentional Inference. In Intentions and Intentionality: Foundations of
Social Cognition. Edited by Bertram F. Malle, Louis J. Moses, and Dare A. Baldwin.
193206. Cambridge: The MIT Press.
Baker, Dorothy Zayatz. 2003. Aaron Coplands Twelve Poems of Emily Dickinson: A Reading
of Dissonance and Harmony. The Emily Dickinson Journal 12: 124.
Baldwin, Dare, Annika Andersson, Jenny Saffran, and Meredith Meyer. 2008. Segmenting
Dynamic Human Action via Statistical Structure. Cognition 106 (3): 1382407.
Bharucha, Jamshed Jay and Carol L. Krumhansl. 1983. The Representation of Harmonic
Structure in Music: Hierarchies of Stability as a Function of Context. Cognition 13:
63102.
Bharucha, Jamshed Jay and Keiko Stoeckig. 1986. Reaction Time and Musical Expectancy:
Priming Chords. Journal of Experimental Psychology: Human Perception and
Performance 12 (4): 40310.
Brower, Candace. 2000. A Cognitive Theory of Musical Meaning. Journal of Music Theory
44 (2): 32379.
Bryden, Kristy A. 2001. Musical Conclusions: Exploring Closural Processes in Five Late
Twentieth-century Chamber Works. Ph.D. Diss., University of Nebraska.
Burstein, Poundie. 2010. Half, Full or In Between? Distinguishing Between Half and Authentic
Cadences. Paper Presentated at the Annual Meeting of the Society for Music Theory,
Indianapolis, 47 November.
Byros, Vasileios (Vasili). 2009. Foundations of Tonality as Situated Cognition, 17301830: An
Enquiry into the Culture and Cognition of Eighteenth-Century Tonality with Beethovens
Eroica Symphony as a Case Study. Ph.D. Diss., Yale University.
Cadwallader, Allen and David Gagn. 2006. Analysis of Tonal Music: A Schenkerian Approach,
2nd ed. New York: Oxford University Press.
Caplin, William E. 2004. The Classical Cadence: Conceptions and Misconceptions. Journal of
the American Musicological Society 57 (1): 51117.

215

Carlsen, James C. 1981. Some Factors which Influence Melodic Expectancy.


Psychomusicology 1: 1229.
Cherlin, Michael. 1991. Thoughts on Poetry and Music, on Rhythms in Emily Dickinsons The
World Feels Dusty and Aaron Coplands Setting of It. Intgral 5: 5575.
Clarke, Eric F. 2001. Meaning and the Specification of Motion in Music. Music Scienti: The
Journal of the European Society for the Cognitive Sciences of Music 5 (2): 21334.
Clendinning, Jane Piper and Elizabeth West Marvin. The Musicians Guide to Theory and
Analysis. New York: W.W. Norton & Company.
Clifford, Robert. 2005. Perennial Questions: Atonal Closure: Process, Completion, and
Balance. Tempo: A Quarterly Review of Modern Music 59 (234): 2933.
Cook, Nicholas. 1987. The Perception of Large-Scale Tonal Closure. Music Perception 5 (2):
197206.
Cuddy, Lola L, & Lunney, Carol. A. 1995. Expectancies Generated by Melodic Intervals:
Perceptual Judgments of Continuity. Perception and Psychophysics 57: 45162.
Delige, Irene. 1987. Grouping Conditions in Listening to Music: An Approach to Lerdahl and
Jackendoffs Grouping Preference Rules. Music Perception 4 (4): 325260.
_________. 2006. Emergence, Anticipation, and Schematization Processes in Listening to a
Piece of Music: A Re-Reading of the Cue Abstraction Model. In New Directions in
Aesthetics, Creativity, and the Arts, edited by Paul Loher, Colin Martindale, and Leonid
Dorfman. 15373. Amityville, NY: Baywood Publishing Company.
Delige, Irene, Marc Mlen, Diana Stammers, and Ian Cross. Musical Schemata in Real-Time
Listening to a Piece of Music. Music Perception 14 (2): 11759.
Deutsch, Diana. 1991. Pitch Proximity in the Grouping of Simultaneous Tones. Music
Perception 9 (2): 18598.
Eberlein, Roland and Jobst Peter Fricke. 1992. Kadenzwahrnegmung und Kadenzgeschichte: Ein
Beitrag zu einer Grammatik der Musik. Frankfurt am Main: Peter Lang.
Edwards, George. 1991. The Nonsense of an Ending: Closure in Haydns String Quartets. The
Musical Quarterly 75 (3): 22754.
Forrest, David. 2010. Prolongation in the Choral Music of Benjamin Britten. Music Theory
Spectrum 32 (1): 125.
Gauldin, Robert. 1985. A Practical Approach to Sixteenth-Century Counterpoint. Long Grove,
Illinois: Waveland Press.
. 1988. A Practical Approach to Eighteenth-Century Counterpoint. Prospect Heights,
Illinois: Waveland Press.
. 2004. Harmonic Practice in Tonal Music, 2nd ed. New York: W.W. Norton & Company.
Gjerdingen, Robert O. 1988. A Classic Turn of Phrase: Music and the Psychology of
Convention. Philadelphia: University of Pennsylvania Press.
. 1994. Apparent Motion in Music? Music Perception 11 (4): 1994.
216

Graubart, Michael. 2003. Perennial Questions: What Are Twelve-Note Rows Really For?
Tempo: A Quarterly Review of Modern Music 57 (225): 3236.
Grave, Floyd K. 2009. Freakish Variations on a Grand Cadence Prototype in Haydns String
Quartets. Journal of Musicological Research 28 (23): 11945.
Hanninen, Dora A. 2001. Orientations, Criteria, Segments: A General Theory of Segmentation
for Music Analysis. Journal of Music Theory 45 (2): 345433.
Hard, Bridgette M., Barbara Tversky and David Lang. 2006. Making Sense of Abstract Events:
Building Event Schemas. Memory & Cognition 34 (6):122135.
Hasty, Christopher. 1981. Segmentation and Process in Post-Tonal Music. Music Theory
Spectrum 3: 5473.
. 1984. Phrase Formation in Post-Tonal Music. Journal of Music Theory 28 (2):
16790.
Hbert, Sylvie, Isabelle Peretz, and Lise Gagnon. 1995. Perceiving the Tonal Ending of Tune
Excerpts: The Roles of Pre-existing Representation and Musical Expertise. Canadian
Journal of Experimental Psychology 49 (2): 193209.
Hepokoski, James and Warren Darcy. 2006. Elements of Sonata Theory: Norms, Types, and
Deformations in the Late-Eighteenth-Century Sonata. Oxford: Oxford University Press.
Hintzman, Douglas L. 1986. Schema Abstraction in a Multiple-Trace Memory Model.
Psychological Review 93 (4): 41128.
. 1988. Judgments of Frequency and Recognition Memory in a Multiple-Trace Memory
Model. Psychological Review 95 (4): 52851.
. 2010. How Does Repetition Affect Memory? Evidence from Judgments of Recency.
Memory & Cognition 38 (1): 10215.
Hopkins, Robert G. 1990. Closure and Mahlers Music: The Role of Secondary Parameters.
Philadelphia: University of Pennsylvania Press.
Huron, David. 2006. Sweet Anticipation: Music and the Psychology of Expectation. Cambridge:
MIT Press.
Hyland, Anne M. 2009. Rhetorical Closure in the First Movement of Schuberts Quartet in C
Major, D. 46: A Dialogue with Deformation. Music Analysis 28 (i): 11142.
Jonaitis, Erin McMullen and Jenny R. Saffran. 2009. Learning Harmony: The Role of Serial
Statistics. Cognitive Sceience 33: 95168.
Jones, Evan and Matthew Shaftel. A Critical Approach to Sight Singing and Musical Style,
preliminary ed. Plymouth: Hayden-McNeil Publishing, 2009.
Joichi, Janet M. 2006. Closure, Context, and Hierarchical grouping in Music. A Theoretical and
Empirical Investigation. Ph.D diss., Northwestern University.
Kessler, Edward J., Christa Hansen, and Roger N. Shepard. 1984. Tonal Schemata in the
Perception of Music in Bali and in the West. Music Perception 2 (2): 13165.

217

Knsche, Thomas R., Christiane Neuhaus, Jens Haueisen, Kai Alter, Burkhard Maess, Otto W.
Witte, and Angela D. Friederici. 2005. Perception of Phrase Structure in Music. Human
Brian Mapping 24: 25973.
Krumhansl, Carol. 1990. Tonal Hierarchies and Rare Intervals in Music Cognition. Music
Perception 7 (3): 30924.
. 1996. A Perceptual Analysis of Mozarts Piano Sonata K. 282: Segmentation, Tension,
and Musical Ideas. Music Perception 13 (3): 40132.
Krumhansl, Carol L., Jukka Louhivuori, Petri Toiviainen, Topi Jrvinen, and Tuomas Eerola.
1999. Melodic Expectation in Finnish Spiritual Hymns: Convergence of Statistical,
Behavioral and Computational Approaches. Music Perception 17: 15195.
Krumhansl, Carol L., Pekka Toivanen, Tuomas Eerola, Petri Toiviainen, Topi Jrvinen, and
Jukka Louhivuori. 2000. Cross-Cultural Music Cognition: Cognitive Methodology
Applied to North Sami Yoiks. Cognition 76: 1358.
Kurby, Christopher A. and Jeffery M. Zacks. 2007. Segmentation in the Perception and
Memory of Events. Trends in Cognitive Sciences 12: 7279.
Kurth, Richard B. 2000. Moments of Closure: Thoughts on the Suspension of Tonality in
Schoenbergs Fourth Quartet and Trio. In Music of My Future: The Schoenberg
Quartets and Trio, edited by Reinhold Brinkmann and Christoph Wolff, 13960.
Cambridge: Harvard University Press.
Lerdahl, Fred and Ray Jackendoff. 1977. Toward a Formal Theory of Tonal Music. Journal of
Music Theory 21 (1): 11171.
. 1983. A Generative Theory of Tonal Music. Cambridge: MIT Press.
Margulis, Elizabeth Hellmuth. 2005. A Model of Melodic Expectation. Music Perception
22 (4): 663713.
. 2007. Silences in Music are Musical not Silent: An Exploratory Study of Context
Effects on the Experience of Musical Pauses. Music Perception 24 (5): 485506.
Marvin, Elizabeth West and Alexander R. Brinkman. 1999. The Effect of Modulation and
Formal Manipulation on Perception of Tonic Closure by Expert Listeners. Music
Perception 16 (4): 389408.
McCreless, Patrick. 1991. The Hermeneutic Sentence and Other Literary Models for Tonal
Closure. Indiana Theory Review 12: 3573.
Meyer, Leonard B. 1956. Emotion and Meaning in Music. Chicago: University of Chicago Press.
. 1973. Explaining Music. Chicago: University of Chicago Press.
Magliano, Joseph P., Jason Miller, and Rolf A. Zwaan. 2001. Indexing Space and Time in Film
Understanding. Applied Cognitive Psychology 15: 53345.
Monahan, Seth. 2011. Success and Failure in Mahlers Sonata Recapitulations. Music Theory
Spectrum 33 (1): 3758.
Narmour, Eugene. 1990. The Analysis and Cognition of Basic Melodic Structures: The
ImplicationRealization Model. Chicago: The University of Chicago Press.
218

Neuhaus, Christiane, Thomas R. Knsche, and Angela D. Friederici. 2006. Effects of Musical
Expertise and Boundary Markers on Phrase Perception in Music. Journal of Cognitive
Neuroscience 18 (3): 47293.
Newston, Darren, Gretchen Engquist, and Joyce Bois. 1977. The Objective Basis of Behavior
Units. Journal of Personality and Social Psychology 35: 84762.
Nusseck, Manfred and Marcel M. Wanderley. 2009. Music and MotionHow Music-Related
Ancillary Body Movements Contribute to the Experience of Music. Music Perception
26 (4): 33553.
Ockelford, Adam. (2006). Implication and Expectation in Music: A Zygonic Model.
Psychology of Music 34: 81142.
Pearce, Marcus T. and Geraint A. Wiggins. 2006. Expectation in Melody: The Influence of
Context and Learning. Music Perception 25 (5): 377405.
Pearce, Marcus T., Daniel Mllensiefen, and Geraint A. Wiggins. 2010. The Role of
Expectation and Probabilistic Learning in Auditory Boundary Perception: A Model
Comparison. Perception 39: 136791.
Pearsall, Edward. 1999. Mind and Music: On Intentionality, Music Theory, and Analysis.
Journal of Music Theory 43 (2): 23155.
Pellegrino, Catherine. 2002. Aspects of Closure in the Music of John Adams. Perspectives of
New Music 40 (1): 14775.
Reber, Rolf, Piotr Winkilman, and Norberr Schwarz. 1998. Effects of Perceptual Fluency on
Affective Judgments. Psychological Science 9 (1): 4548.
Reinhard, Thilo. 1989. The Singers Schumann New York: Perlion Press.
Reti, Rudolph. 1951. The Thematic Process in Music. Westport, CT: Greenwood Press.
Reprinted 1978.
Roeder, John. 2010. Superposition in Saariahos The claw of the magnolia . . .. Paper
Presentated at the Annual Meeting of the Society for Music Theory, Indianapolis, 47
November.
Rogers, Michael R. 1984. Teaching Approaches in Music Theory: An Overview of Pedagogical
Philosphies. Carbondale and Edwardsville: Southern Illinois University Press.
Rosner, Burton S. and Leonard B. Meyer. 1982. Melodic Processes and the Perception of
Music. In The Psychology of Music, edited by Diana Deutsch, 31741. New York:
Academic Press.
. 1986. The Perceptual Roles of Melodic Process, Contour, and Form. Music
Perception 4: 140.
Saffran, Jenny, R. 2001. Constraints on Statistical Language Learning. Journal of Memory and
Language 47 (1): 17296.
Saffran, Jenny R., Richard N. Aslin., and Elissa Newport. 1996. Statistical Learning by 8month-old Infants. Science 274 (5294): 192628.

219

Saffran, Jenny R., Elizabeth K. Johnson, Richard N. Aslin, and Ellisa L. Newport. 1999.
Statistical Learning of Tone Sequences by Human Infants and Adults. Cognition
70 (1): 2752.
Satyendra, Ramon. 1997. Liszts Open Structures and the Romantic Fragment. Music Theory
Spectrum 19 (2): 184205.
Sarver, Sarah. 2010. Embedded and Parenthetical Chromaticism: A Study of Their Structural
and Dramatic Implications in Selected Works by Richard Strauss. Ph.D. Diss., Florida
State University.
Schellenberg, Glenn E. 1996. Expectancy in Melody: Tests of the ImplicationRealization
Model. Cognition 58: 75125.
. 1997. Simplifying the Implication-Realization Model of Melodic Expectancy. Music
Perception 14: 295318.
Schmuckler, Mark A. 1989. Expectation in Music: Investigation of Melodic and Harmonic
Processes. Music Perception 7 (2): 10949.
Shepard, Roger. 1964. Circularity in Judgments of Relative Pitch. Journal of the Acoustical
Society of America 36: 234653.
Serafine, Mary Louise. 1988. Music as Cognition: The Development of Thought in Sound. New
York: Columbia University Press.
Snyder, Bob. 2000. Music and Memory. Cambridge: MIT Press.
Sridharan, Devarajan, Daniel J. Levitin, Chris H. Chafe, Jonathan Berger, and Vinod Menon.
2007. Neural Dynamics of Event Segmentation in Music: Converging Evidence for
Dissociable Ventral and Dorsal Networks. Neuron 55 (3): 52132.
Soll, Beverly and Ann Dorr. 1992. Cyclical Implications in Aaron Coplands Twelve Poems of
Emily Dickinson. College Music Symposium 32: 99128.
Sutcliffe, W. Dean. 2010. Ambivalence in Haydns Symphonic Slow Movements of the 1770s.
Journal of Musicology 27 (1): 84134.
Swallow, Khena M., Jeffrey M. Zacks, and Richard A. Abrams. 2009. Event Boundaries in
Perception Affect Memory Encoding and Updating. Journal of Experimental
Psychology: General 138 (2): 23657.
von Hippel, Paul. 2000. Questioning a Melodic Archetype: Do Listeners Use Gap-Fill to
Classify Melodies? Music Perception 18 (2): 13953.
Wheelock, Gretchen A. 1991. Engaging Strategies in Haydns Opus 33 String Quartets.
Eighteenth-Century Studies 25 (1): 130.
Zacks, Jeffrey. 2004. Using Movement and Intentions to Understand Simple Events. Cognitive
Science 28: 9791008.
Zacks, Jeffrey M. and Khena M. Swallow. 2007. Event Segmentation. Current Directions in
Psychological Science 16: 8084.

220

Zacks, Jeffrey M., Barbara Tversky, and Gowri Iyer. 2001. Perceiving, Remembering, and
Communication Structure in Events. Journal of Experimental Psychology: General
130 (1): 2958.
Zacks, Jeffrey M., Nicole K. Speer, and Jeremy R. Reynolds. 2009. Segmentation in Reading
and Film Comprehension. Journal of Experimental Psychology 138 (2): 307327.
Zacks, Jeffrey, M., Nicole K. Speer, Jean M. Vettel, and Larry L. Jacoby. 2006. Event
Understanding and Memory in Healthy Again and Dementia of the Alzheimer Type.
Psychology & Aging 21: 46682.
Zacks, Jeffrey, M., Nicole K. Speer, Khena M. Swallow, Todd S. Braver, and Jeremy R.
Reynolds. 2007. Event Perception: A MindBrain Perspective. Psychological Bulletin
133 (2): 27393.
Zacks, Jeffrey, M., Shawn Kumar, Richard A. Abrams, and Ritesh Mehta. 2009. Using
Movement and Intentions to Understand Human Activity. Cognition 12: 20116.
Zacks, Jeffrey, M., Todd S. Braver, Margaret A. Sheridan, David I. Donaldson, Abraham Z.
Snyder, John M. Ollinger, et al. 2001. Human Brain Activity Time-Locked to Perceptual
Event Boundaries. Nature Neuroscience 4 (6): 651655.
Zwaan, Rolf A. and Gabriel A. Ravansky. 1998. Situation Models in Language Comprehension
and Memory. Psychological Bulletin 123 (2): 16285.
Zwaan, Rolf A., Mark C. Langston, and Arthur C. Graesser. 1995. The Construction of
Situation Models in Narrative Comprehension: An Event-Indexing Model.
Psychological Science 6 (5): 29297.

221

BIOGRAPHICAL SKETCH
Crystal Peebles received a B.M. in Music Education from East Carolina University and a
M.M. and Ph.D. in Music Theory from The Florida State University. Crystal has presented
research at a variety of conferences including the International Conference for Music Perception
and Cognition, the Annual Meeting for the Society for Music Theory, and numerous regional
conferences. She currently teaches Music Theory at Northern Arizona University.

222

You might also like