Modeling The Perception

Modelling
the perception and cognition

of musical structure
David Meredith
<dave@titanmusic.com>
Centre for Cognition, Computation and Culture
Goldsmiths College, University of London
1
Algorithmic models of music
cognition
Structural description
Algorithmic model (e.g., harmonic analysis,
Input (formal rules, metrical structure,
representation computer program) grouping structure)
(e.g., MIDI, Auxiliary
piano roll, hypotheses
WAV file)
represented by predicts
Theory
Real world
"Real-world" represented by
manifestation of Musical behaviour
music (e.g., dancing,
(e.g., sound, represented by expressive
printed score, performance,
dance) composition,
improvisation )
causes
Sense organs
(ears, eyes)
Percept,
interpretation,
mental
Neural representation
Brain
encoding
2
Longuet-Higgins’ model
Longuet-Higgins, H. C. (1976). The perception of melodies. Nature, 263(5579), 646-653.
Longuet-Higgins, H. C. (1987). The perception of melodies. In H. C. Longuet-Higgins (ed.), Mental Processes:
Studies in Cognitive Science, pp. 105-129. British Psychological Society/MIT Press, London/Cambridge, MA.
th
g
n
e
rt
sl
a
ci
rt
e
m
A flat, not G sharp
OUTPUT:
[[[24 C STC] [[-5 G STC] [0 G STC]]] [[1 AB] [-1 G TEN]]] [[[REST] [4 B STC]] [1 C TEN]]
3
Longuet-Higgins’ model of rhythm
 Assumes listener initially assumes pure
binary metre
 But willing to change mind at any metrical level
 Evidence for change in metre:
 Current metre implies syncopation
 No note onset at beginning of next higher metrical unit
 Current metre implies excessively large change in
tempo
 Metre changed if
 evidence for change and
 other division does not imply syncopation or
excessive tempo change 4
Longuet-Higgins’ model of tonality
 For each note, estimates value of sharpness: position

of pitch name on line of fifths
 Theory of tonality consists of six rules
 First ensures each note spelt so name is as close as possible
to local tonic on line of fifths
 Other rules control how algorithm deals with chromatic
intervals and modulations
 e.g., second rule states that if current key implies two
consecutive chromatic intervals, then key should be changed so
that both become diatonic
5
Longuet-Higgins’ model: Output
 Section of cor anglais solo from Act III of Wagner’s Tristan und Isolde
 Change from binary to ternary in first beat of fifth bar (triplets)
 Grace note correctly identified in seventh bar
 Agrees fully with original score in tonal and rhythmic indications
 Wagner marked all triplets as staccato – fault with performance, not program!
 98.21% notes spelt correctly (3508 errors) in a 195972 note corpus of
classical and baroque music
 Cf. 99.44% spelt correctly (1100 errors) by Meredith’s PS13s1 algorithm
 Meredith, D. (2006). The ps13 pitch spelling algorithm. Journal of New Music
Research, 35(2), pp. 121-159.
6
Generative Theory of Tonal Music
(GTTM)
Lerdahl, F. and Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press,
Cambridge, MA.
Musical surface
Time-span Prolongational
Grouping structure Metrical structure
reduction reduction
rules rules
rules rules
Time-span Prolongational
Grouping structure Metrical structure
reduction reduction
 WELL-FORMEDNESS RULES define CLASS of POSSIBLE structural

descriptions
 PREFERENCE RULES used to find BEST structural 7
descriptions
Lerdahl and Jackendoff’s
theory of grouping structure
 Listener automatically segments music into structural

units of various sizes called groups
 Grouping structure of a passage is way that it is
perceived to be segmented into groups
 “Grouping can be viewed as the most basic
component of musical understanding” (Lerdahl and
Jackendoff, 1983, p.13)
8
grouping well-formedness rules
GWFR 1 Any contiguous sequence of pitch-events, drum beats, or the like can
constitute a group, and only contiguous sequences can constitute a group.
GWFR 2 A piece constitutes a group.
GWFR 3 A group may contain smaller groups.
GWFR 4 If a group G1 contains part of a group G2, then it must contain all of
G2.
GWFR 5 If a group G1 contains a smaller group G2 then G1 must be
exhaustively partitioned into smaller groups.
9
The Gestalt principles of proximity and
similarity in vision and in music
10
second grouping preference rule
GPR 2 (Proximity) Consider a sequence of four notes n1,

n2, n3, n4. All else being equal, the transition n2–n3 may
be heard as a group boundary if
a. (Slur/Rest) the interval of time from the end of n2 to the
beginning of n3 is greater than that from the end of n1 to
the beginning of n2 and that from the end of n3 to the
beginning of n4, or if
b. (Attack-Point) the interval of time between the attack
points of n2 and n3 is greater than that between the attack
points of n1 and n2 and that between the attack points of 11
n3 and n4.
third preference rule
GPR 3 (Change) Consider a sequence of four notes n1, n2, n3, n4.
All else being equal, the transition n2–n3 may be heard as a group
boundary if
a. (Register) the transition n2–n3 involves a greater intervallic
distance than both n1–n2 and n3–n4, or if
b. (Dynamics) the transition n2–n3 involves a change in dynamics
and n1–n2 and n3–n4 do not, or if
c. (Articulation) the transition n2–n3 involves a change in
articulation and n1–n2 and n3–n4 do not, or if
d. (Length) n2 and n3 are of different lengths and both pairs n1,
n2 and n3, n4 do not differ in length.”
(One might add further cases to deal with such things as change in 12
timbre or instrumentation.)
analyser
Temperley, D. (2001). The Cognition of Basic Musical Structures. MIT Press,
Cambridge, MA.
Meredith, D. (2002). Review of David Temperley’s The Cognition of Basic Musical
Structures (Cambridge, MA: MIT Press, 2001). Musicae Scientiae, 6(2), pp. 287-302.
Notes
Roman numeral
Meter harmonic analysis
(prechord mode)
Notes Key
Beats (tactus and below) Notes
Notes with streams
Beats
Beats
Phrases
TPCNotes
Harmony Beats
(prechord mode) Chords
Streamer Grouper
Notes
Beats (tactus and below) Harmony
Chord change time points
Notes 13
Meter
Beats
Temperley’s theory of contrapuntal
structure: Input representation
14
Temperley’s contrapuntal well-
formedness rules (CWFRs)
CWFR 1 A stream must consist of a set of temporally contiguous squares on

the plane.
CWFR 2 A stream may be only one square wide in the pitch dimension.
CWFR 3 Streams may not cross in pitch. 15
CWFR 4 Each note must be entirely included in a single stream.

Temperley’s contrapuntal preference
rules (CPRs)
CPR 1 (Pitch Proximity Rule) Prefer to avoid large leaps within streams.
CPR 2 (New Stream Rule) Prefer to minimize the number of streams.
CPR 3 (White Square Rule) Prefer to minimize the number of white squares
in streams.
CPR 4 (Collision Rule) Prefer to avoid cases where a single square is 16
included in more than one stream.
Using Temperley’s theory to model
listening, composition, performance and
style
 Temperley and Sleator’s programs scan the music from left to
right, keeping note of the analyses that best satisfy the
preference rules so far at each point.
 Ambiguity: Two or more best analyses at a given point in the
music.
 Revision: The best analysis at some point in the music does not
form part of the best analysis at some later point.
 Expectation: The most expected events are those that will lead
to an analysis that best satisfies the preference rules.
 Style: A passage is in the style defined by a set of preference
rules if the analysis that best satisfies the preference rules
achieves a score that is not too high (boring) and not too low
(incomprehensible).
 Composition: Choices guided by goal to produce piece that
satisfies preference rules to just the right extent.
 Performance: Temporal and dynamic expression geared
towards conveying structure in accordance with analysis that
best satisfies the preference rules. 17
Summing up
 We can attempt to model the perception and
cognition of musical structure by constructing
algorithms that take representations of musical
passages as input and generate structural
descriptions of those passages as output
 We can evaluate such algorithms by comparing
their output with expert human analyses and
authoritative scores
 Can express a theory of musical structure as a
preference rule system consisting of
 Well-formedness rules that define the class of legal
structural descriptions
 Preference rules: the legal structural descriptions that
best satisfy the preference rules are predicted to be the
ones that listeners are most likely to hear
18

Modeling The Perception

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Modeling The Perception

Uploaded by

Copyright:

Available Formats

Modelling

the perception and cognition

A flat, not G sharp

 For each note, estimates value of sharpness: position

 WELL-FORMEDNESS RULES define CLASS of POSSIBLE structural

 Listener automatically segments music into structural

GPR 2 (Proximity) Consider a sequence of four notes n1,

CWFR 1 A stream must consist of a set of temporally contiguous squares on

CWFR 4 Each note must be entirely included in a single stream.

You might also like