Professional Documents
Culture Documents
Modeling The Perception
Modeling The Perception
David Meredith
<dave@titanmusic.com>
Centre for Cognition, Computation and Culture
Goldsmiths College, University of London
1
Algorithmic models of music
cognition
Structural description
Algorithmic model (e.g., harmonic analysis,
Input (formal rules, metrical structure,
representation computer program) grouping structure)
(e.g., MIDI, Auxiliary
piano roll, hypotheses
WAV file)
represented by predicts
Theory
Real world
"Real-world" represented by
manifestation of Musical behaviour
music (e.g., dancing,
(e.g., sound, represented by expressive
printed score, performance,
dance) composition,
improvisation )
causes
Sense organs
(ears, eyes)
Percept,
interpretation,
mental
Neural representation
Brain
encoding
2
Longuet-Higgins’ model
Longuet-Higgins, H. C. (1976). The perception of melodies. Nature, 263(5579), 646-653.
Longuet-Higgins, H. C. (1987). The perception of melodies. In H. C. Longuet-Higgins (ed.), Mental Processes:
Studies in Cognitive Science, pp. 105-129. British Psychological Society/MIT Press, London/Cambridge, MA.
th
g
n
e
rt
sl
a
ci
rt
e
m
OUTPUT:
[[[24 C STC] [[-5 G STC] [0 G STC]]] [[1 AB] [-1 G TEN]]] [[[REST] [4 B STC]] [1 C TEN]]
3
Longuet-Higgins’ model of rhythm
Assumes listener initially assumes pure
binary metre
But willing to change mind at any metrical level
Evidence for change in metre:
Current metre implies syncopation
No note onset at beginning of next higher metrical unit
Current metre implies excessively large change in
tempo
Metre changed if
evidence for change and
other division does not imply syncopation or
excessive tempo change 4
Longuet-Higgins’ model of tonality
5
Longuet-Higgins’ model: Output
Section of cor anglais solo from Act III of Wagner’s Tristan und Isolde
Change from binary to ternary in first beat of fifth bar (triplets)
Grace note correctly identified in seventh bar
Agrees fully with original score in tonal and rhythmic indications
Wagner marked all triplets as staccato – fault with performance, not program!
98.21% notes spelt correctly (3508 errors) in a 195972 note corpus of
classical and baroque music
Cf. 99.44% spelt correctly (1100 errors) by Meredith’s PS13s1 algorithm
Meredith, D. (2006). The ps13 pitch spelling algorithm. Journal of New Music
Research, 35(2), pp. 121-159.
6
Generative Theory of Tonal Music
(GTTM)
Lerdahl, F. and Jackendoff, R. (1983). A Generative Theory of Tonal Music. MIT Press,
Cambridge, MA.
Musical surface
Time-span Prolongational
Grouping structure Metrical structure
reduction reduction
rules rules
rules rules
Time-span Prolongational
Grouping structure Metrical structure
reduction reduction
GWFR 1 Any contiguous sequence of pitch-events, drum beats, or the like can
constitute a group, and only contiguous sequences can constitute a group.
GWFR 2 A piece constitutes a group.
GWFR 3 A group may contain smaller groups.
GWFR 4 If a group G1 contains part of a group G2, then it must contain all of
G2.
GWFR 5 If a group G1 contains a smaller group G2 then G1 must be
exhaustively partitioned into smaller groups.
9
The Gestalt principles of proximity and
similarity in vision and in music
10
Lerdahl and Jackendoff’s
second grouping preference rule
GPR 3 (Change) Consider a sequence of four notes n1, n2, n3, n4.
All else being equal, the transition n2–n3 may be heard as a group
boundary if
a. (Register) the transition n2–n3 involves a greater intervallic
distance than both n1–n2 and n3–n4, or if
b. (Dynamics) the transition n2–n3 involves a change in dynamics
and n1–n2 and n3–n4 do not, or if
c. (Articulation) the transition n2–n3 involves a change in
articulation and n1–n2 and n3–n4 do not, or if
d. (Length) n2 and n3 are of different lengths and both pairs n1,
n2 and n3, n4 do not differ in length.”
(One might add further cases to deal with such things as change in 12
timbre or instrumentation.)
analyser
Temperley, D. (2001). The Cognition of Basic Musical Structures. MIT Press,
Cambridge, MA.
Meredith, D. (2002). Review of David Temperley’s The Cognition of Basic Musical
Structures (Cambridge, MA: MIT Press, 2001). Musicae Scientiae, 6(2), pp. 287-302.
Notes
Roman numeral
Meter harmonic analysis
(prechord mode)
Notes Key
Beats (tactus and below) Notes
Notes with streams
Beats
Beats
Phrases
TPCNotes
Harmony Beats
(prechord mode) Chords
Streamer Grouper
Notes
Beats (tactus and below) Harmony
Chord change time points
Notes 13
Meter
Beats
Temperley’s theory of contrapuntal
structure: Input representation
14
Temperley’s contrapuntal well-
formedness rules (CWFRs)
CPR 1 (Pitch Proximity Rule) Prefer to avoid large leaps within streams.
CPR 2 (New Stream Rule) Prefer to minimize the number of streams.
CPR 3 (White Square Rule) Prefer to minimize the number of white squares
in streams.
CPR 4 (Collision Rule) Prefer to avoid cases where a single square is 16
included in more than one stream.
Using Temperley’s theory to model
listening, composition, performance and
style
Temperley and Sleator’s programs scan the music from left to
right, keeping note of the analyses that best satisfy the
preference rules so far at each point.
Ambiguity: Two or more best analyses at a given point in the
music.
Revision: The best analysis at some point in the music does not
form part of the best analysis at some later point.
Expectation: The most expected events are those that will lead
to an analysis that best satisfies the preference rules.
Style: A passage is in the style defined by a set of preference
rules if the analysis that best satisfies the preference rules
achieves a score that is not too high (boring) and not too low
(incomprehensible).
Composition: Choices guided by goal to produce piece that
satisfies preference rules to just the right extent.
Performance: Temporal and dynamic expression geared
towards conveying structure in accordance with analysis that
best satisfies the preference rules. 17
Summing up
We can attempt to model the perception and
cognition of musical structure by constructing
algorithms that take representations of musical
passages as input and generate structural
descriptions of those passages as output
We can evaluate such algorithms by comparing
their output with expert human analyses and
authoritative scores
Can express a theory of musical structure as a
preference rule system consisting of
Well-formedness rules that define the class of legal
structural descriptions
Preference rules: the legal structural descriptions that
best satisfy the preference rules are predicted to be the
ones that listeners are most likely to hear
18