You are on page 1of 138





Joshua P. Salmon

Submitted in partial fulfillment of the requirements

for the degree of Master of Science


Dalhousie University
Halifax NS
July 2006

© Copyright by Joshua P. Salmon, 2006



The undersigned hereby certify that they have read and recommend to the Faculty

of Graduate Studies for acceptance a thesis entitled “ARCH YOU GLAD: THE AFFECT


by Joshua P. Salmon in partial fulfillment of the requirements for the degree of Master of


Dated: July 19, 2006

Supervisor: ___________________________

Readers: ___________________________




DATE: July 19, 2006

AUTHOR: Joshua P. Salmon





Permission is herewith granted to Dalhousie University to circulate and to have

copied for non-commercial purposes, at its discretion, the above title upon the request of
individuals or institutions.

Signature of Author

The author reserves other publication rights, and neither the thesis nor extensive
extracts from it may be printed or otherwise reproduced without the author’s written

The author attests that permission has been obtained for the use of any
copyrighted material appearing in the thesis (other than the brief excerpts requiring only
proper acknowledgment in scholarly writing), and that all such use is clearly


I would like to dedicate this thesis to my loving fiancée, Marsha, who showed
infinite patience putting up with me throughout the entirety of this work. I count my
blessings every day, and look forward to spending the rest of my life with her.


List of Tables … viii

List of Figures …x
Abstract … xii
List of Abbreviations and Symbols Used … xiii
Acknowledgements … xv
Chapter 1: Introduction …1
Affect and Musical Structure …1
Affect and Melodic Contour …5
Contour and Affect: More recently …7
Types of Melodic Contour … 10
Issues for Melodic Contour … 11
Inversions & Complementary Contours … 15
Inversions of the Melodic Arch … 17
Characterizing Melodic Contour … 18
Measuring Emotion of “Affect” … 20
The Current Study … 23
Chapter 2: Method … 27
Participants … 27
Procedure … 28
Tonality Assessment … 30
Affect Ratings … 31
Apparatus and Stimuli … 33
Tonality Assessment … 33
Affect Ratings … 34
Chapter 3: Annotated Results … 39
I. Reliability of Measures … 41
Within-Subject Reliability … 41
Between Groups Reliability … 43

Between Group ANOVA … 45
II. Descriptives … 46
Correlations between Affect Scales … 58
III. Modeling the Responses … 61
Overall ANOVA … 61
Analysis of Each Sequence … 63
Analysis of Each Affect Scale … 64
Modeling effect of Sequences … 65
Basic Statistics / Indices for Scales … 69
Predicting Responses from Basic Indices … 71
Comparison Table (% of variance explained) … 75
Modeling with Polynomials … 76
Results from Polynomial modeling … 82
Contour vs. Pitch Height … 82
Correlations of Coefficients to Responses … 86
Higher order terms … 88
Absolute Values of Coefficients … 91
IV. Summary of Results … 93
V. Additional (Secondary) Analyses … 93
Tonality … 93
Musical Experience … 94
Other Demographics … 95
Chapter 4: General Discussion … 96
Other Observations … 97
Pitch Height (Register / Octave) … 98
Linear (Ascending versus Descending) Contours … 100
The Melodic Arch … 103
The Influence (Effect Size) of Melodic Contour … 104
The function of Melodic Contour … 107
Limitations & Concerns … 108
Modeling … 109

Group Differences … 110
Affect Scales … 111
Future Directions … 111
References … 113
Appendix 1: Computer Instructions … 119
Appendix 2: Musical Background Questionnaire … 122


Table 2.1: Affect scales used for Affect Ratings, shaded area indicates
the affect scales rated by both groups. … 34
Table 3.1: Legend of abbreviations used for affect scales (words) … 40
Table 3.2: Legend of abbreviations used for musical sequences. … 40
Table 3.3: Significance of the change in responding from Time 1 to
Time 2 for each affect scale - sequence combination … 42
Table 3.4: Between group t-tests for repeated Affect Scales … 44
Table 3.5: ANOVA for affect scale contentment-joy … 45
Table 3.6: ANOVA for affect scale hesitation-confidence … 45
Table 3.7: Response Means for each Sequence & Affect Scale … 57
Table 3.8: Correlation of means across all 16 sequences for Group 1 … 59
Table 3.9: Correlation of means across all 16 sequences for Group 2 … 59
Table 3.10: Overall ANOVA, including both groups and all 10
affect scales … 62
Table 3.11: ANOVA for Group 1 … 62
Table 3.12: ANOVA for Group 2 … 62
Table 3.13: ANOVA for Sequences across Affect Scales … 63
Table 3.14: ANOVA for words across all sixteen Sequences … 65
Table 3.15: Transformation of Sequence notes based on a MIDI
coding scheme. … 67
Table 3.16: Transformation of Sequence notes based on a diatonic
coding scheme. … 69
Table 3.17: Simple Statistics per Sequence based on the chromatic … 70
Table 3.18: Simple Statistics per Sequence based on the diatonic … 71
Table 3.19: Obtained R2 from Mean, Median & Mode Regressions … 72
Table 3.20: Obtained R2 from Minimum & Maximumum Regressions … 73
Table 3.21: Obtained R2 from Standard Deviation & Range Regressions… 73
Table 3.22: Obtained R2 from Final Note Regressions … 74

Table 3.23: % of Variance accounted for by the three Best Predictors:
mean, max and final note … 75
Table 3.24: Resulting coefficients per sequence when chromatic scaling
is used … 80
Table 3.25: Resulting coefficients per sequence when diatonic scaling
is used … 80
Table 3.26: Proportion of variance explained by contour … 83
Table 3.27: Proportion of variance explained by pitch height … 84
Table 3.28: Proportion of variance explained by all coefficients … 85
Table 3.29: Coefficient (Pearson r) Correlations to Responses … 86
Table 3.30: Pearson r-values. Coefficient inter-correlations … 91
Table 3.31: Correlations between Responses and Absolute Values
of Coefficients … 92
Table 4.1: Comparison between current ratings for high pitch, and
how equivalent terms were rated by Henver (1936a) … 98
Table 4.2: Comparison between current ratings for low pitch, and
how equivalent terms were rated by Henver (1936a) … 99


Figure 1.1: Examples of melodies with different contours … 10

Figure 1.2: Demonstration of complementary contours … 16

Figure 1.3: The COM-Matrix for <0 1 3 2> … 18

Figure 1.4: Examples of the different types of contours … 25

Figure 2.1: The overview of the computerized six stage design … 29

Figure 2.2: Basic design within each Block … 31

Figure 2.3: General instructions for Affect Ratings … 32

Figure 2.4: The complete set of stimuli used in Tonality Assessment … 35

Figure 2.5: The musical stimuli used for the Affect Ratings … 37

Figure 3.1: Profile for sequence 1Lau … 48

Figure 3.2: Profile for sequence 2Ldu … 48

Figure 3.3: Profile for sequence 3Lal … 49

Figure 3.4: Profile for sequence 4Ldl … 49

Figure 3.5: Profile for sequence 5Rau … 50

Figure 3.6: Profile for sequence 6Rdu … 50

Figure 3.7: Profile for sequence 7Ral … 51

Figure 3.8: Profile for sequence 8Rdl … 51

Figure 3.9: Profile for sequence 9Na … 52

Figure 3.10: Profile for sequence 10Nd … 52

Figure 3.11: Profile for sequence 11Mu … 53

Figure 3.12: Profile for sequence 12Mm … 53

Figure 3.13: Profile for sequence 13Ml … 54

Figure 3.14: Profile for sequence 14Wu … 54

Figure 3.15: Profile for sequence15Wm … 55

Figure 3.16: Profile for sequence 16Wl … 55

Figure 3.17: “8Rdl” modeled by a fourth order polynomial … 78

Figure 3.18: “14Wu” modeled by a fourth order polynomial … 79

Figure 3.19: A sixth order polynomial fit for the sequence 14Wu … 81

Figure 3.20: Demonstration of how a negative correlation for a2 … 87

Figure 3.21: The relationship between the a1 and a3 term … 89

Figure 3.22: Shows the relationship between the a2 and a4 term … 90


This research examined the relationship between affect (emotional content) and
the shape of sixteen equi-temporal eight note musical sequences designed to differ only
on melodic contour (ascending, descending, arch, inverted-arch, etc.). Two groups of
participants rated the intended affect of each sequence on ten affect scales (contentment-
joy, delicacy-strength, excitement-boredom, passivity-aggression, etc.). Modeling of
melodic contour was done using fourth-order polynomial equations. Of which, the first
term captures the average pitch height. The second term captures the ascending /
descending aspect of the shape. The third term captures the arch aspect of the shape.
Higher order terms capture nuances. The analysis indicated that mean pitch height was a
primary predictor on eight of the affect scales. The analysis also indicated that the
ascending/descending aspect was important for five affect scales, while the degree of
arch was important for eight affect scales. Different terms were important for different
affect scales.


Symbol Meaning
1:Lau Musical sequence that is: (L)inear, (a)scending, & in (u)pper range
2:Ldu Musical sequence that is: (L)inear, (d)escending, & in (u)pper range
3:Lal Musical sequence that is: (L)inear, (a)scending, & in (l)ower range
4:Ldl Musical sequence that is: (L)inear, (d)escending, & in (l)ower range
5:Rau Musical sequence that is: A(R)ched, (a)scending, & in (u)pper range
6:Rdu Musical sequence that is: A(R)ched, (d)escending, & in (u)pper range
7:Ral Musical sequence that is: A(R)ched, (a)scending, & in (l)ower range
8:Rdl Musical sequence that is: A(R)ched, (d)escending, & in (l)ower range
9:Na Musical sequence that is: (N)-shaped, & (a)scending
10:Nd Musical sequence that is: (N)-shaped, & (d)escending
11:Mu Musical sequence that is: (M)-shaped, & in (u)pper range
12:Mm Musical sequence that is: (M)-shaped, & in (m)iddle range
13:Ml Musical sequence that is: (M)-shaped, & in (l)ower range
14:Wu Musical sequence that is: (W)-shaped, & in (u)pper range
15:Wm Musical sequence that is: (W)-shaped, & in (m)iddle range
16:Wl Musical sequence that is: (W)-shaped, & in (l)ower range
α alpha, the probability of rejecting the statistical hypothesis tested when in
fact, that hypothesis is true.
a0 through a6 Regression coefficients from polynomial modeling
ANOVA Analysis of variance, a statistical method for making simultaneous
comparisons between two or more means.
2 Chi-squared, a test statistic for comparing observed and theoretical counts.
cf. confer (compare)
cont-joy1 affect scale: contentment-joy (for Group 1)
cont-joy2 affect scale: contentment-joy (for Group 2)
df degrees of freedom
del-str affect scale: delicacy-strength

ener-tran affect scale: energy-tranquility
exci-bore affect scale: excitement-boredom
F F-value from the ANOVA
Grp Group
hes-conf1 affect scale: hesitation-confidence (for Group 1)
hes-conf2 affect scale: hesitation-confidence (for Group 2)
Hz Hertz, or number of cycles (repetitions) per second
IBM International Business Machines
irr-calm affect scale: irritation-calmness
M the arithmetic mean
Ma Magnitude (or ratio) of the difference between two R2’s
mm “metronome marking” or the pace of music measured by the number of
beats in 60 seconds.
η2 Eta squared, a measure of effect size
MIDI Musical Instrument Digital Interface
ms milliseconds
pass-agr affect scale: passivity-aggression
pens-play affect scale: pensiveness-playfulness
ques-ans affect scale: questioning-answering
r Pearson product-moment correlation coefficient
R R-squared, the relative predictive power of the model or the proportion of
variance explained
# a musical sharp, as in A#, C#, etc.
SD Standard Deviation
SPSS Statistical Package for the Social Sciences
sur-exp affect scale: surprise-expectation
t t-value from a t test


I would, foremost, like to thank my supervisor Dr. Bradley Frankland, for without
his help and counsel this work would been entirely impossible. I would also like to thank
my committee Drs. Raymond Klein, and Patricia McMullen who helped keep me on
track and provided encouraging words when they were needed most. I would like to
thank my fiancée, Marsha, who provided support in just about all aspects of this work. I
would like to thank my sister, Tessah Woodman, for her help with proof-reading and
pilot work. I would like to thank my family for always being supportive. Finally, I would
like to thank everyone who took time out of their busy schedules to participate and give
me feedback during the piloting of this research.


It is widely known that music conveys an emotional message. Any who have
watched a Hollywood movie can attest to the emotionally heightening effect of music.
The consistency of the message is strong enough that music is often relied on to induce
moods in controlled studies (e.g. Richell & Anderson, 2004; Van der Does, 2002;
Västfjäll, 2001-2002). Ongoing research since at least the 1930’s has attempted to
characterize the structures of music containing this message. For example, the effect of
changes in mode, pitch, tempo, harmony and rhythm on emotion have been well
documented in a number of studies (e.g. Hevner, 1935; 1936a; 1936b; Gagnon & Peretz
2003; Schellenberg, Krysciak, & Campbell, 2000; Webster & Weir, 2005; etc.).
However, the effects of more complex musical structures like melodic contour have
proven to be much more elusive. This research explores the relationship between melodic
contour and emotion (affect).
This thesis begins with a discussion of the previous research into the affect of
various music structures using, as much as possible, the terminology applied by the
research being described. However, it is important to note that terminology employed by
researchers in this field is not always consistent, and therefore, several terms are applied
to essentially the same construct. For example, the term describing the octave or register
of a melody has also been described as the pitch, pitch level, pitch height, and mean or
average pitch height. Additionally the musical stimuli such as those used in this thesis
have been described as sequences (a series of ordered pitches) but they could also be
described as a phrases, melodies or compositions.

Affect and Musical Structure

Hevner (1935; 1936a; 1936b) was one of the first to look at affect and musical
structure. Hevner tested the relationship between different music structures (mode,
tempo, pitch, harmony, rhythm and contour) and affect through careful manipulations of
one aspect of music structure at a time. Segments of classical music (8 measures) were
chosen, rearranged, and then played live by a concealed pianist for audiences of

participants. Each participant only heard one version of the piece of music, and were
asked to rate the emotional content by checking off all the adjectives that applied from a
checklist provided. Hevner then compared affective responses between groups that heard
different versions of the same musical piece (1935; 1936a; 1936b).
For example, to assess the effect of mode, pieces played in a major key were
transposed into a minor key, and vice versa. Participants were then divided into two
groups with one group hearing the original and the other group the transposition. The
results indicated that participants rated the major key versions as happy, light, sprightly,
cheerful, joyous, gay, bright, merry, playful, graceful and exhilarated. Conversely, the
minor key versions were rated as pathetic, melancholy, plaintive, yearning, mournful,
sad, sober, pleading, mysterious, longing, doleful, gloomy, restless, weird, mystical,
depressing, etc. (Hevner, 1935, 1936b).
To assess tempo -- the speed at which the music was played -- Hevner had the
pianist learn to play the pieces at two different tempos (speeds). The results indicated that
participants rated fast pieces as happy, bright, exciting and elated. Conversely, slow
pieces were rated as serene, gentle, dreamy and sentimental (Hevner, 1936a).
To assess pitch -- the register in which a piece was played -- the pianist played the
same piece in different octaves. In other words, pieces that were normally played on the
high notes of the piano were transposed to be played on low notes and vice versa. The
results indicated that pieces played in the high register (or high pitch) were rated as
graceful, sparkling, sprightly, humorous, et cetera. Conversely, low pitch was rated as
sad, heavy, vigorous, majestic, dignified, serious, et cetera (Hevner, 1936a).
To assess harmony -- the non-melodic part of a composition -- Hevner rearranged
pieces with simple consonant harmonies so that they would have dissonant (complex)
harmonies. The results indicated that simple harmonies were rated as happy and bright.
Complex harmonies were rated as exciting and elated (Hevner, 1936b).
To assess rhythm -- the sense of motion of the music -- rhythms with a firm beat
and full chord (a chord is a collection of notes played simultaneously) were changed to be
more smooth and flowing with supporting chords spread evenly throughout the measure.
The results indicated that firm rhythms were rated as dignified and solemn. Flowing
rhythms were rated as happy, bright, dreamy, and tender (1936b).

Finally, Hevner attempted to assess the affect of melodic contour, or the “rising
and falling of the melodic line” (1936b, p.256). Assessing the affect of melodic contour
(the subject of this thesis), however, turned out to be quite difficult. Hevner’s attempt is
discussed in detail in the next section. For now, it is important to note that, of all musical
manipulations performed by Hevner, melodic contour was the most difficult
manipulation to perform. In fact, performing manipulations of musical mode, pitch and
tempo are quite straight-forward in music and have been well studied both in past and
recent research (recent examples include Costa, Bitti, & Bonfiglioli; 2000; Costa, Fine, &
Bitti, 2004; Gagnon & Peretz, 2003; Juslin & Madison, 1999; Pittenger, 2003;
Schellenberg et al., 2000; Vines, Nuzzo, & Levitin, 2005; Webster & Weir, 2005).
As an example, Webster and Weir (2005) had 177 college participants rate short
musical phrases on a continuous happy – sad dimension. The sequences were
manipulated by mode (major vs. minor), tempo (72, 108, 144 beats per min) and texture
(nonharmonized vs. harmonized). Consistent with Hevner’s (1935, 1936a, 1936b) results,
and other research in the field, their results indicated that the major key, faster tempos
and nonharmonized (monophonic) music were associated with happier ratings. Sad
ratings were associated with the minor key, slower tempos, and harmonized music.
Gagnon and Peretz (2003) performed a similar experiment in which only mode
(major vs. minor) and tempo (fast vs. slow) were manipulated for equitone melodies (all
notes the same length). Again participants rated music along the happy – sad dimension
and the results found major keys and fast tempos were associated with more happy
ratings. Gagnon and Peretz (2003) also examined the isolated and combined effects of
mode and tempo, and their results suggested that tempo was more salient. That is,
manipulations of tempo showed a larger effect (on affect) than manipulations of mode.
Additionally, Costa et al., (2000) examined the musical affect of both mean pitch
height (register) and harmonic intervals. They called their musical stimuli “bichords”
because they were essentially two-note chords. They played the twelve different possible
harmonic intervals (bichords) in either a low or high register. The judging was done by
43 university students on 30 bipolar adjective scales. Costa et al.’s results indicated that
low register bichords were evaluated more emotionally negative than high resister
bichords. All high register bichords were also judged as more unstable, mobile, restless,

dissonant, furious, tense, vigorous and rebellious compared to low bichords. In regards to
intervals, Costa et al.’s results indicated that “dissonant” bichords (second and seventh
chords) were rated as more negative, unstable, and tense than consonant ones. Finally,
Costa et al. (2000) also considered the difference between minor and major bichords, and
found minor bichords were perceived as more dull, mysterious, gloomy, and sinister.
As a final example, Schellenberg et al. (2000) used short melodies that had
already been judged consistently to convey a single emotion (happy, sad or scary) and
manipulated them along either the dimension of pitch or rhythm. Unlike Hevner (1936a),
however, their manipulation of pitch was not one of high versus low, but instead one of
varying versus equal pitch. In other words, for pitch they compared the usual melody
(that varies in pitch) to one in which all pitches were set to the median pitch level. A
similar manipulation was done for rhythm (varying rhythm vs. equal rhythm).
Undergraduate students were asked to rate the original and altered versions on the
corresponding unipolar affect scale. That is, the set of happy melodies were rated on how
happy they were from 1 (not at all happy) to 7 (extremely happy). Sad melodies were
rated on how sad they were, and scary melodies on how scary. Their results indicated that
affective ratings were more influenced by differences in pitch than by differences in
rhythm. That is, pitch appeared to be a more emotionally salient feature of music in this
research. This was an interesting result, because even though their study was not
officially studying melodic contour, melodies in different pitch conditions expressed
different melodic shapes. Thus, Schellenberg et al.’s results suggest that melodic contours
may, indeed, express affect.
Many more examples of research exploring the effects of musical structure on
affect could be cited. The affect of musical structures such as mode, tempo, pitch,
harmony and rhythm have been extensively studied (tempo and mode in particular).
However, little research has been done on the relationship between melodic contour and
emotions (affect). The primary reason for this asymmetry is that melodic contour is
comparatively difficult to manipulate. The next section will explore some of the attempts
to measure the emotional (affective) content of melodic contour, returning first, to
Hevner (1936b) and her research.

Affect and Melodic Contour
Simplistically, melodic contour refers to the change in pitch (or tone) of notes
over time. To assess the effect of melodic contour, one would like to manipulate contour
in a systematic, balanced, fashion, without simultaneously altering any other attributes of
the music. The alterations should maintain a sense of "musicality". This is not easily
Hevner (1936b) tried to manipulate contour through “inversion[s] on the original
melody” (p.258). In other words, she chose generally rising (ascending) or falling
(descending) melodies and tried to find the optimal opposite (or complementary) contour
for each composition. In applying this approach, many of the melodies she first tried were
found to be “unsuitable” (p. 258). She listed many difficulties in trying to construct her

(1) New melodies often did not sound sensible or logical, and [were] often
unmusical, and even unpleasant. Many compositions were discarded for this
(2) The harmony demanded by the new melody was not always the same as the
harmony from the original.
(3) It was sometimes difficult to keep the relation of the melody to the tonic or
keynote exactly parallel throughout the length of the two versions. (Hevner,
1936b, p. 258-259)

What Hevner meant by point three (3) was that there are certain notes in any melody that
help define the key and serve as structural centers for the melody. It was difficult to keep
these notes in the same place in all versions. Suffice to say, Hevner had to do some
creative tweaking to design satisfactory new melodies. Therefore, while Hevner’s design
suggested that she was studying melodic contour by having a pianist play two versions of
the same piece of music to participants, these versions really differed on many more
factors than just melodic contour.
As a result of the confounds, it is not surprising that Hevner’s research showed a
much weaker effect for her manipulation of contour than for any of her others
manipulations (i.e. mode, harmony, rhythm, etc..). She summed up her findings on
contour differences as neither “clear-cut, distinct or consistent” (Hevner, 1936b, p. 268).
However, the results suggested a “tendency” for descending melodies to express both

exhilaration and serenity, and ascending melodies to express dignity and solemnity. In
other words, contour may have had an effect, but the design was not powerful enough to
yield consistently significant results.
For understanding the relationship between musical structure and affect, Hevner’s
design (1936b) had a number of additional weaknesses. First, the sections of music were
fairly long (8 measures). The problem with long pieces is that they quickly become
difficult to model. If the affective responses only vary subtly with the musical structure
then there is the potential for effects to cancel each other out over time. Even when an
overall effect is found, with long pieces it becomes difficult to know if the reported affect
was related to summation over the entire piece, or a particular section of the music.
Shorter pieces simplify the analysis and interpretation.
Second, Hevner (1936b) chose real music pieces complete with different keys,
pitches, harmony, varying rhythms, dynamics, et cetera. As previous research has shown,
some musical structures are more salient than others. Thus, when all musical structures
are present and varying throughout the piece it can become very difficult to tease out the
effect of the specific structure that is being measured. Especially if the structure being
measured is less emotionally salient than the other structures present. Its influence may
be masked by the effects of the other structures present.
Third, Hevner’s (1936b) research employed the use of a live performer. Of
course, at the time of Hevner’s research, no other options were available, but this is a
problem because there is no guarantee a live performer will play the piece exactly as
intended by the researcher. Today we can employ the use of highly precise computer
software to play music for participants.
Lastly, Hevner’s (1936b) attempted to assess melodic contour through
compositional inversions. These are very difficult to construct, especially in long pieces
containing harmony and varying rhythms, as her selections contained. The reason
inversions are so difficult to construct will be discussed in more detail later on. Suffice to
say that subsequent work tended to simplify the problem by using short (approximately 8
note) equi-temporal (no rhythm) music containing no harmony (monophonic). Two of
these designs will now be discussed.

Contour and Affect: More recently
Scherer and Oshinsky (1977) addressed the question of contour and affect by
having participants rate the affective content of eight short synthesized tone sequences
(sawtooth wave bursts). Each “tone sequence” was comparable to a musical note.
However, they called their stimuli “tone sequences”, as opposed to notes, since they were
not necessarily musical and were designed to mimic structures of speech as much as
structures of music. Scherer and Oshinsky manipulated their sequences on a number of
variables including pitch contour (up or down), pitch variation (small or large), and pitch
level (low or high). By up or down they were referring to what in music is usually
described as ascending or descending contours. Pitch variation referred to whether the
sequence spanned large or small intervals. Pitch level referred to pitch (frequency), or
whether the music was played with low or high notes. The affective scales used contained
3 bipolar scales (pleasantness-unpleasantness, activity-passivity, potency-weakness), and
a checklist of seven adjectives (anger, fear, boredom, surprise, happiness, sadness, and
disgust). Participants would listen to each sequence twice and rate on the bipolar scales
during one listening and the checklist items on the other. Unlike Hevner’s research
(1936b), Scherer and Oshinsky (1977) did find participants responded reliably to
manipulations of pitch contour when asked to “judge the kind of emotion” that each
sequence expressed (p. 334). Specifically, they found that both upward pitch contour, and
high pitch level were both rated as expressing fear, surprise, anger, and potency. On the
other hand, downward pitch contour, and low pitch level were both rated as expressing
boredom, pleasantness, and sadness (p. 339). Of the two variables, pitch level appeared to
be having a stronger effect on affect ratings. Thus, mean pitch level / height appeared to
be a more salient emotional feature than pitch contour in this research.
Scherer and Oshinsky’s (1977) research had a number of methodological
advantages over Hevner’s (1935; 1936a; 1936b). These advantages included using
computer-controlled presentations instead of a live pianist, avoiding harmony, and using
shorter pieces / sequences. A computer-controlled presentation is better because it can
play the same sequence the same way every time. Avoidance of harmony or monophony
(playing only one note at a time) simplifies interpretation. Short pieces (8 notes) further
simplify interpretation than longer pieces (e.g. 8 measures or approximately 32 notes, as

used in Hevner’s design). That is, in shorter pieces less happens, so there is greater
certainty as to what aspect of the piece is causing the effect. Thus, Scherer and Oshinsky
demonstrated that using short sequences and avoiding the complexities of harmony can
be an effective way to measure the relationship between affect and structures of music
such as contour.
However, Scherer and Oshinsky’s (1977) design left a few questions. First of all,
the sequences in their study were not designed to sound musical, as their study was a
speech study as much as it was a music study. Thus, it is equally possible that their
stimuli were processed by parts of the brain responding to language in addition to parts of
the brain responding to music. This may or may not be an important point. Secondly,
Scherer and Oshinsky focused their study of contour to only two types (up or down) but
there are many more types of contour than this (e.g. the arch). Thus, Scherer and
Oshinsky seem to indicate that contour can influence affect given the right design, but
their study does not provide any definitive answers, especially for other types of contours
like the arch.
Gerardi and Gerken (1995) addressed the concerns about musicality by using
short, obviously musical, compositions. Melodies were chosen from Music for Sight
Singing by Richard Ottman (1956) that were composed in either the major or minor key
and comprised of predominantly ascending or descending melodies. They then
transformed each melody along both modality and contour. Since their melodies
contained no harmony (monophonic), inversions of melodies were easier than they had
been for Hevner (1936a). Their chosen method of inversion was to write the melodies
backward (retrograde), while maintaining the original rhythm (Gerardi & Gerken, 1995).
Affect was rated on only one bipolar “happy-sad” scale by 5 year old, 8 year old and
college-aged participants. Results indicated that participants of all ages responded to
manipulations of mode, but only college-aged participants responded to the manipulation
of contour. That is, all participants rated the major keys as happy, but only college aged
participants rated ascending melodies as happier than descending ones. For this study,
mode (major vs. minor) was a more powerful indicator of affect than contour, at least
along the happy-sad dimension.

Similar to Hevner (1936b), Gerardi and Gerken (1995) used melodies that still
contained rhythm (in this case, not all notes were equal length). Rhythm becomes a
problem when doing inversions since longer notes are perceived as more important to a
listener and inversions that maintain rhythm may change which notes are considered
important in a melody. Additionally, Gerardi and Gerken only employed one affective
bipolar scale (happy-sad), and of course, there are many more emotions than just these
two. Finally, their design followed the trend of other research in this area in that it only
considered ascending and descending sequences. But there are more than just these two
contours in music (e.g. there is the arch).
Gerardi and Gerken’s (1995) results, to our knowledge, are one of few to report a
significant relationship between melodic contour and affect. Schellenberg et al.’s (2000)
and Scherer and Oshinsky’s (1977) results, however, could be interpreted as a significant
finding for contour. That is, Schellenberg et al. (2000) found pitch varying melodies
(non-flat contours) expressed a significantly stronger affective meaning than non-varying
(flat) contours. This result, however, is confounded by the fact that intervals themselves
convey affective meaning (cf. Costa et al., 2000), so the relative effect of contour versus
absence of intervals was difficult to tease out in their design. In Scherer and Oshinsky’s
(1977) case, their stimuli was not exactly musical: thus the extrapolation of their results
to true musical stimuli is unclear.
Even though reports of a significant relationship between melodic contour and
affect are rare, it is common for studies to report “trends” that indicate this relationship.
For example, Hevner (1936b) reported observing a trend of ascending contours to express
dignity and solemnity, and descending contours to express exhilaration and serenity.
Schubert (2004) claimed a trend for ascending contours to convey happiness (the details
of his design are discussed later). Finally, Gerardi and Gerken (1995) claimed seeing a
trend in their 8-year olds participants to report ascending contours more positively,
although this trend was not observed in 5-year old participants. Thus, previous research
has suggested that a relationship between affect and melodic contour does exist, but
demonstrating this relationship has not been a trivial task.
This research explores the relationship between melodic contour and affective /
emotional responses. It does so by keeping the design simple, while extending the design

to include more types of melodic contour, as well as more affective scales to hopefully
capture more aspects of emotion.

Types of Melodic Contour

Melodic contour was previously defined as the change in pitch (or tone) of notes
over time. It could also be the perception of the change in pitch height over time. While
these simple definitions capture the essence of melodic contour, it is important to realize
that the movement between notes is constrained by the rules or principles of good music
composition. Not just any sequence of notes is a melodic contour. In addition, several
other issues --particularly the notion of tonality (the sense of key) -- are relevant to the
concept of a melodic contour. These issues will be discussed in more detail below.
Numerous terms have evolved to describe different melodic contours. When
subsequent notes in melody tend to be higher than previous ones, the contour is described
as “ascending”. When subsequent notes tend to be lower the melody has a “descending”
contour. When a melody rises, and then falls, returning close to its original pitch height it
is labeled as an “arch”. The opposite pattern (falling first and then rising) can be called an
“inverted arch”, et cetera. Figure 1.1 shows an example of these four main contours,
including the music score and what the overall shape of the contour (drawn above).

A. Ascending Contour B. Arch Contour

C. Descending Contour D. Inverted Arch Contour

Figure 1.1. Examples of melodies with different melodic contours

Intuitively, there are many different shapes a melodic contour can take. However,
it is difficult to convert that intuition into a meaningful, concise, classification of all

possible shapes. One problem is that the number of possible contours increases
dramatically with the number of notes. For example, with only two notes the second note
can only rise in pitch (ascending), fall in pitch (descending) or remain the same pitch
(flat). Simplistically, these contours could also be referred to as “linear” since the contour
can be represented by a straight line between the notes.
With the addition of a third note, arches become possible (e.g., rise then fall, or
fall then rise). The addition of a third note also creates the possibility of more complex
“ascending” (or “descending”) contours. A contour may rise slowly then more rapidly,
may rise at a constant rate (i.e., linear) or may rise rapidly and then more slowly. That is,
even the simple notions of ascending and descending contours become more complex
with more notes. Arches may be combined with a generally increasing (or decreasing)
contour. With the addition of a fourth note, it becomes possible to have contours with two
changes in direction (e.g. an N-shape), and soon on. In addition, such complex contours
may be combined with a generally ascending (or descending) component and/or a general
arch shape.
To our knowledge, most previous research has used only linear (ascending and
descending) melodies when considering emotion / affect. However, music does more
than just monotonically rise or fall -- it will often change direction. In fact, the melodic
“arch” is known as one of the most common melodic contours in music. For example,
when Huron (1996) classified the presence of different melodic contours in over 36,000
phrases taken from western folksongs he found that the melodic arch (in its regular
convex form) occurred more than any other type of contour. Yet, we are not aware of a
single study that has looked at whether or not arches convey emotional meaning. Thus,
this design included consideration of arch contours in addition to the impact of other
complex contours.

Issues for Melodic Contour

One main problem with the classification or measurement of melodic contour is
the notion of pitch or pitch height. Pitch height refers to the frequency or “pitch” or
“tone” (in cycles per second or hertz, Hz) of the individual notes of a sequence. However,

the scaling of such notes within a melody is not straightforward because the perception of
pitch height in music is complicated by several factors.
First of all, the perception of pitch (or sound frequency) in music is very different
from the physics underlying the actual sounds. In nature, sounds may take on any
frequency and humans are generally considered to be able to hear sounds between 20 Hz
and 20,000 Hz. However, most western music is composed of limited set of discrete
tones, and each has a single predefined frequency and a specific name or label (e.g., the
note A in the fourth octave or A4 is 440 Hz). That label is a letter/number combination
usually in the range of 1 - 128 (this is not completely standardized). In music, notes are
treated as categorical (not continuous) and it is argued that they are also heard
categorically (cf. Handel, 1993). The standard piano keyboard is the classic
representation of this concept.
There are further complications. Tones that are doubled in frequency sound very
similar -- they sound like the "same" note. For example, A4 at 440 Hz seems the same as
A3 at 220 Hz, and C3 at 262 Hz sounds the same as C4 at 512 Hz. Hence, any two notes
that have a frequency ratio of 2 are given the same label. For example, the tone "A"
occurs as 27.5 Hz, 50 Hz, 110 Hz, 220 Hz, 440 Hz, 880 Hz, 1760 Hz, 3520 Hz, et cetera.
The tone "C" occurs as 32.7 Hz, 65 Hz, 131 Hz, 262 Hz, 524 Hz, et cetera. This is called
The range of frequencies between two repetitions of the same note is called an
"octave". There is one octave between A3 and A4, and another between A4 and A5. Each
octave -- no matter which two notes define the octave -- is further divided into 12 tones.
However, these notes are spaced as equal intervals on a logarithmic scale. If one sets A0
as note number 1, then the equation is:
freq = 27.5 * 2(N-1) / 12.
The octave from A3 to A4 is divided into the notes A3 (220 Hz), A#3 (233 Hz), B3 (247
Hz), C3 (262 Hz), C#3 (277 Hz), D3 (294 Hz), D#3 (311 Hz), E3 (330 Hz), F3 (349 Hz),
F#3 (370 Hz), G3 (392 Hz), G#3 (415 Hz), and A4 (440 Hz). This is called "equal
temperament tuning" (this is a simplification of many aspects of modern music;
frequencies are approximate; other formulas are used). The 12 notes cited repeat over all
octaves and it does not matter which note is considered the first (e.g., it could be A to A

or C to C). Hence, notes are often referred to simply as A, A#, B, C, C#, D, D#, E, E#, F,
F#, G, G#, (A) as octave information is not relevant.
Note that the frequency steps are not equal. Yet, such notes when played on a
regular piano tend to be perceived as following each other in a linear manner. That is,
each note seems to be "one more-or-less-equal step" above or below its neighbors. For
music, it is the perception that matters. Thus, quantifying the perception of pitch height of
individual notes is best done along some sort of linear scale that refers to each discrete
pitch used in music. That linear scale is usually referred to as the chromatic scale. Most
simply, a chromatic scale assigns a cardinal number to each musical pitch. For the notes
of the typical piano, the numbers 1 though 88 are used to represent all the black and white
keys, with 1 referring to the lowest note (A0 = 27.5 Hz) and 88 to the highest note (C7 =
4187 Hz), with A4 set to note number 48. Other systems (such as MIDI notation) set A4
(still 440 Hz) to note number 60, so to allow for frequencies below 27.5 Hz.
However, even with the notion of a chromatic scale, the perception of pitch height
along that linear chromatic scale is not straight-forward because not all notes within a
single octave are perceived as equal. For example, some notes may sound "sharp" or
"flat" when played in the context of other notes. Notes that sound sharp or flat, in context,
are often referred to as accidentals.
This phenomenon arises because most western music only uses a pre-defined
subset of 7 notes of the 12 possible notes within a single octave. Different subsets are
defined, and each subset is called a key. There is one major key that starts on each of 12
possible notes of the octave. For example, C major uses the notes C, D, E, F, G, A, B, (C)
of an octave, whereas A major uses A, B, C#, D, F#, G# (A). Note that these tones are not
equally spaced within the octave: In the key of C major, the step from C to D is 2
chromatic notes, from D to E is 2, from E to F is 1, from F to G is 2, from G to A is 2,
from A to B is 2, and from B back to C is 1.The same steps are used in the key of A
major (2,2,1,2,2,2,,1), but starting with note A. Hence all major keys use the same steps
(or intervals) but start at different points of the octave. In addition, there are two minor
keys that correspond to each note of the octave. The two minor keys use different steps
(intervals). Music that is restricted to the notes of one key is called "diatonic". Hence,
diatonic is often used synonymously with key, though technically this is incorrect. It is

rare for music to be completely restricted to the notes of one key -- notes taken from
outside the key can add color or interest. However, if most of the notes are confined to
one key, the music is called diatonic, and notes taken from outside the key are called
chromatic notes (in the sense of adding "colour") or accidentals.
If a composition is diatonic, then the 7 notes within an octave "seem" to be
adjacent. That is, for diatonic music, each diatonic note seems to be "one more-or-less-
equal step" above or below its neighbors. In such music, out of key notes sound "sharp"
or "flat". Hence, one can perceive the "scale" of the music as chromatic or diatonic
depending on context. If the perception is diatonic, then usually one will simplify the
numerical notation to represent only the 7 notes of the key. That is, the diatonic notes of
the C major key would be labeled as 1, 2, 3, 4, 5, 6, 7 and 8 for C D E F G A B and C.
This system only works if the music is diatonic.
In compositions, a musical key is usually explicitly stated at the beginning of a
written score (the key-signature). Key is important because musical key also indicates the
“tonal center” or “main note” of a melody. That is, within a key, not all 7 notes are
perceived as equally important. For example, when writing in the key of C major, the
note “C” is the most important, should occur most frequently, and is usually the note the
melody ends on. This is called the tonic or root. Other notes also have special
significance. The notes of the "tonic triad” (C, E, G) are also important in the key “C”.
These notes are C (root), E (third), and G (fifth). Thus, music in the key of C major
should also contain a lot of E’s and G’s, with G being the more important. If a
composition violates these rules of note frequency the resulting melody will sound less
like it is in the key of “C”, and may instead sound like another key or even “unmusical”.
Generally, there is a "tonal hierarchy" that defines the relative importance of the 7
diatonic notes within a single key. This tonal hierarchy has been mapped by Krumhansl
and others (see Krumhansl, 1990), and is related to the frequency of note use, the
summed duration of note use, and other aspects of musical structure.
Not only is occurrence of the triad notes important to tonal sense (or sense of
key), but rhythm and harmony are also important. That is both rhythm and harmony can
be used to accentuate notes in melody. In the key of “C” the notes C, E, and G are
perceived as stable notes or resting points in the music. Thus, C, E, and G are often the

notes of choice for long/held notes in a melody. They are also used more often as the
points of change in a contour, for example as the apex of an arch. Harmony can
accentuate the strength of the triad (C, E, and G) by playing all notes of the chord
together in the harmonic line. However, even in the absence of harmony and varying
rhythm, note frequency alone can provide the listener with sufficient context to convey a
sense of key.
Thus, to maintain a sense of musicality and key, musical sequences used as
stimuli should be designed such that change points in the sequence correspond with one
of the notes of the triad (notes that tend to be perceived as more important and stable to a

Inversions & Complementary Contours

The issue of scale and tonality (or sense of key) makes the design of sequences
for the purpose of comparing and contrasting different contours difficult. Tonality implies
that not all notes have the same “weight” for the perception of structure within a
sequence. Some notes are more important or more central (e.g. C, E and G). This has an
effect on the assessment of the relationship between structure and affect because it makes
it difficult to design a clearly “opposite” or “complementary” sequence.
The most common method of inverting a contour is reversing the order of the
notes. A very simple example is the ascending sequence C-E-G becoming G-E-C.
Although these two sequences use the same notes, they vary on more than just melodic
contour (ascending vs. descending). For example, in the key of C, the “C” is more
important than all other notes. So, C-E-G had the most important note first, while G-E-C
has the most important note last. Thus, these two simple contours vary on both contour
(ascending vs descending) and on the order in which important notes are played (early vs
late). Also, scaling is important. In a diatonic scale, these two contours are perfect
opposites because there is only one note between C to E (D), and one note between E to
G (F). However, in a chromatic scale these sequences are no longer opposites because
one interval contains more notes than the other. Specifically, on a chromatic scale there
are 3 notes between C to E (C#, D, D#), but only two notes between E to G (F and F#).
Thus, the intervals C-E and E-G are considered equivalent on a diatonic scale, but non-

equivalent on a chromatic scale. That is, reversibility of a sequence is an aspect of the
scaling of notes.
Furthermore, reversing the order of the notes (retrograde) is not the only way of
inverting a contour. Instead, a complementary contour could be designed in which the
order of relative intervals is maintained. For example, for a contour which starts at middle
C (designated C4) and rises to the C above it (C5), but skips a note near the end can be
complemented two ways. It can be complemented the usual way by reversing the note
order (Figure 1.2-B). It can also be complemented by changing direction, but maintaining
the order of interval changes (Figure 1.2-D). In summary, if (1) the same notes are used
for the “opposite”, or “complement”, then the interval jumps at a given time are not the
same and (2) if the same size intervals are used at each time then the “opposite” has to
use different notes. This effect is demonstrated in Figure 1.2 with each contour written as
a numerical expression of the diatonic interval between notes.

A. Ascending Melody B. Reversing note Order (Retrograde)

C. The same ascending Melody D. “Opposite” using the same intervals

Figure 1.2: Demonstration of complementary contours

When the ascending melodies’ (A) opposite is composed using the same notes (B) the
descending melody has its largest interval at the start instead of the end. When the same
ascending melody (C) now has its opposite expressed using the same contour or interval
arrangement the descending sequence (D) is forced to use different notes. Thus, a single
melody or sequence of notes can have two “opposite” or “complementary” contours.

Inversions of the Melodic Arch
Inverting a melodic arch is complicated by the factors mentioned above.
Obviously the rule of reversing the note order does not make sense for a melodic arch.
For example reversing the order of a non-symmetric melodic arch C-G-E gives E-G-C,
but reversing the order of a symmetric arch such as C-E-G-E-C gives back the exact same
contour. In both cases, reversing the order of notes results in arch of the same orientation
(both begin by rising). In order to compare a regular (convex) arch to an inverted
(concave) arch a different system must be used.
For example, the “opposite” of the arch that starts on middle C (C4) jumps an
octave (C5) and comes back down to C4 is the inverted arch of C5-C4-C5. However, in
addition to differing on contour, these two contours differ on mean pitch height. That is,
the regular arch (C4-C5-C4) has more C4’s and thus would be perceived as being in a
lower range / register than the inverted arch (C5-C4-C5) which has two C5’s.
Additionally, as soon as other notes beside C are used the scaling issue becomes a
problem again. For example the opposite or inversion of C4-G4-C4 could be considered
both C5-G4-C5 and C4-G3-C4. However, not only are both these “opposite” contours in
different ranges (different mean pitch heights), but the intervals in the contours are not
equivalent. That is C4-G4 is a fifth, meaning there are three intervening notes on a
diatonic scale and C5-G4 is only a fourth. This disparity is also true in the chromatic
scale. Therefore, any “opposite” contour for the simple sequence C4-G4-C4 using the
same notes will also result in a sequence differing also on average pitch height and
intervals present. All examples presented here were short examples, but longer sequences
have the same problems. This is, perhaps, one of the reasons most research has limited
their contour manipulations to ascending versus descending (cf. Gerardi & Gerken, 1995;
Hevner, 1936b; Scherer & Oshinsky, 1977; Schubert, 2004).
Thus, due to the inherent difficulties and issues involved in doing “inversions” of
melodies, especially with arches, the current research will avoid using approximately
“opposite contours” and focus, instead, on careful quantification of the differences
between these contours.

Characterizing Melodic Contour
The earliest methods for describing contour were largely descriptive, and
qualitative. For example, Toch (1948) stated that most music is written as a series of
lower level ascending climaxes that combine to form a single ultimate climax. The
climax is then followed by a descent, such that the overall global contour of an entire
piece tends to follow the shape of an arch (Toch, 1948). Adams’ (1976), on the other
hand, tried to classify all possible types of contour and then categorize melodies
according to which type they belonged. Adams’ (1976) method considered only the first,
final, highest and lowest note as important in any melody, and thus melodies would be
classified on the basis of only these notes. This approach resulted in 15 possible contour
types, but was a gross simplification of what the music was really doing. Intuitively, the
number of different melodies that have been played and preserved for prosperity imply
that a listener pays attention to more than just four notes.
Morris (1987) proposed a more quantitative approach. His method advocated
defining the “contour space” of a given melody. Each note in the melody was assigned a
number based on its relative pitch to all other notes. The lowest pitch would be assigned
the value 0, and the highest pitch would be assigned the value n-1 (where n was the total
number of notes). For example, the sequence C4-E4-G4-F4 would be described as having
the contour space <0 1 3 2>. Different contours could then by compared through Morris’
(1987) comparison matrices or “COM-Matrices”. COM-Matrices for a given contour
would be developed by comparing each note position in a sequence to every other note
position. The COM-Matrix for <0 1 3 2> is shown in Figure 1.3.

COM 0 1 3 2
0 0 + + +
1 - 0 + +
3 - - 0 -
2 - - + 0

Figure 1.3: The COM-Matrix for <0 1 3 2>

In the COM-Matrix, pitches are rated as being higher (+), lower (-) or equal (0) to every
other note, but there is no measure of how much higher or how much lower. Within this

system, contours with the same matrix are considered to have the same contour. For
example the sequence C4-G4-C5-B4 would be considered to have the same contour as
C4-E4-G4-F4. Inversions are considered any contours that have the opposite matrix (all
+’s are –’s and vice versa).
Marvin and Laprade (1987) extended Morris’ (1987) COM-Matrices by
developing other methods for comparing these matrices. For example, Marvin and
Laprade developed the Contour Similarity Function (or CSIM) which compared two
COM-matrices by computing the percentage of positions returning the same sign. They
also developed the Contour Embedding Function (CEMB) for comparing contours of
different sizes. Polansky and Bassein (1992) pointed out that there were a number of
conceptual problems with COM-Matrices. Specifically, using analytical techniques on
COM-Matrices to “recompose” music can result in COM-Matrices representing
impossible contours. That is, if note a is > (higher than) b, and b > c, than it must be true
that a > c. However, it is possible to develop a COM-Matrix where this rule is violated.
In other words, there are more possible COM-Matrices than there are possible melodic
contours using this approach.
Around the same time as Morris (1987), Freidman (1985) was developing his own
system, which included constructing “Contour Interval Arrays” that contained
information about the presence of different interval types. However, like Morris (1987),
Freidman’s (1985) system was relative and contained no information about the size of the
actual intervals. Thus, again, the sequences C4-G4-C4 and C4-D4-C4 would be
considered identical by Freidman’s system. This seems intuitively odd since the interval
between C4-G4 is much larger than the interval between C4-D4. Certainly absolute
interval size must be important for music perception (cf. Costa et al., 2000).
Indeed, it is possible to model contour with approaches that take absolute interval
size into account (not just relative interval size). One approach that takes absolute interval
size into account can be referred to as the polynomial method. This method assigns each
note a value based on the pitch of the note, and plots pitch height over time. Then, a
polynomial curve is generated to model the overall contour, and the equation of that
curve can be considered a representation of the actual melodic contour. This method has
the advantage of not only retaining information about absolute interval size, and pitch

height, but also allows for a comparison between mean pitch height and various shapes
within the contour. This method will be discussed in more detail later on. For now it is
important to note that it was the primary modeling method employed in this thesis, and
has also been used successfully in other melodic contour research (cf. Beard, 2003).

Measuring Emotion or “Affect”

In order to assess the emotional impact of different types of sequences, some
means of assessing that emotion is necessary. As noted in the review, several different
methods have been developed and it would seem that the most common method is to
have participants who are listening to music rate the perceived emotional content using
bipolar scales. A bipolar scale is a scale in which words / adjectives are provided at both
ends and a participant is forced to decide which end of the scale is more appropriate.
Often, bipolar scales come with a degree of magnitude, for example, a 7-point Likert
bipolar scale could look like this:

happy 1 – 2 – 3 – 4 – 5 – 6 – 7 sad

Other options include unipolar scales and checklists. A unipolar scale is similar to
a bipolar scale, except that only one word is being rated. For example, in this case the two
unipolar scales could be:

not happy 1–2–3–4–5–6–7 happy

not sad 1–2–3–4–5–6–7 sad

As seen above, one needs two unipolar scales to capture the same information as single
bipolar scale.
For checklists, participants are provided with a word list and check off all the
words that apply. In the checklist approach, again two words would be required to
capture the same meaning as a bipolar scale. For example:

A checklist has the advantage of being faster, but the disadvantage of not containing as
much information as unipolar and bipolar scales.
There has been some debate over which type of scale or “reporting style” is the
best. Some researchers claim this is an important debate because the type of measurement
used can greatly influence obtained results (e.g. Gilpin, 1973). Yorke (2001) has made a
case for preferring unipolar scales over bipolar. Specifically, Yorke (2001) claimed that
grammatical antonyms do not necessarily correspond to psychological opposites, and the
“centre” of a bipolar scale may not be interpreted consistently across participants.
Schimmack, Bockenholt, and Reisenzein (2002), on the other hand, recently suggested
that different types of reporting styles (unipolar vs. bipolar) had a negligible effect on
affect ratings. That is, participants reporting in both bipolar and unipolar formats
appeared to give approximately the same affect ratings. In the end, however, Schimmack
et al. (2002) sided with Yorke (2001) and stated that, because bipolar scales can be
confusing, unipolar scales may be preferable over bipolar.
Despite claims that unipolar scales may work better, the vast majority of music
research has employed the use of bipolar scales. For example, most of the studies
previously cited in this thesis have employed bipolar scales (i.e. Gagnon and Peretz,
2003; Gerardi & Gerken, 1995; Scherer & Oshinsky, 1977; Schubert, 2004; Webster &
Weir, 2000). Studies of the physiological effects of music have employed bipolar scales
(e.g. Blood, Zatorre, Bermudez, & Evans, 1999). Research into the affect of musical
intervals and chords has been assessed by bipolar scales (e.g. Costa et al., 2000; Costa et
al., 2004). Research into the relationship between music and video tends to employ
bipolar scales (e.g. Iwamiya, 1994; Lipscomb & Kendall, 1994; Marshall & Cohen,
1988). Even studies of vocal expression, which may be very similar to musical
expression, have employed the use of bipolar scales to measure affect (cf. Scherer, 1986).
Certainly, a the literature suggests that bipolar scales are more commonly used in this
type of research -- a trend that has been largely inspired by the Semantic Differential
work of Osgood, Suci, and Tannenbaum (1957). Thus, the choice between unipolar and
bipolar scales is quite complicated.
It is worth noting, however, that other approaches to measuring affect do exist.
The scale systems (checklist, unipolar, or bipolar) generally involve participants waiting

until the end of the presentation of a stimulus (melody) to respond. But, it is also possible
to take continuous measures of affect participants. This can be as complicated as taking
EEG (electroencephalogram) or doing MRI (magnetic resonance imaging) recording, to
as simple as measuring the galvanic skin response, heart rate or respiration rate during an
experiment (e.g. Blood et al., 1999; Gomez & Danuser, 2004; Iwanaga, Kobayashi, &
Kawasaki, 2005; Kawano, 2004; Schmidt, Trainor, & Santesso, 2003). However, the
results of such physiological studies have proven to be difficult to interpret, and do not
capture the more subtle flavors of emotion (Iwanaga et al., 2005; Swanwick, 1973).
Emery Schubert (2004), on the other hand, has proposed a method for measuring
perceived emotion /affect in a continuous non-physiological way. His novel approach
involved defining a two-dimensional space on a computer screen where the up/down
dimension represented arousing/boring and the left/right dimension represented the
sad/happy dimension. Participants were then required to move the cursor around the
dimension space on a screen while listening to musical compositions. For example,
participants moved the cursor towards the right when they began to perceive the
expression of “happiness”, and then returned the cursor to the center of the emotional
space as the emotion dissipated, etc.
Schubert’s (2004) method had the potential advantage of capturing the influence
of subtle affective responses to a melody. Unfortunately, it had the disadvantage of only
being able to capture two emotional dimensions simultaneously, and requiring
participants to translate emotions into movement of the hand (which may not be a direct
translation). Certainly, Schubert’s (2004) design was a very interesting one, and he
employed creative methods for analyzing his data, but his approach was not used in this
thesis. This decision was made, mainly, because of the interest in exploring more than
just two bipolar affective dimensions of music.
For our design, both unipolar and bipolar affective scales were considered in pilot
work. However, with the particular set of musical stimuli we were using, bipolar scales
seemed more applicable. Specifically a force choice between two words (bipolar) worked
better than the decision to apply one word or not (unipolar) with such short sequences.
Thus, the final study began with bipolar affective scales since not only did they fit our
experiment better, but they have been used by other researchers in this field. Words for

the affective scales were originally chosen to match those that had been used in previous
research (e.g. Blood et al., 1999; Costa et al., 2000; Costa et al., 2004; Iwamiya, 1994;
Lipscomb & Kendall, 1994). The words were then modified to fit the context of this
experiment. For example, with the bipolar scale “happy – sad” the option “sad” was
never selected in pilot work, since all the sequences were written in the major mode (C
major) at a moderately fast tempo making them sound happy. Thus, this scale was
changed to “contentment – joy” since it was more appropriate for the musical stimuli
being used.
As a result of modified scales to fit the context of the experiment, some of the
final scales could be considered not “true” bipolar scales. That is, the ends of the scales
may not have been true opposites. As an example, the scale “contentment-joy” worked
very well in this experiment but the words “content” and “joy” are not considered
opposites the same way that “happy” and “sad” are considered opposites. Thus, the scale
“contentment-joy” may have been better described as a unipolar scale from a little
happiness (contentment) to a lot of happiness (joy). For this reason, the term “bipolar”
may not be appropriate for some of our scales, and instead all our scales were referred to
simply as “affect scales”.

The Current Study

The current study set out to demonstrate a clear relationship between melodic
contour and affect (emotion). It also explored the possibility that non-linear contours such
as the melodic arch convey emotional content. The general design followed closely to the
methodology of both Scherer and Oshinsky (1977) and Gerardi and Gerken (1995).
Specifically, this study had participants rate a number of short melodic sequences on a
number of affect scales. Sixteen musical sequences were composed for this research, and
six affect scales were selected to measure affect. The initial intention was to run a single
block of participants, but a second group was added to the design. This second group
rated the same 16 musical sequences using two of the same affect scales and four new
affect scales. Thus, the final design resulted in the use of 10 different affect scales.
In composing the music for this study, a large number of constraints needed to be
taken into consideration. First, the musical sequences were designed so that they basically

differed only on melodic contour. That is, all sequences were designed to be in the same
key (key of C), in the same mode (major), and to begin and end on the same note (C).
Beginning and ending on the same note was important, as it gave each sequence a sense
of completion, and helped reinforce the sense of key. Second, sequences were designed
such that intervals between the notes were never very large (never more than a fifth), and
all important changes happened upon a note of the chord (C, E, or G). This helped the
sequences sound more musical (the former), and helped maintain a constant sense of key
by emphasizing the notes of the chord (the latter, cf. Krumhansl, 2000). All sequences
were contained between two octaves with the C above middle C (C5, 524 Hz) as the
highest note ever played, and the C below middle C (C3, 131 Hz) as the lowest note ever
played. Thirdly, the use of harmony was avoided (monophony) and all notes were equi-
temporal, with one exception. This exception was the final note, which was held an extra
beat in all sequences to help reinforce the perception of completion, and the sense of a C
major key (since all sequences ended on C). These 16 sequences were thus described as
having 8 notes or 7 attack points (since the last note was held). Imposing all these
constraints on all the musical stimuli helped minimize the number of possible confounds
in later analyses.
From the pilot work it became evident mean pitch height was playing a large role
in responses. Here, “mean pitch height” is used to refer to the register in which sequences
were being played in: high register (C4 to C5) versus low register (C3 to C4). On one
hand, the results of pilot testing were not surprising, as Costa et al. (2000), Hevner
(1936a), Scherer and Oshinsky (1977) and others had long since documented the
importance of pitch height. On the other hand, the effect of pitch height appeared to be
much larger than expected for sequences that basically only spanned two octaves (C3 to
C5). Thus, the final set of sequences was chosen to be balanced, such that an equal
number of notes were above and below middle C (C4) within the set of all sequences. In
this manner, sequences using notes above C4 were no more common than those using
notes below C4.
The sixteen sequences were designed to represent a number of contours beyond
the simple linear (ascending / descending) contours. Globally, four different types of
contours were defined. These were contours that: (1) never changed direction: regular

ascending & descending, (2) changed direction once: arches & inverted arches, (3)
changed direction twice: N-shape & inverted N contours, and (4) changed direction three
times: M & W-shape contours. Figure 1.4 shows an example of each of these types of
contours with 8 notes (or 7 attack points).

A. No change: Ascending/Descending B. One change: Arch / Inverted Arch

C. Two changes: N / Inverted N D. Three changes: M / W -shapes

Figure 1.4: Examples of the different types of contours

All examples in Figure 1.4 show only the rising first version of each contour. However,
both rising-first and falling-first versions of each contour were used in this study (see
For the analysis, musical contour was modeled using higher order polynomial
equations (cf. Beard, 2003). Using this technique, each contour could be approximated by
a 4th order polynomial equation of the form:
y = a0 + a1 x + a2 x2 + a3 x3 + a4 x4
The equation models pitch height (y) as a function of time (x), where each of the
coefficients (a0 to a4) yielded a different piece of information about the shape of the
contour. For example, a0 contained information about the mean pitch height of a
sequence, a1 about the extent of linear ascending or descending, and a2 about the extent of
an arch or inverted arch. Thus, polynomials offered a convenient option for encoding the
various elements of a melodic shape.
Finally, our study had a number of expectations that were not strong enough to
call "formal hypotheses" (i.e., the work is somewhat exploratory). First, it was expected
that this design would be powerful enough to show a relationship between melodic

contour and affective responses (emotion). In particular, one would expect that some
contour shapes would be associated with some affects while other contour shapes would
be associated with other affects. Based on prior research, one could also hypothesize that
ascending contours would tend to elicit positive responses, and descending contours
would tend to elicit negative responses (Gerardi & Gerken, 1995; Schubert, 2004),
although all sequences, as noted, were sited in the major mode and would therefore elicit
some "positive" affect. Also, since the melodic arch is such a common contour in music
(cf. Huron, 1996), it was expected that it might play an important role in conveying
affective content.


The final design included two groups of university students exposed to different
affect scales but the same 16 musical sequences. Each participant rated each musical
sequence on each affect scale twice in a 16 (sequences) x 6 (affect scales) x 2 (times)
design. This task was called the “Affect Ratings” task, and was interspersed with another
task called the “Tonality Assessment”.
The “Tonality Assessment” was similar in design to the “Affect Ratings” task but
had the purpose of assessing the internal representation of tonality for each participant.
This was collected as post-manipulation check, but was not expected to influence Affect
Ratings in any significant way (cf., Frankland, 1998; Frankland & Cohen, 1990;
Krumhansl, 1990). Additional demographic and musical background information was
also collected.

All participants were recruited from the Dalhousie University Psychology subject
pool. All participants were students in a first year psychology course, and received course
credit for their participation in this study. One hundred participants participated in this
study. The mean age of the participants was 19.21± 2.11 (range: 17 – 28) with a skew
2.03 ± .241. The majority were 18 years of age (freshmen), 75% of participants were
female, and 88% reported being right-handed. Only one participant reported being
ambidextrous. Additionally, 90% of participants reported learning English as their first
language. None of the participants reported having perfect/absolute pitch or evidenced
any auditory problems.
The majority of participants reported exposure to a music program in elementary
school that involved singing and music theory. In particular, 85% of participants reported
doing some amount of singing, 65% reported learning the piano, and 62% reported
playing in some kind of band (usually school band). Additionally, 42% reported playing
or at least trying to learn the guitar. The average participant reported playing at least 3

The error is the standard error of skew.

(median & mode) ± 1 instruments (when voice was counted). In fact, all participants
reported having at least some exposure to music, music education, and at least one
instrument. The average participant also reported having studied their main instrument
between 5.4 years (mean) to 6 years (mode) ± 3 years with a skew of .61 (range 1 to 14).
Participants were asked approximately how many hours a week the thought they
spent listening to music both by choice and by default (e.g. at work or played by
roommate). By choice, participants reported listening to an average of 18.04 ± 15.41
(range: 1-83) hours a week of music with a skew of 1.76 ± .24. By default, participants
reported listening to an average of 5.23 ± 10.15 (range: 0-60) hours a week of music with
a skew of 3.32 ± .24.
All participants belonged to one of two groups, with 50 participants in each
group. Chi-squares and independent two group t-tests indicated that these groups did not
differ significantly in age (t (98) = .613, p = .541), gender distribution 2(1, N = 100) =
1.33, p > .05), in the number of instruments they reported playing (t (98) = -.761, p =
.448), or in the number of years they reported playing their main instrument (t (98) = -
1.247, p = .215). They also did not differ in the amount that they reported listening to
music both by choice (t (98) = -.055, p = .956), and by default (t (98) = .265, p = .792). In
fact, there appeared to be no significant differences in the demographic composition of
the two groups.

Each participant was brought into the sound attenuating room of the lab, and
seated comfortably in front of the computer. After informed consent was obtained the
experiment began with computer-based tasks, followed by background questionnaires
and debriefing. The computerized part of the experiment presented and recorded both the
main task (Affect Ratings) and the Tonality Assessment, in a six stage design (see Figure
Stages 1, 3, and 5 were designed to assess the internal representation of tonality of
the participant using a modified probe tone task. This served as a useful check on the
musical background of each participant. Each tonality stage was essentially the same,

with Stage 1 serving as practice for Stages 3 and 5. For that reason, Stage 1 actually
contained fewer trials.
Stages 2, 4 and 6 were the stages of interest, and required participants to provide
Affect Ratings (ratings on affect scales) for different musical sequences. Stages 2, 4 and 6
were essentially the same, with Stage 2 serving as practice for Stage 4 and 6 (Stage 2 had
fewer trials).

Stage 1 Tonality Assessment

Stage 2 Affect Ratings } Practice

Stage 3 Tonality Assessment

Stage 4 Affect Ratings

Break Musical Interlude

Stage 5 Tonality Assessment

Stage 6 Affect Ratings

Figure 2.1: The overview of the computerized six stage design. Note: Stages 1 & 2
served as practice for the subsequent Stages 3 to 6. Also, there was a short break between
Stages 4 & 5 in which a piece of classical music was played.

Detailed instructions were presented to the participant at the beginning of each

stage, at a pace determined by the subject, via the computer. The instructions for Stage 1
were repeated at the beginning of Stages 3 and 5. Likewise, the instructions for Stage 2
were repeated at the beginning of Stage 4 and 6. Abbreviated instructions remained on
screen during all trials at all stages. The experimenter remained with the subject during
practice stages (Stage 1 and 2) to insure, through observation of the subject, that the
instructions were understood. Additional instruction and clarification was provided if

needed. A break was included between Stages 4 and 5 to help alleviate any boredom the
participant might experience.
After the computerized part of the experiment (which is described in more detail
below), paper and pencil questionnaires were used to collect basic demographic
information, as well as information about musical preferences and experience. Thereafter,
participants were asked open-ended questions about what they thought they were rating
during the main Affect Rating task. Participants’ responses to this open-ended question
were recorded and considered in the interpretation of the results. Finally, participants
were given the debriefing form, and a chance to answer any remaining questions they
had. The instructions provided for each stage, as well as the questionnaires can be found
in the Appendices.

Tonality Assessment
Each participant performed two identical Tonality Assessments, one in Stage 3
and another Stage 5. Stage 1 served as practice for these assessments. In each of these, a
modified probe tone task was used to determine the tonality profile of each participant
(Frankland, 1998; Frankland & Cohen, 1990; Krumhansl, 1990). Participants were
presented with an ascending or descending scale in the key of C major, followed by a
pause, and finally followed by a single probe tone drawn from the octave bounded by the
scale (see Figure 2.2). The scale defined a context and the probe tones consisted of any of
the 13 chromatic tones within that context. The participants’ rated the fit of the probe
tone to the context on three point scale (poor, indeterminate /uncertain, and good).
In a block of 26 trials (Stage 3 or 5), each of the 13 chromatic tones was presented
once as a probe tone for the ascending scale and each was presented once as the probe
tone for the descending scale. Presentation of the ascending and descending scales was
randomized uniquely within a stage separately for each participant. Presentation of the
probes tones were also randomized for each participant. Stages 3 and 5 each contained a
full block of 26 trials, while Stage 1 presented only three of the 13 possible probe tones.
Responses were indicated by the use of the 1, 2 or 3 of the keypad or by the use of
the left, center and down arrow keys. For left-handed participants the keys 1, 2 and 3 at
the top row of the standard QWERTY keyboard were also available.

Tonality Assessment: Stages 1, 3, & 5
Probe-Tone Task

Key-Defining Context Probe-Tone participant rates

(scale) fit of probe to
ascending & descending each of the 13 context: 3-point
major scales chromatic notes scale

Affect Ratings: Stages 2, 4, & 6

Rating Affect of Music
participant uses
Affect Scale Musical Sequence affect scale to
(Words) appears Plays rate sequence:
3-point scale

Figure 2.2: Basic design within each block. Note that the Tonality Assessment and Affect
Rating components are relatively independent, but both require responses on a 3-point
scale. In the Tonality Assessment participants rated just the Probe-Tone, whereas in the
Affect Ratings participants rated the entire sequence.

Affect Ratings
Each participant performed two identical sequence Affect Ratings, one in Stage 4
and another Stage 6. Stage 2 served as practice for these assessments. In each of these,
participants were presented with an eight-note sequence and asked to rate the affect that
the music was “trying to convey" using an affect scale. The phrase "trying to convey"
was important because previous research has shown that participants are more likely to
agree on “emotion expressed [or conveyed] by the music than on the emotion evoked by
the music” (Schubert, 2004, p. 566; cf. Hampton, 1945; Swanwick, 1973). An example
of the instructions is presented in Figure 2.3.

Figure 2.3: General instructions for Affect Ratings

In a block of 96 trials (Stages 4 and 6) participants were exposed to 16 musical

sequences paired with 6 affect scales. The trial began with the presentation of one of six
affect scales. Each participant read the affect scale (words), and then initiated
presentation of the musical sequence. The participant then rated the affect of the sequence
using a three point scale: a one (1) for first word, a three (3) for the alternative word, or a
two (2) if a delineation was not possible. Stages 4 and 6 each contained a full block of 96
trials, while Stage 1 presented only one affect scale paired with the 16 sequences. The
affect scale used in practice was different than the six used in Stages 4 and 6.
As in the Tonality Assessments, responses were indicated by the use of the 1, 2 or
3 of the keypad, or by the use of the left, center and down arrow keys. For left-handed
participants the keys 1, 2 and 3 at the top row of the standard QWERTY keyboard were
also available.
A 3-point scale was chosen because (a) it allowed participants to respond without
having to move their hand, (b) it simplified the decision task required from each
participant by requiring no decisions of magnitude, and (c) the Tonality Assessment also
used a 3-point scale. The task was further simplified by having participants rate each
sequence on only one affect scale at a time. This type of design has been successful
employed by Frankland and Cohen (1996; 1998) in the past.
Each affect scale was run as a full block. In other words, all 16 musical sequences
were presented with the first affect scale. Then all 16 sequences were presented with the
second affect scale, et cetera. Pilot testing indicated that this allowed a better flow for the
experiment, as participants would not need to adjust their criteria on every single trial.

For each affect scale, the 16 musical sequences were presented in a random order. The
order of affect scales were randomized for each participant, and were randomized within
each Stage.
Participants were instructed to wait until the entire musical sequence was finished
before providing a rating, and in fact, data collection was designed such that ratings could
not be entered (recorded) until the end of the sequence. Feedback was provided to inform
participants that the response had been recorded. Participants were encouraged to make
quick responses. The trial would end automatically 4000ms after the music stopped
playing. This, however, gave participants more than adequate time to respond, in most

Apparatus and Stimuli

All tones within the entire experiment were created using the internal MIDI driver
of a Creative Labs Sound Blaster 16, housed within an IBM AT (486, 50MHz)
compatible computer. All tones were created using the default instrument 0, mode 0,
corresponding to the sound of an acoustic piano. The same computer provided instruction
on a monitor and recorded responses. Programs for the presentation of stimuli and
recording were written in house using Borland’s Turbo C/C++, Version 3.0, aided by the
Creative Lab’s Sound Blaster Developer’s Kit, Version 2.0.
Tones were presented binaurally through a pair of Sony, Studio Monitor, MDR-
V600 headphones connected directly to the audio output of the Sound Blaster 16 at a
volume level considered comfortable by the subject. The monitor and keyboard were
housed within an Industrial Acoustics single-walled sound-attenuating room, with the
main computer (case) external to the sound-attenuating room in order to minimize noise.
For all stages of the experiment, participants were seated comfortably in front of
an IBM style computer in an Industrial Acoustics single-walled sound-attenuating room.
The entire experiment took less than 1 hour to complete, and on average ran 45 minutes.

Tonality Assessment
For the Tonality Assessments (Stages 1, 3 and 5) the basic stimuli consisted of 13
tones representing the equal-tempered chromatic scale within the octave C3 to C4

(261.63 to 535.25 Hz). The context consisted of eight notes creating an ascending or
descending C major scale, with each note having duration of 250 ms (a tempo of
approximately 150mm). Probe tones consisted of each of the 13 chromatic tones, also
with duration of 250 ms. All thirteen tones were used in Stages 3 and 5, but in the
practice, only a subset of three tones was used (Figure 2.4). In addition, there was a 400
ms rest (gap) between the final note of the context and the probe tone. All probe tones
were presented at the same intensity.

Affect Ratings
For the Affect Ratings, participants were presented with an affect scale, and then
a musical sequence consisting of seven notes. Affect scales were chosen to be
comparable to other research in this area (e.g Blood et al., 1999; Costa et al., 2000;
Iwamiya, 1994; Lipscomb & Kendall, 1994; Scherer, 1986). Affect scales were modified
to be appropriate for this experimental design. For example, the happy-sad dimension
was modified to be contentment-joy to reflect the fact that all stimuli were written in the
major mode. The list of affect scales used in this study can be found in Table 2.1. Note
that Groups 1 and 2 received four different affect scales, and two common affect scales.
The two repeated affect scales allowed for the assessment of between-groups reliability.

Table 2.1: Affect scales used for Affect Ratings, shaded area indicates the affect scales
rated by both groups.

Group 1 Group 2

practice energy 1 2 3 tranquility excitement 1 2 3 boredom

1 contentment 1 2 3 joy contentment 1 2 3 joy

2 hesitation 1 2 3 confidence hesitation 1 2 3 confidence
3 excitement 1 2 3 boredom energy 1 2 3 tranquility
4 pensiveness 1 2 3 playfulness questioning 1 2 3 answering
5 irritation 1 2 3 calmness surprise 1 2 3 expectation
6 delicacy 1 2 3 strength passivity 1 2 3 aggression

key-defining context probe-tone


Figure 2.4: The complete set of stimuli used in Stage 3 and 5, for the assessment of
tonality. The circled portion indicates the subset of probes used during practice (Stage 1).

All participants were exposed to the same musical stimuli. The sixteen sequences
used were designed to vary in melodic contour, and were classified according to their
overall shape or contour (Figure 2.5) and mean pitch height (register). Here contour was
categorized by the number of times the sequence changed in direction. For example,
linear sequences were described as sequences that never changed direction, and arches as
sequences that changed direction once. Contours that changed in direction twice or three
times were also used in this study.
Two sequences were classified as “ascending”, defined as sequences that were
monotonically increasing in pitch (each subsequent note was equal to or higher in pitch).
One ascending sequence was placed in the high pitch range (high register) and the other
in the low pitch range (low register). Similarly, two sequences were classified as
“descending”, defined as sequences that were monotonically decreasing in pitch (each
subsequent note was equal to or lower in pitch). One was placed in the high pitch range
and the other in the low. Two sequences were classified as “arches”, defined as sequences
that increased and then decreased (one change in direction), ultimately returning to the
starting note. One of the arches was placed in a high pitch range and the other was
placed in a low pitch range. Two sequences were classified as “inverted-arches”, defined
as sequences that decreased and then increased, returning to the starting value. One of the
inverted arches was placed in a high pitch range and the other was placed in a low pitch
ranges. Like the ascending and descending sequences arches could be defined by type
(regular or inverted) and range (high or low) resulting in four arches.
“N-shaped” sequences were defined as sequences that had two changes in
direction, and would therefore spend part of the time above C, and part below. Since it
was not desired that sequences extend beyond the two octave range, only two N-shaped
sequences were used centered at C4 (middle C) with one rising first and the other falling
“M-shaped” and “W-shaped” sequences were designed to have three changes in
direction. There were a number of ways these could be designed, but the final set was
divided basically by range with one of each in the high range, one of each in the mid-
range, and one of each in the low-range (Table 2.5).

The Sixteen Musical Sequences






Arch & Inverted Arch





Figure 2.5: The musical stimuli used in Stage 2, 4 and 6, for the Affect Ratings. Some
melodies had no change in direction (linear: 1-4), some had one change in direction
(arches: 5-8), some had two changes in direction (N-shape: 9 & 10), and some had three
changes in direction (M & W-shaped: 11-16). Notice that not all sequences were written
in the treble cleft, some were in the bass cleft (i.e. sequences 3, 4, 7, 8, 13, and 16).

For all sequences, the timing of when the “points of change” occurred were varied
so that sequences were not entirely predictable. To help maintain sense of tonality, all
sequences started and ended on C (the tonic) and all directional changes occurred on a
note of the tonic triad (C, E, or G).
All sequences were designed in consultation with M. Curry (BA in music theory)
and B. Frankland, PhD (researcher in music cognition) with the desire to maintain a sense
of musicality (i.e., the rules of composition) and all were pre-tested in pilot work.
The final set of 16 musical sequences spanned two octaves from C3 to C5 (261.63
to 1046.50 Hz). All sequences were played at the same volume, at the same tempo, and
on the same instrument. Each note was played for a duration of 300 ms with the last note
held an extra beat (600 ms). This resulted in a fairly average paced tempo (cf. Krumhansl,
2000). Note that all sequences began and ended on a C, and were designed to fit the
tonality of C major. Basically the sequences were designed so that they only differed on
contour (shape) of the melody, and on the range of notes used.

Annotated Results

This section presents the results in the order in which they were conducted. It is
divided into five main sections: (I) Reliability of Measures, (II) Descriptives (III)
Modeling, (IV) Summary and (V) Additional Analyses. First, the reliability of measures
analyses were conducted to ensure that the measures in this study were working properly,
and to validate collapsing (averaging) the data for subsequent analyses. Second, basic
descriptive statistics were generated and profile charts were created to get a rough idea of
the differences in response patterns for different musical sequences. Third, the data was
modeled to determine which aspects of the music were most important for predicting
different responses on different affect scales. Fourth, the overall results of the modeling
were summarized. Finally, additionally analyses and factors that were considered are
presented in the final section. Results from the Tonality Assessment, for example, were
placed in this section since tonality profiles were not expected to play a large role in
Affect Ratings.
Throughout this section abbreviations for the measures were used. This was done
to simplify the space required to present the sequences and affect scales. Tables 3.1 and
3.2 contain the legend of the abbreviations used for the affect scales and musical
sequences in this section.

Table 3.1: Legend of abbreviations used for affect scales (words).

Group 1, Affect Scales (words) Group 1 (Abbreviations)

contentment -- joy cont-joy1
hesitation -- confidence hes-conf1
pensiveness -- playfulness pens-play
delicacy – strength del-str
irritation -- calmness irr-calm
excitement – boredom exci-bore

Group 2, Affect Scales (words) Group 2 (Abbreviations)

contentment -- joy cont-joy2
hesitation -- confidence hes-conf2
passivity – aggression pass-agr
questioning – answering ques-ans
surprise – expectation sur-exp
energy -- tranquility ener-tran

Table 3.2: Legend of abbreviations used for musical sequences.

# Sequence Abbrev. Legend

1 C4 E4 G4 G4 A4 B4 C5 1:Lau (L)inear, (a)scend., (u)pper range
2 C5 A4 G4 F4 E4 D4 C4 2:Ldu (L)inear, (d)escend., (u)pper range
3 C3 D3 E3 F3 G3 B3 C4 3:Lal (L)inear, (a)scend., (l)ower range
4 C4 A3 G3 F3 E3 D3 C3 4:Ldl (L)inear, (d)escend., (l)ower range
5 C4 G4 C5 G4 E4 D4 C4 5:Rau A(R)ch, (a)scend., (u)pper range
6 C5 B4 A4 G4 E4 G4 C5 6:Rdu A(R)ch, (d)escend., (u)pper range
7 C3 E3 G3 F3 E3 D3 C3 7:Ral A(R)ch, (a)scend., (l)ower range
8 C4 G3 E3 C3 E3 G3 C4 8:Rdl A(R)ch, (d)escend., (l)ower range
9 C4 E4 D4 C4 A3 G3 C4 9:Na (N)-shape, (a)scend.
10 C4 G3 E3 C4 G4 E4 C4 10:Nd (N)-shape, (d)escend.
11 C4 G4 C4 E4 D4 D4 C4 11:Mu (M)-shape, (u)pper range
12 C4 E4 C4 G3 C4 E4 C4 12:Mm (M)-shape, (m)iddle range
13 C3 G3 E3 D3 C3 E3 C3 13:Ml (M)-shape, (l)ower range
14 C4 A4 G4 C4 G4 B4 C4 14:Wu (W)-shape, (u)pper range
15 C4 G3 E4 C4 G3 B3 C4 15:Wm (W)-shape, (m)iddle range
16 C4 A3 G3 B3 C4 G3 C4 16:Wl (W)-shape, (l)ower range

I. Reliability of Measures

This was exploratory research with never before tested musical sequences, and
relatively untested new affect scales. Thus, assessment of the reliability of these measures
was important. Two reliability checks were built into our design: a within-subjects
reliability test and a between-groups reliability test. Both these checks showed the
measures to be reliable.

Within-Subjects Reliability
The repetition of the Affect Ratings (stages 4 and 6) served as a within-subjects
reliability check. In other words, participants were expected to give the same response to
the same sequence regardless of the stage or the randomly-different context. Stage 4
could be considered the first time a participant saw a particular word/music combination,
and Stage 6 would be the second time. Thus, a simple paired t-test analysis could be used
to determine if participants responded significantly differently, on average, between Time
1 (Stage 4) and Time 2 (Stage 6). This paired t-test indicated that participants did not, on
average, significantly change their response pattern between Time 1 and Time 2 (Group
1: t (4742) = 1.634, p = .102; Group 2: t (4729) = .850, p = .395). Responses between
Time 1 and Time 2 were also significantly correlated (Group 1: r = .502, p < .001; Group
2: r = .409, p < .001).
A further check was performed to see if any individual musical sequences or
words were consistently reliable. Table 3.3 contains the 16x12 p-values obtained from
this analysis.

Table 3.3: Significance (p-values) of the change in responding from Time 1 to Time 2 for
each affect scale - sequence combination.

p-values Affect Scales (Words)

cont cont hes- hes-
Musical pens pass- ques del- irr- sur- ener exci-
-joy -joy conf conf
-play agr -ans str calm exp -tran bore
Sequences 1 2 1 2
1:Lau .371 .569 .308 .595 .013 .755 .132 .011 1.00 .129 .096 1.00
2:Ldu .533 .792 .511 .878 .077 .644 .417 .172 1.00 .685 .332 .743
3:Lal .850 .533 .855 .679 .776 .679 .462 .241 .652 .351 .041 .077
4:Ldl .322 1.00 .221 1.00 .598 .012 .498 .498 .455 .302 .518 1.00
5:Rau .360 .561 1.00 .871 .086 .151 1.00 .429 1.00 .542 .451 .358
6:Rdu .830 .377 .471 .471 .472 .005 .125 1.00 .607 .267 .252 .083
7:Ral 1.00 .533 .699 .070 .110 .513 .358 .261 .642 .821 .473 .710
8:Rdl .728 .749 .124 .890 .298 .197 .901 .267 .290 1.00 .522 1.00
9:Na .892 .561 .290 .202 .471 .749 .323 .231 .417 .900 1.00 .583
10:Nd .382 .878 .420 .074 .583 1.00 .011 1.00 .584 .886 1.00 1.00
11:Mu .652 .734 .522 .776 .595 .229 .267 .898 .799 .175 .659 .881
12:Mm .728 .511 .056 .290 .162 .253 .377 .392 .875 .154 .151 .471
13:Ml 1.00 .261 .881 .129 .322 .892 .440 .607 .261 .537 1.00 .096
14:Wu .322 1.00 .735 .771 .766 .008 .269 .255 .290 1.00 .032 .159
15:Wm .617 .776 .042 .521 .348 .552 .489 .883 .172 .605 .335 .749
16:Wl .417 .878 .200 .451 .090 .132 .498 .543 .404 .033 .489 .728
Total Sign. 0 0 1 0 1 3 1 1 0 0 2 0

Notes: Bolded values indicate a significant change (p<.05). ―Total Sign.‖ counts the
number of significant differences for each affect scale. Shaded cells indicate scales
evaluated by Group 2, non-shaded were evaluated by Group 1.

The affect scale passivity–aggression received the highest number of significant

results (3 of a possible 16) indicating that it was possibly the least reliable of all affect
scales. However, when conducting 192 t-tests (16x12), using an α = .05 level, one would
expect 5% (or 9.6) of the tests to come out significant just by chance. Hence, the 9
significant results are well within that which one would expect by chance. In addition, the
significance was fairly randomly distributed (with the possible exception of passivity—
aggression), so it is safe to conclude that participants responded in a reliable/consistent
manner for these measures. Thus, data was collapsed across time for all participants and
all subsequent analyses were performed using this average rating.
It was also noted that both groups had significantly faster responses, on average,
in Time 2 (Group 1: M t1 = 3184 ms, M t2 = 3088 ms, t (4799) = 6.680, p < .001; Group 2:

M t1 = 3341 ms, M t2 = 3221 ms, t (4799) = 7.954, p < .001)2. This effect, however was
quite small, and was probably due to increased familiarity to the task (a practice effect).
Also reaction times between Time 1 and Time 2 were correlated, indicating that difficult
scale-sequence combinations in Time 1 were difficult scale-sequence combinations in
Time 2 (Group 1: r = .144, p < .001; Group 2: r = .191, p < .001). Since actual responses
did not differ significantly between Time 1 and 2 this practice effect was not considered
particularly important and is not discussed further.

Between Groups Reliability

The repetition of two affect scales (contentment – joy, hesitation – confidence)
between both groups served as a between-groups reliability check. In other words, these
two affect scales were used to determine if the two groups used the measures in an
equivalent way. This check consisted of a series of between-subjects t-tests, one for each
sequence with the two affect scales. The results of these t-tests broken down by sequence
can be found in Table 3.4.

M t1 and M t2 stand for the mean reaction time of Time 1, and the mean reaction time of Time 2.

Table 3.4: Between group t-tests for repeated Affect Scales

contentment - joy hesitation - confidence

1:Lau t (98) = -.185, p = .854 t (98) = .656, p = .513
2:Ldu t (98) = 2.202, p = .030 * t (98) = -.378, p = .706
3:Lal t (98) = -1.500, p = .137 t (98) = -1.235, p = .220
4:Ldl t (98) = 1.510, p = .134 t (98) = -.814, p = .418
5:Rau t (98) = .881, p = .380 t (98) = .753, p = .453
6:Rdu t (98) = .458, p = .648 t (98) = 1.332, p = .186
7:Ral t (98) = 1.750, p = .083 t (98) = -1.599, p = .113
8:Rdl t (98) = 1.082, p = .282 t (98) = 1.125, p = .263
9:Na t (98) = 1.210, p = .229 t (98) = 1.373, p = .173
10:Nd t (98) = 1.126, p = .263 t (98) = -.085, p = .932
11:Mu t (98) = .980, p = .330 t (98) = 1.002, p = .319
12:Mm t (98) = .165, p = .869 t (98) = .165, p = .869
13:Ml t (98) = 1.735, p = .086 t (98) = -.162, p = .872
14:Wu t (98) = -.221, p = .826 t (98) = 1.268, p = .208
15:Wm t (98) = 1.516, p = .133 t (98) = 1.743, p = .085
16:Wl t (98) = -.747, p = .457 t (98) = -1.567, p = .120

Note: Some Levene’s Test for Equality of Variances were significant, but this did not
affect t or p-values.

Only one t-test was found to be significant. Thus, participants in the two groups
did not appear to be giving significantly different responses, on average, to the repeated
measures. Again, it is worth mentioning, that with 32 tests using an α = .05 one would
expect 32 times 5% (or 1.6) to be significant just by chance. Since only one test was
found to be significant, this is well within what would be expected just by chance. Thus,
one can conclude that Group 1 and Group 2 did not differ significantly in their responses
to our measures. This would imply that responses could be collapsed (averaged) across
groups for these two affect scales. However, the two groups were not collapsed for these
two affect scales because it would have made subsequent analyses and interpretation
more complex due to the asymmetry of design (i.e. a between-subjects design with
double the sample size for only 2 of the 10 affect scales).
Analysis of reaction time data indicated that participants in Group 2 were slightly
slower completing the task than participants in Group 1 (Group 1: M = 3135.8 ms, SD =
573.2 ms; Group 2: M = 3280.9 ms, SD = 633.3 ms; t (9598) = -11.763, p < .001). This
difference, however, was only about 100 ms which is not large. The two groups had
approximately the same demographic makeup, but were exposed to different affect scales

(four of the six affect scales were different). Hence, the difference was most likely related
to the differences in affect scales between the two groups, and not to demographic
differences. This Group 2 being slower difference was an interesting point, but this was
not considered particularly relevant for this research.

Between Groups ANOVA

An overall ANOVA was calculated to verify that there were no significant
differences in how these groups rated the sequences. These were done as a 2 (groups) x
16 (musical sequences) ANOVA for both affect scales. In these ANOVAs, a large main
effect of musical sequence was to be expected. However, a lack of group differences
would be indicated by no significant main effect of group, and no significant interaction.
Results of these ANOVAs can be found in Tables 3.5 and 3.6.

Table 3.5: ANOVA for affect scale contentment-joy

Source F-value Eta Squared (η2)

Group (G) F (1,1568) = 8.020, p<.010 .005
Musical Sequences (M) F (15,1568) = 144.087, p<.001 .580
GxM F (15,1568) = 1.168, p=.290 .011

Table 3.6: ANOVA for affect scale hesitation-confidence

Source F-value Eta Squared (η2)

Group (G) F (1,1568) = 0.226, p=.426 .000
Musical Sequences (M) F (15,1568) = 13.661, p<.001 .268
GxM F (15,1568) = 0.422, p=.277 .011

For both repeated affect scales (contentment-joy and hesitation-confidence) the

interaction between group and musical sequences was not significant. This indicated that
participants in different groups did not rate the musical sequences significantly
differently across this repeated affect scales. However, there was a small main effect of
group for the affect scale contentment-joy which indicated that participants in Group 2
tended to endorse ―contentment‖ ratings for sequences more often than participants in
Group 1 (MG1 = 1.998, MG2 = 1.9293). The effect size of this main effect was however

MG1 indicates the mean of Group 1, and MG1 indicates the mean of Group 2, where a lower rating
indicates more endorsement of “contentment”.

very small (η2 = .005), thus this group difference was not considered particularly
important. Overall, these groups appeared to use the two repeated affect scales in an
equivalent way.

II. Descriptives

Given that the measures seemed reliable, the patterns of responses (profiles) per
sequence averaged over all participants within each group were examined. These patterns
(response profiles) are presented in Figures 3.1 through 3.16. Each figure presents all the
data from both groups for one sequence. The horizontal axis is an arbitrarily-ordered list
of the affect scales. The vertical axis represents the mean response of all participants (in
the relevant group) to each affect scale. Error bars represent standard errors. The musical
sequence is written out as a musical score at the top of each figure, and sequence
abbreviations appear in the top-left corner.
In this comparison all the figures present the 10 affect scales in the same order
along the horizontal axis. This facilitates comparison. Furthermore, it is important to
realize that the ordering of those affect terms is arbitrary (in fact, the assignment of ―1‖
and ―3‖ to the ends of the affect scale was also arbitrary). That is, the overall shape of the
affect profile does not imply any particular affect. It is simply the fact that the profiles are
different as a function of sequence and that within a sequence, some affect terms are
Note that each figure contains the data for both Groups 1 and 2. As such, there are
two data point for the affect scales of contentment-joy and hesitation-confidence, since
they were used by both groups. The fact that the two data points overlap is a
reaffirmation of the aforementioned between-groups reliability analysis. Group 1 data
was denoted by a solid diamond, and Group 2 data was denoted by an open triangle.
Note, further, that several sequences exhibit differentiated profiles. That is, the
sequence led to polarized responses (i.e., one end of the scale) for some affect terms and
for undifferentiated responses for other affect terms. The ―1:Lau‖ sequence (Figure 3.1)
was a good example. Responses strongly endorsed joy on the joy-contentment affect
scale, playful on the playfulness-pensiveness scale, excitement on the excitement-

boredom scale and energy on the energy-tranquility scale. However, the aggression-
passivity, strength-delicacy, and questioning-answering affect scales show little
differentiation. The remaining three affect scales were between these two extremes (some
tendency to choose one end of the scale). This implied that some dimensions of affect
applied to this sequence, whereas other dimensions of affect did not. Stated alternatively,
the sequence expressed, or had relevance to, some dimensions of affect, but not other
dimensions of affect.
Finally, different sequences produced different affect profiles. For example, the
sequence ―1:Lau‖ (Figure 3.1) seemed to produce the "opposite" affect to the sequences
―4:Ldl‖ (Figure 3.4), ―7:Ral‖ (Figure 3.7), and ―13:Ml‖ (Figure 3.13). In contrast, the
sequence ―1:Lau‖ (Figure 3.1) produced a very similar profile to sequences ―6:Rdu‖
(Figure 3.6) and ―14:Wu‖ (Figure 3.14). Some sequences did not seem to produce
"differentiated profiles" in that all affect scales were rated near the middle of the scale
(See Figures 3.1 to 3.16).




Mean Rating








Figure 3.1: The response profile for the Linear-Ascending-Upper register (Lau) sequence.




Mean Rating








Figure 3.2: The response profile for the Linear-Descending-Upper register (Ldu)




Mean Rating








Figure 3.3: The response profile for the Linear-Ascending-Lower register (Lal) sequence.




Mean Rating








Figure 3.4: The response profile for the Linear-Descending-Lower register (Ldl)




Mean Rating








Figure 3.5: The response profile for the Arch-Ascending-Upper register (Rau) sequence.




Mean Rating








Figure 3.6: The response profile for the Arch-Descending-Upper register (Rdu) sequence.




Mean Rating








Figure 3.7: The response profile for the Arch-Ascending-Lower register (Ral) sequence.




Mean Rating








Figure 3.8: The response profile for the Arch-Descending-Lower register (Rdl) sequence.




Mean Rating








Figure 3.9: The response profile for the N-shape-Ascending (Na) sequence.




Mean Rating








Figure 3.10: The response profile for the N-shape-Descending (Nd) sequence.




Mean Rating








Figure 3.11: The response profile for the M-shape-Upper register (Mu) sequence.




Mean Rating








Figure 3.12: The response profile for the M-shape-Middle register (Mm) sequence.
Where ―middle‖ means the sequence is between the two registers (around C4).




Mean Rating








Figure 3.13: The response profile for the M-shape-Lower register (Ml) sequence.




Mean Rating








Figure 3.14: The response profile for the W-shape-Upper register (Wu) sequence.




Mean Rating








Figure 3.15: The response profile for the W-shape-Middle register (Wm) sequence.
Where ―middle‖ means the sequence is between the two registers (around C4).




Mean Rating








Figure 3.16: The response profile for the W-shape-Lower register (Wl) sequence.

From the other perspective, note that each affect scale showed some range of
responding across the different sequences. This implied that the each affect scale has
some particular relevance to some sequences.
For example, for the affect scale contentment-joy, four of the sequences were
rated as conveying some degree of joy (1:Lau, 5:Rau, 6:Rdu, and 14:Wu), four sequences
were rated as conveying some degree of contentment (4:Ldl, 7:Ral, 8:Rdl, and 13:Ml),
and the remaining eight sequences ratings were toward the middle of the scale. The affect
scale pensiveness-playfulness was rated similarly with approximately three of the
sequences rated as conveying pensiveness (4:Ldl, 7:Ral, 13:Ml), five of the sequences
rated as conveying playfulness (1:Lau, 2:Ldu, 5:Rau, 6:Rdu, 14:Wu), and the rest rated
around the mid-point. The affect scale that was probably most unique was the scale
―questioning-answering‖. For questioning-answering most sequences were rated at the
neutral point (―2‖), with only a few sequences rated as indicating a slight question (1:Lau,
3:Lal, 6:Rdu, 8:Rdl), and four the sequences rated as conveying and answer (2:Ldu,
4:Ldl, 7:Ral, 11:Mu). Noticeable, all sequences rated as conveying questioning ended
with an ascending component, and sequences rated as conveying answering ended with a
descending component. Thus, each affect scale appeared to be rated differently across
different sequences.
The mean rating for all affect terms for each sequence (collapsing Groups 1 and 2
for the joy-contentment and hesitation-confidence scales) is shown in Table 3.7. The
average rating over all sequences for each affect scale was near two -- the middle of the
scale. A mean near 1 or 3 would imply a poor choice of affect terms because it would
indicate that only one part of the scale was being endorsed by participants.

Table 3.7: Response Means for each Sequence & Affect Scale

Affect Scales (Words)

cont- hes- pens- pass- ques- del- irr- sur- ener- exci-
Mean SD
joy conf play agr ans str calm exp tran bore
1:Lau 2.88 2.67 2.80 2.10 1.75 1.74 1.56 1.35 1.15 1.14 2.06 0.86
2:Ldu 2.12 2.19 2.38 1.65 2.51 1.63 1.66 2.19 1.89 1.70 2.02 0.71
3:Lal 1.68 2.15 1.72 1.97 1.75 2.35 2.35 2.28 2.13 2.16 2.03 0.65
4:Ldl 1.14 1.55 1.18 1.64 2.48 2.38 2.64 2.69 2.79 2.90 2.01 0.85
5:Rau 2.39 2.38 2.44 1.80 2.30 1.77 1.68 1.70 1.71 1.39 2.03 0.70
6:Rdu 2.84 2.61 2.85 2.05 1.79 1.72 1.37 1.37 1.16 1.12 2.03 0.86
7:Ral 1.15 1.46 1.14 1.79 2.33 2.42 2.64 2.75 2.83 2.89 2.00 0.87
8:Rdl 1.60 1.90 1.74 1.82 1.87 2.19 2.39 2.26 2.36 2.26 1.99 0.63
9:Na 1.83 1.97 1.93 1.64 2.10 1.92 2.27 2.15 2.28 1.97 1.99 0.57
10:Nd 2.03 1.99 2.17 2.00 2.07 1.94 1.80 1.85 1.92 1.84 1.97 0.54
11:Mu 2.05 2.10 2.27 1.75 2.35 1.85 1.90 2.09 1.98 1.77 2.02 0.63
12:Mm 1.97 1.87 2.15 1.95 2.00 2.00 1.83 2.17 1.90 1.76 1.95 0.57
13:Ml 1.15 1.50 1.18 1.77 2.21 2.39 2.52 2.69 2.80 2.87 1.98 0.85
14:Wu 2.92 2.56 2.91 2.15 1.97 1.69 1.42 1.18 1.12 1.06 2.04 0.88
15:Wm 1.96 2.09 2.18 1.84 2.23 1.97 2.05 2.26 2.00 1.90 2.04 0.54
16:Wl 1.75 2.05 1.88 1.87 2.11 2.22 2.25 2.10 2.13 2.08 2.02 0.57
Mean 1.96 2.06 2.06 1.86 2.11 2.01 2.02 2.07 2.01 1.93
2.01 0.72
SD 0.74 0.69 0.73 0.67 0.72 0.70 0.71 0.71 0.74 0.73
Min 1.14 1.46 1.14 1.64 1.75 1.63 1.37 1.18 1.12 1.06
Max 2.92 2.67 2.91 2.15 2.51 2.42 2.64 2.75 2.83 2.90

Notes: The contentment-joy and hesitation-confidence scales were based on both groups
(dark shading). Light shading indicates affect scales rated by Group 2.
Seq. is short for ―sequences‖.

The primary point of the remainder of this work is the exploration of the link
between the structure of the sequence and the type of affect that it expresses. Some
preliminary observations indicated that sequences that ended on either the high C (C5) or
low C (C3) seemed to elicit the strongest responses. For example, the six sequences that
yielded the strongest response patterns were: ―1:Lau‖: linear ascending in upper octave
ending on C5 (Figure 3.1), ―4:Ldl‖: linear descending in lower octave ending on C3
(Figure 3.4), ―6:Rdu‖: descending arch beginning and ending on C5 (Figure 3,6),
―7:Ral‖: ascending arch beginning and ending on C3 (Figure 3.7), ―13:Ml‖: M-shape
beginning and ending on C3 (Figure 3.13), and ―14:Wu‖: W-shape beginning and ending

on C5 (Figure 3.14). These sequences were hereafter collectively referred to as the
―extreme six‖.
From the profile figures certain affect scales received consistently stronger levels
of responding for the ―extreme six‖. These included scales: contentment – joy,
pensiveness – playfulness, energy – tranquility, excitement – boredom. It was also
evident that the three sequences with a final note C3 (4:Ldl, 7:Ral, 13:Ml) had almost
identical profiles with 13:Ml having the most extreme responses. Correspondingly, the
three sequence with C5 as a final note (1:Lau, 6:Rdu, 14:Wu) had almost identical
profiles, with 14:Wu having the most extreme responses. Specifically, sequences that
ended on C3 were rated as conveying contentment, pensiveness, tranquility and boredom.
On the other hand, sequences that ended on C5 were rated as conveying joy, playfulness,
energy, and excitement.
The profiles of the other ten sequences (those not in the extreme six) were mostly
flat with average responses close to 2. The N-shape and other W and M shapes were the
most flat. The arches and linear sequences that were not part of the ―extreme six‖
behaved more like the arches and linear sequences that were in the same range as them
than those with the same melodic shape / contour. In other words, pitch height (whether
sequences were in the upper or lower octave) appeared to be a stronger predictor of
participant responses than melodic contour or shape of the melody.

Correlations between Affect Scales

From the profile charts, it appeared that certain affect scales tended to be used in a
very similar manner across sequences. For example, sequences described as conveying
joy were also described as conveying confidence, energy and excitement. Correlations on
the affect scales indicated that, in fact, some affect scales were highly correlated with
each other (Tables 3.8 and 3.9).

Table 3.8: Correlation of means across all 16 sequences for Group 1.

cont-joy1 hes-conf1 pens-play del-str irr-calm excit-bore

cont-joy1 - 0.959** 0.989** -0.923** -0.963** -0.984**
hes-conf1 - 0.955** -0.850** -0.877** -0.960**
pens-play - -0.932** -0.967** -0.993**
del-str - 0.930** 0.913**
irr-calm - 0.955**
excit-bore -
Mean 1.995 2.075 2.056 2.012 2.022 1.926
SD 0.573 0.406 0.572 0.277 0.427 0.599

Notes: ** p < .01, * p < .05

For all correlations, N = 16

Table 3.9: Correlation of means across all 16 sequences for Group 2.

cont-joy2 hes-conf2 pass-agr ques-ans sur-exp ener-tran

cont-joy2 - 0.940** 0.713** -0.491 -0.977** -0.995**
hes-conf2 - 0.639** -0.490 -0.907** -0.938**
pass-agr - -0.775** -0.738** -0.733**
ques - ans - 0.528* 0.503*
sur-exp - 0.965**
ener-tran -
Mean 1.928 2.049 1.863 2.116 2.067 2.009
SD 0.588 0.343 0.161 0.244 0.472 0.552

Notes ** p < .01, * p < .05

For all correlations, N = 16

For example, for Group 1 the correlations ranged from strengths of r = .850 to r =
.989 (Table 3.8). These high, significant correlations indicated that all six affect scales in
Group 1 were giving approximately the same information. In other words the affect
scales appeared to be somewhat redundant with one another. The correlation squared is a
measure of overlap (proportion of variance explained), and as such, a correlation of .979
implies a 95.8% overlap, while a correlation of .850 implies a 72.3% overlap (or a 27.7%
non-overlap) As the affect scales were designed to measure different affective qualities of
the music, such strong correlations indicated that these measures were not measuring
different qualities. Thus, the validity of the affective measures was called into question
by this result for Group 1. For example, it seemed odd that that the correlations imply that
sequences rated the most joyful were also rated as the most irritated. To understand this,

one must remember that the correlation is a measure of pattern similarity. The correlation
ignores the absolute values of the measure. Note that across sequences, the contentment-
joy affect scale ranged from 1.14 to 2.92 (1.78), whereas the irritation-calmness scale
ranged from only 1.37 to 2.64 (1.27). Hence, the irritation-calmness scale showed a
narrower range of difference (less differentiation). In addition, the middle of the
contentment-joy scale (the mean of 1.96 was near the middle) implied some positive
affect for a sequence. On the other hand, the middle of the irritation-calm scale (the mean
of 2.02 was near the middle) implied the absence of affect (on this dimension). Hence,
the correlation is saying that small deviations from neutrality on the irritated-calm scale
are associated with large deviations along the positive domain of happiness. Each
correlation needs to be considered in this light, but since this falls out of the modeling
(e.g., η2), discussion will be postponed till then.
The strength of correlations found between the affect scales used in Group 2 was
much lower (Table 3.9), and in fact, for some measures the correlations were not
significant. For example, the affect scale questioning–answering was not found to be
significantly correlated to scales contentment–joy and hesitation–confidence (r = -.491; r
= -.490 respectively). Although the scale questioning–answering was found to be
moderately correlated with surprise – expectation and energy–tranquility (r = .538*, p =
.035; r = .503*, p = .047; respectively). However, in Group 2 affect scales contentment–
joy, hesitation–confidence, energy–tranquility and surprise–expectation were all strongly
correlated with rs > .90** in all cases. Sequences described as conveying joy, were also
described as conveying confidence, surprise, and energy. Thus, even in Group 2 our
affect measures were doing a poor job at capturing completely independent aspects of the
music with the exception of the scale questioning–answering and possibly the scale

III. Modeling the Responses

Overall ANOVA
To obtain a general sense of the differences between affect scales and between
sequences, several initial ANOVAs were conducted. The first analysis examined affect
ratings as a function of affect scale and sequence using a 10x16 (affect scales by
sequences) mixed ANOVA on all responses. The main effect of sequence would imply
that different sequences produced different average ratings collapsed over all affect
scales. The main effect of affect scale would imply that different affect scales produced
different average ratings collapsed over all sequences. However, neither of these terms is
particularly interesting or important because some sequences were expected to produce
higher ratings for some scales and lower ratings for other scales. Hence, it was not
expected that the main effects for affect scale or sequence would be significant. For this
analysis, the important term was the interaction.
Thereafter, analyses were conducted for each affect scale separately as a function
of sequence to assess the impact of differences in sequences for affect scale (i.e., the
simple effects of sequence within each affect scale). Finally, there were analyses of each
sequence as a function of affect scale to assess the impact of differences in affect scales
for each sequence (i.e., the simple effects of affect scale within each sequence).
In all these analyses, one important statistic was η2 (eta squared). That is, as a
prelude the final modeling analysis, it was necessary to determine what proportion of the
variation in the responses could actually be explained by our manipulations of different
sequences and affect scales. Eta squared (η2) is a simple measure of effect size (a
measure the proportion of variance explained) which indicates the relative impact of the
differences between sequences on each affect scale (or the relative impact of the
differences between affect scales on each sequences). Values near 1 would imply that
differences between the sequences could explain nearly all of the differences in affect
ratings and values near 0 would imply no impact. The important issue for modeling was
that, at most, a model could expect to explain the observed η2. That is, the ANOVAs
captured any and all differences that could be explained by the independent variables
(sequences & scales). However, a model only explains that part of "any and all

differences" that are relevant to the model. Hence, η2 served as a benchmark for assessing
the quality of the various models.

Table 3.10: Overall ANOVA, including both groups and all 10 affect scales.

Source F-value Eta Squared (η2)

Musical Sequences (M) F (15, 9440) = 10.879, p<.001 .017
Affect Scales (A) F (9, 9440) = 15.413, p<.001 .014
MxA F (135, 9440) = 43.228, p<.001 .382

Table 3.11: ANOVA for Group 1

Source F-value Eta Squared (η2)

Musical Sequences (M) F (15, 4704) = 3.830, p<.001 .012
Affect Scales (A) F (5, 4704) = 7.384, p<.001 .008
MxA F (75, 4704) = 47.672, p<.001 .432

Table 3.12: ANOVA for Group 2

Source F-value Eta Squared (η2)

Musical Sequences (M) F (15, 4704) = 1.403, p = .136 .004
Affect Scales (A) F (5, 4704) = 20.617, p<.001 .021
MxA F (75, 4704) = 31.109, p<.001 .332

The overall ANOVA based on both groups (16 sequences by 10 affect scales,
Table 3.10) indicated that the interaction between musical sequences and affect scales
(F(135, 9440) = 43.228) was much stronger than the main effect for sequence (F(15,
9440) = 10.879 and for affect scale (F(9, 9440) = 15.413). Similarly, the effect size for
the interaction (η2 = .382) was about 20 times larger than it was for the main effect of
sequence (η2 = .017) or affect scale (η2 = .014). This significant interaction indicated to us
that the manipulation was working. That is, participants’ responses depended on the
combination of musical sequence being played and affect scale presented. An effect size
of η2 = .382 in this case indicated that, on average, approximately 38.2% of the variation
in responses could be explained by our manipulation of musical sequences and affect
scales. By group, Group 1 showed a larger interaction and main effect for word, while
Group 2 showed a larger main effect for sequence further suggesting some group
differences (Tables 3.11 & 3.12).

Analysis of Each Sequence
The next analysis examined the differences between the affect scales for each
sequence separately. Since different groups received different affect scales, these
analyses were conducted within each group separately. Recall that two of the affect scales
were repeated across groups (contentment-joy and hesitation-confidence).
For Group 1 the ANOVAs were all significant for each sequence across the affect
scales (Table 3.13).

Table 3.13: ANOVA for Sequences across Affect Scales. The percentage of variance
accounted for by each sequence.

Eta Squared
ANOVA Result
Group Group
Group 1 Group 2 1 2
1:Lau F (5, 294) = 105.6 *** F (5, 294) = 70.4 *** .642 .545
2:Ldu F (5, 294) = 13.1 *** F (5, 294) = 10.8 *** .182 .155
3:Lal F (5, 294) = 15.0 *** F (5, 294) = 7.0 *** .203 .106
4:Ldl F (5, 294) = 126.5 *** F (5, 294) = 75.1 *** .683 .561
5:Rau F (5, 294) = 34.4 *** F (5, 294) = 13.3 *** .369 .184
6:Rdu F (5, 294) = 124.1 *** F (5, 294) = 59.7 *** .679 .504
7:Ral F (5, 294) = 118.8 *** F (5, 294) = 72.6 *** .669 .553
8:Rdl F (5, 294) = 13.0 *** F (5, 294) = 15.0 *** .181 .203
9:Na F (5, 294) = 3.7 ** F (5, 294) = 9.6 *** .059 .140
10:Nd F (5, 294) = 4.3 ** F (5, 294) = .8, p=.541 .068 .014
11:Mu F (5, 294) = 5.5 *** F (5, 294) = 4.8 *** .086 .075
12:Mm F (5, 294) = 3.1 ** F (5, 294) = 1.7, p=.127 .051 .029
13:Ml F (5, 294) = 90.1 *** F (5, 294) = 72.7 *** .605 .553
14:Wu F (5, 294) = 115.8 *** F (5, 294) = 87.8 *** .663 .599
15:Wm F (5, 294) = 2.3 * F (5, 294) = 5.3 *** .038 .083
16:Wl F (5, 294) = 7.8 *** F (5, 294) = 3.4 ** .117 .055

Note: *** p<.001, ** p<.01, * p < .05

Note that almost all of the sequences produced different ratings for the 6 affect scales
(within each group, there were only 6 affect scales). In Group 1, the largest F-value was
associated with the linear sequence 4:Ldl and the smallest with the W-shaped sequence
15:Wm (F-values 126.5 and 2.3, respectively). Obtained squared etas ranged from η2 =
.683 to η2 = .038. Note that the average is near that of the interaction for the overall
ANOVA. Not surprisingly, the ―extreme six‖ (the six sequences with an ending note

other than middle C) had the highest obtained F-values and η2s. All six of the extreme six
had η2 > .60. Thus, for Group 1, all sixteen sequences were important for explaining the
variation in participant responses but the ―extreme six‖ sequences were the most
powerful predictors (had the most reliable level of responding across all affect scales).
For Group 2, sequence 10:Nd and 12:Mm did not yield significant ANOVAs (F
(5, 294) = .8, p = .541, and F (5, 294) = 1.7, p = .127 respectively). All other sequences
yielded significant ANOVAs, with the largest F-value and η2 obtained by the sequence
14:Wu (F (5, 294) = 87.8, p < .001, η2 = .599). Again, for Group 2, the ―extreme six‖ had
the highest obtained F-values and η2s. For Group 2, however, the obtained η2s were not
as high (η2 > .50). In fact, for almost all sequences squared etas (η2s) and obtained F-
values were smaller for Group 2 compared to Group 1. The only exceptions were for the
three sequences 8:Rdl, 9:Na, and 15:Wm. This means that, in general, the different
sequences explained more variation across affect scales in Group 1 compared to Group 2.
Comparatively, participants in Group 1 responded more consistently to the sequences
than did participants in Group 2.

Analysis of Each Affect Scale

The next analysis examined the effect of differences in the sequences on each
affect scales separately. Since different groups received different scales, these analyses
were conducted within each group separately. In addition, the analysis of affect scales
that were repeated across groups (contentment-joy and hesitation-confidence) were
conducted within each group.
For both Groups 1 and 2 the ANOVAs were significant for the all affect scales
across all sixteen sequences, even for the scale passivity–aggression which many
participants reported as being hard to apply consistently to the sequences (Table 3.14). Of
course, passivity–aggression had the lowest obtained F and η2 (F (15, 784) = 3.032, p <
.001, η2 = .055). The largest F-value was obtained by the scale excitement–boredom (F
(15, 784) = 88.660, p < .001, η2 = .629). Again, Group 1 tended to have higher obtained
F-values and η2s with Group 1 having five of six scales with η2 > .30, and Group 2 only
had three of six scales with η2s larger than .30. The five scales able to explain the most
variation in responses (the largest η2s) were contentment–joy (for both groups),

pensiveness–playfulness, excitement–boredom, surprise–expectation, and energy–
tranquility (all with η2s > .40).
The two affect scales that had the least in common with the others (questioning –
answering and passivity – aggression) also had the lowest η2s. Questioning–answering
only explained 11.1 % (η2 = .111) of the variance in responses, and passivity–aggression
only 5.5 % (Table 3.14).

Table 3.14: ANOVA for words across all sixteen Sequences

Words (Group 1) F-value Eta Squared (η2)

contentment – joy1 F(15,784) = 70.478 *** .574
hesitation – confidence1 F(15,784) = 23.649 *** .312
pensiveness -- playfulness F(15,784) = 72.389 *** .581
delicacy – strength F(15,784) = 8.978 *** .147
irritation -- calmness F(15,784) = 26.371 *** .335
excitement – boredom F(15,784) = 88.660 *** .629

Words (Group 2) F-value Eta Squared (η2)

contentment – joy2 F(15,784) = 74.801 *** .589
hesitation – confidence2 F(15,784) = 16.045 *** .235
passivity – aggression F(15,784) = 3.032 *** .055
questioning – answering F(15,784) = 6.550 *** .111
surprise – expectation F(15,784) = 37.812 *** .420
energy -- tranquility F(15,784) = 56.316 *** .519

Note: *** p<.001

Modeling effect of Sequences

The final and primary set of analyses was intended to determine those features of
the "music" in the sequences that resulted in the observed variation in responses across
affect scales. In these analyses, differences in (musically-relevant) construction of the
sequences were used to predict (model) the observed differences in the affect ratings.
Because the current work was largely exploratory, each affect rating was analyzed
separately. In each analysis, structural differences were used as predictor variables and
each affect rating was used as the criterion using multiple regressions. Essentially, each
participant rated 16 sequences on each affect scale. That is, each participant contributed
16 data points to the analysis of one affect scale.

In order to relate sequence structure to affect ratings, it was necessary to assess
and quantify sequence structure. Hence, the first step in the modeling was the
quantification of musically-relevant structures of the 16 sequences. "Musically-relevant
structure" referred to the internal (cognitive) representation of sequence structure, not
necessarily the external (physical) structure of the sequences.
Sequences differed in terms of pitch pattern. That is, each sequence had different
notes arranged in a different pattern. All sequences had the same number of notes, the
same constant duration per note and the same constant temporal gap between notes. In
fact, many sequences had the same notes in different arrangements. As such, sequences
differed primarily in pitch height (e.g., tone, or note) and the change in pitch pattern over
time. However, despite the constrained construction, these small differences in the
physical stimulus can lead to large differences in a number of musically-relevant
cognitive dimensions. That is, the perception of the physical dimension of pitch and the
associated change in pitch is complicated by the internal (cognitive) representation of
pitch in terms of key (tonality) or harmony (harmonic progression) or other aspects of
music. Of these, for the simpler sequences used in the current work, tonality is a primary
In western tonal music, there are twelve possible notes in an octave. These may be
described as ―C, C#, D, D#, E, F, F#, G, G#, A, A#, B‖ which then repeats in the next
octave. To the human listener, a "C" in one octave sounds "pretty much" the same as a
"C" in any other octave (the same is true of all notes). Hence, pitch height really has two
aspects simplistically labeled as note (chroma) and octave.
To create the internal representation of pitch height, there are two basic options
within music cognition (cf., Frankland & Cohen, 1996). The simplest method is to use the
MIDI standard. The MIDI standard is a numerical encoding of all possible notes used in
western tonal music. For example, one can assign the number 1 to 88 to encode all 88
black and white notes of the (full) piano keyboard. In fact, the scheme assigns the
numbers from 1 to 127 so to address all possible notes in all octaves (i.e, the piano
keyboard does not include all possible notes of all possible instruments). C4 (middle C of
the keyboard) is referenced as 60, C3 as 48 (one octave below middle C), and C5 as 72
(one octave above middle C), et cetera. In the MIDI notation, the 12 notes per octave (c

C# D D# E F F# G G# A A# B [C]) would be numbered sequentially as 60 61 62 63 64
67 66 67 68 69 70 71 [72 ]. Repeating notes in different octaves are separated by the
value of 12 (e.g., 60 and 72).
In the current work, sequences were situated in the middle two octaves of the
piano keyboard. Thus, the linear sequence 1:Lau (c4, e4, g4, g4, a4, b4, C5) is assigned
the MIDI coding: 60, 64, 67, 67, 69, 71, 72. Table 3.13 shows the result of applying this
MIDI coding to all sixteen sequences.

Table 3.15: Transformation of Sequence notes based on a MIDI coding scheme.

Sequences MIDI (chromatic) note coding

# Name Actual Notes 1 2 3 4 5 6 7
1 1:Lau C4 E4 G4 G4 A4 B4 C5 60 64 67 67 69 71 72
2 2:Ldu C5 A4 G4 F4 E4 D4 C4 72 69 67 65 64 62 60
3 3:Lal C3 D3 E3 F3 G3 B3 C4 48 50 52 53 55 59 60
4 4:Ldl C4 A3 G3 F3 E3 D3 C3 60 57 55 53 52 50 48
5 5:Rau C4 G4 C5 G4 E4 D4 C4 60 67 72 67 64 62 60
6 6:Rdu C5 B4 A4 G4 E4 G4 C5 72 71 69 67 64 67 72
7 7:Ral C3 E3 G3 F3 E3 D3 C3 48 52 55 53 52 50 48
8 8:Rdl C4 G3 E3 C3 E3 G3 C4 60 55 52 48 52 55 60
9 9:Na C4 E4 D4 C4 A3 G3 C4 60 64 62 60 57 55 60
10 10:Nd C4 G3 E3 C4 G4 E4 C4 60 55 52 60 67 64 60
11 11:Mu C4 G4 C4 E4 D4 D4 C4 60 67 60 64 62 62 60
12 12:Mm C4 E4 C4 G3 C4 E4 C4 60 64 60 55 60 64 60
13 13:Ml C3 G3 E3 D3 C3 E3 C3 48 55 52 50 48 52 48
14 14:Wu C4 A4 G4 C4 G4 B4 C4 72 69 67 72 67 71 72
15 15:Wm C4 G3 E4 C4 G3 B3 C4 60 55 64 60 55 59 60
16 16:Wl C4 A3 G3 B3 C4 G3 C4 60 57 55 59 60 55 60

Note that the MIDI coding tends to assign a step size of 2 or 4 to adjacent notes in most
sequences, with the occasional 1 or 3.

The MIDI coding is based on a chromatic scale. This may not be an accurate
representation of the manner in which listeners encode pitch information. It is likely that
most listeners encode (hear) pitch within the context of some key (Krumhansl, 1990;
Lerdahl, 1988; Lerdahl, 2001). A key is defined as a particular subset of 7 notes from the
12 possible notes of the octave. It is commonly taught in the formative years as the ―doh
ray me fah soh la te doh‖ scale. Note that the scale covers the entire octave using only 7

notes. This is a diatonic representation of pitch (in fact, there are other possible internal
representations of pitch).
Each different key uses a different subset of 7 notes. For example the key of C
major uses C D E F G A B (the white keys of the piano), and the key of G uses C D E F#
G A B, normally written as G A B C D E F#. The process of key-abstraction is probably
automatic and rapid. For example, Cohen (1991) has shown that key abstraction can be
reliably done with just the first four notes of a melody. Listeners have a strong propensity
to hear music within a given key. In fact, sequences of notes that do not fall within a
single key do not sound "musical" to most individuals (e.g., 12-tone music is not
appreciated by most listeners in the western world). More importantly, sequences of notes
that are not confined to the notes of a single key are not likely processed as music. For
that reason, all sequences in the current work were designed to fall within the single key
of C major.
This created a problem for modeling structure. If listeners processed sequences
using a key-based internal representation of pitch, then the notes of the C major scale
would be C D E F G A B [C]. The steps between all these notes sound the same. That is,
the step from C to D sound the same as the step from E to F. Yet, when compared to the
chromatic scale (MIDI), they are not. The step from C to D is two units, while the step
from E to F is only one unit. The diatonic coding assigns the values of 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 to the notes ―C3 D3 E3 F3 G3 A3 B3 C4 D4 E4 F4 G4 A4 B4 C5‖.
Note that the scaling arbitrarily assigns a 1 to the lowest note of any sequence used in the
current work. Since the point is to compare the effect of the sequences to each other,
nothing is lost by this simplification. Note that no numbers were assigned to accidental
notes such as D# and F#. However, in this work, no accidental notes were used in our
sequences. Thus, 1:Lau (C4, E4, G4, G4, A4, B4, C5) on our diatonic scale would
become: 8, 10, 12, 12, 13, 14, 15. Table 3.16 shows the result of applying the diatonic
coding scheme to all sixteen sequences.

Table 3.16: Transformation of Sequence notes based on a diatonic coding scheme.

Sequences Diatonic note coding

# Name Actual Notes 1 2 3 4 5 6 7
1 1:Lau C4 E4 G4 G4 A4 B4 C5 8 10 12 12 13 14 15
2 2:Ldu C5 A4 G4 F4 E4 D4 C4 15 13 12 11 10 9 8
3 3:Lal C3 D3 E3 F3 G3 B3 C4 1 2 3 4 5 7 8
4 4:Ldl C4 A3 G3 F3 E3 D3 C3 8 6 5 4 3 2 1
5 5:Rau C4 G4 C5 G4 E4 D4 C4 8 12 15 12 10 9 8
6 6:Rdu C5 B4 A4 G4 E4 G4 C5 15 14 13 12 10 12 15
7 7:Ral C3 E3 G3 F3 E3 D3 C3 1 3 5 4 3 2 1
8 8:Rdl C4 G3 E3 C3 E3 G3 C4 8 5 3 1 3 5 8
9 9:Na C4 E4 D4 C4 A3 G3 C4 8 10 9 8 6 5 8
10 10:Nd C4 G3 E3 C4 G4 E4 C4 8 5 3 8 12 10 8
11 11:Mu C4 G4 C4 E4 D4 D4 C4 8 12 8 10 9 9 8
12 12:Mm C4 E4 C4 G3 C4 E4 C4 8 10 8 5 8 10 8
13 13:Ml C3 G3 E3 D3 C3 E3 C3 1 5 3 2 1 3 1
14 14:Wu C4 A4 G4 C4 G4 B4 C4 15 13 12 15 12 14 15
15 15:Wm C4 G3 E4 C4 G3 B3 C4 8 5 10 8 5 7 8
16 16:Wl C4 A3 G3 B3 C4 G3 C4 8 6 5 7 8 5 8

Hence, there is some question as to the internal representation of the pitch

structure of the sequences: Is it chomatic (MIDI) or is it diatonic? Because this question
is not clearly resolved in the literature (it is possible that individuals switch between the
two systems as needed; see Lerdahl, 1988), it was decided that both methods for coding
sequence structure should be used. Note that the two schemes are not "dramatically"
different with respect to the change in pitch within a sequence, but they are different. For
example, the change in 1:Lau when using the chromatic scale is +4 +3 +0 +2 +2 +1 but
the change when using the diatonic scale is +2 +2 +0 +1 + 1 + 1. The subtle differences
may be an issue when comparing responses to sequences that differ only subtly.

Basic Statistics / Indices for Scales

The first attempt at modeling considered simple statistics that described the
overall sequence based on both the chromatic (MIDI) and diatonic coding schemes. The
mean (arithmetic average) median and mode were calculated, as well as the standard
deviation (SD), maximum note, minimum note, and absolute range (maximum –

minimum). A mode could not be found for three of the four linear sequences, so the final
note was used as the ―mode‖ because the final note was held longer than the others, and
tended to be labeled as the mode for other sequences as well. In fact, since non-linear
sequences all began and ended on the same note, the mode ended up representing the
final note for 15 of the 16 sequences (sequence 1:Lau was the only exception). Table 3.17
and 3.18 contain the values resulting from the calculation of all these indices (statistics)
for the MIDI and diatonic coding schemes. These basic indices would then be used as
predictors in the subsequent analyses.

Table 3.17: Simple Statistics per Sequence based on the chromatic (MIDI) coding.

Sequence Mean Median Mode SD Min Max
1:Lau 67.14 67 67 4.14 60 72 12
2:Ldu 65.57 53 48 4.12 60 72 12
3:Lal 53.86 53 60 4.45 48 60 12
4:Ldl 53.57 65 60 4.12 48 60 12
5:Rau 64.57 64 60 4.39 60 72 12
6:Rdu 68.86 55 60 3.02 64 72 8
7:Ral 51.14 52 48 2.61 48 55 7
8:Rdl 54.57 69 72 4.39 48 60 12
9:Na 59.71 60 60 2.98 55 64 9
10:Nd 59.71 60 60 5.06 52 67 15
11:Mu 62.14 50 48 2.61 60 67 7
12:Mm 60.43 71 72 3.05 55 64 9
13:Ml 50.43 62 60 2.70 48 55 7
14:Wu 70.00 60 60 2.31 67 72 5
15:Wm 59.00 59 60 3.16 55 64 9
16:Wl 58.00 60 60 2.31 55 60 5

Table 3.18: Simple Statistics per Sequence based on the diatonic coding

Sequence Mean Median Mode SD Min Max
1:Lau 12.00 12 12 2.38 8 15 7
2:Ldu 11.14 4 1 2.41 8 15 7
3:Lal 4.29 4 8 2.56 1 8 7
4:Ldl 4.14 11 8 2.41 1 8 7
5:Rau 10.57 10 8 2.57 8 15 7
6:Rdu 13.00 5 8 1.83 10 15 5
7:Ral 2.71 3 1 1.50 1 5 4
8:Rdl 4.71 13 15 2.63 1 8 7
9:Na 7.71 8 8 1.70 5 10 5
10:Nd 7.71 8 8 2.98 3 12 9
11:Mu 9.14 2 1 1.46 8 12 4
12:Mm 8.14 14 15 1.68 5 10 5
13:Ml 2.29 9 8 1.50 1 5 4
14:Wu 13.71 8 8 1.38 12 15 3
15:Wm 7.29 7 8 1.80 5 10 5
16:Wl 6.71 8 8 1.38 5 8 3

Predicting Responses from Basic Indices

Separate linear regressions were performed to determine the extent to which these
basic indices could explain the variation in responses for each affect scale. In each
bivariate regression analyses, the indices for the 16 sequences were used as the
independent variable (IV) and the rating on one affect scale was used as the dependent
variable (DV). From the previous ANOVAs, the maximum amount of variance that a
model could potentially explain (η2) was known. In regression, R2 is the corresponding
measure of proportion of variance explained. The closer R2 is to η2, the better the model.
Each measure of central tendency (the Mean, Median & Mode) did an excellent
job at explaining the variation in responses (see Table 3.19).

Table 3.19 Obtained R2 from Regressions for Mean, Median & Mode

R2 values
Mean Median Mode
Group 1 chr. dia. chr. dia. chr. dia. Eta squared (η2)
cont-joy1 .542 .541 .541 .541 .481 .480 .574
hes-conf1 .256 .255 .254 .255 .274 .274 .312
pens-play .545 .544 .544 .546 .504 .503 .581
del-str .133 .133 .130 .132 .089 .089 .147
irr-calm .308 .307 .301 .303 .246 .245 .335
exci-bore .574 .574 .569 .571 .552 .551 .629

Group 2 chr. dia. chr. dia. chr. dia. Eta squared (η2)
cont-joy2 .522 .521 .521 .519 .521 .520 .589
hes-conf2 .175 .174 .171 .171 .188 .187 .235
pass-agr .013 .012 .014 .013 .026 .026 .055
ques-ans .006 .005 .006 .006 .042 .042 .111
sur-exp .344 .343 .349 .344 .362 .361 .420
ener-tran .455 .454 .453 .453 .463 .462 .519

Notes: All tests with R2 > .007 were significant at p < .01, otherwise p < .05
―chr.‖ stands for chromatic, ―dia.‖ stands for diatonic

The regressions (Table 3.19) indicated that the Mean and Median were explaining
the same amount of variance for each affect scale, with the Mean possibly doing a little
better than the Median. The Mode also did very well. For Group 1, the Mean
outperformed the Mode (explained more variance) in all except one case. However, for
Group 2 the Mode appeared to outperform the Mean.
The extreme values for sequences (Minimums and Maximums) also did a very
good job (see Table 3.20). The measures of Standard Deviation and Range, however,
both failed to explain any significant variation in the responses (see Table 3.21).

Table 3.20: R2 from Regressions based on the Minimum and Maximum.

R2 values
Min Max
Group 1 chr. dia. chr. dia. Eta squared (η2)
cont-joy1 .483 .478 .506 .505 .574
hes-conf1 .225 .224 .242 .243 .312
pens-play .486 .481 .509 .505 .581
del-str .120 .119 .135 .134 .147
irr-calm .274 .270 .297 .296 .335
exci-bore .507 .502 .536 .532 .629

Group 2 chr. dia. chr. dia. Eta squared (η2)

cont-joy2 .463 .459 .468 .468 .589
hes-conf2 .149 .149 .166 .168 .235
pass-agr .009 .009 .008 .008 .055
ques-ans .002 .001 .002 .002 .111
sur-exp .300 .295 .307 .309 .420
ener-tran .396 .391 .410 .410 .519

Notes: R2 > .008 were significant (p < .01), and R2 > .004 significant (p < .05)
Only the ques-ans scale was not significant (R2 values < .003)
―chr.‖ stands for chromatic, ―dia.‖ stands for diatonic

Table 3.21: R2 from Regressions based on the Standard Deviation and Range.

R2 values
SD Range
Group 1 Both Both Eta squared (η2)
cont-joy1 .000 .000 .574
hes-conf1 .002 .000 .312
pens-play .000 .000 .581
del-str .000 .001 .147
irr-calm .001 .000 .335
exci-bore .001 .000 .629

Group 2 Both Both Eta squared (η2)

cont-joy2 .000 .000 .589
hes-conf3 .003 .001 .235
pass-agr .000 .000 .055
ques-ans .001 .000 .111
sur-exp .000 .000 .420
ener-tran .000 .000 .519
Notes: None of the R2 values were significant
―Both‖ means the diatonic and chromatic scales had the same values

Note that Table 3.21 shows that neither the Standard Deviation (SD) nor the
Range of the sequences predicted any of the variance in responses.
Since computation of the mode resulted in the value of the ―Final Note‖ for 15 of
the 16 sequences, and the previous analysis suggested the importance of the Mode, an
additional regression was performed using just the value of the final note as a predictor
(Table 3.22). Not surprisingly, that analysis yielded values very similar to that of the
Mode, with the Final Note appearing to be a better predictor in almost all cases.

Table 3.22: R2 from Regressions based on Final Note

R2 values
Group 1 chr. dia. Eta squared (η2)
cont-joy .496 .496 .574
hes-conf .282 .282 .312
pens-play .508 .508 .581
del-str .089 .089 .147
irr-calm .245 .245 .335
exci-bore .556 .556 .629

Group 2 chr. dia. Eta squared (η2)

cont-joy .540 .540 .589
hes-conf .198 .198 .235
pass-agr .028 .028 .055
ques-ans .046 .046 .111
sur-exp .372 .372 .420
ener-tran .477 .477 .519
Notes: All R values significant with p < .01
―chr.‖ stands for chromatic, ―dia.‖ stands for diatonic

Overall, the indices measuring the central tendency appeared to do the best
(highest R2 values). The Final Note also appeared to do quite well, especially in Group 2.
In general, the extremes (Minimum and Maximum note) did not do as well. Of these, the
Maximum note appeared to be a better predictor (suggesting that higher notes were more
salient). The Standard Deviation and Range were not at all useful in explaining the
variation in responses. Also, it was noted that there was no clear advantage for the
diatonic over the chromatic coding scheme. Both schemes appeared to yield equivalent

Comparison Table (% of variance explained)
In order to determine the relative quality of each model, a comparison between
obtained η2 and R2 was done. Thus for each affect scale, η2 indicated the total explainable
variance, and R2 indicated how well a particular index was doing. Thus the percentage of
total possible explainable variance explained could be calculated by R2 divided by η2. For
example, from the scale pensiveness–playfulness, η2 was .581 (from Table 3.23), and the
mean from the MIDI scale yielded an R2 of .545 (Table 3.23). Thus, the Mean for this
scale could explain R2 / η2 = .581/.545 = 93.8% of the explainable variance. The
calculations for each affect scale were done with the best three predictors (the Mean, the
Maximum note, and the Final Note) and the results are presented in Table 3.23.

Table 3.23: % of Variance accounted for by the three Best Predictors: mean, max and
final note

% of Variance Explained3
Words Eta Squared Best Predictor
(R2 / η2)
Group 1 η2 Mean Max Final Note Predictor Best %
cont-joy1 .574 94.4% 88.2% 86.4% Mean 94.4%
hes-conf1 .312 82.1% 77.6% 90.4% Final Note 90.4%
pens-play .581 93.8% 87.6% 87.4% Mean 93.8%
del-str .147 90.5% 91.8% 60.5% Max 91.8%
irr-calm .335 91.9% 88.7% 73.1% Mean 91.9%
exci-bore .629 91.3% 85.2% 88.4% Mean 91.3%

Group 2 η2 Mean Max Final Note Predictor Best %

cont-joy2 .589 88.6% 79.5% 91.7% Final Note 91.7%
hes-conf2 .235 74.5% 70.6% 84.3% Final Note 84.3%
pass-agr .055 23.6% 14.5% 50.9% Final Note 50.9%
ques-ans .111 5.4% 1.8% 41.4% Final Note 41.4%
sur-exp .420 81.9% 73.1% 88.6% Final Note 88.6%
ener-tran .519 87.7% 79.0% 91.9% Final Note 91.9%

For Group 1 the Mean explained the greatest amount of variance for four of the
six affect scales (contentment – joy, pensiveness – playfulness, irritation – calmness, and
excitement – boredom). The Maximum only out-performed the Mean for affect scale
delicacy–strength. In Group 1, the Final Note only outperformed the Mean for the scale

These values were calculated on the basis of the MIDI coding (diatonic values were virtually identical).

hesitation – confidence. However, for Group 2, the best predictor was the Final Note for
all six affect scales. Even the affect scale contentment–joy was better predicted by the
Final Note than the Mean in Group 2. This result suggested that, participants in Group 1
and Group 2 might have been listening for different things.
For Group 1, most of the possible explainable variance could be explained by
these three crude indices with the percentage of explained values ranging from 90.4% to
94.4%. For Group 2, these indices did not do as well with the Final Note only explaining
41.4% of the variation in questioning–answering and 50.9% of the variation in passivity–
aggression. Low percentage of variance accounted for was actually a good thing in this
case. Our expectation was that more complex characterizations of the data, such as
melodic contour, would be necessary to explain the variation in responses. The fact that
Group 1 had such high percentage of variance explained suggested that complex models
of melodic contour were unnecessary but low percentage explained values in Group 2
necessitated further analyses with more complex models.

Modeling with Polynomials

Fourth order polynomials were used to characterize the shape of the sequences
(cf. Beard, 2003). A fourth order polynomial follows the form:
y = a0 + a1 x + a2 x2 + a3 x3 + a4 x4
where a1 to a4 are called the coefficients (or the constant multipliers) and a0 is a special
constant often called the y-intercept. For the current sequences, y would indicate the
value corresponding to the note’s pitch height (i.e. chromatic: 60 for C4), and the x-term
would indicate the note’s ordinal position: 1, 2, 3 … 7. The polynomial equation remains
the same for all sixteen sequences with only the coefficients changing (a0 to a4).
For each sequence, different coefficients capture different aspects of the music.
For example, the first coefficient a0 is simply an indication of the vertical distance along
the y-axis it also represents the point at which x = 0, and is often called the y-intercept.
This y-intercept or a0 thus captures information about pitch height, and therefore, yields
values similar to the mean or median of the sequence. The rest of the coefficients,
however, capture information about the shape of the melodic contour completely
independent from pitch height. For example, the coefficient a1 captures information about

the presence of a linear term in the shape. That is, a1 captures the degree to which the
shape of the sequence could be captured by a straight line. The a2 coefficient similarly
captures the degree of arch present in the sequence, with a large positive a2 indicating a
strong inverted arch, a large negative a2 indicating a strong regular arch, and values of a2
close to zero indicating no or only a slight arch shape. The other terms (a3 and a4) capture
more complex shapes that are necessary for the modeling of the N-shaped and W-shaped
For each analysis, the R2 values provided an indicator of the match between the
polynomial equation and the actual musical sequence. Note that the polynomial is only an
approximation to the shape. It is an idealization that captures the essential information.
The closer the R2 value is to 1, the better polynomial equation matches the actual musical
sequence. For this modeling, the temporal pattern representing the relative time of the
presentation of each note within each musical sequence was entered around zero (x = -3, -
2, -1, 0, 1, 2, 3), rather than starting with the first note (e.g., x = 1, 2, 3, 4, 5, 6, 7).
Centering the temporal pattern around zero is simply a mathematical manipulation that
reduces the correlation between even and odd powers (Cohen & Cohen, 1983, pp. 237-
238), which helps to clarify the interpretation of the analysis.
For the set sixteen sequences, a power series was used to solve for polynomial
coefficients on both the chromatic and diatonic coding scheme. For example, for the
sequence 8:Rdl the basic shape is that of a inverted-arch (or u). When fitted with a
polynomial using the diatonic coding the resulting equation was:
y = 1.5628 – 0 x + 1.053 x2 – 0 x3 - 0.0379 x4 (See Figure 3.17).

C4 C4
8 y = 1.5628 - 0 x + 1.053x2 - 0 x3 -0.0379x4
R2 = 0.9823

Y (pitch height)

5 G3

3 E3

1 C3

-4 -3 -2 -1 0 1 2 3 4
X (position centered at 0)

Figure 3.17: Inverted Arch 8:Rdl modeled by a fourth order polynomial (using diatonic

Note that the equation contained a very small (approximately 0) a1 term since, on
average, the sequence was flat in pitch. However the equation contained a large a2 term
since the overall shape was that of a u. The a2 term was positive because the shape was an
upward u (i.e., a negative a2 captures an inverted arch). Note the high R2 value indicated a
good match between the equation and the data (notes).
As another example, for the sequence 14:Wu, the basic shape is that of a W.
When fitted with a polynomial (using the diatonic coding), the resulting equation was:
y = 13.422 + 0.2659 x – 0.3195 x2 – 0.0278 x3 + 0.0568 x4 (See Figure 3.18).

C5 C5 C4
12 G4 G4
Y (pitch height)


y = 13.442 + 0.2659x - 0.3295x2 - 0.0278x3 + 0.0568x4
R2 = 0.4924

-4 -3 -2 -1 0 1 2 3 4
X (position centered at 0)

Figure 3.18: W-shape sequence 14:Wu modeled by a fourth order polynomial.

Note that the equation had a comparatively strong a2 term (0.3195) resulting in overall
inverted arch (u) shape. The equation also had a weak a1 term (0.2659) indicating that the
sequence had a tendency to rise / ascend. However, for this example the equation was not
a very good approximation of the sequence (R2 = .492). The middle notes in particular
(e.g. C5) were not very close to the line, and the shape of the W was much more obvious
in data (actual notes) than in the polynomial approximation.
The same procedure was applied to all sixteen musical sequences. The resulting
coefficients per sequence for both the chromatic and diatonic coding system can found in
Tables 3.24 and 3.25.

Table 3.24: Resulting coefficients per sequence when chromatic scaling is used
Chromatic (MIDI)
Name Sequence a0 a1 a2 a3 a4 R2
1:Lau C4 E4 G4 G4 A4 B4 C5 67.5 1.27 0.21 0.08 -0.04 0.991
2:Ldu C5 A4 G4 F4 E4 D4 C4 65.2 -1.50 0.08 -0.06 0.00 0.999
3:Lal C3 D3 E3 F3 G3 B3 C4 53.0 2.04 0.59 0.00 -0.05 0.992
4:Ldl C4 A3 G3 F3 E3 D3 C3 53.2 -1.50 0.08 -0.06 0.00 0.999
5:Rau C4 G4 C5 G4 E4 D4 C4 68.0 -3.17 -0.72 0.36 -0.02 0.939
6:Rdu C5 B4 A4 G4 E4 G4 C5 66.4 -2.21 0.60 0.25 0.00 0.966
7:Ral C3 E3 G3 F3 E3 D3 C3 53.6 -1.22 -0.60 0.14 0.00 0.963
8:Rdl C4 G3 E3 C3 E3 G3 C4 49.3 0.00 1.80 0.00 -0.07 0.966
9:Na C4 E4 D4 C4 A3 G3 C4 59.8 -3.54 -0.18 0.39 0.02 0.971
10:Nd C4 G3 E3 C4 G4 E4 C4 59.8 5.85 -0.18 -0.67 0.02 0.881
11:Mu C4 G4 C4 E4 D4 D4 C4 61.7 -0.87 1.08 0.08 -0.14 0.451
12:Mm C4 E4 C4 G3 C4 E4 C4 56.1 0.00 3.36 0.00 -0.33 0.947
13:Ml C3 G3 E3 D3 C3 E3 C3 49.0 -1.72 1.97 0.19 -0.23 0.930
14:Wu C4 A4 G4 C4 G4 B4 C4 69.3 0.53 -0.30 -0.06 0.07 0.439
15:Wm C4 G3 E4 C4 G3 B3 C4 60.5 -1.01 -1.45 0.14 0.16 0.254
16:Wl C4 A3 G3 B3 C4 G3 C4 58.8 0.62 -1.41 -0.08 0.17 0.593

Table 3.25: Resulting coefficients per sequence when diatonic scaling is used
Name Sequence a0 a1 a2 a3 a4 R2
1:Lau C4 E4 G4 G4 A4 B4 C5 12.3 0.68 -0.02 0.06 -0.01 0.988
2:Ldu C5 A4 G4 F4 E4 D4 C4 11.0 -0.91 -0.05 -0.03 0.01 1.000
3:Lal C3 D3 E3 F3 G3 B3 C4 3.9 1.18 0.21 0.00 -0.02 0.997
4:Ldl C4 A3 G3 F3 E3 D3 C3 4.0 -0.91 -0.05 -0.03 0.01 1.000
5:Rau C4 G4 C5 G4 E4 D4 C4 12.5 -1.95 -0.44 0.22 -0.01 0.932
6:Rdu C5 B4 A4 G4 E4 G4 C5 11.5 -1.22 0.28 0.14 0.01 0.947
7:Ral C3 E3 G3 F3 E3 D3 C3 4.3 -0.73 -0.47 0.08 0.01 0.957
8:Rdl C4 G3 E3 C3 E3 G3 C4 1.6 0.00 1.05 0.00 -0.04 0.982
9:Na C4 E4 D4 C4 A3 G3 C4 7.8 -2.02 -0.18 0.22 0.02 0.978
10:Nd C4 G3 E3 C4 G4 E4 C4 7.8 3.40 -0.18 -0.39 0.02 0.859
11:Mu C4 G4 C4 E4 D4 D4 C4 8.8 -0.57 0.67 0.06 -0.08 0.478
12:Mm C4 E4 C4 G3 C4 E4 C4 5.8 0.00 1.81 0.00 -0.17 0.917
13:Ml C3 G3 E3 D3 C3 E3 C3 1.4 -0.99 1.11 0.11 -0.13 0.942
14:Wu C4 A4 G4 C4 G4 B4 C4 13.4 0.27 -0.33 -0.03 0.06 0.492
15:Wm C4 G3 E4 C4 G3 B3 C4 8.2 -0.62 -0.94 0.08 0.10 0.328
16:Wl C4 A3 G3 B3 C4 G3 C4 7.1 0.43 -0.79 -0.06 0.10 0.622
Example of a good polynomial fit for ―8Rdl‖ (a) and a less good polynomial fit for
―14Wu‖ (b).

As expected, fourth order polynomials captured the shape of the melodic contours
for virtually all sequences, except for the four complex sequences 11:Mu, 14:Wu,
15:Wm, and 16:Wl which had R2s ranging from R2 = .254 (15:Wm with MIDI) to R2 =

.622 (16:Wl with diatonic). In general, the polynomials derived based on the diatonic
coding scheme appeared to work slightly better. In other words, better polynomial fits
(higher R2s) appeared to be obtained when the diatonic coding scheme was used,
especially for linear, W, and M shaped contours.
Those familiar with regression analysis will know that since all sequences
contained 7 attack points (8 notes) with 6 intervals between them, a sixth order
polynomial would have provided a perfect fit to the shape of the melodic contour.
However, since the most complicated contours we defined were M and W-shape contours
with only 3 changes in direction, fourth order polynomials seemed adequate. In fact, in
many cases, a sixth order polynomial although giving a perfect fit (R2 = 1) resulted in an
equation that did not make sense in the context of the experiment. Figure 3.19 shows an
example of a sixth order approximation of the sequence 14:Wu.



C5 C5 C4
Y (pitch height)

12 G4 G4


8 y = 15 - 0.15x - 4.275x2 + 0.1667x3 + 1.375x4 - 0.0167x5 - 0.1x6

6 R =1

-4 -3 -2 -1 0 1 2 3 4
X (position centered at 0)

Figure 3.19: A sixth order polynomial fit for the sequence 14:Wu. Pitch height was coded
using the diatonic coding scheme.

Note from the figure (3.19) that the polynomial equation described the sequence
as having an arch shape between the first two notes (C5 and A4) and last two notes (B4
and C4). However, there is no chance a participant would perceive an arch to occur
between those notes. That is, the equation gives an odd solution for the sake of fitting the
data. This was one of the reasons fourth order and not sixth order polynomials were used.

Results from Polynomial modeling

Multiple linear regressions were conducted using the coefficients (a0 to a4) from
the fourth order polynomial approximations as five IVs attempting to predict ratings
within each affect scale separately. The coefficients a1 to a4 taken together provided
information as to shape of the melody. The y-intercept or coefficient a0 is essentially the
same as average pitch height (related to the mean pitch height). Since the approximations
were centered at zero, a0 for each sequence provided approximately the value of the
middle of fourth note in each sequence. Regressions were done with both the chromatic
and diatonic coding schemes and the results indicated no substantial benefit of one
system over the other. There did, however, appear to be a slight advantage for the
diatonic system over the chromatic system, and thus all statistics and tables presented in
the following section are done in regards to the diatonic results (reporting chromatic
values would have been redundant).

Contour vs Pitch Height

The combination of all 4 shape / contour coefficients (a1 to a4) significantly
predicted responses for all affect scales except scales delicacy–strength and irritation–
calmness (both from Group 1). The amount of variance these coefficients explained (R2),
however, was quite small in Group 1 with the contour coefficients explaining the most
variance for scales hesitation–confidence and excitement–boredom in Group 1 (R2 =
0.089 for both when using the diatonic system). For Group 2, however, the contour
coefficients did significantly better. For Group 2 the highest obtained R2s were for
contentment–joy and for questioning–answering (diatonic R2 = .116 and R2 = .100
respectively). Again, these obtained R2 values were generally quite low compared to
values obtained by other predictors (i.e. Mean, Median, Mode, Maximum, and Final

Note). The two exceptions were the scales passivity–aggression and questioning–
answering. For these two affect scales the contour coefficients did a much better job of
predicting the variance than any previous predictors had (R2 = .032, and R2 = .100
respectively). That is, the contour coefficients taken together explained 58.2% of the
explainable variance for passivity–aggression (compared to the 50.9% explained by the
Final Note) and 90.1% of the variance for questioning-answering (compared to 41.4%).
Table 3.26 contains all the R2 values from the analysis of contour coefficients with the
percentage explained from the best predictors of Table 3.23 for comparison.

Table 3.26: Proportion of variance explained by "shape", contrasted with previous


Group Words η2 Previous Best Predictor Contour Coefficients

Predictor % R2Y.a1,a2,a3,a4 %
1 cont-joy1 .574 Mean 94.4 .071*** 12.4
hes-conf1 .312 Final Note 90.4 .089*** 28.5
pens-play .581 Mean 93.8 .070*** 12.0
del-str .147 Max 91.8 .005 3.4
irr-calm .335 Mean 91.9 .011 3.3
exci-bore .629 Mean 91.3 .089*** 14.1
2 cont-joy2 .589 Final Note 91.7 .116*** 19.7
hes-conf2 .235 Final Note 84.3 .071*** 30.2
pass-agr .055 Final Note 50.9 .032*** 58.2
ques-ans .111 Final Note 41.4 .100*** 90.1
sur-exp .420 Final Note 88.6 .090*** 21.4
ener-tran .519 Final Note 91.9 .099*** 19.1
Note: ***p < .001, **p < .01, *p < .05
η2 refers to the proportion of variance explained when analyzed using ANOVA

The pitch height coefficient (a0) was also considered (Table 3.27). For both
Groups 1 and 2 the obtained R2 values from this analysis were generally very high and
significant, again, the only exceptions were the scales passivity-aggression and
questioning-answering. That a0 was a successful predictor was not very surprising, as it
represented the approximate middle of each sequence, and thus was similar to the Mean
as a predictor. It was interesting to note, however, that the influence of pitch height
appeared to be stronger for Group 1 than for Group 2. This was evident when dividing

the R2 obtained of the pitch height regression by the R2 obtained of the contour
(R2Y.a0) / (R2Y.a1,a2,a3,a4) = Ma
Where R2a0 was the obtained R2 from the a0 regression, R2a1:a4 the obtained R2 from the a1
to a4 regression and Ma indicated the magnitude of difference, or ratio comparison
between the obtained R2s. This magnitude score (Ma) was included in the subsequent
analysis of pitch height (a0). See Table 3.27.

Table 3.27: Proportion of variance explained by "pitch height" (a0), contrasted with
previous measures.

Group Words η2 Previous Best Predictor Pitch Height Coefficient

Predictor % R2Y.a0 % Ma
1 cont-joy1 .574 Mean 94.4 .474** 82.6 6.7
hes-conf1 .312 Final Note 90.4 .222** 71.2 2.5
pens-play .581 Mean 93.8 .463** 79.7 6.6
del-str .147 Max 91.8 .119** 81.0 23.8
irr-calm .335 Mean 91.9 .261** 77.9 23.7
exci-bore .629 Mean 91.3 .491** 78.1 5.5
2 cont-joy2 .589 Final Note 91.7 .451** 76.6 3.9
hes-conf2 .235 Final Note 84.3 .163** 69.4 2.3
pass-agr .055 Final Note 50.9 .008* 14.5 0.3
ques-ans .111 Final Note 41.4 .000 0.0 0.0
sur-exp .420 Final Note 88.6 .295** 70.2 3.3
ener-tran .519 Final Note 91.9 .380** 73.2 3.8

Notes: **p < .01, *p < .05

η2 refers to the proportion of variance explained when analysed using ANOVA

Note that for Group 1 pitch height was on average a 11.5 times better predictor
than contour (Ma-scores range = 2.5 to 23.8), but for Group 2 it was on average only 2.3
times better predictor (Ma-scores range = 0 to 3.9). These differences were largely driven
by the fact that contour appeared to have no predictive value for delicacy-strength and
irritation-calmness and large predictive value for passivity-aggression and questioning-
answering. It is also worth noting that the influence of pitch height appeared to larger for
Group 1 on contentment-joy (Ma = 6.7 compared to 3.7) but not for hesitation-confidence
(Ma = 2.5 compared to 2.3).

The best result, of course, was obtained by considering both pitch height and
contour coefficients together (using all the coefficients). The obtained R2 values from this
analysis ranged from .615 to .043 and all were highly significant. Table 3.28 shows the
R2 values obtained from the contour coefficients regression (R2Y.a1,a2,a3,a4), the pitch height
coefficients regression (R2Y.a0) and all the coefficients together (R2Y.a0,a1,a2,a3,a4) for
comparison. The percentage of variance explained by the full (all coefficients) regression
was also shown alongside the percentage of variance explained by the best previous
predictors (i.e. mean, max and final note). See Table 3.28.

Table 3.28: Proportion of variance explained by all regression coefficients, contrasted

with previous measures.

Grp Words η2 R2Y.a1,a2,a3,a4 R2Y.a0 R2Y.a0,a1,a2,a3,a4 %
1 cont-joy1 .574 .071** .474** .566** 98.6 94.4
hes-conf1 .312 .089** .222** .299** 95.8 90.4
pens-play .581 .070** .463** .570** 98.1 93.8
del-str .147 .005 .119** .135** 91.8 91.8
irr-calm .335 .011 .261** .322** 96.1 91.9
exci-bore .629 .089** .491** .615** 97.8 91.3
2 cont-joy2 .589 .116** .451** .582** 98.8 91.7
hes-conf2 .235 .071** .163** .214** 91.1 84.3
pass-agr .055 .032** .008* .043** 78.2 50.9
ques-ans .111 .100** .000 .103** 92.8 41.4
sur-exp .420 .090** .295** .393** 93.6 88.6
ener-tran .519 .099** .380** .511** 98.5 91.9

Notes: **p < .01, *p < .05

η2 refers to the proportion of variance explained when analyzed using ANOVA
―Previous %‖ the proportion of the variance explained by the best previous

For virtually all affect scales this regression with the polynomial coefficients was
better than all previous methods for explaining the variation in responses (i.e. the Mean,
Max, and Final Note). The only exception was for affect scale delicacy-strength for
which the coefficient regression appeared to explain exactly the same amount of variance
(91.8%) as explained before, by the Maximum note. That is, the polynomial method did
as good as a job or better and explaining the variation in responses to affect scale by

sequence pairs. The coefficient method explained over 90% of the variance for every
affect scale except passivity-aggression for which it explained only 78.2%. The reason
this method didn’t work as well for passivity-aggression may indicate that the passivity-
aggression scale was sensitive to the number of times a sequence changed direction
(M/W-shape contours). If this was true, than the fact that the fourth order polynomials
had a harder time describing these contours may have weakened the predictive value of
the coefficients (see Tables 3.24 and 3.25, above). However, explaining 78.2% of the
explainable variance was still an improvement over the best previous method which only
explained 50.9% of the explainable variance.

Correlations of Coefficients to Responses

Each coefficient was correlated to the response pattern, by affect scale. From
these correlations the impact of each coefficient on the ratings could be interpreted. For
example, since a0 coded for pitch height positive correlations indicated that higher notes
were associated with the second affect word, and lower notes with the first affect word.
The correlations (r-values) and their significance level are presented in Table 3.29. Each
affect scale (word pair) is presented in the orientation in which they were rated by
participants (first word indicated by ―1‖, and second word indicated by ―3‖).

Table 3.29: Coefficient (Pearson R) Correlations to Numeric Responses

Words Diatonic coefficients

―1‖ – ―3‖ a0 a1 a2 a3 a4
1 cont-joy1 .688*** .042 -.115*** .044 .152***
hes-conf1 .471*** .019 -.109*** .079* .152***
pens-play .680*** .070* -.100*** .009 .150***
del-str -.345*** .031 .050 -.027 -.058
irr-calm -.511*** -.063 .030 .032 -.049
exci-bore -.700*** -.067 .094*** -.039 -.137***
2 cont-joy2 .672*** .092** -.114*** .033 .158***
hes-conf2 .403*** .068 -.130*** .023 .156***
pass-agr .091* .132*** .003 -.068 .018
ques-ans -.019 -.134*** -.076* .016 .017
sur-exp -.543*** -.120*** .108** .019 -.155***
ener-tran -.617*** -.121** .086* .010 -.138***
Notes: ***p < .001, **p < .01, *p < .05
Grp is an abbreviation for ―Group‖

From the correlations (Table 3.29) it was evident a high pitch (a0) conveyed: joy,
confidence, playfulness, delicacy, irritation, excitement, aggression, surprise and energy.
The reverse was therefore true. That is, a low pitch (a0) conveyed: contentment,
hesitation, pensiveness, strength, calmness, boredom, passivity, expectation, and
tranquility. A positive a1 term indicates an ascending sequence, a negative a1 term a
descending sequence. Thus, a positive correlation between a1 and the response pattern
indicates that ascending sequences were favored by the first word in the affect scale, and
descending sequences favored by the second word. Thus, the analysis suggested that
ascending sequences were rated as conveying: joy, playfulness, aggression, questioning,
surprise, and energy. Conversely, descending sequences conveyed the opposite pattern:
contentment, pensiveness, passivity, answering, expectation and tranquility.
The a2 coefficient is interpreted as the arch term. A negative a2 indicates an arch
shape present in the sequence, and a positive a2 indicates an inverted arch present in the
sequence. Thus negative a2 correlations indicated an arch shape was important to the
second affect word, and an inverted arch shapes were important to the first affect word.
This can be a little complicated to follow, so a Figure is drawn for clarification (See
Figure 3.20).

Figure 3.20: Demonstration of how a negative correlation for a2 indicates a relationship

between a regular arch and the second affect word (joy in this case).

From this, the inverted arch shape appeared to be important for conveying contentment,
hesitation, pensiveness, boredom, questioning, expectation, and tranquility. Conversely,
the regular arch conveyed joy, confidence, playfulness, excitement, answering, surprise,
and energy (Table 3.29).
Furthermore, the correlations indicated that not only was the arch shape important
to conveying emotional meaning, it was appeared to be more important than linear
ascending/descending sequences. Specifically, in the linear (a1) case the 6 of 12
correlations were significant, but for the arch (a2) case as many as 9 of 12 correlations
were significant.

Higher order terms

The a3 and a4 term are harder to interpret. Unlike a0 to a2 which capture relatively
obvious dimensions of the sequences, a3 and a4 can only be interpreted in the context of
all the contour coefficients. Specifically, a third order polynomial is needed, in
conjunction with a1 and a2 to describe N and inverted N-shapes and a fourth order
polynomial is needed, in conjunction with a1, a2 and a3 to describe M/W shapes. The a3
and a4 terms (alone) do not describe the extent of these shapes. In pure form, the a3 term
is highly correlated with the a1 term. That is, the simple curve y=x3 looks very similar to
y = x. These two equations are depicted in by a solid line and dotted line below (Figure

Figure 3.21: Shows the relationship between the a1 and a3 term. Shown is the straight
line of a1 = 1 or y = x (solid), and a plot of a3 = 1, or y = x3 (dotted). When both a1 and a3
are combined for example in the equation y = -2x + .5 x3, an N-shape is represented

Similarly, the a4 term is correlated with the a2 term. That is, the curve y = x4 looks very
similar to the curve y=x2 (Figure 3.22).

Figure 3.22: Shows the relationship between the a2 and a4 term. Shown is the inverted
arch with a2 = 1 or y = x2 (dashed), and a plot of similarly shaped a4 = 1, or y = x4
(dotted). When a2 and a4 are combined a W-shape can be formed, for example y = -2x2 +
.4x4 (solid).

It is the addition of the higher order terms to the lower order terms that allows the
regression model to follow more complex contours (such as the N or M). That is, the a3
and a4 terms only have meaning within the context of the equation. They should not be
considered in isolation. On Figure 3.21, the curves of y = x and y = x3 are superimposed.
The curve representing y = -2x + .5 x3 is also shown (Figure 3.21). Note that the final
curve is very different from the first two and has an N-shape. On figure 3.22 the curves of
y = x2 and y = x4 are superimposed with the curve represented by the equation y = -2x2 +
.4x4 which has a W-shape. Notice that a3 term (or y = x3) alone does not capture the N-
shape and requires some a1 term to be present (Figure 3.21). Similarly, the a4 term (or y =
x4) alone does not capture the W-shape and requires some a2 term to be present (Figure

One should also note that the combination of the different terms is very much
dependent on the range of X used. For example, y=x and y=x2 are very similar if X is
confined to the range 0 to 2 or 3.
Also, the odd powers (coefficients a1, a3 etc) are most clearly distinguished from
the even powers (coefficients a2, a4 etc) when the X values are centered around zero
(Cohen & Cohen, 1983). That was the approached used herein. However, odd powers are
always related to other odd powers and even powers are always related to other even
powers. One consequence of this is that one must be careful when considering each
coefficient in isolation. They are not independent of each other. The correlations
between the coefficients are shown in Table 3.30.

Table 3.30: Pearson r-values. Coefficient inter-correlations

a0 a1 a2 a3 a4
a0 --- -.083 -.426 .110 .372
a1 --- .000 -.890 *** .085
a2 --- .015 -.928 ***
a3 --- -.121
a4 ---

Notes: N = 16

Note that, because the data was centered on X (i.e., x = 0 was considered the center of the
sequence), there were (expectedly) strong correlations between a1 and a3 and between a2
and a4 (Table 3.30). This relationship was especially evident between a2 and a4 which
correlated to all the same affect scales with the exception of questioning-answering which
only a2 correlated to (Table 3.29). However, this relationship was less evident between a1
and a3 with a1 significant in six cases, and a3 only significant in one, even though these
two coefficients were strongly correlated (r = -.890). For such an analysis, the main point
is that a3 and a4 capture nuances of the curves that go beyond the simple linear and arch

Absolute Values of Coefficients

The absolute values of the coefficients were also calculated and correlated to
responses. This allowed a comparison of the importance of the shape without confounds

of orientation of that shape. For example, the absolute value of a1 conveys information of
how strong a linear term the sequence has regardless of whether that term is ascending or
descending. Similarly, the absolute value of a2 conveys the absolute strength or amount
of arch present, regardless of orientation (inverted or not). For example, sequence 8:Rdl
which spans an octave before returning to middle C has a much larger arch term than arch
sequence 7:Ral which only rises to a G before coming back down (1.8 compared to 0.6).
When the absolute values of the coefficients were correlated the significance of the linear
term (a1) almost vanished, but the arch term (a2) stayed strong (See Table 3.31). This
suggested, that for linear term (a1) rising or falling (orientation) was more important than
the strength or amount to which the sequence rose or fell. For the arch term however,
both orientation and strength of the arch term were important. This result suggested that,
at least in this experiment, an arch was more important for conveying emotional meaning
than linearity in a sequence. In fact, the only affect scales for which the magnitude of the
arch did not appear to be important was for passivity-aggression and questioning-
answering scales. Thus, a strong arch appeared to convey the affects of contentment,
hesitation, pensiveness, strength, calmness, boredom, expectancy and tranquility.
Similarly, low arch magnitude was correlated to: joy, confidence, playfulness, delicacy,
irritation, excitement, surprise and energy.

Table 3.31: Correlations between Responses and Absolute Values of Coefficients

Diatonic coefficients (absolute values)

Group Affect a0 a1 a2 a3 a4
1 cont-joy1 (as before) .042 -.200 *** .102 ** -.143 ***
hes-conf1 (as before) .014 -.168 *** .034 -.135 ***
pens-play (as before) .026 -.155 *** .084 * -.089 *
del-str (as before) -.053 .100 ** -.080 * .066
irr-calm (as before) -.059 .102 ** -.100 ** .050
exci-bore (as before) -.045 .153 *** -.100 ** .099 **
2 cont-joy2 (as before) .019 -.196 *** .067 -.132 ***
hes-conf2 (as before) .017 -.194 *** .003 -.155 ***
pass-agr (as before) -.013 .001 .002 .010
ques-ans (as before) .024 -.019 .021 .009
sur-exp (as before) - .072 * .185 *** -.125 *** .139 ***
ener-tran (as before) -.005 .160 *** -.045 .098 **

Notes: N = 800 (16 sequences * 50 participants)

Again, in the absolute value case, a4 followed the trend of a2. The fourth coefficient (a4)
was significant for all the same affect scales except delicacy-strength and irritation-
calmness (Table 3.31). However, in the absolute value case, the significance of a1 and a3
appeared to reverse. For the regular case (orientation) a1 was more important, but for the
absolute value (magnitude) case a3 became significant for more of the affect scales and
the significance of a1 all but disappeared. This analysis suggested that a3 was correlated
to: joy, playfulness, delicacy, irritation, excitement and surprise. By, contrast, a weak a3
or flat sequence was correlated to: contentment, pensiveness, strength, calmness,
boredom, and expectation.

IV. Summary of Results

In general, there was a significant relationship between melodic contour and

affective responses (emotion). In this experiment pitch height was generally more
important than melodic contour, but melodic contour was also important. The analysis
that most clearly showed this result was the regression using only the contour coefficients
(a1 to a4) in which a significant result was achieved for contour for 10 out 12 affect scales
(see Table 3.28). Finally, both the ascending/descending component, and the melodic
arch appeared to be important for conveying affect (the arch more-so than linear
ascending/descending sequences).

V. Additional (Secondary) Analyses

From the tonality assessment, cluster analysis was used to identify participants
with odd profiles. Such profiles might represent a listener who has an internal
representation of tonality atypical of listeners of western-tonal music. One could
speculate that such an individual might also have an unusual affect response to each
sequence. To determine the possible influence of this, the responses of each individual in
the tonality task were converted to a tonality profile for each individual (see Frankland &

Cohen, 1996; Krumansl, 1979; Krumhansl & Toiviainen, 2001). A cluster analysis was
then performed to determine which participants had profiles that did not cluster well with
the other participants. The presence of distinct individuals or groups of individuals was
determined by looking at the tonality profiles relative to different cut-off values for
classifying listeners into groups.
The criterion used was a between group average correlation of 0.340. From this
analysis 15 participants from Group 1 and four participants from Group 2 were identified
as having odd tonality profiles, and representing a possible subgroup. All analyses of the
affect related to each sequence were recomputed without these 19 participants. Excluding
these 19 ―odd‖ participants appeared to improve the model by explaining more variance,
but this improvement was only slight. Thus, these 19 participants were maintained in all
analyses and tables presented in this thesis. In other words, this potential subgroup did
not significantly impact the overall results in this study. It could also imply that the
simple tonal structures of the sequences used in the current work only required a basic
representation of tonality so that individual differences in tonal profiles simply did not

Musical Experience
In a similar manner, participants were categorized into low or high musical
groups based on the number of years that they had engaged in their primary instrument. A
median split was performed to divide participants in low and high musical experience
groups. Based upon this split no significant main effect for musical experience was
found, and there was no significant 3-way interaction between level of musical
experience, the musical sequences, and the affect scales. A significant 3-way interaction
would have indicated that musical experience influenced how participants rated certain
sequences on certain affect scales.
Two of the 2-way interactions with musical experience were found significant4.
For Group 1, musical experience was found to interact with the affect scales (F (5,4608)
= 3.604, p =.003, η2 = .004). For Group 2, musical experience was instead found to

Although it not mentioned in the text, in all analyses the two-way interactions between musical sequences
and affect scales were significant, with adequate effect sizes. It is not discussed in this section since this is
effect was a given, and is the very effect considered in the rest of the thesis.

interact with the musical sequences being played (F (15,4608) = 2.206, p = .005, η2 =
.007). Both of these effects sizes, however, were very small and this result was not as
interesting as a 3-way interaction would have been. Thus, these significant results were
not explored any further.

Other Demographics
Sex and Age were also considered, but appeared to not have an important
influence on responses. There was no significant main effect of Sex or Age, on responses
and there were no significant three way interactions between these demographics and
Musical sequence by Affect scale. Only in Group 2 were significant two-way interactions
found. In Group 2, a Sex by Musical sequence interaction was found (F (5,4608) = 2.09,
p = .002, η2 = .008), and an Age by Affect scale interaction was found when participants
were categorized by a median split (F(5,4608) = 3.23, p = .006, η2 = .003). These effects
were very small and, since 3-way interactions were what we were really interested us,
were not explored any further.

General Discussion

Different sequences produced significantly different ratings on the ten affect

scales. Modeling of the relationship between the structure of the sequence and the ratings
implied that pitch height (a.k.a. the mean pitch height of the sequence or the a0 term of
regression) was important for all the affect dimensions except questioning-answering.
However, pitch height per se does not truly include information about the shape of the
melodic contour. For shape, the degree and/or type of linearly ascending/descending
contour was important for the affect dimensions of contentment-joy, pensiveness-
playfulness, passivity-aggression, questioning-answering, surprise-expectation, and
energy-tranquility. The degree and/or type of an arched contour were important for the
dimensions of contentment-joy, hesitation-confidence, questioning-answering, surprise-
expectation, and energy-tranquility. Hence, one can conclude that melodic contour does
convey affective (emotional meaning).
In all analyses, pitch height was clearly the strongest predictor. For Group 1, the
predictive power of the pitch height (coded by a0) was, on average, 11.5 times better than
all of the contour coefficients combined. For Group 2, however, it was found to be only
an average of 2.3 times better. This difference was largely driven by the four affect scales
delicacy-strength, irritation-calmness, passivity-aggression and questioning-answering.
For the former two scales musical contour did not explain a significant proportion of the
variance, and for the latter two affect scales musical contour explained virtually all of the
explainable variation.
Although it was expected that pitch height would play a role in affective ratings,
the strength of the effect was surprising. It was hoped that by keeping the sequences
contained within two octaves, and by having an equal number of sequences above and
below middle C, the effect of pitch height would be attenuated. Yet it remained
prominent in this study. That is, mean pitch height appeared, on average, to be a more
salient affective feature of music in this study.

Additionally, the influence of the melodic arch appeared to be stronger, in
general, than the influence of linearity. Note that the degree of arch was significantly
correlated to 9 of 12 affect scales, while the linearity coefficient was only correlated to 6
of 12 (when correlations for the absolute values of the coefficients were included).

Other Observations
The data were collected from two groups of participants. Recall that both groups
received the same 16 sequences, but the two groups rated those sequences on different
affect terms. A number of group differences were noticed in the analysis. The most
noticeable difference was that the role of pitch height was much stronger for Group 1. In
fact, pitch height alone was "almost" sufficient as a predictor in Group 1. The reason for
this difference is unclear. Experimentally, the only difference between the two groups
was in the choice of affect scales. Even then, both groups rated the affect scales of
contentment-joy and hesitation-confidence, and the two groups performed effectively the
same on those two affect scales (for Group 2, aspects of shape were slightly more
important for contentment-joy). Furthermore, there were no discernable demographic
differences between the two groups. Hence, it would seem that something about the word
lists encouraged participants to respond in different ways.
One possibility is that the affect scales questioning-answering and passivity-
aggression which appeared in Group 2 but not in Group 1 may have induced greater
attention on a trial. In particular, participants reported rating questioning-answering based
upon whether the sequence ending going down or up, suggesting that this scale forced
participants to attend to contour more than pitch height. This was supported by the
finding that questioning-answering and passivity-aggression were the only affect scales in
which contour explained more variance than mean pitch height. Additionally, these two
scales had the lowest η2’s, suggesting that participants had the least amount of agreement
on how to rate these scales. Participants also often reported these scales as being hard to
use. Accordingly, response times were slower for participants who rated these scales
(Group 2) compared to those who did not (Group 1).

Pitch Height (Register / Octave)
The results of this research replicated previous results on the importance of pitch
height to conveyed affect (cf. Costa et al., 2000; Hevner, 1936a). In this study, mean
pitch height was important to all affect scales except questioning-answering. In
particular, higher pitch rated as conveying joy, confidence, playfulness, delicacy,
irritation, excitement, aggression, surprise and energy. Hevner (1936a) used an adjective
checklist to measure affect in her manipulations of pitch, and her checklist was further
divided into groups of related adjectives. Hevner (1936a) found that high pitch was most
associated with her sprightly-humorous adjective group which contained the adjectives:
delicate, light, graceful, sparkling, playful, jovial, whimsical, fanciful, quaint, sprightly
and humorous. However, all eight of Hevner’s adjective scales appeared to respond
manipulations of pitch height to some degree. This allowed for direct comparison
between the current results and affect terms from Hevner (1936a). Table 4.1 shows the
comparison between ratings of high pitch sequences in the current study to ratings of the
approximately equivalent terms used by Hevner.

Table 4.1: Comparison between current ratings for high pitch, and how equivalent terms
were rated by Henver (1936a).

Current Ratings for Hevner (1936a) Equivalent Ratings

high pitch high pitch (match) low pitch (mis-match)
joy jovial
confidence carefree, vivacious triumphant, exalting, impetuous
playfulness playful
delicacy delicate
irritation restless
excitement exciting
aggression forceful, impetuous, martial
romantic, dreamy,
whimsical, fanciful
energy sparkling, sprightly

Table 4.1 shows that the current results partially replicated the results of Hevner (1936a).
That is 6 of 9 of the affect terms used in the current experiment had equivalent terms
from Hevner’s (1936a) that were rated the same way. The best examples were joy,
playfulness and delicacy which had the equivalents of jovial, playful and delicate in
Hevner’s design. The worst example was the affect scale “excitement” which was rated

for high pitches in the current design, but its equivalent (exciting) was rated for low
pitches in Hevner’s (1936a) design. Additionally, “irritation” and “aggression” were rated
for high pitch sequences in the current results, and their closest equivalents (restless and
forceful) were rated for low pitches by Hevner’s (1936a) participants.
Hevner’s (1936a) research further suggested that low pitch was most associated
with her vigorous-majestic and dignified-serious adjective groups. The vigorous-majestic
group contained the adjectives: forceful, martial, ponderous, emphatic, exalting,
vigorous, and majestic. The dignified-serious group contained the adjectives: spiritual,
solemn, sober, dignified, and serious. For comparison, the current research indicated that
low pitches were rated as conveying: contentment, hesitation, pensiveness, strength,
calmness, boredom, passivity, expectation, and tranquility. Table 4.2 compares ratings for
low pitch sequences from the current study to how equivalent terms were rated in
Hevner’s (1936a) design.

Table 4.2: Comparison between current ratings for low pitch, and how equivalent terms
were rated by Henver (1936a).

Current Ratings for Hevner (1936a) results

low pitch low pitch (match) high pitch (mis-match)
contentment sad merry, calm
hesitation solemn, heavy leisurely
pensiveness sober, serious plaintive, dreamy
strength forceful, ponderous
calmness calm
boredom depressing, gloomy
passivity pathetic leisurely
expectation martial
tranquility spiritual calm, serene

Table 4.2 shows that, again, in most cases a similar term rated in a similar manner could
be found in Hevner (1936a). The one exception to the rule, in this case, was the affect
scale calmness-irritation for which “calmness” described low pitch sequences in the
current design, but “calm” described high pitch melodies in Hevner’s (1936a) results
(Table 4.2). Similarly, the current results indicated “irritation” described high pitch
sequences, but Hevner’s (1936a) equivalent “restless” was used to describe low pitch
melodies (Table 4.1). This discrepancy might have been related to negative valencing of

the affect scale calmness-irritation in the current design such that irritation was rated
equivalently to “sad”. By comparison, in Hevner’s (1936a) design “restless” was likely
positively valenced by being grouped with positive adjectives such as “exciting” and
“elated”. Thus, the current results were a good replication (with a few exceptions) of
Hevner’s (1936a) results on the affect associated with manipulations of pitch height.
More recently, Costa et al. (2000) considered the affect for bichords played in
different octaves. Their results indicated that low register bichords were evaluated as
more emotionally negatively than high resister bichords. Additionally high register
bichords were judged as more unstable, mobile, restless, dissonant, furious, tense,
vigorous and rebellious compared to low bichords (Costa et al., 2000). Recall, that high
pitches sequences in the current design were rated as conveying joy, confidence,
playfulness, delicacy, irritation, excitement, aggression, surprise and energy. Hence, the
current results were a strong replication of Costa et al.’s (2000) for pitch height.
Specifically, both results indicated higher octaves as more positive, and as expressing a
greater degree of activity/movement. For example, the current results used the terms
confidence, irritation, excitement, aggression, surprise, and energy, compared to Costa et
al.’s (2000) terms of mobile, dissonant (irritation), tense, restless, furious (aggression),
surprise (unstable) and vigorous (energy). Thus, although the designs used different
terms, both appeared to be rating high pitches in an equivalent way.
More examples of pitch height research could be cited, however, the primary
interest in this research was the manipulation of melodic contour, not pitch height. Thus,
it is sufficient to show that the current results replicated the approximate affective results
for mean pitch height in both past and recent research (i.e. Costa et al., 2000; Hevner,

Linear (Ascending versus Descending) Contours

The results of this research indicated that ascending and descending contours
conveyed affective meaning. In particular, ascending sequences were rated as conveying
joy, playfulness, aggression, questioning, surprise, and energy (Table 3.27). This result
was consistent was results found in previous research. In particular, Gerardi & Gerken,
(1995) found that college-aged participants rated ascending sequences as more “happy”

than descending sequences. Schubert’s (2004) results also suggested a trend for
ascending contours to convey happiness. Thus, findings that ascending sequences were
more likely to be rated as conveying joy replicated the results of Gerardi & Gerken
(1995) and were consistent with the trend observed by Schubert (2004).
Although Scherer & Oshinsky (1977) considered pitch contour using stimuli that
were not necessarily musical, there is increasing evidence that music and language are
processed in the same brain areas / regions (e.g. Koelsch, Gunter, & Wittfoth, 2005).
Scherer & Oshinsky’s (1977) design also employed the use of more than just one or two
affect scales, and can thus provide greater comparability for the current results. Scherer &
Oshinsky’s (1977) results indicated that downward pitch contour appeared to convey
boredom, pleasantness, and sadness. By comparison, the current results indicated that
descending contours conveyed contentment, pensiveness, passivity, answering,
expectation, and tranquility. “Excitement-boredom” was used as an affect scale but no
significant relationship between this scale and a1 (the linear term was found). Degree of
“pleasantness” was not measured in this research. However, it may be argued that
“tranquility” or “contentment” is similar in affect to “pleasantness”. Finally, none of the
current sequences were rated as conveying sadness (in pilot work) because they were all
written in a major key. However, the current results are consistent with Scherer &
Oshinsky’s (1977) results in that ascending sequences were rated as conveying more
happiness (joy).
Scherer & Oshinsky (1977) results further indicated that upward contours
conveyed fear, surprise, anger, and potency. By comparison, the current results indicated
ascending sequences conveyed joy, playfulness, aggression, questioning, surprise, and
energy. The current research did not measure “fear”, but notice both results indicated
“surprise” as an important affect of a rising contour. The equivalent to “anger” in the
current design was “calmness-irritation” which did not vary significantly according to the
linear term (a1) in this design. Also, the equivalent to “potency” in this design was
“strength-delicacy”, but, like calmness-irritation, strength-delicacy did not show
consistently reliable responses to contour. Thus, the current results were only are partial
replication of findings by Scherer & Oshinsky (1977).

Finally, Hevner (1936b) did not achieve results that were consistent or reliable in
her manipulation of contour, but she did find a trend suggesting that descending melodies
expressed both exhilaration and serenity. The current research did not use these terms,
however, “exhilaration” may be considered equivalent to “excitement”, and “serenity”
might be considered equivalent to “calmness” or “tranquility”. Of these, only
“tranquility” was significant in the current research, and, like Hevner (1936b), the results
indicated that tranquility was associated with descending contours. For ascending
melodies Hevner’s (1936b) results suggested a trend for ascending melodies to express
dignity and solemnity. Again, these terms were not employed in the current design, but
“solemnity” may be considered equivalent to “pensiveness”. “Dignity”, on the other
hand, appears to have no good equivalent in the current design (maybe “confidence”). Of
these, pensiveness-playfulness was important to linear (ascending/descending) sequences,
and hesitation-confidence was not. However, the current design indicated that
pensiveness was conveyed by descending sequences, not ascending sequences like
Hevner. Thus, the current results matched the trend observed by Hevner (1936b) for only
one of four terms: serenity (tranquility). It is important to note, though, that the two
results did not lend themselves very well to comparison.
In summary, the current results replicated previous research. In particular,
ascending sequences in this research tended to convey more happiness (cf. Gerardi &
Gerken, 1995; Scherer & Oshinsky, 1977; Schubert, 2004). To a lesser extent, this
research replicated previous results by showing that descending sequences conveyed
tranquility and contentment, and ascending sequences conveyed surprise (Hevner, 1936b;
Scherer & Oshinsky, 1977).
Note that previous research considering melodic contour has controlled for this
effect of pitch height by only considering ascending versus descending contours (e.g.
Hevner, 1936b; Gerardi & Gerken, 1995, Scherer & Oshinsky, 1977). That is, inversions
of ascending and descending contours using the same notes written in reverse result in
adequate complementary contours (when harmony & rhythm are avoided). This reverse
(or retrograde) method works fine for ascending versus descending sequences (e.g. C-E-
G becomes G-E-C) but does not work well for arches (e.g. C-E-G-E-C becomes C-E-G-
E-C). Previous research has conveniently side-stepped this issue by only considering

ascending versus descending contours, but other contours are important in contour. This
research, however, suggests that the arch should be considered an important affective
contour in music and should be considered in future research.

The Melodic Arch

The demonstration of affect for the arch is a new contribution to the literature. To
my knowledge, there is no research relating the arch to affect on short sequences. On the
other hand, the finding of an affect for the arch should not be surprising because the arch
is one of the most important melodic contours in music. For example, Huron (1996)
found that in an analysis of over 36,000 sequences taken from western folk music the
arch or inverted arch occurred 48.3% of the time, just as often as a linear sequences
(48.2%). Thus, analyses of contour and affect should be considered incomplete if they
consider ascending versus descending sequences but neglect to include arches and
inverted arches. Additionally, the regular melodic arch (in its convex form) occurred
more than any other type of melodic contour. This study shows us that not only do arches
convey affective meaning, they may in fact convey more affect than a simple ascending
or descending sequence.
In the current study, the melodic arch was found to be important for conveying
joy, confidence, playfulness, excitement, answering, surprise and energy. Conversely, the
inverted arch was found to be important for conveying contentment, hesitation,
pensiveness, boredom, questioning, expectation, and tranquility. Certainly it made sense
that the regular arch conveyed answering and the inverted arch conveyed questioning.
The former ends descending, and the latter ends ascending. The fact that the regular arch
conveyed surprise, energy, excitement also seemed consistent with Toch’s (1948)
assertion that the melodic arch is used to convey a climax in a musical piece. Thus,
although there was no research in the literature directly considered the affect of short
melodic arches in a controlled experiment, the results from this experiment were
consistent with what would, intuitively, be expected.
Further, the current design analyzed the absolute values of the polynomial
coefficients. For the arch (a2) term this meant a comparison between contours with a
melodic arch present, and those without. The results indicated that not only was direction

of melodic arch (concave/inverted versus convex/regular) important but so was the
magnitude of melodic arch. Specifically, a strong melodic arch shape was important for
conveying contentment, hesitation, pensiveness, strength, calmness, boredom, expectancy
and tranquility. By contrast a lack of melodic arch was important for conveying joy,
confidence, playfulness, delicacy, irritation, excitement, surprise and energy. Again,
some of these results make intuitive sense, but unfortunately there is no literature in
which these results can be compared to.
It could be suggested that this absolute value comparison was similar to
Schellenberg et al.’s, (2000) comparison of pitch varying (arch present) versus pitch
invariant (no arch) contours, however there is a key difference that should be noted here.
In Schellenberg et al.’s, (2000) design pitch invariant contours were all played on the
same pitch, but, in the current design a lack of arch did not indicate that sequences were
all played on the same pitch. For example, an ascending sequence could have no arch,
and similarly an N-shape contour could contain an arch value (a2) of zero in this design.
This is because, in polynomial modeling with a perfect N-shape the two arches cancel
each other out to give an overall arch term (a2) of zero. Thus, Schellenberg et al.’s, (2000)
design was considerably different than the current design, and the two do not warrant a
direct comparison here.

The Influence (Effect Size) of Melodic Contour

In this study, the affect of contour was largely overshadowed by the affect of
pitch height. Manipulations of tempo, harmony, key, rhythm, et cetera, were purposely
avoided so that contour would be the most prominently manipulated feature, and yet
affective responses to it were still comparatively weak. For example, melodic contour
only explained 14% of the explainable variance for affect scale excitement-boredom.
There are three possible explanations for the relatively weak effect of contour (1) the
sequences were not long enough to induce affect, (2) the affect of melodic contour is
learned / cultural, or (3) melodic contour may just be not that important to conveying
Firstly, in regards to sequence length, short sequences may have the problem of
not being complex enough to elicit strong emotion. If this is true, longer sequences may

be required to measure contour’s full emotion/affective effect. On the other hand, with
longer sequences it becomes difficult to determine what part(s) the participant is
responding to. For example, Hevner (1936b) used full sections of music that were, on
average, 8 measures long. Participants were provided with an adjective list and checked
off adjectives that applied while listening to music played by a live (concealed) pianist.
However, Henver had no method for measuring the time at which certain adjectives were
endorsed. Thus, Hevner’s interpretations were based on the assumption that particular
adjectives were endorsing the entire piece, when it was entirely possible that any single
adjective was referring to just a section of the piece. Thus, if contour conveyed subtle
effects of affect that are only relevant to small sections of music it would be almost
impossible to measure the affect of contour using long musical pieces.
A number of researchers have tried to tackle this problem by collecting continuous
responses from participants (cf. Krumhansl, 1997; Schubert, 2004).
For example, Schubert (2004) analyzed the relationship between melodic contour
and the affect dimension of happy-sad and exciting/boring, using extracts from classical
repertoire. Like many other researchers found that it “varied positively with valence,
though this finding was not conclusive” (p. 561). However, as noted, in Schubert (2004),
the musical stimuli were excerpts from classical music, so the influence of other musical
variables may have masked the small influence of melodic contour.
In Schubert’s study affect ratings were measured continuously by having
participants moving a cursor around a two dimensional space while listening to music.
The east/west dimension measured valence (happy/sad) and the north/south dimension
measured arousal (exciting/boring). The analysis was then done by comparing ratings at a
given time to musical features/events occurring at slightly earlier times (assuming a delay
in reaction). However, this analysis becomes quite computationally intensive, as different
types of affect may have different delays (e.g., sad affects could take longer to manifest).
This would limit the ability to tie the current ratings to musical structure, and may have
explained why Schubert (2004) did find a significant relationship between contour and
affect. Methodologically, this approach is also limited to measuring only two (or a few)
affect dimensions at a time. That, combined with the length of each stimulus, would limit
the amount of information gleaned for a variety of affects and sequence structures.

Therefore, choosing the ideal length of musical stimuli can be difficult issue, and future
research should perhaps consider testing musical stimuli of different lengths.
Secondly, the affect of melodic contour may be learned, or acquired culturally.
Gerardi and Gerken (1995) found that college students appeared to respond consistently
to manipulations of melodic contour, while 5-year olds did not. This suggests that either
response to melodic contour is either developmentally or culturally related. A cultural
difference implies that 5-year olds simply have not been exposed to enough Western
music to begin to learn affective distinctions of contour. A developmental difference
implies that processing melodic contour requires a higher order level of cognitive
processing that does not develop until later years. Indeed, Koelsch and Siebel (2005)
recently published a model of musical perception that placed perception of complex
structures of music at a higher level than perception of pitch height, pitch chroma, timbre,
intensity, and roughness (p. 579). They, unfortunately, did not put forth any theories
about at what age these mechanisms develop. However, there has been research showing
that infants are responsive to differences in vocal contour (e.g. Balaban, Anderson &
Wisniewski, 1998; Trehub, 2000). Suggesting that the 5-year olds are probably aware of
the differences in melodic contour, and thus level of exposure (or culture) is likely the
driver of this effect. This could be investigated by comparing affect ratings of college
aged participants in Canada to participants of the same age in a culture with reduced
exposure to Western music. However, all of the current work was conducted with adults
enmeshed in the norms of western tonal music. The tonality analysis implied that only a
few participants were a bit different. Hence, if there are learned cultural norms of musical
expression, the participants in the current work should have acquired those norms. In
addition, none of the sequences used in the current work represented "complex" or
"atypical" structures. The construction of all sequences was firmly grounded in basic
structural devices of western tonal music. Thus, cultural or learned norms are not a viable
explanation for the weak effect in this study.
Thirdly, it could be possible that melodic contour is just not that important for
affect. That is, the pattern of notes (contour) may only serve as a vehicle for the
expression of other aspects of music. For example, contour may serve as a means of
moving from the tonic (C in the key of C) to the dominant (G in the key of C) and back to

the tonic, but the route to which music takes to get there may not be that important.
However, this possibility does not make a lot of sense in light of previous research.
Previous research has shown that melodic contour is very important to musical memory
(e.g. Cutietta & Booth, 1996; Dowling, 1994; Dowling, Kwak, & Andrews, 1995).
Melodic contour is also very important to music recall, and preserved melodic contour
can lead to music recognition even when other features of music are altered (cf. Dowling,
1994; Massaro, Kallman, & Kelly, 1980). Thus, it makes little sense that melodic contour
could be important for musical memory but serve absolutely no function on its own. If
contour had no function, then there would be no point to its retention. If contour had no
function, then all contours containing the same collections of notes would be functionally
equivalent. All melodies containing the same collections of notes (in any order) could be
remembered by reference to that set collection. In the extreme, all melodies would be
encoded as the tonal hierarchy of Krumhansl (1990). However, this is not the case, so
contour must be important.

The function of Melodic Contour

Contour must have some function, but could that function be for something other
than affect? This is possible, but it is extremely difficult to imagine: What is there besides
affect in music? In every culture, a great deal of effort is directed into the archiving,
production and performance of music, at the cultural and individual levels. Yet, clearly,
music does not convey any semantic (factual) information except in the most contrived
situations (e.g., songs to teach children to count, such as "One, Two Three Little
Indians"). Hence, music must serve to express (including self-expression), communicate,
or possibly to induce, affect.
The ability to communicate affect (or to induce affect) presupposes that there are
"common" structures that can be used for particular communication goals. These
common structures would obviously include key, tempo, and pitch height, but clearly
must also include contour. If contour did not have an impact on affect, there would be no
point to remembering the contour (i.e., the tune). A performer could use any ordering of
the same set of notes to communicate the same affect (often this seems like the method of

Jazz). Memory for music would be limited to memory for a set of notes, and this is
clearly not the case.
Thus, contour must be associated with affect, and yet, there is still the quandary of
why the effect of contour on affect is so difficult to assess. It may be that the
communication of affect is accomplished by the induction of the same affect in the
listener. That is, the communication of affect might require (to varying degrees) the
induction of the same affect in the listener. That is, to know what the music conveys, one
must feel that affect.
This notion leads to two concerns of direct relevance to the current experimental
situation. First, affect may take some time to build, so one would need longer sequences
to actually induce strong affect. Second, it is possible that such knowledge of affect is
only approximate. This implies that the effect of musical structure on affect is unknown
unless the music induces that affect. That is, listeners may not know what the music is
intended to convey if that music does not induce that affect. This might explain the
relatively weaker effects of contour in the current work. Specifically, the sequences used
are too short to induce affect (affect that can override a prior mood state), and for that
reason, listeners are "ambivalent" about the intended affect. This could lead listeners to
gravitate to the center of the scales or rely on more salient (learned) cues such as pitch
height. Note that issues of key (major/minor) and pitch height are much more salient and
often explicitly taught in the public school system.
In summary, contour must play some role in affect, and the best explanation for
the small effect of contour in the current design is that the demand characteristics of the
experiment rendered other elements of music somewhat more salient in the current
design. Specifically this design involved a somewhat sterile presentation of a number of
unrelated short sequences, with only a limited set of affect terms. Future research should
consider similar sequences of longer length to determine if sequence length really is a
driving factor of the weak effect for contour observed in the current study.

Limitations & Concerns

There are a number of limitations and concerns that remain. Foremost, there is the
issue of demand characteristics. Participants were asked to rate sequences on affect

scales, and since these sequences only differed on contour and mean pitch height
participants were forced to respond to only these manipulations. Therefore, ratings may
have been assigned in an artificial way, differently from how participants would normally
respond to music. However, if each participant was deriving their own completely
different system for rating than the results should have averaged out. Thus, in this study
sequences captured at least some relatively consistent effect.
Additionally, the fact that arches were rated as more important than linear
sequences may have been driven by something other than their inherent importance. For
instance, it may have been related to the fact that more arch shape sequences were present
in the stimulus set, or the fact that the Tonality Task was done entirely with linear
sequences. These disparities may have made arches seem more salient to listeners, thus
resulting in more consistent affect ratings for arches. Certainly, future studies should
control these concerns by balancing the number of linear and arch sequences within a
study, and not intermixing Affect Ratings with a Tonality Task (i.e. saving the Tonality
Task for last).

The polynomial modeling worked very well, in general. However, the model did
less well for the more complex sequences such as the M/W contours (R2 not as high).
Thus the models ability to explain the influence of these sequences was somewhat
compromised. Future research should consider strategies to improve the fit between the
model and the data as it remains a weakness in the current design. For example, Fourier
analysis may be an effective modeling tool for N-shape and M/W shape contours, since
Fourier Analysis models contour with sine waves and N-shape contours follow
approximately the period of a sine wave. Additionally, each sequence was checked to
ensure that in fit the tonality of C major, but no analysis was performed on the frequency
of intervals per sequence, for example, the frequency of dissonant intervals (cf. Costa et
al., 2000; Costa et al. 2004). However, this particular analysis would probably not add
much for these particular musical stimuli.
More importantly, the modeling must, eventually, incorporate issues of tonality
and harmonic progression. It is possible that affect is strongly related to harmonic

progression, which would be an aspect of both tonality (key) and contour. For example, a
sequence C E G E C may be distinct from C G E E C because of the simpler harmonic
structure (within the key of C major) of the first, or because of the simpler steps between
adjacent notes. Additional sequences like C E D D C and C D E D C (weaker harmonic
motion) or sequences like D E F E D and D F E E D (difficult to maintain the sense of the
key of C major) might help to clarify such distinctions. Current models were not
extended to the notion of harmonic progression because the topic is so complex, and
there are no empirically valid models of harmony. In addition, this work was, in part,
exploratory (one step at a time).

Group Differences
Participants were assigned to groups sequentially, not randomly as would be done
in a normal design. The observed group differences may have been a consequence of the
order in which participants participated. Participants in Group 2, in general, responded
more to manipulations of melodic contour (as desired). However, all participants in
Group 2 participated after all participants in Group 1. Thus, it is possible (though
unlikely) that participants participating at earlier times were passing on information about
the study, and that this passing of information was primary reason for group differences.
Future studies with these stimuli should randomly assign participants to groups and run
groups in parallel. This was not done in the current study because the second group was
run largely as an extension to more affect terms.
Although collected demographic information led to the conclusion that the groups
did not differ demographically, there may have been problems for how this data was
collected. For example, the musical background was free format and relied (largely) on
participant memory for what instruments they played, for how long, how often they
practiced, etc. In this study, musical expertise was ascribed based on the number of years
a participant played an instrument. Although there was a general correlation between
years played and number of hours per week during those years, using hours per week
may have been a better method. However, the tight clustering of the affect rating data per
sequence did not imply that there were strong distinct subgroups in the data. In addition,
previous research (e.g. Waterman, 1996) has suggested that trained musicians do not tend

to rate affective content of music differently than non-trained participants. Thus, extent of
musical training was likely not an important factor in the current study.

Affect Scales
Affect scales for this study were developed through pilot work, by choosing
scales and adjectives that appeared to work well with the music. The final list was by no
means perfect or exhaustive, and certainly should be updated as needed for future
research. There is also the concern of that the scale represent a mix of unipolar and
bipolar scales. The initial plan was to use bipolar scales with a defined neutral point, but
unfortunately, many had to be adjusted for the current work. For example the
contentment-joy affect scale typically appears in similar research as the bipolar happy-
sad dimension, but this was deemed inappropriate to the current work because all
sequences were design within the major mode.
There was also the problem that many affect scales in this design were highly
correlated with each other (See Tables 3.6 and 3.7). This suggested that these affect
scales were capturing redundant or non-unique pieces of emotion/affect. Perhaps the
stimuli in this study were only strong enough to elicit responses on a few affective
dimensions, or perhaps there were some important affective dimensions excluded from
the current study. It would be worthwhile to consider new and different affect scales to
determine if they add any more information to the model.

Future Directions
Future research could carefully design sequences that yield a balanced set of
polynomial coefficients from which the importance of arch contour to ratings of affect
could be confirmed. It would also be valuable to test longer sequences to determine if
longer sequences showed larger affect. This could be done both by composing new
sequences and by combining the current set of sixteen sequences to form 16 notes (4
measure) sequences. The additive/interactive effects of these combined sequences could
than be analyzed in detail. It would also be worthwhile to consider more affect scale. The
eventual goal of this research will be to build predictive models of contour and affect, and
test these models in an experimental setting.

The effect of melodic contour on emotion (affect) is rather weak particularly, it
seems, when compared to other the effect of features such as pitch height, mode, tempo,
rhythm, etc. However, characterizing the influence of contour is important as it is one of
the most fundamental features of music.
Previous research has focused investigations of contour on the difference between
ascending and descending contours. This paper provides evidence that the arch is also an
important affective contour in music. In light of this finding, and the fact that the arch is
one of the most common melodic contours, future research into contour and affect should
consider the arch.


Adams, C. R. (1976). Melodic contour typology. Ethnomusicology, 20(2), 179-215.

Balaban, M. T., Anderson, L. M., & Wisniewski, A. B. (1990). Lateral asymmetries in
infant melody perception. Developmental Psychology, 34(1), 39-48.
Beard, D. (2003). Contour modeling by multiple linear regression of the nineteen piano
sonatas by mozart. Unpublished Dissertation, Florida State University, Florida.
Blood, A. J., Zatorre, R. J., Bermudez, P., & Evans, A. C. (1999). Emotional responses to
pleasant and unpleasant music correlate with activity in paralimbic brain regions.
Nature Neuroscience, 2(4), 382-387.
Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the
behavioral sciences (2 ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Cohen, A. J. (1991). Tonality and perception: Musical scales primed by excerpts from
The Well-Tempered Clavier of J. S. Bach. Psychological
Research/Psychologische Forschung, 53(4), 305-314.
Costa, M., Bitti, P. E. R., & Bonfiglioli, L. (2000). Psychological connotations of
harmonic musical intervals. Psychology of Music, 28(1), 4-22.
Costa, M., Fine, P., & Bitti, P. E. R. (2004). Interval distributions, mode, and tonal
strength of melodies as predictors of perceived emotion. Music Perception, 22(1),
Cutietta, R. A., & Booth, G. D. (1996). The influence of metre, mode, interval type and
contour in repeated melodic free-recall. Psychology of Music, 24(2), 222-236.
Dowling, W. J. (1994). Melodic contour in hearing and remembering melodies. In R.
Aiello & J. A. Sloboda (Eds.), Musical perceptions (pp. 173-190). New York,
NY: Oxford University Press.
Dowling, W. J., Kwak, S., & Andrews, M. W. (1995). The time course of recognition of
novel melodies. Perception & Psychophysics, 57(2), 136-149.
Frankland, B. W., & Cohen, A. J. (1990). Expectancy profiles generated by major scales:
Group differences in ratings and reaction time. Psychomusicology, Special Issue:
Music expectancy, 9(2), pp. 173-192

Frankland, B. W. & Cohen, A. J. (1996). Using the Krumhansl and Schmuckler key-
finding algorithm to quantify the effects of tonality in an interpolated tone pitch
comparison task. Music Perception. 14, 57-83.
Frankland, B. W. (1998). Empirical assessment of Lerdahl and Jackendoff's low-level
group preference rules for the parsing of melody. Thesis: Doctor of Philosophy,
Dalhousie University, Halifax.
Friedmann, M. L. (1987). A response: My contour, their contour. Journal of Music
Theory, 31(2), 268-274.
Gagnon, L., & Peretz, I. (2003). Mode and tempo relative contributions to 'happy-sad'
judgements in equtione mequitone. Cognition & Emotion, 17(1), 25-40.
Gerardi, G. M., & Gerken, L. (1995). The development of affective responses to modality
and melodic contour. Music Perception, 12(3), 279-290.
Gilpin, A. R. (1973). Lexical marking effects in the semantic differential. Journal of
Psychology: Interdisciplinary and Applied, 85(2), 277-285.
Gomez, P., & Danuser, B. (2004). Affective and physiological responses to
environmental noises and music. International Journal of Psychophysiology,
53(2), 91-103.
Hampton, P. J. (1945). The emotional element in music. Journal of General Psychology,
33, 237-250.
Handel, S. (1993). Listening: An introduction to the perception of auditory events.
Cambridge, MA, US: The MIT Press.
Hevner, K. (1935). The affective character of the major and minor modes in music.
American Journal of Psychology, 47, 103-118.
Hevner, K. (1936a). The affective value of pitch and tempo in music. American Journal
of Psychology, 50, 621-630.
Hevner, K. (1936b). Experimental studies of the elements of expression in music.
American Journal of Psychology, 48, 246-268.
Huron, D. (1996). The melodic arch in western folksongs. Computing in Musicology, 10,

Iwamiya, S. (1994). Interaction between auditory and visual processing when listening to
music in an audiovisual context: I. Matching II. Audio quality. Psychomusicology,
13(1-2), 133-153.
Iwanaga, M., Kobayashi, A., & Kawasaki, C. (2005). Heart rate variability with repetitive
exposure to music. Biological Psychology, 70(1), 61-66.
Juslin, P. N., & Madison, G. (1999). The role of timing patterns in recognition of
emotional expression from musical performance. Music Perception, 17(2), 197-221.
Kawano, K. (2004). Changes in EEGs and other physiological indicators while listening
to healing music. Journal of International Society of Life Information Science,
22(2), 378-380.
Koelsch, S., & Siebel, W. A. (2005). Towards a neural basis of music perception. Trends
in Cognitive Sciences, 9(12), 578-584.
Koelsch, S., Gunter, T. C., & Wittfoth, M. (2005). Interaction between syntax processing
in language and in music: An ERP study. Journal of Cognitive Neuroscience,
17(10), 1565-1577.
Krumhansl, C. L. (1979). The psychological representation of musical pitch in a tonal
context. Cognitive Psychology, 11(3), 346-374.
Krumhansl, C. L. (1990). Cognitive foundations in musical pitch. New York, NY, US:
Oxford University Press.
Krumhansl, C. L. (1997). An exploratory study of musical emotions and
psychophysiology. Canadian Journal of Experimental Psychology, 51(4), 336-
Krumhansl, C. L. (2000). Rhythm and pitch in music cognition. Psychology of Music,
126(1), 159-179.
Krumhansl, C. L., & Toiviainen, P. (2003). Tonal cognition. In I. Peretz & R. Zatorre
(Eds.), The cognitive neuroscience of music (pp. 95-108). New York, NY, US:
Oxford University Press.
Lerdahl, F. (1988). Tonal pitch space. Music Perception, 5(3), 315-349.
Lerdahl, F. (2001). Tonal pitch space. New York: Oxford University Press.

Lipscomb, S. D., & Kendall, R. A. (1994). Perceptual judgement of the relationship
between musical and visual components in film. Psychomusicology, 13(1-2), 60-
Marshall, S. E., & Cohen, A. J. (1988). Effects of musical soundtracks on attitudes
toward animated geometric figures. Music Perception, 6, 95-112.
Marvin, E. W., & Laprade, P. A. (1987). Relating musical contours: Extensions of a
theory for contour. Journal of Music Theory, 31(2), 225-267.
Massaro, D. W., Kallman, H. J., & Kelly, J. L. (1980). The role of tone height, melodic
contour, and tone chroma in melody recognition. Journal of Experimental
Psychology, 6(1), 77-90.
Morris, R. (1987). Composition with pitch classes: A theory of compositional design.
New Haven and London: Yale University Press.
Morris, R. (1993). New directions in the theory and analysis of musical contour. Music
Theory Spectrum, 15(2), 205-228.
Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning:
Urbana: University of Illinois Press.
Ottman, R. (1956) Music for sight singing. Englewood Cliffs, NJ: Prentice Hall, Inc.,
Pittenger, R. A. (2003). Affective and perceptual judgements of major and minor musical
stimuli. Dissertation Abstracts International: Section B: The Sciences and
Engineering, 63(7-B), 3491.
Polansky, L., & Bassein, R. (1992). Possible and impossible melody: some formal
aspects of contour. Journal of Music Theory, 36(2), 259-284.
Quinn, I. (1997). Fuzzy extensions to the theory of contour. Music Theory Spectrum,
19(2), 232-263.
Richell, R. A., & Anderson, M. (2004). Reproducibility of negative mood induction: A
self-referent plus musical mood induction procedure and a
controllable/uncontrollable stress paradigm. Journal of Psychopharmacology,
18(1), 94-101.
Schellenberg, E. G., Krysciak, A. M., & Campbell, R. J. (2000). Perceiving emotion in
melody: Interactive effects of pitch and rhythm. Music Perception, 18(2), 155-171.

Scherer, K. R. (1986). Vocal affect expression: A review and a model for future research.
Psychological Bulletin, 99(2), 143-165.
Scherer, K. R., & Oshinsky, J. S. (1977). Cue utilization in emotion attribution from
auditory stimuli. Motivation and Emotion, 1(4), 331-346.
Schimmack, U., Bockenholt, U., & Reisenzein, R. (2002). Response styles in affect
ratings: Making a mountain out of a molehill. Journal of Personality Assessment,
78(3), 461-483.
Schmidt, L. A., Trainor, L. J., & Santesso, D. L. (2003). Development of frontal
electroencephalogram (EEG) and heart rate (ECG) responses to affective musical
stimuli during the first 12 months of post-natal life. Brain and Cognition, 52(1),
Schmuckler, M. A. (1999). Testing models of melodic contour similarity. Music
Perception 16(3), 295-326.
Schubert, E. (2004). Modeling perceived emotion with continuous musical features.
Music Perception, 21(4), 561-586.
Swanwick, K. (1973). Musical cognition and aesthetic response. Psychology of Music,
1(2), 7-13.
Toch, E. (1948). The shaping force in music. NY: Criterion Music Corp.
Trehub, S. (2000). Human processing predispositions and musical universals. In N. L.
Wallin, B. Merker & S. Brown (Eds.), The origins of music (pp. 427-448).
Cambridge, MA, US: The MIT Press.
Van der Does, W. (2002). Different types of experimentally induced sad mood. Behavior
Therapy, 33(4), 551-561.
Västfjäll, D. (2001-2002). Emotion induction through music: A review of the musical
mood induction procedure. Musicae Scientiae, Special Issue: Current trends in
the study of music and emotion, 173-211.
Vines, B. W., Nuzzo, R. L., & Levitin, D. J. (2005). Analyzing temporal dynamics in
music: Differential calculus, physics, and functional data analysis techniques. Music
Perception, 23(2), 137-152.
Waterman, M. (1996). Emotional responses to music: Implicit and explicit effects in
listeners and performers. Psychology of Music, 24(1), 53-67.

Webster, G. D., & Weir, C. G. (2005). Emotional responses to music: Interactive effects
of mode, texture, and tempo. Motivation and Emotion, 29(1), 19-39.
Yorke, M. (2001). Bipolarity ... or not? Some conceptual problems relating to bipolar
rating scales. British Education Research Journal, 27(2), 171-186.

Appendix 1

Computer Instructions for Practice

Sequence Fit (Tonality) – Practice

This task provides some background information. In this task, you will hear 6
sequences. Each sequence will consist of 8 notes, followed by a pause, and then a
single note. Please judge the “fit” of the single note (i.e., the last note) to the
preceding notes.
Feel free to use any criteria that you like as the basis for your assessment: There
is no right or wrong answer. Please try to respond naturally, instinctively, or with
whatever “feels” right. At first this may seem awkward, but eventually it becomes
automatic, even rhythmic.
When making your judgement, try to use the full range of a scale from one (1)
to three (3) with a 1 representing a good fit, and a 3 representing a poor fit.
good fit 1 ------------------------ 2 ------------------------- 3 poor fit
You can use the number keys (top row of the keyboard) or the keypad or the arrow
keys (labeled on the keyboard) for your response. Finally, do not worry if you should
change your mind. Sequences will be repeated later.

Musical Meaning (Affect) – Practice

This task assesses musical meaning. In this task, you will hear 16 musical
sequences paired with a word-scale. Use the word-scale to indicate which
affect/emotion you feel the musical sequence is intended to convey.
Feel free to use any criterion that you like as the basis for your assessment.
There is no right or wrong answer. Try to make the judgement as natural or
“instinctive” as possible.
Practice Word-Scale will be:
excitement 1 2 3 boredom

When making your judgment, try to use the full range of a scale from one to
three with a one (1) representing the first word (e.g. excitement) and a three (3)
representing the second word (e.g. boredom).
You can use the number keys (top row of the keyboard) or the keypad or the
arrow keys (labeled on the keyboard) for your response. Please wait until the entire
sequence has been played before responding.

Computer Instructions for Actual Trials

Tonality Trial:

Judge the fit of the last note of the sequence to the sequence.
Please use the following range.

good fit 1 ----------------------- 2 ----------------------- 3 poor fit

Playing sequence 2_

Affect Rating Trial:

Please rate according to what you feel the music is trying to convey.

Is the music trying to convey … ?

Trial: 1 hesitation 1 2 3 confidence

……….hit any key to continue_

Computer Instructions prior to Regular (non-practice) Trials

Musical Meaning (Affect)

This is a repeat of the second task. You will hear same 16 sequences paired with
6 different word-scales. There will be 96 trials in all. Use the presented word-scale(s)
to indicate which affect/emotion you feel the musical sequence is intended to convey.
There will be a break between each word-scale. If you do not understand a word-
scale, feel free to ask the experimenter for clarification.
For example, you may see the instructions:

“Is the music trying to convey … ?

contentment 1 2 3 joy
… press any key to continue”

Once you have read and understand the word-scale, press any key to begin the
music. When making your judgment, try to use the full range of the scale from one to
three with a one (1) representing the first word (contentment), and a three (3)
representing the second word (joy).

Appendix 2
J. Salmon, B. Frankland Musical Background Questionnaire 1 of 2
Participant Number: Date:
Age: years Sex: M or F First Language English: Y N
Do you have Absolute (or perfect) pitch? Yes or No Handedness: R or L or A
By Choice hours/week (e.g., you select)
How many hours per week do you listen to music?
By Default hours/week (e.g., at work)

Musical Tastes and Preferences

Label  (Some) Examples
Classical Any/All
Renaissance (1450-1600) Dufay, Palestrina, Gregorian Chant, Gabrieli, Albeniz, Albioni
Baroque (1600-1750) Monteverdi, Purcell,Vivaldi, Rameau, JS Bach, Handel, Adam
Classical (1750-1830) CPE & JC Bach, Hayden, Gluck, Mozart, Schubert, Beethoven
Romantic (1700-1900) Chopin, Liszt, Rubinstein, Schumann, Berlioz, Brahms, Wagner
Modern (20th C.-21st C.) Bartok, Greig, Bellini, Reinecke, Stravinski, Debussy, Schoenburg
Opera S. Brightman, A. Bocelli, Irish Tenors, Charlotte Church
Easy Listening Nana Mouskouri, Ray Conniff, Henry Mancini, Perry Como
Folk Natalie MacMaster, Joan Baez, Crash Test Dummies
Country/Western Emmylou Harris, Dixie Chicks, Shania Twain
Blues Billy Holliday, Ray Charles, Ma Rainey, Corey Harris
Rhythm & Blues Buck 65, Me'Sell NdegeOcello, Alicia Keys
Jazz Any/All
Classic/Early St. Louis Band, Scott Joplin, King Oliver’s Creole Jazz Band
Big (Swing) Bands Benny Goodman, Dorsey Brothers, Duke Ellington, Glen Miller
Bebop (Bop) Charlie Parker, Lester Young, Thelonious Monk
Dixieland Eddie Cordon, Kid Ory, Bunk Johnson
Progressive Jazz Dave Brubeck, Miles Davis, Lenny Tristano
Rock Any/All
Soft Matchbox Twenty, Robbie WIlliams, Bruce Cockburn
Hard Bon Jovi, Limp Bizkit, Stratovarius, AC / DC
Classic Rod Stewart, Eagles, Peter Gabriel
Rock & Roll Beatles, Rolling Stones
Punk Rancid, No Doubt, Good Charlotte, Sex Pistols
Psychedelic Pink Floyd, Tangerine Dream
Alternative S. McLachlan, The Red Hot Chilli Peppers, R.E.M., Nirvana
Pop Sheryl Crow, Britney Spears, Roxette
New Age Enigma, June Tabor, David Sylvain
Dance Asian Dub Foundation, Mariah Carey, Moby
Reggae Bob Marley & the Wailers, Burning Spear, Lee Perry
Rap & Hip Hop Outkast, Rage Against the Machine, 2Pac
Christain/Gospel Amy Grant, The Statler Brothers, Alan Lomax

Other (please specify

J. Salmon, B. Frankland Musical Background Questionnaire 2 of 2

Musical Experience/Training
Formal Lessons Play/Practice
Voice, Theory, Start End Freq. School Music Grade Start End Freq.
mn/yr mn/yr hrs/wk Year? (if appl.) mn/yr mn/yr hrs/wk