You are on page 1of 14
Inerfae, Vol. 21 (1992), pp. 135-148. (303-3902/82/2101-0135 $3.00 (© Swess& Zeidinger On the Spectral Analysis of Melody Nigel Nettheim ABSTRACT ‘The widely-known claim of Voss and Clarke (1978) that much music is well mod- elled by “i/f noise” is critically examined. Some new data are provided for classi- coal music, yielding mainly negative conclusions. In the course ofthe investigation ‘some of the problems ofthe statistical analysis of musical data are discussed. Spectral analysis has long been applied in the study of the timbre of a given musical tone, but itis only more recently that attention was drawn by Voss and Clarke (1978) to the possibility of applying it to several features of the note-to- note progressions in a piece of music. That paper received the endorsement of Gardner (1978) and Mandelbrot (1983), and has since influenced composers of algorithmic music, including for example Bolognesi (1983) and Dodge and Bahn (1986), a8 well as theorists such as Boon et al. (1990). The claims, including in particular that “the frequency fluctuations of music ... have a 1/f spectral density at frequencies down to the inverse of the length of the piece of music” (Voss & Clarke, 1978, p.258), have so far not been challenged, and have been further presented by Voss (1988). In the present paper I first examine these claims and then offer some new data from 18th-19th century music. REVIEW OF THE WORK OF VOSS AND CLARKE Let us acknowledge the appealing character of the work of Voss and Clarke (hereafter abbreviated to Voss) and its stimulation of interest in a novel view of music, The criticism which follows is intended to be entirely constructive. Data Acquisition ‘Voss's musical data were obtained by electronically monitoring a radio signal over a twelve-hour period. The varying amplitude ofthe signal reflected changes, in the loudness of the music a variable which will, however, not be pursued here. ‘The varying rate of zero-crossing of the signal was also recorded; as each group cof two or more zero-crossings (the number depending onthe timbre) corresponds ‘Manuscript received March 26, 1991. 136 NIGEL NETTHEIM to one cycle of sound pressure, it was assumed that this quantity “roughly fol- lows the melody” (Voss, 1978, p.260), though no evidence’ was given for the reliability of this method. Considering that a melody is normally accompanied by other tones which will have their own zero-crossings, not necessarily in phase with those of the melody, and considering also that the timbre of the instrument(s) playing the melodies, of of words sung to them, will generally vary within and between pieces, it would be of interest to see experimental support for this method, by comparing its results with the printed scores. Until such evidence is provided, I believe the method should be considered suspect. The method also evidently assumes thatthe highest pitch at each point forms the melody, a somewhat limiting assumption as will be discussed later. Length of the Data Run ‘Voss accumulated data over a twelveshour stretch, thus including a variety of composers, pieces, movements, and announcers’ comments. By analysing the entire sample undivided he took the period of spectral components up to the average length of a piece, which however seems questionable. A single piece is normally the largest unit of anistic significance, excepting possibly the relatively ‘uncommon groups of related “pieces” such as a song cycle. In most cases, once a piece is finished it is considered to have passed beneath the horizon, and a fresh ‘one starts with the time reset to zero. Appending pieces of different lengths in the ‘order chosen by the broadcaster gives rise to correlations whose musical significance is not clear.! Pitch, Rhythm and Melody We next consider by what method suitable variables may be defined from an ‘encoded melodic sequence. Here a contradiction appears between Voss's method ‘of analysis, onthe one hand, and his method of synthesis (stochastic composition), ‘on the other. For the melody analysed from the recorded signal (Voss, 1978, p.260) is evidently the resultant of both pitch-sequence and duration-sequence, which could be represented by a single graph in which the vertical axis is proportional to pitch and the horizontal to duration, as in my Fig. 1. In what follows, I will take this tobe the meaning of “melody”. But melody has been synthesized (Voss, 1978, p.262) from a separate pitch-sequence and duration-sequence. Similarly in Gardner (1978, pp.24-28), which by acknowledgement is largely the work of Voss, each of the two components is plotted as a separate graph with uniform horizontal increment. Indeed, these graphs have been given the horizontal label “time” in error for “note number”, a very different quantity, as the cumulated time is formed not by uniform increments but according tothe successive durations which occur in the composition. ‘No evidence has so far been provided to my knowledge that pitch and duration considered separately are 1/f processes, and it seems hard to know on what basis such a result has been incorporated into the synthesized music referred to. In- deed, as successive pitches occur in general at unequal time intervals, it is not ‘SPECTRAL ANALYSIS OF MELODY 137 clear what meaning could be attached to a spectrum of pitches arranged uniformly con the time axis. The meaning of a spectrum of durations seems even less clear, for a duration itself involves movement along the time axis. Its true that the analysis and synthesis of the separated processes is a simpler statistical task than that of a joint pitcl/duration process.? But such a separation seems out of place as a model for most composers within the scope of the present paper; the possibility of it arose, if at all, only after Schoenberg in developments such as total serial music.? Stochastic Music ‘Voss (1978), Gardner (1978) and others have generated pitch/duration sequences stochastically according to assumed separate independent I/f processes for pitch ‘and duration and, for comparison, according to white noise and 1/f-squared processes.* From what has been said above it is clear that the resultant processes, do not have the properties of their two constituents. For example, the resultant of “white noise pitches” and “white noise durations” is by no means “white noise music”. In any case, listeners found the music derived from 1/f processes preferable, the white processes producing music considered too random and the 1/f-squared too highly correlated. However, in connection with the highly correlated music derived from 1/f-squared processes, one can guess that this would be undervalued when presented without the interest of appropriate harmony and metre which normally accompanies the composed music whose analysis produced the method of synthesis, A mere line of unaccompanied notes might need more variability 10 sustain interest.® The purpose of the music should be considered too: a funeral ‘march might produce higher correlation than a frenzied dance, but should not on that account be underrated as artistic music. EXAMPLES FROM MUSICAL SCORES Scope In this section I set aside Voss's work and describe an investigation of the spectrum of melodies encoded from musical scores. The selections listed in Table 1 were chosen. It will be seen that I have considered single movements to have the appropriate scope for the present purpose, by contrast with Voss's twelve-hour run discussed earlier. It would of course be desirable to include more examples; however, not only is the task of data entry and checking fairly time-consuming, but also spectral analysis requires individual attention to each case and does not lend itself entirely to automation. Despite the perceived artificiality of the separation of pitch and duration sequences, I have analysed them as well as the resultant melodies, in order to be able to comment on Voss’s claims and on the stochastic composition applica- tions referred to. The characteristics of the pitch and duration components of the 138 NIGEL NETTHEIM ‘Table 1. List of Musical Selections, 1, Bach Prelude in D, Well-tempered Clavier INo. 5. 2 Mozart Piano Sonata in Bb, K. 570, 1st movement. 3. Beethoven Piano Sonata in D, Op. 2 no. 2, 2nd movement. 4. Schubert Piano Sonata in A, D. 664, 2nd movement. 5.Chopin ude in Ab, Op. Posth. No. 3. 6 Gardner White music. 7.Gardner_ Uf music. Gardner 1/f-squared music. ‘Table 2. Characteristics ofthe Musical Selections. Selection Pitch Duration 1. Bach Variable, repeated pattern ‘Smooth 2 Mozart Variable Variable 3. Beethoven Smooth Variable 4. Schubert Variable Variable 5. Chopin Smooth ‘Smooth 6. White Very variable ‘Very variable Taf Intermediate Intermediate 8. Uf-squared Smooth ‘Smooth examples are summarised in Table 2 after preliminary inspection of the scores. It is seen that the selection was made so as to give a variety of combinations of types of pitch and duration processes. The Bach selection has a strongly repetitive pattern, which one will expect to confirm by observing peaks in its spectrum. Data acquisition My approach to data acquisition has been different from Voss's, discussed earlier: Thave encoded the melodies directly from the scores. This epproach, though apparently simple, conceals many problems. Melody may be clearly defined as & single sequence of pitches in cases such as liturgical chant, hymn tunes and folk- song, but by no means always in the case of Western art-music of the 18th-19th centuries, which is the scope of the present paper. Within that scope, there is at present no definition of melody allowing its satisfactory programmed extraction from an encoded score. Features militating against automatic extraction include the following: (1) A single tine played by a single instrument or voice may be formed by ‘movement between two or more melodic or accompanimental strands: the Bach solo violin sonatas furnish clear examples, and others abound. One then has not so much “the melody” as “the intertwined melodies”. ‘SPECTRAL ANALYSIS OF MELODY 139 @) Two or more contrapuntal lines may have equal claim as “the melody”, as in some of the Bach two-part inventions. . (3) The melodic line may move from one voice to another, possibly with overlap, 1s in the second subject group of the Mozart selection below. (4) There may be passages of figuration not properly considered as melody, as in the Chopin Etude Op 25 No 12, where only one o two notes out of the sixteen in each bar belong to “the melody”. (5) A rest might in some cases best be interpreted as ifthe previous note were prolonged through it (as it might indeed be by the piano’s sustaining pedal), ‘or in other cases as a“‘zero” pitch where the melody is therefore not comparably defined. (©) A trilled note is represented by a single note-head on the page, and analyti- cally it may sometimes best be regarded as a single entity; but in performance it is expanded into a number of tones. Given the difference between artistic melody as properly understood and melody as itmay be represented ina single sequence, a melodic encoding of a composition which has any of the above properties may be of questionable significance. A ‘non-musical investigator may of course take a more detached view, but one must question what he is then measuring. I proceeded with the encoding nevertheless; of the present selections, one may find in the Mozart movement some of the problems of the extraction of the melodic line discussed above, whereas the remaining examples tend to minimize those problems. Assumptions ‘Before proceeding with statistical analysis itis healthy to consider the underlying assumptions. The extent to which the composition of artistic music is a suitable field for such analysis is debatable. A work of art is individual and, although generally related to a tradition, to some extent establishes its own terms of reference, rather than being & replication or the output of a production line. If the ‘notes in artistic compositions are treated as a statistical phenomenon, a conflict ‘may arise in the analysis or synthesis of an individual work to be appreciated ‘with some autonomy. Here we can do no more than raise this question; see ‘Meyer (1989, pp.57-65). ‘An underlying assumption of spectral analysis isthe statistical stationarity of the given process. This means the constancy overtime of the probability structure ‘of the process - or informally that any one portion and any other non-overlapping. portion have equivalent probabilistic properties. This assumption can hardly be ‘made of a musical phrase, which typically follows a controlled course of devel- ‘opment from its beginning towards a point of greater intensity and is followed by a cadence or resolution. For a similar reason a movement or piece of music can hardly be considered a stationary process.” Still the shape of a musical unit is partly determined by its harmonic and tonal structure, and these would presum- ably account in part for the non-stationary component while melody might in 140 [NIGEL NETTHEIM ‘general depart not too far from stationarity. Further exploration of this question ‘would be worthwhile, but for now we will act as if stationarity applied to melody. Numerical processing ‘The first step for each selection was the encoding in a computer file of what was judged to be the melodic line, in the format of the SCORE commercial music printing program. The encoding was checked both by printing it in musical notation and by playing it on the computer's speaker. Second, a custom computer program derived for each selection three files as follows. (@) The sequence of pitches was expressed in Hertz with rests converted toa prolonging of the preceding note.* ii) The corresponding sequence of durations was expressed as integer multiples of a suitable small duration such as a sixteenth-note. ii) The resultant ofthe pitch and duration sequences was formed, each pitch ‘number from (i) being repeated a number of times given by (i).? Finally, each of the three sequences for each musical selection was used as input to a program for spectral estimation. ° Results ‘The original melodic data are shown in Fig. 1. Points for orientation include the rising sequence just before the middle of the Chopin selection and the beginning Of the recapitulation 3/5 through the Mozart. Simple though they are, these traces ‘give a good bird's-eye view of the selections. The trace for “1//-squared musi resembles in general terms those for Schubert and Beethoven, and indeed their spectra will be seen, below, to be fairly similar. ‘The spectral estimates for melody are shown in Fig. 2on a double-logarithmi scale, the reason for which will become apparent when their slopes are studied; the vertical shifis, which have no numerical significance here, produce the same ‘order of the selections as in Tables 1-2. The horizontal axis has been indicated in terms of musical note-values for the relevant periods. The maximum period for estimation is between three and four bars of music in each case, ensuring reasonable accuracy. The periods thus extend to the length of a typical musical phrase; the higher periods which Voss measured are precluded when the data run is limited to single movements. ‘The peaks in the Bach spectrum reflect the strong cyclical pattern in this ‘music, the peak at the whole-note period, for example, corresponding to one-bar ‘cycles. Some smaller spectral peaks, on the other hand, should be considered insignificant. As the spectral ordinates are estimated at uniformly spaced fre- ‘quencies, greater detail appears at the higher frequencies on the logarithmic scale, Interesting periodicities involving multiples of three units, as in the Cho- pin example, require interpolation on the horizontal axis." ‘As mentioned earlier, the concept of the spectra of the separate pitch and duration sequences, especially the latter, would seem to be dubious. To complete ‘SPECTRAL ANALYSIS OF MELODY 141 the investigation of Voss’s results, these spectra are nevertheless shown in Figs. 3 and 4. Comparisons can be made with the characteristics of each selection listed in Table 2. For the Bach and Chopin selections the durations are uniform, so that their pitch spectra are virtually the same as their melody spectra and their duration spectra are uniformly zero. ‘The main purpose of this exercise was to study the slopes of the spectra, and lines with a slope of -1 and -2, corresponding to 1/f and 1/f-squared processes respectively, are included in Figs 2-4 for comparison.’ The comparison is not entirely easy to carry out by sight, the ideal aid being the “rolling ruler”, Let us look first at the artificial selections (numbers 6-8). The slopes of their pitch and duration spectra (Figs. 3 and 4) confirm their source, allowing for the considerable deviations to be expected from short series. Their melody spectra (Fig. 2) are intermediate, with the 1/f-squared one tending towards its name-sake; in particular, the melodic spectrum of the so-called “white music” is not flat, as as known from the earlier discussion, We turn finally to the five composers. In the graphs for melody (Fig. 2), the Bach and Mozart spectra are intermediate between 1/f and 1/f-squared, the Beethoven, Schubert and Chopin close to 1/f-squared. Certainly the notion of a Lf process is to be taken as a tendency rather than asa literal prescription, but the present results do not support it well; on the basis of these data one would not claim that melody is 2 1/f process. The pitch spectra (Fig. 3) of Bach and Chopin reproduce their melody spectra, as mentioned; the Mozart graph is not clearly linear at all; while the remainder appear to be intermediate. The duration spectra (Fig. 4) ofall five composers, which have been calculated without the conviction of a reasonable interpretation to be placed upon them, are rather flat, contrary to the claim that the sequence of durations is a I/f process. CONCLUSIONS. The claim of Voss and Clarke that 1/f processes well represent pitch in music has been found in these preliminary studies of classical music to have only slender support, and the claim for duration must evidently be rejected. Some apparent confusion involving the separation of melodies into pitch sequences and duration sequences has been pointed out, and itis suggested that melody is more appropriately analysed as their single-sequence resultant, particularly if spectra are to be cal- culated. In the present studies of melodies so defined, the spectrum has been found to tend more towards the 1/f-squared than the 1/f function, for periods up to about four bars of music. More generally, the appropriateness of spectral analysis as a tool for music analysis seems so far undemonstrated. Hopes for the success of music generated stochastically in the manner which had been advocated ‘would appear not to be well-founded, if the music is to be consistent with 18th- 19th century models. Although these conclusions are on the whole negative, it is hhoped that they may clear the way for work on other characterization having a stronger musical basis. 2 F - a wareeereyt AH bys magpn — = ca ae = we ON " at SN a sm TUK, Avy, hanna nt hl DIES seems jw = aR _ subi Figure 1. Melodies of ‘SPECTRAL ANALYSIS OF MELODY 143 Slope -2 Slope -1 "f-squared” or white’ Chopin ‘ E Schubert ‘ ‘Beethoven p 5 2 ou 0 I Moat eee eee Bach © Cons + og frequen espe)” Pn er ee ee ® Sd J Chopin pts) wo 8 Gd Dots: Figure 2. Melody Spectra. 144 [NIGEL NETTHEIM Slope -2 Slope -1 ly *ifesquarea” we ON Wit” Chopin Schubert Beethoven fF, Mozart Const +o sperm LE Bach 0 Conat+og fees cee Figure 3. Pitch Spectra, SPECTRAL ANALYSIS OF MELODY 4s Slope -2 Slope -1 “/tsquared” SN nite" : : cope i : : semen p i 5 besten Moor =r SSSI ont fg mane Figure 4, Duration Spectra, 146 ‘NIGEL NETTHEM NOTES 1. Not only pieee-to-piece but also movement-to-movement relationships might be ques- ‘tioned, considering that even in masterpieces movements have occasionally been substituted by the composer, probably not preserving spectral properties; there is also ‘the question of the appropriate treatment of the un-notated pause between movements. ‘A joint pitch/daration process isan example of a “point process with adjoined random, variables” in the terminology of Yaglom (1986, p.34 Fig. 12). Such a process was ‘analysed and synthesized by Spyridis and Roumelicts (1983). 3. More explicitly, the typical subconscious process of a composer of the period under ‘discussion is unlikely to be first to ask, “what pitch will I use next?” and then inde- ‘endently “what duration will it ave?” it would rather be “what musical gesture will, Tuse next?", where a musical gesture has as its basis a pattem formed from several ‘successive pitches and durations jointly. 4. Informal definitions are as follows. Ifa series of nurabers is regarded as formed from ‘component cycles of various frequencies, its spectral density, or spectrum, measures the relative contributions at those frequencies. The term “noise” generally indicates cmratic behaviour, whether of sound or of some other quantity, by contrast with a “signal” which it may accompany; here it simply indicates variable behaviour. “White ‘noise” has equal contributions from all frequencies, just as does the colour white, and heano emo bavi, Noe we sperm rer soma rng equi is approximately a function 1/for squared is so named. The two powers, tave prominence here bob because af mathematical properties underying tse Phenomena and because oftheir being widely observed in non-musical applications. 5. It is even possible that music formed as the resultant of white noise pitches and uniform durations sounds more “random” than with white noise durations. It may seem paradoxical thatthe introduction of rhythmic figures, even random ones, into & series of random pitches can produce music sounding more coherent, but this experi- ‘ment can be tried by playing Gardner's or Voss's example with the durations replaced ‘by uniform ones. It may indeed be impossible to define “random” music convincingly = see Boon etal. (1990 pp.5-6). 6. In plain-chant and folk-song the text and intended mood are important ingredients, 7. Amexception might be made for some music of the Baroque period which is relatively ‘undifferentiated during its course - music which runs along rather homogeneously until it stops, and is sometimes informally referred to as “sewing-machine” music, ‘8. The unit of measurement of pitch could altematively have been taken a8 the note- ‘number on a piano keyboard, which is proportional tothe logarithm ofthe frequency {in Herz; this was found to make litle difference tothe results. The conversion of rests Drevents misleading wild fluctuations in the numerical sequence. 9. By this method one sacrifices, apparently unavoidably, the differentiation between repeated notes and a single note having their combined duration. The reason for the conversion to repetitions of a small basic duration is that spectral analysis is always formulated in terms of a uniform time increment. 10, The program was derived from Press et al. (1988), where an introduction to spectral ‘methods is given, The fast Fourier transform method was preferred over the maximum, ‘entropy method, as sharp spectral peaks were not the main interest. Means were subtracted from each series. The number of points to be estimated and the bandwidth for averaging were chosen asin Table 3. (Here one would lie to know similar details. ‘of Voss’s spectral estimation procedure.) 11. The need for interpolation is a consequence of a peculiarity of the fast Fourier trans- form, which works in powers of two, and so does not provide estimates at periods of ‘multiples of three data points. That transform, valuable as iti, thus has two left feet ‘when faced with dance music! SPECTRAL ANALYSIS OF MELODY 447 Table 3. Details for the Analyses. Selection Minimum — Tempo Melody Pitch & Duration Duration NM K NM K 1 Bach toh 438055932528 32 2 Mozart 32nd 1270 S016. G43 LIB 3217 3. Beethoven Woh = 128960 32S. 36816 TT 4 Schubert Teh 383.1800 3227-358 1610, 5. Chopin 12h 162359 16103491610 6. Gardner white toh 4808161625 7. Gardner Uf 1h = 480443613122 8. Gardner 1f-sq ton 48084016 261227 Notes to Table 3 ‘Minimum Duration = horizontal increment for melody. ‘Tempo = number of Minimum Duration notes per minute, estimated for selections 1-5 from recordings by Landowska, Schnabel, Gilels, Solomon, and Rosenthal, respectively. These tempi determined the horizontal scales for Fig. 1. N-= number of data points. f= number of spectral ordinates estimated. K the number of data points used per estimated ordinate is (2K+1)M, ‘The approximate duration in minutes of a selection is N(melody) / Tempo. 12. 1f S denotes the spectrum, f the frequency, ¢ a constant, and p a numerical power (special values of which are -1 and -2), it follows from the inverse power relation we that the logarithms are linearly related with slope -p: si log S = log ¢- p.log f REFERENCES: Bolognesi, 1.(1983). Automatic composition: Experiments with self-similar music, Computer Music Journal, 7(1), 25 Boon, J-P., Noullez, A. & Mommen, C.(1990). Complex dynamics and musical structures, Interface, 19, 3-14. Dodge, C., & Bahn, C-R.(1986, June). Musical fractals, Byte, pp. 185-196, Gardner, M.(1978, 4). White and brown music, fractal curves and one-over-f flututions Scientific American, pp.16-32 Mandelbrot, B.(1983). The Fractal Geometry of Nature, New York: W.H.Freeman. (Re ‘music: pp.374-375.) Meyer, L-B.(1989), Sie and Music: Theory, History, and Ideology. Philadelphia: University of Pennsylvania Press. Press, W.H., Flannery, B.P., Teukolsky, S.A., & Vetteling, W.T.(1988). Numerical ‘Recipes in C. Cambridge: Cambridge University Press. (With computer programs.) 148 NIGEL NETTHEM, Spyridis, H. & Roumeliotis, E.(1983). Fourier analysis and information theory on a musi- eal composition. Acustica, 52, 255-256. Voss, R.F.(1975), lif noise: diffusive systems and music. Unpublished doctoral disserta- tion, University of California, Berkeley Voss, R.F.(1988). Fractals in nature: from characterization o simulation. In H.-O. Peitgen, ‘& D. Saupe (Eds.), The Science of Fractal linages (pp.21-10). New York: Springer- Verlag. ‘Voss, RF, & J. Clarke,(1978).“I/fnoise” in music: Music from 1/fnoise. Journal of the Acoustical Society of America, 63(1), 258-263. ‘Yaglom, A.M.(1986), Correlation Theory of Stationary and Related Random Functions 1 ‘New York: Springer-Verlag, [Nigel Nettheim 204A Beecroft Ré ‘Cheltenham NSW 2119 Australia Nigel Nettheim was born in Sydney, Australia in 1940, The first phase of his studies was in statistics, and led tothe degree of Ph.D. from Stanford University (1966), speci inthe spectral analysis of time series. The second phase of his studies was in music, and included four years of composition with Dr. Samuel Dolin of the Royal Conservatory of “Music, Toronto, a former president ofthe International Society for Contemporary Music. Dr. Nettheim's degrees in music are B.Mus. from the N.S.W. State Conservatorium of Musi and M.Litt. in musicology from the University of New England. He is currently a Research Fellow in the Centre for Liberal and General Studies atthe University of New South Wales, pursuing the relationship between the above two areas.

You might also like