You are on page 1of 10

Which aspects of music can be described by quantitative models?

Music - Chaos, Fractals, and Information

Jan Beran

repulsive monster, dealing wild Atail-lashing serpent, a wounded

and furious blows as it stiffens into its death agony. These were the comments of a critic after a performance of Beethovens second symphony in the early 19th century. About Bartoks compositions, a critic wrote: Some can be played better with the elbows, others with the at of the hand. None require ngers to perform or ears to listen to. On August 1, 1919, The Musical Times in London said of Strawinskys Sacre du Printemps: The music of Le Sacre du Printemps bafes verbal description.... Practically it has no relation to music at all as most of us understand the word... These comments exemplify the fundamental problem underlying judgement and analysis of music, namely the absence of a clear objective criterion that would tell right from wrong. This is in contrast to genuinely quantitative sciences, such as physics or chemistry, where it may be assumed that a true answer exists and can be found by repeated experiments and appropriate theoretical modeling. In music, there is usually no clear denition of optimality or, if there is, no unique optimal

Ludwig van Beethoven

Wolfgang Amadeus Mozart

answer exists. Moreover, not all questions in musicology can be answered by repeated experiments. Consider, for instance, the task of nding musical structures in composed music. Traditional musical analysis starts with historic information that helps us to focus on structures that are likely to be present. For example, if we analyze

Johan Sebastian Bach

a classical sonata, a harmonic analysis or an analysis of motifs is based on the well-dened form of a sonata. More difcult, but often also more interesting, is the question whether there is

German philosopher and mathematician Leibniz (1646-1716) put it:

Music is the arithmetics of the soul.

Gottfried Wilhelm Leibniz

structure beyond the standard rules. It is, however, not clear a priori which structures that may be discovered by data analytical methods are musically important. For instance, one may ask why Beethovens symphonies are coherent, how he sustains the suspense over long stretches of time and how this should be reected by a performance. With respect to performance, repeated experiments can be helpful for nding similarities and differences between different styles of performance. The question of how these are related to structures in the score is much more difficult. An appropriate analysis of the score is needed, and that is where the question of identication and quantication of relevant structures is most challenging. The lack of precise denitions and objective functions, together with the emotional effect of music, have led to a tradition of purely qualitative, mostly descriptive, and occasionally rather emotional music reviews and a predominance of historic aspects in music theory. Nevertheless, throughout the centuries there were occasional attempts to gain a more quantitative understanding of music. The most 8
VOL. 17, NO. 4, 2004

obvious quantitative approach is due to the physical nature of performed music as a sound wave. For instance, the Pythagoreans in ancient Greek (fth century B.C.) were aware of the musical signicance, and apparently pleasant effect, of simple frequency ratios such as 2/1 (octave), 3/2 (fth), 4/3 (fourth) etc. A systematic physical understanding of musical sounds and acoustics was initiated by path breaking contributions of the German physicist Helmholtz. Musical acoustics and sound engineering is now a well-developed scientic discipline (see e.g. Bailhache 2001 for a historic account on musical acoustics) with an abundance of commercial applications (music recording, computers, synthesizers, portable phones, digital television, computer games, etc.). Acoustics is, however, only one aspect of music. Music is not just an arbitrary collection of sounds, but rather organized sound, or, as the German philosopher and mathematician Leibniz (1646-1716) put it: Music is the arithmetics of the soul. Whether we listen to a sonata, a symphony, or an Indian raga, logical construction is an inherent part of music.

Standard musical techniques, such as retrograde, inversion, arpeggio, or augmentation are mathematical functions. In the 20th century a number of composers, such as Schnberg, Webern, Bartok, Xenakis, Cage, Lutoslawsky, Eimert, Kagel, Stockhausen, Boulez, and Ligeti even used explicit mathematical formulas and ideas for their compositions, though perhaps it was psychologically not too clever to admit this publicly. The general audience tends to have a more romantic view of music and wants to relax. A sober explanation of the logical construction is likely to stigmatize a composition as purely intellectual. Musical Opinion (London) wrote in December 1949 about Schoenbergs Piano Pieces op. 11: The Three Pieces, op.11, are now more than 40 years old; they are rarely performed, which is not surprising, since no pianist of the rst rank would bother to learn, or desire to inict on his audience, such unrelievedly ugly and unrewarding abstractions... Schoenberg states: I write what I feel in my heart. If this is really so, we can only assume that from 1908 or so, Schoenberg has been suffering from some unclassiable and peculiarly vir-

ulent form of cardiac disease. Also in the 20th century, a number of theoretical attempts were made to develop general mathematical foundations of music. In particular, in the last two decades, a number of modern mathematical and statistical methods have been developed (Mazzola 2002, Beran 2003). So, how much can music be understood in a quantitative manner? Completely or not at all? The truth probably lies somewhere in between: Some aspects of music may be described and understood by quantitative models, while other aspects may be less accessible to a mathematical approach. Here, and in the subsequent articles of this volume, some (but not all) of those aspects are discussed where quantitative investigations are possible.

1/f-noise, Fractals, and Chaos

In 1975, Voss and Clarke formulated a provocative statement that may be summarized in simplied form as follows: Music is 1/f-noise. Their conclusion is based on spectral time series analysis of recorded music. After eliminating high frequencies by low-pass lters, the frequency structure was analyzed by looking at the empirical spectrum (periodogram). The results indicated a pattern that seemed to be common to recorded music in general: The value of the spectrum increases with decreasing frequency f, and the increase is proportional to 1/f (hence the name 1/fnoise). There are at least two reasons why this statement led and still leads to controversial discussions among musicologists. Firstly, 1/f-noise-processes are purely random. In contrast, musical compositions are highly organized and presumably deterministic (except for aleatoric music). To associate music with a purely random object is therefore rather disturbing. A justication can be given as follows: Fitting a stochastic process can provide a descriptive summary of some essential features, even though the structure may be deterministic. For instance, if the same pattern were repeated exact-

Figure 1. Harpsichord sound wave, squared sound wave (power) on logarithmic scale, together with their aggregations and spectra.

ly with frequency f=0.1 Hz (every 10 seconds), then a spectral analysis would reveal exactly this property in the form of a distinct peak at this frequency. More generally, a spectral analysis reveals how much of the observed signal is due to periodicities with specic frequencies. If repetitions are only approximate, then the peak in the spectrum is less pronounced. Now, in real music, patterns are rarely repeated exactly. Instead, variations are applied to both, the patterns (e.g., melodies) and their frequencies. The empirical spectrum of such a disturbed periodicity then may resemble the empirical spectrum of a random process with similar properties. It can therefore be characterized by the spectrum of a matching random process. In this sense, it is meaningful to associate a recorded musical signal, or other musical data, with a random process. In other words, the random process (or its spectrum) summarizes the degree of variation and memory in a musical signal. The second thought-provoking statement by Voss and Clarke is that all (low-pass ltered) music has the

same spectrum, namely proportional to 1/f. This statement may indeed be too general. To investigate this, let us ask the following question rst: Which aspects of a composition does recorded music represent? The sound wave of a musical performance is mainly determined by the notes that are played, the instruments, the way the instruments are played, and specic acoustic conditions (room, microphone, etc.). Musical instruments do not generate strictly periodic signals. The attraction a Steinway piano or a Stradivari violin is due to a complex sound wave that changes in time. As it turns out, even the sound wave of a single note may look like 1/f-noise. The reason is that irregular slow changes imply a high contribution of low-frequency components in the signal, which in turn implies higher values of the spectrum at low frequencies. Figure 1 illustrates this effect for a harpsichord sound (one e is played). Plotting the logarithm of the empirical spectrum versus the logarithmic frequency shows, for small frequencies, a negative slope close to -1, which corresponds to 1/fnoise. This is the case, even though a

Figure 2. Spectra (both coordinates logarithmic) for compositions by Bach, Scarlatti, Haydn, Mozart, Chopin, Brahms, Rachmaninoff, Prokofeff, and Beran.

Figure 3. Fitted values of power () of spectrum near origin, categorized according to date of birth of composer. The results are based on 60 compositions.

trend function was subtracted beforehand to take care of the most obvious slow changes (Figure 1e). Now, one single note can hardly be called a composition! So, finding 1/f-noise like behavior in recorded music could be due to the instrument(s) rather than the composer. In order to separate the abstract content from instrumental sounds, the notes in a score need to be analyzed by themselves. This is quite easy for monophonic music, as long as the duration of notes is ignored. Focusing on onset time and pitch only, notes can be coded as pairs of positive integers (x,y) with x=onset time and y=pitch. In the well-tempered tuning, an increase of pitch by one unit corresponds to an increase in frequency by the 12th root of two. A difculty arises in polyphonic music. If two or more notes occur at the same onset time, how do the notes add up? In recorded music, a natural superposition of notes is created automatically, since the different sound waves are added. For abstract notes, usual addition does not make sense. For instance, adding two notes y=1 and y2=8 (y1 and y2 are a fth apart) yields 9, which would be higher than each of them individually. Also, the average of 4.5 is meaningless and does not correspond to any note. Brillinger and Irizzary (1998) propose to use articially created sound waves based on cosines multiplied by envelope functions. Essentially, this brings us back to recorded music where sound waves can be added without any problem. Results, however, again depend on the particular choice of envelope functions. An advantage over recorded sound waves is that the envelope functions can be designed such that their contribution to the low frequency behavior of the sound wave is negligible. An alternative approach that completely avoids confusion between score and instrumental sound, while avoiding the problem of superposition, is to dene a simplied score. Beran (2003) considers the socalled upper and lower envelope, dened by the sequence of highest or lowest notes, respectively. Here, we look instead at an arpeggio simplication. This means that notes occurring


VOL. 17, NO. 4, 2004

Figure 4. Aggregated gaps between occurrences of the most frequent note modulo octave and observed 1/f-type spectra.

at the same onset time are replaced by a sequence of the same notes played one after the other, ordered according to pitch (lowest rst, highest last). Moreover, the exact spacing of onset times is ignored. The question that we address now is: Does the sequence of notes resemble 1/f-noise? In a rst step, high frequencies are eliminated by aggregation, i.e., by dividing the time axis into blocks of k observations and taking averages over each block. (see e.g. Beran and Ocker 2001, Tsai and Chan 2004). Observed periodograms of the aggregated series and tted spectra are shown in Figure 2 for the following compositions: (a) J.S. Bach: Prelude and Fugue No. 4 from Das wohltemperierte Klavier; (b) D. Scarlatti: Sonata K381; (c) J. Haydn: Sonata op. 31 (1st Mov.); (d) W.A. Mozart: Sonata KV 333 (2nd Mov.); (e) F. Chopin: Nocturne op. 32, No. 1; (f) J. Brahms: Hungarian Dance No. 3; (g) S. Rachmaninoff: Prlude op. 23, No. 2; (h) S. Prokofeff: Vision fugitive No. 14; (i) J. Beran: Piano concert No. 2 - S nti (2nd Mov.) (Beran a 2000). The tted spectral densities are based on so-called SEMIFAR-models. In the SEMIFAR-approach, an overall estimated trend function is removed

before the periodogram is calculated. All plots in Figure 2 exhibit negative slopes, but not all of them are close to -1. This is conrmed by an extended analysis of 60 pieces by W. Byrd, J.S. Bach, D. Scarlatti, F. Couperin, J.-P. Rameau, L. van Beethoven, M. Clementi, J. Haydn, W.A. Mozart, Fr. Chopin, J. Brahms, C. Debussy, G. Faur, S. Rachmaninoff, S. Prokofeff und J. Beran, with estimated slopes varying between -0.08 to -1.44. Thus, not all compositions resemble 1/f noise. One may say, however, that almost all compositions resemble 1/fnoise for some > 0, and clear deviations from 1/f appear to be rare. Figure 3 shows boxplots of the estimated slopes -, grouped according to the date of birth of the composer. Remarkable in particular is that, in the classical period (Haydn, Mozart, Beethoven), the slopes are almost identical for all compositions considered here, namely, very close to 1/fnoise. It may be worth investigating how much this may have to do with the classical form of a sonata. In our investigation, all pieces in this time period were of this form. Intuitively, the results illustrate the fact that exact repetition is musically

not interesting and therefore a spectral analysis shows random behavior. However, the amount of variation has to be limited so that we are still able to notice certain patterns and see connections between different parts. We therefore do not observe completely independent random events this would correspond to 1/fo-noise. Instead, even patterns that are far apart have a strong similarity this corresponds to 1/f with a positive value of . The amazing nding is that the degree of variation chosen by most composers coincides with =1, which is at the border between stationarity and nonstationarity. For < 1, 1/fnoise is stationary, which implies a certain degree of stability. The preferred degree of variation and longterm memory in music appears to be exactly at or close to the border =1, beyond which the process would become unstable, in the sense that the probability distribution would be different at each time point. Similar investigations can be carried out for other aspects of a score. For instance, Figure 4 shows spectra for time series derived from onset-time gaps between occurrences of the most frequent pitch. All three spectra in Figure 4 exhibit 1/f-shape near the origin. ...How anyone has the nerve to call that kind of stuff music I just do not know. None of it had any tune and most of it could equally well have been our fat old tabby cat jumping on and off the piano chasing after a marmalade jar cover... In spite of this comment by a distressed music lover (in a letter to the editor of Musical Opinion in January 1961) who complained about Schnbergs piano music, Schnbergs music is in many ways a natural continuation of romantic music of the late 19th century rather than its destruction. In particular, he seems to have used the same amount of variation in terms of 1/fbehavior. For instance, the empirical spectrum of Schnbergs piano piece op. 19, No. 2 in Figure 5 is very close to 1/f-noise. On a qualitative level, the connection between music and 1/f-noise may also appear plausible via a possible connection with the mathematical deniCHANCE


Figure 5. Arnold Schnberg, op. 19, No. 2 - Spectrum (in log-log-coordinates) of arpeggio-version.

Clementis Gradus ad Parnassum and the beginning of Chopins Etude op. 10, No. 1. In both cases, the same basic C-major chord is decomposed. but what a difference! Compared to Chopin, Clementis chord-decomposition sounds ugly. The reason for the beautiful Chopin sound is the ordering of notes according to the sequence of overtones of C. Also, Chopin avoids the rough sound of the third in the C-major chord at the beginning. Finally, perhaps a note of caution is needed at this point. The fact that most compositions are related to 1/f-noise does not mean that one can compose music by the simple device of a random 1/f-noise or fractal generator. The fractal character is only one of many aspects that dene a composition. It is perhaps this misunderstanding that lead to the failure of purely algorithmic music. However, fractals can certainly provide an interesting starting point for creative work. Impressive examples of such music are, for instance, Gyrgy Ligetis piano etudes and his concert for piano and orchestra.

One may say that fractal analysis provides a global measure of the coherence of a composition. Since music may be considered as transmission of information from the composer/musician to the listener, another global feature that may be dened is the information content (or entropy) of a composition. A wellknown denition of information is Shannons entropy. For a random experiment, it characterizes the average amount of information, or surprise, in the outcome of the experiment. If a random experiment can have a nite number of outcomes, then Shannons entropy is maximal, if all outcomes are equally likely. On the other hand, if only one outcome is possible, then entropy is zero, because we know the outcome even before carrying out the experiment. The usefulness of entropy in music depends on whether we are able to dene suitable musical quantities for which probabilities or relative frequencies can be calculated. Let us illustrate this by a simple example. In the well-tempered tuning, we may

Figure 6. Comparison of Clementi and Chopin: small change great improvement.

tion of chaos. First of all, common 1/fnoise processes are fractals in the sense of Mandelbrot (1983), with the fractal dimension closely linked to . Mathematical chaos refers to dynamic systems that are highly sensitive to initial conditions, and with (suitably dened) trajectories that dene a frac12
VOL. 17, NO. 4, 2004

tal geometric object. With respect to sensitivity, music can also be called chaotic. This is one of the reasons why it takes so long to become a professional musician. Every little detail matters. A typical example is pointed out in Georgiis classical book on piano music. Figure 6 shows parts of a piece from

identify two values of pitch as harmonically the same, if they differ by one or several octaves. Thus, we code pitch as an integer between 0 and 11. In algebraic terms this means that we consider integers modulo 12. For a given composition, we now count for each of the 12 categories how many times the corresponding pitch occurred, and calculate the entropy of the resulting distribution. This was done for 147 compositions, with dates ranging from the 13th to the 20th century. The composers are: Anonymus (dates of birth between 1200 and 1500), Halle (1240-1287), Ockeghem (1425-1495), Arcadelt (1505-1568), Palestrina (1525-1594), Byrd (1543-1623), Dowland (15621626), Hassler (1564-1612), Schein (1586-1630), Purcell (1659-1695), D. Scarlatti (1660-1725), F. Couperin (1668-1733), Croft (1678-1727), Rameau (1683-1764), J.S. Bach (1685-1750), Campion (1686-1748), Haydn (1732-1809), Clementi (17521832), W.A. Mozart (1756-1791), Beethoven (1770-1827), Chopin (1810-1849), Schumann (18101856), Wagner (1813-1883), Brahms (1833-1897), Faure (1845-1924), Debussy (1862-1918), Scriabin (1872-1915), Rachmaninoff (18731943), Schoenberg (1874-1951), Bartok (1881-1945), Webern (18831945), Prokoffieff (1891-1953), Messiaen (1908-1992) and Takemitsu (1930-1996). Plotting entropy against the date of birth (Figure 7) shows a surprising dependence. After about 1400, entropy seems to be rising. Why this is so can be best understood by reordering the 12 categories of notes. In the western tonal system, tonalities can be ordered in a natural way according to the circle of fourths (Figure 8). Representing the frequencies of the thus ordered categories by star plots shows the following pattern (Figure 9): Up to the 19th century, the frequencies are high in the circle-of-fourth neighborhood of a central note, whereas they are very low for the other note categories. The picture changes later, somewhat dramatically (e.g., Bartok), starting with Scriabin, who was one of the pioneers of atonal music. Due to the replacement of the tonal system by other principles, the circle of fourths

Figure 7. Entropy vs. date of birth.

Figure 8. The circle of fourths.

lost its central role. The distribution of notes became less predetermined so that entropy increased. An even clearer temporal development can be seen when looking at the ratio of the entropy of frequencies in the circle-of-fourth neighborhood of the central note and

the overall entropy (Figure 10). The ratio decreases in time, because the more a composition relies on the circle of fourths, the more uniform the distribution of notes is in the neighborhood of the central note. Thus, compared to the overall entropy, the local entropy


Figure 9. Star plots of note frequencies, with note categories ordered according to the circle of fourths.

tends to be high. This is not the case when the circle of fourths plays no role or a less prominent role.

Score Information and Performance

An area of musicology where statistics plays a major role is performance theory. The reason is that repeated observations are available and experiments can be designed to answer specic questions. For music that is played from a score, the main question is: How does a performer translate information given in a score into a performance? The rst question is how to quantify information contained in a score. Beran and Mazzola (1999; also see Mazzola 2002 and (e.g., Beran 2003) encode structural information of a score by so-called metric, harmonic, and melodic weights or indicators. These curves characterize, for each onset time or even for each

Figure 10. Local entropy divided by total entropy, plotted against the date of birth of the composer.


VOL. 17, NO. 4, 2004

Figure 11. Trumerei by R. Schumann: Motivic indicators.

note, its metric, harmonic and melodic importance. A slightly modied motivic indicator that takes into account a priori knowledge about important motifs in the score is dened in Beran (2003). Figure 11 shows motivic indicator functions for eight different motifs in Schumanns Trumerei. These can be related to performance data in various ways. For instance, if the tempo of a performance is recorded, we may apply regression techniques or data sharpening. Score-related data-sharpening can be done, for instance, by considering tempo values only at onset times where a structural curve exceeds a certain threshold. For example, Figure 12 shows tempo for onset times where the rst motivic indicator exceeds its 90th quantile. The tempo curves used for this analysis were provided to us by B. Repp. Based on the sharpened data, clear differences as well as similarities can be identied. The visual impres-

Figure 12. Trumerei by R. Schumann: Tempo sharpened by 90%-quantiles of main motif.



Figure 13. Trumerei by R. Schumann: Clusters of tempo curves, obtained after sharpening by 90%-quantiles of main motif.

sion that Horowitz clearly differs from Cortot, as well as an amazing consistency of Cortot and Horowitz throughout several decades, is conrmed by a cluster analysis of the sharpened data (Figure 13). In contrast to a cluster analysis of the original data, clustering is associated with motivic elements of the score. Of course, in a score such as Schumanns Trumerei, several motivic streams are present simultaneously so that a unique causal attribution of tempo variations to one and only one motif is not possible. Nevertheless, the statistical analysis provides the means for nding possible explanations why a performer may slow down or accelerate at specic points. For instance, in a more complex analysis involving metric, harmonic, and melodic curves simultaneously, Beran and Mazzola derived the following approximate rule for Trumerei (e.g., Beran 2003): Tempo decreases at onset times that are important from the point of view of harmony and melody, whereas it tends to increase for metrically important points. 16
VOL. 17, NO. 4, 2004

Music is a fascinating mixture of order and chaos. Quantitative musicology involves many scientic disciplines, because music is neither a purely physical nor a purely philosophical phenomenon. Music is not easy to quantify, when quantication is aimed at gaining at a better understanding of music. All this makes music an interesting and at the same time highly challenging topic for interdisciplinary research. In particular, due to its interdisciplinary nature, statistics plays an important part in this emerging eld. Some of us may perhaps fear that music could lose its charm, once it is explained by numbers. Yet, just like in other sciences, each new insight is likely to reveal more questions than answers, thus taunting the curious to further explore the mysterious nature of the universe.

References Bailhache, P. (2001). Une histoire de l'acoustique musicale. CNRS Editions. Beran, J. (2000). S nti. col legno, a WWE 1CD 20062 ( Beran, J. (2003). Statistics in Musicology. Chapman & Hall, CRC Press, Boca Raton. Beran, J. (1994). Statistics for Longmemory Processes. Chapman & Hall, London. Brillinger, D. and Irizzary, R.A. (1998). An investigation of the second- and higher-order spectra of music. Signal Processing, Vol. 65, 161-179. Mandelbrot, B.B. (1983). The Fractal Geometry of Nature. Freeman & Co., San Francisco. Mazzola, G. (2002). The Topos of Music. Birkhuser, Basel. Voss, R.F. and Clarke, J. (1975). 1/f noise in music and speech. Nature, Vol. 258, 317-318.

I would like to thank B. Repp for providing us with the tempo measurements.