A Real-Time Way to Turn Urban Environments into Music

Noah Vawter
Thesis Proposal for Degree of Master of Science, Fall 2005
Thesis A)visor
Chris Csikszentmihalyi
Professor of Media Arts and Sciences
MIT Media Laboratory
Thesis Rea)er
Barry Vercoe
Professor of Media Arts & Sciences
MIT Media Laboratory
Thesis Rea)er
Douglas Repetto
Director of Research
Columbia University, Computer Music Center
Table of Contents
Introduction, Motivation and Inspiration
Prior Explorations
Overview and Physical Description
Analysis, Processing and Synthesis
As human civilization devises ever more powerful machines, living among them may
become more difficult. We may find ourselves surrounded by incidentally created sounds and
noises, out of synchronization with our momentary needs and discordant. Currently,
legislating noise pollution is the only articulated solution and clearly it is not very effective.
Our impression of sound, however, may be mediated and manipulated, transformed into
something less jarring. So far, Walkmans and sound canceling headphones have done this,
isolating us from noise but also from one another. In their place, a next generation
headphone system is proposed which integrates environmental sound into a personal
soundscape. It allows one to synthesize music from environmental sound using a number of
digital signal processing (DSP) algorithms to create a sonic space in which the listener
remains connected with his or her surroundings, is also cushioned from the most harsh and
arrhythmic incursions and may also be drawn to appreciate the more subtle and elegant
Introduction, Motivation and Inspiration
The idea for this project came after experiencing a high-pitched squealing and
extended grating of the bus brakes in the city where I live. The squeal generated by the
friction between the brake pads and the rotors had a particular character. Similar in origin to
a bowed violin string, it held its pitch as the speed of the bus slowed, but resounded through
the sonically hideous frame and panels of the city bus instead of the critically-designed body
of a violin. Instead of the gingerly-spaced overtones of a stringed instrument gently
communicating a pitch to my mind, the inharmonic tones of the manmade vehicle posed a
question at the highest level of urgency: What is going on? I wanted to hear a response to
the blaring sound. Perhaps something soothing, like the exclamation of "Excuse me"
following a disgusting belch.
The bus sounded as if a dozen children with tin whistles each picked a random note
and piped away at it. It was highly dissonant, but composers and conductors are able to lead
instrumental sound into and out of dissonances with a terrific dynamic range. Louis Curran,
Professor Emeritus of Music Theory and Music History at WPI, organized compositional
styles by the way they resolve dissonance. Extrapolating, it is possible to create an intelligent
system that analyzes soundscapes and makes them more harmonious.
To do this, one must inquire: What partials are in the squeal? What harmony did it
resemble? What chord would resolve it? Further sounds among the urban cacophony may
be considered. Many have a tonal component, others are rhythmic. I propose the city may
be seen not in terms of the relationships of the people forming it or the treasures within it, but
by the vibrations of all its physical objects forming a continuous and haphazard concatenation
of disordered pitches. This is neither composed nor improvised, but may in some cases be
While nearly every sound means something useful to the person closest to it (if not in
physical proximity then in operation or design), in a densely populated environment, one is
destined by proximity to perceive meaningless emanations. We may have gotten used to this
routine cacophony, but this thesis proposes to explore what it would be like to order the
sound. Not to regulate every sound-causing movement within it, but to develop digital signal
processing (DSP) techniques to simulate human perception of a busy scene and reconfigure
To have complete control however would mean isolation, never needing to learn to
negotiate and never getting pleasantly surprised. Since the essence of city life is exposure to
and sharing a variety of ideas, even if some of them are not immediately palatable, elitism
behind a Walkman or sound-canceling headphones is unacceptable. Techniques for
cushioning the harshest sounds, and adding variety to the monotonous ones is more enticing.
For example, why must every note sounded from the speaker of a reversing truck have the
same pitch? Why not transform every fourth one into a leading tone? Perhaps a bass line
could be added. The goal of this project is to reconfigure urban audio space into a personally
inspiring and soothing environment.
This thesis may lead to many things. It may inspire people to design more systems
that integrate ambient sound with musical compositions, recalling some of the original
impetus for making live music, before recording was possible. Also, it may encourage
designers of anything sound-emitting to more carefully consider interactions among sound-
emitters. For example, the keypad beeps of adjacent automated teller machines (ATMs)
could be altered to form a chord. Instead of varying continuously, elevator motors could run
at RPMs to form a simple motif. Maybe quieter shipping containers could be designed. And
quite simply, maybe it will make people more considerate to others.
Longer term, it would be nice if this thesis were to lead to an rise in flaneurs- urban
wanderers seeking novelty- because the remixed sound will encourage people to re-
experience familiar areas through the ears of the device, and lure them into exploring areas
they have never been. It may even influence the layout of modern cities. For example, it
might lead one to consider different, more consonant designs of audible systems like
crosswalk signals, automobile mufflers and subway tunnels. Once designers and planners
receive a vision of a more consonant-sounding city, it might encourage them to locate noisy
artifices like chemical plants away from residences. It might persuade city planners to
consider not only volume levels, but pitches in local sound ordinances. It might inspire
designers of large machines like construction equipment to consider auditory harmony and
rhythm in the mechanical operation.
Prior Explorations
In their paper "Smart Headphones," Sumit Basu and Alex Pentland at the MIT Media
Lab describe a reality-mediation project based on headphones, microphones and signal
processing [Basu 2001]. In the context of my project, their work is interesting because it
demonstrates a system with external cognition that shapes the perception of a wearer's
sonic environment. Their paper begins "Though our ears are wonderful instruments, there
are times when they simply cannot handle everything we need them to," which is similar to
the basis of my argument: In the millions of years of evolution leading to the construction of
my hearing system, the inharmonic sounds of arbitrary metal shapes have probably only
influenced the last 5000 years to even the tiniest degree. It is therefore a strain for the
human mind to interpret some of the new sounds.
However, Basu and Pentland's project resulted in a different system. It only lets in
human speech from the outside world, superimposing it over prerecorded music. This is an
improvement over certain situations, but in rich environments, censors too much interesting
information. It treads on the ideals of the flâneur who roams the streets in search of "bustle,
gossip and beauty." [Levi 2004] Along with the honking horns and crossing signals, one
would miss the ringing of bells, the clamoring sirens and warbling birds. It would overlook
cultural differences such as the distinction between the American Republic's sine wave
modulated police sirens and the European tritonic version.
Artists have also addressed some of these ideas. For example, Luigi Russolo wrote a
manifesto titled the Art of Noises in 1913 [Russolo 1913]. This brief document
circumscribes the sonic environment from "ancient life" until 1913, with prescriptions for the
future. It describes sound's evolution as ever-growing in complexity and from mystery to
ecstasy to tedium. He writes "For many years Beethoven and Wagner shook our nerves
and hearts. Now we are satiated and we find far more enjoyment in the combination of the
noises of trams, backfiring motors, carriages and bawling crowds." He beseeches
composers to break out of the monotony of the music of their time by recasting the sounds
around him into a composition with hand-built noise-generating instruments. He writes "We
are therefore certain that by selecting, coordinating and dominating all noises we will enrich
men with a new and unexpected sensual pleasure."
I agree with his sentiment completely - that taking control of the environment around
oneself and ordering it can be used to stimulate emotion. However, I choose to help the
noise become music, rather than perform concerts using those noises, which the Futurists
Another artist who examined the sounds of the city is Iori Nakai [Nakai 2003]. In 2003,
he demonstrated "Streetscape" in Linz. This is an interactive look at urban acoustics, but
no processing is involved. In this art piece, map representations of Tokyo and Linz are
presented to visitors along with a stylus. As the visitor moves the stylus over various parts
of the city, recorded sounds from that region play back. Thus the piece is an inversion of
this thesis. It bends the goals of the flâneur in the direction of voyeur, and therefore
isolation. It is relevant, however for its presentation of sound as an exploration and a
choice. Similarly, my device will encourage one to experience portions of the city outside
one's vital paths and possibly to alter behavior to tend toward particularly exciting areas. In
contrast, my device will not enable one to do this anonymously, nor as rapidly.
Coincidentally, the concepts of anonymity and rapidity of access are keys to understanding
the modern debates over public photography and privacy.
Another artist who experimented with the mobile headphone/microphone combination is
Akitsugu Maebayashi [Maebayashi 2000]. In 2000, he exhibited a piece which sought to
process the environment although with very modest intentions. Rather than respond to the
environment in a manner based on human perception, it simply stepped through a fixed
sequence of echoes and reverberation, generally causing an even more disconcerting
effect than being in the city. Nevertheless, this piece is important for two reasons. First, it
confirms the appeal of a lively processed environment. And most importantly, it contributes
the idea that one can compose a sequence of processing parameters, that the
transformation of an environment can have varying modes. This superimposes the idea of
a "song" onto the outside world, giving a more recognizable and repeatable construction.
7ne of the neat thin!s about this is that a flFneur can# as with any !oo) composition# learn
to anticipate chan!es in the soun)# an) relocate his physical presence to one where the
processin! will be especially appropriate1
-n %&&'# - )emonstrate) a pro:ect calle) @"onic Authority@ CVawter %&&'D 1 -n "onic
Authority# manma)e machines with perio)ic waveforms such as air con)itioners# electric
power transformers an) uni)entifie) telephone pole e=uipment were analy,e) to )etermine
the )ominant perceive) tone1 Permanent# official-loo+in! ta!s were printe) an) affi5e) on or
near the )evices )eclarin! the machinesE contribution to the au)ible scene1
$inally# - was inspire) by a se!ment - hear) on a ra)io show in which a man wal+e)
aroun) New Hor+ *ity with a car)boar) tube1 Be ha) calculate) the resonant pitch of the
tube to be /-flat an) was lettin! people listen to the soun)s of the city as they were
sustaine) by the tube1
Related Psychoacoustical Research
The propose) )evice inten)s to manipulate ima!es of the surroun)in! sonic
environments into tonal music1 To )o this# it will measure the sample) soun)Es features# then
process the soun) numerically so it more closely follows the pre-compose) musical structure
store) insi)e1 Not every musical characteristic humans perceive can be reliably computer
calculate)# but it is an active area of research1 The most important musical characteristics in
this conte5t are volume# +ey# )issonance# an) tempo1
Althou!h it may seem simple# the impression of volume has some important nuances1
This is evi)ent in the well-+nown $letcher an) Munson curves an) mas+in! ma+es it even
more )ifficult to pre)ict Cowlin! an) Barwoo) 2986D1 etection of +ey throu!h inte!ration of
overtones is nicely e5plaine) in Wei *haiEs paper Automate) Analysis of Musical "tructure
C*hai %&&'D1 -n 2I%%# Jean-Phillippe offere) a very early loo+ at )issonance# as )erive) from
the ratios of harmonic instruments ;without re!ar) to tonality< CRameau 2I%%D1 The topic of
)issonance an) its role in composition was important throu!hout Western music an) is the
topic of numerous te5tboo+s such as Barmony CPiston 2942D1 -n the secon) half of the %&th
century# computation of )issonance for arbitrary !roups of nonharmonic tones was e5amine)
in the often reference) paper @*onsonance Theory Part --> *onsonance of *omple5 Tones
an) -ts *alculation Metho)@ CKameosha 2969D1 Auto)etection of tempo is notably e5amine)
in @Tempo an) /eat Analysis of Acoustic Musical "i!nals@ C"cheirer 2998D1 $inally# in a))ition
to musical features# the relationship between )issonance# spectra an) the construction of
unevenly space) scales !ets intense scrutiny in Tunin!# Timbre# "pectrum# "cale C"ethares
Overview and Physical Description
The har)ware pro)uce) will be a small pac+a!e which fits in the poc+et of the flFneurEs
clothin!1 -t will listen to the environment surroun)in! the flFneur an) transform it accor)in!
the schemas of a short album of son!s1 -t will have an onGoff switch# volume control an) a
@tunin! a):ustment1@ A pair of over-the-ear hea)phones on a sin!le cable will be attache)1
-n a))ition to the hea)phonesE spea+ers# one small microphone will be mounte) on the
outsi)e of each ear1
The microphonesE purpose is to pro)uce an ima!e of the au)ible worl) surroun)in! the
listener1 The tunin! a):ustment is to select mo)es or son!s1 The pro!ression throu!h the
album will be similar to an E5ten)e) Play recor)# yet implemente) as an analo! ra)io-style
tunin! +nob with simulate) static in or)er to un)erscore the concept of a !ra)e) inte!ration
with the environment# as oppose) to the )i!ital inGout of * player trac+s1 $urthermore# the
fa+e static sustains the impression of the music an) soun) comin! from @out there@ rather
than @in here1@
The har)ware will most li+ely be implemente) usin! a )evelopment boar) base) on the
Analo! evices ;A-< /lac+fin i!ital "i!nal Processor ;"P<1 This is )esirable because it
has a lar!e amount of processin! power ;21' billion multiply an) accumulate operations per
secon) is consi)ere) lar!e for a portable system by to)ayEs stan)ar)s<# can be
pro!ramme) in *LL an) has the .inu5 environment available for it1 At present the
Vic*ore'3M )evelopment boar) from Voice -nterconnect# a Nerman company# is )esire)1
The har)ware will also have a stereo co)ec an) 2G8@ :ac+s for hea)phone output an)
microphone input1 *ustom mo)ification will be necessary to implement the tunin! +nob1
/ase) on unfortunate e5periences with *hiclet# the "P Musicbo5# a protective case will be
)esi!ne) to cover the printe) circuit boar)1
7ne !oal of this pro:ect is to subtly mi5 fantasy an) reality1 This will be )one by carefully
consi)erin! which si!nal processin! al!orithms to apply1 $or e5ample> When it is
necessary to pro)uce as realistic-soun)in! an environment as possible# al!orithms such as
e=uali,ation# linear filterin! ;$-Rs an) --Rs< pitch-scalin!# an) samplin! will be use)1 -n
lesser measure# unnatural soun)in! al!orithms such as waveshapin! )istortion# bit
re)uction# rin! mo)ulation# aliasin! an) wavetableGa))itive synthesis will be use)1
Another !oal of this pro:ect is to ensure that it is not overly sin!ular1 This means that in
the near future# someone else shoul) be able to pic+ it up an) create their own al!orithms
with it1 Therefore# an e5tension lan!ua!e will be create) for operatin! on the environmental
noise1 -t will be similar to *soun)# but have much hi!her level primitives1 The e5tension
lan!ua!e is inten)e) to survive the physical pro:ect1
Analysis, Processin and !ynthesis
The analysis routines will !et their input )ata from the Analo! to i!ital *onverters
;A*s<1 All si!nal paths will be stereo# fi5e)-point )ata at 4412KB, rate for hi!h =uality1 ata
will be operate) on in win)ows whose si,e will be pic+e) base) upon e5perimentation# since
there is a system of tra)eoffs between imme)iacyG)elay# processin! efficiency an) processin!
Analysis an) synthesis are operations which re=uire computational resources# which
are typically measure) in percent of *PU usa!e or number cycles per secon) run time# e1!1
M-P" ;Millions of -nstructions Per "econ)<1 Niven the limite) number of M-P"# allocation
)ecisions must be ma)e1 "ince the )evice will pro)uce a variety of )ifferent outputs# it will
utili,e a number of synthesis techni=ues# each of which will re=uire a varyin! number of
M-P"1 $or simplicity# the analysis an) synthesis routines will each be re=uire) to utili,e less
than '&O of the available M-P"1 -t is e5pecte) that the analysis routines will always occupy
the same number of M-P"1
The basic flow of si!nal processin! will traverse a simple networ+ of analysis#
processin! an) synthesis mo)ules1 "ee $i!ure 21
Illustration 1Block Diagram of Signal Processing Flow
The analysis routines form a mostly se=uential processin! chain# with outputs ta+en at
each lin+ available to the main se=uencer1 -n the first step in the chain# the soun) will be
filtere) usin! the -nnerG7uter ear transform as in @"+eleton@ CJehan %&&4D1 This e=uali,ation
sta!e is )one when processin! microphone input to more closely resemble the au)io a
human ear woul) hear1 -t may also be twea+e) to account for the transfer function of the
microphones1 $ollowin! the E1P1# one stream will be sent to a beat )etection mo)ule# which
will supply the main se=uencer with tempo an) rhythm )ata1 The e=uali,e) soun) stream will
also be continuously supplie) to the $ast $ourier Transform routine1 The fre=uency )omain
)ata will be supplie) to Kameo+a an) Kuriya!awaEs )issonance measurement al!orithm
CKameo+a 2969D1 The fre=uency )omain )ata will also be supplie) to the ominant Pitch
Analysis mo)ule1 The )ominant pitch )ata forwar) )ata to the *hroma!ram computation
-n the previously mentione) wor+ "onic Authority# the analysis be!an with samples of
each )evice1 $or noise immunity# lon!# 3& secon) win)ows were use) 1 Ne5t the $ast
$ourier Transform ;$$T< was compute)1 This resulte) in a spectrum with about 2#&&&#&&&
bins ;442&& samples per secon) Q 3& secon)s<1 To transform this into a )ominant fre=uency#
the bins were use) to compute 2%2 sums# one for each step of the au)ible 2&-octave
chromatic scale1 Each sum in)icates the relative )ominance of one note1 $or e5ample# to
fin) the )ominance of note A-4# the total of every bin whose fre=uency is within %'O of an
inte!er multiple of the 44& B, fun)amental is summe)1 The )ominance levels are then
compare)# an) the most )ominant is reporte)1 This metho) is similar in spirit to computin!
the *hroma!ram C*hai %&&'D an) computin! the *onstant P Transform C/rown 2989D1 $or
e5ample output# see $i!ure %1
7nce compute)# the )ominant fre=uency spectrum is of !reat usefulness1 -ts outputs
can be rea)ily applie) to computin! the +ey of the piece1 This is a useful piece of information
because it can inform how to harmoni,e1 -n practice# the precision of the spectral )ominance
al!orithm varie) with the sample) location1 "ome soun)s resulte) in =uite narrow ban)s#
an) it was possible to name the )ominant pitch by fin)in! the ma5imum value on the 2%2-
value !raph1 7ther soun)s pro)uce) small clusters of )ominance# from 3-9 semitones wi)e#
whose amplitu)es were within 'O of each other1 "uch clusters are hi!hly )issonant# an) it is
the !oal of this pro:ect to turn such )issonance into music an) improve the =uality of the
al!orithm1 There are many ways to interpret such results an) one of the !oals is to e5plore
Techni=ues for the system coul) come from many places1 They will be both
)iscovere) an) inspire) from other musicians1 $or e5ample# :a,, musician Thelonious Mon+
woul) play a cluster of semitones# then release all but one +ey# creatin! a very )issonant
attac+ on an otherwise normal note1 To mimic this effect insi)e the listenerEs environment#
two envelopin! filters woul) be employe)1 $irst# the cluster of notes woul) be attenuate) with
either an array of comb or notch filters1 This woul) virtually eliminate the )issonant soun)
from the environment1 Then# to sustain connection with the listenerEs environment# a secon)
filter or filter ban+ woul) be use) to isolate only one note from the cluster at a time an) remi5
it in1 $urthermore# the reintro)uce) note coul) be varie) with time# creatin! a melo)ic line1
Another response to the )issonant soun) woul) be to harmoni,e with it1 This is the
response offere) by Kelly obson in @Machine Therapy@ Cobson %&&%D1 -n her pro:ect# the
human listener harmoni,es with a machineEs movements an) au)ible vibrations1 *omputer
musicians have ta+en all +in)s of approaches towar) autoharmony1 7ne area to e5plore is
when to mi5 in realistic vs1 fantastic instruments1 A realistic instrument woul) have a similar
harmonic spectrum to the ori!inal1 Wei *haiEs paper# for e5ample# )escribes the comparison
Illustration 2Dominant Pitches in Unidentified Telephone Pole Equipment
of o)) an) even harmonic levels to !et an ima!e of timbre1
Another set of techni=ues is base) on William "etharesE i)eas C"ethares %&&'D1 Be
e5amines the peculiar spectra of naturally-occurrin! roc+s# )escribin! how K0KEs )issonance
al!orithm informs a particular musical scale1 -t woul) be possible to automate his
metho)olo!y in or)er to create scales in real time1 After calculatin! the non-chromatic scale#
an a))itive synthesis metho) woul) be use) to play melo)ies usin! the new instrument1 This
is an important reference point for analy,in! manma)e noise because often physical )esi!n
of machines such as vehicle transmissions create similar !roups of soun)s whose
fun)amental fre=uencies scale at the same rate# but are inharmonic1
December 2005 - -nitial )evelopment activities such as portin! .inu5 to the Vic*ore
)evelopment boar) will ta+e place1
January 2006 - The har)ware# inclu)in! hea)phones# microphones# tunin! +nob# case
an) battery system will be constructe)1
February 2006 - The initial "P mo)ules an) e5tension lan!ua!e will be )evelope)1
emos of each al!orithm will be presente) to rea)ers for criti=ue1
March 2006 - The system will be teste) e5tensively in several cities1 An online au)io
:ournal will be +ept in several cities for rea)ers to criti=ue1
April 2006 - Writin! the thesis will be!in1
May 2006 - Writin! the thesis will continue1
Vic*ore '35-7EM /lac+fin "P evelopment /oar)
-N.77 Parallel Port -*E ;-n *ircuit Emulator<
A reformulate) Wal+man-li+e )evice that transforms the sonic environment into music1
New al!orithms to transform )isor)ere) manma)e noise into music1
An evaluation of which al!orithms are best suite) to the !oals1
C/asu %&&2D /asu# "1 an) Pentlan)# A1 ;%&&2< @"mart Bea)phones#@ Procee)in!s of
*B- %&&2# "eattle# WA1
C.evi %&&4D .evi# .awrence1 ;%&&4< $laneur ma!a,ine# %&&41 /roo+lyn# New Hor+1
CRussolo 2923D Russolo# .1 ;2923< RSThe Art of Noises1S Translate) by Robert $illiou
296I# Nreat /ear Pamphlet# "omethin! Else Press1
CNa+ai %&&3D -ori Na+ai1 ;%&&3< "treetscape1 Ars Electronica# .in, %&&31
CMaebayashi %&&&D Maebayashi# A+itsu!u1 ;%&&< @"onic -nterface@# e5hibite) in To+yo1
CVawter %&&'D Vawter# Noah1 ;%&&'< @"onic Authority1@ e5hibite) in *ambri)!e1
Cowlin! 2986D ownlin! an) Barwoo)# ;2986< Music *o!nition# Aca)emic Press#
7rlan)o1 pp1 46-491
C*hai %&&'D *hai# Wei1 ;%&&'< @Automate) Analysis of Musical "tructure1@
CRameau 2I%%D Jean-Philippe Rameau ;2I%%<1 Treatise on Barmony# translate) from
$rench# over Press# New Hor+1 p1 %91
CPiston 2942D Piston# Walter1 Barmony1 ;2942< Norton Press# New Hor+1
CKameo+a 2969D A1 Kameo+a an) M1 Kuriya!awa# ;2969< @*onsonance Theory Part -->
TT*onsonance of *omple5 Tones an) -ts *alculation Metho)@# The
Journal of the Acoustical "ociety of America# 2969b# Vol1 4';6<#
C"cheirer 2998D "cheirer# Eric1 ;2998< @Tempo an) /eat Analysis of Acoustic Musical
"i!nals1@ J1 Acoust1 "oc1 Am1 2&3>2 ;Jan 2998<# pp '88-6&21
C"ethares %&&'D "ethares# W1 ;%&&'< Tunin!# Timbre# "pectrum# "cale# %n) e)ition1
"prin!er-Verla! .on)on1 %n) e)ition1 pp 239-2441
CJehan %&&4D Jehan# Tristan1 ;%&&'< R"+eleton *omputer "oftware1S
C/rown 2992D /rown# J1*1# ;2992<1 TT*alculation of a *onstant P "pectral
Transform@ J1 Acoust1 "oc1 Am1 89 4%'-4341
Cobson %&&%D obson# Kelly1 ;%&&%<1 RMachine Therapy "ession1S