You are on page 1of 112

T HE U NIVERSITY OF N EW S OUTH

WALES
S CHOOL OF E LECTRICAL E NGINEERING AND
C OMPUTER S CIENCE AND E NGINEERING

Arti cial Evolution


for Sound Synthesis

Jonathon Crane (2172214)


Bachelor of Engineering (Computer Engineering)
July 5, 1996

Supervisor: Andrew Taylor


Assessor: Tim Lambert
ii
Abstract
A technique of Artificial Evolution is applied to Sound Synthesis algorithms with
the goal of producing an effective, intuitive way of searching the space of possible
sounds.
Interactive Evolutionary Algorithms (IEA’s) are optimisation techniques inspired
by the process of biological evolution. In this study, an IEA is applied to a Frequency
Modulation (FM) synthesis algorithm. This produces a tool which allows users to
explore, sculpt and evolve sounds without any knowledge or understanding of the
underlying algorithm.
To determine the effectiveness of the system, seven different users compared three
different exploration techniques: manually adjusting parameters of the algorithm; ran-
domly adjusting parameters; and using the IEA to adjust the parameters.
It was discovered that the random method performed best, but did not offer the
degree of control required by users. The manual method provided this fine grained
control, but was slow and difficult to use - it required the users to understand the
underlying FM algorithm. In contrast, the IEA method provided all the advantages
of the random method, combined with the control of the manual method. It enabled
users to rapidly locate good sounds and then refine them as necessary, even with no
knowledge or understanding of the FM synthesis algorithm.
Contents

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Previous Work 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Sound Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Describing Sound . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Synthesis Algorithms . . . . . . . . . . . . . . . . . . . . . . 7
2.2.3 Additive Synthesis . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.4 Subtractive Synthesis . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5 Frequency Modulation . . . . . . . . . . . . . . . . . . . . . 10
2.2.6 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.7 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Evolutionary Algorithms (EA’s) . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Genetic Algorithms (GA’s) . . . . . . . . . . . . . . . . . . . 14
2.3.2 Evolution Strategies (ES’s) . . . . . . . . . . . . . . . . . . . 15
2.3.3 Evolutionary Programming (EP) . . . . . . . . . . . . . . . . 17
2.4 EA’s for Sound Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Interactive Evolutionary Algorithms (IEA) . . . . . . . . . . . . . . . 18
2.5.1 Dawkins’ Biomorphs . . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Oppenheimer’s Artificial Menagerie . . . . . . . . . . . . . . 21
2.5.3 Smith’s Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.4 Sims’ Artificial Evolution . . . . . . . . . . . . . . . . . . . 23
2.5.5 Moore’s GAMusic 1.0 . . . . . . . . . . . . . . . . . . . . . 24
2.5.6 van Goch’s P-Farm . . . . . . . . . . . . . . . . . . . . . . . 25
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 System Design 28
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 System Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

ii
CONTENTS iii

3.3 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Sound Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Evolutionary Algorithm . . . . . . . . . . . . . . . . . . . . 31
3.3.3 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 The Design Decision . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 System Development 36
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 EvoS 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 EvoS 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 EvoS 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5 EvoS 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6 EvoS 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.6.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7 EvoS 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7.4 User evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
CONTENTS iv

5 Results 70
5.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.1 Manual Tool . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.2 Random Tool . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.3 Evolution Tool . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.1 Raw Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.2 Analysis of Data . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.3 An Alternative Analysis . . . . . . . . . . . . . . . . . . . . 82
5.4.4 User Opinions . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6 Conclusion 89
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Other lessons learnt . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.3 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 A final note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

A Evolving Samples 93
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.1.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 94
A.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
A.1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
List of Figures

1.1 This Moog synthesizer illustrates the complexity of synthesis algo-


rithms. Each knob controls a separate parameter that can be adjusted
by the user to effect the resulting sound - taken from [Mus98]. . . . . 2

2.1 Patch required for additive synthesis - taken from [Wil88]. . . . . . . 8


2.2 A basic subtractive synthesis patch - taken from [Wil88]. . . . . . . . 9
2.3 The most basic setup required for FM synthesis - taken from [DJ85]. . 10
2.4 A more complicated FM patch - taken from [DJ85]. . . . . . . . . . . 11
2.5 An illustration of 5 point crossover - taken from [Bäc95]. . . . . . . . 15
2.6 Evolution of “biomorphs” with Dawkins’ system - taken from [Daw86]. 20
2.7 Example of an evolved plant form - taken from [Ope88]. . . . . . . . 21
2.8 Biomorphs evolved with Smith’s system - taken from [Smi91]. . . . . 22
2.9 An image generated from an evolved symbolic lisp expression - taken
from [Sim91]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.10 A screen shot of the GAMusic 1.0 user interface. . . . . . . . . . . . 25
2.11 A screen shot of P-Farm’s user interface. . . . . . . . . . . . . . . . . 26

3.1 The three main components of the proposed system . . . . . . . . . . 29

4.1 A screen shot of the EvoS2 user interface. . . . . . . . . . . . . . . . 39


4.2 A sample evolution run for EvoS2. . . . . . . . . . . . . . . . . . . . 41
4.3 Evolution of synthesis parameters in EvoS 3. . . . . . . . . . . . . . 44
4.4 Evolution of variances in EvoS 3. . . . . . . . . . . . . . . . . . . . . 44
4.5 An envelope (top) controls the amplitude of a waveform (bottom). . . 46
4.6 The envelope of figure 4.5 shown with an envelope produced by mu-
tation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.7 The envelope of 4.5 compared with an envelope produced by random-
ization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.8 EvoS 4 evolution with a variance of 50 1 . . . . . . . . . . . . . . . . . 51
4.9 EvoS 4 evolution with a variance of 10 1 . . . . . . . . . . . . . . . . . 52
4.10 EvoS 4 evolution with a variance of 20 1 . . . . . . . . . . . . . . . . . 52
4.11 EvoS 5 evolution with a variance of 1/50. . . . . . . . . . . . . . . . 55
4.12 EvoS 5 evolution with a variance of 1/10. . . . . . . . . . . . . . . . 56
4.13 EvoS 5 evolution with a variance of 1/20. . . . . . . . . . . . . . . . 57

v
LIST OF FIGURES vi

4.14 EvoS 7 evolution: Modulation indices (I 1 and I2 ). . . . . . . . . . . . 63


4.15 EvoS 7 evolution: Frequency ratios (N1 and N2 ). . . . . . . . . . . . 64
4.16 EvoS 7 evolution: Amplitude Envelopes. . . . . . . . . . . . . . . . . 65
4.17 EvoS 7 evolution: Modulation Index Envelopes. . . . . . . . . . . . . 65

5.1 Screen shot of the manual tool user interface. . . . . . . . . . . . . . 72


5.2 Screen shot of the random tool user interface . . . . . . . . . . . . . 73
5.3 Screen shot of the evolution tool . . . . . . . . . . . . . . . . . . . . 75
5.4 Sounds Liked plotted against Time for all search tools. . . . . . . . . 82
5.5 Sounds Liked vs. Sounds Auditioned for all search tools. . . . . . . 83
5.6 Sounds Liked plotted against search tool . . . . . . . . . . . . . . . 84

A.1 Example of a heterodyne analysis file, the amplitude envelopes . . . . 94


A.2 Example of a heterodyne analysis file, the frequency envelopes . . . . 95
A.3 Example of a nicely mutated amplitude envelope . . . . . . . . . . . 96
A.4 Example of a nicely mutated frequency envelope . . . . . . . . . . . 96
A.5 Example of a badly mutated amplitude envelope . . . . . . . . . . . . 97
List of Tables

2.1 Papers published by Andrew Horner et. al. on the subject of applying
GA’s to sound synthesis . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.1 Ranges and variances for the parameters of EvoS 2. . . . . . . . . . . 40


4.2 Variances and ranges for the variances of EvoS 3 . . . . . . . . . . . 43
4.3 Effects of different variances in EvoS 4. . . . . . . . . . . . . . . . . 53
4.4 Effects of different variances in EvoS 5 . . . . . . . . . . . . . . . . 58

5.1 Raw statistics collected for the manual tool . . . . . . . . . . . . . . 79


5.2 Raw statistics collected for the random tool . . . . . . . . . . . . . . 79
5.3 Raw statistics collected for the evolution tool . . . . . . . . . . . . . 80
5.4 Mean statistics for each search tool. . . . . . . . . . . . . . . . . . . 80

vii
Acknowledgements
I would firstly like to thank Emma Hogan, who supported me throughout this thesis -
thank you for all the meals you cooked me.
Next, I would like to thank my family: my mother, Colleen, who has supported
me financially these last four years; my brother, Andrew, who got me out of some
dodgy situations; and my aunt, Nathalie, who also cooked a lot of meals for me.
In relation to this project I would like to thank all those users who volunteered
to test the system: Emma Hogan (again); Brett Webb; Nick Perkins; Dougal May;
Rachael James; Tom Fryer; Chris V.; Nick Mariette; and Brendan Hanna (I’m sorry
about the outlier stuff). I would also like to thank Tom and Will Edwards for letting
me use their scanner.
Academically I would like to thank my supervisor, Andrew Taylor, especially for
approving my project which was a bit out of the ordinary. And also my assessor, Tim
Lambert, for dealing with my frequent anxiety attacks ;). Similarly I would like to
thank those people who answered my emails with querys about this project: Jason
Moore; Arno van Goch; and Waleed Kadous.
Finally, I would like to thank anyone else that I may have left out - I apologise for
this and hereby give you permission to run a guilt trip on me.
In memory of my father

(“The journey of a thousand miles starts with the first step.”)

ix
Chapter 1

Introduction

In theory at least, any sound can be synthesized electronically...


The synthesizer is a wonderfully versatile musical instrument. It can
sound like dozens, hundreds, or even thousands of instruments... Many
other sounds are only possible through sound synthesis.
—Delton T. Horn, Music Synthesizers

Indeed, sound synthesis can produce any sound you desire, but the above quote
does not mention the difficulty involved in obtaining such sounds.
This thesis is concerned with using a process of artificial evolution 1 to produce
novel and interesting sound spectra. More specifically, an Interactive Evolutionary
Algorithm (IEA) is used here to generate parameter sets for sound synthesis algo-
rithms.
The results will show that artificial evolution provides an efficient and intuitive
way for people to search through the space of possible sounds.

1.1 Motivation

Why should anyone bother with such an exercise? — There are a number of reasons.
Synthesis algorithms typically involve a large number of parameters (see figure
1.1). This often makes it difficult to obtain a desired sound quickly. Users of the
synthesis algorithms either have to understand exactly how each parameter will ef-
fect the final sound, or they have to spend a lot of time aimlessly “twiddling knobs”
1
An optimisation procedure modeled on biological evolution.

1
1. Introduction 2

until a satisfactory sound is obtained. What is needed is an efficient, intuitive and


enjoyable way to search through the multi-dimensional space created by the synthesis
parameters.

Figure 1.1: This Moog synthesizer illustrates the complexity of synthesis algorithms.
Each knob controls a separate parameter that can be adjusted by the user to effect the
resulting sound - taken from [Mus98].

Primarily this could have application as a creative tool for musicians, composers
and electronic artists. Instead of twiddling knobs all day they could very quickly
obtain novel and interesting sounds for use in their compositions. Indeed, the process
of evolving a sound could even form the basis of a composition.
Secondly, it would open the world of sound synthesis to many people who don’t
have the time or patience to understand the algorithms. The only expertise you need
is your opinion of ‘what sounds good’ and ‘what sounds bad’.
Finally, another application may lie with researchers experimenting with new
kinds of synthesis algorithms. The system would provide an easy way to quickly
test the boundaries of any new algorithm they discover.
1. Introduction 3

1.2 Goals

The goals of this project were:

 To investigate the use of an Interactive Evolutionary Algorithm applied to sound


synthesis.

 To determine whether this provides a useful tool for people interested in explor-
ing the space of possible sounds.

1.3 Objectives

The objectives accomplished in the process of achieving the above goals are sum-
marised below:

 To conduct a review the previous research done in this field. This will serve as
a background for an informed investigation.

 To assess the possible avenues of implementation for the idea and decide on the
most feasible.

 To execute this most feasible avenue of implementation - resulting in a system


that embodies the goals of this project.

 To collect data (by conducting appropriate experiments) in order to asses whether


the system is effective or not.

 To analyse the collected data in a suitable manner and so asses the feasibility of
this idea for the applications suggested.

 To document the findings of this investigation in a concise and legible manner


- accessible to people in related fields of research, or anyone interested in the
results.

1.4 Outline

So far we have defined the goals and objectives of this thesis, the rest of the report
will be structured as follows:
1. Introduction 4

Chapter 2

Following the standard format of an undergraduate thesis, we begin with a review


relevant literature. This will provide a picture of what has already been achieved in
this area of research and illustrate the originality of this project.

Chapter 3

We then consider the possibilities for implementation of the project. The pros and
cons of each design issue are discussed and the final implementation decision is de-
scribed and justified. The problem of measuring the project’s success is also dis-
cussed.

Chapter 4

Next, we describe how the software for this project was developed. This is presented
in the form of a series of experiments, each with its own aim and conclusion. The
final experiment in this series combines all the previous discoveries into a complete
system.

Chapter 5

This chapter sees the complete system fashioned into an experiment that compares
three different ways of searching the same sound space. The experiment is described
and the results collected are reported and analysed.

Chapter 6

Finally, this chapter will summarise the findings of this investigation and suggest av-
enues of future research which could expand on the work presented here.
So without further ado, lets get down to business 2... I hope you enjoy it!

2
A note on grammar: throughout this document there is occasional use of language traditionally
regarded as informal. For example, the use of dashes ‘-’ to add afterthoughts, and the word ‘But’ at
the beginning of a sentence. It is the author’s opinion that in some circumstances the use of informal
language helps to communicate ideas, and this is the justification for its use in this document. Informal
language also serves the purpose of adding the occasional ‘personal touch’ which can promote the
readers interest in an otherwise bland report.
Chapter 2

Previous Work

2.1 Introduction

This chapter gives a summary of the background research that was conducted for
this project. First a general introduction to the field of Evolutionary Algorithms is
given. This is followed by a review of the research done in applying these techniques
to sound synthesis. Next, applications of Interactive Evolutionary Algorithms are
reviewed, and finally a short summary is given. This will demonstrate the validity and
originality of this project in the context of the research described.

2.2 Sound Synthesis

The process of synthesizing sound has a long and detailed history that is beyond the
scope of this document. Instead, this section aims to give a very brief introduction,
focusing on aspects relevant to this thesis.

2.2.1 Describing Sound

Before methods of synthesizing sound are discussed, it would be good to discuss some
characteristics of sound itself. The basic properties of a (musical) sound are its pitch,
amplitude, duration and timbre:

5
2. Previous Work 6

Pitch

Pitch is our perception of the frequency of a sound. Frequencies are usually measured
in Hertz (Hz) which is a measure of the number of cycles per second in the sound.
Most sounds have many frequencies present at the same time. If these frequencies are
related in a harmonic series then the pitch we hear for the complete sound is that of
the lowest frequency or the fundamental.
People are very used to the idea of controlling the pitch of a sound - take for
example the piano. Each key on the keyboard generates a note of a different pitch.
You can change the pitch of the sound by pressing different keys on the keyboard.

Amplitude

Amplitude (or volume) is our perception of how loud a sound is. Amplitude is com-
monly measured in decibels (dB).
Using the example of the piano again, you control the amplitude of the sound by
striking keys with different speeds. The faster (or harder) you hit a key, the louder the
resulting sound.

Duration

Duration is the time a sound lasts for. It is usually measured in seconds.


On the piano the duration of the sound is controlled by the amount of time a key
is held for. The longer you hold a key down, the longer the duration of the sound.

Timbre

This is much harder to describe. Timbre is “the characteristic tone quality of a partic-
ular class of sounds”1 . You describe timbre of a sound when you say things like “that
sounds like a trumpet” or “that sounds very metallic”. The timbre of all brass instru-
ments is quite similar - a trumpet sounds like a trombone, yet the timbre of brass and
wind instruments are quite different - a trumpet does not sound at all like a flute. Al-
though timbre cannot be measured on any scale, two important aspects that contribute
to the timbre of a sound are its spectrum and its amplitude envelope.
The amplitude envelope of a sound is the way the volume of a sound varies over its
duration. Sounds whose envelopes have a sharp attack segment will be heard as per-
1
From page 48 of [DJ85].
2. Previous Work 7

cussive. Those with very long attack segments will sound like they are being played
backwards.
The spectrum of a sound is determined by Fourier analysis which describes it as
a sum of simple sinusoids. Changing the spectrum of a sound can radically alter
its characteristics. A sound with a simple or narrow spectrum will sound “thin” and
“pure” - like a sine wave. A sound with a wide spectrum will sound “rich” and “buzzy”
- like a square wave.
In our example of the piano, there is no way to control the timbre of the sound 2. It
was not until the advent of electronic musical instruments that very radical changes in
timbre could be effected. Today, a synthesizer can provide many controls that effect
the timbre of a sound in different ways.

2.2.2 Synthesis Algorithms

Research in sound synthesis is usually focussed with musical applications in mind.


As a result, a lot of work has been done in finding ways to simulate the sounds of var-
ious acoustic instruments. Often along the way, many completely new and unnatural
sounds are discovered. These (usually very large) portions of the unnatural timbral
space can only be accessed via sound synthesis algorithms.
Synthesis algorithms can be implemented with analog electronics or digital com-
puters. Early analog sound synthesisers were large machines 3 , laden with knobs4 and
strewn with patch cables which generated all manner of weird and wonderful sounds.
To create a sound, one manually “patched” the output of an oscillator, for example, to
the input of a filter or any other sound processing module which was available. Any
such configuration of modules was called a patch 5.
The advantage of synthesis in this manner was that each parameter (e.g. oscilla-
tor frequency, filter cut-off frequency, etc.), had an associated knob which could be
“tweaked” by the user. This real-time tweaking of knobs, combined with the ability
to patch the output of any module to the input of another afforded an intuitive and
enjoyable way to explore the space of possible sounds.
The disadvantages of modular analog synthesis are its high cost and the sheer
physical size of the machines. If these are concerns, you should probably implement
your synthesis algorithm on a digital computer.
2
You could make very minor alterations to the timbre by doing things like opening the lid of the
piano - this would cause a slight difference in tone quality.
3
Indeed, decent modern analog synthsisers are still very large.
4
Look back at figure 1.1.
5
A term still in use today and throughout this document.
2. Previous Work 8

The advantage of synthesising sound on computers is that any analog configura-


tion can be simulated, but the cost and size of the machine is much smaller. However
you lose the intuitive control (and usually the real-time response) of an analog system.
The next few sections describe some popular sound synthesis algorithms as ex-
plained in [Pre92], [DJ85] and [Wil88]. Although these are usually implemented on
digital computers, in order to understand them it is sometimes easier to picture them
in an analog implementation.

2.2.3 Additive Synthesis

This is the most direct way to form any kind of sound spectra you desire. For each
sinusoidal component (harmonic) 6 of the desired spectra you use a separate sinusoidal
oscillator to produce it. The outputs of all the oscillators are then simply added to-
gether.
In most natural sounds, the volume and pitch of each sinusoidal component varies
with time. To mimic this, two envelopes are used to control each oscillator. One
envelope controls the amplitude and the other varies the pitch. This results in the
patch illustrated in figure 2.1. Here, you can see each oscillator being controlled by a
volume envelope generator (VEG) and a pitch envelope generator (PEG).

Figure 2.1: Patch required for additive synthesis - taken from [Wil88].
6
The terms ‘sinusoidal component’ and ‘harmonic’ are usually interchangeable.
2. Previous Work 9

Using this technique, researchers have been able to very accurately synthesize
the sounds of many instruments. All they have to do is analyse a given instrument
tone (via Fourier decomposition) to determine the shape of the envelopes for each
harmonic. These envelopes are then used to control the volume of oscillators set
at the correct frequencies. The resulting sound is almost indistinguishable from the
original instrument tone.
There are a few disadvantages of the additive synthesis technique however. Firstly,
because a separate oscillator and envelope generator are required for each harmonic
(typically there are more than ten harmonics) it is computationally expensive. If a
fast response is required between specification of the parameters and synthesis of
the sound, you will need either a very powerful computer or specialised hardware.
Secondly, because this technique involves specification of so many parameters, it can
be hard for a musician to achieve a desired sound.

2.2.4 Subtractive Synthesis

In contrast to additive synthesis, subtractive synthesis starts with a dense spectrum


and carves away selected portions to produce the desired sound. Instead of sine wave
oscillators, square and sawtooth wave oscillators are used and combined with noise
generators. The dense spectrum that results is passed to various combinations of high
pass, low pass or band pass filters. To create sounds that change their character (tim-
bre) over time, envelope generators and other oscillators are used to control the cut-off
frequencies of the filters. Figure 2.2 illustrates the concept of subtractive synthesis.
Here, the filter is controlled by an envelope generator (EG) and a low frequency os-
cillator (LFO).

Figure 2.2: A basic subtractive synthesis patch - taken from [Wil88].

Subtractive synthesis is useful for imitating instruments with harmonic spectra


2. Previous Work 10

such as wind and string instruments. Inharmonic spectra (such as the sounds of bells
and drums) can be produced by combining other oscillators or using other devices
such as ring modulators. The range of sounds producible with this technique depends
on the way in which you interconnect modules rather than the numbers you throw at
an algorithm.

2.2.5 Frequency Modulation

Frequency Modulation (FM) is a very popular synthesis method that was first defini-
tively described by John Chowning [Cho73].
FM synthesis involves using one oscillator (the modulator) to control the fre-
quency of another (the carrier). The simplest FM setup is depicted in figure 2.3.

Figure 2.3: The most basic setup required for FM synthesis - taken from [DJ85].

In order to understand the effect of FM, imagine that the frequency of the modu-
lating oscillator (fm ) is very low (1 to 10Hz). The output of the carrier will be a tone
that wavers up and down in pitch. The pitch will vary between (f c d) and (fc + d),
where fc is the frequency of the carrier and d is the amplitude of the modulator. Musi-
cians call this wavering pitch effect “vibrato”. If the amplitude (d) of the modulating
oscillator is increased, the tone will waver up and down more wildly - like a police
2. Previous Work 11

siren. Thus, increasing (d) increases the depth of the vibrato. If the frequency of the
modulating oscillator (f m ) is now increased, the tone will waver up and down faster.
As the modulating frequency goes into the audio range (above 20Hz), the output of
the carrier is no longer heard as a vibrato, but obtains a distinct timbre of its own.
Many different timbres can be obtained by varying the frequency and amplitude of
the modulating oscillator.
Remember this is only the simplest FM patch, a more complex one is shown in
figure 2.4. Here, the constant d is replaced by an envelope generator. The envelope
dynamically changes the amplitude of the modulating oscillator over time. This re-
sults in a time varying spectrum at the output - a very interesting sound that changes
its character over time. Notice also that the amplitude of the carrier is shaped by an en-
velope. This envelope dynamically controls the volume of the time-varying spectrum.

Figure 2.4: A more complicated FM patch - taken from [DJ85].

Recall that two important contributors to the timbre of a sound were its spectrum
and amplitude envelope7. This simple FM algorithm, has achieved control of both
of these! Compare this to the additive synthesis approach which required tens of
oscillators. In the late 60’s and 70’s, the prospects of FM synthesis seemed so good
that Yamaha purchased the rights to it 8 .
7
See section 2.2.1.
8
See [Wil88] page 52.
2. Previous Work 12

However, the disadvantage of FM is that it is hard to control. For a given set of


parameters, it is difficult for a human to predict how the result will sound.

2.2.6 Sampling

Sampling synthesis has become a lot more popular in recent years due to the falling
price of digital electronics.
Sampling is the process of converting an analog audio source into digital form
(via an ADC9 ) and storing this information. The digital audio is stored as a string of
bits and can be played back at any time via a DAC10 . It can also be played back at
different rates to simulate different pitches. For example to synthesize a piano, you
could sample just one note of a real piano. The rest of the notes can be synthesized by
playing the sampled note back at different rates.
Sampling has proved very successful for synthesising all manner of instruments
and usually provides a much more realistic imitation than any other synthesis method.
There is a price to pay for this however - it is very data intensive. To accurately
synthesize an instrument you have to sample it a number of times at a very high rate.
This can lead to huge amounts of data which has to be stored and processed. Compare
this with a sophisticated FM algorithm with which you only need to store about 20
parameters.

2.2.7 Other Methods

Other methods of sound synthesis include waveshaping, discrete summation synthe-


sis, group synthesis and many others. Receiving particular attention in recent times
is granular synthesis which creates sound by combining together tiny sound “grains”.
However, we cannot possibly hope to cover all these methods - and besides, they are
not really relevant to this thesis 11 .
And now for something completely different... we move to a discussion of Evolu-
tionary Algorithms. The material that follows is based largely on the excellent frame-
work presented by Bäck [Bäc95].
9
Analog to Digital Converter, pronounced “aydack”.
10
Digital to Analog Converter, pronounced “dack”.
11
However, future projects may explore artificial evolution applied to these and other, even more
bizarre synthesis methods.
2. Previous Work 13

2.3 Evolutionary Algorithms (EA’s)

Evolutionary Algorithms (EA’s) are a broad class of optimisation techniques that


mimic the process of biological evolution. In nature, populations of organisms adapt
to their environment through a process of reproduction and selection. Unfit organisms
do not survive to reproduce, which leaves the fitter organisms to reproduce and dom-
inate the population. Occasionally, mutations in the genes of an individual organism
will lead to a more successful creature. This creature’s genes will come to dominate
the gene pool and so raise the level of fitness of the population.
Likewise, the EA works by optimising artificial genes which represent parameters
of the problem you are trying to solve. The following pseudo code shows basically
how the EA works12 :

1 <Initialise> population’s genes randomly.


2 While (population unfit), do
3 <Recombine> genes of individuals with others (mating).
4 <Mutate> genes of individuals.
5 <Evaluate> the fitness of each individual.
6 <Select> the individuals which form the new population.
7 od.

This description skips a lot of detail. Each of the operations in the above pseudo
code needs careful consideration for effective implementation:

<Initialise> This involves setting the genes of each individual in the population
to a random number. What probability distribution should you use?
<Recombine> Recombination (mating) can be sexual13 or panmictic14 . Selecting
the individuals that mate with each other is usually random. There are also dis-
crete and intermediate forms of mating. Discrete recombination forms a child
by copying genes from one parent or the other. Intermediate recombination
allows interpolation of gene values from parents to form the child.
<Mutate> Mutation introduces slight variations into the population by randomly
modifying genes of individuals. You have to decide what proportion of the
parent population is subject to mutation and also the degree of mutation - how
much different should the child be from the parent?
<Evaluate> This is a big step. First, you have to translate each individual from a
genotype to a phenotype15 . You then have to evaluate each phenotype using a
12
Adapted from [Bäc95] page 66.
13
Involving just two individuals from the parent population.
14
Involving three or more individuals.
15
How you do this will depend on how the genes are encoded.
2. Previous Work 14

fitness function. The fitness function must give an indication of how close an
individual is to the optimum solution.
<Select> Once you have found the relative fitness of each individual, you must
select which individuals survive to form the next population. This can be as
simple as just selecting the n most fit individuals 16, or more complicated. Usu-
ally, an individual is more likely to be selected if it has a high fitness score, but
unfit individuals still have a chance of being selected.

As the field of EA’s is relatively new, there is still much debate as to the best
method for implementing all these operations 17. There are currently three main flavours
of EA, which each handle these problems in different ways. They are Genetic Algo-
rithms, Evolution Strategies and Evolutionary Programming.

2.3.1 Genetic Algorithms (GA’s)

The Genetic Algorithm (GA) seems to be the most popular form of EA and no doubt,
you have already heard of it. They were developed in America John Holland [Hol75]
and also popularised David Goldberg [Gol89] 18 .
The most distinctive feature of the GA is utilisation of a binary encoding for genes.
That is, the mutation and recombination operations operate on raw bitstrings - se-
quences of zeros and ones. This leads to very simple operations and a very versatile
problem solver. Since everything on a computer is represented as a bitstring at the
lowest level, you can apply a GA to a very wide range of problems. A disadvan-
tage with this approach is that implementation is complicated as it is often difficult to
operate on single bits.
Mutation is achieved by very occasionally ‘flipping a bit’ 19 of an individual. Re-
combination works in two steps. First, a number of crossover points are chosen along
the length of the genome. Next, the child genome is formed by taking the first parent’s
genome up until the first crossover point, taking the second parent’s genome until the
next crossover point and so on. This process is illustrated in figure 2.5.
There are two possible children formed from the same set of crossover points
depending on which parent is chosen first. In GA literature, the recombination process
is often just referred to as “crossover”.
GA philosophy, stresses that mutation is a background operator and crossover does
the real work of optimisation. This claim is backed up by experimental evidence and
16
Literally “survival of the fittest”.
17
A lot of it is guesswork.
18
For an entertaining and inspiring story of the discovery of the GA, see [Lev93].
19
Changing a zero to a one, or vice versa.
2. Previous Work 15

Figure 2.5: An illustration of 5 point crossover - taken from [Bäc95].

also Holland’s schema theorem [Hol75]. This theory of schemata basically proposes
that crossover raises the fitness of the population by combining “building blocks” of
high fitness together in different ways20 .
Although GA’s are very versatile, they are often slow to converge on a solution.
Bäck demonstrated that GA’s converged slower than both Evolution Strategies and
Evolutionary Programming in a number of test problems [Bäc95].

2.3.2 Evolution Strategies (ES’s)

The Evolution Strategy21 (ES) was a developed in Germany independent of the GA


[Bäc95]. ES’s use a real encoding for the genome; each gene is represented by a real
number as opposed to a single bit in the GA. An individual is formed by a vector of
real numbers. While this means that ES’s are not as flexible as GA’s, they are often
much easier to implement.
In the most basic ES, an individual !
a is represented by a vector of object variables
!
x which are the parameters being optimised:
!a = (x1 ; x2 ; : : : ; xn )
Mutation is performed by adding a small amount of noise to each parameter:

x0i = xi + N (0; i )
where N (0;  ) denotes a normally distributed random variable with zero mean and
standard deviation  . The standard deviation may be different for each parameter xi
and must be specified in advance. The child individual is formed as a vector of all the
mutated parameters:
!
a 0 = (x01 ; x02 ; : : : ; x0n )
20
See [Hol75] for a lot more detail.
21
Or more correctly Evolutionsstrategie.
2. Previous Work 16

This mutation procedure is all we need to implement the simple (1+1)-ES 22 ; an evo-
lution strategy where 1 parent is mutated to form 1 child. The best individual out of
the parent and child is chosen to become the parent of the next generation, the pro-
cess is then repeated. Obviously recombination (mating) is impossible with a parent
population of one.
Much more complex kinds of ES arise when populations of more than one are
considered. The ( + )-ES works by recombining  parents which are then mutated
to form  children. The best  individuals are chosen from the parents and children
to form the parent population of the next generation. The current state-of-the-art
evolution strategy is the (; )-ES. Here again the  parents are recombined then
mutated to form  children, but the new parent population is selected only from the
children.
The current state-of-the-art Evolution Strategies also employ self optimisation.
Individuals are not only represented by a vector of object variables !
x as before, but
! !
also by a vector of standard deviations  and rotation angles :
!
a = (!
x;!
;!
)
The standard deviations and rotation angles are subject to mutation as described above.
After these have been mutated, they are used to modify the probability distribution for
mutation of each object variable. Thus each object variable is mutated by a special
normal distribution that is being optimised as well [Bäc95]:

x0i = xi + N (0; C (i0 ; i0 ))


In the above equation C 1 (; ) is a covariance matrix. If a variable xi needs a large
standard deviation, its i will eventually be mutated to a larger value. Using this
method, you do not have to worry about specifying standard distributions for each
variable, they will be set to optimum values automatically.
In contrast to GA’s, mutation is stressed as the main operator in ES’s. The contri-
bution of recombination is not disregarded, but all the self optimisation features de-
scribed above are for the benefit of mutation, not recombination. Bäck demonstrates
that the ES converges much faster than both GA’s and Evolutionary Programming on
a variety of fitness landscapes [Bäc95].
22
This notation is used in [Bäc95] to distinguish the different flavours of ES. The first number is the
size of the parent population. The second number is the size of the child population that is formed from
the parent. If the numbers are separated by a ‘+’ (plus), the new parent population is chosen from the
old parents and the children. If the numbers are separated by a ‘,’ (comma), the new parent population
is chosen only from the children.
2. Previous Work 17

2.3.3 Evolutionary Programming (EP)

Evolutionary Programming (EP) is basically the American equivalent of the ES 23 . It


uses a real encoding for genes and has some self optimising features. In EP, variances
are stored in the genome as opposed to ES’s which store standard deviations. Finally,
EP has no recombination operator and relies on the power of mutation alone. It still
performs admirably however, converging faster than GA’s24 on a variety of fitness
landscapes [Bäc95].
We now move on to look at how EA’s have been applied to the field of sound
synthesis.

2.4 EA’s for Sound Synthesis

Almost all of the published research on EA’s applied to sound synthesis has been
conducted by Andrew Horner and his colleagues. He first applied a GA to a musical
application with David Goldberg in 1991 [HG91]. Given a starting and finishing
pattern of notes, they used a GA to evolve patterns of notes that bridged the start
pattern to the finishing pattern25 . The results were gratifying and the authors suggested
a further application could be found in using a GA to evolve the timbre of a sound 26 ,
as opposed to a sequence of notes.
This is exactly what Horner went on to do in subsequent years, publishing a veri-
table barrage of papers on the subject. These are shown in table 2.4.
In his first paper on the subject, Horner et. al. outline just why GA’s are useful for
sound synthesis [HBH93] :

FM synthesis is a very efficient, though not always easily controlled tech-


nique for generating interesting sounds.

The paper reviews the techniques used so far to solve the problem of choosing param-
eters for FM synthesis and labels them “ad hoc.” Instead [HBH93]:

The task of finding FM parameters to match musical tones is typical of


problems that defy traditional optimisation, yet are suited to [a] genetic
algorithm solution.
23
However, it seems that EP is not as advanced as ES.
24
But slower than ES’s
25
For musicians: this process is called “Thematic Bridging”.
26
Evolving timbre’s with an EA is exactly what this thesis is concerned with, so on reading this I
was very interested to find out if Horner had followed up his suggestion.
2. Previous Work 18

Year Title Reference


1993 GA’s and their Application to FM Matching Synthesis [HBH93]
1995 Wavetable Matching Synthesis of Instruments with GA’s [Hor95a]
1995 Envelope Matching with Genetic Algorithms [Hor95b]
1996 A GA Based method for Synthesis of Low Peak Amp Signals [HB96]
1996 Group Synthesis with Genetic Algorithms [CH96a]
1996 Common Tone Adaptive Tuning using Genetic Algorithms [HA96]
1996 Discrete Summation Synthesis using GA’s [CH96b]
1997 Hybrid Sampling-Wavetable Synthesis with GA’s [YH97]

Table 2.1: Papers published by Andrew Horner et. al. on the subject of applying GA’s
to sound synthesis

All of the papers in table 2.4 use the same basic method to apply a GA to a sound
synthesis algorithm 27: First, a sample of the instrument to be synthesised is taken,
for example a trumpet tone. This original sample will form the fitness function by
enabling a comparison between it and the synthesized sound. The GA then chooses
parameters for the synthesis algorithm. The tone is synthesised and a fitness score is
obtained from the relative spectral error between the synthesised tone and the original
sample. As the generations pass, each population of individuals matches more closely
with the sampled tone. In the example of the trumpet tone, as each generation passes
you would hear synthesised tones that get closer and closer to the sampled trumpet
sound.
The results of this technique were very successful in all of the papers in table 2.4.
Most of the papers conclude that the Genetic Algorithm method of choosing synthesis
parameters is much more effective than any of the “ad hoc” methods used to date.
Horner and his colleague’s work may be extensive, but note that his method does
not involve human interaction. The user of the system gives only a sample of the
instrument they want synthesised, and the GA goes away and does the rest. This
implies that the method is limited to producing sounds that already exist. That is
good news for this project: Horner’s work has shown that GA’s can be successful
when applied to sound synthesis algorithms, but he hasn’t tried an interactive system
capable of producing unknown, novel and unheard-of sounds.

2.5 Interactive Evolutionary Algorithms (IEA)

So far we have only considered Evolutionary Algorithms where the fitness function
is objective. For example, in Horner’s work, the fitness of a synthesised instrument
27
The synthesis algorithm was different for each paper.
2. Previous Work 19

tone was calculated from the relative spectral error between it and the original sam-
pled tone. What if instead, a human steps in and tells the computer how good the
synthesised tone sounds? In this case the fitness function becomes subjective. There
has actually been a fair amount of investigation into EA’s which use subjective, hu-
man supplied fitness functions. This section will present a summary of this work. It
is given in chronological order so you can develop the story as you read.

2.5.1 Dawkins’ Biomorphs

Richard Dawkins [Daw86] was the first to demonstrate the power of a subjective fit-
ness function. He aimed to construct a system that would demonstrate the role of
mutation in evolution; his system would enable a human user to “breed” pictures of
trees and plants. He was astounded testing his system for the first time, when after
only 20 generations he had evolved bug-like forms that he later called “biomorphs”.
Dawkins’ system28 worked with a simple recursive tree drawing algorithm that
took nine parameters. These parameters determined the form of the resulting tree.
The user of the system is presented with a screen of “mutant offspring” trees that are
generated by randomly changing one parameter of the “parent” tree. The user exam-
ines the offspring and selects the most aesthetically pleasing 29 - the chosen tree goes
on to be the parent for the next screenfull of mutant progeny. Figure 2.6 demonstrates
evolution using Dawkins system.
The picture shows each individual biomorph in a box. A line between boxes con-
nects each biomorph to its parent, forming a sort of “family tree”. You can see how
quick the process is - a bug like creature is evolved in just a few generations.
This mutation and selection process produced surprising results. Dawkins quickly
found that the scope of his simple algorithm was not limited to drawing trees. He
was able to produce all manner of plants, bugs and sea creatures. He even managed
to evolve the letters of his own name! Clearly, this manner of forms did not seem
apparent when he first designed the algorithm. By manually plugging numbers into
the algorithm, he may never have found the variety of forms he saw. It was by using
the EA as a way to explore the parameter space of the algorithm that he was able to
see its real potential.
28
The system is described in much more detail in [Daw88].
29
This is Artificial selection as opposed to Natural Selection.
2. Previous Work 20

Figure 2.6: Evolution of “biomorphs” with Dawkins’ system - taken from [Daw86].
2. Previous Work 21

2.5.2 Oppenheimer’s Artificial Menagerie

Peter Oppenheimer also described a system of interactive artificial evolution [Ope88].


However, his images were more complex than Dawkins. Oppenheimer used an al-
gorithm with 15 parameters to evolve impressive three dimensional plant forms (see
figure 2.7).

Figure 2.7: Example of an evolved plant form - taken from [Ope88].

As in Dawkins system, mutation is the only genetic operator used and “Fitness, of
course, is in the eye of the beholder.”30

2.5.3 Smith’s Bugs

Joshua Smith extended Dawkins’ work with “biomorphs,” as well as formalising some
concepts about GA’s with subjective fitness functions [Smi91].
Smith implemented a system like Dawkins’ but with a larger breeding population
and a genetic recombination operator added. This allowed the user to select multiple
biomorphs and have them mate as opposed to just mutating them. Also, a two di-
mensional Fourier series was used to generate the biomorphs as opposed to Dawkins’
30
From [Ope88].
2. Previous Work 22

recursive tree drawing algorithm. Here, the genes of an individual are Fourier coef-
ficients31 . Figure 2.8 shows a sample of biomorphs evolved using Smith’s system.

Figure 2.8: Biomorphs evolved with Smith’s system - taken from [Smi91].

Smith defined the term Interactive Genetic Algorithm (IGA) to refer to systems
where a human user acts as the fitness function for a GA32 . Three criteria are specified
for applicability of an IGA to a problem [Smi91]:

 The problem can be formulated as a search through a parameter space.

 Candidate solutions to the problem can be generated in near real time.

 The utility of candidate solutions can be compared by humans, but not (practi-
cally) by means of a precisely specified formula.
31
Smith uses a real encoding for his genes which implies that he is using an ES as opposed to a GA
which are characterised by binary encodings. However, Smith considers ES’s to be a subset of GA’s
as he states in [Smi94]. This is contrary to Bäck’s belief that ES’s and GA’s fall under the umbrella of
EA’s [Bäc95].
32
Likewise I use the term Interactive Evolutionary Algorithm (IEA) to refer to an Evolutionary
Algorithm (EA) that uses a human user as the fitness function. I believe this to be a more politically
correct term as IGA implies the use of a binary encoding.
2. Previous Work 23

Note that the problem of generating novel and interesting sounds with synthesis algo-
rithms (artificial evolution for sound synthesis) satisfies these criteria:

 Finding interesting sounds is a search through the algorithm’s parameter space.

 Current computing power allows synthesis of sound from parameters in near


real time33 .

 Deciding whether a sound is “novel” or “interesting” cannot be achieved with a


precisely specified formula, but can be easily accomplished by a human.

2.5.4 Sims’ Artificial Evolution

Karl Sims’ [Sim91] constructed one of the most elaborate demonstrations of these
techniques to date34 . Sims used a powerful super-computer to generate and evolve
astounding images (see figure 2.9).

Figure 2.9: An image generated from an evolved symbolic lisp expression - taken
from [Sim91].
33
Not to mention the real time synthesis available with specialised hardware
34
Indeed, this paper was the main inspiration for this thesis.
2. Previous Work 24

Sims evolved 3D plant structures generated from procedural models as in [Ope88],


but he also added four different recombination operators. This allowed the user to
mate plant structures in a variety of ways to form all manner of offspring. Sims went
on further and used symbolic lisp expressions as genotypes for the evolution process.
In this case, evolution was not limited to number of parameters in a procedural model,
but was “open ended.” The symbolic expressions were mutated and mated to generate
2D images35 , 3D volume textures and even animations. Although the resulting images
are rather abstract, they are nevertheless beautiful. In a later paper, he reported that
he had extended the system to evolve 3D shapes and even 2D dynamical systems
[Sim93].

2.5.5 Moore’s GAMusic 1.0

Jason Moore’s program “GAMusic 1.0” [Moo94] is an interactive melody evolver and
up until recently was the closest example of work similar to this thesis.
The program runs on Microsoft Windows and consists of a simple interface to
enable the user to evolve melodies that play over the PC speaker. The user auditions
each melody and assigns a fitness value (good, average or poor). Melodies are rep-
resented as a 128 bit binary string and a simple GA is used to recombine and mutate
these. There are controls that enable adjustment of the mutation and recombination
frequency. After the user assigns a fitness value to each melody in the population, the
GA mutates and recombines the binary strings that represent them. The newly created
population is again auditioned by the user and the process is repeated until the user is
satisfied with the melody.
Figure 2.10 shows a screen shot of the GAMusic user interface. You can see the
controls for mutation and recombination frequency, the population of 12 melodies
(with fitness ratings) and bit string that represents the current melody.
There are a number of problems with GAMusic. Since the melodies play over the
PC speaker as simple bleeps, they are not very pleasing to listen to. Also, there is no
way (except by ear) to obtain the melody data so you can use it in other applications
(e.g. a composition that you are working on).
However, the most frustrating aspect of GAMusic is that it takes a long time to
evolve decent melodies. This is partly due to the time it takes to audition a single
population. There are 12 melodies in each population which can last up to 5 seconds
each. On average, it takes about 1 minute to go through all the melodies and judge
their fitness. Many populations must be auditioned to get a good melody - evolution
progresses slowly and the user quickly becomes bored 36 . In documentation for the
35
Again, see figure 2.9.
36
Especially after listening to the PC speaker bleeps for so long.
2. Previous Work 25

Figure 2.10: A screen shot of the GAMusic 1.0 user interface.

program, Moore mentions plans to develop a program to evolve more complex sounds
using a PC sound-card37 , but these have subsequently been canceled [Moo98].

2.5.6 van Goch’s P-Farm

When I first discovered this program I was shattered. Arno van Goch’s “P-Farm”
[Goc96] achieved exactly what I hoped this project would - and it did it well! The
problem was, I only discovered it halfway through the year when my literature survey
had been completed and I was thinking about implementation.
P-Farm is an experimental program that works with an external synthesizer. In
its current implementation (version 0.3) it supports the Roland Juno 106 and Yamaha
V50 synthesisers, and runs under Microsoft Windows.
To use P-Farm, you must have an external synthesiser. The program works by
evolving patches, then sending the patch information to the synthesiser via MIDI 38 .
The user can then play the synthesiser to evaluate the fitness of the patch that has just
been sent. Figure 2.11 shows a screen shot of P-Farm’s user interface. The external
synthesizer also required to use the system is not shown.
37
A program that evolves sounds is the topic of this thesis.
38
Musical Instrument Digital Interface.
2. Previous Work 26

Figure 2.11: A screen shot of P-Farm’s user interface.

P-Farm uses a GA with a population of 32 patches. The fitness function is binary:


the user can choose either to keep a patch or delete it. After you have gone through
the population keeping only the fit patches, the deleted patches are filled with new
individuals formed by crossing and mutating the fit patches. The program lets you
control three parameters of the GA, namely:

crossover ratio What proportion of each parent’s genes appear in the child.

mutation rate The probability that a given gene will mutate.

transposition rate The probability that genes will get transposed to different parts of
the chromosone.

After you get used to it, you can breed quite good sounds using P-Farm. There are
however some criticisms. Evaluating and keeping track of all 32 patches is sometimes
difficult. You frequently have to go back to a patch and remind yourself of what it
sounded like. Also, switching between the computer keyboard and the synthesiser
keyboard can be frustrating. Although some of the program functions are accessible
via the synthesiser keyboard, when you have to switch around it can make evolution
slow and painful.
The work of van Goch is very close to the subject of this thesis. So close in fact that
I could base my work on extending or modifying the algorithms used in P-Farm. This
2. Previous Work 27

however is not a possibility. Although the program is still under active development,
technical details39 are not available [Goc98].

2.6 Summary

The survey of literature conducted gives a good indication that an Interactive Evo-
lutionary Algorithm applied to sound synthesis would yield fruitful results. This is
highlighted by the following facts:

 Techniques such as FM synthesis can produce a wide variety of timbres from


a small number of parameters. However, choosing the right parameters is diffi-
cult.

 Evolutionary Algorithms are optimization procedures that have proved to be


useful in searching the parameter spaces of a wide variety of problems.

 Horner and his colleagues have shown that EA’s can be successfully applied to
sound synthesis algorithms in a non-interactive manner.

 Dawkins, Sims and Oppenheimer demonstrated the power of Interactive Evolu-


tionary Algorithms for searching the parameter space of procedural models for
computer graphics.

 The problem of evolving “interesting” and “novel” sounds fits Smiths criteria
for applicability of an IEA.

 Very few people have tried to apply an IEA to sound synthesis before. Moore
canceled his plans and the only other known work is that of van Goch, which is
still very experimental.

So it seems that artificial evolution for sound synthesis is a good idea. Using the
surveyed literature as a foundation we can now begin to design a system that will
embody the goals of this thesis – this is the topic of the next chapter.

39
That is, source code that I could modify.
Chapter 3

System Design

3.1 Introduction

This chapter is concerned with the initial design of a system that can achieve the
goals of this project. First, the problem is divided into a number of subproblems –
this gives an outline of the basic system. Then, for each of these subproblems issues
related to implementation are discussed. Next, the design decisions actually taken
are explained and justified. Then finally, we look at ways of evaluating the system in
order to determine whether the goals have been achieved.

3.2 System Outline

A system that applies an Interactive Evolutionary Algorithm to sound synthesis can


be broken into three main parts as shown in figure 3.1.
Each of part achieves a different function:

User Interface This part allows the user to audition individuals (hear the sounds pro-
duced by the Sound Synthesis section) and rate their fitness (evaluate them).

Evolutionary Algorithm This part takes the fitness data from the User Interface
and accordingly mates and mutates the synthesis parameters.

Sound Synthesis This part generates the actual audio data from the sound synthesis
parameters passed from the Evolutionary Algorithm section.

28
3. System Design 29

Evolutionary Synthesis Parameters


Sound Synthesis
Algorithm
Mutation and mating Generates sound from
of individuals individual

Selected Individuals
Audio Data

User Interface
Selection of fit
individuals.

Figure 3.1: The three main components of the proposed system

3.3 Considerations

For each part of the proposed system (figure 3.1), a number of issues have to be
considered before you can decide on an implementation. These issues are discussed
in the following sections.

3.3.1 Sound Synthesis

There are basically two ways you can implement this part: in hardware or in software.

Hardware

A hardware implementation would use an existing stand-alone synthesizer 1 . The Evo-


lutionary Algorithm would send parameters to the synthesizer via MIDI 2 . The user
would then be able to audition the sound by playing the synthesizer as normal. This
1
Sometimes the abbreviation ‘synth’ is used in this document - it refers to a stand-alone hardware
synthesizer.
2
Musical Instrument Digital Interface. See [Boo87] for a good introduction to MIDI.
3. System Design 30

is exactly the way that van Goch’s system works 3.


The advantage of a hardware approach is speed. The only delay required is the
time taken to transfer the parameter data (typically less than 200 bytes) from a com-
puter to the synth. The user can then test out the synthesis algorithm in real time,
playing any combination of notes desired.
The disadvantages of a hardware system are cost and inflexibility. Synthesizers are
typically quite expensive and they only implement a very limited number of synthesis
algorithms4. Since each synthesizer manufacturer has their own way of interpreting
parameters, you could only hope to support one particular synth from one particular
manufacturer. Users who didn’t own that particular synth wouldn’t be able to use the
system. Even if they did own it, they would always be restricted to exploring the set
parameter space that comes with that particular synth.

Software

Alternatively, a software implementation of the Sound Synthesis part would provide


infinite flexibility at a lower cost, since no extra hardware would be required by the
user. If they possessed a basic computer capable of sample playback5 , they would be
able to use the system.
Freely available packages such as Csound [Cso98] enable the user to experiment
with virtually any known synthesis algorithm. MatLab [Mat98] also allows construc-
tion of custom algorithms. MatLab is not as musically oriented as Csound, but it
provides other useful features such as data visualisation and user interface tools.
Another advantage of a software Sound Synthesis part is the increased integra-
tion. Users don’t have to divide their attention between a musical keyboard and a
computer keyboard6. They can audition individuals, then rate their fitness all from the
same place.
However, the disadvantage of a software approach is speed. A software synthesis
algorithm will always be slower than a dedicated hardware synth. This is cause for
concern, as recalling one of Smith’s criteria: candidate solutions must be generated in
near real time7. Never fear – today’s computing power is fast enough to satisfy this
requirement as long as the algorithm is not too complex 8.
3
Refer back to section 2.5.6 for a description of van Goch’s system.
4
Most hardware synths only implement 1 synthesis algorithm.
5
Most computers these days come with sound cards - these provide sample playback functionality.
6
This was one of the annoying features of van Goch’s system - see section 2.5.6.
7
Smith’s criteria for application of an IGA were reported in section 2.5.3.
8
There are now real time versions of Csound that run with Pentium processors.
3. System Design 31

3.3.2 Evolutionary Algorithm

Implementing the Evolutionary Algorithm part of the system also raises issues which
must be considered. It is assumed that this part of the system will run on a small
computer (UNIX or Windows) as this is easiest and cheapest option. Given this, the
following issues have to be considered:

Representation

How will the genes of individuals be represented?


The basic choice here is between a binary encoding (as in a GA) or a real encoding
(as in an ES). Binary encodings are much more flexible, but they are harder to imple-
ment. A real encoding for genes leads to an easier implementation – each gene is just
a real number representing a different parameter of the synthesis algorithm. It could
also lead to faster convergence9 – that is, the user is able to find interesting sounds
more quickly.

Operators

What kind of genetic operators will be useful for a sound synthesis application?
Mutation seems an essential operation, and is also fairly easy to implement. Mat-
ing (recombination) requires more thought and also has many alternatives 10. Since
hardly any work has been done in this field, it is not known which types of mating
work better for an interactive sound synthesis application.

Random Numbers

Whichever operators for the Evolutionary Algorithm are chosen, good random num-
ber generators will be required for their implementation. This fact should be kept in
mind when making a decision on the implementation language.

Self Adaption

Should any of self adaptive features of ES’s11 be used in the system?


9
Section 2.3 discusses the convergence rates of different EA’s.
10
These were discussed in section 2.3.
11
These were discussed in section 2.3.2.
3. System Design 32

Self adaption has the potential to make convergence very fast. However, it has
only been used in non-interactive ES’s with large populations – it may not work in
an interactive system because population sizes are much smaller and are evolved for
fewer generations.

3.3.3 User Interface

The User Interface component possibly requires the most design thought. The issue
here is basically: How do you make the system easy and intuitive to use?
Some User Interface considerations are:

Population size

A user of the system can only hear one sound at a time 12 . This is a major difference to
the graphical systems of Dawkins, Sims and Smith 13 where the user can very quickly
audition a lot of images. Only being able to audition one sound at a time implies
that population size should be somewhat smaller. If it weren’t, auditioning the whole
population would take a long time, and the user may become bored.

Fitness rating

Would it be better to have a binary rating (e.g. good or bad) or some kind of scale
(e.g. 1 to 10)? This decision will also be dependent on the population size and the
genetic operators available.

Implementation Platform

Because the interface to the system is likely to be graphical, the possible implementa-
tion platforms need to be considered.
Java provides very good GUI14 facilities and also has the advantage that it could be
used over the World Wide Web. However at present, the audio support in Java is less
than adequate – Java cannot generate or play audio files on most architectures. This
12
A possible method to overcome this limitation involves use of the “cocktail party effect” [CE95].
This is the phenomenon whereby a person can distinguish multiple sound sources as long as they are
spatially separated. For example, at a cocktail party where there are many conversations occurring at
once, you are able to clearly understand the one you are focussed on.
13
Described in section 2.5.
14
Graphical User Interface
3. System Design 33

could possibly be circumvented by using a CGI script to generate audio files. Most
Web browsers support playback of audio files, so you could send the pre-generated
audio files across the Web to the client. Alas, this idea is also impractical: audio files
are typically quite large, and thus would take a long time to send over the Web. This
would make the system slow and frustrating to use.
Microsoft Windows provides good GUI support once you come to grips with its
ridiculous conventions. However, its sound playback utilities are very hard to use.
Tcl/Tk for Unix provides very good and easy to use GUI facilities. The powerful
redirection features of Unix would make audio playback a breeze.
MatLab also has easy to use GUI features, but these are limited in some respects.

3.4 The Design Decision

The final decision for the implementation of the three system parts was not reached in
a single instant, but evolved 15 over the course of the project.
The initial experiments and system development (chapter 4) were conducted in
MatLab. MatLab was very easy to use and provided a rapid development environment.
Later in the project when the basic system had been worked out, it seemed easiest to
stick with MatLab. Also, the extra time saved in not having to re-implement the
system in another language meant that more could be accomplished. As a result, the
final system was implemented using the GUI features of MatLab (chapter 5).
Implementing all three system parts in MatLab had a number of impacts. These
are described as follows:

Sound Synthesis The sound synthesis algorithm ran in software. This cost less than
using external hardware, but was slower. The slow speed of the software meant
that users were limited to auditioning just one note of each sound. However,
the advantage of having everything implemented in MatLab was increased in-
tegration - there were no problems switching between computer keyboards and
synthesizer keyboards. A software approach also offers flexibility in the choice
of synthesis algorithm, however this project mainly focussed on FM synthesis 16.

Evolutionary Algorithm Being a statistical package, MatLab provides very good


random number generators – this was an advantage for this part of the system.
The big disadvantage however was the lack of data structures. This made some
15
This pun was bound to happen, sooner or later.
16
An experiment was also conducted using additive synthesis - see Appendix A.
3. System Design 34

tasks quite difficult and meant concepts like ‘Object Oriented’ did not exist.
Fortunately, implementing a real encoding for genes was no problem – vectors
of real numbers are supported well by MatLab.
User Interface In preliminary experiments, a simple keyboard interface was used
and this was no problem to implement in MatLab. In the final system, a GUI
was built. MatLab’s GUI features were somewhat limited and at times painful
to use. However, due to MatLab’s interactive nature, development was probably
a lot easier and quicker than it would have been on other platforms.

Other design decisions not explained here (e.g. population size) will be illumi-
nated as the system is developed in chapter 4.

3.5 Evaluation

It’s all very well having a system that generates “novel” sounds, but how can you
measure the success of such a system?
A successful system is one that satisfies its goals. The first goal as stated in section
1.2 could be satisfied by implementing the system described above. What about the
second goal?
One way to approach this would be to collect opinions from a number of users.
These users could be composers or musicians who require such sounds for their work,
or merely novices who are seeking entertainment. Some questions that could be asked
of the users to determine the success of the system include:

 Is the system easy to use?


 In your opinion, st the evolution of sounds controllable?
 Is this system useful? Would you be able to use the sounds generated in com-
positions?
 Is using this system easier than other methods of obtaining the same kinds of
sound?

Another way to test success would be to collect statistics from users. These statis-
tics would be obtained from a number of experiments:

 First the user would be asked to use a system where they manually had to adjust
parameters of the sound synthesis algorithm. The number of changes they had
to make until they had a satisfying sound would be recorded.
3. System Design 35

 This process would be repeated on a system where the parameters of the al-
gorithm were chosen at random. Finally the users would use the evolutionary
system.

 After these tests, we would have a measure of how intuitive each system was to
use. A comparison could be made about which is the most effective system.

The actual experiment conducted to assess the success of the system forms the
subject of chapter 5.
Now that we have discussed the various issues relating to the implementation and
assessment of artificial evolution for sound synthesis, we move on to chapter 4 which
tells the story of how the actual system was developed.
Chapter 4

System Development

4.1 Introduction

This chapter documents the steps taken in developing the software that forms the basis
of this project – software that applies artificial evolution to sound synthesis 1.
The guiding philosophy throughout the development of this system was to imitate
Dawkins’ system2, only applied to sound instead of graphics. Recall that Dawkins’
implemented an IEA based on a very simple tree drawing algorithm with mutation
as the only genetic operator. Humble as it was, the fantastic results obtained inspired
a whole new generation of research. It seems justified to attempt to repeat this for
sound synthesis. The idea then, is to keep the synthesis algorithm simple and focus
on mutation.
To this end, FM synthesis was chosen as the sound synthesis algorithm. FM syn-
thesis can produce a wide variety of timbres with only a few parameters, yet it is very
hard to ‘control’. It seemed a perfect candidate for artificial evolution.
The software development is presented in a series of experiments. As described
before, these experiments were conducted in MatLab as it was easy to use, and pro-
vided a rapid development environment. The programs that performed the experi-
ments were named EvoS which stands for Evolution Strategy.
An outline of this chapter is as follows:

EvoS 2 aims to implement a most basic form of evolution with the most basic FM
algorithm.
1
Yeah, well... it was a spur of the moment thing.
2
See section 2.5.1.

36
4. System Development 37

EvoS 3 experiments with self-adaption, but concludes it is unsuitable for an interac-


tive system.

EvoS 4 tackles the tricky problem of mutating envelopes.

EvoS 5 develops a method of mutating modulation indices.

EvoS 6 describes how the problem of mutating frequency ratios was overcome.

EvoS 7 finally, combines all the previously developed mutation techniques into a sin-
gle system. This system is tested by a number of users.

That being said, lets start at the beginning...

4.2 EvoS 2

4.2.1 Aim

The goal of EvoS 23 was to create a most basic interactive Evolution Strategy. To keep
things very simple, a (1+1)-ES was decided on. Here, one parent sound is mutated to
form a single child sound. If the fitness function 4 deems the child sound is better than
the parent sound, it is replaced by the child who becomes the new parent. On the other
hand, if the child sound is worse than the parent sound, that child is discarded and a
new one is formed by mutation of the same parent. Note that there is no recombination
or mating involved, the only genetic operator is mutation.
Likewise, to keep things simple in the sound synthesis section, a very basic al-
gorithm was used: fixed index frequency modulation (FM) 5 . This requires only three
parameters to produce a sound:

fm = modulating frequency
fc = carrier frequency
I = modulating index, a measure of how much the modulator deviates
the carrier frequency.

Combining the FM synthesis with the Evolution Strategy, EvoS 2 aimed to produce a
system that provided interactive evolution of sounds.
3
What happened to EvoS 1? – It was a mutant that got out of control and had to be deleted.
4
Remember, in an interactive system the human user acts as the fitness function
5
FM synthesis was described in detail in section 2.2.5. Also, see figure 2.3 for an illustration of the
basic FM algorithm implemented in EvoS 2
4. System Development 38

4.2.2 Implementation

Implementing the system described above in MatLab required a number of subpro-


grams or procedures. These subprograms were drawn together in the main program
(EvoS 2) which acted as a user interface.
Since there are three parameters required to generate a sound using FM, an indi-
vidual is represented by a vector of three real numbers - these are its genes:
!
a = (fc ; fm ; I )

There are a number of procedures that operate on the genes of an individual and
achieve the vital functions of the ES:

eval This procedure takes the genotype of an individual and evaluates the pheno-
type. Although this sounds complicated, all it involves is extracting the synthe-
sis parameters (fm , fc and I ) from the vector, generating the resulting sound,
and playing this sound to the user. In a non-interactive system, eval would
also asses the phenotype according to the fitness function, but here we must let
the user decide how good it sounds.

mutate This procedure takes the genes of a parent individual and returns genes of
the mutant child offspring. The ‘mutation’ is just a random deviation applied
to each gene of the parent. This is achieved by sampling a normal distribution
with a mean of the parent gene. The variance of the normal distribution differs
according to which gene is being mutated. For example the variance for mu-
tating the frequency genes fm and fc is different to the variance for mutating
the index gene I . Deciding on these variances is tricky and will be discussed
later. So basically, mutate works by applying the normal distribution to each
element in the vector of numbers passed to it. What results is a mutated child
genome.

init This procedure returns a random set of genes. It is used to initialise the first
parent before any mutation can take place. To generate the random genes, a
uniform probability distribution is sampled. Here we have to decide on the
range of legal values for each gene, this is also a tricky issue that deserves a
discussion of its own.

When eval, mutate and init are combined with a suitable user interface, an
Evolution Strategy is formed. The interface randomly initialises the first parent using
init and then forms the first mutant child with mutate. These two sounds are
played to the user by calling eval. The user then decides which sound they like
more: If they like the parent more, the old child is discarded and mutate is called
4. System Development 39

on the parent once again; If they like the child more, the child takes the parent’s place
and a new child is formed via mutate. With this system the user explores the space
of possible sounds by listening to some randomly chosen directions, deciding which
is the best and then moving forward in that direction.
Figure 4.1 shows a screen shot of EvoS 2, demonstrating the user interface. The

Figure 4.1: A screen shot of the EvoS2 user interface.

user executes commands by entering a character corresponding to a command in the


menu. The following commands can be seen in the menu in figure 4.1:

Parent This plays back the current parent sound to the user.

Child This plays back the current child sound to the user.

Mutate This mutates the parent to form a new child sound and then plays this new
sound so the user can hear it.

Replace This replaces the parent with the current child. A mutant child is formed
from the new parent and this sound is played to the user.

Genes This prints out the actual genes (fc , fm and I ) of the current parent and child.

Display The user can obtain a graph of all the parents they have chosen to date. The
graph shows how they have navigated through the space of possible sounds.
4. System Development 40

History The user can hear all of the parents they have chosen to date, starting with
the first and ending with the most current. This enables the user to hear how the
sound they started with has gradually changed to become the sound it is now.

Variances and Ranges

One of the tricky issues encountered in implementing mutate and init was decid-
ing on valid ranges and variances for each of the genes in the genome. Remember,
the genes represent parameters of a synthesis algorithm and so certain values will not
make any sense – for example if fm or fc are negative.
Once a range valid range for each parameter was chosen, a variance was needed.
Choosing the “right” variance was critical to the effective operation of mutate. For
example, suppose valid frequencies for fm and fc range from 0 to 5,000Hz. If the
variance is 1Hz then the user will not notice the difference between a parent sound
and its mutant child. They will sound almost exactly the same. As a result, all di-
rections in the sound space will sound the same and the user won’t be able to decide
which way to go. On the other hand, suppose the variance is 1000Hz. The user will
not be able to tell that a parent and its mutant child are related. They will sound vastly
different, the user will end up hopping at random around the space of possible sounds.
Clearly a balance must be reached where the variance is small enough that the
user can hear how the child sound relates to its parent, but large enough so there is a
discernible difference.
In EvoS 2 the main focus was to get a system up and running, so deciding on
effective ranges and variances was left to later versions of the system. The somewhat
arbitrary ranges and variances selected are shown in table 4.1.

Parameter Minimum Maximum Variance


fc 0Hz 5,000Hz 100Hz
fm 0Hz 5,000Hz 100Hz
I 0 30 2
Table 4.1: Ranges and variances for the parameters of EvoS 2.

4.2.3 Results

Although EvoS 2 was very simple it did provide some gratifying results. As there
were only three parameters, it meant that you could graph these in 3 dimensional
4. System Development 41

space and obtain a picture of how the user navigated through the parameter values
(the ‘Display’ feature). Figure 4.2 shows one such picture.

Figure 4.2: A sample evolution run for EvoS2.

Each cross in figure 4.2 represents a parent that the user has chosen. The evolution
starts where there is no cross and finishes at the asterix6 .
The thing to notice about figure 4.2 is the clustering of points in one area. This
cluster represents an area of the parameter space that the user found “interesting”
and wanted to keep exploring. The user has auditioned points outside this cluster but
found them unpleasant. You can see at the start, the user has cut a straight course away
from the initial point and into the cluster. This implies the initial point represented a
parameter combination resulting in an unpleasant sound.
In reality, the sounds produced by EvoS 2 were very simple, ranging from mild
bleeps to harsh distorted tones. The only real form of navigation the user could con-
duct was to make the discomforting outbursts more sublime. The cluster in figure 4.2
actually represents an area of the sound space that was less harsh than its surround-
ings. Using the ‘history’ feature you could hear how each parent was related to its
6
Just an apology about the graphs: the results for the EvoS experiments were collected at different
times. Later on, clearer methods of displaying the data were discovered, but is was too late for these
early results. As well as being less clear, it means that the display format is inconsistent between
graphs. Whenever there is a format change, it is explained in detail, but once again I apologise for this
inconvenience
4. System Development 42

predecessor, but you could also notice the difference. This suggests that the choice of
variances was adequate.
Another interesting result was the effect of a small initial gene pool. In nature,
when the gene pool becomes small, it is disastrous. If a change in the environment is
introduced that kills one individual, all individuals will die because they are so similar
to one another. In EvoS 2 the gene pool is very small – the entire population consists
of just one individual. If this individual sounds very displeasing, it is most likely that
all its mutant children will sound displeasing. Subsequently the user cannot pick an
appropriate child and the entire population is doomed. This problem may be alleviated
if the initial population consists of a few random parents. Hopefully, at least one of
these will sound slightly pleasant and allow meaningful evolution.

4.2.4 Conclusion

EvoS 2 demonstrated that a user could navigate through a simple FM sound space
using a (1+1) evolution strategy. Further work needs to address the problems of:
choosing appropriate ranges and variances for parameters; and generating sounds that
are actually pleasant to listen to.

4.3 EvoS 3

4.3.1 Aim

The goal of EvoS 3 was to trial an automated method of choosing variances for pa-
rameters.
In EvoS 2 it was noted that deciding on effective variances for each of the parame-
ters was a tricky matter. Why not, instead of guessing the variances by trial and error,
include the variances as part of the genome? The variances would then evolve to suit-
able values just as the synthesis values do. This is the kind of self-adaption or “meta-
evolution” that is characteristic of more advanced Evolution Strategies [Bäc95].

4.3.2 Implementation

In order to evolve the variances for each of the parameters, extra genes were added
the genome from EvoS 2. These were:

f2m The variance used for mutation of the modulating frequency (fm ).
4. System Development 43

Parameter Minimum Maximum Variance


f2c 0Hz 200Hz 100Hz
f2m 0Hz 200Hz 100Hz
2I 0 4 2
Table 4.2: Variances and ranges for the variances of EvoS 3

f2c The variance used for mutation of the carrier frequency (fc ).
I2 The variance used for mutation of the modulation index (I ).

These additions resulted in a genome of the form:


!
a = (fc ; fm ; I; f2 ; f2 ; I2 )
c m

The eval subroutine remained unchanged from EvoS 2, as it only required the
three synthesis parameters to generate the sound.
The init subroutine needed a slight change in order to generate random initial
values for the new parameters. Once again this raised the issue of a valid range for
the new parameters.
The mutate subroutine was modified so that the variances of each parameter
were mutated first. These mutated variances were used to mutate the synthesis pa-
rameters. This again raised the issue of what variance and valid range to choose for
mutating the variances f2m , f2c , and I2 .

Variances and Ranges

In EvoS 3, the variances and valid ranges of f2m , f2c and I2 were chosen in the aim
of achieving a similar results to EvoS 2, however there was still a lot of guesswork
involved. The values chosen are shown in table 4.2.

4.3.3 Results

EvoS 3 produced very similar results to EvoS 2. The sounds produced were still quite
unpleasant, but the user did have some control of the evolution.
It seems that evolving the variances did not achieve anything. This is illustrated
by the figures 4.3 and 4.4 which show a typical evolution session using EvoS 3. As
4. System Development 44

Figure 4.3: Evolution of synthesis parameters in EvoS 3.

Figure 4.4: Evolution of variances in EvoS 3.


4. System Development 45

before, each cross represents a parent that the user has chosen. The evolution starts at
where there is no cross and ends at the asterisk.
Figure 4.3 shows the evolution of the synthesis parameters with the small clusters
that were illustrated in EvoS 2. It was concluded that these clusters were evidence of
controlled evolution, the user had found an area in the sound space that was pleasant
and stayed there. Figure 4.4 shows the variance parameters of the genome. It is clear
that there is no convergence on any area in this space, it is a wild mess. Together
figures 4.3 and 4.4 demonstrate that while the user can navigate the sound space ade-
quately, they wildly flop around the variance space.
One reason for this might be that the fitness function doesn’t stay stable for long
enough for the variances to converge on any point. For instance, if a freak mutant
whose large variances’ places it in an interesting area of the sound space, the human
user will choose it, instead of sticking to the goal they were following before. Clearly
a human fitness function is not as stable as an objective assessment of fitness.
Another reason could be the small number of individuals that can be tested by
the human user. In a proper Evolution Strategy with a computer evaluated fitness
function, thousands more individuals can be evaluated in a shorter amount of time.
Subsequently, convergence in the variances may be observed whereas in EvoS 3, it
was absent.

4.3.4 Conclusion

EvoS 3 demonstrated that evolving the variances of synthesis parameters does not
offer any advantage over fixed variances.

4.4 EvoS 4

4.4.1 Aim

Since automatically choosing variances seemed like a dead end in EvoS 3, EvoS 4
aimed to begin addressing the problem of creating more interesting sounds.
Generation of more musically useful sounds using FM synthesis requires the use
of ‘envelopes.’ Envelopes allow the values of synthesis parameters to vary over the
duration of a sound. For example in EvoS 2 and EvoS 3 the modulation index pa-
rameter (I ) remained constant throughout the course of the sound. A much more
interesting result would occur if the value of I was varied as the sound proceeded.
This can be accomplished by using an envelope to control the value of I .
4. System Development 46

The simplest kind of envelope is one that controls the amplitude (volume) of a
sound. This situation is depicted in figure 4.5.

Figure 4.5: An envelope (top) controls the amplitude of a waveform (bottom).

To use envelopes in an Evolution Strategy, you have to be able to mutate them - this
is not as easy as it appears. In order to isolate the problem of mutating envelopes, EvoS
4 aimed to implement an instrument that only evolved amplitude envelopes. This
seems like a step backwards in the journey toward interesting sounds, as an amplitude
envelope modifying a sine wave produces more boring results than the bleeps of EvoS
2. However, this intermediate step was necessary in order to effectively advance.
Another goal of EvoS 4 was to start isolating the timbre of sound from other
musical characteristics7 . One of the problems of EvoS3 was that as well as timbre
being subject to evolution, pitch was also being evolved. It would be desirable to
separate pitch from the evolution process. A composer would then be able to work
out a melody, and then use the evolution system to find an appropriate timbre for that
melody.
Finally, EvoS4 addressed another shortcoming of the previous versions. Recall
that init was called to generate a random parent to start off the evolution process.
If the user did not like this initial sound, they had to quit the system and start again.
7
Timbre is the characteristic tone quality of a particular instrument. See section 2.2.1 for more
information.
4. System Development 47

This was the problem of the small initial gene pool. EvoS4 would allow the user to
randomize the genome from within the system. This would allow the user to quickly
get to other parts of the sound space if they had become disinterested in the current
area.

4.4.2 Implementation

Representing Envelopes

Representing an envelope with a genome, requires the following parameters:

n The number of ‘breakpoints’ in the envelope.


xi The point in the sound where each breakpoint occurs.
yi The amplitude value at each breakpoint.

So an individual !
a has the following form:
!
a = (n; x1 ; : : : ; xn ; y1 ; : : : ; yn) where xi ; yi 2 [0::1]
For example, the envelope in figure 4.5 would be represented as:
!
a = (3; 0:166; 0:333; 0:833; 1; 0:75; 0:625)
Notice here that only 3 breakpoints are specified, the endpoints (0; 0) and ;
(1 0) are
added implicitly.

Mutating Envelopes

Once we have a representation for envelopes, we can work out a strategy that mutates
them. This might be pretty straight forward, except for a few considerations:

 If we change the number of envelope points (n), we have to decide which points
to remove or add. This can be done by randomly picking one (or more) of the
breakpoints and deleting or copying it. Also note that n is an integer and so if it
is mutated with a normal distribution, it must be rounded back to an integer.

 If we change the xi ’s we must make sure that they remain monotonic, other-
wise we might end up with an envelope that runs backwards in time! We can
make sure of this by sorting the x i ’s in increasing order after we have made any
changes.
4. System Development 48

 If we change the yi ’s we should scale them so that the largest y i has a value of
1. This ensures that synthesized sound is played with maximum volume 8.

With these constraints in mind, the mutation procedure can be described in the
following steps:

1. Mutate n, the number of envelope points. Any change in n occurs fairly infre-
quently so we set the variance quite low. Also if we do change n, we must make
sure it is still an integer.

2. For each xi , mutate with a normal distribution in the same way as in EvoS2.
Once the xi ’s have been mutated, sort them to make sure they are monotonically
increasing.

3. For each yi , mutate as for the xi ’s. Once they have been mutated, scale the
values so the largest yi is 1.

These steps were implemented in the mutate subroutine.

Variances and Ranges

What about the variances and ranges for the parameters in the above mutation proce-
dure?
In this experiment, to make things easier, mutation of the number of breakpoints
n was disabled. Envelopes always had 3 breakpoints. This means the focus was on
finding a good variance for the xi ’s and yi ’s. Note that both xi ’s and yi ’s are in the
range [0..1] so they will have the same variance. A few different values of variance
were trialled, and the results of these are documented in the next section.

Randomizing Envelopes

As mentioned in the aim, EvoS4 let the user randomize envelopes without having
to quit and start again. This was simply done be providing an option that replaced
the current child with one generated by the init procedure. init also had to be
modified to deal with the envelope genome. It used a very similar strategy to mutate
except that uniform distributions were used instead of normal distributions.
8
If we wanted sounds to play with different volumes, we would have overall-volume as a parameter
in the genome. Here, we are only interested in the relative values of envelope breakpoints.
4. System Development 49

Isolating Timbre

A new eval subroutine was implemented that allowed any sequence of notes to be
played with the evolved parameters. This allowed a user to enter a melody that would
be heard each time with a new mutant timbre.

4.4.3 Results

The envelope mutation procedure seemed quite successful, figure 4.6 shows a parent
envelope and its mutant child.

Figure 4.6: The envelope of figure 4.5 shown with an envelope produced by mutation.

The most interesting result observed was the difference between mutation and
randomization. Randomization produced envelopes that were completely unrelated to
the parent sound (see figure 4.7), whereas mutation produced slightly different ones
(compare figures 4.6 and 4.7).
The degree of similarity of mutated envelopes was controlled by the variance used
for mutation of the xi ’s and yi ’s. Being able to compare the results of mutation and
randomization provided a useful metric for tuning this variance. If the variance was
too small, no difference would be heard between the parent and the child and the user
4. System Development 50

Figure 4.7: The envelope of 4.5 compared with an envelope produced by randomiza-
tion.

resorted to using randomization to search the space of possible sounds. If the variance
was too large, there seemed to be no difference between randomizing an individual
and mutating it, both produced equally dissimilar sounds. However, if the variance
was just right, mutation would produce a sound that was satisfyingly different yet
clearly related to the parent sound. It was here that useful evolution really took place.
The user can feel that they are trying different directions in the sound space and then
choosing the one that sounds best. If this area of the space proves uninteresting they
can jump to an entirely new area by randomizing.
This idea is illustrated in the following figures which show the use of different
variances for mutation. Each of the figures plots how the x i ’s (top graph) and yi ’s
(bottom graph) changed as the user evolved a sound. Each cross represents a parent
that the user has chosen. A solid line joining two parents means mutation was used to
make this jump, whereas a dotted line means randomization was used. The evolution
run starts at the asterisk and ends at the circle. Note that the number of breakpoints
(n) is always 3 in these graphs, that is, n was not subject to mutation. This restriction
was necessary in order to effectively visualize the results.
1 of the param-
Figure 4.8 shows an evolution where the variance was set at 0.02 ( 50
eter range). You can see the very tightly bunched clusters of points separated by dotted
lines. These represent groups of mutants separated by randomizations. The mutants
4. System Development 51

1.
Figure 4.8: EvoS 4 evolution with a variance of 50

are so similar to each other that you can’t hear (or see) the difference between them.
This variance is too small.
Likewise, figure 4.9 shows an evolution where the variance was set at 0.1 ( 10 1 of
the parameter range). Here, the distance between mutations (solid lines) and random-
izations (dotted lines) is almost the same. It is difficult to tell the difference between
a mutant sound and a random one. As a result, randomization seems just as effective
for exploring the space as mutation. This variance is too large.
Finally, figure 4.10 shows an evolution where the variance was set at 0.05 ( 20 1 of
the parameter range). In this figure you can see that mutations are separated by a small
yet still visible distance (as opposed to figure 4.8). Randomizations traverse a much
larger distance than mutations (as opposed to figure 4.9). This allows other parts of
the space to be explored that would take too long to reach via mutation alone. This
variance is just right.
Table 4.3 summarises these results at a glance.
4. System Development 52

1.
Figure 4.9: EvoS 4 evolution with a variance of 10

1.
Figure 4.10: EvoS 4 evolution with a variance of 20
4. System Development 53

Variance Normalised Shown in... Observations


a
Variance
0.02 1/50 Figure 4.8 Too small, no audible difference
between parent and child.
0.1 1/10 Figure 4.9 Too large, mutation no different to
randomization.
0.05 1/20 Figure 4.10 Just right! Useful evolution ob-
tained.
a
Normalised variance is the variance expressed as a fraction of the parameter range

Table 4.3: Effects of different variances in EvoS 4.

Isolating Timbre

The melody playing feature worked well. It isolated the timbre of the sound from
other musical characteristics such as pitch and duration. However, it was too slow to
be of much use. The more notes that the melody contained, the longer MatLab took
to produce the accompanying sound. It was so slow in fact, that the only viable way
of using the system was to just have one note programmed in the melody.
By the way, the sounds produced by EvoS 4 were quite boring. As it was only
a volume envelope that was being evolved, you really had to know what you were
listening for to tell the difference between sounds.

4.4.4 Conclusion

EvoS 4 demonstrated a successful system for evolving simple envelopes. It was found
that a variance of one twentieth of the parameter range facilitated useful evolution.

4.5 EvoS 5

4.5.1 Aim

EvoS5 planned to advance the complexity the evolution instrument a step further.
Instead of applying an envelope to the volume of a tone, an envelope was applied to
the modulation index (I ) input of an FM instrument. When this was done, two extra
parameters were included to control the range that this envelope varied the modulation
index over. These are called I1 and I2 .
4. System Development 54

For example if I1 = 0 and I2 = 5 and we have an envelope as shown in figure


4.5, then the modulation index will start out as 0, rise quickly to 5 at the maximum
amplitude and then decay slowly back down to 0. If I 1 = 5 and I2 = 0 then the envelope
will be flipped. The modulation index will start at 5, drop quickly to zero and then
rise slowly back to 5.
Because the modulation index is a control of the spectral density of the resulting
sound, quite interesting and lively sounds can be produced merely by changing the
values of I1 and I2 .
In EvoS 4, an effective method for evolving envelopes was developed. EvoS5
aimed to develop an effective method for evolving the parameters I 1 and I2 .

4.5.2 Implementation

Representation

In order to concentrate on the parameters I1 and I2 , all other aspects of the sound were
kept constant. The amplitude envelope, the modulation index envelope and the ratio
of fm to fc remained unchanged. So the genome for an individual in EvoS5 was very
simple:
!
a = (I1 ; I2 ) where I1 ; I2 2 [0::20]

Mutation

I1 and I2 were very easy parameters to mutate. Their effective range was [0..20] and it
didn’t matter if their values were real or integer. Consequently the mutation procedure
(mutate) just involved adding a small amount of normally distributed noise as usual.
The only consideration is the variance of the normal distribution. The next section
shows how an effective variance was obtained.
Changing init to provide a randomization procedure was equally simple. It just
involved sampling a uniform distribution with the range [0..20].

4.5.3 Results

In EvoS5, the most interesting result was once again the contrast between the mutation
and randomization procedures. This was illustrated particularly well as the space was
only two dimensional. Different variances were trialled for the mutation procedure
and you can really see their effect on the corresponding graphs. The graphs show how
4. System Development 55

the values of I1 and I2 changed while a user was searching for interesting sounds. It
was interesting to note that the results were very similar to EvoS4, that is a variance
of 1/20 (one twentieth) the parameter range led to the most successful evolution. The
next few paragraphs document the results of the trials conducted.
Figure 4.11 shows a typical user’s course through the parameter space with a vari-
1 of the parameter range).
ance of 0.4 ( 50

Figure 4.11: EvoS 5 evolution with a variance of 1/50.

As with the pictures in EvoS 4, each cross represents a parent that the user has
chosen. The evolution starts on the left of the figure and ends at the cross with the
circle. Mutations are represented by solid lines joining crosses and randomizations
are a dotted line between crosses.
Notice how tightly the clusters of points are bunched. This is because the small
variance doesn’t allow mutants to occur very far away from the parent sound. As
a result mutants always sound very similar to the parent. In fact with a variance
of 1/50 there was no discernible difference between a parent and its mutant child.
Consequently, this is not very useful for evolving sounds – the user can never make
any progress as all mutant children sound exactly the same. Also note how these
clusters of points are widely spaced. These large jumps are caused when the user
decides to randomize the parameters.
1 of the parameter range).
Figure 4.12 shows a user’s path with a variance of 2 ( 10
4. System Development 56

You can immediately see the contrast to figure 4.11. The clusters of points are no

Figure 4.12: EvoS 5 evolution with a variance of 1/10.

longer tightly bunched and it is difficult to tell a mutation from a randomization. What
the user finds when listening to the sounds is that each mutant child sounds quite
different from the parent – so different that the user can’t tell how they are related. As
a result, using randomization seems just as effective for getting good sounds as using
mutation.
Figure 4.13 shows a user’s path, this time with a variance of 1/20 ( 20 1 of the pa-
rameter range). Here is a balance between the extremes of figures 4.11 and 4.12.
The mutant children are spaced widely enough from the parent that the user can hear
the difference in the sound, yet they are close enough that the user can hear how the
sound is related to the parent. Consequently when the user hears something they like,
they can follow that direction and obtain a pleasing sound. Instead of getting ran-
dom clusters of points you get ‘lines’ of points that correspond to the user following
a particular direction in the sound space. Note also in figure 4.13 that you can see the
difference between mutation steps and randomization steps. It is not as marked as it
was in figure 4.11 but it is still visible.
Table 4.4 sums up at a glance the discussion of the previous paragraphs.
A final point is that the sounds produced were now a bit more interesting. Instead
of just a volume envelope modifying a sine wave (as was the case in EvoS 4), the
4. System Development 57

Figure 4.13: EvoS 5 evolution with a variance of 1/20.

sounds had characteristics of trumpets and oboes.

4.5.4 Conclusion

EvoS 5 demonstrated an effective system for evolving the parameters I 1 and I2 . It


was found that a variance of 1/20 the parameter range resulted in successful evolution,
which was the same result obtained in EvoS 4. Could this hold true as a general rule 9 ,
or is this a little early to be hypothesizing?

4.6 EvoS 6

4.6.1 Aim

In EvoS4 and EvoS5 effective methods for evolving envelopes and modulation indices
were discovered. With these we could build an FM instrument with an amplitude
envelope and a modulation index envelope which would be able to produce a fair range
9
The “1/20 rule”?
4. System Development 58

Variance Normalised Shown in... Observations


a
Variance
0.2 1/50 Figure 4.11 Too small, no audible dif-
ference between parent
and child..
2 1/10 Figure 4.12 Too large, mutation no
different to randomiza-
tion.
1 1/20 Figure 4.13 Just right! Useful evolu-
tion obtained.
a
Normalised variance is the variance expressed as a fraction of the parameter range

Table 4.4: Effects of different variances in EvoS 5

of sounds. However, Chowning’s FM instrument [Cho73] has one more parameter


that permits synthesis of a very wide range timbres - the ratio of carrier frequency to
modulating frequency:
fc N1
=
fm N2

EvoS6 aimed to discover an effective method of evolving this ratio.


The ratio of fc to fm was actually being evolved in EvoS2 and EvoS3 as whenever
the values of fc or fm were mutated, the ratio was changed. This form of mutation is
undesirable because as stated before, we want our system to evolve timbres that are
independent of pitch. The pitch of a note is usually determined by its fundamental
frequency f0 . In FM synthesis, f0 is given by the following equation:
fc fm
f0 = = where N1 ; N2 are integers with no common factors.
N1 N2

Clearly, if fc and fm are mutated arbitrarily as they were in EvoS2 then f0 will
change with each mutation. To see how this would be undesirable, consider the fol-
lowing example:

John is a composer who has just come up with a brilliant melody. It is


very simple, consisting of two notes: middle C followed by the A above
this. John worked out the melody on his piano, but he now wants to play
it with a ‘space-age brassy’ kind of instrument. I suggested he use EvoS
to find his perfect sound. Taking my advice, John uses EvoS and finds a
sound that he really likes. However, when he uses the parameter settings
to play his melody, he finds it is out of tune with the piano part! Angered
4. System Development 59

by the time he has wasted, John goes to a traditional synthesizer where he


knows if he presses the middle C key, that is the note he will hear. After
many days of twiddling knobs he doesn’t understand, John finally settles
for a sound that is not what he wanted - a very second rate ‘space-age
brassy’ timbre. However, he is consoled by the fact that when he plays
his melody it will be in tune with whatever other parts he has worked out
for his composition.

Although the above example is fictional, we would like to avoid anything like this
happening in real life. For this to be the case, we need make sure the fundamental
frequency f0 stays constant after mutation and even randomization so only the timbre
of the sound is changed.
To do this we simply mutate the parameters N 1 and N2 . When it comes time to
synthesize the sound, we work out f c and fm according to the pitch (f0 ) that the user
wants to hear. That is:

fc = f0 N1 and fc = f0 N2

The problem is that mutation of N 1 and N2 is not straight forward. If N1 and


N2 are integers, the FM sound produced will be ‘harmonic’. That is, each frequency
present in the spectrum of the sound is a whole number multiple of the fundamental
frequency. The human ear is used to hearing harmonic sounds in string and wind
instruments, harmonic sounds are ‘nice’. If N1 and N2 are not integers then the FM
sound produced is ‘inharmonic’. Examples of inharmonic sounds are drums and bells
– they are usually ‘harsh’.

4.6.2 Implementation

An EvoS6 individual is simply represented as:


!
a = (N1 ; N2 )

In order to implement an effective procedure for mutating N1 and N2 we have


to distinguish between the two domains of harmonic and inharmonic sounds. Since
mutation should produce sounds that are similar to the parent sound yet different, the
following guidelines should be followed:

 When the parent sound is harmonic, most mutant children should be harmonic
too, since an inharmonic sound is very different from a harmonic sound.
4. System Development 60

 When the parent sound is harmonic, mutant children that are harmonic should
only occasionally possess a different value of N1 or N2 from their parents. This
is because within the harmonic domain, different ratios of N 1 and N2 produce
quite different timbres.

 When the parent sound is harmonic, there is a very small chance that a mutant
child could be a closely related inharmonic sound.

 When the parent sound is harmonic, and the values of N 1 and N2 have been mu-
tated, all common factors of these numbers should be eliminated. This ensures
that you never get the situation of a child timbre being the same as the parent
timbre but just transposed by a certain amount. For example if N 1 = 2 and N2
= 2, the resulting sound has the same timbre as when N 1 = N2 = 1, but it is just
transposed up an octave.

 When a sound is inharmonic N1 and N2 may not be integers. In this case,


mutation is straight forward: a small normally distributed random variable is
added to the parent settings. Of course, there is a very small chance that both
N1 and N2 end up as integers after mutation. If this happens, the rules for
harmonic mutation apply subsequently.

The mutate subroutine was modified to accommodate the above guidelines.


The init subroutine which is used for initialisation of individuals and the ran-
domization feature gave the individual an equal chance of being harmonic or inhar-
monic.

4.6.3 Results

After a little fine tuning, N1 and N2 were being evolved successfully. If you started
with a harmonic sound, the mutant children would generally be the same, only occa-
sionally changing to another harmonic sound. This effect was achieved by having a
harmonic variance10 that gave about 1 chance in 8 of changing the values of N1 and
N2 . Starting with an inharmonic sound, each mutant child was different yet clearly
1
related to the parent. This was achieved by having an inharmonic variance 11 set to 20
the range. The useful range of N1 and N2 was found to be [1..10].
10
N N
The variance when 1 and 2 are integers.
11
N N
The variance of 1 and 2 when they are not integers.
4. System Development 61

4.6.4 Conclusion

EvoS6 demonstrated an effective system for evolving the ratio of carrier frequency to
modulating frequency. The parameters N1 and N2 are not trivial to mutate as they are
discontinuous in the timbral space. It is interesting to note how the complexity of the
mutation procedure blossoms in an attempt to provide a continuous timbral space.

4.7 EvoS 7

4.7.1 Aim

EvoS 7 finally put all the pieces together. It implemented an evolution strategy for the
complete FM instrument described in [Cho73]. To describe a particular timbre with
this instrument, you had to specify a separate envelope for amplitude and modulation
index, the modulation index range and the ratio of carrier to modulation frequency.
Since this instrument was finally capable of producing semi-decent sounds, it was
decided to measure its success by gathering the opinions of a few users. The users
first assessed Arno van Goch’s P-farm [Goc96], then used EvoS 7. After this they
were asked to comment on the merits and faults of each system.

4.7.2 Implementation

Implementation of EvoS 7 consisted of just tacking together all the previous versions
of EvoS and making them work in harmony.

Representation

As in previous versions, an individual is represented by a vector of real numbers. The


genome in EvoS 7 was the most complicated so far:
!
a = (nI ; xI :1 ; : : : ; xI :n ; yI :1; : : : ; yI :n ; I1 ; I2 ; N1 ; N2 ;
I I

nV ; xV :1 ; : : : ; xV :nV ; yV :1 ; : : : ; yV :nV )
The genome starts with the modulation index envelope. This is specified in the same
way as the amplitude envelopes were in EvoS4: nI is the number of points in the
envelope and (xI :i ; yI :i ) are the coordinates of the breakpoints. The modulation index
envelope is followed by I1 and I2 specifying the range that this envelope operates
over (as in EvoS5). N1 and N2 specify the ratio of carrier to modulator frequency as
4. System Development 62

in EvoS6. Finally the amplitude envelope is specified, also following the convention
adopted in EvoS4: nV specifies the number of points in the envelope and (x V :i ; yV :i )
are the coordinates of the breakpoints.
Of course the eval procedure was modified to accept individuals of this form. It
had to be able to interpret each of the parameters in order to synthesize the sound and
play it to the user.

Mutation and Randomization

Mutation and Randomization functions are accomplished using the techniques de-
scribed in EvoS4 to EvoS6.
The mutate subroutine takes an individual and splits it up, applying the mutation
schemes of previous versions to each section. The amplitude and modulation index
envelopes are mutated using the technique described in EvoS4. The modulation index
range (I1 and I2 ) is mutated as described in EvoS5. N1 and N2 are mutated using the
technique of EvoS6.
A similar scheme is used to achieve the randomization feature through the init
subroutine.

User Interface

The user interface used for EvoS 7 is still the same simple keyboard interface that was
used in EvoS 2 (refer back to figure 4.1).

4.7.3 Results

To illustrate the effectiveness of EvoS 7, figures 4.14 to 4.17 show an example of an


evolution run. The figures show how each parameter in the EvoS 7 genome changed
over the course of the evolution.
First look at figure 4.14 which shows how I 1 and I2 change. Parents that the
user has chosen are represented by a number - the generation. The evolution starts
at generation 1 and ends at generation 15. A solid line joining two points means
mutation was used to make this step, whereas a dotted line implies randomization.
Looking at figure 4.14 you can see that mutation was used from generations 1 to
5, then a random jump occurred between 5 and 6. Again, mutation is used from 6 to
10 followed by another random jump from 10 to 11. Finally, generations 11 to 15 are
traversed via mutation. Figure 4.14 is quite an exquisite example of evolution. The
4. System Development 63

Figure 4.14: EvoS 7 evolution: Modulation indices (I 1 and I2 ).

sections of mutation exhibit a strong sense of direction, implying that the user could
control the search through the sound space. This is contrasted by the large leaps of
randomization which rapidly transport the user to parts of the sound space that would
by unreachable via mutation alone.
Now look at figure 4.15 showing how N 1 and N2 changed over the course of the
evolution.
In generations 1 to 5 we note that the sound is inharmonic 12. A small path is
followed (via mutation) until the random jump occurs between generations 5 and 6.
This jump lands the user on a harmonic sound 13 where it stays until generation 11.
Recall from EvoS 6 that when the sound is harmonic, you don’t want it to change
often because the harmonic sounds will be quite different from each other. This is
why there is no movement from generation 6 to 10. The random jump to generation
11 also results in a harmonic sound, but from 11 to 12 and 12 to 13 there are some
changes via mutation. Notice that the sound is still harmonic - this illustrates how the
harmonic and inharmonic spaces are disjoint under mutation. Finally the evolution
comes to rest from generation 13 to 15.
Finally, examine figures 4.16 and 4.17 which show the amplitude and modulation
12
N 1 or N2 is not a integer.
Remember, an inharmonic sound results when either
13
N N
Harmonic sounds result when 1 and 2 are integers.
4. System Development 64

Figure 4.15: EvoS 7 evolution: Frequency ratios (N1 and N2 ).

index envelopes respectively.


Although it is hard to pick out much detail in these figures, you can clearly see
the contrast between mutation and randomization. In both figures, the envelope shape
changes gradually from generation 1 to 5 as it is mutated. At generation 6 the shape
is quite different due to the randomization step. From generation 6 to 10, mutation
changes the shape gradually again until the random jump at generation 11. Note again
the marked difference in shapes between generation 10 and 11. Finally in generations
11 to 15, the shape is changed gradually via mutation.
Note that each of these figures leave out a lot of information. For each generation,
only the chosen parent is shown - you don’t see the many children that may have
been auditioned and discarded. Imagine what figure 4.14 would look like if you did:
around each parent point you would see a cluster of sounds that user had auditioned,
only one of these (the one you can see on the graph) would be joined by a line to
the parent. The point being made is that it is not a fluke that the first five generations
in figure 4.14 form a line going vertically down - the user has tried other directions
but decided ‘down’ is what they like the sound of best. If you appreciate that at the
same time, around 14 other parameters are being mutated which also exhibit similar
trends14 , then you can see that EvoS 7 (and artificial evolution in general) is really
quite an effective tool for searching this large, multi-dimensional space.
14
Refer back to figures 4.15 – 4.17
4. System Development 65

Figure 4.16: EvoS 7 evolution: Amplitude Envelopes.

Figure 4.17: EvoS 7 evolution: Modulation Index Envelopes.


4. System Development 66

4.7.4 User evaluation

To further asses the utility of EvoS 7 a number of people spent some time using both
EvoS 7 and Arno van Goch’s P-Farm [Goc96]. Since P-Farm uses Roland’s Juno-106
synthesizer this allowed extra comparisons. The Juno allows you to manually adjust
the sliders that control its parameters, so this method could be compared with evo-
lution using P-Farm. Also, P-Farm can generate a population of completely random
patches. Once again, this random method of finding good sounds could be compared
with evolution using P-Farm.
The users (Brett, Emma and Nick) have had little musical training, however they
all have a strong interest in electronic music and enjoy listening to ‘off-the-wall’
sounds and timbres. Each user spent about an hour experimenting with the systems,
the next sections summarise their experiences and opinions.

Brett

Brett found using P-Farm was easier than manually moving the sliders on the Juno.
After ten generations he had found a few sounds which he was reasonably happy
with. Brett thought it was very hard keeping track of the sounds that he liked. P-Farm
presents a population of 32 sounds to the user who has to audition them all to find the
ones they like most. Brett felt if there had of been a quick way to note sounds that you
liked (for example, a fitness rating of 1 to 10) it would have been much easier to keep
track of which sounds were good.
Brett thought that using EvoS 7 was a lot easier than P-Farm. In EvoS 7 you
are only presented with one sound at a time, and you have to make a decision as to
whether it is better or worse than the current parent sound. Brett felt that this provided
less distractions than P-Farm where you are constantly thinking about the 31 other
sounds and whether they are better or worse than the current one. Brett noticed that
EvoS 7 was slower than P-Farm when generating sounds for auditioning. P-Farm has
a real-time response – as soon as you click on an individual you can play a tune with
it on the keyboard. EvoS 7 however, takes a little while to generate the melody used
for auditioning. The longer this melody is, the longer the delay. Finally, Brett did not
like the fact that you couldn’t save in EvoS 7. If you got a sound you were reasonably
happy with, you were too afraid to explore anywhere else as you may lose the sound
you worked so hard for. To overcome this, Brett suggested a ‘stud-farm’ idea. As you
go along exploring the sound space, if you find a sound you like, you can add it to
your ‘stud-farm’. Sounds stored in the stud-farm can be retrieved at any time, so you
no longer have to worry about exploring further and losing a good sound. Also, after a
while the stud-farm will contain a portfolio of your favourite sounds. You could mate
these favourites together (just like in a real stud-farm) to see if anything interesting
4. System Development 67

happens.

Emma

Emma used P-Farm for 8 generations and found two sounds that she was happy with.
She found evolution with P-Farm was much more intuitive than just moving the sliders
on the Juno without knowing what they did. A very interesting result however is that
out of 32 completely random patches (one random P-Farm population) she found 4
sounds that she liked even more than the 2 she took 8 generations to evolve.
When using EvoS 7 Emma thought that all the sounds it made were a lot harsher
than the ‘nice’ sounds created with P-Farm. She felt that in order to use EvoS 7
effectively you have to know what kind of sound you are looking for to begin with.
Emma also thought it would be good to be able to keep and go back to any sounds
that you liked, but she also suggested an ‘undo’ feature. Sometimes when you are
auditioning sounds very quickly, you make a mistake and accidently destroy a child
sound that you wanted to keep. It would be good to be able to backtrack a few steps
and rectify your mistake. Emma thought that to make an evolutionary sound system
really useful you must be able to mutate samples. For example if I’m listening to a
song and I hear a trumpet sound that I really like – I should be able to sample this
sound and then mutate it into something similar yet different. Emma thought she
would also like to take samples of all her favourite sounds, then breed these sounds
together to make new and interesting sounds.

Nick

Nick thought that using P-Farm was much easier than adjusting the Juno’s sliders
manually. He also found P-Farm evolution easier than randomly generating patches.
Nick used P-Farm for 10 generations to evolve an “organ” type sound.
Nick found using EvoS 7 much easier than P-Farm. He felt that EvoS 7 gave
you more control over where the sound was going – you could see trends that were
occurring. Although sometimes progress was a bit slow, he felt eventually you would
get closer to the sound you desire. On the downside, Nick thought it was frustrating
that you can sometimes go ‘backwards’, away from the sound you like. When you do
this, there is no going back. Also, sometimes when you are being cautious you can
miss an opportunity to get closer to your desired sound. Both these problems could be
fixed with an ‘undo’ feature that allows the user to backtrack a number of steps and
pick up where they left off. Nick disliked the slow speed of EvoS 7 and would have
preferred more notes to test the sounds on. Finally Nick suggested a feature where
you specify areas of ‘like’ and ‘dislike’. If you were in a bad spot in the sound space,
you could tell the system and it would steer you away from that area in future. This
4. System Development 68

would stop you going round in circles, constantly encountering the same bad area of
the sound space.

4.7.5 Conclusion

A number of conclusions can be drawn from analyzing the user’s comments:

 P-Farm demonstrated that evolution is more intuitive than adjusting synthesis


parameters manually. This is especially the case when the user does not know
how each parameter effects the overall sound.

 The user interface of EvoS is more effective than that of P-Farm. It seems
presenting the user with one sound at a time provides less distraction than when
they are faced with 32 sounds. The user interface of EvoS enables the user to
hear how each mutant child is different from its parent.

 The sounds created by P-Farm are generally more pleasing than those of EvoS.
This may be a result of the different synthesis models in use. P-Farm utilises a
Roland Juno-106 which is a subtractive synthesizer while EvoS uses FM syn-
thesis. Also, the Juno is a well designed instrument. The ranges on all the
parameters have been set very carefully. As a result, it is very hard to get a
bad sound out of the Juno. This is why Emma had just as much success with
randomization as she did with evolution using P-farm. It seems that in order
to get better sounds out of EvoS, it might be wise to experiment with different
synthesis models.

 EvoS was much slower than P-Farm in generating sounds. Unfortunately, this
fault in EvoS is not so easy to rectify. The way to improve speed would be
to use a hardware synthesizer (like the Juno) that provides real time response.
However a fair amount of driver software is required to interface any computer
program with the synthesizer. Due to time constraints, this will not be possible.

 EvoS most certainly needs a ‘save’ feature. This would allow any parent or
child sound to be saved so the user could return to it at any time. This feature
might be extended to implement the ‘stud-farm’ that Brett suggested. Here,
sounds that you have collected can be bred together to make new ones.

 EvoS also needs an ‘undo’ feature. This would enable the user to backtrack
through a number of steps and rectify a mistake.

Overall, EvoS 7 performed surprisingly well when put to the test against P-Farm.
If the suggested features are implemented EvoS will become an even more effective
tool for interactive evolution of sounds.
4. System Development 69

So what’s next?

Development of the artificial evolution system for this project was basically complete.
What remained was to implement the users suggestions and then extensively test the
system to measure its effectiveness.
Emma’s suggestion of evolving samples was explored, and this is documented in
Appendix A. Alas, this avenue did not prove successful. Instead, the user suggested
improvements were implemented in the next version of EvoS – it was this system that
was used as a basis of the experiments in the next chapter.
Chapter 5

Results

5.1 Aim

Chapter 4 described the development of EvoS - a system that could evolve sounds.
The aim was now to measure the effectiveness of that system. This chapter describes
exactly how the effectiveness of the EvoS system was measured and what results were
achieved.
How can you measure the effectiveness of a system that relies so much on subjec-
tive opinion? This was a tricky problem which I thought about for a long time.
In the case of an interactive system, at the very least you must acquire some user
opinions. After a few different users had tried your system, you would get an idea
of just how good it is. However, this thesis is not concerned with just how ‘fun’
or interesting a program is to use, but more importantly, whether or not artificial
evolution is a useful tool for sound synthesis.
In order to determine this, a comparative test was made. Users tested the evolution
system, but also tested other (more traditional) methods of searching the sound space.
Through the results of these tests, we will see not only how intuitive the evolution
system was, but also how it rated as a search tool beside other methods.
It was decided the comparison would occur between three search methods:

Manual A manual search corresponds to specifying parameter values exactly and


then listening to the resulting sound. This is akin to manually turning the dials
on an analog synthesizer. To do this effectively usually requires knowledge
of the synthesis algorithm in order to predict the effect of changing various
parameters to achieve the sound you want.

70
5. Results 71

Random In a random search, parameters are assigned a completely random value


and the resulting sound is listened to. This is the same as randomly spinning the
dials of an analog synthesizer and hoping that where they land results in a good
sound. It is a lottery - you might win the first time you try it, or you may spend
years trying with no luck.
Evolution This method of searching works via mutation and selection. It was de-
scribed in detail in chapter 4

To compare these three search methods, a user interface was required for each of
them. The next section describes the implementation of these user interfaces. Fol-
lowing this, the method of the experiment conducted will be explained and the results
discussed and analysed. Finally, conclusions will be drawn from this analysis.

5.2 Implementation

All of the search methods for comparison were implemented using MatLab. For an
effective comparison, each search method used the same underlying synthesis algo-
rithm: the FM synthesis algorithm of EvoS7. Since this was already implemented
in MatLab, it seemed an obvious choice. MatLab also provides basic GUI 1 facili-
ties that are easy, if at times tedious to use. The following subsections describe the
implementation of each search method or tool.

5.2.1 Manual Tool

Figure 5.1 shows a screen shot of the manual search tool user interface.
Look at the left side of figure 5.1 and recall the genome of an EvoS7 individual 2:
!
a = (nI ; xI :1 ; : : : ; xI :n ; yI :1; : : : ; yI :n ; I1 ; I2 ; N1 ; N2 ;
I I

nV ; xV :1 ; : : : ; xV :nV ; yV :1 ; : : : ; yV :nV )
The manual tool interface simply allows the user to type in a value for any of these
parameters. Considerable work was involved in keeping parameters consistent. For
instance, when the user changes the number of points in the amplitude envelope n V ,
the corresponding vectors xV :i and yV :i should shrink or grow to reflect this value.
The user can hear the current sound by pressing the Play button. This causes the
current settings of parameters to be read and then synthesized into a sound. Synthesis
1
Graphical User Interface.
2
Refer back to section 4.7.2 for a more detailed explanation of the genome.
5. Results 72

Figure 5.1: Screen shot of the manual tool user interface.

is achieved easily by using the existing eval subroutine from EvoS7. A small graph
of the amplitude and modulation index envelopes is displayed beside the parameter
settings. This assists the user in achieving the desired envelope. Also assisting the user
is the status bar. This is the line of text at the bottom of the screen. It is frequently
updated to notify the user of program’s status. The status bar has been implemented
in all of the search tools.
Figure 5.1 illustrates another feature common to all three search tools. Down the
right-hand side is the Patch Store. This is an implementation of the “stud farm”
idea3 . By typing in a name4 and pressing Save As:, the current parameter settings
will be saved in the Patch Store. These can be recalled later by selecting the name
of the patch you want5 and pressing Load.
When you have a number of parameter settings recorded in your Patch Store you
will want to save them all to disk. This can be accomplished by using the Save File...
button. Conversely, to load a previously created Patch Store file you can use the
Open File... button.
Finally, when the user is finished using the manual search tool they can hit Quit
3
The stud farm idea was suggested by Brett - see section 4.7.4.
4
For example, “Chowning 1” in figure 5.1.
5
For example, “Trumpet” in figure 5.1.
5. Results 73

and exit the program.


Searching the sound space using the manual tool requires the user to have knowl-
edge of how each parameter effects the sound - or at least have a feel for it. The
user listens to the current sound and tries to guess which parameter needs adjusting
to achieve the desired sound. The adjustment is made and the user listens again. If
the resulting sound is further away from the desired sound, the change made to the
parameter is reversed. Other changes are auditioned until a step closer is made. If the
resulting sound is closer to the desired sound, other parameters may be ‘tweaked’ to
bring it even closer. When the user is satisfied, the sound is saved in the Patch Store
for later reference.

5.2.2 Random Tool

The random search tool is much simpler than the manual tool. A screen shot of the
user interface is shown in figure 5.2.

Figure 5.2: Screen shot of the random tool user interface

The two buttons in the center are all that is required. Pressing Randomize gener-
ates a new patch using totally arbitrary parameter settings. The user no longer has to
tediously enter all the parameters as they did in the manual tool. The downside is that
there is absolutely no control over what the parameters will be. This randomization
5. Results 74

is implemented by using the existing init subroutine from EvoS7. When called,
init simply returns an EvoS7 genome with completely random contents. The Re-
play button simply plays the current randomized sound and is implemented using the
eval subroutine from EvoS7.
Looking at the right hand side of figure 5.2 you can once again see the Patch
Store. This performs exactly the same function as in manual tool. If the user hears
any sounds they like, they can save them in the Patch Store for later reference. Note
that the file formats used in all the search tools are compatible. Thus if a user found a
nice sound in the random tool but wanted to ‘tweak’ it a little, they could save it in a
file, load it up in the manual tool and there adjust the parameters manually to get the
desired effect.
Searching using the random tool is pretty straightforward. The user simply presses
Randomize until an interesting sound is heard. The interesting sound can be saved in
the Patch Store. To continue the search, the user presses randomize again. When the
user is finished searching the sound space, they can save all their interesting patches
in a file (using Save File..) and exit the program by pressing Quit.

5.2.3 Evolution Tool

Last, but not least is the evolution search tool - shown in figure 5.3.
The evolution tool is a GUI that replaces EvoS7’s keyboard interface. Looking
at the left side of figure 5.3 you can see the Parent box. This box is divided in two
halves. One half is the Play button which plays the current parent sound 6 . The other
half is a Patch Store allowing the current parent sound to be saved or previous sounds
to be loaded.
Below the Parent is the Child box. This is also divided into two halves. The left
half allows you to select a particular child and play it, the right half is a Patch Store
allowing you to save the current child sound. As in EvoS7, children are created using
Mutate and Randomize . Mutate mutates the current parent sound using EvoS7’s
mutate subroutine, and Randomize creates a random child using init.
The Replace button replaces the current parent with the current child. So through
the three buttons Replace, Mutate and Randomize you have all the functionality
of the EvoS7 system. Users can search through the sound space by mutating and
replacing. When they are satisfied with any sound they can save it in the Patch Store.
But what about the Partner box and that Mate button?
6
Playback is once again implemented with EvoS7’s eval subroutine.
5. Results 75

Figure 5.3: Screen shot of the evolution tool

Mating Sounds

Mating or recombination has previously been ignored in this study - the main focus
has been on mutation. However, it would seem foolish to conduct an investigation of
artificial evolution for sound synthesis without even experimenting with a recombina-
tion operator. To this end, it was decided to include one in this final incarnation of the
system. Unfortunately as time was running out, this mating operator had to be kept
extremely simple. The following paragraphs will describe its implementation.

One point crossover

The simplest form of recombination is one point crossover: one point along the
genome is chosen at random, a child is formed by taking the genes of the first parent
up to this point, and the genes of the second parent from after this point.
For example, consider two parents a and !
! b which will form a child !
c. The
genomes of the parents look like this:
!
a = (a1 ; a2 ; :::; an 1 ; an )
!
b = (b1 ; b2 ; :::; bn 1 ; bn )
5. Results 76

A random crossover point i is chosen where i 2 [1; n 1]. Thus the child genome
will look like this:
!c = (a1 ; a2; :::; ai 1; ai ; bi+1; bi+2 ; :::bn 1; bn)
But notice this assumes parent ! a goes first. If we decided that parent ! b should go
first we would get a different child:
!c = (b1 ; b2 ; :::; bi 1; bi ; ai+1; ai+2; :::an 1; an)
So for each crossover point i there are two possible children. Thus for two parents !
a
!
and b with n genes each, there are 2(n 1) different children that can be formed
with one point crossover.

Implementing one point crossover

When this scheme of one point crossover is applied to the genomes of EvoS7, some
major problems occur. One of these is caused by the fact that EvoS7 genomes do not
have a fixed length. For example, imagine parent ! a has a modulation index envelope
!
with two points (nI = 2) and parent b has one with ten points (nI = 10). If the
crossover point is chosen to be one (i = 1), then the child will inherit n I = 2 from
parent !a , and all its following genes from parent !b . This will be an invalid child -
nI will specify!two envelope points, but there will be ten envelope points following it,
care of parent b !
OK, so this problem could be fixed up with some fancy coding and a bit of pa-
tience, but there were other problems as well7 . To avoid having to deal with all these,
a much simpler scheme was devised: the genome could only be split at points where
the results were well defined. This resulted in a ‘meta genome’ that looked like this:
!
a = (Index Envelope; I1 ; I2 ; Freq Ratios; Amplitude Envelope)
This scheme worked quite well and solved all of the previously mentioned problems.
The downside of the scheme was that in terms of mating, the genome’s length was
drastically reduced - there are only 4 possible crossover points. This means that for
any two parents, there are only 2  4 = 8 possible children!
Although it would have been desirable to have more combinations available, the
crossover system seemed to work adequately. A MatLab procedure mate was written
that implemented the scheme. The procedure takes two parent genomes, randomly
chooses a crossover point and randomly chooses which parent will ‘go first’. The
resulting child genome is the return value of the procedure.
7
x y
Think about what happens if the crossover point is right in the middle of the ’s and ’s of an
envelope.
5. Results 77

Mating in the evolution tool

Now we can finally explain the mystery parts of figure 5.3. The Partner box allows
you to load any sound previously saved in the Patch Store. When the Mate button
is pressed, a child is formed by mating 8 the current Parent with the current Partner.
This child can then be subjected to more mating or mutation using the previously
described features of the evolution tool.
When the user is finished searching the sound space via mutation, mating and
randomization, they can save all their acquired sounds with the Save File... button.
Once again, this is the same file format used in the random and manual tools - the user
can change their sounds with the other programs by using the Open File... button.
Finally, the user can press Quit which will exit the program.

5.3 Method

After the three search tools were implemented, it was time to actually conduct the
experiment and get some results. The experiment involved getting users to try each of
the search tools.
The users first tried the manual tool, then the random tool and finally the evolution
tool. Before they used each tool, detailed instruction was given and the user was taken
through an example of the tool’s use. Three statistics were recorded for each tool:

Time The time that the user used the tool for. This was usually around 10-15 minutes.

Sounds Auditioned The number of different sounds auditioned. Each time the user
made a change and listened to a new sound, it was noted.

Sounds Liked The number of sounds that the user ‘liked’ out of all the sounds the
user auditioned. This statistic was collected at the end, after all the tools had
been used. Along the way, the user was encouraged to save any sounds that were
remotely interesting. After all the tools had been used, all the saved sounds
for each tool were reviewed. The user specified in retrospect which sounds
they really did think were interesting, useful or amusing - the number of these
was noted. It was necessary to collect this statistic at the end because quite
frequently in the early stages of the trial, the user would save a lot of sounds
they thought were great. However, after using the other tools and seeing what
other sounds they could get out of the algorithm, they would change their minds.
8
Using the aforementioned mate subroutine.
5. Results 78

The motivation for these statistics was to see how many good sounds you could
find with each tool in a certain time. It might be found that with one tool, users find
a lot of good sounds quickly, whereas with another, it takes a long time to find a few.
For this comparison to be valid, it was important that each tool be used for an equal
amount of time. However an ‘equal amount of time’ was not so easy to judge. For
instance, the manual tool took a long time to get used to. Also when using the manual
tool, people usually think a while, make a few deliberate changes and then listen to
the result. These factors result in the user spending a lot of time with the manual tool
and only listening to a few sounds. The random tool however is very simple and users
get the idea straight away. Many sounds are auditioned in a very short time. To get a
good comparison between these two tools one would like the manual tool to be used
for longer, just to ensure that the user had ‘the hang of it’, and was not spending all
the time just trying to work out how to use the interface.
As a result, each tool was used until the user got ‘the hang of it’. Although this
doesn’t seem like a very good metric of time, it proved to be the only viable option.
As well as collecting the three statistics mentioned, after using the search tools,
users were asked a question along the lines of:

“If you needed some sounds (say, for a composition you were writing)
and you were only able to use this algorithm and the three search tools,
what strategy would you use to find the sounds you desired?”

It was hoped by asking this question, further insight could be gained into which tool
the users thought was most effective for searching the sound space. Their answer to
the question and any other comments they had about the three search methods was
duly noted.
The whole process of instructing the users, collecting statistics and asking the
question took about 1 hour. This is quite a large amount of time to ask volunteers to
donate9 - because of this, just seven users were surveyed in total.
Of these seven users, only two of the people surveyed were active composers of
electronic music - people who have a real need for these kind of tools. The rest of the
users, although having an interest in electronic music, would not have a need for the
search tools. Although this may seem like a deficiency, it would be good to see if any
of the search methods allows ‘normal’ people to explore the sound space effectively
- people who have no knowledge of how the synthesis algorithm works, but simply
know what sounds nice and what doesn’t.
9
Especially considering they were not getting paid!
5. Results 79

5.4 Results

5.4.1 Raw Data

The raw data for the statistics Time, Sounds Auditioned and Sounds Liked is pre-
sented first. The results for the manual tool are shown in table 5.1; the random tool in
table 5.2; and the evolution tool in table 5.3. Each table shows the data collected from
each user and also the mean and standard deviation of this data.

User Time Sounds Sounds


(minutes) Auditioned Liked
Emma 18 20 3
Dougal 21 40 2
Rachael 9 25 2
Tom 18 21 2
Brett 14 24 2
Chris 14 25 3
Nick 15 20 2
Mean   15.57  3.867 25.00  6.976 2.286  0.488

Table 5.1: Raw statistics collected for the manual tool

User Time Sounds Sounds


(minutes) Auditioned Liked
Emma 10 40 5
Dougal 4 40 6
Rachael 5 40 4
Tom 6 24 5
Brett 5 41 3
Chris 8 36 4
Nick 5 43 3
Mean   6.143  2.116 37.71  6.396 4.286  1.113

Table 5.2: Raw statistics collected for the random tool

5.4.2 Analysis of Data

Table 5.4 presents a comparison of the previous statistics – it compares the averages
of each statistic for each search tool. It will be useful to analyse each line of this table.
5. Results 80

User Time Sounds Sounds


(minutes) Auditioned Liked
Emma 24 65 6
Dougal 13 80 8
Rachael 9 40 4
Tom 16 54 4
Brett 15 48 5
Chris 13 53 7
Nick 15 80 5
Mean   15.00  4.583 60.00  15.57 5.571  1.512

Table 5.3: Raw statistics collected for the evolution tool

Average Search Tool


Statistic Manual Random Evolution
1. Time 15.57  3.867 6.143  2.116 15.00  4.583
2. Sounds Auditioned 25.00  6.976 37.71  6.396 60.00  15.57
3. Sounds Liked 2.286  0.488 4.286  1.113 5.571  1.512
4. Liked  Time 0.155  0.049 0.762  0.351 0.395  0.142
5. Liked  Auditioned 0.097  0.032 0.120  0.048 0.095  0.022
6. Auditioned  Time 1.685  0.574 6.757  2.511 4.185  1.230

Table 5.4: Mean statistics for each search tool.


5. Results 81

In the first line we see that the average Time spent using the Manual and Evolution
tools was about the same. This implies that the tools were of similar complexity and
users took about 15 minutes to get comfortable with them. The average Time spent
using the Random tool however was much less - only about 6 minutes. This reflects
the fact that the Random tool was much simpler than the others and users quickly
became proficient in it.
The second line of table 5.4 shows the average number of Sounds Auditioned
with each tool. Sounds Auditioned for the Manual tool is quite low because the
manual tool required users to think about what they were doing before modifying
parameters. The value for the Random Tool is a little higher - it was easy to audition
sounds with this tool, but users didn’t spend much Time with it. The Evolution tool
had a much higher average than both of the others as auditioning sounds was easy
(you didn’t have to think as much compared to the Manual tool), but users spent a lot
of Time getting used to the tool (due to its complexity).
Looking at the Sounds Liked statistic in table 5.4 we see a very positive result for
the Evolution tool. On average, with the Manual tool, users found 2 sounds they liked;
with the Random tool, they found 4; however with the Evolution tool they found 5.5
sounds that they liked. On the surface, we could conclude from this result that the
Evolution tool performed the best and thus Artificial Evolution is the most effective
way of searching the sound space. But doing this would neglect the facts that each
tool was not used for an equal amount of time; and an unequal number of sounds were
auditioned using each tool. If we analyse some ratios of these statistics we can gain
further insight into the actual meaning of the data.
The fourth line of table 5.4 shows the average of Sounds Liked divided by Time
for each search tool - this gives us a measure of “Sounds Liked per Minute”. You
can see that the Random tool scores highest here, with Evolution coming second and
Manual last. This implies that using the Random tool you can find more sounds in a
shorter amount of time. Does this mean that the Random tool is the most efficient for
searching the sound space? This is in contradiction to the conclusion reached from
looking at the Sounds Liked statistic alone.
Furthermore, look at the fifth line of table 5.4 which shows the ratio of Sounds
Liked to Sounds Auditioned. This could be another measure of efficiency - how
many sounds do you have to audition before you get one you like? Again we see
that the Random tool is the best, with the Manual and Evolution tools being about
equal. This seems to imply that if a user auditioned an equal amount of sounds with
the Random and Evolution tools, they like more of the sounds discovered with the
Random tool. So again, the efficiency of the Evolution tool is questioned.
Finally, line six of table 5.4 shows the ratio of Sounds Auditioned to Time. This,
along with the Time statistic provides another measure of the tool’s simplicity - how
5. Results 82

many sounds can you audition per minute. The Random tool ranks best again - it is
a very simple tool to use and encourages rapid auditioning of sounds. The Evolution
tool comes second - it also encourages rapid auditioning, but it is not as simple as the
Random tool. Lastly, the Manual tool - it is quite complicated and requires users to
think before auditioning sounds.

5.4.3 An Alternative Analysis

Before making any conclusions on the above analysis, let us consider another possi-
bility. What if the number of Sounds Liked is not dependent on the search tool at
all, but some other variable like Time? Instead of the trend being “people found less
sounds with the Manual tool, and more sounds with the Evolution tool”, it could be
“the more time people used any system, the more sounds they found.”
We can see if this is the case by looking at a graph of Sounds Liked with any
tool vs. the Time spent using that tool. This graph is shown in figure 5.4. Note that
each point is labelled with a letter that identifies the tool used: E for Evolution; R for
Random; and M for Manual.

Figure 5.4: Sounds Liked plotted against Time for all search tools.

Apart from points of the same search tool being clustered, there is no trend in
figure 5.4. This implies that Sounds Liked is definitely not dependent on Time spent.
5. Results 83

For example, if someone used a search tool longer than another person, you could not
predict that they would find more sounds.
On the other hand, look at figure 5.5, which shows Sounds Liked plotted against
Sounds Auditioned. Again, each point is labelled with a letter to identify the search
tool used.

Figure 5.5: Sounds Liked vs. Sounds Auditioned for all search tools.

A clear trend is shown in figure 5.5: no matter which tool is used, more Sounds
Auditioned implies more Sounds Liked. Before we were trying to argue that one
particular search tool would produce more Sounds Liked than another. This graph
goes against this argument, claiming that it doesn’t matter which search tool you use,
just how many sounds you audition.
For comparison, figure 5.6 shows Sounds Liked plotted against the search tool
used.
Here there is also a clear trend: the Evolution tool produces the most Sounds
Liked, the Random tool produces less, and the Manual tool produces even less again.
So which trend is more dominant? Is Sounds Liked a function of Sounds Auditioned
(figure 5.5); or is Sounds Liked a dependent on which search tool is used (figure 5.6)?
An attempt was made to answer this question by fitting polynomials to the two
graphs and checking confidence intervals, however there were too few data points to
5. Results 84

Figure 5.6: Sounds Liked plotted against search tool

get meaningful results. Instead, these numerical results should be taken with a grain
of salt. Although conducting a test like this seemed a good idea at the start, drawing
concrete conclusions from the data proved to be quite difficult. It might be more
instructive to go straight to the horse’s mouth and find out what the users actually
thought about each of the search tools.

5.4.4 User Opinions

Recall that as well as collecting the statistics, each user was asked to answer this
question:

“If you needed some sounds (say, for a composition you were writing)
and you were only able to use this algorithm and the three search tools,
what strategy would you use to find the sounds you desired?”

This section presents a short summary of each users answer to this question and
any other comments they may have made.
5. Results 85

Emma

To find sounds, Emma would use all three search tools. She found the manual tool
very confusing at first, but it then became fun as she understood it. The random tool
was the easiest to use and you could get quite good sounds from it. The evolution
tool was also confusing at first - it was much more complicated than the others. She
liked the mating feature and got quite good results from it, although she thought the
evolution tool could be more straight-forward.
Overall, she thought the tool used would depend on the situation. Sometimes you
might be in a rush, and so use the random tool to get sounds quickly. At other times
you may want to slowly refine sounds by mutation. She thought it would be best to
have a system that combined all three tools. This way you would enjoy the advantages
of each.

Dougal

For finding sounds, Dougal thought the evolution tool was the best. He thought it
was much easier to use than the manual tool. He noted (as did many other users) that
the evolution tool contains all the features of the random tool and more - it would be
stupid to use the random tool when the evolution tool is available. Dougal had a lot of
fun using the evolution tool, and wanted to continue even after the survey had ended.

Rachael

Rachael would also use the evolution tool to find sounds. She thought it was quite
easy to use, except sometimes you forgot to save sounds 10. Rachael thought that if
you understood the algorithm in depth, you would be better off using the manual tool.
This would give you exact control over the sound you wanted.

Tom

Tom thought he would use the manual tool to find sounds. He liked the manual tool
because you could see the envelopes that were being used. Also, when using the
manual tool he knew the sounds generated were his and not some random creation that
the computer cooked up. Tom thought that the evolution tool was good for refining
sounds. He thought a good strategy to find sounds would be to first create something
10
This was due to a poor interface design... my fault!
5. Results 86

with the manual tool and then work on it and refine it in the evolution tool 11 .

Brett

To find sounds, Brett would first use the random tool and get a number of raw sounds
that had desirable characteristics. He would then take these to the evolution tool and
start working on them. He really liked the Mate feature and thought it was successful
at combining the traits of two different sounds. He thought that Mutate was very good
for fine tuning a sound. Brett liked the random tool because its simple interface did
not distract you from listening to the sounds. This is why he would use it first, even
though all its features are contained in the evolution tool.
Brett thought the only use for the manual tool would be to ‘tweak’ sounds that you
obtained via other means.

Chris

Before Chris’ comments are presented, a special note should be made: unlike other
users, Chris is an active composer of electronic music. His compositions involve the
use of many different synthesizers, both digital and analog. Because of this, he has
an in depth understanding of many synthesis algorithms. Chris has been composing
for over five years - he is the kind of person that would really utilize the search tools
being assessed here. As a result, special attention should be paid to his comments.
Chris strongly stated that the evolution tool was the most efficient way to search
for sounds. Using the evolution tool, his search strategy would involve first gener-
ating some random sounds. These would be refined a bit using mutation and any
vaguely interesting sounds would be saved. Next, Chris would mate the saved sounds
in different combinations and hopefully obtain his desired sounds.
Chris would have liked to see the envelopes of sounds in the evolution tool. He
believed this would provide a visual feedback, indicating to the user what part of the
sound had changed. He also would have liked a control that varied the degree of
mutation.
Chris thought it was good having a variety of ways to change the sound - in the
evolution tool there were three: randomization, mutation and mating. Ideally though,
Chris thought it would be good to enable manual modification of parameters if you
needed it. This would create a ‘super tool’ with all the features of the three existing
tools. The ‘super tool’ would provide the composer with the most flexibility.
11
Tom thought that all the programs could do with catchier names. He expected something along
the lines of “SoundGen 2000” rather than “evolution tool”.
5. Results 87

Finally, Chris would have liked a ‘loop’ feature. This would constantly replay the
current sound and enable you to sequence drums or other sounds on top of it. The
loop feature would help you audition sounds; you could see if a sound fitted in with
the rest of your composition. It would also make using the system a lot more fun; you
would really be making music as the sounds mutated and evolved.

Nick

Nick is also an active composer of electronic music. Although not as experienced as


Chris, he would still have a use for the search tools presented here. Special attention
should also be paid to Nick’s comments.
Nick had two strategies for finding sounds. The first was like Brett’s: the random
tool would be used to get a few interesting sounds and then these would be refined
using the evolution tool. The second strategy would involve using the manual tool to
set the desired envelopes - this is easier here as the envelopes are displayed. Then the
evolution tool would again be used to refine the sounds and set the other parameters.
Nick saw the evolution tool mainly useful in refining sounds obtained through other
means.
Nick got some good results using the mutation feature of the evolution tool. Using
it he was able to define a small subspace of sounds that he really liked. He thought
it would be really cool to be able to interpolate sounds within this subspace. Ideally,
this would be done in real time, while performing a composition.
Nick was a bit disappointed in the underlying synthesis algorithm and quickly got
a feel for all the sounds it could produce. He thought it would be good to use an
algorithm that could produce a wider range of timbres.

5.5 Conclusions

Taking all of the previous discussion and analysis of results into hand, we can formu-
late a number of conclusions:

 Drawing conclusions from the statistics collected proved difficult. This was due
to each tool not being used for an equal amount of time and an unequal amount
of sounds being auditioned. This couldn’t be helped because some of the tools
took longer to get used to, and some of them encouraged more auditioning of
sounds. Also, with a small sample space it was difficult to tell which variables
were actually effecting the number of sounds liked.
5. Results 88

 We might tentatively say that based on the statistics, the Random tool proved
to be most efficient for searching the sound space. The Random tool produced
more interesting sounds in less time, and with less auditioned sounds than both
the others. The Random tool was also the easiest to use, with users getting used
to it much quicker than the other tools. The Evolution tool found more sounds
than the Manual tool in less time, but on a sounds liked vs. sounds auditioned
measure, they were about equal. The Evolution tool was easier to use than the
Manual tool.

 Remember that the Random tool is a subset of the Evolution tool - the Evolution
tool provides all the functionality of the Random tool and more. This implies
the good result for the Random tool mentioned above is also a good result for
the Evolution tool.

 The users comments generally enforce the above two points. The Random tool
is good for finding sounds quickly, but it doesn’t have any control - you just
wildly jump around the sound space. What is needed is a finer level of control
that allows sounds to be ‘tweaked’ to get them sounding just right. Both the
Manual and Evolution tools provide fine grained control or ‘tweakability’, but
most users preferred the Evolution tool as it was easier to use.

 Even with a simple implementation, the Mate feature of the Evolution tool
proved quite successful. Three of the users obtained good results with this fea-
ture.

 A definite bonus is Chris’ strong support for Evolution as his tool of choice.
Being a active composer, he may be representative of the people who would
gain most use out of these tools.

 Some users expressed the utopian philosophy that “all the search tools were
good.” In this conclusion, you can’t say that one tool is better than the others,
because they each have their unique advantages and disadvantages. Each tool
is the perfect choice in certain situations.

In summary, the success of the Random tool was a bit unexpected - this simple
method seemed to perform better than the more complex Manual or Evolution tools.
However, the Random tool does not have the fine grained control required of users -
it is here where the Evolution tool excels. The Evolution tool enables users to tweak
sounds but is much easier to use than the Manual tool. What’s more, you don’t need
to understand the underlying synthesis algorithm in order to use the Evolution tool.
If we note that the Random tool is a subset of it, the Evolution tool can definitely be
regarded as the most efficient method for searching the sound space - combining rapid
sound location and fine grained control. In short, artificial evolution is a useful tool
for sound synthesis.
Chapter 6

Conclusion

6.1 Conclusion

This investigation has explored the application of artificial evolution to sound syn-
thesis. It has been shown that interactive evolution is a useful tool for exploring the
sound space of synthesis algorithms.
The above conclusion was reached from building and testing a system that evolved
parameters for a simple FM synthesis algorithm.
A preliminary version of the system was assessed by 3 users. It was also compared
with the only similar work known to the author - van Goch’s P-Farm. Against P-
Farm, EvoSperfumed well. Its main differences were a focus on mutation rather than
mating, and a much smaller population size. The users preferred these features and
rated EvoShighly. The only drawback of EvoSwas the quality of sounds it produced -
this was a limitation of the FM synthesis algorithm.
In addition, a more objective study was conducted that compared artificial evo-
lution to more traditional methods of parameter adjustment. Three tools were con-
structed in order to compare manual, random and evolutionary methods of searching
the sound space of the FM synthesis algorithm. Seven users took part in the survey,
two of which were composers of electronic music.
The results of the study suggested that the random method was the most efficient
way to search the sound space. However, it did not provide the fine grained control
required by users to ‘tweak’ sounds. The evolutionary method proved easier and more
efficient in providing this control than the manual method. This is emphasised again
when it is realised that the random method is just a subset of the evolutionary method.
Both composers and non-composers found the evolutionary method useful. This

89
6. Conclusion 90

shows that artificial evolution provides a tool, not only for people who understand the
underlying synthesis algorithm, but also non-experts who would still like to experi-
ment with sound synthesis.

6.2 Other lessons learnt

Along the way, some other valuable lessons were learnt - usually the hard way:

 Evolving sounds wasn’t as easy as first expected! At first I thought I could just
throw a Genetic Algorithm at a synthesis algorithm and fantastic things would
happen. Sadly, this was not the case - the task involved a lot more careful
design. Suitable ranges for each parameter had to be determined, usually by
trial and error. One example of these difficulties was the problem of evolving
the carrier to modulator frequency ratios1. Trying to make the parameters N1
and N2 seem ‘continuous’ was hard and not simply a matter of blindly applying
an Evolutionary Algorithm.

 The interactive evolution technique developed did not prove successful when
scaled up by a large amount. In order to surmount the shortcomings of the
FM synthesis algorithm used, artificial evolution was applied to an alternative
algorithm2. With around 20 parameters, it worked fine - with around 2000
parameters, it failed dismally!

6.3 Further Work

Although much has been learnt during the course of this project, as always, there is
still much that could be gained from research in the future. Some unexplored avenues
include:

 More experimentation could be done with self adaption in the EvoSsystem.


Looking back at EvoS3 where the trial of self adaption was conducted 3 , the
variances and ranges chosen for f2m , f2c and I2 seem ridiculous! It would be
good to go back and see if some better results could be obtained by using some
sensible values4.
1
See section 4.6.
2
This experiment is described in appendix A.
3
see section 4.3.
4
You could apply the 1/20 rule that was developed later (see section 4.5.4).
6. Conclusion 91

 As was suggested by a number of users, it would be good to implement a


‘variable-variance mutation’ feature. This would allow the user to vary the
amount by which a mutant child differs from its parent. For example, as the
user approached their desired sound they would decrease the variance so mu-
tants would lie closer to the parent sound - thus they would be able to ‘home in’
on the desired sound.
 The mutation procedure could be further extended to allow certain parameters
to be fixed as the user desired. For example, if a user knew they didn’t want
to mutate the amplitude envelope further, they could ‘fix’ it. In subsequent
mutant children, the amplitude envelope would remain constant while the other
parameters were being optimised.
 The EvoSsystem would benefit with an increase in speed. Currently, there is
a significant delay encountered when auditioning sounds. This delay could
be eliminated by using a faster implementation language, for example using
Csound instead of MatLab. Better still, real time response could be obtained
by using dedicated hardware synthesizers in an approach similar to van Goch’s
P-Farm [Goc96].

 A large problem that needs to be dealt with is the limitations of the current
synthesis algorithm. FM synthesis only provides a limited range of timbres. It
would be good to adapt the system to an algorithm that can produce a much
wider range of sounds. Furthermore, it would be good to make the system
modular enough to be adapted to any synthesis algorithm the user cares to ex-
periment with. Be warned however, this is no easy task.

 A preliminary attempt solving the above problem was made (see Appendix A),
but it was not very successful. However, using artificial evolution for additive
synthesis should not be disregarded. A mating operator seems quite feasible and
the mutation operator could be made to work with a bit more experimentation.
The payoff for creating such a system would be large: evolving samples of real
sounds into unreal mutant relatives is a prospect to be relished by musicians,
composers and anyone with a taste for the bizarre.
 I believe the work presented here follows in the footsteps of Richard Dawkins.
The system combines a simple synthesis algorithm with an EA whose main op-
erator is mutation. The next step is to follow in the footsteps of Karl Sims. This
would require the evolution of symbolic lisp expressions that would be inter-
preted as sounds instead of images. Achieving this would solve the problem of
limited-timbre synthesis algorithms – the evolution would be “open-ended.”

 Another large project would involve integrating the EvoS system into a larger
compositional framework. This would better serve musicians and composers
by combining artificial evolution with existing musical composition software.
6. Conclusion 92

In the short term, progress toward this goal could be made by implementing a
‘looping’ feature which allows drum patterns to be sequenced and played along-
side of evolved sounds. It would also be good to provide access to all search
methods from the same interface - manual and random, as well as evolution.

 Finally, we have seen that artificial evolution for sound synthesis provides a
useful tool for composers - who use the system for finding sounds that will
be used in compositions; and also non-composers - who use the system for
entertainment and experimentation. It would be to see if the system is also
useful for researchers who explore new sound synthesis algorithms. This would
involve getting some researchers to trial the system and give their opinions.

And there are, without doubt, many other avenues of future research related to this
work that I have not mentioned. Don’t let this stop you from pursuing them!

6.4 A final note

Maybe it is not really important that one method of searching the sound space be
better than another. Each method has its own appeal depending on the circumstances
and the needs of the user – all three methods presented here make up a whole.
Ultimately we are trying to build tools that enhance human creativity – unblock
it – set it free. I look forward to the day that a machine can interpret my vision and
make it a reality.
Appendix A

Evolving Samples

A.1 Introduction

After the EvoS eexperiments had been conducted (chapter 4) it was decided to try
evolving some more complicated sounds. The sounds from EvoS were interesting, yet
very few were suitable for practical use - for example, in a composition. A user that
evaluated the EvoS system suggested that a system that could mutate sound samples
would be very useful to composers1 . But how could you mutate samples?
One practical way of mutating any pre-recorded sound is via additive synthesis 2.
Recall that additive synthesis can produce any possible sound by using a separate
sinewave oscillator for each harmonic in the sound. Each oscillator has a separate
envelope controlling its amplitude and pitch deviation. This is a bonus as we have
already had a lot of experience in mutating envelopes – it should be a piece of cake.
However, additive synthesis is computationally expensive, with most sounds typically
requiring 8 or more harmonics for accurate reproduction. As was found with EvoS,
MatLab is already slow implementing a couple of oscillators and envelopes for FM
synthesis. Clearly, MatLab would be unsuitable for experimenting with additive syn-
thesis at interactive rates.
All is not lost however, the Csound program has a very quick implementation of
additive synthesis and also another major advantage: a heterodyne filter analysis tool.
This tool can analyse any audio sample and return a data file containing the amplitude
and pitch envelopes of each harmonic. Another Csound command can then be called
which reads the analysis file and uses the amplitude and pitch envelopes to synthesize
the sound. Using Csound, an evolution system can be built very easily by following
these steps:
1
Emma suggested this - see section 4.7.4.
2
Refer to section 2.2.3.

93
A. Evolving Samples 94

1. Sample the sound you wish to evolve.

2. Analyse the sample using Csound’s heterodyne filter analysis tool.

3. Read and modify the analysis data file. This can be done with an external pro-
gram like MatLab. The modifications you make represent the mutation of the
sound.

4. Synthesize the modified analysis file using Csound. Play the synthesized sound
to the user. The user can decide whether they like this sound (discard the orig-
inal analysis file), or would rather try another mutation of the original (discard
the mutated analysis file).

5. Repeat steps 3 and 4 until a satisfactory sound is obtained.

A.1.1 Implementation

The first step in implementing the evolution system described above was to read and
interpret the Csound analysis files. Due to the simple structure of the files, this proved
quite easy to do with MatLab. And using MatLab it was easy to visualise the results.
Figures A.1 and A.2 show the data of a particular analysis file.

Figure A.1: Example of a heterodyne analysis file, the amplitude envelopes


A. Evolving Samples 95

Figure A.2: Example of a heterodyne analysis file, the frequency envelopes

You can see that the amplitude and frequency envelopes are quite detailed – each
has about 100-200 points.
The next (and hardest) task was developing a procedure that sensibly mutated
these envelopes. At first an almost identical strategy to that used in EvoS 4 was ap-
plied3 . Each envelope breakpoint was mutated by a small random amount. A variance
1 of the amplitude range was used to control the amount of mutation. This re-
of 20
sulted in some envelopes being mutated nicely, while others were badly distorted. The
problem was that some envelopes were much more insignificant than others. Look-
ing at figure A.1 you can see that the amplitude of the higher harmonic envelopes is
much less than the lower harmonic envelopes. The lower harmonic envelopes were
being mutated nicely (because the variance was 20 1 of their amplitude range), but the
higher harmonic envelopes were badly distorted (because the variance was many times
1 of their amplitude range).
greater than 20
To overcome this problem, an adaptive variance was used for mutation. Each
envelope was taken on its own, the range of amplitude values was calculated and a
1 of this value. This variance was then used to control
variance calculated that was 20
mutation for that envelope alone. Using this scheme, nicely mutated envelopes were
obtained, see figures A.3 and A.4 for example.
3
Refer to section 4.4.2.
A. Evolving Samples 96

Figure A.3: Example of a nicely mutated amplitude envelope

Figure A.4: Example of a nicely mutated frequency envelope


A. Evolving Samples 97

However, there were still problems with this scheme. Figure A.5 shows a mutated
amplitude envelope that is very noisy. This noise is undesirable and seemed to occur

Figure A.5: Example of a badly mutated amplitude envelope

when breakpoints in the envelope were clustered close together. Although this prob-
lem could be rectified with more complex mutation schemes 4 , due to time constraints
this was not possible. As most envelopes were being mutated well, it was decided to
press on and experiment with evolution using this mutation scheme.
After the mutation procedure had been implemented in MatLab, the rest was easy.
Evolution was performed ‘manually’ by first mutating an analysis file in MatLab, then
using Csound to play the resulting sound. It would be a simple matter to construct a
user interface that automated this task.

A.1.2 Results

For this experiment, evolution was conducted with three main samples: an organ, a
bass guitar and a voice sample. The results were very disappointing. The problem
was that all mutants sounded the same. For a given parent with two mutant children,
it was impossible to tell the two children apart. The obvious solution to this was to
increase the variance from 201 to a larger factor. However this didn’t help matters at
4
For example, filtering the noisy envelope.
A. Evolving Samples 98

all. No matter which mutants you chose, the sample just ended up sounding like it was
being played underwater - the result was the same for the organ, the bass guitar and
the voice sample. When smaller variances were tried, not only were mutant children
indistinguishable from each other, but you couldn’t even tell the difference between
the parent and the mutant child!
Was this due to the badly mutated envelopes described before (figure A.5)? Cer-
tainly, the underwater/bubbling sounds produced could have been a result of rapidly
oscillating, noisy envelopes. But even if this weren’t the case, I suspect that you still
would not be able to tell the mutant children apart from each other. The problem is
really the increased number of dimensions in the sound space.
In EvoS7, there were about 20 to 30 parameters being mutated at a given time.
Here, there are 20 envelopes (10 for amplitude and 10 for frequency), each with 100
to 200 breakpoints which are all being mutated at once. This is a total of 2000 param-
eters - a sound space with 2000 dimensions. The human ear just cannot detect small
differences in this space, so a human user cannot possibly hope to choose a direction
and follow it.
To test this theory a small experiment was conducted with a very simple set of
amplitude and frequency envelopes. Four envelopes were used (2 amplitude and 2
frequency), each with four breakpoints. This give a total of 16 points - a sound space
with 16 dimensions. These were mutated and evolved using the exactly the same pro-
cedure as above. It was found that in this case, you could tell the difference between
mutant children. In fact, it was quite easy to control the evolution and steer it in the
direction you desired. It seemed that with 16 dimensions, evolution worked fine but
when you scaled this up to 2000, it fell apart. Maybe this result can be excepted as
evidence that this technique doesn’t work when the number of dimension is increased
above 20 or 30.

A.1.3 Conclusion

As no useful results were obtained from this experiment, the idea of mutating samples
had to be scrapped - time was running out. This was a shame because if you could
make the idea work, it would be great. Possibly all it needs it a more complex mu-
tation procedure5 , but I suspect not. More likely, you need to find a way of reducing
the dimensional space to about 20 parameters. This might be achieved by swapping
envelopes between harmonics instead of mutating each point in every envelope. It
is also a shame to drop this idea as mating samples would have been fairly easy to
implement. To mate two samples you could just randomly swap some envelopes of
one sound for envelopes of the other sound.

5
Maybe one that involves filtering to get rid of the noisy envelopes.
Bibliography

[Bäc95] Thomas Bäck, Evolutionary Algorithms in Theory and Practice, Oxford:


Oxford University Press, 1995.

[Boo87] Michael Boom, Music through MIDI, Redmond, Washington: Microsoft


Press, 1987.

[Cho73] John M. Chowning, “The Synthesis of Complex Audio Spectra by means of


Frequency Modulation”, Journal of The Audio Engineering Society, Volume
21, Number 7, pp 526–534, 1973.

[Cso98] Csound - sound synthesis software. Csound is free for educational


and research purposes. The lastest version is avaiable via ftp from
ftp.math.bath.uk in the /pub/dream directory. Other Csound
information and resources can be found at the Csound Front Page,
http://www.leeds.ac.uk:80/music/Man/c_front.html

[CE95] Kai Crispien and Tasso Ehrenberg, “Evaluation of the ‘Cocktail Party Effect’
for multiple stimuli within a spatial auditory display”, Journal of the Audio
Engineering Society, Volume 43, Number 11, pp. 932–941, November 1995.

[CH96a] Ngai-Man Cheung and Andrew Horner, “Group Synthesis with Genetic Al-
gorithms”, Journal of the Audio Engineering Society, Volume 44, Number
3, pp. 130–147, March 1996.

[CH96b] San-Kuen Chan and Andrew Horner, “Discrete Summation Synthesis of


Musical Instrument Tones using Genetic Algorithmms”, Journal of the Au-
dio Engineering Society, Volume 44, Number 7/8, pp. 581–592, July/August
1996.

[Daw86] Richard Dawkins, The Blind Watchmaker. Harlow: Langman Scientific &
Technical, 1980.

[Daw88] Richard Dawkins, “The Evolution of Evolvability”, in Artificial Life,


Christopher Langton, Editor, Addison-Wesley, 1988.

99
BIBLIOGRAPHY 100

[DJ85] Charles Dodge and Thomas A. Jerse, Computer Music: Synthesis, Compo-
sition and Performance, New York: Schirmer Books, 1985.
[Goc96] Arno van Goch, “P-Farm”, an experimental version of this program is avail-
able from http://www.xs4all.nl/˜avg/pfarm.html
[Goc98] Arno van Goch. Personal Communication. August 1998.
[Gol89] David E. Goldberg, Genetic Algorithms in Search, Optimisation and Ma-
chine Learning, Reading, Massachusetts: Addison-Wesly, 1989.
[HA96] Andrew Horner and Lydia Ayers, “Common tone adaptive tuning using ge-
netic algorithms”, Journal of the Acoustical Society of America, Volume
100, Number 1, pp. 630–640, July 1996.
[HB96] Andrew Horner and James Beauchamp, “A genetic algorithm-based method
for synthesis of low peak amplitude signals”, Journal of the Acoustical So-
ciety of America, Volume 99, Number 1, pp. 433–443, January 1996.
[HBH93] Andrew Horner, James Beauchamp and Lippold Haken, “Machine Tongues
XVI: Genetic Algorithms and Their Application to FM Matching Synthe-
sis,” Computer Music Journal, Volume 17, Number 4, pp. 17–29, Winter
1993.
[HG91] Andrew Horner & David E. Goldberg, “Genetic Algorithms and Computer
Assisted Music Composition”, Proceedings of the Fourth International Con-
ference on Genetic Algorithms
[Hol75] John H. Holland, Adaption In Natural and Artificial Systems, Ann Arbor:
The University of Michigan Press, 1975.
[Hor95a] Andrew Horner, “Wavetable Matching Synthesis of Dynamic Instruments
with Genetic Algorithms”, Journal of the Audio Engineering Society, Vol-
ume 43, Number 11, pp. 916–931, November 1995.
[Hor95b] Andrew Horner, “Envelope Matching with Genetic Algorithms”, Journal
of New Music Research, Volume 24, Number 4, pp. 318–341, December
1995.
[Lev93] Steven Levy, Artificial Life: The quest for a new creation, London: Penguin,
1993.
, In Richard K. Belew and Lashon B. Booker, editors, San Mateo, California:
Morgan Kaufmann Publishers Inc., 1991.
[Mat98] MatLab computing software, The MathWorks, Inc., see
http://www.mathworks.com/products/matlab/ for more
information.
BIBLIOGRAPHY 101

[Moo94] Jason H. Moore, “GAMusic 1.0”, available via ftp from


fly.bio.indiana.edu in the /science/ibmpc/ directory

[Moo98] Jason H. Moore. Personal Communication. May 1998.

[Mus98] Music Machines, website at http://machines.hyperreal.org

[Ope88] Peter Oppenheimer, “The Artificial Menagerie”, in Artificial Life, Christo-


pher Langton, Editor, Addison-Wesley, 1988.

[Pre92] Jeff Pressing, Synthesizer Performance and Real-Time Techniques, Madison,


Wisconsin: A-R Editions Inc., 1992.

[Sim91] Karl Sims, “Artificial Evolution for Computer Graphics”, Computer Graph-
ics, Volume 25, Number 4, pp. 319–328, July, 1991.

[Sim93] Karl Sims, “Interactive evolution of equations for procedural models”, The
Visual Computer, Volume 9, Number 8, pp. 466–476, 1993.

[Smi91] Joshua R. Smith, “Designing Biomorohs with an Interactive Genetic Al-


gorithm”, Proceedings of the Fourth International Conference on Genetic
Algorithms, In Richard K. Belew & Lashon B. Booker, editors, San Mateo,
California: Morgan Kaufmann Publishers Inc., 1991.

[Smi94] Joshua R. Smith, Evolving Dynamical Systems with the Genetic Algorithm,
Honors Thesis, Williams College, Williamstown, Massachusetts, 1994.

[Wil88] Scott R. Wilkinson, Tuning In: Microtonality in Electronic Music, Milwau-


kee: Hal Leonard Books, 1988.

[YH97] Jennifer Yuen and Andrew Horner, “Hybrid Sampling-Wavetable Synthesis


with Genetic Algorithms”, Journal of the Audio Engineering Society, Vol-
ume 45, Number 5, pp. 316–330, May 1997.