You are on page 1of 135

ISSN 1744-1986

Technical Report N O
2010/ 23

Sound Spheres A non-contact virtual


musical instrument played using finger
tracking
C Hughes

19 September, 2010

Department of Computing
Faculty of Mathematics, Computing and Technology
The Open University

Walton Hall, Milton Keynes, MK7 6AA


United Kingdom

http://computing.open.ac.uk
Sound Spheres
A non-contact virtual musical instrument
played using finger tracking

A dissertation submitted in partial fulfilment


of the requirements for the Open University‟s
Master of Science Degree
in Computing for Commerce and Industry.

Craig Hughes
(T8078171)

8th March 2011


Word Count: 19,687
Preface

Firstly, I would like to thank my tutor, Mr. Michel Wermelinger, whose support and

guidance throughout this project has been invaluable. My thanks also go to the

project‟s specialist advisor, for his constructive advice and comments.

A huge thank you must also go to my family, especially my wife Wendy for her

unyielding patience and support.

I would also like to thank many of my friends and family for generously and

enthusiastically giving up their personal time to help evaluate the Sound Spheres

virtual musical instrument, and for all their words of encouragement.

Finally, I would like to acknowledge that this project has been greatly inspired by the

work of Johnny Lee (2008). In particular, his study “Hacking the Nintendo Wii

Remote” and various practical demonstrations such as his acclaimed presentation at

the Technology, Entertainment and Design (TED) Conference in 2008 have been of

great influence.
Table of Contents

Preface ........................................................................................................................ i

List of Figures ....................................................................................................................... v

List of Tables ...................................................................................................................... vi

List of Tables ...................................................................................................................... vi

Abstract ..................................................................................................................... vii

Chapter 1 Introduction.................................................................................................... 1

1.1 Definition of Terms ................................................................................................ 2


1.2 Background to the Research ................................................................................... 2
1.3 Aims and Objectives of the Research Project......................................................... 8
1.4 Contribution to Knowledge .................................................................................... 9
1.5 Overview of the Dissertation ................................................................................ 10

Chapter 2 Literature Review ........................................................................................ 11

2.1 New Electronic Musical Interfaces....................................................................... 11


2.2 Virtual Musical Instruments ................................................................................. 16
2.3 VMI Mapping Strategies ...................................................................................... 19
2.4 Non-Contact Musical Interfaces ........................................................................... 23
2.5 Finger Tracking .................................................................................................... 25
2.6 Wiimote ................................................................................................................ 26
2.7 Research Question ................................................................................................ 30
2.8 Summary............................................................................................................... 31

Chapter 3 Research Methods ........................................................................................ 32

3.1 Research Techniques ............................................................................................ 32


3.2 Research Detail ..................................................................................................... 33
3.2.1 Literature Review ............................................................................................. 33
3.2.2 Pilot Study ........................................................................................................ 34
3.2.3 User Study ........................................................................................................ 35
3.2.4 Observation ...................................................................................................... 37
3.2.5 Quasi-Experiments ........................................................................................... 38
3.2.6 Interviews ......................................................................................................... 38
3.2.7 Questionnaires .................................................................................................. 39

ii
3.2.8 Other Research Methods .................................................................................. 41
3.3 Preliminary Analysis of Research Data ................................................................ 42

Chapter 4 System Design ............................................................................................. 45

4.1 Overview .............................................................................................................. 45


4.2 User Interface ....................................................................................................... 45
4.2.1 Tracking and Sound Spheres ............................................................................ 45
4.2.2 Sound Sphere Layout ....................................................................................... 46
4.3 Implementation of Control Parameters................................................................. 49
4.3.1 Position............................................................................................................. 49
4.3.2 Angle ................................................................................................................ 49
4.3.3 Speed ................................................................................................................ 51
4.3.4 Pressure ............................................................................................................ 52
4.4 Visual Feedback ................................................................................................... 53
4.5 Control to Sound Synthesis Mappings ................................................................. 55
4.6 System Set-up ....................................................................................................... 56
4.7 Reflective Markers ............................................................................................... 58
4.8 Occlusion Issues ................................................................................................... 58
4.9 Ethical Issues ........................................................................................................ 59
4.10 Pilot Study and Prototype Review Influences ...................................................... 60

Chapter 5 Results.......................................................................................................... 63

5.1 Playability ............................................................................................................. 63


5.1.1 Sound Sphere Layout ....................................................................................... 64
5.1.2 Reflective Markers ........................................................................................... 64
5.2 Audio Visual Feedback ........................................................................................ 65
5.3 Reproducibility ..................................................................................................... 66
5.4 Control Parameters ............................................................................................... 67
5.4.1 Ranking of Control Parameters ........................................................................ 69
5.5 Spearman‟s Rank Correlation Results .................................................................. 71
5.5.1 Unexpected Correlation Results ....................................................................... 72
5.6 Mann-Whitney U Test Results ............................................................................. 73
5.7 Validation ............................................................................................................. 74
5.7.1 Playability ........................................................................................................ 74
5.7.2 Progression ....................................................................................................... 75
5.7.3 Predictability .................................................................................................... 76
5.7.4 Challenge, Frustration and Boredom ............................................................... 76

iii
5.7.5 Application of Control Parameters ................................................................... 77

Chapter 6 Conclusions.................................................................................................. 78

6.1 Project Review...................................................................................................... 79


6.2 Future Research .................................................................................................... 81
6.3 Further Development ............................................................................................ 82

References ..................................................................................................................... 83

Index ..................................................................................................................... 86

Appendix A – Extended Abstract .......................................................................................... 87

Appendix B – User Study Questionnaire ............................................................................... 92

Appendix C – User Study Questionnaire Results - Likert Scale ........................................... 97

Appendix D – User Study Questionnaire Results - Control Parameter Rankings ................. 98

Appendix E – Spearman‟s Rank Correlation Results ............................................................ 99

Appendix F – Mann-Whitney U-Test Results ..................................................................... 105

Appendix G – User Study Interview Responses .................................................................. 109

Appendix H – Categorization of Qualitative Results ........................................................... 112

Appendix I – Ethical Issues ................................................................................................. 113

Appendix J – Sound Spheres Setup and Installation............................................................ 116

Appendix K – Potential Enhancements ............................................................................... 123

Appendix L – Sound Spheres Source Code ......................................................................... 125

Appendix M – Sound Spheres Videos ................................................................................. 126

iv
List of Figures

Figure 1 – Overview of Research Methods .................................................................... 33

Figure 2 – User Study Research Methods and Stages..................................................... 37

Figure 3 – Axis Alignment in Perspective View ............................................................ 47

Figure 4 – Axis Alignment in Orthogonal View ............................................................. 47

Figure 5 – Illustration of User Interface .......................................................................... 48

Figure 6 - Illustration of the Position control parameter ................................................. 49

Figure 7 - Illustration of the Angle control parameter .................................................... 50

Figure 8 - Illustration of the Speed control parameter .................................................... 51

Figure 9 - Sensory Feedback Loops ................................................................................ 54

Figure 10 - Sphere Collision Sparks ............................................................................... 54

Figure 11 – Sound Spheres System Setup ...................................................................... 57

Figure 12 - LED Array, Wiimote and Cover .................................................................. 58

Figure 13 – Control Parameter Rankings for Ease of Control ........................................ 69

Figure 14 - Control Parameter Rankings for Importance to Musical Outcomes ............ 70

v
List of Tables

Table 1 – User Study Participant Profile......................................................................... 36

Table 2 – Control Parameter Mapping ............................................................................ 56

Table 3 - Questionnaire Result Summary - General Playability ..................................... 63

Table 4 - Questionnaire Result Summary - Audio Visual Feedback .............................. 65

Table 5 - Questionnaire Result Summary - Reproducibility........................................... 66

Table 6 - Questionnaire Result Summary - Position Control ......................................... 68

Table 7 - Questionnaire Result Summary - Speed Control ............................................. 68

Table 8 - Questionnaire Result Summary - Angle Control ............................................. 68

Table 9 - Questionnaire Result Summary - Pressure Control ......................................... 69

Table 10 – Spearman‟s Rank Correlation Results Summary .......................................... 71

Table 11 - Mann-Whitney U-Test Results ...................................................................... 74

vi
Abstract

The creation and performance of music is predominantly and traditionally reliant on

the direct physical interaction between the performer and a musical instrument. The

advent of electronics and computing has given rise to many new electronic musical

instruments and interfaces. Recent advances in these areas have seen an emerging

trend into the design of virtual musical interfaces in which audio is synthesized and

played back based on a musician‟s body movements captured by some gestural

interface. Designing new electronic or virtual musical instruments necessitates

consideration of many factors that affect its control and playability.

The research described in this dissertation concerns the design and construction of a

new non-contact virtual musical instrument (called Sound Spheres) that uses a finger

tracking method as its gestural interface. The dissertation identifies control

parameters and key factors that are considered important for the design of such

instruments and provides research into whether these can be successfully achieved in

a non-contact virtual musical instrument played by finger tracking.

Results show that implementation of the control parameters of pressure, speed, and

position can successfully be achieved for a non-contact virtual musical instrument.

Achieving successful implementation of the angle control parameter however was

inconclusive. Furthermore the results present evidence that the finger tracking

technique is an effective method for playing such an instrument.

vii
Chapter 1 Introduction

The cool computer interface technique of finger tracking and the fascinating world of

virtual musical instruments have, up until very recently, completely escaped my

attention! This perhaps would not be surprising if it were not for the fact that I am a

software developer by profession and a keen musician and recording artist in my

spare time.

The ability to control a software application purely by moving ones fingers freely in

the air (i.e. finger tracking) has always been, for me at least, the subject of science

fiction. This perception changed quite by chance after stumbling upon a fascinating

presentation given by Johhny Lee that was published on the Technology,

Entertainment and Design (TED) Conference website. The presentation showed how

to implement a cost effective finger tracking software application utilizing the built

in infrared camera and simple bluetooth connectivity of the Nintendo Wii Remote

Games Controller. This opened a world of possibilities and I began to think about the

various types of software application to which I could apply the finger tracking

technique. As a recording musician I am no stranger to the use of computer

technology in the production and performance of music. Could finger tracking be

used in the production of music and more precisely could it be used to play a virtual

musical instrument (VMI)? The non-contact nature of the finger tracking method

provoked a further question. How might the player of such an instrument exercise

control in order to affect its musical outcomes? This project involves the design and

construction of a non-contact VMI played using finger tracking and provides

research into playability with respect to its control affordances and effectiveness.

1
1.1 Definition of Terms

As a central theme of the project the term interaction is used specifically to describe

the relationship between a musical instrument and an entity that manipulates the

instrument‟s controlling mechanisms necessary to produce sound. Furthermore, the

term physical interaction describes an interaction that involves exerting a tangible

force on a musical instrument in order for it to produce sound.

The terms effective and effectiveness are used throughout this project. They refer

specifically to whether the finger tracking method contributes to the application of

factors considered important for the design of new musical interfaces.

1.2 Background to the Research

Common musical instrument classification systems, such as the Mahillon (Mahillon,

circa 1890) and Hornbostel-Sachs (Hornbostel and Sachs, 1914) systems, clearly

illustrate that practically all well-known musical instruments, from traditional to

more modern variants (e.g. electric guitar, electric keyboard, electric flute, etc),

produce their sound through physical interaction. It is indeed these different

interactions that form the basis of these classifications.

Not all physical interaction with musical instruments requires a performer however.

Take for example the Aeolian Harp, which is an instrument that is played entirely by

the wind as it blows across the harp‟s strings. The wind exerts a force on the harp

strings to produce sound and this by my definition is a physical interaction.

For musical instruments where interaction by a performer is a necessity, the

interaction is typically physical and involves contact with the instrument either

directly through touch or indirectly through the use of equipment or implements such

2
as sticks or bows. However, the Theremin and Terpsitone are two notable

instruments that, whilst requiring a performer to play them, are controlled through

body gestures and are played without any physical contact with the instrument (i.e. a

non-physical interaction). One might argue that even for the Theremin and

Terpsitone the interaction is of a physical nature, for the performer must still provide

body movement in order to play the instrument. Furthermore, an extreme argument

might suggest that the manipulation of electromagnetic waves is also physical (albeit

invisible). However, using the definition adopted in section 1.1, we can say that

interaction by a performer that does not directly or indirectly contact the musical

instrument is best described (for the sake of clarity) as a non-physical interaction and

the instrument as a non-contact instrument.

Electronics and computers, along with music related software, have enabled many

new musical devices, interfaces and software systems to be developed to facilitate

the creation of music for musicians and non-musicians alike. The aspiration to create

new musical devices and interfaces has given rise to a number of interesting studies.

One such study by Crevoisier et al. (2006), showed how a simple everyday object, a

table, could be transformed into a musical and visual instrument using sound

produced by touch. Kiefer (2010) explored three new input devices for the intuitive

control of composition and editing for digital music. Paine et al. (2007) also sought

to develop a new electronic musical instrument/interface based upon a detailed re-

evaluation of the relationship between the musician and musical interface. A more

artistic approach to the creation of music was taken in the design of the Sounds of

Influence software application (Johnstone et al., 2005) where visual feedback is

influenced by musical input using a computer mouse. In each of these examples the

3
physical interaction through contact with a musical instrument or device was central

to the study.

One research project of particular interest is the Performance Practice in New

Interfaces for Real-time Electronic Music Performance carried out at the Virtual,

Interactive, Performance Research Environment and the MARCS Auditory

Laboratories at the University of Western Sydney. This project sought to develop a

taxonomy for new interfaces for real-time electronic music performance under the

working title TIEM (Taxonomy for real-time Interfaces for Electronic Music

performance). A key component of this study is a public accessible online database

and website where, after completing a survey, people can submit details of new

musical interfaces they have developed. The website http://vipre.uws.edu.au/tiem

currently lists over 70 new and unique electronic musical instruments and interfaces.

Significantly only 4 instruments in the taxonomy are played without physical

contact.

Software applications along with various types of controllers (or gestural interfaces

which convert body movement and hand gestures to computer commands) have

enabled the development of virtual musical instruments (VMIs) in which audio is

processed, synthesized and played back to the musician using computer software.

Visual feedback for VMIs is typically carried out by displaying graphics on a

computer screen.

A wide variety of controllers have been developed to facilitate the creation of music

for VMIs. Some of these are included in the TIEM taxonomy and many more have

been described or been the focus of research into gestural interfaces. Mulder (2000)

for example provides descriptions of different types of VMI controller. Based on the

4
evidence of these and other sources, the majority of these VMI controllers rely on

physical interaction with a device such as the T-Stick, researched, designed and built

by Malloch (2007). The T-Stick can sense where and how much of it is touched,

tapped, twisted, tilted, squeezed, and shaken.

There is however a small number of VMIs that are controlled without contact (non-

physical interaction) and instead rely on the proximity or movement of parts of the

body. A good example of such a free-gesture controller is described by Franco

(2005).

Physical contact with any instrument (including VMIs) provides the performer with

varying degrees of touch sensation (or tactile feedback), where the performer can feel

aspects of the device such as resistance, vibration, texture or forces (such as weight

or movement). VMIs played through physical interaction typically rely on haptic

technology (or haptics) to mechanically provide this tactile feedback.

Perhaps the reason why the majority of VMIs are played through physical interaction

is to do with the wide belief that tactile feedback provides a greater degree of control

and hence the musician/instrument can be more expressive. It has been argued that

tactile feedback plays a central role in musical performance, and that audio and

vision are of a lesser importance, and merely act as monitoring senses. This is

certainly the conclusion reached by Castagne et al. (2004) and Lecuyer, et al. (2005).

However, there is another viewpoint. Whilst one might acknowledge that tactile

feedback may be important for precise control of a musical instrument, it is clearly

not an entirely necessary factor as the Theremin and other non-contact instruments

have demonstrated. I suggest that removing physical constraints and precision of

control allows for new musical possibilities. The following viewpoint can also be

5
considered. Instruments are generally played for musical performance or some kind.

The musical performance maybe recorded for later playback or carried out in real-

time. It is during this real-time musical performance that it could be argued audio and

visual feedback become equally (if not more) important than the physical control of

the instrument itself. This is because audience participation is a key component of a

musical performance and from an audience‟s perspective audio and visual is the only

feedback afforded to them. Musical performances of both contact and non-contact

instruments can be equally compelling to watch and moving to listen to. One only

has to watch a performance of Ivan Franco playing the Airstick at Steim, Netherlands

in 2007 to conclude this.

Typically non-contact VMIs take their input from body proximity, movement and/or

gestures. Motion capture techniques are therefore a key component of non-contact

VMIs. In Vlaming‟s (2008) thesis a wide range of motion capture techniques and

systems are identified. Finger tracking is one such motion capture technique.

Finger tracking systems recognize and follow the position and gestures of fingers as

they are moved freely in the air, to enable control of software and user interfaces.

This technique has many applications. Kiefer (2010) demonstrated an example with

the Phalanger gesture recognition system that it is possible to use hand tracking to

control editing of digital music.

However, can finger tracking be an effective non-contact technique in which to play

a VMI? The answer to this would very much depend on what is meant by effective.

The Thummer Mapping Project study (Paine et al., 2007) identified four common

physical instrument variables (pressure, speed, angle and position) that control

instrument dynamics, pitch, vibrato and articulation. In a later study Paine (2009) re-

6
iterated these control parameters as important factors for the design of new musical

interfaces. Perhaps if implementation of these four variables can be achieved for a

finger-tracking VMI then one might conclude that key aspects likely to contribute to

effectiveness are present.

Jorda (2004) describes other factors that are perhaps also important to the

consideration of a good musical instrument. He suggests that playability, progression

(learning curve), control and predictability are all important factors. He also

suggests that the balance between challenge, frustration and boredom must be met.

Ferguson and Wanderley (2009) highlight reproducibility as one more important

factor for digital musical instruments. They suggest that musical instruments that

allow a performer to be expressive must also permit a performer to imagine a

musical idea and be able to reproduce it.

If these factors (i.e. playability, progression, control, predictability, balance between

challenge, frustration and boredom, and reproducibility) can be achieved then one

might conclude that finger-tracking is an effective method for playing a VMI.

Until recently, hardware to support finger tracking has been expensive and confined

to specialist use (such the motion capture systems from Vicon). However, in the

fascinating study “Hacking the Nintendo Wii Remote”, Lee (2008) showed an

accessible and affordable finger tracking technique utilizing the Nintendo Wii

Remote controller (Wiimote) for the Nintendo Wii game console. He cleverly

exploited the Wiimote‟s built in infrared camera and simple bluetooth connectivity,

demonstrating how to implement a low-cost finger tracking application.

It is this accessibility to an affordable finger tracking system that has been the

catalyst for this project. It provides a means in which to develop a new non-contact

7
virtual musical instrument that utilizes finger tracking, and to investigate its musical

and control possibilities.

1.3 Aims and Objectives of the Research Project

The aim of this project is to develop a non-contact virtual musical instrument (VMI),

that is played through using a finger tracking method (movement of fingers in the

air), and to study its playability.

The study has primarily focused on the control aspects of the VMI, specifically the

ability for the VMI to provide control parameters of position, speed, pressure and

angle to vary the audio feedback of the VMI. However, the VMI has also been

assessed for its playability in a more general sense. For this I have considered the

factors progression, predictability, reproducibility, and the balance between

challenge, frustration and boredom.

To facilitate finger tracking I have utilized the infrared camera and bluetooth

connectivity capabilities of a Wiimote as demonstrated by Johnny Lee (2008). As

detailed in section 2.5, with this approach there are two possible marker-based

implementation strategies; passive markers and active markers. This project utilizes

passive markers in the form of highly reflective tape stuck to a lightweight cap

placed on the fingertips. Some experimentation was necessary in order to determine

the most appropriate and effective method of affixing the markers (e.g. directly to the

finger tip or to an object that is placed over the finger tip). An infrared illuminator

has been used as a light source which is directed from the position of the Wiimote in

the direction of the player‟s fingers. The passive markers reflect the infrared light

back to the camera of the Wiimote, which in turn, sends data (via Bluetooth

communication) to a computer application running on a laptop. This application is

8
responsible for both the audio (sound synthesis and playback) and visual feedback

(displayed on a computer monitor) of the VMI.

I have termed the VMI as Sound Spheres and this terminology will be used

throughout this document.

1.4 Contribution to Knowledge

In this research I make several references to work carried out by Paine (2007, 2009)

on the design of new musical interfaces and respectfully acknowledge his

contribution in this area. His assertion that pressure, speed, angle and position could

act as a design consideration for future music interface development has provoked

much thought as to how it might be applied to a virtual musical instrument that is

played using finger tracking (i.e. non-contact). This dissertation explores this idea

further by incorporating the control parameters of position, speed, pressure, and

angle into the design of the Sound Spheres VMI.

As described in section 1.1 the Taxonomy for real-time Interfaces for Electronic

Music performance (TIEM) database currently lists over 70 new and unique

electronic musical instruments and interfaces. Significantly only 4 instruments in the

taxonomy are played without contact and none of these uses the finger tracking

technique. This presents an obvious opportunity for further research and

development into this area. I therefore believe that the addition of the Sound Spheres

VMI would be a significant contribution to the TIEM database. As previously

discussed, all those who submit to the taxonomy must take part in the TIEM Survey.

As part of the survey‟s questionnaire, participants are asked a series of question

about the qualities of movement needed to play their instrument/interface. They are

also asked to rank their relative importance. I believe that results gathered from the

9
user study of the Sound Spheres VMI will enable me also to contribute to this

survey, especially on the rankings of position, speed, pressure and angle.

1.5 Overview of the Dissertation

This project essentially involves the design of a new non-contact virtual musical

instrument and a study on the effectiveness of the finger tracking method and

application of control parameters.

A literature review of published academic artefacts is presented in Chapter 2 which

highlights areas of knowledge that have influenced both the design and study of the

Sound Spheres VMI.

Research methods and the data collection process used within the project are

discussed in Chapter 3 and include methods for a pilot study and a user study of the

completed Sound Spheres VMI.

The functional design of the Sound Spheres VMI is outlined in Chapter 4 and in this

section design decisions are explained.

Chapter 5 presents results and analysis of data collected during the Sound Spheres

VMI user study.

The dissertation ends with Chapter 6 where conclusions are drawn and discussed in

relation to the research aims and objectives. A project review and recommendations

for future research are made.

10
Chapter 2 Literature Review

Research for this project has centered around four core themes; new electronic

musical interfaces, virtual musical instruments (VMIs), non-contact musical

interfaces and finger tracking.

2.1 New Electronic Musical Interfaces

Research into new electronic musical interfaces revealed a wide variety of different

types of interfaces and classifications. While the focus of this research is principally

on designing a VMI that used a finger tracking method, useful insights can also be

gained from research studies into the design of other types of electronic musical

interface.

The Sound Rose project (Crevoisier et al., 2006) is an audio interactive system based

around a touch sensitive interface that enables its user to create music and images

through the tapping and dragging of fingers over the surface of a simple table top

(the touch table). Their project was of interest as it primarily describes the design of

the hardware and software components of the system and it was useful to appreciate

how these are combined into a single system and how the sequence of events/inputs

were processed. The choice of layout and construction of the touch table (designed

with usability and ergonomic considerations) is also described and justified. The user

interface, graphics and music components of the system are described. It was

enlightening to see how the x-y coordinates of the position of touch were mapped to

different control and processing parameters.

The Sound Rose project shares two similarities to the Sound Spheres project and

hence has been useful in developing initial ideas and design of the Sound Spheres

11
system. The first similarity is that both projects take input from finger positions (x

and y coordinates) and process these to render graphics and generate sound to

provide both visual and audio feedback to the user. The simple process flow chart

(and accompanying text) presented in the Sound Rose project helped clarify how

similar processes could be sequenced for the Sound Spheres project. Unfortunately

the Sound Rose paper does not provide an in-depth view or discussion on the sound

and graphics processing and hence it did not help to identify any associated

problems. The second similarity between the projects is that they both use finger

positions as a mapping to the sound generated. Whilst the Sound Spheres system will

consider the parameters of speed, pressure, angle in addition to position, the Sound

Rose paper does at least highlight an interesting example of a mapping which will be

considered for the Sound Spheres system and that is the spatialization and panning of

sound based on x and y coordinates.

The process and study of mapping a performer‟s interaction with an electronic

musical interface is detailed in the Thummer Mapping Project (ThuMP) (Paine et al.,

2007). This study sought to develop a new electronic musical instrument (the

Thummer) based upon a detailed re-evaluation of the relationship between the

musician and musical interface. The methods used to design and evaluate the new

interface are described in detail and have provided insights during the definition of

my own research methodology. The first stage of the project sought to quantify and

categorize perceived control gestures for several types of instrument, specifically

from the performer‟s perspective. Through a series of interviews of musicians and

subsequent analysis, the parameters of instrument control exercised were noted, and

these included pitch, dynamics, articulation and vibrato. Further analysis revealed

that the four most common physical instrument controls used to manipulate these

12
parameters were pressure, speed, angle and position. The second stage of the project

sought to use these controls to experiment with mapping strategies in order to

optimize the playability of the Thummer.

The pressure, speed, angle and position parameters are acknowledged to be applied

in different ways for different instruments. Take for example the control parameter of

pressure. The pressure of a bow on a violin‟s strings will vary the tone and dynamics

of the sound produced, whereas for a wind instrument an increase in pressure results

when more air stream is directed into the instrument and a change of sound is

produced. The pressure of a finger on a guitar string will also change its sound,

perhaps from a muted note when low pressure is applied to a clean sound when high

pressure is applied.

The importance of the pressure, speed, angle and position controls was further

explored by Garth Paine when he studied the application of these controls to

computer-based musical interfaces. He outlines his experimentation with two

specific interfaces, the Wacom Graphics Tablet and The Nintendo Wii Remote

(Wiimote). Paine (2007, 2009) describes in detail how the various gestural

possibilities of the Wiimote can be mapped to provide pressure, speed, angle and

position characteristics. For example the angle characteristic might be implemented

using the pitch, roll and yaw controls of the Wiimote, whilst the pressure

characteristic maybe be implemented by the pressing one or more of the Wiimote

buttons. He suggests that a computer music interface could convincingly provide

nuanced and expressive performance if the gestural possibilities are mapped

sufficiently to these controls. However, Paine presents no formal conclusion as to

whether this was found to be the case.

13
This significance of pressure, speed, angle and position controls is further supported

by the TIEM (Taxonomy for real-time Interfaces for Electronic Music performance)

research project. This ongoing project seeks to develop taxonomy for new interfaces

for real-time electronic music performance. The project‟s objectives, methods,

preliminary outcomes and future plans are detailed by Paine and Drummond (2009).

The project‟s primary method of defining taxonomy is through the online TIEM

Questionnaire (http://tiem.emf.org/survey) with results collated in the TIEM

database, dividing types of instrument/interface into the four concepts: Gesture,

Instrument, Digital Controller, Software. The taxonomy currently includes over 70

musical interfaces, which implies 70 responses to the online TIEM Questionnaire. As

part of this questionnaire, participants were asked to select and rank the qualities of

movement needed to play their instrument/interface. The results were in line with the

pressure, speed, angle and position controls with rankings as followed:-

1. Position (81.13%)
2. Speed (71.70%)
3. Pressure (58.49%)
4. Angle (49.06%)

The studies have been of great interest as they stress the importance of these physical

controls when designing new electronic musical instruments or interfaces. The

pressure, speed, angle and position controls and have played a central role in the

design and study of the Sound Spheres VMI.

These controls are not the only important factors in the design of new musical

interfaces however. Jorda (2004a) also explored the dynamic relationship between

musician and the instrument. His project starts with the assertion that considering

live electronic musical interfaces and laptop music is now so widespread it is

surprising how few professional musicians use them and whether they can be

14
considered to support a virtuoso performance (i.e. a performance where the musician

must apply great skill). To this end he presents factors that contribute to what makes

a good digital musical instrument. A balance between challenge, frustration and

boredom is one such factor he describes, suggesting that an instrument/interface that

is too simple may not provide a rich experience, and one that is too difficult to master

may alienate the user before they are able to progress. This leads onto a discussion

concerning the playability, progression and learning curve of an instrument. Jorda

(2004b) goes on to look at other factors that would perhaps allow a player of an

instrument to reach virtuoso standard. He suggests that factors such as variability,

reproducibility and predictability are important to provide a rich musical learning

experience for an instrument‟s players. One might consider that these factors at odds

with those outlined by Paine. However, they might also be seen as complementary.

For example, variability can only be achieved if there are multiple control parameters

of which to choose. This study is of interest for the design of new musical interfaces

and hence has been useful in formulating design ideas. Whilst no real method is

described by Jorda on how these factors might be applied, their importance is

assessed as part of this dissertation.

Wanderley (2000) looks at how to design and perform new computer-based musical

instruments from yet another perspective. The focus of his study is on gestural

control and hence aligns well with the design of the Sound Spheres VMI. His paper

provides a number of useful insights. He presents a four part approach to the design

of sound synthesis for a gesture based musical interface as follows:-

1. Definition and topology of gesture


2. Gesture acquisition and input device design
3. Mapping of gestural variables to synthesis variables
4. Synthesis algorithms

15
It is comforting to see that Wanderley‟s approach is directly in line with the design

approach started for the Sound Spheres VMI and hence reaffirms its validity. His

paper also discusses various types of gestural control and looks at characteristics of

the sensors used to capture the performer‟s actions. Again, the use of strategies for

mapping of gestural variables to sound synthesis is a highlighted theme and hence

supports the work carried out by Garth Paine on the Thummer Mapping Project.

2.2 Virtual Musical Instruments

The evolution of musical instruments from acoustic, electro-acoustic, electronic (e.g.

MIDI) to virtual musical instruments is outlined by Mulder (2000). He gives

examples of many new musical interfaces and he characterizes a VMI as having at

least two features. Firstly, any gestures or body movements can be used to control

the sound synthesis process. Secondly, the mapping of these gestures is entirely

programmable and hence limited only by the sound synthesis model. He suggests

however that many designs of new musical controllers and interfaces are more

technology driven endeavours than based on a model of musician performance or

auditory perception. Hence many VMI designs are not adopted for musical

performance. This is an interesting perception and underlines the importance of the

user studies during instrument development.

This notion is re-affirmed by Dobrian (2003) who states that the design of new

interfaces for music mostly focuses on technical issues and engineering challenges.

He discusses the relationship between the performer and instrument and suggests that

much of one‟s appreciation of music is in its performance. As an observer we might

be impressed or enthralled by the skills and manoeuvring made by a musician as he

plays his instrument. One‟s knowledge of the instrument being played enhances

16
appreciation of the skill of the performer. However Dobrian recognizes that with new

VMIs there is no established standard for the relationship between a movement in a

virtual space and resulting sounds or music and hence audience appreciation can be

affected. He therefore provides guidelines for the aesthetic design of VMIs.

However, one could argue that his guidelines are too biased towards audience

appreciation rather than the playability by a performer. Take his simplicity guideline

for example. This guideline states that mappings of gesture to sound must be simple

and direct in order for the audience to perceive the cause and effect relationship. This

implies that Dobrian is suggesting a one-to-one gesture to sound mapping be used

(i.e. a specific gesture would always generate the same sound). As argued in section

2.3 on mapping strategies, mappings that are not one-to-one are more engaging for

users. He does go someway towards dealing with this issue in his multiple

simplicities guideline however, suggesting a single simple gesture-sound relationship

may soon seem simplistic to an audience, but two or more simple simultaneous

relationships can be more engaging.

Aesthetic considerations for VMI design are further highlighted by Barbosa (2001).

He gives clear illustrations of the complex sensory feedback system formed by the

performer and the instrument and argues that the feedback between the output of a

virtual device and the user must be in real time if the system is to be classified as an

instrument. Indeed it is hard to see how a VMI could be playable in any other

context. He also gives a list of considerations for live musical performances of VMIs,

although two specific points are open to criticism. Firstly, Barbosa argues that

„although the consequences of the performer‟s actions should be very clear they

should not be predictable and in this sense the interaction process should not be

totally understood by the audience‟. This is in contrast to the view held by Dobrian.

17
An argument for an opposing view to Barbosa‟s argument is that knowing exactly

how a musician plays any instrument does not take away from the pleasure an

audience can get from watching the musician perform. Secondly, Barbosa also

argues that „if the interaction process is too obvious and the audience is not surprised

by its results, the performance becomes a technology demonstration‟. One might

suggest that Barbosa‟s point is in fact back to front. A musical performance is more

likely to be just a technology demonstration if the audience has no idea what role the

musician is playing in the performance.

Software is at the core of a VMI and its mode of implementation can influence the

performer experience and hence the musical outcome. One particular study of

interest (Johnstone, et al. 2008) investigated three different modes of how software

can exert control of the VMI. These modes are described as instrumental,

ornamental and conversational. In instrumental mode the musician is in control of

the musical outcome and plays the virtual instrument in a similar fashion to a

traditional instrument. For example, each sound is generated by a specific and

intended input from the musician. In ornamental mode, the player surrenders control

for the generated sound and visuals of the VMI to the software itself. The player in

this mode merely influences the musical outcome but cannot control it. For example,

the musician may initiate a sound sequence through some input, but the specific

sounds generated maybe randomly selected by the software. In conversational mode

the player and the software share control of the musical outcome. For example, part

of the musical outcome (like the playing of a melody) may be controlled through the

musician‟s direct input and part may be automatically generated by the software

(perhaps as a layering of abstract sounds) to accompany the melody. Through a

qualitative study of expert musicians using the VMI in different modes of software

18
control it was (not surprisingly) discovered that the instrumental mode was the clear

preference. This has obvious implications for the interaction design of VMIs, as the

instrument‟s software component should ideally leave the balance of power and

control to the player. This position has been taken during the design of Sound

Spheres VMI where sound synthesis will be controlled by the player.

2.3 VMI Mapping Strategies

VMI design considerations for balance of power and control to the performer, the

performer/audience relationship, and aesthetics are clearly important. To cater for

these in the design of a VMI necessitates a well conceived mapping strategy between

the performer‟s input and the instrument‟s audiovisual output. The Sound Spheres

project focuses on the control aspects of the VMI, specifically the ability for it to

provide control parameters of position, speed, pressure and angle to vary the audio

feedback of the VMI. It therefore follows that these control parameters should be

mapped to the specific effect on the audio feedback that each will modify. This has

meant that a strategy needed to be designed to map control parameters to sound

synthesis parameters.

The Sound Spheres VMI is essentially a multi-parametric interface where the

parameters of position, speed, pressure and angle are controlled simultaneously and

the audio feedback is a result of their combined effect.

The study by Hunt and Wanderley (2002) looked at various strategies for mapping

gestures onto sound synthesis parameters. They start by describing attributes of a

real-time musical instrument control system and use these attributes to define a mode

of operation they call Performance Mode, which essentially describes how a player

of a VMI discovers how to control it by exploring the different input control options

19
and their combinations. In this mode the player may appear to be merely „playing

around‟ but in fact they are actually discovering hidden relationships between the

various system parameters. They state that this Performance Mode is usually the

player‟s first mode of operation when they try a new instrument for the first time.

This reflects my own experience of picking up and trying to play a wide variety of

instruments. I see this as an indication that players of musical instruments expect to

control multiple parameters in order to play. This preference for performance mode

is considered in the design of the user study of my project where the session starts

with a period of free-play.

Hunt and Wanderley also discuss the concept that multi-parameters should be

coupled together and that this is key to the design and development of richer

interfaces. They first describe two types of mapping, convergent and divergent.

Convergent mapping is where multiple control parameters can control a single sound

synthesis control (i.e. a many-to-one mapping). They illustrate this well by asking

„where is the volume control on a violin‟ and go on to explain that there is no single

control. The violin‟s volume is controlled by a combination of a number of inputs

(such as bow-speed, bow-pressure, choice of string and finger position). Divergent

mapping is a one-to-many mapping where one control parameter can control multiple

sound synthesis parameters (e.g. pitch, reverb, volume, etc). Again they use the

violin for illustrative purposes by asking the question “which sonic parameter does

the bow control”. It actually influences many aspects of the sound (e.g. volume,

timbre, articulation, and pitch). Initially only a one-to-one mapping strategy was

considered form the Sound Spheres VMI, however Hunt and Wanderley‟s illustration

of these two mapping types have presented an alternative view. Surprisingly they do

20
not go on to express the point that in actual fact both convergent and divergent

mapping types can be used in conjunction to form a many-to-many mapping strategy.

Hunt and Wanderley do however give details of a study that they conducted to

compare different types of parametric control that are commonplace in computer

music. Three types were considered; a set of on-screen sliders controlled by a mouse,

a set of physical sliders moved by the fingers, and a multi-parametric interface which

uses parametric coupling. The same set of four sound synthesis parameters (pitch,

volume, timbre and panning) was used in each case. Whilst their study gave

technical and set-up details for each interface used, it was their detailed conclusions

drawn from the study which were most interesting and they can be summarised as:-

 Real-time control can be enhanced by a multi-parametric interface.


 Mappings that are not one-to-one are more engaging for musicians.
 Complex tasks may need complex interfaces.
 A one-to-one mapping is good for simple use and little practice time.
 Some people prefer to think in terms of separate parameters.

Although the study focused on controls for computer-based music interfaces, one

might argue that their findings are equally relevant to any VMI and especially the

Sound Spheres VMI whose interface is also computer-based. It would have been

good to have also seen a comparison between different multi-parametric interfaces to

compare the mapping types many-to-one, one-to-many and many-to-many. This

would have added further value to design decisions for the Sound Spheres mapping

strategy.

So far, focus has been placed on the need for a mapping strategy between input

control parameters and audio feedback. What about mapping for visual feedback? A

view was previously expressed that for a non-contact VMI both audio and visual

21
feedback is necessary (for both the performer and audience), and the Sound Spheres

VMI will very much be an audiovisual experience. Logically therefore the Sound

Spheres mapping strategy must (and does) also consider visual feedback.

Furthermore, the synchresis (i.e. a term coined by film theorist Chion (1994),

meaning the forging between something one sees and something one hears) of both

the audio and visual feedback has also been an influence in system design

considerations.

Both mapping strategies and synchresis for audiovisual instruments are topics

covered in some depth by Moody (2009). His thesis presents an approach to

constructing audiovisual instruments based on the notion of synchresis and hence

details how both sound and visuals may be linked together in a musical instrument.

Not surprisingly Moody devotes much of his discussion on the importance of

mapping strategies for digital and virtual musical instrument design and makes

reference to the work of Hunt and Wanderley. His section on synchresis was

enlightening and he clearly illustrated its importance by explaining how audio and

visuals are combined in films. Typically film audio and visuals are recorded from

different sources (e.g. sound effects are added after the film visual have been

recorded), however they are perceived by the audience as a single, fused audiovisual

event. Moody‟s hypothesis is that where synchresis is involved, it is motion, and the

domain in which the motion occurs, that forms the connection between audio and

visuals. Moody does however acknowledge that film based synchresis is simpler to

achieve than that required for an audiovisual musical instrument as there is no

concept of interactivity in watching films. A VMI on the other hand must also

consider how synchresis is to be achieved to connect the gestures of the performer to

the audio and visual feedback of the instrument. Moody partially addresses this issue

22
by developing 8 small experimental audiovisual instruments with differing control

parameters and evaluating them against a set of criteria he believed were key factors

for a successful audiovisual instrument. Knowledge from this development was

consolidated to develop the Ashitaka audiovisual instrument which was then also

evaluated against the key factors. Whilst his thesis has enabled the incorporation of

synchresis ideas into the design of Sound Spheres, Moody‟s conclusion as to how

valuable or relevant synchresis is to a VMI is less than satisfying, mainly because

evaluation of his experimental and final Ashitaka instruments were carried out only

by himself.

2.4 Non-Contact Musical Interfaces

A study by Monaci, et al. (2007) proved very informative in understanding different

technologies applied to non-contact interfaces in general. It presents a technology

matrix summarizing the relevant characteristics of a wide variety of different device-

less technologies, as well as examples of the types of systems that use them. The

technologies are grouped into five categories; presence detection, proximity

detection, gesture control, attention detection, speech recognition. One could note

that the use of infrared illuminators and/or infrared cameras is used in three of these

categories (i.e. presence detection, gesture control and attention detection), and

interestingly the Wiimote was not featured. Whether this omission was intended or

not, it does indicate that perhaps awareness that the Wiimote can be used for device-

less interaction is a relatively recent thing.

There are relatively few musical interfaces that use non-contact methods of input and

those that exist often tend to be more akin to an art form than a controllable musical

instrument. Take for instance the Sound Sculpture created by Hegarty and Fernstrom

23
(2008) which uses electric field sensing (similar to the Theremin) to detect the

proximity and activity of people. Designed to be used in public places, the system

maps people‟s proximity and activity to various sound files and audio effects and the

resulting sounds are amplified through a loudspeaker embedded in the Sound

Sculpture itself.

However, Franco (2005) presents a good example of a true non-contact VMI. The

AirStick is described as an instrument designed for musical expression and draws

parallels to the Theremin in that it is played “in the air”. It is basically composed of a

series of infrared proximity sensors. These sensors map the position of objects

(typically the player‟s hands) that are placed above them. The x-y coordinates of the

player‟s hands are mapped to sounds using real-time synthesis algorithms. The paper

gives technical details of the AirStick and describes two different approaches that

were used to trigger sounds. The first approach is described as sustained events,

where the note or sound being generated is sustained until the hand is removed. The

second is described as percussive events, where each new sound is triggered and then

decayed regardless of other new hand movements. I see the Sound Spheres VMI as

percussive in nature and hence sound generation takes the percussive events

approach.

Also of interest was the account of the initial user experience of the AirStick. Franco

tells us that players new to such an abstract environment tend to quickly find some

level of sensibility with playing the instrument, and in this sense the controller may

be seen as too easy, almost to the point of it being nothing more than a novelty.

However, he is quick to point out that through a level of persistence musicians find

qualities that would enable a possible virtuosity to it. This indicates that this type of

VMI could appeal to non-musicians and musicians (from novice to experts).


24
Kapros and Raptis (2009) created a synthesizer which is controlled without physical

contact by body movement of the player. Their paper describes the design and

evaluation of the synthesizer. The particular technology used for the non-contact

interface was the processing of video input which processed the movement of the

hands and head into audio parameters. The paper details a number of audio

processing elements that can be controlled by the player. Although the elements of

tone, cutoff and decay are of specific interest and could have been important design

factors for the Sound Spheres VMI, they have not (due to the fixed project schedule)

been considered in its sound synthesis.

2.5 Finger Tracking

Finger tracking is a specific type of motion capture technique used to follow the

movement and/or the gestures of fingers in the air. In Vlaming‟s (2008) thesis a wide

range of motion capture techniques and systems are identified, which he classifies as

optical, inertial, mechanical and magnetic systems. For finger tracking we see it is

optical systems that are most prevalent. Vlaming describes in some detail different

types of optical system (passive markers, active markers, and marker-less). For

passive marker finger tracking, reflective markers are placed on the fingers which are

illuminated by a light source which is reflected by the markers back into a camera.

Basically no direct light is sent from the fingers. This has the advantage that the user

of the system does not need to wear any electronic device. With active marker finger

tracking the opposite is true, and direct light must be emitted from the fingers.

Marker-less systems use video capture systems to recognize the movement and

gestures of fingers. Vlaming makes no study or assessment between these different

types of systems. However his work is of interest because he provides a narrative of

his experience and study in using the Wiimote for finger tracking, and in doing so he

25
does highlight several minor problems he had with this approach, such as the need to

ensure both the horizontal and vertical alignment of the Wiimote are positioned

correctly for optimal results, and the issues found with the different type of reflective

materials he tried for the markers. Another common problem highlighted by Vlaming

regarding gestural computer interfaces such as finger tracking is one of occlusion,

where a gesture wholly or partially visually obstructions another. Take for example a

system that is operated through hand gestures. Without care it would be quite

possible for the system operator to apply a gesture on one hand that obstructs or

overlaps a gesture on the other hand. I have outlined how the design of the Sound

Spheres VMI considers the problem of occlusion in Section 4.9.

2.6 Wiimote

The use of the Wiimote for finger tracking is clearly important for my project;

however it is not the focus of the study itself. Its importance is twofold. Firstly it

must be acknowledged that inspiration for my project came from Johnny Lee‟s

(2008) study “Hacking the Nintendo Wii Remote” and his compelling presentation at

the 2008 TED conference. Secondly, it is important because it provides an accessible

and affordable means in which to develop a new non-contact virtual musical

instrument that utilizes finger tracking.

Research into the Wiimote technology and implementations are still important

however. Developing a system such as a non-contact VMI without understanding the

technological possibilities and limitations of such a device is unwise. The study of

other Wiimote finger tracking implementations gives the opportunity to learn from

other‟s efforts and perhaps mistakes. More importantly it was necessary to access

26
whether the accuracy of the finger tracking with the Wiimote would be sufficient for

the development of a new musical interface.

Softic (2009) gives a general understanding of the Wiimote technology and its set-up

for finger tracking and provides very good illustrations of how to setup a finger

tracking system utilizing an infrared illuminator and a Wiimote and compliments

Lee‟s paper by providing more technical details, especially on the specification of the

LEDs used in the LED array. He also provides a detailed account on how the infrared

illuminator is built. However, his paper really only provides a general description of

each of the components of a Wiimote based finger tracking system. There seems to

be no specific aim to the study and hence there was little that could be taken from his

work to be of use for the design of the Sound Spheres VMI interface.

A study by Vuong, et al. (2009) evaluates the accuracy of the tracking algorithm

used to position the Wiimote in 3D space. Using its built-in camera, the position and

orientation of the Wiimote in 3D space can be calculated by tracking four known

infrared LED positions and then relating these to the reported position of the

Wiimote‟s camera. The positions of the infrared LEDs reside in one coordinate

system and the position of the Wiimote‟s camera focal point resides in another. It is

possible to derive a mathematical expression to linearly transform the LED positions

from their coordinate system to that of the Wiimote‟s coordinate space, and

subsequently the 3D position of the Wiimote. Whilst not stated in the study this

process is commonly known as triangulation and is a principle used in 3D optical

systems to determine the spatial position and dimensions of an object. Vuong,

describes the mathematics for 3D geometric rotation (used for the tracking

algorithm) in detail. To determine the accuracy of the tracking algorithm the

Wiimote is compared against a commercial motion tracking system (Phasespace). By


27
comparing coordinate data from both the Wiimote and Phasespace, Vuong

determines the residual distances between the reported positions of the Wiimote and

Phasespace. The study concludes that the overall level of accuracy of the Wiimote

makes it suitable for most tele-immersion systems. It also concludes that there are a

number of parameters that prevent the Wiimote from being highly accurate. Firstly

the Wiimote‟s camera resolution is relatively small (1024 x 768) which contributes

to loss of precision. Secondly, the relative positions of the infrared beacons may not

be measured correctly. By looking at the photographs of the setup used by Vuong

one might suggest that this may well be the case. The LEDs appear to be held in

position by polystyrene and tape which may possibly allow movement of the LEDs

after initial measurement of the positions. Finally not all four infrared LEDs are

detected during the entire motion of the Wiimote. This may have been down to the

choice of the LEDs which differ in the beam angle of light radiance. Perhaps this

point could have been overcome by Vuong if he had investigated further into

different LED technologies and tried different types of LED in the study.

Wang and Huang (2008) also use a triangulation method to explore the potential of

the Wiimote to track objects in 3D space in terms of accuracy and processing

throughput. Their study differs from Vuong‟s in two aspects. Firstly two Wiimotes

are used for stereo triangulation. Secondly, it is the LEDs that are moved in 3D space

instead of the Wiimote itself. Again the mathematics of their tracking algorithm is

well explained. The study does not however give any specific accuracy level, and

simply concludes that the Wiimote is suitable for 3D motion tracking.

The studies by both Vuong, and Wang and Huang raised concerns regarding whether

the proposed set-up and use of infrared LEDS and the Wiimote for the Sound

Spheres VMI will be accurate enough to provide the necessary control when playing
28
the VMI. As with Lee‟s demonstration of Wiimote motion tracking, the Sound

Spheres system will utilize a single Wiimote and will track up to four markers

reflecting infrared light. At any point in time the system will be tracking from zero to

four markers (depending on the performers gestures), and each marker effectively

moves independently from each other. Positioning is therefore really only carried out

between a single marker and one Wiimote so triangulation cannot take place and

therefore it is only possible to determine the x and y coordinates each. This is not a

significant problem though, due to the fact that the Sound Spheres VMI only tracks

the x and y coordinates of each marker anyway. However, the system may

experience differences in calculating x and y movements of the markers if for

example the player‟s hands are set at different distances from the Wiimote. Early

prototyping confirmed that the motion tracking of the Wiimote seems to be adequate

for Sound Spheres.

The use of two Wiimotes for finger tracking has been studied by Martin (2010). His

project is very similar to the Sound Spheres project in that he also implements finger

tracking for sound generation. Specifically he seeks to explore a bi-dimensional

sound feature space by means of open hand gestures. There are a number of

noteworthy differences however. Firstly Martin‟s system involves tracking fingers in

a 3D space and hence uses triangulation and two Wiimotes. Secondly he has opted

not to use finger markers but to instead place the infrared LED‟s on the tips of the

fingers (held in place by a glove). He cites that this approach worked better than the

use of reflective markers on the fingers. Interestingly he also acknowledges the fact

that as fingers are bent, sometimes the LEDs are no longer tracked. This is because

of the beam angle of the LEDs is not sufficiently wide to be continuously picked up

by the Wiimote camera.

29
Martin‟s work provided important input for the design of the Sound Spheres VMI. It

was considered that bending the fingers while interacting with the Sound Spheres

VMI would be a key movement in the way that performers played the instrument and

perhaps the use of reflective markers at the fingertips would present similar issues

faced by Martin during finger tracking. To overcome this, the reflective marker was

stuck to a convex shaped cover for the finger tip which allows for a more continued

reflection of the transmitted infrared light back to the Wiimote camera as the finger

is bent.

2.7 Research Question

A key theme presented in both the project background and literature review is the

importance of the control parameters position, speed, pressure and angle for new

musical interfaces. I hypothesise that implementation of these control parameters can

be successfully achieved for a non-contact virtual musical instrument.

The literature review also revealed other important factors (i.e. playability,

progression, control, predictability, balance between challenge, frustration and

boredom, and reproducibility) for a musical instrument to be considered as effective.

I hypothesise that these factors can be applied to a non-contact virtual musical

instrument and hence finger tracking could be considered an effective method for

playing such an instrument.

This dissertation attempts to answer the following research question:-

Can implementation of the control parameters of position, speed, pressure and angle

be successfully achieved for a non-contact virtual musical instrument, and is finger

30
tracking an effective non-contact technique in which to play a virtual musical

instrument?

2.8 Summary

The Sound Spheres project essentially involves the design and construction of a new

non-contact virtual musical instrument, Sound Spheres, and a study on its playability.

This literature review identifies important factors, control parameters, control-to-

sound mapping strategies and design considerations for musical instruments in

general, new electronic musical interfaces and more specifically VMIs. It also

considers how these might be achieved for a non-contact VMI.

The non-contact element of the Sound Spheres VMI will be implemented using

finger tracking and will utilize the Wiimote game controller to provide the finger

tracking mechanism. For further design consideration the literature review identifies

finger tracking techniques and challenges, and discusses technical issues associated

with the Wiimote.

31
Chapter 3 Research Methods

This project essentially involves both the design of a new virtual musical

instrument and a user study of its musical potential. Different research

methods were needed for each of these project goals. This section describes

the research techniques and methods used.

3.1 Research Techniques

Four stages to the project‟s research methodology have been identified, with each

having a number of research methods applied. These stages are; literature review,

pilot study (prototyping), user study and data interpretation.

This methodology is very similar to that used by Paine (2005) in the development

and assessment of the Thummer electronic musical interface, in which he employed

both prototyping and a user study during its development. Much of the development

was focused on the mapping of the instrument‟s various control parameters to sound

synthesis. Prototyping also played an important role here. The mapping of control

parameters is equally central to the Sound Spheres VMI and prototyping of the

mapping will also be included in the project. For user testing of the resultant control

mappings Paine conducted a series of sessions where a selected group of performers

would be observed using the instrument and their feedback was recorded using

cognitive interviews.

The design, construction and testing of the Sound Spheres system took place in the

literature review and pilot study phases. The system was then evaluated in the user

study and data interpretation stages.

32
An overview of the research stages, applicable methods and the type of data to be

collected is shown in figure 1. Both qualitative and quantitative elements to the

methodology have been incorporated. A more detailed discussion on the each

research method follows.

Research Methods

Literature Review

Problem Definition

Literature Search

Literature Digest and Review

Pilot Study
Research Data
Design / Construct / Test
Participant
Prototype Review Notes
Comments

User Study

Observation / Quasi-Experiments Notes Video

Interview Responses Video


Completed
Questionnaire
Questionnaire

Data Interpretation

Data Analysis

Presentation of Results

Figure 1 – Overview of Research Methods

3.2 Research Detail

3.2.1 Literature Review

This is a key research method and was used as input to the system‟s design and to

identify appropriate methods for its assessment. There are many aspects to the

system‟s design that have benefited from literature review, such as understanding the

importance of sound synthesis mappings and gaining a better understanding of the

workings, algorithms and limitations of the Wiimote. It has enabled best practice and

current trends in user interface design for virtual musical instruments to be


33
considered, and has helped to identify problems experienced by others in the design

of similar systems.

3.2.2 Pilot Study

The approach for the pilot study was as follows:-

 The system was initially designed using knowledge obtained from the problem
domain and literature review.

 The system was then constructed (both hardware and software) and the first
prototype of the system produced.

 An initial prototype review session was conducted with three selected


participants. This was deemed to be a sufficient number to acquire good range of
comments, views and recommendations. Too many participants would likely
extend the duration of the review session (through more discussion and debate)
and not necessarily be more constructive. One complete day was set aside for the
review and no specific time duration was set. The review session was in the form
of a demonstration, followed by free play by each of the participants and then a
group discussion. All participant comments were noted for potential inclusion in
the second prototype. The primary aims of this session were:-

 To ensure that the general system set-up and interface is generally


useable.

 To review the implementation of the control parameter mappings and


discuss possible alternatives.

 To review the implementation of audio and visual feedback mappings


and discuss possible alternatives.

 The system was then modified to incorporate selected ideas and to fix any defects
discovered from the first prototype review. A second prototype system was
produced.

34
 A second prototype review session was conducted with the same three people
who participated in the first review session and it followed the same format. The
primary aims of this session were to:-

 Review the revised implementation of the control parameter mappings


and discuss possible alternatives.

 Review the revised implementation of audio and visual feedback


mappings and discuss possible alternatives.

 Validate that the system is sufficiently free of errors to be ready for


further evaluation.

 Finally the system was modified to incorporate participant comments and to fix
any further defects discovered from the second prototype review.

3.2.3 User Study

A user study of the Sound Spheres VMI was conducted and it catered for both the

assessment of the implementation of control parameters and the effectiveness

afforded by the finger tracking technique. It included research methods of

observation, quasi-experiments, interviews and a questionnaire. During the

observation, quasi-experiments and interview methods of the study, a camcorder was

used to record the proceedings for later playback during the data interpretation phase.

The approach for the user study was as follows:-

 Eight participants were selected to take part in the user study, including the three
participants who took part in the prototype review. This enables assessment on
whether the skills of the prototype participants had improved. Tauber, E. et al.
(2005) conclude that 3 to 8 participants yield the most useful results. Participants
were from a range of age groups and both musicians and non-musicians were
included. The profile of these participants is shown in table 1.

35
Participated in
Participant Age Prototype Reviews? Musician?
1 15 No Yes
2 17 No Yes
3 47 No No
4 43 No Yes
5 40 Yes Yes
6 9 No Yes
7 42 Yes No
8 25 Yes No
Table 1 – User Study Participant Profile

 A user study session was conducted for each participant in isolation. Each session
lasted for approximately 100 minutes (just over 1½ hours) and was structured
into stages, each of a set period of time, as shown in figure 2.

 The session started with basic introduction to the system and how to use it. Each
participant was given the same introduction and instructions.

 The participant was then given a period of free play to see how they initially
interacted with the system and to observe their path of discovery.

 Further instruction was given to formally show the participant how the position,
speed, pressure and angle controls can affect musical outcomes (they may have
already worked this out for themselves in the free play section of course).

 The session continued with structured play (quasi-experiment), with the


participant attempting to manipulate sound by using each control parameter in
turn.

 Next the participant was given an additional period of free play, to assess how
they then performed with knowledge they had gained from the structured-play
and to see how they might make use of the combination of control parameters.

 Finally a reproducibility test was conducted where the participant was asked to
compose a simple tune, using the controls to add expression as desired. They
were asked to repeatedly reproduce the tune with the same expression.

36
The user study research methods used at each of these stages, along with the planned

stage durations are shown in figure 2.

1. Basic Instruction 5

2. Discovery and Free Play 15 Observation

3. Interview 1 10 Interview

4. Control Instruction 5

5. Structured Play - Position 5

Session 6. Structured Play - Speed 5


Observation
Research
Stages 7. Structured Play - Pressure 5
Method
8. Structured Play - Angle 5

9. Interview 2 10 Interview

10. Discovery and Free Play 15 Observation

11. Reproducibility Test 10 Observation

12. Structured Interview 10 Questionnaire

Stage Duration
(minutes)

Figure 2 – User Study Research Methods and Stages

3.2.4 Observation

This qualitative method (commonly used in user studies) focuses on observing how

people adapt to and perform with the system. This approach has recently been used

in the evaluation of an interactive music performance system also utilizing the

Wiimote (Wong, L. et al. 2008).

An approach to evaluating a VMI taken by Johnstone et al. (2008) was to give

participants freedom to use the software in any way they wished and to make music

with it to explore its full potential. Their user study involved observing participants

during free-play of the instrument. During observation, participants were encouraged

to think-aloud in order to gain insight into their experience. This think-aloud

approach to observation is also appropriate for the Sound Spheres VMI.


37
3.2.5 Quasi-Experiments

Whilst free-play of the system may be an appropriate technique to observe how

participants generally interact with the system, it would not specifically address a

more focussed review of how they interacted with the control parameters of position,

speed, pressure and angle. In this regard periods of structured-play have been

included in which participants were asked to explore each of the control parameters

in turn, so that their interaction with them could be observed in isolation. This means

that questions relating to the control parameters in the structured interview should

result in more meaningful answers from participants. The structured-play

methodology is essentially the quasi-experiment method, where participants are

constrained to specific tasks to provide some basis for comparison. The same think-

aloud observation technique has been applied here also.

The studies by Jorda (2004b) and Ferguson and Wanderley (2009) highlight

reproducibility as an important factor for digital musical instruments and I have

previously suggested that this may be one way of assessing the Sound Spheres VMI.

Thus, a simple reproducibility test (as described in section 3.2.3) has been

incorporated into the user session. The observation technique was applied as they

played. To focus concentration on the task the participants were asked not to

converse while they do this.

3.2.6 Interviews

Conducting cognitive interviews with users after using the system enabled the

capture of participant‟s thoughts and feelings and helped contextualize observations

made whilst they were using the system. This research method is commonly used

when performing user studies, and has recently been used by Kiefer (2010) when
38
evaluating the Phalanger gesture recognition system. As participants were of

different ages (maturity level), musical abilities and experience with interactive

systems, the terms of reference and understanding between each participant was not

consistent. The cognitive interview process therefore catered for this difference in

domain knowledge. Paine (2005) also cites this as a reason for introducing cognitive

interviews during the user studies of the Thummer electronic musical instrument.

The strengths and weaknesses of interviews reside in the interaction between the

interviewer and respondent. Considering the difference in ages, musical ability and

experience with interactive systems of the planned participants the type of interaction

with them is also likely to differ. There is therefore a potential for bias or distortion

in the interview responses. This bias was considered carefully during data analysis

and to some extent is offset by the fact that other research methods (such as a

questionnaire) were used.

The output of the interview process was a set of interview notes and a video

recording of each interview.

3.2.7 Questionnaires

The use of questionnaires as a research method was recently adopted for a project

that developed several Theremin based 3D-interfaces (Geiger, C., et al. 2008). For

this project participants were asked to complete a questionnaire (see Appendix B)

consisting of 49 questions that captured data on both the playability of the system

and its control parameters. The questionnaire allowed responses from the participants

to be directly compared, contrasted and analyzed.

39
Walonick (2004) presents a very informative paper on designing and using

questionnaires and includes useful considerations and good guidelines on designing

and wording questions. For example, the advice to group related questions and to

ensure that each question asks for an answer on only one dimension has been

followed.

The questionnaire (and interview) questions were designed to collect data in the

following areas:-

 Control parameters of position, speed, pressure and angle.

 Playability, progression, control, predictability, reproducibility and expression.

 Balance between boredom and challenge.

 Playing position, posture and fatigue.

 Audio and visual feedback (including synchresis).

Of the 49 questions 39 asked the participant to respond using a 5 point Likert rating

scale (strongly disagree, disagree, neither agree or disagree, agree, and strongly

agree), thus providing quantitative data to which statistical analysis could be applied

(e.g. 75% of participants thought that the mapping of sound synthesis to the position

control parameter was appropriate). One could consider the wording of the responses

implies symmetry around the middle response (neither agree nor disagree) and hence

one might consider the data as interval-level data also. It is widely believed that the

propensity to move from neutral to agree or disagree is be greater than movement to

the strongly agree or strongly disagree. Therefore, this project has considered the

Likert scale data as ordinal. The scoring used a bipolar scale of -2, -1, 0, 1, 2 instead

of the more popular form 1, 2, 3, 4, 5. This provides an easy means of identifying

whether the central tendency (mode and median) corresponds to disagreement or


40
agreement (negative values versus positive values). The majority of Likert scale

questions were worded such that positive responses (agree and strongly agree) were

in support of the hypothesis to the research question. Three questions were in reverse

of this, where the responses “disagree” and “strongly disagree” are viewed as

positive.

The remaining questions served to identify the participant and their ability to play

and read music, rank control parameters based on ease of use and importance to

musical outcomes, and finally to ask for general comments about what participants

liked most or least about the Sound Spheres.

The quantitative data captured from the questionnaire was used to support or dispel

any conclusions drawn from other qualitative data captured.

3.2.8 Other Research Methods

One approach that was used for evaluating expressive musical interfaces by Stowell,

et al. (2008) was discourse analysis. They report that the discourse analysis method

can derive detailed information about how musicians interact with a new musical

interface. However, considering the fixed project timescale and the fact I am

applying a variety of research methods (covering both qualitative and quantitative)

the discourse analysis method has not been used as it is time intensive and would

unlikely help to draw further conclusions.

Both card sorting and participatory design research methods had been considered

but these options were ruled out. The requirements for the Sound Spheres application

have not been generated by a known set of users, and hence I believe capturing user

preferences is not appropriate in this case.

41
An agile/iterative approach to the prototyping stage of development was also

considered, perhaps by prototyping and reviewing how the system will implement

each of the control parameters (position, speed, pressure and angle) in turn. There

were two concerns with this approach. Firstly, the project has a fixed timescale and

the project plan allowed a relatively short time for the design and development of the

system, making it difficult to incorporate four prototype reviews into the schedule.

Secondly, reviewing the implementation of these controls in isolation does not help

identify any issues of usability when two or more of them are combined.

However, prototyping is an essential part of system development and hence two

iterations of the prototype-review cycle were included. However these iterations

were not structured entirely on the control parameters. Other aspects of the system

were equally important to assess during the prototype review (such as layout, visual

feedback, audio feedback, interaction with the Wiimote, etc).

3.3 Preliminary Analysis of Research Data

Much of the data collected is qualitative and as such a grounded theory methodology

has been used in preparing this data for analysis which is carried out in two stages.

Firstly the data has been thoroughly read through and in the case of the video

recordings, transcripts of participant‟s comments were made (along with notes to put

comments into context). Secondly, the key points from each data source were then

coded, conceptualized and categorized which allowed comparison analysis across the

data to discover similarities and differences between the data sources. This

methodology has also been used for the non-Likert scale questions from the user

study questionnaires.

42
Several approaches were used to analyse the Likert scale questions from the user

study questionnaires. Responses to the questionnaire were collated and scored for

statistical analysis. A series of bar charts were also produced to graphically show the

results. The median and mode was determined for each question to measure central

tendency. The mean and standard deviation was not considered due to the ordinal

nature of the data. The responses for each question were also nominally grouped into

positive and non-positive categories. Summation of the responses was not seen as

appropriate as the data is ordinal.

A non-parametric method for statistical dependence was used for determining

correlation. In this case the Spearman’s rank correlation method was used to

determine the relationship between 57 pairs of questionnaire responses. The question

pairings aimed to discover factors effecting challenge, frustration, facilitating the

creation of music, visual feedback and synchresis.

A chi square test for the questions that specifically relate to control parameters was

considered to test the hypothesis that implementation of these parameters can be

achieved for a non-contact VMI. This test was not used as the sample data was not

sufficiently large.

A non-parametric method for statistical hypothesis tests was used. In this case,

because the samples were small the Mann-Whitney U Test was used. A comparison

was made between musicians and non-musicians, and between those who did and did

not participate in the prototype reviews. Perhaps, for example, those who participated

in the prototype reviews found control of the Sound Spheres VMI easier, suggesting

that progression can be improved bye practice and over time.

43
Two questions on the user study questionnaire asked participants to express a

preference by ranking the control parameters from 1 to 4. An overall ranking was

determined by scoring each response and totalling the scores for each control

parameter. The control parameter with the highest score was ranked first.

44
Chapter 4 System Design

4.1 Overview

This section presents the functional design of the Sound Spheres VMI. The design

and construction of the Sound Spheres system considered inexperience in graphics

based programming and the project‟s fixed timescale. In this regard functionality was

limited to directly address the research question. The final version of the system was

greatly influenced by outcomes of the two prototype review sessions as described in

Section 4.11.

An object-oriented approach was taken for the software component of the VMI. The

system‟s software has been developed using Microsoft‟s Visual Basic .Net object

oriented programming language and utilizes Microsoft‟s DirectX runtime libraries

for its graphics implementation. For handling and interpreting data from the Wiimote

the software application uses a .Net managed library developed by Peek (2009).

A series of videos have been created to demonstrate the Sound Spheres WMI user

interface and implementation of control parameters. Details on how to access these

videos can be found in Appendix M.

4.2 User Interface

This section gives an overview of the user interface components.

4.2.1 Tracking and Sound Spheres

Unlike some finger tracking applications, complex hand or finger gestures to interact

with the software have been avoided. Instead the finger tips are used as passive

markers, effectively providing a series of moving pointers. These pointers are

represented on the user interface as small spheres (tracking spheres), whose


45
movement are used to trigger sounds and effects through the collision with a set of

larger spheres (the sound spheres).

Only four points can be simultaneously tracked with the Wiimote and hence this

imposes a limitation of a maximum four tracking spheres only.

Each of these sound spheres is assigned a unique sounding musical note. The

collision of the tracking spheres with the sound spheres plays back the assigned

sound, and hence the user is able to play the VMI rather like a percussive (e.g. drum,

xylophone) or keyboard based instrument. Three sound possibilities were provided

for the prototype reviews, one of which was eventually selected for the usability

study. A sound with unique qualities was sought to avoid being recognizable as

another instrument (a piano for example). A midi controller and synthesiser were

used to create distinct musical sounds for the individual notes to be assigned to each

of the sound spheres.

4.2.2 Sound Sphere Layout

The presentation (or layout) of the sound spheres on the screen is an important

design consideration and will affect the playability of the VMI.

Both the tracking spheres and sound spheres have been aligned on the x-y axis (i.e. z-

axis = 0) as shown in a perspective view in Figure 3.

46
Figure 3 – Axis Alignment in Perspective View

Note however, the actual display on the screen is an orthogonal view as shown in

Figure 4.

Figure 4 – Axis Alignment in Orthogonal View

In this layout one might question why spheres are being used and not circles (i.e. 3-

dimensional over 2-dimensional). The reason for this is to provide visual feedback

using of spin. This is discussed further in section 4.4.

47
The sound spheres are arranged in two rows, each comprising the 12 notes of an

octave as shown in Figure 5. To differentiate the natural notes from the sharp notes,

different size sound spheres are used. This type of visual differentiation is used in

many traditional musical instruments. For example, a piano‟s natural and sharp notes

are differentiated using black and white keys, as well as differences in key size.

Similarly a glockenspiel uses size and position of its bars for natural and sharp note

differentiation.

The two rows of sound spheres also correspond to two different octaves, one octave

apart.

Figure 5 – Illustration of User Interface

48
4.3 Implementation of Control Parameters

This section describes how each of the control parameters of position, speed,

pressure and angle has been implemented for the Sound Spheres VMI.

4.3.1 Position

When a tracking sphere collides with a sound sphere the horizontal distance from the

point of collision and the central line of the sound sphere is determined (as illustrated

in Figure 6). The sound generated at the point of collision is then adjusted dependent

on this distance (see Section 4.5).

Point of sphere collision

Tracking
Sphere

Sound
Sphere

Figure 6 - Illustration of the Position control parameter

4.3.2 Angle

When a tracking sphere collides with the sound sphere the angle between three

points is determined. Point 1 is the center of the tracking sphere at the start of its

movement towards the sound sphere. Point 2 is the center of the tracking sphere at

the point of collision with the sound sphere. Point 3 is a point anywhere on the

horizontal line at the point of collision. The sound generated is then be adjusted

dependent on the angle between these points. This is illustrated in Figure 7.

This type of action can often be seen when a drummer strikes a cymbal for example.

A change in the angle in which the drummer chooses to strike the cymbal with the

drumstick will produce a different sound. Sometimes the player will use a very large

49
angle and appear to brush the drumstick over the surface of the cymbal and

sometimes a more direct hit is executed with widely different sounds being

generated.

Note: A tracking sphere colliding with sound sphere at the same position can yield a

difference in sound depending on the starting position of the tracking sphere. This

enables musical expression by swiping fingers in different ways.

Figure 7 - Illustration of the Angle control parameter

Determining the starting point of the tracking sphere was a challenge. The transition

of a tracking sphere between the playing of one note to the next will frequently

involve continuous motion and hence the point at which the movement starts to play

a new note is difficult to ascertain.

When playing a traditional percussive instrument such as a xylophone or steel drums,

or even a stringed instrument like the piano, the movement of the striking object (be

it a mallet, stick or fingers) from one note to the next is rarely linear. A player

generally lifts the object from one striking position before they start the movement to

make another strike. With this in mind, the tracking sphere‟s starting position was

determined by the point at which the movement changes from a positive direction in

50
the y-plane to a negative one (i.e. the point at which a downward movement starts

following an upward movement).

4.3.3 Speed

When a tracking sphere collides with a sound sphere the average speed of the

tracking sphere is determined as…

average speed = distance between start position and collision position / (t2 – t1)

…where t1 is the time at the starting position of the tracking sphere and t2 is the time

at the collision point.

The sound generated at the point of collision is then adjusted dependent on the

average speed. This is illustrated in Figure 8.

Figure 8 - Illustration of the Speed control parameter

One might ask the question how it is possible to control a low speed when having to

quickly move a tracking sphere from one sound sphere to another to play in time to a

musical piece. This problem, however, exists in many traditional instruments and

does not prohibit the control parameter being of importance. Again, consider the

xylophone. The volume of sound generated is very much dependent on the speed on

51
which the player strikes the mallets onto the xylophone bars. The mallets may well

have to be moved quickly from note to another in order to play in time to a piece of

music, but the player is still able to exert a degree of control in the resulting volume.

This is similar to the playing of a piano.

The issue of control therefore becomes more to do with the skill of the player and the

choice of how the next note in the piece is executed. With the Sound Spheres VMI

the player has a choice of playing any subsequent note with any of the four tracking

spheres and hence perhaps the distance they need to travel can be minimized. A

pianist would take this approach with the choice of fingering.

One must also bear in mind that in reality a small movement of the fingers affects a

big movement in the tracking spheres (sensitivity) and hence the fingers do not need

to actually move great distances.

A more difficult aspect to the speed concept as presented above is to determine the

starting position of the tracking sphere (as described for the angle control in section

4.3.2).

4.3.4 Pressure

In a non-contact environment the pressure control parameter presented a design

challenge. For this project pressure was considered from a perspective of momentum.

Momentum is defined as the product of an object‟s mass and its velocity.

Consider two objects with different masses travelling at the same velocity. In this

case their momentum differs. If they were both to collide against the same surface

then the one with the largest mass will exert more pressure on the surface.

52
In the virtual world tracking spheres obviously have no mass. However they do have

size! As each tracking sphere is the same type of object we can reasonably conclude

that tracking spheres with the same size would also have the same mass. We could

then conclude that a large tracking sphere would exert a greater pressure on a sound

sphere than a smaller tracking sphere travelling at the same velocity.

In other words, by varying the size of a tracking sphere we vary the pressure being

applied to a sound sphere during collision.

To implement the ability to dynamically and quickly change the size of the tracking

spheres the user interface displays a visual component that I call a pressure control.

A pressure control has been placed on either side of the user interface so it can be

quickly accessed by tracking spheres controlled by either the player‟s right or left

hand. The pressure control has two circular surfaces, one containing an upwards

facing arrow representing increasing pressure and one a downward facing arrow

representing decreasing pressure. This control will gradually increase or decrease

the size (and hence the implied pressure) of all tracking spheres when the center

point of one of the tracking spheres is positioned over one of the pressure control‟s

surfaces.

4.4 Visual Feedback

Figure 9 shows the sensory feedback loops for both contact and non-contact VMIs. It

can be seen that the non-contact sensory loop can only provide feedback to the user

using audio and vision. Thus visual feedback is an important factor to the design of

the Sound Spheres VMI.

53
Figure 9 - Sensory Feedback Loops

The Sound Spheres VMI provides visual feedback to the player when the tracking

spheres collide with sound spheres in several ways.

Firstly, graphics are displayed at the point of each collision. A graphics particle

engine was implemented to display a set of flying sparks at the point of collision.

The direction and dispersal of the sparks is dependent on the position of the colliding

tracking sphere. This is illustrated in Figure 10.

Figure 10 - Sphere Collision Sparks

Secondly, when a tracking sphere collides with a sound sphere, the sound sphere

vibrates if it were each on a spring. The vibration diminishes over time until the

vibration stops. The direction of the vibration is always up and down. Consideration

was given as to whether the direction of vibration should be dependent on the angle

also, however, this was not implemented. As the sound spheres are placed close

together and any sideways vibration also result in their collision.

54
Thirdly, when a tracking sphere collides with a sound sphere, the sound sphere spins

around its horizontal axis. The initial speed of spin is dependent on the speed of the

colliding tracking sphere, and the speed of rotation diminishes over time until the

spinning stops. So that the speed of spin is readily apparent to the user graphical

textures have been placed on the sound spheres.

The vibration and spinning of the tracking spheres are not related to any specific

control parameter and denote tracking sphere collision only. It was felt that two

elements of visual feedback provide a stronger reference and better compensate for

the lack of physical contact.

4.5 Control to Sound Synthesis Mappings

For ease and speed of implementation, the sounds generated by the Sound Spheres

VMI use a set of pre-recorded sound wave files and Microsoft‟s DirectSound

technology is used to load and play these back. This technology also provides the

ability to apply effects to sounds as they are played back. Each effect has parameters

used to modify the sound generated further.

Four specific effects have been selected and a mapping between these and the control

parameters (position, speed, pressure and angle) have been applied as shown in

Table 2.

55
Control Parameter Effect

Position Stereo Panning. The sound is increasingly panned to the


left or right speaker dependent on the position of collision.

Speed Volume. A greater speed results in a higher volume.

Pressure Parametric EQ. A greater pressure results in a tone where


the higher frequencies are boosted.

Angle Chorus. An acute angle results in a chorus effect with a


greater degree of modulation than a less acute angle.

Table 2 – Control Parameter Mapping

4.6 System Set-up

The Sound Spheres VMI is comprised of the following software and hardware
components. Specific details of each component can be found in Appendix J.

 The Sound Spheres software application and supporting software libraries.

 Laptop computer with external speakers and wide-screen monitor.

 Bluetooth adapter and supporting driver.

 24-bit sound card.

 Wiimote controller.

 Infrared LED array.

 Cover for the infrared led array (to shield light from the player‟s eyes).

 Two tiered desk and adjustable chair.

 Four reflective markers.

The components are setup on a two-tiered desk with the top tier used as a surface on

which to stand the speakers and computer monitor and the lower tier used as a

surface for placement of the Wiimote and LED arrays. Separate tiers enable the

Wiimote and LED arrays to be positioned horizontally central to the monitor and

56
speakers without obstructing the player‟s view of the monitor. The Wiimote and

LED array can be adjusted up or down to suit desired playing positions. An

adjustable chair also allows player‟s to raise or lower their playing position. The

reference speakers are positioned either side of the monitor so that stereo effects are

maximized. The system‟s setup can be seen in Figure 11.

Figure 11 – Sound Spheres System Setup

The initial system setup included two LED arrays. However, one of the LED arrays

failed and could not be repaired. This put the project at risk as there was no time in

the project to procure and construct another. The eventual solution to the problem

vastly improved the playability of the system. Details of this problem and solution

can be found in Appendix J. The final system setup only used one infrared LED

array, and the Wiimote was positioned so that its camera was partially obscured by

the LED array. A cover was placed over the LED array and Wiimote to shield

infrared light the participant‟s eyes. This can be seen in Figure 12.

57
Figure 12 - LED Array, Wiimote and Cover

4.7 Reflective Markers

Four reflective markers have been constructed. Each marker is essentially a

lightweight plastic cap with a convex surface onto which highly reflective tape has

been fixed. As described in section 2.6 based on work by Martin (2010), the convex

surface improves continued reflection when the fingers are slightly bent. The cap is

placed over the tip of the finger and is held into position by gaffer tape inserted

inside the cap. The gaffer tape can be renewed when necessary to ensure good

adhesion. The plastic caps used were taken from Playmobil© toy figures, as the hair

attachment was discovered to be the perfect shape, size and very lightweight, and

already has a convex surface.

4.8 Occlusion Issues

The physical setup and nature of the Sound Spheres VMI naturally reduce occlusion

problems from occurring. This is because there is nothing between the Wiimote

camera and the player‟s finger tips and whilst occlusion can occur if fingers on one

hand obstruct fingers on the other hand, it is would be quite a difficult gesture to

have occlusion of fingers on the same hand and this sort of gesture is unlikely when

playing the Sound Spheres VMI in a percussive manner where only the finger tips

are used.

58
Occlusion is considered a limitation of the VMI that players will need to work out

and differs little with playing a traditional instrument. For example, the physical

constraints of a guitar and a guitarist‟s hands and fingers mean that not all desired

playing positions are possible. Knowing these constraints and through a learning

process a guitarist has to work out the most effective way to play a desired piece of

music on the guitar. Similarly, the occlusion problem associated with the Sound

Spheres VMI could be considered simply as a constraint that the player must learn to

overcome through correct finger positioning and movement.

4.9 Ethical Issues

A central component to the Sound Spheres research project is a user study. Whilst

the participants are volunteers it was important that they are made fully aware of any

potential risks to their health and safety before agreeing to participate. In this regard

three potential risks have been identified and were disclosed to all participants prior

to selection. Furthermore participants were asked to sign an acknowledgement form

to confirm that they were informed about and understand the risks involved. These

ethical issues are detailed Appendix I and are summarized below.

The Sound Spheres system involves the use of an infrared (IR) illuminator. Without

adhering to the precautions this technology introduces a small risk to the

participant‟s health and safety, namely a potential risk of eye damage. This risk has

been reduced in the design of the Sound Spheres VMI, through a combination of

positioning and shielding of the IR illuminator.

Ergonomics were also considerations. Playing requires participants to sit with their

arms extended and pointing one or more fingers towards the computer display.

59
Movement of both the arms and fingers is necessary. Lee (2008) suggests that the

onset of fatigue happens quite quickly for this type of activity. I believe it is ethical

to warn participants of this fact, especially as exercising muscles that are not

generally used may result in stiffness or soreness sometime after the exercise has

stopped.

People with the photo-sensitive epilepsy‟ medical condition may find that moving or

flickering light can cause problems, and this can include computer screens. Whilst

this condition is quite rare it is ethical to warn participants of this risk and to ensure

that anybody with known epilepsy is not selected to participate.

4.10 Pilot Study and Prototype Review Influences

The pilot study‟s prototype review sessions were very much part of the design

process rather than a study into the VMI‟s playability. For this reason the results or

primary outcomes of the prototype review are included here in this chapter. The

initial prototype review was used to discuss and make a number of design decisions,

and the second review proved invaluable for refining audio and visual feedback of

the VMI.

A number of unexpected events occurred during the first prototype review session

which inadvertently provided valuable feedback. Firstly, the user interface randomly

displayed additional tracking spheres. This turned out to be reflective materials that

the participants were wearing, such as wedding rings, a t-shirt covered in sequins,

and gloss painted finger nails. Secondly the tracking sphere movement seemed to

randomly stop. This turned out to be the invocation of auto-backup and virus

60
scanning software on the laptop that were affecting the performance of the graphics

display.

The following list summarizes key design influences from the prototype reviews:-

 Two methods for controlling the pressure parameter were initially identified and

presented to the participants. Whilst one option was more analogous to pressure,

the chosen option is more intuitive to the users and was easier to implement. The

selected option was preferred by two of the three participants. However, after

using the pressure control during the second review session they thought that the

speed of increase or decrease of tracking sphere size was too fast and hence this

was adjusted accordingly.

 Three different sound types for the VMI were presented to the participants. The

selected sound was preferred by all three participants.

 One participant identified that the decay and attack of each note could be

adjusted dependent on the speed and pressure control parameters so that the

synchresis was more obvious. The greater the speed the quicker the attack and

the greater the pressure the longer the decay. This suggestion has not been

implemented; however it has been noted for future research.

 The vibration of the sound spheres upon collision from a tracking sphere was

thought to appear too „springy‟, that is the variable distance of vibration was too

great and the time taken to return to its non-vibrating state was too long.

Adjustments were made until all participants were satisfied.

 Participants thought that the speed of sound sphere spin upon collision from a

tracking sphere correlated well with the speed of the tracking sphere. Two
61
improvements were highlighted however. One participant suggested that the

speed of spin should be dependent on both the speed and pressure of the tracking

sphere. This suggestion has not been implemented; however it has been noted for

future research. Another suggested that the sound spheres should not spin at all

until a collision took place (initially all spheres were in a state of spin and

collision just made the spin greater). This suggestion was trialed but upon seeing

the results the participants were in agreement that it should not be implemented.

Without spin the sound spheres appeared on the screen more like circles rather

than spheres and the participants preferred the 3D nature of the interface.

 Different playing positions were compared for playability and comfort, including

comparison between sitting and standing, and the amount of bend in the elbows.

The seated position was discovered to be more preferable and enabled a higher

degree of control.

62
Chapter 5 Results

This chapter presents results and analysis of data collected during the user study

deriving from the research methods outlined in Chapter 3. Detailed tables of results

can be found in Appendices C to H.

5.1 Playability

The summary of results from the general playability section of the user study

questionnaire, as presented in Table 3, clearly show that playing the Sound Spheres

VMI was a pleasurable experience and, that whilst challenging, did not generally

cause frustration or fatigue. The Sound Spheres VMI was on the whole thought to

facilitate the creation of music well and the majority of participants saw a positive

progression in their ability to play the Sound Spheres over time.

Table 3 - Questionnaire Result Summary - General Playability

The results above are in line with observations and interview responses. However a

few differences are worth noting. Firstly, signs of frustration were observed by four

of the participants and for three of these, frustration was observed early in the

session, (two participants had problems with keeping the finger markers in place, and

63
one participant‟s initial playing technique was not providing enough control).

However they were able to resolve issues quickly and hence they were no longer

feeling frustration by the end of the session. Secondly, while the majority of

participants responded in the questionnaire that their playing improved over time, the

observed result was that this was only for general control (i.e. moving the tracking

spheres and producing sound) and using the control parameters in isolation.

Progression in using multi-control parameters during a piece of music was only

observed by two participants.

5.1.1 Sound Sphere Layout

It is clear from observation notes and responses to interview questions that

participants were not entirely content with the layout of the sound spheres. Having

the top position of the sharp notes lower than the top position of the natural notes

presented problems when using the angle control parameter as the natural notes

obstructed access to the sharp notes with some angles, sometimes resulting in the

wrong note being played. Furthermore, the placement of the two octaves on separate

rows also caused participants an element of difficulty playing even simple tunes. The

transition of tracking spheres from the top octave to the bottom invariably caused

inadvertent notes to be struck on the top octave.

5.1.2 Reflective Markers

Whilst participants had the opportunity to play with up to four reflective markers on

the fingers, all participants opted to use only two, with one marker on each of their

index fingers. Only one participant attempted to use four markers, and whilst enjoyed

the possibility, found it relatively difficult to control. One participant had a tendency

64
to use just one marker despite wearing two; however this appeared to be down to a

general lack of coordination with their left hand.

Two participants experienced problems with keeping the reflective markers in

position on their fingers.

5.2 Audio Visual Feedback

A summary of results from the Audio Visual Feedback section of the user study

questionnaire is shown in Table 4.

Table 4 - Questionnaire Result Summary - Audio Visual Feedback

The majority of participants liked the sounds generated by the Sound Spheres VMI

and no participant disliked them. However, two participants did suggested

improvements in the sound generation. Similarly the look and feel of the Sound

Spheres VMI also received positive feedback. Despite this, four participants did

express views during the user study on how the user interface might be changed.

65
The general opinions on the control aspects of the Sound Spheres VMI are shown in

the results of questions 15, 16 and 18 in Table 4. Controlling the movement of the

tracking spheres was found easy by most participants. In contrast the consistency of

movement was viewed less positively. Furthermore only half of the participants

claimed to be able to accurately position the tracking spheres.

All participants agreed that visual feedback was used appropriately with the majority

agreeing that both the spinning of the sound spheres and the flying sparks on

collision were effective.

5.3 Reproducibility

Table 5 shows a summary of results from the reproducibility section of the user

study questionnaire. All but one participant agreed that they were able to repeatedly

play a simple piece of music using the Sound Spheres VMI. Observation showed that

in fact all participants were able to repeatedly play a simple tune.

Table 5 - Questionnaire Result Summary - Reproducibility

The user study questionnaire indicates that the control parameters were not used

(intentionally at least) during the reproducibility test. Results also indicate that the

ability to play a piece of music in perfect time was only achieved by one participant.

Observation of participants did show that in fact two participants appeared able to

play in good time.

66
5.4 Control Parameters

This section focuses on results that relate specifically to the assessment on the

implementation of each of the control parameters (angle, position, pressure and

speed) based on ease of control, audio and visual feedback, and perceived

importance to musical outcomes.

Three factors were considered for audio feedback during the application of each

control parameter. Firstly, did participants think the change in sound was apparent

(i.e. could the participant clearly hear the affect of the control parameter)? Secondly,

was the change in sound consistent (i.e. repeated use of the same application of a

control parameter produced the same change in sound)? Thirdly, was the change in

sound appropriate (i.e. was it well suited to the application of the control

parameter)? Only the apparent use of visual feedback was considered when each

control parameter was applied.

The questionnaire responses (see Tables 6, 7 and 9) show that the position, speed and

pressure control parameters all received largely positive feedback from participants.

Where non-positive feedback was identified it was primarily down to participants

responding „neither agree or disagree’ rather than a disagreement.

As can be seen in Table 8, the questionnaire responses clearly show the angle control

parameter received less positive feedback when compared with other control

parameters. The ease of control of this parameter and the consistency of its affect on

the audio feedback were factors that the majority of participants gave negative

feedback to.

67
Table 6 - Questionnaire Result Summary - Position Control

Table 7 - Questionnaire Result Summary - Speed Control

Table 8 - Questionnaire Result Summary - Angle Control

68
Table 9 - Questionnaire Result Summary - Pressure Control

5.4.1 Ranking of Control Parameters

On the user study questionnaire participants were asked to rank the control

parameters from 1 to 4 in order of ease of control (1 being the easiest and 4 being the

hardest). The detailed results of these rankings can be found in Appendix D and are

summarised in the graph in Figure 13.

Figure 13 – Control Parameter Rankings for Ease of Control

The overall ranking of the control parameters shows that the Pressure control

parameter was the easiest to control followed in order by Speed, Position and Angle.

These results correlate with the observed ease of control and with the Likert question

69
responses in the questionnaire. Furthermore, when ranked by those who participated

in the prototype review only and then only by musicians, the control parameters were

ranked in the same order.

Participants were also asked to rank the four control parameters from 1 to 4 in order

of importance to musical outcomes (i.e. which control could be used best for

affecting the musical outcome, 1 being the most important and 4 being the least). The

graph in Figure 14 illustrates the distribution of the number of participants per

ranking per control parameter.

Figure 14 - Control Parameter Rankings for Importance to Musical Outcomes

The overall ranking of the control parameters shows that the Speed control parameter

was thought most important to musical outcomes followed in order by Pressure,

Position and Angle. When ranked only by musicians, the control parameters were

ranked in the same order. However, when ranked by those that participated in the

prototype review it was found that Angle and Position rankings were in reverse.

70
5.5 Spearman’s Rank Correlation Results

The results of the Spearman‟s rank correlation method can be found in Appendix E.

Very few of the P-values are lower than 0.05 and for those that are, further analysis

suggests that in many cases correlation is unlikely to be reliable. Therefore it is

difficult to draw any real conclusions based on statistical significance. Table 10

shows a summary of the correlation results where the P-value < 0.05.

Table 10 – Spearman’s Rank Correlation Results Summary

The issue of reliability can be seen from the result between question 11 and 15 where

there exists a strong correlation between them with a high significance (P-value =

0.01). However, the negative correlation coefficient value (of -0.834) suggests that

the Sound Spheres VMI facilitates the creation of music better as the control of the

tracking spheres gets harder. This is the reverse of what would perhaps be expected,

especially considering that both questions received a high percentage of positive


71
responses with 87.5% of participants thinking that the Sound Spheres VMI facilitated

the creation of music well and 75% thinking that the movement of the tracking

spheres was easy. Five out of the eight results in Table 10 could be considered

unreliable based on similar explanations as above. The remaining three results appear

more reliable, however considering other results their reliability could be called into

question also.

Correlation suggests that accuracy in positioning the tracking spheres increases as the

consistency in control of tracking sphere movement increases. Not surprisingly

correlation also suggests that tracking sphere positioning is considered easier with an

increase in accuracy of positioning.

A strong correlation exists between the improvement of ability to play the Sound

Spheres VMI over time and the ability to distinguish the application of more than

one control parameter at a time. This suggests that progression of ability or skill of

Sound Spheres players can be achieved.

5.5.1 Unexpected Correlation Results

87.5% of participants thought that the Sound Spheres VMI was challenging and

hence it was envisaged that contributing factors could be identified through

correlation of paired questionnaire responses. Unfortunately, no significant

correlation was identified. Similarly, 87.5% of participants thought that the Sound

Spheres VMI facilitated the creation of music well, however no significant

correlation between factors that might contribute to this was found.

All participants were in agreement that visual feedback was used appropriately. One

might therefore expect that sound sphere spin and the use of flying sparks (the two

72
forms of visual feedback that received predominately positive responses) would

correlate strongly with the appropriate use of visual feedback. However, they did not.

We therefore cannot conclude (through correlation at least) the appeal of visual

feedback.

The speed, position and pressure control parameters all showed predominantly

positive questionnaire responses in relation to the audio and visual feedback used for

each control. However, there was no strong correlation between audio and visual

feedback for these control parameters, suggesting that synchresis of the Sound

Spheres VMI audio and visual feedback was not achieved.

5.6 Mann-Whitney U Test Results

The Mann-Whitney U Test was used for statistical hypothesis tests. A comparison

was made between musicians and non-musicians, and between those who did and did

not participate in the prototype reviews. A summary of the Mann-Whitney U test

results can be found in Appendix F.

No significant difference was found between musicians and non-musicians,

suggesting that non-musicians are not disadvantaged in using the finger tracking

method for music playing. However, there were five results that identified notable

differences between the responses of those who participated in the prototype review

sessions and first time users of the Sound Spheres VMI. Table 11 shows an extract

from the Mann-Whitney summary where the results are significantly different with a

confidence level of 0.05. These results indicate that participants of the prototype

review sessions were more able to consistently control the movement and position of

the tracking spheres. They also used the control parameters to add expression during

73
play more than first time participants. Participants of the prototype review sessions

more strongly agreed with the change in sound being apparent and consistent when

using the pressure control. This suggests that, like traditional instruments, awareness

of the Sound Spheres VMI nuances take time to appreciate.

Table 11 - Mann-Whitney U-Test Results

5.7 Validation

I hypothesised that factors of playability, progression, control, predictability, balance

between challenge, frustration and boredom, and reproducibility can be applied to a

non-contact VMI using a finger tracking method. I make the assertion that validation

of the positive application of these factors would imply that finger tracking would be

considered an effective method to play a non-contact VMI. The user study of the

Sound Spheres VMI confirms that several of these factors are afforded the finger

tracking method. The successful application of other factors is inconclusive. This is

discussed in the following sub-sections.

5.7.1 Playability

Playing a non-contact VMI using finger tracking was an enjoyable experience for all,

with all participants wishing to play the Sound Spheres VMI again. The positive

results regarding the control of tracking sphere movement and the ability of the

Sound Spheres VMI to facilitate the creation of music well implies that the finger

74
tracking technique affords successful playability. The degree of success is less

certain and there are two reasons for this.

Firstly, seven out of eight participants did not agree that they could play in perfect

time. This suggests that the Sound Spheres VMI does not facilitate the creation of

music well, as timing is an essential component of music. However this could be

attributed to the fact that the Sound Spheres VMI was a totally new experience to

participants and more practice is needed to build timing skill. Observation of

participants did show that in fact two participants appeared able to play in good time.

As these also participated in the prototype reviews it indicates that good timing is

possible with time and practice.

Secondly, none of the participants attempted to apply the control parameters during

the reproducibility test. On reflection this is entirely natural. When learning any new

musical instrument the ability to produce music and apply good control only comes

with practice and expecting participants to do this with only 1½ hours of playing a

new instrument for the first time is overly ambitious. It is therefore inconclusive

whether the finger tracking method can facilitate control during play. However, as

discussed in section 5.1 the Sound Spheres VMI does facilitate control parameters

well and hence there is good reason to assume that over time a player could develop

skills to incorporate the controls during the playing of music.

5.7.2 Progression

All but one participant agreed that their ability to play the Sound Spheres VMI

improved overtime and this was also observed, especially during the structured play

stages of the user study sessions. Observation also showed that two participants

75
improved in their ability to use more than one control parameter at a time (during

free-play session stages). One can therefore conclude that the progression factor can

be applied to the finger tracking method and non-contact VMIs.

5.7.3 Predictability

Predictability of a musical instrument can be assessed by the consistency of control,

and feedback and (both audio and visual). Results from the user study show conflict

in consistency. The speed, pressure and position control parameters were generally

considered easy to control and provided consistent audio feedback. However, only

two participants agreed that they could consistently control the movement of the

tracking spheres. Interestingly both of these participated in the prototype review

sessions and this would again indicate that consistency in control could be

accomplished with time and practice. Whilst 75% of participants thought that the

change in sound for different angles was apparent, only 37.5% thought it consistent.

The difficulty in controlling the angle parameter (as discussed in section 5.7.5) is

possibly one factor preventing consistency.

5.7.4 Challenge, Frustration and Boredom

All participants found the Sound Spheres VMI challenging to play and the finger

tracking element to this was central to this challenge. However, despite initial

frustration observed from some participants only one agreed that it was frustrating to

play. No participant thought the Sound Spheres VMI was boring to play. Comments

made (such as “so cool”, “wow”, “amazing”, “really fun”, “addictive”) would

indicate this. I conclude therefore that the balance between challenge, frustration and

boredom is an achievable factor for a non-contact VMI.

76
5.7.5 Application of Control Parameters

Results from the user study would indicate that implementation of the position,

speed, and pressure control parameters has been successfully achieved. Each of these

control parameters showed positive feedback in terms of ease of control, consistency

and appropriateness of audio and visual feedback, and importance to musical

outcomes.

Whilst two participants agreed that the angle control parameter was easy to control,

75% of participants found angle control difficult. This was evident from rankings

which placed angle as 4th for both ease of control and importance to musical

outcomes. Observation and interview responses strongly indicate that the sound

spheres layout (as described in section 5.1.1) is a key factor for the difficulty in using

the angle control parameter. Another factor however could be that visual feedback

was not implemented for the angle control parameter and suggests that synchresis

plays an important role for successful control of non-contact VMIs and without it

control is perhaps impaired. A point to consider is that the results of the TIEM

questionnaire (as highlighted in section 2.1) also rank the angle control parameter 4th

in terms of the qualities of movement needed for control. It is quite likely therefore

that even if control, layout and synchresis were improved the angle control parameter

would still be ranked 4th for its importance to musical outcomes.

77
Chapter 6 Conclusions

This research project concludes that the implementation of the position, speed, and

pressure control parameters can be successfully achieved for a non-contact VMI;

however it is inconclusive as to whether implementation of the angle control

parameter can be successfully achieved. Software is at the core of the Sound Spheres

VMI and hence finger tracking control, user interface, sound synthesis and visual

feedback (key factors that contribute to control parameter success) are entirely

programmable. I therefore suggest that through design improvements and further

programming it would be possible to successfully implement an angle based control

parameter for a non-contact VMI.

By assessing how the factors of playability, progression, control, predictability,

balance between challenge, frustration and boredom, and reproducibility can be

applied to the Sound Spheres VMI, this research project also sought to determine

whether the finger tracking method is an effective non-contact technique in which to

play a virtual musical instrument. This research concludes that factors of playability,

progression, control, and balance between challenge, frustration and boredom can be

applied to the finger tracking method. Whether the factors of predictability and

reproducibility can be applied is inconclusive; however the statistically significant

differences between first time users and those who had played Sound Spheres before,

during the prototype review sessions, presents evidence suggesting these factors

could be achieved by the finger tracking method if participants had more time and

practice with the VMI. Evidence of progression in participants‟ ability to exert

control supports this theory also.

78
Results indicate that non-musicians are not disadvantaged in using the finger tracking

method for music playing. The Sound Spheres VMI thus provides a fun and novel

musical instrument accessible to people of all musical abilities.

6.1 Project Review

This project has been successful in that a non-contact VMI using the finger tracking

method has been conceived, designed, constructed, prototyped, tested and studied

from the ground up. It has given birth to a new and novel non-contact VMI, the

Sound Spheres. The TIEM database (as described in section 1.2) provides evidence

to the originality of the Sound Spheres VMI, with only 4 instruments in the

taxonomy considered non-contact and none of these using a finger tracking

technique. The user study has enabled many factors relating to finger tracking

methods, control parameters, audio and visual feedback and system design decisions

to be assessed. It has also identified areas for future research and ideas for further

development of the Sound Spheres VMI.

Whilst the research methods of the user study were appropriate and provided a good

set of data for analysis, in hindsight a different approach to the user study would

have been more suited. I have surmised that where results are inconclusive it is down

to lack of time and practice with the Sound Spheres VMI. Had user study sessions

been more numerous and/or longer in duration participants would have had more

time to adapt to the Sound Spheres VMI and develop better skills and control.

Similarly the project would have benefited in more prototype reviews as not all

design issues were identified during this stage. The poor sound sphere layout that

contributed to the unsuccessful implementation of the angle control is one such issue

that should have been identified during prototype review.

79
In hindsight a more neutral sample of participants should have been selected. The

participants chosen were predominantly friends and family and this non-neutrality

may have biased the results of the prototype review and user study with participants

wanting to provide positive feedback.

The Likert scale used in the user study questionnaire used a five point scale and the

central response (neither agree or disagree) was chosen in 20% of responses. Much

of the analysis used was based on whether responses showed a negative or positive

trend and therefore it may have been more appropriate to use a four point scale where

participants were forced to provide negative or positive feedback.

Results from the Spearman‟s Rank Correlation were disappointing with few

statistically significant or reliable results found. It is possible that the small number

of participants contributed to this. Perhaps, as statistical analysis of quantitative data

was an important aspect of this research, a greater number of participants would have

provided more conclusive correlation results.

The project has not been without its challenges. The design and construction of the

system was far more complex and time consuming than first envisaged. The learning

curve of new technologies, graphics programming and statistics has been steep.

Personal challenges have also made it difficult to accommodate the demands of the

project. A detailed and evolving project schedule has enabled the system and study to

be planned and executed on time.

80
6.2 Future Research

This research highlights a number of topics that, whilst fall outside the project scope,

are worthy of further research and these are discussed here.

This project acknowledges the importance of synchresis in the design of VMIs, and

this has been incorporated into the design of the Sound Spheres VMI. Evidence from

the user study suggests that without both audio and visual feedback the

implementation of control parameters is impaired. Further study could build upon

previous research by Moody (2009) to determine whether this is true for a non-

contact VMI, to what extent synchresis affects musical outcomes, or how best to

implement synchresis for non-contact VMIs.

This research assessed the Sound Spheres VMI from the player‟s perspective only.

Audience participation is a key component of a musical performance however and,

as stressed by both Mulder (2000) and Dobrian (2003), much of one‟s appreciation

of music is in its performance. The importance of mapping strategies for audience

participation is discussed in sections 2.2 and 2.3. Further research could determine

the alternative or optimal mapping strategies for musical performance and audience

appreciation of a non-contact VMI.

The Sound Spheres VMI used the Wiimote and passive markers to implement finger

tracking. Perhaps other finger tracking techniques could be investigated. If the Sound

Spheres VMI could support multiple techniques then a comparison could be made

and perhaps the most suitable technique found. Furthermore, research into whether

other gestural interfaces could be adopted for the Sound Spheres VMI could be

considered. One such interface is Microsoft‟s Kinect which was launched after the

start of this project.


81
This project has presented only one possible sound mapping per control parameter. A

comparison study with other possible mappings could be undertaken to identify the

most appropriate sound synthesis model for each of the control parameters.

For the Sound Spheres VMI a very simple user interface was designed. Other user

interface designs (e.g. different sound sphere layout) could be investigated to

determine which is most appropriate for the finger tracking method.

6.3 Further Development

Functional design of the system was limited to address the research question only to

ensure the system was delivered on time according to the project schedule. However

a number of potential enhancements to the Sound Spheres VMI were identified

during the project and details of these are included in Appendix K.

82
References

Barbosa, A. (2001). Instruments and temporal control in the context of musical


communication and technology. Workshop at "Olhares de Outono Festival -
New Trends in Digital Art.
http://www.abarbosa.org/docs/instruments_temp_control.pdf

Chion, M. (1994). Audio-Vision: Sound on Screen. New York: Columbia University


Press.

Crevoiser, A., Bornand, C., Guichard, A., Matsumura S., Arakawa, C. (2006). Sound
Rose: Creating Music and Images with a Touch Table. Proceedings of the
2006 International Confererence on new Interfaces for Musical Expression
(NIME06), Paris, France.

Crowley, J., Berard, F., Coutaz, J. (1995). Finger Tracking as an Input Device for
Augmented Reality. Proceedings of the International Workshop on Face and
Gesture Recognition, Zurich.1995.

Dobrian, C. (2003). Aesthetic Considerations in the Use of“Virtual” Music


Instruments. Journal SEAMUS, spring 2003.

Erich M. von Horbostel., Curt Sachs. (1914). Systematik der Musikinstrumente: Ein
Versuch. Translated as "Classification of Musical Instruments" by Anthony
Baines and Klaus Wachsmann, Galpin Society Journal (1961), 14: 3-29.

Ferguson, S., Wanderley, M. (2009). The Mcgill Digital Orchestra:


Interdisciplinarity in Digital Musical Instrument Design: Centre for
Interdisciplinary Research in Music Media and Technology, Digital
Composition Studios, Schulich School of Music of McGill University,
Canada.

Franco, I. (2005). The Airstick: A free-Gesture Controller Using Infrared Sensing.


Proceedings of the 2005 International Conference on New Interfaces for
Musical Expression (NIME05), Vancouver, BC, Canada.

Gieger, C., Reckter, H., Paschke, D., Schulz, F. (2008). Evolution of a Theremin-
Based 3D-interface for Music Synthesis. IEEE Symposium on 3D User
Interfaces 2008 8-9 March, Reno, Nevada, USA.

Guillaume, D., Jouvelot, P. (2005). Motivation-Driven Educational Game design:


Applying Best Practices to Music Education. ACM International Conference
Proceeding Series; Vol. 265.

Gotterbarn, D. et al. (1999). Software Engineering Code of Ethics and Professional


Practice. ACM/IEEE-CS joint task force on Software Engineering Ethics
ands Professional Practices (SEEPP), 1999.

Hunt, A., Wanderley, M. (2002). Mapping Performer Parameters to Synthesis


Engines. Organised Sound, 7, pp97-10
83
Johnstone, A., Candy, L., Edmonds, E. (2008). Designing and Evaluating Virtual
Musical Instruments: Facilitating Conversational Interaction. Elsevier Ltd.
Design Studies, Vol. 29, No. 6.

Johnstone, A., Marks, B., Edmonds, E. (2005). Sounds of Influence – An Interactive


Musical Work. ACM International Conference Proceeding Series; Vol. 123.

Jorda, A. (2004a). Digital Instruments and Players: Part 1 – Efficiency and


Apprenticeship. Proceedings of NIME 2004.

Jorda, A. (2004b). Digital Instruments and Players: Part II – Diversity, Freedom and
Control. Proceedings of the International Computer Music Conference,
Miami, 2004.

Kapros, E., Raptis, K. (2009). An Audiovisual Virtual Interface. Proceedings of the


International Multiconference on Computer Science and Information
Technology 2009.

Kiefer, C. (2010). Input Devices and Mapping Techniques for the Intuitive Control
and Composition and Editing for Digital Music. ACM, New York, NY, USA.

Martin, G. (2008). The Enlightened Hands: Navigating through a bi-dimensional


feature space using wide and open-air hand gestures,. Music Technology
Area, McGill University, Montreal, QC, Canada.

Lee, J. (2010). Hacking the Nintendo Wii Remote. IEEE Pervasive Computing, Vol.
7, No. 3, 39–45.

Malloch, J. Wanderley M. (2007). The T-Stick: From Musical Interface to Musical


Instrument. Proceedings of the 2007 International Conference on New
Interfaces for Musical Expression (NIME07), New York City, USA, pp. 66-
69.

Monaci, G. Triki, M. Sarroukh, B. (2007). Device-less Interaction. Koninklijke


Philips Electronics N.V. 2009

Moody, N. (2009). Ashitaka: An Audiovisual Instrument. Thesis, Department of


Electronics and Electrical Engineering, University of Glasgow, Scotland.

Mulder, A. (2000). Virtual Musical Instruments: Accessing the sound synthesis


universe as a performer. Proceedings of the first Brazilian Symposium on
Computer Music, held in Caxambu, Minas Gerais, Brazil, August 2-4 1994,
during the XIV annual congress of the Brazilian Computing Society (SBC),
pp 243-250.

Paine, G. (2009). Towards unified design guidelines for new interfaces for musical
expression. Organised Sound, 14(2), pp 143-156.

Paine, G., Drummond, J. (2007). Developing an Ontology of New Interfaces for


Realtime Electronic Music Performance. MARCS Auditory Laboratories,
Sydney.

84
Paine, G., Stevenson, I., Pearce, A. (2007). The Thummer Mapping Project (ThuMP).
International Conference on New Interfaces for Musical Expression
(NIME07), New York City, NY.

Paine, G. (2007). Interfacing for Dynamic Morphology in Computer Music


Performance. The inaugural International Conference o Music
Communication Science, Sydney, 2007.

Peek, B. (2009). Managed Library for Nintendo’s Wiimote.


http://wiimotelib.codeplex.com/

Softic, S. (2009). Using Nintendo Wii Remote Controller for Finger Tracking,
Gesture Detection and a HCI Device. Institute of Information Systems &
Computer Media, Graz University of Technology.

Stowel, D., Plumbley, M., Bryan-Kinn, N. (2008). Discourse analysis evaluation


method for expressive musical interfaces. Centre for Digital Music, Queen
Mary, University of London, London, UK.

Tauber, E., Stanford, J., Klein, L. (2005). How many users are enough? User
Experience; Vol 4. No. 4.

Victor-Charles Mahillon. (circa 1890) Catalogue descriptif et analytique du Musee


Instrumental du Conservatoire Royal de Musique de Bruxelles III (Ghent:
Ad. Hoste, 1900), no. 1602

Vlaming, L. (2008). Human Interfaces – Finger Tracking Applications. Department


of Computer Science, University of Groningen, July 4, 2008

Vuong, P., Kurillo, G., Bajcsy R. (2009). Wiimote Tracking Algorithm and its
Limitations”.

Walonick, D. (2004). Excerpts from Survival Statistics. StatPac, Inc., 8609 Lyndale
Ave. S. #209A, Bloomington, MN 55420

Wang, D., Huang, D. (2008). Low-Cost Motion Capturing Using Nintendo Wii
Remote Controllers. CSC2228 Project Report, Department of Electrical and
Computer Engineering, University of Toronto, Ontario.

Wanderley, M. (2000). Gestural Control of Music. IRCAM, Paris, France.

Wong, E., Yuen, W., Choy, C. (2008). Designing Wii Controller – A Powerful
Musical Instrument In An Interactive Music Performance System.
Proceedings of MoMM, Linz, Austria

85
Index

Aeolian Harp, 2 Phalanger, 6

Airstick, 6 physical interaction, 2

AirStick, 24 playability, 15

angle, 6, 88, 90 position, 6, 88, 90

audience participation, 6, 81 pressure, 6, 88, 90

bluetooth, 8 progression, 15

Convergent mapping, 20 reproducibility, 7, 88

conversational, 18 Sound Spheres, 9

Divergent mapping, 20 sound synthesis, 16

Finger tracking, 6, 25 speed, 6, 88, 90

haptic technology, 5 synchresis, 22

infrared camera, 8 tactile feedback, 5

infrared illuminator, 27 Terpsitone, 3

instrumental, 18 Theremin, 3

interaction, 2 TIEM, 4

learning curve, 15 T-Stick, 5

mapping strategy, 19 virtual musical instruments, 4

motion capture, 6, 88 VMI controller, 4, 88

ornamental, 18 Wiimote, 26

passive markers, 8

Craig Hughes (T8078171) 86


Appendix A – Extended Abstract
SOUND SPHERES: A NON-CONTACT VIRTUAL MUSICAL
INSTRUMENT PLAYED USING FINGER TRACKING
Craig Hughes
Extended Abstract of Open University MSc Dissertation Submitted 8 March 2011
He cleverly exploited the Wiimote‟s built in infrared
ABSTRACT
camera and simple bluetooth connectivity, demonstrating
how to implement a finger tracking application. The
This paper describes the design and evaluation of a
launch, after our project had started, of Microsoft‟s
new virtual musical instrument (Sound Spheres) which
Kinect introduced another low cost opportunity for
uses a finger tracking method as its gestural interface. By
developing body tracking applications.
assessing the instrument against key factors that are
considered important for the design of such instruments,
This paper concerns both the design and user study of
the aim of the project was to determine whether finger
a new non-contact virtual musical instrument, the Sound
tracking is an effective non-contact technique in which to
Spheres which uses Lee‟s finger tracking motion capture
play a virtual musical instrument (VMI), and whether
technique for its gestural interface.
common musical instrument control variables (position,
speed, pressure and angle) can be successfully applied.
2. DESIGN FACTORS
1. INTRODUCTION
Designing new electronic or virtual musical
instruments necessitates consideration of many factors
The creation and performance of music is
that affect its control and playability. For example, the
predominantly and traditionally reliant on the direct
Thummer Mapping Project [5] identified four common
physical interaction between the performer and a musical
physical instrument variables (pressure, speed, angle and
instrument. The advent of electronics and computing has
position) that control instrument dynamics, pitch, vibrato
given rise to many new electronic musical instruments
and articulation. In a later study Paine [7] re-iterated these
and interfaces. Recent advances in these areas have seen
control parameters as important factors for the design of
an emerging trend in the design of virtual musical
new musical interfaces. Jorda [1,2] describes other factors
interfaces in which audio is synthesized and played back
that are maybe important to the consideration of a good
based on a musician‟s body movements captured by some
musical instrument, suggesting playability, progression
gestural interface.
(learning curve), control and predictability. He also
suggests that the balance between challenge, frustration
A large number of VMIs are included in the and boredom must be met. Ferguson and Wanderley [9]
Taxonomy for real-time Interfaces for Electronic Music highlight reproducibility as one more important factor for
performance (TIEM) [6] and many more have been
digital musical instruments, suggesting that musical
described or been the focus of research into gestural
instruments that allow a performer to be expressive must
interfaces. Mulder [4] for example provides examples and
also permit a performer to imagine a musical idea and be
descriptions of different types of VMI controller. Based
able to reproduce it.
on the evidence of these and other sources, the majority of
these VMI controllers rely on physical interaction, i.e. an
A key aim of this project was to determine whether
interaction that involves exerting a tangible force on a
finger tracking can be an effective non-contact technique
musical instrument in order for it to produce a sound.
in which to play a virtual musical instrument. We take
effectiveness as meaning that the factors identified above
There is however a small number of non-contact
are supported, i.e. players can apply the four control
VMIs that are controlled without physical contact variables (pressure, speed, angle and position) and
interaction and instead rely on the proximity or movement achieve playability, progression, control, predictability,
(gestures) of parts of the body. Motion capture techniques
reproducibility, and balance between challenge,
are therefore a key component of non-contact VMIs.
frustration and boredom.
Vlaming [8] identifies a wide range of motion capture
techniques and systems. Finger tracking is one such
3. SOUND SPHERES
motion capture technique.
The Sound Spheres VMI is essentially a software
Until recently, hardware to support finger tracking has
application controlled only by the movement of the
been expensive and confined to specialist use. However,
musician‟s fingers in the air. The software has been
Lee [3] showed an accessible and affordable finger
developed using Microsoft‟s Visual Basic .Net object
tracking technique utilizing the Nintendo Wii Remote
oriented programming language and utilizes Microsoft‟s
controller (Wiimote) for the Nintendo Wii game console.
DirectX runtime libraries for its graphics implementation.
For handling and interpreting data from the Wiimote the
software application uses a .Net managed library called
WiimoteLib [10].

Unlike some finger tracking applications complex


finger gestures are avoided and only the finger tips are
used. Highly reflective tape placed on the fingertips
reflects infrared light to the Wiimote‟s infrared camera.
The Wiimote then passes data concerning the positioning
of the fingertips to the Sound Spheres VMI software.
Only four fingertips can be simultaneously tracked with
the Wiimote‟s infrared camera and hence this poses a
limitation of up to a maximum of four tracking spheres. Figure 2. Angle control parameter

The position of the finger tips is represented on the Each control parameter was mapped to a different
user interface as small spheres (tracking spheres), whose audio effect. For example, a varying amount of
movement are used to trigger sounds through collision modulation is applied to the generated sound dependent
with a set of fixed larger spheres (the sound spheres), on the degree of angle. Control parameters were also
which are organized in two rows, each comprising the 12 mapped to different visual effects which included flying
notes of an octave. Before sounds are played back, the sparks, spinning of the sound spheres.
software synthesizes the sound depending on how the
musician has used the control parameters of position,
speed, pressure and angle. Control Audio Effect Visual
Parameter Effect
Stereo Panning. Flying Sparks.
Position
The sound is The direction of
increasingly panned sparks is
to the left or right dependent of the
speaker dependent position of
on the position of tracking sphere
collision. collision.
Volume. Spin.
Speed
A greater speed The greater the
results in a higher speed of the
volume. tracking sphere
the faster the
sound spheres
Figure 1. Playing the Sound Spheres spin on
collision.
Much of the design effort focused on how each of the Parametric EQ. Size.
control parameters could be implemented without the Pressure
A greater pressure The greater the
sense of touch and with only audio and visual feedback. results in a tone pressure the
For example, the angle control parameter was based on where the higher larger the
the angle generated at the point of collision by the frequencies are tracking sphere.
tracking sphere‟s starting position and collision point (as boosted.
illustrated in Figure 2). Having in mind how instruments Chorus. None.
like the piano and the xylophone are played, the tracking Angle
An acute angle
sphere‟s starting position was determined by the position results in a chorus
at which the movement changes from a positive direction effect with a greater
in the y-plane to a negative one, i.e. the point at which a degree of
downward movement starts, following an upward modulation than a
movement. less acute angle.
Table 1. Control Parameter Mapping

Vibration of the Sound Spheres as they were struck by


the tracking spheres was also a visual effect that was
implemented; however this was not control parameter Spheres. The remaining 39 questions covered the various
dependent. design factors mentioned in section 2, asking the
participant to respond using a 5 point Likert rating scale
4. METHOD (strongly disagree, disagree, neither agree or disagree,
agree, and strongly agree), thus providing quantitative
Research was carried out in two phases: a pilot study data to which statistical analysis could be applied.
followed by a user study. The pilot study phase consisted
of the design, construction and testing of the Sound For example, based on the received answers, the
Spheres VMI. Using an iterative design process, a series control parameters of pressure, speed, angle and position
of prototype review sessions were used to test and were ranked based upon their ease of control and their
improve system design. perceived importance to musical outcomes (Figure 4). In
addition, Spearman‟s rank correlation method was used to
In the user study phase eight participants took part in determine the relationship between 57 pairs of
individual sessions to play the Sound Spheres. Five of the questionnaire responses, e.g. if the ease of use of the
participants were musicians and three had previously speed control parameter correlated with the preference for
participated in the pilot study prototype reviews. The its applied visual feedback. Furthermore, due to the small
sessions were split into a number of stages that required sample size, the non-parametric Mann-Whitney U Test
the participants to try out different elements of the was systematically applied to each of the 41 questions to
instrument. These stages catered for both the assessment test the hypotheses that questions may be answered
of the implementation of control parameters and the differently between musicians and non-musicians, and
effectiveness afforded by the finger tracking technique. between those who did and did not participate in the
prototype reviews.
1. Basic Instruction 5

2. Discovery and Free Play 15 Observation 5. RESULTS


3. Interview 1 10 Interview

4. Control Instruction 5
Statistical analysis of the questionnaire responses
5. Structured Play - Position 5
showed very positive feedback to many factors relating to
Session 6. Structured Play - Speed 5
Observation
Research the Sound Spheres VMI. For example, 87.5% of
Stages 7. Structured Play - Pressure 5

8. Structured Play - Angle 5


Method participants thought that the Sound Spheres VMI
9. Interview 2 10 Interview
facilitated the creation of music well and that their
10. Discovery and Free Play 15 Observation
playing improved over time. 75% of participants thought
11. Reproducibility Test 10 Observation that it was easy to move the tracking spheres using the
12. Structured Interview 10 Questionnaire finger tracking method. Factors such as general
playability, the progression of the musician‟s ability,
Stage Duration control, and balance between challenge, frustration and
(minutes) boredom all applied positively to the Sound Spheres VMI.
It was inconclusive as to whether the factors of
Figure 3. User study stages predictability and reproducibility applied, as the
questionnaire showed negative feedback in these areas.
Both qualitative and quantitative data were collected. However, the research (through observation and the
Observation notes were taken during the user study results of the Mann-Whitney U Tests) also suggested that
sessions to provide data for a comparative study to this could be down to a lack of playing time (i.e. practice)
determine patterns of use and behaviour (feelings), body given to the participants during the user study.
movement and posture, ease of use of the interface, ability
to understand and use the control parameters, progression
of learning, likes and dislikes, etc. Video recordings were
also taken to support, validate and clarify observation
notes. Interviews were conducted with each participant
after each stage, and the responses were also used for a
comparative study.

At the end of each user study session the participants


was asked to complete a questionnaire with 49 questions.
The initial 5 questions served to identify the participant
and their ability to play and read music. Two questions
asked the participant to rank the control parameters in
terms of ease of use and importance to musical outcomes.
The last 3 questions asked for general comments about
what participants liked most and least about Sound Figure 4. Control parameter rankings
In general the control parameters pressure, speed, and The participants‟ answers to interviews and the
position were considered easy to use and the sounds questionnaire also indicates that factors of playability,
generated for each of these controls were considered progression, control, and balance between challenge,
apparent, consistent and appropriate. Angle was the frustration and boredom can be achieved by a non-contact
control parameter that received the most negative VMI using the finger tracking method. Whether the
feedback in terms of its ease of control and audio factors of predictability and reproducibility can be
feedback. Interestingly it was the only control parameter achieved is inconclusive. However the statistically
that did not use visual feedback of any kind. significant differences between first time users and those
who had played Sound Spheres before, during the
Only 8 of the 57 Spearman‟s rank correlation results prototype review sessions, presents evidence suggesting
showed statistical significance and through further these factors could be achieved by the finger tracking
analysis 5 of these results were considered unreliable. method if participants had more time and practice with
However a strong correlation exists between the the VMI. Last but not least, the results seem to indicate
improvement of ability to play the Sound Spheres VMI that non-musicians are not disadvantaged in using the
over time and the ability to distinguish the application of finger tracking method for music playing.
more than one control parameter at a time. This suggests
that progression of ability or skill in playing the Sound 7. REFERENCES
Spheres VMI can be achieved. Correlation also suggests
that accuracy in positioning the tracking spheres increases [1] Jorda, A. “Digital Instruments and Players: Part 1 –
as the consistency in control of tracking sphere movement Efficiency and Apprenticeship”. Proceedings of
increases. NIME, 2004.
[2] Jorda, A. “Digital Instruments and Players: Part II –
The Mann-Whitney U Test results indicated that there
Diversity, Freedom and Control”. Proceedings of
was no significant difference between musicians and non-
the International Computer Music Conference,
musicians in the way questions were answered. However,
Miami, 2004.
there were five questions that identified significant (i.e. p
< 0.05) differences between the responses of those who [3] Lee, J. “Hacking the Nintendo Wii Remote”, IEEE
participated in the prototype review sessions and first Pervasive Computing, 7(3):39–45, 2010.
time users of the Sound Spheres VMI. These results
[4] Mulder, A. “Virtual Musical Instruments: Accessing
indicate that participants of the prototype review sessions
the sound synthesis universe as a performer”,
were more able to consistently control the movement and
position of the tracking spheres. They also used the Proceedings of the first Brazilian Symposium on
control parameters to add expression during play more Computer Music, pp 243-250, 1994.
than first time participants. Participants of the prototype [5] Paine, G., Stevenson, I., Pearce, A. “The Thummer
review sessions more strongly agreed with the change in Mapping Project (ThuMP)”. International
sound being apparent and consistent when using the Conference on New Interfaces for Musical
pressure control. Expression (NIME07), New York City, NY. 2007.
[6] Paine, G., Drummond, J. “TIEM - Taxonomy for
6. CONCLUSIONS
real-time Interfaces for Electronic Music
performance”. MARCS Auditory Laboratories at
This research project concludes that implementation
the University of Western Sydney, 2008.
of the control parameters of position, speed, and pressure
http://vipre.uws.edu.au/tiem
can be achieved for a non-contact VMI. The research was
inconclusive as to whether the angle control parameter [7] Paine, G. “Towards unified design guidelines for
could also be successfully implemented. There are a new interfaces for musical expression”. Organised
number of reasons why the angle control parameter was Sound, 14(2):143-156, 2009.
not received well. Firstly, analysis revealed that the
[8] Vlaming, L. “Human Interfaces – Finger Tracking
positioning of the sharp note sound spheres (which were
Applications”, Department of Computer Science,
placed lower than the natural notes) made them difficult
University of Groningen, July 4, 2008.
to hit at an angle. Participants also found that they often
played more than one intended note when using the angle [9] Wanderley, M. “Gestural Control of Music”.
control due to the close proximity of sound spheres. IRCAM, Paris, France. 2000.
Secondly, visual feedback was not implemented for the
angle control parameter. This suggests the combination of [10] Peek, B. “Managed Library for Nintendo‟s
both audio and visual feedback (synchresis) could play an Wiimote”. http://wiimotelib.codeplex.com/
important role for successful control of non-contact VMIs
and without it control is perhaps impaired.
Appendix B – User Study Questionnaire

Participant Details

1. What is your name? ______________________________________

2. Are you male or female?  Male  Female (tick one only)


3. How old are you? __________

4. Do you play a musical instrument?  No  Yes (tick one only)


If Yes what instrument(s) do you play?
______________________________________

5. Can you sight read music?  No  Yes (tick one only)

General Playability

For each of the statements below select one


Disagree

Disagree

Disagree
Agree or
Strongly

Strongly
Neither
response only.

Agree

Agree
6. Playing the Sound Spheres VMI was enjoyable.     
7. Playing the Sound Spheres VMI was     
challenging.

8. Playing the Sound Spheres VMI was     


frustrating.

9. Playing the Sound Spheres VMI was boring.     


10. Playing the Sound Spheres VMI was tiring.     
11. The Sound Spheres VMI facilitates the creation     
of music well.

12. My ability to play the Sound Spheres VMI     


improved over time.

92
Audio and Visual Feedback

Disagree

Disagree

Disagree
For each of the statements below select one

Agree or
Strongly

Strongly
Neither
response only.

Agree

Agree
13. I liked the sounds generated by the Sound     
Spheres VMI.

14. I liked the overall look and feel of the Sound     


Spheres VMI.

15. Controlling the movement of the tracking     


spheres was easy.

16. I could consistently control the movement of the     


tracking spheres.

17. Visual feedback was used appropriately.     


18. I could accurately position the tracking spheres.     
19. The use of sound sphere vibration was effective.     
20. The use of sound sphere spin was effective.     
21. The use of flying particles when a tracking     
sphere collided with a sound sphere was
effective.

Position Control
Disagree

Disagree

Disagree

For each of the statements below select one


Agree or
Strongly

Strongly
Neither

response only.
Agree

Agree

22. It was easy to control the position of the     


tracking spheres when colliding with the sound
spheres.

23. The change in sound when the tracking spheres     


collided with the sound spheres at different
positions was apparent.

24. The change in sound when the tracking spheres     


collided with the sound spheres at different
positions was consistent.

25. The change in sound when the tracking spheres     


collided with the sound spheres at different
positions was appropriate.

26. The visual feedback when the tracking spheres     


collided with the sound spheres at different
positions was apparent.

93
Speed Control

For each of the statements below select one

Disagree

Disagree

Disagree
Agree or
Strongly

Strongly
Neither
response only.

Agree

Agree
27. It was easy to control the speed of the tracking     
spheres when colliding with the sound spheres.

28. The change in sound when the tracking spheres     


collided with the sound spheres at different
speeds was apparent.

29. The change in sound when the tracking spheres     


collided with the sound spheres at different
speeds was consistent.

30. The change in sound when the tracking spheres     


collided with the sound spheres at different
speeds was appropriate.

31. The visual feedback when the tracking spheres     


collided with the sound spheres at different
speeds was apparent.

Angle Control

For each of the statements below select one


Disagree

Disagree

Disagree
Agree or
Strongly

Strongly
Neither

response only.
Agree

Agree
32. It was easy to control the angle of the tracking     
spheres when colliding with the sound spheres.

33. The change in sound when the tracking spheres     


collided with the sound spheres at different
angles was apparent.

34. The change in sound when the tracking spheres     


collided with the sound spheres at different
angles was consistent.

35. The change in sound when the tracking spheres     


collided with the sound spheres at different
angles was appropriate.

94
Pressure Control

For each of the statements below select one

Disagree

Disagree

Disagree
Agree or
Strongly

Strongly
Neither
response only.

Agree

Agree
36. It was easy to control the pressure of the     
tracking spheres when colliding with the sound
spheres.

37. The change in sound when the tracking spheres     


collided with the sound spheres at different
pressures was apparent.

38. The change in sound when the tracking spheres     


collided with the sound spheres at different
pressures was consistent.

39. The change in sound when the tracking spheres     


collided with the sound spheres at different
pressures was appropriate.

40. The visual feedback when the tracking spheres     


collided with the sound spheres at different
pressures was apparent.

Using Multiple Controls

For each of the statements below select one


Disagree

Disagree

Disagree
Agree or
Strongly

Strongly
Neither

response only. Agree

41. I was able to distinguish the application of more     Agree



than one control parameter at a time.

42. Rank each of the control parameters from 1 to Angle: ___ Position: ___
4 in order of ease of control (1 being the
Pressure: ___ Speed: ___
easiest and 4 being the hardest).
Note: No control must have the same rank.

43. Rank each of the control parameters from 1 to Angle: ___ Position: ___
4 in order of importance to musical outcomes
Pressure: ___ Speed: ___
(1 being the most important and 4 being the
least important).
Note: No control must have the same rank.

95
Reproducibility

Answer this section only if you took part in the


reproducibility test.

Disagree

Disagree

Disagree
Agree or
Strongly

Strongly
Neither

Agree

Agree
For each of the statements below select one
response only.

44. I was successfully able to play a simple piece of     


music repeatedly.

45. I used the control parameters to add expression     


whilst playing the piece of music.

46. I was able to play the piece of music in perfect     


time.

General Comments
47. What did you like most about the Sound Spheres VMI?

48. What did you like least about the Sound Spheres VMI?

49. Would you like to play the Sound Spheres again?  No  Yes (tick one only)

96
Appendix C – User Study Questionnaire Results - Likert Scale

The table below shows the responses to the Likert scale questions on the user study
questionnaire given by each participant. The mode and median are shown for the
responses to each question. The likert scale used a bipolar scoring as follows:-

-2 Strongly disagree, -1 Disagree, 0 Neither agree or disagree, 1 Agree, 2 Strongly agree

A score of 1 and 2 (Agree and Strongly Agree) was considered a positive response.
There were three exceptions to this (questions 8, 9 and 10, shown with an asterisk
suffix) where a favourable responses was considered to be in the negative and in
these cases a score of -2 and -1 (Strongly Disagree and Disagree) was considered a
positive response.

97
Appendix D – User Study Questionnaire Results - Control
Parameter Rankings

The tables below shows the responses to the ranking questions (42 and 43) on the

user study questionnaire given by each participant. Participants were asked to rank

each control parameter from 1 to 4 based on the ease of using the control and the

importance of the control parameter to musical outcomes. An overall ranking was

determined by scoring each response and totalling the scores for each control

parameter. The control parameter with the highest score was ranked first.

Control Parameter Rankings By All Participants

Control Parameter Rankings By Prototype Review Participants

Control Parameter Rankings By Musicians

98
Appendix E – Spearman’s Rank Correlation Results

Results for the Spearman‟s rank correlation coefficient (Rho) for selected Likert
question pairings are shown in the table below. The P-Value and Significance
percentage are also shown. Typically one would reject the null hypothesis when the
P-value < 0.05 (a statistical significance of 95%), and results that attain this level of
significance have been highlighted in red. The percentage of positive and non-
positive responses has been included for each question to aid analysis of the result.

99
100
101
102
103
104
Appendix F – Mann-Whitney U-Test Results

The U1 and U2 values for the Mann-Whitney U-Tests were calculated as shown
below:

The following tables show the n1, n2, R1, R2, U1 and U2 values. The U1 or U2 values
were compared with Mann-Whitney U distribution critical values at a significance
level 0.05. A red-flag is shown in the table against any U1 or U2 value falls below the
corresponding critical value, indicating that the null hypothesis can be rejected and
that a significant difference is found.

Results for Musician and Non Musicians

Continued on next page….

105
Continued on next page…

106
Results for Prototype Review Participants and First Time Participants

Continued on next page…

107
108
Appendix G – User Study Interview Responses

Continued on next page…

109
Continued on next page…

110
111
Appendix H – Categorization of Qualitative Results

112
Appendix I – Ethical Issues

A number of ethical issues were considered for the Sound Spheres research project.

These are identified and discussed here.

Health and Safety

A central component to the Sound Spheres research project is a user study of a new

system involving a number of participants. Whilst the participants are volunteers (i.e.

willing participants) it was important that they are made fully aware of any potential

risks to their health and safety before agreeing to participate in the user study. In this

regard I identified three specific potential risks that I feel ethically bound to disclose.

Firstly, the Sound Spheres system involves the use of an infrared (IR) illuminator.

One must therefore ask the question “is this a safe technology in which to subject

participants”. There is certainly contradiction in various publications as to the safe

use of such equipment. However my research into this area (as outlined in later in

this section) provides convincing evidence that this technology is safe when

appropriate precautions (i.e. strength of IR LEDs, safe working distances and

durations of use) are taken. Without adhering to the precautions there is a small risk

that the participant‟s health and safety could be compromised, namely a potential risk

of eye damage. These risks and evidence of safe use have been fully disclosed to

participants during the selection process for the user study.

For the Sound Spheres VMI the natural playing position of the hands is at torso

height with the arms bent at the elbows. Therefore the IR illuminator will also be

positioned at torso height and raised up or down dependent on the height of the

player. This variation in height will be most prominent between a child and an adult.

113
Therefore in addition to adjusting the height, a shield has been used to cover the IR

illuminator so that infrared light cannot be directed towards the player‟s eyes.

Secondly, there are ergonomic considerations. Playing any musical instrument for

the first time requires individual adaptation and playing the Sound Spheres virtual

instrument will be no exception. Playing required the participant to sit with their

arms extended and pointing one or more fingers towards the computer display.

Movement of both the arms and fingers is necessary. Lee (2008) suggests that the

onset of fatigue happens quite quickly for this type of activity. I believe it is ethical

to warn participants of this fact, especially as exercising muscles that are not

generally used may result in stiffness or soreness sometime after the exercise has

stopped.

Lastly, consideration has been made to the specific medical condition called „photo-

sensitive epilepsy‟. People with this condition may find that moving or flickering

light can cause problems, and this can include computer screens. Whilst this

condition is quite rare (only 3-5% of people with epilepsy are in fact photo-sensitive)

it is ethical to warn participants of this risk and to ensure that anybody with known

epilepsy is not selected to participate.

I have discussed all of these issues with participants prior to selection. Additionally I

have asked all participants to sign an acknowledgement form (prior to taking part in

the user study) confirming that they were told about and understand the risks

involved. This form also acted as a parental consent form for any participants that are

considered minors. In Abu Dhabi (the location of this research) a minor is considered

anybody less than 21 years of age.

114
Discrimination

The Sound Spheres virtual instrument makes no specific provision for disadvantaged

users with disabilities and can therefore be argued that discriminates against such

users. Users of the system must have use of their upper limbs and must not have

severely impaired vision. Is it ethical therefore to develop a software system that is

not accessible to all? Is it ethical to develop a musical instrument that is not playable

by all?

Whilst issues of accessibility by those with physical disabilities have seen much

ethical debate, I believe that in the case of Sound Spheres there is no real ethical

debate to be had. Looking at the Sound Spheres system from a musical instrument

perspective would support this view. As stated in the project overview, the

overwhelming majority of musical instruments require physical contact to play them,

and thus will logically discriminate against those with certain physical disabilities.

Looking at the Sound Spheres system from a software application perspective would

also give support to this notion. I would suggest that the majority of software based

systems make no specific provision for those with disabilities. Also, the Software

Engineering Code of Ethics and Professional Practice document (published by IEEE

Computer Society) simply states that one must only “consider issues of physical

disabilities” where appropriate but does not go as far as to suggest that this should

apply to all cases.

115
Appendix J – Sound Spheres Setup and Installation

System Components

The Sound Spheres VMI comprised of the following software and hardware
components.

 The Sound Spheres software application and supporting software libraries.

 Sony Vaio Z-Series laptop computer running the Windows 7 operating system,
with an Intel Core i7 CPU @2.67 GHz processor, 6GB of memory, and a
NVIDIA GeForce GT 330M graphics card.

 Samsung LD220 wide-screen monitor.

 Bluetooth adapter.

 BlueSoleil bluetooth driver.

 M-Audio Black Box. This is essentially a computer interface for home recording.
For the purposes of the Sound Spheres VMI I am using it for its 24-bit sound
card which is needed to implement some of the effects (such as stereo panning)
of the Microsoft DirectSound software library. There are additional features of
this device that make for fun an interesting playing of the Sound Spheres VMI,
such as a 100 pre-set drum beats and various effects processors. These features
however are outside of the scope of this project. All that is essential is the 24-bit
sound card.

 M-Audio Studiophile BX5a 70 watt bi-amplified studio reference speakers.


These are used for audio playback and combined with the sound card of the M-
Audio Black Box they provide the VMI with a very professional sound quality
and allow for high volumes without distortion. They are also magnetically
shielded which makes them ideal for desktops where they are used with computer
monitors.

 Wiimote controller.

116
 Infrared LED arrays with 32 LEDs with a wavelength of 850nm.

 Cover for the infrared led array and the Wiimote, shielding light from the
player‟s eyes.

 Two tiered desk and adjustable chair.

 Four reflective markers.

The connectivity of the Sound Sphere‟s hardware components is shown below.

Wide-screen Monitor M-Audio BX5a


Reference Speakers

Bluetooth
Adapter

Wiimote and
Laptop Computer M-Audio Black Box
LED Arrays

Sound Spheres Hardware Component Connectivity1

Physical System Setup

The components are essentially setup on a two-tiered desk. The top tier is used as a

surface on which to stand the speakers and computer monitor. The lower tier is used

as a surface on which the Wiimote and LED arrays are placed. The separate tiers

enable the Wiimote and LED arrays to be positioned horizontally central to the

monitor and speakers, but without obstructing the player‟s view of the monitor. The

1
Images have been taken from the corporate websites from Samsung, M-Audio, Sony and Nintendo.

117
lower tier is set at 70 cm above the ground and represents the minimum height that

the Wiimote can be placed. The Wiimote and LED arrays can be adjusted up or

down as desired to suit the height and natural playing position of each player. The

top tier is set 20 cm above the lower tier and hence allows the height of the Wiimote

to be adjusted up to 20 cm without obstructing the monitor.

Sound Spheres Physical Setup2

The players were also sat on an adjustable chair which also allowed them to raise or

lower their playing position. The positioning of the laptop and M-Audio Black Box

device are not important as long as they do not obstruct the player‟s view of the

monitor or the LED arrays. The reference speakers are positioned either side of the

monitor so that stereo effects are maximized.

2
Images have been taken from the corporate websites from Samsung, M-Audio, Sony and Nintendo.

118
System Failure

An event occurred during the first user study session that, whilst initially put the

research project at risk, provided a valuable insight worth further study. One of the

infrared LED arrays stopped working and could not be repaired. With no time to

procure and construct another, an attempt to get the system working with just one

LED array was made. With this configuration the movement of the tracking spheres

was possible but they would only move in a limited area of the user interface. A

different algorithm for calculating coordinates was tried to overcome this but was

unsuccessful. Through desperate trial and error it was discovered that when the

Wiimote was placed at a slight angle to the LED array the movement improved,

however it was still stilted and inconsistent. A breakthrough came when the Wiimote

camera was partially obscured (accidentally) by the back of the LED array and the

movement of the tracking spheres became very consistent, fluid, and more

responsive than when two LED arrays were working. The Sound Spheres playability

vastly improved! Due to project time constraints the technical reasons of why

performance improved were not investigated.

119
Software Installation

This section outlines installation steps for the Sound Spheres VMI. The Sound

Spheres VMI is available online in a compressed folder at the following location:-

http://dl.dropbox.com/u/18234532/SoundSpheres.zip

Note: The software will also be made available on requests sent to


Craig.G.Hughes@gmail.com

Note: The system has been developed and tested using a laptop computer running the

Windows 7 operating system, with an Intel Core i7 CPU @2.67 GHz processor, 6GB

of memory, and a NVIDIA GeForce GT 330M graphics card. The Sound Spheres

VMI should be installed and run on a workstation or laptop of an equivalent

specification.

1. Ensure pre-requisite software has been installed.

 Microsoft DirectX For Managed Code 1.0.2902.0


 Microsoft .Net 2.0
 Microsoft .Net 3.5

2. Download the compressed folder (SoundSpheres.zip) file from its online location

(see above) and extract the contents to a desired folder (the installation folder).

Follow the steps outlined in the Installation.txt file delivered in the installation

folder.

The alternative is to download the Sound Spheres VMI source code (as detailed

in Appendix L) and rebuild the Sound Spheres solution.

120
Starting the Sound Spheres VMI

Note: Before the Sound Spheres VMI software can be started (executed) it is

necessary to establish a bluetooth connection between the Wiimote controller and the

laptop (or workstation). This is perhaps the most challenging part of the setup.

Several bluetooth stacks / drivers were tried without success. Even the bluetooth

drivers delivered by Microsoft failed. Searching on the internet revealed that this was

a common problem. The bluetooth driver that seemed to have the best success rate

was the BlueSoleil driver which can be downloaded from www.bluesoleil.com.

Steps to start the Sound Spheres VMI are as follows:-

1. Ensure the laptop‟s bluetooth adapter is switched on (or inserted if it is an

external USB type adapter).

2. Establish a bluetooth connection between the Wiimote and the laptop.

3. Position the IR LED array(s) and the Wiimote as shown in Section 4.6. Turn on

the IR LED array(s).

4. Execute the SoundSpheres.exe application from the installation folder. The

Sound Spheres splash screen (as shown below) will be displayed while the

system is initializing.

121
5. If the application is unable to establish a connection with the Wiimote controller

then the following error message will appear.

6. The Sound Spheres user interface is then displayed, regardless of whether the

Wiimote can be initialized. This was to facilitate testing as the sound spheres can

also be played with the tracking sphere controlled by the mouse.

122
Appendix K – Potential Enhancements

 The Sound Spheres VMI limits movement of the tracking and sound spheres to a

fixed z-coordinate, effectively preventing any depth of field or movement in 3D

space. This design was primarily to do with the fact that a single Wiimote was

used and hence the position of the finger tracking markers in 3D space is not

possible to calculate. Using multiple Wiimotes would enable 3D positioning of

the finger tracking markers and would enable a richer (and perhaps more

appealing) graphical interface where both the tracking and sound spheres could

be positioned along the z-axis as well and the x-axis and y-axis. This would

perhaps enable more diverse musical outcomes through different playing

techniques and control-to-sound mappings. Visual feedback might also be

improved. Furthermore, the use of multiple Wiimotes will reduce (or possibly

eliminate) occlusion problems.

 The Sound Spheres VMI uses pre-recorded sound wave files for audio feedback.

The sounds generated by the Sound Spheres VMI could be implemented using

true midi output. This would enable the VMI to be used in conjunction with other

midi devices for variable sound generation and hence the player could assign

practically any sound to the sound spheres.

 The implementation of tone, cutoff and decay these would enable richer audio

feedback and perhaps enhance musical outcomes and improve appeal. Similarly

biasing, where one parameter needs to be activated to a certain threshold before

another one can have an effect, could be implemented.

123
 The Sound Spheres VMI is limited to two fixed octaves. The ability for the

player to dynamically change these octaves (during a performance) could be

implemented.

 An improvement to visual feedback would be to make the sound sphere‟s

direction of spin be dependent on the position and angle of a colliding tracking

sphere. This would enhance visual feedback for both the player and audience.

Similarly, the spinning speed of the sound spheres is currently affected by the

speed of a colliding tracking sphere. A further improvement would be to also

make it dependent on the pressure control.

 The control-to-sound mappings and their parameters have been essentially hard-

coded for this project. One improvement would to make these mappings and their

parameters configurable so that the performer can configure the Sound Spheres

VMI to their specific preference.

124
Appendix L – Sound Spheres Source Code

The full source code for the Sound Spheres VMI is available online in a compressed

folder at the following location:-

http://dl.dropbox.com/u/18234532/Sound%20Spheres%20Source%20Code.zip

Note: The source code will also be made available on requests sent to
Craig.G.Hughes@gmail.com

The Sound Spheres VMI been developed with Microsoft‟s Visual Studio

Professional 2008 using the Visual Basic .Net programming languages. The

compressed folder contains the following 3 solutions:-

 Sound Spheres –The main solution file for this project and can accessed from

/SoundSpheres/SounSpheres.sln and it includes references to two other

supporting solutions (as follows).

 DirectXUtil – This solution is delivered with the Microsoft DirectX SDK. It is

used to provide graphics classes for rendering the user interface.

 WiimoteLib - A .Net managed library developed by Peek (2009) used for

handling and interpreting data from the Wiimote.

Software pre-requisites

 Microsoft DirectX For Managed Code 1.0.2902.0


 Microsoft .Net 2.0
 Microsoft .Net 3.5

125
Appendix M – Sound Spheres Videos

In order to show various aspects of the Sound Spheres VMI a series of short videos
have been created and have been made available online.

The following videos show one of the participants of the user study playing the
Sound Spheres VMI during the first discovery and free play session.

http://dl.dropbox.com/u/18234532/Sound%20Spheres%20Video%201.MOV

http://dl.dropbox.com/u/18234532/Sound%20Spheres%20Video%202.mov

The following short videos attempts to show the change in sound as each of the
control parameters are used:-

 Speed Control Parameter


http://dl.dropbox.com/u/18234532/SoundSPheres%20-%20Speed.AVI

 Pressure Control Parameter


http://dl.dropbox.com/u/18234532/SoundSPheres%20-%20Pressure.AVI

 Angle Control Parameter


http://dl.dropbox.com/u/18234532/SoundSpheres%20-%20Angle.AVI

 Position Control Parameter


Note that the camera used to create the video does not have stereo capabilities
and hence it was not possible to show the sound being panned left and right.
However, the video does demonstrate the use of the flying sparks for the different
positions of the tracking sphere.
http://dl.dropbox.com/u/18234532/SoundSPheres%20-%20Position.AVI

The following video shows one of the participants of the user study playing a simple
melody during the reproducibility test.

http://dl.dropbox.com/u/18234532/SoundSpheres%20-%20SimpleSong.AVI

126

You might also like