You are on page 1of 30

The spatial complexity of auditory scenes

Research Proposal

Faculty of Architecture, Design and Planning The University of Sydney

Submitted as fullfilment of the first year requirements for the degree of Doctor of Philosophy Submitted by: Luis Alejandro Miranda Jofre Supervisor: Dr. Densil Cabrera (Faculty of Architecture , Design and Planning) Associate Supervisor: Dr. Craig Jin (Faculty of Electrical Engineering)

The spatial complexity of auditory scenes


Abstract
The research project proposed here was motivated by the increased interest in understanding the factors that affect the perception of quotidian spaces. Several soundscape studies have been conducted in recent years, aiming to relate soundscape attributes to users preferences. However, besides some expected common findings (e.g. natural sounds are preferred to mechanical sounds), there seems to be very little consensus as to what makes people prefer a certain soundscape. The proposed research project steps back from the soundscape approach of focusing on naturally occurring scenes and analysing them holistically, and aims to isolate and examine specific factors that are common in natural everyday acoustic scenes. The research project will focus on examining the spatial characteristics of an auditory scene. The two spatial characteristics of an acoustic scene under study will be the distribution and number of sources and reverberation. The concept of spatial complexity is introduced by analysing the effect of sources, reverberation and their spatial distribution on the perception of an acoustic scene. The research project combines common auditory scene analysis techniques (speech segregation, localisation, event detection) with a focus on ecological validity, customary to soundscape research (using speech and naturally occurring sounds as test signals, non-anechoic conditions, ecologically validated reproduction systems). Advances in understanding the spatial complexity of a given acoustic scene could serve as a starting point to comprehend everyday acoustic scenes from their primeval characteristics.

Table of Contents
The spatial complexity of auditory scenes................................................................1 Abstract.................................................................................................................................1 1. Introduction ................................................................................................................3 2. Literature Review......................................................................................................4 2.1 Soundscape studies and ecological validity ........................................................... 4 2.2 Acoustic and auditory scene analysis applied to everyday spaces .................... 8 2.3 Perception of complexity............................................................................................... 10 2.4 Soundfield simulation and acquisition using acoustic holography ................ 12 2.5 Conclusions ........................................................................................................................ 13 3. Research Question.................................................................................................. 14 4. Research Design and Methods ........................................................................... 14 4.1 Research Design Outline................................................................................................ 14 4.2 Methods of data collection ............................................................................................ 18 4.2.1 Quantitative methods of data collection ......................................................................... 18 4.2.2 Qualitative methods of data collection ............................................................................ 20 4.3 Methods of data analysis ............................................................................................... 21 4.3.1 Quantitative methods of data analysis............................................................................. 22 4.3.2 Qualitative methods of data analysis................................................................................ 23 4.3.3 Data conversion ......................................................................................................................... 24 4.3.3.2 Laboratory qualitative data conversion ...................................................................... 24 5. Work Schedule......................................................................................................... 25 6. References................................................................................................................. 26

1. Introduction
The purpose of the study is to advance the knowledge on perception of everyday auditory scenes by identifying, isolating and examining the influence of spatial distribution of sources and spatial distribution of reverberation. The starting point of this research project was the study of soundscapes departing from the common soundwalk approach. After a copious literature review (included in the following section) it has been noted that there is very little consensus on what are the characteristics that make a soundscape preferable to other, besides some very general recommendation. This has led to the idea that a different study and analysis approach could lead to better understanding preferences of everyday auditory scenes. The study of auditory scene analysis has been a constant staple in auditory studies. Auditory scene analysis was greatly propelled by Bregmans recollection of studies in his volume Auditory Scene Analysis (Bregman, 1990), and, the subsequent surge in computational techniques based on these studies. Computational auditory scene analysis has placed a special focus on localisation, pitch detection, derverberation and segregation, particularly of speech signals. This approach has been very valuable, as it has found several applications in fields such as video conferencing and forensic acoustics. However, as the complexity of the auditory scene increases, the ability of computational models to perform as well as human subjects decreases. This has led to the idea that auditory scene analysis could be used in a different way where the task would be more akin to description and analysis. The last important topic under research is the concept of auditory complexity. While there are significant sources and theories of visual complexity, there are very few studies that investigate auditory complexity. It is easy to prove, based on informal listening, that as sources increase in an auditory scene, one of the main perceived characteristics that also increases is complexity. Also, with a set number of sources, increasing the reverberation in an acoustic scene also tends to increase the perceived complexity.

Based on the ideas described in the previous paragraphs the research proposed here addresses the study of the spatial complexity of auditory scenes, using a signal approach common to auditory scene analysis, while trying to preserve the ecological validity of the study, comparable to soundscape studies. One more factor to consider is the analysis of existing scenes. During the research process spatial analysis of everyday scenes will be performed with the aid of microphone arrays capable of decomposing the soundfield into its spherical harmonic components. This type of array provides the ability to perform highly directional beamforming, benefitting the analysis of the spatial characteristics of a sound field. It is expected that the development of measurements that take into account the spatial characteristics of a soundfield could lead to a better understanding of the perceived spatial characteristics of a soundfield. The research project presented here has the potential to lead to important advances in applied acoustics. One of the main drivers behind soundscape research in recent years has been the increasing interest in defining design guidelines for everyday spaces. Large projects in Europe have been funded to understand soundscape preferences. The present study could provide the basis for design guidelines that could be applied in the process of planning everyday spaces. These guidelines would be based on the performance of subjects in auditory scenes of different spatial complexity, and characteristics that are defined based on spatial measurements.

2. Literature Review
The literature review presents the main ideas that have lead to the research question. The structure of this section is aimed at aiding the reader in understanding similar blocks of literature and at the same time guides the reader to the research question.

2.1 Soundscape studies and ecological validity


The study of the perception of urban sonic environments as a whole rather than in parts can be traced several decades back (Southworth, 1969). R. Murray

Schaffer conducted initial documented studies of how sonic environments are perceived in the 1960s. The term soundscape, as a sonic analogy to the term landscape, was first adopted by R. Murray Schaffer in his influential book from 1977 The Tuning of the World (for the latest edition see: Schafer, 1993). In this book several techniques that would become the customary soundscape research tools are proposed. His work was a starting point to further studies that propose to analyse the acoustic environment in an inclusive way, using techniques from the physical as well as perceptual sciences (Truax, 2001). One of the techniques proposed by Schaffer that is most widely used is the soundwalk. Soundwalks have been adopted since Schaffers proposition and are the preferred tool to detail how subjects experience a soundfield, most commonly in open spaces (Semidor, 2006). Commonly, the soundwalk process involves leading subjects into the space or spaces of interest, ask the subjects to shift their attention to the sonic environment and, based on their listening experience, answer questionnaires designed by the researcher. A common aim of these questionnaires is to understand users preferences and perceived characteristics of the spaces studied. Open-ended questionnaires and rating of attributes are common techniques (Raimbault, 2006, Ge and Hokao, 2004). Statistical analysis is often used to find commonalities in judgements (Berglund and Nilsson, 2006). Under this paradigm, several studies have been conducted. In these studies demographic factors have proven to be an important factor in the preference of soundscapes as well as the type of sound being evaluated. Age and education appear to be important factors in soundscape evaluation; nature and cultural sounds are preferred to industrial or technological sounds (Kang and Zhang, 2010, Yang and Kang, 2005b, Yang and Kang, 2005a, Mace et al., 1999, Kariel, 1990, Han et al., 2010, Yu and Kang, 2008). An extensive literature review of soundscape studies using the soundwalk method in urban areas can be found in Zhang and Kang (2007) and of interior urban spaces in Kang (2007). Artificial neural networks have been used to analyse preference information gathered with the methods described above, but their efficacy is limited to predicting

preferences in the space where the preference information was gathered (Yu and Kang, 2009). Furthermore, it has been proposed that in order to predict soundscape preferences different disciplines are required to study individual aspects of the soundscape and convergence between these disciplines will be required to fully understand the perception of soundscapes. Examples are given below. As a first example of interdisciplinary soundscape research, it has been proposed that to understand preference of soundscapes, not only the acoustical properties of the soundfield need to be evaluated, but also the meaning or semiotics of the individual sounds that form the soundscape (Jekosch, 1999). Focusing on sound source and preference, linguistic studies suggest that subjects tend to group sound events according to their source when presented with complex auditory scenes. This differs to situations where subjects are presented with music; in situations where music is presented, subjects tend to analyse stimuli according to the sound characteristics instead of belonging to a source (Guastavino, 2007, Guastavino, 2006). This is linked to the studies mentioned above where it was found that natural sources are preferred to human or industrial sources. Expanding on the findings of sound sources, a two-part study has been conducted where, for the first part, the most preferable natural sounds have been identified and recorded. Subsequently, the effect of natural to urban sounds, stated as signal to noise was studied related to preference (Jeon et al., 2010). The perception of environmental sounds as foreground and background sound and its relation to soundscape preference has also been studied. Studies have been conducted with its intended outcome being the classification in time and space of background and foreground sounds in an environment. The soundscape is compared to landscape as its visual counterpart. (Mazaris et al., 2009). Notice- events are proposed as a model to explain the perception of environmental sounds; in this study the following is stated: Our key hypothesis states that the perception of environmental sound is primarily determined by consciously

noticed sounds. This theory is evaluated against road noise. Auditory scene analysis is considered but a simplification is preferred. The model is simplified by using a gate type analysis triggered by level solely. This would separate noticed events from unnoticed events lying in the background. Gating of events is based on temporal factors of sound being studied as well as previous inputs into the system. (De Coensel et al., 2009) Psychoacoustic methods have been proposed to evaluate soundscapes (Genuit and Fiebig, 2006, Raimbault et al., 2003). However, studies have been conducted that take into account the human preference of soundscapes in their context suggesting that psychoacoustic metrics do not completely predict soundscape preference (Lam et al., 2010). In addition, studies also show that a combination of psychoacoustic data as well as sound source identification is better at predicting soundscape preference. Using a multiple regression model soundscape preference has been linked to psychoacoustic as well as sound source properties (Lavandier et al., 2006). In other studies, proven audio techniques have been proposed to analyse soundscapes in regards to quality. This approach taken in conjunction with conventional approaches of soundscape analysis has proven to be successful in identifying soundscape preferences. (Ljungdahl Eriksson, 2009). Finally, it should also be noted that the soundscape is seldom presented as a unique sensory stimulus, and that soundscape evaluation is linked to visual stimulus (Abe et al., 2006, Pheasant et al., 2008, Viollon et al., 2002, Carles et al., 1999). Key ideas: Based on the soundscape literature review presented, it should be noted that there is very little consensus on what makes a soundscape preferable to other. It should also be noted that most signal analysis is based on binaural signals and that the spatial and temporal organisation of events commonly occurring when listening in a scene is seldom taken into account. However, the ecological validity of these studies cannot be doubted as they all take place in or are derived from real life scenes. The proposed research project

aims to maintain the ecological validity of soundscape studies while incorporating new elements that aid in the physical and perceptual description of every day scenes.

2.2 Acoustic and auditory scene analysis applied to everyday spaces

In brief, acoustic analysis refers solely to the study of physical characteristics of sound in space. On the other hand, auditory analysis incorporates another analysis layer that includes how sound is perceived by a human listener. Bregmans Auditory Scene Analysis (Bregman, 1990) is considered the first major work on auditory scene analysis. In his influential work he set out the principles of auditory scene analysis from a psychological point of view that now constitute the foundation for computational auditory scene analysis. In his research Bregman proposes two primary means by which listeners integrate and differentiate sounds sources. The first method of auditory scene analysis that Bregman proposes is sequential integration. Sequential integration refers to the auditory ability to integrate sonic sequences in time as if belonging to the same stream. This in turn helps to identify sound sources, as sonic streams would be perceived as coming from a unique source. Sequential integration in Bregmans studies is mainly based in temporal characteristics of the stream, although it is mentioned that spatial characteristics aid in the integration process. The second auditory scene analysis method is based on the integration of simultaneous components in a complex auditory scene. Different sources are segregated and differentiated using spectral cues. Again, it is mentioned that spatial cues aid in this process of segregation, but he does not develop this further. It should also be stated that there is a brief mention suggesting that spatial attributes of multiple sound sources affect the psychoacoustic characteristics of a complex scene, with similar or dissimilar sources. This is mentioned in relation to roughness. CASA combines the concepts put forward in Bregmans work and digital signal processing techniques. Some of the techniques used in computational auditory scene analysis could provide the basis for a spatial acoustic and auditory scene analysis. 8

Some of the sonic features studied under CASA: Fundamental frequency detection: One of the most basic parameters that can be examined in an acoustic or auditory scene is the fundamental frequency or F0 of a signal. Algorithms for the estimation of F0 are abundant in the literature (Hess, 1983) and good solutions have been provided (Cheveigne and Kawahara, 2002). However, it would be beneficial to estimate F0 in of various sources in a complex acoustic or auditory scene. The theories behind multiple F0 in the spectral domain (Parsons, 1976) and in the time domain (Frazier et al., 1976) are not new; nevertheless, space for improvement still exists. Approaches using separation in the spectral and time domain have also been proposed and provide better results (Jin and Wang, 2010, Mingyang et al., 2003). Based on F0 estimation, algorithms have been proposed to estimate the number of sources in a soundfield (Klapuri, 2003). Onset and offset detection: Acoustic analysis can be performed on the onset and offset of sound signals. There are basic temporal parameters that can be investigated in this manner. An onset can be defined as a noticeable increase in the level of the signal under study; while an offset can be defined as a noticeable decrease in the level of the signal under study (Hu and Wang, 2007). Envelope extraction: A common signal analysis technique is envelope extraction by the Hilbert transform. Furthermore, it has been proposed that the spectrum of the envelope of a signal can be used in auditory scene analysis for signal separation as it correlates to psychoacoustical experimental data (Kollmeier and Koch, 1994). Spatial location: Another basic parameter of a sound source is its location. Binaural techniques exploit interaural time and spectral differences to situate sources in space. Furthermore, beam forming with microphone arrays can provide higher resolution for source localisation in space. It has been proposed that a combination of fundamental frequency detection and source localisation trough beam forming could provide better results in separating multiple sound sources (Drake et al., 2009). Also, it has been proposed recently that the use of statistical methods and spherical microphone arrays could be beneficial to identifying source location and source separation (Epain et al., 2010). 9

Key ideas: Auditory scene analysis is well established and provides a good basis for the study of complex scenes based on signals as perceived by human subjects. The research project proposed here aims to combine this approach with the ecological validity of soundscape research. It is expected that the precision that is usually sought after in computational auditory scene analysis tasks (e.g. segregation and localisation), will be replaced by description of the complexity of an auditory scene based on a signal analysis correlated to human perception.

2.3 Perception of complexity


Research in visual perception usually predates research on auditory perception. The perception of complexity is not the exception. Theories of visual complexity based on different models, ranging from Gestalt psychology to algorithmic information theory, have been available for decades (for a review the reader is referred to Donderi, 2006). There also exists a good amount of information on the visual perception of scenes and objects within scenes, as well as the role of visual selection attention on perception and task performance (Goldstein, 2010) pp. 99-155. On the audition field there are very few studies on the perception of complexity. Some of the few available studies on auditory complexity have focused on the subject of musical complexity. In Scheirer (2000) several psychoacoustical parameters of musical excerpts were correlated with semantic descriptors of music including complexity. The proposed model, constructed from linear regression techniques, claims to correlate well with perceived complexity. In Streich and Herrera (2004) several musical parameters such as melody, harmony, rhythm, timbre and structure are proposed as descriptors of the complexity of a musical piece as a whole. In Scheirer (2001) applies the theory of algorithmic information to express the complexity of a signal, however, this is only applicable to information theory and not to the perceptual characteristics of the signal. In a way, we could argue that several studies have indirectly studied or studied without explicitly addressing spatial complexity of auditory scenes. This has been done with experiments that address spatially separated signals. Common 10

tasks are localisation and speech intelligibility tasks. Given that there is some overlap with the previous section only studies with more than two sources are mentioned here, as this are deemed for the purposes of this research frame, to have higher spatial complexity. The phenomenon of focusing the attention on a single talker in a multi-talker situation (also known as the cocktail party effect) is widely studied (Bronkhorst, 2000). The following single talker in a multi-talker situation studies are mentioned for their importance in forming a perceptual theory of spatial auditory complexity. In Kopco et al. (2010), the ability of listeners is tested to localise a female voice amongst four spatially separated male voices. Even further, the effect of knowing the localisation of the competing speech sources was tested. The results show that previous spatial awareness of masking sources aid in speech localisation. The task is further complicated if the sources exhibit movement (Brungart and Simpson, 2007). In Iwaya (2009), the realism of a sound scene is evaluated against the distribution of sound sources. The author uses unrestricted head movement to evaluate subjects and a spatial sound pressure distribution is proposed where the scene is perceived with greater realism. In Santala and Pulkki (2011), the effects of multiple sources extending in two dimensions is examined. In this study the effect of contiguous multiple sources in the horizontal plane is tied to the perception of sources. It is shown that as the pattern complexity and number of sources increases, subjects are less efficient at localising and identifying number of sources. In Braasch and Hartung (2002), it is shown that reverberation reduces the ability of a listener to localise a source when presented simultaneously with a distracter. It is important to notice that this study was conducted entirely over headphones. Key ideas: The common trait in the studies presented in the later part of this section is that they provide the basis for a study of spatial complexity based on source number, localisation and reverberation. While spatial complexity is not specifically

11

addressed, several pointers should be noted. As important conceptual conclusions, it can be stated that as the number of sources and reverberation increase the performance of subjects to complete tasks decreases. Some methodological features of the studies that should be taken into account are the use of completely unrestricted head movement (adding in maintaining ecological validity) and headphone based presentation of stimuli (aiding in uniformity of stimuli presented).

2.4 Soundfield simulation and acquisition using acoustic holography



In the following, a number of realistic applications for AVEs (Auditory Virtual Environments) are listed. The list is based on projects which the Bochum Institute has been involved in. Such projects have been, e.g., auditory displays for pilots of civil aircraft, AVEs for the acoustic design and evaluation of space for musical and oral performances, for individual interactive movie sound, and for teleconferencing. Further, virtual sound studios, listening and control rooms, musical practicing rooms, and systems to generate artificial sound effects, especially so-called spatializers. In addition, there is the auditory representation in simulators of all kinds of vehicles, e.g., aircraft, passenger cars, trucks, train, motorcycles. Also, AVEs are in use for the archiving of cultural heritage, for training, e.g., police and fire-fighter training, for rehabilitation purposes, e.g., motoric training, and as an interactive interface to the world-wide web, e. g., an internet kiosk.(Blauert, 2005)

Auditory virtual environments have reached a level of sophistication in recent years that allow for real world applications without compromising quality or perceived realness of the reproduction. This is possible thanks to several advances in soundfield simulation and acquisition with high-resolution spatial characteristics made in recent years. This has been achieved trough the use of wave-field synthesis and high-order ambisonics (Daniel et al., 2003, Berkhout et al., 1993). Acoustic holography is a commonly used term that refers to the rendition of three-dimensional soundfields with a high degree of accuracy, Spherical microphones have been proposed to capture soundfields with high spatial resolution(Meyer and Elko, 2002). The bandwidth where these microphones operate optimally is limited by the number of transducers and the spacing between microphones (Rafaely, 2005). The optimal bandwidth refers to the region where the microphone can record and decompose a soundfield without spatial errors. Researchers at the Department of Electrical Engineering at the University of Sydney have proposed a co-centric microphone where

12

transducers are located in an open outer sphere and a rigid inner sphere, extending the operational bandwidth of the spherical microphone (Parthy et al., 2008). The validity of using high-resolution spatial audio for listening experiments has been tested. Initial studies using first order spherical harmonics have shown that two-dimensional arrays are preferred in laboratory conditions than three- dimensional arrays. This has been the case for simple audio reproduction preferences as well as for laboratory reproduction of test stimuli. Also, it is shown that, with appropriate settings laboratory conditions, similar results can be achieved in laboratory listening experiments as in field studies (Guastavino et al., 2005, Guastavino and Katz, 2004). Key ideas: Spherical harmonic decomposition of a sound scene is possible thanks to spherical microphone arrays. It is intended that early research stages will include the use of microphone arrays to record and analyse everyday scenes. The development of a suitable reproduction experimental setting will likely have spherical harmonics domain applications at a later stage during the research. The use of spherical harmonics in the reproduction of scenes wether via an ambisonics system or translated to head-tracked headphones systems is expected.

2.5 Conclusions
Based on the existing literature some gaps in the knowledge have been identified and a sense of direction has been obtained. Based on the ecological validity of soundscape research and the signal-oriented methods of auditory scene analysis, a research project is proposed where the perceived spatial complexity of an auditory scene is studied. The main aspects under study will be the number and situation of sources in space and the reverberation, which is also a spatial phenomenon. It is proposed that spherical microphone arrays will be used to analyse existing auditory scenes, developing measurements based on the spatial distribution of sound.

13

Experimental setups will be prepared where subjects are tested in a manner that is intended to emulate real life settings. Unrestricted head movement in an ambisonics setup, or a head-tracked reproduction system with six degrees of freedom can aid in this task.

3. Research Question
The research question is formulated as follows: How does the distribution of sources and reverberation in space affect the perceived complexity of an auditory scene?

4. Research Design and Methods


4.1 Research Design Outline
Suggestions of appropriate research designs can be found in Teddlie and Tashakkori (2009, pp. 141-147). The foundation of the research frame proposed in the present document is based on understanding the advantages and disadvantages of using quantitative, qualitative or mixed method research strategies during all the research phases. In order to arrive at a suitable research design, each of the research phases, as proposed in the aforementioned book, are analysed and integrated in the overarching process. It is important to consider the nature of the research question, as this is the base for all the following research design decisions. This appears in Teddlie and Tashakkori (2009, p.145) as the Conceptualisation Stage. If we take the research question as a purely psychoacoustical question, we could arrive at the conclusion that we could use commonly used psychoacoustical tests to obtain a valid result. In traditional psychoacoustic tests, the influence of a few selected variables is tested by presenting stimuli in a controlled environment where the subject under test commonly has to provide an answer. Methods of data collection and analysis are well established (Goldstein, 2010, pp. 8-15). The common factor in these methods is that all are quantitative based methods. Based on this we can conclude that the research question is essentially trying to answer a quantitative question. However, as we hope to clarify in the following

14

sections, to properly ask the quantitative question, some pieces of information not available in the existing literature need to be obtained via qualitative research methods. Also, a purely quantitative research design could lead to results without the appropriate internal and external validity. The Experiential (Methodical) Stage (Teddlie and Tashakkori, 2009, p. 145), involves the experiment design and data collection. Given the assessment of the research proposal it would seem natural to plan this stage as purely quantitative. We could plan laboratory data collection where we control the variables under test and subjects are exposed to controlled situations. However, if we put the research under scrutiny of common measurements of research quality (Groat and Wang, 2002, pp. 34-40), the credibility of the research could be compromised due to the complexity of the research and lack of key information. While the data collection experiments will be conducted using purely quantitative methods, the key elements to be tested need to be elicited via qualitative methods, and the experimental setup needs to be validated via qualitative methods. The internal validity of the study is a measure of how much does the variables under study can correlate to the responses obtained, observed as a direct cause- effect phenomenon, without the input of variables not under consideration. The external validity of the study is the extent to which the results obtained can be generalised outside of the conditions used during the study (Groat and Wang, 2002, Leedy and Ormrod, 2005). Both of these are closely related due to the expected research outcomes. In our case, the best way to examine the internal validity of the research is by conducting qualitative research parallel or prior to the quantitative research where the variables under study are segregated by means of exclusion. A very important aspect of this research is to preserve the external validity of the study and, in particular the ecological validity (Schmuckler, 2001). In our case, the ecological validity is preserved if the subjects respond to the stimuli to be presented in the laboratory in the same way that they would respond in a natural setting. Again, this will be verified by qualitative research methods.

15

Similar studies to validate experimental settings of natural scenes have been conducted in the past (Guastavino et al., 2005). A key missing point in the research is the subjective experience of complexity in an auditory scene. A very important aspect of the research is to understand qualitatively common characteristics that correlate to the concept of complexity as perceived by subjects. While there is considerable information on studies on visual complexity (Donderi, 2006), there is very little information available on auditory complexity. In order to correctly prepare the experimental set up mentioned above, we have to understand what makes a scene complex from a qualitative point of view. The elicitation of this information is included in the experiential (methodological) stage. The Experiential (Analytical) Stage (Teddlie and Tashakkori, 2009, p. 145), involves the analysis of the data collected in the experiential methodical stage. In this research there will be three main strategies for the analysis of data: quantitative data analysed using quantitative methods, qualitative data analysed using qualitative methods and qualitative data transformed into quantitative data to be analysed using quantitative methods. It is expected that the details of the experiential stage (methodological and analytical) will evolve as the research evolves. However, it is not expected that the research design will change. The Inferential Stage (Teddlie and Tashakkori, 2009, p. 145) is where conclusions are drawn and new understandings on the research question are achieved. This will part from the analysis of data on the previous stage. The expected methods for inferential analysis will be in accordance to the experiential analytical stage, therefore it is expected that quantitative and qualitative methods will be used. It should be noted that the process from the experiential (methodological) stage to the inferential stage is an iterative process. The experiential stage is enriches with partial conclusions drawn from the inferential stage until a satisfactory research question answer is reached.

16

As an additional final step we have a Metainference stage. This is where the conclusions drawn from the quantitative and qualitative processes are drawn together to arrive at an answer for the research question that has the internal and external validity, only possible by using a mixed method research approach. Approaches to validate the results obtained from the metainfrence stage are presented in Tashakkori and Teddlie (2008). In brief this include methods of checking the validity of each conclusion with the findings, checking the validity with current theories and other investigators in the field and checking the validity as the most plausible theory of all investigated. This validity checks are applicable to all metainferences as well as inferences made on each strand of the research. A final validity test is the integrative efficacy, which is the degree of to which inferences made on each strand integrate to formulate theories that take into account both strands and at the same time are completely congruous.

17

Figure 1 presents the complete research design. Quantitative processes appear in green, qualitative processes appear in red and mixed method processes appear in orange.


Figure 1. Research design

4.2 Methods of data collection


The methods of data collection that encompass the experiential (methodological) stage are described below. In pursuit of clarity they are divided in quantitative and qualitative methods, and ordered in their expected chronological sequence. 4.2.1 Quantitative methods of data collection 4.2.1.1 Physical measurements The measurement of acoustic parameters is well established (for an introduction the reader is referred to Bies and Hansen (1996, pp. 92-122). Most well

18

established physical measurements of sound take into account its intensity and temporal characteristics and correlate it to human perception (e.g. weighting curves and integration parameters). However, the spatial characteristics of sound fields are usually translated to simple parameters measured at a single direction. The first phase of this study is to examine and optimise possible ways of measuring a soundfield taking into account its spatial characteristics. For this data collection stage, measurements will be developed and implemented that use the spherical harmonic decomposition of the soundfield as a starting point for describing the sound fields spatial characteristics. Measurements using traditional techniques will also be performed (i.e. sound level and reverberation measurements). It should be noted that all sound recordings will be paired to visual recordings. 4.2.1.2 Closed-ended questionnaires and rating scales Simultaneous to the measurement of acoustic environments, closed-ended and rating scale questionnaires will be given to test subjects (Leedy and Ormrod, 2005, pp. 183-187). The design of these data collection tools will be aimed at obtaining information about specific characteristics of the acoustic environment. The questionnaires will be mainly aimed at providing information about the spatial characteristics of the soundfield, however, other characteristics will also be analysed (e.g. visual and olfactory stimuli). The sampling method to obtain the information could be considered as purposeful sampling, with the peril of falling into convenience sampling (Teddlie and Tashakkori, 2009, pp. 173-178) (Mertens, 1998, pp. 261-265). The subjective data expected to be collected from real life scenes will be extensive and in-depth. It is highly desirable that the subjects are easily contactable and motivated to be called multiple times and for long periods of time. The subjects are expected to be students or personnel of the Audio and Acoustics program at the University of Sydney, as it is likely that these subjects will be motivated by interest in the research. This leads us to convenient sampling, which can lead to erroneous results when trying to extrapolate their results to the general population. On the other hand we can expect these subjects to have better training in critical listening tasks, leading to better results. As highlighted in

19

(Patton, 2002, pp. 230-232), purposeful sampling can be a strength when the case we are analysing includes a qualitative element, and the subjects are chosen carefully because of their expertise or relevance to the case under study. This will be assessed during a pilot study. It is also expected that the questionnaires and rating scales will be perfected based on the analysis of initial data. However, as a starting point, questionnaires will be based on existing soundscape studies (Jeon et al., 2010, Kang, 2007, Yang and Kang, 2005b). 4.2.1.3 Experimental data collection The experimental data collection in the lab will include presenting real and synthetic acoustic scenes in the lab and perform common psychoacoustic tests on subjects. Presenting acoustic scenes in laboratory reproduction systems has been validated in the past for other type of tasks (Guastavino et al., 2005), and further qualitative tests will be done to validate synthetic scenes. The experimental data collection is based on experimental and quasi-experimental techniques where one or several known and identified variables are explored to obtain results (Mertens, 1998, pp. 85-103) (Leedy and Ormrod, 2005, pp.217- 239). The sampling method to obtain information will be based on a random probability sampling method (Teddlie and Tashakkori, 2009, pp. 171-173) (Mertens, 1998, pp. 258-265). Acknowledging the limitations of the sampling, it should be noted that it will not be a truly random sampling as it is not envisaged that subjects will be chosen by traditional random sampling methods (i.e. telephone-based methods or mail-based methods). The random subject selection will only extend to the main researchers sample recruitment capabilities. This will be assessed at time of implementing the experiment. 4.2.2 Qualitative methods of data collection 4.2.2.1 In-situ qualitative data collection At the same time as physical measurements and subjective quantitative data is collected, an open-ended questionnaire will be applied to the same subjects to understand the qualities of the spaces under study. The study will be based on the ground theory method (Leedy and Ormrod, 2005, p. 140) (Mertens, 1998, p. 20

170). In this type of study theories are developed based on the empirical impressions of subjects. It is based on developing theories based on data reduced from recurrent themes and based on similarities and differences among subjects. The sampling method will be the same as the closed-ended questionnaires as the subjects will be the same. As exposed above, it is preferred to collect in-depth data from few subjects, and having the ability to delve deeper into these subjects experience by repeat visits to measurement sites. 4.2.2.2 Laboratory qualitative data collection Before the experimental data collection, qualitative data collection will be performed in the laboratory using recorded and synthetic acoustic scenes. Similar to the in-situ qualitative data collection, a ground theory study will be applied. The sampling method will be the same as the closed-end questionnaires and the in-situ qualitative data collection, as the subjects are expected to be the same. 4.2.2.3 Elicitation of complexity characteristics by the repertory grid technique The repertory grid technique was proposed by Kelly (1955) to reveal the classification of a subjects experiences. This technique has been successfully used to describe the perceived spatial characteristics of sound reproduction systems (Berg and Rumsey, 2006). In the repertory grid technique triads of stimuli are presented to a subject and then the researchers asks in which way two of them are similar and how do they differ from the excluded one. In this manner, the researcher can obtain a relational attribute as well as a differentiation attribute in one trial. The sampling method will be similar to the experimental quantitative data collection as a large number of subjects is expected to be needed to obtain significant results.

4.3 Methods of data analysis


The methods of data analysis that encompass the experiential (analytical) stage are described below. In pursuit of clarity they are divided in quantitative, qualitative and data conversion methods, and ordered in their expected

21

chronological sequence. A brief description of the purpose of each stage in the path towards answering the main research question is also included. 4.3.1 Quantitative methods of data analysis 4.3.1.1 Physical measurements analysis The analysis of physical measurements of everyday soundfields with common instrumentation will be carried out using well-established acoustic indicators (Bies and Hansen, 1996, pp. 92-122). On the other hand, measurements using spherical harmonic domain signals will be developed based on the existing methods and knowledge of the capabilities of soundfield spatial decomposition. Preliminary studies can be found in O'Donovan et al. (2008). The main purpose of this data is to obtain objective descriptions of the spaces being studied. 4.3.1.2 Closed-ended questionnaire and rating scale analysis The analysis of the close-ended questionnaires and rating scales will be performed using statistical tools. Descriptive statistics will be used to assess and present general population trends on auditory spatial characteristics of the sites under study (Mertens, 1998, p. 330) (Leedy and Ormrod, 2005, pp. 253-267). More concisely, measures of central tendency, measures of spread and standard scores are some statistical techniques related to descriptive statistics (Rosenthal and Rosnow, 2008, pp. 299-313) that will be used to describe the data. The purpose of this data analysis is to obtain perceptual characteristics of the sites under study. The first round of data collection will be based on previous studies of spatial characteristics; subsequent stages of data collection will be benefited from the qualitative data collection and analysis. 4.3.1.3 Experimental data collection The analysis of experimental data collection will be performed using statistical tools. Again, descriptive statistics will be used to assess general trends in the population. In order to calculate correlation between responses, inferential statistics will be used to test possible related variables (Mertens, 1998, p. 330) (Leedy and Ormrod, 2005, pp. 267-275).

22

The descriptive statistic methods will be again measures of central tendency, measures of spread and standard scores (Rosenthal and Rosnow, 2008, pp. 299- 313). The inferential statistic methods that will be used to analyse the data obtained will be based on examining the effects of variables, mainly using correlation (Rosenthal and Rosnow, 2008, pp. 314-353) and mean comparison using t tests and analysis of variance (Rosenthal and Rosnow, 2008, pp. 381- 433). The purpose of this data analysis is to obtain answers for the research question in the form of the relation of spatial variables to perceived complexity. 4.3.2 Qualitative methods of data analysis 4.3.2.1 In-situ qualitative data analysis The methodology used to analyse the qualitative data collected will be based on the methodology presented in Corbin and Strauss (2008). In this method, subjects are interviewed on several occasions. While the initial interviews are usually open, on subsequent interviews the researcher leads the subject to the areas of interest. The task of the interviewer in the data analysis stage is to identify commonly occurring themes and expand on these. By identifying main topics the researcher can make the subject expand on this. It is usually considered that this process (interview-analysis-interview) is finished when no new main themes appear. There are methods and computer software that might aid the interviewer in these tasks. The purpose of this data analysis is to gain an understanding of the main perceived spatial characteristics of everyday environments. 4.3.2.2 Laboratory qualitative data analysis The methodology will follow the methods used in the in-situ qualitative data analysis. The purposed of this data analysis is to compare the perceived characteristics of in-situ interviews and laboratory interviews. 4.3.2.3 Repertory grid technique data analysis The obtained grid from the repertory grid test can be analysed by cluster analysis. The purpose of data clustering is to uncover data hidden structures. A method for analysing repertory grid data is included in Shaw (1980). It is important to consider the variability of subjects in areas where there exist little 23

information. Techniques to reduce the error due to the difference in subjects expertise are treated in Shaw and Gaines (1989). The aim of this data analysis is to understand the main perceived characteristics that describe the complexity of an auditory scene. This could form the basis of the experimental setup controlled variable selection. 4.3.3 Data conversion 4.3.3.1 In-situ qualitative data conversion There are several methods to convert qualitative data for analysis using quantitative data analysis methods (Bazeley, 2010). Most techniques involve the use of computers and the simplest methods include word counting and key word identification. From this quantized data, pattern recognition techniques can be implemented to gain further understanding of the phenomenon under study. The purpose of this data analysis is to gain an insight on the main characteristics that influence the perception of acoustic space in an in-situ situation. 4.3.3.2 Laboratory qualitative data conversion See section above for details of data conversion. The purpose of this data analysis is to gain an insight and compare results of the acoustic space characteristics of auditory scenes presented in a laboratory situation.

24

5. Work Schedule
The following figure presents the expected work schedule, divided in months across three years.


Figure 2. Work Schedule

25

6. References

ABE, K., OZAWA, K., SUZUKI, Y., ITI & SONE, T. 2006. Comparison of the Effects of Verbal Versus Visual Information About Sound Sources on the Perception of Environmental Sounds. Acta Acustica united with Acustica, 92, 51-60. BAZELEY, P. 2010. Computer-assisted integration of mixed methods data sources and analyses. In: TASHAKKORI, A. & TEDDLIE, C. (eds.) Sage Handbook of Mixed Methods in Social & Behavioral Research. Thousand Oaks, CA: Sage. BERG, J. & RUMSEY, F. 2006. Identification of Quality Attributes of Spatial Audio by Repertory Grid Technique. Journal of the Audio Engineering Society, 54, 365-379. BERGLUND, B. & NILSSON, M. E. 2006. On a Tool for Measuring Soundscape Quality in Urban Residential Areas. Acta Acustica united with Acustica, 92, 938-944. BERKHOUT, A. J., VRIES, D. D. & VOGEL, P. 1993. Acoustic control by wave field synthesis. The Journal of the Acoustical Society of America, 93, 2764-2778. BIES, D. A. & HANSEN, C. H. 1996. Engineering Noise Control, New York, NY, Spon Press. BLAUERT, J. (ed.) 2005. Communication Acoustics, New York, NY: Springer. BRAASCH, J. & HARTUNG, K. 2002. Localization in the presence of a distracter and reverberation in the frontal horizontal plane. I. Psychoacoustical data. Acta Acustica united with Acustica, 88, 942-955. BREGMAN, A. S. 1990. Auditory Scene Analysis The Perceptual Organization of Sound, Cambridge, MA, MIT Press. BRONKHORST, A. W. 2000. The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions. Acustica, 86, 117-128. BRUNGART, D. S. & SIMPSON, B. D. 2007. Cocktail party listening in a dynamic multitalker environment. Perception & Psychophysics, 69, 79-79-91. CARLES, J. L., BARRIO, I. L. & DE LUCIO, J. V. 1999. Sound influence on landscape values. Landscape and Urban Planning, 43, 191-200. CHEVEIGNE, A. D. & KAWAHARA, H. 2002. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111, 1917-1930. CORBIN, J. & STRAUSS, A. 2008. Basics of Qualitative Research, Thousand Oaks, CA, Sage. DANIEL, J., NICOL, R. & MOREAU, S. 2003. Further investigations of high order ambisonics and wavefield synthesis for holophonic sound imaging. AES 114th Convention. Amsterdam, The Netherlands. DE COENSEL, B., BOTTELDOOREN, D., DE MUER, T., BERGLUND, B., NILSSON, M. E. & LERCHER, P. 2009. A model for the perception of environmental sound based on notice-events. The Journal of the Acoustical Society of America, 126, 656-665. DONDERI, D. C. 2006. Visual Complexity: A Review. Psychological Bulletin, 132, 73-97. DRAKE, L. A., RUTLEDGE, J. C., ZHANG, J. & KATSAGGELOS, A. 2009. A Computational Auditory Scene Analysis-Enhanced Beamforming Approach for Sound Source Separation. Eurasip Journal on Advances in Signal Processing. EPAIN, N., JIN, C. & VAN SCHAIK, A. Blind source separation using independent component analysis in the spherical harmonic domain. 2nd International Symposium on Ambisonics and Spherical Acoustics, 2010 Paris, France. FRAZIER, R., SAMSAM, S., BRAIDA, L. & OPPENHEIM, A. Enhancement of speech by adaptive filtering. Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '76., 1976. 251-253. GE, J. & HOKAO, K. 2004. Research on the sound environment of urban open space from the viewpoint of soundscape - A case study of Saga Forest Park, Japan. Acta Acustica united with Acustica, 90, 555-563. GENUIT, K. & FIEBIG, A. 2006. Psychoacoustics and its Benefit for the Soundscape Approach. Acta Acustica united with Acustica, 92, 952-958. GOLDSTEIN, E. B. 2010. Sensation and Perception, Belmont, CA, Wadsworth. GROAT, L. & WANG, D. 2002. Architectural Research Methods, New York, NY, John Wiley & Sons, Inc. GUASTAVINO, C. 2006. The Ideal Urban Soundscape: Investigating the Sound Quality of French Cities. Acta Acustica united with Acustica, 92, 945-951.

26

GUASTAVINO, C. 2007. Categorization of environmental sounds. Canadian Journal of Experimental Psychology-Revue Canadienne De Psychologie Experimentale, 61, 54-63. GUASTAVINO, C. & KATZ, B. F. G. 2004. Perceptual evaluation of multi-dimensional spatial audio reproduction. The Journal of the Acoustical Society of America, 116, 1105-1115. GUASTAVINO, C., KATZ, B. F. G., POLACK, J.-D., LEVITIN, D. J. & DUBOIS, D. 2005. Ecological validity of soundscape reproduction. Acta Acustica united with Acustica, 91, 333-341. HAN, M.-H., JOO, M.-K. & OH, Y.-K. 2010. Residential and Acoustic Environments Perceived by Residents of Regional Cities in Korea: A Case Study of Mokpo City. Indoor and Built Environment, 19, 102-113. HESS, W. 1983. Pitch Determination of Speech Signals, Berlin, Springer-Verlag. HU, G. & WANG, D. 2007. Auditory Segmentation Based on Onset and Offset Analysis. Audio, Speech, and Language Processing, IEEE Transactions on, 15, 396-405. IWAYA, Y. Sound space perception in virtual environments with head movements. International Workshop on the Principles and Applications of Spatial Hearing, 2009 Zao, Miyagi, Japan. JEKOSCH, U. 1999. Meaning in the Context of Sound Quality Assessment. Acustica, 85, 681-684. JEON, J. Y., LEE, P. J., YOU, J. & KANG, J. 2010. Perceptual assessment of quality of urban soundscapes with combined noise sources and water sounds. The Journal of the Acoustical Society of America, 127, 1357-1366. JIN, Z. & WANG, D. A multipitch tracking algorithm for noisy and reverberant speech. Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, 2010. 4218-4221. KANG, J. 2007. Urban Sound Environment, London, Taylor & Francis. KANG, J. & ZHANG, M. 2010. Semantic differential analysis of the soundscape in urban open public spaces. Building and Environment, 45, 150-157. KARIEL, H. G. 1990. FACTORS AFFECTING RESPONSE TO NOISE IN OUTDOOR RECREATIONAL ENVIRONMENTS. Canadian Geographer / Le Gographe canadien, 34, 142-149. KELLY, G.A. 1955. Psychology of Personal Constructs. New York, W.W. Norton and Company, Inc. KLAPURI, A. P. 2003. Multiple fundamental frequency estimation based on harmonicity and spectral smoothness. Speech and Audio Processing, IEEE Transactions on, 11, 804-816. KOLLMEIER, B. & KOCH, R. 1994. Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction. The Journal of the Acoustical Society of America, 95, 1593-1602. KOPCO, N., BEST, V. & CARLILE, S. 2010. Speech localization in a multitalker mixture. The Journal of the Acoustical Society of America, 127, 1450-1457. LAM, K. C., BROWN, A. L., MARAFA, L. & CHAU, K. C. 2010. Human Preference for Countryside Soundscapes. Acta Acustica united with Acustica, 96, 463-471. LAVANDIER, C., DEFR & VILLE, B. 2006. The Contribution of Sound Source Characteristics in the Assessment of Urban Soundscapes. Acta Acustica united with Acustica, 92, 912-921. LEEDY, P. D. & ORMROD, J. E. 2005. Practical Research. Planning and Design, Upper Saddle River, NJ, Pearson Education, Inc. LJUNGDAHL ERIKSSON, M. B., JAN 2009. Soundscape Attribute Identification. AES 126th Convention. Munich, Germany. MACE, B. L., BELL, P. A. & LOOMIS, R. J. 1999. Aesthetic, Affective, and Cognitive Effects of Noise on Natural Landscape Assessment. Society & Natural Resources: An International Journal, 12, 225 - 242. MAZARIS, A. D., KALLIMANIS, A. S., CHATZIGIANIDIS, G., PAPADIMITRIOU, K. & PANTIS, J. D. 2009. Spatiotemporal analysis of an acoustic environment: interactions between landscape features and sounds. Landscape Ecology, 24, 817-831. MERTENS, D. M. 1998. Research Methods in Education and Psychology, Thousadn Oaks, CA, Sage. MEYER, J. & ELKO, G. A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield. Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP '02). IEEE International Conference on, 2002. 1781-1784. MINGYANG, W., DELIANG, W. & BROWN, G. J. 2003. A multipitch tracking algorithm for noisy speech. Speech and Audio Processing, IEEE Transactions on, 11, 229-241. O'DONOVAN, A., DURAISWAMI, R. & ZOTKIN, D. Imaging concert hall acoustics using visual and audio cameras. 2008. 5284-5287. PARSONS, T. W. 1976. Separation of speech from interfering speech by means of harmonic selection. Journal of the Acoustical Society of America, 60, 911-918.

27

PARTHY, A., JIN, C. & VAN SCHAIK, A. 2008. Measurend and theoretical performance comparison of a co-centered rigid and open spherical microphone array. International conference on audio, language and image processing. PATTON, M. Q. 2002. Qualitative Research & Evaluation Methods, Thousand Oaks, CA, Sage. PHEASANT, R., HOROSHENKOV, K., WATTS, G. & BARRETT, B. 2008. The acoustic and visual factors influencing the construction of tranquil space in urban and rural environments tranquil spaces-quiet places? The Journal of the Acoustical Society of America, 123, 1446- 1457. RAFAELY, B. 2005. Analysis and design of spherical microphone arrays. Speech and Audio Processing, IEEE Transactions on, 13, 135-143. RAIMBAULT, M. 2006. Qualitative Judgements of Urban Soundscapes: Questionning Questionnaires and Semantic Scales. Acta Acustica united with Acustica, 92, 929-937. RAIMBAULT, M., LAVANDIER, C. & BERENGIER, M. 2003. Ambient sound assessment of urban environments: field studies in two French cities. Applied Acoustics, 64, 1241-1256. ROSENTHAL, R. & ROSNOW, R. L. 2008. Essentials of Behavioral Research. Methods and Data Analysis, New York, NY, Mc Graw-Hill. SANTALA, O. & PULKKI, V. 2011. Directional perception of distributed sound sources. Journal of the Acoustical Society of America, 129, 1522-1530. SCHAFER, R. M. 1993. The Soundscape: Our Sonic Environment and the Tuning of the World, Rochester, VT, Destiny Books. SCHEIRER, E. D. 2000. Music-Listening Systems. PhD Thesis, MIT. SCHEIRER, E. D. 2001. Structured audio, Kolmogrov complexity and generalized audio coding. IEEE Trans. on Speech and Audio Processing, 9, 914-931. SCHMUCKLER, M. A. 2001. What Is Ecological Validity? A Dimensional Analysis. Infancy, 2, 419- 436. SEMIDOR, C. 2006. Listening to a City With the Soundwalk Method. Acta Acustica united with Acustica, 92, 959-964. SHAW, M.L.G. 1980. On Becoming a Personal Scientist. London, UK, Academic Press. SHAW, M. L. G. & GAINES, B. R. 1989. Comparing conceptual structures: consensus, conflict, correspondence and contrast. Knowledge Acquisition, 1, 341-363. SOUTHWORTH, M. 1969. Sonic environment of cities. Environment and Behavior, 1, 49-70. STREICH, S. & HERRERA, P. 2004. Towards describing percieved complexity of songs: computational methods and implementation. AES 25h International Conference: Metadata for Audio. UK. TASHAKKORI, A. & TEDDLIE, C. 2008. Quality of inferences in mixed method research. In: BERGMAN, M. M. (ed.) Advances in Mixed Methods Research. Thousands Oaks, CA: Sage. TEDDLIE, C. & TASHAKKORI, A. 2009. Foundations of Mixed Methods Research, Thousand Oaks, CA, Sage. TRUAX, B. 2001. Acoustic Communication, Westport, CT, Ablex Publishing. VIOLLON, S., LAVANDIER, C. & DRAKE, C. 2002. Influence of visual setting on sound ratings in an urban environment. Applied Acoustics, 63, 493-511. YANG, W. & KANG, J. 2005a. Acoustic comfort evaluation in urban open public spaces. Applied Acoustics, 66, 211-229. YANG, W. & KANG, J. 2005b. Soundscape and sound preferences in urban squares: a case study in Sheffield. Journal of Urban Design, 10, 61 - 80. YU, L. & KANG, J. 2008. Effects of social, demographical and behavioral factors on the sound level evaluation in urban open spaces. The Journal of the Acoustical Society of America, 123, 772-783. YU, L. & KANG, J. 2009. Modeling subjective evaluation of soundscape quality in urban open spaces: An artificial neural network approach. The Journal of the Acoustical Society of America, 126, 1163-1174. ZHANG, M. & KANG, J. 2007. Towards the evaluation, description, and creation of soundscapes in urban open spaces. Environment and Planning B-Planning & Design, 34, 68-86.

28

You might also like