You are on page 1of 25

British Journal of Educational Technology Vol 0 No 0 2019 1–25

doi:10.1111/bjet.12852

Interactive sonification of images in serious games as an


education aid for visually impaired children

Andrzej Radecki , Michał Bujacz , Piotr Skulimowski and


Paweł Strumiłło
Andrzej Radecki is an Assistant Professor at the Institute of Automatics, Lodz University of Technology (TUL). He
specializes in embedded systems, automatic control methods and signal processing algorithms. His current research
focuses on practical applications of signal processing in system diagnostics, fault tolerant control and multimedia
user interfaces. Michał Bujacz is an Assistant Professor at the Dept. of Medical Electronics, TUL. His research
experience focuses on spatial audio, echolocation, sonification and auditory representation of space in assistive
technologies for the blind. Piotr Skulimowski is an Assistant Professor at the Dept. of Medical Electronics, TUL.
He specializes in image and video analysis, 3D reconstruction, hardware and software systems aiding visually
impaired in independent travel. Paweł Strumiłło is a Professor and the director of the Institute of Electronics, TUL.
His current research interests include medical electronics, processing of biosignals and human-system interaction
systems. He has published more than 200 frequently cited technical articles, authored one and co-authored two
books. He is a Senior Member of the IEEE and a member of Biocybernetics and Biomedical Engineering Committee
of the Polish Academy of Sciences. Address for correspondence: Michał Bujacz, Institute of Electronics, Lodz
University of Technology, Wolczanska 211/215 B9, 90-924, Lodz, Poland. Email: bujaczm@p.lodz.pl

Abstract
The paper presents an application for interactive sonification of images intended for use
on mobile devices in education of blind children at elementary school level. The paper
proposes novel sonification algorithms for converting colour and grayscale images into
sound. The blind user can interactively explore image content through multi-touch
gestures and select image sub-regions for sonification with real-time control of the
synthesized sound parameters such as amplitude, frequency and timbre. Additionally,
images may contain text fields read by text-to-speech synthesizer. The usability of one of
the proposed sonification schemes is tested by collecting data on the tracking accuracy
and recognition speed of basic shapes, such as lines, curves, as well as figures and simple
functions. In order to facilitate the learning process for blind children, a number of
games were proposed that use the devised sonification schemes. The first game—“Hear
the Invisible”—is intended for two players—one child draws a shape and the task of the
other one is to guess the displayed shape by means of the available sonification methods.
The second proposed game is “Follow the Rabbit” and is intended for a single player
who tracks a colourful “Rabbit” that runs along a path representing a given geometric
shape. The obtained results show that the proposed sonification methods allow to reach
previously unattainable levels of  flexibility in exploration of shapes and colours. The
main application of the described interactive tool is to teach geometry and mathematics
in schools for blind children. The developed games are meant to enhance the learning
process and motivate the children.

© 2019 British Educational Research Association


2    British Journal of Educational Technology  Vol 0 No 0 2019

Practitioner Notes
What is known about the topic
• Sonification is a generic term for data-based non-speech sound generation.
• Interactive sonification is a relatively fresh area of study, where the listener heavily
influences the process of sound synthesis.
• Earlier studies on sonification and educational computer games have shown that it
is possible to demonstrate shapes such as plots and geometry using only non-verbal
sounds.
What the paper adds
• The paper proposes an interactive sonification algorithm that can be used on grey-
scale or colour images using additive synthesis and the HSV colour representation.
• The presented software also allows to record and analyse user interaction with the
images, making it a good research tool for current and future sonification algorithms.
• The paper presents a pilot study with blind children using two games that improve
the learning process of the proposed algorithms.
Implications for practice and/or policy
• Once the software is openly released it can be easily installed on any Android OS
device to allow blind users to analyse images by touch.
• The proposed games train the users the basics of the sonification algorithm in a com-
petitive context.
• The software can be especially useful in education of blind children, eg, in teaching
geometry, math and in training visual and spatial imagination.

Introduction
There are approximately 283 million legally blind persons worldwide, including 36 million
totally blind and 19 million children (Bourne et al., 2017). Surveys conducted with the visually
impaired (Marston, 2009) have indicated the three most problematic challenges faced by blind
people: (1) independent safe mobility, (2) spatial orientation and navigation, (3) access to visual
information (text and graphics). The authors’ previous experience has been with devices that
address the first two problems, mainly in the form of electronic travel aids (Bujacz & Strumillo,
2016) or navigation software (Baranski & Strumillo, 2015; Skulimowski, Korbel, & Wawrzyniak,
2014). In this project, however, we attempt to address the third major problem and focus on
interactive auditory presentation of images for the use in education of blind children.
Images and graphics are extensively used as teaching aids at schools at different education lev-
els (Hersh & Johnson, 2008). Such a visual education medium, however, can only be accessed
by visually impaired children either through the means of a verbal description (The Audio
Description Project, 2018), by touch (Klatzky & Lederman, 2007; Visell, 2009), with the help of
non-verbal representation, termed sonification (Hermann & Hunt, 2005) or through some form
of a combined multimodal approach (Fernandes & Haley, 2013). These methods all require spe-
cific solutions that increase the effort of the teaching personnel and/or the cost of teaching due
to, for instance, the necessity of purchasing additional technologies.
Verbal description, known also as audio-description, needs to follow specific guidelines to be nat-
urally comprehended by a blind pupil. Such descriptions can be provided either by an experienced
teacher or prepared in advance on recordings. An innovative and pioneering solution is to apply

© 2019 British Educational Research Association


Interactive sonification of images    3

advanced machine learning classifiers. A promising new venue are interactive haptic displays
(Gay, Rivière, & Pissaloux, 2018) or tactile tablets, which are an extremely expensive technology
(http://web.metec-ag.de for 15000 EUR).
Engaging the sense of touch in image perception, although currently the predominant method,
incurs significant costs associated with preparing specialized printed materials and storing large
libraries. Such materials can include additional descriptions of the images with the use of the
Braille alphabet or tactile diagrams reflecting the main graphical components of the studied
images. Tactile diagrams, in particular, are very helpful in teaching such subjects as art, geogra-
phy, biology and mathematics (Sheppard & Aldrich, 2001).
In this paper, we focus on interactive sonification—the third technique for presenting images or
graphics for educational purposes. Sonification is formally defined as: “data-dependent gener-
ation of sound in a way that reflects objective properties of the input data” (Degara, Hunt, &
Hermann, 2015). Interactivity is the addition of a user-feedback loop to the sound generation
and data selection process. Our view is that this particular feature of the sonification technique is
particularly important in the teaching process. To support this view we have designed and imple-
mented original algorithms for converting images and graphics into the sonic representation.
These algorithms have been implemented on mobile devices and presented to a group of blind
children for first trials. To better encourage the children in using a novel interaction device we
proposed simple audiogames, yet capturing high attention of the blind children who eagerly have
been taking part in the trial sessions.

Related work on sonification and audiogames


Sonification, which is the use of non-speech sounds to present information is an important type
of auditory display approaches to building human-computer interfaces (Csapó & Wersényi,
2013; Kramer, 1994). Richer incorporation of audio modality into the interfaces can enhance
perception capacity of personnel whose visual channel tends to be overloaded. Auditory display
techniques are employed in science, technology, industry and medicine, especially in the form
of simple alerts, eg, in medical devices or parking sensors. The use of sounds and sonification
techniques are particularly important in assistive devices for the blind people. Most recently
the authors have used the interactive sonification approach to help the blind perceive 3D space
(Skulimowski et al., 2018).
Sonification research is frequently concerned with presenting images to blind users. One of the
most known algorithms for sonification for the blind is the vOICe (Meijer, 1992). It is popular due
to its simplicity of generation, but has been criticized for its “unfriendliness” and perceptual diffi-
culty. The algorithm converts images into sound spectra by cyclically scanning columns of pixels
and assigning each pixel’s brightness to the loudness of a different frequency component (low
rows to low frequencies and high rows to high frequencies). Neurological research has shown
that after several weeks of training with the vOICe, previously unused regions of the visual cortex
are activated when hearing the sonification (Merabet & Pascual-Leone, 2010).
Very few widely available sonification algorithms employ any user interaction, especially by
touch, as they sonify whole images automatically, as, eg, the vOICe. The ones that do, usually
utilize either mouse or keyboard input. An interesting project presented in O’Neill and Ng (2008)
used the Nintendo’s Wiimote controller to provide the user with sonic and haptic feedback. In this
work, the image was pre-segmented and the detected regions were assigned descriptors that were
further mapped to sounds. The mapping method, however, has not been disclosed. Two other
studies were designed to sonify images by concentrating exclusively on image edges or boundar-
ies. In Ramloll et al. (2000), the interaction is limited to choosing which line elements of images
© 2019 British Educational Research Association
4    British Journal of Educational Technology  Vol 0 No 0 2019

are to be sonified, whereas in Yoshida, Kitani, Koike, Belongie, and Schlei (2011) both image
edges and distance to the edges were sonified. The latter is a truly interactive tool in which the
user can select image regions for sonification by touch. Edges are sonified in a similar manner
as with the vOICe algorithm, whereas the distance to edges is sonified by a frequency which is
increasing with the user’s finger moving towards the nearest edge. A drawback of the above son-
ication approaches was that the colour information is entirely discarded and not reproduced in
the sonification scheme.
An advanced study was carried out in Banf and Blanz (2012) which a multilevel image sonifi-
cation approach was proposed. Namely, low-level image features like edges, colour and texture
were sonified in parallel to applying a machine learning algorithm to recognise the object and
verbally communicate its name to the user. Although tools for automatic image understanding
are undoubtedly useful, it is our belief that for some teaching processes it may be beneficial for
a blind person to learn to perceive low-level features of images and then build up a higher level
understanding of the image on their own.
Summarizing the above approaches to graphics and image sonification we can conclude that the
use of interactive sonification techniques in the interfaces dedicated to blind people are scarce.
In particular, to the authors’ best knowledge, there has been no work reported on applying inter-
active sonification in educational aids for the blind children. Here, we state again that our view
is that interactivity of the sonification techniques is particularly important in the teaching. This
is because a blind child using the assistive device can individually control the pace of exploring
the visual content under question. In our solution, this interaction has been made possible by
simple touch gestures on a tablet. This design decision has been motivated by noting that children
(including blind children) are already very skilful in handling touch devices like smartphones or
tablets. Moreover, the sonification techniques we propose are flexible and different sonification
schemes can be reprogrammed and tuned to various content.
We hypothesized that through audio games interactive sonification techniques would be seamlessly
accepted by the blind children. Audio game is a type of computer game in which instead of the
visual modality the audio channel is primarily or solely used for the user’s interface. Computer audio
games have originated from the need to offer the visually impaired accessible games. Early designs of
audio games for the visually impaired were based on the text-to-speech technologies. However, with
the fast development of the graphical operating systems the video games started to dominate due to
strong market demand. Recently, however, the audio games have grown in popularity (Merilampi,
Koivisto, & Sirkka, 2018) and have evolved into an important type of computer games devoted not
only to the blind users (Giannakopoulos, Tatlas, Giannakopoulos, Floros, & Katsoulis, 2018).
One of the first games designed for the visually impaired that used solely non-verbal sounds was
the auditory version of the Towers of Hanoi (Winberg & Hellström, 2001). The game relies on
moving discs of different size between three towers. Unique pitch and timbre sound features were
assigned to every disc while their tower location were coded by stereo panning and amplitude
envelope. Successful trials have shown that users can effectively interact with a complex auditory
space.
Important conclusions were drawn from another study conducted by Targett and Fernstrom
(2003) who have shown that computer games using only non-speech audio implementations of
Tic-tac-toe and Mastermind helped the blind users in developing skills that could be used outside
the gaming environment.
Similar objectives reaching beyond the gaming environment were achieved in the study reported
in Balan, Moldoveanu, Moldoveanu, and Morar (2016). The designed audio games were used for

© 2019 British Educational Research Association


Interactive sonification of images    5

Figure 1:  Modules of interactive sonification mobile application

creating immersive 3D spaces for the visually impaired. The design purpose of this game was to
train the visually impaired in comprehending space from spatial audio signals generated with the
use of binaural rendering with Head Related Transfer Functions (Dobrucki et al., 2010).
Finally, an interesting and recently developed application for understanding echolocation and
learning to navigate was presented in Wu et al. (2017). The authors have shown that their game
application dedicated for the visually impaired can be a useful tool for increasing awareness of
echo cues.
An important potential application of the proposed sonification software is in presenting inter-
active tactile maps to the blind. An interest reader can find reviews of this subject in Brock and
Jouffrais (2015), Heuten, Henze, and Boll (2007) and Ducasse, Brock, and Jouffrais (2018).
In the further sections, we explain our proposed sonification method and define the interactive
sonification schemes of graphics and images that we use in the developed audio games. Then we
report on the results of the first trials of the proposed interface with participation of six blind
children.

Interactive sonification for Serious Games


For many children, games are one of the most attractive ways to spend free time. Visually im-
paired and blind children are faced with a very limited access to games of any type. Taking into
account earlier successful designs of games for the visually impaired, presented, eg, in Winberg
and Hellström (2001) we have followed a similar approach to combine two concepts: Interactive
Sonification and Serious Audiogames for Blind Children. For this purpose, we developed an
Android application which mixes a number of interactive sonification schemes with educational
modules (Figure 1).
Every module enables different methods of learning and comprises its own set of images (Figure 2)
which are interactively sonified. Many of the images are scans of tactile printouts from teaching
materials from a school for the blind. The application allows images to have their own two-level
© 2019 British Educational Research Association
6    British Journal of Educational Technology  Vol 0 No 0 2019

Figure 2:  Examples of images from a database built for the project, they included computer-generated graphics,
infrared photos and colorized scans of tactile diagrams used in blind education

audio description layers to enhance the educational content. The two levels consist of a short
caption or a longer text description that can be read after a longer touch gesture over a chosen
region. A separate tool was developed to easily assign the captions to image regions, either rect-
angular or of a specific colour, and store them in an XML file under the same name as the image.
A further planned development of the tool will entail using it with tactile printouts overlaid on
the tablet’s screen.
We developed three games with a purpose of familiarizing blind children with the interactive
sonification schemes and test the potential readability of basic educational figures. Two of the
games, “Hear the Invisible” and “Follow the Rabbit,” are described in this paper. In both the tested
Serious Games we used images and paths which are intended to teach basics of mathematics, ie,
fundamental functions and elementary geometric shapes.

Algorithms of interactive sonification of images


The main role of interactive sonification algorithm is to respond to interactively selected Region
of Interest (ROI) on the image and to produce a sound according to the sonification scheme
(Figure 3).
Both parts of the algorithms require understanding of the needs and the sensory ability of the
visually impaired. We have developed a number of algorithms differing by the ROI input (point,

© 2019 British Educational Research Association


Interactive sonification of images    7

Figure 3:  Simplified block diagram of the interactive sonification algorithm

line or area selection) and the sound output (continuous or periodic), but only the most basic one,
ie, the point method, is presented here as it was the one tested by blind users. Its concept is simi-
lar to algorithms previously published in Banf and Blanz (2013) and Yoshida et al. (2011), ie, a
touched point is converted into a combination of synthesized sounds. The user’s task is to explore
the image and try to understand its structure solely on the basis of the generated sound feedback.
In our work, we proposed a different approach to colour mapping than the one presented in Banf
and Blanz (2013). Our mapping is based on the HSV (Hue Saturation Value) colour system in the
following way:

1. the H component, for S  =  1, is used for determining a dominant frequency of the syn-
thesized polyphonic sound (which is fundamentally a sum of a number of sinusoids),
2. red colour is represented using the highest frequencies (the colour often described as hot or
dangerous) and blue is mapped with the lowest frequencies (the colour often described as cold
or denoting deepness),
3. five sounds for producing polyphonic timbre are sufficient to achieve a reasonable resolution
of colour space without being overcomplicated,
4. the synthesized sound reflects the circular nature of the H component, resulting in a smooth
frequency change regardless of its direction,
5. the V component controls the overall sound volume,
6. the S component is a smooth transition between a simplified monoharmonic sixth sound for
S = 0 and a full polyphonic timbre (S = 1).

In simplification, our sonification method produces polyphonic sound composed from five sound
buffers where the Hue component determines the dominant frequency volume, the Saturation
component controls a smooth transition between colour sound representation and the monohar-
monic tone and the Value component determines the overall sound amplitude level (Figure 4).
A different timbre for each HSV colour is a result of buffer combinations defined by Equation A2
(included in appendix).
The proposed colour mapping model, given by A2 and A3 corresponds to circular nature of hue
component (through modulus operations) with simplicity of sound of only one harmonic for
monochromatic point (saturation component equal to zero). The circular property of the sonic

© 2019 British Educational Research Association


8    British Journal of Educational Technology  Vol 0 No 0 2019

Figure 4:  Hue component mapping to individual sound volumes of five sound buffers

transformation was used to represent pseudo-colour images and determine the desired move-
ment of direction mapping in a polar coordinate system during image exploration.
Every generated sound can be recognized as a colour, purely based on the mathematical Equation
A2. A user with a strong mathematical and musical background may learn to recognize a son-
ified colour in a much shorter period, as understanding of concepts such as HSV and frequency
components may aid the imagination; however, the purpose of the prepared games was that any
user could learn the sonification rules through training.
The described sonification method can be expressed algorithmically as shown in Figure 5. The
algorithm is based on the implementation of Equation A2 and additive synthesis of sound buf-
fers. A more detailed description is available in the appendix.
Although the presented algorithm uses only 5, the software can handle up to 16 different types
of sounds as this is the maximum number of parallel sound buffers in the Android OS. This gives
a possibility for a straightforward extension of the sonification algorithm in the future without
changing the sound engine in the software.
Other sonification algorithms, developed by the authors but not tested in the presented study
(Radecki, Bujacz, Skulimowski, & Strumillo, 2016), allowed the user to select different regions of
interest with multi-finger gestures: points, lines and areas, choose between the main sonification
methodology—continuous (real time response to ROI changes and change in loudness reflect-
ing the brightness of the currently selected pixel or area of pixels) and looped (performing small
© 2019 British Educational Research Association
Interactive sonification of images    9

Figure 5:  Sonification algorithm and its execution time on an Android device. The longest step is the preloading of
the sound buffers, once they are in the memory, the sonification is real time

sweeps of an area around the selected point, along a line or over an area, in a fashion similar to
the vOICe (Meijer, 1992)—with columns of pixels swept left to right, with pixel rows correspond-
ing to the pitch of a frequency component). The development version of the application allows
the user to adjust various parameters of the sonification algorithm in real time, eg, change the
default ROI area or period of the looped sound.

Research methodology
We have used the above-mentioned interactive continuous point sonification to conduct tests
with a total number of six visually impaired children. Three children aged 10–12 years old and
three aged 13–16 years old. Every child had performed the same tests in the same order in the
two Serious Games under analysis. The whole research process was divided into three main
steps for each game:

1. Interactive sonification algorithm learning process. This was done using specially pre-
pared images that emphasized all the important aspects of a given sonification method
(Figure 6).

© 2019 British Educational Research Association


10    British Journal of Educational Technology  Vol 0 No 0 2019

Figure 6:  Images used for (a) learning continuous point ROI sonification scheme from “Hear the Invisible”
game—the primarily goal was to learn to recognize sounds of edges and (b), (c) learning the sonic colour mapping
from “Follow the Rabbit” game, where the colour (and thus sound) changes depended either only on the distance
to the target’s centre or both the distance and direction

2. Analysing an image using the proposed interactive sonification methods in both Serious
Games separately. A series of images was sonified in the same order for each Serious Game by
each tester accordingly.
3. Verifying image understanding by retracing its content on the touchscreen. The drawn curves
were later analysed using coefficients measuring their similarity to the original figures.

The test images included basic mathematical functions and geometric shapes. None of the sub-
jects were told what kind of drawings to expect, which increased the complexity of the tasks and
improved the reliability of the results. By later asking the testers to name the shapes, we were
able to verify two things: first, the usefulness of the sonification method to recognize and recreate
a shape, and secondly, the mathematical knowledge of the testers (eg, whether they identified a
parabola).
Unfortunately, we had very limited time for conducting the tests due to the fact that they were
scheduled to take place in the school for the blind under teacher supervision. Hence, the first
familiarization step was limited to 15  minutes for each participant. In the next two research
steps (image analysis and reconstruction), all touch points and gestures of the six subjects were
recorded and stored in a database. The saved data were used for measuring times and accuracy of
recognition and reconstruction.
To visualize the process of image analysis by the blind testers, two-dimensional discrete heatmaps
described by function fHMap(px,py) were created using AutoKorel software (Radecki, 2006). The
heatmaps were calculated by dividing an image into 200×200 ROIs and summing the times spent
touching each them. The software for generating the 2D heatmaps (AutoKorel) was described
and linked in reference (Radecki, 2006). A sample visualization of the analysis performed by the
visually impaired subjects when finding the highest mountain chain in Europe hypsometric map
was shown in Figure 7. The example comes from another game in the application—“Highest,
Fastest, Closest” in which the player’s task was to find extrema in various physical images.
We introduced three simple measures to represent the quality and completeness of the analysis
of the image. Coefficient Q1 is defined as the ratio between the number of image points belonging
to the sonified object (Chit) and all sonified image points (chit + cmiss).
© 2019 British Educational Research Association
Interactive sonification of images    11

Figure 7:  Hypsometric map of Europe in pseudocolours (a) and heatmap generated by the AutoKorel software
(Radecki, 2006) visualizing the analysis of the image by a blind volunteer tasked to find the tallest mountain (b)

chit
Q1 = ⋅ 100 % (1)
chit + cmiss

Coefficient Q2 defines the ratio between the sonification duration of image points belonging to the
sonified object (thit) and the sonification duration of all sonified points (thit + tmiss).

thit
Q2 = ⋅ 100 % (2)
thit + tmiss

Coefficients Q1 and Q2 can be interpreted as the ability to persist in the desired image region (with
useful information in case of analysis stage) or as the accuracy of mapping the recognized shape
(in case of recognition stage). In addition, the Q2 coefficient takes into account the time of anal-
ysed points which gives further information regarding the ease of returning to sonification of the
desired part of the image (in case of analysis stage) or to drawing correctly recognized shape (in
case of recognition stage).
The last coefficient Cmpl determines the degree of sonification completeness of the sonified object.
It is defined as the ratio between the number of curve points that were sonified with their close
neighbourhood (chitLP) and the number of all points defining the curve (cSMAP).

thit
cmpl = ⋅ 100 % (3)
thit + tmiss

© 2019 British Educational Research Association


12    British Journal of Educational Technology  Vol 0 No 0 2019

“Hear the Invisible” Serious Game


The first proposed game named “Hear the Invisible” focuses on discovering the content of an
image through its sonic analysis and measuring the accuracy of its subsequent reconstruction.
The interactive analysis was performed without any specific guidelines or instruction, with each
test participant finding the most suitable method for him or herself. The goal of the game was to
reconstruct the image structure as accurately as possible and in the shortest possible time. A set
of images representing basic mathematical functions was used in the game (Figure 8).
For every examined volunteer and every mathematical function, we gathered data both during
the image analysis stage and during the reconstruction stage. It should be emphasised, however,
that the children differed in their perception and sensory abilities as well as their approach to the
tasks. For example, one of the volunteers (User1) was able to find a function image extremely
quickly and she hardly left the function image representation area (Figure 9a), while another
(User4) explored the whole image area trying to find additional content (Figure 9b). It should be
borne in mind that none of the test participants had any prior information about the nature of
the image.

Figure 8:  Elementary mathematical functions used for the “Hear the Invisible” Serious Game

Figure 9:  Analysis heatmaps for (a) User1 who scored Q1 = 26.8%, Q2 = 83.44%, Cmpl = 100% and (b) User4
who scored Q1 = 16.13%, Q2 = 61.61%, Cmpl = 100%

© 2019 British Educational Research Association


Interactive sonification of images    13

User1 was able to identify the linear function (Figure 10a) accurately, while User4 mistook the
quadratic function for a linear function module (Figure 10b).
Plots in Figure 11 summarize the results in the children age groups 10–12 and 13–16. Although
the differences between the age groups are not significant, the older kids had marginally better
quality and completeness scores at the cost of spending more than twice the time on the tasks.

“Follow the Rabbit” Serious Game


The “Follow the Rabbit” game is intended for a single player whose task is to track a colourful
“Rabbit” (Figure 4b and c) that runs along a path representing a particular geometric shape.

Figure 10:  Reconstruction of analysed images for (a) User1 Q1 = 96.69%, Q2 = 98.27%, Cmpl = 92.34% and (b)
User4 Q1 = 52.99%, Q2 = 48.78%, Cmpl = 45.32%

"Hear the Invisible" game – mean results


Analysis stage Reconstruction stage Task times [s]
110 110 300
100 100
250
90 90
80 80 200
70 70
150
60 60
50 50 100
40 40
50
30 30
20 20 0
Analysis Reconstruction
10 10
0 0 10-12 Years Old
Q1[%] Q2[%] Cmpl[%] Q1[%] Q2[%] Cmpl[%] 13-16 Years Old

Figure 11:  Average quality and completeness scores for the “Hear the Invisible” game

© 2019 British Educational Research Association


14    British Journal of Educational Technology  Vol 0 No 0 2019

The “Rabbit” starts at a specific point (which is different for every shape) and once touched in
its centre, it starts to move. It starts moving very slowly, but it accelerates when it is success-
fully tracked. Touching the “Rabbit” triggers generation of different sounds (consistent with
the colour sonification scheme), dependent on how far from its centre the player is (for circled
“Rabbit” Figure 4b) or in which direction the player should move (for omni-directional “Rabbit”
Figure 4c). The player is allowed to occasionally lose the “Rabbit,” but the game ends when the
player loses track of it for more than two seconds. We prepared several paths, along which the
“Rabbit” moved (Figure 12).
There are two types of “Rabbits” used in the game. The first one is characterised by centred rings
of different hue component (Figure 6b). This “Rabbit” does not present information about the
direction in which the player should move to stay on the path, as it only gives information about
how far from the centre the player is touching. This is a very simple approach and the players
need to explore the area around the centre of the “Rabbit” on their own. The second “Rabbit” is
omni-directional and uses the full capability of the proposed colour sonification scheme. In the
very centre, it produces a “monochromatic” (S = 0 and V = 1) sound while on the edges it has a
pure saturated colour (S = 1). The hue component changes radially with red in the north, blue in
the south, green in the east and purple in the west.
For every volunteer and every mathematical geometric path, we conducted tests based on
the image analysis stage while tracking the “Rabbit” and the path reconstruction stage for
acquiring information about understanding the shape. One of the best players with the cen-
tred rings “Rabbit” was User5. His shape analysis results are shown as heatmaps in Figure 13a
and b and the shapes he plotted in the shape reconstruction phase are shown in Figure 13c
and d.
User3 achieved impressive results with the omni-directional “Rabbit.” His analysis and recon-
struction results are shown in Figure 14a and b and Figure 14c and d, respectively.
The results from the “Follow the Rabbit” Serious Game were analysed in the same manner as
in the case of “Hear the Invisible” game. All children were able to finish the tests and achieved
results that were suitable for comparison purposes. The results describing the analysis and recon-
struction were summed up in Figure 15 for different age groups.
Our intention had been to improve image exploration by suggesting the direction in which
the interactive sonification should evolve. For this purpose, we developed the omni-direc-
tional “Rabbit.” Unfortunately, this approach was too complex for children and they did not
have enough time to practice with direction recognition based on sonified colour. This can

Figure 12:  Geometric shapes used as paths for interactive sonification tests in the “Follow the Rabbit” game.
The “rabbit” that is followed by the user is a coloured circle from Figure 4b the circled rabbit and Figure 4c for the
omnidirectional rabbit version of the game
© 2019 British Educational Research Association
Interactive sonification of images    15

Figure 13:  Analysis heatmaps and shape reconstructions made by User5 using the centred rings “Rabbit”

be seen in the generally worse results achieved for the omni-directional “Rabbit” (comparing
Figures 15‒17).
Discussion
In this paper, we report results of pilot trials of interactive sonification techniques for aiding blind
children in perceiving shapes and colours. We have designed custom sonification algorithms
for converting visual properties of images (colours and shapes) into sounds. Image HSV colour
space is mapped into such sound properties as dominant frequency (controlled by image Hew),
mono-harmonic to poly-harmonic sounds (reflecting Saturation intensity of image colours) and
© 2019 British Educational Research Association
16    British Journal of Educational Technology  Vol 0 No 0 2019

Figure 14:  Analysis heatmaps and shape reconstructions made by User3 using omni-directional “Rabbit”

sound volume corresponding to image brightness (Value). Moreover, the computer-generated


sonification schemes can be designed in a flexible manner and even customized for individual
needs of the visual impaired person. The schemes we have tested in the trials are the results of
considerable research efforts and many disappointing pre-trials with blind participants, which
have enabled us to improve the tool to a level that has been accepted by the blind children. The
interactive functionality of the sonification algorithm allows the young user to flexibly define
(by touch gestures) regions of interest for sonification and track salient image geometric shapes.
Thus, image exploration range and pace can be entirely controlled by the user. This algorithm

© 2019 British Educational Research Association


Interactive sonification of images    17

"Follow the Rabbit" game – mean results


Analysis stage Reconstruction stage Task times [s]
110 110 250
100 100
90 200
90
80 80 150
70 70
60 60 100
50 50
50
40 40
30 30 0
20 20 Analysis Reconstruction

10 10
10-12 Years Old
0 0
Q1[%] Q2[%] Cmpl[%] Q1[%] Q2[%] Cmpl[%] 13-16 Years Old

Figure 15:  Average quality and completeness scores for the “Follow the Rabbit” game with the use of default
concentric colour coding (Figure 4b)

property offers a pupil-centred approach to educating children suffering from different category
of blindness.
From the results of our study we can conclude that interactive sonification techniques can be
viable approaches to teaching blind children and are worth further research efforts. Sensory sub-
stitution by sonification can enrich the education process and elevate it to new standards that
are not available with the use of embossed pictures or tactile maps (Jansson, 2008). By seeing
the usefulness of the proposed sensory substitution tool we should note, however, that this claim
is based on a small study group of six blind children only. The children taking part in the trials
were diversified in their sensory abilities as well as the approach to the tasks. This observation
applies both to their attitude to a novel interactive tool and the values of the quantitative perfor-
mance measures computed from the individual trials. Thus, we can define our trial as a case study
approach that has concentrated rather on individual user achievements rather than on group
based statistics of the obtained results.
The achieved individual results from “Hear the Invisible” game were analysed using basic statis-
tics (mean value and standard deviation) for all calculated quality coefficients Q1, Q2, Cmpl and for
the total time of image sonification and reconstruction. Note that User6 (see Appendix Table A1)
spent the longest time to analyse the shapes (T = 313 seconds), however, he achieved best scores
for coefficients Q2 = 78.23% (duration of sonified points belonging to the shape to all sonified
points, see Equation 2) and Cmpl = 99.69% (sonification completeness, see Equation 3). User5
was the runner up in the use of the interactive sonification scheme for the “Hear the Invisible”
game. He also won his age group. On the opposite side was User2. She did not spent enough time
on image analysis (T = 18.94 seconds), so she could not understand the content well. Obviously
image reconstruction scores (in terms of values of the performance coefficients) were better
almost in most cases than for the analysis stage. This is because analysis was the exploration
type of task whereas the reconstruction task was a simple re-drawing type task of the discovered
shape. Again User2 was the best in achieving the shortest reconstruction time, however, her
other scores were the worst. She was the least patient child in playing the game. Finally, note that
age group comparison for analysis and reconstruction static images highlighted more patience

© 2019 British Educational Research Association


18    British Journal of Educational Technology  Vol 0 No 0 2019

and diligence for older children, who achieved better results (Figure 11 and Appendix Tables
A3–A4).
The “Follow the Rabbit” is a dynamic game in a sense that the user needs to track an object (the
rabbit) along predefined paths forming simple geometric figures (see Figure 9). The results from
the “Follow the Rabbit” game were analysed using the same performance coefficients as for the
“Hear the Invisible” game.
This time User1 was in the forefront, but surprisingly the most promising results were achieved
also by User3 who did not excel in the earlier game. He was able to track the “Rabbit” carefully
and to reconstruct the shape precisely. One of the volunteers (User4) was replaced by User7 for
the “Follow the Rabbit” game due to unforeseen circumstances.
It was quite unexpected, that younger children were able to better follow the “Rabbit” (Figures
15 and 16) and better reconstruct geometrical shapes (Figure 11), especially that they had worse
results for analysing static images. We can hypothesize that the younger children performed bet-
ter in this game because it was very simple, did not contain any intellectual challenges and their
stronger competitive instinct dominated the interaction with the game.
Disappointing results were obtained from the omni-directional type “Rabbit” trial. Our intention
was to use the sonic colour mapping to indicate the direction in which the tracking should follow.
Unfortunately children had difficulties in accommodating this additional feature (see the results
in Figure 17). By trying to focus their attention on the sound timbre they kept losing the “Rabbit.”
Possibly after more training sessions this extra sonication feature could be useful, but not after
such short training.
Obviously, the examined groups were not large enough to draw statistically significant conclu-
sions, but the study will continue with a larger group of participants. Further tests will use a
wider range of sonifications and another already implemented game—“Highest, hottest, closest”
that tasks the user with locating the maximum bright spot in an image from various data sources
(eg, thermograms) in the shortest possible time (see examples of such images shown on the left
hand side in Figure 2). All the child testers were blind from birth as the ability to understand
visual information can be significantly better for persons who lost sight at a later age (Fernandes
& Haley, 2013).

Omnidirectional "Follow the Rabbit" game – mean results


Analysis stage Reconstruction stage Task times [s]
110 110 250
100 100
90 90 200
80 80 150
70 70
60 60 100
50 50
50
40 40
30 30 0
20 20 Analysis Reconstruction
10 10
10-12 Years Old
0 0
Q1[%] Q2[%] Cmpl[%] Q1[%] Q2[%] Cmpl[%] 13-16 Years Old

Figure 16:  Results achieved in “Follow the Rabbit” Serious Game for analysis stage and “Rabbit” type

© 2019 British Educational Research Association


Interactive sonification of images    19

Concentric circled vs. Omnidirectional "Follow the Rabbit" games


Analysis stage Reconstruction stage Task times
110 110 250
100 100
90 90 200
80 80 150
70 70
60 60 100
50 50
50
40 40
30 30 0
Analysis Reconstruction
20 20
10 10 Circled rabbit
0 0
Q1[%] Q2[%] Cmpl[%] Q1[%] Q2[%] Cmpl[%] Omnidirectional rabbit

Figure 17:  Comparison of the concentric circled rabbit and directional rabbit

Conclusions
In this paper, we report results of pilot trials of interactive sonification techniques for aiding blind
children in perceiving shapes and colours. We have designed custom sonification algorithms for
converting visual properties of images (colours and shapes) into sounds. Image HSV colour space is
mapped into such sound properties as dominant frequency (controlled by image Hew), mono-har-
monic to poly-harmonic sounds (reflecting Saturation intensity of image colours) and sound vol-
ume corresponding to image brightness (Value). Moreover, the computer-generated sonification
schemes can be designed in a flexible manner and even customized for individual needs of the
visual impaired person. The schemes we have tested in the trials are the results of considerable
research efforts and many disappointing pre-trials with blind participants, which have enabled us
to improve the tool to a level that has been accepted by the blind children. The interactive function-
ality of the sonification algorithm allows the young user to flexibly define (by touch gestures) re-
gions of interest for sonification and track salient image geometric shapes. Thus, image exploration
range and pace can be entirely controlled by the user. This algorithm property offers a pupil-cen-
tred approach to educating children suffering from different category of blindness.
Six blind children that took part in the trials were able (with different success rate) to learn to
perceive basic mathematical functions and geometric shapes. Here, we should note however, that
the conducted study is a pilot one and, although very promising, preliminary conclusions can be
drawn only about the validity of the proposed interactive sonification scheme. Firstly, the study
was conducted with participation of a limited number of visually impaired children willing to
take part in the trials. The participants had to only be blind or visually impaired, but show no
other impairments, which also narrowed the list of potential participants. The group of volun-
teers showed keen interest in the sonic educational games. Such an acceptance level of the tool
will have to be confirmed for a wider groups of subjects. Secondly, the trials were conducted with
a help of the authors of the software tool. The teachers will need to better acquaint themselves
with the developed tool before using it during regular classes. This can be a challenging task for
the teachers as also indicated in a similar study devoted to teaching mathematics blind children
(Scherer, Beswick, DeBlois, Healy, & Opitz, 2016).
Nevertheless with a cautious enthusiasm we have noted that the majority of the volunteers were able
to correctly analyse and recognize the test images. This was confirmed by numerous sessions spent

© 2019 British Educational Research Association


20    British Journal of Educational Technology  Vol 0 No 0 2019

with the new interaction tool. The sessions were intentionally build in such a way that they captured
the children’s interest in the games, whether containing dynamic image objects or static shapes. The
study has revealed that the efficiency of the interactive sonification method for analysing and recon-
structing basic mathematical shapes or functions is strongly dependent on individual abilities of the
visual impaired children. On the one hand, we observed that some of the volunteers surpassed all
our expectations as regards their ability to understand and reconstruct the presented test images. On
the other hand, we need to admit that in both age groups there were the children that could not fully
grasp the interactive sonification schemes. They needed more time for analysis, their answers were
not as precise as those of the other children. This observation should be an important prerequisite
for apprising the educational usefulness of the proposed tool in a teaching practice. Nevertheless, our
view (that is also supported by other recent study (Scherer et al., 2016)) is that multimodal resources
enhance children’s perception and facilitate the grasp of specific geometric relations. Currently, we
plan to develop more user friendly versions of the applications in cooperation with a software com-
pany that will take into account users’ feedback. We envision the main application of the interactive
sonification tool will be in teaching geometry, some branches of match, biology and geography.

Acknowledgements
The project was financed by the Polish National Science Centre grant no. 2015/17/B/ST7/03884.

Statements on open data, ethics and conflicts of interest


Additional information, raw data (*.csv) and processed data (*.png) are available for download
online: http://www.eletel.p.lodz.pl/progr​amy/ison.
The study obtained approval no. RNN/261/16/KE from the Bioethics Committee at Medical
University of Lodz. It was carried out in compliance with the required ethical standards:
non-harmful procedures, informing the participants, proper supervision of minors and anonym-
ity of the participants and results. The study was performed on the grounds of and under super-
vision of instructors from the School for the Blind and Visually Impaired Children, 24 Dziewanny
Str., 91-001 Lodz.
The authors certify that they have no conflicts of interest, affiliations with or involvement in any
organization or entity with any financial interest or non-financial interest in the subject matter
or materials discussed in this manuscript.

References
Balan, O., Moldoveanu, A., Moldoveanu, F., & Morar, A. (2016). From game design to gamification and
serious gaming—How game design principles apply to educational gaming. In eLearning & Software for
Education Conference (pp. 334–341), Bucharest, Romania, Issue 1.
Banf, M., & Blanz, V. (2012). A modular computer vision sonification model for the visually impaired. In
Proceedings of the International Conference of Auditory Display. Atlanta, GA.
Banf, M., & Blanz, V. (2013). Sonification of images for the visually impaired using a multi-level approach. In
Proceedings of the 4th Augmented Human International Conference (pp. 162–169). Stuttgart, Germany: ACM.
Baranski, P., & Strumillo, P. (2015). Emphatic trials of a teleassistance system for the visually impaired.
Journal of Medical Imaging and Health Informatics, 5(8), 1640–1651.
Bourne, R. R. A., Flaxman, S. R., Braithwaite, T., Cicinelli, M. V., Das, A., Jonas, J. B., … Taylor, H. R. (2017).
Vision loss expert group. Magnitude, temporal trends, and projections of the global prevalence of blind-
ness and distance and near vision impairment: A systematic review and meta-analysis. The Lancet Global
Health, 5(9), e888–e897.
© 2019 British Educational Research Association
Interactive sonification of images    21

Brock, A., & Jouffrais, C. (2015). Interactive audio-tactile maps for visually impaired people. In ACM
Sigaccess Accessibility and Computing (ACM Digital Library), Association for Computing Machinery (ACM)
(pp. 3–12).
Bujacz, M., & Strumillo, P. (2016). Sonification: Review of auditory display solutions in electronic travel
aids for the blind. Archives of Acoustics, 41(3), 401–414.
Cavaco, S., Henriques, J. T., Menguccia, M., Correiaa, N., & Medeirosd, F. (2013). Color sonification for the
visually impaired. Procedia Technology, 9, 1048–1057.
Csapó, A., & Wersényi, G. (2013). Overview of auditory representations in human-machine interfaces.
ACM Computing Surveys, 46(2), 1–23.
Degara, N., Hunt, A., & Hermann, T. (Eds.). (2015). Interactive sonification. IEEE multimedia (Vol. 22, pp.
20–23).
Dobrucki, A., Plaskota, P., Pruchnicki, P., Pec, M., Bujacz, M., & Strumillo, P. (2010). Measurement system
for personalized head-related transfer functions and its verification by virtual source localization trials
with visually impaired and sighted individuals. Journal of the Audio Engineering Society, 58(9), 724–738.
Ducasse, J., Brock, A. M., & Jouffrais, C. (2018). Accessible interactive maps for visually impaired users. In
E. Pissaloux, & R. Velazquez (Eds.), Mobility of visually impaired people (pp. 537–584). Cham, Switzerland:
Springer.
Fernandes, S., & Healy, L. (2013). Multimodality and mathematical meaning-making: Blind students' in-
teractions with Symmetry. International Journal for Research in Mathematics Education, 3, 36–55.
Gay, S., Rivière, M., & Pissaloux, E. (2018). Towards haptic surface devices with force feedback for visually
impaired people. In K. Miesenberger, & G. Kouroupetroglou (Eds.), Lecture Notes in Computer Science: Vol.
10897. Computers Helping People with Special Needs. ICCHP 2018 (pp. 110–113). Cham: Springer.
Giannakopoulos, G., Tatlas, N., Giannakopoulos, V., Floros, A., & Katsoulis, P. (2018). Accessible electronic
games for blind children and young people. British Journal of Educational Technology, 49, 608–619.
Hermann, T., & Hunt, A. (2005). An introduction to interactive sonification. IEEE Multimedia, 12(2),
20–24.
Hersh, M., & Johnson, M. (Eds.). (2008). Assistive technology for visually impaired and blind people. London,
UK: Springer.
Heuten, W., Henze, N., & Boll, S. (2007, April). Interactive exploration of city maps with auditory torches.
In CHI'07 extended abstracts on Human factors in computing systems (pp. 1959–1964). San Jose, CA: ACM.
Jansson, G. (2008). Haptics as a substitute for vision. In M. A. Hersh & M. A. Johnson (Eds.), Assistive tech-
nology for the visually impaired and blind people (pp. 135–166). London: Springer-Verlag, London Limited.
Klatzky, R. L., & Lederman, S. J. (2007). Object recognition by touch. In J. J. Rieser, D. H. Ashmeed, F. F.
Ebner, & A. L. Com (Eds.), Blindness and brain plasticity in navigation and object perception (pp. 185–207).
Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Kramer, G. (Ed.). (1994). Auditory display: Sonification, audification, and auditory interfaces. Reading, MA:
Addison-Wesley.
Marston, J. (2009). Researching end-user requirements with application to wayfinding technologies.
Conference & Workshop on Assistive Technologies for People with Vision & Hearing Impairments. Wroclaw,
Poland (CD-edition).
Meijer, P. (1992). An experimental system for auditory image representations. IEEE Transactions on
Biomedical Engineering, 39(2), 112–121.
Merabet, L. B., & Pascual-Leone, A. (2010). Neural reorganization following sensory loss: The opportunity
of change. Nature Reviews Neuroscience, 11(1), 44–52.
Merilampi, S., Koivisto, A., & Sirkka, A. (2018). Designing serious games for special user groups—Design
for somebody approach. British Journal of Educational Technology, 49, 646–658.
O'Neill, C., & Ng, K. (2008). Hearing images: Interactive sonification interface for images. EVA London
Conference, London, 22–24 July 2008 (pp. 188–195).
Radecki, A. (2006). Numerical features and applications of AutoKorel program. Measurement Automation
and Monitoring, no. 9 bis (pp. 130–133) (in Polish).

© 2019 British Educational Research Association


22    British Journal of Educational Technology  Vol 0 No 0 2019

Radecki, A., Bujacz, M., Skulimowski, P., & Strumillo, P. (2016). Interactive sonification of color images on
mobile devices for blind persons—Preliminary concepts and first tests. Interactive Sonification Workshop
(ISon 2016), December 15–16th 2016 CITEC. Bielefeld, Germany.
Radecki, A., Bujacz, M., Skulimowski, P., & Strumillo, P. (2017). Interactive sonification of images on mobile
devices for the visually impaired. 21st IEEE Conference on Signal Processing—Algorithms, Architectures,
Arrangements, and Application (SPA 2017), September 20–22nd 2017. Poznan, Poland.
Ramloll, R., Yu, W., Brewster, S., Riedel, B., Burton, M., & Dimihgen, R. (2000). Constructing sonified hap-
tic line graphs for the blind student: First steps. 4th International ACM Conference on Assistive technologies,
Arlington, VA, 13–15 November 2000, (pp. 17–25).
Scherer, P., Beswick, K., DeBlois, L., Healy, L., & Opitz, E. (2016). Assistance of students with mathematical
learning difficulties: how can research support practice? ZDM, 48, 633–649.
Sheppard, L., & Aldrich, F. K. (2001). Tactile graphics in school education: Perspectives from teachers.
British Journal of Visual Impairment, 19(3), 93–97.
Skulimowski, P., Korbel, P., & Wawrzyniak, P. (2014). POI explorer—A sonified mobile application aiding
the visually impaired in urban navigation. Frontiers in Network Applications, Network Systems and Web
Services (SoFAST-WS'14) Federated Conference on Computer Science and Information Systems FedCSIS 2014,
Warsaw 2014.
Skulimowski, P., Owczarek, M., Radecki, A., Bujacz, M., Rzeszotarski, D., & Strumillo, P. (2018, November).
Interactive sonification of U-depth images in a navigation aid for the visually impaired. Journal of
Multimodal User Interfaces, 1–12.
Targett, S., & Fernström, M. (2003). Audio games: Fun for all? All for fun? In Proceedings of the International
Conference on Auditory Displays. Boston, MA.
The Audio Description Project. (2018). An initiative of the American council of the blind. Retrieved from http://
acb.org/adp/
Visell, Y. (2009). Tactile sensory substitution: Models for enaction in HCI. Interacting with Computers, 21(1–
2), 38–53.
Winberg, F., & Hellström, S. O. (2001). Qualitative aspects of auditory direct manipulation: A case study
of the towers of Hanoi. Proceedings of the International Conference on Auditory Displays. Espoo, Finland.
Wu, W., Morina, R., Schenker, A., Gotsis, A., Chivukula, H., Gardner, M., …Heller, L. M. (2017).
EchoExplorer: A Game App for understanding echolocation and learning to navigate using echo cues.
In Proceedings of the International Conference on Auditory Displays (pp. 237–240). PA.
Yoshida, T., Kitani, K. M., Koike, H., Belongie, S., & Schlei, K. (2011, March). EdgeSonic: Image feature
sonification for the visually impaired. In Proceedings of the 2nd Augmented Human International Conference
(p. 11). Tokyo: ACM.

APPENDIX 1. Sonification algorithm details


Our implementation of the sound synthesis method is based on parallel playback of sound buffers that have
hardware support on Android devices. We only change the frequency and volume of the buffer playback in
real time. Changes of both parameters are implemented in hardware that enable the latency and delays in
the overall sonification process to be the lowest possible. No sound buffer transformations are made, that
could cause additional delays. This allows for a change of their synthesis parameters (such as sampling
frequency and sound volume) completely asynchronously, regardless of the sound playing time. Such an
approach gives the lowest latency in changing sound characteristic in response to a change of ROI and
makes it possible to change the ROI in a continuous manner with negligible delay.
The sonification equation presented in the paper evolved from our past approaches (Radecki et al., 2016).
The sonification algorithms were initially subjectively evaluated by our visually impaired partners. The first
approaches utilized sounds that were judged as frequently annoying and our tests participants could not
endure prolonged training. Our effort was then directed to propose a sonification method that would be con-
sistently “pleasant” or at least “not too much annoying” for a larger group of test participants. In Equation
A2, coefficients c11 to c15 were set to stay in harmony. This led to an overall sound timbre that our tests par-
ticipants called “pleasant.” In the matter of fact, the sonification method described by A2 can be compared
to the sound instrument that is able to produce only consonant chords.
© 2019 British Educational Research Association
Interactive sonification of images    23

From the beginning of our research that started in early 2012, the developed sonification method was con-
stantly evolving. We have tried to use different sizes of dimensions and different colour to sound mapping al-
gorithms for various colour representations and sound synthesis parameters (Radecki et al., 2016; Radecki,
Bujacz, Skulimowski, & Strumillo, 2017). As a result of our experience we picked a total of six dimensions
for the sonic space and the HSV colour space. Five dimensions for colour representation and one for pure
monochromatic graphic information.
Based on the above, we decided to test a sound synthesis scheme composed of five sounds symmetrically
spread over the H component. This was a different approach to the one used in Cavaco et al. (2013) or Yoshi-
da et al. (2011) since the achieved timbre was pleasant and consistent, in contrast to the results obtained in
Yoshida et al. (2011), which the listeners described as “too annoying to be used”.
The presented approach is based on a direct colour-to-sound converter rather than on mapping shapes with
colours. This means that the user has to learn how each colour is build up in the HSV space and how this
mapping sounds to be able to recognize it. However, one does not need to learn the specific assignment
to easily recognize difference between different colours. This is very helpful in object classification tasks in
which different colours or mappings of depth, height, temperature and other scales with colours are used.
The developed sonification algorithm is actually quite generic and can create a complex sonic space S from
m
a parallel playback of n (up to 16) sounds s The S space consists of n dimensions and each of its elements
n
features one of m unique timbres. Each component of the S space has two main parameters: sampling fre-
quency fnb, and playback amplitude An. This gives the S space of s elements that can be described with the
following formula
1 m
s = (s ( f1 b , A1), ⋯, S ( fnb, An )) (A1)
1 n

m
Sounds s are dependent on the sonification mode that combines the sound synthesis scheme and image
n
ROI selection and discretization. For every ROI selection of a given size, we calculated the averaged compo-
nents and then transformed them into the sonic space using the following formula:
gCPo int : HSV → S, gCPo int (H, S, V )
1
= (s ( fb ⋅ c11, S . V. g ((H + c ) mod 1)),
1 amp 21
1
s ( fb ⋅ c12 , S . V. g (( H + c ) mod 1)),
2 amp 22
1
s ( fb ⋅ c13 , S. V. g (( H + c ) mod 1)), (A2)
3 amp 23
1
s ( fb ⋅ c14, S. V. g (( H + c ) mod 1)),
4 amp 24
1
s ( fb ⋅ c15 , S. V. g (( H + c ) mod 1)),
5 amp 25
2
s ( fb ⋅ c16, V.(1 − S)))
6

where:
{
1 − 2⋅ x for x ≤ 0.5
gamp ( x) = (A3)
2⋅ x − 1 for x > 0.5

in which the mod1 operator calculates the fractional part of a number and where coefficients fb = 44100Hz,
c11 = 0.494, c12 = 0.44, c13 = 0.349, c14 = 0.293, c15 = 0.262, c16 = 0.293, c21 = 0, c22 = 0.2, c23 = 0.4,
c24 = 0.6, c25 = 0.8 are set to be spread over one octatonic music scale starting from note C for blue hue

© 2019 British Educational Research Association


24    British Journal of Educational Technology  Vol 0 No 0 2019

value and ending at B note for the red hue value. The octatonic scale alternates semitones and full tones
and is known for its symmetry, ie, it can be shifted by any number of tones without disturbing the relations
between notes. The sound buffers used in A2 were set to hold one sinusoid period by default set to playback
at 1 kHz frequency.

APPENDIX 2. Tables with individual results


Collected data were used to determine the resultant of the developed sonification method effectiveness.
Tables A1–A6 list individual results obtained for the analysis stage and the reconstruction stage corre-
spondingly for every blind volunteer taking part in the trials.

Table A1:  Analysis stage results in “Hear the Invisible” game for individual subjects

Q1 [%] Q2 [%] Cmpl [%] T [s]

Mean value SD Mean value SD Mean value SD Mean value SD

User1 22.76 6.25 58.97 20.28 95.52 5.25 97.01 54.27


User2 22.50 9.46 28.14 10.40 81.02 12.85 18.94 4.43
User3 35.82 22.32 55.44 30.31 92.24 10.90 73.05 50.79
User4 13.88 1.61 31.78 19.91 94.63 9.30 80.93 53.77
User5 44.05 17.54 67.53 29.86 96.43 7.15 34.54 16.40
User6 33.60 9.88 78.23 6.07 99.69 0.54 313.18 97.01

Table A2:  Reconstruction stage results in “Hear the Invisible” game for individual subjects

Q1 [%] Q2 [%] Cmpl [%] T [s]

Mean value SD Mean value SD Mean value SD Mean value SD

User1 61.91 37.46 60.49 39.88 69.61 24.64 5.86 0.58


User2 16.24 13.70 15.49 13.14 20.74 13.50 3.17 1.19
User3 38.20 23.68 36.37 24.52 47.64 16.97 9.59 5.03
User4 41.61 14.49 43.61 21.08 37.97 12.65 3.64 1.64
User5 65.89 24.66 61.27 28.74 82.48 17.38 20.76 8.07
User6 28.90 22.36 32.17 24.20 40.46 13.69 37.79 21.89

Table A3:  Analysis stage results in “Follow the Rabbit” game using the centred rings “Rabbit”

Q1 [%] Q2 [%] Cmpl [%] T [s]

Mean value SD Mean value SD Mean value SD Mean value SD

User1 55.89 3.94 75.72 12.59 82.87 17.95 87.99 35.56


User2 31.39 0.93 66.07 12.38 80.57 27.48 238.37 134.37
User3 54.76 11.84 70.34 12.16 99.98 0.02 122.02 27.46
User7 43.93 10.12 67.47 11.83 98.43 2.73 139.70 57.96
User5 47.85 20.18 81.13 9.21 94.64 5.90 133.45 45.02
User6 32.68 9.64 61.67 11.57 65.32 15.46 119.18 32.70

© 2019 British Educational Research Association


Interactive sonification of images    25

Table A4:  Reconstruction stage results in “Follow the Rabbit” game using the centred rings “Rabbit”

Q1 [%] Q2 [%] Cmpl [%] T [s]

Mean value SD Mean value SD Mean value SD Mean value SD

User1 35.28 18.77 35.05 19.58 54.12 11.35 9.20 1.69


User2 21.32 0.33 19.23 0.38 38.66 7.31 7.01 0.59
User3 51.54 23.23 50.93 23.17 85.82 9.82 4.86 0.58
User7 47.05 7.30 43.75 8.03 76.11 4.14 2.52 0.57
User5 34.97 21.11 32.17 22.60 61.00 22.92 14.44 6.57
User6 16.66 4.11 21.20 8.60 34.48 13.39 24.59 3.50

Table A5:  Analysis stage results in “Follow the Rabbit” game using the omni-directional “Rabbit”

Q1 [%] Q2 [%] Cmpl [%] T [s]

Mean value SD Mean value SD Mean value SD Mean value SD

User1 51.44 7.23 73.84 6.62 99.89 0.14 120.35 5.26


User2 22.35 0.67 65.75 9.31 69.87 14.32 209.17 52.11
User3 59.23 9.85 80.02 9.08 99.93 0.09 119.10 10.00
User7 40.07 6.35 60.78 9.53 93.38 11.03 182.38 67.93
User5 42.93 10.30 73.44 8.76 84.98 18.41 131.74 56.80
User6 41.11 10.77 68.86 14.10 66.57 20.47 106.44 17.52

Table A6:  Reconstruction stage results in “Follow the Rabbit” game using the omni-directional “Rabbit”

Q1 [%] Q2 [%] Cmpl [%] T [s]

Mean value SD Mean value SD Mean value SD Mean value SD

User1 45.87 13.90 43.92 18.95 85.04 5.81 8.94 2.67


User2 25.10 8.85 19.28 8.92 58.47 22.40 10.75 0.85
User3 77.74 16.99 76.58 19.06 93.12 9.36 3.65 0.31
User7 45.62 17.00 45.46 14.82 72.58 11.65 4.34 2.02
User5 54.99 18.72 55.13 20.34 89.13 8.99 12.27 1.67
User6 34.19 10.39 36.18 8.97 56.00 30.16 24.73 7.98

© 2019 British Educational Research Association

You might also like