You are on page 1of 20

Popular Music and Society

ISSN: 0300-7766 (Print) 1740-1712 (Online) Journal homepage: http://www.tandfonline.com/loi/rpms20

The dB in the .db: Vocaloid Software as Posthuman


Instrument

Sarah A. Bell

To cite this article: Sarah A. Bell (2015): The dB in the .db: Vocaloid Software as Posthuman
Instrument, Popular Music and Society, DOI: 10.1080/03007766.2015.1049041

To link to this article: http://dx.doi.org/10.1080/03007766.2015.1049041

Published online: 23 Jun 2015.

Submit your article to this journal

Article views: 57

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=rpms20

Download by: [Orta Dogu Teknik Universitesi] Date: 26 December 2015, At: 07:18
Popular Music and Society, 2015
http://dx.doi.org/10.1080/03007766.2015.1049041

The dB in the .db: Vocaloid Software as


Posthuman Instrument
Sarah A. Bell
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

This article presents an analysis of the Vocaloid voice synthesis software in order to explore
what kind of instrument the synthetic voice is emerging to be. Vocaloid software
exemplifies a database paradigm of cultural production analogous to that from which
instrumental music has always drawn, but representative of a perspective on the voice as
an increasingly malleable instrument. Furthermore, Vocaloid music production
challenges notions of voice as revealing something essential about the body, resulting in
works that subvert traditional expectations of bodies based on vocal pitch, and timbre.

Fifty years after the Beatles performed on the Ed Sullivan Show, the Ed Sullivan
Theater was the setting for another milestone in popular music performance. On 8
October 2014 the musical guest on the Late Show with David Letterman was virtual
pop idol Hatsune Miku, a holographic projection whose voice is generated with
Yamaha’s VOCALOIDe voice synthesis software. Miku’s first US television
performance might not have attracted 80 million screaming and fainting fans as
the Fab Four did, but “she” represents more than just a fad in fringe youth culture.
The synthesis of the singing voice is a watershed event in our evolving sociocultural
relationship with what it means to be human.
“The voice makes our life and subjectivity audible,” declares media philosopher
Dieter Mersch (26). The voice “marks the boundaries between the inwardness and the
outwardness of the subject, stages itself as a venture of exposure, addresses the other,
and still remains self-sufficient,” he explains. As this crucial interface between
inwardness and outwardness, artificial simulation of the human voice has captured
the engineering imagination since Daedalus added quicksilver to give speech to his
automata in the mythology of ancient Greece. The late 20th century saw the
development of voice-synthesis technology advance to a point at which text-to-speech
applications can allow those with mutism to speak, and interactive voice response
systems are a ubiquitous part of the bureaucracy of contemporary life. The Spike
Jonze movie Her (2013) speculates about a world where our Siri and Cortana personal
assistants have evolved to become disembodied voices with which we might have the

q 2015 Taylor & Francis


2 S. A. Bell
most intimate relationships. Machinic voices blur the boundaries between human
users and our electronic devices, facilitating what we might experience as ever more
naturalized communication with our machines. Synthetic speaking voices have come
quite a long way from merely understandable speech (think Speak & Spell) to almost
natural sounding speech (like Siri), but a rich and natural synthetic singing voice has
proven to be much more complex to develop. If speech and language were historically
held to be the unique ability of homo sapiens, singing was the artistic pinnacle.
Although musicians have always experimented with vocalization, and 20th-century
electronic music, in particular, explored manipulation of the human voice through
vocoders, recording techniques, sampling, and effects machines, widespread access to
a fully synthesized voice instrument has been available for only about the last decade.
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

The affordances and constraints of the Vocaloid software, as well as the way it is
currently being used, reveal how the lines between what is human and what is
machine continue to blur.
This article presents an analysis of the Vocaloid voice synthesis software in order to
explore what kind of instrument the synthetic voice is emerging to be. First, I describe
how the Vocaloid software uses a database paradigm of cultural production analogous
to that from which instrumental music has always drawn, but representative of a
perspective on the voice as an increasingly malleable instrument. Next, I demonstrate
how the ethos of Vocaloid music production challenges notions of voice as revealing
something essential about the body, resulting in works that subvert traditional
expectations of bodies based on pitch and timbre. Finally, I discuss media convergence
in Vocaloid culture, showing how user-generated content circulates to simultaneously
support and destabilize commercial interests in Vocaloid music. Specifically, I follow
the trajectory of two songs, “Tell Your World” and “PoPiPo,” in order to explore how
Vocaloid emerges as a new musical instrument at the crux of fissures in musical
production, performance, and distribution in a globalized and informationalized
world.

Hatsune Miku: The First Sound from the Future


The Vocaloid software was developed at the Pompeu Fabra University (Spain), with
financing from the Yamaha Corporation, as part of the dissertation project of Jordi
Bonada. The goal was to develop a synthesis engine that could “sound as natural and
expressive as a real singer” and whose inputs could be just the score and lyrics of a
song (Bonada and Serra 68). The software is licensed to third-party vendors who
create a database (called a singer library) consisting of phonemes voiced by an organic
body and then release that voice with a score editing version of the software that
allows the user to write music that is sung by the synthesized voice. One journalist
likens a singer library to a musical “font” (Werde). Within the editor, the user clicks
on notes in a “piano roll” interface to create the melody line and then writes the lyrics
for each note (pictured in Figure 1).1 Once the basic vocal track has been created,
modifications to numerous parameters (vibrato, pitch bend, dynamics, and so forth)
Popular Music and Society 3
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Figure 1 The Piano Roll Interface of Vocaloid Software (Left). Crypton’s Original Art
for the Hatsune Miku Vocaloid Software (Right).

can be manipulated by the user. Multiple vocal tracks can create virtual choruses and
instrumental accompaniment can also be written, although exporting the vocal .wav
files to a more complex audio mixing program is the most common and flexible
workflow. Originally intended for commercial music production—one producer
mused of the potential for synthesized background vocals, “It’s intriguing this idea of
‘O.K., just give me all your vowels and all your consonants and I’ll see you later’”
(Werde)—the creation of an avatar, Hatsune Miku, ignited an entire fandom
circulating around Vocaloid music production by amateurs.2
Anime Expo 2011 marked the North American concert debut of a holographic
projection of Miku, a “virtual singer who can sing any song that anybody composes”
(according to her official Facebook page). The Miku Vocaloid was released by Crypton
Future Media in 2007 and quickly became the number one selling computer software
program in Japan. Miku’s singing is generated from a database of samples recorded by
voice actress Fujita Saki, and Crypton hired manga artist Kei Garo to design the image
of Miku (see Figure 1). However, Crypton provided no back story and no personality
characteristics for Miku, leaving her almost completely malleable in the art and
imagination of fan production. Furthermore, Crypton does not control the rights to
non-commercial uses of Miku’s image, which is protected under the equivalent of a
Creative Commons share-alike license, resulting in thousands and thousands of fan-
generated works that incorporate Miku’s image, many of them animated music videos
of original songs written by end-users of the software.3 Beginning in 2010, SEGA and
Crypton started sponsoring live concerts featuring Miku as a life-sized projection
backed by human musicians on keyboards, drums, guitar, and bass, performing songs
that were written within the fan community. As MIT anthropologist Ian Condry says
of these concerts, “It’s as if we could all write songs for Lady Gaga, and she would
perform them for us. Does it matter that Miku’s not real? How ‘real’ is Lady Gaga
anyway?”4 In 2012 an opinion-polling website posed the question, “What music artist
would you most like to perform at the London Olympics?” and Miku was submitted
4 S. A. Bell
more than any other artist. Lady Gaga came in tenth. Although there is no
documented evidence that Gaga herself had any knowledge of Condry’s statements or
the London Olympics poll, in an interesting twist Miku was the opening act for 14
North American dates on Lady Gaga’s 2014 ArtRave tour, boosting even further the
exposure of Vocaloid music in the west.
When Miku appeared on Late Night, Letterman joked that watching the
performance was akin to being on Willie Nelson’s bus, but, as Condry lamented the
next day in the Huffington Post, “People focus on the live show like it’s so bizarre . . .
[to be] cheering at a cartoon on stage” but what Miku fans are really cheering for are
the “people on the other side,” the community of amateur producers of songs, stories,
fashion, and animation that are collectively “Miku” (qtd in Rao). When the Huffington
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Post goes on to describe Miku as the “mascot for a product” they are misrepresenting
the relationship between Miku the image and Miku the synthetic voice. It is more
accurate to think of Miku as an instrument, a means whereby something is achieved,
performed, or furthered, and, especially, as a musical instrument.

Database
The key to creating a realistic (rather than robotic) synthesized singing voice is the
software’s architecture, which consists of a database of sampled sounds, an expression
module that analyzes the input melody, and a synthesis module that runs algorithms
to concatenate the database samples and apply the expression models. A singer library
(database) is created to capture all the elements of a human singer’s sonic space,
including multiple dimensions of pitch, tempo, loudness, attack, and vibrato
(Bonada). Modeling sonic space in this way places less emphasis on emulating the
mechanics of the human vocal tract and more on the analysis of sound waves in
performance. This is a development over previous voice synthesis technologies that
were often optimized for text-to-speech systems and relied on the shortcuts of
perceptual coding in order to reduce the amount of data processing necessary to
emulate an intelligible voice.
The first electronic voice synthesis apparatus was patented by Homer Dudley at Bell
Laboratories in the 1930s.5 Bell showcased this technology in their 1939 New York
World’s Fair and San Francisco Exhibition exhibits with daily demonstrations of the
Voder, short for Voice Operating Demonstrator, an organ-like apparatus that required
a skilled “girl operator” to use keys and foot pedals to manipulate the frequency range
and oscillation of electric noise to produce something like recognizable speech “with
what might be called a slight electrical accent” (Bell Laboratories 170). Dudley’s patent
paved the way for developments in formant synthesis of the voice over the next several
decades. Voice synthesis in the 1970s and 1980s was accomplished with specialized
signal processors on integrated circuit chips. Once the price of computer memory and
processing power began to plummet, concatenative synthesis emerged as the choice
for increasingly natural-sounding voices. Since the 1990s, software has been the most
cost effective means of achieving voice synthesis. Although concatenative synthesis
Popular Music and Society 5

still often suffers from glitchiness where samples are conjoined (such as in the
characteristic lilt of Apple’s Siri) it allowed Roger Ebert to have assistive technology
created from recordings made before he lost his voice to cancer in 2006,6 and it
allowed Bonada and Serra to create software that emulates realistic human singing.
One of the keys to both the realism and the mutability of voices like Miku is the
database of recorded phonemes from which the voice is assembled.
Vocaloid software is sold with one or more singer libraries included.7 Each library
represents a specific voice, usually recorded by a voice actor, and consists of samples of
all the necessary diphones,8 sustained vowels, and triphones in a language. At a
minimum, diphones need to include all possible phoneme combinations (C-V, V-C,
V-V, C-C),9 so roughly 500 diphones are necessary to synthesize Japanese and around
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

2,500 for English. Triphones (usually V-C-V) are optional but make for a more
realistic and flexible library. In order to create models that can imitate the timbre and
expressiveness of the actor’s voice, each diphone, vowel, or triphone is usually
recorded in various contexts represented in Bonada’s sonic space model. Although
third-party vendors will decide what they will include in the singer library they sell,
Vocaloid creators recommend at least three different pitch, two different loudness
(soft, loud), and two different tempo (90 and 120 beats per minute) contexts for each
di-allophone.
Through the process of concatenative synthesis, samples are pulled from the
database and assembled to create the words the software user has entered as lyrics (in
natural language). Therefore, it is also necessary to include sinusoid models in the
database that are used to connect the waves of sampled phonemes in a way that
effectively imitates the singer’s natural articulation. Vocaloid creators Bonada and
Serra believe that sampling “succeeds in capturing the naturalness of the sounds, since
the sounds are real sounds” (67) but it is important to note that the concatenation of
samples is achieved through computational models that are abstractions of human
physiological processes. For example, the Excitation plus Resonances (EpR) model
consists of filtering algorithms that reproduce nuances of the harmonic envelope (the
amplitude structure of the sound frequencies), and templates mapped from the
human singer’s performance are used to generate vibratos. Because aspects of sound
such as harmonic frequencies are measurable, computational models can be used to
“fill in” for the function of larynx, tongue, lips, and so forth in maintaining the timbre,
or tone quality, of the samples as they are concatenated.10 However, like all models,
these are abstractions of intricate processes such as attack, breathiness, airflow, and
breath control, etc., that contribute to the complexity known as timbre that has been
called “the psychoacoustician’s multidimensional waste-basket category for every-
thing that cannot be labeled pitch or loudness” (McAdams and Bregman 34).11
Abstractions have limitations. Bonada and Serra note that certain transformations
are rather difficult to achieve, especially those related to irregularities in the voice
pulse sequence, which require more difficult-to-model subharmonics. These types of
irregularities are inherent in characteristically rough or creaky voices and appear
frequently in singing, sometimes even as an intentional expressive recourse such as a
6 S. A. Bell
growl (75). In many respects, the software’s synthesis is more pure than a
physiological performance would be. Vocaloid enables users to adjust synthesis
parameters to manipulate the acoustical spectrum at any instant of time in a
composition; however, it is difficult to make the thousands (or tens of thousands) of
adjustments by hand that would be necessary to consistently and significantly change
the timbre of phonemes stored in the singer library. Additionally, the “light” editing
version of the software that is sold to end users often has more limited capability with
regards to “tuning” the spectral envelope of individual phonemes. Crypton has
released several different Miku libraries in which the timbre of the voice throughout
the database is adjusted to imply a “style”—dark, light, soft, solid, sweet, and vivid.
Nevertheless, as is often the case with constraints, human creativity will turn them
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

into affordances. When imagined as a musical instrument in its own right, Vocaloid
software allows users to create effects beyond the physical capabilities of biological
human voices, especially in range and rhythm. Following the cycles of appropriation
identified by technology scholar Ron Eglash, users of Vocaloid software have adapted
and even reinvented uses and contexts for Vocaloid voice instruments. The first
Vocaloid products released were intended to replace live studio back-up singers but
were not widely adopted for that purpose. Leon and Lola were soul singer libraries
created by United Kingdom sound effects company ZeroG and launched at the 2004
winter National Association of Music Merchants trade show in Los Angeles. The
products stirred interest from the press and won a 2005 Electronic Musician Editor’s
Choice award. However, the software was typically reviewed as an “impressive debut”
that still suffered from considerable ease-of-use issues and lacked the quality that
could be attained with processing software like Auto-Tune (see, for example, Miller;
Walden).
It would take the Japanese cultural penchant for idol singers, kawaii mascots,12 and
robotics to inspire the president of audio company Crypton Future Media, Hiroyuki
Itō, to reinterpret the use of Vocaloid technology for the large dōjinshi (self-published)
media culture, releasing a singer named Meiko in 2004. While the box art depicted an
anime-like female character, there was not an explicit marketing strategy that tied a
Meiko avatar to the software. Users responded, and, as the software matured, so did
Crypton’s strategy for selling it. With the release of Hatsune Miku, the virtual identity
of the singer/instrument was defined by just enough physical traits to unite users with
a platform (the way drummers or violists might be united) while not closing off any
possibilities for how the software might be used as an instrument. While Miku’s
“suggested genre” is pop with a 70–150 bpm tempo range, the software has been used
in genres from opera to death metal, and at tempos and pitch ranges far beyond
human ability. A popular animated music video (AMV) of an original song by cosMo,
“The Disappearance of Hatsune Miku,” has Miku reflecting on her precarious
existence within digital files that can end up in the computer operating system’s trash
bin. As the song builds to the event of “an irreversible error” the singing is furious,
glitchy, and rapid, effects achieved as affordances of the software itself, leaving behind
any desire for human fidelity.
Popular Music and Society 7

New media critic Lev Manovich believes that the database is “the center of the
creative process in the computer age” (227) defining new media objects as those which
consist of one or more interfaces to a database of multimedia material. He goes so far
as to argue that the database paradigm is representative of ontology itself in which
“any process or task is reduced to an algorithm, a final sequence of simple operations
that a computer can execute to accomplish a given task. And any object in the
world . . . is modeled as a data structure, that is, data organized in a particular way for
efficient search and retrieval” (223). In fact, Manovich refers to database and narrative
as “natural enemies” (225). Although descriptive of the technical limitations of
computerized databases, his claim fails to account for the vast combinations that a
human user of the database will assemble from the raw materials of the data and
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

algorithms available. The database is not a new cultural form. Although its prevalence
has certainly increased in scale and authority, it is still a construct with which human
creativity has long coexisted. Musicians have always worked from within a limited set
of tones available from their instrument of choice, but this constraint is technical
rather than creative. Musicians create at the point of friction between the constraints
of a limited database and the possibilities inherent in its vast potential for
recombination. As literature scholar N. Katherine Hayles explains, database and
narrative are symbiotic: “Because database can construct relational juxtapositions but
is helpless to interpret or explain them, it needs narrative to make its results
meaningful. Narrative, for its part, needs database in the computationally intensive
culture of the new millennium to enhance its cultural authority and test the generality
of its insights” (How We Think 176). Hayles’s conclusion that “the expansion of
database is a powerful force constantly spawning new narratives” is played out in the
case of the Vocaloid phenomenon, where narratives proliferate in the form of tens of
thousands of original songs and their derivatives. Hayles seems to describe music
perfectly when she explains, “No longer singular, narratives remain the necessary
others to database’s ontology, the perspectives that invest the formal logic of database
operations with human meanings and gesture toward the unknown hovering beyond
the brink of what can be classified and enumerated” (183).

Timbre
But what of the unique expressiveness, the imperfect perfection of the biological
voice? That bit of something that conveys individuality? Certainly there is no need to
fight an either/or battle for the superiority of a “real” voice in a world of ubiquitous
technologization of the human body. The unaltered biological voice is one of any
number of contexts for the instrument “voice” as it is emerging in today’s popular
music. Examples of timbre as a manipulation within Vocaloid software help identify
how an expansion of the abilities of the singing voice through synthesis might serve to
undermine essentialist and reductive beliefs about human beings. Rather than the
valorization of the machine, my purpose is to explore the creative potentialities
enabled by human-machine symbiosis.
8 S. A. Bell
Electronic music has always taken advantage of its non-human elements. For
example, the affectless, robotic sound of vocoder voice synthesis was used specifically
for these qualities by Afro-Futurists conveying a transhuman state of post-racial
identity. Laurie Anderson, also known for incorporating vocoders, self-playing
instruments, and other electronica into her performance art, has been described as “an
electric body in which gestures, stories, and songs mix with synthesizers, video
projections, printed matter, and . . . personal computers . . . reiterating and
recombining simple components into diversifying assemblages” (McKenzie 30).13
While fidelity in voice synthesis may be an engineering goal, it is not necessarily one
that musicians share. Vocaloid singer libraries provide users with an instrument that
can convey posthuman experiences of our co-evolution with our technologies and
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

each other.
Musicologists Utz and Lau call the “principle of contested identities” a sine qua non
of contemporary vocal music. It goes without saying that particular vocal
articulations have been associated with particular groups, but is there a future in
which this kind of cultural essentialism might be challenged? Utz and Lau hope so.
While recognizing that “speaking with one’s own voice” glosses both hegemonic and
liberatory dimensions, they claim that “the versatility of the voice, its capacity to
simulate, mask, or transcend identities, as epitomized by narrative and dramatic
musical genres worldwide, suggests that voices can be leading agents in shaping what
may be termed intercultural dialogue, transnationalism, or cultural hybridity”
(“Introduction” 2). They temper that enthusiasm by recognizing that “vocal musical
practices are deeply embedded in the historical trajectories and local situations within
which composers, performers, and listeners act” (3). Nevertheless, they believe new
identities, consciously staged, enacted, or subversively dissolved in vocal performance,
offer the potential to transgress personal and cultural boundaries. It would seem that
the ability to experiment, via synthesis, with timbre, a quality that has often been
understood as biologically essential, might offer one method of such transgression.
Musicologist Nina Eidsheim proposes a semiotic analogy as a framework for
considering vocal timbre as an artifact of identity rather than biology, thus
challenging its essentialism. She suggests that the denotative “meaning” of a sound is
the idea of the sound itself, such as a human voice singing B flat above middle C, but
that the timbre of the sound, its reverb, brightness, and fidelity, for example,
constitute its connotation. Therefore, a reverberating B flat might invoke cultural
connotations of church music, but it does not “mean” church music. In a third order
of signification, the B flat might be understood as part of the hegemonic diatonic
system of western music. Any musical genre incorporates many such connotations
that may be misunderstood as essential (consider, for example, Billboard’s “Black
Music” category). Eidsheim uses this framework to argue that race and gender,
organizing principles of American culture, are often seen to be essential to vocal
timbre, bundling together a particular body and a particular timbre in a “flawed
semiology,” when timbre is, more accurately, a performance. As evidence, Eidsheim
offers the user reception to Lola, the very first Vocaloid singer library to be released.
Popular Music and Society 9

Lola was marketed as a “soul” singer, but rejected as such by users. “Do we have a
British soul singer with a Japanese accent who lisps like a Spaniard?” asked message-
board user RobotArchie (offensively). In fact, the singer whose voice was sampled to
create Lola is culturally none of these. Eidsheim explains the mistaken assumptions of
the database producers:

[T]he [black female] Lola singer was from a Caribbean background, but . . . she was
often in demand as a studio singer for soul material since she sounded idiomatically
like a soul singer. Because [the creators] assumed that a soul sound would be
emitted from any black body, they chose a black body to provide the sound samples.
But when the Lola singer sang pure syllables outside the soul music context, her
origin in the Caribbean—and thus an accent atypical for soul music—was recorded.
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

In assuming an essential relationship between a black body and the soul sound,
[software creators] assembled Lola using pieces that failed to add up to what we
know as soul. Users’ rejection of Lola as a soul voice shows us that a vocal sound that
we recognize as soul is not the essential sound of blackness which any black vocalist
will automatically inhabit; instead it is comprised of a particular vocal delivery and
timbre (with an indisputable origin in African American culture).

An organic body is physically capable of producing a range of timbres given a


functioning vocal apparatus (lungs, trachea, vocal folds, vocal tract, tongue).
Eidsheim concludes: “When a person is identified by the sound of her voice as African
American, the sound of that voice represents the vocal community to which the singer
belongs, or in which she desires to mark herself as a participant, rather than the
essential sound of her body.” Timbre is a performance of sociocultural divisions, but it
has historically been considered only a sonic phenomenon. We assume we hear the
unmediated sound of a body rather than a conscious performance, but timbre is
largely produced.
While Eidsheim demonstrates that timbre is a performance, her evidence from Lola
users also shows that essentialist ideas about timbre are deeply entrenched. This may
be even more the case as regards gendered voices, where there is a biological
component to the production of vocal timbre as well as pitch. Research has
demonstrated a bias associating certain personality characteristics with vocal gender
cues, such as high pitch equating submissiveness (see, for example, Robinson and
McArthur; Gordon and Heath). Although such arguments are often framed in terms
of evolutionary biology, there is no reason to perpetuate such stereotypes in
contemporary society (Karpf). A majority of human bodies are able to manipulate
their voices to a mid-range of pitch and, with training, can control articulation and
prosody to achieve transgender effects. However, the engineering of voice-activated
systems makes heavy use of gender biases when designing interfaces (Nass and Brave)
at the cost of perpetuating sexist and heteronormative prejudices. There is a long
historical tradition of gender crossing on stage and in music, from the drastic
measures taken to create castrato singers, to the elaborate costuming of the kabuki and
travesti traditions. Artistic expression has the flexibility to reinforce, resist, or re-
imagine stereotypes and Vocaloid music often emerges at the fulcrum of this conflict.
10 S. A. Bell
There are ten editable parameters in Vocaloid software, one of which is called
“gender factor.” This parameter adjustment applies a combination of filters that affect
pitch transposition, timbre mapping, and spectral shape. These filters are based on
physiological models of the human vocal tract. Timbre mapping applies an algorithm
to adjust for the influence on sound frequencies due to the size of vocal tract organs.
Spectral shape compression involves a theoretical abstraction of the length between
formants (the amplitude peak in the frequency spectrum of the sound), which
impacts the perception of pitch. Rather than requiring users to apply the gender factor
filters to each individual phoneme in order to adjust the gender of a singer wholesale,
most Vocaloids have separate singer libraries for a “genderbend” version such as the
female Miku and male Mikuo. It is interesting to note though that many genderbend
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

duos are sampled from the same voice actor. Vocaloid twins Kagamine Rin and Len,
the second product released by Crypton Future Media in 2007, were recorded by the
same female voice actress, Shimoda Asami, who explained that Len’s “male” voice is
achieved by speaking from her belly, while Rin’s “female” voice is achieved by speaking
at the top of her head.14 “I imagine Rin’s cute and punchy voice springing out from
the whorl of my hair. You know, this kind of high voice you do with your eyes wide
open, it really comes from there. On the contrary, Len’s low voice can’t come out
properly if you don’t use the power of your belly” (Kurisuto).
Between using multiple singer libraries and adjusting the gender factor parameter,
it is common for Vocaloid compositions to play with voicing that is ambiguous as to
sex. The first opera written to include Vocaloids, The Memory Palace of Matteo Ricci,
makes thematic use of this affordance. One example is that of a Vocaloid character
called “the mother” who appears in the first scene. Even though “mother” is a
biologically gendered reference, aspects of this digitally generated mother obscure
specific identity markers including age, race, and even gender. As musicologist
Samson Young describes it,
The personal identity of the mother is uncertain . . . . The character could be at
once referring to the Virgin Mary, Ricci’s mother, or simply a pacifying and
nurturing spiritual presence . . . . She is devoid of a body, and the [Vocaloid]
singing voice possesses a generic English accent. On stage she is represented by a
large computer-generated talking head, which has facial features that could be Asian
or Western. The . . . synthesized voice also renders the character somewhat sexually
ambiguous . . . . [I]t is sometimes unclear to the ear at which point does the voice
cross over from the female range into the male range . . . . This configuration
presents gender not as a binary, but as a continuum. (Young 209 –10)

Vocaloid affords the manipulation of sounds within a singer library to a point at


which pitch and timbre are not reflections of gender, but function outside biological
“male” and “female.” In fan production, Vocaloid characters follow suit, with
gendered “pairs” being commonly performed by either men or women (see
Figure 2). Crossplay, or dressing as a character of the opposite gender, is so common
as not even to raise eyebrows at fan events. After all, many, many characters in anime
and manga worlds (e.g. Pokémon) do not have gender and, for those who do, this
Popular Music and Society 11
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Figure 2 Crossplay Examples, Clockwise from Top Left: Two Female Cosplayers
Dressed as Vocaloid Twins Kagamine Rin and Len; Hatsune Miku and Genderbend
Counterpart Hatsune Mikuo as Portrayed by Female Cosplayers; Miku and Luka
Cosplayers, in Homemade Costumes, Representing a Fan-Drawn Image Associated with
the Popular Song “Magnet” as Portrayed by Two Males and Two Females.

rarely defines their physical abilities, cognitive abilities, or reproductive options in


the fantasy world. As Sue-Ellen Case states, in the cyborg era the body is no longer
an agency for operations, but a theater of operations (44).15 I am not suggesting that
gender becomes a non-issue in Vocaloid culture,16 but that its representation is
much less a reflection of the user’s sense of a stable self than a reflection of emergent
orientation to the community at any specific time. Facilitated by the experimental
affordances of singer library databases and timbre algorithms, gender becomes a
fluid process rather than an essential identity. In other words, gender is one of any
number of narrative possibilities that can be assembled from the database of a singer
library.

Convergence
“The traditional split between the work of creation and the work of production no
longer obtains,” announced Katherine Hayles nearly a decade ago (“Print” 87), just as
Tim O’Reilly was outlining the World Wide Web as platform during the first “Web
2.0” conference. Of course, almost everyone was starting to notice this. Remember,
12 S. A. Bell
“You” were TIME magazine’s Person of the Year in 2006 because “You” were out
blogging and posting videos online and updating your MySpace page as part of “a
story about community and collaboration on a scale never seen before” (Grossman).
Pervasive computation marks the end of a paradigmatic concept of media as we have
known them (Gunkel). No longer a matter of studying individuals or collectives,
digital networked technologies force us to “explore relations with others in
technological ensembles” (Mackenzie 197).
The history of the unofficial Miku anthem, “Tell Your World,” is emblematic of
the prosumer convergence that characterizes Vocaloid culture. Commissioned by
Google Chrome Japan for a marketing campaign, the song is the creation of Kz
(“K-Zet”), one of the first dōjin musicians to gain popularity for songs featuring
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Miku vocals. Dōjin refers to self-published works, including manga, novels, video
games, art, and music, that are often sold and traded through conventions like the
biannual Comiket market in Tokyo. As an amateur, Kz first gained popularity by
uploading songs to Nico Nico Douga, Japan’s primary video-sharing site. He teamed
up with another dōjin musician, Kajuki P,17 to release an independent album under
the band name livetune at Comiket 73 in December 2007. Livetune was quickly
signed by Victor Entertainment (a subsidiary of JVC) and released two albums of
music featuring Hatsune Miku vocals, the first of which reached number five on the
Japanese music charts. “Tell Your World” debuted in a 61-second spot for Google
Chrome in Japan in 2011. Livetune (just Kz as of 2009) left Victor for the
independent Toy’s Factory label and in early 2012 released a full-length single of
“Tell Your World” through the iTunes store, making it immediately available in 217
countries. “Tell Your World” was the finale of Miku’s set on Lady Gaga’s 2014
ArtRave tour.
While Kz is a sort of hero within Vocaloid music culture for emerging out of the
dōjin scene because his songs were essentially nominated by the community, it is easy
to see how corporate sponsorship can quickly take advantage of community ethos to
achieve its own ends. The Chrome ad depicts Japanese teenagers writing Vocaloid
music, creating digital Miku art, and sharing and collaborating via the Chrome
browser. As in the American campaigns that feature Justin Bieber and Lady Gaga as
web-native musical artists, Miku’s popularity is seen to grow exponentially through
search page counts, Google map bubbles, and YouTube uploads. The spot ends with a
quickly scrolling list of credits, the individual’s screen names and the roles they play
within Vocaloid culture (e.g. musician, illustrator, dancer, cosplayer) and then pauses
on the final “credit”: Everyone, Creator.18 Kz explains,

For a lot of people and of course for me as well, Google already equates to the
Internet so I feel extremely happy and incredibly honoured for having the chance to
participate in such a project and have the song used in such a meaningful way.
I wrote the song to express the feelings that I get when using the internet, the
feelings of happiness and excitement and I think it’s because many people had
similar feelings and were able to relate with the song so that’s why it became such a
big hit. (qtd in Collab)
Popular Music and Society 13

As in all art forms, commodification is often a tension in Vocaloid music


production. On the one hand, expert users are valorized when recognized by
commercial interests, while, on the other, commercial exploitation of the community
is disdained. When amateur Kz is promoted from within the community and ends up
achieving commercial success, the community participates in that success. When
Marc Jacobs is hired to design new costumes for the first Miku opera, the community
is outraged, taking to discussion boards to organize boycotts. Furthering this tension
is the free labor that sustains the very possibilities of commodification in the first
place. It is unlikely that the amateur musicians whose songs were featured in Miku’s
ArtRave set were compensated in spite of the fact that their user-generated content
creates additional markets for Lady Gaga.
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

These concerns are not insignificant,19 but media scholar Jonathan Sterne has
noticed that “future of music” questions have “tended to revolve around . . .
intellectual property debates and matters of industry control over professional
musicians,” questions that focus on music as a profession and an industry, perhaps at
the expense of seeing music as a social practice (256). As a social practice, Vocaloid
music often calls attention to “something deeper beneath the waves of dominant
corporate culture” (Nozomi). Miku is not a Vocaloid mascot, and her development is
not regulated by any corporate body. Even in examples as innocuous as the
ridiculously popular song “PoPiPo,” which has Miku celebrating the virtues of
vegetable juice, community values for peer-to-peer communication, folk artistry, and
an ironic relationship to globalization are evident.
The catchy “PoPiPo,” composed by fan artist Lamaze-P, was one of the first Miku
songs to become popular at Nico Nico Douga. The silly song lyrics are about drinking
vegetable juice, but the catchiness of the simple tune and the dance that the hand-
drawn Miku animation does in the original video (see Figure 3) caught on quickly,
spawning imitations, parodies, remixes, and human fans performing the “PoPiPo”
dance all over the world. Like the success of other pop song dances (the “Macarena,”
the “Achy Breaky Heart” line dance, and 2012’s viral hit “Gangnam Style”) a simple,

Figure 3 Image of Miku from Original “PoPiPo” Music Video (Left). Stop-Motion Art
from a Fan Derivative “Popipo” Video Showing the Two Images Needed to Animate the
PoPiPo Dance Move (Right).
14 S. A. Bell
repetitive tune, choreographed with easy to mimic, somewhat exaggerated dance steps
is the key to the participation and parody. However, since the dance step is also easy to
animate, requiring only two still drawings (many fanimations are created with hand-
drawn stop-motion techniques), it also promoted numerous fan-made videos.
About six weeks after Lamaze-P’s original “PoPiPo” was posted to Nico Nico
Douga, another user, Sunafuki P, posted what is now recognized as the “official”
video (although by what authority “official” is judged is a reflection of the fan
community).20 This animation is also hand drawn but is significantly more complex
in terms of graphics, backgrounds, and movement, reflecting an anime aesthetic of
bright colors, rapid movements, exaggerated facial expressions, and annotated
gestures. Many parodies followed, often making reference to the ridiculousness of the
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

original lyrics about vegetable juice being healthy, by depicting, for example, the
exploits of those who fail to heed the song’s advice and choose beverages of a less
healthy variety, or those who must be force-fed their vegetable juice. Others mock the
commodification of something as unsexy as green vegetable juice. Many versions of
the odeotte mita (“I’ve tried dancing it”) and itatte mita (“I’ve tried singing it”) genres
have also been uploaded by fans who filmed their own performances for the
community.
Following its initial success as a music video, references to “PoPiPo” showed up in
many other guises. In 2012, a search at worldwide SNS deviantART for “PoPiPo”
returned almost 10,000 results.21 The diversity of imitation, parody, and response is
impossible to summarize. From drawings (simple, complex, cartoonistic, realistic) to
animated GIFs to costume play to one-of-a-kind trading cards to handmade toys and
clothing, “PoPiPo” has spawned user-generated content from every corner of the
networked world. As Harvard law professor Yochai Benkler observes, “One cannot
make new culture ex nihilo” (qtd in Jenkins 281). Mass media have dominated first-
world transnational culture long enough for us to draw on it as a shared vocabulary.
Benkler explains, “If we are to make this culture our own, render it legible, and make it
into a new platform for our needs and conversations today, we must find a way to cut,
paste, and remix present culture” (qtd in Jenkins 282). Rather than being simply a
means to trivialize, parody functions in this remix of culture to rework materials for
alternative purposes. Parody becomes pedagogy, a movement beyond seeing popular
culture as a “weapon of mass distraction” and instead viewing it as material to be
transformed in the crucible of sociality into tools for political action (Jenkins 288).
So far as I can tell, Miku fan production has not yet been organized toward radical
politics, but the uptake of Miku in parody cannot be used to dismiss Vocaloid out of
hand. Even the frivolity of the “PoPiPo” example shows a certain critical orientation
to pervasive advertising and, occasionally, world politics (see Figure 4).

Conclusion
Ian Condry is correct; Miku’s appearance on Late Night caricatures the network that
exists behind the “cartoon on stage” (Rao). Condry hopes that thinking about Miku can
Popular Music and Society 15
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Figure 4 Collage of User-Generated Parody Based on “PoPiPo” from deviantART, q


the Artists: Clockwise from Top Left, Parody of the “Inspiration Poster” Meme Featuring
Miku; Anti-Soviet Cartoon Replacing Vegetable Juice from Original with Bomb;
Animated GIF Showing a Canadian Miku Doing the PoPiPo Dance; and Photoshope-
Generated Image Replacing Miku with Self-Portrait of the Artist and Vegetable Juice with
Mt. Dew.

help us “imagine new approaches to creating communities of action for the future.”
Recognizing that she reinforces some of the lessons about participatory civic media that
have already been learned—people need to feel a genuine openness to participate;
sharing and dialogue are key to building a community; free culture is more generative
than controlled-IP systems—as well as the fact that cooption and commercialization
are always risks, he nevertheless believes that Miku represents a new platform for
organizing civic action through transnational transmedia creation. That is, the many
incarnations of Miku, from fan production to holographic distribution, offer a model
of civic practice that provides insight into how collective action is leveraged in the
digital environment. As a slight revision of Condry’s dreams for Miku, I would assert
that music is the platform, as it has always been, and that Vocaloid is an instrument with
affordances and constraints uniquely able to help us explore our complex relationship
with conceptions of voice and its changing material and technological iterations.
As with any instrument, the personal practice can be as meaningful as the
community sharing or the public performance. A quick glance at some of the Facebook
comment threads on a local Vocaloid users page for the intermountain American West
reveals that a main point of conversation after the Late Night performance was not
whether the appearance will make or break Miku’s popularity in the United States as a
16 S. A. Bell
virtual idol, but rather had to do with the tuning of the instrument itself. Some
complained that the English-language song “Sharing Your World” was performed
rather than “Tell Your World” (the similarities in the messages of the two are probably
not coincidental), but discussions often focused on the quality of the singing, or, by
extension, the skill of the electronic musician using the Vocaloid software. In fact, it
seems that Crypton might have made a significant misstep by promoting Miku the
performer to US audiences rather than Miku the instrument. American audiences have
already proved to be amenable to Vocaloid music, as demonstrated by the success of
Porter Robinson’s album Worlds, which features Vocaloids.22
Speculating about possible ethical implications in the creation of increasingly
sophisticated virtual musicians, composer Nick Collins summarizes the engineering
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

impulse clearly:
Expressive modeling in alliance with advances in music information retrieval has a
long-term goal of targeting the playing styles of past musicians. There is a continual
engineering drive to further the abilities of transcription technology . . . . Our
ability to extract and generalize musical personality from fixed recordings (with
some amount of psychological and physiological modeling to fill in the gaps) will
only improve. (Collins 37)

It is precisely those “gaps” in his parenthetical in which our technogenesis blossoms. Or as


Collins puts it, “The more of ourselves we commit to virtual musicians, the weirder and
more interesting musical life may become” (38). It is neither the reincarnation of “dead”
voices nor the duplication of “real” voices that Vocaloid musicians are interested in. When
asked which Vocaloid character is his favorite, Kz stated, “I don’t really see them as an
actual character, I see them as something I can adjust and use in a software” (Collab). Miku
is, first of all, an instrument with which one can create music. Since she is the product of a
database paradigm and does not exist as a set personality with a specific narrative back
story, what she can become is emergent, dynamic, and malleable in the hands of musicians
and fans alike. As such, she is a construction in the sense that Judith Butler uses the term:
Construction is not opposed to agency; it is the necessary scene of agency, the very
terms in which agency is articulated and becomes culturally intelligible . . . . The
critical task, [sic] is . . . to locate strategies of subversive repetition enabled by those
constructions, to affirm the local possibilities of intervention through participating
in precisely those practices of repetition that constitute identity and, therefore,
present the immanent possibility of contesting them. (Butler 147)

Everyone, Creator.

Disclosure statement
No potential conflict of interest was reported by the author.

Notes
[1] The software takes care of syllabification by matching words to its own syllable pronunciation
dictionary. As of 2014, Japanese (hiragana, katakana, and Japanese case), English, Korean
(Hangeul), and Spanish are supported, with plans to expand to Chinese.
Popular Music and Society 17

[2] Although Vocaloid is not the only singing voice synthesis product available, it is currently the
most affordable and most widely used by amateur musicians. In addition to Miku, there are
currently more than two dozen voice banks available in Japanese, English, and Spanish.
[3] Crypton claims that more than 50,000 original songs have been written for Miku. In October
2014 a search of YouTube for “Hatsune Miku” returned “about 1,550,000 results,” but it is
impossible to make an authoritative count of the number of fan-generated works in the
dynamic, international digital environment, let alone the analog works or those that are not
uploaded to the internet. Miku had 2,553,615 likes on Facebook at that point.
[4] PBS’s online Idea Channel series of videocasts about the evolving relationship between
modern technology and art also raised the question of “reality” in its episode “Who’s the Real
Pop Star?” which, by way of juxtaposing Miku with a controversial human singer, Lana Del
Rey, poses the question: What does authenticity really mean in the world of popular music?
(http://www.pbs.org/arts/exhibit/idea-channel-s1e2-lana-del-rey-miku-hatsune/). However,
the brief episode fails to mention that Miku’s popularity stems largely from the fact that her
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

songs are created by the fan community.


[5] US Patent No. 2151091 was filed on 30 October 1935 and granted on 21 May 1939.
[6] Stephen Hawking is one of the most famous users of assistive voice synthesis; however,
Hawking uses the CallText5010, a hardware-based synthesizer with a solid state synthesis
chip. This is a different technology from the concatenative synthesis by software of Vocaloid.
[7] Vocaloid software does not enable the creation of new singer libraries, but freeware called
Utau, reverse engineered from Vocaloid, does.
[8] In linguistic terms, a phone is any speech sound or gesture considered as a physical event
without regard to its place in the phonology of a language. A diphone is an adjacent pair of
phones. In speech synthesis, pre-recorded diphones are often combined to create synthesized
speech because the resulting sounds are much more natural than combining just simple
phones as the pronunciations of each phone can vary based on the surrounding phones.
[9] C¼ consonant and V ¼ vowel
[10] Bonada and Serra provide a complete technical account of each model.
[11] More technically, timbre is often understood as the unique combination of five acoustic
parameters: tone, spectral envelope, ADSR (attack, decay, sustain, release), changes in
fundamental frequency, and the prefix of the sound (which in the case of vocal timbre is tied
to the control of breath).
[12] Kawaii is roughly translated as “cute.”
[13] For a comprehensive discussion of the use of vocoders in electronic music, see Tompkins.
[14] I use the verb speaking here because that is how the phonemes are recorded. It is the
concatenation that creates singing. Shimoda insists that she is not the singers Rin and Len; “I
only provide the voices . . . I don’t actually do the singing.” (Kurisuto)
[15] A 2001 survey-based study of gender-switching in MOOs (an object-oriented online multi-
user environment) concluded that it was a minority practice and “more benign” than usually
portrayed in the literature. The researchers believed gender-switching to be “an experimental
behavior rather than . . . an enduring expression of sexuality, personality or gender politics”
(Roberts and Parks 265– 66). Their surveys revealed that the most prevalent reason for
gender-switching was simply that it expanded possibilities for game play (277).
[16] In contrast to many visual analyses of anime which find the overt sexualization of young
female bodies problematic (Flanagan is a representative example of the kinds of analyses I am
referring to, while Black discusses virtual idols specifically), Joel Gn concludes that “the
animated body is not a portrait of anatomical sex, but a synthesized performance of gender”
(585). However, this would seem to avoid the existence of myriad sexual fetish communities
that do congregate in the culture around anime and manga and may or may not perpetuate
subject positions that constrain human agency. There is an active pornographic industry
featuring Miku as a character. This is precisely why I advocate Hayles’s media specific analysis
(MSA) as part of a toolbox of analytical concepts that can be applied when weaving one’s way
through the complexities of media. See also the position of Mackenzie as presented
previously.
18 S. A. Bell
[17] P is the common designation for “Producer,” the role that many musicians who write music
for Vocaloid voices identify themselves with.
[18] The original Japanese ad is available at the Google Chrome Japan YouTube channel: https://
www.youtube.com/watch?v ¼ MGt25mv4-2Q&list ¼ UUgJaiN-DuBKOcjFgcGJ3Q2w.
A version with English subtitles is at https://www.youtube.com/watch?v ¼ xExy_FCC0PA.
[19] There are many excellent critiques of user-generated free labor. See, for example, Lessig; Sholz.
[20] These videos can be seen at the entry for “PoPiPo” at the website http://knowyourmeme.com.
[21] deviantART claims 33 million registered users of its peer artist network. The “about” page
claims that “the site’s vibrant social network environment receives over 160,000 daily uploads
of original art works ranging from traditional media, such as painting and sculpture, to
digital art, pixel art, films and anime.” It should be noted that the popularity of “PoPiPo” has
waned in the last couple of years and there are currently just over 2,000 derivative works
uploaded as of October 2014.
[22] Worlds, released in August 2014, entered the Billboard 200 Albums chart at #18 and the iTunes
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Overall Albums chart at #4. Worlds reached #1 on the Billboard Dance/Electronic chart and
the iTunes Electronic Albums chart.

Works Cited
Bell Laboratories, “Pedro the Voder: A Machine that Talks.” Bell Laboratories Record 17.6 (1939):
170– 171. Print.
Black, Daniel. “The Virtual Ideal: Virtual Idols, Cute Technology and Unclean Biology.” Continuum:
Journal of Media & Cultural Studies 22.1 (2008): 37 –50. Print.
Bonada, Jordi. “Voice Processing and Synthesis by Performance Sampling and Spectral Models.”
Diss., U Popeu Fabra, 2008. Print.
Bonada, Jordi and Xavier Serra. “Synthesis of the Singing Voice by Performance Sampling and
Spectral Models.” IEEE Signal Processing Magazine 24.2 (2007): 67– 79. Print.
Butler, Judith. Gender Trouble: Feminism and the Subversion of Identity. New York: Routledge, Print,
1990.
Case, Sue-Ellen. “Performing the Cyberbody on the Transnational Stage.” Gramma: Journal of
Theory and Criticism 10 (2002): 41– 57. Print.
Collab, Meta. “Comic Fiesta 2013 Exclusive Interviews.” Metanorn, 14 Jan. 2013. Web. 12 Apr. 2015.
Collins, Nick. “Trading Faurés: Virtual Musicians and Machine Ethics.” Leonardo Music Journal 21
(2011): 35– 39. Print.
Condry, Ian. “Miku: Japan’s Virtual Idol and Media Platform.” MIT Center for Civic Media, 11 July
2011. Web. 12 Apr. 2015.
Eidsheim, Nina. “Synthesizing Race: Towards an Analysis of the Performativity of Vocal Timbre.”
Revista Transcultural de Music 13 (2009): n. pag. Web. 12 Apr. 2015.
Eglash, Ron. “Appropriating Technology.” Appropriating Technology: Vernacular Science and Social
Power. Ed. Ron Eglash, Jennifer L. Croissant, Giovanni Di Chiro and Rayvon Fouché.
Minneapolis: U of Minnesota P, 2004. vii – xxi. Print.
Flanagan, Mary. “Mobile Identities, Digital Stars, and Post-Cinematic Selves.” Wide Angle 2.1 (1999):
76– 93. Print.
Gn, Joel. “Queer Simulation: The Practice, Performance and Pleasure of Cosplay.” Continuum:
Journal of Media & Cultural Studies 25.4 (2011): 583– 593. Print.
Gordon, Matthew and Jeffrey Heath. “Sex, Sound Symbolism, and Sociolinguistics.” Current
Anthropology 39.4 (1998): 421– 449. Print.
Grossman, Lev. “You—Yes, You—are TIME’s Person of the Year.” TIME Magazine, 25 Dec. 2006.
Web. 12 Apr. 2015.
Gunkel, David J. “Beyond Mediation: Thinking the Computer Otherwise.” Interactions: Studies in
Communication and Culture 1.1 (2009): 53– 70. Print.
Hayles, N. Katherine. How We Think: Digital Media and Contemporary Technogenesis. Chicago: U of
Chicago P, 2012. Print.
Popular Music and Society 19

Hayles, N. “Print is Flat, Code is Deep: The Importance of Media Specific Analysis.” Poetics Today
25.1 (2004): 67 – 90. Print.
Jenkins, Henry. Convergence Culture: Where Old and New Media Collide. New York: New York UP,
2006. Print.
Karpf, Anne. The Human Voice. New York: Bloomsbury, 2006. Print.
Kurisuto. Asami Shimoda Interview. Vocaloidism, 12 Feb. 2011. Web. 12 Apr. 2015.
Lessig, Lawrence. Free Culture: How Big Media Uses Technology and the Law to Lock Down Culture
and Control Creativity. New York: Penguin, 2004. Print.
Mackenzie, Adrian. Cutting Code: Software and Sociality. New York: Peter Lang, 2006. Print.
Manovich, Lev. The Language of New Media. Cambridge, MA: MIT Press, 2001. Print.
McAdams, Stephen and Albert Bregman. “Hearing Musical Streams.” Computer Music Journal 3.4
(1979): 26– 43. Print.
McKenzie, Jon. “Laurie Anderson for Dummies.” The Drama Review 41.2 (1997): 30– 50. Print.
Mersch, Dieter. “Presence and Ethnicity of the Voice.” Utz and Lau 25 –44. Print.
Downloaded by [Orta Dogu Teknik Universitesi] at 07:18 26 December 2015

Miller, Dennis. “ZERO-G Vocaloid 1.02 Leon and Lola (Win) (review).” Electronic Musician, 1 Aug.
2004. Web. 12 Apr. 2015.
Nass, Clifford and Scott Brave. Wired for Speech: How Voice Activates and Advances the Human-
Computer Relationship. Cambridge, MA: MIT Press, 2005. Print.
Nozomi, Hayase. “Otaku: A Silent Cultural Revolution.” Cultureunplugged.com, 1 July 2010. Web. 12
Apr. 2015.
Rao, Mallika. “Meet Hatsune Miku, the Sensational Japanese Pop Star Who Doesn’t Really Exist.”
Huffington Post, 10 Oct. 2014. Web. 12 Apr. 2015.
Roberts, Lynne D. and Malcolm R. Parks. “The Social Geography of Gender-Switching in Virtual
Environments on the Internet.” Virtual Gender: Technology, Consumption, and Identity. Ed.
Eileen Green and Alison Adam. New York: Routledge, 2001. 265– 285. Print.
Robinson, Janet and Leslie Z. McArthur. “Impact of Salient Vocal Qualities on Causal Attribution
for a Speaker’s Behavior.” Journal of Personality and Social Psychology 43 (1982): 236– 247.
Print.
Scholz, Trebor, ed. Digital Labor: The Internet as Playground and Factory. New York: Routledge, 2012.
Print.
Sterne, Jonathan. “On the Future of Music.” Cybersounds: Essays on Virtual Music Culture. Ed.
Michael D. Ayers. New York: Peter Lang, 2006. 255– 263. Print.
Tompkins, Dave. How to Wreck a Nice Beach: The Vocoder from World War II to Hip-Hop. Chicago:
STOPSMILING Books, 2010. Print.
Utz, Christian, and Frederick Lau. “Introduction: Voice, Identities, and Reflexive Globalization in
Contemporary Music Practices.” Utz and Lau 1 –22. Print.
Utz, Christian and Frederick Lau, eds. Vocal Music and Cultural Identity. Hoboken, NJ: Taylor &
Francis, 2013. Print.
Walden, John. “Vocaloid Leon & Lola: Singing Synthesis Software for Windows (Review).” SOS:
Sound on Sound, Mar. 2004. Web. 12 Apr. 2015.
Werde, Bill. “Could I Get That Song in Elvis, Please?” New York Times, 23 Nov. 2003. Web. 12 Apr.
2015.
Young, Samson. “A ‘Digital Opera’ at the Boundaries of Transnationalism: Human and Synthesized
Voices in Zuni Icosahedron’s The Memory Palace of Matteo Ricci.” Utz and Lau 203– 24. Print.

Notes on Contributor
Sarah A. Bell is a doctoral candidate in Communication at the University of Utah.
Her research focuses on the history and rhetoric of digital computer technologies.

You might also like