You are on page 1of 15

DEPARTMENT OF COMPUTER

SCIENCE AND ENGINEERING

A TECHNICAL SEMINAR ON
“VOICE MORPHING”

CO-ORDINATOR: PRESENTED BY :
E.SRI LAXMI(M.Tech,Asst prof) P.SHIVA SHANKAR
(19631A0518)
CONTENTS
 WHAT IS VOICE MORPHING ?
 APPROACHS TO THE PROBLEM.
 CONVERSION OF VOICE.
 TYPES OF VOICE MORPHING.
 REFRANCES OR METHODS.
 APPLICATION OF VOICE MORPHING.
 AVAILABLE SOFTWARE FOR VOICE
MORPHING.
 CONCLUSION.
WHAT IS VOICE MORPHING ?
 Voice Morphing which is also referred to as voice
transformation and voice conversion is a technique to
modify a source speaker's speech utterance to sound as
if it was spoken by a target speaker.
 There are many applications which may benefit from
this sort of technology. For example, a TTS system with
voice morphing technology integrated can produce
many different voices. In cases where the speaker
identity plays a key role, such as dubbing movies and
TV-shows, the availability of high quality voice
morphing technology will be very valuable allowing the
appropriate voice to be generated (maybe in different
languages) without the original actors being present.
APPROACHES TO THE PROBLEM

 Voice conversion will be performed in two phases.

 In the first phase, the training, the speech signals of the


source and target speakers will be analyzed and the
voice characteristics will be extracted by means of a
mathematical optimization technique, very popular in
the speech processing world, the Linear Prediction
Coding (LPC) technique.
APPROACHS TO THE PROBLEM

 In second phase , the transformed features will be used


in order to synthesis speech that will, hopefully,
resemble that of the target speaker.

 Speech synthesis will be performed again by means of


the Linear Prediction Coding.
CONVERSION OF VOICE
 TECHNICS:-
 Wavelet Decomposition.

 Proposed model.

 Wavelet Decomposition :-
 Wavelets are a class of functions that possess compact
support and form a basis for all finite energy signals.
 They are able to capture the non-stationary spectral
characteristics of a signal by decomposing it over a set of
atoms which are localized in both time and frequency. The
DWT uses the set of dyadic scales and translates of the
mother wavelet to form an orthonormal basis for signal
analysis.
EXAMPLE
The original signal S is
Click icon to add picture split into an approximation
cA1 and a detail cD1.
The approximation is then
itself split into an
approximation and a detail
and so on.
Decomposing a signal
into k levels of
decomposition therefore
results in k+1 sets of
coefficients at different
frequency resolutions, k
levels of detail and 1 level
of approximation
coefficients.
 Proposed model :
 Voice morphing is performed in two steps: training and
transformation. The training data consist of repetitions
of the same phonemes uttered by both source and
target speakers.
 The source and target training data is divided into
frames of 128 samples and the data is randomly
divided into training and validation sets.
 A 5-level wavelet decomposition is then performed to
the source and target training data.
TYPES OF VOICE MORPHING
 IN THIS SECTION WE KNOW THAT IN WHICH
FORM WE CAN TRANFORM A NORMAL VOICE
OR SPEECH.

SOURCE TARGET RESULT1 RESULT2

F TO M SPEECH1 TARGET1 RESULT1 VOICE1

M TO F SPEECH2 TARGET2 RESULT2 VOICE2

F TO F SPEECH3 TARGET3 RESULT3 VOICE3

M TO M SPEECH4 TARGET4 RESULT4 VOICE4


 The "Source Speech" column indicates the utterances
of the source speaker.
 Target Speech" column is the target speaker's
utterances.
 The utterances in both these two columns are NOT
included in the training data for the estimation of the
conversion function.
 The next two columns for result.
 The difference between these two columns is that the
“RESULT1" applies the target prosody extracted from
the target utterance, but the “RESULT2" still applies
the original prosody of the source utterances.
REFERENCES OR METHODS

 Abe M. , Nakamura S. , Shikano K. and Kuwabara H.:


Voice conversion through vector quantization,
Proceedings of the ICASSP, 1988.
 Stylianou Y., Cappe O. And Moulines E.:
Statistical Methods for Voice Quality Transformation,
Proceedings of Euro speech, 1995.
 Arslan L. and Talkin D: Voice Conversion by
Codebook Mapping of Line Spectral Frequencies and
Excitation Spectrum, Proceedings of Euro speech ,
1997.
APPLICATION OF VOICE MORPHING

 ENTERTAINMENT.
 IN FILM INDUSTRY.
 SECURITY.
 IN COMPUTER GAMING
AVAILABLE SOFTWARE FOR VOICE
MORPHING
 MORPH VOX PRO VOICE CHANGER 2.0.6.
 MORPH VOX PRO VOICE CHANGER 4.2.2.
 MORPH VOX PROVOICE CHANGER 4.3.8.
 TERA VOICE SERVAER 2004.
 FLASH VOICE BUTTONS 3.0.
 VOICE TWISTER 1.0.4.
 VOICE AGAIN 1.5.2.
 QUICK VOICE FOR OSX 2.2.0.
 QUICK VOICE FOR WINDOWS 2.2.0.
CONCLUSION

As voice morphing is a technology


with a lot of interesting, useful and fun applications
further research on the subject with or without the
implementation of the GTM (Generative Topographic
Mapping) model is bound to follow that will lead to
the production of morphed speech of an excellent
quality.
THANK YOU

You might also like