Efficient Diphone Database Creation For MBROLA - DocPlayer

Efficient diphone database creation
for MBROLA,
a multilingual speech synthesiser
Jolanta Bachan
Institute of Linguistics
Adam Mickiewicz University
Poznań
OWD 2010
Wisła-Kopydło, Poland
Why MBROLA?
● useful for testing ● Recently used for:
speech models in ● expressive speech
linguistic work ● dialogue synthesis
● easy manipulation of ● voice quality
duration and pitch ● underresourced
values languages
● easy to create new ● large speech corpora
synthetic voices evaluation (ACCS)
Ph.D. thesis context
● to model different speech styles which will align
with the speaker in a consultation situation
● in a stress situation
● based on the phonetic and linguistic characteristics
of the speaker’s speech
● to design and build a speech synthesis
component and a style selection module for an
adaptive dialogue system
2010-10-24 Efficient diphone database creation for 3

MBROLA
Jolanta Bachan
Ph.D. thesis context
● Adaptive dialogue system
● to adapt its speech by selecting a speech style
appropriate for the speaker’s level of speech
arousal
● to improve human-computer interaction at
emergency unit control centres and the help desks
of call centres, by making the dialogue more
natural.

MBROLA
Jolanta Bachan
Objectives
● Minimasation of the material to be recorded and
annotated for a synthetic voice creation
● Automatisation of the process of synthetic voice
creation

MBROLA
Jolanta Bachan
MBROLA voice creation
(Dutoit et al. 1996)
● Creating text corpus ● Segmenting corpus
● list of phones with ● phone level
allophones (PL) ● automatically and/or
● list of diphones (DL) manually
|DL| = |PL|2 ● extracting diphones
● list of words ● Equalising corpus
● words in carries (mbrolation)
sentences
● energy levels
● Recording corpus normalisation
with monotonous ● pitch normalisation
intonation
MBROLA
Jolanta Bachan
(Dutoit et al. 1996)
● Creating text corpus ● Segmenting corpus
● list of phones with ● phone level
allophones (PL) ● automatically and/or
● list of diphones (DL) manually
|DL| = |PL|2 ● extracting diphones
● list of words ● Equalising corpus
● words in carries (mbrolation)
sentences
● energy levels
● Recording corpus normalisation
with monotonous ● pitch normalisation
intonation
2010-10-24 Efficient diphone database creation for
MBROLA
7
Jolanta Bachan
Mbrolation
The Mbrolator, is a software suite for MBROLA
voice creation
● database file in the SEG format
● diphone filename ● diphone start & end
● diphone label ● diphone subsplitting
● restrictions put on the diphone files are:
● 16000Hz sampling rate
● no longer than 10000 samples
● context of 800 samples on the left and the right sides
MBROLA
Jolanta Bachan
Mbrolation

MBROLA
Jolanta Bachan
Phonetically rich sentence extractor
● to select the smallest possible set of sentences
from a text corpus which will contain the largest
number of diphones

MBROLA
Jolanta Bachan
Available text resources
● 1623 sentences from the BOSS corpus
● 8828 sentences from the Jurisdict database
● 10451 ← altogether
● transcription in
● Polish SAMPA = 37 phonemes
● Polish Extended-SAMPA (PE-SAMPA) = 40 phonemes

MBROLA
Jolanta Bachan
Sentence extraction procedure

MBROLA
Jolanta Bachan
Results
● SAMPA (38*38=1444 diphones)
●1008 diphones in 211 sentences out of 10451
● PE-SAMPA (41*41=1681 diphones)
● 1095 diphones in 201 out of 10451

MBROLA
Jolanta Bachan
Diphone extractor
● to automatically cut out diphones from the
recordings based on the annotations of those
recordings on the phone level

MBROLA
Jolanta Bachan
Available material
● 1580 sentences from BOSS corpus
● recordings in professional recording studio
● recorded male voice in monotonous intonation
● annotated in Polish Extended-SAMPA
– automatic annotation
– manual correction

MBROLA
Jolanta Bachan
Diphone extractor architecture

MBROLA
Jolanta Bachan
Diphone extraction results
● SAMPA: 1039 diphones from 1580 sentences
● PE-SAMPA: 1058 diphones from 1580 sentences

MBROLA
Jolanta Bachan
Tools combination and evaluation
● 226 sentences rocorded by a male speaker
● sentences annotated automatically
● 1002 extracted diphones
● MBROLA voice creation
● Total time: ca. 5 hours

MBROLA
Jolanta Bachan
Tools combination and evaluation
● original
● fully automatic
● manual correction (micro-voice)

MBROLA
Jolanta Bachan
Conclusions
● Phonetically rich sentence extractor and
diphone extractor seem to be indispensable in

MBROLA
Jolanta Bachan
Acknowledgements
● This work was partly funded by
● the research supervisor project grant to Prof. Grażyna
Demenko & the author No. N N104 119838
● the international cooperation scholarship funded by the
Bielefeld University, Germany
● the scholarship for scientific achievements funded by the
Kulczyk Family Foundation
● The author is very grateful to Prof. Grażyna Demenko
for providing the text and speech corpora and to Prof.
Dafydd Gibbon for his invaluable advice on the system
design and implementation.

MBROLA
Jolanta Bachan
Thank you!

MBROLA
Jolanta Bachan

Efficient Diphone Database Creation For MBROLA - DocPlayer

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Efficient Diphone Database Creation For MBROLA - DocPlayer

Uploaded by

Copyright:

Available Formats

Efficient diphone database creation

2010-10-24 Efficient diphone database creation for 3

2010-10-24 Efficient diphone database creation for 4

2010-10-24 Efficient diphone database creation for 5

2010-10-24 Efficient diphone database creation for 9

2010-10-24 Efficient diphone database creation for 10

2010-10-24 Efficient diphone database creation for 11

2010-10-24 Efficient diphone database creation for 12

2010-10-24 Efficient diphone database creation for 13

2010-10-24 Efficient diphone database creation for 14

2010-10-24 Efficient diphone database creation for 15

2010-10-24 Efficient diphone database creation for 16

2010-10-24 Efficient diphone database creation for 17

2010-10-24 Efficient diphone database creation for 18

2010-10-24 Efficient diphone database creation for 19

2010-10-24 Efficient diphone database creation for 20

2010-10-24 Efficient diphone database creation for 21

2010-10-24 Efficient diphone database creation for 22

You might also like