You are on page 1of 1

It should be possible to use eSpeak as a front-end to mbrola.

eSpeak
does the spelling-to-phoneme translation and intonation, and mbrola
produces the speech. I've tried this with the mbrola "en1" voice and
it sounds good. The speech quality is better than eSpeak, and the
intonation is better festival etc. Response times are probably slower
than standard eSpeak.

I have now added some mbrola support to eSpeak, version 1.21.05 or


later, from
http://espeak.sf.net/test/latest.html
. This now includes a
command-line version of the espeak.exe program (which doesn't yet
include a sound-interface).

If you do the command:


espeak -v en --mbrola=en1 "some text"

then it produces mbrola .pho data to stdout.

On Linux, you can simply pipe that into mbrola, eg:


espeak -v en --mbrola=en1 -f textfile | mbrola /usr/share/mbrola/en1
- test.wav

or to play the sound immediately, pipe the output from mbrola to a


sound player, eg:
espeak -v en --mbrola=en1 -f textfile | mbrola /usr/share/mbrola/en1
- - | aplay -r16000 -fS16

eSpeak needs data to translate from its own phoneme names to the mbrola
phoneme names. So far, I've only set up data for mbrola voices: en1,
us1, us2, us3, ro1.

I don't know how to use mbrola from a Windows SAPI5 speech synthesizer,
but it looks like the information is available. I don't have time to
work on it just now though.