Professional Documents
Culture Documents
Abstract
This paper presents the design and performance of
a Malay grapheme to phoneme (G2P) tool for
generating the pronunciation dictionary for a Malay
automatic speech recognition system (ASR). The G2P
tool is a rule based system. It is flexible in adding and
removing rules, and handling of English words. The
G2P tool also contains morphological and syllable
tool, which it uses to determine the pronunciation of a
word. Our evaluation results showed that using the
pronunciation dictionary that was generated
automatically from our G2P tool, our Malay ASR
system achieves WER of 16.5%, which is only 1.9%
higher compared to the usage of a pronunciation
dictionary that are manually verified.
1. Introduction
A grapheme-to-phoneme (G2P) is a tool used to
generate the pronunciation of a given word. A
grapheme is the fundamental unit of written
language, and a phoneme is the smallest
linguistically distinctive unit of sound [1]. G2P is an
important component of many speech processing
systems. For example, in speech synthesis systems, the
pronunciation of unknown words, that is, words that are
not in the pronunciation dictionary can be predicted by
applying G2P rules. In speech recognition systems, a
G2P tool can be used to generate the pronunciation
dictionary.
Malay, in its variety of forms, is widely used in
Malaysia, Indonesia, Singapore, and southern of
Thailand. In this paper, we focus only on the Malay as
it is used in Malaysia. Malay is written using either
Latin alphabet (Rumi) or an adapted Arabic alphabet
(Jawi). The G2P that is described in this paper is only
for Rumi Malay.
This paper reports our effort to develop Malay G2P
system for ASR system. Section 2 provides an
Mode of
articulation
Bi-lab.
Lab.-dent.
Dent.
pb
Plosive
Fricative
Affricate
Vibrante
Lateral
Nasale
Glide
fv
m
w
i
e
Back
u
o
Alveo.
Alveopalat.
td
sz
t
r
l
n
Palat.
Vel.
Glot.
kg
x
?
h
3. Malay phonology
There are 36 phonemes in Malay [10]. Six of them
are vowels, three are diphthongs and 27 are
consonants. Table 1 and Table 2 show the IPA tables
for Malay vowels and consonants respectively. The
three Malay diphthongs are /aj/, /aw/ and /oj/. Figure 1
shows the Malay phoneme distribution in the text.
18
16
14
12
P e rc e n t
10
8
6
4
2
0
a
b
d
dZ
e
f
g
h
i
j
J
k
l
m
n
N
o
o j/a j/a w
p
r
s
S
t
tS
u
w
z
?
@
Phoneme
4. Malay morphology
Malay is an agglutinative language. It can create
new words by adding affixes to a root word. Besides,
additional bound morphemes can be added to the
affixed word as it is shown in Figure 2 [11].
Circumfix
Infix
Prefix
Proclitic
Root
Suffix
Affixed word
Enclitic Particle
diberikan.nya
diberi.kan.nya
dibe.ri.kan.nya
di.be.ri.kan.nya
Malay syllable structures are shown in Table 3.
Most of the words with two or more consonants that
form the coda of a syllable are borrowed from English.
For example the Malay word struktur is from the
English word structure.
Table 3. Malay syllable structures
Syllable
Word
Description
V
i.kan
V.CVC
CV
sa.tu
CV.CV
CVC
ban.tu
CVC.CV
CCV
dwibahasa CCV.CV.CV.CV
CCVC
prak.tik
CCVC.CVC
CCCV
stra.tegi
CCCV.CVCV
CCCVC
struk.tur
CCCVC.CVC
Figure 3 shows the distribution of Malay words in
the texts in term of syllable length. Most of the words
in Malay are disyllabic. Disyllabic words form nearly
half of the overall words in the text. This is followed by
words with three syllables.
0.5
0.45
0.4
0.35
Percent (%)
CV.CV.CV.CVC.CV
0.3
0.25
0.2
0.15
0.1
0.05
0
1
3
Number of syllables
>5
p
b
t
j
l
r
d
l
r
d
k
q
g
s
x
h
f
v
z
sy
sh
kh
gh
c
d
k
k
g
s
s
h
f
v
z
m
n
ng
ny
w
y
a
e
i
o
u
ai
au
oi
m
n
w
j
a
i
o
u
aj
aw
oj
Target
grapheme
u
i
a
a
b
e
,
e
f
g
h
i
I (long)
a
o
e
o
aw
aj
b
d
d
e
ej
f
g
h
i
i
k
l
m
n
p
r
s
u
v
w
j
z
k
l
m
n
ow
oj
p
r
s
t
t
u
u
v
w
j
z
8. Conclusion
The results show that automatically generated
pronunciation dictionary performed only slightly worst
than the pronunciation dictionary that was created
semi-automatically. However, it also shows that there is
still room for improvement. For the mapping of
grapheme e to phoneme /e/, one possible way to
reduce the mismatch is by force aligning the grapheme
e to either // or /e/. This approach however only
solves part of the problem. The second improvement is
to identify words that should be applied schwa rules
and words that should not. As discuss earlier, one way
is to manually determine those words that should apply
this rule. This will eliminate some unnecessary variants
from the dictionary. Thirdly, we should verify the
English to Malay phoneme mapping to make sure that
they are applied correctly. We may even improve the
mapping by taking into consideration the context of the
English phoneme it is in. Fourth improvement possible
is to determine from the original text, a word found in
the English pronunciation dictionary, whether it is
really an English word or a Malay word.
8. References
7. Evaluation