Professional Documents
Culture Documents
Speech Recognition System and Method Using HMM
Speech Recognition System and Method Using HMM
OT6
110
STATE TABUE
50
. . we leak. 40
Telephony 30
Network F. G. 3
F. G. A.
U.S. Patent Aug. 25, 1998 Sh eet 4 Of 4 5,799,278
N M M2
N(I) M(I) M(I)
N(2) ( M(2)
M b92)
N(3) M, (3) M(3) M(3)
N(t) M, (L.) M(t) M(t)
N(5) M, (5) M(5) M(5)
N(6) M, (6) M(6) M(6)
N(7) M(7) M(7)
N(8) M, (8) M(8) M(8)
N(9) M(9) M K9 )
N() M() M 2. ) MK )
F.G. 6A
QUESTION RESPONSES
SMALL MEDIUM
F.G. 6B
5,799.278
1 2
SPEECH RECOGNITION SYSTEMAND self-transitions). The number of outputs produced whilst in
METHOD USINGA HIDDEN MARKOW a particular state depends on the duration of stay in that state.
MODEL ADAPTED TO RECOGNIZE A This type of HMM has the advantage that it is possible to
NUMBER OF WORDS AND TRAINED TO force an exit from the model within a finite time (useful for
RECOGNIZE A GREATER NUMBER OF modelling real-life processes); otherwise, an HMM theoreti
PHONETICALLY DISSMLARWORDS. cally may never exit from a state for which self-transitions
FIELD OF INVENTION are allowed (thus in the example of FIG. 1, the model may
get stuck in state S(2)).
The present invention invention relates to a speech rec O Further details about HMMs, particularly in speech
ognition system, and more particularly to a speech recog applications, can be found in "A tutorial on Hidden Markov
nition system which uses a Hidden Markov Model (HMM) Models and Selected Applications in Speech Recognition"
for the recognition of discrete words. by L Rabiner, p257-286, Proceedings of the IEEE, Vol 77.
BACKGROUND OF THE INVENTION Number 2, February 1989; "Hidden Markov models for
15 automatic speech recognition: theory and application" by S
In recent years it has become very popular to use Hidden J. Cox, p105-115, British Telecom Technological Journal,
Markov Models (HMMs) to perform speech recognition. An Vol 6. Number 2, April 1988, and "Hidden Markov Models
HMM has a number of states, which it transfers between in for Speech Recognition” by X Huang, YAriki, and MJack,
a statistically determined fashion. FIG. 1 gives an example Edinburgh University Press, 1990 (ISBN 0 7486 01627).
of a simple HMM having N states. (S(1), S(2), ... S(N)). The 20
probability of transferring from one particular state to If we now consider the use of HMMs in more detail, FIG.
another particular state can be specified in an N by N matrix 2 illustrates an example of an HMM which has four states.
(generally denominated as A in which element ail repre States 1 and 4 are only capable of producing a single output.
sents the probability of transferring from state i to state j. which can be designated Z, and which we can equate with
FIG. 1 illustrates how the values of a are associated with 25 silence. States 2 and 3 are each capable of producing two
transitions between states (note that FIG. 1 only shows a different outputs, which can be designated X and Y. The
subset of the possible transitions). For the HMMs normally model only supports left-to-right (plus self) transitions, so
used in speech recognition, a-0 for j<i, and it may be that in order to produce a meaningful (i.e., non-silent)
additionally specified that the model is initially in state S(T). output, which we will presume is bounded by silence, the
30
In terms of FIG. 1, this implies that the model starts on the model must start in state 1 and terminate in state 4.
far left, and state transitions only proceed from left to right; We will discuss the HMM of FIG. 2 with reference to two
there are no state transitions which operate in the reverse
direction. It is common however, for models to permit different sets of A and B matrices (i.e., transition and output
self-transitions to occur, i.e., a transition which starts and probabilities), which will be referred to as Model 1 and
ends at the same node. Thus a may be non-zero. An 35 Model 2. The assumed values of the A and B matrices for
example of a self-transition is shown for S(2) in FIG. 1. these two models are:
In a "hidden' Markov model, the state transitions cannot
Model 1:
be directly observed. Instead the model produces a series of 2
outputs or observations, (OO ... O ...). Like the state 0 - - 0
transitions, the outputs are also produced in a statistically
determined fashion, and follow a probability distribution A=
0 2- 4- 4- -
-
B(O), where B generally depends only on the current state 3
1, 2
-
of the model. The output probability distribution for state 0 0 3 3
S(i) can therefore be written as Bs (O), with B now being 45 O. O. O. 1
used to represent the set of Bs (O) for all states. Note that
B may represent either a discrete probability distribution, in Model 2:
which case there is a fixed set of possible outputs, or O
3
0
alternatively B may represent a continuous probability dis 3
tribution. 1 1 --