Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
0 of .
Results for:
P. 1
Chapter 13

# Chapter 13

Ratings: (0)|Views: 56 |Likes:

### Availability:

See more
See less

07/23/2010

pdf

text

original

Tern
poral
Processing
13.1
Introduction
The back
-
propagation algorithm described in Chapter
has established itself as the mostpopular method for the design of neural networks. However, a major limitation of thestandard back
-
propagation algorithm described there is that it can only learn an
input-
output mapping that is
static.
Consequently, the multilayer perceptron
so
trained has
a
static structure that maps an input vector
x
onto an output vector
y,
as depicted in Fig.13.la.
This
form of static input
-
output mapping is well suited for pattern
-
recognitionapplications
(e.g.,
optical character recognition), where both the input vector
x
and theoutput vector
y
represent
spatial
patterns that
are
independent of time.The standard back
-
propagation algorithm may also be used to perform nonlinear predic
-
tion on a stationary time series.
A
time series is said to be
stationary
when its statisticsdo
not
change with time. In such a case we may also use a static multilayer perceptron,as depicted in Fig.
13.lb,
where the input elements labeled
z-'
represent unit delays. Theinput vector
x
is now defined in terms of the past samples
x(n
-
l),
x(n
-
2),
.
.
.
,
x(n
-
)
as follows:
x
=
[x(n
-
l),
x(n
-
2),
. .
.
,
x(n
-
)]*
(13.1)We refer to
p
as
the
prediction order.
Thus the scalar output
y(n)
of the multilayerperceptron produced in response to the input vector
x
equals the
one
-
step prediction
2(n),
as shown by
Y(n)
=
(13.2)The actual value
x(n)
of the input signal represents the desired response.The important point to note from both Figs.
13.la
and
13.lb
is that the multilayerperceptron represents a
static model,
all of whose free parameters have
\$xed
values.However, we know that
time
is important in many of the cognitive tasks encounteredin practice, such
as
vision, speech, signal processing, and motor control. The question ishow to represent time. In particular, how can we extend the design of a multilayerperceptron
so
that it assumes a time
-
varying form and therefore will be able to deal withtime
-
varying signals? Indeed, how can we
do
a similar modification for other neuralnetworks? The answer to these questions is to allow time to be represented by the effectit has on signal processing. This means providing the mapping network
dynamic
propertiesthat make
it
responsive
to
time
-
varying signals.In
short,
for a neural network to be dynamic, it must be given
memory
(Elman,
1990).One way in which this requirement can be accomplished is to introduce
time delays
intothe synaptic structure of the network and to adjust their values during the learning phase.The use of time delays in neural networks is neurobiologically motivated, since it is well
498

13.1
I
Introduction
499
Inputvector
X
InputStaticmultilayerperceptron
n(n
-
1)
x(n
-
2)
staticmultilayerperceptron
x(n
-p
+
x(n
-
-
1
-
2
-
q
outputvector
Y
(b)
FIGURE 13.1
Static multilayer perceptron used as (a) a pattern classifier and
(b)
anonlinear predictor.
known that signal delays are omnipresent in the brain and play an important role inneurobiological information processing (Braitenberg,
1967, 1977, 1986;
Miller,
1987).
In this chapter we focus on error
-
correction learning techniques that involve the use
o
time delays in one
form
or another. One such popular technique is the so
-
called
time-
delay neural network
(TDNN),
which was first described by Lang and Hinton
(1988)
andWaibel et al.
(1989).
The
TDNN
is a multilayer feedforward network whose hiddenneurons and output neurons are
replicated across time.
It was devised to capture explicitlythe concept of time symmetry
as
encountered
in
the recognition
of
an isolated word(phoneme) using a spectrogram.
A
spectrogram
is a two
-
dimensional image in which thevertical dimension corresponds to frequency and the horizontal dimension correspondsto time; the intensity (darkness) of the image corresponds to signal energy (Rabiner andSchafer,
1978).
Figure
13.2a
illustrates a single hidden
-
layer version of the
TDNN
(Langand Hinton,
1988).
The input layer consists of
192 (16
by
12)
sensory nodes encodingthe spectrogram; the hidden layer contains
10
copies of
8
hidden neurons; and the outputlayer contains
copies of
4
output neurons. The various replicas of a hidden neuron applythe same set of synaptic weights to narrow (three
-
time
-
step) windows
of
the spectrogram;similarly,
the
various replicas of an output neuron apply the same set
of
synaptic weightsto narrow (five
-
time
-
step) windows of the pseudospectrogram computed by the hiddenlayer. Figure
13.2b
presents a
time
-
delay
interpretation
of
the replicated neural network

500
13
/
Temporal
Processing
A
A
l6
E
I
II
II
II
12
Input units
63
@
4
output units,each connected to
@
all
the hidden unitsTime delays
of
1,2,3,4,5
8
hidden units,each connected to
all
the input units
16
input units
i
Time slices
of
spectrogram
.
___f.
FIGURE
13.2
(a)
A
three
-
layer network whose hidden units and output units arereplicated across time.
(b)
Time
-
delay neural network
(TDNN)
representation. (From
K.J.
Lang and
G.E.
Hinton,
1988.)
of Fig.
13.2a-hence
the name “time
-
delay neural network.”
This
network has a totalof
544
synaptic weights. Lang and Hinton (1988) used the
TDNN
for the recognition of four isolated words: “bee,” “dee,” “ee,” and “vee,” which accounts for the use of four output neurons in Fig. 13.2.
A
recognition score of 93 percent was obtained on testdata different from the training data. In a more elaborate study reported by Waibel et al.
(1989),
a
TDNN
with two hidden layers was used for the recognition of three isolatedwords: “bee,” “dee,” and “gee.” In performance evaluation involving the use of testdata from three speakers, the
TDNN
achieved an average recognition score of 98.5 percent.For comparison, various hidden Markov models
(HMM)
were applied to the same task,for which a recognition score of only 93.7 percent was obtained. It appears that the powerof the
TDNN
lies in its ability
to
develop shift
-
invariant internal representations of speechand to use them for making optimal classifications (Waibel et al., 1989).The
TDNN
topology is in fact embodied in a multilayer perceptron in which eachsynapse is represented by a
Jinite
-
duration impulse response
(FIR)
Jilter
(Wan, 1993).
This
latter neural network is referred to as an
FIR
multilayer perceptron.
For its training,we may construct a static equivalent network by
unfolding
the FIR multilayer perceptron
in
time,
and then use the standard bac
-
propagation algorithm.
A
more efficient procedure,however,
is
to use a
temporal back
-
propagation algorithm
that invokes certain approxima
-
tions to simplify the computation, and which was first described by Wan
(1990a,
b).The FIR multilayer perceptron is a
feedfonvard
network; it attains dynamic behavior