Professional Documents
Culture Documents
1
Recognition Based on Artificial Neuron Networks (2)
Perceptron:
2
7. Recognition Based on Artificial Neuron Networks (3)
Let α > 0 denote a correction increment (also called the learning increment or the
learning rate),
let w(1) be a vector with arbitrary values, and
let wn+1(1) be an arbitrary constant.
Then, do the following for k = 2, 3,…: For a pattern vector, x(k), at step k,
3
Recognition Based on Artificial Neuron Networks (4)
4
Recognition Based on Artificial Neuron Networks (5)
The notation in previous equations can be simplified if we
add a 1 at the end of every pattern vector and include the
bias in the weight vector. That is, we define
x = [x1, x2, …, xn, 1]T
and
w= [w1, w2,…, wn, wn+1]T
Then,
5
Recognition Based on Artificial Neuron Networks (6)
The perceptron algorithm can be modified as: For any
pattern vector, x(k), at step k
6
Recognition Based on Artificial Neuron Networks (7)
For nonseparable pattern classes:
Let r denote the response we want the perceptron to have
for any pattern during training. The output of the perceptron
r is either +1 or −1. We want to find the augmented weight
vector, w, that minimizes the mean squared error (MSE)
between the desired and actual responses of the perceptron.
The perceptron algorithm for finding w is based on the
least-mean-squared-error (LMSE) algorithm as
7
Recognition Based on Artificial Neuron Networks (8)
Artificial Neuron:
Neural networks are interconnected perceptron-like
computing elements called artificial neurons. These neurons
perform the same computations as the perceptron, but with
different activation function.
Activation function:
8
Recognition Based on Artificial Neuron Networks (9)
9
Recognition Based on Artificial Neuron Networks (10)
Forward pass
through a fully
connected
feedforward
neural network
11
NN with 4 layers
Recognition Based on Artificial Neuron Networks (11)
The outputs of the layer 1 are the components of input vector
x, n = n1 is the dimensionality of x:
13
Recognition Based on Artificial Neuron Networks (12)
Example: (Example 12.10, pp. 948-949, [1])
14
Recognition Based on Artificial Neuron Networks (13)
The matrix, W(l), that contains all the weights in layer l, each rows contains the
weights for one of the nodes in layer l:
15
Recognition Based on Artificial Neuron Networks (14)
Then, we can obtain all the sum-of-products computations, zi(l), for layer l
simultaneously:
16
Recognition Based on Artificial Neuron Networks (15)
Because the activation function is applied to each net input independently of the
others, the outputs of the network at any layer can be expressed in vector form
as:
17
Recognition Based on Artificial Neuron Networks (16)
Example: (Example 12.11, p. 951, [1])
18
Recognition Based on Artificial Neuron Networks (17)
For multiple pattern vectors, beginning by arrange all input pattern vectors as
columns of a single matrix, X, of dimension n × np, where n is the dimensionality
of the pattern vectors and np is the number of pattern vectors. It follows
where each column of matrix A(1) contains the initial activation values (i.e., the
vector values) for one pattern. Then the input matrix for all neurons and all
pattern vectors at layer l is
where W(l) is given as before and B(l) is an n × np matrix whose columns are
duplicates of b(l), the bias vector containing the biases of the neurons in layer l.
20
Recognition Based on Artificial Neuron Networks (18)
21
Recognition Based on Artificial Neuron Networks (19)
The equations above are used to classify each of a set of patterns into one of nL
pattern classes. Each column of output matrix A(L) contains the activation values
of the nL output neurons for a specific pattern vector. The class membership of
that pattern is given by the location of the output neuron with the highest
activation value.
It is assumed in this section that we know the weights and biases of the network.
These are obtained during training using backpropagation.
22
Recognition Based on Artificial Neuron Networks (20)
23
7. Recognition Based on Artificial Neuron Networks (21)
These steps (epochs) are repeated until the error reaches an acceptable level.
24
Recognition Based on Artificial Neuron Networks (22)
Equations of backpropagation
Given a set of training patterns and a multilayer feedforward neural network
architecture, the approach is to find the network parameters that minimize an
error (also called cost or objective) function.
Defining the error function for a neural network as the average of the
differences between desired and actual responses. Let r denote the desired
response for a given pattern vector, x, and let a(L) denote the actual response
of the network to that input.
26
Recognition Based on Artificial Neuron Networks (23)
The activation values of neuron j in the output layer is
aj(L). The error of that neuron is defined as
The key objective is to find a scheme to adjust all weights in a network using
training patterns. In order to do this, it needs to know how E changes with
respect to the weights and bias in the network in terms of quantities can be
computed
28
Recognition Based on Artificial Neuron Networks (25)
Finally, updating the network parameters using gradient
descent:
and
29
Recognition Based on Artificial Neuron Networks (26)
30
Recognition Based on Artificial Neuron Networks (27)
Then, rewriting
This nL × 1 column vector δ(L) contains the activation values of all the
output neurons for one pattern vector. To account for all np patterns
simultaneously we form a matrix D(L), whose columns are the δ(L)
Each column of A(L) is the network output for one pattern. Similarly, each
column of R is a binary vector with a 1 in the location corresponding to the
class of a particular pattern vector, and 0’s elsewhere. All matrices are of size nL
× np.
31
Recognition Based on Artificial Neuron Networks (28)
Finally, the equations for updating the network parameters (weights and
biases) at layer l:
and
where δk(l) is the kth column of matrix D(l). The matrix B(l) of size nl × np by
concatenating b(l) np times in the horizontal direction:
32
Recognition Based on Artificial Neuron Networks (29)
During training, these steps 1–4 are repeated for a number of specified
epochs, or until a predefined measure of error is deemed to be small enough.
33
Recognition Based on Artificial Neuron Networks (30)
37