You are on page 1of 21

NEURAL NETWORKS

(ELEC 5240 and ELEC 6240)


single neuron training

Bodgan M. Wilamowski

Area with 4 partitions. Input–output mapping


Assuming linear
activation function for
the output neuron

Relatively complex
nonlinear
li mapping
i
Question:

How to design a system for an


arbitrary nonlinear mapping?
2
How to design?

What is given?
(a) Mathematical function
No need for design – just use microcomputer for
calculations
(b) Experimental data
- Find an analytical functions describing the process ???
- Use a data to train neural networks

Hamming code example

Let us consider binary signals and weights such as


x = +1 -1 -1 +1 -1 +1 +1 -1
if weights
g w=x
w = +1 -1 -1 +1 -1 +1 +1 -1
then n
net =  w i x i  8
i=11
this is the maximum value net can have for any other
combinations net would be smaller
4
Hamming code example

For the same pattern


x = +1 -1 -1 +1 -1 +1 +1 -1
and slightly
g y different weights
g
w = +1 +1 -1 +1 -1 -1 +1 -1
n
net =  w i x i  4
i=1
n
net =  w i x i  n  2 HD HD is the
i=11 Hamming Distance
5

Unsupervised learning rules for single neuron

w i  c x
where c is the learning constant

Hebb rule w i  c o x

Pattern normalization required

6
Supervised learning rules for single neuron
w i  c  x
correlation rule (supervised):  d
perceptron fixed rule:   d o
perceptron adjustable rule - as above but the learning constant is modified to:
x wT net
   T   2
*

xx x
LMS (Widrow-Hoff) rule:   d  net
delta rule:   d  o  f '
w  x T x  x T d
1
pseudoinverse
d i rule
l (the
(th same as LMS):
LMS)
7

Training Neurons

w i    x i
Perceptron learning rule:
  d o
w i   x i d  sign ( net ) 
A
Assuming
i bibipolar
l neurons
output = ±1 w i   x i 2
8
Simple example of training one neuron

neuron
3 y
x
1
((1,2)
, ) => -1
3
2 y
-3
(2 1) =>
(2,1) > +1
+1
1
initial setting with wrong answers
both paterns belongd to -1 category

x
1 2 3
9

Simple example of training one neuron

Weights: 1 3 -3 Desired output


Pattern 1: 1 2 +1 -1
Pattern 2: 2 1 +1 +1
n
net = w
i=1
i xi A
Actual
l output

for pattern 1: net = 11*1+3*2


1+3 2-3
3*1=
1= 4 => +1
for pattern 2: net = 1*2+3*1-3*1= 2 => +1
10
Simple example of training one neuron
assuming learning constant α  0.3
weights: w  [1 3 - 3]
pattern 1: x  [1 2 1]
n
net = w i xi w   xd  o 
i=1
net  1 1  2  3  1  3  4   1
w  0.3 x 1  1  0.6 x
w  [-0.6 - 1.2 - 0.6]
w  [0.4
[0 4 1.8
1 8 - 3.6]
3 6]
11

Simple example of training one neuron


Aft applying
After l i the
th first
fi t pattern
tt first
fi t time
ti
x
3 y w  [0.4
[0 4 1.8
1 8 - 3.6]
3 6] 04
0.4
1.8
y
(1,2) => -1
1
2 -3.6
+1

3.6
(2 1) => +1
(2,1) 1 x0  9
0.4
1
3.6
y0  2
x 1.8
1 2 3
12
Simple example of training one neuron
Applying the second pattern first time

weights: w  [0.4
[0 4 1.8
18 - 3.6]
3 6]
pattern 2: x  [2 1 1]
n
net = w i xi w   xd  o 
i=1
net  2  0.4  1 1.8  1  3.6  1   1
w  0.3 x 1   1  0.6 x
w  [1.2 0.6 0.6]
w  [1.6 2.4 - 3.0] 13

Simple example of training one neuron


After applying the second pattern first time
x
3 y w  [1.6
[1 6 2.4
2 4 - 3] 16
1.6
2.4
y
((1,2)
, ) => -1
2 -3
+1

(2 1) =>
(2,1) > +1 3
x0   1.87
1 1.6
3
y0   1.25
x 2.4
1 2 3
14
Simple example of training one neuron
Applying the first pattern second time

weights: w  [1.6
[1 6 2.4
2 4 - 3]
pattern 1: x  [1 2 1]
n
net = w i xi w   xd  o 
i=1
net  1 1.6  2  2.4  1  3  3.4   1
w  0.3 x 1   1  0.6 x
w  [-0.6 - 1.2 - 0.6]
w  [1 1.2 - 3.6]
15

Simple example of training one neuron


Aft applying
After l i the
th first
fi t pattern
tt second
d time
ti
x
3 y w  [1 1.2
1 2 - 3.6]
3 6] 1
1.2
y

2 -3.6
+1

3.6
x0   3.6
1 1
3.6
x y0  3
1 2 3 1.2
16
Simple example of training one neuron
Applying the second pattern second time

weights: w  [1 1.2
1 2 - 3.6]
3 6]
pattern 2: x  [2 1 1]
n
net = w i xi w   xd  o 
i=1
net  2 1  1 1.2  1  3.6  0.4   1
w  0.3 x 1   1  0.6 x
w  [1.2 0.6 0.6]
w  [2.2 1.8 - 3.0]
17

Simple example of training one neuron


After applying the second pattern second time
x
3 y w  [2.2 1.8 - 3.0] 22
2.2
1.8
y

2 -3
+1

3
x0   1.36
2.2
1
3
y0   1.67
1.8
x
1 2 3
18
Simple example of training one neuron
Applying the first pattern third time

weights: w  [2.2
[2 2 1.8
1 8 - 3.0]
3 0]
pattern 1: x  [1 2 1]
n
net = w i xi w   xd  o 
i=1
net  1  2.2  2 1.8  1  3  2.8   1
w  0.3 x 1   1  0.6 x
w  [-0.6 - 1.2 - 0.6]
w  [1.6 0.6 - 3.6]
19

Simple example of training one neuron


Applying the first pattern third time
x
y
3 w  [1.6
[1 6 0.6
0 6 - 3.6]
3 6] 16
1.6
0.6
y

2 -3.6
+1

3.6
x0   2.25
1
1.6
3.6
y0  6
x 0.6
1 2 3
20
Simple example of training one neuron
Applying the second pattern third time

weights: w  [1.6
[1 6 0.6
0 6 - 3.6]
3 6]
pattern 2: x  [2 1 1]
n
net = w i xi w   xd  o 
i=1
net  2 1.6  1  0.6  1  3.6  0.2   1
w  0.3 x 1   1  0  x  0
w  [0 0 0]
w  [1.6
[1 6 0.6 0 6 - 3.6]
3 6]
21

Simple example of training one neuron


Applying the second pattern third time
w  [1.6 0.6 - 3.6]x
3 y 1.6
0.6
y

2 -3.6
+1

3.6
x0   2.25
1
1.6
3.6
y0  6
x 0.6
1 2 3
22
Simple example of training one neuron
Applying the first pattern 4-th time

weights: w  [1.6
[1 6 00.66 - 3.6]
3 6]
pattern 1: x  [1 2 1]
n
net = w i xi w   xd  o 
i=1
net  1 1.6  2  0.6  1  3.6  0.8   1
w  0.3 x 1   1  0  x  0
w  [0 0 0]
w  [1.6
[1 6 0.6 0 6 - 3.6]
3 6]
23

Supervised learning rules for single neuron


w i  c  x
correlation rule (supervised):  d
perceptron fixed rule:   d o
perceptron adjustable rule - as above but the learning constant is modified to:
x wT net
   T   2
*

xx x
LMS (Widrow-Hoff) rule:   d  net
delta rule:   d  o  f '
w  x T x  x T d
1
pseudoinverse
d i rule
l (the
(th same as LMS):
LMS)
24
Training one neuron using the perceptron rule
Desired output
Pattern 1: 1 2 +1 -1
Pattern 2: 2 1 +1 +1
Initial weights: 1 3 -3
learning constant α  0.3
w i   x i d  sign ( net ) 
w i   x i 2
25

Training one neuron using the perceptron rule


Final weights: w  [1.6 0.6 - 3.6]
x
3 y 1.6
0.6
y

2 -3.6
+1

3.6
x0   2.25
1
1.6
3.6
y0  6
x 0.6
1 2 3
26
Soft activation functions

 1 if net  0  1 if net  0
sign (net )  1  
o  f ( net )   0.5 if net  0 o  f ( net )  sgn( net )   0 if net  0
2  0 if net  0  1 if net  0
 

1 2
o = f(net) = o = f(net) = tanh 0.5net = -1
1 + exp- net  1 + exp-  net 

f ' =  1  o o 
f ' =  1  o2 
27

Program in MATLAB (with graphics)


%single neuron perceptron training with soft activation function
format compact;
ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;
figure(1); clf; hold on
plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) % agumenting input
a=axis ; a=[0 4 0 4]; axis(a); j=0;
for ite=1:5,
for p=1:np,
p=1:np
j=j+1; if j>1, plot(x,y,'g'); end;
net(p)=ip(p,:)*ww' ; op(p)=sign(k*0.5*net(p));
er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);
x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);
x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);
plot(x,y,'r');
% pause;
end
ter=sqrt(er*er')
tter(ite)=ter;
if tter <0.001,
0 001 b break;
k end;d
end;
28
hold off; ite,figure(2); clf; semilogy(tter)
MATLAB training results
c=0.3 iter=4 error =0 c=0.1 iter=9 error =0

c=1 iter=4 error =0 c=0.01 iter=66 error =0

29

Program in MATLAB (perceptron -hard)


%single neuron perceptron training with hard activation function
format compact;
ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;
figure(1); clf; hold on
plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) % augmenting input
a=axis ; a=[0 4 0 4]; axis(a); j=0;
for ite=1:5,
for p=1:np,
p=1:np
j=j+1; if j>1, plot(x,y,'g'); end;
net(p)=ip(p,:)*ww' ; op(p)=sign(net(p));
er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);
x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);
x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);
plot(x,y,'r');
% pause;
end
ter=sqrt(er*er')
tter(ite)=ter;
if tter <0.001,
0 001 b break;
k end;d
end;
30
hold off; ite,figure(2); clf; semilogy(tter)
MATLAB training results (perceptron -hard)
c=0.3 iter=4 error =0 c=0.1 iter=9 error =0

c=1 iter=4 error =0 c=0.01 iter=66 error =0

31

Program in MATLAB (perceptron -soft)


%single neuron perceptron traning with soft activation function 4

format compact; clear all;


3.5
ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;
figure(1); clf; hold on; xlabel('X input'); ylabel('Y input'); 3

plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');
2.5
[np ni]=size(ip); ip(:,ni+1)=ones(np,1)
[np,ni]=size(ip); ip(: ni+1)=ones(np 1) %agumenting input
a=axis ; a=[0 4 0 4]; axis(a); j=0;
Y input

for ite=1:20,
1.5
for p=1:np,
jj=j+1;
j if jj>1, plot(x,y,'g');
( y g ) end; 1

net(p)=ip(p,:)*ww' ;
0.5
op(p)=tanh(k*0.5*net(p)); %hiperbolic function
er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:); 0
0 0.5 1 1.5 2 2.5 3 3.5 4

x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2); X input

x(2)=4; y(2)=-(ww(1)
y(2)=-(ww(1)*x(2)+ww(3))
x(2)+ww(3))./ww(2);
/ww(2);
plot(x,y,'r');
% pause;
end
ter=sqrt(er*er'), tter(ite)=ter;
0

if ter <0.0001, break; end; 10

end;
error

hold off; ite


figure(2); clf;
semilogy(tter); xlabel(
xlabel('iterations');
iterations ); ylabel(
ylabel('error');
error );

32
0 2 4 6 8 10 12 14 16 18 20
iterations
Program in MATLAB (perceptron -soft)
4

%single neuron perceptron traning with soft activation function 3.5

format compact; clear all;


3
ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=3; k=0.3;
figure(1); clf; hold on; xlabel('X input'); ylabel('Y input'); 2.5

plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');

nput
[np ni]=size(ip); ip(:,ni+1)=ones(np,1)
[np,ni]=size(ip); ip(: ni+1)=ones(np 1) %agumenting input 2

Y in
a=axis ; a=[0 4 0 4]; axis(a); j=0; 1.5

for ite=1:20,
for p=1:np, 1

jj=j+1;
j if jj>1, plot(x,y,'g');
( y g ) end; 05
0.5
net(p)=ip(p,:)*ww' ;
op(p)=tanh(k*0.5*net(p)); %hiperbolic function 0
0 0.5 1 1.5 2 2.5 3 3.5 4
X input
er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);
x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2); 1
10
x(2)=4; y(2)=-(ww(1)
y(2)=-(ww(1)*x(2)+ww(3))
x(2)+ww(3))./ww(2);
/ww(2);
plot(x,y,'r');
% pause;
end
0

ter=sqrt(er*er'), tter(ite)=ter; 10

if ter <0.0001, break; end;

error
end;
hold off; ite
figure(2); clf; -1
10

semilogy(tter); xlabel(
xlabel('iterations');
iterations ); ylabel(
ylabel('error');
error );

-2
10
33
0 2 4 6 8 10 12 14 16 18 20
iterations

LMS learning rule


d TE 
 
np
  2 d p  o p f ' xip
dwi p 1
In LMS rule (Widrow Hoft – 1962) they assumed f‘=1
(they worked with hard threshold neurons so ff’ was not defined)

d TE 
 
np
  2 d p  o p xip
dwi p 1
Therefore:
2

 
np
TE   d p  net p
p 1 34
Program in MATLAB (LMS)
4
%single neuron LMS training with soft activation function
f
format t compact; t clear
l all;
ll 3.5

ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=0.1; k=1; 3

figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');


2.5
plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');

ut
[
[np,ni]=size(ip);
i] i (i ) iip(:,ni+1)=ones(np,1)
( i+1) ( 1) % %augmentingti

Y inpu
2

input 1.5

a=axis ; a=[0 4 0 4]; axis(a); j=0;


for ite=1:100, 1

f p=1:np,
for 1 0.5

j=j+1; if j>1, plot(x,y,'g'); end; 0

net(p)=ip(p,:)*ww' ; 0 0.5 1 1.5 2


X input
2.5 3 3.5 4

op(p)=tanh(0.5*k*net(p)); %hyperbolic function


er(p) dp(p) op(p) ww=ww+c*(dp(p)-net(p))*ip(p,:);
er(p)=dp(p)-op(p); +c*(dp(p) net(p))*ip(p )
x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);
x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);
plot(x,y,'r');
% pause;
end

error
ter=sqrt(er*er'), tter(ite)=ter;
if ter <0.0001, break; end; 0
10
end;
hold off; ite
figure(2); clf; 35
0 10 20 30 40 50 60 70 80 90 100
semilogy(tter); xlabel('iterations'); ylabel('error'); iterations

Delta learning rule


Errors:
Err1  d1  o1 
2

Err2  d 2  o2 
2



Errnp  d np  onp 
2

 
np
TE   d p  o p
p 1

36
DELTA learning rule 1
2

 
np
TE   d p  o p
p 1
o p  f w1 x1  w2 x2    wni xni 
The gradient of TE along wi:
d TE 

do p

np
  2 d p  o p
dwi p 1 dwi
d p
do d p dnet
do d tp
  f ' xi
dwi dnet p dwi
37

Delta learning rule 2


d TE 
do p

np
  2 d p  o p
dwi p 1 dwi
do p do p dnet p
  f ' xi
dwi dnet p dwi
d TE 
 
np
  2 d p  o p f ' xip
d i
dw p 1

 
np
wi  2    d p  o p f ' xip
p 1

38
Program in MATLAB (Delta)
4

3.5

%single neuron delta training with soft activation function


format compact; clear all; 3

ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=2; k=0.5; 2.5


figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');

nput
plot(ip(1 1) ip(1 2) 'ro');
plot(ip(1,1),ip(1,2), ro ); plot(ip(2,1),ip(2,2),
plot(ip(2 1) ip(2 2) 'bx');
bx ); 2

Y in
[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting input
1.5
a=axis ; a=[0 4 0 4]; axis(a);j=0;
for ite=1:250, 1

for p=1:np,
05
0.5
j j 1 if jj>1,
j=j+1; 1 plot(x,y,'g');
l t( ' ') end; d
net(p)=ip(p,:)*ww' ; 0
0 0.5 1 1.5 2 2.5 3 3.5 4
op(p)=tanh(0.5*k*net(p)); %hyperbolic function X input

fp(p)=k*(1-op(p)*op(p));
er(p)=dp(p)-op(p);
(p) p(p) p(p); ww=ww+c*fp(p)*er(p)*ip(p,:);
p(p) (p) p(p, );
1
10

x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);
x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);
plot(x,y,'r');
% pause; 0
10

end
ter=sqrt(er*er'), tter(ite)=ter;

error
if ter <0.001, break; end;
end; -1
10

hold off; ite


figure(2); clf;
semilogy(tter); xlabel('iterations'); ylabel('error');
-2
10
39
0 50 100 150 200 250
iterations

Program in MATLAB (Delta)


Batch training 4

%single neuron delta training with soft activation function 3.5

% BATCH training 3

format compact; clear all;


2.5
ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=2; k=0.5;
figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');
Y input

plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx'); 1.5

[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting input


a=axis ; a=[0 4 0 4]; axis(a);j=0; 1

for ite=1:125, 0.5

if ite>1, plot(x,y,'g'); end;


0
net=ip*ww'; op=tanh(0.5.*k.*net); 0 0.5 1 1.5 2
X input
2.5 3 3.5 4

fp=k.*(1-op.*op); er=dp-op; 1
10

dw=(c*er.*fp)'*ip; ww=ww+dw;
x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);
x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2); 0
10

plot(x,y,'r');
% pause;
error

-1
10
ter=er'*er, tter(ite)=ter;
if ter <0.001, break; end;
end; -2
10

hold off; ite


figure(2); clf;
semilogy(tter); xlabel('iterations'); ylabel('error'); -3
10
40
0 20 40 60 80 100 120 140
iterations
c=3 k=1
Delta learning for multiple patterns derr=0.01 ite=576
4

%single neuron delta training with soft activation function


% BATCH training with several patterns 3

format compact; clear all;


ip= [-1,-1; 2,2; 0,0; 1,1; -0.5,0; 2,1; 0,1; 3,1; 1,1.5; 2.5,1.5] 2

[
[np,ni]=size(ip);
i] i (i ) ip(:,ni+1)=ones(np,1)
i ( i+1) ( 1) % %augmentingti iinputt
dp=[-1, 1,-1, 1, -1, 1, -1, 1, -1, 1]', ww=[-1 3 -3], c=1.8; k=1;

Y input
1

figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');


a=axis ; a=[-2 4 -2 4]; axis(a);j=0; 0

for ite=1:10000,
if ite>1, plot(x,y,'g'); end; -1

net=ip*ww'; op=tanh(0.5.*k.*net);
fp=k.*(1-op.*op); er=dp-op; -2
-2 -1 0 1 2 3 4
dw=(c*er.*fp)'*ip; ww=ww+dw; 2
X input
10
x(1)= 1; y(1)=-(ww(1)
x(1)=-1; y(1)= (ww(1)*x(1)+ww(3))
x(1)+ww(3))./ww(2);
/ww(2);
x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);
plot(x,y,'r'); 1
10
% pause;
ter=er'*er, tter(ite)=ter;
if ter <0.01, break; end;

error
0

end; 10

for p=1:np, if dp(p)>0, plot(ip(p,1),ip(p,2),'ro');


else plot(ip(p,1),ip(p,2),'bx'); end; end;
hold off; ite
-1
10

figure(2); clf;
semilogy(tter); xlabel('iterations'); ylabel('error');
. 10
-2 41
0 100 200 300 400 500 600
iterations

You might also like