Neural Network Lecture on Connectionist Approach to AI

Connectionist approach to AI
Lecture # 21
neural networks, neuron model
perceptrons
Connectionist approach to AI threshold logic, perceptron training, convergence theorem
single layer vs. multi-layer
backpropagation
stepwise vs. continuous activation function
1 2
Connectionist models (neural nets) Biology

humans lack the speed & memory of computers
yet humans are capable of complex reasoning/action The brain doesn’t seem to have a CPU.
maybe our brain architecture is well-suited for certain tasks Instead, it’s got lots of simple, parallel, asynchronous units,
called neurons.
Every neuron is a single cell that has a number of relatively
short fibers, called dendrites, and one long fiber, called an
axon.
The end of the axon branches out into more short fibers
Each fiber “connects” to the dendrites and cell bodies of other
neurons
The “connection” is actually a short gap, called a synapse
Axons are transmitters, dendrites are receivers
general brain architecture:

many (relatively) slow neurons, interconnected
dendrites serve as input devices (receive electrical impulses from other neurons)
cell body "sums" inputs from the dendrites (possibly inhibiting or exciting)
if sum exceeds some threshold, the neuron fires an output impulse along axon
3 4
Neuron How neurons work
The fibers of surrounding neurons emit chemicals (neurotransmitters)

that move across the synapse and change the electrical potential of the
cell body
Sometimes the action across the synapse increases the potential, and
sometimes it decreases it.
If the potential reaches a certain threshold, an electrical pulse, or action
potential, will travel down the axon, eventually reaching all the
branches, causing them to release their neurotransmitters. And so on ...
5 6
1
Brain metaphor From a neuron to a perceptron
connectionist models are based on the brain metaphor
All connectionist models use a similar model of a neuron
large number of simple, neuron-like processing elements
large number of weighted connections between neurons There is a collection of units each of which has
note: the weights encode information, not symbols! a number of weighted inputs from other units
parallel, distributed control inputs represent the degree to which the other unit is firing
emphasis on learning weights represent how much the units wants to listen to other units
a threshold that the sum of the weighted inputs are compared
against
brief history of neural nets the threshold has to be crossed for the unit to do something (“fire”)
1940's theoretical birth of neural networks a single output to another bunch of units
McCulloch & Pitts (1943), Hebb (1949) what the unit decided to do, given all the inputs and its threshold
1950's & 1960's optimistic development using computer models
Minsky (50's), Rosenblatt (60's)
1970's DEAD
Minsky & Papert showed serious limitations
1980's & 1990's REBIRTH – new models, new techniques
Backpropagation, Hopfield nets
7 8
A unit (perceptron) Interesting questions for NNs

w1
x1
x2 w2 How do we wire up a network of perceptrons?
x3 w3 O=f(y)
. y=Σ
Σwixi - i.e., what “architecture” do we use?
.
. wn
How does the network represent knowledge?
xn
- i.e., what do the nodes mean?
xiare inputs
How do we set the weights?
wiare weights - i.e., how does learning take place?
wnis usually set for the threshold with xn =1 (bias)
y is the weighted sum of inputs including the
threshold (activation level)
o is the output. The output is computed using a
function that determines how far the
perceptron’s activation level is below or
above 0
9 10
Artificial neurons Computation via activation function
McCulloch & Pitts (1943) described an artificial neuron can view an artificial neuron as a computational element
inputs are either electrical impulse (1) or not (0) accepts or classifies an input if the output fires
(note: original version used +1 for excitatory and –1 for inhibitory signals)
INPUT: x1 = 1, x2 = 1
each input has a weight associated with it
.75*1 + .75*1 = 1.5 >= 1 OUTPUT: 1
the activation function multiplies each input value by its weight
1
if the sum of the weighted inputs >= θ, INPUT: x1 = 1, x2 = 0
then the neuron fires (returns 1), else doesn't fire (returns 0) .75 .75 .75*1 + .75*0 = .75 < 1 OUTPUT: 0
θ INPUT: x1 = 0, x2 = 1
w1 wn
if Σwixi >= θ, output = 1 .75*0 + .75*1 = .75 < 1 OUTPUT: 0
x1 x2
w2
... if Σwixi < θ, output = 0 INPUT: x1 = 0, x2 = 0
.75*0 + .75*0 = 0 < 1 OUTPUT: 0
x1 x2 xn
this neuron computes the AND function
11 12
2
In-class exercise Computation via activation function
specify weights and thresholds to compute OR can view an artificial neuron as a computational element
accepts or classifies an input if the output fires
INPUT: x1 = 1, x2 = 1
INPUT: x1 = 1, x2 = 1
w1*1 + w2*1 >= θ OUTPUT: 1
INPUT: x1 = 1, x2 = 0
INPUT: x1 = 1, x2 = 0
θ 1
INPUT: x1 = 0, x2 = 1
w1*1 + w2*0 >= θ OUTPUT: 1
w1 w2 .75 .75 INPUT: x1 = 0, x2 = 0
INPUT: x1 = 0, x2 = 1
w1*0 + w2*1 >= θ OUTPUT: 1
TRY to calculate neuron to computes the OR
x1 x2 x1 x2
INPUT: x1 = 0, x2 = 0 function
w1*0 + w2*0 < θ OUTPUT: 0
13 14
Another exercise? Normalizing thresholds
specify weights and thresholds to compute XOR to make life more uniform, can normalize the threshold to 0
INPUT: x1 = 1, x2 = 1 simply add an additional input x0 = 1, w0 = -θ
w1*1 + w2*1 >= θ OUTPUT: 0

θ 0
INPUT: x1 = 1, x2 = 0
θ w1 wn −θ wn
w1*1 + w2*0 >= θ OUTPUT: 1 w2 w1 w2
w1 w2 ... ...
INPUT: x1 = 0, x2 = 1
x1 x2 xn 1 x1 x2 xn
w1*0 + w2*1 >= θ OUTPUT: 1
x1 x2
INPUT: x1 = 0, x2 = 0
advantage: threshold = 0 for all neurons
w1*0 + w2*0 < θ OUTPUT: 0
Σwixi >= θ ≡ -θ*1 + Σwixi >= 0
15 16
Normalized examples Perceptrons

INPUT: x1 = 1, x2 = 1 Rosenblatt (1958) devised a learning algorithm for artificial neurons
1*-1 + .75*1 + .75*1 = .5 >= 0 OUTPUT: 1
AND 0
start with a training set (example inputs & corresponding desired
INPUT: x1 = 1, x2 = 0 outputs)
-1 .75
1*-1 +.75*1 + .75*0 = -.25 < 0 OUTPUT: 0 train the network to recognize the examples in the training set (by
.75
INPUT: x1 = 0, x2 = 1 adjusting the weights on the connections)
1*-1 +.75*0 + .75*1 = -.25 < 0 OUTPUT: 0 once trained, the network can be applied to new examples
INPUT: x1 = 0, x2 = 0
1 x1 x2 1*-1 +.75*0 + .75*0 = -1 < 0 OUTPUT: 0
Perceptron learning algorithm:
1. Set the weights on the connections with random values.
INPUT: x1 = 1, x2 = 1
2. Iterate through the training set, comparing the output of the network with the
OR desired output for each example.
0 INPUT: x1 = 1, x2 = 0
3. If all the examples were handled correctly, then DONE.
INPUT: x1 = 0, x2 = 1
-.5 .75 4. Otherwise, update the weights for each incorrect example:
.75 INPUT: x1 = 0, x2 = 0
• if should have fired on x1, …,xn but didn't, wi += xi (0 <= i <= n)
• if shouldn't have fired on x1, …,xn but did, wi -= xi (0 <= i <= n)
5. GO TO 2
1 x1 x2
17 18
3
Example: perceptron learning Example: perceptron learning (cont.)
using these updated weights:
0 x1 = 1, x2 = 1: 0.1*1 + 1.6*1 + 1.2*1 = 2.9 1 OK
Suppose we want to train a perceptron to compute AND
x1 = 1, x2 = 0: 0.1*1 + 1.6*1 + 1.2*0 = 1.7 1 WRONG
0.1 1.2 x1 = 0, x2 = 1: 0.1*1 + 1.6*0 + 1.2*1 = 1.3 1 WRONG
training set: x1 = 1, x2 = 1 1 1.6
x1 = 1, x2 = 0 0 x1 = 0, x2 = 0: 0.1*1 + 1.6*0 + 1.2*0 = 0.1 1 WRONG
x1 = 0, x2 = 1 0
x1 = 0, x2 = 0 0 new weights: w0 = 0.1 - 1 - 1 - 1 = -2.9
randomly, let: w0 = -0.9, w1 = 0.6, w2 = 0.2 1 x1 x2 w1 = 1.6 - 1 - 0 - 0 = 0.6
0
w2 = 1.2 - 0 - 1 - 0 = 0.2
-0.9 0.2 using these weights:
0.6
x1 = 1, x2 = 1: -0.9*1 + 0.6*1 + 0.2*1 = -0.9 0 WRONG
x1 = 1, x2 = 0: -0.9*1 + 0.6*1 + 0.2*0 = -0.3 0 OK using these updated weights:
x1 = 0, x2 = 1: -0.9*1 + 0.6*0 + 0.2*1 = -0.7 0 OK 0 x1 = 1, x2 = 1: -2.9*1 + 0.6*1 + 0.2*1 = -2.1 0 WRONG
x1 = 0, x2 = 0: -0.9*1 + 0.6*0 + 0.2*0 = -0.9 0 OK x1 = 1, x2 = 0: -2.9*1 + 0.6*1 + 0.2*0 = -2.3 0 OK
1 x1 x2 -2.9 0.2
0.6 x1 = 0, x2 = 1: -2.9*1 + 0.6*0 + 0.2*1 = -2.7 0 OK
x1 = 0, x2 = 0: -2.9*1 + 0.6*0 + 0.2*0 = -2.9 0 OK
new weights: w0 = -0.9 + 1 = 0.1
w1 = 0.6 + 1 = 1.6
w2 = 0.2 + 1 = 1.2 new weights: w0 = -2.9 + 1 = -1.9
1 x1 x2 w1 = 0.6 + 1 = 1.6
19 w2 = 0.2 + 1 = 1.2 20
Example: perceptron learning (cont.) Hidden units

the addition of hidden units allows the network to develop complex feature
detectors (i.e., internal representations)
using these updated weights:
0 x1 = 1, x2 = 1: -1.9*1 + 1.6*1 + 1.2*1 = 0.9 1 OK
x1 = 1, x2 = -1: -1.9*1 + 1.6*1 + 1.2*0 = -1.5 0 OK e.g., Optical Character Recognition (OCR)
-1.9 1.2
1.6 x1 = -1, x2 = 1: -1.9*1 + 1.6*0 + 1.2*1 = -2.3 0 OK perhaps one hidden unit
x1 = -1, x2 = -1: -1.9*1 + 1.6*0 + 1.2*0 = -4.7 0 OK "looks for" a horizontal bar
another hidden unit
DONE! "looks for" a diagonal
1 x1 x2
another looks for the vertical base
the combination of specific

hidden units indicates a 7
EXERCISE: train a perceptron to compute OR
21 22
Building multi-layer nets Hidden units & learning
smaller example: can combine perceptrons to perform more complex every classification problem has a perceptron solution if enough hidden
computations (or classifications) layers are used
i.e., multi-layer networks can compute anything
3-layer neural net
(recall: can simulate AND, OR, NOT gates)
2 input nodes expressiveness is not the problem – learning is!
1 0.1
.75 1 hidden node it is not known how to systematically find solutions
2 output nodes the Perceptron Learning Algorithm can't adjust weights between levels
.75 -3.5
1.5 1.5
1.5
RESULT?
1 1
Minsky & Papert's results about the "inadequacy" of perceptrons pretty much
HINT: left output node is AND killed neural net research in the 1970's
x1 x2 right output node is XOR
rebirth in the 1980's due to several developments
FULL ADDER faster, more parallel computers
new learning algorithms e.g., backpropagation
new architectures e.g., Hopfield nets
23 24
4
Backpropagation nets Backpropagation learning
backpropagation nets are multi-layer networks
normalize inputs between 0 (inhibit) and 1 (excite) there exists a systematic method for adjusting weights, but no global
utilize a continuous activation function convergence theorem (as was the case for perceptrons)
backpropagation (backward propagation of error) – vaguely stated

perceptrons utilize a stepwise activation select arbitrary weights
function
pick the first test case
output = 1 if sum >= 0 make a forward pass, from inputs to output
0 if sum < 0 compute an error estimate and make a backward pass, adjusting
weights to reduce the error
repeat for the next test case
testing & propagating for all training cases is known as an epoch

backpropagation nets utilize a continuous
activation function despite the lack of a convergence theorem, backpropagation works well in
output = 1/(1 + e-sum) practice
however, many epochs may be required for convergence
25 26
Backpropagation applets on the Web Neural net applications
pattern classification
9 of top 10 US credit card companies use Falcon
uses neural nets to model customer behavior, identify fraud
http://www.cs.ubc.ca/labs/lci/CIspace/Version4/neural/ claims improvement in fraud detection of 30-70%
Sharp, Mitsubishi, … -- Optical Character Recognition (OCR)
prediction & financial analysis
Merrill Lynch, Citibank, … -- financial forecasting, investing
Spiegel – marketing analysis, targeted catalog sales
http://members.aol.com/Trane64/java/JRec.html
control & optimization
Texaco – process control of an oil refinery
Intel – computer chip manufacturing quality control
AT&T – echo & noise control in phone lines (filters and compensates)
Ford engines utilize neural net chip to diagnose misfirings, reduce emissions
also from AI video: ALVINN project at CMU trained a neural net to drive
backpropagation network: video input, 9 hidden units, 45 outputs
27 28
DEMONSTRATION
SOLVING AND, OR and XOR problems using NN model

: Multi layer perceptron:
29

Neural Network Lecture on Connectionist Approach to AI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Neural Network Lecture on Connectionist Approach to AI

Uploaded by

Copyright:

Available Formats

Connectionist approach to AI

Connectionist models (neural nets) Biology

general brain architecture:

Neuron How neurons work

The fibers of surrounding neurons emit chemicals (neurotransmitters)

A unit (perceptron) Interesting questions for NNs

Artificial neurons Computation via activation function

Another exercise? Normalizing thresholds

w11 + w21 >= θ OUTPUT: 0

Normalized examples Perceptrons

Example: perceptron learning (cont.) Hidden units

the combination of specific

Building multi-layer nets Hidden units & learning

backpropagation (backward propagation of error) – vaguely stated

testing & propagating for all training cases is known as an epoch

Backpropagation applets on the Web Neural net applications

SOLVING AND, OR and XOR problems using NN model

You might also like

Neural Network Lecture on Connectionist Approach to AI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Neural Network Lecture on Connectionist Approach to AI

Uploaded by

Copyright:

Available Formats

Connectionist approach to AI

Connectionist models (neural nets) Biology

general brain architecture:

Neuron How neurons work

The fibers of surrounding neurons emit chemicals (neurotransmitters)

A unit (perceptron) Interesting questions for NNs

Artificial neurons Computation via activation function

Another exercise? Normalizing thresholds

w1*1 + w2*1 >= θ  OUTPUT: 0

Normalized examples Perceptrons

Example: perceptron learning (cont.) Hidden units

the combination of specific

Building multi-layer nets Hidden units & learning

backpropagation (backward propagation of error) – vaguely stated

testing & propagating for all training cases is known as an epoch

Backpropagation applets on the Web Neural net applications

SOLVING AND, OR and XOR problems using NN model

You might also like

w11 + w21 >= θ OUTPUT: 0