Professional Documents
Culture Documents
Historical Sketch
Pre-1940: von Hemholtz, Mach, Pavlov, etc.
General theories of learning, vision, conditioning No specific mathematical models of neuron operation
field
an NN is a network of many simple processors (units), each possibly having a small amount of local memory. The units are connected by communication channels (connections) which usually carry numeric (as opposed to symbolic) data, encoded by any of various means. The units operate only on their local data and on the inputs they receive via the connections.590 Lecture 1 CSE
Who is concerned with NNs? neural nets and about Computer scientists want to find out about the properties of non-symbolic information processing with
learning systems in general.
Statisticians use neural nets as flexible, nonlinear regression and classification models. Engineers of many kinds exploit the capabilities of neural networks in many areas, such as signal processing and automatic control. Cognitive scientists view neural networks as a possible apparatus to describe models of thinking and consciousness (Highlevel brain function).
Neuro-physiologists use neural networks to describe and explore medium-level brain function (e.g. memory, sensory system, motorics). Physicists use neural networks to model phenomena statistical mechanics and for a lot of other tasks. Biologists use Neural Networks to interpret nucleotide sequences.
CSE 590 Lecture 1 5
Human brain
The brain is a highly complex, non-linear, parallel information processing system. It performs tasks like pattern recognition, perception, motor control, many times faster than the fastest digital computers. It characterize by; Robust and fault tolerant Flexible can adjust to new environment by learning Can deal with fuzzy, probabilistic, noisy or inconsistent information Is highly parallel Is small, compact and requires little power.
CSE 590 Lecture 1 6
representation and computation the ability to selforganize the ability to generalize based on existing knowledge associative memory recall
alternation between concepts low energy consumption and very high capacity
Cesare Pianese
1. 2. 3. 4.
Biological inspiration Artificial neurons and neural networks Learning processes Learning with artificial neural networks
Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. An appropriate model/simulation of the nervous system should be able to produce similar responses and behaviours in artificial systems. The nervous system is build by relatively simple units, the neurons, so copying their behavior and
The spikes travelling along the axon of the pre-synaptic neuron trigger the release of neurotransmitter substances at the synapse. The neurotransmitters cause excitation or inhibition in the dendrite of the post-synaptic neuron. The integration of the excitatory and inhibitory signals may produce spikes in the postsynaptic neuron. The contribution of the signals depends on the strength of the synaptic connection.
Neuron structure
Human brain consists of approximately 1011 elements called neurons. Communicate through a network of long fiber called axons. Each of these axons splits up into a series of smaller fiber, which communicate with other neurons via junctions called synapses that connect to small fibers called dendrites attached to the main body
CSE 590 Lecture 1
14
Biological Neuron
Biological Neuron
Neural Networks
A neural network (NN) is a machine learning
approach inspired by the way in which the brain performs a particular learning task. A NN is specified by: an architecture: a set of neurons and links connecting neurons. Each link has a weight, a neuron model: the information processing unit of the NN, a learning algorithm: used for training the NN by modifying the weights in order to model the particular learning task correctly on the training examples.
The aim is to obtain a NN that generalizes well, that is, that behaves correctly on new instances of the learning task.
Neuron structure
Basic computational unit is the Neuron
Dendrites (inputs, 1 to 104 per neuron) Soma (cell body) Axon (output)
-Synapses
-excitatory -inhibitory
CSE 590 Lecture 1 18
Interconnectedness
80,000 neurons per square mm 1015 connections Most axons extend less than 1 mm (local connections) Some cells in cerebral cortex may have 200,000 connections Total number of connections in the brain network is astronomicalgreater than the number of particles in known universe
CSE 590 Lecture 1 19
Synapse like a one-way valve. Electrical signal is generated by the neuron, passes down the axon, and is received by the synapses that join onto other neurons dendrites. Electrical signal causes the release of transmitter chemicals which flow across a small gap in the synapse (synaptic cleft). Chemicals can have an excitatory effect on the receiving neuron (making it more likely to fire) or an inhibitory effect (making it less likely to fire) Total inhibitory and excitatory connections to a particular neuron are summed, if this CSE value exceeds 590 Lecture 1 the neurons threshold the 20
Neuron structure
one-output unit output can be excited or not excited incoming signals from other neurons determine if the neuron shall excite ("fire") Output subject to attenuation in the synapses, which are junction parts of the neuron
22
Soma Dendrites
Soma Synapse
neurons
Knowledge is represented in neural networks by the strength of the synaptic connections between neurons (hence connectionism) Learning in neural networks is accomplished by adjusting the synaptic strengths (weights) There are three primary categories of neural network learning algorithms :
Supervised exemplar pairs of inputs and (known, labeled) target outputs are used for training. Reinforcement single good/bad training signal used for training. Unsupervised no training signal; selfCSE 590 Lecture organization and1 clustering produce (and
25
A physical neuron
An artificial neuron
26
Dendrites
Soma Dendrites
Soma Synapse
Artificial Neuron
Input Signals x1 x2 Weights w1 w2 Neuron Y Y Output Signals Y
xn
wn
Neurons work by processing information. They receive and provide information in form of spikes. x1 x2 x3 Inputs xn-1 xn . . w3 . wn-1 wn The McCullogh-Pitts model w2 w1
z = wi xi ; y = H ( z )
i =1
Output y
y = f ( x, w)
y is the neurons output, x is the vector of inputs, and w is the vector of synaptic weights. Examples:
y=
1 1+ e
w xa
T
y=e
|| x w|| 2 2a 2
Output
Inputs
An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs.
Tasks to be solved by artificial neural networks: controlling the movements of a robot based on self-perception and other information (e.g., visual information); deciding the category of potential food items (e.g., edible or non-edible) in an artificial world; recognizing a visual object (e.g., a familiar face);
Learning = learning by adaptation The young animal learns that the green fruits are sour, while the yellowish/reddish ones are sweet. The learning happens by adapting the fruit picking behavior. At the neural level the learning happens by changing of the synaptic strengths, eliminating some synapses, and building new ones.
An output signal is either discrete (e.g., 0 or 1) or it is a real-valued number (e.g., between 0 and 1) Net input is calculated as the weighted sum of the input signals Net input is transformed into an output signal via a simple function (e.g., a threshold function)
CSE 590 Lecture 1 35
...
Input n
Neural Network
Output 0
Output 1
...
Output m
The McCulloch-Pitts Model (First Neuron Model The neuron 1943 ) inputs (0 or 1) has binary labelled x where i = 1,2, ...,n.
i
These inputs have weights of +1 for excitatory synapses and -1 for inhibitory synapses labelled wi where i = 1,2, ...,n. The neuron has a threshold value T which has to be exceeded by the weighted sum of signals if the neuron is to fire. The neuron has a binary output signal denoted by o.
CSE 590 Lecture 1 38
i.e. output of the neuron at time t+1 is 1 if the sum of all the inputs x at time t multiplied by their weights w is greater than or equal to the threshold T, and 0 if the sum of all the inputs x at time t multiplied by their weights is less than the threshold T. Simplistic, but can perform basic logic operations590 Lecture 1 OR and AND. NOT, CSE
39
x1 x2 . Inputs
Mathematical Representation
w1 w2 wn b
net = wi xib
i =1
Output
y = f (net)
xn .
x1 x2 . . xn
Inputs
w1 w2 . . wn
Weights
+ b x0
f(n)
Summation
Activation
Output
40
A linear neuron is a more flexible model if we include a bias. A Bias unit can be thought of as a unit which always has an output value of 1, which is connected to the hidden and output layer units via modifiable weights. It sometimes helps convergence of the weights to an acceptable solution A bias is exactly equivalent to a weight on an extra input line bias th input i that always has an activity of 1. m
Adding biases
y =b + i wi x
i= 1
index over input connections
weight on i th input
output
b w1
x1
w2
x2
OR
y = wixi
i =0
w0 = b
x1
41
w2
weights
f (x )
wm
Summing function
xm
y = wj xj
j =0
w0 = b
CSE 590 Lecture 1 42
wij is the weight characterizing the synapse from input j to neuron i wij is known as the weight from unit j to unit i wij > 0 synapse is excitatory wij < 0 synapse is inhibitory Note that Xi may be external input or the output of some other neuron
CSE 590 Lecture 1
43
Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realize nonlinear function, called neuron activation function. Signal x is adder output signal, and y = f(x) is output signal of nonlinear element. Signal y is also output signal of neuron.
CSE 590 Lecture 1 44
Activation Functions
Usually, we dont just use weighted sum directly Apply some function to the weighted sum before it is used (e.g., as output) Call this the activation function Step function could be a good simulation of a biological neuron spiking the Is called f(n), f(net) f(e)
1 if x f ( x) = 0 if x <
CSE 590 Lecture Step function 1
threshold (T)
45
1 f ( x) = x 1+ e
CSE 590 Lecture 1
steepness parameter
46
Topology Layers
Connection Weights
Heteroassociative
Cesare Pianese
Recurrent Autoassociative
Input Neurons
CSE 590 Lecture 1
Hidden Neurons
Output Neurons
48
-0.9
0.5
49
propagated from the inputs to the outputs Computations of No non linear functions from n input variables by compositions of Nc algebraic functions Time has no role (NO cycle between outputs and inputs)
.. xn
x1 x2
topologies Can model systems with internal states (dynamic ones) Delays are associated to a specific weight Training is more difficult Performance may be problematic Stable Outputs may be more difficult to evaluate Unexpected behavior (oscillation, chaos, )
52
Feedforward
Recurrent
Unsupervised (Kohonen)
Unsupervised (ART)
Supervised
(Elman, Jordan, Hopfield)
53
Types of connectivity
1-Feedforward networks
The neurons are arranged in separate layers There is no connection between the neurons in the same layer The neurons in one layer receive inputs from the previous layer The neurons in one layer delivers its output to the next layer The connections are unidirectional (Hierarchical) output units
hidden units
input units
2-Recurrent networks
Some connections are present from a layer to the previous layers More biologically realistic. Feedforward + feedback = recurrent CSE 590 Lecture 1
54
l l e c t i n a a L N
D t o
a t T
e p D S
a r a t e e f i n e
e t w e a r n
e l e c t S e t P
r e h
a r a m r m T a n D
e t e r a t a i n
Reset
t o N
a l u
e s e t w o r
r a n S S
s f o
t a r t t o p p
r a i n d e n T
Reset
e s t n
I m
CSE 590 Lecture 1
l e m
t a t i o
A neural network is a set of interconnected neurons (simple processing units) Each neuron receives signals from other neurons and sends an output to other neurons The signals are amplified by the strength of the connection The strength of the connection changes over time according to a feedback mechanism The net can be trained
Input signals x1 x2
1 1 2 1 2 i 2
y1 y2 yk
xi
wij
wjk
xn
yl
Input layer
Output layer
Advantages of ANNs
Generalization :using responses to prior input patterns to determine the response to a novel input Inherent massively parallel Able to learn any complex non-linear mapping Learning instead of programming Robust Can deal with incomplete and/or noisy data
CSE 590 Lecture 1 59
Disadvantages of ANNs
Difficult to design The are no clear design rules for arbitrary applications Learning process can be very time consuming Can overfit the training data, becoming useless for generalization Difficult to assess internal operation It is difficult to find out whether, and if so what tasks are performed by different parts of the net Unpredictable It is difficult CSE 590 Lecture 1 to estimate future network
60
Applications
Aerospace
autopilots, flight path simulations, aircraft control systems, Automotive Automobile automatic guidance systems, warranty activity analyzers Banking Check and other document readers, credit application evaluators Defense Weapon steering, target tracking, object discrimination, facial recognition Electronics Code sequence prediction, integrated circuit chip layout,
Applications
Robotics
Trajectory control, forklift robot, manipulator controllers, vision
systems
Speech
Speech recognition, speech compression, vowel classification,
Securities
Market analysis, automatic bond rating, stock trading advisory
Telecommunications
Image and data compression, automated information services,
Transportation
Truck brake diagnosis systems, vehicle scheduling, routing
Introduction
Assumption of (traditional) AI work is that: Knowledge may be represented as symbol structures (essentially, complex data structures) representing bits of knowledge (objects, concepts, facts, rules, strategies..).
E.g., red represents colour red. car1 represents my car. red(car1) represents fact that my car is red.
designed to facilitate this. Rather than use general C++/Java data structures, use special purpose formalisms. A KR language should allow you to:
represent adequately the knowledge you need for
your problem (representational adequacy) do it in a clear, precise and natural way. allow you to reason on that knowledge, drawing new conclusions.
Well-defined syntax/semantics
Knowledge representation languages should have precise syntax and semantics. You must know exactly what an expression means in terms of objects in the real world.
Real World Map to KR language Real World Map back to real world
Inference
New conclusions
Computer
syntax and semantics, which supports sound inference. Independent of domain of application. Different logics exist, which allow you to represent different kinds of things, and which allow more or less efficient inference.
propositional logic, predicate logic, temporal logic,
natural, and inferences may not be efficient. More specialised languages may be better..
Propositional logic
In general a logic is defined by syntax: what expressions are allowed in the language. Semantics: what they mean, in terms of a mapping to real world proof theory: how we can draw new conclusions from existing statements in the logic. Propositional logic is the simplest..
P represents the fact Andrew likes chocolate Q represents the fact Andrew has chocolate
These are called atomic propositions Logical connectives are used to represent and: , or: , if-
then: , not: . Statements or sentences in the language are constructed from atomic propositions and logical connectives.
P Q Andrew likes chocolate and he doesnt have any. P Q If Andrew likes chocolate then Andrew has chocolate
CS 561, Session 28
70
Converging Frameworks
Artificial intelligence (AI):
packet of intelligence into a machine build a
explain human behavior by interacting processes (schemas) in the head but not localized in the brain interactions of components of the brain - computational neuroscience - neurologically constrained-models
Cognitive psychology:
- connectionism: networks of trainable quasi-neurons to provide parallel distributed models little constrained by neurophysiology - abstract (computer program or control system) information processing models
CS 561, Session 28
71
1950s: beginning of computer vision Aim: give to machines same or better vision capability as ours Drive: AI, robotics applications and factory automation Initially: passive, feedforward, layered and hierarchical process that was just going to provide input to higher reasoning processes (from AI) But soon: realized that could not handle real images 1980s: Active vision: make the system more robust by allowing the vision to adapt with the ongoing recognition/interpretation
CS 561, Session 28
72