Artificial Neural Networks: 1 CSE 590 Lecture 1

Artificial Neural Networks
CSE 590 Lecture 1
Historical Sketch
Pre-1940: von Hemholtz, Mach, Pavlov, etc.
General theories of learning, vision, conditioning No specific mathematical models of neuron operation
1940s: Hebb, McCulloch and Pitts

Mechanism for learning in biological neurons Neural-like networks can compute any arithmetic function
1950s: Rosenblatt, Widrow and Hoff

First practical networks and learning rules
1980s: Grossberg, Hopfield, Kohonen, Rumelhart, etc.

Important new developments cause a resurgence in the
field
Artificial Neural Network Systems are called:

neurocomputers neural networks parallel distributed processors PDP connectionists systems
CSE 590 Lecture 1
What is a Neural Network?

There is no universally accepted definition of an NN. But perhaps most people in the field would agree that
an NN is a network of many simple processors (units), each possibly having a small amount of local memory. The units are connected by communication channels (connections) which usually carry numeric (as opposed to symbolic) data, encoded by any of various means. The units operate only on their local data and on the inputs they receive via the connections.590 Lecture 1 CSE
Who is concerned with NNs? neural nets and about Computer scientists want to find out about the properties of non-symbolic information processing with
learning systems in general.
Statisticians use neural nets as flexible, nonlinear regression and classification models. Engineers of many kinds exploit the capabilities of neural networks in many areas, such as signal processing and automatic control. Cognitive scientists view neural networks as a possible apparatus to describe models of thinking and consciousness (Highlevel brain function).
Neuro-physiologists use neural networks to describe and explore medium-level brain function (e.g. memory, sensory system, motorics). Physicists use neural networks to model phenomena statistical mechanics and for a lot of other tasks. Biologists use Neural Networks to interpret nucleotide sequences.
CSE 590 Lecture 1 5
Human brain
The brain is a highly complex, non-linear, parallel information processing system. It performs tasks like pattern recognition, perception, motor control, many times faster than the fastest digital computers. It characterize by; Robust and fault tolerant Flexible can adjust to new environment by learning Can deal with fuzzy, probabilistic, noisy or inconsistent information Is highly parallel Is small, compact and requires little power.
CSE 590 Lecture 1 6
Inspiration for Artificial Intelligence

AI has been inspired by two fundamental questions: How does the human brain work? How can we exploit the brain metaphor to build intelligent machines?
Phenomenological Properties of the Human Brain

massive parallelism distributed fault tolerance graceful degradation endurance of memories fast retrieval and quick
representation and computation the ability to selforganize the ability to generalize based on existing knowledge associative memory recall
alternation between concepts low energy consumption and very high capacity
Other Factors that Promoted Interest

Neuroscience research
Exponential increase in desktop computing power
Powerful neural network algorithms and applications
Psychological research on human problem solving and decision making
Brain vs. Computer Processing

Processing Speed: Milliseconds VS Nanoseconds. Processing Order: Massively parallel.VS serially. Abundance and Complexity: 1011 and 1014 of neurons operate in parallel in the brain at any given moment, each with between 103 and 104 abutting connections per neuron. Knowledge Storage: Adaptable VS New information destroys old information. Fault Tolerance: Knowledge is retained through the redundant, distributed encoding information VS the corruption of a conventional computer's memory is irretrievevable and leads to failure as well.
Cesare Pianese
1. 2. 3. 4.
Biological inspiration Artificial neurons and neural networks Learning processes Learning with artificial neural networks
Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. An appropriate model/simulation of the nervous system should be able to produce similar responses and behaviours in artificial systems. The nervous system is build by relatively simple units, the neurons, so copying their behavior and
The spikes travelling along the axon of the pre-synaptic neuron trigger the release of neurotransmitter substances at the synapse. The neurotransmitters cause excitation or inhibition in the dendrite of the post-synaptic neuron. The integration of the excitatory and inhibitory signals may produce spikes in the postsynaptic neuron. The contribution of the signals depends on the strength of the synaptic connection.
Neuron structure
Human brain consists of approximately 1011 elements called neurons. Communicate through a network of long fiber called axons. Each of these axons splits up into a series of smaller fiber, which communicate with other neurons via junctions called synapses that connect to small fibers called dendrites attached to the main body
CSE 590 Lecture 1
14
Biological Neuron
Biological Neuron
Neural Networks
A neural network (NN) is a machine learning
approach inspired by the way in which the brain performs a particular learning task. A NN is specified by: an architecture: a set of neurons and links connecting neurons. Each link has a weight, a neuron model: the information processing unit of the NN, a learning algorithm: used for training the NN by modifying the weights in order to model the particular learning task correctly on the training examples.
The aim is to obtain a NN that generalizes well, that is, that behaves correctly on new instances of the learning task.
Neuron structure
Basic computational unit is the Neuron
Dendrites (inputs, 1 to 104 per neuron) Soma (cell body) Axon (output)
-Synapses
-excitatory -inhibitory
CSE 590 Lecture 1 18
Interconnectedness
80,000 neurons per square mm 1015 connections Most axons extend less than 1 mm (local connections) Some cells in cerebral cortex may have 200,000 connections Total number of connections in the brain network is astronomicalgreater than the number of particles in known universe
Synapse like a one-way valve. Electrical signal is generated by the neuron, passes down the axon, and is received by the synapses that join onto other neurons dendrites. Electrical signal causes the release of transmitter chemicals which flow across a small gap in the synapse (synaptic cleft). Chemicals can have an excitatory effect on the receiving neuron (making it more likely to fire) or an inhibitory effect (making it less likely to fire) Total inhibitory and excitatory connections to a particular neuron are summed, if this CSE value exceeds 590 Lecture 1 the neurons threshold the 20
Neuron structure
Inspiration from Neurobiology

A neuron: many-inputs /
one-output unit output can be excited or not excited incoming signals from other neurons determine if the neuron shall excite ("fire") Output subject to attenuation in the synapses, which are junction parts of the neuron
Electron Micrograph of a Real Neuron
22
Biological vs. Artificial Neuron

Synapse Axon Synapse Dendrites Axon
Soma Dendrites
Soma Synapse
neurons
Knowledge is represented in neural networks by the strength of the synaptic connections between neurons (hence connectionism) Learning in neural networks is accomplished by adjusting the synaptic strengths (weights) There are three primary categories of neural network learning algorithms :
Supervised exemplar pairs of inputs and (known, labeled) target outputs are used for training. Reinforcement single good/bad training signal used for training. Unsupervised no training signal; selfCSE 590 Lecture organization and1 clustering produce (and
25
BNNs versus ANNs

From experience: examples / training data Strength of connection between the neurons is stored as a weight-value for the specific connection. Learning the solution to a problem = CSE 590 Lecture 1
A physical neuron
An artificial neuron
26
Analogy between biological and artificial neural networks
Synapse Axon Synapse Axon

Out put Signals Middle Layer Input Layer Output Layer Input Signals
Dendrites
Soma Dendrites
Soma Synapse
Artificial Neuron
Input Signals x1 x2 Weights w1 w2 Neuron Y Y Output Signals Y
xn
wn
Neurons work by processing information. They receive and provide information in form of spikes. x1 x2 x3 Inputs xn-1 xn . . w3 . wn-1 wn The McCullogh-Pitts model w2 w1
z = wi xi ; y = H ( z )
i =1
Output y
The McCullogh-Pitts model:

spikes are interpreted as spike rates; synaptic strength are translated as synaptic weights; excitation means positive product between the incoming spike rate and the corresponding synaptic weight; inhibition means negative product between the incoming spike rate and the corresponding synaptic weight;
Nonlinear generalization of the McCullogh-Pitts neuron:
y = f ( x, w)
y is the neurons output, x is the vector of inputs, and w is the vector of synaptic weights. Examples:
y=
1 1+ e
w xa
T
sigmoidal neuron Gaussian neuron
y=e
|| x w|| 2 2a 2
Output
Inputs
An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs.
Tasks to be solved by artificial neural networks: controlling the movements of a robot based on self-perception and other information (e.g., visual information); deciding the category of potential food items (e.g., edible or non-edible) in an artificial world; recognizing a visual object (e.g., a familiar face);
Learning = learning by adaptation The young animal learns that the green fruits are sour, while the yellowish/reddish ones are sweet. The learning happens by adapting the fruit picking behavior. At the neural level the learning happens by changing of the synaptic strengths, eliminating some synapses, and building new ones.
An output signal is either discrete (e.g., 0 or 1) or it is a real-valued number (e.g., between 0 and 1) Net input is calculated as the weighted sum of the input signals Net input is transformed into an output signal via a simple function (e.g., a threshold function)
Neural networks abstract from the details of real neurons
Basic Artificial Model

Consists of simple processing elements called neurons, units or nodes. Each neuron is connected to other nodes with an associated weight (strength). Each neuron has a single threshold value. Weighted sum of all the inputs coming into the neuron is formed and the threshold is subtracted from this value = activation. Activation signal is passed through an activation function (transfer function) to produce the output of the neuron.
Basic Network generally Concepts A Neural

maps a set of inputs to a set of outputs Number of inputs/outputs is variable The Network itself is composed of an arbitrary number of nodes with an arbitrary topology
Input 0 Input 1
...
Input n
Neural Network
Output 0
Output 1
...
Output m
The McCulloch-Pitts Model (First Neuron Model The neuron 1943 ) inputs (0 or 1) has binary labelled x where i = 1,2, ...,n.
i
These inputs have weights of +1 for excitatory synapses and -1 for inhibitory synapses labelled wi where i = 1,2, ...,n. The neuron has a threshold value T which has to be exceeded by the weighted sum of signals if the neuron is to fire. The neuron has a binary output signal denoted by o.
The output o at a time t+1 can be defined by the following equation:

1 Ot+ = 1 if i =1 1 Ot+ = 0 if
i= 1 n
n
wi xit >= T wi xit < T
i.e. output of the neuron at time t+1 is 1 if the sum of all the inputs x at time t multiplied by their weights w is greater than or equal to the threshold T, and 0 if the sum of all the inputs x at time t multiplied by their weights is less than the threshold T. Simplistic, but can perform basic logic operations590 Lecture 1 OR and AND. NOT, CSE
39
x1 x2 . Inputs
Mathematical Representation
w1 w2 wn b
net = wi xib
i =1
Output
y = f (net)
xn .
x1 x2 . . xn
Inputs
w1 w2 . . wn
Weights
+ b x0
f(n)
Summation
Activation
Output
40
CSE 590 Lecture 1
A linear neuron is a more flexible model if we include a bias. A Bias unit can be thought of as a unit which always has an output value of 1, which is connected to the hidden and output layer units via modifiable weights. It sometimes helps convergence of the weights to an acceptable solution A bias is exactly equivalent to a weight on an extra input line bias th input i that always has an activity of 1. m
Adding biases
y =b + i wi x
i= 1
index over input connections
weight on i th input
output
b w1
x1
w2
x2
OR
y = wixi
i =0
w0 = b
CSE 590 Lecture 1
x1
41
Bias as extra input

x0 = +1 x1 Input x Attribute 2 values w0
W1
w2
weights
f (x )
Activation function Output class

y
wm
Summing function
xm
y = wj xj
j =0
w0 = b
Elements of the model neuron :

Xi is the input to synapse i
wij is the weight characterizing the synapse from input j to neuron i wij is known as the weight from unit j to unit i wij > 0 synapse is excitatory wij < 0 synapse is inhibitory Note that Xi may be external input or the output of some other neuron
CSE 590 Lecture 1
43
Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realize nonlinear function, called neuron activation function. Signal x is adder output signal, and y = f(x) is output signal of nonlinear element. Signal y is also output signal of neuron.
Activation Functions
Usually, we dont just use weighted sum directly Apply some function to the weighted sum before it is used (e.g., as output) Call this the activation function Step function could be a good simulation of a biological neuron spiking the Is called f(n), f(net) f(e)
1 if x f ( x) = 0 if x <
CSE 590 Lecture Step function 1
threshold (T)
45
Another Activation Function: The Sigmoidal

The math of some neural nets requires that the activation function be continuously differentiable A sigmoidal function often used to approximate the step functionIs the
1 f ( x) = x 1+ e
CSE 590 Lecture 1
steepness parameter
46
Topology Layers
Connection Weights
Architecture Modular Feedforward
Heteroassociative
Cesare Pianese
Recurrent Autoassociative
Basic Neural Network & Its Elements

Bias Neurons
Input Neurons
CSE 590 Lecture 1
Hidden Neurons
Output Neurons
48
Example: Mapping from input to output

Output pattern: <-0.9, 0.2,-0.1,0.7> output layer
-0.9
0.2 -0.1 0.7 0.2 -0.5 0.8

feed-forward processing
hidden layer input layer
0.5
1.0 -0.1 0.2
Input pattern: <0.5, 1.0,-0.1,0.2>
CSE 590 Lecture 1
49
Feed Forward Neural Networks

The information is
Output layer 2nd hidden layer 1st hidden layer
propagated from the inputs to the outputs Computations of No non linear functions from n input variables by compositions of Nc algebraic functions Time has no role (NO cycle between outputs and inputs)
.. xn
x1 x2
Recurrent Neural Networks

Can have arbitrary
0 0 1 0 x1 1 x2 1 0 0
topologies Can model systems with internal states (dynamic ones) Delays are associated to a specific weight Training is more difficult Performance may be problematic Stable Outputs may be more difficult to evaluate Unexpected behavior (oscillation, chaos, )
The main characteristics of NN

Architecture: the pattern of nodes and connections between them Learning algorithm, or training method: method for determining weights of the connections Activation function: function that produces an output based on
CSE 590 Lecture 1
52
Taxonomy of neural networks
Feedforward
Recurrent
Unsupervised (Kohonen)
Supervised (MLP, RBF)

CSE 590 Lecture 1
Unsupervised (ART)
Supervised
(Elman, Jordan, Hopfield)
53
Types of connectivity
1-Feedforward networks
The neurons are arranged in separate layers There is no connection between the neurons in the same layer The neurons in one layer receive inputs from the previous layer The neurons in one layer delivers its output to the next layer The connections are unidirectional (Hierarchical) output units
hidden units
input units
2-Recurrent networks
Some connections are present from a layer to the previous layers More biologically realistic. Feedforward + feedback = recurrent CSE 590 Lecture 1
54
l l e c t i n a a L N
D t o
a t T
Artificial Neural Network Developme nt Process
Get More, a Better Data

r a i n i n g a n d T
e p D S
a r a t e e f i n e
e t w e a r n
e l e c t S e t P
Refine o r k S t r u c t u Structure Select Another i n g A l Algorithm g o r i t

V i n
r e h
a r a m r m T a n D
e t e r a t a i n
Reset
t o N
a l u
e s e t w o r
r a n S S
s f o
t a r t t o p p
r a i n d e n T
Reset
e s t n
I m
CSE 590 Lecture 1
l e m
t a t i o
Typical Neural Network
A neural network is a set of interconnected neurons (simple processing units) Each neuron receives signals from other neurons and sends an output to other neurons The signals are amplified by the strength of the connection The strength of the connection changes over time according to a feedback mechanism The net can be trained
Three-layer back-propagation neural network
Input signals x1 x2
1 1 2 1 2 i 2
y1 y2 yk
xi
wij
wjk
xn
yl
Input layer
Hidden layer Error signals
Output layer
Types of Problems Solved by NN

Classification: Regression:
determine to which of a discrete number of classes a given input case belongs predict the value of a (usually) continuous variable
Times series- you wish to predict

the value of variables from earlier values of the same or other variables
Advantages of ANNs
Generalization :using responses to prior input patterns to determine the response to a novel input Inherent massively parallel Able to learn any complex non-linear mapping Learning instead of programming Robust Can deal with incomplete and/or noisy data
Disadvantages of ANNs
Difficult to design The are no clear design rules for arbitrary applications Learning process can be very time consuming Can overfit the training data, becoming useless for generalization Difficult to assess internal operation It is difficult to find out whether, and if so what tasks are performed by different parts of the net Unpredictable It is difficult CSE 590 Lecture 1 to estimate future network
60
Applications
Aerospace
autopilots, flight path simulations, aircraft control systems, Automotive Automobile automatic guidance systems, warranty activity analyzers Banking Check and other document readers, credit application evaluators Defense Weapon steering, target tracking, object discrimination, facial recognition Electronics Code sequence prediction, integrated circuit chip layout,
High performance aircraft
Applications
Robotics
Trajectory control, forklift robot, manipulator controllers, vision
systems
Speech
Speech recognition, speech compression, vowel classification,
text to speech synthesis systems
Securities
Market analysis, automatic bond rating, stock trading advisory
Telecommunications
Image and data compression, automated information services,
real-time translation of spoken language, customer payment processing systems systems
Transportation
Truck brake diagnosis systems, vehicle scheduling, routing
Introduction KR and Logic
Introduction
Assumption of (traditional) AI work is that: Knowledge may be represented as symbol structures (essentially, complex data structures) representing bits of knowledge (objects, concepts, facts, rules, strategies..).
E.g., red represents colour red. car1 represents my car. red(car1) represents fact that my car is red.
Intelligent behaviour can be achieved
through manipulation of symbol structures
Knowledge representation languages

Knowledge representation languages have been
designed to facilitate this. Rather than use general C++/Java data structures, use special purpose formalisms. A KR language should allow you to:
represent adequately the knowledge you need for
your problem (representational adequacy) do it in a clear, precise and natural way. allow you to reason on that knowledge, drawing new conclusions.
Well-defined syntax/semantics
Knowledge representation languages should have precise syntax and semantics. You must know exactly what an expression means in terms of objects in the real world.
Real World Map to KR language Real World Map back to real world
Representation of facts in World Computer
Inference
New conclusions
Computer
Logic as a Knowledge Representation Language

A Logic is a formal language, with precisely defined
syntax and semantics, which supports sound inference. Independent of domain of application. Different logics exist, which allow you to represent different kinds of things, and which allow more or less efficient inference.
propositional logic, predicate logic, temporal logic,
modal logic, description logic..
But representing some things in logic may not be very
natural, and inferences may not be efficient. More specialised languages may be better..
Propositional logic
In general a logic is defined by syntax: what expressions are allowed in the language. Semantics: what they mean, in terms of a mapping to real world proof theory: how we can draw new conclusions from existing statements in the logic. Propositional logic is the simplest..
Propositional Logic: Syntax

Symbols (e.g., letters, words) are used to represent facts about
the world, e.g.,
P represents the fact Andrew likes chocolate Q represents the fact Andrew has chocolate
These are called atomic propositions Logical connectives are used to represent and: , or: , if-
then: , not: . Statements or sentences in the language are constructed from atomic propositions and logical connectives.
P Q Andrew likes chocolate and he doesnt have any. P Q If Andrew likes chocolate then Andrew has chocolate
Artificial Neural Networks and AI

Artificial Neural Networks provide
- A new computing paradigm - A technique for developing trainable classifiers,
memories, dimension-reducing mappings, etc

- A tool to study brain function
CS 561, Session 28
70
Converging Frameworks
Artificial intelligence (AI):
packet of intelligence into a machine build a
Cognitive psychology: Brain Theory:
explain human behavior by interacting processes (schemas) in the head but not localized in the brain interactions of components of the brain - computational neuroscience - neurologically constrained-models
and abstracting from them as both
Cognitive psychology:
Artificial intelligence and
- connectionism: networks of trainable quasi-neurons to provide parallel distributed models little constrained by neurophysiology - abstract (computer program or control system) information processing models
CS 561, Session 28
71
Vision, AI and ANNs
1950s: beginning of computer vision Aim: give to machines same or better vision capability as ours Drive: AI, robotics applications and factory automation Initially: passive, feedforward, layered and hierarchical process that was just going to provide input to higher reasoning processes (from AI) But soon: realized that could not handle real images 1980s: Active vision: make the system more robust by allowing the vision to adapt with the ongoing recognition/interpretation
CS 561, Session 28
72

Artificial Neural Networks: 1 CSE 590 Lecture 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Neural Networks: 1 CSE 590 Lecture 1

Uploaded by

Copyright:

Available Formats

Artificial Neural Networks

CSE 590 Lecture 1

1940s: Hebb, McCulloch and Pitts

1950s: Rosenblatt, Widrow and Hoff

1980s: Grossberg, Hopfield, Kohonen, Rumelhart, etc.

Artificial Neural Network Systems are called:

CSE 590 Lecture 1

What is a Neural Network?

Inspiration for Artificial Intelligence

Phenomenological Properties of the Human Brain

Other Factors that Promoted Interest

Exponential increase in desktop computing power

Powerful neural network algorithms and applications

Psychological research on human problem solving and decision making

Brain vs. Computer Processing

Inspiration from Neurobiology

Electron Micrograph of a Real Neuron

Biological vs. Artificial Neuron

BNNs versus ANNs

Analogy between biological and artificial neural networks

Synapse Axon Synapse Axon

The McCullogh-Pitts model:

Nonlinear generalization of the McCullogh-Pitts neuron:

sigmoidal neuron Gaussian neuron

Neural networks abstract from the details of real neurons

Basic Artificial Model

Basic Network generally Concepts A Neural

The output o at a time t+1 can be defined by the following equation:

wi xit >= T wi xit < T

CSE 590 Lecture 1

CSE 590 Lecture 1

Bias as extra input

Activation function Output class

Elements of the model neuron :

Another Activation Function: The Sigmoidal

Architecture Modular Feedforward

Basic Neural Network & Its Elements

Example: Mapping from input to output

0.2 -0.1 0.7 0.2 -0.5 0.8

hidden layer input layer

1.0 -0.1 0.2

Input pattern: <0.5, 1.0,-0.1,0.2>

CSE 590 Lecture 1

Feed Forward Neural Networks

Recurrent Neural Networks

The main characteristics of NN

Taxonomy of neural networks

Supervised (MLP, RBF)

Artificial Neural Network Developme nt Process

Get More, a Better Data

Refine o r k S t r u c t u Structure Select Another i n g A l Algorithm g o r i t

Typical Neural Network

Three-layer back-propagation neural network

Hidden layer Error signals

Types of Problems Solved by NN

Times series- you wish to predict

High performance aircraft

text to speech synthesis systems

real-time translation of spoken language, customer payment processing systems systems

Introduction KR and Logic

Intelligent behaviour can be achieved

through manipulation of symbol structures

Knowledge representation languages