You are on page 1of 41

http://www.StudentRockStars.

com

Neural Networks
Introduction
http://www.StudentRockStars.com

http://www.StudentRockStars.com
http://www.StudentRockStars.com
(Artificial) Neural Network ?
 Computational model inspired from neurological model of
brain
 Human brain computes in different way from digital
computer(the von Neumann machine)

http://www.StudentRockStars.com
 highly complex, nonlinear, and parallel computing
 many times faster than d-computer in
 pattern recognition, perception, motor control
 has great structure and ability to build up its own rules by experience
 dramatic development within 2 years after birth
 continues to develop afterward
 Language Learning Device before 13 years old
 Plasticity : ability to adapt to its environment

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Neural Network Definitions
 Machine designed to model the way in which brain performs
tasks
 implemented by electronic devices and/or software (simulation)
 Learning is the major emphasis of NN


http://www.StudentRockStars.com
Massively parallel distributed processor
massive interconnection of simple processing units
 simple processing units store experience and make it available to
use
 knowledge is acquired from environment thru learning process

 Learning Machine
 modify synaptic weights to obtain design objective
 modify own topology - neurons die and new one can grow

 Connectionist network - connectionism


http://www.StudentRockStars.com
http://www.StudentRockStars.com
Benefits of Neural Networks(I)
 Power comes from massively parallel distributed structure
and learn to generalize
 generalization : ability to produce reasonable output for inputs not
encountered during training
 NN cannot provide solution by working individually
http://www.StudentRockStars.com
Complex problem is decomposed into simple tasks, and each task is
assigned to a NN
 Long way to go to build a computer that mimic human brain

1. Non-linearity
 interconnectionof non-linear neurons is itself non-linear
 desirable property if underlying physical mechanism is non-linear

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Benefits of Neural Networks(II)
2. Input-Output Mapping
 input-outputmapping is built by learning from examples
 reduce differences of desired response and actual response
 non-parametric statistical inference

http://www.StudentRockStars.com
 estimate arbitrary decision boundaries in input signal space

3. Adaptivity
 adapt synaptic weight to changes of environment
 NN is retrained to deal with minor change in the operating
environment
 change synaptic weights in real-time
 more robust, reliable behavior in non-stationary environment
 Adaptive pattern recognition, Adaptive signal processing, Adaptive
control
 stability-plasticity dilemma
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Benefits of Neural Networks(III)
4. Evidential Response
 not only selected class label but also confidence
 confidences can be used to reject
 recognition accuracy vs. reliability( do only you can do)

http://www.StudentRockStars.com
5. Contextual Information processing

(contextual) knowledge is presented in the structure
 every neuron is affected by others
6. Fault Tolerance
 performance degrades gracefully under adverse condition
 catastrophic failure of d-computer

7. VLSI implementability
 massively parallel nature makes it well suited for VLSI
implementation
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Benefits of Neural Networks(IV)
8. Uniformity of Analysis and Design
 Neuron is common to all NN
 share theories and learning algorithms
 modular networks can be built thru seamless integration

http://www.StudentRockStars.com
9. Neurobiological Analogy

living proof of fault tolerant, fast, powerful processing
 Neuroscientists see it as a research tool for neurobiological
phenomena
 Engineers look to neuroscience for new ideas

 Downloaded from http://www.StudentRockStars.com

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Human Brain
 Human nervous system

stimulus Receptors Neural Net Effectors response

http://www.StudentRockStars.com
 1011 neurons in human cortex
 60 x 1012 synaptic connections
 104 synapses per neuron
 10-3 sec cycle time (computer : 10-9 sec)
 energetic efficiency : 10-16 joules operation per second
(computer : 10-6 joules)

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Human Brain
 Neuron structure
 nucleus, cell body, axon, dendrite, synapses
 Neurotransmission
 Neuron’s output is encoded as a series of voltage pulses

http://www.StudentRockStars.com
 called action potentials or spikes
 by means of electrical impulse effected by chemical transmitter
 Period of latent summation
 generate impulse if total potential of membrane reaches a level :
firing
 excitatory or inhibitory

 Cerebral cortex
 Areas of Brain for specific function
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Biological Neuron

http://www.StudentRockStars.com

Excerpted from Artificial Intelligence: Modern Approach


http://www.StudentRockStars.com
by S. Russel and P. Norvig
http://www.StudentRockStars.com
Neurotransmission

http://www.StudentRockStars.com

http://www.StudentRockStars.com
http://www.health.org/pubs/qdocs/mom/TG/intro.htm
http://www.StudentRockStars.com
Organization of Levels in Brains
Central Nervous sys
 map into cerebral cortex
Interregional circuits pathways, columns, topographic maps; involve multiple
regions
Local circuits  neurons of similar and different properties, 1 mm in size,

http://www.StudentRockStars.com
Neurons 
localized region in the brain
100µm in size, contains several dendrite trees

Dendrite tree

Neural microcircuits  Assembly of synapses (silicon chip), milisecond, µm

Synapses  most fundamental level (transistor)

Molecules http://www.StudentRockStars.com
http://www.StudentRockStars.com

Cerebral
Cortex
http://www.StudentRockStars.com

http://www.StudentRockStars.com
Downloaded from http://www.StudentRockStars.com
http://www.StudentRockStars.com
Models of Neuron
 Neuron is information processing unit

 A set of synapses or connecting links


 characterized by weight or strength

http://www.StudentRockStars.com
An adder
summing the input signals weighted by synapses
a linear combiner
 An activation function
 also called squashing function
 squash (limits) the output to some finite values

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Nonlinear model of a neuron (I)
Bias
bk
x1 wk1 Activation
function
x2 wk2
vk Output
Σ ϕ(.)
http://www.StudentRockStars.com yk
...

...

xm wkm
Summing
Input Synaptic junction
signal weights

y = ϕ (v )
m

v = ∑w x +b
k
j =1
kj j k k k

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Nonlinear model of a neuron (II)
X0 = +1 wk0 Wk0 = bk (bias)

x1 wk1 Activation
function
x2 wk2

http://www.StudentRockStars.com
vk Output
Σ ϕ (.) yk
...

...

xm wkm
Summing
Input Synaptic junction
signal weights

y = ϕ (v )
m

v = ∑w x
k
j =0
kj j k k

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Types of Activation Function
Oj Oj Oj

+1 +1 +1

http://www.StudentRockStars.com
t ini
t ini ini

Threshold Function Piecewise-linear Sigmoid Function


Function (differentiable)

ϕ (v) = 1 + exp(1 −av)


a is slope parameter
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Activation Function value range

+1
+1

http://www.StudentRockStars.com vi
vi

-1

Hyperbolic tangent Function

Signum Function
ϕ (v) = tanh( v)
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Stochastic Model of a Neuron
 Deterministic vs stochastic
 stochastic : stay at a state with probability
+ 1 with probabilit y P(v)
x=
− 1 with probabilit y 1 − P(v)
http://www.StudentRockStars.com
x: state of neuron
v: induced local field
P(v) probability of firing
1
P (v ) = v
1 + exp( − )
T

where T is pseudotemparature
T → 0, reduced to deterministic form
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Neural network as directed Graph
 Block diagram can be simplify by the idea of signal flow
graph
 node is associated with signal
 directed link is associated with transfer function
http://www.StudentRockStars.com
synaptic links

governed by linear input-output relation
 signal xj is multiplied by synaptic weight wkj
 activation links
 governed by nonlinear input-output relation
 nonlinear activation function

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Signal Flow Graph of a Neuron

x0 = +1
Wk0 = bk
x1 wk1
vk ϕ(.)
http://www.StudentRockStars.com
wk2
x2 y k
...

wkm
xm

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Architectural graph of a Neuron
 Partially complete directed graph describing layout
 Computation node : shaded
 source node : small square

http://www.StudentRockStars.com
 Three graphical representations
 Block diagram - providing functional description of a NN
 Signal flow graph - complete description of signal flow
 architectural graph - network layout

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Feedback
 Output determines in part own output via feedback

xj’(n)
w
xj(n) yk(n)

http://www.StudentRockStars.com ∞

y k ( n) = ∑ w x ( n − l )
l +1
z-1
j
i =0

 depending on w
 stable,linear divergence, exponential divergence
 we are interested in the case of |w| <1 ; infinite memory
 output depends on inputs of infinite past

 NN with feedback loop : recurrent network


http://www.StudentRockStars.com
http://www.StudentRockStars.com
Network Architecture
 Single-layer Feedforward Networks
 input layer and output layer
 single (computation) layer
 feedforward, acyclic


http://www.StudentRockStars.com
Multilayer Feedforward Networks
hidden layers - hidden neurons and hidden units
 enables to extract high order statistics
 10-4-2 network, 100-30-10-3 network
 fully connected layered network

 Recurrent Networks
 at least one feedback loop
 with or without hidden neuron

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Network Architecture

Single layer Multiple layer


fully connected Unit delay
operator

http://www.StudentRockStars.com Recurrent network


without hidden units

} outputs

inputs
{ Recurrent network
with hidden units
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Knowledge Representation
 Knowledge refers to stored information or models used by a
person or machine to interpret, predict and appropriately
respond to the outside world
 What information is actually made explicit;
 How the information is physically encoded for the subsequent use

 http://www.StudentRockStars.com
Good solution depends on good representation of
knowledge
 In NN, knowledge is represented by internal network
parameters
 real challenge
 Knowledge of the world
 worldstate represented by known facts - prior knowledge
 observations - obtained by (noisy) sensors; training examples
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Knowledge Acquisition by NN Training
 Training examples : either labeled or unlabeled
labeled : input signal and desired response
 unlabeled : different realizations of input signal
 Examples represent the knowledge of environment


http://www.StudentRockStars.com
Handwritten digit recognition
1. Appropriate architecture is selected for NN
 source node = number of pixels of input image
 10 output node for each digit
 subset of examples for training NN
 by suitable learning algorithm
2. Recognition performance is tested by the rest of the examples
 Positive and negative examples

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Rules of Knwldg representation in NN
1. Similar input from similar classes produce similar
representations
 similarity
measures
 Euclidian distance, dot (inner) product, cos

http://www.StudentRockStars.com
 random variable : Mahalanobis distance
 ...

2. Separate classes produce widely different representations


3. More neurons should be involved in representation of more
important feature
 probability of detection / false alarm
4. Prior information and invariances should be built into the
design of the network
 general purpose vs specialized
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Building Prior to NN design
 Specialized structure
 learns fast because of small free parameters
 runs fast because of simple structure

 No well-defined rules for building specialized NN


http://www.StudentRockStars.com
 ad hoc approeach
 restricting the network architecture thru use of local connection
 receptive field
 Constraining the choice of synaptic weights
 weight sharing, parameter tying

http://www.StudentRockStars.com
http://www.StudentRockStars.com
x1
x2
x3
x4
x5 y1

x6 y2
x7

http://www.StudentRockStars.com
x8
x9 6
x10 v j = ∑w x
i =1
i i + j −1
, j = 1,2,3,4

Combining receptive field and weight-sharing


(convolution network)

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Building invariance to NN design
 Want to be capable to cope with transformations
 Invariance by structure
 synaptic connections are arranged not to affected by the
transform
 rotation invariant forcing wji = wjk for all k in the same distance
http://www.StudentRockStars.com
from the center of image
 Invariance by training
 train by data of many different transformations
 computationally infeasible
 invariant feature space
 use features invariant to the transformations

 No well-developed theory of optimizing architecture of NN


 NN lacks explanation capability
http://www.StudentRockStars.com
http://www.StudentRockStars.com
AI and NN
 Definition of AI; Goal of AI
 art of creating machine that performs tasks that requires intelligence
when performed by people
 study of mental faculties through the use of computational models
 to make computers to perceive, reason and act


http://www.StudentRockStars.com
 to develop machine that perform cognitive tasks

functions of AI system
 store knowledge
 apply the knowledge to solve problems
 acquire new knowledge thru experience

 Key components of AI representation

 representation learning
 reasoning
reasoning
 learning
http://www.StudentRockStars.com
http://www.StudentRockStars.com
AI
 AI is goal, objective, dream
 NN is a model of intelligent system
 itis not the only system
 Intelligent system is not necessarily same as human


http://www.StudentRockStars.com
 Example : Chess machine

Symbolic AI is a tool, paradigm toward AI


 NN can be a good tool toward AI

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Representation in Symbolic AI
 By symbol structure
 symbolic AI
 well suited for human-machine communication

 Knowledge is just data


http://www.StudentRockStars.com
 declarative

represented by facts
 general inference procedure is used to manipulate the facts
 PROLOG
 procedural
 knowledge is embodied in executable code
 LISP

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Reasoning
 Reasoning is ability to solve problem
 must able to express and solve broad range of problems
 must able to make explicit and implicit information known to it
 must have control mechanism to select operators for a situation



http://www.StudentRockStars.com
Problem solving is a searching problem
deal with incompleteness, inexactness, uncertainty
 probabilistic reasoning, plausible reasoning, fuzzy reasoning

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Learning
 Model of Machine Learning (Figure 1.25)
 Environment,
 Learningelement,
 Knowledge base, and


http://www.StudentRockStars.com
 performance cycle

Inductive learning
 generate rules from raw data
 similarity-based learning, case-based reasoning

 deductive learning
 generalrules are used to determine specific facts
 theorem proving

 Augmenting knowledge-base is not a simple task


http://www.StudentRockStars.com
http://www.StudentRockStars.com
Symbolic AI vs. NN
 Symbolic representation  Parallel distributed processing
Level of
 cognition as sequential  neurobiological explanation
explanation
processing of symbolic
representation

http://www.StudentRockStars.com
Processing

sequential, step-by-step
 logical-inference-like
 parallelism
 processing power in certain
style tasks such as sensory proc.
 von Neumann machine  flexibility
 robustness

Representation  quasi-linguistic  no structure


structure  composition of symbolic  bottom-up
expression
 top-down
http://www.StudentRockStars.com
http://www.StudentRockStars.com
Symbolic AI + NN
 Structured connectionist model
 hybrid systems integrate them together
 from NN
 adaptivity

http://www.StudentRockStars.com
 robustness
 uniformity
 from symbolic AI
 representation
 inference
 universality

 Extracting rule from trained NN


 if possible, it will be good for explanation

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Advantage of Extracting rules from NN
 To validate NN by making internal state of NN accessible
and understandable to users
 to identify regions of input space need more samples
 indicate the circumstances where NN fails to generalize

http://www.StudentRockStars.com
to discover salient features for data exploration
 to traverse the boundaries of symbolic and connectionist
approaches
 to satisfy the critical need for safety

http://www.StudentRockStars.com
http://www.StudentRockStars.com
Historical Notes
 See the textbook section 1.9

http://www.StudentRockStars.com

http://www.StudentRockStars.com
http://www.StudentRockStars.com
StudentRockStars.com
 StudentRockStars.com

http://www.StudentRockStars.com

http://www.StudentRockStars.com

You might also like