You are on page 1of 26

What is a Neural Network?

An Artificial Neural Network (ANN) is an information


processing paradigm that is inspired by the way
biological nervous systems, such as the brain,
processes the information.
The key element of this paradigm is the novel structure
of the information processing system. It is composed
of a large number of highly interconnected processing
elements (neurons) working to solve specific problems.

1
Why use neural networks?
Neural networks, have remarkable ability to derive
meaning from complicated or imprecise data
These networks can be used to extract patterns and detect
trends that are too complex to be noticed by either
humans or other computer techniques.
Adaptive learning: An ability to learn how to do tasks based
on the data given for training or initial experience.
Self-Organization: An ANN can create its own organization
during learning time.
Real Time Operation.
Ideal for Imprecise and Noisy Data
2
Comparison between Neural Networks,
Expert Systems & Conventional
Programming

3
Artificial Neural Networks
– An ANN is a model that emulate the biological neural
network.
– The biological neurons receive inputs through dendrites
and pass signal to other neurons through axon.
– Nucleus is the processing element in the neuron.
– ANN is composed of artificial neurons; these are the
processing elements (PE). Each of the neuron receive
input(s), processes inputs and delivers a single output.
– Synapse in the neuron decides to amplify or attenuate the
signal
– A single output signal from axon can go to multiple
dendrites.
4
• The basic element of a neural network is the
perceptron.
• First proposed by Frank Rosenblatt in 1958 at
Cornell University, the perceptron has 5 basic
elements:
• an n-vector input, weights, summing function,
threshold device, and an output.
• Outputs are in the form of -1 and/or +1. The
threshold has a setting which governs the output
based on the summation of input vectors. If the
summation falls below the threshold setting, a -1 is
the output. If the summation exceeds the threshold
setting, +1 is the output. 5
6
Artificial Neural Networks
– Inputs to the perceptron can be raw data or outputs from
other processing elements.
– Outputs of the perceptron can be the final product or
input to another neuron.
• The Network
– An ANN is composed of as a collection of neurons that
are grouped in layers- minimum of two layers ( input
and output layers). Other layers are known as hidden
layers.
– A typical structure is shown on the next page.
– The processing of information is massively parallel-as it
is in our brain.
7
Processing of Information
– Inputs:
• Each input corresponds to a single attribute of the problem.
• For example for the diagnosis of a disease, each symptom,
could represent an input to one node.
• Input could be image ( pattern) of skin texture, if we are
looking for the diagnosis of normal or cancerous cells.
– Outputs:
• The outputs of the network represent the solution to a
problem.
• For diagnosis of a disease, the answer could be yes or no.
– Weights:
• A key element of ANN is weight.
• Weight expresses relative strength of the entering signal
from various connections that transfers data from input point
to the output point.
8
Processing of Information
– Summation Function:
• Finds the weighted average of all input elements
entering the PE
n
Y =  xiwi
1

• If there are several output neurons. The output at jth


neuron is:
n
yj =  xiwij
1
Wij is weight from ith input node to the jth output node.
9
Processing of Information
– Transfer Function:
• The summation function computes the internal
cumulative signal value.
• There is also an activation level of the neuron.
Based on the cumulative value of signal received ,
the neuron may or may not produce an output.
• The relationship between the activation level and
the output of the neuron may be linear or non-
linear.
• The selection of the specific activation function
determines the network’s operation.
• One of the popular function is “sigmoid function”
where YTYisT the 1
transformed value of Y.
=
-Y
1+e 10
Processing of Information
– Transfer Function:
• The purpose of transfer function is to modify the
output level to a reasonable value (between 0 - 1).
This transformation is done before the output
reaches the next level.
• Example:
x1 = 3 w1 = 0.2
x2 = 1 w2 = 0.4 PE = Y = 1.2
x3 = 2 w3 = 0.1 YT = f(Y)
• You can use simple threshold value.
X1 w1

Y YT
X2 w2 
X3 w3
11
Neurons Transfer Functions
1. Pure Linear Transfer Function
2. Hard Limit Transfer Function
3. Log Sigmoid Transfer Function

-1

12
A Multi-layered Network Function

13
Processing of Information
– Learning:
• An ANN learns from its experience. The usual
process of learning involves three tasks:
– Compute output(s).
– Compare outputs with desired patterns and feed-
back the error.
– Adjust the weights and repeat the process.
• The learning process starts by setting the weights by
some rules ( or randomly). The difference between
the actual output (y) and the desired output(z) is
called error (delta).
• The objective is to minimize delta (error)to zero. The
reduction in error is done by changing the weights.

14
• The formal definition of learning in the context of the
network model can be given as “the process of updating
network connection weights so that the network can
perform a specific task efficiently”.
• The network learns ( or modifies) the connection weights
from the available training patterns ( data available).
• The performance of the network improves over time by
iteratively updating weights in the network.
• ANNs ability to automatically learn from examples (data)
makes them attractive and exciting ( instead of following a
set of rules specified by human experts)
• This is one of the major advantage of the neural networks
over the traditional expert systems.
15
• The key to the adaptive learning is to change weights in
right directions, so as to reduce the error.
• There are various algorithms for adjusting weights - A few
will be introduced later.
• A Procedure for developing NN based applications will be:
1. Collect Data.
2. Separate the data into Training and Test Sets.
3. Define ( select) a Network Structure.
4. Select a Learning Algorithm.
5. Transform Data to Network Inputs ( training data).
6. Start Training and Revise Weights until the Error
Criterion is Satisfied.
7. Stop and Test the results with Test data.
8. Implementation: Use Network for Testing New Cases.
16
General Considerations for Network Design
• Which input attributes will be used to build and
study the network ?- use data points with min corr
(independent)
• Which network architecture will be suitable for the
study?
• How many hidden layers should the network
contain?
• How many nodes should there be in each hidden
layer?
• What conditions will terminate the network training?
17
Strengths of the network
• Neural networks are very suitable for noisy or partial data sets.
Transfer functions, such as sigmoid functions normally
smoothen the variations
• ANNs can process and predict numeric as well as categorical
outcome.
• ANNs can be used for applications that requires a time element
to be included in the data set.
• Neural networks have performed well in certain domains where
rules are not defined and there is no structure.
• The network can be trained for supervised and unsupervised
clustering
18
Weaknesses
• The biggest weakness is that they lack the criterion
for the decision. This is important at times.
• The learning algorithms are not guaranteed to
converge to an optimal solution. However, you can
manipulate with various learning parameters.
• Neural networks can be easily over-trained
(memorize) to a point of working well with
training data but perform poorly on test data. You
have to monitor this problem carefully.

19
Forward and Backward Propagation

20
Developing NN Models
• One of the important step is the selection of network structure:We will
discuss the detailed structures at a later stage
• Associative Memory Systems:
• It refers to ability to recall complete situations from partial
information. Such systems correlate input data with
information stored in memory,
• Information can be recalled even from incomplete or “noisy”
inputs.
• Associative memory systems can detect similarities between
new inputs & stored input patterns. Use distance criterion.
• Hidden Layer Systems
• Complex practical applications require one or more hidden
layers between inputs and outputs and and a corresponding
large number of weights.

21
– Hidden Layer Systems ( contd):
• Using more that three layers is rare.
• Amount of computations involved is enormous.
– Double Layered Networks:
• This structure does not require knowledge of precise
number of classes in the training data
( unsupervised). This is normally used in cases where
the output is not given. Only input data are available.
• Instead, it uses feed-forward and feed-backward
approach to adjust parameters/ weights as data are
analyzed to establish an arbitrary ( required ) number
of categories that represent the data presented to the
system.
22
• Back propagation Network:
• It is the most widely used architecture. It is very popular
technique that is relatively easy to implement. It requires
large amount of training data for conditioning the
network before using it for predicting the outcome.
• A back-propagation network includes at-least one
hidden layer.
• The approach is considered as “feed-forward/ back
propagation” approach.
• Limitations:
• NNs do not do well at tasks that are not driven well by
people.
• They lack the explaining facility.
• Training time can be excessive .

23
Back –Propagation Algorithm
• The most popular & successful method.
• Steps to be followed for the training:
– Select the next training pair from the training set( input
vector and the output).
– Present the input vector to the network.
– Network calculate the output of the network.
– Network calculates the error between the network
output and the desired output.
– Network back propagates the error
– Adjust the weights of the network in a way that
minimizes the error.
– Repeat the above steps for each vector in the training
set until the error is acceptable, for each training data
set.. 24
Network Training
• Supervised Learning
– Network is presented with the input and the desired
output.
– Uses a set of inputs for which the desired outputs
results / classes are known.The difference between the
desired and actual output is used to calculate
adjustment to weights of the NN structure
• Unsupervised Learning
– Network is not shown the desired output.
– Concept is similar to clustering
– It tries to create classification in the outcome.
25
• Unsupervised Learning:
• Only input stimuli (parameters) are presented to the
network. The network is self organizing, that is, it organizes
itself internally, so that each hidden processing elements
and weights responds appropriately to a different set of
input stimuli.
• No knowledge is supplied about the classification of
outputs. However, the number of categories into which the
network classifies the inputs can be controlled by varying
certain parameters in the model. In any case, human expert
must examine the final classifications to assign a meaning
& usefulness of results.
• Reinforcement Learning
In between Supervised & Unsupervised learning.
Network gets a feedback from the environment.
26

You might also like