You are on page 1of 30

Artificial neural network

What are Neural Networks? Biological neurons Artificial neurons Activation functions Training Example Voice recognition Different nets models Applications

What are Neural Networks.

Some NNs are models of biological neural networks and some are not, but historically, much of the inspiration for the field of NNs came from the desire to produce artificial systems capable of sophisticated, perhaps intelligent, computations similar to those that the human brain routinely performs, and thereby possibly to enhance our understanding of the human brain

Biological inspirations
Some numbers
The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000 synapses Working speed of human brain is estimated on 1018 operations/s , while the fastest machine ETA10 does 1010 operations/s.

Properties of the brain

It can learn, reorganize itself from experience It adapts to the environment It is robust and fault tolerant

Biological Neural Nets

Pigeons as art experts (Watanabe et al. 1995)
Pigeon in Skinner box Present paintings of two different artists (e.g. Chagall / Van Gogh) Reward for pecking when presented a particular artist (e.g. Van Gogh)

Pigeons were able to discriminate between Van Gogh and Chagall with 95% accuracy (when presented with pictures they had been trained on) Discrimination still 85% successful for previously unseen paintings of the artists
Pigeons do not simply memorise the pictures They can extract and recognise patterns (the style) They generalise from the already seen to make predictions This is what neural networks (biological and artificial) are good at (unlike conventional computer)

Biological neuron

The brain is a collection of about 10 billion interconnected neurons. Each neuron is a cell that uses biochemical reactions to receive, process and transmit information. Each terminal button is connected to other neurons across a small gap called a synapse. A neuron's dendritic tree is connected to a thousand neighbouring neurons. When one of those neurons fire, a positive or negative charge is received by one of the dendrites. The strengths of all the received charges are added together through the processes of spatial and temporal summation.

Artificial neuron
Inputs p1 p2 p3 Weights w1 w2 w3



a f p1w1 p2 w2 p3 w3 b f pi wi b

Neural computing requires a number of neurons, to be connected together into a neural network. Neurons are arranged in layers. Each neuron within the network is usually a simple processing unit which takes one or more inputs and produces an output. At each neuron, every input has an associated weight which modifies the strength of each input. The neuron simply adds together all the inputs and calculates an output to be passed on.

Activation functions

The activation function is generally non-linear. Linear functions are limited because the output is simply proportional to the input.

Training the Network - Learning

Requires training set (input / output pairs) Starts with small random weights Error is used to adjust weights (supervised learning) Gradient descent on error landscape

Training methods
Supervised learning
In supervised training, both the inputs and the outputs are provided. The network then processes the inputs and compares its resulting outputs against the desired outputs. Errors are then propagated back through the system, causing the system to adjust the weights which control the network. This process occurs over and over as the weights are continually tweaked. The set of data which enables the training is called the training set. During the training of a network the same set of data is processed many times as the connection weights are ever refined.

Training methods
Unsupervised learning
In unsupervised training, the network is provided with inputs but not with desired outputs. The system itself must then decide what features it will use to group the input data. This is often referred to as selforganization or adaption.

Example: Voice Recognition

Task: Learn to discriminate between two different voices saying Hello Data
Person A Person B

Frequency distribution (60 bins) Analogy: cochlea

Network architecture
Feed forward network
60 input (one for each frequency bin) 6 hidden 2 output (0-1 for A, 1-0 for B)

Presenting the data


Presenting the data (untrained network)


0.43 0.26

0.73 0.55

Calculate error

0.43 0 0.26 1

= 0.43 = 0.74

0.73 1 0.55 0

= 0.27 = 0.55

Backprop error and adjust weights


0.43 0 0.26 1

= 0.43 = 0.74 1.17

0.73 1 0.55 0

= 0.27 = 0.55 0.82

Repeat process (sweep) for all training pairs

Present data Calculate error Backpropagate error Adjust weights

Repeat process multiple times

Presenting the data (trained network)


0.01 0.99

0.99 0.01

Feed-forward nets
Information flow is unidirectional
Data is presented to Input layer Passed on to Hidden Layer Passed on to Output layer

Information is distributed

Information processing is parallel

Internal representation (interpretation) of data

Recurrent Networks
Feed forward networks:
Information only flows one way One input pattern produces one output No sense of time (or memory of previous state)

Nodes connect back to other nodes or themselves Information flow is multidirectional Sense of time and memory of previous state(s)

Biological nervous systems show high levels of recurrency (but feed-forward structures exists too)

Elman Nets
Elman nets are feed forward networks with partial recurrency

Unlike feed forward nets, Elman nets have a memory or sense of time

Hopfield Networks
Sub-type of recurrent neural nets
Fully recurrent Weights are symmetric Nodes can only be on or off Random updating

Learning: Hebb rule (cells that fire together wire


Biological equivalent to LTP and LTD

Can recall a memory, if presented with a corrupt or incomplete version

auto-associative or content-addressable memory

Electronic systems diagnosis Stock market prediction Sonar target recognition (oil exploration) Medical test analyis Optical character recognition (OCR) Engines management Artificial intelligence robots To scan luggage at the aiports for dangerous items Speech synthesis Face recognition

Summary Neural Networks

Components biological plausibility
Neurone / node Synapse / weight

Feed forward networks

Unidirectional flow of information Good at extracting patterns, generalisation and prediction Distributed representation of data Parallel processing of data Training: Backpropagation Not exact models, but good at demonstrating principles

Recurrent networks
Multidirectional flow of information Memory / sense of time Complex temporal dynamics (e.g. CPGs) Various training methods (Hebbian, evolution) Often better biological models than FFNs