You are on page 1of 49

Gwalior Engineering College


Practical File
Neural Networks

Submitted to: Submitted by:

Mr. Rakesh Singh

(Lecturer in Dept. of Computer Science) Roll No:
Gwalior engineering college, Semester: VII.


What is neuron? Explain the artificial neural network with


02. Explain the taxonomy of neural network architecture?

Write the explanatory notes on the difference between ANN

and human brain?

04. Make a comparison between AI and ANN?

What do you understand by the word training? Differentiate

between supervised and unsupervised training?

06. Implement the error back propagation algorithm in C/C++?

07. Implement the perceptron learning algorithm in C/C++?

08. What is perceptron? Also explain Rosenblatt’s perceptron?

Give a brief overview of optical neural networks and explain

the advantages and disadvantages also?

10 Single layer Hopfield Network with 4 neurons

11. Multi Layer Perceptron neural network in C++


Neurons also known as neurons and nervous cells) are responsive cells in the nervous system
that process and transmit information by electrochemical signaling. They are the core components of
the brain, the vertebrate spinal cord, the invertebrate ventral nerve cord, and the peripheral nerves.
The brain is a collection of about 10 billion interconnected neurons. Each neuron is a cell [right] that
uses biochemical reactions to receive, process and transmit information.

A neuron's dendritic tree is

connected to a thousand
neighboring neurons. When one
of those neurons fire, a positive or
negative charge is received by one
of the dendrites. The strengths of
all the received charges are added
together through the processes of
spatial and temporal summation.
Spatial summation occurs when several weak signals are converted into a single large one, while
temporal summation converts a rapid series of weak pulses from one source into one large signal. The
aggregate input is then passed to the soma (cell body). The soma and the enclosed nucleus don't play
a significant role in the processing of incoming and outgoing data. Their primary function is to
perform the continuous maintenance required to keep the neuron functional. The part of the soma that
does concern itself with the signal is the axon hillock. If the aggregate input is greater than the axon
hillock's threshold value, then the neuron fires, and an output signal is transmitted down the axon.
The strength of the output is constant, regardless of whether the input was just above the threshold, or
a hundred times as great. The output strength is unaffected by the many divisions in the axon; it
reaches each terminal button with the same intensity it had at the axon hillock. This uniformity is
critical in an analogue device such as a brain where small errors can snowball, and where error
correction is more difficult than in a digital system.
Artificial Neural Network (Ann)

An artificial neural network (ANN), often just called a "neural network" (NN), is a
mathematical model or computational model based on biological neural networks. It consists of an
interconnected group of artificial neurons and processes information using a connectionist approach
to computation. In most cases an ANN is an adaptive system that changes its structure based on
external or internal information that flows through the network during the learning phase.

A neural network is an interconnected group of nodes, akin to the vast network of neurons in
the human brain.


 A neural network can perform tasks that a linear program can not.
 When an element of the neural network fails, it can continue without any.
problem by their parallel nature
 A neural network learns and does not need to be reprogrammed.
 It can be implemented in any application.
 It can be implemented without any problem.

 The neural network needs training to operate.

 The architecture of a neural network is different from the architecture of.
microprocessors therefore needs to be emulated
 Requires high speed processing time for large neural networks.

Example of Neural Network

In this document we are going to develop an example of how does the algorithm of back
propagation work in a very simple network, we will teach you how to develop an application.. We are
going to use the Stuttgart Neural Network Simulator (SNNS) that is a very useful tool to realize this
objective. First of all click here ( to load the SNNS.

When you have the program SNNS running, you can work with it. You must see the snns-
manager display. You can manage all the application of the simulator. First of all, you have to load
the ANN. To do this click on "FILE" button, and load the encoder network: answer yes to the
question of loading the configuration file. It will appear the display of the network, something like

Then select the file of patterns for training the encoder network. Just press on PAT button and
select encoder. Now that we have the ANN and the patterns loaded, we can train it. For it we have to
get the CONTROL display from the snns-manager. Press on control button. Set the number of cycles
(sweep) needed to train the network. At the beginning you can try with this cases:
Learning-Function Learning-Parameters Cycles

Std.-Backpropagation 2.0 750

Backpropagation with Momentum 0.8 0.6 0.1 300
Quickprop 0.2 1.75 0.0001 75
Rprop 0.2 75
You can open the GRAPH window to see how the error goes down at each cycle (sweep). To
start training the network first we have to stablish the initial weights. To do this press on INIT button.
Then press on the ALL button to train all the patterns in each sweep (block adaptive method).

You can select other examples given by the simulator and see different networks than
backpropagation one. You can repeat this process for different aleatory initial weights. You will see
that the final error will be different, just because you started the training process in different points of
the error surface.

Feedforward neural network

In this network, the information moves in only one direction, forward, from the input nodes,
through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the

Radial basis function (RBF) network

Radial basis functions have been applied in the area of neural networks where they may be
used as a replacement for the sigmoidal hidden layer transfer characteristic in Multi-Layer

Kohonen self-organizing network

A set of artificial neurons learn to map points in an input space to coordinates in an output
space. The input space can have different dimensions and topology from the output space, and the
SOM will attempt to preserve these.

Recurrent network

Contrary to feed forward networks, recurrent neural networks (RNs) are models with bi-
directional data flow. While a feed forward network propagates data linearly from input to output,
RNs also propagate data from later processing stages to earlier stages.

Simple recurrent network

A simple recurrent network (SRN) is a variation on the Multi-Layer Perceptron, sometimes

called an "Elman network" due to its invention by Jeff Elman.

Hopfield network

The Hopfield network is a recurrent neural network in which all connections are symmetric. Invented
by John Hopfield in 1982, this network guarantees that its dynamics will converge.
Echo state network

The echo state network (ESN) is a recurrent neural network with a sparsely connected random
hidden layer. The weights of output neurons are the only part of the network that can change and be
learned. ESN are good to (re)produce temporal patterns.

Long short term memory network

The Long short term memory is an artificial neural net structure that unlike traditional RNNs
doesn't have the problem of vanishing gradients.

Stochastic neural networks

A stochastic neural network differs from a typical neural network because it introduces
random variations into the network.

Boltzmann machine

The Boltzmann machine can be thought of as a noisy Hopfield network. Invented by Geoff
Hinton and Terry Sejnowski in 1985, the Boltzmann machine is important because it is one of the
first neural networks to demonstrate learning of latent variables (hidden units).

Modular neural networks

Biological studies have shown that the human brain functions not as a single massive
network, but as a collection of small networks.

Associative neural network (ASNN)

ASNN represents a combination of an ensemble of feed-forward neural networks and the k-

nearest neighbor technique (kNN).

Other types of networks

These special networks do not fit in any of the previous categories.

Holographic associative memory

Holographic associative memory represents a family of analog, correlation-based, associative,

stimulus-response memories, where information is mapped onto the phase orientation of complex
numbers operating.

Instantaneously trained networks

Instantaneously trained neural networks (ITNNs) were inspired by the phenomenon of short-
term learning that seems to occur instantaneously. In these networks the weights of the hidden and
the output layers are mapped directly from the training vector data. Ordinarily, they work on binary
data, but versions for continuous data that require small additional processing are also available.

Spiking neural networks

Spiking neural networks (SNNs) are models which explicitly take into account the timing of
inputs. The network input and output are usually represented as series of spikes (delta function or
more complex shapes).

Dynamic neural networks

Dynamic neural networks not only deal with nonlinear multivariate behaviour, but also
include (learning of) time-dependent behaviour such as various transient phenomena and delay

Cascading neural networks

Cascade-Correlation is an architecture and supervised learning algorithm developed by Scott

Fahlman and Christian Lebiere.

Neuro-fuzzy networks

A neuro-fuzzy network is a fuzzy inference system in the body of an artificial neural network.

Compositional pattern-producing networks

Compositional pattern-producing networks (CPPNs) are a variation of ANNs which differ in
their set of activation functions and how they are applied. While typical ANNs often contain only
sigmoid functions (and sometimes Gaussian functions), CPPNs can include both types of functions
and many others.

One-shot associative memory

This type of network can add new patterns without the need for re-training. It is done by
creating a specific memory structure, which assigns each new pattern to an orthogonal plane using
adjacently connected hierarchical arrays.

Difference # 1: Brains are analogue; computers are digital

It's easy to think that neurons are essentially binary, given that they fire an action potential if
they reach a certain threshold, and otherwise do not fire. This superficial similarity to digital "1's and
0's" belies a wide variety of continuous and non-linear processes that directly influence neuronal

Failure to recognize these important subtleties may have contributed to Minksy & Papert's
infamous mischaracterization of perceptrons, a neural network without an intermediate layer between
input and output.

Difference # 2: The brain uses content-addressable memory

In computers, information in memory is accessed by polling its precise memory address. This
is known as byte-addressable memory. In contrast, the brain uses content-addressable memory, such
that information can be accessed in memory through "spreading activation" from closely related
concepts. For example, thinking of the word "fox" may automatically spread activation to memories
related to other clever animals, fox-hunting horseback riders, or attractive members of the opposite

Difference # 3: The brain is a massively parallel machine; computers are modular and

An unfortunate legacy of the brain-computer metaphor is the tendency for cognitive

psychologists to seek out modularity in the brain. For example, the idea that computers require
memory has lead some to seek for the "memory area," when in fact these distinctions are far more
messy. One consequence of this over-simplification is that we are only now learning that "memory"
regions (such as the hippocampus) are also important for imagination, the representation of novel
goals, spatial navigation, and other diverse functions.
Difference # 4: Processing speed is not fixed in the brain; there is no system clock

The speed of neural information processing is subject to a variety of constraints, including the
time for electrochemical signals to traverse axons and dendrites, axonal myelination, the diffusion
time of neurotransmitters across the synaptic cleft, differences in synaptic efficacy, the coherence of
neural firing, the current availability of neurotransmitters, and the prior history of neuronal firing.
Although there are individual differences in something psychometricians call "processing speed," this
does not reflect a monolithic or unitary construct, and certainly nothing as concrete as the speed of a
microprocessor. Instead, psychometric "processing speed" probably indexes a heterogenous
combination of all the speed constraints mentioned above.

Difference # 5 - Short-term memory is not like RAM

Although the apparent similarities between RAM and short-term or "working" memory
emboldened many early cognitive psychologists, a closer examination reveals strikingly important
differences. Although RAM and short-term memory both seem to require power (sustained neuronal
firing in the case of short-term memory, and electricity in the case of RAM), short-term memory
seems to hold only "pointers" to long term memory whereas RAM holds data that is isomorphic to
that being held on the hard disk. Unlike RAM, the capacity limit of short-term memory is not fixed;
the capacity of short-term memory seems to fluctuate with differences in "processing speed" (see
Difference #4) as well as with expertise and familiarity.

Difference # 6: No hardware/software distinction can be made with respect to the brain

or mind

For years it was tempting to imagine that the brain was the hardware on which a "mind
program" or "mind software" is executing. This gave rise to a variety of abstract program-like models
of cognition, in which the details of how the brain actually executed those programs was considered
irrelevant, in the same way that a Java program can accomplish the same function as a C++ program.

Difference # 7: Synapses are far more complex than electrical logic gates

Another pernicious feature of the brain-computer metaphor is that it seems to suggest that
brains might also operate on the basis of electrical signals (action potentials) traveling along
individual logical gates. Unfortunately, this is only half true. The signals which are propagated along
axons are actually electrochemical in nature, meaning that they travel much more slowly than
electrical signals in a computer, and that they can be modulated in myriad ways. For example, signal
transmission is dependent not only on the putative "logical gates" of synaptic architecture but also by
the presence of a variety of chemicals in the synaptic cleft, the relative distance between synapse and
dendrites, and many other factors. This adds to the complexity of the processing taking place at each
synapse - and it is therefore profoundly wrong to think that neurons function merely as transistors.

Difference #8: Unlike computers, processing and memory are performed by the same
components in the brain

Computers process information from memory using CPUs, and then write the results of that
processing back to memory. No such distinction exists in the brain. As neurons process information
they are also modifying their synapses - which are themselves the substrate of memory. As a result,
retrieval from memory always slightly alters those memories

Difference # 9: The brain is a self-organizing system

This point follows naturally from the previous point - experience profoundly and directly
shapes the nature of neural information processing in a way that simply does not happen in traditional
microprocessors. For example, the brain is a self-repairing circuit - something known as "trauma-
induced plasticity" kicks in after injury. This can lead to a variety of interesting changes, including
some that seem to unlock unused potential in the brain (known as acquired savantism), and others
that can result in profound cognitive dysfunction (as is unfortunately far more typical in traumatic
brain injury and developmental disorders).

Difference # 10: Brains have bodies

This is not as trivial as it might seem: it turns out that the brain takes surprising advantage of
the fact that it has a body at its disposal. For example, despite your intuitive feeling that you could
close your eyes and know the locations of objects around you, a series of experiments in the field of
change blindness has shown that our visual memories are actually quite sparse. In this case, the brain
is "offloading" its memory requirements to the environment in which it exists: why bother
remembering the location of objects when a quick glance will suffice? A surprising set of
experiments by Jeremy Wolfe has shown that even after being asked hundreds of times which simple
geometrical shapes are displayed on a computer screen, human subjects continue to answer those
questions by gaze rather than rote memory. A wide variety of evidence from other domains suggests
that we are only beginning to understand the importance of embodiment in information processing.



S.No. Artificial intelligence Artificial neural network

01. Knowledge is represented at higher Knowledge is represented by numeric
level that is, explicit knowledge or forms in terms of weights which have no
abstract knowledge. relationship of weights.

It can explicitly correct errors by It cannot explicitly correct the errors. The
02. remodifying the facts and and pulses. networks by itself modify their weights to
produce the current output.
03. Intelligence is obtained by designing. Intelligence is obtained by training.
Since processing is fast, comparatively
04. Comparatively interior to real systems.
good for real time system.
05. Response time is consistent. Response time is inconsistent.
06. Symbolic representation is used. Numeric representation is used.
07. Sequential processing is used. Distributed processing is used.
Speed is fast due to its parallel processing
08. Processing speed is low.
and dedicated hardware.
09. It is not good for fault tolerant systems. It is good for fault tolerant systems.
There is proper explanation for any
10. response or output, that is, it is derived There is no explanation.
from these facts and inputs.


However interesting such functions may be in themselves, what has attracted the most interest
in neural networks is the possibility of learning, which in practice means the following:

Given a specific task to solve, and a class of functions F, learning means using a set of

observations, in order to find which solves the task in an optimal sense.

This entails defining a cost function such that, for the optimal solution f ,

(no solution has a cost less than the cost of the optimal solution).

The cost function C is an important concept in learning, as it is a measure of how far away we
are from an optimal solution to the problem that we want to solve. Learning algorithms search
through the solution space in order to find a function that has the smallest possible cost.

For applications where the solution is dependent on some data, the cost must necessarily be a
function of the observations, otherwise we would not be modelling anything related to the data. It is
frequently defined as a statistic to which only approximations can be made. As a simple example

consider the problem of finding the model f which minimizes , for data
pairs (x,y) drawn from some distribution . In practical situations we would only have N samples

from and thus, for the above example, we would only minimize .
Thus, the cost is minimized over a sample of the data rather than the true data distribution.

When some form of online learning must be used, where the cost is partially
minimized as each new example is seen. While online learning is often used when is fixed, it is
most useful in the case where the distribution changes slowly over time. In neural network methods,
some form of online learning is frequently also used for finite datasets.
Choosing a cost function

While it is possible to arbitrarily define some ad hoc cost function, frequently a particular cost
will be used either because it has desirable properties (such as convexity) or because it arises
naturally from a particular formulation of the problem (i.e., In a probabilistic formulation the
posterior probability of the model can be used as an inverse cost). Ultimately, the cost function will
depend on the task we wish to perform. The three main categories of learning tasks are overviewed

Learning paradigms

There are three major learning paradigms, each corresponding to a particular abstract learning
task. These are supervised learning, unsupervised learning and reinforcement learning. Usually any
given type of network architecture can be employed in any of those tasks.

Supervised learning

In supervised learning, we are given a set of example pairs and the

aim is to find a function in the allowed class of functions that matches the examples. In
other words, we wish to infer the mapping implied by the data; the cost function is related to the
mismatch between our mapping and the data and it implicitly contains prior knowledge about the
problem domain.

A commonly used cost is the mean-squared error which tries to minimize the average squared
error between the network's output, f(x), and the target value y over all the example pairs. When one
tries to minimise this cost using gradient descent for the class of neural networks called Multi-Layer
Perceptrons, one obtains the common and well-known backpropagation algorithm for training neural

Tasks that fall within the paradigm of supervised learning are pattern recognition (also known
as classification) and regression (also known as function approximation). The supervised learning
paradigm is also applicable to sequential data (e.g., for speech and gesture recognition). This can be
thought of as learning with a "teacher," in the form of a function that provides continuous feedback
on the quality of solutions obtained thus far.
Unsupervised learning

In unsupervised learning we are given some data x and the cost function to be minimized, that
can be any function of the data x and the network's output, f.

The cost function is dependent on the task (what we are trying to model) and our a priori
assumptions (the implicit properties of our model, its parameters and the observed variables).

As a trivial example, consider the model f(x) = a, where a is a constant and the cost C = E[(x
− f(x))2]. Minimizing this cost will give us a value of a that is equal to the mean of the data. The cost
function can be much more complicated. Its form depends on the application: For example in
compression it could be related to the mutual information between x and y. In statistical modelling, it
could be related to the posterior probability of the model given the data. (Note that in both of those
examples those quantities would be maximized rather than minimised).

Tasks that fall within the paradigm of unsupervised learning are in general estimation
problems; the applications include clustering, the estimation of statistical distributions, compression
and filtering.

// prepare XOR training data

double data[][4]={// I XOR I XOR I = O


0, 0, 0, 0,
0, 0, 1, 1,
0, 1, 0, 1,
0, 1, 1, 0,
1, 0, 0, 1,
1, 0, 1, 0,
1, 1, 0, 0,
1, 1, 1, 1 };
int numLayers = 4, lSz[4] = {3,3,3,1};
double beta = 0.2, alpha = 0.1, thresh = 0.00001;
long num_iter = 500000;
CBackProp *bp = new CBackProp(numLayers, lSz, beta, alpha);
for (long i=0; i < num_iter ; i++)
bp->bpgt(data[i%8], &data[i%8][3]);

if( bp->mse(&data[i%8][3]) < thresh)

break; // mse < threshold - we are done training!!!

double testData[][3]={ // I XOR I XOR I = ?


0, 0, 0,
0, 0, 1,
0, 1, 0,
0, 1, 1,
1, 0, 0,
1, 0, 1,
1, 1, 0,
1, 1, 1};

for ( i = 0 ; i < 8 ; i++ )

cout << testData[i][0]<< " "
<< testData[i][1]<< " "
<< testData[i][2]<< " "
<< bp->Out(0) << endl;
class CBackProp{

// output of each neuron

double **out;

// delta error value for each neuron

double **delta;

// 3-D array to store weights for each neuron

double ***weight;
// no of layers in net including input layer

int numl;

// array of numl elements to store size of each layer

int *lsize;

// learning rate

double beta;

// momentum

double alpha;

// storage for weight-change made in previous epoch

double ***prevDwt;

// sigmoid function

double sigmoid(double in);



// initializes and allocates memory

CBackProp(int nl,int *sz,double b,double a);

// backpropogates error for one set of input

void bpgt(double *in,double *tgt);

// feed forwards activations for one set of inputs

void ffwd(double *in);

// returns mean square error of the net

double mse(double *tgt);

// returns i'th output of the net

double Out(int i) const;

// initializes and allocates memory

CBackProp::CBackProp(int nl,int *sz,double b,double a):beta(b),alpha(a)


// Note that the following are unused,

// delta[0]

// weight[0]

// prevDwt[0]
// I did this intentionally to maintain

// consistency in numbering the layers.

// Since for a net having n layers,

// input layer is referred to as 0th layer,

// first hidden layer as 1st layer

// and the nth layer as output layer. And

// first (0th) layer just stores the inputs

// hence there is no delta or weight

// values associated to it.

// set no of layers and their sizes

lsize=new int[numl];

for(int i=0;i<numl;i++){

// allocate memory for output of each neuron

out = new double*[numl];

for( i=0;i<numl;i++){
out[i]=new double[lsize[i]];

// allocate memory for delta

delta = new double*[numl];

delta[i]=new double[lsize[i]];

// allocate memory for weights

weight = new double**[numl];

weight[i]=new double*[lsize[i]];
for(int j=0;j<lsize[i];j++){
weight[i][j]=new double[lsize[i-1]+1];

// allocate memory for previous weights

prevDwt = new double**[numl];

prevDwt[i]=new double*[lsize[i]];
for(int j=0;j<lsize[i];j++){
prevDwt[i][j]=new double[lsize[i-1]+1];

// seed and assign random weights

for(int j=0;j<lsize[i];j++)
for(int k=0;k<lsize[i-1]+1;k++)
weight[i][j][k]=(double)(rand())/(RAND_MAX/2) - 1;

// initialize previous weights to 0 for first iteration

for(int j=0;j<lsize[i];j++)
for(int k=0;k<lsize[i-1]+1;k++)
// feed forward one set of input

void CBackProp::ffwd(double *in)

double sum;

// assign content to input layer

for(int i=0;i < lsize[0];i++)


// assign output(activation) value

// to each neuron usng sigmoid func

// For each layer

for(i=1;i < numl;i++){
// For each neuron in current layer

for(int j=0;j < lsize[i];j++){

// For input from each neuron in preceding layer

for(int k=0;k < lsize[i-1];k++){

// Apply weight to inputs and add to sum

sum+= out[i-1][k]*weight[i][j][k];
// Apply bias

// Apply sigmoid function

void CBackProp::bpgt(double *in,double *tgt)
double sum;
for(int i=0;i < lsize[numl-1];i++){
for(int j=0;j < lsize[i];j++){
for(int k=0;k < lsize[i+1];k++){
for(i=1;i < numl;i++){
for(int j=0;j < lsize[i];j++){
for(int k=0;k < lsize[i-1];k++){
for(i=1;i < numl;i++){
for(int j=0;j < lsize[i];j++){
for(int k=0;k < lsize[i-1];k++){

// Perceptron model

#include <stdio.h>

#include <iostream.h>

#include <math.h>

class ineuron


float weight;

float activation;

friend class oneuron;


ineuron() {};

ineuron(float j) ;

float act(float x);


class oneuron


int output;

float activation;

friend class network;


oneuron() { };

void actvtion(float x[4], ineuron *nrn);

int outvalue(float j) ;


class network


ineuron nrn[4];
oneuron onrn;



//percept.cpp V. Rao, H. Rao

//Perceptron model

#include "percept.h"

#include "stdio.h"

#include "stdlib.h"

ineuron::ineuron(float j)

weight= j;

float ineuron::act(float x)

float a;

a = x*weight;

return a;

void oneuron::actvtion(float *inputv, ineuron *nrn)

int i;

activation = 0;


cout<<"\nweight for neuron "<<i+1<<" is "<<nrn[i].weight;

nrn[i].activation = nrn[i].act(inputv[i]);

cout<<" activation is "<<nrn[i].activation;

activation += nrn[i].activation;

cout<<"\n\nactivation is "<<activation<<"\n";

int oneuron::outvalue(float j)


cout<<"\nthe output neuron activation \

exceeds the threshold value of "<<j<<"\n";

output = 1;


cout<<"\nthe output neuron activation \

is smaller than the threshold value of "<<j<<"\n";

output = 0;

cout<<" output value is "<< output;

return (output);

network::network(float a,float b,float c,float d)

nrn[0] = ineuron(a) ;

nrn[1] = ineuron(b) ;

nrn[2] = ineuron(c) ;

nrn[3] = ineuron(d) ;

onrn = oneuron();

onrn.activation = 0;

onrn.output = 0;

void main (int argc, char * argv[])

float inputv1[]= {1.95,0.27,0.69,1.25};

float wtv1[]= {2,3,3,2}, wtv2[]= {3,0,6,2};

FILE * wfile, * infile;

int num=0, vecnum=0, i;

float threshold = 7.0;

if (argc < 2)

cerr << "Usage: percept Weightfile Inputfile";


// open files

wfile= fopen(argv[1], "r");

infile= fopen(argv[2], "r");

if ((wfile == NULL) || (infile == NULL))

cout << " Can't open a file\n";






//create the network by calling its constructor.

//the constructor calls neuron constructor as many times as the

number of

//neurons in input layer of the network.

cout<<"please enter the number of weights/vectors \n";

cin >> vecnum;

for (i=1;i<=vecnum;i++)

fscanf(wfile,"%f %f %f %f\n",

network h1(wtv1[0],wtv1[1],wtv1[2],wtv1[3]);

fscanf(infile,"%f %f %f %f \n",

cout<<"this is vector # " << i << "\n";

cout << "please enter a threshold value, eg 7.0\n";

cin >> threshold;

h1.onrn.actvtion(inputv1, h1.nrn);





fscanf(infile,"%f %f %f %f \n",


cout<<"this is vector # " << i << "\n";

cout << "please enter a threshold value, eg 7.0\n";

cin >> threshold;

h1.onrn.actvtion(inputv1, h1.nrn);





There are two data files used in this program. One is for setting up the weights, and the other for
setting up the input vectors. On the command line, you enter the program name followed by the weight file
name and the input file name. For this discussion creates a file called weight.dat, which contains the following

2.0 3.0 3.0 2.0

3.0 0.0 6.0 2.0

These are two weight vectors. Create also an input file called input.dat with the two data vectors

1.95 0.27 0.69 1.25

0.30 1.05 0.75 0.19

During the execution of the program, you are first prompted for the number of vectors that are used (in
this case, 2), then for a threshold value for the input/weight vectors (use 7.0 in both cases). You will then see
the following output. Note that the user input is in italic.

Percept weight.dat input.dat




please enter the number of weights/vectors

this is vector # 1

please enter a threshold value, eg 7.0


weight for neuron 1 is 2 activation is 3.9

weight for neuron 2 is 3 activation is 0.81

weight for neuron 3 is 3 activation is 2.07

weight for neuron 4 is 2 activation is 2.5

activation is 9.28

the output neuron activation exceeds the threshold value of 7

output value is 1

this is vector # 2

please enter a threshold value, eg 7.0


weight for neuron 1 is 3 activation is 0.9

weight for neuron 2 is 0 activation is 0

weight for neuron 3 is 6 activation is 4.5

weight for neuron 4 is 2 activation is 0.38

activation is 5.78

the output neuron activation is smaller than the threshold value of 7

output value is 0


The perceptron is a type of invented in 1957 at the Cornell Aeronautical Laboratory by Frank
Rosenblatt. It can be seen as the simplest kind of feedforward neural network: a linear classifier.

The Perceptron uses matrix eigenvalues to represent feedforward neural networks and is a
binary classifier that maps its input x (a real-valued vector) to an output value f(x) (a single binary
value) across the matrix.

where w is a vector of real-valued weights and is the dot product (which computes a
weighted sum). b is the 'bias', a constant term that does not depend on any input value.

The value of f(x) (0 or 1) is used to classify x as either a positive or a negative instance, in the
case of a binary classification problem. The bias can be thought of as offsetting the activation
function, or giving the output neuron a "base" level of activity. If b is negative, then the weighted
combination of inputs must produce a positive value greater than − b in order to push the classifier
neuron over the 0 threshold. Spatially, the bias alters the position (though not the orientation) of the
decision boundary.

Since the inputs are fed directly to the output unit via the weighted connections, the
perceptron can be considered the simplest kind of feed-forward neural network.


A perceptron (X1, X2 input, X0*W0=b, TH=0.5) learns how to perform a NAND function:

Note: Initial weight equals final weight of previous iteration. A too high learning rate makes
the perceptron periodically oscillate around the solution. A possible enhancement is to use LRn
starting with n=1 and incrementing it by 1 when a loop in learning is found.
Frank Rosenblatt

Frank Rosenblatt (11 July 1928 – 1971) was a New York City born computer scientist who
completed the Perceptron, or MARK 1, computer at Cornell University in 1960. This was the first
computer that could learn new skills by trial and error, using a type of neural network that simulates
human thought processes.

Rosenblatt’s perceptrons were initially simulated on an IBM 704 computer at Cornell

Aeronautical Laboratory in 1957. By the study of neural networks such as the Perceptron, Rosenblatt
hoped that "the fundamental laws of organization which are common to all information handling
systems, machines and men included, may eventually be understood."

A 1946 graduate of the Bronx High School of Science, Rosenblatt was a colorful character at
Cornell in the early 1960s. A handsome bachelor, he drove a classic MGA sports car and was often
seen with his cat named Tobermory. He enjoyed mixing with undergraduates, and for several years
taught an interdisciplinary undergraduate honors course entitled "Theory of Brain Mechanisms" that
drew students equally from Cornell's Engineering and Liberal Arts colleges.

This course was a melange of ideas drawn from a huge variety of sources: results from
experimental brain surgery on epileptic patients while conscious, experiments on measuring the
activity of individual neurons in the visual cortex of cats, studies of loss of particular kinds of mental
function as a result of trauma to specific areas of the brain, and various analog and digital electronic
circuits that modeled various details of neuronal behavior (i.e. the perceptron itself, as a machine).

There were also some breathtaking speculations, based on what was known about brain
behavior at this time (well before the CAT or PET scan was available), including one calculation that,
based on the number of neuronal connections in a human brain, the human cortex had enough storage
space to hold a complete "photographic" record of its perceptual inputs, stored at the 16 frames-per-
second rate of flicker fusion, for about two hundred years.

In 1962 Rosenblatt published much of the content of this honors course in the book
"Principles of neurodynamics: Perceptrons and the theory of brain mechanisms" (Spartan Books,
1962) which he used thereafter as a textbook for the course.

Research on similar devices was also being done in other places such as SRI, and many
researchers had big expectations on what they could do. The initial excitement became somewhat
reduced, though, when in 1969 Marvin Minsky and Seymour Papert published the book Perceptrons
with mathematical proofs that elucidated some of the characteristics of the three-layer feed-forward
perceptrons. For one side, they demonstrated some of the advantages of using them on certain cases.
But they also presented some limitations. The most important one was the impossibility of
implementing general functions using only "local" neurons, that don't have all inputs available. This
was taken by many people as one of the most important characteristics of perceptrons.

Rosenblatt died in a boating accident in 1971. He is buried at Quick Cemetery in

Brooktondale, New York. After research on neural networks returned to the mainstream in the 1980s,
new researchers started to study his work again. This new wave of study on neural networks is
interpreted by some researchers as being a contradiction of hypotheses presented in the book
Perceptrons, and a confirmation of Rosenblatt's expectations, but the extent of this is questioned by
some. In 2004 the IEEE established the Frank Rosenblatt Award, for "outstanding contributions to
the advancement of the design, practice, techniques or theory in biologically and linguistically
motivated computational paradigms including but not limited to neural networks, connectionist
systems, evolutionary computation, fuzzy systems, and hybrid intelligent systems in which these
paradigms are contained."

An optical neural network is a physical implementation of an artificial neural network with

optical components.

Some artificial neural networks that have been implemented as optical neural networks
include the Hopfield neural network and the Kohonen self-organizing map with liquid crystals.

Biological neural networks function on an electrochemical basis, while optical neural

networks use electromagnetic waves. Optical interfaces to biological neural networks can be created
with optogenetics, but is not the same as an optical neural networks. In biological neural networks
there exist a lot of different mechanisms for dynamically changing the state of the neurons, these
include short-term and long-term synaptic plasticity. Synaptic plasticity is among the
electrophysiological phenomena used to control the efficiency of synaptic transmission, long-term for
learning and memory, and short-term for short transient changes in synaptic transmission efficiency.
Implementing this with optical components is difficult, and ideally requires advanced photonic
materials. Properties that might be desirable in photonic materials for optical neural networks include
the ability to change their efficiency of transmitting light, based on the intensity of incoming light.

There is one recent (2007) model of Optical Neural Network: the Programmable Optical
Array/Analogic Computer (POAC). It had been implemented in the year 2000 and reported based on
modified Joint Fourier Transform Correlator (JTC) and Bacteriorhodopsin (BR) as a holographic
optical memory. Full parallelism, large array size and the speed of light are three promises offered by
POAC to implement an optical CNN. They had been investigated during the last years with their
practical limitations and considerations yielding the design of the first portable POAC version.

//Single layer Hopfield Network with 4 neurons

#include <stdio.h>

#include <iostream.h>

#include <math.h>

class neuron


int activation;

friend class network;


int weightv[4];

neuron() {};

neuron(int *j) ;

int act(int, int*);


class network


neuron nrn[4];

int output[4];

int threshld(int) ;

void activation(int j[4]);



//Single layer Hopfield Network with 4 neurons

#include "hop.h"

neuron::neuron(int *j)

int i;


weightv[i]= *(j+i);

int neuron::act(int m, int *x)

int i;

int a=0;


a += x[i]*weightv[i];

return a;

int network::threshld(int k)


return (1);

return (0);

network::network(int a[4],int b[4],int c[4],int d[4])

nrn[0] = neuron(a) ;

nrn[1] = neuron(b) ;

nrn[2] = neuron(c) ;

nrn[3] = neuron(d) ;

void network::activation(int *patrn)

int i,j;



cout<<"\n nrn["<<i<<"].weightv["<<j<<"] is "


nrn[i].activation = nrn[i].act(4,patrn);

cout<<"\nactivation is "<<nrn[i].activation;


cout<<"\noutput value is "<<output[i]<<"\n";

void main ()

int patrn1[]= {1,0,1,0},i;

int wt1[]= {0,-3,3,-3};

int wt2[]= {-3,0,-3,3};

int wt3[]= {3,-3,0,-3};

int wt4[]= {-3,3,-3,0};





cout<<"\nPATTERNS 1010 AND 0101 CORRECTLY.\n";

//create the network by calling its constructor.

// the constructor calls neuron constructor as many times as

the number of

// neurons in the network.

network h1(wt1,wt2,wt3,wt4);

//present a pattern to the network and get the activations of

the neurons


//check if the pattern given is correctly recalled and give



if (h1.output[i] == patrn1[i])

cout<<"\n pattern= "<<patrn1[i]<<

" output = "<<h1.output[i]<<" component matches";

cout<<"\n pattern= "<<patrn1[i]<<

" output = "<<h1.output[i]<<

" discrepancy occurred";


int patrn2[]= {0,1,0,1};



if (h1.output[i] == patrn2[i])

cout<<"\n pattern= "<<patrn2[i]<<

" output = "<<h1.output[i]<<" component matches";


cout<<"\n pattern= "<<patrn2[i]<<

" output = "<<h1.output[i]<<

" discrepancy occurred";




nrn[0].weightv[0] is 0

nrn[0].weightv[1] is -3

nrn[0].weightv[2] is 3

nrn[0].weightv[3] is -3

activation is 3

output value is 1

nrn[1].weightv[0] is -3

nrn[1].weightv[1] is 0

nrn[1].weightv[2] is -3

nrn[1].weightv[3] is 3

activation is -6

output value is 0

nrn[2].weightv[0] is 3

nrn[2].weightv[1] is -3

nrn[2].weightv[2] is 0

nrn[2].weightv[3] is -3

activation is 3

output value is 1

nrn[3].weightv[0] is -3

nrn[3].weightv[1] is 3

nrn[3].weightv[2] is -3
nrn[3].weightv[3] is 0

activation is -6

output value is 0

pattern= 1 output = 1 component matches

pattern= 0 output = 0 component matches

pattern= 1 output = 1 component matches

pattern= 0 output = 0 component matches

nrn[0].weightv[0] is 0

nrn[0].weightv[1] is -3

nrn[0].weightv[2] is 3

nrn[0].weightv[3] is -3

activation is -6

output value is 0

nrn[1].weightv[0] is -3

nrn[1].weightv[1] is 0

nrn[1].weightv[2] is -3

nrn[1].weightv[3] is 3

activation is 3

output value is 1

nrn[2].weightv[0] is 3

nrn[2].weightv[1] is -3

nrn[2].weightv[2] is 0

nrn[2].weightv[3] is -3

activation is -6

output value is 0
nrn[3].weightv[0] is -3

nrn[3].weightv[1] is 3

nrn[3].weightv[2] is -3

nrn[3].weightv[3] is 0

activation is 3

output value is 1

pattern= 0 output = 0 component matches

pattern= 1 output = 1 component matches

pattern= 0 output = 0 component matches

pattern= 1 output = 1 component matches

#include <stdlib.h>

#include <stdio.h>

#include <time.h>

#include <math.h>

#define numInputs 3

#Define numPatterns 4

#define numHidden 4

const int numEpochs = 500;

const double LR_IH = 0.7;

const double LR_HO = 0.07;

void initWeights();

void initData();

void calcNet();

void WeightChangesHO();

void WeightChangesIH();

void calcOverallError();

void displayResults();

double getRand();

int patNum = 0;

double errThisPat = 0.0;

double outPred = 0.0;

double RMSerror = 0.0;

double hiddenVal[numHidden];

double weightsIH[numInputs][numHidden];

double weightsHO[numHidden];

int trainInputs[numPatterns][numInputs];

Int trainOutput[numPatterns];

void calcNet(void)

int i = 0;

for(i = 0;i<numHidden;i++)

hiddenVal[i] = 0.0;

for(int j = 0;j<numInputs;j++)

hiddenVal[i] = hiddenVal[i] + (trainInputs[patNum][j]

* weightsIH[j][i]);

hiddenVal[i] = tanh(hiddenVal[i]);

outPred = 0.0;

for(i = 0;i<numHidden;i++)

outPred = outPred + hiddenVal[i] * weightsHO[i];

errThisPat = outPred - trainOutput[patNum];

void WeightChangesHO(void)

for(int k = 0;k<numHidden;k++)

double weightChange = LR_HO * errThisPat * hiddenVal[k];

weightsHO[k] = weightsHO[k] - weightChange;

if (weightsHO[k] < -5)

weightsHO[k] = -5;

else if (weightsHO[k] > 5)

weightsHO[k] = 5;

void WeightChangesIH(void)

for(int i = 0;i<numHidden;i++)

for(int k = 0;k<numInputs;k++)

double x = 1 - (hiddenVal[i] * hiddenVal[i]);

x = x * weightsHO[i] * errThisPat * LR_IH;

x = x * trainInputs[patNum][k];

double weightChange = x;

weightsIH[k][i] = weightsIH[k][i] - weightChange;

double getRand(void)

return ((double)rand())/(double)RAND_MAX;

void initWeights(void)

for(int j = 0;j<numHidden;j++)

weightsHO[j] = (getRand() - 0.5)/2;

for(int i = 0;i<numInputs;i++)

weightsIH[i][j] = (getRand() - 0.5)/5;

printf("Weight = %f\n", weightsIH[i][j]);

void initData(void)

printf("initialising data\n");

trainInputs[0][0] = 1;

trainInputs[0][1] = -1;

trainInputs[0][2] = 1; //bias

trainOutput[0] = 1;

trainInputs[1][0] = -1;

trainInputs[1][1] = 1;

trainInputs[1][2] = 1; //bias

trainOutput[1] = 1;

trainInputs[2][0] = 1;

trainInputs[2][1] = 1;

trainInputs[2][2] = 1; //bias

trainOutput[2] = -1;

trainInputs[3][0] = -1;

trainInputs[3][1] = -1;

trainInputs[3][2] = 1; //bias

trainOutput[3] = -1;

void displayResults(void)

for(int i = 0;i<numPatterns;i++)

patNum = i;


printf("pat = %d actual = %d neural model =


void calcOverallError(void)

RMSerror = 0.0;

for(int i = 0;i<numPatterns;i++)

patNum = i;


RMSerror = RMSerror + (errThisPat * errThisPat);

RMSerror = RMSerror/numPatterns;

RMSerror = sqrt(RMSerror);

int main(void)

srand ( time(NULL) );



for(int j = 0;j <= numEpochs;j++)

for(int i = 0;i<numPatterns;i++)

patNum = rand()%numPatterns;




printf("epoch = %d RMS Error = %f\n",j,RMSerror);



return 0;

epoch = 497 RMS Error = 0.000000

epoch = 498 RMS Error = 0.000000

epoch = 499 RMS Error = 0.000000

epoch = 500 RMS Error = 0.000000

pat = 1 actual = 1 neural model = 1.000000

pat = 2 actual = 1 neural model = 1.000000

pat = 3 actual = -1 neural model = -1.000000

pat = 4 actual = -1 neural model = -1.000000