You are on page 1of 49

Marathwada Mitra Mandal’s

College of Engineering, Pune


Karvenagar, Pune-411 052

Accredited with 'A++' Grade by NAAC

Lab Manual

Software Laboratory II

Artificial Neural Network

(317533)

Prepared by

Ms.A.G.Sawant

3
Graduates will be able to

1. Apply the knowledge of mathematics, science, engineering fundamentals, and an engineering


specialization to the solution of complex engineering problems. [Engineering knowledge]
2. Identify, formulate, research literature, and analyse complex engineering problems reaching
substantiated conclusions using first principles of mathematics, natural sciences, and
engineering sciences. [Problem analysis]
3. Design solutions for complex engineering problems and design system components or
processes that meet the specified needs with appropriate consideration for the public health
and safety, and the cultural, societal, and environmental considerations.
[Design/development of solutions]
4. Use research-based knowledge and research methods including design of experiments,
analysis and interpretation of data, and synthesis of the information to provide valid
conclusions. [Conduct investigations of complex problems]
5. Create, select, and apply appropriate techniques, resources, and modern engineering and IT
tools including prediction and modelling to complex engineering activities with an
understanding of the limitations. [Modern tool usage]
6. Apply reasoning informed by the contextual knowledge to assess societal, health, safety,
legal and cultural issues and the consequent responsibilities relevant to the professional
engineering practice. [The engineer and society]
7. Understand the impact of the professional engineering solutions in societal and
environmental contexts, and demonstrate the knowledge of, and need for sustainable
development. [Environment and sustainability]
8. Apply ethical principles and commit to professional ethics and responsibilities and norms of
the engineering practice. [Ethics]
9. Function effectively as an individual, and as a member or leader in diverse teams, and in
multidisciplinary settings. [Individual and team work]
10. Communicate effectively on complex engineering activities with the engineering community
and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.
[Communication]
11. Demonstrate knowledge and understanding of the engineering and management principles
and apply these to one’s own work, as a member and leader in a team, to manage projects
and in multidisciplinary environments. [Project management and finance]
12. Recognize the need for, and have the preparation and ability to engage in independent and
life-long learning in the broadest context of technological change. [Life-long learning]

4
Sr. Equipment
Title of Experiment CO PO PSO
No. Required
Implement different concepts of probability
1 PC with Python IDE - -
using Python -
Write a Python program to plot a few
2 activation functions that are being used in PC with Python IDE 1,2 1,3 1,2
neural networks.
Generate ANDNOT function using
3 McCulloch-Pitts neural net by a python PC with Python IDE 1,2 1,6 1,2
program.
Write a Python Program using Perceptron
Neural Network to recognize even and odd
4 PC with Python IDE 1,2 1.4 1,2
numbers. Given numbers are in ASCII form
0 to 9
With a suitable example demonstrate the
perceptron learning law with its decision
5 PC with Python IDE 1,2 4,5 1,2
regions using python. Give the output in
graphical form
Write a python Program for Bidirectional
6 Associative Memory with two pairs of PC with Python IDE 1,2 1,5 1,2
vectors.
Implement Artificial Neural Network
7 training process in Python by using Forward PC with Python IDE 1,2 3,4 1,2
Propagation, Back Propagation.
Write a python program to show Back
8 Propagation Network for XOR function with PC with Python IDE 1,2 4,5 1,2
Binary Input and Output
Write a python program to illustrate ART PC with Python IDE,
9 1,2 2,3 1,2
neural network Tensorflow
Write a python program in python program
10 for creating a Back Propagation Feed- PC with Python IDE 1,2 3,5 1,2
forward neural network
Write a python program to design a Hopfield
11 PC with Python IDE 1,2 2,5 1,2
Network which stores 4 vectors
How to Train a Neural Network with
12 TensorFlow / Pytorch and evaluation of PC with Python IDE 1,2 3,4 1,2
logistic regression using tensorflow.

5
TensorFlow / Pytorch implementation of
13 Tensor flow 1,2 5,11 1,2
CNN.
MNIST Handwritten Character Detection
14 Tensor flow 1,2 3,4 1,2
using PyTorch, Keras and Tensorflow.

Content beyond Syllabus

6
Assignment No. : 1

Title
Implement different concepts of probability using Python
Objectives
To explore how statistics relates to probability
Outcomes
Students will be able to understand concepts of probability.
Software
Python 3.9.7
Theory

At the most basic level, probability seeks to answer the question, ―What is the chance of an event
happening?‖ An event is some outcome of interest. To calculate the chance of an event
happening, we also need to consider all the other events that can occur. The quintessential
representation of probability is the humble coin toss. In a coin toss the only events that can
happen are:
1. Flipping a heads
2. Flipping a tails
These two events form the sample space, the set of all possible events that can happen. To
calculate the probability of an event occurring, we count how many times are event of interest
can occur (say flipping heads) and dividing it by the sample space. Thus, probability will tell us
that an ideal coin will have a 1-in-2 chance of being heads or tails. By looking at the events that
can occur, probability gives us a framework for making predictions about how often events will
happen. However, even though it seems obvious, if we actually try to toss some coins, we’re
likely to get an abnormally high or low counts of heads every once in a while. If we don’t want
to make the assumption that the coin is fair, what can we do? We can gather data! We can use
statistics to calculate probabilities based on observations from the real world and check how it
compares to the ideal.

The coin_trial function is what represents a simulation of 10 coin tosses. It uses


the random() function to generate a float between 0 and 1, and increments our heads count if it’s

7
within half of that range. Then, simulate repeats these trials depending on how many times you’d
like, returning the average number of heads across all of the trials. The coin toss simulations give
us some interesting results.
First, the data confirm that our average number of heads does approach what probability
suggests it should be. Furthermore, this average improves with more trials. In 10 trials, there’s
some slight error, but this error almost disappears entirely with 1,000,000 trials. As we get more
trials, the deviation away from the average decreases. Sound familiar? Sure, we could have
flipped the coin ourselves, but Python saves us a lot of time by allowing us to model this process
in code. As we get more and more data, the real-world starts to resemble the ideal.
Thus, given enough data, statistics enables us to calculate probabilities using real-world
observations. Probability provides the theory, while statistics provides the tools to test that theory
using data. The descriptive statistics, specifically mean and standard deviation, become the
proxies for the theoretical. You may ask, ―Why would I need a proxy if I can just calculate the
theoretical probability itself?‖ Coin tosses are a simple toy example, but the more interesting
probabilities are not so easily calculated.
What is the chance of someone developing a disease over time? What is the probability that a
critical car component will fail when you are driving? There are no easy ways to calculate
probabilities, so we must fall back on using data and statistics to calculate them. Given more and
more data, we can become more confident that what we calculate represents the true probability
of these important events happening. That being said, remember from our previous statistics
post that you are a sommelier-in-training. You need to figure out which wines are better than
others before you start purchasing them. You have a lot of data on hand, so we’ll use our
statistics to guide our decision.
Before we can tackle the question of ―which wine is better than average,‖ we have to mind the
nature of our data. Intuitively, we’d like to use the scores of the wines to compare groups, but
there comes a problem: the scores usually fall in a range. How do we compare groups of scores
between types of wines and know with some degree of certainty that one is better than the other?
Enter the normal distribution. The normal distribution refers to a particularly important
phenomenon in the realm of probability and statistics. The normal distribution looks like this:

The most important qualities to notice about the normal distribution is its symmetry and
its shape. We’ve been calling it a distribution, but what exactly is being distributed? It depends
on the context. In probability, the normal distribution is a particular distribution of the
probability across all of the events. The x-axis takes on the values of events we want to know the
probability of. The y-axis is the probability associated with each event, from 0 to 1.
We haven’t discussed probability distributions in-depth here, but know that the normal
distribution is a particularly important kind of probability distribution. In statistics, it is the
values of our data that are being distributed. Here, the x-axis is the values of our data, and the y-
axis is the count of each of these values. Here’s the same picture of the normal distribution, but

8
labelled according to a probability and statistical
context:

In a probability context, the high point in a normal distribution represents the event with the
highest probability of occurring. As you get farther away from this event on either side, the
probability drops rapidly, forming that familiar bell-shape. The high point in a statistical context
actually represents the mean. As in probability, as you get farther from the mean, you rapidly
drop off in frequency. That is to say, extremely high and low deviations from the mean are
present but exceedingly rare.

Conclusions

implemented different concepts of probability using Python

9
Assignment No: 2

Title
Write a Python program to plot a few activation functions that are being used in neural
networks.
Objectives
To explore various activation function
Outcomes
Students will be able to understand various activation functions.
Software
Python 3.9.7
Theory

A Sigmoidal Activation Function is S-shaped and beneficial activation functions. They are
particularly beneficial for use in natural nets trained by BP, as the easy connection between the
significance of the function at a point and the value of the derivative at that point the
computational load at the time of training. The Logistic Function or Binary Sigmoidal are the
sigmoidal Function having range in between 0 to 1 and used to train ANNs having targeted
output in between 0 to 1 or binary value 0 and 1. The Logistic Function is given by Eq. 1 and
shown in Fig. 2.

( ) ( )

Where V is input to sigmoidal function as shown in Fig. 1, p is positive quantity which controls
the slope of curve. The ―p‖ is the limit of curve. If p is very high i.e. ∞ it will reach threshold. If p
is very zero, then it becomes linear activation function.

The Bipolar Sigmoidal Activation Function is nearly correlated to the hyperbolic tangent
(tanh) function, which is moreover frequently used when targeted output is between -1 and 1.
The Bipolar Sigmoidal Activation Function is given by Eq. 2 or Eq. 3, shown in Fig. 1.

( ) ( )

( ) ( ) ( )

10
TABLE 1
ACTIVATION FUNCTIONS

Function Graph and Equation

Threshold
()
( ) {
{
Bipolar
Unipolar

Linear

Identity
{
Piecewise Linear

Nonlinear

Sig oid /Logistic Hyperbolic Tangent

P= 1
Output f (Y)
P=0.5
1

-20 -10 -6 0 6 10 20
Input (V)

Fig.2. Sigmoidal Activation

11
P= 1
Output f (Y)
P=0.5
1

-20 -10 -6 0 6 10 20 Input (V)


-1

Fig.3. Bipolar Sigmoidal Activation Function


• Threshold Activation Function:

I. Load input (X) and output data (Y_T).


II. Initialize Weights ( ), Iterations, Learning rate (η) and Cost function.
III. Calculate value of new weights.
IV. Find out value of output variable (V).
V. Calculate out values of the mean square Error (MSE).
VI. Repeat steps 3, 4 and 5 for given number of iterations.
VII. If value of V is less than threshold then output (Y) is 0, else 1.
VIII. Plot graph of Iterations versus MSE.
IX. Stop.

Conclusions

12
Assignment No: 3

Title
Generate ANDNOT function using McCulloch-Pitts neural net by a python program.
Objectives
To generate ANDNOT function using McCulloch-Pitts neural net by a python program.
Outcomes
Students are able to generate ANDNOT function using McCulloch-Pitts neural net by a python
program.
Software
Python 3.9.7
Theory

The prescribed meaning of a synthetic neuron structure based on the extremely simplified
concern of the biological model was expressed by Warren McCulloch and Walter Pitts in
1943. The Architecture of Warren McCulloch and Walter Pitts net is given in Fig. 3 and called
as McCulloch Pitts (MP) model. These networks are binary activated and permits only binary
0 or 1 conditions. Neurons are attached by direct weighted path with positive or negative
weights. If connected with positive weights then they are called as excitatory connections. If
connected with negative weights then they are called as inhibitory connections.

X0 (Bias)
X1 WK0
Wk
X2 WK2

Wk VK YK
X3 ∑ YK = f (VK)
.
WKn

Xn
Input Processing Activatio Output
Fig.2. Architecture of Artificial Neural Network

X1
X2 W
W

W
Xn
.
-P Y
Xn+1
-P

-P
Xn+2
X n+m
12

Fig.3. Architecture of Warren McCulloch and Walter


The model consists of four layers i.e. an input layer, two hidden layers and one output layer.
This model is implemented using combination of AND NOT logic and OR Logic which is
given by Eq. 3.
̅ ̅ ( )

Where, YN1 = ̅ ̅ and YN2 = ̅ ̅ are AND NOT logic combined by OR logic. The TTsfor YN1 and
YN2 are given by Table 6 and Table 7 respectively. The weights are decided by trial and error
methods. The weights are finalized as 1 and -1. The TTs for YN is given in Table 8. The
combinations given in
TRUTH TABLE FOR AND-NOT GATE (YN1)

X1 X2 Y W1X1 W2X2 YN1 = W1X1+ W2X2

0 0 0 0 0 0

0 1 0 0 -1 0

1 0 1 1 0 1

1 1 0 1 -1 0

TRUTH TABLE FOR AND-NOT GATE (YN2)

X1 X2 Y W1X1 W3X2 YN2 = W1X1+ W3X2

0 0 0 0 0 0

0 1 0 0 -1 0

1 0 1 1 0 1

1 1 0 1 -1 0

Conclusions

13
Assignment No: 4

Title
Write a Python Program using Perceptron Neural Network to recognize even and odd numbers.
Given numbers are in ASCII form 0 to 9
Objectives
To write a Python Program using Perceptron Neural Network to recognize even and odd
numbers. Given numbers are in ASCII form 0 to 9
Outcomes
Students are able to write a Python Program using Perceptron Neural Network to recognize
even and odd numbers. Given numbers are in ASCII form 0 to 9
Software
Python 3.9.7
Theory

The ASCII code or American Standard Code for Information Interchange is a collection of 255
symbols in the character set that is divided into two parts: the standard ASCII code and the
extended ASCII code. The regular ASCII code is 7 bits long and ranges from 0 to 127, while
the extended ASCII code is 8 bits long and ranges from 128 to 255. This character set is made
up of uppercase and lowercase letters (a to z, A to Z), digits (0-9), special characters (!, @, #, $,
etc.), punctuation marks, and control characters. As a result, each character has a unique ASCII
value.

Suppose we input a string "Scaler Topics" in the computer, then the system does not directly
store the string we entered. Instead, the computer stores the strings in their equivalent ASCII
value, such as '083099097108101114032084111112108099115'. The ASCII value of S is 083, c
is 099, a is 097, " " is 032, and so on.

Input/Output

Let's take an example to understand to print the ASCII value of 7.

Input:- No need to input anything. Output:- ASCII value of 7 is 55. Explanation:- The ASCII
value of 7 is 55.

Program to Print ASCII Value of 1 to 9 Using the ASCII Table

14
Let's look at the ASCII table for the numbers from 1 to 9.

Number ASCII value

1 49

2 50

3 51

4 52

5 53

6 54

7 55

8 56

9 57

It is clear from the above table that we need to add 48 to the number to get its ASCII value.

Conclusions

15
Assignment No: 5

Title
With a suitable example demonstrate the perceptron learning law with its decision regions using python. Give
the output in graphical form
Objectives
To With a suitable example demonstrate the perceptron learning law with its decision regions using python.
Give the output in graphical form
Outcomes
Students are able to demonstrate the perceptron learning law with its decision regions using python. Give the
output in graphical form
Software
Python 3.9.7
Theory

Perceptron is a machine learning algorithm which mimics how a neuron in the brain works. It is also called
as single layer neural network consisting of a single neuron. The output of this neural network is decided
based on the outcome of just one activation function associated with the single neuron. In perceptron,
the forward propagation of information happens. Deep neural network consists of one or more perceptrons
laid out in two or more layers. Input to different perceptrons in a particular layer will be fed from previous
layer by combining them with different weights.
Let’s first understand how a neuron works. The diagram below represents a neuron in the brain. The input
signals (x1, x2, …) of different strength (observed weights, w1, w2 …) is fed into the neuron cell as weighted
sum via dendrites. The weighted sum is termed as the net input. The net input is processed by the neuron and
output signal (observer signal in AXON) is appropriately fired. In case the combined signal strength is not
appropriate based on decision function within neuron cell (observe activation function), the neuron does not
fire any output signal.

Fig 1. Neuron in Human Brain

The following is an another view of understanding an artificial neuron, a perceptron, in relation to a biological
neuron from the viewpoint of how input and output signals flows:

16
The perceptron when represented as line diagram would look like the following with mathematical notations:

Fig 2. Perceptron – Single-layer Neural Network

Pay attention to some of the following in relation to what’s shown in the above diagram representing a neuron:

• Step 1 – Input signals weighted and combined as net input: Weighted sums of input signal reaches to
the neuron cell through dendrites. The weighted inputs does represent the fact that different input signal
may have different strength, and thus, weighted sum. This weighted sum can as well be termed as net
input to the neuron cell.
• Step 2 – Net input fed into activation function: Weighted The weighted sum of inputs or net input is
fed as input to what is called as activation function. The activation function is a non-linear activation
function. The activation functions are of different types such as the following:
• Unit step functions
• Sigmoid function (Popular one as it outputs number between 0 and 1 and thus can be used to
represent probability)
• Rectilinear (ReLU) function
• Hyperbolic tangent

The diagram below depicts different types of non-linear activation functions.

17
• Step 3A – Activation function outputs binary signal appropriately: The activation function processes
the net input based on the unit step (Heaviside) function and outputs the binary signal appropriately as
either 1 or 0. The activation function for perceptron can be said to be a unit step function. Recall that
the unit step function, u(t), outputs the value of 1 when t >= 0 and 0 otherwise. In the case of a shifted
unit step function, the function u(t-a) outputs the value of 1 when t >= a and 0 otherwise.
• Step 3B – Learning input signal weights based on prediction vs actuals: A parallel step is a neuron sending
the feedback to strengthen the input signal strength (weights) appropriately such that it could create an
output signal appropriately that matches the actual value. The feedback is based on the outcome of the
activation function which is a unit step function. Weights are updated based on the gradient descent
learning algorithm. Here is my post on gradient descent – Gradient descent explained simply with
examples. Here is the equation based on which the weights get updated:

Fig 3. Weight update rule of Perceptron learning algorithm

18
Here is another picture of Perceptron that represents the concept explained above.

Perceptron – A single-layer neural network comprising of a single neuron

Conclusions

19
Assignment No: 6

Title
Write a python Program for Bidirectional Associative Memory with two pairs of vectors.
Objectives
To Write a python Program for Bidirectional Associative Memory with two pairs of vectors.
Outcomes
Students are able to write a python Program for Bidirectional Associative Memory with two
pairs of vectors.
Software
Python 3.9.7
Theory

Associative memory is understood as the storage and retrieval of information by association


with other information.
An information storage device is called associative memory if it allows information to be
retrieved based on partial knowledge of its content, without knowing its storage location. It is
also sometimes called content addressing memory.
Traditional computers do not use this addressing, but are based on the exact knowledge of the
memory address in which the information is located.
Why BAM is required?
The main objective to introduce such a network model is to store hetero-associative pattern pairs.
This is used to retrieve a pattern given a noisy or incomplete pattern.
The main point of BAM is to act as memory, where you can teach it to associate several patterns
together. This way, if you teach it to associate a pattern A with a pattern B, when you give it A
again, it will spit out B. Since it’s bidirectional, you can also give it pattern B and it will spit out
pattern A in response. You can teach it several pattern pairs to associate, and can even corrupt
some of the data, and it will give you what it thinks is the best match for the data you gave it.
Just like a human brain.

Creating a BAM Network

BAM Network Structure


We will see how to create a BAM network using two training images and two test images, one
previously edited with noise and the other a copy of the original.

20
For pattern A we use an simple binary image from bear:

Pattern A
For pattern B we use a duck:

Pattern B
For testing we use same pattern B and for pattern A we use same bear with noise

Patter A for testing BAM Network

First, we need to load two images in img directory, and using opencv apply binary operation to
every file and resize image to 100x50.

21
Thresholding is a technique in OpenCV, which is the assignment of pixel values in relation to
the threshold value provided. In thresholding, each pixel value is compared with the threshold
value. If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a
maximum value (generally 255). Thresholding is a very popular segmentation technique, used
for separating an object considered as a foreground from its background. A threshold is a value
which has two regions on its either side i.e. below the threshold or above the threshold.
In Computer Vision, this technique of thresholding is done on grayscale images. So initially, the
image has to be converted in grayscale color space.

Conclusions

22
Assignment No: 7

Title
Implement Artificial Neural Network training process in Python by using Forward Propagation,
Back Propagation.
Objectives
To Implement Artificial Neural Network training process in Python by using Forward
Propagation, Back Propagation.
Outcomes
Students are able to implement Artificial Neural Network training process in Python by using
Forward Propagation, Back Propagation.
Software
Python 3.9.7
Theory

The backward propagation part of neural networks is quite complicated. In this article, I provide
an example of forward and backward propagation to (hopefully) answer some questions you
might have. Though it’s no substitute for reading papers on neural networks, I hope it clears up
some confusion.
In this post, I walk you through a simple neural network example and illustrate how forward and
backward propagation work. My neural network example predicts the outcome of the logical
conjunction.
The logical conjunction (AND operator) takes two inputs and returns one output. The function
only returns true, if both of its inputs are true. The truth table of it looks like this:

Our neural network has 2 inputs, p and q, and one output, the prediction of p & q. Our training
set includes 2 examples out of the 4 possible examples:

For the first training example, the desired output of the neural network is 1, and for the second
training example, it is 0. Our neural network has 2 neurons in the input layer and 1 neuron on the
output layer. The structure looks like this:

23
Forward propagation:
In the forward propagation, we check what the neural network predicts for the first training
example with initial weights and bias. First, we initialize the weights and bias randomly:

Then we calculate z, the weighted sum of activation and bias:

After we have z, we can apply the activation function to it:

σ is the activation function. The most common activation functions are relu, sigmoid and tanh. In
this example, we are going to use tanh.

24
For the first training example, our neural network predicted the outcome 0.291. Our desired
outcome is 1. The neural network can improve with the learning process of backward
propagation. Before we continue with the backward propagation, let’s calculate the prediction
for the second training example.
Here are the results:

Backward propagation:
We can define a cost function that measures how good our neural network performs. For an
input, x, and desired output, y, we can calculate the cost of a specific training example as the
square of the difference between the network’s output and the desired output, that is,

Where k stands for the training example and the output is assumed to be the activation of the
output neuron, and y is the actual desired output. For our training examples, the costs are the
following:

The total cost of a training set is the average of the individual cost functions of the data in the
training set:

Where N stands for the number of training examples. In our training set it looks like this:

25
We want to improve the performance of the neural network on the training examples, so that we
can change the weights and bias, and hopefully, lower the total cost. We want to know how
much the specific weights and bias affect the total cost, so we need to calculate the partial
derivatives of the total cost with respect to the weights and bias. To do this, we can apply the
chain rule:

After simplification, the parts look like this:

The calculation of the partial derivatives for the first training example:

26
The partial derivatives for the second training example:

Now, we calculate the partial derivatives with respect to the total cost. Consider the first weight
(w1). The partial derivative of the total cost with respect to w1 is the average of all the partial
derivatives of the individual cost functions with respect to w1:

In our training example, it is:

We do the same calculation for the other weight and bias:

27
Then we update the weights and bias: we multiply the partial derivatives with some learning rate
and subtract the results from the weights and bias. Let’s use the learning rate α = 0.6

Repeating this calculation with the other weight and the bias:

After updating the weights and bias, our neural network looks like this:

This is the end of the first iteration of the backward propagation. We could continue with the
forward propagation, calculate the cost, and then go back to the backward propagation again.

Conclusions

28
Assignment No: 8

Title
Write a python program to show Back Propagation Network for XOR function with Binary
Input and Output.
Objectives
To Implement Back Propagation Network for XOR function with Binary Input and Output..
Outcomes
Students are able to implement Back Propagation Network for XOR function with Binary Input
and Output.
Software
Python 3.9.7

Theory

Implementing logic gates using neural networks help understand the mathematical computation
by which a neural network processes its inputs to arrive at a certain output. This neural network
will deal with the XOR logic problem. An XOR (exclusive OR gate) is a digital logic gate that
gives a true output only when both its inputs differ from each other. The truth table for an XOR
gate is shown below:

Truth Table for XOR

The goal of the neural network is to classify the input patterns according to the above truth table.
If the input patterns are plotted according to their outputs, it is seen that these points are not
linearly separable. Hence the neural network has to be modeled to separate these input patterns
using decision planes.

Error can be simply written as the difference between the predicted outcome and the actual
outcome. Mathematically:

Where t is the targeted/expected output & y is the predicted output

29
However, is it fair to assign different error values for the same amount of error? For example, the
absolute difference between -1 and 0 & 1 and 0 is the same, however the above formula would
sway things negatively for the outcome that predicted -1. To solve this problem, we use square
error loss.(Note modulus is not used, as it makes it harder to differentiate). Further, this error is
divided by 2, to make it easier to differentiate, as we’ll see in the following steps.

Squared Error Loss

Since, there may be many weights contributing to this error, we take the partial derivative, to find
the minimum error, with respect to each weight at a time. The change in weights are different for
the output layer weights (W31 & W32) and different for the hidden layer weights (W11, W12,
W21, W22).
Let the outer layer weights be wo while the hidden layer weights be wh.

We’ll first find ∆W for the outer layer weights. Since the outcome is a function of activation and
further activation is a function of weights, by chain rule:

On solving,

Change in the outer layer weights

Note that for Xo is nothing but the output from the hidden layer nodes.
This output from the hidden layer node is again a function of the activation and correspondingly a
function of weights. Hence, the chain rule expands for the hidden layer weights:

Which comes to,

30
Change in the hidden layer weights

XOR, Graphically

Choosing the number of epochs and the value of the learning rate decides two things: how
accurate the model is, and how fast did the model take to compute the final output. The concept
of hyperparameter tuning is a whole subject by itself.

Conclusions

31
Assignment No: 9

Title
Write a python program to illustrate ART neural network
Objectives
To Implement ART neural network
Outcomes
Students are able illustrate ART neural network
Software
Python 3.9.7,TensorFlow

Theory

TensorFlow

―TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive,
flexible ecosystem of tools, libraries and community resources that lets researchers push the state-
of-the-art in ML and developers easily build and deploy ML powered applications.‖ Neural
Style Transfer

―Neural style transfer is an optimization technique used to take two images — a content image
and a style reference image (such as an artwork by a famous painter) — and blend them together
so the output image looks like the content image, but ―painted‖ in the style of the style reference
image.‖
decoding the image into three different channels. This is like going into the pixel level of things.
And then we are cropping the center of the images. Lastly, we are resizing the image so that our
content and style images are matching in terms of size.

We are going to pick two images, and then import them into our program. We will use
the load_image function that we defined earlier to import the images.

32
The model that we will use is called Arbitrary Image Stylization. There are different ways of
loading a model, one can be loading throught the url or downloading the model folder.

Conclusions

33
Assignment No: 10

Title
Write a python program in python program for creating a Back Propagation Feed-forward
neural network
Objectives
To Implement Back Propagation Feed-forward neural network
Outcomes
Students are able create Back Propagation Feed-forward neural network
Software
Python 3.9.7,TensorFlow

Theory

Creating a backpropagation feed-forward neural network involves several key steps. Here's a
theoretical overview of the process:
Data Preparation: Start by preparing your dataset, which typically includes input features and
corresponding target labels. Ensure the data is properly preprocessed, normalized, and split into
training and validation sets.
Network Architecture: Define the architecture of your neural network. Specify the number of
layers, the number of neurons in each layer, and the activation functions to be used. The most
common architecture is a feed-forward network, where information flows from the input layer
through hidden layers to the output layer.
Initialization: Initialize the weights and biases of your neural network. Random initialization is
often used, setting small random values for the weights and initializing biases to zero.
Forward Propagation: Perform forward propagation to compute the output of the network
given an input. Start by passing the input through the network's layers, applying the activation
function at each layer. Compute the output of the last layer, which will be the predicted values.
Loss Function: Define a suitable loss function that quantifies the difference between the
predicted output and the actual target values. Common loss functions for different tasks include
mean squared error (MSE) for regression and cross-entropy loss for classification.
Backpropagation: Perform backpropagation to calculate the gradients of the loss function with
respect to the weights and biases. Start by calculating the gradient of the loss function at the
output layer and propagate it backward through the network, applying the chain rule. Update the
weights and biases using an optimization algorithm such as stochastic gradient descent (SGD).

Main steps of Backpropagation algorithm

Step 1: The input layer receives the input.


Step 2: The input is then averaged overweight.
34
Step 3: Each hidden layer processes the output. Each output is referred to as ―Error‖ here which
is actually the difference between the actual output and the desired output.
Step 4: In this step, the algorithm moves back to the hidden layers again to optimize the weights
and reduce the error.

Advantages of Backpropagation in Python

It is relatively faster and simple algorithm to implement. Extensively used in the field of face
recognition and speech recognition. Moreover, it is a flexible method as no prior knowledge of
the neural network is needed.

Disadvantages of Backpropagation

The algorithm is not disadvantageous for noisy and irregular data. The performance of the
backpropagation highly depends on the input.

Backpropagation is a great way to improve the accuracy of feed-forward neural network model.
It is quite easy and flexible algorithm but does not work well with noisy data. It is a great way to
reduce the error and improve the accuracy of the model. It optimizes the weights by going
backwards by minimizing the loss function with the help of gradient descent.

35
Back propagation (BP) is a feed forward neural network and it propagates the error in backward
direction to update the weights of hidden layers. The error is difference of actual output and
target output computed on the basis of gradient descent method. The performance of the system
is evaluated on the basis of recognition rate.
Conclusions

36
Assignment No: 11

Title
Write a python program to design a Hopfield Network which stores 4 vectors
Objectives
To Implement Hopfield Network which stores 4 vectors
Outcomes
Students are able create a Hopfield Network which stores 4 vectors
Software
Python 3.9.7,TensorFlow
Theory

The Hopfield network is a type of recurrent artificial neural network that can be used for
associative memory and pattern recognition tasks. It consists of a set of interconnected neurons,
where each neuron can be in one of two states: "on" or "off."
If we want to design a Hopfield network that stores four vectors, we need to define the size of the
vectors and the number of neurons in the network. Let's assume that each vector has N elements,
and we will need N neurons in the network.
To store the four vectors, we can initialize the weights between neurons in such a way that the
network converges to the desired state when presented with one of the vectors. We can set the
weights according to the Hebbian learning rule, which states that "neurons that fire together wire
together." The weight between two neurons i and j is given by:
W(i, j) = ∑(1 to 4) (V(i) * V(j))

Where V(i) represents the i-th element of the vector V.

To recall a stored vector, we can present a partial or noisy version of the vector to the network,
and the network will converge to the closest stored vector. This convergence process is achieved
through the iterative updating of neuron states based on the weighted inputs from other neurons.
The dynamics of the Hopfield network can be described using an energy function. The network
seeks to minimize this energy function, and when it reaches a stable state, the energy function
reaches a minimum. The energy function is defined as:
E = -0.5 ∑(1 to N) ∑(1 to N) (W(i, j) * S(i) * S(j)) - ∑(1 to N) θ(i) * S(i)

Where S(i) represents the state of neuron i (either -1 or 1), W(i, j) is the weight between neurons
i and j, and θ(i) is the threshold or bias value for neuron i.

By iteratively updating the states of the neurons based on the energy function, the network will
converge to a stable state that corresponds to one of the stored vectors.

37
Hopfield network has some limitations, such as the network being prone to spurious states and
the limited storage capacity as the number of vectors increases. Nonetheless, it serves as a
foundational model for associative memory and has been expanded upon with variations and
improvements over the years.

Conclusions

38
Assignment No: 12

Title
How to Train a Neural Network with TensorFlow / Pytorch and evaluation of logistic regression
using tensorflow.
Objectives
To Implement Neural Network with TensorFlow
Outcomes
Students are able implement Neural Network with TensorFlow
Software
TensorFlow

Theory

―TensorFlow is an open source software library for numerical computation using dataflow graphs.
Nodes in the graph represent mathematical operations, while graph edges represent multi-
dimensional data arrays (aka tensors) communicated between them. The flexible architecture
allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile
device with a single API.‖
The advantages of using TensorFlow are:

It has an intuitive construct, because as the name suggests it has ―flow of tensors‖. You can easily
visualize each and every part of the graph.

Easily train on cpu/gpu for distributed computing

Platform flexibility.

You can run the models wherever you want, whether it is on mobile, server or PC

TensorFlow is as follows:

• Build a computational graph; this can be any mathematical operation TensorFlow supports.
• Initialize variables, to compile the variables defined previously
• Create session; this is where the magic starts!
• Run graph in session, the compiled graph is passed to the session, which starts its execution.
• Close session, shutdown the session.

We have 70,000 images of 28 pixels width and 28 pixels height in greyscale. Each image is
showing one of 10 possible clothing types. Here is one:

39
Here are some images from the dataset along with the clothing they are showing:

40
Here are all different types of clothing:
| Label | Description |
| | |
| 0 | T-shirt/top |
| 1 | Trouser |
| 2 | Pullover |
| 3 | Dress |
| 4 | Coat |
| 5 | Sandal |
| 6 | Shirt |
| 7 | Sneaker |
| 8 | Bag |
| 9 | Ankle boot |

41
Now that we got familiar with the data we have let’s make it usable for our Neural Network.

Data Preprocessing

Let’s start with loading our data into memory:

Fortunately, TensorFlow has the dataset built-in, so we can easily obtain it.

Loading it gives us 4 things:


x_train — image (pixel) data for 60,000 clothes. Used for building our model.
y_train — classes (clothing type) for the clothing above. Used for building our model.

x_val — image (pixel) data for 10,000 clothes. Used for testing/validating our model.
y_val — classes (clothing type) for the clothing above. Used for testing/validating our model.

Now, your Neural Network can’t really see images as you do. But it can understand numbers.
Each data point of each image in our dataset is pixel data — a number between 0 and 255.

Conclusions

42
Assignment No: 13

Title
TensorFlow / Pytorch implementation of CNN
Objectives
To Implement CNN with TensorFlow
Outcomes
Students are able implement CNN with TensorFlow
Software
TensorFlow

Theory

Convolutional Neural Networks


Convolutional Neural networks are designed to process data through multiple layers of arrays.
This type of neural networks is used in applications like image recognition or face recognition.
The primary difference between CNN and any other ordinary neural network is that CNN takes
input as a two-dimensional array and operates directly on the images rather than focusing on
feature extraction which other neural networks focus on.
The dominant approach of CNN includes solutions for problems of recognition. Top companies
like Google and Facebook have invested in research and development towards recognition
projects to get activities done with greater speed.
A convolutional neural network uses three basic ideas −
• Local respective fields
• Convolution
• Pooling

The CIFAR10 dataset contains 60,000 color images in 10 classes, with 6,000 images in each
class. The dataset is divided into 50,000 training images and 10,000 testing images. The classes
are mutually exclusive and there is no overlap between them.
Verify the data
43
To verify that the dataset looks correct, let's plot the first 25 images from the training set and
display the class name below each image:

As input, a CNN takes tensors of shape (image height, image width, color_channels), ignoring
the batch size. If you are new to these dimensions, color_channels refers to (R,G,B). In this
example, you will configure your CNN to process inputs of shape (32, 32, 3), which is the format
of CIFAR images.

44
We build the LeNet-5 Architecture using the PyTorch framework. It has 2 Convolutional layers
each followed by a Average Pooling Layer, 2 Fully connected layers and a final output classifier
layer with 10 classes as the final output has 10 categories items.

Conclusions

45
Assignment No: 14

Title
MNIST Handwritten Character Detection using PyTorch, Keras and Tensorflow
Objectives
To Implement MNIST Handwritten Character Detection using TensorFlow
Outcomes
Students are able implement MNIST Handwritten Character Detection using TensorFlow
Software
TensorFlow
Theory

The MNIST (Modified National Institute of Standards and Technology) database is a large
database of handwritten numbers or digits that are used for training various image processing
systems. The dataset also widely used for training and testing in the field of machine learning.

MNIST contains 70,000 images of handwritten digits: 60,000 for training and 10,000 for testing.
The images are gray scale, 28x28 pixels, and centered to reduce preprocessing and get started
quicker.

Yann LeCun (Courant Institute, NYU) and Corinna Cortes (Google Labs, New York) hold the
copyright of MNIST dataset, which is a derivative work from original NIST datasets.

Keras is a high-level neural network API focused on user friendliness, fast prototyping,
modularity and extensibility

Our pixel vector serves as the input. Then, two hidden 512-node layers, with enough model
complexity for recognizing digits. For the multi-class classification we add another densely-
connected (or fully-connected) layer for the 10 different output classes.
46
For this network architecture we can use the Keras Sequential Model. We can stack layers using
the .add() method.

When adding the first layer in the Sequential Model we need to specify the input shape so Keras
can create the appropriate matrices. For all remaining layers the shape is inferred automatically.

As expected, the pixel values range from 0 to 255: the background majority close to 0, and those
close to 255 representing the digit.

Normalizing the input data helps to speed up the training. Also, it reduces the chance of getting
stuck in local optima, since we're using stochastic gradient descent to find the optimal weights
for the network.

47
Our pixel vector serves as the input. Then, two hidden 512-node layers, with enough model
complexity for recognizing digits. For the multi-class classification we add another densely-
connected (or fully-connected) layer for the 10 different output classes. For this network
architecture we can use the Keras Sequential Model. We can stack layers using the .add()
method.

When adding the first layer in the Sequential Model we need to specify the input shape so Keras
can create the appropriate matrices. For all remaining layers the shape is inferred automatically

48
We used Keras with a Tensorflow backend on a GPU-enabled server to train a neural network to
recognize handwritten digits in under 20 seconds of training time - all that without having to spin
up any compute instances
49
Conclusions

50

You might also like