You are on page 1of 37

INSTITUTE OF ENGINEERING AND TECHNOLOGY,

LUCKNOW

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SPECIAL LAB (KCS-751)

SUBMITTED BY-

HIMANI JAYAS
1805213023
IT 4TH YEAR
EXPERIMENT - 01

OBJECTIVE:
For a given network of cities, find an optimal path to reach from a given source
city to any other destination city using an admissible heuristic.

THEORY:
Heuristics: The heuristic function h(n) tells A* an estimate of the minimum cost
from any vertex n to the goal. It’s important to choose a good heuristic function.
The heuristic can be used to control A*’s behavior.
 At one extreme, if h(n) is 0, then only g(n) plays a role, and A* turns into
Dijkstra’s Algorithm, which is guaranteed to find a shortest path.
 If h(n) is always lower than (or equal to) the cost of moving from n to the
goal, then A* is guaranteed to find a shortest path. The lower h(n) is, the
more node A* expands, making it slower.
 If h(n) is exactly equal to the cost of moving from n to the goal, then A* will
only follow the best path and never expand anything else, making it very
fast. Although you can’t make this happen in all cases, you can make it
exact in some special cases. It’s nice to know that given perfect
information, A* will behave perfectly.
 If h(n) is sometimes greater than the cost of moving from n to the goal,
then A* is not guaranteed to find a shortest path, but it can run faster.
 At the other extreme, if h(n) is very high relative to g(n), then only h(n)
plays a role, and A* turns into Greedy Best-First-Search.

So we have an interesting situation in that we can decide what we want to get


out of A*. With 100% accurate estimates, we’ll get shortest paths really quickly. If
we’re too low, then we’ll continue to get shortest paths, but it’ll slow down. If
we’re too high, then we give up shortest paths, but A* will run faster.
PROCEDURE:
1. Put the start node son a list called OPEN of unexpanded nodes.
2. If OPEN is empty exit with failure; no solution exists.
3. Remove the first OPEN node n at which f is minimum (break ties arbitrarily),
and place it on a list called CLOSED to be used for expanded nodes.
4. If n is a goal node, exit successfully with the solution obtained by tracing the
path along the pointers from the goal back to s.
5. Otherwise expand node n, generating all its successors with pointers back to n.
6. For every successor n’ on n:
a. Calculate f(n’).
b. if n’ was neither on OPEN nor on CLOSED, add it to OPEN. Attach a pointer from
n’ back to n. Assign the newly computed f(n’)to node n’.
c. if n’ already resided on OPEN or CLOSED, compare the newly computed
f(n’)with the value previously assigned to n’.
If the old value is lower, discard the newly generated node. If the new value is
lower, substitute it for the old (n’ now points back to n instead of to its previous
predecessor). If the matching node n’ resides on CLOSED, move it back to OPEN.
7.Go to step 2.
CONCLUSION:
When h is consistent, the f values of nodes expanded by A* are never decreasing.
When A* selected n for expansion it already found the shortest path to it. When h
is consistent every node is expanded once. Normally the heuristics we encounter
are consistent

–the number of misplaced tiles


–Manhattan distance
–straight-line distance
EXPERIMENT – 02

OBJECTIVE:
Solve the weather problem to predict the possibility of a rain happening under
known parameters for e.g., temperature, humidity, wind flow, sunny or cloudy
etc. using Bayesian Learning.

THEORY:
The basic idea of Bayesian networks (BNs) (BNs) is to reproduce the most
important dependencies and independencies among a set of variables in a
graphical form (a directed acyclic graph) which is easy to understand and
interpret. Let us consider the subset of climatic stations shown in the graph in
Figure, where the variables (rainfall) are represented pictorially by a set of nodes;
one node for each variable (for clarity of exposition, the set of nodes is denoted
{y1,.....yn}). These nodes are connected by arrows, which represent a cause and
effect relationship. That is, if there is an arrow from node yi to node yj , we say
that yi is the cause of yj , or equivalently, yj is the effect of yi. Another popular
terminology of this is to say that yi is a parent of yj or yj is a child of yi. For
example, in Figure, the nodes Gijon and Amieva and Proaza are a child of Gijon
and Rioseco (the set of parents of a node yi is denoted by πi).

Directed graphs provide a simple definition of independence (d-separation) based


on the existence or not of certain paths between the variables.
The dependency/independency structure displayed by an acyclic directed graph
can be also expressed in terms of a the Joint Probability Distribution (JPD)
factorized as a product of several conditional distributions as follows:

Pr(y1,y2, …., yn) = nΠi=1 P(yi | πi)

Therefore, the independencies from the graph are easily translated to the
probabilistic model in a sound form. For instance, the JPD of a BN defined by the
graph given in Figure requires the specification of 100 conditional probability
tables, one for each variable conditioned to its parents’ set. Hereafter we shall
consider rainfall discredited into three different states (0=“no rain”, 1=“weak
rain”, 2=“heavy rain”), associated with the thresholds 0, 2, and 10 mm,
respectively.

PROCEDURE:

1) Learning Bayesian Networks from Data


In addition to the graph structure, a BN requires that we specify the conditional
probability of each node given its parents. However, in many practical problems,
we do not know neither the complete topology of the graph, or some of the
required probabilities. For this reason, several methods have been recently
introduced for learning the graphical structure (structure learning) and estimating
probabilities (parametric learning) from data. A learning algorithm consists of two
parts:

1. A quality measure, which is used for computing the quality of the candidate
BNs. This is a global measure, since it measures both the quality of the graphical
structure and the quality of the estimated parameters.

2. A search algorithm, which is used to efficiently search the space of possible BNs
to find the one with highest quality. Note that the number of all possible
networks, even for a small number of variables and, therefore, the search space is
huge. Among the different quality measures proposed in the literature the basic
idea of Bayesian quality measures is to assign to every BN a quality value that is a
function of the posterior probability distribution of the available data D = {yt1, …,
yt 100} (with the index t running daily from 1979 to 1993), given the BN (M,θ)
with network structure M and the corresponding estimated probabilities θ. The
posterior probability distribution p(M, θ|D) is calculated as follows:

Geiger and Heckerman consider multinomial networks and assume certain


hypothesis about the prior distributions of the parameters, leading to the quality
measure

where n is the number of variables, ri is the cardinal of the i-th variable, si the
number of realizations of the parent’s set Πi , ηijk are the “a priori” Dirichlet
hyper-parameters for the conditional distribution of node i, Nijk is the number of
realizations in the database consistent with yi = j and πi = k, Nik is the number of
realizations in the database consistent with πi = k and Г is the gamma function.

2) Inference-
Once a model describing the relationships among the set of variables has been
selected, it can then be used to answer queries when evidence becomes
available.

3) Validation of the Bayesian Network Forecast Model-


To check the quality of BN in a simple case, we shall apply this methodology to a
now casting problem. In this case we are given a forecast in a given subset of
stations and we need to infer a prediction for the remaining stations in the
network. To this aim, consider that we are given predictions in the five stations of
the primary network. These predictions shall be plugged in the network as
evidence, obtaining the probabilities for the remaining stations in the secondary
network.

4) Connecting With Numerical Atmospheric Models-


Since we are interested in rainfall forecasts, we shall use the gridded forecasts of
total precipitation given by the operative ECMWF model (these values are
obtained by adding both the convective and the large scale precipitation outputs).
The forecasts are obtained 24 hours ahead; therefore, they give a numeric
estimation of the future precipitation pattern (one day ahead) on a coarse-
grained resolution grid.

OUTPUT:

Bayesian network of precipitation grid points and local precipitation at the


network of local stations.

CONCLUSION:
We have used Bayesian network learning and show their applicability for local
weather forecasting and downscaling. The preliminary results presented how
such models can be built and how they can be used for performing inference.
EXPERIMENT – 03

OBJECTIVE:
Solve the problem of human recognition from their faces using machine learning
techniques.

THEORY:
Let us introduce a new benchmark data set of face images with variable makeup,
hairstyles and occlusions, named BookClub artistic makeup data, and then
examine the performance of the ANNs under different conditions. Makeup and
other occlusions can be used not only to disguise a person's identity from the
ANN algorithms, but also to spoof a wrong identification.

ANN Algorithm:
Artificial Neural Network (ANN) are capable of learning patterns of interest from
data in the presence of variations. An Artificial Neural Network in the field of
Artificial intelligence where it attempts to mimic the network of neurons makes
up a human brain so that computers will have an option to understand things and
make decisions in a human-like manner.
Artificial Neural Network primarily consists of three layers:
 Input Layer
 Hidden Layer
 Output Layer

PROCEDURE:
1. The images used in this are kept coloured and downsized and compressed into
JPEG format with the dimension of 48x48 pixels.
2. The downsizing is done due to computational restrictions to keep processing
times reasonable. However, observations made on the small size images are
extendable to larger sizes.
3. For computational experiments, ‘Keras’ library with Tensorflow back-end were
used.
4. The ANN consists of the four sequential groups of layers of the Gaussian noise,
convolution with ReLU activation functions, normalization, pooling and dropout
layers.
5. It is topped with the fully connected layers, the softmax activation function of
the last layer and cross-entropy loss function. "Adam" learning algorithm with
0:001 coecient, mini-batch size 32 and 100 epochs parameters are used.

OUTPUT:

CONCLUSION: Despite the small size images were scaled to and not very deep
ANN, mean accuracy of the face recognition of the model trained on the samples
from all photo-sessions of all subjects is quite high at 92%, and higher (up to
99:9%)
EXPERIMENT – 04

OBJECTIVE:
Classify the objects using deep learning techniques.

THEORY:
Image classification involves assigning a class label to an image, whereas object
localization involves drawing a bounding box around one or more objects in an
image. Object detection is more challenging and combines these two tasks and
draws a bounding box around each object of interest in the image and assigns
them a class label. Together, all of these problems are referred to as object
recognition.

 Image Classification: Predict the type or class of an object in an image.


o Input: An image with a single object, such as a photograph.
o Output: A class label (e.g. one or more integers that are mapped to class
labels).

 Object Localization: Locate the presence of objects in an image and indicate


their location with a bounding box.
o Input: An image with one or more objects, such as a photograph.
o Output: One or more bounding boxes (e.g. defined by a point, width, and
height).

 Object Detection: Locate the presence of objects with a bounding box and
types or classes of the located objects in an image.
o Input: An image with one or more objects, such as a photograph.
o Output: One or more bounding boxes (e.g. defined by a point, width, and
height), and a class label for each bounding box.
CONCLUSION:
Object detection can be used in many areas to reduce human efforts and increase
the efficiency of processes in various fields. Object detection, as well as deep
learning, is areas that will be blooming in the future and making its presence
across numerous fields. There is a lot of scope in these fields and also many
opportunities for improvements.
EXPERIMENT – 05

OBJECTIVE:
Validate the principles of transfer learning for solving any real-life
classification/recognition problem.

THEORY:
Humans have an inherent ability to transfer knowledge across tasks. What we
acquire as knowledge while learning about one task, we utilize in the same way to
solve related tasks. The more related the tasks, the easier it is for us to transfer,
or cross-utilize our knowledge. Some simple examples would be,
Know how to ride a motorbike ⮫ Learn how to ride a car
Transfer learning, as we have seen so far, is having the ability to utilize existing
knowledge from the source learner in the target task. During the process of
transfer learning, the following three important questions must be answered:

What to transfer: This is the first and the most important step in the whole
process. We try to seek answers about which part of the knowledge can be
transferred from the source to the target in order to improve the performance of
the target task. When trying to answer this question, we try to identify which
portion of knowledge is source-specific and what is common between the source
and the target.

When to transfer: There can be scenarios where transferring knowledge for the
sake of it may make matters worse than improving anything (also known as
negative transfer). We should aim at utilizing transfer learning to improve target
task performance/results and not degrade them. We need to be careful about
when to transfer and when not to.

How to transfer: Once what and when have been answered, we can proceed
towards identifying ways of actually transferring the knowledge across
domains/tasks. This involves changes to existing algorithms and different
techniques, which we will cover in later sections of this article. Also, specific case
studies are lined up in the end for a better understanding of how to transfer.
Image Classification with a Data Availability Constraint
The dataset that we will be using, comes from the very popular Dog vs Cat
Challenge, where our primary objective is to build a deep learning model that can
successfully recognize and categorize images into either a cat or a dog.

Creating Datasets:
import glob
import numpy as np
import os
import shutilnp.random.seed(42)
files = glob.glob(‘train/*’)
cat_files = [fn for fn in files if 'cat' in fn]
dog_files = [fn for fn in files if 'dog' in fn]
len(cat_files), len(dog_files)
cat_train = np.random.choice(cat_files, size=1500, replace=False)
dog_train = np.random.choice(dog_files, size=1500, replace=False)
cat_files = list(set(cat_files) - set(cat_train))
dog_files = list(set(dog_files) - set(dog_train))
cat_val = np.random.choice(cat_files, size=500, replace=False)
dog_val = np.random.choice(dog_files, size=500, replace=False)
cat_files = list(set(cat_files) - set(cat_val))
dog_files = list(set(dog_files) - set(dog_val))
cat_test = np.random.choice(cat_files, size=500, replace=False)
dog_test = np.random.choice(dog_files, size=500, replace=False)
print('Cat datasets:', cat_train.shape, cat_val.shape, cat_test.shape)

Writing on disk
train_dir = 'training_data'
val_dir = 'validation_data'
test_dir = 'test_data'
train_files = np.concatenate([cat_train, dog_train])
validate_files = np.concatenate([cat_val, dog_val])
test_files = np.concatenate([cat_test, dog_test])
os.mkdir(train_dir) if not os.path.isdir(train_dir) else None
os.mkdir(val_dir) if not os.path.isdir(val_dir) else None
os.mkdir(test_dir) if not os.path.isdir(test_dir) else None
for fn in trains_files:
shutil.copy(fn, train_dir)
for fn in validate_files:
shutil.copy(fn, val_dir)

Testing on CNN Model -


import glob
import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing.image
import ImageDataGenerator, load_img, img_to_array,array_to_img%matplotlib
inline
IMG_DIM = (150, 150)
train_files = glob.glob('training_data/*') train_imgs =
[img_to_array(load_img(img, target_size=IMG_DIM)) for img in train_files]
train_imgs = np.array(train_imgs)
train_labels = [fn.split('\\')[1].split('.')[0].strip() for fn in train_files]
validation_files = glob.glob('validation_data/*')
validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in
validation_files]
validation_imgs = np.array(validation_imgs)
validation_labels = [fn.split('\\')[1].split('.')[0].strip() for fn in validation_files]
print('Train dataset shape:', train_imgs.shape),

CONCLUSION
We can clearly see that we have 3000 training images and 1000 validation images.
Each image is of size 150 x 150 and has three channels for red, green, and blue
(RGB), hence giving each image the (150, 150, 3) dimensions. We will now scale
each image with pixel values between (0, 255) to values between (0, 1) because
deep learning models work really well with small input values.
EXPERIMENT – 06

OBJECTIVE:
Write a Program using Cloudsim to create a datacentre having three hosts and
run five cloudlets on it. The cloudlets run in Virtual Machines with different
million instructions per second (MIPS) requirements. The cloudlets will take
different time to complete the execution depending on the requested VM
performance.

THEORY:
CloudSim Simulation Tool is the most popular simulator used by researchers and
developers nowadays for the cloud-related issues in the research field. This
manual will ease your learning by providing simple steps to follow up with
installing and understanding this simulation tool. *This manual is intentionally
prepared to help the research community who are working in Cloud Computing
domain.
package org.cloudbus.cloudsim.examples;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Linkedlist;
import java.util.List;
import org.cloudbus.cloudsim.Cloudlet;
import org.cloudbus.cloudsim.CloudletSchedulerTimeShared;
import org.cloudbus.cloudsim.Datacenter;
import org.cloudbus.cloudsim.DatacenterBroker;
import org.cloudbus.cloudsim.DatacenterCharacteristics;
import org.cloudbus.cloudsim.Host;
import org.cloudbus.cloudsim.Log;
import org.cloudbus.cloudsim.Pe;
import org.cloudbus.cloudsim.Storage;
import org.cloudbus.cloudsim.UtilizationModel;
import org.cloudbus.cloudsim.UtilizationModelFull;
import org.cloudbus.cloudsim.Vm;
import org.cloudbus.cloudsim.VmAllocationPolicySimple;
import org.cloudbus.cloudsim.VmSchedulerTimeShared;
import org.cloudbus.cloudsim.core.CloudSim;
import org.cloudbus.cloudsim.provisioners.BwProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.PeProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.RamProvisionerSimple;

/**
* A simple example showing how to create
**/
public class CloudSimExample3 {

/** The cloudlet list. */


private static List<Cloudlet> cloudletList;

/** The vmlist. */


private static List<Vm> vmlist;
/** Creates main() to run this example **/
public static void main(String[] args) {
Log.printLine("Starting CloudSimExample3...");
try {
// First step: Initialize the CloudSim package. It should be called.
int num_user = 1; // number of cloud users
Calendar calendar = Calendar.getInstance();
boolean trace_flag = false; // mean trace events
CloudSim.init(num_user, calendar, trace_flag);

// Second step: Create Datacenters


@SuppressWarnings("unused")
Datacenter datacenter0 = createDatacenter("Datacenter_0");

//Third step: Create Broker


DatacenterBroker broker = createBroker();
int brokerId = broker.getId();

//Fourth step: Create one virtual machine


vmlist = new ArrayList<Vm>();

//VM description
int vmid = 0;
int mips = 250;
long size = 10000; //image size (MB)
int ram = 2048; //vm memory (MB)
long bw = 1000;
int pesNumber = 1; //number of cpus
String vmm = "Xen"; //VMM name

//create two VMs


Vm vm1 = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm, new
CloudletSchedulerTimeShared());

//the second VM will have twice the priority of VM1 and so will receive twice CPU
time
vmid++;
Vm vm2 = new Vm(vmid, brokerId, mips * 2, pesNumber, ram, bw, size, vmm,
new CloudletSchedulerTimeShared());

//add the VMs to the vmList


vmlist.add(vm1);
vmlist.add(vm2);

//submit vm list to the broker


broker.submitVmList(vmlist);

//Fifth step: Create two Cloudlets


cloudletList = new ArrayList<Cloudlet>();

//Cloudlet properties
int id = 0;
long length = 40000;
long fileSize = 300;
long outputSize = 300;
UtilizationModel utilizationModel = new UtilizationModelFull();

Cloudlet cloudlet1 = new Cloudlet(id, length, pesNumber, fileSize, outputSize,


utilizationModel, utilizationModel, utilizationModel);

cloudlet1.setUserId(brokerId);
id++;
Cloudlet cloudlet2 = new Cloudlet(id, length, pesNumber, fileSize,
outputSize, utilizationModel, utilizationModel, utilizationModel);
cloudlet2.setUserId(brokerId);

//add the cloudlets to the list


cloudletList.add(cloudlet1);
cloudletList.add(cloudlet2);

//submit cloudlet list to the broker


broker.submitCloudletList(cloudletList);
//bind the cloudlets to the vms. This way, the broker
// will submit the bound cloudlets only to the specific VM
broker.bindCloudletToVm(cloudlet1.getCloudletId(),vm1.getId());
broker.bindCloudletToVm(cloudlet2.getCloudletId(),vm2.getId());

// Sixth step: Starts the simulation


CloudSim.startSimulation();

// Final step: Print results when simulation is over


List<Cloudlet> newList = broker.getCloudletReceivedList();
CloudSim.stopSimulation();

printCloudletList(newList);
Log.printLine("CloudSimExample3 finished!");
{
catch (Exception e) {
e.printStackTrace();
Log.printLine("The simulation has been terminated due to an
unexpected error");
}
}
private static Datacenter createDatacenter(String name){

// Here are the steps needed to create a PowerDatacenter:


// 1. We need to create a list to store
// our machine

List<Host> hostList = new ArrayList<Host>();

// 2. A Machine contains one or more PEs or CPUs/Cores.


// In this example, it will have only one core.
List<Pe> peList = new ArrayList<Pe>();
int mips = 1000;

// 3. Create PEs and add these into a list.


peList.add(new Pe(0, new PeProvisionerSimple(mips))); // need to store
Pe id and MIPS Rating
//4. Create Hosts with its id and list of PEs and add them to the list of machines
int hostId=0;
int ram = 2048; //host memory (MB)
long storage = 1000000; //host storage
int bw = 10000;
hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList,
new VmSchedulerTimeShared(peList)
)
); // This is our first machine

//create another machine in the Data center


List<Pe> peList2 = new ArrayList<Pe>();
hostId++;
hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList2,
new VmSchedulerTimeShared(peList2)
)
); // This is our second machine
// 5. Create a DatacenterCharacteristics object that stores the
// properties of a data center: architecture, OS, list of
// Machines, allocation policy: time- or space-shared, time zone
// and its price (G$/Pe time unit).
String arch = "x86"; // system architecture
String os = "Linux"; // operating system
String vmm = "Xen";
double time_zone = 10.0; // time zone this resource located
double cost = 3.0; // the cost of using processing in this resource
double costPerMem = 0.05; // the cost of using memory in this resource
double costPerStorage = 0.001;// the cost of using storage in this resource
double costPerBw = 0.0; // the cost of using bw in this resource
LinkedList<Storage> storageList = new LinkedList<Storage>(); //we are not
adding SAN devices by now
DatacenterCharacteristics characteristics = new DatacenterCharacteristics(
arch, os, vmm, hostList, time_zone, cost, costPerMem, costPerStorage,
costPerBw);

// 6. Finally, we need to create a PowerDatacenter object.


Datacenter datacenter = null;
try {
datacenter = new Datacenter(name, characteristics, new
VmAllocationPolicySimple(hostList), storageList, 0);
} catch (Exception e) {
e.printStackTrace();
}
return datacenter;
}

//We strongly encourage users to develop their own broker policies, to submit
vms and cloudlets according
//to the specific rules of the simulated scenario
private static DatacenterBroker createBroker(){
DatacenterBroker broker = null;
try {
broker = new DatacenterBroker("Broker");
} catch (Exception e) {
e.printStackTrace();
return null;
}
return broker;
}

/* * Prints the Cloudlet objects**/


private static void printCloudletList(List<Cloudlet> list) {
int size = list.size();
Cloudlet cloudlet;

String indent = " ";


Log.printLine();
Log.printLine("========== OUTPUT ==========");
Log.printLine("Cloudlet ID" + indent + "STATUS" + indent + "Data center ID" +
indent + "VM ID" + indent + "Time" + indent + "Start Time" + indent + "Finish
Time");
DecimalFormat dft = new DecimalFormat("###.##");
for (int i = 0; i < size; i++) {
cloudlet = list.get(i);
Log.print(indent + cloudlet.getCloudletId() + indent + indent);
if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS){
Log.print("SUCCESS");
Log.printLine( indent + indent + cloudlet.getResourceId() + indent +
indent + indent + cloudlet.getVmId() +
indent + indent + dft.format(cloudlet.getActualCPUTime()) + indent +
indent + dft.format(cloudlet.getExecStartTime())+
indent + indent + dft.format(cloudlet.getFinishTime()));
}
}
}
}
EXPERIMENT – 07

OBJECTIVE:
Application of Multi-Layer Perceptron on classification Problem

THEORY:
Multi-layer perceptron (MLP) is a supplement of a feed-forward neural network. It
consists of three types of layers—the input layer, output layer, and hidden layer,
as shown in Fig. below.

The input layer receives the input signal to be processed. The required task such
as prediction and classification is performed by the output layer. An arbitrary
number of hidden layers that are placed in between the input and output layer
are the true computational engine of the MLP. Similar to a feed-forward network
in an MLP the data flows in the forward direction from input to output layer. The
neurons in the MLP are trained with the back propagation learning algorithm.
MLPs are designed to approximate any continuous function and can solve
problems that are not linearly separable. The major use cases of MLP are pattern
classification, recognition, prediction, and approximation.

The computations taking place at every neuron in the output and hidden layer are
as follows,
o(x)=G(b(2)+W(2)h(x)) …(1)
h(x)=Φ(x)=s(b(1)+W(1)x) …(2)
with bias vectors b(1), b(2); weight matrices W(1), W(2) and activation functions G
and s. The set of parameters to learn is the set θ = {W(1), b(1), W(2), b(2)}. Typical
choices for s include tanh function with tanh(a) = (ea − e− a)/(ea + e− a) or the
logistic sigmoid function, with sigmoid(a) = 1/(1 + e− a).

PERCEPTRON FOR BINARY CLASSIFICATION


With this discrete output, controlled by the activation function, the perceptron
can be used as a binary classification model, defining a linear decision boundary.
It finds the separating hyperplane that minimizes the distance between
misclassified points and the decision boundary.

To minimize this distance, Perceptron uses Stochastic Gradient Descent as the


optimization function.
If the data is linearly separable, it is guaranteed that Stochastic Gradient Descent
will converge in a finite number of steps.
The last piece that Perceptron needs is the activation function , the function that
determines if the neuron will fire or not. Initial Perceptron models used sigmoid
function, and just by looking at its shape, it makes a lot of sense! The sigmoid
function maps any real input to a value that is either 0 or 1 and encodes a non-
linear function. The neuron can receive negative numbers as input, and it will still
be able to produce an output that is either 0 or 1.
A Multilayer Perceptron has input and output layers, and one or more hidden
layers with many neurons stacked together. And while in the Perceptron the
neuron must have an activation function that imposes a threshold, like ReLU or
sigmoid, neurons in a Multilayer Perceptron can use any arbitrary activation
function.

CONCLUSION
Perceptron is a neural network with only one neuron, and can only understand
linear relationships between the input and output data provided. However, with
Multilayer Perceptron, horizons are expanded and now this neural network can
have many layers of neurons.
EXPERIMENT – 08

OBJECTIVE :
Application of LSTM in Time Series Prediction/Speech recognition / covid -19
forecasting.

THEORY:
LSTM (Long Short-Term Memory) is a Recurrent Neural Network (RNN) based
architecture that is widely used in natural language processing and time series
forecasting. The LSTM rectifies a huge issue that recurrent neural networks suffer
from short memory. Using a series of ‘gates,’ each with its own RNN, the LSTM
manages to keep, forget or ignore data points based on a probabilistic model.
LSTMs also help solve exploding and vanishing gradient problems. In simple
terms, these problems are a result of repeated weight adjustments as a neural
network trains. With repeated epochs, gradients become larger or smaller, and
with each adjustment, it becomes easier for the network’s gradients to compound
in either direction. This compounding either makes the gradients way too large or
way too small. While exploding and vanishing gradients are huge downsides of
using traditional RNN’s, LSTM architecture severely mitigates these issues.
After a prediction is made, it is fed back into the model to predict the next value
in the sequence. With each prediction, some error is introduced into the model.
To avoid exploding gradients, values are ‘squashed’ via (typically) sigmoid & tanh
activation functions prior to gate entrance & output. Below is a diagram of LSTM
architecture.
# Time Series
import numpy as np
import matplotlib.pyplot as plt

def create_series(df, xcol, datecol):


features_considered = [xcol]
features = df[features_considered]
features.index = df[datecol]
features.head()
features.plot(subplots=True)
return features

def stationarity_test(X, log_x = "Y", return_p = False, print_res = True):


if log_x == "Y":
X = np.log(X[X>0])
from statsmodels.tsa.stattools import adfuller
dickey_fuller = adfuller(X)

if print_res:
print('ADF Stat is: {}.'.format(dickey_fuller[0]))
if log_x == "Y":
X = np.log(X[X>0])

# Once we have the series as needed we can do the ADF test


from statsmodels.tsa.stattools import adfuller
dickey_fuller = adfuller(X)

if print_res:
print('P Val is: {}.'.format(dickey_fuller[1]))
print('Critical Values (Significance Levels): ')
for key,val in dickey_fuller[4].items():
print(key,":",round(val,3))

if return_p:
return dickey_fuller[1]

def difference(X):
diff = X.diff()
plt.plot(diff)
plt.show()
return diff

PROCEDURE:
Before building the model, we create a series and check for stationarity. While
stationarity is not an explicit assumption of LSTM, it does help immensely in
controlling error. A non-stationary series will introduce more errors in predictions
and force errors to compound faster.
We filter out one ‘sequence length’ of data points for later validation. In this case,
60 points.
The data format required for an LSTM is 3 dimensional, with a moving window.
 So the first data point will be the first 60 days of data.
 The second data point is the first 61 days of data but not including the first.
 The third data point is the first 62 days of data but not including the first
and second.

The last major step of prep is to scale the data. Here we use a simple min-max
scaler. Our sequence length is 60 days for this part of the code.

CONCLUSION:
Since this article is mainly about building an LSTM, I didn’t discuss many
advantages/disadvantages of using an LSTM over classical methods. I’d like to
offer some guidelines in this conclusion:
Technical Considerations
1. ARIMA (and MA-based models in general) are designed for time series data
while RNN-based models are designed for sequence data. Because of this
distinction, it’s harder to build RNN-based models out of the box.
2. ARIMA models are highly parameterized and due to this, they don’t generalize
well. Using a parameterized ARIMA on a new dataset may not return accurate
results. RNN-based models are non-parametric and are more generalizable.
3. Depending on window size, data, and desired prediction time, LSTM models
can be very computationally expensive. Sometimes they’re not feasible without
powerful cloud computing.
4. It’s good practice to have a ‘no-skill’ model to compare results to. A good start
would be to compare the model results to a model predicting only the mean for
each time step over the period (horizontal line).
EXPERIMENT – 09

OBJECTIVE -
Application of Convolution Neural Network in disease detection such as
pneumonia/covid detection through Chest X-ray/ heart beat classification etc.

ABSTRACT :
CNNs are powerful image processing, artificial intelligence (AI) that use deep
learning to perform both generative and descriptive tasks, often using machine
vison that includes image and video recognition, along with recommender
systems and natural language processing (NLP).
A CNN uses a system much like a multilayer perceptron that has been designed
for reduced processing requirements. The layers of a CNN consist of an input
layer, an output layer and a hidden layer that includes multiple convolutional
layers, pooling layers, fully connected layers and normalization layers. The
removal of limitations and increase in efficiency for image processing results in a
system that is far more effective, simpler to trains limited for image processing
and natural language processing.

a) Application of Convolution Neural Network in pneumonia through Chest X-


ray classification

Introduction:
Pneumonia is a lung parenchyma inflammation often caused by pathogenic
microorganisms, factors of physical and chemical, immunologic injury and
other pharmaceuticals. There are several popular pneumonia classification
methods: (1) pneumonia is classified as infectious and non-infectious based on
different pathogeneses in which infectious pneumonia is then classified to
bacteria, virus, mycoplasmas, chlamydial pneumonia, and others, while non-
infectious pneumonia is classified as immune-associated pneumonia,
aspiration pneumonia caused by physical and chemical factors, and radiation
pneumonia. (2) Pneumonia is classified as CAP (community-acquired
pneumonia), HAP (hospital-acquired pneumonia) and VAP (ventilator-
associated pneumonia) based on different infections, among which CAP
accounts for a larger part. Because of the different range of pathogens, HAP is
easier to develop resistance to various antibiotics, making treatment more
difficult.
Related Work:
Several methods have been introduced to describe a brief process in
pneumonia detection using chest X-ray images in recent years, especially some
deep learning methods. Deep Learning has been successfully applied to
improve the performance of computer aided diagnosis technology (CAD),
especially in the field of medical imaging [5], image segmentation [6,7] and
image reconstruction [8,9]. In 2017, Rampura et al.

Background:
In the past few decades, machine learning (ML) algorithms have gradually
attracted researchers’ attention. This type of algorithm could take full
advantage of the giant computing power of calculators in images processing
through given algorithms or specified steps. However, traditional ML methods
in classification tasks need to manually design algorithms or manually set
feature extraction layers to classify images

Proposed CNN Model


Figure 4 illustrates the architecture of our proposed model that has been
applied for the detection of whether the input image shows pneumonia. Figure
5 displays our model that contains a total of six layers, where we employed 3 ×
3 kernel convolution layers whose strides are 1 × 1 and the activation function
is ReLU. After each convolution layer, a 2 × 2 strides kernel operation was
employed as a max-pooling operation to retain the maximum of each sub-
region, which is split according to strides. Besides, we set several drop layers to
randomly fit weights to zero, aiming to improve the model performance. Then
two densely fully-connected layers followed by Sigmoid function are utilized to
take full advantage of the features extracted through previous layers,
outputting the possibility of patients suffering from pneumonia or not.
b) Application of Convolution Neural Network in covid detection through heart
beat classification: -
Introduction:-
CNN is used in pattern recognition with superior feature learning capabilities,
being a suitable model to deal with image data. Indeed, CNN is a dominant
architecture of DL for image classification and can rival human accuracies in
many tasks. CNN uses hierarchical layers of tiled convolutional filters to mimic
the effects of human receptive fields on feedforward processing in the early
visual cortex thereby exploiting the local spatial correlations present in images
while developing robustness to natural transformations such as changes of
viewpoint or scale. A CNN-based model generally requires a large set of
training samples to achieve good generalization capabilities. Its basic structure
is represented as a sequence of Convolutional—Pooling—Fully Connected
Layers possibly with other intermediary layers for normalization and/or
dropout.

Network architecture:-
1. Input layer
The input layer basically depends on the dimension of the images. In our
network, all images must have the same dimension presented as a grayscale
(single colour channel) image.
2. Batch Normalization layer.
Batch normalization converts the distribution of the inputs to a standard
normal distribution with mean 0 and variance 1, avoiding the problem of
gradient dispersion and accelerating the training process.
3. Convolutional layer.
Convolutions are the main building blocks of a CNN. Filter kernels are slid over
the image and for each position the dot product of the filter kernel and the
part of the image covered by the kernel is taken. All kernels used in this layer
are 3 × 3 pixels. The chosen activation function of convolutional layers is the
rectified linear unit (ReLU), which is easy to train due to its piecewise linear
and sparse characteristics.
4. Max pooling layer.
Max pooling is a sub-sampling procedure that uses the maximum value of a
window as the output. The size of such a window was chosen as 2 × 2 pixels.
5. Fire layer.
A fire module is comprised of a squeeze convolutional layer (which has only 1 ×
1 filters) feeding into an expand layer that has a mix of 1 × 1 and 3 × 3
convolution filters. The use of a fire layer could reduce training time while still
extracting data characteristics in comparison with dense layers with the same
number of parameters. The layer is represented in Fig 4 in which Input and
Output have the same dimensions.

Proposed model:-
Despite their self-learning capacity and superior prediction performance, LWL
and SOM models achieve human-like precision in image description and
prediction issues. Our framework aims mainly at providing distinguishing visual
properties and a quick diagnostic system that can be used to classify new
COVID-19 X-rays. This technique can also be useful to clinicians as a treatment
plan that can be used depending on the type of infection and can provide
prompt decisions.
Related Work:-
Real-time reverse transcription-polymerase chain reaction (RT-PCR) is the
primary research technique currently in use for COVID-19 diagnosis. Chest
radiographic images, such as CT images and X-rays, are critical for the early
diagnosis and treatment of the condition. The low sensitivity of RT-PCR (60–
70%) allows symptoms to be detected by analysing radiographic images of
patients, even though adverse findings are obtained.

CONCLUSION:-
Within this context, the literature suggests that the diagnosis may be assisted by
the use of data mining methods to classify pneumonia disease in chest X-rays.
However, the issue is much more difficult when we look at chest images of
patients suffering from pneumonia caused by multiple types of pathogens and
attempt to forecast a particular form of pneumonia (COVID-19).
EXPERIMENT – 10

OBJECTIVE:-
Designing new methods for DAG scheduling problem for cloud computing.

ABSTRACT:-
It is a scheduling layer in a spark which implements stage-oriented scheduling. It
converts logical execution plan to a physical execution plan. When an action is
called, spark directly strikes to DAG scheduler. It executes the tasks those are
submitted to the scheduler.
The objective of DAG scheduling is to minimize the overall program finish-time by
proper allocation of the tasks to the processors and arrangement of execution
sequencing of the tasks. Scheduling is done in such a manner that the precedence
constraints among the program tasks are preserved. The overall finish-time of a
parallel program is commonly called the schedule length or make span. Some
variations to this goal have been suggested. For example, some researchers
proposed algorithms to minimize the mean flow-time or mean finish-time, which
is the average of the finish-times of all the program tasks [25], [110]. The
significance of the mean finish-time criterion is that minimizing it in the final
schedule leads to the reduction of the mean number of unfinished tasks at each
point in the schedule. Some other algorithms try to reduce the setup costs of the
parallel processors [159]. We focus on algorithms that minimize the schedule
length.

INTRODUCTION:-
The Cloud is a huge, interconnected system of Powerful servers that provides
businesses and individuals with services [1] The concept (Cloud Computing) refers
to the ability for online users to share resources offered by the service provider.
Without needing to buy expensive hardware, to leverage the high-service
provider's capabilities[2]. The main goal of the cloud computing model is to allow
users to share resources and data, Software as a service (SaaS), application as a
service (PaaS), and infrastructure as a service (IaaS). As the number of cloud users
has grown in recent years, the number of tasks that must be managed
propositionally has increased, necessitating task scheduling[3]. methodology is
based on Reinforcement learning
RELATED WORK:-
The task scheduling algorithm's main goal is to ensure that tasks are completed as
efficiently as possible. List scheduling algorithms are used in the task scheduling
process. In list scheduling algorithms, there are two distinct phases. The first
phase entails determining the tasks' priority, and the second phase entails
assigning tasks to the processor in the order determined[3], They will be
discussed as follow. In 2017 (Wei et al.)[4] t has been proposed a task scheduling
algorithm based on Q-learning and the mutual value function (QS).

Workflow model:-
A directed acyclic graph, G=(V,E), represents an application, with V representing
the set of v tasks and E representing the set of e edges between the tasks. Each
edge (imp) E represents a precedence constraint, requiring task to finish before
task can begin.

Data is a v×v matrix of communication data, with indicating the amount of data to
be transmitted from task to task . DAG scheduling object: node tasks are assigned
object resources that must satisfy a chronological order constraint in order to
reduce the total time to completion.

Components of proposed algorithm:-


RL, MDP, and the Q-learning algorithm

Proposed Scheduling Algorithm:-


Input: DAG all Tasks.
Output: The make span.

PROCEDURE:
1: Create DAG for all tasks.
2: Set gamma parameter, environment rewards in matrix R.
3: Initialize matrix Q to zero.
4: Repeat for each episode.
5: Select an initial state.
6: While the goal state not reached Do.
7: Select possible actions for the current state.
8: Go to the next state.
9: Get maximum Q value with E.g. (6).
10: Set next state as a current state.
11:Update Q(state, action) with E.g. (6).
12: Obtain tasks order according to updated Q-table.
13: Map task to the processor which have the minimum execution time.
14: Calculate the make span.
15: Until no longer changes in make span

CONCLUSION –
Existing scheduling algorithms focused on the time. The main goal of these
schedulers is to reduce the overall Make span of the workflow. Gaps in
current workflow scheduling strategies in the cloud environments were
studied in this thesis, and an effective scheduling method for workflow
management in the cloud setting was proposed based on the gap analysis.
It has b even determined that the current scheme is effective enough to
make the best use of the available resources. There are two stages to the
algorithm design theory.

You might also like