Professional Documents
Culture Documents
LUCKNOW
SUBMITTED BY-
HIMANI JAYAS
1805213023
IT 4TH YEAR
EXPERIMENT - 01
OBJECTIVE:
For a given network of cities, find an optimal path to reach from a given source
city to any other destination city using an admissible heuristic.
THEORY:
Heuristics: The heuristic function h(n) tells A* an estimate of the minimum cost
from any vertex n to the goal. It’s important to choose a good heuristic function.
The heuristic can be used to control A*’s behavior.
At one extreme, if h(n) is 0, then only g(n) plays a role, and A* turns into
Dijkstra’s Algorithm, which is guaranteed to find a shortest path.
If h(n) is always lower than (or equal to) the cost of moving from n to the
goal, then A* is guaranteed to find a shortest path. The lower h(n) is, the
more node A* expands, making it slower.
If h(n) is exactly equal to the cost of moving from n to the goal, then A* will
only follow the best path and never expand anything else, making it very
fast. Although you can’t make this happen in all cases, you can make it
exact in some special cases. It’s nice to know that given perfect
information, A* will behave perfectly.
If h(n) is sometimes greater than the cost of moving from n to the goal,
then A* is not guaranteed to find a shortest path, but it can run faster.
At the other extreme, if h(n) is very high relative to g(n), then only h(n)
plays a role, and A* turns into Greedy Best-First-Search.
OBJECTIVE:
Solve the weather problem to predict the possibility of a rain happening under
known parameters for e.g., temperature, humidity, wind flow, sunny or cloudy
etc. using Bayesian Learning.
THEORY:
The basic idea of Bayesian networks (BNs) (BNs) is to reproduce the most
important dependencies and independencies among a set of variables in a
graphical form (a directed acyclic graph) which is easy to understand and
interpret. Let us consider the subset of climatic stations shown in the graph in
Figure, where the variables (rainfall) are represented pictorially by a set of nodes;
one node for each variable (for clarity of exposition, the set of nodes is denoted
{y1,.....yn}). These nodes are connected by arrows, which represent a cause and
effect relationship. That is, if there is an arrow from node yi to node yj , we say
that yi is the cause of yj , or equivalently, yj is the effect of yi. Another popular
terminology of this is to say that yi is a parent of yj or yj is a child of yi. For
example, in Figure, the nodes Gijon and Amieva and Proaza are a child of Gijon
and Rioseco (the set of parents of a node yi is denoted by πi).
Therefore, the independencies from the graph are easily translated to the
probabilistic model in a sound form. For instance, the JPD of a BN defined by the
graph given in Figure requires the specification of 100 conditional probability
tables, one for each variable conditioned to its parents’ set. Hereafter we shall
consider rainfall discredited into three different states (0=“no rain”, 1=“weak
rain”, 2=“heavy rain”), associated with the thresholds 0, 2, and 10 mm,
respectively.
PROCEDURE:
1. A quality measure, which is used for computing the quality of the candidate
BNs. This is a global measure, since it measures both the quality of the graphical
structure and the quality of the estimated parameters.
2. A search algorithm, which is used to efficiently search the space of possible BNs
to find the one with highest quality. Note that the number of all possible
networks, even for a small number of variables and, therefore, the search space is
huge. Among the different quality measures proposed in the literature the basic
idea of Bayesian quality measures is to assign to every BN a quality value that is a
function of the posterior probability distribution of the available data D = {yt1, …,
yt 100} (with the index t running daily from 1979 to 1993), given the BN (M,θ)
with network structure M and the corresponding estimated probabilities θ. The
posterior probability distribution p(M, θ|D) is calculated as follows:
where n is the number of variables, ri is the cardinal of the i-th variable, si the
number of realizations of the parent’s set Πi , ηijk are the “a priori” Dirichlet
hyper-parameters for the conditional distribution of node i, Nijk is the number of
realizations in the database consistent with yi = j and πi = k, Nik is the number of
realizations in the database consistent with πi = k and Г is the gamma function.
2) Inference-
Once a model describing the relationships among the set of variables has been
selected, it can then be used to answer queries when evidence becomes
available.
OUTPUT:
CONCLUSION:
We have used Bayesian network learning and show their applicability for local
weather forecasting and downscaling. The preliminary results presented how
such models can be built and how they can be used for performing inference.
EXPERIMENT – 03
OBJECTIVE:
Solve the problem of human recognition from their faces using machine learning
techniques.
THEORY:
Let us introduce a new benchmark data set of face images with variable makeup,
hairstyles and occlusions, named BookClub artistic makeup data, and then
examine the performance of the ANNs under different conditions. Makeup and
other occlusions can be used not only to disguise a person's identity from the
ANN algorithms, but also to spoof a wrong identification.
ANN Algorithm:
Artificial Neural Network (ANN) are capable of learning patterns of interest from
data in the presence of variations. An Artificial Neural Network in the field of
Artificial intelligence where it attempts to mimic the network of neurons makes
up a human brain so that computers will have an option to understand things and
make decisions in a human-like manner.
Artificial Neural Network primarily consists of three layers:
Input Layer
Hidden Layer
Output Layer
PROCEDURE:
1. The images used in this are kept coloured and downsized and compressed into
JPEG format with the dimension of 48x48 pixels.
2. The downsizing is done due to computational restrictions to keep processing
times reasonable. However, observations made on the small size images are
extendable to larger sizes.
3. For computational experiments, ‘Keras’ library with Tensorflow back-end were
used.
4. The ANN consists of the four sequential groups of layers of the Gaussian noise,
convolution with ReLU activation functions, normalization, pooling and dropout
layers.
5. It is topped with the fully connected layers, the softmax activation function of
the last layer and cross-entropy loss function. "Adam" learning algorithm with
0:001 coecient, mini-batch size 32 and 100 epochs parameters are used.
OUTPUT:
CONCLUSION: Despite the small size images were scaled to and not very deep
ANN, mean accuracy of the face recognition of the model trained on the samples
from all photo-sessions of all subjects is quite high at 92%, and higher (up to
99:9%)
EXPERIMENT – 04
OBJECTIVE:
Classify the objects using deep learning techniques.
THEORY:
Image classification involves assigning a class label to an image, whereas object
localization involves drawing a bounding box around one or more objects in an
image. Object detection is more challenging and combines these two tasks and
draws a bounding box around each object of interest in the image and assigns
them a class label. Together, all of these problems are referred to as object
recognition.
Object Detection: Locate the presence of objects with a bounding box and
types or classes of the located objects in an image.
o Input: An image with one or more objects, such as a photograph.
o Output: One or more bounding boxes (e.g. defined by a point, width, and
height), and a class label for each bounding box.
CONCLUSION:
Object detection can be used in many areas to reduce human efforts and increase
the efficiency of processes in various fields. Object detection, as well as deep
learning, is areas that will be blooming in the future and making its presence
across numerous fields. There is a lot of scope in these fields and also many
opportunities for improvements.
EXPERIMENT – 05
OBJECTIVE:
Validate the principles of transfer learning for solving any real-life
classification/recognition problem.
THEORY:
Humans have an inherent ability to transfer knowledge across tasks. What we
acquire as knowledge while learning about one task, we utilize in the same way to
solve related tasks. The more related the tasks, the easier it is for us to transfer,
or cross-utilize our knowledge. Some simple examples would be,
Know how to ride a motorbike ⮫ Learn how to ride a car
Transfer learning, as we have seen so far, is having the ability to utilize existing
knowledge from the source learner in the target task. During the process of
transfer learning, the following three important questions must be answered:
What to transfer: This is the first and the most important step in the whole
process. We try to seek answers about which part of the knowledge can be
transferred from the source to the target in order to improve the performance of
the target task. When trying to answer this question, we try to identify which
portion of knowledge is source-specific and what is common between the source
and the target.
When to transfer: There can be scenarios where transferring knowledge for the
sake of it may make matters worse than improving anything (also known as
negative transfer). We should aim at utilizing transfer learning to improve target
task performance/results and not degrade them. We need to be careful about
when to transfer and when not to.
How to transfer: Once what and when have been answered, we can proceed
towards identifying ways of actually transferring the knowledge across
domains/tasks. This involves changes to existing algorithms and different
techniques, which we will cover in later sections of this article. Also, specific case
studies are lined up in the end for a better understanding of how to transfer.
Image Classification with a Data Availability Constraint
The dataset that we will be using, comes from the very popular Dog vs Cat
Challenge, where our primary objective is to build a deep learning model that can
successfully recognize and categorize images into either a cat or a dog.
Creating Datasets:
import glob
import numpy as np
import os
import shutilnp.random.seed(42)
files = glob.glob(‘train/*’)
cat_files = [fn for fn in files if 'cat' in fn]
dog_files = [fn for fn in files if 'dog' in fn]
len(cat_files), len(dog_files)
cat_train = np.random.choice(cat_files, size=1500, replace=False)
dog_train = np.random.choice(dog_files, size=1500, replace=False)
cat_files = list(set(cat_files) - set(cat_train))
dog_files = list(set(dog_files) - set(dog_train))
cat_val = np.random.choice(cat_files, size=500, replace=False)
dog_val = np.random.choice(dog_files, size=500, replace=False)
cat_files = list(set(cat_files) - set(cat_val))
dog_files = list(set(dog_files) - set(dog_val))
cat_test = np.random.choice(cat_files, size=500, replace=False)
dog_test = np.random.choice(dog_files, size=500, replace=False)
print('Cat datasets:', cat_train.shape, cat_val.shape, cat_test.shape)
Writing on disk
train_dir = 'training_data'
val_dir = 'validation_data'
test_dir = 'test_data'
train_files = np.concatenate([cat_train, dog_train])
validate_files = np.concatenate([cat_val, dog_val])
test_files = np.concatenate([cat_test, dog_test])
os.mkdir(train_dir) if not os.path.isdir(train_dir) else None
os.mkdir(val_dir) if not os.path.isdir(val_dir) else None
os.mkdir(test_dir) if not os.path.isdir(test_dir) else None
for fn in trains_files:
shutil.copy(fn, train_dir)
for fn in validate_files:
shutil.copy(fn, val_dir)
CONCLUSION
We can clearly see that we have 3000 training images and 1000 validation images.
Each image is of size 150 x 150 and has three channels for red, green, and blue
(RGB), hence giving each image the (150, 150, 3) dimensions. We will now scale
each image with pixel values between (0, 255) to values between (0, 1) because
deep learning models work really well with small input values.
EXPERIMENT – 06
OBJECTIVE:
Write a Program using Cloudsim to create a datacentre having three hosts and
run five cloudlets on it. The cloudlets run in Virtual Machines with different
million instructions per second (MIPS) requirements. The cloudlets will take
different time to complete the execution depending on the requested VM
performance.
THEORY:
CloudSim Simulation Tool is the most popular simulator used by researchers and
developers nowadays for the cloud-related issues in the research field. This
manual will ease your learning by providing simple steps to follow up with
installing and understanding this simulation tool. *This manual is intentionally
prepared to help the research community who are working in Cloud Computing
domain.
package org.cloudbus.cloudsim.examples;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Linkedlist;
import java.util.List;
import org.cloudbus.cloudsim.Cloudlet;
import org.cloudbus.cloudsim.CloudletSchedulerTimeShared;
import org.cloudbus.cloudsim.Datacenter;
import org.cloudbus.cloudsim.DatacenterBroker;
import org.cloudbus.cloudsim.DatacenterCharacteristics;
import org.cloudbus.cloudsim.Host;
import org.cloudbus.cloudsim.Log;
import org.cloudbus.cloudsim.Pe;
import org.cloudbus.cloudsim.Storage;
import org.cloudbus.cloudsim.UtilizationModel;
import org.cloudbus.cloudsim.UtilizationModelFull;
import org.cloudbus.cloudsim.Vm;
import org.cloudbus.cloudsim.VmAllocationPolicySimple;
import org.cloudbus.cloudsim.VmSchedulerTimeShared;
import org.cloudbus.cloudsim.core.CloudSim;
import org.cloudbus.cloudsim.provisioners.BwProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.PeProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.RamProvisionerSimple;
/**
* A simple example showing how to create
**/
public class CloudSimExample3 {
//VM description
int vmid = 0;
int mips = 250;
long size = 10000; //image size (MB)
int ram = 2048; //vm memory (MB)
long bw = 1000;
int pesNumber = 1; //number of cpus
String vmm = "Xen"; //VMM name
//the second VM will have twice the priority of VM1 and so will receive twice CPU
time
vmid++;
Vm vm2 = new Vm(vmid, brokerId, mips * 2, pesNumber, ram, bw, size, vmm,
new CloudletSchedulerTimeShared());
//Cloudlet properties
int id = 0;
long length = 40000;
long fileSize = 300;
long outputSize = 300;
UtilizationModel utilizationModel = new UtilizationModelFull();
cloudlet1.setUserId(brokerId);
id++;
Cloudlet cloudlet2 = new Cloudlet(id, length, pesNumber, fileSize,
outputSize, utilizationModel, utilizationModel, utilizationModel);
cloudlet2.setUserId(brokerId);
printCloudletList(newList);
Log.printLine("CloudSimExample3 finished!");
{
catch (Exception e) {
e.printStackTrace();
Log.printLine("The simulation has been terminated due to an
unexpected error");
}
}
private static Datacenter createDatacenter(String name){
//We strongly encourage users to develop their own broker policies, to submit
vms and cloudlets according
//to the specific rules of the simulated scenario
private static DatacenterBroker createBroker(){
DatacenterBroker broker = null;
try {
broker = new DatacenterBroker("Broker");
} catch (Exception e) {
e.printStackTrace();
return null;
}
return broker;
}
OBJECTIVE:
Application of Multi-Layer Perceptron on classification Problem
THEORY:
Multi-layer perceptron (MLP) is a supplement of a feed-forward neural network. It
consists of three types of layers—the input layer, output layer, and hidden layer,
as shown in Fig. below.
The input layer receives the input signal to be processed. The required task such
as prediction and classification is performed by the output layer. An arbitrary
number of hidden layers that are placed in between the input and output layer
are the true computational engine of the MLP. Similar to a feed-forward network
in an MLP the data flows in the forward direction from input to output layer. The
neurons in the MLP are trained with the back propagation learning algorithm.
MLPs are designed to approximate any continuous function and can solve
problems that are not linearly separable. The major use cases of MLP are pattern
classification, recognition, prediction, and approximation.
The computations taking place at every neuron in the output and hidden layer are
as follows,
o(x)=G(b(2)+W(2)h(x)) …(1)
h(x)=Φ(x)=s(b(1)+W(1)x) …(2)
with bias vectors b(1), b(2); weight matrices W(1), W(2) and activation functions G
and s. The set of parameters to learn is the set θ = {W(1), b(1), W(2), b(2)}. Typical
choices for s include tanh function with tanh(a) = (ea − e− a)/(ea + e− a) or the
logistic sigmoid function, with sigmoid(a) = 1/(1 + e− a).
CONCLUSION
Perceptron is a neural network with only one neuron, and can only understand
linear relationships between the input and output data provided. However, with
Multilayer Perceptron, horizons are expanded and now this neural network can
have many layers of neurons.
EXPERIMENT – 08
OBJECTIVE :
Application of LSTM in Time Series Prediction/Speech recognition / covid -19
forecasting.
THEORY:
LSTM (Long Short-Term Memory) is a Recurrent Neural Network (RNN) based
architecture that is widely used in natural language processing and time series
forecasting. The LSTM rectifies a huge issue that recurrent neural networks suffer
from short memory. Using a series of ‘gates,’ each with its own RNN, the LSTM
manages to keep, forget or ignore data points based on a probabilistic model.
LSTMs also help solve exploding and vanishing gradient problems. In simple
terms, these problems are a result of repeated weight adjustments as a neural
network trains. With repeated epochs, gradients become larger or smaller, and
with each adjustment, it becomes easier for the network’s gradients to compound
in either direction. This compounding either makes the gradients way too large or
way too small. While exploding and vanishing gradients are huge downsides of
using traditional RNN’s, LSTM architecture severely mitigates these issues.
After a prediction is made, it is fed back into the model to predict the next value
in the sequence. With each prediction, some error is introduced into the model.
To avoid exploding gradients, values are ‘squashed’ via (typically) sigmoid & tanh
activation functions prior to gate entrance & output. Below is a diagram of LSTM
architecture.
# Time Series
import numpy as np
import matplotlib.pyplot as plt
if print_res:
print('ADF Stat is: {}.'.format(dickey_fuller[0]))
if log_x == "Y":
X = np.log(X[X>0])
if print_res:
print('P Val is: {}.'.format(dickey_fuller[1]))
print('Critical Values (Significance Levels): ')
for key,val in dickey_fuller[4].items():
print(key,":",round(val,3))
if return_p:
return dickey_fuller[1]
def difference(X):
diff = X.diff()
plt.plot(diff)
plt.show()
return diff
PROCEDURE:
Before building the model, we create a series and check for stationarity. While
stationarity is not an explicit assumption of LSTM, it does help immensely in
controlling error. A non-stationary series will introduce more errors in predictions
and force errors to compound faster.
We filter out one ‘sequence length’ of data points for later validation. In this case,
60 points.
The data format required for an LSTM is 3 dimensional, with a moving window.
So the first data point will be the first 60 days of data.
The second data point is the first 61 days of data but not including the first.
The third data point is the first 62 days of data but not including the first
and second.
The last major step of prep is to scale the data. Here we use a simple min-max
scaler. Our sequence length is 60 days for this part of the code.
CONCLUSION:
Since this article is mainly about building an LSTM, I didn’t discuss many
advantages/disadvantages of using an LSTM over classical methods. I’d like to
offer some guidelines in this conclusion:
Technical Considerations
1. ARIMA (and MA-based models in general) are designed for time series data
while RNN-based models are designed for sequence data. Because of this
distinction, it’s harder to build RNN-based models out of the box.
2. ARIMA models are highly parameterized and due to this, they don’t generalize
well. Using a parameterized ARIMA on a new dataset may not return accurate
results. RNN-based models are non-parametric and are more generalizable.
3. Depending on window size, data, and desired prediction time, LSTM models
can be very computationally expensive. Sometimes they’re not feasible without
powerful cloud computing.
4. It’s good practice to have a ‘no-skill’ model to compare results to. A good start
would be to compare the model results to a model predicting only the mean for
each time step over the period (horizontal line).
EXPERIMENT – 09
OBJECTIVE -
Application of Convolution Neural Network in disease detection such as
pneumonia/covid detection through Chest X-ray/ heart beat classification etc.
ABSTRACT :
CNNs are powerful image processing, artificial intelligence (AI) that use deep
learning to perform both generative and descriptive tasks, often using machine
vison that includes image and video recognition, along with recommender
systems and natural language processing (NLP).
A CNN uses a system much like a multilayer perceptron that has been designed
for reduced processing requirements. The layers of a CNN consist of an input
layer, an output layer and a hidden layer that includes multiple convolutional
layers, pooling layers, fully connected layers and normalization layers. The
removal of limitations and increase in efficiency for image processing results in a
system that is far more effective, simpler to trains limited for image processing
and natural language processing.
Introduction:
Pneumonia is a lung parenchyma inflammation often caused by pathogenic
microorganisms, factors of physical and chemical, immunologic injury and
other pharmaceuticals. There are several popular pneumonia classification
methods: (1) pneumonia is classified as infectious and non-infectious based on
different pathogeneses in which infectious pneumonia is then classified to
bacteria, virus, mycoplasmas, chlamydial pneumonia, and others, while non-
infectious pneumonia is classified as immune-associated pneumonia,
aspiration pneumonia caused by physical and chemical factors, and radiation
pneumonia. (2) Pneumonia is classified as CAP (community-acquired
pneumonia), HAP (hospital-acquired pneumonia) and VAP (ventilator-
associated pneumonia) based on different infections, among which CAP
accounts for a larger part. Because of the different range of pathogens, HAP is
easier to develop resistance to various antibiotics, making treatment more
difficult.
Related Work:
Several methods have been introduced to describe a brief process in
pneumonia detection using chest X-ray images in recent years, especially some
deep learning methods. Deep Learning has been successfully applied to
improve the performance of computer aided diagnosis technology (CAD),
especially in the field of medical imaging [5], image segmentation [6,7] and
image reconstruction [8,9]. In 2017, Rampura et al.
Background:
In the past few decades, machine learning (ML) algorithms have gradually
attracted researchers’ attention. This type of algorithm could take full
advantage of the giant computing power of calculators in images processing
through given algorithms or specified steps. However, traditional ML methods
in classification tasks need to manually design algorithms or manually set
feature extraction layers to classify images
Network architecture:-
1. Input layer
The input layer basically depends on the dimension of the images. In our
network, all images must have the same dimension presented as a grayscale
(single colour channel) image.
2. Batch Normalization layer.
Batch normalization converts the distribution of the inputs to a standard
normal distribution with mean 0 and variance 1, avoiding the problem of
gradient dispersion and accelerating the training process.
3. Convolutional layer.
Convolutions are the main building blocks of a CNN. Filter kernels are slid over
the image and for each position the dot product of the filter kernel and the
part of the image covered by the kernel is taken. All kernels used in this layer
are 3 × 3 pixels. The chosen activation function of convolutional layers is the
rectified linear unit (ReLU), which is easy to train due to its piecewise linear
and sparse characteristics.
4. Max pooling layer.
Max pooling is a sub-sampling procedure that uses the maximum value of a
window as the output. The size of such a window was chosen as 2 × 2 pixels.
5. Fire layer.
A fire module is comprised of a squeeze convolutional layer (which has only 1 ×
1 filters) feeding into an expand layer that has a mix of 1 × 1 and 3 × 3
convolution filters. The use of a fire layer could reduce training time while still
extracting data characteristics in comparison with dense layers with the same
number of parameters. The layer is represented in Fig 4 in which Input and
Output have the same dimensions.
Proposed model:-
Despite their self-learning capacity and superior prediction performance, LWL
and SOM models achieve human-like precision in image description and
prediction issues. Our framework aims mainly at providing distinguishing visual
properties and a quick diagnostic system that can be used to classify new
COVID-19 X-rays. This technique can also be useful to clinicians as a treatment
plan that can be used depending on the type of infection and can provide
prompt decisions.
Related Work:-
Real-time reverse transcription-polymerase chain reaction (RT-PCR) is the
primary research technique currently in use for COVID-19 diagnosis. Chest
radiographic images, such as CT images and X-rays, are critical for the early
diagnosis and treatment of the condition. The low sensitivity of RT-PCR (60–
70%) allows symptoms to be detected by analysing radiographic images of
patients, even though adverse findings are obtained.
CONCLUSION:-
Within this context, the literature suggests that the diagnosis may be assisted by
the use of data mining methods to classify pneumonia disease in chest X-rays.
However, the issue is much more difficult when we look at chest images of
patients suffering from pneumonia caused by multiple types of pathogens and
attempt to forecast a particular form of pneumonia (COVID-19).
EXPERIMENT – 10
OBJECTIVE:-
Designing new methods for DAG scheduling problem for cloud computing.
ABSTRACT:-
It is a scheduling layer in a spark which implements stage-oriented scheduling. It
converts logical execution plan to a physical execution plan. When an action is
called, spark directly strikes to DAG scheduler. It executes the tasks those are
submitted to the scheduler.
The objective of DAG scheduling is to minimize the overall program finish-time by
proper allocation of the tasks to the processors and arrangement of execution
sequencing of the tasks. Scheduling is done in such a manner that the precedence
constraints among the program tasks are preserved. The overall finish-time of a
parallel program is commonly called the schedule length or make span. Some
variations to this goal have been suggested. For example, some researchers
proposed algorithms to minimize the mean flow-time or mean finish-time, which
is the average of the finish-times of all the program tasks [25], [110]. The
significance of the mean finish-time criterion is that minimizing it in the final
schedule leads to the reduction of the mean number of unfinished tasks at each
point in the schedule. Some other algorithms try to reduce the setup costs of the
parallel processors [159]. We focus on algorithms that minimize the schedule
length.
INTRODUCTION:-
The Cloud is a huge, interconnected system of Powerful servers that provides
businesses and individuals with services [1] The concept (Cloud Computing) refers
to the ability for online users to share resources offered by the service provider.
Without needing to buy expensive hardware, to leverage the high-service
provider's capabilities[2]. The main goal of the cloud computing model is to allow
users to share resources and data, Software as a service (SaaS), application as a
service (PaaS), and infrastructure as a service (IaaS). As the number of cloud users
has grown in recent years, the number of tasks that must be managed
propositionally has increased, necessitating task scheduling[3]. methodology is
based on Reinforcement learning
RELATED WORK:-
The task scheduling algorithm's main goal is to ensure that tasks are completed as
efficiently as possible. List scheduling algorithms are used in the task scheduling
process. In list scheduling algorithms, there are two distinct phases. The first
phase entails determining the tasks' priority, and the second phase entails
assigning tasks to the processor in the order determined[3], They will be
discussed as follow. In 2017 (Wei et al.)[4] t has been proposed a task scheduling
algorithm based on Q-learning and the mutual value function (QS).
Workflow model:-
A directed acyclic graph, G=(V,E), represents an application, with V representing
the set of v tasks and E representing the set of e edges between the tasks. Each
edge (imp) E represents a precedence constraint, requiring task to finish before
task can begin.
Data is a v×v matrix of communication data, with indicating the amount of data to
be transmitted from task to task . DAG scheduling object: node tasks are assigned
object resources that must satisfy a chronological order constraint in order to
reduce the total time to completion.
PROCEDURE:
1: Create DAG for all tasks.
2: Set gamma parameter, environment rewards in matrix R.
3: Initialize matrix Q to zero.
4: Repeat for each episode.
5: Select an initial state.
6: While the goal state not reached Do.
7: Select possible actions for the current state.
8: Go to the next state.
9: Get maximum Q value with E.g. (6).
10: Set next state as a current state.
11:Update Q(state, action) with E.g. (6).
12: Obtain tasks order according to updated Q-table.
13: Map task to the processor which have the minimum execution time.
14: Calculate the make span.
15: Until no longer changes in make span
CONCLUSION –
Existing scheduling algorithms focused on the time. The main goal of these
schedulers is to reduce the overall Make span of the workflow. Gaps in
current workflow scheduling strategies in the cloud environments were
studied in this thesis, and an effective scheduling method for workflow
management in the cloud setting was proposed based on the gap analysis.
It has b even determined that the current scheme is effective enough to
make the best use of the available resources. There are two stages to the
algorithm design theory.