You are on page 1of 6

Capstone Project: Weekly Progress Report Reporting Week: May 2021

Department of Computer Science & Engineering

Capstone Design Project Weekly Progress Report

Project Title DYNAMIC AUTOSELECTION OF MACHINE LEARNING MODEL


IN NETWORKS OF CLOUD

Team Members Enter name and USN

Reporting Week 24-05-2021 And 30-05-2021

Faculty Supervisor Enter the name of the faculty supervisor of the project

1. Tasks Outlined in Previous Weekly Progress Report (Provide detailed information on the
tasks to be completed in this week).

We completed BU (Business understanding) based on ML Model under Auto-Selection And


Auto-Tuning. Our target to fulfill our requirement based on Cloud network storage. To find out
“How the dataset compressed under ML Model. Data uploading and downloading storage based
on our online cloud dataset.

Analyzing Supporting information- List required resources, assumptions and dependencies.


Auto-Selection and analyze associated risks and prepare mitigation plan, perform ROI analysis.
Select which algorithm will be suitable and what are the success criteria, Dataset etc...

This phase revolves around data gathering, exploration and comprehension or more commonly
known as exploratory data analysis (EDA). Data understanding belongs to theses stages
following (data collection, data proprieties, data quality and AWS.

Dept. of CSE, EPCET 2020-21 Page 1


Capstone Project: Weekly Progress Report Reporting Week: May 2021

2. Progress Made in Reporting Week (Provide detailed information on the progress that you
made in the reporting week. Limit your write-up to no more than two page)

Model generalization on unseen/unknown data (Over-fitting vs. Under-fitting) — Model


evaluation is done to verify or validate that the model developed is correct and conforms to
reality. After the model is built, we need to check if the model works well on the actual data and
not just the data from which it was built. Evaluation of model using business success criteria.

• Accuracy of model

• Performance and execution speed

• Computational cost

Maintenance and monitoring — deployed model should be monitored and maintained in


production for any performance changes. In case if there is major performance issues or outcome
degradation then the model should be rebuilt from beginning.

CIAF (Cloud Information Accountability Framework). The overall CIA framework, combining
data, users, logger and harmonizer is sketched in Auto-Selection Dataset. Each user creates a pair
of public and private keys based on Identity-Based Encryption based on online ranking level.
The user will create a logger component which is a JAR file, to store its data items. To storage
data from CNS. Major component based on Double Authentication used ML.

The JAR file includes a set of simple access control rules specifying whether and how the cloud
servers and possibly other data stakeholders (users, companies) are authorized to access the
content itself. We use OpenSSLbased certificates, wherein a trusted certificate authority certifies
the CSP. Some of the labels like “NO-ATTACK”, “Fuzzers”, and “Exploits” and “DoS” have
prediction accuracy still below par for most of the models.

Dept. of CSE, EPCET 2020-21 Page 2


Capstone Project: Weekly Progress Report Reporting Week: May 2021

In order to improve their accuracies, the unsupervised algorithms are used. These algorithms find
the inherent groupings called clusters of the data samples such that the samples have high intra-
cluster similarity and low inter-cluster similarity.

 Worm
 Shell-code
 Reconnaissance
 NOATTACK
 Generic Fuzzers
 Exploits
 DoS
 Backdoor
 Analysis
We are researching theses measurement for accuracy in ML Model. We analysis these
techniques to get graphs for all level ML Model.

Multi-layer Perceptron Neural Network (MLP) and a Support Vector Machine (SVM) models
have used and compared for the classification of whole-sky, ground-based images of clouds. The
presence of such MLs in the auto-selection setup is shown as a critical path.

Time-out Rule: Each ML model runs inside a docker container which might time out due to
network congestion or to a completion time limit on ML algorithms. An automated tuning of the
parameters of support vector machine (SVM) classifier for pattern recognition has been
presented.

The comparison between accuracies of Naive Bayes, Decision Tree, and Multi-layer Perceptron
classifiers for label ”NOATTACK” and cluster number = {2; 3; 4; 5; 6}.

Prediction Error Rule: Any ML model that makes erroneous predictions for S consecutive
samples is put in the “OFF” state.

The data owner could opt to reuse the same key pair for all JARs or create different key pairs for
separate JARs. We first elaborate on the automated logging mechanism and then present

Dept. of CSE, EPCET 2020-21 Page 3


Capstone Project: Weekly Progress Report Reporting Week: May 2021

techniques to guarantee dependability. We leverage the programmable capability of JARs to


conduct automated logging.

We study about theses supervised techniques.

 Regression
 Logistic Regression
 Classification
 Naive Bayes Classifiers
 K-NN (k nearest neighbors)
 Decision Trees
 Support Vector Machine

3. Difficulties Encountered in Reporting Week (Provide detailed information on the difficulties


and issues that you encountered in the reporting week. Limit your write-up to no more than
one page)

Difficulties to understanding theses selection model for challenge for us to deploy.

Gaussian Mixture K-means Birch


Model Clustering Clustering Clustering

Un-supervised learning to find out issues during implementing there are no notions of the output
along the learning process. It does not allow estimating or mapping the results of a new sample.
Result very considerable in the presence of outliers. It only performs classification task.
Supervised learning to find out issues during implementing the it requires a labelled datasets. It
requires a training process.

Dept. of CSE, EPCET 2020-21 Page 4


Capstone Project: Weekly Progress Report Reporting Week: May 2021

Computation time is vast for supervised learning. Unwanted data downs efficiency. Pre-
processing of data is no less than a big challenge.

Cost and time are involved in selecting training data for supervised training datasets.

It can take time to interpret the spectral classes for un-supervised datasets.

Supervised learning can be a complex method in comparison with the unsupervised method.


The key reason is that you have to understand very well and label the inputs in supervised
learning.
It is needed a lot of computation time for training. If you have a dynamic big and growing data,
you are not sure of the labels to predefine the rules. This can be a real challenge.
You cannot get very specific about the definition of the data sorting and the output. This is
because the data used in unsupervised learning is labeled and not known. It is a job of the
machine to label and group the raw data before determining the hidden patterns.
Less accuracy of the results. This is also because the input data is not known and not labeled by
people in advance, which means that the machine will need to do this alone.
The results of the analysis cannot be ascertained. There is no prior knowledge in the
unsupervised method of machine learning. 

4. Tasks to Be Completed in Next Week (Outline the tasks to be completed in the following
week)

We planning to complete these topics

 Regression
 Logistic Regression
 Classification
 Naive Bayes Classifiers
 K-NN (k nearest neighbors)
 Decision Trees
 Support Vector Machine

Dept. of CSE, EPCET 2020-21 Page 5


Capstone Project: Weekly Progress Report Reporting Week: May 2021

Algorithms with graphics graphs also to different between supervised and unsupervised training
dataset following above those topics.

Individual graphs and overall graphs to generate all type of datasets belong to upload data from
the network cloud security.

Individual Contribution:

Name and USN Work done


1. Network cluster node and network cloud
storage from the shiven cloud and internal
traffic maintenance also. CIA Framework
deployment done
2. ML Model workflow, supervised and
unsupervised classification datasets
maintenance.
3. Cluster accuracy based on dataset data to
measurement graphs various.
4. 7A(Algorithms) Workflow deploy and making
separate graphs based on online dataset data
from the online cloud storage.

Guide’s comment/suggestions:

Dept. of CSE, EPCET 2020-21 Page 6

You might also like