Machine Learning With Python Report

TRAINING REPORT
ON
MACHINE LEARNING WITH PYTHON
Session 2020-2021
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
I
CERTIFICATE
This is to certify that the seminar entitled “MACHINE LEARNING” has been
presented by JAYESH GUPTA under my guidance during the academic year
2020.
Submitted to
Ms. Anima Sharma
Head of the Department

Dr. Sanjay Gaur
II
ACKNOWLEDGEMENT
The training opportunity I had with [ED Apply Pvt. Ltd.] was a great chance
for learning and professional development. Therefore, I consider myself as a
very lucky individual as I was provided with an opportunity to be a part of it. I
am also grateful for having a chance to meet so many wonderful people and
professionals who led me though this training period.
Bearing in mind previous I am using this opportunity to express my deepest

gratitude and special thanks to the CEO of [ED Apply Pvt. Ltd.] who in spite
of being extraordinarily busy with her/his duties, took time out to hear, guide
and keep me on the correct path and allowing me to carry out
I perceive as this opportunity as a big milestone in my career development. I

will strive to use gained skills and knowledge in the best possible way, and I
will continue to work on their improvement, in order to attain desired career
objectives. Hope to continue cooperation with all of you in the future.
(Name of student)
PAGE INDEX
III
SR. TITLE PAGE NO.
NO.
ABSTRACT 1
LEARNING OUTCOMES OF TRAINING 2
1. INTRODUCTION TO MACHINE LEARNING 3
1.1 WHAT IS MACHINE LEARNING? 3
1.2 FUTURE SCOPE OFMACHINE LEARNING 4
1.3 WHY PYTHON PROGRAMMIMG LANGUAGE? 5
2. TYPES OFMACHINE LEARNING 8

2.1 CONCEPTS OF LEARNING 9
2.2 SUPERVISED LEARNING 9
2.3 UNSUPERVISED LEARNING 11
2.4 SEMI-SUPERVISD LEARNING 11
2.5 REINFORCEMENT LEARNING 12
2.6 PURPOSE OF MACHINE LEARNING 12
3. TRAINING A MACHINE LEARNING MODEL 14

3.1 IMPORTING DATASET `4
3.2 PRE-PROCESSING DATA 15
3.3 EVALUATING ALGORITHMS 16
4. DATA VISUALIZATION 24
4.1 CONFUSION MATRIX 24
4.2 ACCURACY, PRECISION AND RECALL 26
5. PROJECTS
5.1 WINE QUALITY ANALYSIS 28
5.2 FACE RECOGNISATION SYASTEM 30
6. APPLICTIONS OF MACHINE LEARNING 32
CONCLUSION 36
BIBLIOGRAPHY 37
FIGURE INDEX
IV
FIGURE FIGURE TITLE PAGE NO.
NO.
1.1 SET REPRESENTATION OF AI 3
1.2 FEATURES OF PYTHON 5
2.1 TYPES OF MACHINE LEARNING 8
3.1 LABEL AND ONE HOT ENCODING 16
3.2 LINEAR REGRESSION 18
3.3 LOGISTIC REGRESSION 19
3.4 RANDOM FOREST 20
3.5 K-MEANS ALGORITHM 22
3.6 K-NEAREST NEIGHBOUR 23
5.1 PROJECT SCREENSHOT 1 28
5.6 PROJECT SCREENSHOT 31
V
ABSTRACT
The objective of this briefing is to present an overview of the machine

learning techniques currently in use or in consideration at statistical
agencies worldwide. Section I, outlines the main reason why statistical
agencies should start exploring the use of machine learning techniques
and what is machine learning, what is the future scope of machine
learning and why python is the best suited language for machine
learning. Section II outlines what machine learning is, by comparing a
well-known statistical technique (logistic regression) with a (non-
statistical) machine learning counterpart (support vector machines).
Sections III, IV discuss current research or applications of machine
learning techniques within the field of official statistics in the areas of
automatic coding, editing and imputation, and record linkage,
respectively. Section VI contains a list of machine learning applications
in official statistics outside of the three areas mentioned above.
VI
LEARNING OUTCOMES OF TRAINING
 On completion of this training, I’m able to:

 Have a good understanding of the fundamental issues and challenges
of machine learning: data, model selection, model complexity, etc.
 Analyze the strengths and weaknesses of many popular machine

learning approaches.
 Differentiate between various machine learning algorithms and the

paradigms of supervised and un-supervised learning.
 Be able to design and implement various machine learning

algorithms in a range of real-world applications.
VII
CHAPTER 1
INTRODUCTION TO MACHINE LEARNING
In this chapter, you will learn in detail about the concepts of Python in
machine learning.
1.1 What is Machine Learning?
• Data science, machine learning and artificial intelligence are some of

the top trending topics in the tech world today.
• Machine Learning is a subset of Artificial Intelligence.
Fig. 1.1 Set Representation of AI
• We will try to understand ML using different definition
VIII
• Definition 1
Machine learning is a discipline that deals with programming the systems so
as to make them automatically learn and improve with experience.
• Definition 2
A program is said to learn from experience E with respect to some
performance measure P, if its performance on T, as measured by P, improves
with experience E.
1.2 Future Scope of Machine Learning
1.2.1 Automotive Industry
The automotive industry is one of the areas where Machine Learning is

excelling by changing the definition of ‘safe’ driving.
1.2.2 Robotics
Robotics is one of the fields that always gain the interest of researchers as well
as the common
1.2.3 Quantum Computing
We are still at an infant state in the field of Machine Learning. There are a lot
of advancements to achieve in this field. One of them that will take Machine
Learning to the next level is Quantum Computing.
IX
1.2.4 Job Scope of ML
The scope of Machine Learning in India, as well as in other parts of the world,
is high in comparison to other career fields when it comes to job opportunities.
According to Gartner, there will be 2.3 million jobs in the field of Artificial
Intelligence and Machine Learning by 2022
1.3 Why Python programming language?
Fig. 1.2 Features of Machine Learning
1.3.1 Simple and consistent
Python offers concise and readable code. While complex algorithms and
versatile workflows stand behind machine learning and AI, Python’s
simplicity allows developers to write reliable systems. Developers get to put
X
all their effort into solving an ML problem instead of focusing on the technical
nuances of the language.
Additionally, Python is appealing to many developers as it’s easy to learn.

Python code is understandable by humans, which makes it easier to build
models for machine learning.
1.3.2 Extensive selection of libraries and frameworks
Implementing AI and ML algorithms can be tricky and requires a lot of

time. It’s vital to have a well-structured and well-tested environment to
enable developers to come up with the best coding solutions.
1.3.3 Platform independence
Platform independence refers to a programming language or framework

allowing developers to implement things on one machine and use them on
another machine without any (or with only minimal) changes. One key to
Python’s popularity is that it’s a platform independent language. Python is
supported by many platforms including Linux, Windows, and macOS. Python
code can be used to create standalone executable programs for most common
operating systems, which means that Python software can be easily distributed
and used on those operating systems without a Python interpreter.
1.3.4 Great community and popularity
In the Developer Survey 2018 by Stack Overflow, Python was among the top
10 most popular programming languages, which ultimately means that you
can find and hire a development company with the necessary skill set to build
your AI-based project.
XI
CHAPTER 2
TYPES OF MACHINE LEARNING
Fig 2.1 Types of Machine Learning
Machine Learning (ML) is an automated learning with little or no human

intervention. It involves programming computers so that they learn from the
available inputs. The main purpose of machine learning is to explore and
construct algorithms that can learn from the previous data and make
predictions on new input data.
XII
2.1 Concepts of Learning
Learning is the process of converting experience into expertise or knowledge.

Learning can be broadly classified into three categories, as mentioned below,
based on the nature of the learning data and interaction between the learner
and the environment.
 Supervised Learning
 Unsupervised Learning
 Semi-supervised learning
Similarly, there are four categories of machine learning algorithms as shown

below:
 Supervised learning algorithm
 Unsupervised learning algorithm
 Semi-supervised learning algorithm
 Reinforcement learning algorithm
However, the most commonly used ones are supervised and unsupervised
learning.
2.2 Supervised Learning
Supervised learning is commonly used in real world applications, such as face

and speech recognition, products or movie recommendations, and sales
forecasting.
Supervised learning can be further classified into two types:

Regression and Classification.
XIII
Regression trains on and predicts a continuous-valued response, for example
predicting real estate prices.
Classification attempts to find the appropriate class label, such as analyzing

positive/negative sentiment, male and female persons, benign and malignant
tumors, secure and unsecure loans etc.
In supervised learning, learning data comes with description, labels, targets or

desired outputs and the objective is to find a general rule that maps inputs to
outputs. This kind of learning data is called labeled data. The learned rule is
then used to label new data with unknown outputs.
Supervised learning involves building a machine learning model that is based

on labeled samples. For example, if we build a system to estimate the price of
a plot of land or a house based on various features, such as size, location, and
so on, we first need to create a database and label it. We need to teach the
algorithm what features correspond to what prices. Based on this data, the
algorithm will learn how to calculate the price of real estate using the values
of the input features.
Supervised learning deals with learning a function from available training

data. Here, a learning algorithm analyzes the training data and produces a
derived function that can be used for mapping new examples. There are many
supervised learning algorithms such as Logistic Regression, Neural networks,
Support Vector Machines (SVMs), and Naive Bayes classifiers.
XIV
Common examples of supervised learning include classifying e-mails into
spam and not spam categories, labeling webpages based on their content, and
voice recognition.
2.3 Unsupervised Learning
Unsupervised learning is used to detect anomalies, outliers, such as fraud or

defective equipment, or to group customers with similar behaviors for a sales
campaign. It is the opposite of supervised learning. There is no labeled data
here.
When learning data contains only some indications without any description or
labels, it is up to the coder or to the algorithm to find the structure of the
underlying data, to discover hidden patterns, or to determine how to describe
the data. This kind of learning data is called unlabeled data
Suppose that we have a number of data points, and we want to classify them
into several groups. We may not exactly know what the criteria of
classification would be. So, an unsupervised learning algorithm tries to
classify the given dataset into a certain number of groups in an optimum way.
Unsupervised learning algorithms are extremely powerful tools for analyzing

data and for identifying patterns and trends. They are most commonly used for
clustering similar input into logical groups. Unsupervised learning algorithms
include K-means, Random Forests, Hierarchical clustering and so on.
2.4 Semi-supervised Learning
XV
If some learning samples are labeled, but some other are not labeled, then it is
semi-supervised learning. It makes use of a large amount of unlabeled data
for training and a small amount of labeled data for testing. Semi-supervised
learning is applied in cases where it is expensive to acquire a fully labeled
dataset while more practical to label a small subset. For example, it often
requires skilled experts to label certain remote sensing images, and lots of
field experiments to locate oil at a particular location, while acquiring
unlabeled data is relatively easy.
2.5 Reinforcement Learning
Here learning data gives feedback so that the system adjusts to dynamic
conditions in order to achieve a certain objective. The system evaluates its
performance based on the feedback responses and reacts accordingly. The
best-known instances include self-driving cars and chess master algorithm
AlphaGo.
2.6 Purpose of Machine Learning
Machine learning can be seen as a branch of AI or Artificial Intelligence,

since, the ability to change experience into expertise or to detect patterns in
complex data is a mark of human or animal intelligence.
As a field of science, machine learning shares common concepts with other

disciplines such as statistics, information theory, game theory, and
optimization.
XVI
As a subfield of information technology, its objective is to program machines
so that they will learn.
However, it is to be seen that, the purpose of machine learning is not building

an automated duplication of intelligent behavior, but using the power of
computers to complement and supplement human intelligence. For example,
machine learning programs can scan and process huge databases detecting
patterns that are beyond the scope of human perception.
XVII
CHAPTER 3
TRAINING A MACHINE LEARNING MODEL
• A machine learning project involves the following steps:
 Defining a Problem
 Importing Dataset
 Pre-processing Data
 Evaluating Algorithms
 Improving Results
 Analyzing Results
3.1 Importing Dataset
• A dataset is a collection of data. In the case of tabular data, a data set

corresponds to one or more database tables, where every column of a
table represents a particular variable, and each row corresponds to a
given record of the data set in question.
• A dataset can be imported from various online portals like kaggle.com

or can be developed by humans according to the problem statement.
• Generally data is imported from online sources and then modified

according to the problem statement.
XVIII
3.2 Data Pre-processing
• In the real world, we usually come across lots of raw data which is not
fit to be readily processed by machine learning algorithms.
• We need to preprocess data to convert

Raw Data Efficient Data
• Problems that can occur in our imported/made dataset:

1. Missing values
2. Categorical data
3. Feature Scaling
3.2.1 Missing Values
So it order to correct there are two possible ways:
i. We can update dataset by replacing nan value by mean of that column

ii. We can drop that whole row in which the nan value is present
3.2.2 Categorical Data
• It means that text data is present in place of numerical data

NOTE: ML algorithms work only on numerical data and not on
text/categorical data
• So in order to convert categorical data into numerical data we have to
follow a two step process:
XIX
Fig 3.1 Label and One Hot Encoding
3.2.3 Feature Scaling
• Sometimes in a dataset two quantities are not in same range

• Thus Feature scaling is done in order to make all quantities in same
range
• Feature Scaling is a method used to normalize the range of independent
variables or features
3.3 Evaluating Algorithms
• Multiple machine learning algorithms working on different techniques

are available and it is adopted based on its accuracy, scalability and
results.
• As far as Supervised Machine Learning is concerned, following
algorithms are available:
3.3.1 Linear regression
XX
 A simple variable linear regression technique is a type of ML algorithm
that demonstrates how a single input-independent variable (feature
variable) and an output-dependent variable work together.
 Advantages: Quick to model. Simple to understand. Useful for smaller

datasets that aren’t overly complicated.
 Disadvantages: Difficult to design for nonlinear data. Tends to be

ineffectual when working with highly complex data.
Fig 3.2 Linear Regression
3.3.2 Logistic Regression
 An alternative regression machine learning algorithm is the logistic

model. This technique is designed for binary classification problems, as
indicated by two possible outcomes that are affected by one or more
explanatory variables.
XXI
 Advantages: Easy to implement and interpret. Suited well for a linearly
separable dataset.
 Disadvantages: Logistic regression assumes linearity between the

dependent and independent variables.
Fig 3.3 Logistic Regression
3.3.3 Random Forest
 A random forest machine learning algorithm is considered an ensemble

method because it is a collection of hundreds and sometimes thousands
of decision trees.
 The model increases predictive power by combining the decisions of

each decision tree to find an answer.
XXII
 The random forest technique is simple, highly accurate and widely used
by engineers.
 Advantages: Applicable for both regression and classification

problems. Efficient on large datasets. Works well with missing data.
 Disadvantages: Not easily interpretable. Can cause overfitting if noise

is detected. Slower than other models at creating predictions.
Fig 3.4 Random Forest
3.3.4 K-Means
It is a type of unsupervised algorithm which deals with the clustering

problems. Its procedure follows a simple and easy way to classify a given
data set through a certain number of clusters (assume k clusters). Data
XXIII
points inside a cluster are homogeneous and are heterogeneous to peer
groups.
How K-means Forms Cluster K-means forms cluster in the steps given below:
 K-means picks k number of points for each cluster known as centroids.
 Each data point forms a cluster with the closest centroids, that is k clusters.
 Finds the centroid of each cluster based on existing cluster members. Here
we have new centroids.
As we have new centroids, repeat step 2 and 3. Find the closest distance for
each data point from new centroids and get associated with new k-clusters.
Repeat this process until convergence occurs, that is till centroids do not
change.
Determination of Value of K
In K-means, we have clusters and each cluster has its own centroid. Sum of
square of difference between centroid and the data points within a cluster
constitutes the sum of square value for that cluster. Also, when the sum of
square values for all the clusters are added, it becomes total within sum of
square value for the cluster solution.
We know that as the number of cluster increases, this value keeps on

decreasing but if you plot the result you may see that the sum of squared
distance decreases sharply up to some value of k, and then much more slowly
after that. Here, we can find the optimum number of clusters.
XXIV
Fig 3.5 K-Means Algorithm
3.3.5 KNN (K-Nearest Neighbours)
K-Nearest Neighbours, KNN for short, is a supervised learning algorithm

specialized in classification. It is a simple algorithm that stores all available
cases and classifies new cases by a majority vote of its k neighbours.
The case being assigned to the class is the most common among its K nearest
neighbours measured by a distance function. These distance functions can be
Euclidean, Manhattan, Makowski and Hamming distance. First three functions
are used for continuous function and fourth one (Hamming) for categorical
variables.
XXV
If K = 1, then the case is simply assigned to the class of its nearest neighbour.
At times, choosing K turns out to be a challenge while performing KNN
modelling.
The algorithm looks at different centroids and compares distance using some
sort of function (usually Euclidean), then analyses those results and assigns
each point to the group so that it is optimized to be placed with all the closest
points to it.
You can use KNN for both classification and regression problems. However, it
is more widely used in classification problems in the industry. KNN can easily
be mapped to our real lives.
You will have to note the following points before selecting KNN:
 KNN is computationally expensive.
 Variables should be normalized else higher range variables can bias it.
 Works on pre-processing stage more before going for KNN like outlier,
noise removal
XXVI
Fig 3.6 K-Nearest Neighbour
CHAPTER 4
DATA VISUALIZATION
• Analysis of how good or bad your model is trained is a very important

part of Machine Learning.
• Some common terms to be clear with are:

True positives (TP): Predicted positive and are actually positive.
False positives (FP): Predicted positive and are actually negative.
True negatives (TN): Predicted negative and are actually negative.
False negatives (FN): Predicted negative and are actually positive.
4.1 Confusion Matrix
XXVII
Fig 4.1 Confusion Matrix
Well, it is a performance measurement for machine learning classification

problem where output can be two or more classes. It is a table with 4 different
combinations of predicted and actual values.
It is extremely useful for measuring Recall, Precision, Specificity, Accuracy

and most importantly AUC-ROC Curve.
Let’s understand TP, FP, FN, TN in terms of pregnancy analogy.
True Positive:
Interpretation: You predicted positive and it’s true. You predicted that a
woman is pregnant and she actually is.
True Negative:
XXVIII
Interpretation: You predicted negative and it’s true. You predicted that a man is
not pregnant and he actually is not.
False Positive: (Type 1 Error)
Interpretation: You predicted positive and it’s false. You predicted that a man
is pregnant but he actually is not.
False Negative: (Type 2 Error)
Interpretation: You predicted negative and it’s false. You predicted that a
woman is not pregnant but she actually is.
4.2 Accuracy, Precision and Recall
Consider a classification task in which a machine learning system observes

tumors and has to predict whether these tumors are benign or malignant.
Accuracy, or the fraction of instances that were classified correctly, is an

obvious measure of the program's performance. While accuracy does measure
the program's performance, it does not make distinction between malignant
tumors that were classified as being benign, and benign tumors that were
classified as being malignant. In some applications, the costs incurred on all
types of errors may be the same. In this problem, however, failing to identify
malignant tumors is a more serious error than classifying benign tumors as
being malignant by mistake.
We can measure each of the possible prediction outcomes to create different

snapshots of the classifier's performance. When the system correctly classifies
XXIX
a tumor as being malignant, the prediction is called a true positive. When the
system incorrectly classifies a benign tumor as being malignant, the prediction
is a false positive. Similarly, a false negative is an incorrect prediction that
the tumor is benign, and a true negative is a correct prediction that a tumor is
benign. These four outcomes can be used to calculate several common
measures of classification performance, like accuracy, precision, recall and so
on.
Accuracy is calculated with the following formula:
ACC = (TP + TN)/(TP + TN + FP + FN)
Where,
TP is the number of true positives
TN is the number of true negatives
FP is the number of false positives
FN is the number of false negatives.
Precision is the fraction of the tumors that were predicted to be malignant that
are actually malignant. Precision is calculated with the following formula:
PREC = TP/(TP + FP)
Recall is the fraction of malignant tumors that the system identified.

Recall is calculated with the following formula:
R = TP/ (TP + FN)
XXX
In this example, precision measures the fraction of tumors that were predicted
to be malignant that are actually malignant. Recall measures the fraction of
truly malignant tumors that were detected. The precision and recall measures
could reveal that a classifier with impressive accuracy actually fails to detect
most of the malignant tumors. If most tumors are benign, even a classifier that
never predicts malignancy could have high accuracy. A different classifier
with lower accuracy and higher recall might be better suited to the task, since
it will detect more of the malignant tumors. Many other performance measures
for classification can also be used.
CHAPTER 5
PROJECTS
5.1 Wine Quality Analysis
Problem Statement:
Using the dataset of red wine quality analysis, train a model using suitable ML
algorithm which can predict the quality of a wine as fine, good or great.
XXXI
Fig 5.1 Project Screenshot 1
XXXII
XXXIII
6.2 Face Recognition System
Problem Statement
Develop a face recognition system which can detect human faces and can also
count the no. of faces in the frame.
XXXIV
Fig 5.6 Project Screenshot
CHAPTER 6
APPLICATIONS OF MACHINE LEARNING
XXXV
Artificial Intelligence (AI) and Machine Learning are everywhere. Chances
are that you are using them and not even aware about that. In Machine
Learning (ML), computers, software, and devices perform via cognition
similar to human brain.
Typical successful applications of machine learning include programs that
decode handwritten text, face recognition, voice recognition, speech
recognition, pattern recognition, spam detection programs, weather
forecasting, stock market analysis and predictions, and so on. This chapter
discusses these applications in detail.
6.1 Virtual Personal Assistants
Siri, Google Now, Alexa are some of the common examples of virtual
personal assistants. These applications assist in finding information, when
asked over voice. All that is needed is activating them and asking questions
like for example “What are my appointments for today?”, “What are the
flights from Delhi to New York”.
6.2 Traffic Congestion Analysis and Predictions
GPS navigation services monitor the user’s location and velocities and use
them to build a map of current traffic. This helps in preventing the traffic
congestions. Machine learning in such scenarios helps to estimate the regions
where congestion can be found based on previous records.
6.3 Automated Video Surveillance
XXXVI
Video surveillance systems nowadays are powered by AI and machine
learning is the technology behind this that makes it possible to detect and
prevent crimes before they occur. They track odd and suspicious behaviour of
people and sends alerts to human attendants, who can ultimately help
accidents and crimes.
6.4 Social Media
Facebook continuously monitors the friends that you connect with, your
interests, workplace, or a group that you share with someone etc. Based on
continuous learning, a list of Facebook users is given as friend suggestions.
6.5 Face Recognition
You upload a picture of you with a friend and Facebook instantly recognizes
that friend. Machine learning works at the core of Computer Vision, which is a
technique to extract useful information from images and videos. Pinterest uses
computer vision to identify objects or pins in the images and recommend
similar pins to its users.
6.6 Email Spam and Malware Filtering
Machine learning is being extensively used in spam detection and malware

filtering and the databases of such spams and malwares keep on getting
updated so these are handled efficiently.
XXXVII
6.7 Online Customer Support
In several websites nowadays, there is an option to chat with customer support

representative while users are navigating the site. In most of the cases, instead
of a real executive, you talk to a chatbot. These bots extract information from
the website and provide it to the customers to assist them. Over a period of
time, the chatbots learn to understand the user queries better and serve them
with better answers, and this is made possible by machine learning algorithms.
6.8 Refinement of Search Engine Results
Google and similar search engines are using machine learning to improve the
search results for their users. Every time a search is executed, the algorithms at
the backend keep a watch at how the users respond to the results. Depending
on the user responses, the algorithms working at the backend improve the
search results.
6.9 Product Recommendations
If a user purchases or searches for a product online, he/she keeps on receiving

emails for shopping suggestions and ads about that product. Based on previous
user behaviour, on a website/app, past purchases, items liked or added to cart,
brand preferences etc., the product recommendations are sent to the user.
6.10 Detection of Online frauds
Machine learning is used to track monetary frauds online. For example:

PayPal is using ML to prevent money laundering. The company is using a set
XXXVIII
of tools that helps them compare millions of transactions and make a
distinction between legal or illegal transactions taking place between the
buyers and sellers.
CONCLUSION
XXXIX
Machine Learning is a technique of training machines to perform the activities
a human brain can do, albeit a bit faster and better than an average human-
being. Today we have seen that the machines can beat human champions in
games such as Chess, Alpha-Go which are considered very complex. You
have seen that machines can be trained to perform human activities in several
areas and can aid humans in living better lives. Machine Learning can be a
Supervised or Unsupervised.
If you have lesser amount of data and clearly labelled data for training, opt for
Supervised Learning. Unsupervised Learning would generally give better
performance and results for large data sets. If you have a huge data set easily
available, go for deep learning techniques. You also have learned
Reinforcement Learning and Deep Reinforcement Learning. You now know
what Neural Networks are, their applications and limitations. Finally, when it
comes to the development of machine learning models of your own, you
looked at the choices of various development languages, IDEs and Platforms.
Next thing that you need to do is start learning and practicing each machine
learning technique.
XL
BIBLIOGRAPHY
[1] Tutorials Point e-book on Machine Learning
[2] Machine Learning For Absolute Beginners: A Plain English Introduction

(2nd Edition) Book by Oliver Theobald
[3] Machine Learning (in Python and R) For Dummies (1st Edition)

[4] Machine Learning for Hackers Book by Drew Conway and John Myles
White
[5] Introduction to Machine Learning with Python: A Guide for Data

Scientists
Book by Andreas C. Müller and Sarah Guido
XLI

Machine Learning With Python Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning With Python Report

Uploaded by

Copyright:

Available Formats

TRAINING REPORT

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Head of the Department

Bearing in mind previous I am using this opportunity to express my deepest

I perceive as this opportunity as a big milestone in my career development. I

2. TYPES OFMACHINE LEARNING 8

3. TRAINING A MACHINE LEARNING MODEL 14

6. APPLICTIONS OF MACHINE LEARNING 32

1.1 SET REPRESENTATION OF AI 3

1.2 FEATURES OF PYTHON 5

2.1 TYPES OF MACHINE LEARNING 8

3.1 LABEL AND ONE HOT ENCODING 16

3.2 LINEAR REGRESSION 18

3.3 LOGISTIC REGRESSION 19

3.4 RANDOM FOREST 20

3.5 K-MEANS ALGORITHM 22

3.6 K-NEAREST NEIGHBOUR 23

5.1 PROJECT SCREENSHOT 1 28

5.2 PROJECT SCREENSHOT 2 29

5.3 PROJECT SCREENSHOT 2 29

5.4 PROJECT SCREENSHOT 2 30

5.5 PROJECT SCREENSHOT 2 30

5.6 PROJECT SCREENSHOT 31

The objective of this briefing is to present an overview of the machine

 On completion of this training, I’m able to:

 Analyze the strengths and weaknesses of many popular machine

 Differentiate between various machine learning algorithms and the

 Be able to design and implement various machine learning

1.1 What is Machine Learning?

• Data science, machine learning and artificial intelligence are some of

• Machine Learning is a subset of Artificial Intelligence.

Fig. 1.1 Set Representation of AI

• We will try to understand ML using different definition

1.2 Future Scope of Machine Learning

1.2.1 Automotive Industry

The automotive industry is one of the areas where Machine Learning is

1.2.3 Quantum Computing

1.3 Why Python programming language?

Fig. 1.2 Features of Machine Learning

1.3.1 Simple and consistent

Additionally, Python is appealing to many developers as it’s easy to learn.

1.3.2 Extensive selection of libraries and frameworks

Implementing AI and ML algorithms can be tricky and requires a lot of

1.3.3 Platform independence

Platform independence refers to a programming language or framework

1.3.4 Great community and popularity

Fig 2.1 Types of Machine Learning

Machine Learning (ML) is an automated learning with little or no human

Learning is the process of converting experience into expertise or knowledge.

Similarly, there are four categories of machine learning algorithms as shown

2.2 Supervised Learning

Supervised learning is commonly used in real world applications, such as face

Supervised learning can be further classified into two types:

Classification attempts to find the appropriate class label, such as analyzing

In supervised learning, learning data comes with description, labels, targets or

Supervised learning involves building a machine learning model that is based

Supervised learning deals with learning a function from available training

2.3 Unsupervised Learning

Unsupervised learning is used to detect anomalies, outliers, such as fraud or

Unsupervised learning algorithms are extremely powerful tools for analyzing

2.4 Semi-supervised Learning

2.5 Reinforcement Learning

2.6 Purpose of Machine Learning