Amitesh Sharma ML

Medicaps University,Indore
Practical File
Enrollment No. : EN20CS306036

Name of Student : Amitesh Sharma
Department : Computer Science & Engineering
Faculty of : Engineering
Class : B. Tech. CSBS
Year/Sem : III year/ V Sem(Odd)
Course Name : Machine Learning
Course Code : CB3EL01
Faculty Name : Mr. Binod Kumar Mishra
1|Page EN20CS306007 Amitesh Sharma

Table of Content
S.No Name of Experiment Date Remark

1. Introduction to Python 22-08-22
2. NumPy and Pandas 29-08-22
3. Weka 05-09-22
4. R- Programming 12-09-22
5. Linear Regression Model 19-09-22
6. Support Vector Machine(SVM) 8-10-22
7. PCA 10-10-22
8. Decision Tree 17-10-22
PRACTICAL 1
Aim:- Introduction to Python
What is Python?
Python is a popular programming language. It was created by Guido van Rossum, and
released in 1991.
It is used for:
 web development (server-side),

 software development,
 mathematics,
 system scripting.
What can Python do?

 Python can be used on a server to create web applications.
 Python can be used alongside software to create workflows.
 Python can connect to database systems. It can also read and modify files.
 Python can be used to handle big data and perform complex mathematics.
 Python can be used for rapid prototyping, or for production-ready software
development.
Why Python?
 Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
 Python has a simple syntax similar to the English language.
 Python has syntax that allows developers to write programs with fewer lines
than some other programming languages.
 Python runs on an interpreter system, meaning that code can be executed as soon
as it is written. This means that prototyping can be very quick.
 Python can be treated in a procedural way, an object-oriented way or a
functional way.

Python Syntax
Python Variabl
es
Python Lists
Python Tuples
Python Dictionaries

Python If ... Else
 Python Conditions and If statements
Python supports the usual logical conditions from mathematics:
 Equals: a == b
 Not Equals: a! = b
 Less than: a < b
 Less than or equal to: a <= b
 Greater than: a > b
 Greater than or equal to: a >= b
These conditions can be used in several ways, most commonly in "if statements" and
loops.
An "if statement" is written by using the if keyword.
 Elif
The elif keyword is pythons way of saying "if the previous conditions were not true,
then try this condition".

 Else
The else keyword catches anything which isn't caught by the preceding conditions.
Python While Loops
Python has two primitive loop commands:
 while loops
 for loops
The while Loop

With the while loop we can execute a set of statements as long as a condition is true.
Python For Loops
Python For Loops
A for loop is used for iterating over a sequence (that is either a list, a tuple, a
dictionary, a set, or a string).

This is less like the for keyword in other programming languages, and works more like
an iterator method as found in other object-orientated programming languages.
With the for loop we can execute a set of statements, once for each item in a list, tuple,
set etc.
Python Classes and Object

Python is an object oriented programming language.
Almost everything in Python is an object, with its properties and methods.
A Class is like an object constructor, or a "blueprint" for creating objects
Create a Class
To create a class, use the keyword class:
Create Object
Now we can use the class named MyClass to create objects:

NumPy – Creation of a universal data structure helpful in analysis and exchange of
algorithms; advanced mathematical operations on huge data sets
Pandas – Data manipulation, data analysis, data alignment, data set restructuring, and
segmentation
Scikit-Learn – Data analysis, data mining, statistical modeling
TensorFlow – Build and train neural networks; Pattern detection; Numerical

computing
PyTorch – Artificial intelligence, machine learning, and deep learning applications
These Python libraries make the implementation of AI and ML algorithms very easy.

This helps faster product development as it enables the developer to solve complex
problems without rewriting codes. Python is a binary platform-independent
programming language, which means it can run on a range of platforms and software
architectures. The developer can write the code, compile, and run on multiple
platforms. Python is popular for its high versatility and can run on any platform, be it
Windows, Macintosh, Linux, Solaris MacOS, Unix, and more. Integrating Python with
other languages like Java, .NET, C/C++, Perl, PHP, R, etc. is easy.
Machine learning deals with the study of experiences and facts and prediction is given
on the bases of intents provided. The larger the database the better the machine
learning model is. The flow of Machine Learning
 Cleaning the data
 Feeding the dataset
 Training the model
 Testing the dataset
 Implementing the model

PRACTICAL 2
Aim:- NumPy & Pandas
What is Pandas?
Pandas is defined as an open-source library that provides high-performance data
manipulation in Python. It is built on top of the NumPy package, which
means Numpy is required for operating the Pandas. The name of Pandas is derived
from the word Panel Data, which means an Econometrics from Multidimensional
data. It is used for data analysis in Python and developed by Wes McKinney in 2008.
Before Pandas, Python was capable for data preparation, but it only provided limited
support for data analysis. So, Pandas came into the picture and enhanced the
capabilities of data analysis. It can perform five significant steps required for
processing and analysis of data irrespective of the origin of the data, i.e., load,
manipulate, prepare, model, and analyze.
What is NumPy?
NumPy is mostly written in C language, and it is an extension module of Python. It is
defined as a Python package used for performing the various numerical computations
and processing of the multidimensional and single-dimensional array elements. The
calculations using Numpy arrays are faster than the normal Python array. The NumPy
package is created by the Travis Oliphant in 2005 by adding the functionalities of the
ancestor module Numeric into another module Numarray. It is also capable of
handling a vast amount of data and convenient with Matrix multiplication and data
reshaping.
Both the Pandas and NumPy can be seen as an essential library for any scientific
computation, including machine learning due to their intuitive syntax and high-
performance matrix computation capabilities. These two libraries are also best suited
for data science applications.

Difference between Pandas and NumPy:
There are some differences between Pandas and NumPy that is listed below:
10 | P a g e EN20CS306007 Amitesh Sharma

Importing library-
Defining Version-
Tuple/List-

Creating Series-

PRACTICAL 3
Aim:- Learn about Weka
WEKA – an open source software provides tools for data pre-processing,

implementation of several Machine Learning algorithms, and visualization tools so that
you can develop machine learning techniques and apply them to real-world data
mining problems. What WEKA offers is summarized in the following diagram −
WEKA is a purpose-built software platform and cloud computing environment purpose-

built for machine learning applications. With WEKA, you can harness the power of
hardware-accelerated cloud systems to drive advanced machine learning and neural
network research.
With WEKA, you can build machine learning and AI applications with the following
features:
 Streamlined and fast cloud file systems to combine multiple sources into a
single high-performance computing system
 Industry-best GPUDirect performance (113 Gbps for a single DGX-2 and
162 Gbps for a single DGX A100)
 In-flight and at-rest encryption for governance, risk, and compliance
requirements
 Agile access and management for edge, core, and cloud development
 Scalability up to exabytes of storage across billions of files
The WEKA file system also works with Amazon Web Services (AWS), Google Cloud
Platform (GCP), Microsoft Azure, and Oracle Cloud Infrastructure (OCI) cloud
infrastructures.
Weka Machine Learning Algorithms

Weka has a lot of machine learning algorithms. This is great, it is one of the large
benefits of using Weka as a platform for machine learning.
They are divided into a number of main groups:
 bayes: Algorithms that use Bayes Theorem in some core way, like Naive Bayes.
 function: Algorithms that estimate a function, like Linear Regression.
 lazy: Algorithms that use lazy learning, like k-Nearest Neighbours.
 meta: Algorithms that use or combine multiple algorithms, like Ensembles.
 misc: Implementations that do not neatly fit into the other groups, like running a
saved model.
 rules: Algorithms that use rules, like One Rule.
 trees: Algorithms that use decision trees, like Random Forest.
The tab is called “Classify” and the algorithms are listed under an overarching group
called “Classifiers”. Nevertheless, Weka supports both classification (predict a
category) and regression (predict a numeric value) predictive modeling problems.

1. Linear Machine Learning Algorithms
Linear algorithms assume that the predicted attribute is a linear combination of the
input attributes.
 Linear Regression: function.LinearRegression

 Logistic Regression: function.Logistic
2. Nonlinear Machine Learning Algorithms
Nonlinear algorithms do not make strong assumptions about the relationship between
the input attributes and the output attribute being predicted.
 Naive Bayes: bayes.NaiveBayes

 Decision Tree (specifically the C4.5 variety): trees.J48
 k-Nearest Neighbors (also called KNN: lazy.IBk
 Support Vector Machines (also called SVM): functions.SMO
 Neural Network: functions.MultilayerPerceptron
3. Ensemble Machine Learning Algorithms
Ensemble methods combine the predictions from multiple models in order to make
more robust predictions.
 Random Forest: trees.RandomForest

 Bootstrap Aggregation (also called Bagging): meta.Bagging
 Stacked Generalization (also called Stacking or Blending): meta.Stacking
Weka has an extensive array of ensemble methods, perhaps one of the largest available
across all of the popular machine learning frameworks.

PRACTICAL 4
Aim:- Introduction to R Programming
R language is basically developed by statisticians to help other statisticians and

developers faster and efficiently with the data. As by now, we know that machine
learning is basically working with a large amount of data and statistics as a part of
data science the use of R language is always recommended. Therefore the R language
is mostly becoming handy for those working with machine learning making tasks
easier, faster, and innovative. Here are some top advantages of R language to
implement a machine learning algorithm in R programming.
Advantages to Implement Machine Learning Using R

Language
 It provides good explanatory code. For example, if you are at the early stage
of working with a machine learning project and you need to explain the
work you do, it becomes easy to work with R language comparison to
python language as it provides the proper statistical method to work with
data with fewer lines of code.
 R language is perfect for data visualization. R language provides the best
prototype to work with machine learning models.
 R language has the best tools and library packages to work with machine
learning projects. Developers can use these packages to create the best pre-
model, model, and post-model of the machine learning projects. Also, the
packages for R are more advanced and extensive than python language
which makes it the first choice to work with machine learning projects.
Popular R Language Packages Used to Implement Machine

Learning
 lattice: The lattice package supports the creation of the graphs displaying
the variable or relation between multiple variables with conditions.
 DataExplorer: This R package focus to automate the data visualization and
data handling so that the user can pay attention to data insights of the
project.
 Dalex(Descriptive Machine Learning Explanations): This package helps
to provide various explanations for the relation between the input variable

and its output. It helps to understand the complex models of machine
learning
 dplyr: This R package is used to summarize the tabular data of machine
learning with rows and columns. It applies the “split-apply-combine”
approach.
 Esquisse: This R package is used to explore the data quickly to get the
information it holds. It also allows to plot bar graph, histograms, curves,
and scatter plots.
 caret: This R package attempts to streamline the process for creating
predictive models.
 janitor: This R package has functions for examining and cleaning dirty
data. It is basically built for the purpose of user-friendliness for beginners
and intermediate users.
 rpart: This R package helps to create the classification and regression
models using two-stage procedures. The resulting models are represented as
binary trees.
Application Of R in Machine Learning

There are many top companies like Google, Facebook, Uber, etc using the R
language for application of Machine Learning. The application are:
 Social Network Analytics
 To analyze trends and patterns
 Getting insights for behaviour of users
 To find the relationships between the users
 Developing analytical solutions
 Accessing charting components
 Embedding interactive visual graphics
Machine learning is a branch in computer science that studies the design of algorithms
that can learn. Typical machine learning tasks are concept learning, function learning
or “predictive modeling”, clustering and finding predictive patterns. These tasks are
learned through available data that were observed through experiences or instructions,
for example. Machine learning hopes that including the experience into its tasks will
eventually improve the learning. The ultimate goal is to improve the learning in such a
way that it becomes automatic, so that humans like ourselves don’t need to interfere
any more.

Using R For k-Nearest Neighbours (KNN)
Step One. Get your Data
Step Two. Know your Data
Step Three. Where to go Now?
Step Four. Prepare your Workspace
Step Five. Prepare your Data
Step Six. The Actual KNN Model
Step Seven. Evaluation of your Model
Basic program of hello world!!

PRACTICAL 5
Aim:- Introduction about Linear Regression Model

Linear regression is one of the easiest and most popular Machine Learning algorithms.
It is a statistical method that is used for predictive analysis. Linear regression makes
predictions for continuous/real or numeric variables such as sales, salary, age,
product price, etc.
Linear regression algorithm shows a linear relationship between a dependent (y) and
one or more independent (y) variables, hence called as linear regression. Since linear
regression shows the linear relationship, which means it finds how the value of the
dependent variable is changing according to the value of the independent variable.
The linear regression model provides a sloped straight line representing the
relationship between the variables. Consider the below image:
Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε

Here,
Y= Dependent Variable (Target Variable)

X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
The values for x and y variables are training datasets for Linear Regression model
representation.
Types of Linear Regression

Linear regression can be further divided into two types of the algorithm:
o Simple Linear Regression:

If a single independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called Simple
Linear Regression.
o Multiple Linear regression:

If more than one independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called Multiple
Linear Regression.
Linear Regression Line

A linear line showing the relationship between the dependent and independent
variables is called a regression line. A regression line can show two types of
relationship:
o Positive Linear Relationship:

If the dependent variable increases on the Y-axis and independent variable
increases on X-axis, then such a relationship is termed as a Positive linear
relationship.

o Negative Linear Relationship:
If the dependent variable decreases on the Y-axis and independent variable
increases on the X-axis, then such a relationship is called a negative linear
relationship.
Implementation: -
To implement the Simple Linear regression model in machine learning
using Python, we need to follow the below steps:

Step-1: Data Pre-processing
Step-2: Fitting the Simple Linear Regression to the Training
set Step: 3. Prediction of test set result
Step: 4. visualizing the Training set results

PRACTICAL 6
Aim:- Support Vector Machine
Support Vector Machine or SVM is one of the most popular Supervised

Learning algorithms, which is used for Classification as well as Regression
problems. However, primarily, it is used for Classification problems in Machine
Learning.
The goal of the SVM algorithm is to create the best line or decision boundary
that can segregate n-dimensional space into classes so that we can easily put the
new data point in the correct category in the future. This best decision boundary
is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane.
These extreme cases are called as support vectors, and hence algorithm is
termed as Support Vector Machine.
SVM algorithm can be used for Face detection, image classification, text

categorization, etc.
26 | P a g e
EN20CS306036 Mukta Gupta
Types of SVM
1. Linear SVM
2. Non-Linear SVM
o Linear SVM: Linear SVM is used for linearly separable data, which

means if a dataset can be classified into two classes by using a single
straight line, then such data is termed as linearly separable data, and
classifier is used called as Linear SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated

data, which means if a dataset cannot be classified by using a straight
line, then such data is termed as non-linear data and classifier used is
called as Non-linear SVM classifier.
27 | P a g e
28 | P a g e

Amitesh Sharma ML

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Amitesh Sharma ML

Uploaded by

Copyright:

Available Formats

Medicaps University,Indore

Enrollment No. : EN20CS306036

1|Page EN20CS306007 Amitesh Sharma

S.No Name of Experiment Date Remark

2. NumPy and Pandas 29-08-22

5. Linear Regression Model 19-09-22

6. Support Vector Machine(SVM) 8-10-22

8. Decision Tree 17-10-22

 web development (server-side),

What can Python do?

3|Page EN20CS306007 Amitesh Sharma

4|Page EN20CS306007 Amitesh Sharma

An "if statement" is written by using the if keyword.

5|Page EN20CS306007 Amitesh Sharma

The while Loop

6|Page EN20CS306007 Amitesh Sharma

Python Classes and Object

Almost everything in Python is an object, with its properties and methods.

A Class is like an object constructor, or a "blueprint" for creating objects

7|Page EN20CS306007 Amitesh Sharma

Scikit-Learn – Data analysis, data mining, statistical modeling

TensorFlow – Build and train neural networks; Pattern detection; Numerical

PyTorch – Artificial intelligence, machine learning, and deep learning applications

These Python libraries make the implementation of AI and ML algorithms very easy.

8|Page EN20CS306007 Amitesh Sharma

Aim:- NumPy & Pandas

9|Page EN20CS306007 Amitesh Sharma

10 | P a g e EN20CS306007 Amitesh Sharma

11 | P a g e EN20CS306007 Amitesh Sharma

14 | P a g e EN20CS306007 Amitesh Sharma

Aim:- Learn about Weka

WEKA – an open source software provides tools for data pre-processing,

WEKA is a purpose-built software platform and cloud computing environment purpose-

Weka Machine Learning Algorithms

They are divided into a number of main groups:

16 | P a g e EN20CS306007 Amitesh Sharma

 Linear Regression: function.LinearRegression

 Naive Bayes: bayes.NaiveBayes

 Random Forest: trees.RandomForest

17 | P a g e EN20CS306007 Amitesh Sharma

Aim:- Introduction to R Programming

R language is basically developed by statisticians to help other statisticians and

Advantages to Implement Machine Learning Using R

Popular R Language Packages Used to Implement Machine

18 | P a g e EN20CS306007 Amitesh Sharma

Application Of R in Machine Learning

19 | P a g e EN20CS306007 Amitesh Sharma

Step Two. Know your Data

Step Three. Where to go Now?

Step Four. Prepare your Workspace

Step Five. Prepare your Data

Step Six. The Actual KNN Model

Step Seven. Evaluation of your Model

Basic program of hello world!!

20 | P a g e EN20CS306007 Amitesh Sharma

Aim:- Introduction about Linear Regression Model

Mathematically, we can represent a linear regression as:

22 | P a g e EN20CS306007 Amitesh Sharma

Y= Dependent Variable (Target Variable)

Types of Linear Regression

o Simple Linear Regression:

o Multiple Linear regression: