CS 601 ML Lab Manual

MACHINE LEARNING LABORATORY -1-
CHAMELI DEVI GROUP OF INSTITUTIONS, INDORE
LABORATORY MANUAL
Machine Learning
CS - 601
VI SEM (CSE)
Department of
Computer Science & Engineering
CHAMELI DEVI GROUP OF INSTITUTIONS, INDORE. DEPARTMENT OF COMPUTER SCIENCE & ENGG
Department of Computer Science & Engineering
Vision
To foster innovative software engineers with refined technical approach and to excel in academics with ethics
to shoulder social responsibilities.
Mission
➢ To achieve, sustain and foster excellence in computer engineering.

➢ To be a center of excellence for innovative pedagogy and educational reforms.
➢ To develop an intellectual and inspiring environment for learning and research to make students
ethically and academically sound.
Program Educational Objectives (PEOs)
The graduates will be able to –

➢ PEO1: Pursue a successful career in engineering by applying imperative technical skills, professional
knowledge and principles to implement computer support systems.
➢ PEO2: Demonstrate interpersonal skills and leadership qualities with the sole intention of achieving
organizational goals through teamwork and to serve the society with ethics and integrity.
➢ PEO3: Inculcate in‐depth knowledge of technologies for providing effective and efficient solutions to the
existing and expected problems.
Program Specific Outcomes (PSOs)
The graduates will be able to –

➢ PSO1: Use their engineering skills in database and network design, project management and
knowledge engineering.
➢ PSO2: Acquire skills to provide solution using high level programming language.
➢ PSO3: Demonstrate proficiency to analyze, design and develop applications in different domains for
providing solutions through innovative ideas.
CHAMELI DEVI GROUP OF INSTITUTIONS

INDORE (M.P.)
DEPARTMENT OF
COMPUTER SCIENCE & ENGINEERING
CERTIFICATE
This is to certify that Mr./Ms……………………………………………………………… with RGTU
Enrollment No. 0832 ..…………………………..has satisfactorily completed the course of experiments in
…………………….……………………………………………...………laboratory, as prescribed by Rajiv
Gandhi Proudhyogiki Vishwavidyalaya, Bhopal for ……… Semester of the Computer Science & Engineering
Department during year 20….… − ....
Signature of
Faculty In-charge
INDEX
Date of Signature of
Sl. CO
Name of the Experiment Conduction Faculty-in-
No.
Charge
WAP to print checkerboard pattern having NXN dimensions CS601.1

1 using Numpy.
CS601.1
2 WAP to print Transpose of matrix in single line in Python.
Perform data manipulation with Pandas (Create Data CS601.1

3 Frame).
CS601.1
4 Perform data manipulation with Pandas (Read csv file ).
CS601.1
5 Create data preprocessing template.
CS601.1
6 Implement Linear Regression model.
CS601.1
7 Implement Polynomial Regression model.
CS601.1
8 Implement Logistic Regression model.
CS601.1
9 Implement K- Nearest neighbors algorithm.
CS601.5
10 Implement SVM algorithm.
EXPT. No. - 1. WAP to print checkerboard pattern having NXN dimensions using Numpy (Take input
for N=9).
Aim: To understand the concept of Numpy in python.

Theory: Numpy is an array-processing package which is used for general-purpose and providing a high-
performance multidimensional array object, and tools to work on array. It is the fundamental package for
scientific computing and used as an efficient multi-dimensional container of generic data.
Code:
Output:
0 1 0 1 0 1 0 1 0
1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0
1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0
1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0
1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0
Viva Question:
1. What is Numpy?
2. Differentiate between Numpy and List.
3. How Numpy is useful in Machine Learning?
4. How slicing works in Numpy?
5. What is the default datatype in Numpy array?
EXPT. No.- 2. WAP to print Transpose of matrix in single line in Python.
Aim: To understand the concept of matrix in Numpy using python.

Theory: Nested loop is used to find the Transpose of a matrix (Using a nested loop). But there are some
interesting ways to do the same in a single line.
In Python, we can implement a matrix as nested list (list inside a list). Each element is treated as a row of the
matrix. For example m = [[1, 2], [3, 4], [5, 6]] represents a matrix of 3 rows and 2 columns.
Code:
Output:
Entered Matrix-
[1, 2]
[3, 4]
[5, 6]
Transpose of Matrix-
[1, 3, 5]
[2, 4, 6]
Viva Question:
1. What are the advantages of Numpy?
2. What is range function?
3. Define module in Python.
4. What is byte swapping in Numpy?
5. What is the use of arange function Numpy?
EXPT. No. 3. Perform data manipulation with Pandas (Create Data Frame).
Aim: To understand the concept of pandas to create Data Frame.

Theory: A DataFrame is a widely used data structure of pandas and works with a two-dimensional array with
labeled axes (rows and columns) DataFrame is defined as a standard way to store data and has two different
indexes, i.e., row index and column index.
Create Data Frame for Employee-
Ename Age Salary Designation Location
Ajay 28 40000 Project Engineer Indore
Chetna 24 35000 HR Indore
Karan 26 39000 Data Analyst Pune
Richa 25 34000 HR Trainee Pune
Perform Following Operations-

1. Create Data Frame
2. Select Ename
3. Add one 5 more rows
4. Add two column Gender and Marital status
5. Rename Column from Designation to Profile
6. Find out the work locations
7. Create separate data frame without age
8. Chetna left the Job after her marriage, delete her data
9. Add column for experience
10. Find out top 3 most experienced employee
11. Update the profile of Richa from HR Trainee to HR and revised her salary to 40000
12. All employees shifted to new work locations Pune
Code:
Output:
Viva Question:
1. Define “pandas” in Python.

2. What is “DataFrame” in “pandas”?
3. How will you create an empty “DataFrame” in pandas?
4. How will you add a column in to a “DataFrame” using “pandas”?
5. What are the different ways to create a “DataFrame” in “pandas”.
EXPT. No.- 4 . Perform data manipulation with Pandas (Read csv file ).
Aim: To understand the concept of pandas to data manipulation. (Data Set is attached)
Theory: pandas is a software library written for the Python programming language for data manipulation and
analysis. In particular, it offers data structures and operations for manipulating numerical tables
Read dataset (odi_new.csv) from local disk and it is a csv file and perform below operations-
a) View number of rows and columns
b) Type of data in each column
c) To view few columns and rows in data to understand it
d) To see the description of data
e) Show top 10 rows
f) Show below 10 rows
g) Show the statistics of all numeric columns
h) Verify all statistics by calculate in the individual cell (perform statistics)
i) Find out the details of where versus is Canada
Code:
Output:
Viva Question:
1. What are the uses of “pandas” in Machine Learning?
2. Which is the standard data missing marker in “pandas”?
3. What is PEP8?
4. Explain categorical data in “pandas”.
5. What is the role of “unique()” function in “pandas”?
EXPT. No.- 5 . Create data preprocessing template ( Read Data from csv file named Data).
Aim: To understand the concept and need of data preprocessing. (Data Set is attached).
Theory: Pre-processing refers to the transformations applied to our data before feeding it to the algorithm.
Data pre-processing is a technique that is used to convert the raw data into a clean data set. In other words,
whenever the data is gathered from different sources it is collected in raw format which is not feasible for the
analysis. For achieving better results from the applied model in Machine Learning projects the format of the
data has to be in a proper manner. Some specified Machine Learning model needs information in a specified
format, for example, Random Forest algorithm does not support null values, therefore to execute random forest
algorithm null values have to be managed from the original raw data set.
Another aspect is that data set should be formatted in such a way that more than one Machine Learning and
Deep Learning algorithms are executed in one data set, and best out of them is chosen.
Code:
Output:
Viva Question:
1. What is data preprocessing?
2. What is categorical feature?
3. What is “OneHotEncoder”?
4. What is “Fit” method?
5. What is “Transform” in Machine Learning?
MACHINE LEARNING LABORATORY - 10 -
EXPT. No.- 6 . Implement Linear Regression model. ( Read Data named Salary_data).
Aim: To understand the concept Linear Regression. (Data Set is attached)

Theory: This is one of the most common and interesting type of Regression technique. Here we predict a target
variable Y based on the input variable X. A linear relationship should exist between target variable and
predictor and so comes the name Linear Regression.
Consider predicting the salary of an employee based on his/her age. We can easily identify that there seems to
be a correlation between employee’s age and salary (more the age more is the salary). The hypothesis of linear
regression is- Y= a + bX
Y represents salary, X is employee’s age and a and b are the coefficients of equation. So in order to predict Y
(salary) given X (age), we need to know the values of a and b (the model’s coefficients).
Code:
Output:
Viva Question:
1. What is Linear Regression?
2. What is predictor variable in machine learning?
3. What is the best ratio for training and test data?
4. What is the use of “fit-transform” method?
5. What is the use of “labelEncoder”?
EXPT. No.- 7. Implement Polynomial Regression model. ( Read Data named positions_salaries)
Aim: To understand the concept Polynomial Regression. (Data Set is attached)

Theory: In polynomial regression, we transform the original features into polynomial features of a given degree
and then apply Linear Regression on it. Consider the above linear model Y = a+bX is transformed to something
like – Y=a + bX + cX2
It is still a linear model but the curve is now quadratic rather than a line. Scikit-Learn provide Polynomial
Features class to transform the features.
Code:
Output:
Viva Question:
1. What is polynomial feature?
2. What is irreducible error?
3. Define bias in learning algorithm.
4. Where we can use polynomial regression?
5. What is the use of scatter plot?
EXPT. No.- 8 . Implement Logistic Regression model. (Read Data Social_network_ads)
Aim: To understand the concept of Logistic Regression.

Theory: Logistic regression is a fundamental classification technique. It belongs to the group of linear
classifiers and is somewhat similar to polynomial and linear regression. Logistic regression is fast and
relatively uncomplicated, and it’s convenient for you to interpret the results. Although it’s essentially a method
for binary classification, it can also be applied to multiclass problems.
Code:
Output:
Viva Question:
1. What is “overfitting”?
2. What is “underfitting”?
3. What is “bestfit” in learning algorithm?
4. Define bias-variance trade-off.
5. Define variance.
EXPT. No.- 9 . Implement K- Nearest neighbors algorithm ( Read Data named Social_network_ads)
Aim: To understand the concept of K-Nearest Neighbors. (Data Set is attached)

Theory: KNN can be used for both classification and regression predictive problems. However, it is more
widely used in classification problems in the industry. To evaluate any technique, we generally look at 3
important aspects:
1. Ease to interpret output
2. Calculation time
3. Predictive Power
KNN makes predictions using the training dataset directly.
Predictions are made for a new instance (x) by searching through the entire training set for the K most similar
instances (the neighbors) and summarizing the output variable for those K instances. For regression this might
be the mean output variable, in classification this might be the mode (or most common) class value.
To determine which of the K instances in the training dataset are most similar to a new input a distance measure
is used.
Code:
Output:
Viva Question:
1. What is use of StandardScaler?
2. What are the parameters for KNeighborsClassifier?
3. What is confusion matrix?
4. Define cross validation.
5. How we can balance bias and variance?
EXPT. No.- 10. Implement SVM algorithm. (Read Data named Social_network_ads)
Aim: To understand the concept of Support Vector Machine.

Theory: “Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for
both classification or regression challenges. However, it is mostly used in classification problems. In this
algorithm, we plot each data item as a point in n-dimensional space (where n is number of features you have)
with the value of each feature being the value of a particular coordinate. Then, we perform classification by
finding the hyper-plane that differentiate the two classes very well.
Code:
Output:
Viva Question:
1. What is hyper plane?
2. How to choose the best hyper plane?
3. What is kernel?
4. In which situation kernel is used?
5. What is random state?

CS 601 ML Lab Manual

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS 601 ML Lab Manual

Uploaded by

Copyright:

Available Formats

MACHINE LEARNING LABORATORY -1-

CHAMELI DEVI GROUP OF INSTITUTIONS, INDORE

Department of Computer Science & Engineering

➢ To achieve, sustain and foster excellence in computer engineering.

Program Educational Objectives (PEOs)

The graduates will be able to –

Program Specific Outcomes (PSOs)

The graduates will be able to –

CHAMELI DEVI GROUP OF INSTITUTIONS

This is to certify that Mr./Ms……………………………………………………………… with RGTU

Enrollment No. 0832 ..…………………………..has satisfactorily completed the course of experiments in

…………………….……………………………………………...………laboratory, as prescribed by Rajiv

Department during year 20….… − ....

WAP to print checkerboard pattern having NXN dimensions CS601.1

Perform data manipulation with Pandas (Create Data CS601.1

Aim: To understand the concept of Numpy in python.

Aim: To understand the concept of matrix in Numpy using python.

Aim: To understand the concept of pandas to create Data Frame.

Perform Following Operations-

1. Define “pandas” in Python.

Aim: To understand the concept Linear Regression. (Data Set is attached)

Aim: To understand the concept Polynomial Regression. (Data Set is attached)

Aim: To understand the concept of Logistic Regression.

Aim: To understand the concept of K-Nearest Neighbors. (Data Set is attached)

Aim: To understand the concept of Support Vector Machine.

You might also like