You are on page 1of 17

Index

S.no List of Experiments Experiment Submission


Date Date
1. Write a program for read data set
through a csv file.
2. Write a program which demonstrates
Linear Regression using Python.
3. How to Plot a Line Chart in Python
using Matplotlib.
4. How to Plot a Histogram in Python
using Matplotlib.
5. Write a program to visualize data
using bar chart.
6. Draw Matplotlib Pie Charts.

7. Write program to draw scatter plot


using Python.
8. Write program which demonstrate
Machine Learning - Polynomial
Regression.
EXPERIMENT-1

AIM:- Write a program for read data set through a csv file.

content_csv_space = []

content_csv_tab = []

import csv

with open(file1, newline='',


encoding='utf-8') as csvfile:

spamreader_space =
csv.reader(csvfile, delimiter=' ')

for row in spamreader_space:

content_csv_space.append(row)

with open(file1, newline='',


encoding='utf-8') as csvfile:

spamreader_tab = csv.reader(csvfile,
delimiter='\t')

for row in spamreader_tab:

content_csv_tab.append(row)
EXPERIMENT-2

AIM: Write a program which demonstrates Linear Regression using Python.

Linear regression uses the relationship between the data-points to draw a straight line through all them.

This line can be used to predict future values.

Python has methods for finding a relationship between data-points and to draw a line of linear regression.
We will show you how to use these methods instead of going through the mathematic formula.

Example:

import matplotlib.pyplot as plt
from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)


def myfunc(x):
  return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

It is important to know how the relationship between the values of the x-axis and the values
of the y-axis is, if there are no relationship the linear regression can not be used to predict
anything.

This relationship - the coefficient of correlation - is called r.

The r value ranges from -1 to 1, where 0 means no relationship, and 1 (and -1) means 100%
related.

Python and the Scipy module will compute this value for you, all you have to do is feed it
with the x and y values.
EXPERIMENT-3

AIM: How to Plot a Line Chart in Python using Matplotlib.


Matplotlib is a data visualization library in Python. The pyplot, a sublibrary of matplotlib, is a
collection of functions that helps in creating a variety of charts. Line charts are used to represent the
relation between two data X and Y on a different axis.

Simple line plots


First import Matplotlib.pyplot library for plotting functions. Also, import the Numpy library as per
requirement. Then define data values x and y.

# importing the required libraries

import matplotlib.pyplot as plt

import numpy as np

# define data values

x = np.array([1, 2, 3, 4]) # X-axis points

y = x*2 # Y-axis points

plt.plot(x, y) # Plot the chart

plt.show() # display

We can see in the above output image that there is no label on the x-axis and
y-axis. Since labelling is necessary for understanding the chart dimensions. In
the following example, we will see how to add labels, Indent in the charts
import matplotlib.pyplot as plt

import numpy as np

# Define X and Y variable data

x = np.array([1, 2, 3, 4])

y = x*2

plt.plot(x, y)

plt.xlabel("X-axis") # add X-axis label

plt.ylabel("Y-axis") # add Y-axis label

plt.title("Any suitable title") # add title

plt.show()
Experiment 4:

AIM: How to Plot a Histogram in Python using Matplotlib.


A histogram is basically used to represent data provided in a form of some groups.It is accurate
method for the graphical representation of numerical data distribution.It is a type of bar plot where
X-axis represents the bin ranges while Y-axis gives information about frequency.

To create a histogram the first step is to create bin of the ranges, then distribute the whole range of the
values into a series of intervals, and count the values which fall into each of the intervals.Bins are
clearly identified as consecutive, non-overlapping intervals of variables.The matplotlib.pyplot.hist()
function is used to compute and create histogram of x.

Let’s create a basic histogram of some random values. Below code creates a simple
histogram of some random values:  
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors
from matplotlib.ticker import PercentFormatter

# Creating dataset
np.random.seed(23685752)
N_points = 10000
n_bins = 20

# Creating distribution
x = np.random.randn(N_points)
y = .8 ** x + np.random.randn(10000) + 25
legend = ['distribution']

# Creating histogram
fig, axs = plt.subplots(1, 1,figsize =(10, 7),tight_layout = True)

# Remove axes splines


for s in ['top', 'bottom', 'left', 'right']:
axs.spines[s].set_visible(False)

# Remove x, y ticks
axs.xaxis.set_ticks_position('none')
axs.yaxis.set_ticks_position('none')

# Add padding between axes and labels


axs.xaxis.set_tick_params(pad = 5)
axs.yaxis.set_tick_params(pad = 10)

# Add x, y gridlines
axs.grid(b = True, color ='grey',linestyle ='-.', linewidth = 0.5,alpha = 0.6)
# Add Text watermark
fig.text(0.9, 0.15, 'Jeeteshgavande30',
fontsize = 12,
color ='red',
ha ='right',
va ='bottom',
alpha = 0.7)

# Creating histogram
N, bins, patches = axs.hist(x, bins = n_bins)

# Setting color
fracs = ((N**(1 / 5)) / N.max())
norm = colors.Normalize(fracs.min(), fracs.max())

for thisfrac, thispatch in zip(fracs, patches):


color = plt.cm.viridis(norm(thisfrac))
thispatch.set_facecolor(color)

# Adding extra features


plt.xlabel("X-axis")
plt.ylabel("y-axis")
plt.legend(legend)
plt.title('Customized histogram')

# Show plot
plt.show()
Experiment 5:

AIM: Write a program to visualize data using bar chart.


INTRODUCTION:

A bar graph is a graphical representation of data in which we can highlight the category with
particular shapes like a rectangle. The length and heights of the bar chart represent the data distributed
in the dataset. In a bar chart, we have one axis representing a particular category of a column in the
dataset and another axis representing the values or counts associated with it.  Bar charts can be plotted
vertically or horizontally. A vertical bar chart is often called a column chart. When we arrange bar
charts in a high to low-value counts manner, we called them Pareto charts.

The matplotlib A The matplotlib API in Python provides the bar() function which can be used in
MATLAB style use or as an object-oriented API. The syntax of the bar() function to be used with the axes
is as follows:-

plt.bar(x, height, width, bottom, align)


The function creates a bar plot bounded with a rectangle depending on the given parameters. Following is
a simple example of the bar plot, which represents the number of students enrolled in different courses of
an institute. PI in Python provides the bar() function which can be used in MATLAB style use or as an
object-oriented API. The syntax of the bar() function to be used with the axes is as follows:-
, height, width, bottom, align)
import pandas as pd
from matplotlib import pyplot as plt

# Read CSV into pandas


data = pd.read_csv(r"cars.csv")
data.head()
df = pd.DataFrame(data)

name = df['car'].head(12)
price = df['price'].head(12)

# Figure Size
fig, ax = plt.subplots(figsize =(16, 9))

# Horizontal Bar Plot


ax.barh(name, price)

# Remove axes splines


for s in ['top', 'bottom', 'left', 'right']:
ax.spines[s].set_visible(False)

# Remove x, y Ticks
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')

# Add padding between axes and labels


ax.xaxis.set_tick_params(pad = 5)
ax.yaxis.set_tick_params(pad = 10)

# Add x, y gridlines
ax.grid(b = True, color ='grey',linestyle ='-.', linewidth = 0.5,alpha = 0.2)

# Show top values


ax.invert_yaxis()

# Add annotation to bars


for i in ax.patches:
plt.text(i.get_width()+0.2, i.get_y()+0.5,
str(round((i.get_width()), 2)),
fontsize = 10, fontweight ='bold',
color ='grey')

# Add Plot Title


ax.set_title('Sports car and their price in crore',
loc ='left', )

# Add Text watermark


fig.text(0.9, 0.15, 'Jeeteshgavande30', fontsize = 12,
color ='grey', ha ='right', va ='bottom',
alpha = 0.7)

# Show Plot
plt.show()

EXPERIMENT – 6
AIM: Draw Matplotlib Pie Charts.
INTRODUCTION:

A Pie Chart is a circular statistical plot that can display only one series of data. The area of the chart is the
total percentage of the given data. The area of slices of the pie represents the percentage of the parts of the
data. The slices of pie are called wedges. The area of the wedge is determined by the length of the arc of the
wedge. The area of a wedge represents the relative percentage of that part with respect to whole data. Pie
charts are commonly used in business presentations like sales, operations, survey results, resources, etc as
they provide a quick summary.

Program Code
# Import libraries
import numpy as np
import matplotlib.pyplot as plt

# Creating dataset
cars = ['AUDI', 'BMW', 'FORD',
'TESLA', 'JAGUAR', 'MERCEDES']

data = [23, 17, 35, 29, 12, 41]

# Creating explode data


explode = (0.1, 0.0, 0.2, 0.3, 0.0, 0.0)

# Creating color parameters


colors = ( "orange", "cyan", "brown",
"grey", "indigo", "beige")

# Wedge properties
wp = { 'linewidth' : 1, 'edgecolor' : "green" }

# Creating autocpt arguments


def func(pct, allvalues):
absolute = int(pct / 100.*np.sum(allvalues))
return "{:.1f}%\n({:d} g)".format(pct, absolute)

# Creating plot
fig, ax = plt.subplots(figsize =(10, 7))
wedges, texts, autotexts = ax.pie(data,
autopct = lambda pct: func(pct, data),
explode = explode,
labels = cars,
shadow = True,
colors = colors,
startangle = 90,
wedgeprops = wp,
textprops = dict(color ="magenta"))
# Adding legend
ax.legend(wedges, cars,
title ="Cars",
loc ="center left",
bbox_to_anchor =(1, 0, 0.5, 1))

plt.setp(autotexts, size = 8, weight ="bold")


ax.set_title("Customizing pie chart")

# show plot
plt.show()

EXPERIMENT- 7
AIM: Write program to draw scatter plot using Python.
INTRODUCTION:

A scatter plot is a diagram where each value in the data set is represented by a dot.

The Matplotlib module has a method for drawing scatter plots, it needs two arrays of the same length, one
for the values of the x-axis, and one for the values of the y-axis:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]

y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

The x array represents the age of each car.

The y array represents the speed of each car.

import matplotlib.pyplot as plt

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

plt.scatter(x, y)
plt.show()

The x-axis represents ages, and the y-axis represents speeds.

What we can read from the diagram is that the two fastest cars were both 2 years old,
and the slowest car was 12 years old.

EXPERIMENT- 8
AIM: Write program which demonstrate Machine Learning - Polynomial
Regression.

INTRODUCTION:

Polynomial Regression is a form of linear regression in which the relationship between the independent
variable x and dependent variable y is modeled as an nth degree polynomial. Polynomial regression fits a
nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y |x)

Why Polynomial Regression: 


 

 There are some relationships that a researcher will hypothesize is curvilinear. Clearly, such types of
cases will include a polynomial term.
 Inspection of residuals. If we try to fit a linear model to curved data, a scatter plot of residuals (Y-
axis) on the predictor (X-axis) will have patches of many positive residuals in the middle. Hence in
such a situation, it is not appropriate.
 An assumption in usual multiple linear regression analysis is that all the independent variables is
independent. In polynomial regression model, this assumption is not satisfied.
Uses of Polynomial Regression: 
These are basically used to define or describe non-linear phenomena such as: 
 
 The growth rate of tissues.
 Progression of disease epidemics
 Distribution of carbon isotopes in lake sediments
The basic goal of regression analysis is to model the expected value of a dependent variable y in terms of the
value of an independent variable x. In simple regression, we used the following equation – 
y = a + bx + e
Here y is the dependent variable on x, a is the y-intercept and e is the error rate.
In general, we can model it for nth value. 
y = a + b1x + b2x^2 +....+ bnx^n
Since regression function is linear in terms of unknown variables, hence these models are linear from the
point of estimation.
Hence through the Least Square technique, let’s compute the response value that is y.

Polynomial Regression in Python: 


Step 1: Import libraries and dataset 
Import the important libraries and the dataset we are using to perform Polynomial Regression. 

# Importing the libraries


import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset


datas = pd.read_csv('data.csv')
datas

Step 2: Dividing the dataset into 2 components

Divide dataset into two components that is X and y.X will contain the Column between 1 and 2. y will
contain the 2 columns. 

X = datas.iloc[:, 1:2].values
y = datas.iloc[:, 2].values

Step 3: Fitting Linear Regression to the dataset


Fitting the linear Regression model On two components. 
# Fitting Linear Regression to the dataset
from sklearn.linear_model import LinearRegression
lin = LinearRegression()
lin.fit(X, y)

Step 4: Fitting Polynomial Regression to the dataset


Fitting the Polynomial Regression model on two components X and y. 
# Fitting Polynomial Regression to the dataset
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree = 4)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

Step 5: In this step, we are Visualising the Linear Regression results using a scatter plot. 

# Visualising the Linear Regression results


plt.scatter(X, y, color = 'blue')
plt.plot(X, lin.predict(X), color = 'red')
plt.title('Linear Regression')
plt.xlabel('Temperature')
plt.ylabel('Pressure')
plt.show()

Step 6: Visualising the Polynomial Regression results using a scatter plot.


# Visualising the Polynomial Regression results
plt.scatter(X, y, color = 'blue')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'red')
plt.title('Polynomial Regression')
plt.xlabel('Temperature')
plt.ylabel('Pressure')
plt.show()

Step 7: Predicting new results with both Linear and Polynomial Regression. Note that the input variable
must be in a numpy 2D array.

# Predicting a new result with Linear Regression after converting predict


variable to 2D array
pred = 110.0
predarray = np.array([[pred]])
lin.predict(predarray)

# Predicting a new result with Polynomial Regression after converting predict


variable to 2D array
pred2 = 110.0
pred2array = np.array([[pred2]])
lin2.predict(poly.fit_transform(pred2array))

You might also like