You are on page 1of 24

UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA

FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING


COMPUTER ENGINEERING DEPARTMENT

Digital Image Processing

Lab Manual No 02

Computing using Python Modules

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

Learning Outcomes:-

 Introduction to scipy library


 Introduction to matplotlib library
 Introduction to scikits library

Introduction:

Python comes with various built in batteries or modules. These batteries or modules perform
various specialized operations. The modules can be used to perform computation, database
management, web server etc. we limit our focus to Python modules that allow computation such
as scipy, numpy, matplotlib and scikits. We discuss the relevance of each of these modules and
explain their use with examples.

SciPy Library:

SciPy, a scientific library for Python is an open source, BSD-licensed library for mathematics,
science and engineering. The SciPy library depends on NumPy, which provides convenient and
fast N-dimensional array manipulation. The main reason for building the SciPy library is that it
should work with NumPy arrays. It provides many user-friendly and efficient numerical
practices such as routines for numerical integration and optimization.

The basic data structure used by SciPy is a multidimensional array provided by the NumPy
module. NumPy provides some functions for Linear Algebra, Fourier Transforms and Random
Number Generation, but not with the generality of the equivalent functions in SciPy. By default,
all the NumPy functions have been available through the SciPy namespace. There is no need to
import the NumPy functions explicitly, when SciPy is imported.

NumPy Vector:

A vector can be created in multiple ways. Some of them are described below.

1. Converting Python array-like objects to NumPy

Let us consider the following example.

import numpy as np

list = [1,2,3,4]

arr = np.array(list)

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

print arr

The output of the above program will be as follows.

[1 2 3 4]

Intrinsic NumPy Array Creation:

NumPy has built-in functions for creating arrays from scratch. Some of these functions are
explained below.

1. Using zeros ( )

The zeros(shape) function will create an array filled with 0 values with the specified shape. The
default dtype is float64. Let us consider the following example.

Import numpy as np

print np.zeros((2, 3))

The output of the above program will be as follows.

array([[ 0., 0., 0.],


[ 0., 0., 0.]])
2. Using ones ( )

The ones (shape) function will create an array filled with 1 values. It is identical to zeros in all
the other respects. Let us consider the following example.

import numpy as np

print np.ones((2, 3))

The output of the above program will be as follows.


array([[ 1., 1., 1.],

[ 1., 1., 1.]])

3. Using arange ( )

The arange() function will create arrays with regularly incrementing values. Let us consider the following
example.

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

import numpy as np

print np.arange(7)

The above program will generate the following output.

array([0, 1, 2, 3, 4, 5, 6])

4. Defining the data type of the values:

Let us consider the following example.

import numpy as np

arr = np.arange(2, 10, dtype = np.float)

print arr

print "Array Data Type :",arr.dtype

The above program will generate the following output.

[ 2. 3. 4. 5. 6. 7. 8. 9.]

Array Data Type : float64

5. Using linspace()

The linspace() function will create arrays with a specified number of elements, which will be spaced
equally between the specified beginning and end values. Let us consider the following example.

import numpy as np

print np.linspace(1., 4., 6)

The above program will generate the following output.

array([ 1. , 1.6, 2.2, 2.8, 3.4, 4. ])

Matrix:

A matrix is a specialized 2-D array that retains its 2-D nature through operations. It has certain special
operators, such as * (matrix multiplication) and ** (matrix power). Let us consider the following
example.

import numpy as np

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

print np.matrix('1 2; 3 4')

The above program will generate the following output.

matrix([[1, 2],

[3, 4]])

1. Transpose of Matrix

This feature returns the transpose of self. Let us consider the following example.

import numpy as np

mat = np.matrix('1 2; 3 4')

mat.T

The above program will generate the following output.

matrix([[1, 3],

[2, 4]])

SciPy - Clustering:

K-means clustering is a method for finding clusters and cluster centers in a set of unlabeled data.
Intuitively, we might think of a cluster as – comprising of a group of data points, whose inter-point
distances are small compared with the distances to points outside of the cluster. Given an initial set of K
centers, the K-means algorithm iterates the following two steps.

1. For each center, the subset of training points (its cluster) that is closer to it is identified than any
other center.
2. The mean of each feature for the data points in each cluster are computed, and this mean vector
becomes the new center for that cluster.

These two steps are iterated until the centers no longer move or the assignments no longer change. Then,
a new point x can be assigned to the cluster of the closest prototype. The SciPy library provides a good
implementation of the K-Means algorithm through the cluster package.

SciPy – Constants:

SciPy constants package provides a wide range of constants, which are used in the general scientific area.

SciPy Constants Package

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

The scipy.constants package provides various constants. We have to import the required constant and use
them as per the requirement. Two types of constants are available in SciPy which include mathematical
and physical constants.

SciPy – FFTpack:

Fourier Transformation is computed on a time domain signal to check its behavior in the frequency
domain. Fourier transformation finds its application in disciplines such as signal and noise processing,
image processing, audio signal processing, etc. SciPy offers the fftpack module, which lets the user
compute fast Fourier transforms.

Example:

#Importing the fft and inverse fft functions from fftpackage

from scipy.fftpack import fft

#create an array with random n numbers

x = np.array([1.0, 2.0, 1.0, -1.0, 1.5])

#Applying the fft function

y = fft(x)

print y

The above program will generate the following output.


[ 4.50000000+0.j 2.08155948-1.65109876j -1.83155948+1.60822041j
-1.83155948-1.60822041j 2.08155948+1.65109876j]

Discrete Cosine Transform

A Discrete Cosine Transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine
functions oscillating at different frequencies. SciPy provides a DCT with the function dct and a
corresponding IDCT with the function idct. Let us consider the following example.

from scipy.fftpack import dct

print dct(np.array([4., 3., 5., 10., 5., 3.]))

The above program will generate the following output.

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

array([ 60., -3.48476592, -13.85640646, 11.3137085, 6., -


6.31319305])

SciPy – Integrate:

SciPy has a number of routines for performing numerical integration. Most of them are found in
the same scipy.integrate library. The commonly used functions include.

1. quad Single integration


2. dblquad Double integration
3. tplquad Triple integration
4. trapz Trapezoidal rule
5. romberg Romberg integration

SciPy – Interpolate:

Interpolation is the process of finding a value between two points on a line or a curve. To help us
remember what it means, we should think of the first part of the word, 'inter,' as meaning 'enter,'
which reminds us to look 'inside' the data we originally had. This tool, interpolation, is not only
useful in statistics, but is also useful in science, business, or when there is a need to predict
values that fall within two existing data points.

Let us create some data and see how this interpolation can be done using the scipy.interpolate
package.

import numpy as np

from scipy import interpolate

import matplotlib.pyplot as plt

x = np.linspace(0, 4, 12)

y = np.cos(x**2/3+4)

print x,y

The above program will generate the following output.

(
array([0., 0.36363636, 0.72727273, 1.09090909, 1.45454545,
1.81818182,

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

2.18181818, 2.54545455, 2.90909091, 3.27272727,


3.63636364, 4.]),

array([-0.65364362, -0.61966189, -0.51077021, -0.31047698, -


0.00715476,
0.37976236, 0.76715099, 0.99239518, 0.85886263,
0.27994201,
-0.52586509, -0.99582185])
)
Now, we have two arrays. Assuming those two arrays as the two dimensions of the points in space, let us
plot using the following program and see how they look like.

plt.plot(x, y,’o’)

plt.show()

The above program will generate the following output.

1-D Interpolation

The interp1d class in the scipy.interpolate is a convenient method to create a function based on fixed data
points, which can be evaluated anywhere within the domain defined by the given data using linear
interpolation.

By using the above data, let us create a interpolate function and draw a new interpolated graph.

f1 = interp1d(x, y,kind = 'linear')

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

f2 = interp1d(x, y, kind = 'cubic')

Using the interp1d function, we created two functions f1 and f2. These functions, for a given input x
returns y. The third variable kind represents the type of the interpolation technique. 'Linear', 'Nearest',
'Zero', 'Slinear', 'Quadratic', 'Cubic' are a few techniques of interpolation.

SciPy - Input & Output:

The Scipy.io (Input and Output) package provides a wide range of functions to work around with
different format of files. Some of these formats are −

1. Matlab
2. IDL
3. Matrix Market
4. Wave
5. Arff
6. Netcdf

SciPy – Ndimage:

The SciPy ndimage submodule is dedicated to image processing. Here, ndimage means an n-dimensional
image. Some of the most common tasks in image processing are as follows

1. Input/Output, displaying images


2. Basic manipulations − Cropping, flipping, rotating, etc.
3. Image filtering − De-noising, sharpening, etc.
4. Image segmentation − Labeling pixels corresponding to different objects
5. Classification
6. Feature extraction
7. Registration

Matplotlib Library:

Matplotlib is one of the most popular Python packages used for data visualization. It is a cross-platform
library for making 2D plots from data in arrays. It provides an object-oriented API that helps in
embedding plots in applications using Python GUI toolkits such as PyQt, WxPythonotTkinter. It can be
used in Python and IPython shells, Jupyter notebook and web application servers also.

matplotlib.pyplot is a collection of command style functions that make Matplotlib work like MATLAB.
Each Pyplot function makes some change to a figure. For example, a function creates a figure, a plotting

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. Types of plot
Matplotlib supports include:

1. Bar Make a bar plot.


2. Barh Make a horizontal bar plot.
3. Boxplot Make a box and whisker plot.
4. Hist Plot a histogram.
5. hist2d Make a 2D histogram plot.
6. Pie Plot a pie chart.
7. Plot Plot lines and/or markers to the Axes.
8. Polar Make a polar plot..
9. Scatter Make a scatter plot of x vs y.
10. Stackplot Draws a stacked area plot.
11. Stem Create a stem plot.
12. Step Make a step plot.
13. Quiver Plot a 2-D field of arrows.

Some of the axis function include:

1. Axes Add axes to the figure.


2. Text Add text to the axes.
3. Title Set a title of the current axes.
4. Xlabel Set the x axis label of the current axis.
5. Xlim Get or set the x limits of the current axes.
6. Xscale Set the scaling of the x-axis
7. Xticks Get or set the x-limits of the current tick locations and labels.
8. Ylabel Set the y axis label of the current axis.
9. Ylim Get or set the y-limits of the current axes.
10. Yscale Set the scaling of the y-axis.
11. Yticks Get or set the y-limits of the current tick locations and labels.

Some of the figure functions include:

1. Figtext Add text to figure.


2. Figure Creates a new figure.
3. Show Display a figure.
4. Savefig Save the current figure.
5. Close Close a figure window.

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

Matplotlib - Simple Plot:

In this section, we will learn how to create a simple plot with Matplotlib. We shall now display a simple
line plot of angle in radians vs. its sine value in Matplotlib. To begin with, the Pyplot module from
Matplotlib package is imported, with an alias plt as a matter of convention.

import matplotlib.pyplot as plt

Next we need an array of numbers to plot. Various array functions are defined in the NumPy library
which is imported with the np alias.

import numpy as np

We now obtain the ndarray object of angles between 0 and 2π using the arange() function from the
NumPy library.

x = np.arange(0, math.pi*2, 0.05)

The ndarray object serves as values on x axis of the graph. The corresponding sine values of angles in x to
be displayed on y axis are obtained by the following statement −

y = np.sin(x)

The values from two arrays are plotted using the plot() function.

plt.plot(x,y)

You can set the plot title, and labels for x and y axes.

You can set the plot title, and labels for x and y axes.

plt.xlabel("angle")

plt.ylabel("sine")

plt.title('sine wave')

The Plot viewer window is invoked by the show() function −

plt.show()

The complete program is as follows −

from matplotlib import pyplot as plt

import numpy as np

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

import math #needed for definition of pi

x = np.arange(0, math.pi*2, 0.05)

y = np.sin(x)

plt.plot(x,y)

plt.xlabel("angle")

plt.ylabel("sine")

plt.title('sine wave')

plt.show()

When the above line of code is executed, the following graph is displayed.

Matplotlib - Figure Class:

The matplotlib.figure module contains the Figure class. It is a top-level container for all plot elements.
The Figure object is instantiated by calling the figure() function from the pyplot module.

fig = plt.figure()

The following table shows the additional parameters:

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

1. Figsize (width,height) tuple in inches


2. Dpi Dots per inches
3. Facecolor Figure patch facecolor
4. Edgecolor Figure patch edge color
5. Linewidth Edge line width

Matplotlib - Axes Class:

Axes object is the region of the image with the data space. A given figure can contain many Axes, but a
given Axes object can only be in one Figure. The Axes contains two (or three in the case of 3D) Axis
objects. The Axes class and its member functions are the primary entry point to working with the OO
interface.

Axes object is added to figure by calling the add_axes() method. It returns the axes object and adds an
axes at position rect [left, bottom, width, height] where all quantities are in fractions of figure width and
height.

Legend:

The legend() method of axes class adds a legend to the plot figure. It takes three parameters

ax.legend(handles, labels, loc)

axes.plot()

This is the basic method of axes class that plots values of one array versus another as lines or markers.
The plot() method can have an optional format string argument to specify color, style and size of line and
marker.

Following example shows the advertisement expenses and sales figures of TV and smartphone in the
form of line plots. Line representing TV is a solid line with yellow colour and square markers whereas
smartphone line is a dashed line with green colour and circle marker.

import matplotlib.pyplot as plt

y = [1, 4, 9, 16, 25,36,49, 64]

x1 = [1, 16, 30, 42,55, 68, 77,88]

x2 = [1,6,12,18,28, 40, 52, 65]

fig = plt.figure()

ax = fig.add_axes([0,0,1,1])

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

l1 = ax.plot(x1,y,'ys-') # solid line with yellow colour and square


marker

l2 = ax.plot(x2,y,'go--') # dash line with green colour and circle


marker

ax.legend(labels = ('tv', 'Smartphone'), loc = 'lower right') # legend


placed at lower right

ax.set_title("Advertisement effect on sales")

ax.set_xlabel('medium')

ax.set_ylabel('sales')

plt.show()

Matplotlib – Multiplots:

In this chapter, we will learn how to create multiple subplots on same canvas. The subplot() function
returns the axes object at a given grid position. The Call signature of this function is

plt.subplot(subplot(nrows, ncols, index)

In the current figure, the function creates and returns an Axes object, at position index of a grid of nrows
by ncolsaxes. Indexes go from 1 to nrows * ncols, incrementing in row-major order.Ifnrows, ncols and
index are all less than 10. The indexes can also be given as single, concatenated, threedigitnumber.

For example, subplot(2, 3, 3) and subplot(233) both create an Axes at the top right corner of the current
figure, occupying half of the figure height and a third of the figure width.

Creating a subplot will delete any pre-existing subplot that overlaps with it beyond sharing a boundary.

import matplotlib.pyplot as plt

# plot a line, implicitly creating a subplot(111)

plt.plot([1,2,3])

# now create a subplot which represents the top plot of a grid with 2
rows and 1 column.

#Since this subplot will overlap the first, the plot (and its axes)
previously created, will be removed

plt.subplot(211)

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

plt.plot(range(12))

plt.subplot(212, facecolor='y') # creates 2nd subplot with yellow


background

plt.plot(range(12))

The above line of code generates the following output −

Matplotlib - Subplots() Function:

Matplotlib’spyplot API has a convenience function called subplots() which acts as a utility wrapper and
helps in creating common layouts of subplots, including the enclosing figure object, in a single call.

Plt.subplots(nrows, ncols)

The two integer arguments to this function specify the number of rows and columns of the subplot grid.
The function returns a figure object and a tuple containing axes objects equal to nrows*ncols. Each axes

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

object is accessible by its index. Here we create a subplot of 2 rows by 2 columns and display 4 different
plots in each subplot.

import matplotlib.pyplot as plt

fig,a = plt.subplots(2,2)

import numpy as np

x = np.arange(1,5)

a[0][0].plot(x,x*x)

a[0][0].set_title('square')

a[0][1].plot(x,np.sqrt(x))

a[0][1].set_title('square root')

a[1][0].plot(x,np.exp(x))

a[1][0].set_title('exp')

a[1][1].plot(x,np.log10(x))

a[1][1].set_title('log')

plt.show()

The above line of code generates the following output:

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

Scikit Learn Library:

Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It provides a
selection of efficient tools for machine learning and statistical modeling including classification,
regression, clustering and dimensionality reduction via a consistence interface in Python. This library,
which is largely written in Python, is built upon NumPy, SciPy and Matplotlib.

Rather than focusing on loading, manipulating and summarising data, Scikit-learn library is focused on
modeling the data. Some of the most popular groups of models provided by Sklearn are as follows −

1. Supervised Learning algorithms − Almost all the popular supervised learning algorithms, like
Linear Regression, Support Vector Machine (SVM), Decision Tree etc., are the part of scikit-
learn.
2. Unsupervised Learning algorithms − On the other hand, it also has all the popular unsupervised
learning algorithms from clustering, factor analysis, PCA (Principal Component Analysis) to
unsupervised neural networks.
3. Clustering − This model is used for grouping unlabeled data.
4. Cross Validation − It is used to check the accuracy of supervised models on unseen data.
5. Dimensionality Reduction − It is used for reducing the number of attributes in data which can be
further used for summarization, visualization and feature selection.
6. Ensemble methods − As name suggest, it is used for combining the predictions of multiple
supervised models.
7. Feature extraction − It is used to extract the features from data to define the attributes in image
and text data.
8. Feature selection − It is used to identify useful attributes to create supervised models.
9. Open Source − It is open source library and also commercially usable under BSD license.

Scikit Learn - Modelling Process:

Dataset Loading

A collection of data is called dataset. It is having the following two components −

1. Features − The variables of data are called its features. They are also known as predictors, inputs
or attributes.
2. Feature matrix − It is the collection of features, in case there are more than one.
3. Feature Names − It is the list of all the names of the features.
4. Response − It is the output variable that basically depends upon the feature variables. They are
also known as target, label or output.
5. Response Vector − It is used to represent response column. Generally, we have just one response
column.
6. Target Names − It represent the possible values taken by a response vector.

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

Scikit-learn have few example datasets like iris and digits for classification and the Boston house prices
for regression.

Example

Following is an example to load iris dataset −

from sklearn.datasets import load_iris

iris = load_iris()

X = iris.data

y = iris.target

feature_names = iris.feature_names

target_names = iris.target_names

print("Feature names:", feature_names)

print("Target names:", target_names)

print("\nFirst 10 rows of X:\n", X[:10])

Output
Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal
length (cm)', 'petal width (cm)']
Target names: ['setosa' 'versicolor' 'virginica']
First 10 rows of X:
[
[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1]
]

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

Splitting the dataset

To check the accuracy of our model, we can split the dataset into two pieces-a training set and a testing
set. Use the training set to train the model and testing set to test the model. After that, we can evaluate
how well our model did.

Example

The following example will split the data into 70:30 ratio, i.e. 70% data will be used as training data and
30% will be used as testing data. The dataset is iris dataset as in above example.

from sklearn.datasets import load_iris

iris = load_iris()

X = iris.data

y = iris.target

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size =


0.3, random_state = 1)

print(X_train.shape)

print(X_test.shape)

print(y_train.shape)

print(y_test.shape)

Output:

(105, 4)
(45, 4)
(105,)
(45,)

As seen in the example above, it uses train_test_split() function of scikit-learn to split the dataset. This
function has the following arguments

1. X, y − Here, X is the feature matrix and y is the response vector, which need to be split.
2. test_size − This represents the ratio of test data to the total given data. As in the above example,
we are setting test_data = 0.3 for 150 rows of X. It will produce test data of 150*0.3 = 45 rows.

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

3. random_size − It is used to guarantee that the split will always be the same. This is useful in the
situations where you want reproducible results.

Train the Model

Next, we can use our dataset to train some prediction-model. As discussed, scikit-learn has wide range of
Machine Learning (ML) algorithms which have a consistent interface for fitting, predicting accuracy,
recall etc.

Example

In the example below, we are going to use KNN (K nearest neighbors) classifier. Don’t go into the details
of KNN algorithms. This example is used to make you understand the implementation part only.

from sklearn.datasets import load_iris

iris = load_iris()

X = iris.data

y = iris.target

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size =


0.4, random_state=1)

from sklearn.neighbors import KNeighborsClassifier

from sklearn import metrics

classifier_knn = KNeighborsClassifier(n_neighbors = 3)

classifier_knn.fit(X_train, y_train)

y_pred = classifier_knn.predict(X_test)

# Finding accuracy by comparing actual response values(y_test) with


predicted response value(y_pred)

print("Accuracy:", metrics.accuracy_score(y_test, y_pred))

# Providing sample data and the model will make prediction out of that
data

sample = [[5, 5, 3, 2], [2, 4, 3, 5]]

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

preds = classifier_knn.predict(sample)

pred_species = [iris.target_names[p] for p in preds]


print("Predictions:", pred_species)

Output

Accuracy: 0.9833333333333333
Predictions: ['versicolor', 'virginica']

In SciKit learn library, a wide range of machine learning algorithms have been built which include linear
regression, logistic regression, Stochastic Gradient Descent, k- nearest neighbors, Support vector
machines, Naïve Bayes, decision trees, adaboost, k-Means and Hierarchical Clustering.

Lab Tasks:

1. Write a Python program to draw a linear line with suitable


label in the x axis, y axis and a title.
2. Write a Python program to draw a line using given axis values
taken from a text file, with suitable label in the x axis, y
axis and a title.
Test Data:
test.txt
1 2
2 4
3 1

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

3. Write a Python program to plot two or more lines with legends,


different widths and colors.

4. Write a Python program to plot two or more lines with


different styles.

5. Write a Python programming to display a bar chart, horizontal


bar chart of the popularity of programming Languages. Use
different color for each bar
Sample data:
Programming languages: Java, Python, PHP, JavaScript, C#,
C++
Popularity: 22.2, 17.6, 8.8, 8, 7.7, 6.7

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

6. Write a Python programming to create a pie chart with a title


of the popularity of programming Languages. Make multiple
wedges of the pie. Use data from task 5.

7. Write a Python program to draw a scatter plot comparing two


subject marks of Mathematics and Science. Use marks of 10
students.

Sample data:

Test Data:

math_marks = [88, 92, 80, 89, 100, 80, 60, 100, 80, 34]
science_marks = [35, 79, 79, 48, 100, 88, 32, 45, 20, 30]
marks_range = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

Digital Image Processing Lab Instructor:- Sheharyar Khan


UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA
FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING
COMPUTER ENGINEERING DEPARTMENT

8. Write a python program to convert a list of numeric value into


a one-dimensional NumPy array.
9. Write a python program to create a null vector of size 10 and
update sixth value to 11.
10. Write a python program to create a 2d array with 1 on the
border and 0 inside.
11. Write a python program to concatenate two 2-dimensional
arrays.

Digital Image Processing Lab Instructor:- Sheharyar Khan

You might also like