You are on page 1of 28

MATPLOTLIB

Matplotlib is the most popular graphing & data visualization module for python. This is important
for all professionals working with data. Either they want a better way to visualize the data they
are using to have a better grasp of the data Or they want to display the data is a better way to
present the same.

Once you have Matplotlib installed, be sure to open up a terminal or a script, type:

import matplotlib

Once you can successfully import matplotlib, then you are ready to continue. Let’s begin by one
of the most simple graphs by importing pyplot:

In [1]: import matplotlib.pyplot as plt

In [3]: plt.plot([1,2,3],[4,5,1])
plt.show()

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Creating LINE charts using passing variables

In terms of programming, it is unlikely that you will actually be filling in data to the plt.plot()
function. Instead, you will, or at least you should, be only passing variables into it. Like
plt.plot(x,y). So now let us show plotting variables as well as adding some descriptive labels and
a good title.

In [4]: x = [5,8,10]
y = [12,16,6]

plt.plot(x,y)
plt.show()

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
But these charts are not useful without:

(a) Axis Labels

(b) Chart Title

Adding Axis-Labels/Chart Titles to the Charts

We can add the Axis-labels and the Chart Title using the following functions in Matplotlib.

plt.title('Title of the Chart')

plt.ylabel('Y axis Name')

plt.xlabel('X axis Name')

Using the above functions let us name the axis and the chart title of the above chart. Let’s use
the following:

In [5]: x = [5,8,10]
y = [12,16,6]
plt.plot(x,y)

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.title('Epic Info') #– Where “Epic Info” is the title/name of the ch
art
plt.ylabel('Y Axis') #– Where “Y-Axis” is the name of the y-axis
plt.xlabel('X-Axis') #– Where “X-Axis” is the name of the x-axis
plt.show()

Matplotlib STYLES:
The Matplotlib Styles are used to style the graphs/charts. You can just import a stylesheet and
use it pre-set features for customizations in your charts.

Let us begin by importing styles by:

In [6]: import matplotlib.style as msty

In [7]: #Now let’s add style to it:


msty.use('ggplot') #-Where ggplot is the name of the style

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [8]: #and create a graph using the following:
x = [5,8,10]
y = [12,16,6]

x2 = [6,9,11]
y2 = [6,15,7]

# can plot specifically, after just showing the defaults:


plt.plot(x,y,linewidth=5) #-Where linewidth determines the width of the
line in the chart
plt.plot(x2,y2,linewidth=5)
plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()

Adding/Editing Colors/Legends to the Chart:

We can also edit the colors of the line chart and add legends to the chart to make them more
interactive and informative.

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
We can edit the color of the lines using the following prompt:

In [9]: plt.plot(x,y,'r') #-Where ‘r’ stands for color RED


plt.plot(x2,y2,'y') #-Where ‘b’ stands for color YELLOW
plt.show()

In [10]: #we can add the legends using the following prompt (label):

plt.plot(x,y,'r', label = 'first data', linewidth=5)


plt.plot(x2,y2,'y', label='second data', linewidth=5)
plt.legend()
plt.show()

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
we can also change the location of the legend
using the following prompts:
plt.legend(loc=0)

0 is for default(auto-selection), 1, 2, 3, 4

Let us use one of them for illustration purposes:

In [11]: plt.plot(x,y,'r', label = 'first data', linewidth=5)


plt.plot(x2,y2,'y', label='second data', linewidth=5)
plt.legend(loc=3)
plt.show()

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [12]: #We can add the grid in the chart by using:
plt.grid(True,color='k') #k is used for BLACK color

BAR CHARTS:
Bar Charts or bar graphs are used to present grouped data with rectangular bars

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
• The lengths of the bar are proportional to the values that they represent.

• The bars can be plotted vertically or horizontally. A vertical bar chart is sometimes called a Line
graph.

• There are two axis to the bar plots:

• First Axis: It includes categorical values(Qualitative).

• Second Axis: It includes discrete values(Quantitative)

In [13]: x = [5,8,10]
y = [12,16,6]

x2 = [6,9,11]
y2 = [6,15,7]

plt.bar(x, y, align = 'center')


plt.bar(x2, y2, color = 'g', align = 'center')
plt.title('Bar graph')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
HISTOGRAMS
Histogram is used for density estimation

• It uses categorical data similar to bar plot, but groups continuous data into numbers into
ranges.

• The ranges of continuous data are known as bins/Intervals.

• One more important point is all the bins must be adjacent to each other.

• Usually bins are of equal size, but they can vary in sizes.

• Example:

▪ Consider a histogram of x and y, which are 500 Gaussian generated random numbers. In the
histogram distribution of x and y are represented with different colors.

We can also create histograms using the following code:

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [14]: from matplotlib import pyplot as plt
import numpy as np

a = np.array([22,87,5,43,56,73,55,54,11,20,51,5,79,31,27])
plt.hist(a, bins = [0,20,40, 60,80,100]) #-Where Bins is the interval r
equired on the x-axis
plt.title("histogram")
plt.show()

NOTE:

default historgram prompt:

matplotlib.pyplot.hist(x, bins=None, range=None, density=None, weights=None,


cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None,
log=False, color=None, label=None, stacked=False, normed=None, hold=None, data=None,
kwargs)

SCATTER DIAGRAM:

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
The pyplot submodule provides scatter() function to generate bar graphs. The following example
produces the scatter graph of two sets of x and y arrays.

In [15]: x = [5,8,10]
y = [12,16,6]

x2 = [6,9,11]
y2 = [6,15,7]

plt.scatter(x, y)
plt.scatter(x2, y2, color = 'g')
plt.title('Scatter Chart')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()

NOTE:

Default SCATTER prompt:

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None,
vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None,
hold=None, data=None, kwargs)

SUBPLOTS
matplotlib.pyplot.subplots(nrows=1, ncols=1, sharex=False, sharey=False, squeeze=True,
subplot_kw=None, gridspec_kw=None, **fig_kw)

Matplotlib feature which lets you add multiple plots within a figure called subplots. Subplots are
helpful when you want to show different data presentation in a single view, for instance
Dashboards.

The Matplotlib subplot() function can be called to plot two or more plots in one figure. Matplotlib
supports all kind of subplots including 2×1 vertical, 2×1 horizontal or a 2×2 grid.

Horizontal subplot
In Horizontal subplot, we carry out the sub-plotting based on the no. of rows. In that sense the
rows in (rows,column,plot_number), is always greater than 1. Look at the following code:

In [16]: import numpy as np


import pandas as pd
import matplotlib.pyplot as plt
t = range(0, 20)
s = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]

plt.subplot(2,1,1)
plt.title('subplot(2,1,1)')
plt.plot(t,s,'m')

plt.subplot(2,1,2)
plt.title('subplot(2,1,2)')

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.plot(t,s,'r')
plt.show()

In [17]: #Vertical subplot


#In Vertical subplot, we carry out the sub-plotting based on the no. of
columns.
#In that sense the columns in (rows,column,plot_number), is always grea
ter than 1. Look at the following code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
t = range(0, 20)
s = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]

plt.subplot(1,2,1)
plt.title('subplot(1,2,1)')
plt.plot(t,s,'m')

plt.subplot(1,2,2)
plt.title('subplot(1,2,2)')

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.plot(t,s,'r')
plt.show()

In [19]: #2X2 subplot Code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
t = range(0, 20)
s = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]

plt.subplot(2,2,1)
plt.title('subplot(2,2,1)')
plt.plot(t,s,'m')

plt.subplot(2,2,2)
plt.title('subplot(2,2,2)')
plt.plot(t,s,'r')

plt.subplot(2,2,3)
plt.title('subplot(2,2,3)')

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.plot(t,s,'g')

plt.subplot(2,2,4)
plt.title('subplot(2,2,4)')
plt.plot(t,s,'y')
plt.show()

In [20]: #EXAMPLE

x=pd.DataFrame(data=[1,2,3,4,5])
y=pd.DataFrame(data=[1,4,9,16,25])

plt.plot(x,y*2, label='Multiplier Data')


plt.plot(x*2,y, label='Square Data')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(loc=4)
plt.show()

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
STACKED BAR CHART EXAMPLE
Stacked bar plot is nothing but a bar plot which used to plot similar categories with different
discreet value combined together.

• Different discrete values are represented as different colors in one column.

• Example:

▪ If you are comparing scores in final 5 overs of India and Pakistan in previous 5 matches then
stacked bar plot gives better comparison.

▪ Both the teams are represented with different colors.

▪ India – Represented as Blue color and Pakistan – Represented as Green color

In [21]: # Here we are showing india and pakistan scores in last 5 overs
india = (20, 35, 30, 35, 27)
pakistan = (25, 32, 34, 20, 25)
N = 5

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars: can also be len(x) sequence

# This creates bars of both teams, in one graph


p1 = plt.bar(ind, india, width, color='b')
p2 = plt.bar(ind, pakistan, width, color='g', bottom=india)

plt.ylabel('Scores')
plt.title('Scores of India and Pakistan')

# Shows the text between the center of each bar


plt.xticks(ind + width/2., ('L1', 'L2', 'L3', 'L4', 'L5'))
plt.yticks(np.arange(0, 81, 10))

# shows the box labeling of india and pakistan


plt.legend((p1[0], p2[0]), ('India', 'Pakistan'))
plt.show()

Using MTCARS to create charts


In [23]: #Load Data

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [28]: import os
os.getcwd()

Out[28]: 'C:\\Users\\Shivani\\Proschool'

In [33]: mtcars=mtcars.rename(index=str, columns={'Unnamed: 0': 'name'})


mtcars.head()

Out[33]:
name mpg cyl disp hp drat wt qsec vs am gear carb

0 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4

1 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4

2 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1

3 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1

4 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2

Here we are plotting the bar plot of categorical variable “gear” from mtcars dataset.

Using pd.crosstab(x,y) we are giving axises

and using plot() we are plotting the graph, here we need bar plot so “kind = bar”.

In below plot x-axis contains possible gear numbers of vehicles are taken and in y-axis count of
vehicles with respect to their gears is taken.

In [36]: pd.crosstab(dat['gear'],columns = 'Count').plot(kind='bar')

Out[36]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40f6d5fd0>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
VERTICAL STACKED BAR:
We are visualizing two categorical variables in below plot which gives good way to compare the
variables.

• In below plot x-axis contains possible gear numbers of vehicles and in y-axis distribution of
number of cylinders(cyl)

In [37]: pd.crosstab(dat['gear'],dat['cyl']).plot(kind='bar',stacked=True)

Out[37]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40f7c2630>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
HORIZONTAL STACKED BAR
We can also plot horizontal bar plot for same example by changing the “kind” from to “bar” to
“barh” (h –horizontal).

In [38]: pd.crosstab(dat['gear'],dat['cyl']).plot(kind='barh',stacked=True)

Out[38]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40f88b160>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
HISTOGRAM
Histogram is used for density estimation, Histogram is plotted to numeric variable.

• Let us plot histogram of ‘mgp’(mileage) of vehicles in the mtcars dataset.

In [39]: dat['mpg'].plot(kind='hist',alpha=0.5)

Out[39]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40f8b8e80>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
BOXPLOT
Box plot: We draw boxplot to show the distribution with respect to categories.

• Box represents different quartiles and viscous extends to show the rest of distribution.

• Boxplots are very helpful to “outlier” detection.

• Example: Let us plot a simple box plot of ‘mpg’(mileage) using mtcars dataset.

In [40]: dat['mpg'].plot(kind='box')

Out[40]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40f9a8ba8>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
SCATTER PLOTS
In [41]: dat.plot(kind='scatter', x='mpg',y='hp')

Out[41]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40f9f55c0>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
PIE_CHART
It depicts the numerical proportion of data. The proportions are showed using wedges.

• Simple Pie Chart: Let us plot a pie chart of gears. The plot shows the proportion of each gear
types.

In [42]: d=pd.crosstab(dat['gear'],columns="Count")['Count']
d

Out[42]: gear
3 15
4 12
5 5
Name: Count, dtype: int64

In [43]: d.plot(kind='pie')

Out[43]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40fa47630>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Pie chart created with Separating the wedges of
the chart.
Following chart is an example for exploded pie chart.

In [44]: labels=['3-Gear','4-Gear','5-Gear']
colors=['gold','yellowgreen','lightcoral']
explode=(0.1,0.1,0.1)
d.plot(kind='pie',explode=explode,labels=labels,colors=colors,autopct='
%1.1f%%',shadow=True,
startangle=140)

Out[44]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40fab5e48>

Pie chart created with Separating a single


wedge of the chart.
Following chart is an example for single exploded pie chart.

In [45]: explode = (0,0.1,0)


d.plot(kind='pie',explode=explode,labels=labels,colors=colors,autopct='

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
%1.1f%%',shadow=True,
startangle=140)

Out[45]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40fa1ca90>

Adding Secondary Axis


Secondary Axis: If we want to compare 2 line charts then We can add second line chart as
secondary axis in same chart.

• Example: If we need line charts of both ‘mpg’ and ‘disp’ in a same chart then we can make
‘disp’ as secondary axis and get then in same chart.

In [46]: dat['mpg'].plot(kind='line',style='r')
dat['disp'].plot(secondary_y=True,style='g')

Out[46]: <matplotlib.axes._subplots.AxesSubplot at 0x1f40fb9a160>

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
SEABORN

Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD

You might also like