Professional Documents
Culture Documents
7th LESSON (ANKUR - PROSCHOOL) - MATPLOTLIB - HTML
7th LESSON (ANKUR - PROSCHOOL) - MATPLOTLIB - HTML
Matplotlib is the most popular graphing & data visualization module for python. This is important
for all professionals working with data. Either they want a better way to visualize the data they
are using to have a better grasp of the data Or they want to display the data is a better way to
present the same.
Once you have Matplotlib installed, be sure to open up a terminal or a script, type:
import matplotlib
Once you can successfully import matplotlib, then you are ready to continue. Let’s begin by one
of the most simple graphs by importing pyplot:
In [3]: plt.plot([1,2,3],[4,5,1])
plt.show()
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Creating LINE charts using passing variables
In terms of programming, it is unlikely that you will actually be filling in data to the plt.plot()
function. Instead, you will, or at least you should, be only passing variables into it. Like
plt.plot(x,y). So now let us show plotting variables as well as adding some descriptive labels and
a good title.
In [4]: x = [5,8,10]
y = [12,16,6]
plt.plot(x,y)
plt.show()
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
But these charts are not useful without:
We can add the Axis-labels and the Chart Title using the following functions in Matplotlib.
Using the above functions let us name the axis and the chart title of the above chart. Let’s use
the following:
In [5]: x = [5,8,10]
y = [12,16,6]
plt.plot(x,y)
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.title('Epic Info') #– Where “Epic Info” is the title/name of the ch
art
plt.ylabel('Y Axis') #– Where “Y-Axis” is the name of the y-axis
plt.xlabel('X-Axis') #– Where “X-Axis” is the name of the x-axis
plt.show()
Matplotlib STYLES:
The Matplotlib Styles are used to style the graphs/charts. You can just import a stylesheet and
use it pre-set features for customizations in your charts.
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [8]: #and create a graph using the following:
x = [5,8,10]
y = [12,16,6]
x2 = [6,9,11]
y2 = [6,15,7]
We can also edit the colors of the line chart and add legends to the chart to make them more
interactive and informative.
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
We can edit the color of the lines using the following prompt:
In [10]: #we can add the legends using the following prompt (label):
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
we can also change the location of the legend
using the following prompts:
plt.legend(loc=0)
0 is for default(auto-selection), 1, 2, 3, 4
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [12]: #We can add the grid in the chart by using:
plt.grid(True,color='k') #k is used for BLACK color
BAR CHARTS:
Bar Charts or bar graphs are used to present grouped data with rectangular bars
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
• The lengths of the bar are proportional to the values that they represent.
• The bars can be plotted vertically or horizontally. A vertical bar chart is sometimes called a Line
graph.
In [13]: x = [5,8,10]
y = [12,16,6]
x2 = [6,9,11]
y2 = [6,15,7]
plt.show()
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
HISTOGRAMS
Histogram is used for density estimation
• It uses categorical data similar to bar plot, but groups continuous data into numbers into
ranges.
• One more important point is all the bins must be adjacent to each other.
• Usually bins are of equal size, but they can vary in sizes.
• Example:
▪ Consider a histogram of x and y, which are 500 Gaussian generated random numbers. In the
histogram distribution of x and y are represented with different colors.
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [14]: from matplotlib import pyplot as plt
import numpy as np
a = np.array([22,87,5,43,56,73,55,54,11,20,51,5,79,31,27])
plt.hist(a, bins = [0,20,40, 60,80,100]) #-Where Bins is the interval r
equired on the x-axis
plt.title("histogram")
plt.show()
NOTE:
SCATTER DIAGRAM:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
The pyplot submodule provides scatter() function to generate bar graphs. The following example
produces the scatter graph of two sets of x and y arrays.
In [15]: x = [5,8,10]
y = [12,16,6]
x2 = [6,9,11]
y2 = [6,15,7]
plt.scatter(x, y)
plt.scatter(x2, y2, color = 'g')
plt.title('Scatter Chart')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
NOTE:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None,
vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None,
hold=None, data=None, kwargs)
SUBPLOTS
matplotlib.pyplot.subplots(nrows=1, ncols=1, sharex=False, sharey=False, squeeze=True,
subplot_kw=None, gridspec_kw=None, **fig_kw)
Matplotlib feature which lets you add multiple plots within a figure called subplots. Subplots are
helpful when you want to show different data presentation in a single view, for instance
Dashboards.
The Matplotlib subplot() function can be called to plot two or more plots in one figure. Matplotlib
supports all kind of subplots including 2×1 vertical, 2×1 horizontal or a 2×2 grid.
Horizontal subplot
In Horizontal subplot, we carry out the sub-plotting based on the no. of rows. In that sense the
rows in (rows,column,plot_number), is always greater than 1. Look at the following code:
plt.subplot(2,1,1)
plt.title('subplot(2,1,1)')
plt.plot(t,s,'m')
plt.subplot(2,1,2)
plt.title('subplot(2,1,2)')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.plot(t,s,'r')
plt.show()
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
t = range(0, 20)
s = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
plt.subplot(1,2,1)
plt.title('subplot(1,2,1)')
plt.plot(t,s,'m')
plt.subplot(1,2,2)
plt.title('subplot(1,2,2)')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.plot(t,s,'r')
plt.show()
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
t = range(0, 20)
s = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
plt.subplot(2,2,1)
plt.title('subplot(2,2,1)')
plt.plot(t,s,'m')
plt.subplot(2,2,2)
plt.title('subplot(2,2,2)')
plt.plot(t,s,'r')
plt.subplot(2,2,3)
plt.title('subplot(2,2,3)')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
plt.plot(t,s,'g')
plt.subplot(2,2,4)
plt.title('subplot(2,2,4)')
plt.plot(t,s,'y')
plt.show()
In [20]: #EXAMPLE
x=pd.DataFrame(data=[1,2,3,4,5])
y=pd.DataFrame(data=[1,4,9,16,25])
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
STACKED BAR CHART EXAMPLE
Stacked bar plot is nothing but a bar plot which used to plot similar categories with different
discreet value combined together.
• Example:
▪ If you are comparing scores in final 5 overs of India and Pakistan in previous 5 matches then
stacked bar plot gives better comparison.
In [21]: # Here we are showing india and pakistan scores in last 5 overs
india = (20, 35, 30, 35, 27)
pakistan = (25, 32, 34, 20, 25)
N = 5
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
ind = np.arange(N) # the x locations for the groups
width = 0.35 # the width of the bars: can also be len(x) sequence
plt.ylabel('Scores')
plt.title('Scores of India and Pakistan')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
In [28]: import os
os.getcwd()
Out[28]: 'C:\\Users\\Shivani\\Proschool'
Out[33]:
name mpg cyl disp hp drat wt qsec vs am gear carb
Here we are plotting the bar plot of categorical variable “gear” from mtcars dataset.
and using plot() we are plotting the graph, here we need bar plot so “kind = bar”.
In below plot x-axis contains possible gear numbers of vehicles are taken and in y-axis count of
vehicles with respect to their gears is taken.
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
VERTICAL STACKED BAR:
We are visualizing two categorical variables in below plot which gives good way to compare the
variables.
• In below plot x-axis contains possible gear numbers of vehicles and in y-axis distribution of
number of cylinders(cyl)
In [37]: pd.crosstab(dat['gear'],dat['cyl']).plot(kind='bar',stacked=True)
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
HORIZONTAL STACKED BAR
We can also plot horizontal bar plot for same example by changing the “kind” from to “bar” to
“barh” (h –horizontal).
In [38]: pd.crosstab(dat['gear'],dat['cyl']).plot(kind='barh',stacked=True)
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
HISTOGRAM
Histogram is used for density estimation, Histogram is plotted to numeric variable.
In [39]: dat['mpg'].plot(kind='hist',alpha=0.5)
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
BOXPLOT
Box plot: We draw boxplot to show the distribution with respect to categories.
• Box represents different quartiles and viscous extends to show the rest of distribution.
• Example: Let us plot a simple box plot of ‘mpg’(mileage) using mtcars dataset.
In [40]: dat['mpg'].plot(kind='box')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
SCATTER PLOTS
In [41]: dat.plot(kind='scatter', x='mpg',y='hp')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
PIE_CHART
It depicts the numerical proportion of data. The proportions are showed using wedges.
• Simple Pie Chart: Let us plot a pie chart of gears. The plot shows the proportion of each gear
types.
In [42]: d=pd.crosstab(dat['gear'],columns="Count")['Count']
d
Out[42]: gear
3 15
4 12
5 5
Name: Count, dtype: int64
In [43]: d.plot(kind='pie')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Pie chart created with Separating the wedges of
the chart.
Following chart is an example for exploded pie chart.
In [44]: labels=['3-Gear','4-Gear','5-Gear']
colors=['gold','yellowgreen','lightcoral']
explode=(0.1,0.1,0.1)
d.plot(kind='pie',explode=explode,labels=labels,colors=colors,autopct='
%1.1f%%',shadow=True,
startangle=140)
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
%1.1f%%',shadow=True,
startangle=140)
• Example: If we need line charts of both ‘mpg’ and ‘disp’ in a same chart then we can make
‘disp’ as secondary axis and get then in same chart.
In [46]: dat['mpg'].plot(kind='line',style='r')
dat['disp'].plot(secondary_y=True,style='g')
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
SEABORN
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD