Professional Documents
Culture Documents
Pyplot
Pyplot
Line graphs are usually used to find relationship between two data sets
on different axis; for instance X, Y.
import matplotlib.pyplot as plt
year = [2015,2016,2017,2018]
pass_per= [96,98.5,98.5 ,100]
plt.plot(pass_per, year)
plt.show()
import matplotlib.pyplot as plt
year = [2015,2016,2017,2018]
pass_per= [96,98.5,98.5 ,100]
plt.plot(pass_per, year)
plt.xlabel("Pass Percentage")
plt.ylabel("Year")
plt.title("Year Wise Pass Percentage")
plt.show()
import matplotlib.pyplot as plt
year = [2015,2016,2017,2018]
pass_per= [96,98.5,98.5 ,100]
plt.plot(pass_per, year)
plt.xlabel("Pass Percentage",color='red')
plt.ylabel("Year",color='green')
plt.title("Year Wise Pass Percentage",color='yellow')
plt.show()
import matplotlib.pyplot as plt
year = [2015,2016,2017,2018]
pass_per= [96,98.5,98.5 ,100]
plt.plot(pass_per,
year,marker='o',markeredgecolor='red',markerfacecolor='yellow',ls='dashdot')
plt.xlabel("Pass Percentage",color='red')
plt.ylabel("Year",color='green')
plt.title("Year Wise Pass Percentage",color='yellow')
plt.show()
import matplotlib.pyplot as plt
year = [2015,2016,2017,2018]
pass_per_cs= [96,98.5,98.5 ,100]
pass_per_ip=[97,98,99,99]
plt.plot(pass_per_cs,
year,marker='o',markeredgecolor='red',markerfacecolor='yellow',ls='dashdot',color='magenta',label='CS')
plt.plot(pass_per_ip,
year,marker='d',markeredgecolor='blue',markerfacecolor='green',ls='dotted',color='red',label='IP')
plt.legend(loc='upper center')
plt.xlabel("Pass Percentage",color='red')
plt.ylabel("Year",color='green')
plt.title("Year Wise Pass Percentage",color='yellow')
plt.show()
import matplotlib.pyplot as plt
year = [2015,2016,2017,2018]
pass_per_cs= [96,98.5,98.5 ,100]
pass_per_ip=[97,98,99,99]
plt.plot(pass_per_cs,
year,marker='o',markeredgecolor='red',markerfacecolor='yellow',ls='dashdot',color
='magenta',label='CS')
plt.plot(pass_per_ip,
year,marker='d',markeredgecolor='blue',markerfacecolor='green',ls='dotted',color='
red',label='IP')
plt.legend(loc='upper center')
plt.xlabel("Pass Percentage",color='red')
plt.ylabel("Year",color='green')
plt.title("Year Wise Pass Percentage",color='yellow')
plt.yticks(year)
plt.show()
Bar chart
import numpy as np
import matplotlib.pyplot as plt
year =np.array( [2015,2016,2017,2018])
pass_per_cs= [50,98.5,60 ,100]
pass_per_ip=[60,90,40,100]
plt.bar(year,pass_per_cs,ls='dashdot',width=0.5,color='magenta',edgecolor='black',label='CS')
plt.bar(year+0.2,pass_per_ip,width=0.5,ls='dotted',color='red',edgecolor='black',label='IP')
plt.legend(loc='upper center')
plt.xlabel("Pass Percentage",color='red')
plt.ylabel("Year",color='green')
plt.title("Year Wise Pass Percentage",color='yellow')
#plt.xticks([90,92,94,96,98,100])
plt.show()
Horizontal bar
import numpy as np
import matplotlib.pyplot as plt
# Show graphic
plt.show()
Saving graph
import numpy as np
import matplotlib.pyplot as plt
year =np.array( [2015,2016,2017,2018])
pass_per_cs= [96,98.5,98.5 ,100]
pass_per_ip=[97,98,99,99]
plt.bar(year,pass_per_cs,ls='dashdot',color='magenta',width=0.5,edgecolor='black',label='CS')
plt.bar(year+0.2,pass_per_ip,ls='dotted',color='red',width=0.5,edgecolor='black',label='IP')
plt.legend(loc='upper center')
plt.xlabel("Pass Percentage",color='red')
plt.ylabel("Year",color='green')
plt.title("Year Wise Pass Percentage",color='yellow')
plt.xticks([2015,2016,2017,2018])
plt.savefig("abc.png")
plt.show()
Pie Chart
# Plot
plt.pie(reading_hours, explode=explode, labels=labels,
colors=colors,autopct='%1.1f%%', shadow=True, startangle=140)
plt.axis('equal')
plt.show()
import matplotlib.pyplot as plt
labels = 'Chemistry', 'Physics', 'Maths', 'CS'
reading_hours = [5,4,4,2]
colors = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue']
explode = (0.1, 0, 0, 0) # explode 1st slice
# Plot
plt.pie(reading_hours, explode=explode, labels=labels,
colors=colors,autopct='%1.1f%%', shadow=True, startangle=140)
plt.legend(loc="upper left")
plt.axis('equal')
plt.show()
Histograms
A histogram is an accurate graphical representation of the distribution
of numerical data. It is an estimate of the probability distribution of a
continuous variable (quantitative variable) and was first introduced by
Karl Pearson. It is a kind of bar graph. To construct a histogram, the first
step is to “bin” the range of values — that is, divide the entire range of
values into a series of intervals — and then count how many values fall
into each interval. The bins are usually specified as consecutive, non-
overlapping intervals of a variable. The bins (intervals) must be
adjacent, and are often (but are not required to be) of equal size.
• Alternatively, you may derive the bins using the following formulas:
• n = number of observations
• Range = maximum value – minimum value
• # of intervals = √n
• Width of intervals = Range / (# of intervals)
• let’s say that you have the following data about the age of 100
individuals
•:
Age
1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91
Determine number of bins
• Using our formulas:
• n = number of observations = 100
• Range = maximum value – minimum value = 91 – 1 = 90
• # of intervals = √n = √100 = 10
• Width of intervals = Range / (# of intervals) = 90/10 = 9
Based on this information, the frequency table would look like this:
plt.hist(x, bins=[0,10,20,30,40,50,60,70,80,90,99])
import matplotlib.pyplot as plt
x = [1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91
]
plt.hist(x, bins=[0,10,20,30,40,50,60,70,80,90,99],width=5.0)
plt.show()
cumulative : bool, optional
If True, then a histogram is computed where each bin gives the
counts in that bin plus all bins for smaller values. The last bin gives
the total number of datapoints. If normed or density is also True
then the histogram is normalized such that the last bin equals 1.
If cumulative evaluates to less than 0 (e.g., -1), the direction of
accumulation is reversed. In this case, if normed and/or density is
also True, then the histogram is normalized such that the first bin
equals 1.
Default is False
import matplotlib.pyplot as plt
x = [1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91
]
x = [1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91
]
plt.hist(x, bins=[0,10,20,30,40,50,60,70,80,90,99], histtype='step')
plt.show()
import matplotlib.pyplot as plt
x = [1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91
]
plt.hist(x, bins=[0,10,20,30,40,50,60,70,80,90,99], histtype='step‘ ,
cumulative =True)
plt.show()
BOXPLOT
of the data. The end of the box shows the upper and lower
quartiles. The extreme lines shows the highest and lowest
value excluding outliers. Note that boxplot hide the number
of values existing behind the variable.
Plotting a boxplot
import matplotlib.pyplot as plt
value1 = [72,76,24,40,57,62,75,78,31,32]
box=plt.boxplot(value1,vert=1,patch_artist=True)
plt.show()