You are on page 1of 13

Data Visualization with Python

Prof. Rajiv Kumar


IIM Kashipur
Data Visualization with Python

 Data visualization is the discipline of trying to understand data by placing it in a visual context so
that patterns, trends and correlations can be exposed. Python offers multiple great graphing
libraries that come packed with lots of different features.
 Few popular plotting libraries:
• Matplotlib: low level, provides lots of freedom
• Pandas Visualization: easy to use interface, built on Matplotlib
• Seaborn: high-level interface, great default styles
• ggplot: based on R’s ggplot2, uses Grammar of Graphics
• Plotly: can create interactive plots
matplotlib

 matplotlib is the most popular Python library for producing plots and other 2D data visualizations.
 It is well-suited for creating plots suitable for publication.
 It integrates well with IPython, thus providing a comfortable interactive environment for plotting and
exploring data.
 The plots are also interactive; you can zoom in on a section of the plot and pan around the plot using
the toolbar in the plot window.
Importing Datasets

import pandas as pd
df=pd.read_csv('irisData.csv')
df
Scatter Plot

#Data from File


import pandas as pd
import matplotlib.pyplot as plt

df=pd.read_csv('irisData.csv')
fig, ax=plt.subplots()
ax.scatter(df['Sepal.Length'], df['Sepal.Width'])
ax.set_title('Iris Dataset')
ax.set_xlabel('sepal_length')
ax.set_ylabel('sepal_width')
Line Chart (1 of 2) Economy Data
Year Unemployment_Rate
1920 9.8
In Matplotlib we can create a bar chart using the plot method.
1930 12
1940 8

#Data in List 1950 7.2


1960 6.9
1970 7

import matplotlib.pyplot as plt 1980 6.5


1990 6.2
Year = [1920,1930,1940,1950,1960,1970,1980,1990,2000,2010]
2000 5.5
Unemployment_Rate = [9.8,12,8,7.2,6.9,7,6.5,6.2,5.5,6.3] 2010 6.3

fig, ax=plt.subplots()
ax.plot(Year, Unemployment_Rate, color="blue", marker=".")
ax.set_title("Unemployment Data", fontsize="18")
ax.set_xlabel("Year", fontsize="14")
ax.set_ylabel("Unemployment Rate", fontsize="14")
Line Chart (2 of 2) Unemployment Data

Year Unemployment_Rate

#Data in DataFrame 1920 9.8


1930 12
1940 8

import pandas as pd 1950 7.2


1960 6.9
import matplotlib.pyplot as plt 1970 7

df=pd.DataFrame([[1920, 9.8], [1930, 12], [1940, 8], 1980 6.5


1990 6.2
[1950, 7.2], [1960, 6.9], [1970, 7], [1980, 6.5], 2000 5.5

[1990, 6.2], [2000, 5.5], [2010, 6.3]]) 2010 6.3

fig, ax=plt.subplots()
ax.plot(df[0], df[1], color="blue", marker=".")
ax.set_title("Unemployment Data", fontsize="18")
ax.set_xlabel("Year", fontsize="14")
ax.set_ylabel("Unemployment Rate", fontsize="14")
Bar Chart (1 of 2)

In Matplotlib we can create a bar chart using the bar method.

#Data in list

import matplotlib.pyplot as plt


x_data=[1,2,3,4,5]
y_data=[5,6,7,8,5]
fig, ax = plt.subplots()
ax.bar(x_data, y_data)
ax.set_title(‘Bar Plot')
ax.set_xlabel("X-Axis")
ax.set_ylabel("Y-Axis")
Bar Chart (2 of 2)

# Data in DataFrame

import pandas as pd
import matplotlib.pyplot as plt
data_df=pd.DataFrame([[1,5], [2,6], [3,7], [4, 8], [5,5]])
fig, ax = plt.subplots()
ax.bar(data_df[0], data_df[1])
ax.set_title(‘Bar Plot')
ax.set_xlabel("X-Axis")
ax.set_ylabel("Y-Axis")
Histogram

In Matplotlib we can create a Histogram using the hist method. If we pass it categorical data
like the points column from the wine-review dataset it will automatically calculate how often
each class occurs.

import pandas as pd
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.hist([1,1,1, 3,3, 4,5,5,5,5,5,])
ax.set_title('Histogram')
ax.set_xlabel("X-Axis")
ax.set_ylabel("Y-Axis")
Pie Chart (1 of 2)

In Matplotlib we can create a pie chart using the pie Fruits Quantity
Apple 25
method.
Banana 40
Cherry 15
#Data in List
Dates 10
import matplotlib.pyplot as plt
fruits = ["Apples", "Bananas", "Cherries", "Dates"]
weight=[25,40,15,10]
fig, ax=plt.subplots()
ax.pie(weight, labels=fruits,autopct="%0.2f%%", explode=[0.0,
0.2, 0.0, 0.0])
Pie Chart (2 of 2)

Fruits Quantity
#Data in DataFrame
Apple 25
Banana 40
import matplotlib.pyplot as plt
Cherry 15
Dates 10
fig, ax=plt.subplots()
fruit_df=pd.DataFrame([['Apple', 25],
['Banana', 40],
['Cherry', 15],
['Dates', 10]])
plt.pie(fruit_df[1], labels=fruit_df[0], autopct="%0.2f%%",
explode=[0.0, 0.2, 0.0, 0.0])
Readings

For more details, please go through the below web link


https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.plot.html

You might also like