0% found this document useful (0 votes)
181 views38 pages

Advanced Visualization For Data Scientists With Matplotlib

This document discusses advanced data visualization techniques for data scientists using Matplotlib. It begins with an overview and then demonstrates basic Matplotlib plots like line plots and bar plots using a Vancouver property tax dataset. It plots number of properties built over time in a line plot and plots the same values in a bar plot. The document then discusses more advanced Matplotlib features like 3D plots and interactive widgets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views38 pages

Advanced Visualization For Data Scientists With Matplotlib

This document discusses advanced data visualization techniques for data scientists using Matplotlib. It begins with an overview and then demonstrates basic Matplotlib plots like line plots and bar plots using a Vancouver property tax dataset. It plots number of properties built over time in a line plot and plots the same values in a bar plot. The document then discusses more advanced Matplotlib features like 3D plots and interactive widgets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

Advanced Visualization for Data


Scientists with Matplotlib
Contents: Basic plots, 3D plots and widgets

Veekesh Dhununjoy Follow


Mar 13 · 7 min read

A picture is worth a thousand words but a good visualization is worth


millions.

Visualization plays a fundamental role in communicating results in


many fields in today’s world. Without proper visualizations, it is very
hard to reveal findings, understand complex relationships among
variables and describe trends in the data.

In this blog post, we’ll start by plotting the basic plots with Matplotlib
and then drill down into some very useful advanced visualization
techniques such as “The mplot3d Toolkit” (to generate 3D plots) and
widgets.

The Vancouver property tax report dataset has been used to explore
different types of plots in the Matplotlib library. The dataset contains
information on properties from BC Assessment (BCA) and City sources
including Property ID, Year Built, Zone Category, Current Land Value, etc.

A Link to the codes is mentioned at the bottom of this blog.

. . .

[Link] 1/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

Matplotlib Basic Plots

Frequently used commands in the given examples:

plt. gure(): To create a new figure


[Link](): Plot y versus x as lines and/or markers
[Link](): Set the label for the x-axis
[Link](): Set the label for the y-axis
[Link](): Set a title for the axes
[Link](): Configure the grid lines
[Link](): Place a legend on the axes
[Link] g(): To save the current figure on the disk
[Link](): Display a figure
[Link](): Clear the current figure(useful to plot multiple figures in the
same code)

1. Line Plot

A line plot is a basic chart that displays information as a series of data


points called markers connected by straight line segments.

[Link] 2/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 # Line plot.
2
3 # Importing matplotlib to plot the graphs.
4 import [Link] as plt
5
6 # Importing pandas for using pandas dataframes.
7 import pandas as pd
8
9 # Reading the input file.
10 df = pd.read_csv("property_tax_report_2018.csv")
11
12 # Removing the null values in PROPERTY_POSTAL_CODE.
13 df = df[(df['PROPERTY_POSTAL_CODE'].notnull())]
14
15 # Grouping by YEAR_BUILT and aggregating based on PID to
16 df = df[['PID', 'YEAR_BUILT']].groupby('YEAR_BUILT', as_
17
18 # Filtering YEAR_BUILT and keeping only the values betwe
19 df = df[(df['YEAR_BUILT'] >= 1900) & (df['YEAR_BUILT'] <
20
21 # X-axis: YEAR_BUILT
22 x = df['YEAR_BUILT']
23
24 # Y-axis: Number of properties built.
25 y = df['No_of_properties_built']
26
27 # Change the size of the figure (in inches).
28 [Link](figsize=(17,6))
29
30 # Plotting the graph using x and y with 'dodgerblue' col
31 # Different labels can be given to different lines in th
32 # Linewidth determines the width of the line.
33 [Link](x, y, 'dodgerblue', label = 'Number of properti
34 # [Link](x2, y2, 'red', label = 'Line 2', linewidth =
35
36 # X-axis label.
37 [Link]('YEAR', fontsize = 16)
38

[Link] 3/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create a line graph. Here, Pandas
Dataframe has been used to perform basic data manipulations. After
reading and processing the input dataset, [Link]() is used to plot the
line graph with Year on the x-axis and the Number of properties built on
the y-axis.

2. Bar Plot

A bar graph displays categorical data with rectangular bars of heights or


lengths proportional to the values which they represent.

[Link] 4/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 # Bar plot.
2
3 # Importing matplotlib to plot the graphs.
4 import [Link] as plt
5
6 # Importing pandas for using pandas dataframes.
7 import pandas as pd
8
9 # Reading the input file.
10 df = pd.read_csv("property_tax_report_2018.csv")
11
12 # Removing the null values in PROPERTY_POSTAL_CODE.
13 df = df[(df['PROPERTY_POSTAL_CODE'].notnull())]
14
15 # Grouping by YEAR_BUILT and aggregating based on PID to
16 df = df[['PID', 'YEAR_BUILT']].groupby('YEAR_BUILT', as_
17
18 # Filtering YEAR_BUILT and keeping only the values betwe
19 df = df[(df['YEAR_BUILT'] >= 1900) & (df['YEAR_BUILT'] <
20
21 # X-axis: YEAR_BUILT
22 x = df['YEAR_BUILT']
23
24 # Y-axis: Number of properties built.
25 y = df['No_of_properties_built']
26
27 # Change the size of the figure (in inches).
28 [Link](figsize=(17,6))
29
30 # Plotting the graph using x and y with 'dodgerblue' col
31 # Different labels can be given to different bar plots i
32 # Linewidth determines the width of the line.
33 [Link](x, y, label = 'Number of properties built', colo
34 # [Link](x2, y2, label = 'Bar 2', color = 'red', width
35
36 # X-axis label.
37 [Link]('YEAR', fontsize = 16)
38

[Link] 5/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create a Bar graph.

3. Histogram

A histogram is an accurate representation of the distribution of


numerical data. It is an estimate of the probability distribution of a
continuous variable.

[Link] 6/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 # Histogram
2
3 # Importing matplotlib to plot the graphs.
4 import [Link] as plt
5
6 # Importing pandas for using pandas dataframes.
7 import pandas as pd
8
9 # Reading the input file.
10 df = pd.read_csv("property_tax_report_2018.csv")
11
12 # Removing the null values in PROPERTY_POSTAL_CODE.
13 df = df[(df['PROPERTY_POSTAL_CODE'].notnull())]
14
15 # Grouping by YEAR_BUILT and aggregating based on PID to
16 df = df[['PID', 'YEAR_BUILT']].groupby('YEAR_BUILT', as_
17
18 # Filtering YEAR_BUILT and keeping only the values betwe
19 df = df[(df['YEAR_BUILT'] >= 1900) & (df['YEAR_BUILT'] <
20
21 # Change the size of the figure (in inches).
22 [Link](figsize=(17,6))
23
24 # X-axis: Number of properties built from 1900 to 2018.
25 # Y-axis: Frequency.
26 [Link](df['No_of_properties_built'],
27 bins = 50,
28 histtype='bar',
29 rwidth = 1.0,
30 color = 'dodgerblue',
31 alpha = 0.8
32 )
33
34 # X-axis label.
35 [Link]('Number of properties built from 1900 to 2018

[Link] 7/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create a Histogram.

4. Pie Chart

A pie chart is a circular statistical graphic which is divided into slices to


illustrate numerical proportions. In a pie chart, the arc length of each
slice is proportional to the quantity it represents.

[Link] 8/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 # Pie-chart.
2
3 # Importing matplotlib to plot the graphs.
4 import [Link] as plt
5
6 # Importing pandas for using pandas dataframes.
7 import pandas as pd
8
9 # Reading the input file.
10 df = pd.read_csv("property_tax_report_2018.csv")
11
12 # Filtering out the null values in ZONE_CATEGORY
13 df = df[df['ZONE_CATEGORY'].notnull()]
14
15 # Grouping by ZONE_CATEGORY and aggregating based on PID
16 df_zone_properties = [Link]('ZONE_CATEGORY', as_inde
17
18 # Counting the total number of properties.
19 total_properties = df_zone_properties['No_of_properties'
20
21 # Calculating the percentage share of each ZONE for the
22 df_zone_properties['percentage_of_properties'] = ((df_zo
23
24 # Finding the ZONES with the top-5 percentage share in t
25 df_top_10_zone_percentage = df_zone_properties.nlargest(
26
27 # Change the size of the figure (in inches).
28 [Link](figsize=(8,6))
29
30 # Slices: percentage_of_properties.
31 slices = df_top_10_zone_percentage['percentage_of_proper
32 # Categories: ZONE_CATEGORY.
33 categories = df_top_10_zone_percentage['ZONE_CATEGORY']
34 # For different colors: [Link]
35 cols = ['purple', 'red', 'green', 'orange', 'dodgerblue'
36
37 # Plotting the pie-chart.
38 [Link](slices,

[Link] 9/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create a Pie chart.

5. Scatter Plot

[Link] 10/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 # Scatter plot.
2
3 # Importing matplotlib to plot the graphs.
4 import [Link] as plt
5
6 # Importing pandas for using pandas dataframes.
7 import pandas as pd
8
9 # Reading the input file.
10 df = pd.read_csv("property_tax_report_2018.csv")
11
12 # Removing the null values in PROPERTY_POSTAL_CODE.
13 df = df[(df['PROPERTY_POSTAL_CODE'].notnull())]
14
15 # Grouping by YEAR_BUILT and aggregating based on PID to
16 df = df[['PID', 'YEAR_BUILT']].groupby('YEAR_BUILT', as_
17
18 # Filtering YEAR_BUILT and keeping only the values betwe
19 df = df[(df['YEAR_BUILT'] >= 1900) & (df['YEAR_BUILT'] <
20
21 # X-axis: YEAR_BUILT
22 x = df['YEAR_BUILT']
23
24 # Y-axis: Number of properties built.
25 y = df['No_of_properties_built']
26
27 # Change the size of the figure (in inches).
28 [Link](figsize=(17,6))
29
30 # Plotting the scatter plot.
31 # For different types of markers: [Link]
32 [Link](x, y, label = 'Number of properties built',s
33 alpha = 0.8, marker = '.', edgecolors='black
34
35 # X-axis label.
36 [Link]('YEAR', fontsize = 16)
37
38 # Y-axis label

[Link] 11/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create a Scatter plot.

6. Working with Images

Link to download the Lenna test image. (Source: Wikipedia)

1 # Reading, displaying and saving an image.


2
3 # Importing matplotlib pyplot and image.
4 import [Link] as plt
5 import [Link] as mpimg
6
7 # Reading the image from the disk.
8 image = [Link]('Lenna_test_image.png')
9
10 # Displaying the image.

[Link] 12/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

. . .

3D Plots using Matplotlib

3D plots play an important role in visualizing complex data in three or


more dimensions.

1. 3D Scatter Plot

[Link] 13/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 '''
2 ==============
3 3D scatterplot
4 ==============
5
6 Demonstration of a basic scatterplot in 3D.
7 '''
8
9 # Import libraries
10 from mpl_toolkits.mplot3d import Axes3D
11 import [Link] as plt
12 from [Link] import Line2D
13 import numpy as np
14 import pandas as pd
15
16 # Create figure object
17 fig = [Link]()
18
19 # Get the current axes, creating one if necessary.
20 ax = [Link](projection='3d')
21
22 # Get the Property Tax Report dataset
23 # Dataset link: [Link]
24 data = pd.read_csv('property_tax_report_2018.csv')
25
26 # Extract the columns and do some transformations
27 yearWiseAgg = data[['PID','CURRENT_LAND_VALUE']].groupby
28 yearWiseAgg = yearWiseAgg.reset_index().dropna()
29
30 # Define colors as red, green, blue
31 colors = ['r', 'g', 'b']
32
33 # Get only records which have more than 2000 properties
34 morethan2k = [Link]('PID>2000')
35
36 # Get shape of dataframe
37 dflen = [Link][0]
38
39 # Fetch land values from dataframe
40 lanvalues = (morethan2k['CURRENT_LAND_VALUE']/2e6).tolis
41
42 # C t li t f l f h i t di t

[Link] 14/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

42 # Create a list of colors for each point corresponding t


43 c_list = []
44 for i,value in enumerate(lanvalues):
45 if value>0 and value<1900:

3D scatter plots are used to plot data points on three axes in an attempt
to show the relationship between three variables. Each row in the data
table is represented by a marker whose position depends on its values
in the columns set on the X, Y, and Z axes.

2. 3D Line Plot

[Link] 15/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 '''
2 ==============
3 3D lineplot
4 ==============
5
6 Demonstration of a basic lineplot in 3D.
7 '''
8
9 # Import libraries
10 import matplotlib as mpl
11 from mpl_toolkits.mplot3d import Axes3D
12 import numpy as np
13 import [Link] as plt
14
15 # Set the legend font size to 10
16 [Link]['[Link]'] = 10
17
18 # Create figure object
19 fig = [Link]()
20
21 # Get the current axes, creating one if necessary.
22 ax = [Link](projection='3d')
23
24 # Create data point to plot

[Link] 16/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

3D Line Plots can be used in the cases when we have one variable that is
constantly increasing or decreasing. This variable can be placed on the
Z-axis while the change of the other two variables can be observed in
the X-axis and Y-axis w.r.t Z-axis. For example, if we are using time series
data (such as planetary motions) the time can be placed on Z-axis and
the change in the other two variables can be observed from the
visualization.

3. 3D Plots as Subplots

1 '''
2 ====================
3 3D plots as subplots
4 ====================
5
6 Demonstrate including 3D plots as subplots.
7 '''
8
9 import [Link] as plt
10 from mpl_toolkits.mplot3d.axes3d import Axes3D, get_test
11 from matplotlib import cm
12 import numpy as np
13
14
15 # set up a figure twice as wide as it is tall
16 fig = [Link](figsize=[Link](0.5))
17
18 #===============
19 # First subplot
20 #===============
21 # set up the axes for the first plot
22 ax = fig.add_subplot(1, 2, 1, projection='3d')
23
24 # plot a 3D surface like in the example mplot3d/surface3
25 # Get equally spaced numbers with interval of 0.25 from
26 X = [Link](-5, 5, 0.25)
27 Y = [Link](-5, 5, 0.25)
28 # Convert it into meshgrid for plotting purpose using x
29 X, Y = [Link](X, Y)
30 R = [Link](X**2 + Y**2)
31 Z = [Link](R)

[Link] 17/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create multiple 3D plots as


subplots in the same figure. Both the plots can be analyzed
independently.

4. Contour Plot

1 '''
2 ==============
3 Contour Plots
4 ==============
5 Plot a contour plot that shows intensity
6 '''
7
8 # Import libraries
9 from mpl_toolkits.mplot3d import axes3d
10 import [Link] as plt
11 from matplotlib import cm
12
13 # Create figure object
14 fig = [Link]()
15
16 # Get the current axes, creating one if necessary.
17 ax = [Link](projection='3d')
18
19 # Get test data

[Link] 18/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create contour plots. Contour
plots can be used for representing a 3D surface on a 2D format. Given a
value for the Z-axis, lines are drawn for connecting the (x,y) coordinates
where that particular z value occurs. Contour plots are generally used
for continuous variables rather than categorical data.

5. Contour Plot with Intensity

[Link] 19/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 '''
2 ==============
3 Contour Plots
4 ==============
5 Plot a contour plot that shows intensity
6 '''
7 # Import libraries
8 from mpl_toolkits.mplot3d import axes3d
9 import [Link] as plt
10 from matplotlib import cm
11
12 # Create figure object
13 fig = [Link]()
14
15 # Get the current axes, creating one if necessary.
16 ax = [Link](projection='3d')
17
18 # Get test data

The above code snippet can be used to create filled contour plots.

6. Surface Plot

[Link] 20/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 """
2 ========================
3 Create 3d surface plots
4 ========================
5 Plot a contoured surface plot
6 """
7
8 # Import libraries
9 from mpl_toolkits.mplot3d import Axes3D
10 import [Link] as plt
11 from matplotlib import cm
12 from [Link] import LinearLocator, FormatStrFo
13 import numpy as np
14
15 # Create figures object
16 fig = [Link]()
17
18 # Get the current axes, creating one if necessary.
19 ax = [Link](projection='3d')
20
21 # Make data.
22 X = [Link](-5, 5, 0.25)
23 Y = [Link](-5, 5, 0.25)
24 X, Y = [Link](X, Y)
25 R = [Link](X**2 + Y**2)
26 Z = [Link](R)

[Link] 21/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create Surface plots which are
used for plotting 3D data. They show a functional relationship between
a designated dependent variable (Y), and two independent variables (X
and Z) rather than showing the individual data points. A practical
application for the above plot would be to visualize how the Gradient
Descent algorithm converges.

7. Triangular Surface Plot

[Link] 22/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 '''
2 ======================
3 Triangular 3D surfaces
4 ======================
5
6 Plot a 3D surface with a triangular mesh.
7 '''
8 # Import libraries
9 from mpl_toolkits.mplot3d import Axes3D
10 import [Link] as plt
11 import numpy as np
12
13 # Create figures object
14 fig = [Link]()
15
16 # Get the current axes, creating one if necessary.
17 ax = [Link](projection='3d')
18
19 # Set parameters
20 n_radii = 8
21 n_angles = 36
22
23 # Make radii and angles spaces (radius r=0 omitted to el
24 radii = [Link](0.125, 1.0, n_radii)
25 angles = [Link](0, 2*[Link], n_angles, endpoint=Fals

[Link] 23/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create Triangular Surface plot.

8. Polygon Plot

[Link] 24/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 '''
2 ==============
3 Polygon Plots
4 ==============
5 Plot a polygon plot
6 '''
7 # Import libraries
8 from mpl_toolkits.mplot3d import Axes3D
9 from [Link] import PolyCollection
10 import [Link] as plt
11 from matplotlib import colors as mcolors
12 import numpy as np
13
14 # Fixing random state for reproducibility
15 [Link](19680801)
16
17 def cc(arg):
18 '''
19 Shorthand to convert 'named' colors to rgba format a
20 '''
21 return mcolors.to_rgba(arg, alpha=0.6)
22
23
24 def polygon_under_graph(xlist, ylist):
25 '''
26 Construct the vertex list which defines the polygon
27 the (xlist, ylist) line graph. Assumes the xs are i
28 '''
29 return [(xlist[0], 0.), *zip(xlist, ylist), (xlist[-
30
31 # Create figure object
32 fig = [Link]()
33
34 # Get the current axes, creating one if necessary.
35 ax = [Link](projection='3d')
36
37 # Make verts a list, verts[i] will be a list of (x,y) pa
38 verts = []
39

[Link] 25/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create Polygon Plots.

9. Text Annotations in 3D

[Link] 26/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 '''
2 ======================
3 Text annotations in 3D
4 ======================
5
6 Demonstrates the placement of text annotations on a 3D p
7
8 Functionality shown:
9 - Using the text function with three types of 'zdir' val
10 an axis name (ex. 'x'), or a direction tuple (ex. (1,
11 - Using the text function with the color keyword.
12 - Using the text2D function to place text on a fixed pos
13 '''
14 # Import libraries
15 from mpl_toolkits.mplot3d import Axes3D
16 import [Link] as plt
17
18 # Create figure object
19 fig = [Link]()
20
21 # Get the current axes, creating one if necessary.
22 ax = [Link](projection='3d')
23
24 # Demo 1: zdir
25 zdirs = (None, 'x', 'y', 'z', (1, 1, 0), (1, 1, 1))
26 xs = (1, 4, 4, 9, 4, 1)
27 ys = (2, 5, 8, 10, 1, 2)
28 zs = (10, 3, 8, 9, 1, 8)
29
30 for zdir, x, y, z in zip(zdirs, xs, ys, zs):
31 label = '(%d, %d, %d), dir=%s' % (x, y, z, zdir)
32 ax text(x y z label zdir)

[Link] 27/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

The above code snippet can be used to create text annotations in 3D


plots. It is very useful when creating 3D plots as changing the angles of
the plot does not distort the readability of the text.

10. 2D Data in 3D Plot

[Link] 28/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 """
2 =======================
3 Plot 2D data on 3D plot
4 =======================
5
6 Demonstrates using [Link]'s zdir keyword to plot 2D dat
7 selective axes of a 3D plot.
8 """
9
10 # Import libraries
11 from mpl_toolkits.mplot3d import Axes3D
12 import [Link] as plt
13 from [Link] import Line2D
14 import numpy as np
15 import pandas as pd
16
17 # Create figure object
18 fig = [Link]()
19
20 # Get the current axes, creating one if necessary.
21 ax = [Link](projection='3d')
22
23 # Get the Property Tax Report dataset
24 # Dataset link: [Link]
25 data = pd.read_csv('property_tax_report_2018.csv')
26
27 # Extract the columns and do some transformations
28 yearWiseAgg = data[['PID','CURRENT_LAND_VALUE']].groupby
29 yearWiseAgg = yearWiseAgg.reset_index().dropna()
30
31 # Where zs takes either an array of the same length as x
32 # and zdir takes ‘x’, ‘y’ or ‘z’ as direction to use as
33 [Link](yearWiseAgg['PID'],yearWiseAgg['YEAR_BUILT'], zs
34
35 # Define colors as red, green, blue
36 colors = ['r', 'g', 'b']
37
38 # Get only records which have more than 2000 properties
39 morethan2k = [Link]('PID>2000')
40
41 # Get shape of dataframe
42 dfl th 2k h [0]

[Link] 29/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

42 dflen = [Link][0]
43
44 # Fetch land values from dataframe
45 lanvalues = (morethan2k['CURRENT_LAND_VALUE']/2e6).tolis
46
47 # Create a list of colors for each point corresponding t
48 c_list = []
49 for i,value in enumerate(lanvalues):

The above code snippet can be used to plot 2D data in a 3D plot. It is


very useful as it allows to compare multiple 2D plots in 3D.

11. 2D Bar Plot in 3D

[Link] 30/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 """
2 ========================================
3 Create 2D bar graphs in different planes
4 ========================================
5
6 Demonstrates making a 3D plot which has 2D bar graphs pr
7 planes y=0, y=1, etc.
8 """
9
10 # Import libraries
11 from mpl_toolkits.mplot3d import Axes3D
12 import [Link] as plt
13 from [Link] import Line2D
14 import numpy as np
15 import pandas as pd
16
17 # Create figure object
18 fig = [Link]()
19
20 # Get the current axes, creating one if necessary.
21 ax = [Link](projection='3d')
22
23 # Get the Property Tax Report dataset
24 # Dataset link: [Link]
25 data = pd.read_csv('property_tax_report_2018.csv')
26
27 # Groupby Zone catrgory and Year built to seperate for e
28 newdata = [Link](['YEAR_BUILT','ZONE_CATEGORY']).a
29
30 # Create list of years that are found in all zones that
31 years = [1995,2000,2005,2010,2015,2018]
32
33 # Create list of Zone categoreis that we want to plot
34 categories = ['One Family Dwelling', 'Multiple Family Dw
35
36 # Plot bar plot for each category

[Link] 31/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

. . .

The above code snippet can be used to create multiple 2D bar plots in a
single 3D space to compare and analyze the differences.

Widgets in Matplotlib

So far we have been dealing with static plots where the user can only
visualize the charts or graphs without any interaction. However,
widgets provide this level of interactivity to the user for better
visualizing, filtering and comparing data.

1. Checkbox widget

[Link] 32/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 import numpy as np
2 import [Link] as plt
3 from [Link] import CheckButtons
4 import pandas as pd
5
6 df = pd.read_csv("property_tax_report_2018.csv")
7
8 # filter properties built on or after 1900
9 df_valid_year_built = [Link][df['YEAR_BUILT'] >= 1900]
10 # retrieve PID, YEAR_BUILT and ZONE_CATEGORY only
11 df1 = df_valid_year_built[['PID', 'YEAR_BUILT','ZONE_CAT
12 # create 3 dataframes for 3 different zone categories
13 df_A = [Link][df1['ZONE_CATEGORY'] == 'Industrial']
14 df_B = [Link][df1['ZONE_CATEGORY'] == 'Commercial']
15 df_C = [Link][df1['ZONE_CATEGORY'] == 'Historic Area']
16 # retrieve the PID and YEAR_BUILT fields only
17 df_A = df_A[['PID','YEAR_BUILT']]
18 df_B = df_B[['PID','YEAR_BUILT']]
19 df_C = df_C[['PID','YEAR_BUILT']]
20 # Count the number of properties group by YEAR_BUILT
21 df2A = df_A.groupby(['YEAR_BUILT']).count()
22 df2B = df_B.groupby(['YEAR_BUILT']).count()
23 df2C = df_C.groupby(['YEAR_BUILT']).count()
24
25 # create line plots for each zone category
26 fig, ax = [Link]()
27 l0, = [Link](df2A, lw=2, color='k', label='Industrial')
28 l1, = [Link](df2B, lw=2, color='r', label='Commercial')
29 l2, = [Link](df2C, lw=2, color='g', label='Historic Are
30 # Adjusting the space around the figure
31 plt.subplots_adjust(left=0.2)
32 # Addinng title and labels
33 [Link]('Count of properties built by year')
34 [Link]('Year Built')
35 [Link]('Count of Properties Built')
36
37 #create a list for each zone category line plot

[Link] 33/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

As you can see from the above graph, Matplotlib allows the user to
customize which graph to show with the help of checkboxes. This can
be particularly useful when there are many different categories making
comparisons difficult. Hence, widgets make it easier to isolate and
compare distinct graphs and reduce clutter.

2. Slider widget to control the visual properties of plots

[Link] 34/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

1 import numpy as np
2 import [Link] as plt
3 from [Link] import Slider, Button, RadioButt
4
5 # configure subplot
6 fig, ax = [Link]()
7 plt.subplots_adjust(left=0.25, bottom=0.25)
8 t = [Link](0.0, 1.0, 0.001)
9
10 #set initial values of frequency and amplification
11 a0 = 5
12 f0 = 3
13 delta_f = 5.0
14 s = a0*[Link](2*[Link]*f0*t)
15 l, = [Link](t, s, lw=2, color='red')
16
17 # plot cosine curve
18 [Link]([0, 1, -10, 10])
19
20 #configure axes
21 axcolor = 'lightgoldenrodyellow'
22 axfreq = [Link]([0.25, 0.1, 0.65, 0.03], facecolor=axc
23 axamp = [Link]([0.25, 0.15, 0.65, 0.03], facecolor=axc
24
25 # add slider for Frequency and Amplification
26 sfreq = Slider(axfreq, 'Freq', 0.1, 30.0, valinit=f0, va
27 samp = Slider(axamp, 'Amp', 0.1, 10.0, valinit=a0)
28
29 # function to update the graph when frequency or amplifi
30 def update(val):
31 # get current amp value
32 amp = [Link]
33 # get current freq value
34 freq = [Link]
35 # plot cosine curve with updated values of amp and f
36 l.set_ydata(amp*[Link](2*[Link]*freq*t))
37 # redraw the figure
38 [Link].draw_idle()
39 # update slider frequency
40 sfreq.on_changed(update)
41 # update amp frequency
42 h d( d t )

[Link] 35/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

42 samp on changed(update)

Matplotlib slider is very useful to visualize variations of parameters in


graphs or mathematical equations. As you can see, the slider enables the
user to change the values of the variables/parameters and view the
change instantly.

. . .

Where to go from here?

If you are interested in exploring more interactive plots with modern


design aesthetics, we recommend checking out Dash by Plotly.

This is it, folks. I hope you find this post useful. The full code (Jupyter
Notebook and Python files) can be found here. Due to the limitations of
Jupyter Notebook, the interactive plots (3D and widget) do not work
properly. Hence, the 2D plots are provided in a Jupyter Notebook and
the 3D and widget plots are provided as .py files.

Feel free to leave your comments below.

Cheers!

Contributors:

Gaurav Prachchhak, Tommy Betz, Veekesh Dhununjoy, Mihir Gajjar.

[Link] 36/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

[Link] 37/38
1/4/2019 Advanced Visualization for Data Scientists with Matplotlib

[Link] 38/38

You might also like