You are on page 1of 78

Data Science

Visualization
Topics

• We will have today an overview of:

• Introduction / Definition
• History
• Examples
• Workflow / Pipeline
• Software overview
• Hands-on exercises
• Resources
“Sci vis” versus “Info vis”

• Visualization: converting raw data to a form that is


viewable and understandable to humans.

• Scientific visualization: specifically concerned


with data that has a well-defined representation in
2D or 3D space (e.g., from simulation mesh or
scanner).

*Adapted from The ParaView


Tutorial, Moreland

Introduction to Information Visualization - Fall 2013


Information visualization

• Information visualization: concerned with data that


does not have a well-defined representation in 2D or
3D space (i.e., “abstract data”).

Introduction to Information Visualization - Fall 2013


Pre-history

• Selected figures
– William Playfair (1821) – line, bar charts, etc.
– Charles Joseph Minard (1869) – Napoleon’s march, etc.
– Jacques Bertin (1967) – “semiology of graphics”
– John Tukey (1977) – “exploratory data analysis”
– Edward Tufte (1983) – statistical graphics standards/practices
• 1985 NSF Workshop on Scientific Visualization
• 1990: S.K.Card, et al. Readings in Information
Visualization: Using Vision to Think

Introduction to Information Visualization - Fall 2013


Examples

⚫ Network visualization
(vizster)

Introduction to Information Visualization - Fall 2013


Examples
⚫ Geo data
mapping

⚫ Demo

Introduction to Information Visualization - Fall 2013


Examples
• Treemap

• Demo

Introduction to Information Visualization - Fall 2013


Examples
• Circle chart

• Demo

Introduction to Information Visualization - Fall 2013


Examples
⚫ Population

“Trendalyzer”

⚫ Demo

Introduction to Information Visualization - Fall 2013


Additional Examples
• NY Times words, words, numbers
• Visual Complexity (from book by Manuel Lima)
• 50 examples (from June 2009, somewhat dated)
• D3 Gallery

Introduction to Information Visualization - Fall 2013


Visualization components
• Color
• Size
• Texture
• Proximity
• Annotation
• Interactivity
– Selection / Filtering
– Zoom
– Animation

Introduction to Information Visualization - Fall 2013


Info vis workflow / pipeline

• Acquire
• Parse
• Filter
• Mine
• Represent
• Refine
• Interact

* Adapted from Fry, Visualizing Data


https://www.oreilly.com/library/view/visualizing-data/9780596514556/ch01.html
Introduction to Information Visualization - Fall 2013
Info vis workflow / pipeline
• Acquire
Obtain the data, whether from a file on a disk or a source over a network.

[p. 7, Fry, Visualizing Data]


Introduction to Information Visualization - Fall 2013
Info vis workflow / pipeline

• Parse
Provide some structure for the data’s meaning and order it into categories.

[p. 8, Fry, Visualizing Data]


Introduction to Information Visualization - Fall 2013
Info vis workflow / pipeline

• Filter/Mine

Filter
Remove all but the data of interest.
Mine
Apply methods from statistics or data
mining as a way to discern patterns or
place the data in mathematical context.

[p. 10, Fry, Visualizing Data]


Introduction to Information Visualization - Fall 2013
Info vis workflow / pipeline
• Represent
Choose a basic visual model, such as a bar graph, list, or tree.

[p. 10, Fry, Visualizing Data]


Introduction to Information Visualization - Fall 2013
Info vis workflow / pipeline
• Refine
Improve the basic representation to make it clearer and more visually engaging.

[p. 12, Fry, Visualizing Data]

Introduction to Information Visualization - Fall 2013


Info vis workflow / pipeline
• Interact
Add methods for manipulating the data or controlling what features are visible.

• Demo
[p. 12, Fry, Visualizing Data]

Introduction to Information Visualization - Fall 2013


Iris Sample Data Set

• data visualization techniques can also be illustrated with the Iris Plant
data set (more later).
– Can be obtained from the UCI Machine Learning Repository
http://www.ics.uci.edu/~mlearn/MLRepository.html
– Three flower types (classes):
• Setosa
• Virginica
• Versicolour
– Four (non-class) attributes
• Sepal width and length
• Petal width and length

2.1, 2.5, 1.4, 5.0 → Setosa (S)


1.0, 1.2, 3.0, 4.0 → Virginica (V)
1.0, 2.4, 5.0, 1.0 → Versicolour (R)
Visualization software

• Host language (C/C++/Java/Python) plus OpenGL


• Stat/math package with graphics
– R
– MATLAB
• Special-purpose info viz software
– Earth mapping, biological network visualization, etc.
• Browser-enabled graphics/info viz packages
– Google Charts
– Processing / Processing.js
– Java + Flash (becoming rarer)

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line chart;
A line chart is used for the representation of continuous data points.
This visual can be effectively utilized when we want to understand the
dependence between two variables.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line chart;

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

df = sns.load_dataset("iris")
df=df.groupby('sepal_length')['sepal_width'].sum().to_frame().reset_index()
#Creating the line chart
plt.plot(df['sepal_length'], df['sepal_width'])

#Adding the aesthetics


plt.title('Chart title')
plt.xlabel('X label')
plt.ylabel('Y label')

#Show the Chart Plot


plt.show() **currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line chart;
A line chart is used for the representation of continuous data points.
This visual can be effectively utilized when we want to understand the
dependence between two variables.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Bar chart;
A bar chart is used when we want to compare metric values across
different subgroups of the data. If we have a greater number of groups,
a bar chart is preferred over a column chart.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Bar chart;
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


df = sns.load_dataset('titanic')
df=df.groupby('who')['fare'].sum().to_frame().reset_index()

#Creating the bar chart


plt.barh(df['who'],df['fare'],color = ['#F0F8FF','#E6E6FA','#B0E0E6'])

#Adding the aesthetics


plt.title('Bar Chart Who vs. Fare')
plt.xlabel('Total Paid Fare')
plt.ylabel('Who (Man, Woman, Child)')

#Show the Bar Chart Plot **currently


plt.show()
Introduction to Information Visualization - Fall 2013
Hands-on using Python
• Bar chart;
A bar chart is used when we want to compare metric values across
different subgroups of the data. If we have a greater number of groups,
a bar chart is preferred over a column chart.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Bar plot;
A bar plot is from the seaborn library. It is slightly different than the
module from matlibplot.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Bar plot;
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


df = sns.load_dataset('titanic')
df=df.groupby('who')['fare'].sum().to_frame().reset_index()

#Creating the bar plot


sns.barplot(x = 'fare',y = 'who',data = df,palette = "Blues")

#Adding the aesthetics


plt.title('Bar Chart Who vs. Fare')
plt.xlabel('Total Paid Fare')
plt.ylabel('Who (Man, Woman, Child)')

#Show the Bar Chart Plot **currently


plt.show()
Introduction to Information Visualization - Fall 2013
Hands-on using Python
• Bar plot;
A bar plot is from the seaborn library. It is slightly different than the
module from matlibplot.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Column chart;
Column charts are mostly used when we need to compare a single
category of data between individual sub-items, for example, when
comparing revenue between regions.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Column chart;
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


df = sns.load_dataset('titanic')
df=df.groupby('who')['fare'].sum().to_frame().reset_index()

#Creating the column plot


plt.bar(df['who'],df['fare'],color = ['#F0F8FF','#E6E6FA','#B0E0E6']

#Adding the aesthetics


plt.title('Chart Who vs. Fare')
plt.xlabel('Total Paid Fare')
plt.ylabel('Who (Man, Woman, Child)')

#Show the Bar Chart Plot **currently


plt.show()
Introduction to Information Visualization - Fall 2013
Hands-on using Python
• Column chart;
Column charts are mostly used when we need to compare a single
category of data between individual sub-items, for example, when
comparing total fare across different passenger types.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Grouped Column chart;
A grouped column chart is used when we want to compare the values in
certain groups and sub-groups (e.g., passenger type, ticket class).

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Grouped Column chart;
import pandas as pd # for dataframes
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


df = sns.load_dataset('titanic')

#Creating a grouped Column chart


df_pivot = pd.pivot_table(df, values="fare",index="who",columns="class", aggfunc=np.mean)
ax = df_pivot.plot(kind="bar",alpha=0.5)

#Adding the aesthetics


plt.title('Chart Who vs. Fare')
plt.xlabel('Who')
plt.ylabel('Total Fare')

#Show the Chart Plot


plt.show()
**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Grouped Column chart;
A grouped column chart is used when we want to compare the values in
certain groups and sub-groups (e.g., passenger type, ticket class).

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Stacked column chart;
A stacked column chart is used when we want to compare the total
sizes across the available groups and the composition of the different
subgroups.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Stacked bar chart;
import pandas as pd # for dataframes
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


# Stacked bar chart
#Creating the dataset
df = pd.DataFrame(columns=["A","B", "C","D"],
data=[["E",0,1,1],
["F",1,1,0],
["G",0,1,0]])

df.plot.bar(x='A', y=["B", "C","D"], stacked=True, width = 0.4,alpha=0.5)

#Show the Chart Plot **currently


plt.show()
Introduction to Information Visualization - Fall 2013
Hands-on using Python
• Stacked bar chart;
A stacked column chart is used when we want to compare the total
sizes across the available groups and the composition of the different
subgroups.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Pie chart;
Pie charts can be used to identify proportions of the different
components as they are related to the whole.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Pie chart;

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


cars = ['AUDI', 'BMW', 'NISSAN',
'TESLA', 'HYUNDAI', 'HONDA']
data = [20, 15, 15, 14, 16, 30]
#Creating the pie chart
plt.pie(data, labels = cars,colors = ['#F0F8FF','#E6E6FA','#B0E0E6','#7B68EE'
,'#483D8B','#0000FF'])
#Adding the aesthetics
plt.title('Chart title')
#Show the plot
plt.show()

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Pie chart;
Pie charts can be used to identify proportions of the different
components as they are related to the whole.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Area chart;
Area charts are used to track changes over time for one or more groups.
Area graphs are preferred over line charts when we want to capture the
changes over time for more than 1 group.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Area chart;

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


x=range(1,7) #from 1 to columns+1
y=[ [1,4,6,8,19,2], [2,2,7,10,12,1], [2,8,5,10,6,3] ] #3 rows x 6 columns
#Creating the area chart
ax = plt.gca()
ax.stackplot(x, y, labels=['A','B','C'],alpha=0.5)
#Adding the aesthetics
plt.legend(loc='upper left')
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
#Show the plot
plt.show() **currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Area chart;
Area charts are used to track changes over a common variable (e.g.
days) for one or more groups. Area graphs are preferred over line charts
when we want to capture the changes over time for more than 1 group.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Column histogram;
Column histograms are used to observe the distribution (frequency of
occurrence) of a single variable within a data set.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Column histogram;

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Reading the dataset


penguins = sns.load_dataset("penguins")
#Creating the column histogram
ax = plt.gca()
ax.hist(penguins['flipper_length_mm'], color='green',alpha=0.5, bins=10)
#Adding the aesthetics
plt.legend(loc='upper left')
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
#Show the plot
plt.show()
**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Column histogram;
Column histograms are used to observe the distribution (frequency of
occurrence) of a single variable within a data set. Here we have 10 bins.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram;
Line histograms are used to observe the distribution (frequency of
occurrence) of two or more variables within a data set.

Kernel Density Estimate (KDE)


KDE is a way to estimate the probability density function (PDF) of the
random variable. KDE is a means of data smoothing.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram;
import pandas as pd # for dataframes
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


mean = 10
stdv= 2
dist = pd.DataFrame(np.random.normal(loc=mean, scale=stdv, size=(1000, 1)),co
lumns=['rnd_data'])
print(dist.agg(['min', 'max', 'mean', 'std']).round(decimals=2))

fig, ax = plt.subplots() #more details later

#Creating the line histogram


dist.plot.kde(ax=ax, legend=False, title='Histogram: rnd_data & KDE')
dist.plot.hist(density=True, ax=ax)
ax.set_ylabel('Probability') #Y-Label
ax.grid(axis='y') #Y-Grid lines **currently
plt.show()
Introduction to Information Visualization - Fall 2013
Hands-on using Python
• Line histogram;
Line histograms are used to observe the distribution (frequency of
occurrence) of two or more variables within a data set.

rnd_data
min 3.57
max 16.97
mean 10.00
std 1.97

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram;
Line histograms are used to observe the distribution (frequency of
occurrence) of two or more variables within a data set.

Kernel Density Estimate (KDE)


KDE is a way to estimate the probability density function (PDF) of the
random variable. KDE is a means of data smoothing.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram;

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs
from scipy import stats

#Creating the dataset


df_1 = np.random.normal(0, 1, (1000, ))
density = stats.gaussian_kde(df_1)
#Creating the line histogram
n, x, _ = plt.hist(df_1, bins=np.linspace(-3, 3, 50), histtype=u'step', density=True)
plt.plot(x, density(x))
#Adding the aesthetics
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
#Show the plot
plt.show()

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram;
Line histograms are used to observe the distribution (frequency of
occurrence) of two or more variables within a data set.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram (alternative way);

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs

#Creating the dataset


mean = 10
stdv= 2
dist = pd.DataFrame(np.random.normal(loc=mean, scale=stdv, size=(1000, 1)),columns=['rnd_data'])
print(dist.agg(['min', 'max', 'mean', 'std']).round(decimals=2))

fig, ax = plt.subplots() #more details later

#Creating the line histogram


dist.plot.kde(ax=ax, legend=False, title='Histogram: rnd_data & KDE')
dist.plot.hist(density=True, ax=ax)
ax.set_ylabel('Probability') #Y-Label
ax.grid(axis='y') #Y-Grid lines
plt.show()

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram;
Line histograms are used to observe the distribution (frequency of
occurrence) of two or more variables within a data set.

rnd_data
min 3.57
max 16.97
mean 10.00
std 1.97

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Scatter plot;
Scatter plots can be used to identify relationships between two variables. It
can be effectively used in circumstances where the dependent variable can
have multiple values for a value of the independent variable.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Line histogram (alternative way);

import pandas as pd # for dataframes


import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs
from scipy import stats

#Creating the dataset


df = sns.load_dataset("tips")
#Creating the scatter plot
plt.scatter(df['total_bill'],df['tip'],alpha=0.5 )
#Adding the aesthetics
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
#Show the plot
plt.show()

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Scatter plot;
Scatter plots can be used to identify relationships between two variables. It
can be effectively used in circumstances where the dependent variable can
have multiple values for a value of the independent variable.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Bubble chart;
They are like scatter plots, but used to depict and show relationships among
three variables. The third variable is represented via the bubble area.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Bubble chart;
import pandas as pd # for dataframes
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs
from scipy import stats

#Creating the dataset


np.random.seed(42)
N = 100
x = np.random.normal(170, 20, N)
y = x + np.random.normal(5, 25, N)
colors = np.random.rand(N)
area = (25 * np.random.rand(N))**2
df = pd.DataFrame({
'X': x,
'Y': y,
'Colors': colors,
"bubble_size":area})
#Creating the bubble chart
plt.scatter('X', 'Y', s='bubble_size',alpha=0.5, data=df)
#Adding the aesthetics
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
#Show the plot **currently
plt.show()
Introduction to Information Visualization - Fall 2013
Hands-on using Python
• Bubble chart;
They are like scatter plots, but used to depict and show relationships among
three variables. The third variable is represented via the bubble area.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Box chart;
The box chart is used to show the range of the distribution, its central value,
and its variability.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Box chart;
import pandas as pd # for dataframes
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs
from scipy import stats

#Creating the dataset


df_1 = [[1,2,5], [5,7,2,2,5], [7,2,5]]
df_2 = [[6,4,2], [1,2,5,3,2], [2,3,5,1]]
#Creating the box plot
ticks = ['A', 'B', 'C']
plt.figure()
bpl = plt.boxplot(df_1, positions=np.array(range(len(df_1)))*2.0-0.4, sym='', widths=0.6)
bpr = plt.boxplot(df_2, positions=np.array(range(len(df_2)))*2.0+0.4, sym='', widths=0.6)
plt.plot([], c='#D7191C', label='Label 1')
plt.plot([], c='#2C7BB6', label='Label 2')
#Adding the aesthetics
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
plt.legend()
plt.xticks(range(0, len(ticks) * 2, 2), ticks)
plt.xlim(-2, len(ticks)*2)
plt.ylim(0, 8)
plt.tight_layout()
**currently
#Show the plot
plt.show()
Hands-on using Python
• Box chart;
The box chart is used to show the range of the distribution, its central value,
and its variability.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Venn Diagram;
Venn diagrams are used to see the relationships between two or three sets of
items. It highlights the similarities and differences.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Venn Diagram;
import matplotlib.pyplot as plt
from matplotlib_venn import venn3

#Making venn diagram


venn3(subsets = (10, 8, 22, 6,9,4,2))
plt.show()

**currently
Hands-on using Python
• Venn Diagram;
Venn diagrams are used to see the relationships between two or more sets of
items. It highlights the similarities and differences.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Tree Maps;
Tree Maps are primarily used to display data that is grouped and nested in a
hierarchical structure and observe the contribution of each component.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Tree Maps;
import matplotlib.pyplot as plt
import squarify

sizes = [40, 30, 5, 25, 10]


squarify.plot(sizes)
#Adding the aesthetics
plt.title('Chart title')
plt.xlabel('X axis title')
plt.ylabel('Y axis title')
# Show the plot
plt.show()

**currently
Hands-on using Python
• Tree Maps;
Tree Maps are primarily used to display data that is grouped and nested in a
hierarchical structure and observe the contribution of each component.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Sub plots;
Subplots are powerful visualizations that help easy comparisons between
plots.

**currently

Introduction to Information Visualization - Fall 2013


Hands-on using Python
• Sub plots;
import pandas as pd # for dataframes
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns # for plotting graphs
from scipy import stats

#Creating the dataset


df = sns.load_dataset("iris")
df=df.groupby('sepal_length')['sepal_width'].sum().to_frame().reset_index()
#Creating the subplot
fig, axes = plt.subplots(nrows=2, ncols=2)
ax=df.plot('sepal_length', 'sepal_width',ax=axes[0,0])
ax.get_legend().remove()
#Adding the aesthetics
ax.set_title('Chart title')
ax.set_xlabel('X axis title')
ax.set_ylabel('Y axis title')
ax=df.plot('sepal_length', 'sepal_width',ax=axes[0,1])
ax.get_legend().remove()
ax=df.plot('sepal_length', 'sepal_width',ax=axes[1,0])
ax.get_legend().remove()
ax=df.plot('sepal_length', 'sepal_width',ax=axes[1,1])
ax.get_legend().remove()
**currently
#Show the plot
plt.show()
Hands-on using Python
• Sub plots;
Subplots are powerful visualizations that help easy comparisons between
plots.

**currently

Introduction to Information Visualization - Fall 2013


Resources

⚫ Books
– Visual Complexity, Mapping Patterns of Information , Manuel Lima
– The Visual Display of Quantitative Information, Edward Tufte
– Information Visualization: Beyond the Horizon, Chaomei Chen
– JavaScript: The Definitive Guide, David Flanagan
– Getting Started with D3, Mike Dewar
– Visualizing Data, Ben Fry
– Interactive Data Visualization for the Web, Scott Murray

Introduction to Information Visualization - Fall 2013


Resources
⚫ Web sites
– http://processingjs.org/
– http://d3js.org/, https://github.com/mbostock/d3/wiki/API-Reference
– http://code.google.com/apis/ajax/playground/
– http://www.edwardtufte.com/tufte/
– http://www.visualcomplexity.com/
– http://www.webdesignerdepot.com/2009/06/50-great-examples-of-data-
visualization/
– http://fellinlovewithdata.com/
– http://infosthetics.com/
– http://visual.ly/

Introduction to Information Visualization - Fall 2013


Conclusion

• We have learned an array of different libraries


which can be used to their full potential by
understanding the use-case and the requirement.
The syntax and the semantics vary from package
to package and it is essential to understand the
challenges and advantages of the different
libraries.
Assignment 5

• Download a data set


• Perform the following visualization plots on it:
-Pie chart
-Line histogram
-Column histogram
-Bubble chart
-Subplots
-Box plot, at least two combined

You might also like