You are on page 1of 33

COMP6725 - Big Data Technologies

TOPIK 9
BIG DATA VISUALIZATION
LEARNING OUTCOMES

At the end of this session, students will be able to:


o LO2. Use big data analytics and visualizations
OUTCOMES

Students are able to use big data analytics and visualizations


OUTLINE

1. Data Visualizations
2. Frameworks & Libraries
3. Visualization Examples
DATA VISUALIZATIONS
DATA VISUALIZATION

o Data visualization is the process that makes the analyzed data results to be visually
presented to the business users for effective interpretation.
o Visualization makes the life cycle of Big Data complete assisting the end users to gain
insights from the data
o Visualization techniques use tables, diagrams, graphs, and images as the ways to
represent data to the users.
o There are many conventional data visualization techniques available, and they are
line graphs, bar charts, scatterplots, bubble plots, and pie charts.
VISUALIZATIONS

o The choice of the visualization tools, serving databases and web frameworks is driven by
the requirements of the application.
o Can be static, dynamic or interactive.
• Static => Results shown from data stored in a serving database.
• Dynamic => The results updated regularly (visualizations use live widgets, plots, or
gauges).
• Interactive => Displays result according to accepted inputs from the user.
FRAMEWORKS & LIBRARIES
LIGHTNING

o Lightning is a framework for creating web-based interactive visualizations


o Lightning provides a REST API and client libraries for Python, Scala, R and
JavaScript programming languages
o Lightning can be deployed either in a server mode or in a local server-less
mode.
o Lightning server can be installed and run using the following commands:

o The Lightning Python client can be installed with the follows command:
PYGAL

o The Python Pygal library is an easy-to-use charting library which supports


charts of various types.
o The charts built with Pygal can be rendered in output formats such as SVG,
PNG or in the browser.
o The Pygal library can be installed with the following commands:
SEABORN

o Seaborn is a Python visualization library for plotting attractive statistical plots.


o Seaborn is built on top of matplotlib and uses data structures from Python
numpy and pandas libraries and statistical routines from scipy and statsmodels.
o Seaborn can be installed with the following commands:
VISUALIZATION EXAMPLES
LINE CHART

A line chart has vertical and


horizontal axes where the
numeric data points are
plotted and connected, which
results in a simple and
straightforward way to visu-
alize the data.

Pic 10.1. Line chart plotted with Lightning


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.
LINE CHART (CONT)

Pic 10.2. Line chart plotted with Pygal


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.
SCATTER PLOT

Scatter plots can be used to visualize two variables along the X and Y axes. Scatter
plots and are useful for identifying the relationships between two sets of data.

Pic 10.3. Scatter plot plotted with Lightning


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.
SCATTER PLOT (CONT)

Pic 10.4. Scatter plot plotted with Pygal


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
BAR CHART

Bar charts are the most commonly used data


visualization techniques as they reveal the ups and
downs at a glance. The data can be either discrete or
continuous.
Pic 10.5. Bar chart plotted with Pygal
Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
BAR CHART (CONT)

Pic 10.6. Bar plot plotted with Seaborn


Source : https://pythonbasics.org/seaborn-barplot/#barplottips
BOX PLOT

A boxplot is a standardized way of


displaying the distribution of data
based on a five number summary
(“minimum”, first quartile (Q1),
median, third quartile (Q3), and
“maximum”). It can tell you about
your outliers and what their values
are. Pic 10.7. Box plot plotted with Pygal
Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
BOX PLOT (CONT)

Pic 10.8. Box plot plotted with Seaborn


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
PIE CHART

Pie charts are the best visualization technique


used for part-to-whole comparison. The pie chart
can also be in a donut shape with either the
design element or the total value in the center.
Pic 10.9. Pie chart plotted with Pygal
Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
DOT CHART

Dot charts are used to display different


datasets where the size of the dots is
proportional to the values represented.

Pic 10.10. Dot chart plotted with Pygal


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
MAP CHART

Pic 10.11. Map plotted with Lightning


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
MAP CHART (CONT)

Pic 10.12. Map plotted with Pygal


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
GAUGE CHART

Pic 10.13. Gauge chart plotted using Pygal


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
RADAR CHART

Radar chart (also called Kiviat diagram) is used


to display multivariate data on a two-
dimensional chart with the zero point in the
middle.

Pic 10.14. Radar chart plotted using Pygal


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
MATRIX CHART

Matrix chart can be used to display data in a


grid format.

Pic 10.15. Matrix plotted with Lightning


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
CIRCLE CHART

Circle plots show groups of nodes as


points around a circle with the
connections between the nodes
represented by lines between the
points. Pic 10.16. Circle chart plotted using Lightning
Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
FORCE-DIRECTED GRAPH

Force-directed graphs are used to display


data in an aesthetically pleasing graph.
The edges represent connections between
the nodes and are of more or less equal
length. The layout of the nodes in the
graph is such that there are as few
crossing edges as possible.

Pic 10.17. Force-directed graph plotted using Lightning


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
SPATIAL GRAPH

Spatial graphs can be used to display nodes with fixed spatial positions, and the links
between them.

Pic 10.18. Spatial graph potted using Lightning


Source : Big Data Science & Analytics: A Hands-On Approach Basic Statistics.,
2016.v
ThankYOU...
SUMMARY
o Data Visualization, the easiest way for the end users to interpret the business
analysis, is explained with various types of conventional data visualization
techniques, namely, line graphs, bar charts, pie charts, and scatterplots.
o The benefits of data visualization techniques are improved decision-making,
enabling the end users to interpret the results without the assistance of the
data analysts, increased profitability, better data analysis, and much more.
o Big data has mostly unstructured data, and due to bandwidth limitations,
visualization should be moved closer to the data to efficiently extract
meaningful information.
REFERENCES

o Arshdeep Bahga & Vijay Madisetti. (2016). Big Data Science & Analytics: A Hands-On Approach. 1st E.
VPT. India. ISBN: 9781949978001. Chapter 12
o Balusamy. Balamurugan, Abirami.Nandhini, Kadry.R, Seifedine, & Gandomi. Amir H. (2021). Big Data
Concepts, Technology, and Architecture. 1st. Wiley. ISBN 978-1-119-70182-8. Chapter 10
o https://www.youtube.com/watch?v=tse_8LLWtfY
o https://www.youtube.com/watch?v=uNGdpXCMrgM&list=PLwQ7o9tyw625ThzACVnUoO0XIMorcP52-
o https://www.youtube.com/watch?v=z7ZINBk8EUk&list=PL998lXKj66MpNd0_XkEXwzTGPxY2jYM2d
o https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51
o https://pythonbasics.org/seaborn-barplot/#barplottips

You might also like