You are on page 1of 15

西南科技大学《Data Visualization》期末综合报告

Southwest University of Science and Technology

“Data Visualization”

Project Report:

Data Visualization of ‘global hot websites’

using Tableau and Python 3.9

Project Link:Data Visualization Project 2021

学院名称 计算机科学与技术

专业名称 计算机科学与技术(本科)

学生姓名 苏南

学号 44 2 0 18 0 0 0 2

授课教师 王松老师

November, 2021
西南科技大学《Data Visualization》期末综合报告

Acknowledgements
I am pleased to acknowledge Prof. Wang Song ( 王 松 ) for his invaluable guidance
during the course of this project.

I am also grateful to the creators of ‘Tableau public’, ‘Pycharm- community edition’


and ‘Python’ for allowing me to use their products and services to complete the
visualization project.

Pant Sushovan Nath (苏南)

1
西南科技大学《Data Visualization》期末综合报告

Contents
Acknowledgements....................................................................................................... 1
Introduction................................................................................................................... 3
Overview.................................................................................................................3
Background of the project......................................................................................3
Objective................................................................................................................ 3
Methodology.......................................................................................................... 4
Data visualization: tools, types and techniques used.................................................. 5
Tableau public........................................................................................................ 5
Python 3.9 and respective modules.......................................................................5
Types of visualization used.....................................................................................5
Tableau elements and techniques used.................................................................6
1. Storytelling......................................................................................................... 6
2. Dashboard.......................................................................................................... 6
3. Sheet...................................................................................................................6
4. Parameter and functions....................................................................................6
Preprocessing using python................................................................................... 6
Informational insights from the data............................................................................8
1. Tableau public............................................................................................................8
2. Python 3.9........................................................................................................ 11
Project Summary......................................................................................................... 13
Problems and solutions........................................................................................13
Personal feedback................................................................................................ 13
References................................................................................................................... 14

2
西南科技大学《Data Visualization》期末综合报告

Introduction

Overview

This is a project report done on the final Data Visualization project of the course. The
project was to use Tableau or any programming language suited to visualise the given
data set of websites. As to answer the question “Why visualization?”, because it
provides insights, answers questions, supports way finding, tells stories and
communicates knowledge and awareness representing the data in an interactive and
understandable form.

Background of the project

Websites can help us generate business, sales, leads and also increase our brand
value. It helps in terms of increasing credibility in front of customers and helps
businesses showcase their services to the targeted audience. In 2021, it is relatively
uncommon to find any big or small businesses and companies to not have their own
website. Most popular websites as of now include the social media giants like
Facebook, Twitter, video and other service providers like Youtube, Amazon, Google,
search engines like ‘google.com’, ‘baidu.com’. These are among the millions of
websites available on the internet today. The websites have integrated into our lives
and daily use very well. So in a world with so much of data available on websites, it is
almost a given that data visualization on such statistics will benefit the owners as
well as the clients or the users of such websites.

Objective

The goals of this report are as such.

1. To talk about the tools and visualization techniques used in the project.

2. To discuss the informational insights that were produced from the data.

3. To discuss about the difficulties in handling the dataset and the methods used to
overcome them.

3
西南科技大学《Data Visualization》期末综合报告

Methodology

To implement the above goals, the following methodology was followed:

Tableau public was used to implement the data visualization techniques and python
3 with its relative modules were used to clean the data as well as visualize them in
various ways.

The link to the project is given below:

https://public.tableau.com/app/profile/sushovan6141/viz/DataVisualizationProject2
021/Story1

4
西南科技大学《Data Visualization》期末综合报告

Data visualization: tools, types and


techniques used

Tableau public

Tableau Public is a free platform to publicly share and explore data visualizations
online. Anyone can explore and contribute to the millions of visualizations on Tableau
Public. Saving locally and refreshing data are limited. Visualizations published to
Tableau Public are available for anyone to see online. Tableau Public is a platform for
public (not private) data.

Tableau public was used to do data visualization and the workbook is publicly
available at the link provided on the front cover of this report.

Python 3.9 and respective modules

Python is one of the most used programming language for data mining or handling
big data. I have also used it to pre-process the data by cleaning it and changing
certain data types. The modules used are listed below:

Pandas, numpy, matplotlib, wordlcloud, cufflinks, plotly, Ipython, squarify

Jupyter notebook was used as the ideal IDE for visualization purposes.

Types of visualization used

Charts Geospatial Tables

Pie chart Choropleth, Highlight Tables


Isopleth, Area and Heatmaps
Bar Chart
Maps
Treemaps

Bubble chart

5
西南科技大学《Data Visualization》期末综合报告

Tableau elements and techniques used

1. Storytelling

In Tableau, a story is a sequence of visualizations that work together to convey


information.

2. Dashboard

A dashboard is a collection of several views, letting you compare a variety of data


simultaneously.

3. Sheet

Tableau uses a workbook and sheet file structure, much like Microsoft Excel. A
workbook contains sheets. A sheet can be a worksheet, a dashboard, or a story.

4. Parameter and functions

A parameter is a workbook variable such as a number, date, or string that can replace
a constant value in a calculation, filter, or reference line. Tableau supports many
functions for use in Tableau calculations. (String function, logical function, etc.)

Preprocessing using python

6
西南科技大学《Data Visualization》期末综合报告

7
西南科技大学《Data Visualization》期末综合报告

Informational insights from the data


1. Tableau public

The country can be changed from the drop-down menu along with the website. In
this way, we can know the top 10 websites by country rank of any country as well as
the daily users of any websites on the
map.

The different countries where a website is available can be checked from the
data.

8
西南科技大学《Data Visualization》期末综合报告

The right hand side table shows us that the more highly a website is ranked in a
country, more likely it is to have lot of social media mentions. The left bubble chart
showcases the total social media mentions of websites and displays the ten with the
most.

The highlight table shows the top ten registrars and the wordcloud shows the top 15
countries to have most websites located in
them.

9
西南科技大学《Data Visualization》期末综合报告

As the visualization hints, various domains of the website worldwide can be searched
according to the number of countries they are available
in.

It can be seen that often top websites often have excellent privacy and excellent
child-safety feature and are the most trustworthy.

10
西南科技大学《Data Visualization》期末综合报告

2. Python 3.9

11
西南科技大学《Data Visualization》期末综合报告

12
西南科技大学《Data Visualization》期末综合报告

Project Summary

Problems and solutions

Problems Solutions

Difficult to use uncleaned data in Tableau Used Python to preprocess the given
for accurate visualizations. data.

Lack of interactive visualizations in Use Tableau to produce dynamic and


Python. interactive visualizations.

Hard to use Tableau to create or generate Use Python to implement such


new attribute or complex dimensions. functionality.

‘k’, ‘ millions’ in social media were in Used python to change them into
string format. numeric data type.

Personal feedback

I have learnt a great deal during the course of this project. It has not only taught me
how to use a powerful Data Viz tool like Tableau and data visualization techniques,
but also showed me how Python and Tableau together can complement each other
and produce really meaningful data visualizations.

Data visualization has tremendous scope in any field that deals with data as it allows
human input in the decision making process. Through this project. I have come to
realise how data visualization can represent behemoth complex datasets as simple
understandable and meaningful information. In accordance with human cognition,
presenting a data in a meaningful way can vastly improve and augment the
effectiveness of any system. Although this dataset was not that clean or suitable for
meaningful analysis off the table but with a little preprocessing and visualization, this
completely changed.

Finally, I am very grateful to Prof. Wang Song for advising me and helping me out in
every possible way during the course of this project. Even in a time of Covid-19 crisis,
I am really thankful that the professor has always kept lines of communication open
for me. I learnt a lot through this project and will definitely continue using and
further developing my understanding of Data Visualization.
13
西南科技大学《Data Visualization》期末综合报告

References
1. https://digital.lib.washington.edu/researchworks/handle/1773/46581

2. https://digital.lib.washington.edu/researchworks/handle/1773/46581

3. https://www.tableau.com/learn/articles/data-visualization

14

You might also like