Professional Documents
Culture Documents
Data Visualization
Data Visualization
• Data visualization is the graphical representation of information and data.
• By using visual elements like charts, graphs and maps, data visualization tools
provide an accessible way to see and understand trends, outliers, and patterns in
data
• Data visualization sits right in the middle of analysis and visual storytelling.
• The advantages and benefits of good data visualization:
o Our eyes are drawn to colors and patterns.
o we quickly see trends and outliers
o It’s storytelling with a purpose
• Effective data visualization is a delicate balancing act between form and function.
• The plainest graph could be too boring to catch any notice or it make tell a
powerful point.
• The most stunning visualization could utterly fail at conveying the right message
or it could speak volumes.
• The data and the visuals need to work together, and there’s an art to combining
great analysis with great storytelling.
History of Tableau
• Tableau was founded by Pat Hanrahan, Christian Chabot, and Chris Stolte from
Stanford University in 2003.
• The main idea behind its creation is to make the database industry interactive and
comprehensive.
• It is a user-friendly tool that can help you in creating graphs, charts, maps, reports
Advantages of Tableau
1. Great visualizations:
• With Tableau, you can work with more unordered data and create varieties
of visualizations with the help of the in-built features offered by Tableau.
• Moreover, you will be able to achieve great context, several ways of drilling
the data and exploring the data within minutes.
2. Detailed Insights:
• Tableau helps in analyzing future data without any future goals in mind.
• You will be able to explore visualizations and observe data from different
approaches.
• With hypothetical visualizations and a feature of adding components for
comparison and analysis, you can frame ‘what-if’ queries and work on the
data accordingly.
3. User-friendly Approach:
• The user-friendly feature is the major strength of Tableau.
• This feature demonstrates the ability of an individual to work without any
technical or coding knowledge.
• Since Tableau offers most of its features in a drag-and-drop form and each
visualization is built-in and self-depicting, any newbie can work without any
prior set of skills.
5. Adding Data Set is easy: Tableau can add new data sets easily which can be
automatically blended with Tableau using common fields.
Architecture of Tableau
• Tableau Server is designed to connect many data tiers. It can connect clients from
Mobile, Web, and Desktop.
• Tableau Desktop is a powerful data visualization tool. It is very secure and highly
available.
1. Data server:
o The primary component of Tableau Architecture is the Data sources which can
connect to it.
o Tableau can connect with multiple data sources.
It can blend the data from various data sources.
o It can connect to a spreadsheet, a database, and a web application at the same
time. It can also make relationships between different types of data sources.
2. Data connector:
o The Data Connectors provide an interface to connect external data sources
with the Tableau Data Server.
o Tableau has in-built SQL/ODBC connector. This ODBC Connector can be
connected with any databases without using their native connector.
o Tableau desktop has an option to select both extract and live data.
o The user can easily switch between live and extracted data.
o Real-time data or live connection: Tableau can be connected with real data by
linking to the external database directly.
o Extracted or in-memory data: Tableau is an option to extract the data from
external data sources
3. Components of Tableau server: Different types of components of the Tableau
server are:
o Application Server: The application server is used to provide the authorizations
and authentications. It handles the permission and administration for mobile
and web interfaces.
o VizQL server: VizQL server is used to convert the queries from the data source
into visualizations.
o Data server: Data server is used to store and manage the data from external
data sources. It is a central data management system.
4. Gateway:
o The gateway directs the requests from users to Tableau components.
o When the client sends a request, it is forwarded to the external load balancer
for processing.
o The gateway works as a distributor of processes to different components.
o In case of absence of external load balancer, the gateway also works as a load
balancer.
o For single server configuration, one gateway or primary server manages all the
processes.
o For multiple server configurations, one physical system works as a primary
server, and others are used as worker servers.
o Only one machine is used as a primary server in Tableau Server environment.
5. Clients:
o The visualizations and dashboards in Tableau server can be edited and viewed
using clients like a web browser, mobile applications, and Tableau Desktop.
o Web Browser:
▪ Web browsers like Google Chrome, Safari, and Firefox support the
Tableau server.
▪ The visualization and contents in the dashboard can be edited by using
these web browsers.
o Mobile Application:
▪ The dashboard from the server can be interactively visualized using
mobile application and browser.
▪ It is used to edit and view the contents in the workbook.
o Tableau Desktop:
▪ Tableau desktop is a business analytics tool.
▪ It is used to view, create, and publish the dashboard in Tableau server.
▪ Users can access the various data source and build visualization in
Tableau desktop.
Tools of Tableau
1. Tableau Desktop
• Tableau Desktop is a desktop application that is the fundamental offering of
Tableau.
• You can use Tableau Desktop to create interactive data visualizations, perform
unlimited data exploration, create charts and dashboards from the data, etc.
• It also allows connections to the data on the cloud or in local memory, whether
the data is an SQL database, spreadsheet data, Big Data, or data on Google
Analytics, Salesforce, etc.
2. Tableau Public
• Tableau Public is a free software provided by Tableau.
• It is used to create data visualizations and data charts that can be embedded
into blogs, web pages, etc. or transferred using social media.
• You can create visualizations in Tableau Public using the Tableau Desktop Public
Edition.
• However, the downside of Tableau Public is that the data visualizations created
here are accessible by everyone on the internet and cannot be saved privately.
• So, this product is best for students who are still learning Tableau, hobbyists,
journalists, bloggers, etc. who are fine with creating public data visualizations.
3. Tableau Online
• Tableau Online can be used to create data visualizations, storyboards, data
charts, etc. that are fully hosted on the cloud and not on local servers.
• You can create your visualizations and share them with other people online
using a web browser or the Tableau mobile app.
• There is no need to worry about software upgrades or hardware scaling or
server configuration while using Tableau Online as it is a totally online hosted
service.
• So, you can set up Tableau Online in minutes and start reaping the benefits as a
single user or company.
4. Tableau Server
• Tableau Server is a server product for your organization that needs to be
installed on a Windows or Linux server.
• This is widely used in the industrial domain with many IT companies installing
Tableau Server to create interactive data visualizations, perform unlimited
data exploration, create charts and dashboards from the data, etc. while being
sure that their data is safe and secure.
• Tableau Server can integrate with existing security protocols in companies such
as Kerberos, Active Directory, OAuth, etc. to ensure data security while also
providing data insights.
\
Aliases in Tableau
• You can create aliases (alternate names) for members in a dimension so that their
labels appear differently in the view.
• Aliases can be created for the members of discrete dimensions only. They cannot
be created for continuous dimensions, dates, or measures
Steps to create an Alias:
1. In the Data Source page, select a dimension, say ‘Segment’
2. Click on the dropdown and select ‘Aliases’
3. A dialog box ‘Edit Aliases’ appears.
\
Cards and Shelves
• Every worksheet in Tableau contains shelves and cards, such as Columns, Rows,
Marks, Filters, Pages, Legends, and more
• By placing fields on shelves or cards, you:
o Build the structure of your visualization.
o Increase the level of detail and control the number of marks in the view by
including or excluding data.
o Add context to the visualization by encoding marks with color, size, shape,
text, and detail
1. Marks Card
• The Marks card is a key element for visual analysis in Tableau.
• As you drag fields to different properties in the Marks card, you add context and
detail to the marks in the view.
• You can encode your data with color, size, shape, text, and detail.
• After you add a field to the Marks card, you can click the icon next to the field to
change the property it is using. Example:
i) Colour option: Before, all the circles were blue. After adding ‘Category’ field to
Colour Mark type, we get the circles in different colour, representing the categories.
ii) Size option: Provides additional information (a size) for each mark
iii) Detail Property: Adding a dimension on the Detail marks is useful to breakdown
the data you already have in your chart/table with this dimension too.
iii) Label option: Used to add mark labels or text to the visualization.
2. Filters shelf
• The Filters shelf allows you to specify which data to include and exclude.
• You can filter data using measures, dimensions, or both at the same time.
Types of Filters in Tableau
• There are different types of filters in a tableau that can be used to organize data
based on predefined conditions and use them for data visualization like
1. Extract Filters 2. Data Source Filters
3. Context Filter 4. Dimension filter
5. Measure Filters 6. Table Filters
Such ability to filter large data sets helps prepare
o for analysis, including removing irrelevant data records,
o reducing data sizes for faster processing, and more.
• The filters are required to highlight any underlying insights that can be derived
from the data upon visualizing in a readable, actionable format.
1. Extract Filters:
o Tableau allows you to have a data source, either live or an extracted or
copied one.
o Live data source would mean, any changes done in the data source will
almost immediately reflect in Tableau.
o Extract, on the other hand, is a copy of the original data and will not reflect
any changes made to the data source unless you decide to reload the data
once again after the changes to the original data are saved.
o When you’re loading in your data you can choose to extract it, i.e., saving a
snapshot of how it looks in your workbook and ultimately reducing the
number of times Tableau queries the data source.
o Steps to add an Extract Filter:
1. Drag and drop the table into the canvas.
2. Select an Extract Connection. Then click on Edit.
3. Now, click on Add in the Extract Data dialog box.
6. Select the filter. Then, a new dialog box open, to select the data to be
filtered. Select the data, then click OK
7. Now, the the ‘Edit Data Source Filters’ dialog box looks like this: Click OK
3. Context Filter:
• Context Filters are used to improve performance of views, filters, and
queries run on a data source.
• All filters by default are applied independently of any other filters in
Tableau, which means each filter accesses the entire data set without regard
to any other filters.
• A context filter can be applied to change this behaviour.
• You could have a context filter that is run before any other filters (run first),
and the rest of the filters are applied on top of the data returned after
context filtering.
• Steps to create a Context Filter:
1. Drag and drop a table into the canvas in the Data Source page.
2. Create a new ‘Worksheet’ in the bottom-left corner.
3. Drag a field from the Data pane to the Filters shelf.
4. The Filter dialog box opens so you can define the filter.
5. To create a context filter, select Add to Context from the context menu
of an existing categorical filter.
4. Dimension filter:
o Dimensions in Tableau are fields (columns) that are independent, typically
any field that contains categorical or qualitative information.
o When a dimension is used to filter the data in a worksheet, it is called as
Dimension filter. (Ex: applying Department as "CSE" filter on NHCE database)
o It is a non-aggregated filter (does not support AVG, SUM etc) where a
dimension, group, sets and bin can be added.
o The members present in a dimension can be included or excluded from the
list using this filter.
o Dimension filter can be shown in a sheet or dashboard to change the filter
condition dynamically.
o Steps to create a dimension filter: Steps 1 - 3 in Context Filter
5. Measure Filters:
o Measures are typically fields/columns that contain quantitative data.
o In this filter, you can apply the various operations like SUM, AVG, Median,
Standard Deviation, and other aggregate functions.
o Every time you drag the data you want to filter, you choose a specific setting.
o Steps to create a measure filter:
1. When you drag a measure from the Data pane to the Filters shelf in
Tableau Desktop, the following dialog box appears:
2. Select how you want to aggregate the field, and then click Next.
3. In the next dialog box, we can create four types of quantitative filters:
Range of Values, At Least, At Most, Special. Select a filter and click OK.
6. Table Filters:
o The table filter is used to process is the table calculation that gets executed
once the data view has been rendered.
o With this filter, you can quickly look into the data without any filtering of
the hidden data.
Tableau Data Connections
• There are 2 types of data connections in Tableau.
o LIVE
o EXTRACT (IN-MEMORY)
• Live connection is for high volume data and sends logic to data.
• Extract connection brings data in to memory, i.e., sends data to the logic.
• There are no standard rules to decide on which connection to choose.
• Depending on the situation and resources, we must choose the connection type
that helps to provide responsive report
• In general, high-volume data and frequently changing data may be eligible for
LIVE connection.
• Low volume data and less frequently changing data may be eligible for EXTRACT
connection. Again, it all depends on the available storage, memory, cache and
other resources.
Tableau Calculations
• Calculated fields allow you to create new data from data that already exists in
your data source.
• When you create a calculated field, you are essentially creating a new field (or
column) in your data source, the values or members of which are determined by a
calculation that you control.
• This new calculated field is saved to your data source in Tableau, and can be used
to create more robust visualizations.
• But don't worry: your original data remains untouched.
• You can use calculated fields for many, many reasons. Some examples might
include:
o To segment data
o To aggregate data
o To convert the data type of a field, such as converting a string to a date.
o To filter results
o To calculate ratios
Types of Charts in Tableau
1. Bar Charts
o Bar charts are definitely one of the most, if not the most common data
visualizations across all Business Intelligence (BI) platforms.
o You can quickly highlight differences between categories, show trends and
outliers, and reveal historical highs and lows at a glance.
o Bar charts are simple, yet, effective, especially when you have data that can
be split into many categories.
o Steps:
1. Connect to the data source like an Excel file
2. Drag the Order Date dimension to Columns and drag the Sales measure
to Rows.
3. Tableau uses Line as the automatic mark type because you added a date
dimension.
4. On the Marks card, select Bar from the drop-down list. The view changes to a
bar chart.
o If a dimension is in columns and a measure in rows, then we get vertical bar
graphs. If vice-versa, then horizontal bar graphs.
2. Line Chart
o The line chart, or line graph, is another familiar method for displaying data.
o It connects several distinct data points, presenting them as one continuous
evolution.
o The result is a simple, straightforward way to visualize changes in one value
relative to another.
3. Area Chart
o Area charts represent any quantitative data over various periods of time.
o It is basically a line graph where the area between line and axis is generally
filled with color.
o Steps:
1. Connect to a data-source
2. Navigate to a new worksheet.
3. From the Data pane, drag Order Date to the Columns shelf.
4. From the Data pane, drag Quantity to the Rows shelf.
5. From the Date pane, drag Ship Mode to Color on the Marks card.
6. On the Marks card, click the Mark Type drop-down and select Area.
4. Scatter Plots
o Scatter plots are an effective way to give you a sense of trends,
concentrations, and outliers that facilitate deeper investigations of your
data.
o A scatter plot presents lots of distinct data points on a single chart.
o The chart can then be enhanced with analytics like cluster analysis or trend
lines.
o Scatter plots are used to show if one variable is a good predictor of another,
or if they tend to change independently.
o This type of chart easily lends itself to many types of analysis.
o Steps:
1. Drag the Profit measure to Columns.
2. Drag the Sales measure to Rows.
Tableau Joins
1. To create a join, connect to the relevant data source or sources.
2. Drag the first table to the canvas.
3. Select Open from the menu or double-click the first table to open the join canvas
Inner The result is a table that contains values that have matches in both
tables.
When a value doesn't match across both tables, it is dropped entirely.
Left the result is a table that contains all values from the left table and
corresponding matches from the right table.
When a value in the left table doesn't have a corresponding match in
the right table, you see a null value in the data grid.
Right The result is a table that contains all values from the right table and
corresponding matches from the left table.
When a value in the right table doesn't have a corresponding match in
the left table, you see a null value in the data grid.
Full outer The result is a table that contains all values from both tables.
When a value from either table doesn't have a match with the other
table, you see a null value in the data grid.
Example: Join on Customer ID
Order ID Customer ID Customer ID Customer Name
1 222 222 Rahul
2 777 777 Rohith
3 222 888 Aman
4 555 666 Shivanand
Inner Join
Order ID Customer ID Customer Name
1 222 Rahul
2 777 Rohith
3 222 Rahul
Left Join
Order ID Customer ID Customer Name
1 222 Rahul
2 777 Rohith
3 222 Rahul
4 555 null
Right Join
Order ID Customer ID Customer Name
1 222 Rahul
2 777 Rohith
3 222 Rahul
null 888 Aman
null 666 Shivanand
Full Outer Join
Order ID Customer ID Customer Name
1 222 Rahul
2 777 Rohith
3 222 Rahul
4 555 null
null 888 Aman
null 666 Shivanand