Professional Documents
Culture Documents
VISUALIZATION
LECTURE NOTE
IDEAS AND DATA GLOBAL ACADEMY | KATSINA
Table of Content
Table of Content ....................................................................................................................................... 1
INTRODUCTION ......................................................................................................................................... 3
How do we generate Data? ...................................................................................................................... 3
Data usage in life;...................................................................................................................................... 3
The data types to know are: ..................................................................................................................... 3
Data analysis ................................................................................................................................................. 4
What Is the Data Analysis Process? .......................................................................................................... 4
Data Analysis Methods ............................................................................................................................. 6
How to Analyze Data? Top Data Analysis Techniques to Apply................................................................ 8
Data visualization ........................................................................................................................................ 10
Types of data visualizations .................................................................................................................... 10
Importance of data visualization ............................................................................................................ 11
Tools for data visualization ..................................................................................................................... 12
Microsoft Excel............................................................................................................................................ 14
PIVOT TABLE............................................................................................................................................ 14
WHAT ARE PIVOT TABLES USED FOR? .................................................................................................... 14
PIVOT CHART........................................................................................................................................... 15
Insert Pivot Chart ................................................................................................................................ 15
Power Query ............................................................................................................................................... 16
How Do You Enable Power Query? ................................................................................................... 16
The Four Phases of Power Query ............................................................................................................ 16
Power Pivot ................................................................................................................................................. 17
How to Get the Excel Power Pivot Add-In ......................................................................................... 17
DAX in Excel................................................................................................................................................. 19
Data Analysis Toolpak ................................................................................................................................. 19
How Load the Data Analysis Toolpak Add-in.................................................................................... 20
Functions Available in Excel Data Analysis ToolPak ................................................................................ 21
Power BI ...................................................................................................................................................... 22
Components of Power BI ........................................................................................................................ 23
Power BI Dashboards .............................................................................................................................. 23
Power BI Reports .................................................................................................................................... 24
Power BI DAX .......................................................................................................................................... 24
1
How does it work? .................................................................................................................................. 24
Working of Power BI ............................................................................................................................... 26
Connecting Your Data ............................................................................................................................. 27
Uploading your File to Power BI ............................................................................................................. 27
Importing your File into Power BI ........................................................................................................... 27
Transforming Your Data .......................................................................................................................... 27
Modeling Your Data ................................................................................................................................ 27
Data Visualization ................................................................................................................................... 28
The following are some samples of data visualization: .......................................................................... 28
Sharing the Generated Reports .............................................................................................................. 28
SQL (Structured Query Language)..................................................................................................... 29
Why SQL? ................................................................................................................................................ 29
SQL Commands ....................................................................................................................................... 29
DDL - Data Definition Language ....................................................................................................... 30
DML - Data Manipulation Language.................................................................................................. 30
DCL - Data Control Language ............................................................................................................ 31
SQL Syntax............................................................................................................................................... 31
2
INTRODUCTION
Most companies are collecting loads of data all the time but, in its raw form, this
data doesn’t really mean anything. This is where data analytics comes in. Data
analytics is the process of analyzing raw data in order to draw out meaningful,
actionable insights, which are then used to inform and drive smart business
decisions.
Data is also descriptive information about people, animals and things, colour,
habits, lifestyle, and preferences are example of descriptive information.
3
• String (or str or text). Used for a combination of any characters that
appear on a keyboard, such as letters, numbers and symbols.
• Character (or char). Used for single letters.
• Integer (or int). Used for whole numbers.
• Float (or Real). Used for numbers that contain decimal points, or for
fractions.
• Boolean (or bool). Used where data is restricted to True/False or yes/no
options.
Data analysis
Data analysis is the process of cleaning, changing, and processing raw data and
extracting actionable, relevant information that helps businesses make informed
decisions.
Answering the question “what is data analysis” is only the first step. Now we will
look at how it’s performed. The process of data analysis, or alternately, data
analysis steps, involves gathering all the information, processing it, exploring the
data, and using it to find patterns and other insights. The process of data analysis
consists of:
4
1. Data Requirement Gathering
Ask yourself why you’re doing this analysis, what type of data you want to
use, and what data you plan to analyze.
2. Data Collection
Guided by your identified requirements, it’s time to collect the data from
your sources. Sources include case studies, surveys, interviews,
questionnaires, direct observation, and focus groups. Make sure to organize
the collected data for analysis.
3. Data Cleaning
Not all of the data you collect will be useful, so it’s time to clean it up. This
process is where you remove white spaces, duplicate records, and basic
errors. Data cleaning is mandatory before sending the information on for
analysis.
4. Data Analysis
Here is where you use data analysis software and other tools to help you
interpret and understand the data and arrive at conclusions. Data analysis
tools include Excel, Python, R, Looker, Rapid Miner, Chartio, Metabase,
Redash, and Microsoft Power BI.
5. Data Interpretation
Now that you have your results, you need to interpret them and come up
with the best courses of action based on your findings.
6. Data Visualization
Data visualization is a fancy way of saying, “graphically show your
information in a way that people can read and understand it.” You can use
charts, graphs, maps, bullet points, or a host of other methods. Visualization
helps you derive valuable insights by helping you compare datasets and
observe relationships.
5
7. Exploratory Data Analysis (EDA)
EDA focuses on exploring and understanding the data without preconceived
hypotheses. It involves visualizations, summary statistics, and data profiling
techniques to uncover patterns, relationships, and interesting features. It
helps generate hypotheses for further analysis.
8. Diagnostic Analysis
Diagnostic analysis aims to understand the cause-and-effect relationships
within the data. It investigates the factors or variables that contribute to
specific outcomes or behaviors. Techniques such as regression analysis,
ANOVA (Analysis of Variance), or correlation analysis are commonly used
in diagnostic analysis.
9. Predictive Analysis
Predictive analysis involves using historical data to make predictions or
forecasts about future outcomes. It utilizes statistical modeling techniques,
machine learning algorithms, and time series analysis to identify patterns
and build predictive models. It is often used for forecasting sales, predicting
customer behavior, or estimating risk.
10.Prescriptive Analysis
Prescriptive analysis goes beyond predictive analysis by recommending
actions or decisions based on the predictions. It combines historical data,
optimization algorithms, and business rules to provide actionable insights
and optimize outcomes. It helps in decision-making and resource allocation.
Next, we will get into the depths to understand about the data analysis
methods.
6
Some professionals use the terms “data analysis methods” and “data analysis
techniques” interchangeably. To further complicate matters, sometimes people
throw in the previously discussed “data analysis types” into the fray as well! Our
hope here is to establish a distinction between what kinds of data analysis exist,
and the various ways it’s used.
Although there are many data analysis methods available, they all fall into one of
two primary types: qualitative analysis and quantitative analysis.
The qualitative data analysis method derives data via words, symbols, pictures, and
observations. This method doesn’t use statistics. The most common qualitative
methods include:
Also known as statistical data analysis methods collect raw data and process it into
numerical data. Quantitative analysis methods include:
7
• Sample Size Determination uses a small sample taken from a larger group of
people and analyzed. The results gained are considered representative of the
entire body.
To analyze data effectively, you can apply various data analysis techniques. Here
are some top techniques to consider:
Clearly define the objectives of your data analysis. Understand the questions you
want to answer or the insights you want to gain from the data. This will guide your
analysis process.
2. Data Cleaning
Start by cleaning the data to ensure its quality and reliability. Remove duplicates,
handle missing values, and correct any errors or inconsistencies. Data cleaning is
crucial for accurate analysis.
3. Descriptive Statistics
4. Data Visualization
8
Create visual representations of the data using charts, graphs, or plots.
Visualization helps spot patterns, trends, or outliers that may not be immediately
apparent in the raw data. Use appropriate visualizations based on the type of data
and the insights you want to convey.
Perform EDA techniques to explore the data deeply. Use data profiling, summary
statistics, and visual exploration to identify patterns, relationships, or interesting
features within the data. EDA helps generate hypotheses and guides further
analysis.
6. Inferential Statistics
Apply inferential statistics to conclude the larger population based on sample data.
Use techniques like hypothesis testing, confidence intervals, and regression
analysis to test relationships, make predictions, or assess the significance of
findings.
If working with textual data, employ text mining and natural language processing
techniques. Analyze sentiment, extract topics, classify text, or conduct entity
recognition to derive insights from unstructured text data.
Remember, the choice of techniques depends on your specific data, objectives, and
the insights you seek. It's essential to have a systematic and iterative approach,
using multiple techniques to gain a comprehensive understanding of your data.
Data visualization
Data visualization is the representation of data through use of common graphics,
such as charts, plots, infographics, and even animations. These visual displays of
information communicate complex data relationships and data-driven insights in a
way that is easy to understand.
10
used within predictive analytics. Line graphs utilize lines to demonstrate
these changes while area charts connect data points with line segments,
stacking variables on top of one another and using color to distinguish
between variables.
4. Histograms: This graph plots a distribution of numbers using a bar chart
(with no spaces between the bars), representing the quantity of data that falls
within a particular range. This visual makes it easy for an end user to
identify outliers within a given dataset.
5. Scatter plots: These visuals are beneficial in reveling the relationship
between two variables, and they are commonly used within regression data
analysis. However, these can sometimes be confused with bubble charts,
which are used to visualize three variables via the x-axis, the y-axis, and the
size of the bubble.
6. Heat maps: These graphical representation displays are helpful in
visualizing behavioral data by location. This can be a location on a map, or
even a webpage.
7. Tree maps, which display hierarchical data as a set of nested shapes,
typically rectangles. Tree maps are great for comparing the proportions
between categories via their area size.
The importance of data visualization is simple: it helps people see, interact with,
and better understand data. Whether simple or complex, the right visualization can
bring everyone on the same page, regardless of their level of expertise.
Advantages
11
• Interactively explore opportunities.
• Visualize patterns and relationships.
Disadvantages
1. Power BI
Power BI, Microsoft's easy-to-use data visualization tool, is available for both on-
premise installation and deployment on the cloud infrastructure. Power BI is one of
the most complete data visualization tools that supports a myriad of backend
databases, including Teradata, Salesforce, PostgreSQL, Oracle, Google Analytics,
Github, Adobe Analytics, Azure, SQL Server, and Excel. The enterprise-level tool
creates stunning visualizations and delivers real-time insights for fast decision-
making.
2. Tableau
One of the most widely used data visualization tools, Tableau, offers interactive
visualization solutions to more than 57,000 companies.
12
3. Dundas BI
4. JupyteR
5. Google Charts
One of the major players in the data visualization market space, Google Charts,
coded with SVG and HTML5, is famed for its capability to produce graphical and
pictorial data visualizations. Google Charts offers zoom functionality, and it
provides users with unmatched cross-platform compatibility with iOS, Android,
and even the earlier versions of the Internet Explorer browser.
6. Visual.ly
Visual.ly is one of the data visualization tools on the market, renowned for its
impressive distribution network that illustrates project outcomes. Employing a
dedicated creative team for data visualization services, Visual.ly streamlines the
process of data import and outsource, even to third parties.
7. Fusion-charts
13
Fusion-charts is one of the most popular and widely-adopted data visualization
tools. The Javascript-based, top-of-the-line visualization tool offers ninety different
chart building packages that integrate with major frameworks and platforms,
offering users significant flexibility.
8. High-charts
Microsoft Excel
Microsoft Excel is one of the most popular applications for data analysis. Equipped
with built-in pivot tables, they are without a doubt the most sought-after analytic
tool available. It is an all-in-one data management software that allows you to
easily import, explore, clean, analyze, and visualize your data.
PIVOT TABLE
In simple words, a pivot table is a data analysis technique used for summarizing
large datasets and answering questions you may have about the data. It is available
in spreadsheet applications like Microsoft Excel, Google Sheets and Polymer
(interactive pivot tables). It's a very powerful way to organize your data.
14
Pivot tables are used for summarizing and rearranging large amounts of data into
an easier-to-understand table that allows us to draw important business
conclusions.
Pivot tables utilize SUM and AVERAGE, among other functions, to quickly get
the answer to these questions.
PIVOT CHART
A pivot chart is the visual representation of a pivot table in Excel. Pivot charts and
pivot tables are connected with each other.
Below you can find a two-dimensional pivot table.
15
The Insert Chart dialog box appears.
3. Click OK.
Power Query
Power Query is an application for transforming and preparing data. With Power
Query you can get data from sources using a graphical interface and apply
transformations using a Power Query Editor. Using Power Query, a business
intelligence tool offered by Microsoft Excel, you can import data from any number
of sources, clean it, transform it, then reshape it according to your needs. In this
way, you can set up a query only once, re-use it later by simply refreshing.
Power Query allows users to extract, transform, and load (ETL) data from various
sources into Excel or Power BI. The four phases of Power Query are:
1. Connect
In this phase, users connect to the data source(s) from which they want to extract
data. Power Query supports many data sources, including databases, files, web
pages, and more. Users can also specify any required authentication or
authorization details during this phase.
2. Transform
Once the data is loaded into Power Query, users can use various data
transformation tools to clean, reshape, and transform the data to meet their specific
16
needs. Common data transformation tasks include removing duplicates, filtering
data, merging data, splitting columns, and pivoting data.
3. Combine
Power Query also allows users to combine data from multiple sources using
various techniques. Users can merge tables, append, or join data using a common
key. This phase is beneficial for integrating data from different sources into a
single, unified view.
4. Load
Finally, in the Load phase, users specify where to load the transformed data. They
can load the data into an Excel worksheet or a Power BI report or create a
connection to the data source so that the data is automatically refreshed whenever
the source data changes.
Power Pivot
Power Pivot is an Excel add-in you can use to perform powerful data analysis and
create sophisticated data models. With Power Pivot, you can mash up large
volumes of data from various sources, perform information analysis rapidly, and
share insights easily.
• Open Excel.
17
• Select Add-Ins.
• Select the Manage drop down menu, then select COM Add-ins.
18
• Select Go.
DAX in Excel
DAX in Excel stands as an abbreviation for Data Analysis Expressions. DAX
formulas are very similar to the general and default functions made available by
excel. DAX Functions and Formulas also start with an equals sign, just as the
default functions.
19
options section, and then in the “Add-ins” section, we need to click on “Manage
Add-ins” and then check on Analysis ToolPak to use it in Excel.
• Click the File tab, click Options, and then click the Add-Ins category.
• On the Data tab, in the Analysis group, you can now click on Data
Analysis.
20
Functions Available in Excel Data Analysis ToolPak
Below is the list of available functions in the Analysis ToolPak Excel add-in:
21
Power BI
Power BI is a technology-driven business intelligence tool provided by Microsoft
for analyzing and visualizing raw data to present actionable information. It
combines business analytics, data visualization, and best practices that help an
organization to make data-driven decisions.
22
Components of Power BI
1. Power Query
Power Query is the data transformation and mash up the engine. It enables
you to discover, connect, combine, and refine data sources to meet your
analysis need. It can be downloaded as an add-in for Excel or can be used as
part of the Power BI Desktop.
2. Power Pivot
Power Pivot is a data modeling technique that lets you create data models,
establish relationships, and create calculations. It uses Data Analysis
Expression (DAX) language to model simple and complex data.
3. Power View
Power View is a technology that is available in Excel, Sharepoint, SQL
Server, and Power BI. It lets you create interactive charts, graphs, maps, and
other visuals that bring your data to life. It can connect to data sources and
filter data for each data visualization element or the entire report.
4. Power Map
Microsoft's Power Map for Excel and Power BI is a 3-D data visualization
tool that lets you map your data and plot more than a million rows of data
visually on Bing maps in 3-D format from an Excel table or Data Model in
Excel. Power Map works with Bing maps to get the best visualization based
on latitude, longitude, or country, state, city, and street address information.
5. Power BI Desktop
Power BI Desktop is a development tool for Power Query, Power Pivot, and
Power View. With Power BI Desktop, you have everything under the same
solution, and it is easier to develop BI and data analysis experience.
Power BI Dashboards
23
The Power BI Dashboard is basically a page consisting of visualizations to
represent a story or business insight. It is limited to only one page and can be
viewed and shared on mobile devices as well. The visualizations that are seen on
the dashboard are obtained from reports that are gathered from datasets.
Power BI Reports
Power BI allows you to create reports with multiple perspectives from one dataset,
later stored in the Power BI Report Server. These reports are known as Power BI
reports. You can create and edit the reports on the Power BI desktop and publish
them on the web portal. Post-publication, the readers can view the reports in a web
browser or the Power BI Mobile on a mobile device. The reports in Power BI can
have one or more visual pages.
Power BI DAX
1. Syntax
24
The syntax is basically the Power BI formula, which consists of many components.
To write effective DAX syntax, it is suggested that you break the formula into
understandable language. Consider the simple syntax below:
2. Context
Row Context is, simply speaking, the current row and is used especially in the case
of measures. Row context is applied to a formula where the function uses filters to
identify a single row in a table.
In simpler terms, Filter Context means that there is one more filter applied in a
calculation. It is a difficult context to understand as compared to the Row context.
Usually, it is not possible that the Filter context is applied in place of the Row
Context. But it is not possible for the Filter Context to be applied in addition to the
Row Context.
3. Functions
25
Functions in Power BI refer to the predefined, ordered, and structured formulae to
carry out calculations using arguments. Some of the common DAX functions are
MIN, MAX, SUM, AVERAGE, MAXX, SUMX, and more.
‘Calculated columns’ are nothing but the extension of a table using the DAX
formulas to add new columns while creating data models on the Power BI
Desktop. Also, the content to be added in the new columns is defined by a DAX
expression.
Working of Power BI
In Power BI, you should first connect your data to the tool, transform the data that
you have uploaded, model the data as needed, visualize the data, and share the
generated results.
26
Connecting Your Data
You can either use Power BI or Power BI Desktop to connect a variety of data
sources, such as MySQL Server, MySQL, Oracle, etc. You can connect your data
to Power BI in two ways either upload your file to Power BI or import the file into
Power BI.
Connect to the data in your workbook to create Power BI reports and dashboards
for it.
In order to view your file, you will have to fetch it in Power BI; you can interact
with the file just as you would do in the case of Excel Online.
If you have any doubts or queries related to BI, do post them in our BI
Community.
Once your data is loaded, you can transform it as per your needs. You can do this
by using the Transform menu. It has a set of operations, including reverse rows,
count rows, rename, replace values and errors, pivot and unpivot columns, etc.
For data modeling, add the data sources in Power BI new report option. Power BI
lets you add functions, calculations, relationships, measures, etc., to your data for
better visualization and analytics; this is done so that the data can be used to derive
27
better business insights. This functionality of Power BI is referred to as Data
Modeling. By using Power BI Data Modeling, you can even write a query to your
files so that you can accomplish different tasks in a short span of time.
Data Visualization
In Power BI, you can create reports, dashboards, etc. based on the modeled data
and depending on your organization’s requirements. Report creation can be done in
many ways; you have to select a field of your choice from your CSV or data file,
and then choose the tool that you want to give to your data so as to generate the
desired report. You can use a variety of tools and even add a custom visual gallery.
28
To share a generated report, select the ‘Share’ option that appears in the top
navigation. Select the ‘share’ option, complete the form, and share it with your
team.
On selecting the ‘share’ option, complete the form, and share it with your team.
You can share the generated reports from Favorites, Recent, and My Workspace.
Why SQL?
• Allows users to define the data in a database and manipulate that data.
• Allows to embed within other languages using SQL modules, libraries &
pre-compilers.
SQL Commands
29
The standard SQL commands to interact with relational databases are CREATE,
SELECT, INSERT, UPDATE, DELETE and DROP. These commands can be
classified into the following groups based on their nature:
Command Description
Command Description
30
DCL - Data Control Language
Command Description
SQL Syntax
SQL is followed by a unique set of rules and guidelines called Syntax. This tutorial
gives you a quick start with SQL by listing all the basic SQL Syntax.
All the SQL statements start with any of the keywords like SELECT, INSERT,
UPDATE, DELETE, ALTER, DROP, CREATE, USE, SHOW and all the
statements end with a semicolon (;).
The most important point to be noted here is that SQL is case insensitive, which
means SELECT and select have same meaning in SQL statements. Whereas,
MySQL makes difference in table names. So, if you are working with MySQL,
then you need to give table names as they exist in the database.
31