You are on page 1of 69

3

◦ Statistical Analysis
FLEX Course Material

BUSINESS ANALYTICS

College of Business and Accountancy


#3

.
Data Visualization and
Measure of Central Tendency,
Variation and Skewness and
Kurtosis
What is Data Visualization?

What are steps in problem solving process?


Data visualization
is the graphical representation of information and data.

By using visual elements like pie charts, graphs, and line graph, data visualization
tools provide an accessible way to see and understand trends, outliers, and
patterns in data. Additionally, it provides an excellent way for employees or
business owners to present data to non-technical audiences without confusion.
Common Data Visualization used in Business Analytics
1. Bar graph
is a pictorial representation of any statistics which is used to compare data. It shows quantity or
numbers in the form of bars which can be either horizontal or vertical.
2. Pie Chart
is a pictorial representation of any statistics which is used to determine the percentage of a data.
It displays data information in an easy-to-read 'pie-slice' format with varying slice telling you
what percent of one data element exists.
b. Line graph
is a pictorial representation of any statistics which is used to the trend data.
Trend analysis involves collecting the information from multiple periods and
plotting the collected information on a horizontal line to find actionable patterns
from the given information.
Sample problem of Data Visualization Business Analytics
Example 1
Example
Construct the bar graph and the pie chart of the given data using excel.
Bar graph
Pie Chart
Example 2
PRODUCTS SOLD BY XYZ
COMPANY
Example 2
Construct the Line graph of the given data.

Sales of XYZ Company

Year Period Sales (1,000)


2011 1 3
2012 2 4.5
2013 3 4.8
2014 4 3.7
2015 5 4.6
2016 6 5
2017 7 4
2018 8 5
2019 9 6
Example 2
What is ungrouped data?
Ungrouped data
is the type of distribution in which the data is individually given in a raw form.
Example:
What is a Model?
Model:
An abstraction or representation of a real system, idea, or object
Captures the most important features
Can be a written or verbal description, a visual display, a mathematical formula, or
a spreadsheet representation
Statistical Modeling for Business Analytics
Statistical Modeling for Business Analytics
◦ Statistical Modeling for Business Analytics

◦ Statistics
◦ A collection of mathematical techniques to characterize and interpret data
◦ Descriptive Statistics
◦ Describing the data (as it is)
◦ Inferential statistics
◦ Drawing inferences about the population based on sample data
Measures of Central Tendency
◦ Descriptive Statistics
Measures of Centrality Tendency
◦ Arithmetic mean

◦ Median
◦ The number in the middle
◦ Mode
◦ The most frequent observation
◦ Descriptive Statistics
Measures of Dispersion
◦ Dispersion
◦ Degree of variation in a given variable
◦ Range
◦ Max - Min
◦ Variance Standard Deviation

◦ Mean Absolute Deviation (MAD)


◦ Average absolute deviation from the mean
Descriptive Statistics
Measures of Dispersion

◦ Box-and-Whiskers Plot
◦ a.k.a. box-plot
◦ Descriptive Statistics
Shape of a Distribution
◦ Histogram – frequency chart
◦ Skewness
◦ Measure of asymmetry

◦ Kurtosis
◦ Peak/tall/skinny nature of the distribution
Relationship
Between
Dispersion and
Shape Properties
Technology Insights 2.1 – Descriptive Statistics in Excel
Creating box-plot in Microsoft Excel
◦ Regression Modeling for Inferential Statistics
◦ Regression
◦ A part of inferential statistics
◦ The most widely known and used analytics technique in statistics
◦ Used to characterize relationship between explanatory (input) and response
(output) variable
◦ It can be used for
◦ Hypothesis testing (explanation)
◦ Forecasting (prediction)
◦ Regression Modeling
◦ Correlation versus Regression
◦ What is the difference (or relationship)?
◦ Simple Regression versus Multiple Regression
◦ Base on number of input variables
◦ How do we develop linear regression models?
◦ Scatter plots (visualization—for simple regression)
◦ Find a straight line passing through right between the plotted dots.
◦ Ordinary least squares method
◦ A line that minimizes distance between dots and the line
Regression Modeling
◦ Regression Modeling
◦ x: input, y: output

◦ Simple Linear Regression

◦ Multiple Linear Regression

◦ The meaning of Beta () coefficients


Process of Developing
a Regression Model
How do we know if the
model is good enough?
◦ R2 (R-Square)
◦ p Values
◦ Error measures (for
prediction problems)
◦ RMSE
Regression Modeling Assumptions

https://www.youtube.com/watch?v=zPG4NjIk
Time Series Forecasting

◦ Is it different than Simple Linear Regression? How?


Google Search Trends
◦ Business Reporting
Definitions and Concepts
◦ Report = Information  Decision
◦ Report?
Any communication artifact prepared to convey specific information
What is a Business Report?
What is a Business Report?
◦ Business report
- written document that contains information
regarding business matters.
◦ Purpose: to improve managerial decisions
◦ Source: data from inside and outside the organization (via the use of
ETL(Extract Transform Load()
◦ Format: text + tables + graphs/charts
◦ Distribution: in-print, email, portal/intranet
Business Reporting
Business Functions

UOB 1.0 X UOB 2.1 X UOB 3.0


Data
UOB 2.2
Transactional Records
Exception Event
Symbol Count Description
Machine Failure
Action
1 (decision)

Data Repositories
DEPLOYMENT CHART

PHASE 1 PHASE 2 PHASE 3 PHASE 4 PHASE 5

DEPT 2

DEPT 4

2 4
DEPT 3

1 3 5

Information Decision
(reporting) Maker
Types of Business Reports

Metric Management Reports

• Help manage business performance through metrics (SLAs


for externals; KPIs for internals)
Dashboard-Type Reports

• Graphical presentation of several performance indicators in


a single page using dials
Balanced Scorecard–Type Reports

• Present an integrate view of success in an organization.


Include financial, customer, business process, and
learning & growth indicators.
Statistical Modeling for Business Analytics

◦ Statistics
◦ A collection of mathematical techniques
to characterize and interpret data
◦ Descriptive Statistics
◦ Describing the data (as it is)
◦ Inferential statistics
◦ Drawing inferences about the population based
on sample data
Descriptive Statistics
Measures of Centrality Tendency

◦ Arithmetic mean

◦ Median
◦ The number in the middle
◦ Mode
◦ The most frequent observation
Measures of dispersion?
Measures of dispersion
- is summary statistics that represent the amount of spread in a set of numerical data.
.
Descriptive Statistics
Measures of Dispersion
◦ Dispersion
◦ Degree of variation in a given variable
◦ Range
◦ Max - Min
◦ Variance Standard Deviation

◦ Mean Absolute Deviation (MAD)


◦ Average absolute deviation from the mean
Descriptive Statistics
Measures of Dispersion

◦ Box-and-Whiskers
Plot
◦ a.k.a. box-plot
Measures of skewness
Skewness
is a measure of the symmetry in a distribution. A symmetrical dataset (Normally distributed) will
have a skewness equal to 0. Skewness essentially measures the relative size of the two tails.
Descriptive Statistics
Shape of a Distribution

◦ Histogram – frequency chart


◦ Skewness
◦ Measure of asymmetry

◦ Kurtosis
◦ Peak/tall/skinny nature of the distribution
Symmetrical Dataset with Skewness = 0
Dataset with Positive Skewness
Dataset with Negative Skewness
Kurtosis
Kurtosis
is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. The
value is often compared to the kurtosis of the normal distribution, which is equal to 3. If the kurtosis
is greater than 3, then the dataset has heavier tails than a normal distribution (more in the tails). If
the kurtosis is less than 3, then the dataset has lighter tails than a normal distribution (less in the
tails).
Dataset with Negative Kurtosis
Dataset with Positive Kurtosis
Relationship
Between
Dispersion and
Shape Properties
Regression
Regression Modeling for Inferential Statistics

◦ Regression
◦ A part of inferential statistics
◦ The most widely known and used analytics
technique in statistics
◦ Used to characterize relationship between
explanatory (input) and response (output)
variable
◦ It can be used for
◦ Hypothesis testing (explanation)
◦ Forecasting (prediction)
Regression Modeling

◦ Correlation versus Regression


◦ What is the difference (or relationship)?
◦ Simple Regression versus Multiple Regression
◦ Base on number of input variables
◦ How do we develop linear regression models?
◦ Scatt er plots (visualization—for simple regression)
◦ Find a straight line passing through right between the plotted dots.

◦ Ordinary least squares method


◦ A line that minimizes distance between dots and
the line
Regression Modeling
Regression Modeling

◦ x: input, y: output
◦ Simple Linear Regression

◦ Multiple Linear Regression

◦ The meaning of Beta () coefficients


Process of Developing
a Regression Model

How do we know if the


model is good enough?
◦ R2 (R-Square)
◦ p Values
◦ Error measures (for
prediction problems)
◦ RMSE
Regression Modeling Assumptions

https://www.youtube.com/watch?v=zPG4NjIk
KEEP SAFE EVERYONE
END

You might also like