You are on page 1of 17

Advanced Business Data Analysis

Higher Diploma in Data Analytics

Theo Mendonca
tmendonca@ncirl.ie

Image source: Pros and cons of statistics


Slides courtesy of Dr. Eugene O'Loughlin
Lecture 02 - Introduction

 In this section you will:


 Review the ABDA module aims
 Review the ABDA module Learning
Outcomes
 Review the module recommended
readings
 Review the BDA module content
Module Aim

 To enable learners to address statistical problems in data analytics on a


practical level so that learners are in a position to conduct more advanced
analyses independently and critically evaluate research papers

 Topics covered include an elaboration on inferential statistics learned in


the module Business Data Analysis, p-values revisited and confidence
intervals, SPSS analyses of inferential tests, non-parametric tests, effect
size, power and sample size, meta-analysis, cluster and factor analysis.

(Module Descriptor)
Module Learning Outcomes

On successful completion of this module, learners will be able to:

1. Evaluate and choose between different options for inference statistics so that a
motivated decision between two or more options can be made
2. Critically evaluate statistical applications in a particular discipline using advanced
topics (Power analysis, sample size calculation, cluster and factor analysis)
3. Conduct advanced statistical analyses using a statistical package
4. Interpret the results output of a statistical package Work out and apply a strategy
for a statistical analysis when presented with a real-world problem from business
Indicative Content
 Conducting Statistical Analyses (40%)
 Effect Size, Sample Size, Power (30%)
 Factor Analysis and Principal Component
Analysis (30%)

…we fill some of the gaps from the BDA


course and take a very practical view of
analysis.
Recommend Texts
Cortinhas, C. and Black, K. 2012, Statistics for
Business and Economics, 1st European Edition
John Wiley & Sons [ISBN: 1119993660]

Statistics (2011)
McClave, J. & Sincich, T.
12th Edition

Statistics for People Who (Think They) Hate Statistics


Neil Salkind (2014)

Student Study Site:


https://secure.uk.sagepub.com/salkind5e/study/default.htm
BDA: What you should know
Descriptive statistics

House Fly Wing Lengths (Sokal & Hunter, 1955)


20

Central Frequency 15

Tendency, 10
Dispersion 5
and Shape 0
35 37 39 41 43 45 47 49 51 53 55

Wing Length (mm)


BDA: What you should know
Common Symbols and Abbreviations:
BDA: What you should know
 Normal distribution:
 Mean = Median = Mode
 Kurtosis = 3; Excess Kurtosis = 0
 Skewness = 0

 Standard Normal distribution:


BDA: What you should know

𝑋 −𝜇
𝑍 =
σ
√𝑛 Single
Sample

𝑋−𝜇
𝑡 𝑛 −1=
σ
√𝑛
BDA: What you should know
t-test
Equal variance

Unpaired

Two samples

Unequal variance

Two
−1 −1
Samples

Paired
BDA: What you should know
One way ANOVA

BDA: What you should know


Brand A Brand B Brand C

Three or More
Samples
BDA: What you should know
Chi-squared Goodness of Fit

Grade 1 2 3 4 5 6
# students (O) 309 432 346 432 369 329 2217
Expected ( E ) 369.5 369.5 369.5 369.5 369.5 369.5
O-E -60.5 62.5 -23.5 62.5 -0.5 -40.5
(O - E)2 3660.25 3906.25 552.25 3906.25 0.25 1640.25
2
(O - E) /E 9.905954 10.57172 1.494587 10.57172 0.000677 4.439107 36.98376

( )
2
(𝑶− 𝑬)
𝝌 =∑
2
𝑬 Observed vs
Expected
BDA: What you should know
Time Series Analysis

Ft = At-1 + At-2 + At-3


n
Ft = 3At-1 + 2At-2 + 1At-3
Sum of Weights
Predict
Ft = α At−1 + (1 − α)Ft−1
BDA: What you should know
Correlation

Correlation is
not Causation!
BDA: What you should know
Simple Linear Regression
4.00
3.50
3.00
Predict 2.50
2.00
Y 1.50
1.00

𝑦 =𝑎+𝑏𝑥
0.50
0.00
0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 5.50

𝑠𝑦
𝑎= 𝑦 −𝑏 𝑥 𝑏=𝑟
𝑠𝑥
ABDA – next steps
• What are the preconditions for the tests we have looked?
• Using statistical packages to check some of these preconditions.
• Looking at more than one dimension / factor with the ANOVA and regression.
• Two-way ANOVA…
• Multiple linear regression
• Dimension reduction and classification.
• Factor analysis…
• Power and effect.
• Some non-parametric alternatives to the tests we have looked at in BDA.

You might also like