You are on page 1of 74

MMGT6012

Business Tools for Management

TOPIC 1: Descriptive Statistics


Dr. Matthew Beck
ITLS, Business School

The University of Sydney Page 1


1. Descriptive Statistics

Agenda

1) Orientation
2) Course Overview & Challenges
3) Introduction to Data Analysis / Fundamental Principles

It is envisioned that we work together as ‘partners’:


– You have made a choice to enhance your knowledge in this area.
– Let’s come with an open attitude right from the start to learn from each
other and strive to achieve the common goal

The University of Sydney Page 2


1. Descriptive Statistics

ORIENTATION

All communication will be done via Canvas

About Me:
– Matthew Beck
– Rm 444; Merewether Building (H04)
– matthew.beck@sydney.edu.au
– 02 9114 1834

The University of Sydney Page 3


1. Descriptive Statistics

Course Overview

Learning Objectives

– Apply statistical methods to solve quantitative business problems


– Apply the logical processes of quantitative analysis to deconstruct complex
business problems
– Critically evaluate application statistical methods in solving business
problems
– Communicate the results of statistical analysis clearly, concisely and with
impact
– Be comfortable using a range of software especially Excel

The University of Sydney Page 4


1. Descriptive Statistics

How to Succeed in MMGT6012

The University of Sydney Page 5


1. Descriptive Statistics

Why Study Quantitative Methods?

• Where to get materials?


• How costly?

• Performance Old/New? • What features?


• Relative costs? • What price?
• What is demand?

• Worthwhile addition?
• Machine performance? • What colour package?
• Monitor production levels • What method best to ship?
• What markets to ship to?

The University of Sydney Page 6


1. Descriptive Statistics

Why Study Quantitative Methods

“To increase the business value of information, we need data from various angles. In
addition to sales data, we need to know why sales increased. We need to know how
and where we are influential.”
CEO, German Pharmaceutical Company

“With thousands of customers, products, and contractual terms and conditions, pricing
and incentive models become very complex. Analytics is a key way to get this
complexity under control. But we are not yet good enough in this regard.”
CEO, Japanese Electronics Manufacturer

“High-performance businesses have a much more developed analytical orientation than


other organizations. They are five times more likely to view analytical capabilities as
core to the business.”
Why Predictive Analytics Is A Game-Changer, Forbes Magazine, 01/04/10.

The University of Sydney Page 7


1. Descriptive Statistics

Why Study Quantitative Methods

Information is Valuable:

– purchased for $1 billion.

– purchased for $956 million.

– purchased for $689 million.

The University of Sydney Page 8


1. Descriptive Statistics

Why Study Quantitative Methods

The University of Sydney Page 9


1. Descriptive Statistics

It's Not All Math

“On one NBA team, an analyst quickly gained the reputation as being the smartest
guy in the room but had virtually no impact on the decision-making process...while the
work he was producing was innovative, it was wasted because he did not have the
ability to communicate it in a manner that was understandable to the decision-makers.”

“It's not just about the data. It's what you do with the data.”
Mike Rhodin, Senior Vice President, IBM

The University of Sydney Page 10


1. Descriptive Statistics

It's Not All Math

You may never do the analysis yourself

But somewhere at some point you will:


– Need to read it.
– Need to understand it.
– Need to use it.

The University of Sydney Page 11


1. Descriptive Statistics

Course Overview

Statistical Data Analysis:


– Basic description
– Hypothesis testing (differences and relationships)
– Predictive analysis (regression modelling)

Optimisation Modelling:
– Linear Programming (optimal allocation of limited resources)

Simulation Modelling:
– Incorporating uncertainty into the modelling process (probabilities)

The University of Sydney Page 12


1. Descriptive Statistics

Course Overview

Cannot study every single method to solve all problems:


– A board range of techniques to solve a broad range of problems
– Develop useful skills in software and problem definition / solving

Focus on the application:


– Making sense of data
– Approaching complex problems with a plan of attack
– Trying to use quantitative information to do something meaningful

The University of Sydney Page 13


1. Descriptive Statistics

What is Data?

Data is derived from study, experience, or instruction

Data is something that we can process that allows us to make


sense of a situation

We take data and by analysing it we transform it into


information: a set of facts with meaning

The University of Sydney Page 14


1. Descriptive Statistics

The Role of Data

Reduce the risk of decision making:


– Information on course of action
– Ideas of likely strategy direction
– Understand how something is performing
– Understand what people want
– Understand where and how we can have influence

– https://www.kaggle.com/c/flight

The University of Sydney Page 15


1. Descriptive Statistics

Class Activity

The University of Sydney Page 16


1. Descriptive Statistics

Two Branches of Statistics

Descriptive Statistics:
– Procedures that describe the data we are studying
– Results help us organize and understand the data
– The results cannot be generalized to any larger group

Descriptive Statistics include:


– Measures of central tendency (mean, median, and mode)
– Measure of spread (range, variance, standard deviation)
– Frequency distributions Graphs (pie charts, bar charts, etc)

The University of Sydney Page 17


1. Descriptive Statistics

Two Branches of Statistics

Useful if you do not need to extend your results a larger group:


– Most reports in social sciences seek to produce “universal truths” about the
behaviour of people

The University of Sydney Page 18


1. Descriptive Statistics

Two Branches of Statistics

Inferential Statistics:
– Trying to reach conclusions that extend past the immediate data
– How might the population behave based on our collected data
– The probability that our result is systematic not random chance

The University of Sydney Page 19


1. Descriptive Statistics

Two Branches of Statistics

Hypothesis Testing:
– Also called tests of significance
– A Chi-square test, T-test, F-test, regression models are examples
– Typically tests for differences or relationships between variables

Predictive Analysis:
– Regression modelling (and others)
– Determine patterns and predict future outcomes and trends

The University of Sydney Page 20


1. Descriptive Statistics

Types of Data

Also referred to as levels of measurement

CATEGORICAL CONTINUOUS

Nominal Ordinal Interval Ratio


(has no meaning) (can be put in order) (can measure between) (Has a fixed zero)

The University of Sydney Page 21


1. Descriptive Statistics

Class Activity

1. Form small teams with the people sitting next to you

2. Come up with an example for each type of data

3. Give an example of how you describe the characteristics of


that data (if you had a large number of data points)

4. Write this on a piece of paper and bring it up the front

The University of Sydney Page 22


1. Descriptive Statistics

As per Ordinal data


Ratio Data

As per Ordinal data


plus means, std dev. Interval Data

As per Nominal data


plus medians and
percentiles.
Ordinal Data

Frequencies, Mode Nominal Data


The University of Sydney Page 23
1. Descriptive Statistics

Nominal Data

From a production line, I pick 10


widgets randomly.

They may be coded as either being


defective or not defective:
– How will I code this?
– What information is useful?
– What analysis is appropriate to
perform?

The University of Sydney Page 24


1. Descriptive Statistics

Nominal Data – Frequency Distributions

What is a Frequency Distribution


– A frequency distribution is a list or a table …
– containing categories or ranges within which the data fall
– the corresponding frequencies within each category

A graph of a frequency distribution is called a frequency


histogram:
– Allows for a quick visual interpretation of the data

The University of Sydney Page 25


1. Descriptive Statistics

Nominal Data – Frequency Distributions

Defective 2 Defective 20%


Not Defective 8 Not Defective 80%

The University of Sydney Page 26


1. Descriptive Statistics

– A measure of central tendency


– Value that occurs most often
– Not affected by extreme values
– Used for either numerical or categorical data
– There may may be no mode
– There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 9 No Mode
The University of Sydney Page 27
1. Descriptive Statistics

As per Ordinal data


Ratio Data

As per Ordinal data


plus means, std dev. Interval Data

As per Nominal data


plus medians and
percentiles.
Ordinal Data

Frequencies, Mode Nominal Data


The University of Sydney Page 28
1. Descriptive Statistics

Ordinal Data

Five people rank, in order of preference, 5 companies:


– (1 = Best – 5 = Worst)

– How will I code this?


– What information is useful?
– What analysis is appropriate to perform?

The University of Sydney Page 29


1. Descriptive Statistics

Ordinal Data – Frequency Distributions

The University of Sydney Page 30


1. Descriptive Statistics

Ordinal Data – Mode

The University of Sydney Page 31


1. Descriptive Statistics

Ordinal Data – Median

In an ordered array:
– The median is the “middle” number
– 50% of observations are above this point
– 50% of observations are below this point
– (n+1)/2 gives the position of the median

13 14 15 16 17 18 19 20 21 13 14 15 16 17 18 19 20 21

Median = 19 Median = 18

The University of Sydney Page 32


1. Descriptive Statistics

As per Ordinal data


Ratio Data

As per Ordinal data


plus means, std dev. Interval Data

As per Nominal data


plus medians and
percentiles.
Ordinal Data

Frequencies, Mode Nominal Data


The University of Sydney Page 33
1. Descriptive Statistics

Interval Data

5 people rate the delivery performance for 5 companies:


– (1= very bad to 10 = very good)

– How will I code this?


– What information is useful?
– What analysis is appropriate to perform?

The University of Sydney Page 34


1. Descriptive Statistics

Interval Data – Arithmetic Mean

The mean/average is the most common measure of central


tendency
– For a sample of size n:

∑X i
X1 + X 2 + X 3 + X n
Observed
X= i =1
= Values
n n
Sample size

The University of Sydney Page 35


1. Descriptive Statistics

Interval Data – Mean

The University of Sydney Page 36


1. Descriptive Statistics

Interval Data – Variance

Average (approximately) of the squared deviations of values


from the mean:
n

2
∑ (X − X)
i
2

S = i=1
n -1
– X = arithmetic mean
– n = sample size
– Xi = ith value of the variable X

The University of Sydney Page 37


1. Descriptive Statistics

Interval Data – Standard Deviation

Square root of the variance


– The most commonly used measure of variation

∑ i
(X − X ) 2

S= i=1
n -1
– Has the same unit of measurement as the original data
– Provides a "standard" way of knowing what is “average” and what is
“extra large” or “extra small”

The University of Sydney Page 38


1. Descriptive Statistics

Interval Data – Picturing the Standard Deviation


Data A
11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5
S = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 0.926

Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 4.570
The University of Sydney Page 39
1. Descriptive Statistics

Interval Data – Variance and Std Dev

The University of Sydney Page 40


1. Descriptive Statistics

As per Ordinal data


Ratio Data

As per Ordinal data


plus means, std dev. Interval Data

As per Nominal data


plus medians and
percentiles.
Ordinal Data

Frequencies, Mode Nominal Data


The University of Sydney Page 41
1. Descriptive Statistics

Ratio Data

Burger Squares is a company that make


chip packets which come in 50g packs.
– Given this weight, government regulations mean
that to protect consumers, packs must have
between 45 and 55 grams.
– To much variability in packets weight, i.e. outside
the prescribed range, results in heavy fines.

The University of Sydney Page 42


1. Descriptive Statistics

Ratio Data

20 packs are randomly chosen off of an


assembly line and are weighed.
– How will I code this?
– What information is useful?
– What analysis is appropriate to perform?

The University of Sydney Page 43


1. Descriptive Statistics

Ratio Data – Descriptive Statistics

Note that by looking at all the data combined, we get a holistic


understanding of what is occurring within the data.
The University of Sydney Page 44
1. Descriptive Statistics

How To Best Describe Data

We tend to describe categorical variables with frequencies and


modes:
– Why do you not want to report a mean or a standard deviation?

We tend to describe continuous variables with means and


standard deviations:
– Why do you not want to report frequencies or use a histogram?

The University of Sydney Page 45


1. Descriptive Statistics

Thinking About Measurement Accuracy (or Error)

Accurate and Precise Accurate but Imprecise Inaccurate and Precise


Systematic Error

The University of Sydney Page 46


1. Descriptive Statistics

Measures of Central Tendency

Mode:
– Can really only be used when dealing with nominal data

Mean:
– When data is symmetric and continuous

Median:
– When the data is skewed (has extreme or outlying observations)

The University of Sydney Page 47


1. Descriptive Statistics

Outliers and Extreme Values

A value that is much smaller or larger than other values:


– A value that stands out or doesn’t belong doesn't belong

The University of Sydney Page 48


1. Descriptive Statistics

The University of Sydney Page 49


1. Descriptive Statistics

The University of Sydney Page 50


1. Descriptive Statistics

The Role of Data Cleaning

Mistakes are made when collecting or coding data:


– All data should be cleaned before analysis!

Data cleaning is the process of the act of detecting and


correcting
– Corrupt data
– Inaccurate data
– Incomplete data
– Irrelevant data
– Incorrect data

The University of Sydney Page 51


1. Descriptive Statistics

The Role of Data Cleaning

Errors and outliers can be detected by:


– Descriptives
– Frequencies
– Consistent with allowable values
– Common sense

Profiling & categorising:


– Procedural error – data entry or coding error
– Extraordinary event – uniqueness of observation
– Observations within range but unique in combination
– No explanation

The University of Sydney Page 52


1. Descriptive Statistics

The Role of Data Cleaning

Retention vs. Deletion:


– Depending on philosophies among analysts
– Deletion may improve multivariate analysis but limit generalisability

Case-wise deletion:
– Cases or respondents with any missing responses are discarded
– Useful if extent of missing data is small

Pair-wise deletion:
– Only the piece of information requiring cleaning is discarded

The University of Sydney Page 53


1. Descriptive Statistics

The Role of Data Cleaning

Case Substitution
– Observations with missing data are replaced by choosing another non-sampled observation.

Mean Substitution
– Widely used; replaces missing values for a variable with the mean value of that variable
based on all valid responses.

Cold Deck Imputation


– Substituting a constant value derived from external resources or previous research fro the
missing values.

Regression Imputation
– Regression analysis used to predict the missing values of a variable based on its relationship to
other variables in the data set.

The University of Sydney Page 54


1. Descriptive Statistics

Class Activity

1. Form small teams with the people near you

2. Locate the “Class Data.xlsx” file

3. Clean this data file

4. Record issues and a description of what you did

The University of Sydney Page 55


1. Descriptive Statistics

Presenting Data

Data analysis is story-telling:


– It should be interesting and compelling

− Consider your audience


− Establish a setting
− Define the characters
− Establish the conflict
− Resolve the conflict
− Think about the future

The University of Sydney Page 56


1. Descriptive Statistics

Presenting Data

Present data in a way that provides substance & statistics:


– Communicate complex ideas with clarity, precision and efficiency
– Give the largest number of ideas in the most efficient manner
– Tell the truth about the data

http://www.edwardtufte.com/tufte/

http://kirkgoldsberry.com/

The University of Sydney Page 57


1. Descriptive Statistics

Presenting Data

The University of Sydney Page 58


1. Descriptive Statistics

Presenting Data – Chart Junk

The University of Sydney Page 59


1. Descriptive Statistics

Presenting Data – No Relative Basis for Comparison

A’s Received by Students A’s Received by Students

300 30%

200 20%

100 10%

0 0%
1st Yr 2nd Yr 3rd Yr 4th Yr 1st Yr 2nd Yr 3rd Yr 4th Yr

Click here to listen to an explanation!!!

The University of Sydney Page 60


1. Descriptive Statistics

Presenting Data – Distorting/Compressing Vertical Axis

Quarterly Sales Figures Quarterly Sales Figures

200 50

100 25

0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

The University of Sydney Page 61


1. Descriptive Statistics

Presenting Data – No Zero Point on Vertical Axis


Monthly Sales Numbers
$45

$42
Monthly Sales Numbers
$39
$45
$36
$42
J F M A M J
$39
$60
$36
J F M A M J $40

$20

$0
J F M A M J
The University of Sydney Page 62
1. Descriptive Statistics

Using Different Simple Graphs

When presenting data you are normally trying to show a:


– Relationship (how things are connected)
– Comparison (how things are different)
– Composition (what things are made of)
– Distribution (how spread out things are)

The University of Sydney Page 63


1. Descriptive Statistics

Using Different Simple Graphs

The University of Sydney Page 64


1. Descriptive Statistics

Using Different Simple Graphs

Vertical Bar Charts:


– Best for comparing 2-7 means or percentages

The University of Sydney Page 65


1. Descriptive Statistics

Using Different Simple Graphs

Horizontal Bar Charts:


– Best for comparing 8 or more means or percentages

The University of Sydney Page 66


1. Descriptive Statistics

Using Different Simple Graphs

Pie Charts:
– Best for showing the composition of single group of data

The University of Sydney Page 67


1. Descriptive Statistics

Using Different Simple Graphs

Line Charts:
– Illustrate trends or changes over time
– Movements in the same or different direction

The University of Sydney Page 68


1. Descriptive Statistics

Using Different Simple Graphs

Scatterplots Charts:
– Illustrate distribution of data
– Help identify relationships between variables

The University of Sydney Page 69


1. Descriptive Statistics

Some Useful Resources

– https://infogr.am/

– http://articles.sysev.com/principles-practices-effective-
presentation-communication/

– http://www.kaushik.net/avinash/data-presentation-tips-focus-
think-simplify-visualize/

– http://www.forbes.com/sites/kateharrison/2015/01/20/a-
good-presentation-is-about-data-and-story/

The University of Sydney Page 70


1. Descriptive Statistics

Class Activity

1. Go to Blackboard and locate the “Country Data.xlsx” file

2. Present the information in the file in an interesting way

3. Develop a story around the data; what is it telling you?

4. Prepare a short power point presentation to tell your story

5. Email to matthew.beck@sydney.edu.au

The University of Sydney Page 71


1. Descriptive Statistics

Descriptives in Excel

=mode(array) Where the array is a row or


column of data you are
=median(array) interested in analysing
=average(array)
=stdev(array)
Go to the Insert ribbon for graphs
=range(array)
=min(array)
=max(array)

The University of Sydney Page 72


1. Descriptive Statistics

Descriptives in SPSS

You are an intelligent Master’s level student

SPSS is pretty intuitive

See if you can figure it out:


– Hint: You are trying to analyse your data (where do you think you go?)
– Hint: You want to describe your data (where do you think you go?)

The University of Sydney Page 73


1. Descriptive Statistics

Class Activity

1. Locate the “Class Data - Clean.xlsx” file on Blackboard

2. Do some descriptive statistics on the data

3. Develop a story around the data; what is it telling you?

4. Prepare a short power point presentation to tell your story

5. Email to matthew.beck@sydney.edu.au

The University of Sydney Page 74

You might also like