You are on page 1of 4

Lecture 4

BUSINESS STATISTICS DESCRIPTIVE STATISTICS:


Advanced Educational Program Numerical summaries

Reading materials:
Chap 4 (Keller)

1 2

1 2

Outline Measure of center and spread


• Measures of center:
- Mean, median, mode
- Selection of measures of location
• Measures of dispersion (spread):
- Range, quartile range, quartile deviation,
variance, standard deviation
• Empirical rule (general case: Chebyshev’s
law)
• Coefficient of skewness
• Coefficient of variation
3 4

3 4

Measures of center Measures of center

• A measure of center or location shows


where the center of the data is
• Three most useful measures of location:
 Arithmetic mean/average
 Median
 Mode

5 6

5 6

1
Arithmetic mean from raw data Arithmetic mean from frequency table
N

X i
• Apply this formula for the sample:
• Arithmetic mean from population:  i 1

N
k
n

x x f i i
• Arithmetic mean from sample:
i
x i 1
x i 1 k
n f
i 1
i

Where: Xi, xi - the value of each item Where: xi - the value of class i
N, n - total number of items fi – frequency of class i

7 8

7 8

Advantages and disadvantages of arithmetic mean Mean is sensitive to outliers

• Advantages:
– Easy to understand and calculate
– Values of every items are included => representative for
the whole set of data
• Disadvantages
– Sensitive to outliers:
Sample: (43; 38; 37; : : : ; 27; 34): => x  33.5
Contaminated sample
(43; 38; 37; : : : ; 27; 1934): => x  71.5

9 10

9 10

Median Calculate median from raw data

 Median is the value of the observation which is • If the data has an odd number of observations:
located in the middle of the data set (n  1)th
– Middle observation:
2
 Steps to find median:
Median  x ( n1)th
1. Arrange the observations in order of size (normally 2
ascending order) • If the data has an even number of observations:
2. Find the number of observations and hence the middle – There are two observations located in the middle and
observation
3. The median is the value of the middle observation M edian  ( x th x th )/2
n n 
   1 
2 2 

11 12

11 12

2
Example Advantages and disadvantages of median

• Advantages:
• E.g1. Raw data: 11, 11, 13, 14, 17 => find median
– Easy to understand and calculate
• E.g 2. Raw data: 11, 11, 13, 14, 16, 17 => find – Not affected by outlying values => thus can be used
median when the mean would be misleading

• Disadvantages
– Value of one observation => fails to reflect the whole
data set
– Not easy to use in other analysis

13 14

13 14

Mode
Example to calculate mode

• Mode is the value which occurs most


frequently in the data set X Frequency

8 3
• Steps to find mode
12 7
1. Draw a frequency table for the data
16 12
2. Identify the mode as the most frequent value 17 8
19 5

15 16

15 16

Mean, median and mode in normal and skewed


Bimodal and multimodal data distributions

Bimodal (two modes) Multimodal (several modes)


17 18

17 18

3
Which measure of centre is best?
• Mean generally most commonly used
• Sensitive to extreme values
• If data skewed/extreme values present, median better, e.g.
real estate prices
• Mode generally best for categorical data – e.g. restaurant
service quality (below): mode is very good. (ordinal)

Rating # customers
Excellent 20
Very good 50
Good 30
Satisfactory 12
Poor 10
Very Poor 6 19

19

You might also like