You are on page 1of 32

EC2119

Descriptive Statistics

Measures of Central Tendency


Mean, Median and Mode

1
Descriptive Statistics
 Used to describe the basic features in a study

 Provides simple summaries about the sample


(graphical, tabular or summary statistics)

 Form the basis of virtually every quantitative


analysis of data

2
 Distinguished from Inferential Statistics

 Simply describing what the data shows

 IS => trying to make conclusions about the


pop

 DS => presents quantitative data in a


manageable form

3
Steps in Descriptive Statistics

 Collect Data
 Classify Data
 Summarise Data
 Present data
 Proceed to inferential statistics (if there is
enough data to draw a conclusion)

4
Descriptive Statistics

Measures of Central Tendency:


A single value that summarises a set of data. It locates the
central value of a set of data.

Measures of Dispersion:
Measures the variation around the central value of a data
set.

Measures of Shape
The shape of the distribution.

5
Summary Measures

Summary Measures

Central Tendency Dispersion

Mean Median Mean Average deviation

Mode Range Variance

Standard Deviation

6
Grouped and Ungrouped Data
 Ungrouped data is data that has been
collected and has not been ordered or put
into more readable groups

 Grouped data has been organised into a


form that is easier for the researcher to
analyse

7
Example of ungrouped data
 30 people are randomly chosen in class and
asked how many books they own
2, 45, 27, 13, 43, 19, 32, 28, 23, 4,

7, 11, 19, 12, 56, 0, 13, 30, 21, 14,

9, 17, 36, 42, 55, 15, 19, 8, 6, 31.

8
Number of books owned by 30 students in class
Number of Midpoint Frequency
Books
0-10 5 7
11-20 15.5 10
21-30 25.5 4
31-40 35.5 4
41-50 45.5 3
51-60 55.5 2

Total = 30
Source: Authors own
9
Measures of Central Tendency
(Ungrouped Data)
Mean:
arithmetic average of data values.
Sum of all values divided by the total number of values.

Population Mean
μ 
X
N
Sample Mean
n

x i
is the Arithmetic Average of data values:
x i  i 1
n

10
Some Terms
 N is the total number of observations in the
population
 n is the total number of observations in the sample

 Pronounced sigma, means ‘sum of’


N


i 1
Means the sum of the observations up to N

11
Example:
Calculate the mean for the following ungrouped data:
(6,7,6,8,5,7,6,9,10,6)

X 
X 
6  7  6  8  5  7  6  9  10  6
n 10

=7
The mean in our Book Data = 647 / 30
=21.56 but cannot interpret .56 when it comes to Books so
mean is 22

12
Properties of the Mean
 Every set of interval and ratio level data has a
mean
 All values are included when computing the mean
 The mean is unique
 The sum of the deviations of each value to the
mean = 0
3, 8, 4 MEAN = 5
∑X-Xbar) = 0
(3-5) + (8-5) + (4-5) = 0
-2 +3 -1
13
Weakness of the mean:
(only for interval and ratio level data)
(why not nominal and ordinal?)
It is affected by extreme values or outliers.

 Example:
 5 Sites, Average Price = €110,000
 €70,000, €275,000, €80,000, €60,000, €65,000
 Your Budget is €75,000. Should you look?
 Mean can be unrepresentative of the data

14
Median:
In an ordered array, the median is the middle number.
If n is odd, the median is the middle number.
If n is even, the median is the average of the 2 middle
numbers.
 Example:
 Remember our Book data
 Arrange the data ascending order
 (0,2,4,6,7,8,9,11,12,13,13,14,15,17,19,19,19,20,21,23,27,
28,31,32,36,42,43,44,55,56)
 n is 30 which is even, so the median is the average of the 2
middle numbers 19 and 19 which is 19!
15
 Used instead of the mean when the data set contains
extreme outliers

 Example: Site Prices again

 €60,000, €65,000, €70,000, €80,000, €275,000

 Mean = €110,000

 Median = €70,000

 Median is more representative of the data in this instance

16
Properties of the Median

 Like the mean, the median is unique

 It is not affected by extremely large or


small values (ie, outliers)

 Can be computed for ordinal, interval and


ratio level data

17
Mode:
The value that occurs most often in a data set
Example:
 Calculate the mode for the following ungrouped data:
 (0,2,4,6,7,8,9,11,12,13,13,14,15,17,19,19,19,20,21,23,27,28,3
1,32,36,42,43,44,55,56)

 Since 19 is the value that occurs most often this is the mode.
Not Affected by Extreme Values

Can be used for nominal and ordinal level data as well as ratio
and interval level data (advantage over mean and median)

18
Example: Nominal Level data
Survey of 100 TV viewers on the TV show they prefer

Fair City The Late Late Prime Time Nationwide The


Show Sunday
Game
10 34 18 25 13

 Mode = The Late Late Show

Disadvantages
There May Not be a Mode
There May be Several Modes (bi-modal, multi-model)

19
Measures of Central Tendency
(Grouped Data)
N
Mean:  fxi
i  i 1
Population Mean N

 fx i
Sample Mean xi  i 1
n
20
Mean of our grouped Data

 Mean is No. of Midpoint F F(X)


n Books (X)
 fx i 0-10 5 7 35
xi  i 1
11-20 15.5 10 155
n
21-30 25.5 4 102
31-40 35.5 4 142
41-50 45.5 3 136.5
 So = 681.5 /30 = 51-60 55.5 2 111

 22.71 => 23
= 30 681.5

21
Median: n
  fprec
Median  Lmedian  2 * Cw
fmedian
 Lmedian = lower limit of class interval containing the median.

 fprec = sum of the frequency's of the class intervals preceding


(but not including) the one containing the median

 fmedian = frequency of the class interval containing the median

 Cw = difference between lower limit of class interval


containing the median and the lower limit on the following
class interval (always positive)

22
Number of Books owned by 30 students in class
Number of Midpoint Frequency
Books
0-10 5 7
11-20 15.5 10
21-30 25.5 4
31-40 35.5 4
41-50 45.5 3
51-60 55.5 2

Total = 30

23
Median of grouped Books data example

Number of Midpoin Frequenc n


Books t y   fprec
0-10 5 7 Median  Lmedian  2 * Cw
fmedian
11-20 15.5 10

21-30 25.5 4

31-40 35.5 4 11 + [(30/2) -7)) / 10] * 10


41-50 45.5 3
11 + [0.8 *10]
51-60 55.5 2
11 + 8 = 19
Median = 19
Total = 30
In an ordered array the 15th person owns
19 Books

24
Mode.

Mode = {Lmedian + [[d1 / (d1 + d2)] * Cw)}

d1 = Frequency of the modal class – frequency of the


previous class.

d2 = Frequency of the modal class - frequency of the


following class.

 f X  X
2

SD( s) 
n 1

25
Mode of grouped data example

Number of Midpoin Frequenc Mode = {Lmedian + [[d1 / (d1 + d2)] * Cw)}


Books t y
0-10 5 7
11 + [(10-7) / (10-7) + (10 -4) ] * 10
11 + [ 3 / (3+6)] * 10
11-20 15.5 10
11 + 3.33
21-30 25.5 4
14.33
31-40 35.5 4
14
41-50 45.5 3
The most common number of Books
51-60 55.5 2 owned, is 14

Total = 30

26
Pocket Money Example (MEAN)

 fx i
xi  i 1

f(X) = 758 and n = 57


So grouped mean = 758 / 57 => 13.29

27
Pocket Money Example (MEDIAN)

n
  fprec
Median  Lmedian  2 * Cw 10 + {[(28.5 – 15) / 19] * 5}
fmedian
10 + {[13.5 / 19] * 5}
Lmedian = 10
n/2 = 28.5 10 + {0.71 * 5}

∑fprec = 6+9 = 15 10 + 3.55


Fmedian = 19
13.55 (INTERPRET) 28
Cw = 5
Pocket Money Example (MODE)

Mode = {Lmedian + [[d1 / (d1 + d2)] * Cw)} 10 + {[10 / 13] * 5}

10 + {0.77 * 5}
Lmedian = 10
d1 = 19 -9 = 10 10 + {3.85}

d2 = 19 – 16 = 3 13.85 (INTERPRET)
Cw = 5
29
In Class Work: Part time Wages

Calculate the mean, median and mode for the above


data
30
n

 fx i
xi  i 1
n
n
  fprec
Median  Lmedian  2 * Cw
fmedian

Mode = {Lmedian + [[d1 / (d1 + d2)] * Cw)}

31
In Class Work

Calculate the mean, median and mode for the above


data
32