Lec 4

Lecture 4: Central Tendency and Dispersion
Dr. A. Ramesh
Department of Management Studies
1
Lecture objectives
• Central tendency
• Measures of Dispersion
2
Measures of Central Tendency
• Measures of central tendency yield information about “particular places or

locations in a group of numbers.”
• A single number to describe the characteristics of a set of data
3
Summary statistics
• Central tendency or measures of • Dispersion

location – Skewness
– Arithmetic mean – Kurtosis
– Weighted mean – Range
– Median – Interquartile range
– Percentile – Variance
– Standard score
– Coefficient of variation
4
Arithmetic Mean
• Commonly called ‘the mean’
• It is the average of a group of numbers
• Applicable for interval and ratio data
• Not applicable for nominal or ordinal data
• Affected by each value in the data set, including extreme values
• Computed by summing all values in the data set and dividing the sum by
the number of values in the data set
5
Population Mean
 X  X 1
 X 2
 X 3
 ...  X N
N N
24  13  19  26  11

5
93

5
 18.6
6
Sample Mean
X 
X  X 1
 X 2
 X 3
 ...  X n
n n
57  86  42  38  90  66

6
379

6
 63.167
7
Mean of Grouped Data
• Weighted average of class midpoints
• Class frequencies are the weights
  fM
f

 fM
N
f 1M 1  f 2 M 2  f 3M 3    fiMi

f 1  f 2  f 3    fi
8
Calculation of Grouped Mean
Class Interval Frequency(f) Class Midpoint(M) fM
20-under 30 6 25 150
30-under 40 18 35 630
40-under 50 11 45 495
50-under 60 11 55 605
60-under 70 3 65 195
70-under 80 1 75 75
50 2150

fM 2150
  43.0
f 50
9
Weighted Average
• Sometimes we wish to average numbers, but we want to assign more

importance, or weight, to some of the numbers.
• The average you need is the weighted average.

Formula for Weighted Average
 xw
Weighted Average 
w
where x is a data value and w is
the weight assigned to that data
value. The sum is taken over all
data values.
Example
Suppose your midterm test score is 83 and your final exam score is 95.
Using weights of 40% for the midterm and 60% for the final exam, compute
the weighted average of your scores. If the minimum average for an A is
90, will you earn an A?
Weighted Average 
830.40 950.60
0.40  0.60
32  57
  90.2
1 You will earn an A!
Median
• Middle value in an ordered array of numbers
• Applicable for ordinal, interval, and ratio data
• Not applicable for nominal data
• Unaffected by extremely large and extremely small values
13
Median: Computational Procedure
• First Procedure
– Arrange the observations in an ordered array
– If there is an odd number of terms, the median is the middle term of the
ordered array
– If there is an even number of terms, the median is the average of the
middle two terms
• Second Procedure
– The median’s position in an ordered array is given by (n+1)/2.
14
Median: Example with an Odd Number of Terms
Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22
• There are 17 terms in the ordered array.
• Position of median = (n+1)/2 = (17+1)/2 = 9
• The median is the 9th term, 15.
• If the 22 is replaced by 100, the median is 15.
• If the 3 is replaced by -103, the median is 15.
15
Median: Example with an Even Number of Terms
Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21
• There are 16 terms in the ordered array
• Position of median = (n+1)/2 = (16+1)/2 = 8.5
• The median is between the 8th and 9th terms, 14.5
• If the 21 is replaced by 100, the median is 14.5
• If the 3 is replaced by -88, the median is 14.5
16
Median of Grouped Data
N
 cfp
Median  L  2 W 
fmed
Where :
L  the lower limit of the median class
cfp = cumulative frequency of class preceding the median class
fmed = frequency of the median class
W = width of the median class
N = total of frequencies
17
Median of Grouped Data -- Example
Cumulative
N
Class Interval Frequency Frequency  cfp
20-under 30 6 6 Md  L  2 W 
30-under 40 18 24 fmed
40-under 50 11 35 50
 24
50-under 60 11 46
60-under 70 3 49
 40  2 10 
11
70-under 80 1 50  40.909
N = 50
18
Mode
• The most frequently occurring value in a data set
• Applicable to all levels of data measurement (nominal, ordinal, interval,

and ratio)
• Bimodal -- Data sets that have two modes
• Multimodal -- Data sets that contain more than two modes
19
Mode -- Example
• The mode is 44
• There are more 44s 35 41 44 45
than any other value 37 41 44 46
37 43 44 46
39 43 44 46
40 43 44 46
40 43 45 48
20
Mode of Grouped Data
• Midpoint of the modal class
• Modal class has the greatest frequency
Class Interval Frequency

20-under 30 6  d1 
Mode  LMo   w 
30-under 40 18  d1  d 2 
40-under 50 11
 12 
50-under 60 11 30   10  36.31
60-under 70 3  12  7 
70-under 80 1
21
22
Percentiles
• Measures of central tendency that divide a group of data into 100 parts
• Example: 90th percentile indicates that at most 90% of the data lie
below it, and at least 10% of the data lie above it
• The median and the 50th percentile have the same value
• Applicable for ordinal, interval, and ratio data
• Not applicable for nominal data
23
Percentiles: Computational Procedure
• Organize the data into an ascending ordered array
• Calculate the p th percentile location:
P
i ( n)
100
• Determine the percentile’s location and its value.
• If i is a whole number, the percentile is the average of the values at the

i and (i+1) positions
• If i is not a whole number, the percentile is at the (i+1) position in the

ordered array
24
Percentiles: Example
• Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
• Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
• Location of 30th percentile:
30
i (8)  2.4
100
• The location index, i, is not a whole number; i+1 = 2.4+1=3.4;

the whole number portion is 3; the 30th percentile is at the 3rd
location of the array; the 30th percentile is 13.
25
Dispersion
• Measures of variability describe the spread or the dispersion of a set of

data
• Reliability of measure of central tendency
• To compare dispersion of various samples
26
Variability
No Variability in Cash Flow Mean
Variability in Cash Flow Mean
27
Measures of Variability or dispersion
Common Measures of Variability
• Range
• Inter-quartile range
• Mean Absolute Deviation
• Variance
• Standard Deviation
• Z scores
• Coefficient of Variation
28
Range – ungrouped data
• The difference between the largest and the smallest values in 35 41 44 45

a set of data
37 41 44 46
• Simple to compute
37 43 44 46
• Ignores all data points except the two extremes
• Example: 39 43 44 46
Range = Largest – Smallest = 48 - 35 = 13 40 43 44 46
40 43 45 48
29
Quartiles
• Measures of central tendency that divide a group of data into four subgroups
• Q1: 25% of the data set is below the first quartile
• Q2: 50% of the data set is below the second quartile
• Q3: 75% of the data set is below the third quartile
• Q1 is equal to the 25th percentile
• Q2 is located at 50th percentile and equals the median
• Q3 is equal to the 75th percentile
• Quartile values are not necessarily members of the data set
30
Quartiles
Q1 Q2 Q3
25% 25% 25% 25%
31
Quartiles: Example
• Ordered array: 106, 109, 114, 116, 121, 122, 125, 129
• Q1 i
25
(8)  2 Q 
109  114
1  111.5
100 2
• Q2:
50 116  121
i (8)  4 Q2   118.5
100 2
• Q3:
75 122  125
i (8)  6 Q3   123.5
100 2
32
Interquartile Range
• Range of values between the first and third quartiles

• Range of the “middle half”
• Less influenced by extremes
Interquartile Range  Q3  Q1
33
Deviation from the Mean
• Data set: 5, 9, 16, 17, 18

• Mean:

 X  65  13
N 5
• Deviations from the mean: -8, -4, 3, 4, 5
-4 +5
-8 +4
+3
0 5 10 15 20

34
Mean Absolute Deviation
• Average of the absolute deviations from the mean
X X   X  
M . A.D. 
 X 
5 -8 +8 N
9 -4 +4 24

16 +3 +3 5
17 +4 +4  4.8
18 +5 +5
0 24
35
Population Variance
• Average of the squared deviations from the arithmetic mean
X X   X  
2
 X  
2
 
2
5 -8 64 N
130
9 -4 16 
5
16 +3 9  26.0
17 +4 16
18 +5 25
0 130
36
Population Standard Deviation
• Square root of the variance
X X   X  
2
 X  
2
 
2
N
5 -8 64 130
9 -4 16 
5
16 +3 9  26.0
17 +4 16   
2
18 +5 25  26.0
0 130  5.1
37
Sample Variance
• Average of the squared deviations from the arithmetic mean
X X  X X  X 
2
 X  X 
2
2,398 625 390,625


2
1,844 71 5,041 S n 1
1,539 -234 54,756 663,866
1,311 -462 213,444 
3
7,092 0 663,866
 221, 288.67
38
Sample Standard Deviation
• Square root of the sample variance
X X  X X  X 
2
 X  X 
2

2
2,398 625 390,625 S n 1

663,866
1,844 71 5,041 
3
1,539 -234 54,756  221, 288.67
1,311 -462 213,444 S  S
2
7,092 0 663,866  221, 288.67

 470.41
39
Uses of Standard Deviation
• Indicator of financial risk
• Quality Control
– construction of quality control charts
– process capability studies
• Comparing populations
– household incomes in two cities
– employee absenteeism at two plants
40
Standard Deviation as an Indicator of Financial Risk
Annualized Rate of Return

Financial  
Security
A 15% 3%
B 15% 7%
41

Lec 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec 4

Uploaded by

Copyright:

Available Formats

Lecture 4: Central Tendency and Dispersion

• Measures of central tendency yield information about “particular places or

• Central tendency or measures of • Dispersion

• Sometimes we wish to average numbers, but we want to assign more

• The average you need is the weighted average.

• Applicable for ordinal, interval, and ratio data

• Not applicable for nominal data

• Unaffected by extremely large and extremely small values

• There are 16 terms in the ordered array

• Position of median = (n+1)/2 = (16+1)/2 = 8.5

• The median is between the 8th and 9th terms, 14.5

• If the 21 is replaced by 100, the median is 14.5

• If the 3 is replaced by -88, the median is 14.5

• The most frequently occurring value in a data set

• Applicable to all levels of data measurement (nominal, ordinal, interval,

• Bimodal -- Data sets that have two modes

• Multimodal -- Data sets that contain more than two modes

than any other value 37 41 44 46

Class Interval Frequency

• Applicable for ordinal, interval, and ratio data

• Not applicable for nominal data

• If i is a whole number, the percentile is the average of the values at the

• If i is not a whole number, the percentile is at the (i+1) position in the

• The location index, i, is not a whole number; i+1 = 2.4+1=3.4;

• Measures of variability describe the spread or the dispersion of a set of

• Reliability of measure of central tendency

• To compare dispersion of various samples

No Variability in Cash Flow Mean

Variability in Cash Flow Mean

• The difference between the largest and the smallest values in 35 41 44 45

Range = Largest – Smallest = 48 - 35 = 13 40 43 44 46

• Q1: 25% of the data set is below the first quartile

• Q2: 50% of the data set is below the second quartile

• Q3: 75% of the data set is below the third quartile

• Q1 is equal to the 25th percentile

• Q2 is located at 50th percentile and equals the median

• Q3 is equal to the 75th percentile

• Quartile values are not necessarily members of the data set

25% 25% 25% 25%

• Range of values between the first and third quartiles

• Data set: 5, 9, 16, 17, 18

• Average of the absolute deviations from the mean

2,398 625 390,625

2,398 625 390,625 S n 1

7,092 0 663,866  221, 288.67

Annualized Rate of Return

You might also like