You are on page 1of 22

CENTRAL TENDANCY AND

DISPERSION

M B D Neelakanthie
MBA, MSc in CS, BSc (Special) Hons.
QMS/ISMS/Auditor, Six Sigma(Green Belt)
Descriptive Statistics
• Central tendency
– Where is the data centered?
– Do different kinds of centers differ?

• Dispersion
– How are the data spread out?
• Widely? Narrowly?

2
• Measure of Central Tendency

The single value that best describes


the performance of the group as a
whole

3
Measures of Central
Tendency
• Mode
– most frequently occurring value
• Median
– middle value, when cases ordered by
value
• Mean:
– arithmetic average 

4
About Modes…
• the most frequently occurring score
• A set of measurement can have one
– But there can be no mode (equal distribution)
– Can also be more than one (bimodal, tri-, etc)
• In a symmetrical distribution:
mean = median = mode

5
About Medians…
• Proportional measure
– Score that falls in the middle of the scores when they’re
ranked according to magnitude
– 50th percentile
• If n = odd number, there is a unique median
– Rank scores, find middle
– If there are 5 cases, median is (5+1)/6 = 3rd
• If even number of cases (eg n=8)
– Can rank cases, find position (n/2), and avg that and the next
one: (n/2)+( (n/2)+1)/2
– Best if they’re the same, or if their average is meaningful
• Report value, not the position
• Could also look at cumulative frequency to
locate
6
MEDIAN

If we place all the data in ascending (or descending)


order, ie ranking, then the middle of the ranked data
is known as the median value or median.

Example 1
80 80 80 80 80 100 100 100 100 100

100 100 100 120 120 120 120 120 120 120

120 120 120 120 120 120 150 150 150 150

median would be 120.


7
Example 2
Assume we have tested a set of steel specimens to
discover the tensile strength of the material and
obtained the following:

Tensile strength MPa


89 84 87 81 89 86 91 90 78 89 87 99 83 89

Ranking

78 81 83 84 86 87 87 89 89 89 89 90 91
99
Median

Given 14 values, the median is between 87 and 89.


Splitting the difference we get the median as 88.
8
Properties of Medians…
• generally not affected by outliers

• does not use all information from the


data

9
MEAN
The most widely used and often useful measure of
location is the mean, x .

The arithmetic mean is obtained by


1 n 1
x   x j  x1  x2  x3   xn 
n j 1 n

where x j are the data values

For the tensile specimen (example 2)


1
x 89  84  89  611  87.3 MPa
14 7

10
For the saw blade (example 1)
1
x  80  120  100   100 
30
3360
  112 mm
30

The population mean ( μ) is sometimes denoted by E (x)

11
Properties of Means…
• uses all data/scores

• affected by outliers (extreme values)

• best computed for bell-shaped


distributions
– not as good if bimodal or heavily
skewed

12
Means or Medians?
• If a distribution is symmetric,
mean = median
• If a distribution is skewed, the mean
is in the direction of the skew from
the median
– If skewed to right, mean is higher than
median
– If skewed to left, mean is lower than
median
13
skewness

14
Examples
• What are the mode and median of:
{4, 6, 12, 7, 10, 3, 4}
• What’s the median of:
{4, 6, 12, 10, 3, 4}
• What’s the mean, median, and mode for
{2,2,7,8,3,2,11,13,9}
• Find the mean, median, and mode for
{7, 6, 10, 7, 5, 9, 3, 7, 5, 13}

15
MEASURES OF DISPERSION
RANGE
This is simply the largest value minus the smallest
value of the variable. It gives a measure of the
overall spread of the data.

Range

The range for the tensile test example is


99  78  21 Pa
16
VARIANCE

The most widely used and useful measure of sample


spread is variance and standard deviation.

Variance is given by:


2
1 n
2
s   xi  x 
n  1 i 1

where x  sample mean, and


x j  individual variable value in set
s 2  variance

17
A more refined way of measuring spread is by use of
standard deviation, .
=

1 n
s  x j  x 2

n  1 j 1

S =

18
Example
For the steel tensile test example:

2 1  2
611   611 
2
 611  
2

s   89     84       89   
13  7   7   7  
176
  25.14
7

and s  25.14  5.014

What does this mean?

19
Example
The summary of test results of the weight of bottles
collected during a period of one week is given below.
Specification of Weight of bottle is 33.5 ±1 g.
Class Interval Frequency
32.05-32.25 6
32.25- 32.45 13
32.45-32.65 14
32.65-32.85 16
32.85-33.05 12
33.05-33.25 8
33.25-33.45 1

1. Calculate the mean and standards deviation.


2. Comment on the process. 20
Class Mid point Frequency fx fx2
f
x

32.15 6 192.9 6201.73


32.35 13 420.55 13604.79
32.55 14 455.7 14833.04
32.75 16 524 17161
32.95 12 395.4 13028.43
33.15 08 265.2 8791.38
33.35 01 33.35 1112.22
Total 70 2287.1 74732.6

= 2287.1/70
=32.67

21
= 74732.6-74726.09
70-1

= 0.09

=0.3

22

You might also like