You are on page 1of 19

How closely a set of data clusters around its centre Measures of Spread or Dispersion: 1. Range 2. Interquartile Range (IQR) 3. Standard Deviation 4. Variance Measures of Position (Ranking Data): 1. Percentiles 2. Quartiles 3. Z-Scores

Measures of Position

Determine the position of a value, relative to other values, in a set of data Measures of Position (Ranking Data): 1. Percentiles 2. Quartiles 3. Z-Scores Quartiles are required to determine interquartile ranges Data must be arranged in order to determine percentiles and quartiles

Measures of Position
1.

Percentiles

Divide a set of ordered data into 100 intervals with equal numbers of values k percent of the data are less than or equal to kth percentile, Pk (100 k) percent of the data are greater than or equal to kth percentile, Pk

Measures of Position
2.

Quartiles

Divide a set of ordered data into four groups with equal numbers of values Median = Second Quartile Median divides data into two equally sized groups

Measures of Position
3.

Z-Scores
= the number of standard deviations that a datum is from the mean Divide the deviation of a datum from the mean by the standard deviation Variable values below the mean have negative zscores whereas values above the mean have positive z-scores, and values equal to the mean have zero z-score

Measures of Position
Implications of Z-Scores

Z-scores are used to rank any set of data, using the standard deviation as a unit of measure A z-score of 0.072 indicates that it is approximately 7% of a standard deviation or 0.072 standard deviation below the mean A z-score of 0.46 indicates that it is approximately half a standard deviation or 0.46 standard deviation above the mean

Z-Scores

While measures of central tendency are used to estimate "normal" values of a dataset, measures of dispersion are important for describing the spread of the data, or its variation around a central value. Measures of Spread or Dispersion: 1. Range 2. Interquartile Range (IQR) 3. Standard Deviation 4. Variance

1.

Range
Simply put

## Range = Max Min

Not always the best measure Box & whisker plot shows it graphically

Example
Data points include: 7, 9, 12, 13, 24, 29

2.

## Interquartile Range (IQR)

= the 75th percentile (Q3) the 25th percentile (Q1) IQR is essentially the range of the middle 50% of data Because it uses the middle 50%, IQR is not affected by outliers or extreme values.

To find IQR:
1. 2. 3.

Find Median (Q2) Find upper & lower Median (Q3 & Q1) IQR is difference between Q3 & Q1 (50% of data)

NOTE: A box & whisker plot shows IQR graphically Smaller range means more reliable data (less spread) Outliers have little impact IQR values.

IQR Examples:

## Calculate the median (Q2) and the IQR for

a) 10, 14, 17, 18, 21, 25, 27 ,28 b) 40, 40, 44, 47, 48, 51, 52

Solution
a) Q2 =(18 + 21)/2 = 19.5 Q1 =(14 + 17)/2 = 15.5 Q3 =(25 + 27)/2 = 26 IQR = Q3 Q1 = 26 15.5 = 10.5 b) Q2 = 47 Q1 = 40 Q3 = 51 IQR = 51 40 = 11

3.

Standard Deviation
Mathematicians choice for measuring spread of data = Square root of the average sum of the squared differences between each data point and the mean

xi
i 1

n Population

f x
i 1 i i

n Grouped Population

x x
n i 1 i

n 1

Sample

n 1 Grouped Sample

f x x
n i 1 i i

Represents an average of the square of the distance each piece of data is from the mean. If data is clustered about the mean, little dispersion & low standard deviation. If data is spread out, widely scattered & high deviation. Outliers have a larger impact on since every piece of data is considered. Use the mid-value/mid-point of each interval as x for Grouped data

4.

VARIANCE (2 )
= another measure of dispersion/spread Equal to the square of standard deviation