You are on page 1of 7

Measures of Variability

I. Range The range for a set of data items is the difference between the largest and smallest values. Although the range is the easiest of the numerical measures of variability to compute, it is not widely used because it is based on only two of the items in the data set and thus is influenced too much by extreme data values. II. Inter uartile Range A form of the range that avoids the dependence on extreme values in the data set is the inter uartile range !IQR", or Q-spread. This descriptive measure of variability is simply the difference between the third uartile !Q# " , or $%&'tile data item, and the first uartile !Q( " , or )%&'tile data item. In effect, it is showing the range for the middle %*& of the data and, as such, is not affected by the extreme values in the data set. To calculate Q# , let i =
# N where N is the number of data items. If i is +

not an integer, then the next integer greater than i denotes the position of the $%&'tile, if i is an integer, then the $%&'tile is the average of the data values in positions i and i - (. .imilarly, to calculate Q( , let i = above. /xample (0 1iven the following data0 ), #, %, $, ((, (#, ($, (2, )#, )2. 3ind the IQR. N 4 (* i = !(*" = $.% Q# is the 5th data item Q# = (2. 6ext,
i= ( !(*" 4 ).% Q( is the #rd data item Q( = % . Therefore, IQR 4 (2'% 4 + # + ( N and follow the same guidelines as +

(+. /xample )0 1iven the following data0 ), #, %, $, ((, (#, ($, (2. 3ind the IQR.
# !5" = 7 Q# is the average of the data values in the 7th and $th + (# + ($ ( = (%. 6ext, i = !5" = ) Q( is the average of the positions Q# = ) + # +% = +. Therefore, IQR 4 (%'+ 4 ((. values in the )nd and #rd positions Q( = ) N =5i =

( III. Average Absolute 8eviation from the Mean 9bviously, there are limitations in using range or inter uartile range as measures of variability. It would seem reasonable that any useful measure of variability should measure the spread around the mean since the mean is the :balance point; of a distribution. If you find the difference between each data item and the mean, you will get negative values for items that are less than the mean and positive values for items greater than the mean. If you then sum up all of these differences, you will get <ero, this illustrates a special property of the mean. =owever, by ta>ing the absolute value of each difference, you will get the distance of each item from the mean, and the sum of these distances would measure the total spread around the mean. If you were to include more data items, e ually spread around the mean, you would increase the total of the distances even though the new distribution might be less variable. Therefore, it is important to divide the total absolute deviation by the number of data items, this will give an average absolute deviation from the mean. Average Absolute Deviation =
X X N

This average absolute deviation gives the average distance of any data item from the mean and thus is a good measure of spread. IV. .tandard 8eviation If you were to calculate the average absolute deviation of a distribution using a value other than the mean, you could possibly get a smaller average absolute deviation. This result is one of the reasons that the average absolute deviation is not the best measure of variability. Instead, calculate the average of the s uared differences from the mean, this is the variance of a distribution. If you were to calculate the average of the s uared differences of a distribution by using a value other than the mean, you would always get a larger value. The mean is the one number that minimi<es the average of the s uared differences in a distribution. Variance = ) =
! X X " ) N

There are still two slight inconveniences in using variance as our measure of variability. 3irst, variance does not give an estimate of the distance of a typical data from the mean, it is too big. .econd, if the data items have a unit of measurement associated with them, then the variance would not have the same unit of measure' ment, it would have s uare units. ?y ta>ing the s uare root of variance, we get standard deviation, which is the measure of variability that we want.

)
) Standard Deviation = = ! X X "

The standard deviation can be calculated in an alternative way.


) Standard Deviation = = X X )

/xample0 1iven the following histogram, estimate the standard deviation. &@cig # ) (*& * * (* )* +* 6umber of cigarettes 5* #*& +*& !.%" )*&

Recall that the mean of a histogram can be determined by calculating a :weighted; average using the midpoints of the class intervals and the areas of the bloc>s. Thus, X =.(!%" +.#!(%" +.+!#*" +.)!7*" =.% ++.% +() +() = )2 cigarettes. The standard deviation of a histogram can also be calculated using the midpoints of the class interval, the area of the bloc>s, and the :weighted; average. Asing the first formula, we get0
SD = = .(!% )2" ) + .#!(% )2" ) + .+!#* )2" ) + .)!7* )2" ) ($.7 cig .( + .# + .+ + .)

Asing the second formula, we get0


SD = = .(!% ) " + .#!(% ) " + .+!#* ) " + .)!7* ) " )2 ) ($.7 cig .( + .# + .+ + .)

# Important Note:

.ome textboo>s will give the following formulas for variance and standard deviation0
) ) Variance = s ) = ! X X " = X N X )

N (

N (

) ) Standard Deviation = s = ! X X " = X N X

N (

N (

These formulas should be used when N data items are ta>en as a sample from a larger population in which the variance and standard deviation of that population are un>nown. These formulas give good approximations of the variance and standard deviation of the population.

Bractice .heet C Measures of Variability


I. The following are )% final averages in a math class0 +7 +2 %# 7* 7( 7+ 77 77 7$ $( $) $+ $% $7 $2 $2 $2 5* 5# 55 52 2( 2+ 2% 25

!(" Dhat is the rangeE !)" Dhat is the inter uartile rangeE II. 1iven the following data0 %, $, ((, (), (#, (5. !(" !)" !#" !+" !%" !7" !$" !5" !2" Dhat is the meanE Dhat is the average absolute deviation from the meanE Dhat is the medianE Dhat is the average absolute deviation from the medianE Dhat is the standard deviationE Add 5 to each item. Dhat is the new .8E .ubtract $ from each item. Dhat is the new .8E Multiply each item by $. Dhat is the new .8E 8ivide each item by %. Dhat is the new .8E

+ III. In the histogram given below, the class intervals include the right endpoint, not the left0

&@F(*** (.)% (.** *.$% *.%* *.)% * * )* +* 5* Income !in F(***" (** ()*

!(" Dhat is the estimated meanE !)" Dhat is the estimated standard deviationE !#" Dhat is the estimated inter uartile rangeE IV. Glass A N = )* Glass ? N = #*

X = (*
!(" !)" !#" !+" !%" !7" !$" !5" 3ind 3ind 3ind 3ind 3ind 3ind 3ind 3ind

X = $*

X =7

X = 5*

X for class A. X for class ?. X for the two classes combined. X for the combined classes. X ) for class A. H=int0 Ase the alternative formula for .8.I X ) for class ?. X ) for the combined classes. X for the combined classes.

.olution Jey for Measures of Variability

I. !(" 25 C +7 4 %) !)" 5# C 77 4 ($ II. !(" (( !)" # ( !+" # ( !%" !7" !$" !5" !2"

# #

!#" ((.% +.) +.) +.) )2.+ .5+

III. !(" %7 !)" )7 !#" $7 C #% 4 +( IV. !(" !)" !#" !+" !%" !7" !$" !5" (+** )+** #5** $7 (**,*** (2#,*5* )2#,*5* 2.)%

You might also like