You are on page 1of 4

Range and quartiles

• • • Quartiles Interquartile range Semi-quartile range

The range is very easy to calculate because it is simply the difference between the largest and the smallest observed values in a data set. Thus, range, including any outliers, is the actual spread of data.

Range = difference between highest and lowest observed values
A great deal of information is ignored when computing the range, since only the largest and smallest data values are considered. The range value of a data set is greatly influenced by the presence of just one unusually large or small value (outlier). The range can be expressed as an interval such as 4–10, where 4 is the lowest value and 10 is highest. Often, it is expressed as interval width. For example, the range of 4–10 can also be expressed as a range of 6. The latter convention will be used throughout this chapter. The disadvantage of using range is that it does not measure the spread of the majority of values in a data set—it only measures the spread between highest and lowest values. As a result, other measures are required in order to give a better picture of the data spread. The range is an informative tool used as a supplement to other measures such as the standard deviation or semi-interquartile range, but it should rarely be used as the only measure of spread.

The median divides the data into two equal sets. For more information on the median, refer to the chapter on Measures of central tendency: • • The lower quartile is the value of the middle of the first set, where 25% of the values are smaller than Q1 and 75% are larger. This first quartile takes the notation Q1. The upper quartile is the value of the middle of the second set, where 75% of the values are smaller than Q3 and 25% are larger. This third quartile takes the notation Q3.

It should be noted that the median takes the notation Q2, the second quartile.

Example 1 – Upper and lower quartiles
• • • • • Data: 6, 47, 49, 15, 43, 41, 7, 39, 43, 41, 36 Ordered data: 6, 7, 15, 36, 39, 41, 41, 43, 43, 47, 49 Median: 41 Upper quartile: 43 Lower quartile: 15

11. 37. also indicates the dispersion of a data set. 15. 24. 20. the median b. The following data set is a list of her sales for the last 12 months: 34. 37. Median = (12th + first) ÷ 2 = 6. the highest and lowest quarters are removed. the interquartile range Answers a. 28. Range = difference between the highest and lowest values = 57 – 1 = 56 c. Use Angela's sales records to find: a. 24 (third + fourth observations) ÷ 2 (15 + 19) ÷ 2 17 d. which is called the interquartile range. 15. Angela began working at a computer store.5th value = (sixth + seventh observations) ÷ 2 = (24 + 28) ÷ 2 = 26 b. 11. 19. 20. 15. 34. 47. 47. Upper quartile = value of middle of second half of data Q3 = the median of 28. 1. 50. in effect. the range c. 34. The difference between upper and lower quartiles (Q3–Q1). 50. 50. 47. 19. Her supervisor asked her to keep a record of the number of sales she made each month. The interquartile range spans 50% of a data set. 20. The values in ascending order are: 1. 37. 57 . 11. Interquartile range = difference between upper quartile (Q3) and lower quartile (Q1) Example 2 – Range and quartiles A year ago. Lower quartile = value of middle of first half of data Q1 = = = = the median of 1. 57. 19. 57. 28. 24.Top of page Interquartile range The interquartile range is another range used as a measure of the spread. the upper and lower quartiles d. and eliminates the influence of outliers because.

6. 2. Q1 = 2 and Q3 = 6. Similarly. 5. However. As the location of the median is right on the fourth observation. It is calculated as one half the difference between the 75th percentile (often called Q3) and the 25th percentile (Q1). In the case of a data set with a normal distribution. In the above example.= (third + fourth observations) ÷ 2 = (37 + 47) ÷ 2 = 42 e. so it is a good measure of spread to use for skewed distributions. this will not be true for askewed distribution. Top of page Semi-quartile range The semi-quartile range is another measure of spread. Consider an odd number of observations such as 1. 4. the semi-quartile range is one-half the distance needed to cover half the values. Here the value of Q2is 4. lies between the centre of two observations (24 and 28). an interval stretching from one semi-quartile range below the median to one semi-quartile above the median will contain one-half of the values. In a symmetric distribution. this value is not included in calculating Q1 and Q3 . 28 is also included in the calculation of Q3 as it is above the value of Q2. so the calculation of Q1 includes the observation 24 as it is below the value of Q2. The formula for semi-quartile range is: (Q3–Q1) ÷ 2. The median. . 7. 3. the standard deviation is used instead. as we are interested only in the data above and below Q2. Q2. The semi-quartile range is hardly affected by higher values. Since half the values in a distribution lie between Q3 and Q1. but it is rarely used for data sets that have normal distributions. Interquartile range = Q3–Q1 = 42 – 17 = 25 These results can be summarized as follows: Note: This example has an even number of observations.