Professional Documents
Culture Documents
CH APTER
Statistics
In previous books in this series, we have looked at the measures of central
tendency, such as the mean and the median.
In this chapter, we discuss two measurements of spread – the interquartile range
and standard deviation. The representation of numerical data by boxplots is
also introduced.
In our study of statistics up to now, we have often associated one measurement
with an item. For example, the height of each person in a class, the number of
possessions obtained by a player in a football match or the number of marks
obtained by a student in a test.
In the last two sections of this chapter, we look at associating a pair of numbers
with an item, for example, the height and weight of a person or the age and
salary of an employee. This is called bivariate data.
When a measurement is collected or recorded at successive intervals of time, it
is referred to as time-series data. This type of bivariate data is also introduced in
this chapter.
52 0 ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18A The median and the
interquartile range
The median has been introduced and discussed in earlier books in this series. We review it here,
because it is the measure of central tendency used when working with the interquartile range as a
measurement of spread.
Median
We often see the median value being used to describe the housing market in a city. The median is the
‘middle value’ when all values are arranged in numerical order.
Here are 13 numbers in numerical order:
2, 2, 3, 3, 3, 4, 5 , 11, 13, 18, 18, 19, 21
This data set has an odd number of values. The middle value is 5, since it has the same number of
values on either side of it. Hence, the median of this data set is 5.
Here is a set of 12 numbers, arranged in numerical order:
1, 3, 4, 4, 5, 7 , 9 , 11, 13, 13, 19, 21
This data set has an even number of values. The middle values are 7 and 9. We take the average of
7 and 9 to calculate the median.
7+9
Median =
2
= 8
Hence, the median of this data set is 8, even though this value does not occur in the data set.
Median
• When a data set has an odd number of values and they are arranged in numerical order,
the median is the middle value.
• When a data set has an even number of values and they are arranged in numerical order,
the median is the average of the two middle values.
th
n + 1⎞
• When a data set with n items is arranged in numerical order, the median lies in the ⎛⎜
⎝ 2 ⎟⎠
position.
Example 1
Solution
a To locate the median, first put the values in numerical order. This gives:
29 33 35 39 43 45 53
Median = 39
(continued over page)
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
521
Photocopying is restricted under law and this material must not be transferred to another party.
18A THE MEDIAN AND THE INTERQUARTILE RANGE
Interquartile range = 25 − 14
= 11
Thus, the middle 50% of Olivia’s times have a spread of 11 minutes.
52 2 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18A THE MEDIAN AND THE INTERQUARTILE RANGE
Notice that the interquartile range is unaffected by the lower quarter and the upper quarter of the
values. Hence, the large sizes of two of Olivia’s times, when she left the game running to eat dinner
and to sleep, do not affect the interquartile range.
The calculations begin slightly differently when there is an even number of results. For example,
suppose that Olivia played one more game, which she solved in 22 minutes.
There are now 12 results to arrange in ascending order:
8, 12, 14, 14, 16, 18, 19, 19, 22, 25, 78, 523
Step 1: Since there is an even number of results, we divide them into two equal groups of 6. (The median lies
12 + 1
‘between’ the 6th and 7th member of the ordered data set. That is, in the = 6.5th position.)
2
8, 12, 14, 14, 16, 18 and 19, 19, 22, 25, 78, 523
14 + 14
Step 2: The lower quartile is now = 14 .
2
22 + 25
Step 3: The upper quartile is now = 23 12 .
2
Step 4: The interquartile range is now 23 12 − 14 = 9 12 .
In this case, the middle 50% of Olivia’s times have a spread of 9 12 minutes.
The minimum, maximum, median and the two quartiles are sometimes called the five-number
summary. Sometimes the lower quartile is called the first quartile, because it marks the first quarter
of the ordered data. The median is then the second quartile, although this term is seldom used. The
upper quartile is called the third quartile.
We denote the lower quartile by Q1 and the upper quartile by Q 3. We sometimes use the abbreviation
IQR for the interquartile range.
Example 2
Solution
median = 23
11 + 1
There are 11 data values. The 6th value is 23, ⎛ = 6th value⎞ , so the median is 23.
⎝ 2 ⎠
The lower group contains 5 values. The 3rd value is 20. So the lower quartile is 20.
Similarly, the upper quartile is 25.
Thus, interquartile range = 25 − 20
=5
That is, the middle 50% of data values have a spread of 5.
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
523
Photocopying is restricted under law and this material must not be transferred to another party.
18A THE MEDIAN AND THE INTERQUARTILE RANGE
Example 3
Solution
There are 22 data values. First locate the median to divide the data into two equal groups.
22 + 1
The median lies in the = 11.5th position of the ordered set. The 11th value is 36 and
2
the 12th value is 37, so the median is 36.5.
The lower group contains 11 values. The 6th value is 30. So the lower quartile is 30.
Similarly, the upper quartile is 47.
Measures of spread
• The range is the difference between the highest and lowest values in a data set.
• The interquartile range measures the spread of the middle 50% of the data in an ordered
data set.
• To calculate the interquartile range, find the difference between the upper quartile Q 3
and the lower quartile Q1.
Exercise 18A
Example 2 1 Find the range and interquartile range of each data set.
a 7 5 15 10 13 3 20 7 15 b 8 5 1 7 5 7 8 10 5 7
c 40646794 d 3 13 8 11 1 18 5 13
Example 3 2 Locate the median and the quartiles for each of the following stem-and-leaf plots. State the
interquartile range for each data set.
a 2 0 12 4 4 7 7 9 b 5446 779
3 111 2 2 4 6 6 7 8 9 6 1 4 4 4 6 7 8
4 0 12 2 4 7 1 5 7 8 9 9
3 2 means 32 8 0 1 1 2 3 4 6
9 1 3 4 5
6 1 means 61
52 4 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18A THE MEDIAN AND THE INTERQUARTILE RANGE
3 Find the mean, the mode, the median and the interquartile range of this data set.
Value 0 1 2 3 4 5 6 7 8 9 10
Frequency 5 2 0 7 1 8 4 6 0 2 11
4 Complete the following table for the positions of the median and the quartiles for data sets
of 100 and 101 items. (Note: A position of 8.5 means it is between the eighth and ninth data
values).
7 The following figures are the amounts a family spent on food each week for 13 weeks.
$148 $143 $152 $149 $158
$155 $147 $152 $158 $139
$143 $150 $141
a Find the median, upper quartile and lower quartile.
b Find the interquartile range of the amounts spent.
8 Write down two sets of seven whole numbers with minimum data value 3, lower quartile 5,
median 10, upper quartile 12 and maximum data value 13.
9 The median is always between the two quartiles. Is the mean always between the two
quartiles? If not, give an example of seven whole numbers where the mean is above the
upper quartile and an example where the mean is below the lower quartile.
10 a For a data set, the minimum value is 8 and the range is 27. Find the maximum value.
b For a particular data set, the upper quartile is 25.6, and the interquartile range is 11.9. Find
the lower quartile.
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
525
Photocopying is restricted under law and this material must not be transferred to another party.
18B Boxplots
A useful way of displaying the maximum value and the minimum value, the upper and lower
quartiles and the median of a data set (the five-number summary) is a boxplot.
scale
lower upper
quartile (Q1) quartile (Q3)
minimum median maximum
Example 4
The weights of 20 students are recorded here. The weights are given to the nearest kilogram.
48 52 54 54 55 58 58 61 62 63 63 64 65 66 66 67 69 70 72 79
a Find the median, upper quartile, lower quartile and interquartile range.
b Draw a boxplot for this data.
Solution
63 + 63
a There are 20 data values. Therefore, the median = = 63 kg
2
Divide the data into two equal groups of 10.
48 52 54 54 55 58 58 61 62 63 63 64 65 66 66 67 69 70 72 79
55 + 58 66 + 67
The lower quartile = = 56.5 kg The upper quartile = = 66.5 kg
2 2
The interquartile range = 66.5 − 56.5
= 10 kg
b 40 50 60 70 80
lower upper
quartile quartile
56.5 kg 66.5 kg
minimum median maximum
48 kg 63 kg 79 kg
52 6 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18B BOXPLOTS
Exercise 18B
1 The boxplot below shows the price (in $) of 20 different brands of sports shirts.
10 20 30 40 50
What is the cost of the most expensive and least expensive sports shirt?
2 The boxplot below gives information regarding the annual salaries (in thousands of dollars)
of employees in a large company.
40 50 60 70 80 90 100
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
527
Photocopying is restricted under law and this material must not be transferred to another party.
18B BOXPLOTS
8 In a boxplot, why is the median not always in the centre of the box?
9 Here are two boxplots drawn on the one scale.
Data set A
Data set B
10 20 30 40 50
Class B
10 20 30 40 50
Channel A
Channel B
Channel C
a Write down the approximate values of the median, quartiles and maximum and
minimum values for each channel.
b Which channel has the largest interquartile range?
c If the winning channel is the one with the highest rated program, which channel is the
winner? Which is second? Which is third?
d If the winning channel is the one with the largest median, rank the channels.
e Can you find a criterion that makes Channel C the winning channel?
52 8 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18C Boxplots, histograms
and outliers
It is common to use a form of the boxplot that is designed to illustrate any possible outliers in the
data. Outliers are unusual, or ‘freak’, values that differ greatly in magnitude from the majority of
data values.
median
outlier
Q1 Q3
• Any point that is more than 1.5 IQRs away from the end of the box is classified as an outlier. That
is, if a data value is greater than Q 3 + 1.5 × IQR or less than it Q1 − 1.5 × IQR is considered to be
an outlier. An outlier is indicated by a marker, as shown in the diagram above.
• The whiskers end at the highest and lowest data values that lie within 1.5 IQRs from the ends of
the box.
The following examples look at representing data with histograms and boxplots.
Example 5
The house prices of 50 houses sold in a town over a period of two years are recorded. The
prices are in thousands of dollars.
110, 110, 120, 130, 140, 150, 150, 170, 170, 170, 180, 190, 200, 210, 210, 230, 270, 270,
290, 310, 340, 340, 340, 340, 350, 360, 360, 365, 365, 400, 400, 400, 400, 410, 430,
440, 450, 460, 460, 460, 460, 564, 678, 678, 750, 760, 904, 1320, 2350, 2350
a Find the quartiles, the median and the interquartile range.
b Calculate 1.5 × IQR.
c Name the outliers.
d Draw a histogram and boxplot of this information. The boxplot should show outliers.
e i Calculate the mean, including the outliers.
ii Calculate the mean, not including the outliers.
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
529
Photocopying is restricted under law and this material must not be transferred to another party.
18C BOXPLOTS, HISTOGRAMS AND OUTLIERS
Solution
a The data has been given in ascending order. There are 50 data values. The median is the
mean of the 25th and 26th values.
Median = $355 000
Q1 is the median of the lower set of 25 values. This is the 13th value.
Q1 = $200 000
Q 3 is the median of the upper set of 25 values.
Q 3 = $460 000
IQR = $260 000
b 1.5 × IQR = 1.5 × (Q 3 − Q1 ) = $390 000
Hence, a value is an outlier if it is greater than 460 000 + 390 000 = $850 000
or less than 200 000 − 1.5 × 260 000
c The outliers are $904 000, $1 320 000, $2 350 000 and $2 350 000.
d
(Note: The right-hand whisker ends with the value $760 000)
14
12
10
0
0-
0-
0-
0-
0-
0-
0-
0-
10 -
11 -
12 -
13 -
14 -
15 -
16 -
17 -
18 -
19 -
20 -
21 -
22 -
23 -
24 -
-
0
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
(Thousands of dollars)
The classes are $100 000 to $199 000, $200 000 to $299 000 etc.
e i Mean with outliers = $449 300, to the nearest $100.
ii Mean without outliers = $337 800, to the nearest $100.
It could be said that the distribution has a positive skew. The left-hand whisker is short. Most
of the values lie in the interval from $100 000 to $500 000.
53 0 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18C BOXPLOTS, HISTOGRAMS AND OUTLIERS
Example 6
Solution
0 5 10 15 20 25 30 35 40 45
Q 3 + 1.5 × IQR = 34.5 + 1.5 × 12 = 52.5
Q1 − 1.5 × IQR = 22.5 − 1.5 × 12 = 4.5
Therefore, the values 0, 0 and 3 are considered to be outliers.
c 16
14
12
10
0
0–4 5–9 10–14 15–19 20–24 25–29 30–34 35–39 40–44 45–49 50–54
(Waiting time in seconds)
d There is a negative skew. The right-hand whisker is short. The left-hand whisker is longer,
indicating a tailing off of the data values. The values 0, 0 and 3 are outliers.
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
531
Photocopying is restricted under law and this material must not be transferred to another party.
18C BOXPLOTS, HISTOGRAMS AND OUTLIERS
Example 7
Fifty-four lengths of wire are cut off by a machine. The resulting lengths measured in cm are
as shown:
103, 104, 105, 106, 106, 106, 107, 107, 107, 107, 107, 108, 108, 108, 108, 108, 108,
108, 108, 109, 109, 109, 109, 109, 109, 109, 109, 110, 110, 110, 110, 110, 110, 110,
110, 110, 111, 111, 111, 111, 111, 111, 111, 112, 112, 112, 112, 113, 113, 113, 113,
114, 115, 116
a Find Q1, the median, Q 3 and the IQR.
b Draw a boxplot, showing outliers.
c Draw a histogram.
d Comment on the shape of the histogram and the boxplot.
Solution
c 10
0
103 104 105 106 107 108 109 110 111 112 113 114 115 116
(Length of wires in cm)
d The histogram is symmetric. The whiskers on the boxplot are of equal length. The values
103 cm and 116 cm are outliers.
53 2 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18C BOXPLOTS, HISTOGRAMS AND OUTLIERS
Exercise 18C
Example
5, 6
1 The heights, measured in centimetres, of 25 students in a class are:
170 175 133 153 164 189 143 133 167 145 150 164 169
159 177 186 173 164 177 168 142 155 153 167 166
a Find Q1, the median and Q 3. b Find the interquartile range.
c Draw a boxplot, showing any outliers.
Example 7 2 The annual incomes of 30 people, given correct to the nearest $1000, are:
54 000 67 000 92 000 78 000 54 000 87 000 102 000 112 000
132 000 45 000 256 000 89 000 78 000 98 000 34 000 75 000
65 000 100 000 34 000 68 000 79 000 81 000 82 000 103 000
21 000 345 000 98 000 67 000 105 000 98 000
a Find Q1, the median and Q 3. b Find the interquartile range.
c Draw a boxplot, showing any outliers.
3 Match each histogram a − c with its box plot i − iii and describe the shape of the data distribution.
a 16
i
14
50 60 70 80 90 100 110 120
12
10
0
50–59 60–69 70–79 80–89 90–99 100–109 110–119
b 16 ii
14
50 60 70 80 90 100 110 120
12
10
0
50–59 60–69 70–79 80–89 90–99 100–109 110–119
c 16
iii
14
50 60 70 80 90 100 110 120
12
10
0
50–59 60–69 70–79 80–89 90–99 100–109 110–119
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
533
Photocopying is restricted under law and this material must not be transferred to another party.
18C BOXPLOTS, HISTOGRAMS AND OUTLIERS
5 The lower and upper quartiles for a data set are 116 and 134. Which of the following data
values would be classified as an outlier?
a 190 b 60 c 150
8 The weight loss (in kilograms) of 20 randomly selected people undertaking a special diet
over three weeks is:
8 5 10 6 6 12 4 5 5 6
8 13 7 7 7 6 6 4 5 5
a Construct a dotplot of the data.
b Construct a boxplot of the data.
c Comment on the shape.
53 4 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18D The mean and the
standard deviation
Mean
The mean of a data set is a measure of its centre. The mean is calculated by adding together all the
data values and then dividing the resulting sum by the number of data values.
sum of values
Mean =
number of values
A more common name for the mean is ‘average’. We use the symbol x to denote the mean.
For a set of data x1 , x2 , x3 , … , xn ,
x1 + x2 + x3 + … + xn
x =
n
Example 8
Solution
43 + 35 + 41 + 29 + 33 + 39 + 42
x =
7
≈ 37.43 (Correct to two decimal places.)
For larger sets of data, a frequency table can be prepared. Let f1 be the frequency of the data
item x1, let f2 be the frequency of the data item x2 and so on. In this case we can write:
f1 x1 + f2 x2 + … + fs xs
x =
f1 + f2 + … + fs
The numerator is the sum of the data items and the denominator is the number of data items.
Example 9
The following information gives the number of children in each of 20 families. Calculate the
mean number of children per family.
Number of children xi Frequency fi
0 4
1 5
2 7
3 4
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
535
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 D T H E M E A N A N D T H E S TA N D A R D D E V I AT I O N
Solution
It is obviously impossible for a family to have 1.55 children. The mean is not necessarily a
member of the data set.
Standard deviation
The standard deviation of a set of data is a measure of how far the data values are spread out from
the mean. The difference between each data item and the mean is called the deviation of the data
value. The sum of the deviations is zero, which will be proved in question 10 of Exercise 18D.
The standard deviation is calculated from the squares of the deviations.
Here are the steps in finding the standard deviation:
• Calculate the mean.
• Square each of the deviations.
• Sum these squares.
• Divide the sum of the squares by the number of data values.
• Take the square root of the value obtained.
This is given by the formula:
( x1 − x )2 + ( x2 − x )2 + ( x3 − x )2 + … + ( xn − x )2
σ =
n
where the xi are the data values, x is the mean and n is the number of data values.
We will use the Greek letter σ (sigma) to denote the standard deviation of a data set.
Example 10
Find the standard deviation, correct to two decimal places, for the data set.
5, 7, 11, 13, 14
53 6 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 D T H E M E A N A N D T H E S TA N D A R D D E V I AT I O N
Solution
5 + 7 + 11 + 13 + 14
x =
5
= 10
When calculating the standard deviation from a frequency table, we can use the following formula:
f1 ( x1 − x )2 + f2 ( x2 − x )2 + f3 ( x3 − x )2 + … + fs ( xs − x )2
σ =
f1 + f2 + … + fs
When frequencies are taken into account, we can see that this is the same formula as above.
We can calculate the standard deviation with an extended frequency table with five columns. Fill in
the first three columns, then calculate x . Fill in the other two columns and then calculate σ.
Example 11
Calculate the mean and standard deviation of the set of values, correct to two decimal places.
1, 3, 4, 5, 7, 3, 6, 9, 9, 4, 5, 2, 5, 7
Solution
70 76
x = =5 σ =
14 14
≈ 2.33 (Correct to two decimal places.)
Note: The sum of the deviations xi – x is zero. Hence, the average of the deviations is not useful.
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
537
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 D T H E M E A N A N D T H E S TA N D A R D D E V I AT I O N
f1( x1 − x )2 + f2 ( x 2 − x )2 + f3 ( x 3 − x )2 + … + fs ( x s − x )2
σ = , when the data is in a
f1 + f2 + … + fs
frequency table.
It is clear that the larger the standard deviation, the more spread out the data are about the mean.
For example, here is a bar chart of the data in Example 11, and also another set of 14 data items
where the data are not as spread out but have the same mean.
4 5
4
3
3
2
2
1
1
0 0
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
In the following section we will see how the standard deviation may be used to make comparisons
between data sets.
Use of calculators
Many calculators and spreadsheets have a built-in facility for calculating the standard deviation of a
set of data.
To save time, we recommend using this facility for all but the simplest data sets. In particular, if x is
not an integer, then calculating σ is very tedious.
It should be noted that in this book we calculate the standard deviation by dividing the sum of
the squares of the deviations by n, the number of data items, and taking the square root. There is
also another type of standard deviation that is obtained by dividing the sum of the squares of the
deviations by n – 1, and taking the square root. Many calculators offer both versions. Sometimes
they are denoted by symbols such as σ n and σ n –1. In this book, we only use σ n.
53 8 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 D T H E M E A N A N D T H E S TA N D A R D D E V I AT I O N
Exercise 18D
Give all answers correct to two decimal places unless otherwise specified.
Example 8 1 During a 13-week football season, the number of kicks obtained by a particular player each
week is:
18, 18, 20, 26, 10, 8, 21, 14, 16, 14, 12 and 16
Calculate the mean number of kicks obtained by the player.
2 The daily maximum temperature was recorded in two different cities for a week. The results
are shown below.
City A: 28, 31, 34, 32, 31, 29, 28
City B: 26, 32, 36, 38, 37, 29, 25
Which city had the greater mean daily maximum temperature?
3 The average of 5 masses is 67 kg. If a mass of 25 kg is added, what is the average of the
6 masses?
4 During a term, a student has an average of 46 marks after the first four tests and his average
for the next six tests is 38 marks. What is his average for the ten tests?
Example 10 5 a Calculate, correct to two decimal places, the mean and standard deviation for the
data sets.
i 2, 4, 8, 10, 2, 9, 3, 8, 2, 2 ii 3, 6, 4, 5, 6, 7, 3, 4, 6, 6
b Comment on the results from part a.
Example 11 6 Complete the following extended frequency table to calculate the mean and standard
deviation of the given data set.
7 Use a calculator to find, correct to two decimal places, the mean and standard deviation for
each data set.
a 3, 6, 7, 5, 8, 5, 10, 12, 13, 12, 6, 9, 12, 14, 15
b 8, 10, 12, 14, 16, 17, 19, 12, 11, 10, 14, 16, 18, 19
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
539
Photocopying is restricted under law and this material must not be transferred to another party.
8 Twenty students sat a test and their results are given in the stem-and-leaf plot below.
1 2 2 8 9
2 2 4 5 6 8
1 2 means 12 3 0 2 6 8 8 9
4 0 1 2 3 6
Score 0 1 2 3 4 5 6 7 8 9 10
Number of people 0 2 0 1 1 2 4 6 0 2 2
54 0 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 E I N T E R P R E T I N G T H E S TA N D A R D D E V I AT I O N
0 889
1 00223
2 4444448888888
3 111 2 2 4 4 4 4 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9
4 11111 2 2 2 3 3 3 4 4 4 4 4 5 5 5 5 6 6 7 7 7 7 7 7 8 8 9 9 9 9 9 9 9 9 9
5 0 0 0 0 0 11111111 2 2 2 3 3 3 4 4 4 4 4 5 7 7
6 333336666 9999
7 77899
8 666 7 | 7 means $77 000
The mean is 45.1, the median is 45.5, and the standard deviation is 16.1.
We next consider intervals centred on the mean.
x + σ = 45.1 + 16.1 = 61.2 and x – σ = 45.1 – 16.1 = 29.0
We can observe from the plot above that there are 92 45
values between 29 and 61; hence, the percentage of 40
values within one standard deviation 35
of the mean is 68.7%. Also, 30
x − σ to x + σ
xxxxxxx − 2σ to x + 2σ xxxxxx
We have seen that about 69% of the data is within one standard deviation of the mean and about 90%
of the data is within two standard deviations of the mean.
Histograms similar to this one occur frequently. In most cases like these the median and the mean
are very close.
Example 12
David plays golf every Friday. He has recorded his score each Friday for five years, and has
found that his mean score for all his games is 85 and the standard deviation of his scores is 5.2.
Find the range of scores that lie within:
a one standard deviation of the mean b two standard deviations of the mean
Solution
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
541
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 E I N T E R P R E T I N G T H E S TA N D A R D D E V I AT I O N
A remarkable result known as Chebyshev’s inequality states that, for any set of data, if we take an
interval between x − kσ and x + kσ, then all values can lie outside this interval for 0 < k ≤ 1, but
1
for k > 1, at most 2 of the data can lie outside this interval.
k
1
So, for example, taking k = 2, not more than of the data can be outside this interval.
4
So at least 75% of the data must lie inside this interval.
σ – 2σ x x + 2σ
Example 13
Gus scored 14 in a maths test and 14 in an English test. The scores of each student in the
maths and English classes are listed below. In which test did Gus perform better, relative to
the class results?
Maths test: 10, 13, 18, 17, 12, 16, 9, 8, 7, 11, 10, 12
English test: 15, 17, 18, 19, 18, 17, 19, 16, 14, 15, 14, 12
Solution
143
Maths test x = ≈ 11.92, σ ≈ 3.38
12
English test x ≈ 16.17 , σ ≈ 2.11
It can be seen that in the maths test Gus scored about 0.6 of a standard deviation above the mean
⎛ 14 − 11.92 ≈ 0.6⎞ and in the English test Gus scored about 1 standard deviation below the
⎝ 3.38 ⎠
14 − 16.17
mean ⎛ ≈ −1⎞ . So Gus has done better relative to the class in the maths test.
⎝ 2.11 ⎠
54 2 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 E I N T E R P R E T I N G T H E S TA N D A R D D E V I AT I O N
Exercise 18E
Example 12 2 The mean and standard deviation of each set of data is given. Find the range of values that
is within:
i one standard deviation of the mean ii two standard deviations of the mean
a x = 35, σ = 2.5
b x = 40, σ = 5
c x = 35, σ = 8
Example 13 3 The mathematics and English marks for a class of 15 students are given below.
Mathematics: 12, 16, 14, 19, 17, 18, 15, 15, 19, 20, 14, 18, 19, 15, 11
English: 10, 13, 16, 19, 20, 19, 18, 16, 15, 14, 17, 11, 15, 18, 17
a Calculate, correct to two decimal places, the mean and standard deviation for each set
of marks.
b If a student scored 16 for the mathematics test and 14 for the English test, which is the
better mark relative to the class results?
4 The following table lists the marks of several students on different tests in English and
mathematics. Compare the English and mathematics marks of each student.
Mark Mean Standard deviation
a David
English 15 17 2
Mathematics 13 17 3
b Akira
English 42 30 6
Mathematics 39 25 8
c Katherine
English 70 75 5
Mathematics 65 70 10
d Daniel
English 70 55 9
Mathematics 69 62 7
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
543
Photocopying is restricted under law and this material must not be transferred to another party.
5 The bar charts of three sets of data are shown.
i 4 4
ii
3 3
2 2
1 1
0 0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
54 4 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 F T I M E - S E R I E S D ATA
Example 14
The mean daily maximum temperature was measured each month in a particular city.
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Mean daily
29.2 28.9 28.1 26.4 23.5 21.2 20.6 21.7 23.8 25.7 27.4 28.7
max. temp (° C)
a Represent this information on a time-series plot.
b Briefly comment on the annual variation in daily maximum temperature.
Solution
Temperature (°C)
27
the mean daily maximum temperature. 26
The points are plotted and joined by lines. 25
24
The following time-series plot is obtained. 23
22
b There is a gradual decrease in the mean daily 21
maximum temperature over the months January, 20
February and March. During April, May and June, J F M A M J J A S O N D
the mean daily maximum temperature falls quite Month
quickly to a minimum during July. For the remainder of the year, there is a steady increase in
the mean daily maximum temperature each month.
Exercise 18F
Example 14 1 a Construct a time-series plot for the average rainfall (in cm) in a particular city, which is
given in the table below.
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Rainfall (in cm) 16.2 17.5 14.2 9.1 9.6 7.1 6.2 4.1 3.3 9.3 9.6 12.6
b Use the time-series plot to write a brief description as to how the rainfall varies in this
particular city.
2 The table below gives the annual profit (in $ million) of a particular company over a 10-year
period. Construct a time-series plot of the information.
Year 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998
Profit ($ million) 1.2 1.8 2.4 2.2 2.6 3.1 3.2 3.4 3.6 4.0
3 The table below gives the number of births that occurred in a hospital each month for a year.
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Number of births 52 46 43 40 31 32 26 27 24 20 26 26
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
545
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 F T I M E - S E R I E S D ATA
4 The table below gives the position of a particular football team in a competition of 12 teams
at the completion of each round throughout the season.
Round 1 2 3 4 5 6 7 8 9 10 11
Position 10 12 11 9 8 6 5 5 4 5 5
Round 12 13 14 15 16 17 18 19 20 21 22
Position 6 4 4 3 4 3 5 7 6 9 8
5 The data below shows the quarterly sales of a department store over a period of three years.
The quarters are labelled 1 to 12 in the corresponding time-series graph.
90
Sales quarter Sales $’000 80
2009–1 45 70
Sales $ ‘000
60
2009–2 63 50
2009–3 67 40
30
2009–4 43
20
2010–1 51 10
2010–2 69 0
1 2 3 4 5 6 7 8 9 10 11 12
2010–3 75 Quarter
2010–4 39
2011–1 55
2011–2 71
2011–3 79
2011–4 49
a In which quarter of each year are the sales figures the worst?
b In which quarter of each year are the sales figures the best?
c Are the sales figures improving? Compare the sales figures for the first quarter of each
year and do the same for the other quarters.
6 The table below gives the quarterly sales figures for a car dealer for the period 2009–2011.
Number of sales Q1 Q2 Q3 Q4
2009 72 62 90 98
2010 87 78 112 111
2011 90 84 132 117
54 6 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18G Bivariate data
We often want to know if there is a relationship between the items in two different data sets.
• Is there a relationship between children’s ages and their heights?
• Is there a relationship between people’s heights and weights?
• Is there a relationship between students’ marks in English and their marks in mathematics?
In each of the above, two pieces of information are to be collected from each person in the
investigation and then the two data sets are to be compared. When two pieces of information are
collected from each subject in an investigation, we are then concerned with bivariate data.
A scatter graph or scatter plot is a type of display that uses coordinates to display values for two
variables for a set of data. The data is displayed as a collection of points, each having the value of
one variable determining the position of the horizontal coordinate and the value of the other variable
determining the position of the vertical coordinate.
Example 15
The age (in years) and height (in cm) of a group Person Age (years) Height (cm)
of people was recorded. The data obtained is shown Alan 12 145
in the table on the right. Present the information in
the table on a scatter plot. Brianna 14 140
Chiyo 15 160
Danielle 14 150
Ezra 10 130
Frankie 11 135
Solution
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
547
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 G B I VA R I AT E D ATA
Example 16
The second-hand price and age of a particular model of car are recorded in the table below,
and the points plotted on a scatter plot.
25 000
Age of car Second-hand
3 16 400 5000
3 17 000
0
3 16 800 0 2 4 6 8 10 12
Age of car (years)
4 15 800
4 15 950
5 14 800
6 12 500
6 12 000
6 12 800
7 12 200
7 11 580
8 10 500
8 9200
8 8600
9 5700
10 4850
11 4500
Solution
a The top-left of the scatter plot has points corresponding to relatively new second-hand
cars with higher prices.
b The bottom-right of the scatter plot has points corresponding to older second-hand cars
with lower prices.
c As the age of the car increases the value decreases.
54 8 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 G B I VA R I AT E D ATA
Exercise 18G
Example 15 1 The table below gives the marks obtained by 10 students in a mathematics examination and
an English examination.
Mathematics mark 72 50 96 58 86 94 78 66 85 78
English mark 78 64 70 46 88 72 70 62 72 74
Represent this information on a scatter plot, using the horizontal axis to represent the
mathematics marks and the vertical axis to represent the English marks.
Break the axes so that the vertical axis starts near 40 and the horizontal axis starts near 50.
2 The table below gives the average monthly rainfall, in mm, and the average number of rainy
days per month for twelve different cities in Australia.
a Represent this information on a scatter plot. Use the horizontal axis to represent
average monthly rainfall and the vertical axis to represent the average number of rainy
days per month.
b Give a brief description of the relationship between rainy days and average rainfall.
3 The table below gives the amount of carbohydrates, in grams, and the amount of fat, in
grams, in 100 g of a number of breakfast cereals.
Carbohydrates (in g) 88.7 67.0 77.5 61.7 86.8 32.4 72.4 77.1 86.5
Fat (in g) 0.3 1.3 2.8 7.6 1.2 5.7 9.4 10.0 0.7
a Represent this information on a scatter plot. Use the x-axis to represent the amount of
carbohydrates and the y-axis to represent the amount of fat.
b Does there appear to be any relationship between the carbohydrate content and the fat
content?
Example 16 4 The table below gives the IQ of a number of adults and the time, in seconds, for them to
complete a simple puzzle.
a Represent this information on a scatter plot. Use the x-axis to represent IQ and the
y-axis to represent the time taken to complete the puzzle.
b Is there any trend in the data?
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
549
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 G B I VA R I AT E D ATA
5 The table below gives the number of kicks and the number of handballs obtained by each
player in an AFL team in a particular match.
Player 1 2 3 4 5 6 7 8 9 10 11
Number of kicks 3 20 7 19 7 6 2 9 7 26 3
Number of handballs 8 11 11 6 4 6 3 1 3 3 8
Player 12 13 14 15 16 17 18 19 20 21 22
Number of kicks 12 17 6 11 14 5 1 21 6 13 4
Number of handballs 4 5 0 3 8 3 0 11 0 17 11
a Represent this information on a scatter plot. Use the x-axis to represent the number of
kicks and the y-axis to represent the number of handballs.
b Does your scatter plot support the claim, ‘the more kicks a player obtains, the more
handballs he gives’? Explain your answer.
6 The table below gives the number of ‘goals for’ (scored by the team) and the number of
‘goals against’ (scored by the opposing team) for each team in a soccer competition.
Team A B C D E F G H I J K L
Goals for 36 45 22 26 20 59 24 41 23 43 32 41
Goals against 31 16 33 26 64 16 53 42 47 21 49 14
a Represent this information on a scatter plot. Use the x-axis to represent ‘goals for’ and
the y-axis to represent ‘goals against’.
b Use your scatter plot to answer the following questions.
i Which team is the best team in the competition? Why?
ii Which team is the worst team in the competition? Why?
iii Which of team J and team H is better? Why?
7 The scatter plot at the right gives information iv
ii v vi vii
about the height and weight of a number of iii
people. Annabelle’s height and weight is
i A
Weight (kg)
55 0 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
1 8 G B I VA R I AT E D ATA
Test 2
i viii
Which point represents each of the following vii
students?
a Alex, who got the top mark in both tests
b Bao, who got the top mark in Test 1 but not Test 1
in Test 2
c Charlene, who did better in Test 1 than John, but not as well on Test 2
d Drago, who did not do as well as Charlene on either test
e Eddie, who got the same mark as John for Test 2, but did not do as well as John on Test 1
f Francis, who got the same mark as John for Test 1, but did better than John on Test 2
g Georgina, who got the lowest mark for Test 1
h Harvir, who had the greatest discrepancy between his two marks
9 The test results of a group of 9 students is recorded in the table and plotted on a scatter plot.
A line has been drawn through the ‘middle of the points’.
100
Test 1 Test 2
90
53 54
80
70 67
Test 2
70
53 55
60
81 81
50
85 82 40
51 51
40 50 60 70 80 90 100
52 53 Test 1
76 78
75 77
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
551
Photocopying is restricted under law and this material must not be transferred to another party.
18H Line of best fit
Consider the four scatter plots below. A trend line or ‘line of best fit’ has been fitted to each ‘by
eye’. It is constructed by first noting the general trend, increasing or decreasing. A line (or curve) is
then drawn through the middle of the scatter plot following that upwards or downwards trend, with
roughly equal number of points above and below the line. The distance points lie from the line must
also be taken into account.
I 170 II 25 000
160
140 15 000
130 10 000
120
5000
110
100
0 2 4 6 8 10 12
15 25 35 45 55 Age of car (years)
Heart mass (grams)
III 5 IV 30
4.5
25
4
Time to complete
Performance level
3.5 20
(seconds)
3
2.5 15
2 10
1.5
1 5
0.5
0 5 10
0 2 4 6 8 10 Age (years)
Time spent preparing (hours)
Observations
• Graphs I and III show an increasing trend whilst graphs II and IV show a decreasing trend.
• Graph II shows a strong linear relationship between the variables and all points are in close
proximity to the line of best fit. However, graphs I and IV show moderately strong linear
relationships between the variables.
• Graph III shows a non-linear relationship between variables and a ‘curve’ of best fit is suggested.
The other graphs display a linear relationship.
In this section, only linear relationships will be studied. To determine the equation of the line of best
fit we draw on skills that were introduced in Chapter 4.
55 2 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18H LINE OF BEST FIT
Example 17
Consider the scatter plot below showing the relationship between ice-creams sold by vendor
during the month of February and maximum temperature for the day.
90
Number of ice-creams sold
80
70
60
50
40
30
20
10
0
15 20 25 30 35 40
Maximum temperature (°C)
Solution
a 90
Number of ice-creams sold
80
70
60
50
40
30
20
10
0
15 20 25 30 35 40
Maximum temperature (°C)
Note: Small variations in the placement of the line of best fit is expected using this
technique.
(continued over page)
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
553
Photocopying is restricted under law and this material must not be transferred to another party.
18H LINE OF BEST FIT
5 40
d 58 = × ( maximum temperature °C ) +
3 3
174 = 5 × ( maximum temperature °C ) + 40 (multiplying all terms by 3)
174 − 40
∴ maximum temperature = = 26.8°C
5
55 4 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
18H LINE OF BEST FIT
Exercise 18H
1 Copy these scatter plots and draw a line of best fit by eye though each.
i ii
iii iv
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
555
Photocopying is restricted under law and this material must not be transferred to another party.
18H LINE OF BEST FIT
3 Data was collected on 100 adults comparing shoe size and height. Shoe sizes ranged from
6 to 13. An equation relating height (in cm) to shoe size was determined to be:
height = 127.18 + 4.84 × shoe size
Use this equation to predict (to the nearest cm) the height of a person whose shoe size is as
follows. Are you interpolating or extrapolating?
a size 7 b size 12 c size 14
4 A line of best fit for a scatter plot, relating the weight of a pumpkin (kg) to the number of
seeds it contains, was found to pass through the points (1, 300) and (7, 540). Assume weight
is on the x-axis.
a Find the equation of the line of best fit.
b Use your equation to estimate the number of seeds a pumpkin contains that
weighs 5.2 kg.
c Use your equation to estimate the weight of a pumpkin containing 600 seeds.
Example 17 5 A class of Year 10 PE students were asked to run a lap of the school’s oval. Their times were
recorded and compared against their fitness levels, which had been previously analysed and
placed on a scale of 1 to 10. The teacher then drew a line of best fit over the scatter plot as
shown.
80
75
Time (seconds)
70
65
60
55
50
0 2 4 6 8 10
Fitness level
55 6 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
6 State the problems with making predictions using the lines of best in the following
scatter plots.
a b
7 Consider the time series below, showing a company’s profit for consecutive financial years
over a 10 year period. ‘Year 1’ marks the financial year 1988–1989, ‘Year 2’ marks the
financial year 1989–1990, and so on. ‘Year 10’ marks the financial year 1997–1998.
4.5
4
3.5
Profit ($ milion)
3
2.5
2
1.5
1
0.5
0 2 4 6 8 10 12
Year number
Create a line of best fit on the time series and use it to predict the company’s profits, to the
nearest $100 000, in the financial year 1998–1999. (Predicting future values in a time series
based on previously observed values is called forecasting.) Is your answer an example of
interpolation or extrapolation?
Review exercise
1 The stem-and-leaf plot on the right gives the times for which a 11 5
class of 26 Year 10 students ran 100 m. 12 3 4 6 9
a What is the range of times to run 100 m in the class? 13 0 0 2 6 8
14 0 1 2 4 7 9 9
b What is the median time to run 100 m in the class?
15 1 2 4 5 5 5
c What is the interquartile range? 16 3 4
d Would the median time change if the fastest 17
and slowest times were removed? 18 2
15 1 means 15.1seconds
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
557
Photocopying is restricted under law and this material must not be transferred to another party.
REVIEW EXERCISE
2 The ‘life’ of alkaline batteries is compared through continuous use in a standard product.
40 Grade A and 40 Grade B batteries are tested in this way. Their results are shown in
the two boxplots below.
Grade B
Grade A
a State the median battery life for the Grade A and Grade B batteries.
b State the range in battery life for the Grade A and Grade B batteries.
c State the interquartile range for the Grade A and Grade B batteries.
d Determine the number of Grade A and Grade B batteries lasting longer than 29 hours.
e Describe the shape of data distributions for the Grade A and Grade B battery life.
f Under what criterion is the Grade B battery ‘better’ than the Grade A battery in this test?
3 The following data are the speeds of 45 semi-trailers passing a given point on an
interstate highway. The speeds are measured in km/h.
88 90 93 94 95 96 98 100 100 100 100 100
101 102 102 102 103 103 103 104 105 106 106 107
109 109 110 110 110 112 113 114 116 117 118 120
120 121 128 130 130 139 141 144 150
a Construct a dotplot of the data.
b Construct a boxplot of the data.
c Comment on the shape.
4 The number of times 35 randomly chosen Year 10 students go online in the course of a
school day was recorded. The results are shown in the frequency table below.
Times online 0 1 2 3 4 5 6 7
Number of students 8 3 5 6 7 5 0 1
a Calculate the mean number of times students in this random sample go online.
b Find the standard deviation of the number of times students go online, correct to
two decimal places.
c Find the range of times online that lie within one standard deviation of the mean.
d If every student in this sample went online one more time than what was recorded,
determine the effect on the mean and standard deviation.
55 8 I C E - E M M AT H E M AT I C S Y E A R 1 0
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
Photocopying is restricted under law and this material must not be transferred to another party.
REVIEW EXERCISE
5 Kathryn scored 78% on both her history and mathematics tests. Both tests had a class
mean of 70%, but history had a standard deviation of 8% and mathematics had a
standard deviation of 12%. In which test did Kathryn perform better relative to the rest of
the class?
6 The table below gives the quarterly sales figures for a Melbourne swimwear shop in the
period 2014–2016.
170
165
160
155
150
145
140
0 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Tibia length (cm)
C H A P T E R 1 8 S TAT I S T I C S
ICE-EM Mathematics 10 3ed ISBN 978-1-108-40434-1 © The University of Melbourne / AMSI 2017 Cambridge University Press
559
Photocopying is restricted under law and this material must not be transferred to another party.