Professional Documents
Culture Documents
Exploring data
Chapter assessment
1. George records the time he spends per day surfing the internet for the first three
weeks of May. The times, given to the nearest minute, are as follows:
0 26 13 5 18 12 35
24 61 16 10 26 15 0
0 73 21 17 16 42 32
(i) Illustrate the data using a sorted stem and leaf diagram with eight stems.
Comment briefly on the shape of the distribution. [3]
(ii) Find the mode, median and mean, commenting on their relative usefulness as
measures of central tendency for this data set. [5]
(iii) Calculate the standard deviation and hence find any outliers. [4]
(iv) George’s Dad claims that he is spending too much time on the internet. He
tells George to reduce his usage so that the mean daily time for May is 20%
less than the current mean.
What is the maximum total time George can spend surfing the internet for the
remaining 10 days of May? [3]
2. Over a period of time, a teacher recorded the number of time, x, each of the 20
students in the mathematics class was absent. The distribution was as follows.
Number of times 11 or
0 1 2 3 4 5 6 7 8 9 10
absent, x more
Number of students, f 4 6 3 2 0 2 0 1 1 0 1 0
f 20, fx 53, fx 2
299
During this period of time, there were 30 mathematics lessons. The teacher needs
to analyse the distribution of the number of times each student was present during
the 30-lesson session.
(iv) Without creating a new frequency distribution, deduce values for the mean
and standard deviation of the numbers of times students were present.
Describe the shape of the new distribution. [3]
(i) Calculate estimates of the mean and standard deviation of the number of
chapters per book. [5]
(ii) Why is it not possible to obtain exact values for the mean and standard
deviation from the data in the table? [1]
In fact, the exact values of the mean and standard deviation of the number of
chapters per book are 14.7 and 6.1. Use these exact values for the remainder of the
question.
(iii) For each of the two statements below, state with reasons whether it is
definitely true, definitely false or possibly true.
4. The first paragraph of the children’s book Stig of the Dump contains 107 words.
The number of letters per word is summarised in the following table.
f 107, fx 443, fx 2
2183 .
A passage of an adult fiction book was analysed in a similar way. The mean
number of letters per work was 5.07 and the standard deviation was 2.62.
(iv) Compare the word lengths in the two passages of writing, commenting briefly
on the differences. [3]
Total 60 marks
Exploring data
1. (i)
0 0 0 0 5
10 0 2 3 5 6 6 7 8 Key: 20 | 6 means
20 1 4 6 6 26 minutes
30 2 5
40 2
50
60 1
70 3
(ii) Mode = 0
462
Mean 22
21
The mode is not very useful as it is not representative of the data. Most of
the values appear only once.
The mean is skewed by two unusually large values.
The median is the most representative of the data.
2. (i)
frequency
6
5
4
3
2
1
0 1 2 3 4 5 6 7 8 9 10
Number of absences
(ii) Mode = 1
Median is halfway between 10th and 11th values which are 1 and 2
Median = 1.5
(iii) Mean
fx
53
2.65 days
n 20
S xx
(vi) For boys: 3 S xx 99
11
99 fx 2 12 3 2 fx 2 207
Total value of fx 2992
S xx fx nx 92 8 2.125
2 2 2
55.875
S xx 55.875
Standard deviation 2.83 days (3 s.f.)
n 1 7
3. (i)
Number of chapters 3–5 6–8 9–11 12–16 17–21 22–30 Total
Mid-interval value x 4 7 10 14 19 26
x² 16 49 100 196 361 676
Number of books, f 4 6 12 14 14 10 60
fx 16 42 120 196 266 260 900
fx² 64 294 1200 2744 5054 6760 16116
Mean
fx
900
15 chapters
n 60
(ii) The raw data is not available, so each piece of data in a particular interval
is taken to be the mid-interval value.
(iii) Outliers are more than two standard deviations from the mean, i.e. below
2.5 or above 26.9.
Statement 1 is definitely false, since the lowest class interval is 3 – 5.
Statement 2 is possibly true. There may be one or more data items in the
22 – 30 class interval which are above 26.9, but it is not possible to be
certain, since all the items in this class could be less than 26.9.
4. (i)
Frequency
30
25
20
15
10
5
1 2 3 4 5 6 7 8 9 10
Number of letters
(ii) Mode = 3
1 10
Midrange 5.5
2
The distribution is positively skewed, which is why the mode and median
are smaller than the mid-range.
(iii) Mean
fx
443
4.14 letters (3 s.f.)
n 107
(iv)The adult fiction book has a greater mean word length and also a greater