You are on page 1of 50

Chapter 10 Revision sheet [188

marks]

A college runs a mathematics course in the morning. Scores for a test from this
class are shown below.
25 33 51 62 63 63 70 74 79 79 81 88 90 90 98
For these data, the lower quartile is 62 and the upper quartile is 88.

1a. Show that the test score of 25 would not be considered an outlier. [3 marks]
The box and whisker diagram showing these scores is given below.

Test scores
Another mathematics class is run by the college during the evening. A box and
whisker diagram showing the scores from this class for the same test is given
below.

Test scores
A researcher reviews the box and whisker diagrams and believes that the evening
class performed better than the morning class.

1b. With reference to the box and whisker diagrams, state one aspect that [2 marks]
may support the researcher’s opinion and one aspect that may counter
it.
At the end of a school day, the Headmaster conducted a survey asking students in
how many classes they had used the internet.
The data is shown in the following table.

2a. State whether the data is discrete or continuous. [1 mark]

The mean number of classes in which a student used the internet is 2.

2b. Find the value of k. [4 marks]


2c. It was not possible to ask every person in the school, so the Headmaster [1 mark]
arranged the student names in alphabetical order and then asked every
10th person on the list.
Identify the sampling technique used in the survey.

A pharmaceutical company has developed a new drug to decrease cholesterol.


The final stage of testing the new drug is to compare it to their current drug. They
have 150 volunteers, all recently diagnosed with high cholesterol, from which they
want to select a sample of size 18. They require as close as possible 20% of the
sample to be below the age of 30, 30% to be between the ages of 30 and 50 and
50% to be over the age of 50.

3a. State the name for this type of sampling technique. [1 mark]

3b. Calculate the number of volunteers in the sample under the age of 30. [3 marks]
Half of the 18 volunteers are given the current drug and half are given the new
drug. After six months each volunteer has their cholesterol level measured and
the decrease during the six months is shown in the table.

Calculate the mean decrease in cholesterol for

3c. The new drug. [1 mark]

3d. The current drug. [1 mark]

The company uses a t-test, at the 1% significance level, to determine if the new
drug is more effective at decreasing cholesterol.

3e. State an assumption that the company is making, in order to use a t-test. [1 mark]
3f. State the hypotheses for this t-test. [1 mark]

3g. Find the p-value for this t-test. [3 marks]

3h. State the conclusion of this test, in context, giving a reason. [2 marks]
Willow finds that she receives approximately 70 emails per working day.
She decides to model the number of emails received per working day using the
random variable X , where X follows a Poisson distribution with mean 70.

4a. Using this distribution model, find P (X < 60). [2 marks]

4b. Using this distribution model, find the standard deviation of X . [2 marks]
In order to test her model, Willow records the number of emails she receives per
working day over a period of 6 months. The results are shown in the following
table.

From the table, calculate

4c. an estimate for the mean number of emails received per working day. [3 marks]
4d. an estimate for the standard deviation of the number of emails received [2 marks]
per working day.

4e. Give one piece of evidence that suggests Willow’s Poisson distribution [1 mark]
model is not a good fit.
Archie works for a different company and knows that he receives emails
according to a Poisson distribution, with a mean of λ emails per day.

4f. Suppose that the probability of Archie receiving more than 10 emails in [3 marks]
total on any one day is 0.99. Find the value of λ.
4g. Now suppose that Archie received exactly 20 emails in total in a [5 marks]
consecutive two day period. Show that the probability that he received
exactly 10 of them on the first day is independent of λ.

30
Each athlete on a running team recorded the distance (M miles) they ran in 30
minutes.
The median distance is 4 miles and the interquartile range is 1. 1 miles.
This information is shown in the following box-and-whisker plot.

5a. Find the value of a . [2 marks]

The distance in miles, M , can be converted to the distance in kilometres, K,


using the formula K = 85 M .

5b. Write down the value of the median distance in kilometres (km). [1 mark]

16 2
The variance of the distances run by the athletes is 16
9
km2 .
The standard deviation of the distances is b miles.

5c. Find the value of b. [4 marks]

A total of 600 athletes from different teams compete in a 5 km race. The times
the 600 athletes took to run the 5 km race are shown in the following cumulative
frequency graph.
There were 400 athletes who took between 22 and m minutes to complete the
5 km race.

5d. Find m. [3 marks]

150
150 athletes that completed the race won a prize.
5e. The first [5 marks]
Given that an athlete took between 22 and m minutes to complete the 5 km
race, calculate the probability that they won a prize.
The number of sick days taken by each employee in a company during a year was
recorded. The data was organized in a box and whisker diagram as shown below:

For this data, write down

6a. the minimum number of sick days taken during the year. [1 mark]

6b. the lower quartile. [1 mark]

6c. the median. [1 mark]


6d. Paul claims that this box and whisker diagram can be used to infer that [2 marks]
the percentage of employees who took fewer than six sick days is
smaller than the percentage of employees who took more than eleven sick days.
State whether Paul is correct. Justify your answer.
Mackenzie conducted an experiment on the reaction times of teenagers. The
results of the experiment are displayed in the following cumulative frequency
graph.

Use the graph to estimate the

7a. median reaction time. [1 mark]


7b. interquartile range of the reaction times. [3 marks]

7c. Find the estimated number of teenagers who have a reaction time [2 marks]
greater than 0. 4 seconds.

90th
7d. Determine the 90th percentile of the reaction times from the cumulative [2 marks]
frequency graph.

Mackenzie created the cumulative frequency graph using the following grouped
frequency table.

7e. Write down the value of a . [1 mark]

7f. Write down the value of b. [1 mark]


7g. Write down the modal class from the table. [1 mark]

7h. Use your graphic display calculator to find an estimate of the mean [2 marks]
reaction time.
Upon completion of the experiment, Mackenzie realized that some values were
grouped incorrectly in the frequency table. Some reaction times recorded in the
interval 0 < t ≤ 0. 2 should have been recorded in the interval 0. 2 < t ≤ 0. 4.

7i. Suggest how, if at all, the estimated mean and estimated median [4 marks]
reaction times will change if the errors are corrected. Justify your
response.
A group of 800 students answered 40 questions on a category of their choice out
of History, Science and Literature.
For each student the category and the number of correct answers, N , was
recorded. The results obtained are represented in the following table.

8a. State whether N is a discrete or a continuous variable. [1 mark]

8b. Write down, for N , the modal class; [1 mark]

8c. Write down, for N , the mid-interval value of the modal class. [1 mark]
8d. Use your graphic display calculator to estimate the mean of N ; [2 marks]

8e. Use your graphic display calculator to estimate the standard deviation of [1 mark]
N.

A χ 2 test at the 5% significance level is carried out on the results. The critical
value for this test is 12.592.

8f. Find the expected frequency of students choosing the Science category [2 marks]
and obtaining 31 to 40 correct answers.
8g. Write down the null hypothesis for this test; [1 mark]

8h. Write down the number of degrees of freedom. [1 mark]

8i. Write down the p-value for the test; [1 mark]

8j. Write down the χ 2 statistic. [2 marks]


8k. State the result of the test. Give a reason for your answer. [2 marks]

University students were surveyed and asked how many hours, h , they worked
each month. The results are shown in the following table.

Use the table to find the following values.

9a. p. [1 mark]
9b. q. [1 mark]

The first five class intervals, indicated in the table, have been used to draw part of
a cumulative frequency curve as shown.

9c. On the same grid, complete the cumulative frequency curve for these [2 marks]
data.
9d. Use the cumulative frequency curve to find an estimate for the number [2 marks]
of students who worked at most 35 hours per month.

Stephen was invited to perform a piano recital. In preparation for the event,
Stephen recorded the amount of time, in minutes, that he rehearsed each day for
the piano recital.
Stephen rehearsed for 32 days and data for all these days is displayed in the
following box-and-whisker diagram.

10a. Write down the median rehearsal time. [1 mark]

32
Stephen states that he rehearsed on each of the 32 days.

10b. State whether Stephen is correct. Give a reason for your answer. [2 marks]

10c. Onk days, Stephen practiced exactly 24 minutes. [3 marks]


Find the possible values of k.
Fiona walks from her house to a bus stop where she gets a bus to school. Her
time, W minutes, to walk to the bus stop is normally distributed with
W ~N(12, 32 ).
Fiona always leaves her house at 07:15. The first bus that she can get departs at
07:30.

11a. Find the probability that it will take Fiona between 15 minutes and 30 [2 marks]
minutes to walk to the bus stop.

The length of time, B minutes, of the bus journey to Fiona’s school is normally
distributed with B~N(50, σ 2 ). The probability that the bus journey takes less than
60 minutes is 0. 941.

11b. Find σ. [3 marks]

45
11c. Find the probability that the bus journey takes less than 45 minutes. [2 marks]

If Fiona misses the first bus, there is a second bus which departs at 07:45. She
must arrive at school by 08:30 to be on time. Fiona will not arrive on time if she
misses both buses. The variables W and B are independent.

11d. Find the probability that Fiona will arrive on time. [5 marks]

183
11e. This year, Fiona will go to school on 183 days. [2 marks]
Calculate the number of days Fiona is expected to arrive on time.

120
A group of 120 students sat a history exam. The cumulative frequency graph
shows the scores obtained by the students.

12a. Find the median of the scores obtained. [1 mark]

1 5
The students were awarded a grade from 1 to 5, depending on the score obtained
in the exam. The number of students receiving each grade is shown in the
following table.

12b. Find an expression for a in terms of b. [2 marks]

The mean grade for these students is 3. 65.

12c. Find the number of students who obtained a grade 5. [3 marks]

5
12d. Find the minimum score needed to obtain a grade 5. [2 marks]

On Paul’s farm, potatoes are packed in sacks labelled 50 kg . The weights of the
sacks of potatoes can be modelled by a normal distribution with mean weight
49. 8 kg and standard deviation 0. 9 kg.

13a. Find the probability that a sack is under its labelled weight. [2 marks]
13b. Find the lower quartile of the weights of the sacks of potatoes. [2 marks]

13c. The sacks of potatoes are transported in crates. There are 10 sacks in [3 marks]
each crate and the weights of the sacks of potatoes are independent of
each other.
Find the probability that the total weight of the sacks of potatoes in a crate
exceeds 500 kg .
The following box-and-whisker plot shows the number of text messages sent by
students in a school on a particular day.

14a. Find the value of the interquartile range. [2 marks]

14b. One student sent k text messages, where k > 11 . Given that k is an [4 marks]
outlier, find the least value of k.
In a high school, 160 students completed a questionnaire which asked for the
number of people they are following on a social media website. The results were
recorded in the following box-and-whisker diagram.

The following incomplete table shows the distribution of the responses from these
160 students.

15. Write down the mid-interval value for the 100 < x ≤ 150 group. [1 mark]
A transportation company owns 30 buses. The distance that each bus has
travelled since being purchased by the company is recorded. The cumulative
frequency curve for these data is shown.

16a. Find the number of buses that travelled a distance between 15000 and [2 marks]
20000 kilometres.
16b. Use the cumulative frequency curve to find the median distance. [2 marks]

16c. Use the cumulative frequency curve to find the lower quartile. [1 mark]

16d. Use the cumulative frequency curve to find the upper quartile. [1 mark]

16e. Hence write down the interquartile range. [1 mark]


16f. Write down the percentage of buses that travelled a distance greater [1 mark]
than the upper quartile.

16g. Find the number of buses that travelled a distance less than or equal to [1 mark]
12 000 km.

It is known that 8 buses travelled more than m kilometres.

16h. Find the value of m . [2 marks]

16i. The smallest distance travelled by one of the buses was 2500 km. [4 marks]
The longest distance travelled by one of the buses was 23 000 km.
On graph paper, draw a box-and-whisker diagram for these data. Use a scale of
2 cm to represent 5000 km.

4 5 6 7 8
Chicken eggs are classified by grade (4, 5, 6, 7 or 8), based on weight. A mixed
carton contains 12 eggs and could include eggs from any grade. As part of the
science project, Rocky buys 9 mixed cartons and sorts the eggs according to their
weight.

17a. State whether the weight of the eggs is a continuous or discrete variable.[1 mark]

17b. Write down the modal grade of the eggs. [1 mark]

17c. Use your graphic display calculator to find an estimate for the standard [2 marks]
deviation of the weight of the eggs.
17d. The mean weight of these eggs is 64.9 grams, correct to three [2 marks]
significant figures.
Use the table and your answer to part (c) to find the smallest possible number
of eggs that could be within one standard deviation of the mean.

A factory, producing plastic gifts for a fast food restaurant’s Jolly meals, claims
that just 1% of the toys produced are faulty.
A restaurant manager wants to test this claim. A box of 200 toys is delivered to
the restaurant. The manager checks all the toys in this box and four toys are
found to be faulty.

18a. Identify the type of sampling used by the restaurant manager. [1 mark]

10%
The restaurant manager performs a one-tailed hypothesis test, at the 10%
significance level, to determine whether the factory’s claim is reasonable. It is
known that faults in the toys occur independently.

18b. Write down the null and alternative hypotheses. [2 marks]

18c. Find the p-value for the test. [2 marks]


18d. State the conclusion of the test. Give a reason for your answer. [2 marks]

As part of his mathematics exploration about classic books, Jason investigated the
time taken by students in his school to read the book The Old Man and the Sea.
He collected his data by stopping and asking students in the school corridor, until
he reached his target of 10 students from each of the literature classes in his
school.

19a. State which of the two sampling methods, systematic or quota, Jason has [1 mark]
used.
Jason constructed the following box and whisker diagram to show the number of
hours students in the sample took to read this book.

19b. Write down the median time to read the book. [1 mark]

19c. Calculate the interquartile range. [2 marks]

25
Mackenzie, a member of the sample, took 25 hours to read the novel. Jason
believes Mackenzie’s time is not an outlier.

19d. Determine whether Jason is correct. Support your reasoning. [4 marks]

For each student interviewed, Jason recorded the time taken to read The Old Man
and the Sea (x), measured in hours, and paired this with their percentage score
on the final exam (y). These data are represented on the scatter diagram.

19e. Describe the correlation. [1 mark]


Jason correctly calculates the equation of the regression line y on x for these
students to be
y = −1. 54x + 98. 8.
He uses the equation to estimate the percentage score on the final exam for a
student who read the book in 1. 5 hours.

19f. Find the percentage score calculated by Jason. [2 marks]

19g. State whether it is valid to use the regression line y on x for Jason’s [2 marks]
estimate. Give a reason for your answer.

50
Jason found a website that rated the ‘top 50’ classic books. He randomly chose
eight of these classic books and recorded the number of pages. For example, Book
H is rated 44th and has 281 pages. These data are shown in the table.

Jason intends to analyse the data using Spearman’s rank correlation coefficient, rs
.

19h. Copy and complete the information in the following table. [2 marks]

19i. Calculate the value of rs . [2 marks]

19j. Interpret your result. [1 mark]

© International Baccalaureate Organization 2023


International Baccalaureate® - Baccalauréat International® - Bachillerato Internacional®

Printed for EIB Victor Hugo school

You might also like