Professional Documents
Culture Documents
Stat and Probability Test 2
Stat and Probability Test 2
x 6. 3 4. 1 5. 6 9. 2 7. 8 8. 2
y 9. 2 4. 9 8. 9 10. 3 8. 9 9. 8
(a) State null and alternative hypotheses which could be used to test whether there is a linear
correlation between X and Y . [2]
(c) State whether your result from part (b)(ii) indicates there is sufficient evidence to claim that,
at the 5% significance level, X and Y are not linearly correlated.
A speed of 75. 7 km h
−1
is two standard deviations from the mean.
(a) Find the standard deviation for the speed of the cars. [2]
(b) Show that the region of the normal distribution between p and q is not symmetrical about
the mean. [3]
3. [Maximum mark: 6] 23M.1.AHL.TZ1.16
The relationship between the intensity, I , of a light source and the distance, d, from the light source can be
modelled by I .
k
= 2
d
Pablo measures the intensity of a light source at different distances. The data collected is shown in the table.
d (m) 1 2 5
I (lm) 42 11 1. 5
Pablo finds the sum of square residuals in the form 1. 0641k 2 − 89. 62k + c.
(b) Hence find the least squares regression curve of the form I =
k
2
. [2]
d
4. [Maximum mark: 4] 23M.1.AHL.TZ2.2
A company that owns many restaurants wants to determine if there are differences in the quality of the food
cooked for three different meals: breakfast, lunch and dinner.
Their quality assurance team randomly selects 500 items of food to inspect. The quality of this food is
classified as perfect, satisfactory, or poor. The data is summarized in the following table.
A χ 2 test at the 5% significance level is carried out to determine if there is significant evidence of a
difference in the quality of the food cooked for the three meals.
H 0 : The quality of the food and the type of meal are independent.
H 1 : The quality of the food and the type of meal are not independent.
(b) State, with justification, the conclusion for this test. [2]
5. [Maximum mark: 7] 23M.1.AHL.TZ2.3
The following Venn diagram shows two independent events, R and S . The values in the diagram represent
probabilities.
(a) Calculate the probability that the length of the seed is less than 3. 7 cm. [2]
It is known that 30% of the seeds have a length greater than k cm.
The probability that he is successful in getting a seat on the bus for any single journey is 0. 86.
(a) Determine the expected number of these 56 journeys for which Akar gets a seat on the bus. [1]
(b) Find the probability that Akar gets a seat on at least 50 journeys during these 28 days. [3]
The probability that Akar gets a seat on at most n journeys is at least 0. 25.
The probability that he is successful in getting a seat on the bus for any single journey is 0. 86.
(a) Determine the expected number of these 56 journeys for which Akar gets a seat on the bus. [1]
(b) Find the probability that Akar gets a seat on at least 50 journeys during these 28 days. [3]
The probability that Akar gets a seat on at most n journeys is at least 0. 25.
Athlete A B C D E F G H
Age (years) 13 17 22 18 19 25 11 36
Time (seconds) 13. 4 14. 6 13. 4 12. 9 12. 0 11. 8 17. 0 13. 1
Sung-Jin decides to calculate the Spearman’s rank correlation coefficient for his set of data.
Athlete A B C D E F G H
Age rank 3
Time rank 1
[2]
(d) Suggest a mathematical reason why Sung-Jin may have decided not to use Pearson’s
product-moment correlation coefficient with his data from the original table. [1]
(e.i) Find the coefficient of determination for the data from the original table. [2]
A χ 2 goodness of fit test at the 5% significance level is carried out on the data.
(a) State the null and alternative hypotheses for this test. [2]
(b) Perform the test and give your conclusion in context. [4]
A χ 2 goodness of fit test at the 5% significance level is carried out on the data.
(a) State the null and alternative hypotheses for this test. [2]
(b) Perform the test and give your conclusion in context. [4]
12. [Maximum mark: 6] 23M.1.AHL.TZ2.15
A random sample of eight packets of Apollo coffee granules are selected from a supermarket shelf.
The weights of the coffee granules present in each packet are as follows:
(a.i) Find an unbiased estimate for the mean weight of coffee granules in a packet of Apollo
coffee. [1]
(a.ii) Calculate a 95% confidence interval for the population mean. Give your answer to four
significant figures. [2]
(b) State one assumption you have made in order for your interval to be valid. [1]
(c) The label of each packet has a description which includes the phrase: “contains 226 g of
coffee granules”.
Using your answer to part (a)(ii), briefly comment on the claim on the label. [2]
13. [Maximum mark: 15] 23M.2.SL.TZ1.1
The mean annual temperatures for Earth, recorded at fifty-year intervals, are shown in the table.
Year °C (y) 8. 73 9. 22 9. 10 9. 12 9. 13 9. 45 9. 76
Tami creates a linear model for this data by finding the equation of the straight line passing through the
points with coordinates (1708, 8. 73) and (1958, 9. 45).
(a) Calculate the gradient of the straight line that passes through these two points. [2]
(b.i) Interpret the meaning of the gradient in the context of the question. [1]
(c) Find the equation of this line giving your answer in the form y = mx + c. [2]
(d) Use Tami’s model to estimate the mean annual temperature in the year 2000. [2]
(e.ii) Find the value of r, the Pearson’s product-moment correlation coefficient. [1]
(f ) Use Thandizo’s model to estimate the mean annual temperature in the year 2000. [2]
Thandizo uses his regression line to predict the year when the mean annual temperature will first exceed
15 °C.
(g) State two reasons why Thandizo’s prediction may not be valid. [2]
14. [Maximum mark: 17] 23M.2.SL.TZ2.4
It is claimed that a new remedy cures 82% of the patients with a particular medical problem.
This remedy is to be used by 115 patients, and it is assumed that the 82% claim is true.
(a) Find the probability that exactly 90 of these patients will be cured. [3]
(b) Find the probability that at least 95 of these patients will be cured. [2]
(c) Find the variance in the possible number of patients that will be cured. [2]
The probability that at least n patients will be cured is less than 30%.
A clinic is interested to see if the mean recovery time of their patients who tried the new remedy is less than
that of their patients who continued with an older remedy. The clinic randomly selects some of their patients
and records their recovery time in days. The results are shown in the table below.
The data is assumed to follow a normal distribution and the population variance is the same for the two
groups. A t-test is used to compare the means of the two groups at the 10% significance level.
(e) State the appropriate null and alternative hypotheses for this t-test. [2]
(g) State the conclusion for this test. Give a reason for your answer. [2]
This remedy is to be used by 115 patients, and it is assumed that the 82% claim is true.
(a) Find the probability that exactly 90 of these patients will be cured. [3]
(b) Find the probability that at least 95 of these patients will be cured. [2]
(c) Find the variance in the possible number of patients that will be cured. [2]
The probability that at least n patients will be cured is less than 30%.
A clinic is interested to see if the mean recovery time of their patients who tried the new remedy is less than
that of their patients who continued with an older remedy. The clinic randomly selects some of their patients
and records their recovery time in days. The results are shown in the table below.
The data is assumed to follow a normal distribution and the population variance is the same for the two
groups. A t-test is used to compare the means of the two groups at the 10% significance level.
(e) State the appropriate null and alternative hypotheses for this t-test. [2]
(g) State the conclusion for this test. Give a reason for your answer. [2]
Year °C (y) 8. 73 9. 22 9. 10 9. 12 9. 13 9. 45 9. 76
Tami creates a linear model for this data by finding the equation of the straight line passing through the
points with coordinates (1708, 8. 73) and (1958, 9. 45).
(a) Calculate the gradient of the straight line that passes through these two points. [2]
(b.i) Interpret the meaning of the gradient in the context of the question. [1]
(c) Find the equation of this line giving your answer in the form y = mx + c. [2]
(d) Use Tami’s model to estimate the mean annual temperature in the year 2000. [2]
(e.ii) Find the value of r, the Pearson’s product-moment correlation coefficient. [1]
(f ) Use Thandizo’s model to estimate the mean annual temperature in the year 2000. [2]
17. [Maximum mark: 17] 23M.2.AHL.TZ1.3
A large international sports tournament tests their athletes for banned substances.
They interpret a positive test result as meaning that the athlete uses banned substances.
A negative result means that they do not.
If an athlete uses banned substances, the probability that they will test positive is 0. 71.
If an athlete does not use banned substances, the probability that they will test negative is 0. 98.
(a) Using the information given, complete the following tree diagram.
[2]
(b.i) Determine the probability that a randomly selected athlete does not use banned substances
and tests negative. [2]
(b.ii) If two athletes are selected at random, calculate the probability that both athletes do not
use banned substances and both test negative. [2]
(c.i) Calculate the probability that a randomly selected athlete will receive an incorrect test
result. [3]
(c.ii) A random sample of 1300 athletes at the tournament are selected for testing. Calculate the
expected number of athletes in the sample that will receive an incorrect test result. [2]
Team X are competing in the tournament. There are 20 athletes in this team. It is known that none of the
athletes in Team X use banned substances.
(d) Calculate the probability that none of the athletes in Team X will test positive. [4]
(e) Determine the probability that more than 2 athletes in Team X will test positive. [2]
18. [Maximum mark: 15] 23M.2.AHL.TZ1.5
Goran is interested in the number of sightings of a particular bird each week in the 50 weeks following the
first day of September. He collects some data which is shown in the table.
Number of
8 16 13 8 3 2 0
weeks
The sample mean number of sightings per week for this data is 1.76 .
(a) Calculate the unbiased estimate of the population variance of sightings per week. [3]
(b) State why your answer to part (a) supports Goran’s belief. [1]
Goran decides to test at the 5% significance level to see if his belief is correct.
His null hypothesis is X~ Po(m), where the random variable, X , is defined as the number of sightings per
week.
Goran estimates parameter m to be the mean of the sample, 1. 76. He calculates the expected frequencies
for sightings per week in the 50 weeks after the first day of September. These are shown to two decimal
places in the following table.
Expected
8. 60 15. 14 13. 32 7. 82 j k
frequencies
(c.i) j; [3]
(c.ii) k. [2]
(d) State a reason why Goran should combine groups to conduct his significance test. [1]
(e) Write down the degrees of freedom for the test. [1]