Revision Sheet Chapter 19 MS

Revision sheet 2- Chapter 19 [158 marks]
1. [Maximum mark: 24] 21M.3.AHL.TZ2.1

Juliet is a sociologist who wants to investigate if income affects happiness
amongst doctors. This question asks you to review Juliet’s methods and
conclusions.
Juliet obtained a list of email addresses of doctors who work in her city. She
contacted them and asked them to fill in an anonymous questionnaire.
Participants were asked to state their annual income and to respond to a set of
questions. The responses were used to determine a happiness score out of 100. Of the
415 doctors on the list, 11 replied.
Juliet’s results are summarized in the following table.
For the remaining ten responses in the table, Juliet calculates the mean
happiness score to be 52. 5.
Juliet decides to carry out a hypothesis test on the correlation coefficient to

investigate whether increased annual income is associated with greater
happiness.
Juliet wants to create a model to predict how changing annual income might
affect happiness scores. To do this, she assumes that annual income in dollars, X,
is the independent variable and the happiness score, Y , is the dependent
variable.
She first considers a linear model of the form
Y = aX + b .
Juliet then considers a quadratic model of the form
Y = cX
2
+ dX + e .
After presenting the results of her investigation, a colleague questions whether

Juliet’s sample is representative of all doctors in the city.
A report states that the mean annual income of doctors in the city is $80 000 .
Juliet decides to carry out a test to determine whether her sample could
realistically be taken from a population with a mean of $80 000.
(a.i) Describe one way in which Juliet could improve the reliability
of her investigation. [1]
Markscheme
Any one from: R1
increase sample size / increase response rate / repeat process

check whether sample is representative
test-retest participants or do a parallel test
use a stratified sample
use a random sample
Note: Do not condone:

Ask different types of doctor
Ask for proof of income
Ask for proof of being a doctor
Remove anonymity
Remove response K.
[1 mark]
(a.ii) Describe one criticism that can be made about the validity of
Juliet’s investigation. [1]
Markscheme
Any one from: R1
non-random sampling means a subset of population might be responding

self-reported happiness is not the same as happiness
happiness is not a constant / cannot be quantified / is difficult to measure
income might include external sources
Juliet is only sampling doctors in her city
correlation does not imply causation
sample might be biased
Note: Do not condone the following common but vague responses unless
they make a clear link to validity:
Sample size is too small
Result is not generalizable
There may be other variables Juliet is ignoring
Sample might not be representative
[1 mark]
(b) Juliet classifies response K as an outlier and removes it from the
data. Suggest one possible justification for her decision to
remove it. [1]
Markscheme
because the income is very different / implausible / clearly contrived

R1
Note: Answers must explicitly reference "income" to get credit.
[1 mark]
(c.i) Calculate the mean annual income for these remaining

responses. [2]
Markscheme
($) 90 200 (M1)A1
[2 marks]
(c.ii) Determine the value of r, Pearson’s product-moment

correlation coefficient, for these remaining responses. [2]
Markscheme
r = 0. 558 (0. 557723 …) A2
[2 marks]
(d.i) State why the hypothesis test should be one-tailed. [1]
Markscheme
EITHER
only looking for change in one direction R1
OR
only looking for greater happiness with greater income R1
OR
only looking for evidence of positive correlation R1
[1 mark]
(d.ii) State the null and alternative hypotheses for this test. [2]
Markscheme
H0 : ρ = 0; H1 : ρ > 0 A1A1
Note: Award A1 for ρ seen (do not accept r), A1 for both correct hypotheses,
using their ρ or r. Accept an equivalent statement in words, however
reference to “correlation for the population” or “association for the
population” must be explicit for the first A1 to be awarded.
Watch out for a null hypothesis in words similar to “Annual income is not
associated with greater happiness”. This is effectively saying ρ ≤ 0 and
should not be condoned.
[2 marks]
(d.iii) The critical value for this test, at the 5% significance level, is
0. 549 . Juliet assumes that the population is bivariate normal.
Determine whether there is significant evidence of a positive
correlation between annual income and happiness. Justify your
answer.
[2]
Markscheme
METHOD 1 – using critical value of r
0. 558 > 0. 549 (0. 557723 … > 0. 549) R1
(therefore significant evidence of ) a positive correlation A1
Note: Do not award R0A1.
METHOD 2 – using p-value
0. 0469 < 0. 05 (0. 0469463 … < 0. 05) A1
Note: Follow through from their r-value from part (c)(ii).
(therefore significant evidence of ) a positive correlation A1
Note: Do not award A0A1.
[2 marks]
(e.i) Use Juliet’s data to find the value of a and of b. [1]
Markscheme
a = 0. 000126 (0. 000125842 …), b = 41. 1 (41. 1490 …) A1

[1 mark]
(e.ii) Interpret, referring to income and happiness, what the value of

a represents. [1]
Markscheme
EITHER
the amount the happiness score increases for every $1 increase in (annual)
income A1
OR
rate of change of happiness with respect to (annual) income A1
Note: Accept equivalent responses e.g. an increase of 1. 26 in happiness for

every $10000 increase in salary.
[1 mark]
(e.iii) Find the value of c, of d and of e. [1]
Markscheme
c = −2. 06 × 10
−9
(−2. 06191 … × 10
−9
) ,
d = 7. 05 × 10
−4
(7. 05272 … × 10
−4
) ,
e = 12. 6 (12. 5878 …) A1
[1 mark]
(e.iv) Find the coefficient of determination for each of the two
models she considers. [2]
Markscheme
for quadratic model: R 2

= 0. 659 (0. 659145 …) A1
for linear model: R2

= 0. 311 (0. 311056 …) A1
Note: Follow through from their r value from part (c)(ii).
[2 marks]
(e.v) Hence compare the two models. [1]
Markscheme
EITHER
quadratic model is a better fit to the data / more accurate A1
OR
quadratic model explains a higher proportion of the variance A1
[1 mark]
(e.vi) Juliet decides to use the coefficient of determination to choose

between these two models.
Comment on the validity of her decision. [1]
Markscheme
EITHER
not valid, R not a useful measure to compare models with different
2
numbers of parameters A1
OR
not valid, quadratic model will always have a better fit than a linear model
A1
Note: Accept any other sensible critique of the validity of the method. Do
not accept any answers which focus on the conclusion rather than the
method of model selection.
[1 mark]
(f.i) State the name of the test which Juliet should use. [1]
Markscheme
(single sample) t-test A1
[1 mark]
(f.ii) State the null and alternative hypotheses for this test. [1]
Markscheme
EITHER
H0 : μ = 80 000; H1 : μ ≠ 80 000 A1
OR
H0 : (sample is drawn from a population where) the population mean is
$80 000
H1 : the population mean is not $80 000 A1
Note: Do not allow FT from an incorrect test in part (f )(i) other than a z-test.
[1 mark]
(f.iii) Perform the test, using a 5% significance level, and state your
conclusion in context. [3]
Markscheme
p = 0. 610 (0. 610322 …) A1
Note: For a z-test follow through from part (f )(i), either 0. 578 (from biased
estimate of variance) or 0. 598 (from unbiased estimate of variance).
0. 610 > 0. 05 R1
EITHER
no (significant) evidence that mean differs from $80 000 A1
OR
the sample could plausibly have been drawn from the quoted population
A1
Note: Allow R1FTA1FT from an incorrect p-value, but the final A1 must still be
in the context of the original research question.
[3 marks]
2. [Maximum mark: 6] 20N.1.SL.TZ0.T_10
On 90 journeys to his office, Isaac noted whether or not it rained. He also
recorded his journey time to the office, and classified each journey as short,
medium or long.
Of the 90 journeys to the office, there were 3 short journeys when it rained, 22
medium journeys when it rained, and 15 long journeys when it rained. There
were also 14 short journeys when it did not rain.
Isaac carried out a χ test at the 5% level of significance on these data, looking
2
at the weather and the types of journeys.
(a) Write down H , the null hypothesis for this test.

0 [1]
Markscheme
* This question is from an exam for a previous syllabus, and may contain
minor differences in marking or structure.
type of journey and whether it rained are independent (A1) (C1)
Note: Accept “there is no association” or “not dependent”. Do not accept

“not related” or “not correlated”. Accept equivalent terms for ‘type of
journey’.
[1 mark]
(b) Find the expected number of short trips when it rained. [3]
Markscheme
17
90
×
40
90
× 90 OR 17×40
90
(A1)(M1)
Note: Award (A1) for 17 or 40 seen. Award (M1) for 17
90
×
40
90
× 90 OR
17×40
90
seen.
7. 56 (7. 55555 … ,
68
9
) (A1) (C3)
[3 marks]
(c) The p-value for this test is 0. 0206.
State the conclusion to Isaac’s test. Justify your reasoning. [2]
Markscheme
reject (do not accept) H 0 (A1)
OR
type of journey and whether it rained are not independent (A1)
Note: Follow through from part (a) for their phrasing of the null hypothesis.
0. 0206 < 0. 05 (R1) (C2)
Note: A comparison must be seen, either numerically or in words (e.g. p-

value < significance level). Do not award (R0)(A1).
[2 marks]
Don took part in a project investigating wind speed, x km h −1
, and the time, y
minutes, to fully charge a solar powered robot.
The investigation was carried out six times. The results are recorded in the table.
M is the point with coordinates ( x , .

y )
(a) On graph paper, draw a scatter diagram to show the results of

Don’s investigation. Use a scale of 1 cm to represent 2 units on
the x-axis, and 1 cm to represent 5 units on the y -axis. [4]
Markscheme
(A4)
Note: Award (A1) for correct scales and labels.
Award (A3) for all six points correctly plotted.
Award (A2) for four or five points correctly plotted.
Award (A1) for two or three points correctly plotted.
Award at most (A0)(A3) if axes reversed.
If graph paper is not used, award at most (A1)(A0)(A0)(A0).
[4 marks]
(b.i) Calculate x , the mean wind speed. [1]
Markscheme
19 (km h
−1
) (A1)
[1 mark]
(b.ii) Calculate y , the mean time to fully charge the robot. [1]
Markscheme
32 (minutes) (A1)
[1 mark]
(c) Plot and label the point M on your scatter diagram. [2]
Markscheme
point in correct position, labelled M (A1)(ft)(A1)

Note: Award (A1)(ft) for point plotted in correct position, (A1) for point
labelled M Follow through from their part (b).
[2 marks]
(d.i) Calculate r, Pearson’s product–moment correlation coefficient. [2]
Markscheme
(r =) 0. 944 (0. 943733 …) (G2)
Note: Award (G1) for 0. 943 (incorrect rounding).
[2 marks]
(d.ii) Describe the correlation between the wind speed and the time
to fully charge the robot. [2]
Markscheme
(very) strong positive correlation (A1)(ft)(A1)(ft)
Note: Award (A1)(ft) for (very) strong. Award (A1)(ft) for positive. Follow though
from their part (d)(i). If there is no answer to part (d)(i), award at most (A0)(A1)
for a correct direction.
[2 marks]
(e.i) Write down the equation of the regression line y on x, in the

form y = mx + c. [2]
Markscheme
y = 0. 465x + 23. 2 (y = 0. 465020 … x + 23. 1646 …) (A1)(A1)(G2)
Note: Award (A1) for 0. 465x. Award (A1) for 23. 2. If the answer is not an
equation, award at most (A1)(A0).
[2 marks]
(e.ii) Draw this regression line on your scatter diagram. [2]
Markscheme
regression line through their M (A1)(ft)
regression line through their (0, 23. 2) (A1)(ft)
Note: Award a maximum of (A1)(A0) if the line is not straight/ruler not used.
Award (A0)(A0) if the points are connected.
Follow through from their point M in part (b) and their y -intercept in part
(e)(i).
If M is not plotted or labelled, then follow through from part (b).
[2 marks]
(e.iii) Hence or otherwise estimate the charging time when the wind
speed is 27 km h
−1
. [2]
Markscheme
(y =) 0. 465020 …(27)+23. 1646 … (M1)
Note: Award (M1) for correct substitution into their regression equation.
35. 7 (minutes) (35. 7201 …) (A1)(ft)(G2)
Note: Follow through from their equation in part (e)(i).
OR
an attempt to use their regression line to find the y value at x = 27
Note: Award (M1) for an indication of using their regression line. This must
be illustrated by vertical and horizontal lines or marks at the correct place(s)
on their scatter diagram.
35. 7 (minutes) (A1)(ft)
Note: Follow through from part (e)(ii).
[2 marks]
(f ) Don concluded from his investigation: “There is no causation

between wind speed and the time to fully charge the robot”.
In the context of the question, briefly explain the meaning of

“no causation”. [1]
Markscheme
wind speed does not cause a change in the time to charge (the robot)
(A1)
Note: Award (A1) for a statement that communicates the meaning of a non-
causal relationship between the two variables.
[1 mark]
Casanova restaurant offers a set menu where a customer chooses one of the
following meals: pasta, fish or shrimp.
The manager surveyed 150 customers and recorded the customer’s age and
chosen meal. The data is shown in the following table.
A χ test was performed at the 10% significance level. The critical value for this
2
test is 4. 605.
Write down
A customer is selected at random.
(a) State H , the null hypothesis for this test.

0 [1]
Markscheme
(H0 :) choice of meal is independent of age (or equivalent) (A1)
Note: Accept "not associated" or "not dependent" instead of independent.

In lieu of "age", accept an equivalent alternative such as "being a child or
adult".
[1 mark]
(b) Write down the number of degrees of freedom. [1]

Markscheme
2 (A1)
[1 mark]
(c) Show that the expected number of children who chose shrimp
is 31, correct to two significant figures. [2]
Markscheme
69
150
×
67
150
× 150 OR 69×67
150
(M1)
Note: Award (M1) for correct substitution into expected frequency formula.
30. 82 (30. 8) (A1)
31 (AG)
Note: Both an unrounded answer that rounds to the given answer and
rounded answer must be seen for the (A1) to be awarded.
[2 marks]
(d.i) the χ statistic.

2 [2]
Markscheme
(χ
2
calc
=) 2. 66 (2. 657537 …) (G2)
[2 marks]
(d.ii) the p-value. [1]
Markscheme
(p-value =) 0. 265 (0. 264803 …) (G1)
Note: Award (G0)(G2) if the χ statistic is missing or incorrect and the p-value
2
is correct.
[1 mark]
(e) State the conclusion for this test. Give a reason for your answer. [2]
Markscheme
0. 265 > 0. 10 OR 2. 66 < 4. 605 (R1)(ft)
the null hypothesis is not rejected (A1)(ft)
OR
the choice of meal is independent of age (or equivalent) (A1)(ft)
Note: Award (R1)(ft)) for a correct comparison of either their χ statistic to

2
the χ critical value or their p-value to the significance level.

2
Condone “accept” in place of “not reject”.

Follow through from parts (a) and (d).
Do not award (A1)(ft)(R0).
[2 marks]
(f.i) Calculate the probability that the customer is an adult. [2]
Markscheme
81
150
(
27
50
, 0. 54, 54%) (A1)(A1)(G2)
Note: Award (A1) for numerator, (A1) for denominator.

[2 marks]
(f.ii) Calculate the probability that the customer is an adult or that

the customer chose shrimp. [2]
Markscheme
116
150
(
58
75
, 0. 773, 0. 773333 … , 77. 3%) (A1)(A1)(G2)
[2 marks]
(f.iii) Given that the customer is a child, calculate the probability that
they chose pasta or fish. [2]
Markscheme
34
69
(0. 493, 0. 492753 … , 49. 3%) (A1)(A1)(G2)
[2 marks]
5. [Maximum mark: 6] 19M.2.SL.TZ2.S_1
A group of 7 adult men wanted to see if there was a relationship between their
Body Mass Index (BMI) and their waist size. Their waist sizes, in centimetres, were
recorded and their BMI calculated. The following table shows the results.
The relationship between x and y can be modelled by the regression equation

y = ax + b.
(a.i) Write down the value of a and of b. [3]
Markscheme
valid approach (M1)
eg correct value for a or b (or for correct r or r = 0.955631 seen in (ii))

2
0.141120, 11.1424
a = 0.141, b = 11.1 A1A1 N3
[3 marks]
(a.ii) Find the correlation coefficient. [1]
Markscheme
0.977563
r = 0.978 A1 N1
[1 mark]
(b) Use the regression equation to estimate the BMI of an adult
man whose waist size is 95 cm. [2]
Markscheme
correct substitution into their regression equation (A1)
eg 0.141(95) + 11.1
24.5488
24.5 A1 N2
[2 marks]
6. [Maximum mark: 13] 19M.2.SL.TZ2.T_1
Sila High School has 110 students. They each take exactly one language class
from a choice of English, Spanish or Chinese. The following table shows the
number of female and male students in the three different language classes.
A χ test was carried out at the 5 % significance level to analyse the relationship
2
between gender and student choice of language class.
Use your graphic display calculator to write down
The critical value at the 5 % significance level for this test is 5.99.
One student is chosen at random from this school.
Another student is chosen at random from this school.
(a) Write down the null hypothesis, H0 , for this test. [1]
Markscheme
(H0:) (choice of ) language is independent of gender (A1)
Note: Accept “there is no association between language (choice) and

gender”. Accept “language (choice) is not dependent on gender”. Do not
accept “not related” or “not correlated” or “not influenced”.
[1 mark]
(b) State the number of degrees of freedom. [1]
Markscheme
2 (AG)
[1 mark]
(c.i) the expected frequency of female students who chose to take

the Chinese class. [1]
Markscheme
16.4 (16.4181…) (G1)
[1 mark]
(d) State whether or not H0 should be rejected. Justify your

statement. [2]
Markscheme
(we) reject the null hypothesis (A1)(ft)
8.68507… > 5.99 (R1)(ft)
Note: Follow through from part (c)(ii). Accept “do not accept” in place of
“reject.” Do not award (A1)(ft)(R0).
OR
(we) reject the null hypothesis (A1)
0.0130034 < 0.05 (R1)

Note: Accept “do not accept” in place of “reject.” Do not award (A1)(ft)(R0).
[2 marks]
(e.i) Find the probability that the student does not take the Spanish
class. [2]
Markscheme
88
110
(
4
5
, 0.8, 80% ) (A1)(A1)(G2)
Note: Award (A1) for correct numerator, (A1) for correct denominator.
[2 marks]
(e.ii) Find the probability that neither of the two students take the
Spanish class. [3]
Markscheme
88
110
×
87
109
(M1)(M1)
Note: Award (M1) for multiplying two fractions. Award (M1) for multiplying
their correct fractions.
OR
(
46
110
)(
45
109
)+ 2(
46
110
)(
42
109
)+ (
42
110
)(
41
109
) (M1)(M1)
Note: Award (M1) for correct products; (M1) for adding 4 products.
0.639 (0.638532 … ,
348
545
, 63.9% ) (A1)(ft)(G2)
Note: Follow through from their answer to part (e)(i).

[3 marks]
(e.iii) Find the probability that at least one of the two students is
female. [3]
Markscheme
1 −
67
110
×
66
109
(M1)(M1)
Note: Award (M1) for multiplying two correct fractions. Award (M1) for
subtracting their product of two fractions from 1.
OR
43
110
×
42
109
+
43
110
×
67
109
+
67
110
×
43
109
(M1)(M1)
Note: Award (M1) for correct products; (M1) for adding three products.
0.631 (0.631192 … , 63.1% ,

344
545
) (A1)(G2)
[3 marks]
The marks obtained by nine Mathematical Studies SL students in their projects
(x) and their final IB examination scores (y) were recorded. These data were used
to determine whether the project mark is a good predictor of the examination
score. The results are shown in the table.
The equation of the regression line y on x is y = mx + c.
A tenth student, Jerome, obtained a project mark of 17.
(a.ii) Use your graphic display calculator to write down ȳ , the mean
examination score. [1]
Markscheme
54 (G1)
[1 mark]
(a.iii) Use your graphic display calculator to write down r , Pearson’s

product–moment correlation coefficient. [2]
Markscheme
0.5 (G2)
[2 marks]
(b.i) Find the exact value of m and of c for these data. [2]
Markscheme
m = 0.875, c = 41.75 (m =
7
8
, c =
167
4
) (A1)(A1)
Note: Award (A1) for 0.875 seen. Award (A1) for 41.75 seen. If 41.75 is rounded
to 41.8 do not award (A1).
[2 marks]
(c.i) Use the regression line y on x to estimate Jerome’s examination

score. [2]
Markscheme
y = 0.875(17) + 41.75 (M1)
Note: Award (M1) for correct substitution into their regression line.
= 56.6 (56.625) (A1)(ft)(G2)
Note: Follow through from part (b)(i).
[2 marks]
(c.ii) Justify whether it is valid to use the regression line y on x to

estimate Jerome’s examination score. [2]
Markscheme
the estimate is valid (A1)
since this is interpolation and the correlation coefficient is large enough

(R1)
OR
the estimate is not valid (A1)
since the correlation coefficient is not large enough (R1)
Note: Do not award (A1)(R0). The (R1) may be awarded for reasoning based
on strength of correlation, but do not accept “correlation coefficient is not
strong enough” or “correlation is not large enough”.
Award (A0)(R0) for this method if no numerical answer to part (a)(iii) is seen.
[2 marks]
8. [Maximum mark: 21] 18N.3.AHL.TZ0.Hsp_3
Mr Sailor owns a fish farm and he claims that the weights of the fish in one of his
lakes have a mean of 550 grams and standard deviation of 8 grams.
Assume that the weights of the fish are normally distributed and that Mr Sailor’s
claim is true.
Kathy is suspicious of Mr Sailor’s claim about the mean and standard deviation of
the weights of the fish. She collects a random sample of fish from this lake whose
weights are shown in the following table.
Using these data, test at the 5% significance level the null hypothesis
H : μ = 550 against the alternative hypothesis H : μ < 550 , where μ grams
0 1
is the population mean weight.
Kathy decides to use the same fish sample to test at the 5% significance level
whether or not there is a positive association between the weights and the
lengths of the fish in the lake. The following table shows the lengths of the fish in
the sample. The lengths of the fish can be assumed to be normally distributed.
(a.i) Find the probability that a fish from this lake will have a weight
of more than 560 grams. [2]
Markscheme
Note: Accept all answers that round to the correct 2sf answer in (a), (b) and
(c) but not in (d).
X ∼ N (550, 82) (M1)

P (X > 560) − 0.10564 … = 0.106 A1
[2 marks]
(a.ii) The maximum weight a hand net can hold is 6 kg. Find the
probability that a catch of 11 fish can be carried in the hand net. [4]
Markscheme
(c) but not in (d).
Xi ∼ N (550, 82), i = 1 ,…, 11
11
let Y = ∑ Xi
i=1
E (Y) = 11 × 550 (6050) A1
Var (Y) = 11 × 8
2
(704) (M1)A1
P (Y ⩽ 6000) = 0.02975 … = 0.0298 A1
[4 marks]
(b.i) State the distribution of your test statistic, including the

parameter. [2]
Markscheme
(c) but not in (d).
t distribution with 7 degrees of freedom A1A1
[2 marks]
(b.ii) Find the p-value for the test. [2]
Markscheme
(c) but not in (d).
p = 0.25779…= 0.258 A2
[2 marks]
(b.iii) State the conclusion of the test, justifying your answer. [2]
Markscheme
(c) but not in (d).
p > 0.05 R1
therefore we conclude that there is no evidence to reject H 0 A1

Note: FT their p-value.
Note: Only award A1 if R1 awarded.
[2 marks]
(c.i) State suitable hypotheses for the test. [1]
Markscheme
(c) but not in (d).
H0 : ρ = 0 , H1 : ρ > 0 A1
Note: Do not accept r in place of ρ.
[1 mark]
(c.ii) Find the product-moment correlation coefficient r. [2]
Markscheme
(c) but not in (d).
r = 0.782 A2
[2 marks]
(c.iii) State the p-value and interpret it in this context. [3]
Markscheme
(c) but not in (d).
0.01095… = 0.0110 A1
since 0.0110 < 0.05 R1
there is positive association between weight and length A1
Note: FT their p-value.
Note: Only award A1 if R1 awarded.
Note: Conclusion must be in context.
[3 marks]
(d) Use an appropriate regression line to estimate the weight of a

fish with length 360 mm. [3]
Markscheme
(c) but not in (d).
regression line of y (weight) on x(length) is (M1)

y = 0.8267… x + 255.96… (A1)
x = 360 gives y = 554 A1
Note: Award M1A0A0 for the wrong regression line, that is y = 0.7393…x –
51.62….
[3 marks]
A scientist measures the concentration of dissolved oxygen, in milligrams per
litre (y) , in a river. She takes 10 readings at different temperatures, measured in
degrees Celsius (x).
The results are shown in the table.
It is believed that the concentration of dissolved oxygen in the river varies

linearly with the temperature.
(a.i) For these data, find Pearson’s product-moment correlation

coefficient, r. [2]
Markscheme
−0.974 (−0.973745…) (A2)
Note: Award (A1) for an answer of 0.974 (minus sign omitted). Award (A1) for
an answer of −0.973 (incorrect rounding).
[2 marks]
(a.ii) For these data, find the equation of the regression line y on x. [2]
Markscheme
y = −0.365x + 17.9 (y = −0.365032…x + 17.9418…) (A1)(A1) (C4)
Note: Award (A1) for −0.365x, (A1) for 17.9. Award at most (A1)(A0) if not an
equation or if the values are reversed (eg y = 17.9x −0.365).
[2 marks]
(b) Using the equation of the regression line, estimate the
concentration of dissolved oxygen in the river when the
temperature is 18 °C. [2]
Markscheme
y = −0.365032… × 18 + 17.9418… (M1)
Note: Award (M1) for correctly substituting 18 into their part (a)(ii).
= 11.4 (11.3712…) (A1)(ft) (C2)
Note: Follow through from part (a)(ii).
[2 marks]
The following scatter diagram shows the scores obtained by seven students in
their mathematics test, m, and their physics test, p.
The mean point, M, for these data is (40, 16).
(a) Plot and label the point M(m̄, p̄ ) on the scatter diagram. [2]
Markscheme
(A1)(A1)
(C2)
Note: Award (A1) for mean point plotted and (A1) for labelled M.
[2 marks]
(b) Draw the line of best fit, by eye, on the scatter diagram. [2]
Markscheme
straight line through their mean point crossing the p-axis at 5±2 (A1)(ft)(A1)
(ft) (C2)
Note: Award (A1)(ft) for a straight line through their mean point. Award (A1)
(ft) for a correct p-intercept if line is extended.
[2 marks]
(c) Using your line of best fit, estimate the physics test score for a
student with a score of 20 in their mathematics test. [2]
Markscheme
point on line where m = 20 identified and an attempt to identify y-
coordinate (M1)
10.5 (A1)(ft) (C2)
Note: Follow through from their line in part (b).
[2 marks]
The following table shows values of ln x and ln y.
The relationship between ln x and ln y can be modelled by the regression

equation ln y = a ln x + b.
(a) Find the value of a and of b. [3]
Markscheme
valid approach (M1)
eg one correct value
−0.453620, 6.14210
a = −0.454, b = 6.14 A1A1 N3
[3 marks]
(b) Use the regression equation to estimate the value of y when x =

3.57. [3]
Markscheme
correct substitution (A1)
eg −0.454 ln 3.57 + 6.14
correct working (A1)

eg ln y = 5.56484
261.083 (260.409 from 3 sf )
y = 261, (y = 260 from 3sf ) A1 N3
Note: If no working shown, award N1 for 5.56484.

If no working shown, award N2 for ln y = 5.56484.
[3 marks]
(c) The relationship between x and y can be modelled using the

formula y = kxn, where k ≠ 0 , n ≠ 0 , n ≠ 1.
By expressing ln y in terms of ln x, find the value of n and of k. [7]
Markscheme
METHOD 1
valid approach for expressing ln y in terms of ln x (M1)
eg ln y = ln (kx ) ,
n n
ln (kx ) = a ln x + b
correct application of addition rule for logs (A1)
eg ln k + ln n
(x )
correct application of exponent rule for logs A1
eg ln k + n ln x
comparing one term with regression equation (check FT) (M1)
eg n = a, b = ln k
correct working for k (A1)
eg ln k = 6.14210, k = e
6.14210
465.030
n = −0.454, k = 465 (464 from 3sf ) A1A1 N2N2
METHOD 2
valid approach (M1)
eg e
ln y
= e
a ln x+b
correct use of exponent laws for e a ln x+b

(A1)
eg e
a ln x
× e
b
correct application of exponent rule for a ln x (A1)
eg ln x a
correct equation in y A1
eg y = x
a
× e
b
comparing one term with equation of model (check FT) (M1)
eg k = e ,
b
n = a
465.030
n = −0.454, k = 465 (464 from 3sf ) A1A1 N2N2
METHOD 3
valid approach for expressing ln y in terms of ln x (seen anywhere) (M1)
eg ln y = ln (kx ) ,
n n
ln (kx ) = a ln x + b
correct application of exponent rule for logs (seen anywhere) (A1)

eg ln a
(x ) + b
correct working for b (seen anywhere) (A1)
eg b = ln (e )
b
correct application of addition rule for logs A1
eg ln b
(e x )
a
comparing one term with equation of model (check FT) (M1)
eg k = e ,
b
n = a
465.030
n = −0.454, k = 465 (464 from 3sf ) A1A1 N2N2
[7 marks]
The following table shows the mean weight, y kg , of children who are x years old.
The relationship between the variables is modelled by the regression line with
equation y = ax + b.
(a.i) Find the value of a and of b. [3]
Markscheme
valid approach (M1)
eg correct value for a or b (or for r seen in (ii))
a = 1.91966 b = 7.97717
a = 1.92, b = 7.98 A1A1 N3
[3 marks]
(a.ii) Write down the correlation coefficient. [1]
Markscheme
0.984674
r = 0.985 A1 N1
[1 mark]
(b) Use your equation to estimate the mean weight of a child that is
1.95 years old. [2]
Markscheme
correct substitution into their equation (A1)

eg 1.92 × 1.95 + 7.98
11.7205
11.7 (kg) A1 N2
[2 marks]
13. [Maximum mark: 14] 17N.2.SL.TZ0.S_8
Adam is a beekeeper who collected data about monthly honey production in his
bee hives. The data for six of his hives is shown in the following table.
The relationship between the variables is modelled by the regression line with
equation P = aN + b.
Adam has 200 hives in total. He collects data on the monthly honey production
of all the hives. This data is shown in the following cumulative frequency graph.
Adam’s hives are labelled as low, regular or high production, as defined in the
following table.
Adam knows that 128 of his hives have a regular production.
(a) Write down the value of a and of b. [3]
Markscheme
evidence of setup (M1)
egcorrect value for a or b
a = 6.96103, b = −454.805
a = 6.96, b = −455 (accept 6.96x − 455) A1A1 N3
[3 marks]
(b) Use this regression line to estimate the monthly honey

production from a hive that has 270 bees. [2]
Markscheme
substituting N = 270 into their equation (M1)
eg6.96(270) − 455
1424.67
P = 1420 (g) A1 N2
[2 marks]
(c) Write down the number of low production hives. [1]
Markscheme
40 (hives) A1 N1
[1 mark]
(d.i) Find the value of k; [3]
Markscheme
valid approach (M1)
eg128 + 40
168 hives have a production less than k (A1)
k = 1640 A1 N3
[3 marks]
(d.ii) Find the number of hives that have a high production. [2]
Markscheme
valid approach (M1)
eg200 − 168
32 (hives) A1 N2
[2 marks]
(e) Adam decides to increase the number of bees in each low

production hive. Research suggests that there is a probability of
0.75 that a low production hive becomes a regular production
hive. Calculate the probability that 30 low production hives
become regular production hives. [3]
Markscheme
recognize binomial distribution (seen anywhere) (M1)
n
egX ∼ B(n, p), (
r
) p (1 − p)
n−r
correct values (A1)
egn = 40 (check FT) and p = 0.75 and

40 30 10
r = 30, ( ) 0.75 (1 − 0.75)
30
0.144364
0.144 A1 N2
[3 marks]
© International Baccalaureate Organization, 2023

Revision Sheet Chapter 19 MS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Revision Sheet Chapter 19 MS

Uploaded by

Copyright:

Available Formats

Revision sheet 2- Chapter 19 [158 marks]

1. [Maximum mark: 24] 21M.3.AHL.TZ2.1

Juliet’s results are summarized in the following table.

Juliet decides to carry out a hypothesis test on the correlation coefficient to

She first considers a linear model of the form

Juliet then considers a quadratic model of the form

After presenting the results of her investigation, a colleague questions whether

Any one from: R1

increase sample size / increase response rate / repeat process

Note: Do not condone:

Any one from: R1

non-random sampling means a subset of population might be responding

because the income is very different / implausible / clearly contrived

Note: Answers must explicitly reference "income" to get credit.

(c.i) Calculate the mean annual income for these remaining

($) 90 200 (M1)A1

(c.ii) Determine the value of r, Pearson’s product-moment

r = 0. 558 (0. 557723 …) A2

METHOD 1 – using critical value of r

0. 558 > 0. 549 (0. 557723 … > 0. 549) R1

(therefore significant evidence of ) a positive correlation A1

Note: Do not award R0A1.

METHOD 2 – using p-value

0. 0469 < 0. 05 (0. 0469463 … < 0. 05) A1

Note: Follow through from their r-value from part (c)(ii).

(therefore significant evidence of ) a positive correlation A1

Note: Do not award A0A1.

(e.i) Use Juliet’s data to find the value of a and of b. [1]

a = 0. 000126 (0. 000125842 …), b = 41. 1 (41. 1490 …) A1

(e.ii) Interpret, referring to income and happiness, what the value of

Note: Accept equivalent responses e.g. an increase of 1. 26 in happiness for

(e.iii) Find the value of c, of d and of e. [1]

e = 12. 6 (12. 5878 …) A1

for quadratic model: R 2

for linear model: R2

Note: Follow through from their r value from part (c)(ii).

(e.v) Hence compare the two models. [1]

(e.vi) Juliet decides to use the coefficient of determination to choose

Comment on the validity of her decision. [1]

(single sample) t-test A1

H1 : the population mean is not $80 000 A1

p = 0. 610 (0. 610322 …) A1

no (significant) evidence that mean differs from $80 000 A1

at the weather and the types of journeys.

(a) Write down H , the null hypothesis for this test.

type of journey and whether it rained are independent (A1) (C1)

Note: Accept “there is no association” or “not dependent”. Do not accept

Note: Award (A1) for 17 or 40 seen. Award (M1) for 17

(c) The p-value for this test is 0. 0206.

State the conclusion to Isaac’s test. Justify your reasoning. [2]

reject (do not accept) H 0 (A1)

type of journey and whether it rained are not independent (A1)

0. 0206 < 0. 05 (R1) (C2)

Note: A comparison must be seen, either numerically or in words (e.g. p-

M is the point with coordinates ( x , .

(a) On graph paper, draw a scatter diagram to show the results of

(b.i) Calculate x , the mean wind speed. [1]

point in correct position, labelled M (A1)(ft)(A1)

(d.i) Calculate r, Pearson’s product–moment correlation coefficient. [2]

(r =) 0. 944 (0. 943733 …) (G2)

Note: Award (G1) for 0. 943 (incorrect rounding).

(very) strong positive correlation (A1)(ft)(A1)(ft)

(e.i) Write down the equation of the regression line y on x, in the