You are on page 1of 10

ZNOTES.

ORG

UPDATED TO 2020-22 SYLLABUS

CAIE A2 LEVEL
MATHS
SUMMARIZED NOTES ON THE STATISTICS 2 SYLLABUS
CAIE A2 LEVEL MATHS

P (A ≥ 3 ) = 1 − P (A < 3 )
1. The Poisson Distribution 1.95 2 e−1.95 1.95 1 e−1.95 1.95 0 e−1.95
=1−( + + )
2! 1! 0!
​ ​ ​

The Poisson distribution is used as a model for the


number, X, of events in a given interval of space or = 1 − 0.690 = 0.310
times. It has the probability formula
Part (ii):
λx Write the distribution using the correct notation
P (X = x ) = e−λ ​ x = 0, 1, 2, …
x! (A + B ) ∼ P o (2 (0.65 + 0.45 )) = (A + B ) ∼ P o (2.2 )
Where λ is equal to the mean number of events in the given
interval Use the limits given in the question to find probability

(2.2 )3 (2.2 )2 (2.2 )1 (2.2 )0


P (A < 4 ) = e−2.2 ( )
A Poisson distribution with mean λ can be noted as
+ + +
3! 2! 1! 0!
​ ​ ​ ​

X ∼ P o(λ)
= 0.819
1.2. Suitability of a Poisson Distribution
1.5. Relationship of Inequalities
Occur randomly in space or time
Occur singly – events cannot occur simultaneously P (X < r) = P (X ≤ r − 1 )
Occur independently P (X = r) = P (X ≤ r ) − P (X ≤ r − 1 )
Occur at a constant rate – mean no. of events in given
P (X > r) = 1 − P (X ≤ r )
time interval proportional to size of interval
P (X ≥ r) = 1 − P (X ≤ r − 1 )

1.3. Expectation & Variance


1.6. Poisson Approximation of a
∼ P o (λ )
For a Poisson distribution X Binomial Distribution
Mean = μ = E (X ) = λ
Variance = σ 2 = Var (X ) = λ To approximate a binomial distribution given by:
The mean & variance of a Poisson distribution are equal
X ∼ B(n, p)

1.4. Addition of Poisson Distributions If n > 50 and np < 5


Then we can use a Poisson distribution given by:
If X and Y are independent Poisson random variables,
with parameters λ and μ respectively, then X + Y has a X ∼ P o(np)
Poisson distribution with parameter λ + μ
{IS Ex 8d} Question 8:
A randomly chosen doctor in general practice sees, on
{IS Ex 8d} Question 1: average, one case of a broken nose per year and each case is
The numbers of emissions per minute from two radioactive independent of the other similar cases.
objects A and B are independent Poisson variables with 1. Regarding a month as a twelfth part of a year,
mean 0.65 and 0.45 respectively. Find the probabilities that: 1. Show that the probability that, between them,
three such doctors see no cases of a broken
1. In a period of three minutes there are at least three
nose in a period of one month is 0.779
emissions from A.
2. Find the variance of the number of cases seen
2. In a period of two minutes there is a total of less than
by three such doctors in a period of six months
four emissions from A and B together.
2. Find the probability that, between them, three such
Solution: doctors see at least three cases in one year.
Part (i): 3. Find the probability that, of three such doctors, one
Write the distribution using the correct notation sees three cases and the other two see no cases in
one year.
A ∼ P o (0.65 × 3 ) = A ∼ P o (1.95 )
Solution:
Use the limits given in the question to find probability Part (i)(a):
Write down the information we know and need

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

1
1 doctor = 1 nose per year = 12 noses per month ​ If λ > 15
3 1 Then we can use a normal distribution given by:
3 doctors= 12 = 4 noses per month
​ ​

Write the distribution using the correct notation


X ∼ N (λ, λ)
X ∼ P o (0.25 )
Apply continuity correction to limits:
Use the limits given in the question to find probability Poisson Normal

0.25 0 e−0.25 𝑥=6 5.5≤𝑥≤6.5


P (X = 0 ) = = 0.779 𝑥>6 𝑥≥6.5
0!

𝑥≥6 𝑥≥5.5
Part (i)(b):
Use the rules of a Poisson distribution 𝑥<6 𝑥≤5.5
𝑥≤6 𝑥≤6.5
Var (X ) = μ = λ
{IS Ex 10h} Question 11:
Calculate λ in this scenario:
The no. of flaws in a length of cloth, lm long has a Poisson
distribution with mean 0.04l
λ = 6 × μ (in one month) = 6 × 0.25 = 1.5
1. Find the probability that a 10m length of cloth has
∴ Var (X ) = 1.5
fewer than 2 flaws.
Part (ii): 2. Find an approximate value for the probability that a
Calculate λ in this scenario: 1000m length of cloth has at least 46 flaws.
3. Given that the cost of rectifying X flaws in a 1000m
λ = 12 × μ (in one month) = 12 × 0.25 = 3 length of cloth is X 2 pence, find the expected cost.

Use the limits given in the question to find probability Solution:


Part (i):
P (X ≥ 3 ) = 1 − P (X ≤ 2 ) Form the parameters of Poisson distribution

32 31 30 l = 10 and λ = 0.04l
= 1 − e−3 ( + + ) = 1 − 0.423 = 0.577
2! 1! 0!
​ ​ ​

∴ λ = 0.4
Part (iii): Write down our distribution using correct notation

λ for one doctor in one year = 1 X ∼ P o (0.4 )


λ for other two doctors in one year = 2 × 1 = 2 Write the probability required by the question

For the first doctor: P (X < 2 )


3
1
P (X = 3 ) = e−1 ( ) From earlier equations:
3!

0.4 0 0.4 1
For the two other doctors: P (X < 2 ) = e−0.4 ( + ) = 0.938
0! 1!
​ ​

10
P (X = 0 ) = e−1 ( ) Part (ii):
0! Using question to form the parameters

l = 10 and λ = 0.04l
Considering that any of the three could be the first

13 10 ∴ λ = 40 > 15
P (X ) = e−1 ( ) × e−1 ( ) × 3 C 2 = 0.025
3! 0!
​ ​ ​ ​

Thus, we can use the normal approximation


Write down our distribution using correct notation
1.7. Normal Approximation of a Poisson
X ∼ P o (40 ) → Y ∼ N (40, 40 )
Distribution
Write the probability required by the question
To approximate a Poisson distribution given by:
P (X ≥ 46 )
X ∼ P (λ)
Apply continuity correction for the normal distribution

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

P (Y ≥ 45.5 ) Expectations of combinations of random variables:

Evaluate the probability E (aX + bY ) = aE (X ) + bE (Y )


45.5 − 40 Variance of combinations of independent random
P (Y ≥ 45.5 ) = 1 − Φ ( ) = 0.192
40 variables:

Part (iii): Var (aX + bY + c ) = a2 Var (X ) + b2 Var (Y )


Using the variance formula
Var (X ± Y ) = Var (X ) + Var(Y )
2
Var (X ) = E (X 2 ) − (E (X ))
Combinations of identically distributed random variables
For a Poisson distribution having mean μ and variance σ 2
E (X ) = Var (X ) = λ and λ = 40
Substitute into equation and solve for the unknown E (2X ) = 2μ and E (X1 ) + E (X2 ) = 2μ ​ ​

Var (2X ) = 4σ 2 and Var (X1 + X2 ) = 2σ 2 ​ ​

2 2
∴ 40 = E (X ) − 40 {IS Ex 6b} Question 3:
It is given that X1 and X2 are independent, and E (X1 )
=
E (X 2 ) = 1640 pence
​ ​ ​

E (X2 ) = μ, Var (X1 ) = Var (X2 ) = σ 2 . Find E(X )


​ ​ ​

E (X 2 ) = ₤16.40 and Var (X ) , where X = 12 (X1 + X2 ) ​ ​ ​

Solution:
Expected cost for rectifying cloth is ₤16.40 Split the variance into individual components

1 1 2 1 2
2. Linear Combinations of Var ( (X1 + X2 )) = ( ) Var (X1 ) + ( ) Var (X
2 2 2
​ ​ ​ ​ ​ ​

Random Variables Substitute given values, hence

1 1 1 1
Var ( (X1 + X2 )) = σ 2 + σ 2 = σ 2
2.1. Expectation & Variance of a 2 4

4 2
​ ​ ​ ​ ​

Function of X
2.3. Expectation & Variance of Sample
E (aX + b) = aE (X ) + b Mean
Var (aX + b) = a2 Var (X ) σ2
E (X ) = μ, Var (X ) = n ​

{IS Ex 6a} Question 12: {IS Ex 6c} Question 5:


The random variable T has mean 5 and variance 16. Find two The mean weight of a soldier may be taken to be 90kg, and
pairs of values for the constants c and d such that σ = 10 kg. 250 soldiers are on board an aircraft, find the
E (cT + d ) = 100 and Var (cT + d ) = 144 expectation and variance of their weight. Hence find the μ
Solution: and σ of the total weight of soldiers.
Expand expectation equation: Solution:
Let X be the average weight, therefore
E (cT + d ) = cE (T ) + d = 100
E (X ) = μ = 90
∴ 5c + d = 100
2 2
2
Var (X ) = σn = 10 250 = 0.4 kg
Expand variance equation:
​ ​

To find μ of total weight, you are calculating


Var (cT + d ) = c 2 Var (T ) = 144 E (X1 ) + E (X2 ) … + E (X250 ) = 250E (X ) =
​ ​ ​

22 500 kg
16c 2 = 144 To find σ , first find Var(X)
Var (X1 ) … + Var (X250 ) = 250Var (X ) = 2500 kg
c = ±3
​ ​

Var (X ) = σ 2 = 25000
Use first equation to find two pairs:
∴σ= 25000 = 158.1 kg
c = 3, d = 85c = −3, d = 115

2.2. Combinations of Random Variables

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

Part (i):
3. Continuous Random Total area must equal 1 hence

Variables ∫
5
kx (6 − x ) = [3kx 2 −
kx 3
] =1
5

3 2
​ ​ ​

2
3.1. Probability Density Functions (pdf) 125 8
= 75k − k − 12k + k = 24k = 1
3 3
​ ​

Function whose area under its graph represents


probability used for continuous random variables 1
∴k=
24

Represented by f(x)
Part (ii):
Mode is the value which has the greatest probability hence
we are looking for the max point on the pdf

d
[kx (6 − x )] = 6k − 2kx
dx

Finding max point hence stationary point

6k − 2kx = 0
1
6 ( 24 )
x= 1 =3
2 ( 24 )

∴ mode = 3
Part (iii):
P (X < m) can be interpreted as P (−∞ < X < m)
Conditions:
3 3
m
kx 3
Total area always = 1 ∫ kx (6 − x ) = ∫ kx (6 − x ) = [3kx − ] 2
3 2
​ ​ ​ ​

−∞ 2
d
∫ ​ f(x) dx = 1 =
1
(3(3 2 ) −
33 23
− 3(2 2 ) + ) =
13
c 24 3 3 36
​ ​ ​ ​

Cannot have negative probabilities ∴ graph cannot dip


below x -axis; f(x) ≥ 0 3.2. Mean & Variance
Probability that X lies between a and b is the area from a
to b To calculate mean/expectation

E (X ) = ∫
b
P (a < X < b) = ∫ f (x ) dx ​ xf (x ) dx
−∞

(x ) = 0 ; show on a sketch
Outside given interval f To calculate variance:
P (X = b) always equals 0 as there is no area First calculate E(X) then E(X 2 ) by
Notes: ∞
P (X < b) = P (X ≤ b) as no extra area added E (X 2 ) = ∫ ​ x 2 f (x ) dx
−∞
The mode of a pdf is its maximum (stationary point)
Substitute information and calculate using
{IS Ex 9a} Question 6:
Given that: 2
Var (X ) = E (X 2 ) − E (X )

f(x) = {
kx(6 − x) 2<x<5
3.3. The Median
0 otherwise
​ ​

The Cumulative Distribution Function (cdf)


1. Find the value of k
2. Find the mode, m Gives the probability that the value is less than b
3. Find P (X < m)
P (X < b) or P (X ≤ b)
Solution:

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

Represented by F (b) Part (a):


It is the integral of f (x ) Write down distribution

b X ∼ N (1, 0.25 2 )
F (b) = ∫ ​ f(x) dx
−∞ Write down the probability they want
Median: the value of b for which F (b) = 0.5 P (X > 1.25 ) = 1 − P (X < 1.25 )
Standardize and evaluate

1.25 − 1
1 − P (Z < ) = 0.1587
0.25

Part (b):
Write down initial distribution

X ∼ N (1, 0.25 2 )

For sample, mean remains equal but variance changes


Find new variance
2 2
Variance of sample = σn = 0.2510 ​ ​
= 0.00625
Write down distribution of sample

Y ∼ N (1, 0.00625 )
4. Sampling & Estimation Write down the probability they want

4.1. Sample & Population P (Y < 0.9 )


Standardize and evaluate
Population: collection of all items
Standardized probability is negative so do 1 minus
Sample: subset of population used as a representation of
the entire population 0.9 − 1 0.1
P (Z < ) = 1 − P (Z < ) = 0.
0.00625 0.00625
​ ​

​ ​

4.2. Central Limit Theorem


4.3. Point Estimate & Confidence
If (X1 , X2 , … , Xn ) is a random sample of size n
Interval
​ ​

drawn from any population with mean μ and variance σ 2


then the sample has:
A point estimate is a numerical value calculated from a
Expected mean, μ set of data (sample) which is used as an estimate of an
2 unknown parameter in a population
Expected variance, σn
Examples of point estimates are:

It forms a normal distribution:

σ2 Sample mean x $$$$ population mean μ


X ∼ N (μ, ​) Sample proportion nr $$$$ population proportion p
n ​

Sample variance s2 $$$$ population variance σ 2


{IS Ex 10f} Question 12:
The weights of the trout at a trout farm are normally
Point estimate close to population value but not exact
distributed with mean 1kg & standard deviation 0.25kg
We can determine a confidence interval where the
1. Find, to 4 decimal places, the probability that a trout population value is likely to lie in (x − δ, x + δ)
chosen at random weighs more than 1.25kg.
2. If Y kg represents mean weight of a sample of 10 4.4. The Variance
trout chosen at random, state the distribution of Y :
evaluate the mean and variance. Find the probability Variance can be calculated/given for either a sample or a
that the mean weight of a sample of 10 trout will be population and there is a difference between them
less than 0.9kg
Using the divisor n
Solution:
This is appropriate to use when

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

data is given for the whole population and you are z is the value corresponding to the confidence level
interested in the variance of the whole required and n is the sample size
data is given for the sample and you are interested in The confidence interval calculated is exact
the variance of just the sample
Large sample taken from an unknown population distribution
2 with known population variance
( Σx 2 − )
1 (Σx )
σ2 = ​ ​

n n
By the Central Limit Theorem, the distribution of X will be
approximately normal so same method as above
Using the divisor (n − 1)
(x − z )
σ σ
Appropriate to use when data is given for a sample and ​, x+z ​

you are estimating variance of the whole population n​ n ​

The quantity calculated s2 is known as the unbiased The confidence interval calculated is an approximate
estimate of the population variance
Large sample taken from an unknown population distribution
2
( Σx 2 − )
1 (Σx ) with unknown population variance
s2 =
n−1
​ ​

n
As the population variance is unknown, you must first
estimate the population variance, s, using sample data
4.5. Percentage Points for a Normal
(x − z )
s s
Distribution n


, x+z
n

The percentage points are determined by finding the z - The confidence interval calculated is an approximate
value of specific percentages
E.g. to find the z -value of a 95% confidence level, we can {W13-P71} Question 2:
see that the 5% would be removed equally from both Heights of a certain species of animal are normally
sides ( 2.5% ) so the z -value we would actually be finding distributed with σ = 0.17m. Obtain a 99% confidence interval
would be of 100% − 2.5% = 97.5% for the population mean, with total width less than 0.2m. Find
the smallest sample size required.
Solution:
For a 99% confidence interval, find z where
Φ (z ) = 0.995 (think of the 1% cut from both sides)

z = 2.576

Subtract the limits of the interval and equate to 0.2

(x + z ) − (x − z ) = 0.2
σ σ
​ ​

n ​ n ​

2 (z ) = 0.2
σ

n ​

Substitute information given and find n


Percentage Points Table
Confidence level 90% 95% 98% 99% 0.2
n= × 0.17
2 × 2.576
​ ​

z -value 1.645 1.960 2.326 2.576


n = 4126.53 ≈ 4130
4.6. Confidence Interval for a
Population Mean 4.7. Confidence Interval for a Population
Proportion
Sample taken from a normal population distribution with
known population variance Calculating the confidence interval from a random sample
of n observations from a population in which the
σ σ
(x − z , x+z

) ​
proportion of successes is p and the proportion of failures
n ​ n ​

is q
The observed proportion of success p is nr where r
​ ​

represents the number of successes

( )

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

The test statistic is calculated from the sample. Its value is


(p − z )
pq pq
, p+z
​ ​ ​ ​

n
​ ​ ​

n
​ ​

used to decide whether the null hypothesis should be


rejected
{S10-P71} Question: The rejection or critical region gives the values of the test
A random sample of n people were questioned about their statistic for which the null hypothesis is rejected
internet use. 87 of them had a high-speed internet The acceptance region gives the values of the test
connection. A confidence interval for the population statistic for which the null hypothesis is accepted
proportion having a high-speed internet connection is 0.1129 The critical values are the boundary values of the
rejection region
< p < 0.1771.
The significance level of a test gives the probability of the
1. Write down the mid-point of this confidence interval test statistic falling in the rejection region
and hence find the value of n.
To carry out a hypothesis test:
2. This interval is an α % confidence interval. Find α .
Define the null and alternative hypotheses
Solution:
Decide on a significance level
Part (i):
Determine the critical value(s)
Find the midpoint of the limits, finding p
Calculate the test statistic
0.1771 − 0.1129 Decide on the outcome of test depending on whether
0.1129 + = 0.145 value of test statistic lies in rejection/acceptance region
2

State the conclusion in words


The midpoint is equal to the proportion of people with high- The test statistic Z can be used to test a hypothesis about
speed internet use so a population
87
n ​ = 0.145 ∴ n = 600
Part (ii): x−μ
z=
Using the upper limit, this was calculate by:

σ2
n ​ ​

pq
0.1771 = 0.145 + z ​ ​ where μ is the population mean specified by H0 ​

n
The critical values for some commonly used rejection
Substituting values calculated ( q = 1 − p ), find z
regions:
87
600× 513
600 Significance level Two-tail One-tail
0.0321 = z ∴ z = 2.233
​ ​

600 μ = μ0 μ > μ0 μ < μ0


​ ​

​ ​ ​

Use normal tables and find corresponding probability 10 ±1.645 1.282 −1.282
5 ±1.960 1.645 −1.645
Φ (z ) = 0.9872
2 ±2.326 2.054 −2.054
Think of symmetry, the same area is chopped off from both 1 ±2.576 2.326 −2.326
sides of the graph so

1 − 2 (1 − 09872 ) = 0.9744 5.2. Testing Different Distributions


Hence the α % confidence is = 97.44% Test for mean, known variance, normal distribution or
large sample

5. Hypothesis Tests X ∼ N (μ, σ2


n ​ )

Use general procedure as outlined above


5.1. Null & Alternative Hypothesis
Test for mean, large sample, variance unknown
For a hypothesis test on the population mean μ, the null
X ∼ N (μ, sn )
2

hypothesis H ∗ 0 proposes a value μ ∗ 0 for μ


H0 : μ = μ0 Use the same procedure however must use unbiased


estimate of the population variance, s
​ ​

The alternative hypothesis H ∗ 1 suggests the way in


Test for large Poisson mean
which μ might differ from μ ∗ 0 . H1 can take three forms: ​

H1 : μ < μ0 , a one-tail test for a decrease


​ ​

X ∼ N (λ, λn ) ​

H1 : μ > μ0 , a one-tail test for an increase


​ ​

H1 ​: μ ≠ μ0​, a two-tail test for a difference

WWW.ZNOTES.ORG
CAIE A2 LEVEL MATHS

Use general procedure but must approximate normal


distribution using the mean given
Must apply continuity correction

Test for proportion, large sample (Binomial distribution)


pq
X ∼ N (p, )

n
Similar to Poisson approximation; using probability of
success and applying continuity correction

5.3. Type I and Type II Errors


Calculating P(Type II error):
A Type I error is made when a true null hypothesis is Firstly, calculate the acceptance region by leaving x as
rejected a variable and equating the test statistic to the
A Type II error is made when a false null hypothesis is significance level
accepted Next, calculate the conditional probability that μ is
P(Type I error) = significance level now μ′ and x is still in the acceptance region

P( x is in acceptance region | μ = μ′ )
Calculate this by substituting the limit of the acceptance
region as x (calculated previously) and the new, given μ′ into
the test statistic equation and find the probability

WWW.ZNOTES.ORG
CAIE A2 LEVEL
Maths

Copyright 2023 by ZNotes


These notes have been created by Rohana Sree Gullapalli for the 2020-22 syllabus
This website and its content is copyright of ZNotes Foundation - © ZNotes Foundation 2023. All rights reserved.
The document contains images and excerpts of text from educational resources available on the internet and
printed books. If you are the owner of such media, test or visual, utilized in this document and do not accept its
usage then we urge you to contact us and we would immediately replace said media.
No part of this document may be copied or re-uploaded to another website without the express, written
permission of the copyright owner. Under no conditions may this document be distributed under the name of
false author(s) or sold for financial gain; the document is solely meant for educational purposes and it is to remain
a property available to all at no cost. It is current freely available from the website www.znotes.org
This work is licensed under a Creative Commons Attribution-NonCommerical-ShareAlike 4.0 International License.

You might also like