Cie A2 Furthermaths 9231 Statistics1 v1 Znotes

TABLE OF CONTENTS
2
CHAPTER 1
Expectation Algebra
2
CHAPTER 2
Poisson Distribution
4 Continuous Random Variable

CHAPTER 3
7
CHAPTER 4
Geometric & Exponential Distribution
8 Sampling & Central Limit Theorem

CHAPTER 5
8 Point & Interval Estimation

CHAPTER 6
10
CHAPTER 7
Hypothesis Tests
12
CHAPTER 8
Goodness of Fit
13
CHAPTER 9
Regression and Correlation

CIE A-LEVEL FURTHER MATHEMATICS//9231
1. EXPECTATION ALGEBRA Split the variance into individual components
1 1 2 1 2
1.1 Expectation & Variance of a Function of 𝑿 𝑉𝑎𝑟 ( (𝑋
2 1
+ 𝑋2 = ( ) 𝑉𝑎𝑟(𝑋1 + ( ) 𝑉𝑎𝑟(𝑋2 )
))
2
)
2
𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸(𝑋) + 𝑏 Substitute given values, hence
𝑉𝑎𝑟(𝑎𝑋 + 𝑏) = 𝑎2 𝑉𝑎𝑟(𝑋)
1 1 1 1
(IS) Ex 6a: Question 12: 𝑉𝑎𝑟 ( (𝑋1 + 𝑋2 )) = 𝜎 2 + 𝜎 2 = 𝜎 2
The random variable 𝑇 has mean 5 and variance 16. 2 4 4 2
Find two pairs of values for the constants 𝑐 and 𝑑 such
that 𝐸(𝑐𝑇 + 𝑑) = 100 and 𝑉𝑎𝑟(𝑐𝑇 + 𝑑) = 144
1.3 Expectation & Variance of Sample Mean
𝜎2
Solution: 𝐸(𝑋) = 𝜇 𝑉𝑎𝑟(𝑋) = 𝑛
Expand expectation equation: (IS) Ex 6c: Question 5:
𝐸(𝑐𝑇 + 𝑑) = 𝑐𝐸(𝑇) + 𝑑 = 100 The mean weight of a soldier may be taken to be 90kg,
∴ 5𝑐 + 𝑑 = 100 and 𝜎 = 10kg. 250 soldiers are on board an aircraft,
Expand variance equation: find the expectation and variance of their weight.
𝑉𝑎𝑟(𝑐𝑇 + 𝑑) = 𝑐 2 𝑉𝑎𝑟(𝑇) = 144 Hence find the 𝜇 and 𝜎 of the total weight of soldiers.
16𝑐 2 = 144 Solution:
𝑐 = ±3 Let 𝑋 be the average weight, therefore
Use first equation to find two pairs: 𝐸(𝑋) = 𝜇 = 90
𝑐 = 3, 𝑑 = 85𝑐 = −3, 𝑑 = 115 𝜎2 102
𝑉𝑎𝑟(𝑋) = 𝑛
= 250 = 0.4 kg2
To find 𝜇 of total weight, you are calculating
1.2 Combinations of Random Variables
𝐸(𝑋1 ) + 𝐸(𝑋2 ) … + 𝐸(𝑋250 ) = 250𝐸(𝑋) = 22 500kg
• Expectations of combinations of random variables:
To find 𝜎, first find 𝑉𝑎𝑟(𝑋)
𝐸(𝑎𝑋 + 𝑏𝑌) = 𝑎𝐸(𝑋) + 𝑏𝐸(𝑌)
𝑉𝑎𝑟(𝑋1 ) … + 𝑉𝑎𝑟(𝑋250 ) = 250𝑉𝑎𝑟(𝑋) = 2500kg
• Variance of combinations of independent random 𝑉𝑎𝑟(𝑋) = 𝜎 2 = 25000
variables: ∴ 𝜎 = √25000 = 158.1kg
𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑌 + 𝑐) = 𝑎2 𝑉𝑎𝑟(𝑋) + 𝑏 2 𝑉𝑎𝑟(𝑌)
𝑉𝑎𝑟(𝑋 ± 𝑌) = 𝑉𝑎𝑟(𝑋) + 𝑉𝑎𝑟(𝑌) 2. POISSON DISTRIBUTION
• The Poisson distribution is used as a model for the
• Combinations of identically distributed random variables number, 𝑋, of events in a given interval of space or
having mean 𝜇 and variance 𝜎 2 times. It has the probability formula
𝐸(2𝑋) = 2𝜇 and 𝐸(𝑋1 ) + 𝐸(𝑋2 ) = 2𝜇 𝜆𝑥
2
𝑃(𝑋 = 𝑥) = 𝑒 −𝜆 𝑥! 𝑥 = 0, 1, 2, …
𝑉𝑎𝑟(2𝑋) = 4𝜎 but 𝑉𝑎𝑟(𝑋1 + 𝑋2 ) = 2𝜎 2
Where 𝜆 is equal to the mean number of events in the
given interval.
(IS) Ex 6b: Question 3:
It is given that 𝑋1 and 𝑋2 are independent, and • A Poisson distribution with mean 𝜆 can be noted as
𝐸(𝑋1 ) = 𝐸(𝑋2 ) = 𝜇, 𝑉𝑎𝑟(𝑋1 ) = 𝑉𝑎𝑟(𝑋2 ) = 𝜎 2 𝑋 ~ 𝑃𝑜(𝜆)
1
Find 𝐸(𝑋) and 𝑉𝑎𝑟(𝑋), where 𝑋 = 2 (𝑋1 + 𝑋2 ) 2.1 Suitability of a Poisson Distribution
Solution: • Occur randomly in space or time
Split the expectation into individual components • Occur singly – events cannot occur simultaneously
1 1 1 • Occur independently
𝐸 ( (𝑋1 + 𝑋2 )) = 𝐸(𝑋1 ) + 𝐸(𝑋2 )
2 2 2 • Occur at a constant rate – mean no. of events in given
Substitute given values, hence time interval proportional to size of interval
1 1 1
𝐸 ( (𝑋1 + 𝑋2 )) = 𝜇 + 𝜇 = 𝜇
2 2 2
PAGE 2 OF 13
2.2 Expectation & Variance 2.5 Poisson Approximation of a Binomial
• For a Poisson distribution 𝑋~𝑃𝑜 (𝜆) Distribution
• Mean = 𝜇 = 𝐸(𝑋) = 𝜆 • To approximate a binomial distribution given by:
• Variance = 𝜎 2 = 𝑉𝑎𝑟(𝑋) = 𝜆 𝑋~𝐵(𝑛, 𝑝)
• The mean & variance of a Poisson distribution are equal • If 𝑛 > 50 and 𝑛𝑝 > 5
• Then we can use a Poisson distribution given by:
2.3 Addition of Poisson Distributions 𝑋~𝑃𝑜(𝑛𝑝)
• If 𝑋 and 𝑌 are independent Poisson random variables, (IS) Ex 8d: Question 8:
with parameters 𝜆 and 𝜇 respectively, then 𝑋 + 𝑌 has a A randomly chosen doctor in general practice sees, on
Poisson distribution with parameter 𝜆 + 𝜇 average, one case of a broken nose per year and each
(IS) Ex 8d: Question 1: case is independent of the other similar cases.
The numbers of emissions per minute from two i. Regarding a month as a twelfth part of a year,
radioactive objects 𝐴 and 𝐵 are independent Poisson a. Show that the probability that, between them,
variables with mean 0.65 and 0.45 respectively. three such doctors see no cases of a broken
Find the probabilities that: nose in a period of one month is 0.779
i. In a period of three minutes there are at least three b. Find the variance of the number of cases seen
emissions from 𝐴. by three such doctors in a period of six months
ii. In a period of two minutes there is a total of less ii. Find the probability that, between them, three
than four emissions from 𝐴 and 𝐵 together. such doctors see at least three cases in one year.
Solution: iii. Find the probability that, of three such doctors,
Part (i): one sees three cases and the other two see no
Write the distribution using the correct notation cases in one year.
𝐴~𝑃𝑜(0.65×3) = 𝐴~𝑃𝑜(1.95) Solution:
Use the limits given in the question to find probability Part (i)(a):
𝑃(𝐴 ≥ 3) = 1 − 𝑃(𝐴 < 3) Write down the information we know and need
1.952 𝑒 −1.95 1.951 𝑒 −1.95 1.950 𝑒 −1.95 1
1 doctor = 1 nose per year = 12noses per month
=1−( + + )
2! 1! 0! 3 1
3 doctors= 12 = 4 noses per month
= 1 − 0.690 = 0.310
Part (ii): Write the distribution using the correct notation
Write the distribution using the correct notation 𝑋~𝑃𝑜(0.25)
(𝐴 + 𝐵)~𝑃𝑜(2(0.65 + 0.45)) = (𝐴 + 𝐵)~𝑃𝑜(2.2) Use the limits given in the question to find probability
Use the limits given in the question to find probability 0.250 𝑒 −0.25
𝑃(𝑋 = 0) = = 0.779
(2.2)3 (2.2)2 (2.2)1 0!
𝑃(𝐴 < 4) = 𝑒 −2.2 ( + + Part (i)(b):
3! 2! 1! Use the rules of a Poisson distribution
(2.2)0
+ ) 𝑉𝑎𝑟(𝑋) = 𝜇 = 𝜆
0! Calculate 𝜆 in this scenario:
= 0.819 𝜆 = 6×𝜇 (𝑖𝑛 𝑜𝑛𝑒 𝑚𝑜𝑛𝑡ℎ) = 6×0.25 = 1.5
∴ 𝑉𝑎𝑟(𝑋) = 1.5
2.4 Relationship of Inequalities Part (ii):
• 𝑃(𝑋 < 𝑟) = 𝑃(𝑋 ≤ 𝑟 − 1) Calculate 𝜆 in this scenario:
• 𝑃(𝑋 = 𝑟) = 𝑃(𝑋 ≤ 𝑟) − 𝑃(𝑋 ≤ 𝑟 − 1) 𝜆 = 12×𝜇 (𝑖𝑛 𝑜𝑛𝑒 𝑚𝑜𝑛𝑡ℎ) = 12×0.25 = 3
• 𝑃(𝑋 > 𝑟) = 1 − 𝑃(𝑋 ≤ 𝑟) Use the limits given in the question to find probability
• 𝑃(𝑋 ≥ 𝑟) = 1 − 𝑃(𝑋 ≤ 𝑟 − 1)
𝑃(𝑋 ≥ 3) = 1 − 𝑃(𝑋 ≤ 2)
32 31 30
= 1 − 𝑒 −3 ( + + ) = 1 − 0.423 = 0.577
2! 1! 0!
PAGE 3 OF 13
Part (iii): Write the probability required by the question
We will need two different 𝜆s in this scenario: 𝑃(𝑋 < 2)
𝜆 𝑓𝑜𝑟 𝑜𝑛𝑒 𝑑𝑜𝑐𝑡𝑜𝑟 𝑖𝑛 𝑜𝑛𝑒 𝑦𝑒𝑎𝑟 = 1 From earlier equations:
𝜆 𝑓𝑜𝑟 𝑜𝑡ℎ𝑒𝑟 𝑡𝑤𝑜 𝑑𝑜𝑐𝑡𝑜𝑟𝑠 𝑖𝑛 𝑜𝑛𝑒 𝑦𝑒𝑎𝑟 = 2×1 = 2 0.40 0.41
For the first doctor: 𝑃(𝑋 < 2) = 𝑒 −0.4 ( + ) = 0.938
0! 1!
13 Part (ii):
𝑃(𝑋 = 3) = 𝑒 −1 ( )
3! Using information from question form the parameters
For the two other doctors: of Poisson distribution
10 𝑙 = 10 and 𝜆 = 0.04𝑙
𝑃(𝑋 = 0) = 𝑒 −1 ( )
0! ∴ 𝜆 = 40 > 15
Considering that any of the three could be the first Thus we can use the normal approximation
13 10 Write down our distribution using correct notation
𝑃(𝑋) = 𝑒 ( ) ×𝑒 ( ) × 3𝐶2 = 0.025
−1 −1
3! 0! 𝑋~𝑃𝑜(40) → 𝑌~𝑁(40, 40)
Write the probability required by the question
2.6 Normal Approximation of a Poisson 𝑃(𝑋 ≥ 46)
Distribution Apply continuity correction for the normal distribution
• To approximate a Poisson distribution given by: 𝑃(𝑌 ≥ 45.5)
𝑋~𝑃(𝜆) Evaluate the probability
• If 𝜆 > 15 45.5 − 40
𝑃(𝑌 ≥ 45.5) = 1 − Φ ( ) = 0.192
• Then we can use a normal distribution given by: √40
𝑋~𝑁(𝜆, 𝜆) Part (iii):
Using the variance formula
Apply continuity correction to limits: 2
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − (𝐸(𝑋))
Poisson Normal
For a Poisson distribution
𝑥=6 5.5≤𝑥≤6.5
𝐸(𝑋) = 𝑉𝑎𝑟(𝑋) = 𝜆 and 𝜆 = 40
𝑥>6 𝑥≥6.5
𝑥≥6 𝑥≥5.5 Substitute into equation and solve for the unknown
𝑥<6 𝑥≤5.5 ∴ 40 = 𝐸(𝑋 2 ) − 402
𝑥≤6 𝑥≤6.5 𝐸(𝑋 2 ) = 1640 pence
𝐸(𝑋 2 ) = ₤16.40
(IS) Ex 10h: Question 11: Expected cost for rectifying cloth is ₤16.40
The no. of flaws in a length of cloth, 𝑙m long has a
Poisson distribution with mean 0.04𝑙 3. CONTINUOUS RANDOM VARIABLE
i. Find the probability that a 10m length of cloth has
fewer than 2 flaws. 3.1 Probability Density Functions (pdf)
ii. Find an approximate value for the probability that a • Function whose area under its graph represents
1000m length of cloth has at least 46 flaws. probability used for continuous random variables
iii. Given that the cost of rectifying 𝑋 flaws in a 1000m • Represented by 𝑓(𝑥)
length of cloth is 𝑋 2 pence, find the expected cost.
Solution:
Part (i):
Form the parameters of Poisson distribution
𝑙 = 10 and 𝜆 = 0.04𝑙
∴ 𝜆 = 0.4
Write down our distribution using correct notation
𝑋~𝑃𝑜(0.4)
PAGE 4 OF 13
Conditions: Part (iii):
• Total area always = 1 𝑃(𝑋 < 𝑚) can be interpreted as 𝑃(−∞ < 𝑋 < 𝑚)
𝑑 𝑚 3 3
∫ 𝑓(𝑥) 𝑑𝑥 = 1 𝑘𝑥 3
∫ 𝑘𝑥(6 − 𝑥) = ∫ 𝑘𝑥(6 − 𝑥) = [3𝑘𝑥 2 − ]
𝑐 −∞ 2 3 2
• Cannot have –ve probabilities ∴ graph cannot dip below
1 33 23 13
𝑥-axis; 𝑓(𝑥) ≥ 0 = (3(32 ) − − 3(22 ) + ) =
24 3 3 36
• Probability that 𝑋 lies between 𝑎 and 𝑏 is the area from
𝑎 to 𝑏 3.2 Cumulative Distribution Function (cdf)
𝑏
𝑃(𝑎 < 𝑋 < 𝑏) = ∫ 𝑓(𝑥) 𝑑𝑥 • Gives the probability that the value is less than 𝑥
𝑎 𝑃(𝑋 < 𝑥) or 𝑃(𝑋 ≤ 𝑥)
• Outside given interval 𝑓(𝑥) = 0; show on a sketch • Represented by 𝐹(𝑥)
• 𝑃(𝑋 = 𝑏) always equals 0 as there is no area • It is the integral of 𝑓(𝑥)
𝑏
• Notes: 𝐹(𝑏) = ∫ 𝑓(𝑥) 𝑑𝑥
−∞
o 𝑃(𝑋 < 𝑏) = 𝑃(𝑋 ≤ 𝑏) as no extra area added
• Median: the value of 𝑥 for which 𝐹(𝑥) = 0.5
o The mode of a pdf is its maximum (stationary point)
(apply analogy to quartiles/percentages)
(IS) Ex 9a: Question 6:
Given that:
𝑘𝑥(6 − 𝑥) 2<𝑥<5
𝑓(𝑥) = {
0 otherwise
i. Find the value of 𝑘
ii. Find the mode, 𝑚
iii. Find 𝑃(𝑋 < 𝑚)
Solution:
Part (i):
Total area must equal 1 hence
5 5
𝑘𝑥 3 2
∫ 𝑘𝑥(6 − 𝑥) = [3𝑘𝑥 − ] =1
3 2
2 • Notes:
125 8 o Since it is always impossible to have a value of 𝑋
= 75𝑘 − 𝑘 − 12𝑘 + 𝑘 = 24𝑘 = 1
3 3 smaller than −∞ or larger than ∞:
1
∴𝑘= 𝐹(−∞) = 0 𝐹(∞) = 1
24 o As 𝑥 increase, 𝐹(𝑥) either increase or remains
Part (ii):
constant, but never decreases.
Mode is the value which has the greatest probability
hence we are looking for the max point on the pdf o 𝐹 is a continuous function even if 𝑓 is discontinuous
𝑑 • Useful relations:
[𝑘𝑥(6 − 𝑥)] = 6𝑘 − 2𝑘𝑥 o 𝑃(𝑐 < 𝑋 < 𝑑) = 𝐹(𝑑) − 𝐹(𝑐)
𝑑𝑥
Finding max point hence stationary point o 𝑃(𝑋 > 𝑥) = 1 − 𝐹(𝑥)
6𝑘 − 2𝑘𝑥 = 0 (IS) Ex 9b: Question 9:
1 Given that:
6( )
𝑥= 24
=3 𝑘 0<𝑥<1
1 𝑓(𝑥) = {4𝑘 1 < 𝑥 < 3
2 (24)
0 otherwise
∴ mode = 3 i. Find the value of 𝑘
ii. Find 𝐹(𝑥)
iii. Find the difference between the median and the
fifth percentile of 𝑋
Solution:
PAGE 5 OF 13
∞
Part (i):
𝐸(𝑋) = ∫ 𝑥𝑓(𝑥) 𝑑𝑥
Total area must equal 1 hence −∞
1 3
∫ 𝑘 + ∫ 4𝑘 = [𝑘𝑥]10 + [4𝑘𝑥]13 = 1 • To calculate variance:
0 1 o First calculate 𝐸(𝑋) as above
= (𝑘 − 0) + (12𝑘 − 4𝑘) = 9𝑘 = 1 o The calculate 𝐸(𝑋 2 ) by
1 ∞
∴𝑘= 𝐸(𝑋 2 ) = ∫ 𝑥 2 𝑓(𝑥) 𝑑𝑥
9
Part (ii): −∞
Integrate each case separately from its −∞ to 𝑥
For the first interval 0 ≤ 𝑥 ≤ 1 o Substitute information and calculate using
𝑥
1 1 𝑥 1 𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − 𝐸(𝑋)2
𝐹(𝑥) = ∫ = [ 𝑥] = 𝑥
0 9 9 0 9
We must split next interval 0 ≤ 𝑥 ≤ 3 as 3.4 Obtaining f(x) from F(x)
𝐹(𝑥) = 𝑃(𝑋 ≤ 3) = 𝑃(𝑋 ≤ 1) + 𝑃(1 ≤ 𝑥 ≤ 3) • As 𝐹 is obtained by integrating 𝑓, then 𝑓 can be obtained
and 𝑃(𝑋 ≤ 1) = 𝐹(1) = 1⁄9 by differentiating 𝐹
1 𝑥
1 (IS) Ex 9d: Example 13:
∴ 𝐹(𝑥) = + ∫ 4× The random variable has cdf given by
9 1 9
1 1 𝑥 4 3 0 𝑥≤1
= + [4× 𝑥] = 𝑥 − (𝑥 − 1)3 1 1
9 9 1 9 9 𝐹(𝑥) = { 1≤𝑥≤3
Writing in correct notation and fixing intervals (adding 8 1 1
1 𝑥≥3
equal sign to inequalities) Find the form of the pdf of 𝑋
0 𝑥≤0
1 1 1 Solution:
𝑥 0≤𝑥≤1 𝐹(𝑥) is unchanging for 𝑥 < 1 and for 𝑥 > 3, therefore
𝐹(𝑥) = 9 9 9
4 3 1 3 𝑓(𝑥) is equal to 0. Hence we must find differentiate in
𝑥− 1≤𝑥≤3 the interval 1 < 𝑥 < 3
9 9 9 9
{ 1 𝑥≥3 𝑓(𝑥) = 𝐹 ′ (𝑥)
Part (iii):
𝑑 (𝑥 − 1)3 3
Finding the median, you must check in which interval it 𝑓(𝑥) = ( ) = (𝑥 − 1)2
lies. Do this by substituting the maximum value for 𝑥 in 𝑑𝑥 8 8
the first case Hence:
1 1 1 3 2 1<𝑥<3
×1 = < 𝑓(𝑥) = {8 (𝑥 − 1)
9 9 2
This means the median does not lie in this interval ∴ 0 otherwise
4 3
𝑥 − = 0.5
9 9 3.5 Distribution of a Function of a Random
15 Variable
𝑥=
8 • We can deduce the distribution of a simple function of 𝑋
1 1
The fifth percentile lies in the first interval as < so either increasing or decreasing with this procedure:
20 9
1 1 fX → FX → FY → fY
𝑥=
9 20
9
𝑥= (IS) Ex 9e: Example 15:
20 The random variable 𝑋 has pdf 𝑓𝑋 (𝑥) given by,
Find the difference
1 2<𝑥<3
15 9 57 𝑓𝑋 (𝑥) = {
− = 0 otherwise
8 20 40 The random variable 𝑌 is given by 𝑌 = 2𝑋 + 3.
Determine the pdf and cdf of 𝑌.
Solution:
3.3 Expectation and Variance First step is to find 𝐹𝑋 (𝑥) and suppose we do,
• To calculate expectation
PAGE 6 OF 13
0 𝑥≤2 𝑃(𝑋 < 𝑥) = 1 − (1 − 𝑝)𝑥−1 𝑃(𝑋 > 𝑥) = (1 − 𝑝)𝑥
FX (𝑥) = 𝑃(𝑋 ≤ 𝑥) = {𝑥 − 2 2 ≤ 𝑥 ≤ 3
1 𝑥≥3 Example:
Find the ranges for 𝑌 In the village of Nanakuli, about 80% of the residents
(2×2) + 3 ≤ 𝑦 ≤ (3×2) + 3 are of Hawaiian ancestry. Suppose you fly to Hawaii and
7≤𝑦≤9 visit Nanakuli.
i. What is the probability that the fifth villager you
Convert cdf from 𝑋 to 𝑌 using relationship given
meet is Hawaiian?
FY (𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(2𝑋 + 3 ≤ 𝑦) ii. What is the probability that you do not meet a
= 𝑃 (𝑋 ≤ 1⁄2 (𝑦 − 3)) Hawaiian until the third villager?
Solution:
Now substitute 1⁄2 (𝑦 − 3) for 𝑥 in cdf function Part (i):
(1⁄2 (𝑦 − 3)) − 2 ⟹ 1⁄2 (𝑦 − 7) Using the formula
Expressing cdf of 𝑌 with ranges worked out 𝑃(𝑋 = 5) = (1 − 0.80)5−1 (0.80) = 0.00128
Part (ii):
0 𝑥≤7 Not meeting until third means the probability
1
F𝑌 (𝑦) = 𝑃(𝑌 ≤ 𝑦) = { ⁄2 (𝑦 − 7) 7 ≤ 𝑦 ≤ 9 𝑃(𝑋 > 3)
1 𝑥≥9
Using relationships above
Differentiate function to find pdf 𝑃(𝑋 > 3) = (1 − 0.80)3 = 0.008
1 7<𝑦<9
𝑓𝑌 (𝑦) = { ⁄2
0 otherwise 4.3 Mean & Variance of a Geometric
• Method can be used for both increasing and decreasing Distribution
functions as well functions with powers (e.g. 𝑊 = 𝑋 2 ) • The expectation (mean) of a geometric distribution:
1
𝐸(𝑋) =
4. GEOMETRIC & EXPONENTIAL DISTRIBUTION 𝑝
• The variance of a geometric distribution:
4.1 Geometric Distribution 1−𝑝
𝑉𝑎𝑟(𝑋) = 2
Conditions for a Geometric Distribution: 𝑝
• Only two possible outcomes: success or failure
• Probability of success, 𝑝, is constant 4.4 Exponential Distribution
• Each event is independent • Used for modeling duration of events
𝑃(𝑋 < 𝑥) = 1 − 𝑒 −𝜆𝑥
• The geometric distribution is used to find the number of 𝑃(𝑋 > 𝑥) = 𝑒 −𝜆𝑥
trials required to obtain the first success 𝑃(𝑎 < 𝑋 < 𝑏) = 𝑒 −𝜆𝑎 − 𝑒 −𝜆𝑏
𝑃(𝑋 = 𝑛) = (1 − 𝑝)𝑛−1 𝑝 𝑛 = 1, 2, 3, … Where 𝜆 is the average no. of events in 1 unit of time
Where 𝑝 is the probability of success, (1 − 𝑝) is the and 𝑥 is the duration
probability of failure and 𝑛 is the number of trials • An exponential distribution with average 𝜆 can be noted:
• A geometric distribution with probability of success 𝑝 𝑋 ~ 𝐸𝑥𝑝(𝜆)
can be noted as • The exponential distribution is memory-less
𝑋 ~ 𝐺𝑒𝑜(𝑝) 𝑃[𝑋 > (𝑎 + 𝑏)|𝑋 > 𝑎] = 𝑃(𝑋 > 𝑏)
• The distribution is called geometric because successive o e.g. if a motor has been running for 3 hours and you
probabilities, 𝑝, (1 − 𝑝)𝑝, (1 − 𝑝)2 𝑝… form a geometric are asked to calculate the probability of it running for
progression with first term 𝑝 and common ratio (1 − 𝑝) more than 4 hours, you only need to find the
probability of it running for the next hour as the
previous condition does not affect the probability
4.2 Cumulative Probabilities
• Calculating cumulative probabilities
𝑃(𝑋 ≤ 𝑥) = 1 − (1 − 𝑝)𝑥 𝑃(𝑋 ≥ 𝑥) = (1 − 𝑝)𝑥−1
PAGE 7 OF 13
4.5 Mean & Variance of an Exponential Solution:
Part (a):
Distribution
Write down distribution
• The expectation (mean) of an exponential distribution:
1 𝑋~𝑁(1, 0.252 )
𝐸(𝑋) = Write down the probability they want
𝜆
• The variance of an exponential distribution: 𝑃(𝑋 > 1.25) = 1 − 𝑃(𝑋 < 1.25)
1 Standardize and evaluate
𝑉𝑎𝑟(𝑋) = 2
𝜆 1.25 − 1
1 − 𝑃 (𝑍 < ) = 0.1587
Example: 0.25
Part (b):
Calls arrive at an average rate of 12 per hour. Find the
probability that a call will occur in the next 5 minutes Write down initial distribution
given that you have already waited 10 minutes. 𝑋~𝑁(1, 0.252 )
Solution: For sample, mean remains equal but variance changes
Interpreting the information, Find new variance
𝜆 = 12 per hour = 0.2 per minute 𝜎2 0.252
Variance of sample = 𝑛
= 10
= 0.00625
We are being asked to calculate
Write down distribution of sample
𝑃(𝑇 ≤ 15|𝑇 > 10)
As the exponential distribution is memory-less; the 𝑌̅~𝑁(1, 0.00625)
Write down the probability they want
previous condition does not affect it hence we are
simply being asked to find 𝑃(𝑇 ≤ 5) 𝑃(𝑌̅ < 0.9)
Standardize and evaluate
𝑃(𝑇 ≤ 5) = 1 − 𝑒 −0.2×5 = 0.63
Standardized probability is negative so do 1 minus
0.9 − 1 0.1
5. SAMPLING & CENTRAL LIMIT THEOREM 𝑃 (𝑍 < ) = 1 − 𝑃 (𝑍 < ) = 0.103
0.00625 0.00625
5.1 Central Limit Theorem
If (𝑋1 , 𝑋2 , … , 𝑋𝑛 ) is a random sample of size 𝑛 drawn from 6. POINT AND INTERVAL ESTIMATION
any population with mean 𝜇 and variance 𝜎 2 then the
6.1 The Variance
sample has:
• The variance can be calculated/given for either a sample
Expected mean, 𝜇
𝜎2
or a population and there is a difference between them
Expected variance, 𝑛 Using the divisor 𝒏
It forms a normal distribution: • This is appropriate to use when
𝜎2 o data is given for the whole population and you are
𝑋̃~𝑁 (𝜇, )
𝑛 interested in the variance of the whole
o data is given for the sample and you are interested in
(IS) Ex 10f: Question 12: the variance of just the sample
The weights of the trout at a trout farm are normally 1 (Σ𝑥)2
distributed with mean 1kg & standard deviation 0.25kg 𝜎 2 = (Σ𝑥 2 − )
𝑛 𝑛
a. Find, to 4 decimal places, the probability that a trout Using the divisor (𝒏 − 𝟏)
chosen at random weighs more than 1.25kg. • This is appropriate to use when data is given for a
b. If 𝑌̅kg represents mean weight of a sample of 10 sample and you are interested in estimating the variance
trout chosen at random, state the distribution of 𝑌̅: of the whole population
evaluate the mean and variance. • The quantity calculated 𝑠 2 is known as the unbiased
Find the probability that the mean weight of a estimate of the population variance
sample of 10 trout will be less than 0.9kg 1 (Σ𝑥)2
𝑠2 = (Σ𝑥 2 − )
𝑛−1 𝑛
PAGE 8 OF 13
6.2 Point Estimate & Confidence Interval Large sample taken from an unknown population
• A point estimate is a numerical value calculated from a distribution with known population variance
set of data (sample) which is used as an estimate of an • By the Central Limit Theorem, the distribution of 𝑋̅ will
unknown parameter in a population be approximately normal so same method as above
𝜎 𝜎
• Examples of point estimates are: (𝑥̅ − 𝑧 , 𝑥̅ + 𝑧 )
Sample mean 𝑥̅ →
𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑠
population mean 𝜇 √𝑛 √𝑛
𝑟 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑠
• The confidence interval calculated is an approximate
Sample proportion → population proportion 𝑝
𝑛
𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑠
Sample variance 𝑠 2 → population variance 𝜎 2 Large sample taken from an unknown population
distribution with unknown population variance
• The point estimate will lie close to the population value • As the population variance is unknown, you must first
but may not be exact estimate the population variance, 𝑠, using sample data
𝑠 𝑠
• We can determine a confidence interval where the (𝑥̅ − 𝑧 , 𝑥̅ + 𝑧 )
population value is likely to lie in (𝑥̅ − 𝛿, 𝑥̅ + 𝛿) √𝑛 √𝑛
• The confidence interval calculated is an approximate
6.3 Percentage Points for a Normal
{W13-P71}: Question 2:
Distribution
Heights of a certain species of animal are normally
• The percentage points are determined by finding the
distributed with 𝜎 = 0.17m. Obtain a 99% confidence
𝑧-value of specific percentages.
interval for the population mean, with total width less
• E.g. to find the 𝑧-value of a 95% confidence level, we can
than 0.2m. Find the smallest sample size required.
see that the 5% would be removed equally from both
Solution:
sides (2.5%) so the 𝑧-value we would actually be finding For a 99% confidence interval, find 𝑧 where
would be of 100% − 2.5% = 97.5%
Φ(𝑧) = 0.995 (think of the 1% cut from both sides)
𝑧 = 2.576
Subtract the limits of the interval and equate to 0.2
𝜎 𝜎
(𝑥̅ + 𝑧 ) − (𝑥̅ − 𝑧 ) = 0.2
√𝑛 √𝑛
𝜎
2 (𝑧 ) = 0.2
Percentage Points Table √𝑛
Substitute information given and find 𝑛
Confidence level 90% 95% 98% 99%
0.2
𝒛-value 1.645 1.960 2.326 2.576 √𝑛 = ×0.17
2×2.576
𝑛 = 4126.53 ≈ 4130
6.4 Confidence Interval for a Population
Mean 6.5 Confidence Interval for a Population
Sample taken from a normal population distribution with
Proportion
known population variance
𝜎 𝜎 • Calculating the confidence interval from a random
(𝑥̅ − 𝑧 , 𝑥̅ + 𝑧 ) sample of 𝑛 observations from a population in which the
√𝑛 √𝑛
• 𝑧 is the value corresponding to the confidence level proportion of successes is 𝑝 and the proportion of
required and 𝑛 is the sample size failures is 𝑞
𝑟
• The confidence interval calculated is exact • The observed proportion of success 𝑝̂ is where 𝑟
𝑛
represents the number of successes
𝑝̂ 𝑞̂ 𝑝̂ 𝑞̂
(𝑝̂ − 𝑧√ , 𝑝̂ + 𝑧√ )
𝑛 𝑛
PAGE 9 OF 13
{S10-P71}: Question 2: 6.7 Confidence Interval for a Population
A random sample of 𝑛 people were questioned about Mean with a Small Sample
their internet use. 87 of them had a high-speed Small sample (<30) taken from a Normal population
internet connection. A confidence interval for the distribution with unknown population variance
population proportion having a high-speed internet 𝑠 𝑠
connection is 0.1129 < 𝑝 < 0.1771. (𝑥̅ − 𝑐 , 𝑥̅ + 𝑐 )
√𝑛 √𝑛
i. Write down the mid-point of this confidence • As sample is small, the normal distribution cannot be
interval and hence find the value of 𝑛. used and instead the 𝑡-distribution is used
ii. This interval is an 𝛼% confidence interval. Find 𝛼. • For a small sample 𝑛, its 𝑡-distribution is 𝑡𝑛−1 (degree of
Solution: freedom 𝑣 = 𝑛 − 1
Part (i):
• Use the tables to find the percentage point, 𝑐
Find the midpoint of the limits, finding 𝑝
• As the population variance is unknown, you must
0.1771 − 0.1129
0.1129 + = 0.145 estimate the population variance, 𝑠, using sample data
2
The midpoint is equal to the proportion of people with • The confidence interval calculated is exact
high-speed internet use so
87 7. HYPOTHESIS TESTS
= 0.145 ∴ 𝑛 = 600
𝑛
Part (ii): 7.1 Null & Alternative Hypothesis
Using the upper limit, this was calculate by: • For a hypothesis test on the population mean 𝜇, the null
𝑝𝑞 hypothesis 𝑯𝟎 proposes a value 𝜇0 for 𝜇
0.1771 = 0.145 + 𝑧√
𝑛 𝐻0 : 𝜇 = 𝜇0
Substituting values calculated (𝑞 = 1 − 𝑝), find 𝑧 • The alternative hypothesis 𝑯𝟏 suggests the way in
87 513 which 𝜇 might differ from 𝜇0 . 𝐻1 can take three forms:
×
0.0321 = 𝑧√600 600 ∴ 𝑧 = 2.233 𝐻1 : 𝜇 < 𝜇0 , a one-tail test for a decrease
600 𝐻1 : 𝜇 > 𝜇0 , a one-tail test for an increase
Use normal tables and find corresponding probability 𝐻1 : 𝜇 ≠ 𝜇0 , a two-tail test for a difference
Φ(𝑧) = 0.9872 • The test statistic is calculated from the sample. Its value
Think of symmetry, the same area is chopped off from is used to decide whether the null hypothesis should be
both sides of the graph so rejected
1 − 2(1 − 09872) = 0.9744 • The rejection or critical region gives the values of the
Hence the 𝛼% confidence is = 97.44% test statistic for which the null hypothesis is rejected
• The acceptance region gives the values of the test
6.6 Percentage Points for a t-Distribution statistic for which the null hypothesis is accepted
• The critical values are the boundary values of the
rejection region
• The significance level of a test gives the probability of
the test statistic falling in the rejection region
To carry out a hypothesis test:
• Define the null and alternative hypotheses
• Decide on a significance level
• Determine the critical value(s)
• Calculate the test statistic
• Decide on the outcome of the test depending on
whether the value of the test statistic lies in the rejection
or acceptance region
• State the conclusion in words
PAGE 10 OF 13
• The test statistic 𝑍 can be used to test a hypothesis 7.3 Hypothesis Tests and Confidence Interval
about a population • If a 𝑐% symmetric confidence interval excludes the
𝑥̅ − 𝜇 population value of interest, then the null hypothesis
𝑧=
√
𝜎2 that the population parameter takes this value will be
𝑛 rejected at the 100(1 − 𝑐)% level
where 𝜇 is the population mean specified by the null
hypothesis 7.4 Type I and Type II Errors
• The critical values for some commonly used rejection • A Type I error is made
regions: when a true null
Significance Two-tail One-tail hypothesis is rejected
level 𝜇 ≠ 𝜇0 𝜇 > 𝜇0 𝜇 < 𝜇0 • A Type II error is made
10% ±1.645 1.282 −1.282 when a false null
5% ±1.960 1.645 −1.645 hypothesis is accepted
2% ±2.326 2.054 −2.054 • P(Type I error) = significance level
1% ±2.576 2.326 −2.326 • Calculating P(Type II error):
o Firstly, calculate the acceptance region by leaving 𝑥̅ as
7.2 Hypothesis Testing with Different a variable and equating the test statistic to the
Distributions significance level
• Test for mean, known variance, normal distribution or o Next, calculate the conditional probability that 𝜇 is
large sample now 𝜇′ and 𝑥̅ is still in the acceptance region
𝜎2 P(𝑥̅ is in acceptance region | 𝜇 = 𝜇′)
𝑋~𝑁 (𝜇, 𝑛
)
Calculate this by substituting the limit of the
o Use general procedure as outlined above
acceptance region as 𝑥̅ (calculated previously) and the
• Test for mean, large sample, variance unknown
𝑠2
new, given 𝜇′ into the test statistic equation and find
𝑋~𝑁 (𝜇, 𝑛 ) the probability
o Use the same procedure however must use unbiased
estimate of the population variance, 𝑠 7.5 Comparison of Two Means
• Test for large Poisson mean • When testing the hypothesis that two population have
𝜆 the same mean
𝑋~𝑁 (𝜆, 𝑛)
• Two cases when comparing two population means:
o Use general procedure but must approximate normal
o Population variances are known
distribution using the mean given
o Although population variances unknown, they can be
o Must apply continuity correction
assumed to have the same value
• Test for proportion, large sample (Binomial distribution)
𝑝𝑞 Known population variance
𝑋~𝑁 (𝑝, ) • For two random variables 𝑋 and 𝑌 with unknown means
𝑛
o Similar to Poisson approximation; using probability of 𝜇𝑥 and 𝜇𝑦 and known variances 𝜎𝑥2 and 𝜎𝑦2 ,
success and applying continuity correction o The null hypothesis is:
• Test for mean, small sample, variance unknown 𝐻0 : 𝜇𝑥 = 𝜇𝑦
𝑠2 o The alternate hypothesis can be one or two-tailed:
𝑋~𝑇 (𝜇, ) 𝐻1 : 𝜇𝑥 ≠ 𝜇𝑦 or 𝐻1 : 𝜇𝑥 > 𝜇𝑦
𝑛
o Firstly, you must estimate the variance, calculate 𝑠 • When calculating the 𝑧 value for the hypothesis test use
o The distribution of the corresponding random variable, the following formula:
𝑇, is 𝑡𝑛−1 (i.e. one less than sample size 𝑛) 𝑥̅ − 𝑦̅
𝑧=
𝜎𝑥2 𝜎𝑦2
√ +
𝑛𝑥 𝑛𝑦
• Carry out hypothesis test as normal
PAGE 11 OF 13
Common unknown population variance 8.2 Comparing the 𝝌𝟐 Value
• We are assuming that 𝜎𝑥2 = 𝜎𝑦2 = 𝜎 2 • Once you have calculated the 𝜒 2 value of the data given,
• To find a common variance, we calculate the pooled you must then compare it to the critical values of the 𝜒 2
estimated of the common variance 𝑠𝑝2 by: distribution
∑(𝑥 − 𝑥̅ )2 + ∑(𝑦 − 𝑦̅)2 • To test 5 classes at a 5% significance level, find the
𝑠𝑝2 =
𝑛𝑥 + 𝑛𝑦 − 2 critical value of the 𝜒 2 distribution at 95% with 4 degrees
• The hypothesis are the same as above however as the of freedom
variance is the same, the 𝑧 value is given by: • If the distribution fits, the calculate value should be less
𝑥̅ −𝑦̅
𝑧= than the critical value, accepting 𝐻0
1 1
√𝑠2𝑝 (𝑛 +𝑛 )
𝑥 𝑦 8.3 Goodness of Fit to Prescribed
• For a small sample size, you cannot continue to use the Distribution Type
normal distribution and instead must use 𝑡-distribution • This is the case where the null hypothesis states that the
with 𝑛𝑥 + 𝑛𝑦 − 2 degrees of freedom. The test statistic data has a ‘particular named distribution’ but does not
is calculated same as above. specify all the parameters of the distribution
• You must then calculate the parameter in order to carry
8. GOODNESS OF FIT out the test e.g.
o Normal: mean and estimated sample variance
8.1 𝝌𝟐 Test o Poisson: mean
• Used to test whether a particular type of distribution is o Binomial: probability of success
appropriate for the data given • For 𝑘 parameters calculated from the observed data,
• Test statistic involves squares – only interested in upper you must subtract 𝑘 from the degrees of freedom 𝑣
limit critical values • Hence, with 𝑚 different outcomes,
• The 𝜒 2 test can only be used to test two lists of 𝑣 =𝑚−1−𝑘
frequencies – the observed and the expected
frequencies calculated from the hypothesis. 8.4 Contingency Table
(𝑂𝑖 − 𝐸𝑖 )2 • This is a table which contains the frequencies for two or
𝜒2 = ∑
𝐸𝑖 more variables.
where 𝑂𝑖 and 𝐸𝑖 are the observed and expected • You may then assess whether the variables are
frequencies associated or independent.
• When calculating, set up a table as follows • Hypothesis when testing:
(𝑶𝒊 − 𝑬𝒊 )𝟐 o 𝐻0 : the variables are independent
Variable Probability 𝑶𝒊 𝑬𝒊
𝑬𝒊 o 𝐻1 : the variables are associated
• For example:
𝑨 𝑩 𝑪
⋮ ⋮ ⋮ ⋮ ⋮ 𝑿 ∑𝑅1
𝒀 ∑𝑅2
Total 𝒁 ∑𝑅3
∑𝐶1 ∑𝐶2 ∑𝐶3 ∑
• If the expected frequency for a class is less than 5, then • The expectation of each variable is calculated by
you must group this class with the next class (or two …) row total × column total
• Hypothesis when testing: grand total
o 𝐻0 : the … distribution is a suitable model • List each variable and set up table as before
o 𝐻1 : the … distribution is not a suitable model • The degree of independence for an 𝑟 by 𝑐 table is
𝑣 = (𝑟 − 1)(𝑐 − 1)
PAGE 12 OF 13
9. REGRESSION AND CORRELATION • Positive correlation; correlation coefficient > 0;
regression lines of 𝑌 on 𝑋 and 𝑋 on 𝑌 have +ve gradients
9.1 Regression • Negative correlation; correlation coefficient < 0;
• This is finding a linear relationship between two regression lines of 𝑌 on 𝑋 and 𝑋 on 𝑌 have -ve gradients
variables where one variable is dependent on the other • Zero correlation: no linear relationship, does not mean
e.g. 𝑦 on 𝑥 𝑋 and 𝑌 are unrelated (e.g. parabolic relationship)
• The regression line is the line summarizing the relation
between 𝑥 and 𝑦 9.5 Product-Moment Correlation Coefficient
• The line must pass through the mean values i.e. 𝑥̅ and 𝑦̅ • Is the measurement of scatter that lies between -1 and 1
hence the line of the equation can be written as 𝑆𝑥𝑦
𝑟=
𝑦̅ = 𝑎 + 𝑏𝑥̅ √𝑆𝑥𝑥 𝑆𝑦𝑦
where 𝑏 is the regression coefficient • Correlation graphs:
• Rearranging equation, the value of 𝑎 can be calculated
𝑎 = 𝑦̅ − 𝑏𝑥̅
1
𝑎 = (∑𝑦 − 𝑏∑𝑥)
𝑛
9.2 Calculating the Regression Coefficient • Relationship between 𝑟 and the regression coefficients:
• The value of 𝑏 can be calculated using the method of 𝑟 2 = 𝑏1 𝑏2
least squares where
𝑆𝑥𝑦
𝑏=
𝑆𝑥𝑥
• Where the quantities 𝑆𝑥𝑦 and 𝑆𝑥𝑥 are given by
∑𝑥∑𝑦
𝑆𝑥𝑦 = ∑𝑥𝑦 −
𝑛
2 (∑𝑥)2 (∑𝑦)2
𝑆𝑥𝑥 = ∑𝑥 − 𝑛 𝑆𝑦𝑦 = ∑𝑦 2 − 𝑛
9.3 Two Regression Lines

• When both 𝑋 and 𝑌 are random variables, there are two
regression models:
𝒚 on 𝒙 𝒙 on 𝒚
𝑦 = 𝑎 + 𝑏𝑥 𝑥 = 𝑐 + 𝑑𝑦
• The two regression lines both pass through the point
(𝑥̅ , 𝑦̅) which is therefore the point of intersection
• To predict a value of 𝑥 when, for the given data, the 𝑥
values are fixed (as opposed to being observations of a
random variable), then it is appropriate to use the
regression line of 𝑦 on 𝑥 ‘in reverse’ rather than using
the regression line of 𝑥 on 𝑦
9.4 Correlation
• Used when both 𝑋 and 𝑌 are random variables
• The correlation coefficient is a number between −1 and
+1 calculated so as to represent the linear dependence
of two variables or sets of data
PAGE 13 OF 13

Cie A2 Furthermaths 9231 Statistics1 v1 Znotes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cie A2 Furthermaths 9231 Statistics1 v1 Znotes

Uploaded by

Copyright:

Available Formats

TABLE OF CONTENTS

4 Continuous Random Variable

Geometric & Exponential Distribution

8 Sampling & Central Limit Theorem

8 Point & Interval Estimation

Regression and Correlation

9.3 Two Regression Lines

You might also like