You are on page 1of 12

26/02/2020

Lecture 5

Confidence interval for a mean and comparison


of two means

Lecture 5 – Introduction

• What is a confidence interval?


• Confidence interval for large samples
• Difference between confidence interval
and reference range
• Comparison of two means
– Sampling distribution of the difference
between two means

[See Kirkwood & Sterne: Chapter 6 and Chapter 7]

What is a confidence interval?

• Population parameter m
• Sample estimate x
• Standard error of the estimate s.e.(x)

1
26/02/2020

What is a confidence interval?

• Population parameter m
• Sample estimate x
• Standard error of the estimate s.e.(x)
• Confidence interval

m
Lower limit x Upper limit

Definition of a 95% confidence interval

• If we were to draw (e.g.,) 20 independent, random


samples (of equal size) from the sample population and
calculate 95% confidence intervals for each of them

Definition of a 95% confidence interval

• If we were to draw (e.g.,) 20 independent, random


samples (of equal size) from the sample population and
calculate 95% confidence intervals for each of them

• then (on average) 19 out of every 20 (95%) such


confidence intervals would contain the true population
mean, and one of every 20 (5%) would not

2
26/02/2020

Example

Sample mean and 95% CI for ‘length of stay in hospital’ from 20 samples
of 100 patients

Confidence interval for a large sample

• The sample standard deviation (s) is a reliable estimate of the


population standard deviation (s)

• The distribution of the sample means is approximately normal


when (approximately, n>60)

• We can use the z-score of the standard normal distribution to


calculate the 95% confidence interval for a large sample

CI for the population mean (large sample)

• Lower limit = x − ( z  se( x))

• Upper limit = x + ( z  se( x))

3
26/02/2020

95% CI for the population mean (large sample)

• Lower limit = x − ( z  se( x))

• Upper limit = x + ( z  se( x))

0.95

Z score -z 0 +z

10

95% confidence interval for a large sample

s
• Lower limit = x − (1.96  )
n

s
• Upper limit = x + (1.96  )
n
m
Lower limit x Upper limit

11

90% confidence interval for a large sample

s
• Lower limit = x − (1.645  )
n

s
• Upper limit = x + (1.645  )
n
m
Lower limit x Upper limit

12

4
26/02/2020

Example:- Insecticide required for malaria control in


10,000 houses

• From sample of n=100 houses, sample mean sprayable


area x=24.2m2, sample standard deviation s=5.9m2

13

Example:- Insecticide required for malaria control in


10,000 houses

• From sample of n=100 houses, sample mean sprayable


area x=24.2m2, sample standard deviation s=5.9m2

Estimate standard error s/√n by s/√n=0.59 m2

14

Example:- Insecticide required for malaria control in


10,000 houses

• From sample of n=100 houses, sample mean sprayable


area x=24.2m2, sample standard deviation s=5.9m2

Estimate standard error s/√n by s/√n=0.59 m2


Lower limit of 95% CI = 24.2 – (1.96 × 0.59) = 23.0 m2
Upper limit of 95% CI = 24.2 + (1.96 × 0.59) = 25.4 m2

95%CI for population mean m

23.0 x = 24.2 25.4

15

5
26/02/2020

Example:- Insecticide required for malaria control in


10,000 houses

n = 100 houses

POPULATION SAMPLE

Sample mean = 24.2 m2


Population mean sprayable area m
95% CI: 23.0, 25.4 m 2
INFERENCE

16

Confidence interval for a small sample

• The sample standard deviation (s) may not be a reliable


estimate of the population standard deviation (s) for
calculation of the standard error when (approximately) n<60

• We cannot use the z-score of the standard normal distribution

• Instead we use the t distribution

17

Difference between a confidence interval and


reference range

• From a sample of 100 babies in Melbourne the sample


mean birthweight was 3.627 kg and sample standard
deviation was 0.358 kg.

• If the maternal health nurse wanted to know if your


baby’s weight lies within the middle 95% of the
population, would they use the reference range or
confidence interval?

18

6
26/02/2020

Difference between a confidence interval and


reference range

95% reference range:-


Lower limit = x − (1.96  s )

Upper limit = x + (1.96  s )

95% confidence interval:-


s
Lower limit = x − (1.96  )
n
Upper limit = x + (1.96 
s
)
n

19

Difference between a confidence interval and


reference range

95% reference range:-


Lower limit = 2.925 kg 95% reference range

2.925 4.329
Upper limit = 4.329 kg

95% confidence interval:-


Lower limit = 3.557 kg m

3.557 3.697
Upper limit = 3.697 kg

20

Difference between a confidence interval and


reference range

• The confidence interval will always be narrower than the


reference range

• The confidence interval tells us a range of plausible


values for the population mean

• The reference range tells us a range of individual


observations

21

7
26/02/2020

Comparison of two means

Exposed Unexposed
(subscript 1) (subscript 0)
Sample mean x1 x0
Sample standard deviation s1 s0
Number of participants n1 n0
Population mean m1 m0
Population standard deviation s1 s0

22

Comparison of two means

Exposed Unexposed
(subscript 1) (subscript 0)
Sample mean x1 x0
Sample standard deviation s1 s0
Number of participants n1 n0
Population mean m1 m0
Population standard deviation s1 s0

We are interested in population mean difference m1 – m0

23

Comparison of two means

Exposed Unexposed
(subscript 1) (subscript 0)
Sample mean x1 x0
Sample standard deviation s1 s0
Number of participants n1 n0
Population mean m1 m0
Population standard deviation s1 s0

We are interested in population mean difference m1 – m0


Estimate the population mean difference using x1 – x0

24

8
26/02/2020

Comparison of two means

Exposed Unexposed
(subscript 1) (subscript 0)
Sample mean x1 x0
Sample standard deviation s1 s0
Number of participants n1 n0
Population mean m1 m0
Population standard deviation s1 s0

We are interested in population mean difference m1 – m0


Estimate the population mean difference using x1 – x0
How accurate is the estimate? Compute standard error and CI!

25

Comparison of two means: paired and unpaired


samples

• Unpaired samples
– Independent samples obtained from two populations
– e.g., effect of a medical treatment
• Enroll 1,000 subjects into a study.
• Randomly assign 500 subjects to the treatment group and 500 subjects
to the control group

• Paired samples
– Same individuals (tested twice, before and after)
– Each individual is their own “control”
– Not covered in this subject

26

Comparison of two means

• Examples (unpaired or paired?)


– Compare the mean systolic blood pressure 6 months
post-treatment between patients receiving new drug
versus patients receiving standard therapy

– Compare blood pressure measurements in a group of


hypertensive men, before and after they received
treatment

27

9
26/02/2020

Comparison of two means

Unpaired samples

28

Sampling distribution of the difference between two


means (large samples)

• We are interested in population mean difference (m1 – m0)

• Estimate the population mean difference using (x1 – x0)

• The standard error of the sample mean difference:

29

95% confidence interval of difference between


population means (large samples)

• Lower limit = ( x1 − x 0 ) − (1.96  s.e.( x1 − x 0 ))

• Upper limit = ( x1 − x 0 ) + (1.96  s.e. )


( x1 − x 0 )

where

30

10
26/02/2020

Comparison of two means: large samples

Sample
Sample Sample
mean weight
Group n standard standard
loss after 4
deviation error
weeks (kgs)
Atkins
n1 = 60 x1 = 4.40 s1 = 2.45 s.e.(x1) = 0.32
(Group 1)
Weight Watchers
n0 = 61 x0 = 2.86 s0 = 2.23 s.e.(x0) = 0.29
(Group 0)

We are interested in difference in the population mean weight loss after 4 weeks
between Atkins & Weight Watchers groups: m1 - m0
Truby H et al. BMJ 2007

31

95% confidence interval of difference between


population means (large samples)

Estimate of difference in population mean weight loss after 4 weeks


between Atkins & Weight Watchers groups = 4.40 – 2.86 = 1.54 kg

x1 − x 0 = 4.40 − 2.86 = 1.54 kg

s.e.( x 1 − x 0 ) = (0.32) 2 + (0.29) 2 = 0.43 kg

• CI95% Lower limit = 1.54 – (1.96 × 0.43) = 0.70 kg


• CI95% Upper limit = 1.54 + (1.96 × 0.43) = 2.38 kg

32

Interpretation

• We found a difference of 1.54 kg in mean weight loss after 4


weeks between the Atkins & Weight Watchers diet groups.

m1 - m0

0 0.7 2.38

• We are 95% confident that the population mean difference in


weight loss between the Atkins and the Weight Watchers diet
could be as much as 2.38 kg (much greater weight loss for
Atkins diet) or 0.7 kg (marginally greater weight loss for the
Atkins diet) compared with Weight Watchers

33

11
26/02/2020

Alternative interpretations

• What if the 95% CI: [-1.3kg, -0.1kg] ?


m1 - m0

-1.3 -0.1 0

• We are 95% confident that the population mean difference in


weight loss between the Atkins and the Weight Watchers diet
could be as much as -1.3 kg (greater weight loss for Weight
Watchers diet) or -0.1 kg (slightly greater weight loss for the
Weight Watchers diet) compared with Atkins

34

Alternative interpretations

• What if the 95% CI: [-1.3kg, 2.1kg] ?


m1 - m0

-1.3 0 2.1

• We are 95% confident that the population mean difference in


weight loss between the Atkins and the Weight Watchers diet
could be as much as -1.3 kg (greater weight loss for Weight
Watchers diet) or 0 kg (no difference in weight loss between
the two diets), or 2.1 kg (greater weight loss for the Atkins diet)
compared with Weight Watchers diet)

35

Summary

• Interpretation of the confidence interval


• Calculate confidence interval for large
samples
• Understand the difference between a
confidence interval and reference range
• Difference between unpaired & paired
data
• Comparison of two means (unpaired)
– Interpret the confidence interval of the
difference between two means
– Calculate the confidence interval for
large samples

36

12

You might also like