You are on page 1of 43

Introduction to sample

size and power


calculations

How much chance do we have


to reject the null hypothesis
when the alternative is in fact
true?
(whats the probability of
detecting a real effect?)

Can we quantify how


much power we have for
given sample sizes?

study 1: 263 cases, 1241 controls

Null
Distribution:
difference=0.

Rejection region.
Any value >= 6.5
(0+3.3*1.96)
For 5% significance
level, one-tail
area=2.5%

(Z/2 = 1.96)

Power= chance of being in the


Clinically relevant
rejection region if the alternative
alternative: is true=area to the right of this
difference=10%.
line (in yellow)

study 1: 263 cases, 1241 controls


Rejection region.
Any value >= 6.5
(0+3.3*1.96)

Power here:
6.5 10
P( Z >
)=
3.3
P( Z > 1.06) = 85%

Power= chance of being in the


rejection region if the alternative
is true=area to the right of this
line (in yellow)

study 1: 50 cases, 50 controls

Critical value=
0+10*1.96=20

Z/2=1.96

2.5% area
Power closer to
15% now.

Study 2: 18 treated, 72 controls, STD DEV = 2


Critical value=
0+0.52*1.96 = 1

Clinically relevant
alternative:
difference=4 points

Power is nearly
100%!

Study 2: 18 treated, 72 controls, STD DEV=10


Critical value=
0+2.58*1.96 = 5

Power is about
40%

Study 2: 18 treated, 72 controls, effect size=1.0

Critical value=
0+0.52*1.96 = 1

Power is about
50%
Clinically relevant
alternative:
difference=1 point

Factors Affecting Power


1. Size of the effect
2. Standard deviation of the
characteristic
3. Bigger sample size
4. Significance level desired

1. Bigger difference from the null mean

Null

Clinically
relevant
alternative

average weight from samples of 100

2. Bigger standard deviation

average weight from samples of 100

3. Bigger Sample Size

average weight from samples of 100

4. Higher significance level

Rejection region.

average weight from samples of 100

Sample size calculations

Based on these elements, you can


write a formal mathematical
equation that relates power,
sample size, effect size, standard
deviation, and significance level

**WE WILL DERIVE THESE


FORMULAS FORMALLY SHORTLY**

Simple formula for


difference in means

Represents the
desired power
(typically .84 for
80% power).

Sample size in
each group
(assumes equal
sized groups)

2 ( Z Z/2 )
2

difference Represents the

Standard
deviation of the
outcome variable

Effect Size
(the
difference in

desired level of
statistical
significance
(typically 1.96).

Simple formula for


difference in proportions
Represents the
desired power
(typically .84 for
80% power).

Sample size in
each group
(assumes equal
sized groups)

2( p )(1 p )( Z Z/2 )

A measure of
variability
(similar to
standard

(p1 p2 )
Effect Size
(the
difference in
proportions)

2
Represents the
desired level
of statistical
significance
(typically

Derivation of sample size


formula.

Study 2: 18 treated, 72 controls, effect size=1.0


Critical value= 0+.52*1.96=1

Power close to 50%

SAMPLE SIZE AND POWER FORMULAS


Critical value=
0+standard error (difference)*Z/2

Power= area to right of Z=


Z

critical value - alternative difference (here 1)


standard error (diff)

e.g . here :Z

0
; power 50%
standard error (diff)

Power= area to right of Z=


Z

critical value - alternative difference


standard error (diff)

Z/2 * standard error (diff) - difference


Z
standard error(diff)
Power is the area to the right of Z .
difference
Z Z/2
OR power is the area to the left of standard error(diff) Z .
Since normal charts give us the area
to the left by convention, we need to
difference
Z
Z/2 use
standard error(diff)
- Z to get the correct value.

Z power Z

Most textbooks just call this


Z; Ill use the term Zpower to
avoid confusion.

the area to the left of Z power the area to the right of Z

All-purpose power
formula

Z power

difference

Z / 2
standard error(difference)

Derivation of a sample
size formula

s.e.(diff )

n1 n2
2

Sample size is embedded in


the standard error.

if ratio r of group 2 to group 1 : s.e.(diff )

n1 rn1
2

Algebra
Z power

Z power

difference

2 2

n1 rn1

difference
(r 1) 2
rn1
2

( Z power Z/2 ) (

Z/2

Z/2

difference
(r 1) 2
rn1

)2

( r 1) ( Z power Z/2 ) rn1difference

rn1 difference ( r 1) ( Z power Z/2 )

( r 1) ( Z power Z/2 )
2

n1

rdifference 2

(r 1) ( Z power Z/2 )
n1
2
r
difference
2

If r 1 (equal groups), then n1

2 2 ( Z power Z/2 ) 2
difference 2

Sample size formula for


difference in means
(r 1) ( Z power Z/2 )
n1
2
r
difference
2

where :
n 1 size of smaller group
r ratio of larger group to smaller group

standard deviation of the characteristic


diffference clinically meaningful difference in means of the outcome
Z power corresponds to power (.84 80% power)
Z / 2 corresponds to two - tailed significance level (1.96 for .05)

Examples

Example 1: You want to calculate how much


power you will have to see a difference of 3.0
IQ points between two groups: 30 male doctors
and 30 female doctors. If you expect the
standard deviation to be about 10 on an IQ
test for both groups, then the standard error
for the difference will be about:

10 2 10 2

= 2.57
30
30

Power formula
Z power

Z
Z power

d*

Z / 2
(d *)

d*
2 2
n

Z / 2

d* n

Z / 2
2

d*
3
d* n
3
Z / 2
1.96 .79 or ZZpower

/2
(d *)
2.57
2
10

30
1.96 .79
2

P(Z -.79) =.21; only 21% power to see a difference of 3 IQ points.

Example 2: How many people


would you need to sample in each
group to achieve power of 80%
(corresponds to Z=.84)

2 2 ( Z Z / 2 ) 2
(d *) 2

100(2)(.84 1.96) 2

174
2
(3)

174/group; 348 altogether

Sample Size needed for


comparing two
proportions:
Example: I am going to run a case-control
study to determine if pancreatic cancer is
linked to drinking coffee. If I want 80%
power to detect a 10% difference in the
proportion of coffee drinkers among
cases vs. controls (if coffee drinking and
pancreatic cancer are linked, we would
expect that a higher proportion of cases
would be coffee drinkers than controls),
how many cases and controls should I
sample? About half the population drinks
coffee.

Derivation of a sample
size formula:

The standard error of the difference of two proportion

p (1 p ) p (1 p )

n1
n2

Derivation of a sample
size formula:
Here, if we assume equal sample size and
that, under the null hypothesis proportions of
coffee drinkers is .5 in both cases and
controls, then
s.e.(diff)=

.5(1 .5) .5(1 .5)

.5 / n
n
n

Z power

test statistic

Z / 2
s.e.(test statistic )

Z power =

.10
.5 / n

1.96

For 80% power


.84

.10
.5 / n

.84 1.96

1.96
.10
.5 / n

2
.
10
n
(.84 1.96) 2
.5
.5(.84 1.96) 2
n
392
2
.10

There is 80% area to


the left of a Z-score
of .84 on a standard
normal curve;
therefore, there is 80%
area to the right of
-.84.

Would take 392 cases and 392 controls to have 80% power!
Total=784

Question 2:
How many total cases and controls would I
have to sample to get 80% power for the
same study, if I sample 2 controls for
every case?

Ask yourself, what changes here?


Z power

test statistic
Z / 2
s.e.(test statistic)

p (1 p ) p (1 p )
.25 .25
.25 .5
.75
.75

2n
n
2n
n
2 n 2n
2n
2n

Different size groups


.84

.10
.75 / 2n

1.96
.10

.84 1.96

.75 / 2n

(.10 2 ) 2n
(.84 1.96)
.75
.75(.84 1.96) 2
n
294
2
( 2).10
2

Need: 294 cases and 2x294=588 controls. 882 total.


Note: you get the best power for the lowest sample size if you keep both groups equal (882 > 784).
You would only want to make groups unequal if there was an obvious difference in the cost or ease of
collecting data on one group. E.g., cases of pancreatic cancer are rare and take time to find.

General sample size


formula
s.e.(diff )

p (1 p ) p (1 p)

rn
n

p(1 p ) rp(1 p )
( r 1) p (1 p)

rn
rn
rn

2
r 1 p (1 p )( Z power Z / 2 )
n
r
( p1 p 2 ) 2

General sample size needs


when outcome is binary:
2
p
(
1

p
)(
Z

Z
)
r 1

/2
n
2
r
( p1 p2 )
where :
n size of smaller group
r ratio of larger group2 to smaller group

2 ( Z power Z / 2 ) 2

p1 p2 clinically
n meaningful difference in proportions of the outcome
2

Z corresponds to power (.84


80%
(diff
) power)

Z / 2 corresponds to two - tailed significance level (1.96 for .05)

Compare with when


outcome is continuous:
(r 1) ( Z Z/2 )
n1
2
r
difference
2

where :
n1 size of smaller group
r ratio of larger group to smaller group

standard deviation of the characteristic


diffference clinically meaningful difference in means of the outcome
Z corresponds to power (.84 80% power)
Z / 2 corresponds to two - tailed significance level (1.96 for .05)

Question

How many subjects would we need to


sample to have 80% power to detect an
average increase in MCAT biology score
of 1 point, if the average change without
instruction (just due to chance) is plus or
minus 3 points (=standard deviation of
change)?

Standard error here=

change
n

3
n

Z power
Z power

test statistic

Z / 2
s.e.(test statistic )

Z / 2
D
n

( Z power Z / 2 ) 2
2

Where
D=change from
test 1 to test 2.
(difference)

nD

D ( Z power Z / 2 )
D2

Therefore, need:
(9)(1.96+.84)2/1=
70peopletotal
2

Sample size for paired


data:
2

d ( Z Z/2 )
difference

where :
n sample size
standard deviation of the within - pair difference
diffference clinically meaningful difference
Z corresponds to power (.84 80% power)
Z / 2 corresponds to two - tailed significance level (1.96 for .05)

Paired data difference in


proportion: sample size:
n

p (1 p )( Z Z / 2 ) 2
( p1 p2 )

where :
n sample size for 1 group

p1 p2 clinically
2meaningful
( Z powerdifference
Z / in2 )dependent proportions
2

n s to power (.84 80%


Z correspond
2 power)
(diff )

Z / 2 corresponds to two - tailed significance level (1.96 for .05)

You might also like