You are on page 1of 43

Statistical Power And

Sample Size Calculations


Minitab calculations

Manual calculations

1
Sunday 9 July 2023 05:43 AM
When Do You Need Statistical Power
Calculations, And Why?

A prospective power analysis is used


before collecting data, to consider
design sensitivity .

2
When Do You Need Statistical Power
Calculations, And Why?

A retrospective power analysis is


used in order to know whether the
studies you are interpreting were well
enough designed.

3
When Do You Need Statistical Power
Calculations, And Why?

In Cohen’s (1962) seminal power analysis of the


journal of Abnormal and Social Psychology he
concluded that over half of the published
studies were insufficiently powered to result in
statistical significance for the main hypothesis.

Cohen, J. 1962 “The statistical power of


abnormal-social psychological research: A
review” Journal of Abnormal and Social
Psychology 65 145-153.
4
What Is Statistical Power?
Essential concepts

• the null hypothesis Ho


• significance level, α
• Type I error
• Type II error

Information point: Type I and Type II errors


Crichton, N.
Journal Of Clinical Nursing 9(2) 207-207 2000

5
What Is Statistical Power?
Essential concepts
Recall that a null hypothesis (Ho) states that the
findings of the experiment are no different to
those that would have been expected to occur
by chance. Statistical hypothesis testing
involves calculating the probability of achieving
the observed results if the null hypothesis were
true. If this probability is low (conventionally p <
0.05), the null hypothesis is rejected and the
findings are said to be “statistically significant”
(unlikely) at that accepted level.
6
Statistical Hypothesis Testing

When you perform a statistical


hypothesis test, there are four
possible outcomes

7
Statistical Hypothesis Testing

• whether the null hypothesis (Ho) is true


or false

• whether you decide either to reject, or


else to retain, provisional belief in Ho

8
Statistical Hypothesis Testing
Ho is really Ho is really false
Decision true i.e., there i.e., there really
is really no is an effect to
effect to find be found
correct
Retain Ho decision: Type II error:
prob = β
prob = 1 - α

Reject Ho Type I error: correct decision:


prob = α prob = 1 - β
9
When Ho Is True And You Reject It,
You Make A Type I Error
• When there really is no effect, but the
statistical test comes out significant by
chance, you make a Type I error.

• When Ho is true, the probability of making


a Type I error is called alpha (α). This
probability is the significance level
associated with your statistical test.

10
When Ho is False And You Fail To
Reject It, You Make A Type II Error
• When, in the population, there really is an
effect, but your statistical test comes out
non-significant, due to inadequate power
and/or bad luck with sampling error, you
make a Type II error.

• When Ho is false, (so that there really is an


effect there waiting to be found) the
probability of making a Type II error is
called beta (β).
11
The Definition Of
Statistical Power
• Statistical power is the probability of
not missing an effect, due to sampling
error, when there really is an effect
there to be found.

• Power is the probability (prob = 1 - β)


of correctly rejecting Ho when it
really is false.
12
Calculating Statistical Power
Depends On
1. the sample size
2. the level of statistical significance
required
3. the minimum size of effect that it is
reasonable to expect.

13
How Do We Measure Effect Size?

• Cohen's d
• Defined as the difference between
the means for the two groups, divided
by an estimate of the standard
deviation in the population.
• Often we use the average of the
standard deviations of the samples as
a rough guide for the latter.

14
Cohen's Rules Of Thumb For Effect Size

Correlation Difference
Effect size coefficient between means
d = 0.2 standard
“Small effect” r = 0.1 deviations

d = 0.5 standard
“Medium
r = 0.3 deviations
effect”

d = 0.8 standard
“Large
r = 0.5 deviations
effect” 15
Calculating Cohen’s d

N
otatio
n
d C
ohen’sdeffectsize
x x
d 1 2 x M
ean
sPooled
s S
tand
ardd
eviatio
n
S
ubscrip
treferstothetw
oco
nditio
nsb
eingco
m p
ared

Cohen, J., (1977). Statistical power analysis for the behavioural


sciences. San Diego, CA: Academic Press.
Cohen, J., (1992). A Power Primer. Psychological Bulletin 112
155-159.
16
Calculating Cohen’s d

17
Calculating Cohen’s d from a t test

Interpreting Cohen's d effect size: an interactive visualization

18
Conventions And Decisions About
Statistical Power
• Acceptable risk of a Type II error is often set at 1
in 5, i.e., a probability of 0.2 (β).
• The conventionally uncontroversial value for
“adequate” statistical power is therefore set at 1 -
0.2 = 0.8.
• People often regard the minimum acceptable
statistical power for a proposed study as being an
80% chance of an effect that really exists showing
up as a significant finding.
Understanding Statistical Power and Significance Testing — an Interactive Visualization

19
6 Steps to determine to determine an
appropriate sample size for my study?
1. Formulate the study. Here you detail your
study design, choose the outcome summary,
and you specify the analysis method.

2. Specify analysis parameters. The analysis


parameters, for instance are the test
significance level, specifying whether it is a
1 or 2-sided test, and also, what exactly it
is you are looking for from your analysis.
20
6 Steps to determine to determine an
appropriate sample size for my study?
3. Specify effect size for test. This could be
the expected effect size (often a best
estimate), or one could use the effect size
that is deemed to be clinically meaningful.

4. Compute sample size or power. Once you


have completed steps one through three
you are now in a position to compute the
sample size or the power for your study.
21
6 Steps to determine to determine an
appropriate sample size for my study?
5. Sensitivity analysis. Here you compute your sample
size or power using multiple scenarios to examine
the relationship between the study parameters on
either the power or the sample size. Essentially
conducting a what-if analysis to assess how
sensitive the power or required sample size is to
other factors.

22
6 Steps to determine to determine an
appropriate sample size for my study?
6. Choose an appropriate power or sample size, and
document this in your study design protocol.

However other authors suggest 5 steps (a, b, c or d)!

Other options are also available!

23
A Couple Of Useful Links

For an article casting doubts on scientific precision and


power, see The Economist 19 Oct 2013. “I see a train
wreck looming,” warned Daniel Kahneman. Also an
interesting read The Economist 19 Oct 2013 on the
reviewing process.

A collection of online power calculator web pages for


specific kinds of tests.

Java applets for power and sample size, select the


analysis.
24
Next Week

Statistical Power Analysis In Minitab

Note that GPower3.1 is installed on University


Machines. It is more complex to use than
Minitab, but does provide a wider range of
tests. For further information see the link,
and look down for the desired software.

25
Statistical Power Analysis In Minitab
Minitab is available via RAS
Stat > Power and Sample Size >

26
Statistical Power Analysis In Minitab

Note that you might


find web tools for
other models.

The alternative
normally involves
solving some very
complex equations.

Recall that a comparison of two proportions equates


to analysing a 2×2 contingency table.
27
Statistical Power Analysis In Minitab

Note that you might


find web tools for Simple statistical
other models. correlation analysis online

See Test 28 in the Handbook of


The alternative
Parametric and Nonparametric
normally involves
Statistical Procedures, Third
solving some very
Edition by David J Sheskin
complex equations.

28
Factors That Influence Power

• Sample Size
• alpha  
• the standard deviation

29
Using Minitab To Calculate Power
And Minimum Sample Size
• Suppose we have two samples, each
with n = 13, and we propose to use the
0.05 significance level
• Difference between means is 0.8
standard deviations (i.e., Cohen's
d = 0.8), so a t test
• All key strokes in printed notes

30
Using Minitab To Calculate Power
And Minimum Sample Size
Note that all
parameters, bar
one are required.

Leave one field


blank.

This will be
estimated.

31
Using Minitab To Calculate Power
And Minimum Sample Size
• Power and Sample Size

• 2-Sample t Test

• Testing mean 1 = mean 2 (versus not =)


• Calculating power for mean 1 = mean 2 + difference
• Alpha = 0.05 Assumed standard deviation = 1

Power will be
• Sample 0.4992
• Difference Size Power
• 0.8 13 0.499157

• The sample size is for each group.

32
Using Minitab To Calculate Power
And Minimum Sample Size
If, in the population, there really is a
difference of 0.8 between the
members of the two categories that
would be sampled in the two groups,
then using sample sizes of 13 each will
have a 49.92% chance of getting a
result that will be significant at the
0.05 level.
33
Using Minitab To Calculate Power
And Minimum Sample Size
• Suppose the difference between the
means is 0.8 standard deviations (i.e.,
Cohen's d = 0.8)
• Suppose that we require a power of
0.8 (the conventional value)
• Suppose we intend doing a one-tailed
t test, with significance level 0.05.
• All key strokes in printed notes
34
Using Minitab To Calculate Power
And Minimum Sample Size

Select “Options” to set a


one-tailed test 35
Using Minitab To Calculate Power
And Minimum Sample Size

36
Using Minitab To Calculate Power
And Minimum Sample Size
• Power and Sample Size

• 2-Sample t Test Target power


of at least 0.8
• Testing mean 1 = mean 2 (versus >)
• Calculating power for mean 1 = mean 2 + difference
• Alpha = 0.05 Assumed standard deviation = 1

• Sample Target
• Difference Size Power Actual Power
• 0.8 21 0.8 0.816788

• The sample size is for each group.

37
Using Minitab To Calculate Power
And Minimum Sample Size
• Power and Sample Size

• 2-Sample t Test At least 21 cases


in each group
• Testing mean 1 = mean 2 (versus >)
• Calculating power for mean 1 = mean 2 + difference
• Alpha = 0.05 Assumed standard deviation = 1

• Sample Target
• Difference Size Power Actual Power
• 0.8 21 0.8 0.816788

• The sample size is for each group.

38
Using Minitab To Calculate Power
And Minimum Sample Size
• Power and Sample Size

• 2-Sample t Test Actual power


0.8168
• Testing mean 1 = mean 2 (versus >)
• Calculating power for mean 1 = mean 2 + difference
• Alpha = 0.05 Assumed standard deviation = 1

• Sample Target
• Difference Size Power Actual Power
• 0.8 21 0.8 0.816788

• The sample size is for each group.

39
Using Minitab To Calculate Power
And Minimum Sample Size

Suppose you are about to undertake an


investigation to determine whether or not 4
treatments affect the yield of a product using 5
observations per treatment. You know that the
mean of the control group should be around 8,
and you would like to find significant differences
of +4. Thus, the maximum difference you are
considering is 4 units. Previous research suggests
the population σ is 1.64. So an ANOVA.
40
Using Minitab To Calculate Power
And Minimum Sample Size

41
Using Minitab To Calculate Power
And Minimum Sample Size

Power and Sample Size Power 0.83


One-way ANOVA
Alpha = 0.05
Assumed standard deviation = 1.64
Number of Levels = 4
SS Sample Maximum
Means Size Power Difference
8 5 0.826860 4
The sample size is for each level.
42
Using Minitab To Calculate Power
And Minimum Sample Size

To interpret the results, if you assign five


observations to each treatment level, you have
a power of 0.83 to detect a difference of 4
units or more between the treatment means.

Minitab can also display the power curve of all


possible combinations of maximum difference in
mean detected and the power values for one-
way ANOVA with the 5 samples per treatment.
43

You might also like