You are on page 1of 39

Hypothesis testing case study: glaucoma

Dr Alberto Corrias

Department of Biomedical Engineering. National University of Singapore

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 1 / 39


PollEverwhere game
1 Bi-weekly lucky draw and provisional
standing will be next week
2 We have new questions today.
Navigate to PollEverywhere at
https://pollev.com/albertocorrias.
Insert your name and wait until the
lecturer activates the first question.
3 When the lecturer activates the first
question, you may be asked to ”login
with SSO”. Please do so using your
NUS email address, e.g.,
eXXXXXX@u.nus.edu

Drawing of man warming up by Videoplasty.com from Wikimedia Commons is licensed under CC-BY-SA 4.0

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 2 / 39


Anatomy of the eye

Youtube link: https://www.youtube.com/watch?v=4kpXKu5QKww

To note:
Retina
Optic disc and optic nerve

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 3 / 39


Introduction to glaucoma

Youtube link: https://www.youtube.com/watch?v=wD4bEFEpNao

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 4 / 39


The problem of glaucoma†

Prevalence of Glaucoma in Singapore


3.54% Chinese 3.2%
prevalence Indian 1.95%
worldwide Malay 3.4%


Sources:
Tham YC et al. Opthalmology. 2014;121(11): 2081−2090
Singapore Malay Eye Study: Shen SY et al. Invest Ophthalmol Vis Sci. 2008; 49(9): 3846-51.
Singapore Chinese Eye Study: Baskaran M et al. JAMA Ophthalmol. 2015;133(8):874-80
Singapore Indian Eye Study: Narayanaswamy A et al. Invest Ophthalmol Vis Sci. 2013;54(7):4621-7.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 5 / 39


Detail of the retina

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 6 / 39


OCT to measure RNFL thickness

Youtube link: https://www.youtube.com/watch?v=_xoJ_t_H09Y

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 7 / 39


Our first case study

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 8 / 39


The retinal nerve fibre layer (RNFL)

Golzan S, Morgan WH, Georgevsky D, Graham SL. PLoS One. 2015; 10(6): e0128433.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 9 / 39


The data

Glaucoma group, RNFL thickness (µm)


Patient Temporal superior Temporal inferior Nasal superior Nasal inferior
1 141.50 83.58 112.32 61.37
2 134.36 76.50 86.52 65.46
3 105.66 86.91 63.45 78.36
4 144.20 51.33 137.09 56.22
.. .. .. ..
. . . .
25 87.15 87.17 120.10 87.27

Healthy group, RNFL thickness (µm)


Patient Temporal superior Temporal inferior Nasal superior Nasal inferior
1 110.01 90.86 84.22 117.72
2 134.10 73.35 105.12 106.71
3 105.90 104.22 136.60 89.97
4 113.75 84.12 120.02 102.85
.. .. .. ..
. . . .
35 112.14 114.69 121.67 96.02

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 10 / 39


Box plot to visualize data

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 11 / 39


Histograms to visualize data (Mean ± SD)

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 12 / 39


Histograms to visualize data (Mean ± SEM)!?

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 13 / 39


Mean ± SEM: a ”dirty little trick”

The sample standard deviation is an indication of the variability in the


population
s
Pn 2
i=1 (Xi − X )
SD = s =
n−1

The SEM is an indication of how well the sample mean estimates the
population mean
s
SEM = √
n

NOTE that SEM < SD! Data ”look” better with less variability,
but showing mean ± SEM may deceive the reader into believing the
variability is small! DO NOT DO THIS! Usage of mean ± SD is
advised.
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 14 / 39
The question

What we are interested in


Is there a difference in RNFL thickness between glaucoma patients
and healthy individuals?

The question in statistical terms


Is there a difference in the mean value of RNFL thickness between the
population of glaucoma patients (µgla ) and the population of healthy
individuals (µhea )?

The characteristics of the populations of glaucoma patients and


healthy individuals are unknown. We will use our sample to test the
hypothesis:
H0 : µgla − µhea = 0 (this is the NULL hypothesis)
H1 : µgla − µhea 6= 0 (this is the alternate hypothesis)
We will assume unequal variance for this problem.
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 15 / 39
The sum of two random variables

If, hypothetically, X ∼ N (3, 16) and Y ∼ N (2, 9), then


X + Y ∼ N (a, b 2 )
a=?,b=?
Answer: X + Y will have mean = 2 + 3 = 5 and
σ 2 = σX2 + σY2 = 25, hence, a=5, b 2 = 25 and so b=5.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 16 / 39


Performing the t test
1 Compute the sample means
P25
(xi ) 141.5 + 134.36 + ... + 87.15
Xgla = i=1 = = 98.3
25 25
P35
(xi ) 110 + 134.10 + ... + 112.14
Xhea = i=1 = = 117.2
35 35

2 Compute the variances of the samples (and standard deviations)


s s
P25 P35
i=1 (xi − Xgla )
2 − Xhea )2
i=1 (xi
sgla = = 28.42, shea = = 12.72
25 − 1 35 − 1

3 Compute the t statistic

(Xgla − Xhea ) − 0
t= p2 2
= −3.109
sgla/25 + shea/35

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 17 / 39


Performing the t test

T (η)

Area = α/2 Area = α/2

−tcrit 0 tcrit

Rejection Region No Rejection Region Rejection Region


where η is given by the Satterthwaite approximation (see theory)
2
(sgla 2
/ngla + shea/nhea )2
η= 2
(sgla /ngla )2 2
(shea /nhea )2
= 30.9 ∼ 31
ngla −1 + nhea −1

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 18 / 39


The Student t distribution

We have just determined that we will use a Student t distribution


with 31 degrees of freedom. A friend of yours suggests to use the
normal distribution N (0, 1) instead. Do you think the result
would change considerably? Why?
Answer: Using T (31) or N (0, 1) will not make a big difference.
For sample size n > 30 the student t distribution and the normal
distribution N (0, 1) are very close to each other. In practice, you
can use the t distribution, but it is useful to bear in mind that for
n > 30 the difference with the normal distribution is not big.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 19 / 39


Determination of tcrit using tables

We choose α = 0.05 to
identify tcrit = 2.042

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 20 / 39


Performing the t test

T (31)

Area = 0.025 Area = 0.025

-3.109 −2.042 0 2.042 3.109


Rejection Region No Rejection Region Rejection Region

|t| > tcrit : we reject the NULL hypothesis


The sum of the areas in purple is the p-value.
Without further computations, we already know p < 0.05
The exact p-value can be easily computed by any software or
approximated using tables.
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 21 / 39
Rejecting the NULL Hypothesis

Statistical interpretation
We rejected the NULL hypothesis. If, in fact, µgla = µhea , the
probability of having obtained the observed differences in RNFL
measurements between the two groups simply by random sampling is
very small, i.e., less than 5%.

How it is reported
We observed that the mean RNFL thickness was reduced in glaucoma
patients as compared to healthy subjects in the temporal superior
quadrant. The difference was statistically significant (p < 0.05)† .

We can use ”reduced” because we observed that Xgla < Xhea


Alternatively, the exact p value is reported.
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 22 / 39
The Student t distribution

We have just rejected the NULL hypothesis based on a 95%


confidence level. If it turns out that, actually, µgla = µhea , what
type of error would have we committed? The probability of
making such error is < MAX . MAX=?
Answers:
Type I Error.
MAX=0.05 (the actual value is the p value, which in this
case is < 0.05)

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 23 / 39


Rejecting the NULL Hypothesis
One possible unlikely scenario that we rejected as ”almost impossible”

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 24 / 39


Rejecting the NULL Hypothesis
One possible scenario (more likely)

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 25 / 39


Two-sided tests versus one-sided tests
Two-sided One-sided
H0 : µhea − µgla = 0 H0 : µhea − µgla = 0
H1 : µhea − µgla 6= 0 H1 : µhea − µgla > 0

Area = 0.025 Area = 0.05

-2.042 0 2.042 0 tcrit


OS

tcrit (which we saw is 2.042) is OS


tcrit (one sided) is the value that
the value that leaves the half of leaves the entire area to its
the area towards each of the right.
two tails
the p-value is the area to the
the p-value is the sum of the right of my t statistic.
areas to the left and to the right
of my t statistic.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 26 / 39


Two-sided tests versus one-sided tests

A two-sided hypothesis test found a p-value = 0.2. What’s the p


value of the corresponding one-sided test?
Answer:for the one-sided test, the p-value is half of 0.2, i.e., 0.1.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 27 / 39


Two-sided or one-sided?

General rule
In most cases, you should use a 2-sided test, even if you see
something like ”I would like to know whether A > B” in the problem
statement. This is because, before the experiments, regardless of
what you would like to investigate, you do not know whether A > B
or A < B. In most cases, A < B is also an interesting result!

Example: you are analysing data of a new drug against cancer against
the currently available one. You would like to know whether the
concentration of cancer cells with new drug (CND ) is less than with
the currently available one (CCA ). Then
H0 : CND = CCA
H1 : CND 6= CCA
because CND > CCA is still a very interesting result!

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 28 / 39


Disc Haemorrhage (DH) versus non-DH study

Lee EJ, Kim TW, Kim M, Girard MJ, Mari JM, Weinreb RN. Invest Ophthalmol Vis Sci. 2014 Apr 28;55(4):2805-15.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 29 / 39


Second scenario: CALCULATION QUESTION (3 points)
Questions:
RNFL thickness (µm) 1 Test the hypothesis that the RNFL
DH group non-DH group thickness changes between the DH
70.00 47.00 group and the non-DH group (NULL
hypothesis is no change). Use a 95%
62.00 51.00
confidence level. Provide the value
80.00 85.00 of the t statistic. Assume equal
81.00 88.00 variance.
80.00 2 What can you say about the p-value
of your analysis?

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 30 / 39


Second scenario: CALCULATION QUESTION solution

P4 P5
i=1 XDHi XnonDHi
XDH = XnonDH = i=1
4 5
s s
P4 2 P5
i=1 (XDHi − XDH )
2
sDH = i=1 (XnonDHi − XnonDH )
snonDH =
4−1 5−1

DH group non-DH group


Sample Sample Sample Sample
Mean standard deviation Mean standard deviation
(XDH ) (sDH ) (XnonDH ) (snonDH )
73.25 9.00 70.20 19.61

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 31 / 39


Second scenario: CALCULATION QUESTION solution
Hypothesis testing
H0 : µDH − µnonDH = 0, i.e., µDH = µnonDH
H1 : µDH − µnonDH 6= 0, i.e., µDH 6= µnonDH
We know that, assuming equal variance, we can use
(XDH − XnonDH ) − 0
t= p
sp 1/4 + 1/5
where sp is the pooled variance given by
s
2 + (5 − 1)s 2
(4 − 1)sDH nonDH
sp = = 15.95
4+5−2
Our test statistic t will follow a Student t distribution with
η = n1 + n2 − 2 = 7 degrees of freedom. In our case
(73.25 − 70.20) − 0
t= p = 0.285
15.95 1/4 + 1/5
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 32 / 39
Determination of tcrit using tables

We choose α = 0.05
(tail = 0.025) to iden-
tify tcrit = 2.365

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 33 / 39


Second scenario: CALCULATION QUESTION solution

T (7)

Area = 0.025 Area = 0.025

−2.365 -0.285 0 0.285 2.365


Rejection Region No Rejection Region Rejection Region

|t| < tcrit : we fail to reject the NULL hypotehsis


Without further computations, we already know p > 0.05
The exact p-value can be easily computed by any software or
approximated using tables.
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 34 / 39
Second scenario: CALCULATION QUESTION solution

VERY COMMON MISTAKE


Please note that we are not allowed to conclude that there is no
difference in RNFL thickness between DH and non-DH group. Failing
to reject to NULL hypothesis does NOT mean H0 is true. It only
means that our data failed in the attempt of rejecting H0 and
therefore H0 is still plausible.

One possible interpretation


The collected data contain insufficient evidence to support a
difference in RNFL thickness between the DH and non-DH group.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 35 / 39


Courtroom analogy

In a courtroom
H0 : innocent, H1 : guilty. Prosecution In hypothesis testing
tries to reject H0 . If evidence is not Like the prosecutor, we try to reject
enough, the accused is not convicted. H0 . If we fail to do so, It does not
It does not mean he is innocent. mean H0 is true. It just means we
Prosecution failed to prove he is failed to prove H0 is false.
guilty.

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 36 / 39


Hypothesis testing: assumption check

X −µ
t= s/√n
∼T

Requires the assumptions:


A truly random and independent sample.
Any trend in the data?
Any bias in selecting subjects?
Underlying population approximately normal (more important as
samples get smaller)
Check for extreme outliers!
Look at your data for suspect non-normality

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 37 / 39


Hypothesis testing and confidence intervals
β = 100 ∗ (1 − α)% confidence interval Hypothesis testing

s s X −µ
X − tα/2 √ < µ < X + tα/2 √ t= s/√n
n n

BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 38 / 39


What’s so special with 95% confidence?

”...if one in 20 does not seem high


enough odds, we may, if we prefer it,
draw the line at 1 in 50 (the 2% point) or
1 in 100 (the 1% point). Personally, the
writer prefers to set a low standard of
significance at the 5% point, and ignore
entirely all results which fails to reach this
level.”

Ronald A Fisher
(1890-1962)


Fisher RA. ”The arrangement of field experiments” J Min Agr.
1926;33:503-513
BN2102 Bioengineering Data Analysis Hypothesis testing case study: glaucoma 39 / 39

You might also like