You are on page 1of 17

Hypothesis Testing

Module 3.4: Goodness-of-t Tests

© University of New South Wales


School of Risk and Actuarial Studies

1/15
Hypothesis Testing

Goodness of t tests

Example: Goodness-of-t Tests

2/15
Hypothesis Testing
Goodness of t tests

Outline

Goodness of t tests

Example: Goodness-of-t Tests

3/15
Hypothesis Testing
Goodness of t tests

Goodness-of-t

▶ We wish to test distributional assumptions on a given set of


data.

▶ For example, is our data normally distributed? Does the count


data have a Poisson distribution?

▶ Goodness-of-t tests: AD, CV, and KS: based on empirical


c.d.f. and hypothesized c.d.f.

3/15
Hypothesis Testing
Goodness of t tests

Anderson-Darling & Cramér-von Mises test


Test H0 : X ∼ F (X ; θ) v.s. H1 : X ≁ F (X ; θ). Test statistic:
Z ∞
T =n · (Fn (x) − F (x))2 · w (x)dF (x)
−∞
1, if T = CV , Cramér-von Mises test;

w (x) =
(F (x) · (1 − F (x)))−1 , if T = AD , Anderson-Darling test.
n
1 − 2i
· log(Fn (x(i) )) + log(1 − Fn (x(n+1−i) )) − n
X 
AD =
d
n
i=1

1 n 
i − 0.5 2

(discrete sample).
X
CM = + Fn (x(i) ) − ,
12n
d
n
i=1

Here Fn (x) is the empirical cumulative distribution function.


4/15
Hypothesis Testing
Goodness of t tests

Figure Anderson-Darling & Cramér-von Mises test

0.7981

0.5

0.3297

Exp(1) c.d.f.
e.c.d.f.
0
0 0.4 1.6 3

5/15
Hypothesis Testing
Goodness of t tests

Knowledge of F (x)

To make a conclusion on the test, we compare the computed


statistics against a table of critical values

Consider the following cases


▶ Case 0: F (x) is completely specied
▶ Case 1: F (x) is the exponential distribution with unknown
parameter λ
▶ Case 2: F (x) is the normal distribution with unknown µ and
σ2
Remark: if F (x) is known, then you need to more stringent in
comparison to if only partial knowledge is available.

6/15
Hypothesis Testing
Goodness of t tests

Critical values Anderson-Darling & Cramér-von Mises test


H0 Statistic α = 0.1 0.05 0.025 0.01
F (x) CM
d 0.347 0.461 0.581 0.743
EXP(λ) (1 + 0.16/n) · CM
d 0.177 0.224 0.273 0.337
N(µ, σ 2 ) (1 + 0.5/n) · CM
d 0.104 0.126 0.148 0.178
F (x) AD
d 1.933 2.492 3.070 3.857
EXP(λ) (1 + 0.6/n) · AD
d 1.078 1.341 1.606 1.957
N(µ, σ 2 ) 1 + 0.75/n − 2.25/n2 · AD 0.631 0.752 0.873 1.035

d

▶ Calculate the test statistics and compare it to the critical


value1 at a given level of signicance α
▶ Reject the null hypothesis if test statistic is larger than the
critical value
1
7/15 values from Anderson and Darling (1954) and Stephens (1974)
Hypothesis Testing
Goodness of t tests

Kolmogorov-Smirno/Kuiper test
The KS test statistic is based on the maximum dierence between
the sample CDF and the hypothesized CDF,
H0 : X ∼ FX (x) v.s. H1 : X ≁ FX (x).

Test statistics:
 
i
D + =max − F (x(i) )
i n
i −1
 
D− =max F (x(i) ) −
i n
 + −
D
|{z} = max D , D V
|{z} = D+ + D−
KS-statistic Kuiper-statistic

8/15
Hypothesis Testing
Goodness of t tests

Figure Kolmogorov-Smirno/Kuiper test

D + = 0.072

D − = 0.117

Exp(1/3.7) c.d.f.
e.c.d.f.

9/15
Hypothesis Testing
Goodness of t tests

Critical values Kolmogorov-Smirno/Kuiper test


H0 Statistic α = 0.1 0.05 0.025 0.01
√ √ 
F (x)

n + 0.12 + 0.11/ n · D
√ 
1.224 1.358 1.480 1.628
EXP(λ) √
n + 0.26 + 0.5/ n · (D − 0.2/n)
√ 
0.995 1.094 1.184 1.298
N(µ, σ 2 ) n − 0.1 + 0.85/ n · D 0.819 0.895 0.955 1.035
√ √ 
F (x)

n + 0.155 + 0.24/ n · V
√ 
1.620 1.747 1.862 2.001
EXP(λ) √
n + 0.24 + 0.35/ n · (V − 0.2/n)
√ 
0.527 1.655 1.774 1.910
N(µ, σ 2 ) n + 0.05 + 0.82/ n · V 1.386 1.489 1.585 1.693

▶ Calculate the test statistics and compare it to the critical


value2 at a given level of signicance α
▶ Reject the null hypothesis if test statistic is larger than the
critical value
2
10/15 values from Anderson and Darling (1954) and Stephens (1974)
Hypothesis Testing
Goodness of t tests

Comparison
1. Plot ecdf and cdf in a graph, is it a good t?
2. Compare the characteristics of the distributions (mean,
variance, skewness, kurtosis).
3. Plot QQ-plot ⇒ is it a good t.
1-3 Graphical methods give an idea of the t (good/bad,
tail/around median etc), interpret it in your setting!
▶ Chi-squared test: easy to adjust for estimated number of
parameters, approximate test.
▶ KS/Kuiper test: good power, hard to work with; more
sensitive near center of the distribution;
▶ AD/CM test: good power, hard to work with. CM/AD test
places higher weight on the tails.
11/15
Hypothesis Testing
Example: Goodness-of-t Tests

Outline

Goodness of t tests

Example: Goodness-of-t Tests

12/15
Hypothesis Testing
Example: Goodness-of-t Tests

Figure Anderson-Darling & Cramér-von Mises test

0.7981

0.5

0.3297

Exp(1) c.d.f.
e.c.d.f.
0
0 0.4 1.6 3

12/15
Hypothesis Testing
Example: Goodness-of-t Tests

Exercise: CM
Consider the sample from slide 12 with only two observations. Do
the following tests at a level of signicance of 5%.
Question: Perform the CM test if the the parameter was not
estimated.
Solution: Statistic
CM = 1/24 + (0.3297 − 0.25)2 + (0.7981 − 0.75)2 = 0.050,
critical value: 0.461, Thus do not reject H0 .
Question: Perform the CM test if the the parameter was
estimated.
Solution:
d = 1/24 + (0.3297 − 0.25)2 + (0.7981 − 0.75)2 = 0.050;
CM
statistic (1 + 0.08) · 0.050 = 0.054 critical value: 0.224, Thus
do not reject H0 .
13/15
Hypothesis Testing
Example: Goodness-of-t Tests

Exercise: AD
Consider the sample from slide 12 with only two observations. Do
the following tests at a level of signicance of 5%.
Question: Perform the AD test if the the parameter was not
estimated.
Solution: AD = −1/2 · (log(0.3297) + log(1 − 0.7981)) +
−3/2 · (log(0.7981) + log(1 − 0.3297)) − 2 = 0.293, critical
value: 2.492, Thus do not reject H0 .
Question: Perform the AD test if the the parameter was
estimated.
Solution: ADd = −1/2 · (log(0.3297) + log(1 − 0.7981)) +
−3/2 · (log(0.7981) + log(1 − 0.3297)) − 2 = 0.293; statistic
(1 + 0.3) · 0.293 = 0.381 critical value: 1.341, Thus do not
reject H0 .
14/15
Hypothesis Testing
Example: Goodness-of-t Tests

Exercise: KS/Kuiper
Consider the sample from slide 12 with only two observations.
Perform the KS and Kuiper tests at a level of signicance of 5%.
We have:
i i−1 i i−1
i F (x(i) ) n n n − F (x(i) ) F (x(i) ) − n
1 0.3297 1/2 0/2 0.1703 0.3297
2 0.7981 2/2 1/2 0.2019 0.2981
D + = 0.2019 D − = 0.3297

Question: Perform the KS test if the parameter was not


estimated.
Solution
√ : D = 0.3297;
√ statistic:
( 2 + 0.12 + 0.11/ 2) · 0.3297 = 0.531 critical value: 1.358,
thus do not reject H0 .
15/15

You might also like