Professional Documents
Culture Documents
Goodness of Fit
When a theoretical distribution has been assumed (e.g.
based on the general shape of histogram, or physical nature
of the problem), the validity of the assumed distribution may
be verified or disproved statistically by goodness of fit tests
- Chi-square method (we will cover this)
- Kolmogorov-Smirnov test
- Anderson-Darling test not covered here
Hypothesis testing
During 400 5-min intervals the air traffic control of an airport
received 0,1,2,…, or 13 radio messages with respective frequencies
of 3, 15, 47, 76, 68, 74, 46, 39, 15, 9, 5, 2, 0, and 1. Furthermore, we
want to check whether these data substantiate the claim that the
number of radio messages received during a 5 min. interval may be
regarded as a r.v. having Poisson distribution with = 4.6.
General rule of
thumb is that
expected
frequencies > 5.
We can achieve this
by combing some of
the data.
(But in exam, don’t
worry about this, we
will tell you how to
combine if needed)
Note: observed freq
can be < 5
Poisson Distribution
(t ) x t
P ( X x in t ) e x 0, 1, 2...
x!
= 4.6 = t
Poisson Probability
For
x = 0, P = 0.01 x = 13, P should be calculated
as P(X 13)
x = 1, P = 0.046
P(X 13) = 1 – P(X 12)
x = 2, P = 0.107
= 1 – P(X=0) – P(X=1) ….
…
= 0.001
Statistic for Test Goodness of Fit
k
o i ei
2
2
i 1 ei
Example (cont’d)
Test at 0.01 level of significance whether the data can be viewed
as values of a random variable having the Poisson distribution
with = 4.6.
1. H0: Random variable, Poisson distribution, = 4.6
H1: Random variable, does not follow Poisson distribution
with = 4.6
2. = 0.01
2 2
3. Criterion: Reject H0 if 0.01 21.666 ,
Here, parameter is given ( = 4.6), so no. of parameters
estimated from data is m = 0 k 2
oi ei
d.o.f. = k – 1 – m = 10 – 1 = 9, where
2
i 1 ei
4. Calculation:
2 18 22.42 8 82
... 6.749
22.4 8
5. Decision:
2 02.01 , H0 cannot be rejected.
Example (cont’d)
Test at 0.01 level of significance whether the data can be viewed
as values of a random variable having the Poisson distribution
with = 4.6.
However, if we wanted to test whether the data could have
arisen from a Poisson distribution, without specifying ,
and we estimate from the data, then
m=1 d.o.f. = 10 – 1 – m = 10 – 1 – 1 = 8.
3 15 47 1
0 1 2 13
400 400 400 400
= 4.535, which is close, but not exactly equal to 4.6
Example
Example
Normal distribution
2
k
oi ei 2 10.73 11.07 OK, but marginal
i 1 ei
Lognormal distribution
2
k
oi ei 2 7.97 11.07 Clearly OK
i 1 ei
However, the test shows that the lognormal is better than the
normal distribution
Appendix
Appendix
Normal distribution
Parameters: = 7.5, = 0.53 (estimated from raw data, which
is not shown in this example)
6.75 7.5
Bin 1 P ( X 6.75) 1.42
0 . 53
1 0.9222 0.0778
Theoretical frequency, e1 = 0.0778 143 = 11.1
7 7 .5 6.75 7.5
Bin 2 P (6.75 X 7.00)
0 . 53 0 . 53
0.1736 0.0778 0.0958
Theoretical frequency, e2 = 0.0958 143 = 13.7
Lognormal distribution
ln(6.75) 2.012
Bin 1 P ( X 6.75) 1.45
0.0706
1 0.9265 0.0735
Appendix
Lognormal distribution
Bin 2
ln(7) 2.012 ln(6.75) 2.012
P (6.75 X 7.00)
0 . 0706 0 . 0706
0.1736 0.0735 0.100
Theoretical frequency, e2 = 0.1 143 = 14.3