Professional Documents
Culture Documents
R Codes of Q1
#Q1 Poisson
library(fitdistrplus)
health <- read.csv("Health2022C.csv", stringsAsFactors = TRUE)
fpois <- fitdist(health$claims, "pois")
result1 <- gofstat(fpois, chisqbreaks = (0:2),
discrete=TRUE, fitnames=c("Poisson"))
summary(fpois)
result1
ansd <- dpois(1, fpois$estimate)+dpois(2, fpois$estimate)
ansd
R Output of Q1
> #Q1 Poisson
> library(fitdistrplus)
> health <- read.csv("Health2022C.csv", stringsAsFactors = TRUE)
> fpois <- fitdist(health$claims, "pois")
> result1 <- gofstat(fpois, chisqbreaks = (0:2),
+ discrete=TRUE, fitnames=c("Poisson"))
> summary(fpois)
Fitting of the distribution ' pois ' by maximum likelihood
Parameters :
estimate Std. Error
lambda 0.3158 0.007947247
Loglikelihood: -3560.259 AIC: 7122.517 BIC: 7129.034
> result1
Chi-squared statistic: 0.8184031
Degree of freedom of the Chi-squared distribution: 2
Chi-squared p-value: 0.6641803
Chi-squared table:
obscounts theocounts
<= 0 3638.00000 3646.02638
<= 1 1169.00000 1151.41514
<= 2 172.00000 181.80845
> 2 21.00000 20.75004
Goodness-of-fit criteria
Poisson
Akaike's Information Criterion 7122.517
Bayesian Information Criterion 7129.034
> ansd <- dpois(1, fpois$estimate)+dpois(2, fpois$estimate)
> ansd
[1] 0.2666447
1
Term 2, 2021/2022
Answer of Q1
a. P-value=0.66418> 0.05. We cannot reject the null hypothesis. The data
do not provide sufficient evidence that the Poisson distribution is not the true
model.
b. P(X=1)+P(X=2)=0.2666
R Codes of Q2
#Q2 modeling continues distribution
bulbs <- read.csv("Bulb2022C.csv", stringsAsFactors = TRUE)
dat <- bulbs$lifetime
fnorm <- fitdist(dat, "norm")
flnorm <- fitdist(dat, "lnorm")
fexp <- fitdist(dat, "exp")
summary(fnorm)
summary(flnorm)
summary(fexp)
# normal is the best model
plot(fnorm)
fnorm$estimate
R Output of Q2
> #Q2 modeling continues distribution
> bulbs <- read.csv("Bulb2022C.csv", stringsAsFactors = TRUE)
> dat <- bulbs$lifetime
>
> fnorm <- fitdist(dat, "norm")
> flnorm <- fitdist(dat, "lnorm")
> fexp <- fitdist(dat, "exp")
> summary(fnorm)
Fitting of the distribution ' norm ' by maximum likelihood
Parameters :
estimate Std. Error
mean 243.54611 1.3207689
sd 17.71997 0.9339246
Loglikelihood: -772.8536 AIC: 1549.707 BIC: 1556.093
2
Term 2, 2021/2022
Correlation matrix:
mean sd
mean 1 0
sd 0 1
> summary(flnorm)
Fitting of the distribution ' lnorm ' by maximum likelihood
Parameters :
estimate Std. Error
meanlog 5.49264758 0.005444284
sdlog 0.07304274 0.003846444
Loglikelihood: -773.0776 AIC: 1550.155 BIC: 1556.541
Correlation matrix:
meanlog sdlog
meanlog 1.000000e+00 2.975914e-13
sdlog 2.975914e-13 1.000000e+00
> summary(fexp)
Fitting of the distribution ' exp ' by maximum likelihood
Parameters :
estimate Std. Error
rate 0.004105999 0.0002864443
Loglikelihood: -1169.155 AIC: 2340.31 BIC: 2343.503
> # normal is the best model
> plot(fnorm)
> fnorm$estimate
mean sd
243.54611 17.71997
Answer of Q2
(a) Normal (AIC=1549.707, BIC=1556.093), lognorm (AIC=1550.155,
BIC=1556.541), and exponential (AIC=2340.31, BIC=2343.503). Since
normal model provides the smallest AIC and smallest BIC, normal model
is the best fitted distribution.
(b)