Professional Documents
Culture Documents
HW 3-1
Use the ball-bearing life-test data in Table 1.1 and file LZBearing.csv to do the following:
# Calculate the ECDF and estimate the population fraction failing by 75 million cycles
F_hat <- ecdf(LZbearing$Millions.of.Cycles)
F_hat(75)
## [1] 0.6521739
Exercise 3.1(b) Use the conservative method in Section 3.4.1 to compute a conservative 90%
confidence interval for the population fraction failing by 75 million cycles.
Answer
We can now use the formula for the pointwise conservative confidence interval to compute the lower and upper
bounds of the interval. We will set 𝛼 = 0.1 for a 90% confidence interval.
1
Exercise 3.1(c) Repeat part (b) using the Jeffreys method in Section 3.4.2.
Answer
Exercise 3.1(d) Repeat part (b) using the Wald method in Section 3.4.3.
Answer
Exercise 3.1(e) Comparing the intervals from parts (b), (c), and (d), what do you conclude
about the adequacy of the Wald method for these data?
Answer
# set up plot
plot(c(0.4,0.85), c(0, 4), type = "n",ylab = "", xlab = "", main = "Comparison Interval Plot")
segments(q1_conservative, 1, q2_conservative, 1, lwd = 2, col = "blue")
text(mean(c(q1_conservative, q2_conservative)), 1.2, "Conservative", col = "blue")
segments(q1_Jeffreys, 2, q2_Jeffreys, 2, lwd = 2, col = "red")
text(mean(c(q1_Jeffreys, q2_Jeffreys)), 2.2, "Jeffreys", col = "red")
segments(q1_Wald, 3, q2_Wald, 3, lwd = 2, col = "green")
text(mean(c(q1_Wald, q2_Wald)), 3.2, "Wald", col = "green")
2
Comparison Interval Plot
4
Wald
3
Jeffreys
2
Conservative
1
0
Comparing the intervals from the conservative method, Jeffreys method, and Wald method, we can see that the
three intervals are very similar to each other. However, The Wald method relies on a normal approximation to
the sampling distribution of the sample proportion. This approximation is valid when the sample size is large
and the population proportion is not too close to 0 or 1. In this case, the sample size is relatively small (23 ball
bearings) and the estimated population proportion is relatively high (0.65). Therefore, the normal approximation
(Wald method) may not be accurate for these data.
In contrast, the conservative method and the Jeffreys method do not rely on a normal approximation and are based
on different statistical principles. The conservative method produces intervals that have guaranteed coverage
probabilities, while the Jeffreys method produces intervals that are based on a non-informative prior distribution
and have desirable properties in terms of coverage probability. Therefore, the conservative method and the
Jeffreys method may be more appropriate for these data than the Wald method.
HW 3-2
Weis et al. (1986) report on the results of a life test on silicon photodiode detectors in which 28 detectors were
tested at 85∘ C and 40 volts reverse bias. These conditions, which were more stressful than normal use condi-
tions, were used in order to get failures quickly. Specified electrical tests were made at 0,10,25,75,100,500,750,
1000,1500,2000,2500,3000,3500,3600,3700, and 3800 hours to determine if the detectors were still performing
properly. Failures were found after the inspections at 2500 (1 failure), 3000 (1 failure), 3500 (2 failures), 3600 (1
failure), 3700 (1 failure), and 3800 (1 failure). The other 21 detectors had not failed after 3800 hours of operation.
Use these data to estimate the failure-time cdf of such photodiode detectors running at the test conditions.
3
Exercise 3.12(a) From the description given above, the data would be useful for making in-
ferences about what particular population or process? Explain your reasoning.
Answer
The data provided in the study by Weis et al. (1986) would be useful for making inferences about the reliability of
silicon photodiode detectors under the given stress conditions. The purpose of the study was to test the durability
of the detectors under accelerated stress conditions and to observe their failure behavior. The researchers mon-
itored the performance of the detectors at different time intervals to see if any of them failed during the testing
period. By analyzing the data obtained from the study, it is possible to draw conclusions about the reliability
and durability of the photodiode detectors under the given stress conditions. Thus, the data is useful for making
inferences about the population of silicon photodiode detectors that may be subject to similar stress conditions.
Exercise 3.12(b) Compute and plot a nonparametric estimate of the cdf for time to failure at
the test conditions.
Answer
0.10
0.00
4
Exercise 3.12(c) Compute standard errors for the nonparametric estimate in part (b).
Answer
√ 𝑖
√ 𝑑𝑗̂
𝑆𝐸(𝑆(𝑡𝑖 )) = √Var [𝑆 (𝑡𝑖 )] = 𝑆 (𝑡𝑖 ) √∑
̂ ̂ ̂ ̂ ̂
̂
⎷ 𝑗=1 𝑛𝑗 (𝑛̂ 𝑗 − 𝑑𝑗 )
s_ti_hat=s_ti_hat
d_j=d_i
n_j=n_i
Exercise 3.12(d) Compute pointwise approximate 95% confidence intervals for 𝐹 (𝑡) and add
these to your plot.
Answer
𝑑𝑗
̂
V ar(𝐹 ̂ (𝑡)) = 𝐹 ̂ (𝑡)2 ∑
𝑡𝑗
𝑛 (𝑛 − 𝑑𝑗 )
<𝑡 𝑗 𝑗
# based on z_f_hat
plot(PhotoDetector$upper,F_ti_hat,
ylim = c(0, 0.25),xlab = "Time (thousands of hours)", type = "s",
ylab = "CDF",main = "Nonparametric CDF estimate with 95% confidence intervals")
5
(F_ti_hat*(1-F_ti_hat))
lower_ci_invlogit <- exp(logit_F_ti_hat - 1.96*se_logit_F_ti_hat)/
(1+exp(logit_F_ti_hat - 1.96*se_logit_F_ti_hat))
upper_ci_invlogit <- exp(logit_F_ti_hat + 1.96*se_logit_F_ti_hat)/
(1+exp(logit_F_ti_hat + 1.96*se_logit_F_ti_hat))
95% CI
95% CI ( logit−1 trans)
0.20
0.15
CDF
0.10
0.05
0.00
## 95% CI
## lower bound : -0.033 -0.024 0.013 0.037 0.062 0.09 0.09
## upper bound : 0.104 0.167 0.272 0.32 0.366 0.41 0.41
## 95% CI (inverse logit transformation)
## lower bound : 0.005 0.018 0.055 0.076 0.1 0.124 0.124
## upper bound : 0.214 0.245 0.324 0.364 0.402 0.439 0.439
6
Exercise 3.12(e) Compute nonparametric simultaneous approximate 95% confidence bands
for 𝐹 (𝑡) over the complete range of observation.
Answer
[𝐹 (𝑡), 𝐹̃(𝑡)] = 𝐹
̂(𝑡) ∓ 𝑒(𝑎,𝑏,1−𝛼/2) se ̂(𝑡) for all 𝑡 ∈ [𝑡𝐿 (𝑎), 𝑡𝑈 (𝑏)]
𝐹
∼
̂(𝑡) − 𝐹 (𝑡)
𝐹
𝑍max 𝐹̂ = max [ ]
𝑡∈[𝑡𝐿 (𝑎),𝑡𝑈 (𝑏)] se𝐹̂(𝑡)
̂.
And it is generally better to compute the simultaneous confidence bands based on the logit transformation of 𝐹
This gives
̂(𝑡)
𝐹 ̂(𝑡)
𝐹
[𝐹 (𝑡), 𝐹̃(𝑡)] = [ , ]
∼ ̂(𝑡) + [1 − 𝐹
𝐹 ̂(𝑡)] × 𝑤 𝐹̂(𝑡) + [1 − 𝐹
̂(𝑡)]/𝑤
where
̂(1 − 𝐹
𝑤 = exp {𝑒(𝑎,𝑏,1−𝛼/2) se𝐹̂ /[𝐹 ̂)]}
Unfortunately, the formula for calculating 𝑒(𝑎,𝑏,1−𝛼/2) was not provided in the lecture notes, and it is unclear
what the values of 𝑎 and 𝑏 represent in the table. However, an approximation of the values was obtained by
taking the average of the 𝑒(𝑎,𝑏,1−𝛼/2) values for 95% confidence level in the table of page 3-45 .
ep_factor= mean(c(3.41,3.39,3.34,3.41,3.36,3.34,3.28,3.39,3.34,3.31,
3.25,3.34,3.28,3.25,3.16,3.31,3.25,3.21,3.11))
# Z_max
lower_ep_ci <- F_ti_hat-ep_factor*se_F_ti_hat
upper_ep_ci <- F_ti_hat+ep_factor*se_F_ti_hat
# Z_max_logit
w=exp(ep_factor*se_F_ti_hat/
(F_ti_hat*(1-F_ti_hat)))
7
,"lower bound :",round(lower_ep_logit_ci,3),"\n",
"upper bound :",round(upper_ep_logit_ci,3),"\n")
plot(PhotoDetector$upper,F_ti_hat,
ylim = c(-0.1, 0.7), xlab = "Time (thousands of hours)",
ylab = "CDF",main = "Nonparametric CDF estimate with 95% confidence intervals",
type = "s")
95% CI (Z_max)
95% CI (Z_max_logit)
0.6
0.4
CDF
0.2
0.0
8
Exercise 3.12(f) Provide a careful explanation of the differences in interpretation and appli-
cation of the nonparametric pointwise confidence intervals and the nonparametric simulta-
neous confidence bands.
Answer
Nonparametric pointwise confidence intervals and nonparametric simultaneous confidence bands are both im-
portant tools for assessing the uncertainty associated with nonparametric estimators of functions. The choice
between these methods depends on the specific research question and the level of inference desired. Pointwise
confidence intervals are useful for making inferences about specific points in the function, while simultaneous
confidence bands provide information about the overall behavior of the function across its domain.
Another important difference is that simultaneous confidence bands are generally wider than pointwise confi-
dence intervals because they need to account for the uncertainty associated with estimating the function at all
points in the domain simultaneously. This wider band is necessary to ensure that the overall probability of the
true function lying within the band is equal to the specified level of confidence.
HW 3-3
Read the Example 1.4 in the Suess-Trumbo book and draw Figure 1.5. Show the code and
the figure. Give the probabilities that the confidence intervals from the first two students
contain the parameter (the probability that a die shows a six).
Answer
9
# pi=1/6
abline(h=1/6, col="forestgreen")
for(i in 1:m) {
bar <- "solid"
if (!cover[i]) {
bar <- "dashed"
}
lines(c(i,i), c(UCL[i], LCL[i]), lty=bar, lwd=1)
points(i, UCL[i], pch=19,cex=0.3)
points(i, LCL[i], pch=19,cex=0.3)
}
0.5
0.4
0.3
π
0.2
0.1
0.0
0 5 10 15 20
Student
P(the confidence interval for student 1 contains 𝜋 = 16 )= 1
HW 3-4
Read the Example 1.5 in the Suess-Trumbo book and draw Figure 1.6. Show the code and
the figure.
Answer
The code first computes the lower and upper confidence limits for each possible value of 𝑋 , and then creates
a sequence of values for the population proportion 𝜋. For each 𝜋 value, it computes the coverage probability
as the sum of the probabilities of 𝑋 values whose confidence interval covers 𝜋. Finally, it plots the coverage
10
probabilities as a function of 𝜋, and adds a horizontal line at the nominal coverage probability of 0.95 (the green
dashed line).
n = 30 # number of trials
x = 0:n; sp = x/n # n+1 possible outcomes
m.err = 1.96*sqrt(sp*(1-sp)/n) # n+1 Margins of error
lcl = sp - m.err # n+1 Lower conf. limits
ucl = sp + m.err # n+1 Upper conf. limits
pp = seq(0, 1, by = 0.001) # range of population proportions
cov.prob = sapply(pp, function(p) {
sum(dbinom(x[lcl<=p & p<=ucl], n, p))
}) # coverage probability for each pp value
plot(pp, cov.prob, type = "l", ylim = c(0.8, 1), xlim=c(0.04,0.96),
xlab = TeX("$\\pi$=P(Success)"), ylab = "Coverage Probability")
abline(h = 0.95, col = "forestgreen")
1.00
0.95
Coverage Probability
0.90
0.85
0.80
π=P(Success)
11