You are on page 1of 11

HW3_Reliability Analysis

R26104047 統計所碩二 張晏壬

HW 3-1

Use the ball-bearing life-test data in Table 1.1 and file LZBearing.csv to do the following:

Exercise 3.1(a) Compute a nonparametric estimate of the population fraction failing by 75


million cycles.

LZbearing <- read.csv("/Users/ianchang/Desktop/⶙ⶏ⛪➶㣨/Data/LZbearing.csv")

# Calculate the ECDF and estimate the population fraction failing by 75 million cycles
F_hat <- ecdf(LZbearing$Millions.of.Cycles)
F_hat(75)

## [1] 0.6521739

Exercise 3.1(b) Use the conservative method in Section 3.4.1 to compute a conservative 90%
confidence interval for the population fraction failing by 75 million cycles.

Answer

We can now use the formula for the pointwise conservative confidence interval to compute the lower and upper
bounds of the interval. We will set 𝛼 = 0.1 for a 90% confidence interval.

alpha <- 0.1


n=length(LZbearing$Millions.of.Cycles)
q1_conservative <- qbeta(alpha/2, n*F_hat(75), n-n*F_hat(75)+1) # The lower bound
q2_conservative <- qbeta(1-alpha/2, n*F_hat(75)+1, n-n*F_hat(75)) # The upper bound
cat("The conservative 90% confidence interval :","(",q1_conservative,",",q2_conservative,")")

## The conservative 90% confidence interval : ( 0.4595441 , 0.8136562 )

1
Exercise 3.1(c) Repeat part (b) using the Jeffreys method in Section 3.4.2.

Answer

q1_Jeffreys <- qbeta(alpha/2, n*F_hat(75)+0.5, n-n*F_hat(75)+0.5)


q2_Jeffreys <- qbeta(1-alpha/2, n*F_hat(75)+0.5, n-n*F_hat(75)+0.5)
cat("The pointwise Jeffreys 90% confidence interval :","(",q1_Jeffreys,",",q2_Jeffreys,")")

## The pointwise Jeffreys 90% confidence interval : ( 0.4813831 , 0.7961805 )

Exercise 3.1(d) Repeat part (b) using the Wald method in Section 3.4.3.

Answer

z_alpha <- qnorm(1-alpha/2)


se_F_hat <- sqrt(F_hat(75)*(1-F_hat(75))/n)
q1_Wald <- F_hat(75) - z_alpha*se_F_hat; q2_Wald <- F_hat(75) + z_alpha*se_F_hat
cat("The pointwise Wald 90% confidence interval:","(",q1_Wald,",",q2_Wald,")")

## The pointwise Wald 90% confidence interval: ( 0.4888213 , 0.8155265 )

Exercise 3.1(e) Comparing the intervals from parts (b), (c), and (d), what do you conclude
about the adequacy of the Wald method for these data?

Answer

# set up plot
plot(c(0.4,0.85), c(0, 4), type = "n",ylab = "", xlab = "", main = "Comparison Interval Plot")
segments(q1_conservative, 1, q2_conservative, 1, lwd = 2, col = "blue")
text(mean(c(q1_conservative, q2_conservative)), 1.2, "Conservative", col = "blue")
segments(q1_Jeffreys, 2, q2_Jeffreys, 2, lwd = 2, col = "red")
text(mean(c(q1_Jeffreys, q2_Jeffreys)), 2.2, "Jeffreys", col = "red")
segments(q1_Wald, 3, q2_Wald, 3, lwd = 2, col = "green")
text(mean(c(q1_Wald, q2_Wald)), 3.2, "Wald", col = "green")

2
Comparison Interval Plot

4
Wald
3

Jeffreys
2

Conservative
1
0

0.4 0.5 0.6 0.7 0.8

Comparing the intervals from the conservative method, Jeffreys method, and Wald method, we can see that the
three intervals are very similar to each other. However, The Wald method relies on a normal approximation to
the sampling distribution of the sample proportion. This approximation is valid when the sample size is large
and the population proportion is not too close to 0 or 1. In this case, the sample size is relatively small (23 ball
bearings) and the estimated population proportion is relatively high (0.65). Therefore, the normal approximation
(Wald method) may not be accurate for these data.

In contrast, the conservative method and the Jeffreys method do not rely on a normal approximation and are based
on different statistical principles. The conservative method produces intervals that have guaranteed coverage
probabilities, while the Jeffreys method produces intervals that are based on a non-informative prior distribution
and have desirable properties in terms of coverage probability. Therefore, the conservative method and the
Jeffreys method may be more appropriate for these data than the Wald method.

HW 3-2

Weis et al. (1986) report on the results of a life test on silicon photodiode detectors in which 28 detectors were
tested at 85∘ C and 40 volts reverse bias. These conditions, which were more stressful than normal use condi-
tions, were used in order to get failures quickly. Specified electrical tests were made at 0,10,25,75,100,500,750,
1000,1500,2000,2500,3000,3500,3600,3700, and 3800 hours to determine if the detectors were still performing
properly. Failures were found after the inspections at 2500 (1 failure), 3000 (1 failure), 3500 (2 failures), 3600 (1
failure), 3700 (1 failure), and 3800 (1 failure). The other 21 detectors had not failed after 3800 hours of operation.
Use these data to estimate the failure-time cdf of such photodiode detectors running at the test conditions.

3
Exercise 3.12(a) From the description given above, the data would be useful for making in-
ferences about what particular population or process? Explain your reasoning.

Answer

The data provided in the study by Weis et al. (1986) would be useful for making inferences about the reliability of
silicon photodiode detectors under the given stress conditions. The purpose of the study was to test the durability
of the detectors under accelerated stress conditions and to observe their failure behavior. The researchers mon-
itored the performance of the detectors at different time intervals to see if any of them failed during the testing
period. By analyzing the data obtained from the study, it is possible to draw conclusions about the reliability
and durability of the photodiode detectors under the given stress conditions. Thus, the data is useful for making
inferences about the population of silicon photodiode detectors that may be subject to similar stress conditions.

Exercise 3.12(b) Compute and plot a nonparametric estimate of the cdf for time to failure at
the test conditions.

Answer

PhotoDetector <- read.csv("/Users/ianchang/Desktop/⶙ⶏ⛪➶㣨/Data/PhotoDetector.csv")


colnames(PhotoDetector)=c("lower","upper","type","Count")
n_i = c(28,27,26,24,23,22,21)
d_i = c(1,1,2,1,1,1,0)
p_i_hat =d_i/n_i
one_minus_p_i_hat <- 1-p_i_hat
s_ti_hat <- cumprod(one_minus_p_i_hat)
F_ti_hat <- 1 - s_ti_hat
plot(PhotoDetector$upper,F_ti_hat,ylim=c(0,0.25),
xlab = "Time (thousands of hours)", ylab = "CDF",
main = "Nonparametric CDF estimate",type = 's')

Nonparametric CDF estimate


0.20
CDF

0.10
0.00

2.6 2.8 3.0 3.2 3.4 3.6 3.8

Time (thousands of hours)

4
Exercise 3.12(c) Compute standard errors for the nonparametric estimate in part (b).

Answer

We Using the variance formula, which is known as Greenwood’s formula.


𝑖
𝑑𝑗̂
̂
V ̂ (𝑡𝑖 )] = V
ar [𝐹 ̂ ar [𝑆̂(𝑡𝑖 )] = 𝑆̂2 (𝑡𝑖 ) ∑
𝑗=1 𝑛𝑗 (𝑛̂ 𝑗 − 𝑑𝑗̂ )

√ 𝑖
√ 𝑑𝑗̂
𝑆𝐸(𝑆(𝑡𝑖 )) = √Var [𝑆 (𝑡𝑖 )] = 𝑆 (𝑡𝑖 ) √∑
̂ ̂ ̂ ̂ ̂
̂
⎷ 𝑗=1 𝑛𝑗 (𝑛̂ 𝑗 − 𝑑𝑗 )
s_ti_hat=s_ti_hat
d_j=d_i
n_j=n_i

se_F_ti_hat <- s_ti_hat*sqrt(cumsum(d_j/(n_j*(n_j-d_j))))


se_F_ti_hat

## [1] 0.03507073 0.04867037 0.06613001 0.07237888 0.07754431 0.08183171 0.08183171

Exercise 3.12(d) Compute pointwise approximate 95% confidence intervals for 𝐹 (𝑡) and add
these to your plot.

Answer

𝑑𝑗
̂
V ar(𝐹 ̂ (𝑡)) = 𝐹 ̂ (𝑡)2 ∑
𝑡𝑗
𝑛 (𝑛 − 𝑑𝑗 )
<𝑡 𝑗 𝑗

# Compute confidence intervals


lower_ci <- F_ti_hat - 1.96*se_F_ti_hat
upper_ci <- F_ti_hat + 1.96*se_F_ti_hat

# based on z_f_hat
plot(PhotoDetector$upper,F_ti_hat,
ylim = c(0, 0.25),xlab = "Time (thousands of hours)", type = "s",
ylab = "CDF",main = "Nonparametric CDF estimate with 95% confidence intervals")

lines(PhotoDetector$upper, lower_ci, lty = 2, col = "blue",type = 's')


lines(PhotoDetector$upper, upper_ci, lty = 2, col = "blue",type = 's')

# Compute pointwise approximate 95% C.I based on inverse logit transformation


logit_F_ti_hat <- log(F_ti_hat/(1-F_ti_hat))
se_logit_F_ti_hat <- se_F_ti_hat/

5
(F_ti_hat*(1-F_ti_hat))
lower_ci_invlogit <- exp(logit_F_ti_hat - 1.96*se_logit_F_ti_hat)/
(1+exp(logit_F_ti_hat - 1.96*se_logit_F_ti_hat))
upper_ci_invlogit <- exp(logit_F_ti_hat + 1.96*se_logit_F_ti_hat)/
(1+exp(logit_F_ti_hat + 1.96*se_logit_F_ti_hat))

# Plot nonparametric CDF based on inverse logit transformation


lines(PhotoDetector$upper, lower_ci_invlogit, lty = 1, col = "red",type = 's')
lines(PhotoDetector$upper, upper_ci_invlogit, lty = 1, col = "red",type = 's')
legend("topleft",lty = c(2, 1),cex = 0.7,col = c("blue", "red"),
legend = c("95% CI", TeX("95% CI ( $\\logit^{-1}$ trans)")))

Nonparametric CDF estimate with 95% confidence intervals


0.25

95% CI
95% CI ( logit−1 trans)
0.20
0.15
CDF

0.10
0.05
0.00

2.6 2.8 3.0 3.2 3.4 3.6 3.8

Time (thousands of hours)

cat("95% CI","\n","lower bound :",round(lower_ci,3),"\n",


"upper bound :",round(upper_ci,3),"\n",
paste("95% CI (inverse logit transformation)", "\n"),
"lower bound :",round(lower_ci_invlogit,3),"\n",
"upper bound :",round(upper_ci_invlogit,3))

## 95% CI
## lower bound : -0.033 -0.024 0.013 0.037 0.062 0.09 0.09
## upper bound : 0.104 0.167 0.272 0.32 0.366 0.41 0.41
## 95% CI (inverse logit transformation)
## lower bound : 0.005 0.018 0.055 0.076 0.1 0.124 0.124
## upper bound : 0.214 0.245 0.324 0.364 0.402 0.439 0.439

6
Exercise 3.12(e) Compute nonparametric simultaneous approximate 95% confidence bands
for 𝐹 (𝑡) over the complete range of observation.

Answer

The approximate 100(1 − 𝛼)% simultaneous confidence bands

[𝐹 (𝑡), 𝐹̃(𝑡)] = 𝐹
̂(𝑡) ∓ 𝑒(𝑎,𝑏,1−𝛼/2) se ̂(𝑡) for all 𝑡 ∈ [𝑡𝐿 (𝑎), 𝑡𝑈 (𝑏)]
𝐹

are based on the approximate distribution of

̂(𝑡) − 𝐹 (𝑡)
𝐹
𝑍max 𝐹̂ = max [ ]
𝑡∈[𝑡𝐿 (𝑎),𝑡𝑈 (𝑏)] se𝐹̂(𝑡)

̂.
And it is generally better to compute the simultaneous confidence bands based on the logit transformation of 𝐹
This gives
̂(𝑡)
𝐹 ̂(𝑡)
𝐹
[𝐹 (𝑡), 𝐹̃(𝑡)] = [ , ]
∼ ̂(𝑡) + [1 − 𝐹
𝐹 ̂(𝑡)] × 𝑤 𝐹̂(𝑡) + [1 − 𝐹
̂(𝑡)]/𝑤
where
̂(1 − 𝐹
𝑤 = exp {𝑒(𝑎,𝑏,1−𝛼/2) se𝐹̂ /[𝐹 ̂)]}

These are based on the approximate distribution of

̂(𝑡)] − logit[𝐹 (𝑡)]


logit[𝐹
𝑍maxlogit (𝐹̂) = max [ ]
𝑡∈[𝑡𝐿 (𝑎),𝑡𝑈 (𝑏)] selogit[𝐹̂(𝑡)]

Unfortunately, the formula for calculating 𝑒(𝑎,𝑏,1−𝛼/2) was not provided in the lecture notes, and it is unclear
what the values of 𝑎 and 𝑏 represent in the table. However, an approximation of the values was obtained by
taking the average of the 𝑒(𝑎,𝑏,1−𝛼/2) values for 95% confidence level in the table of page 3-45 .

ep_factor= mean(c(3.41,3.39,3.34,3.41,3.36,3.34,3.28,3.39,3.34,3.31,
3.25,3.34,3.28,3.25,3.16,3.31,3.25,3.21,3.11))
# Z_max
lower_ep_ci <- F_ti_hat-ep_factor*se_F_ti_hat
upper_ep_ci <- F_ti_hat+ep_factor*se_F_ti_hat

# Z_max_logit
w=exp(ep_factor*se_F_ti_hat/
(F_ti_hat*(1-F_ti_hat)))

lower_ep_logit_ci <- F_ti_hat/(F_ti_hat+(1-F_ti_hat)*w)


upper_ep_logit_ci <- F_ti_hat/(F_ti_hat+(1-F_ti_hat)/w)

cat("95% CI (simultaneous confidence bands for Z_max)","\n"


,"lower bound :",round(lower_ep_ci,3),"\n",
"upper bound :",round(upper_ep_ci,3),"\n","\n",
"95% CI(simultaneous confidence bands for Z_max_logit)","\n"

7
,"lower bound :",round(lower_ep_logit_ci,3),"\n",
"upper bound :",round(upper_ep_logit_ci,3),"\n")

## 95% CI (simultaneous confidence bands for Z_max)


## lower bound : -0.08 -0.089 -0.075 -0.06 -0.042 -0.02 -0.02
## upper bound : 0.152 0.232 0.361 0.418 0.47 0.52 0.52
##
## 95% CI(simultaneous confidence bands for Z_max_logit)
## lower bound : 0.001 0.007 0.027 0.041 0.056 0.073 0.073
## upper bound : 0.517 0.464 0.498 0.526 0.555 0.585 0.585

plot(PhotoDetector$upper,F_ti_hat,
ylim = c(-0.1, 0.7), xlab = "Time (thousands of hours)",
ylab = "CDF",main = "Nonparametric CDF estimate with 95% confidence intervals",
type = "s")

lines(PhotoDetector$upper, lower_ep_ci, lty = 2, col = "orange",type = 's')


lines(PhotoDetector$upper, upper_ep_ci, lty = 2, col = "orange",type = 's')
lines(PhotoDetector$upper, lower_ep_logit_ci, lty = 2, col = "forestgreen",type = 's')
lines(PhotoDetector$upper, upper_ep_logit_ci, lty = 2, col = "forestgreen",type = 's')
legend("topleft",lty = c(2, 1),cex = 0.7,col = c("orange", "forestgreen"),
legend = c("95% CI (Z_max)", "95% CI (Z_max_logit)"))

Nonparametric CDF estimate with 95% confidence intervals

95% CI (Z_max)
95% CI (Z_max_logit)
0.6
0.4
CDF

0.2
0.0

2.6 2.8 3.0 3.2 3.4 3.6 3.8

Time (thousands of hours)

8
Exercise 3.12(f) Provide a careful explanation of the differences in interpretation and appli-
cation of the nonparametric pointwise confidence intervals and the nonparametric simulta-
neous confidence bands.

Answer

Nonparametric pointwise confidence intervals and nonparametric simultaneous confidence bands are both im-
portant tools for assessing the uncertainty associated with nonparametric estimators of functions. The choice
between these methods depends on the specific research question and the level of inference desired. Pointwise
confidence intervals are useful for making inferences about specific points in the function, while simultaneous
confidence bands provide information about the overall behavior of the function across its domain.

Another important difference is that simultaneous confidence bands are generally wider than pointwise confi-
dence intervals because they need to account for the uncertainty associated with estimating the function at all
points in the domain simultaneously. This wider band is necessary to ensure that the overall probability of the
true function lying within the band is equal to the specified level of confidence.

HW 3-3

Read the Example 1.4 in the Suess-Trumbo book and draw Figure 1.5. Show the code and
the figure. Give the probabilities that the confidence intervals from the first two students
contain the parameter (the probability that a die shows a six).

Answer

n <- 30 # number of rolls


m <- 20 # number of students

# observed number of 6s for each student


x <- c(3, 1, 3, 3, 5, 3, 4, 4, 6, 10, 3, 1, 8, 4, 5, 5, 4, 9, 4, 5)

# calculate sample proportion and confidence interval for each student


p_hat <- x/n
se <- sqrt(p_hat*(1-p_hat)/n)
z_star <- 1.96 # for 95% confidence interval
LCL <- p_hat - z_star*se
UCL <- p_hat + z_star*se
cover <- LCL < 1/6 & UCL > 1/6

# plot the confidence intervals and coverage


plot(c(0,m+1), c(1,0), col="white", ylab=TeX("\\pi"), xlab="Student",
main="", xaxs="i",ylim=c(-0.05,0.52))

9
# pi=1/6
abline(h=1/6, col="forestgreen")
for(i in 1:m) {
bar <- "solid"
if (!cover[i]) {
bar <- "dashed"
}
lines(c(i,i), c(UCL[i], LCL[i]), lty=bar, lwd=1)
points(i, UCL[i], pch=19,cex=0.3)
points(i, LCL[i], pch=19,cex=0.3)
}
0.5
0.4
0.3
π

0.2
0.1
0.0

0 5 10 15 20

Student
P(the confidence interval for student 1 contains 𝜋 = 16 )= 1

P(the confidence interval for student 2 contains 𝜋 = 16 )= 0

HW 3-4

Read the Example 1.5 in the Suess-Trumbo book and draw Figure 1.6. Show the code and
the figure.

Answer

The code first computes the lower and upper confidence limits for each possible value of 𝑋 , and then creates
a sequence of values for the population proportion 𝜋. For each 𝜋 value, it computes the coverage probability
as the sum of the probabilities of 𝑋 values whose confidence interval covers 𝜋. Finally, it plots the coverage

10
probabilities as a function of 𝜋, and adds a horizontal line at the nominal coverage probability of 0.95 (the green
dashed line).

n = 30 # number of trials
x = 0:n; sp = x/n # n+1 possible outcomes
m.err = 1.96*sqrt(sp*(1-sp)/n) # n+1 Margins of error
lcl = sp - m.err # n+1 Lower conf. limits
ucl = sp + m.err # n+1 Upper conf. limits
pp = seq(0, 1, by = 0.001) # range of population proportions
cov.prob = sapply(pp, function(p) {
sum(dbinom(x[lcl<=p & p<=ucl], n, p))
}) # coverage probability for each pp value
plot(pp, cov.prob, type = "l", ylim = c(0.8, 1), xlim=c(0.04,0.96),
xlab = TeX("$\\pi$=P(Success)"), ylab = "Coverage Probability")
abline(h = 0.95, col = "forestgreen")
1.00
0.95
Coverage Probability

0.90
0.85
0.80

0.2 0.4 0.6 0.8

π=P(Success)

11

You might also like