You are on page 1of 6

CEE 3040 - UNCERTAINTY ANALYSIS IN ENGINEERING

Second Prelim in class on FRIDAY Nov. 15, 2013
open book, open notes.

Topics for Prelim exam:
Continuous random variables including LN/Gumbel/Weibell (§ 4.1-4.5),
Multivariate Random Variables (§ 5.1-5.2, § 5.3-5.5),
Estimators (Ch. 6: § 6.1-6.2),
Confdence intervals (§ 7.1-7.3 but not CI for proportions or σ in 7.4), and
Simple Hypothesis Testing (§ 8.1-8.2, 8.4-8.5 - only one-sample tests.)
___________________________________________________________________________
Homework #9 Due: Monday Nov. 4, 2013
Read: Devore 7.1-7.3, 8.1-8.2 (We neglect discussions of proportions.)
Goal: We have discussed various estimators. This assignment addresses the meaning
of confdence intervals and how they are computed. A vehicle for developing this
understanding is the Monte Carlo simulation capability of R and other statistical
packages. That capability allows you to experiment with your own random samples,
and to observe the sampling properties of the sample mean and sample variance. The
last two problems provide an introduction to hypothesis testing, the focus of this
week’s lectures. All this material is on 2nd prelim.
Assignment
These assignments are written to use R – a free software available through Internet. Please see
the attached handout for more details and tutorial information.
Start early on R problems, and those problems should be fun.
Please write your answers for the computer assignments on a separate sheet of paper
(or cut and paste the results) so that we can grade your work easily. In order to
minimize the amount of paper generated by this assignment, I text indicates where
graphs need to be submitted with the homework.
Problems 1 – 5 use the random number generation capability of R to illustrate
empirically properties of the sample mean and sample variance, and of confdence
CEE 3040 - UNCERTAINTY ANALYSIS IN ENGINEERING
intervals. Attached R instructions supply information you need.
1) Suppose that contaminant concentrations are normally distributed with
mean µ = 100 and standard deviation σ = 30. Consider the properties of
small samples of 25 independent observations drawn from this distribution.
Use the R command rnorm to obtain your own unique sets of random numbers. For
example, to get one sample of N(100,30
2
), you could type (note σ is passed to R):
> x <- rnorm (25, 100, 30)
To generate ten diferent sets of 25 independent normal random numbers using R,
and to store your samples for subsequent computations, enter the R commands:
> x <-rnorm(25,100,30)
> for(i in 2:10){x<-matrix(c(x,rnorm(25,100,30)),nrow=25,ncol=i)}
OR
> x <- replicate(10, rnorm(25,100,30))
Both commands store the random data sets in matrix x[1:25,1:10]
What do you get with the command: mean(x[1:25,2])?
Have R calculate the ten sample means and ten variances of the ten samples using R-
commands mean() and var(). Here is an R solution if done all at once:
> xbar<-ector(mo!e = "n#meric", len$t% = 10)
> &2<-ector(mo!e = "n#meric", len$t% = 10)
> for(i in 1:10) cat("mean =", xbar'i(<-mean(x'1:25,i(), "ar =",
&2'i(<-ar(x'1:25,i(),")n")
where “cat” has the function of concaenate and print
NOW generate dotplot for each of the ten samples using dotPlot in the
supplemental R-package: BHH2. {OPTIONAL: You can also try hist and compare
with dotPlot results.}
To make Package “BHH2” available in R
(i) In the R environment, RGui, Select the menu item
Package  Intall !ackage() ("ac# Package$Intaller)  chooe %R&' mirror
 %hooe BHH2 (rom the !ackage lit
(ii) Then loa) the !ackage (ma* nee) to re!eat each time run R)# &G&I' elect the menu
Package  +oa) !ackage ("ac# Package "anager)  %hooe BHH2("ac# check loa))
CEE 3040 - UNCERTAINTY ANALYSIS IN ENGINEERING
The command to plot the second sample is (you can execute the command 10 times)
> !ot*lot( x'1:25,2(, xlim=c(0,200), xlab=+ran!om n#mber+)
WHERE: cat() concatenates & print; "\n" wraps text; xlim = limits = x-range
For this problem, submit a summary of the ten averages and variances computed for
each sample, as well as the dotPlot for at least one sample. (Note: The command
window can be saved as a txt. fle showing all commands used and the non-graphic
output.)
2) (a) Consider now all ten sample averages. If you haven’t already, created a xbar-
vector with all ten sample averages, do so as follows:
xbar<-ector(len$t%=10)
Put all of the sample averages in a single vector using
> for(i in 1:10){xbar'i(<-mean(x'1:25,i()}

Make a DOTPLOT of the ten calculated sample averages and turn it in with your
assignment.
> !ot*lot( xbar, xlab="&ample aear$e")
(b) If before you went to the computer you imagined as a new derived random
variable the averages A that you would compute from samples of 25 normal
random variables where:
A = , = (1/25) [ X
1
+ X
2
+
. . .
X
25
]
what is the population mean for A, µ
A
= E[A], and the variance σ
A
2

= Var[A] ?
(c) You have 10 sample average values (realizations of A). Please compute the
sample average

A
and sample variance S
A
2
of those ten values of A. (in other
words, compute the average and variance of the sample averages computed in
Problem 1.) (Use R COMMAND.)
> mean(xbar)
> ar(xbar)
CEE 3040 - UNCERTAINTY ANALYSIS IN ENGINEERING
(d) Why is it that the expectation of the averages µ
A
= E[A] is diferent from the
sample average

A
of the ten samples that you generated?
3) (a) If before you went to the computer you imagined the sample variance S
2
as a
random variable, what are its population (theoretical) mean E[S
2
] and variance
Var[S
2
] ? [Recall the formula Var[S
2
] = 2σ
4
/(n-1) for normal data.]
(b) Make a DOTPLOT or histogram of the ten sample variances S
2
and turn it in
with your assignment.
> !ot*lot(&2, xlab="&ample ariance")
> %i&t(&2)
(c) What are the sample average and sample variance of the ten values of S
2
that
you generated?
> mean(&2)
> ar(&2)
4)(a) With each of the your samples, construct an 80% confdence interval for the
true mean E[X] (which we happen to know is 100). R will do the work if you use
the command
> for(i in 1:10) cat(t,te&t(x'1:25,i(, m#=100, conf,leel=0,-).conf,int,
")n")
You can use R to calculate your intervals, but do at least one by hand in order to
show the needed calculations. (Please assume σ is unknown.)
(NOTE: If you do not specify a percentage then R generates a 95% CI.)
>t,te&t(x'1:25,3(, m#=100, conf,leel=0,-)
(b) How many of the 80% confdence intervals actually contain the true mean?
(c) If confdence intervals are generated randomly, as we have done:
What is the probability an interval that will be generated will contain the true mean µ?
Of ten such intervals, how many on average will contain the true mean µ?
Of ten such intervals, what is the probability that exactly 8 will contain µ?
What is the variance of the number of intervals that will actually contain µ?
[Think Binomial distribution when answering the last three questions.
CEE 3040 - UNCERTAINTY ANALYSIS IN ENGINEERING
These are simple probability problems.]
5) The dotplots created with each of the ten samples with n = 25 may not look like a
normal density function. Use R to generate one sample of 100 independent normal
random variables with µ = 100 and σ = 30. Generate a Histogram or DOTPLOT and turn
it in with homework. Does the graph look better for n = 100 than for n = 25?
> x<-rnorm(100,100,30)
> !ot*lot(x,xlab="ran!om n#mber")
> %i&t(x)
6) Section 7.2, D8 p. 284 [D7 p. 269], using the data in problem 18, construct a 90% CI for
the true mean strength of anchor bolts. (Large sample confdence interval; show
computation.)
7) Section 7.3, D8 p. 293 [D7 p. 277], # 37a
8) Section 8.1, D8 pp. 308-309 [D7 pp. 293-94], #3, 4, 10abcd (simple hypothesis tests)
9) Section 8.2, D8 pp. 322-23, # 34, 36 [D7 p. 306, #32, 34 ] (Student t and hypothesis tests)
(c) For #34 (#32 in D7) graph β,the Type II error, for n = 15 as a function of the
true mean concentration µ, over 90 pCi/L < µ < 110 pCi/L
assuming a standard deviation σ of 7.5 pCi/L.
(See fgure 8.5, D8 p. 319 [Fig. 8.4 D7, p. 303])
(d) For D8 #34 (D7 #32), did you use a one- or two-sided test? Justify your choice.
Learning objective: Students should know how (i) to select the hypotheses, (ii)
to decide what tests are appropriate for diferent situations (large/small sample;
one/two sided), (iii) to compute rejection regions for a test for a given type I error
α, (iv) to compute the type II error β, and (v) to determine the reuired sample
si!e n to achieve a speci"ed α and β#
__________________________________________________________________________
ANSWERS not in book.
2) E[

A ] = µ ; Var[

A ] = σ
2
/n where n = 25.
3) E(S
2
) = σ
2
(unbiased);

for normal observations: Var(S
2
) = 2σ
4
/(n-1) where σ
2
is variance of X
i
.
CEE 3040 - UNCERTAINTY ANALYSIS IN ENGINEERING
5) It should look better!
6) CI = 4.01 to ???
8) #10a. H
0
: µ = 1300 versus H
a
: µ > 1300 [why?]
#10b. Sample average is unbiased with standard error of 13.4. Type I error is 1%.
#10c. β(1350) = Pr(Z < 1.40) = 8.08% .
#10d. Reject X ≥ 1322 so now β(1350) = 1.88%; α got larger but β became smaller.
<Not assigned> #10e. 1% corresponding to a critical z of 2.33
9) #34 (D7 #32). (a) t = -0.92, accept H
o
. (b) n = 30
#36 (D7 #34). look at equation for $.