Professional Documents
Culture Documents
X X (n 1) S 2 Pˆ p approx
~ N (0,1) ~ tn 1 ~ 2n1 ~ N (0,1)
/ n S/ n 2
pq / n
(each result is only valid under certain requirements)
Binomial distribution:
n
X ~ BIN(n, p) f X ( x) p x (1 p)n x for x 0,1,..., n
x
E( X ) np, Var( X ) np(1 p)
Multinomial distribution:
( X1 ,..., X k ) ~ MULT(n, p1,..., pk )
n!
f X ( x1 ,..., xk ) p1x1 p2 x2 ... pkxk 11 for xi 0,1,..., n
x1 ! x2 !... xk ! xk 1 !
k k
where pk 1 1 pi , pi 0 (i 1,..., k 1) and xk 1 n xi .
i 1 i 1
A – questions 1 to 4
A1 [7] The independent discrete random variables X and Y have the following pdf’s:
x 0 1 2 y 0 1 2
Determine P( X 1| X Y 2)
A2 The continuous random variables X and Y have the following joint pdf:
x 1 y xy for 0 x 1, 0 y 2
f X ,Y ( x, y ) 2
0 elsewhere
A3a [4] Describe in words the model assumptions underlying the multinomial distribution. (Example of such a model
assumption for the Poisson-process: the probability of one event in a very short (time)interval is proportional
to the length of the interval).
Assume for the rest of this question that (X,Y ) ~ MULT( n, p1, p2 ).
b [3] Argue why it follows that X ~ BIN(n, p1) and Y ~ BIN(n, p2).
(Here you are asked for a logical argument, not for a mathematical proof)
p2
c [4] Prove that Y | X x ~ BIN(n x, ).
1 p1
A4 [9] Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 represent a random sample from a normal distribution with unknown mean and a known
variance 𝜎 2 = 12.2. How large should the sample size be, such that the margin of error associated with a
99%-confidence interval for μ will be at most 0.1?
B – questions 5 to 7
B5 [8] Explain as clearly as possible what is meant by the ‘power of a test’. Pay attention to the definition of this
concept, to the relation of the power with the sample size and the sample outcomes, and to the relations with
the ‘error of type I’ and ‘error of type II’.
n
B6a [4] Show that the following equation always holds: (n 1) S 2 n( X )2 ( X i )2 where S 2 denotes the
i 1
(n 1) S 2 ( X )2 n
( X i ) 2
1)
2
2 n
i 1 2
(follows from part a)
2) X (the sample mean) and S 2 are independent random variables when sampling is done from normal
distributions
3) the moment generating function of a chi-square distribution with n degrees of freedom is given by
M (t ) (1 2t )
1
2n
4) the square of a standard normal random variable has a chi-square distribution with 1 degree of
freedom
Hint: use moment generating functions.
B7 An important indicator for the level of customer satisfaction at a certain Fastfood-restaurant is the so-called
service time. Both the variation as the mean of the service time contribute to the overall customer satisfaction.
The service times should not differ too much from customer to customer, and the service times should not
exceed the norm (240 sec.) too often. The table below shows some data which were gathered during two days,
by taking random samples of 41 and 61 customers respectively. For each customer, the service time is
recorded in seconds. The table also shows the proportion of customers whose service times exceeded the
norm.
Day Service time Proportion exceeding
norm
1 mean 168.196 0.121952
sample size 41 41
variance 4102.980 0.109756
2 Mean 138.004 0.163934
sample size 61 61
variance 10669.770 0.139344
total Mean 150.140 0.147059
sample size 102 102
variance 8184.714 0.137254
a [8] Test with a level of significance of 5% (and using the complete sample of 102 customers) whether the mean
service time is less than 175 seconds.
b [8] Determine the p-value associated with a test which is aimed at establishing whether sufficient proof exists (at
α = 0.1) to state that the standard deviation of the service times (combined for both days) is less than 150
seconds.
c [7] Find the 95%-confidence level for the difference (between the two days) of the proportions of service times
which exceeded the norm. Then use this confidence interval to draw a conclusion about the question whether
these two proportions are different from each other.
End
Solutions
P( X 1 X Y 2) P( X 1 Y 2) 0.09 3
P( X 1| X Y 2)
P( X Y 2) P( X 1 Y 2) P( X 2 Y 1) 0.09 0.12 7
1
12 x2 12 xy 12 x2 y x0
x 1
2a. fY ( y ) x 12 y xy dx 1
2
for 0 y 2
0
FX ,Y x, y P 0 X x 0 Y y
0 voor x0 of y0
1 xy x 1 y 1 xy
2 2 2 voor 0 x 1 en 0 y 2
1
2
y voor x 1 en 0 y 2
x voor 0 x 1 en y2
1 voor x 1 en y2
2c. Draw figure, see below. We have to integrate over the grey area. That area has to be split into two parts, for
example like shown below (light and dark grey):
w
2 1 wx
x w x 12 y xy dy dx
2
FW ( w) P( XY w) 1
2
y xy dy dx
0 0 2 0
Or, alternatively with other integration order:
w
w1 2 y
FW ( w) P( XY w) x 12 y xy dx dy x 12 y xy dx dy
00 w 0
2d. To be able to apply the transformation method, we need to define first another random variable in such a way that
𝑊
we obtain a one-to-one transformation. For example choose 𝑉 = 𝑋, 𝑊 = 𝑋 ⋅ 𝑌 ⟹ 𝑋 = 𝑉, 𝑌 = 𝑉
1 0 1
The Jacobian is: |− 𝑤 1| =
𝑣
𝑣2 𝑣
𝑤 1 0 𝑤 1 𝑤 w
And thus 𝑓𝑊,𝑉 (𝑤, 𝑣) = 𝑓𝑋,𝑌 (𝑣, 𝑣 ) ∙ |− 𝑤 1| = (𝑣 + 2𝑣 − 𝑤) ∙ 𝑣 = 1 + 2𝑣 2 − v
𝑣2 𝑣
To find the support for (W, V ) we substitute x = v and y = w/v into the support for (X, Y ): 0 < v < 1 and
0 < w/v < 2. This can be rewritten as: 0 < v < 1 and 0 < w < 2v. (or 0 < w < 2v < 2, or 0 < w/2 < v < 1)
1
𝑤 w 𝑤 𝑣=1
1
𝑓𝑊 (𝑤) = ∫ (1 + 2
− ) 𝑑𝑣 = (𝑣 − − 𝑤 ln 𝑣)| 𝑤 = 2 − 𝑤 + 𝑤 ln 2𝑤 for 0 < 𝑤 < 2
2𝑣 v 2𝑣 𝑣=
𝑤 2
2
5 The power is the probability that the null hypothesis will be rejected when the alternative hypothesis is true.
It is always equal to 1 minus the probability of a type II error.
When the probability of a error of type I increases (so the probability that the null hypothesis will be rejected
when it is true), then the rejection region is becoming larger, and thus also the probability that the null
hypothesis will be rejected when the alternative hypothesis is true (i.e. the power).
The power of a test does not depend on the outcome of a specific outcome of the test statistic in a specific
sample. It is a characteristic of the test, not of a sample.
When the sample size increases, the power will becomes larger as well (when the level of significance remains
the same).
If the difference between the value of the unknown population parameter as stated in the null hypothesis and
its true value becomes larger, then the power will increase as well.
1 n
X i X (n 1)S 2 X i X
n
2 2
S2
n 1 i 1 i 1
n n
X
i 1
i nX and X
i 1
2
nX 2
i 1
X i 2 2 X i X X 2 n X 2 2X 2
n
i 1
n n n
X i 2 2 X i X X 2 nX 2 2nX n 2
i 1 i 1 i 1
n n n
X i 2 nX nX 2 nX 2 2 X i 2
2 2
i 1 i 1 i 1
n n n
X i 2 2 X i 2
i 1 i 1 i 1
X i 2 2X i 2
n
i 1
n
X i
2
i 1
(Or see the first part of the proof of Theorem 6.14 in the reader)
(n 1) S 2 ( X )2 n
( X i ) 2
6b. We focus first on the r.h.s. of the equation
2
2 n
i 1 2
:
Xi 4)
( X i ) 2 3) ( X i ) 2
X i ~ N (, 2 ) ~ N (0, 1) ; t (1 2t ) 2
2 1
~ ( 1) M
2 2
n ( X ) 2 n ( X ) 2
M i 2 ; t M i 2
n
; t (1 2t ) 2 (1 2t )
1 1
2n
i 1 i 1
(n 1) S 2 ( X )2 n
( X ) 2
Now we focus on the l.h.s. of i 2 :
2
n
2
i 1
2 X 4)
( X ) 2 ( X )2 3)
; t 1 2t 2
1
X ~ N (, ) ~ N (0, 1) ~ 2
( 1) M
n 2
n
2
n n
Together:
(n 1) S 2 ( X ) 2 n
( X i ) 2
2
2 n
i 1 2
(n 1) S 2 ( X ) 2 n ( X i ) 2
M ; t M ;t
2 n 2
2
i 1
(n 1) S 2
; t 1 2t 2 1 2t
1 1 n
M 2
2
(n 1) S
2
2
n 1
1 2t
1
M ;t
2
(n 1) S 2
~ 2 n 1
2
(Or see the second part of the proof of Theorem 6.14 in the reader)
7a. Requirements for the use of the hypothesis test below: random sample, “service times” in the population are
normally distributed.
𝑋̅−𝜇
Test statistic: 𝑇= 𝑠 ~ 𝑡[𝜈 = 101]
⁄ 𝑛
√
150.140−175
Observed value for test statistic: 𝑇= √8184.714
= −2.77523
√102
This value falls within the rejection region, so 𝐻0 will be rejected. There is sufficient proof (at a level of
significance of 0.05) to state that the mean service-time on the two days is below 175 seconds.