Professional Documents
Culture Documents
Ex. 1 Suppose that we visit a timber-merchant. Last year the average length of
timber has been 3000 mm with the standard deviation 400 mm. Now he has got a
new supplier and he wonders if the average length has changed. To examine this he
picks a stochastic selection of 16 boards and measures them carefully. The average
of these 16 boards is calculated to 3140 mm. Suppose that the length of a board is
normally distributed. Can we conclude that the average length of the boards has
changed using this examination?
x −µ
Step 3: The test statistic z = .
σ
n
Step 4: From the random sample we calculated x = 3140. ⇒
3140 − 3000
z= = 1.40
400
16
Step 5: H0 cannot be rejected. This examination does not support the statement
that there is a change of the average length.
The p-value is the largest significance level where we draw the conclusion that there
has been a chance,
σ 400
x ± 1.96 ⇒ 3140 ± 1.96 ⋅ d.v.s. 3140 ± 196
n 16
The end points in the interval become 2944 – 3336. We see that the value 3000 is
covered by the interval. Therefore we cannot exclude that the true average still is
3000.
Upper tail test of the expected value, µ
Ex. 2 A coffee bar sells 320 cups of coffee every day at an average with the known
standard deviation 40 cups. After an advertisement campaign the cashier finds that
during the week (7 days) after the campaign she has sold 2450 cups of coffee. The
owner of the coffee bar wants to estimate if the campaign has had any effect on the
number of cups sold. Therefore he compares the number of cups sold before the
campaign with the number of cups sold after the campaign. Suppose the number of
cups of coffee sold per day is normally distributed.
α = 0.05
Step 2: The significance level, α = 0.05
1.645
x −µ
Step 3: Choose the test statistic z = .
σ
n
2450
Step 4: From the sample we obtain x = = 350 ⇒
7
350 − 320
z= = 1.98
40
7
Step 5: H0 can be rejected, i.e. the examination indicates that there has been an
increase.
Lower tail test of the expected value, µ
Step 1: H0: µ ≥ 40
H1: µ < 40
α = 0.001
Step 2: α = 0.001
zC
x−µ
Step 3: The test statistic z =
σ
n
39.8 − 40
z= = -0.32
2.5
16
Step 5: H0 cannot be rejected, i.e. this examination does not contradict the
statement that the material can be used.
The Power of the Test
- Type II error means that we don’t observe a change even if it has occurred.
The probability of type II error is denoted .
- If the significance level is big then H0 often will be rejected even though it is
true,
- If the significance level is small then it can be difficult to reject H0 even if there
has been a change and H1 is true.
-
H0 is correct H1 is correct
Accept H0 OK Type II error
Reject H0 type I error OK
P(Accept H0 | H0 is correct) = 1 – α
µ0 µ1 µ0 µ1
β 1–β
µ0 µ1 µ0 µ1
Ex. 4 Suppose that ξ is N(µ, 20). Take a random sample of n = 100 and examine if
µ > 60. Suppose that the sample average is x = 63.40. Carry through the hypothesis
test and then calculate the power of the test for different values of in the alternative
hypothesis.
Step 1: H0: µ ≤ 60
H1: µ > 60
α = 0.05
x −µ
Step 3: The test statistic is z = .
σ
n
63.40 − 60
z= = 1.7
20
100
x C − 60 20
z = 1.645 ⇒ = 1.645 ⇒ x C = 1.645 ⋅ + 60 = 63.29
20 100
100
P( ξ > 63.29 | µ = µ1) = 1 – β.
63.29 − 61
1) µ1 = 61 ⇒ P(Z ≤ ) = P( Z ≤ 1.145) = = 0.8739 ⇒ β = 0.8739 ⇒
20
100
The power = 1 – β = 0.1261.
63.29 − 65
2) µ1 = 65 ⇒ P(Z ≤ ) = P( Z ≤ –0.855) = 0.1963 i.e. β = 0.1963 ⇒
20
100
The power = 1 – β = 0.8037.
_____________________________________________________
1,2
1,0
0,8
Styrkan
0,6
0,4
0,2
0,0
60 62 64 66 68 70
µ1
The relation between α, β and the sample size, n
Ex. 5 Suppose that ξ is N(µ, 30). Test H0: µ = 300 against H1: µ = 310.
Determine the sample size so that α = 0.05 and β = 0.036.
β = 3.6% α = 5%
µ0 = 300 µ1 = 310
σ 30
Condition 1: using µ0 and α: xC = µ0 + z0 ⋅ = 300 + 1.645 ⋅
n n
σ 30
Condition 2: using µ1 and β: x C = µ 1 + z1 ⋅ = 310 − 1.80 ⋅
n n
30 30 30
300 + 1.645 ⋅ = 310 − 1.80 ⋅ ⇒ (1.80 + 1.645) ⋅ = 310 – 300 ⇒
n n n
3.445 ⋅ 30
⇒ = n ⇒ n = 10.335 ⇒ n = 106.8 ≈ 107
10
σ 30
The critical value, x C = µ 0 + z 0 ⋅ = 300 + 1.645 ⋅ = 304.78
n 10.335
The t-test of the expected value µ
Step 2: α = 0.05.
tC
x −µ
Step 3: The test statistic is t = .
s
n
107 − 102.5
t= = 2.51
6 .2
12
Step 5: H0 is rejected. The examination suggests that µ > 102.5, i.e. the new
method seems to give an increase in the average length of life of the
batteries.
Test of a population proportion
Ex. 7 In a random sample of 200 persons who buy a special product there were 87
women. Can you from this investigation state that the product is bought to the same
extent by both male and female? Test the statement using 1% significance level and
calculate the p-value.
Step 2: α = 0.01
–2.575 2.575
p̂ − p
Step 3: The test statistic z = .
p(1 − p)
n
87
Step 4: p̂ female = = 0.435 ⇒
200
0.435 − 0.5
z= = –1.84.
0 .5 ⋅ 0 .5
200
Step 5: H0 cannot be rejected. The investigation does not suggest that the product
is bought more frequently by one of the sexes.
The p-value = 2P(Z < –1.84) =2 (1 – P(Z < 1.84)) = 2(1 – 0.9671) = 0.0658
Comparison between two expected values
Two samples
E(η) = E( ξ1 − ξ 2 ) = E( ξ1 ) – E( ξ2 ) = µ1 – µ2
σ12 σ 22
Var(η) = Var( ξ1 − ξ 2 ) = Var( ξ1 ) + Var( ξ2 ) = +
n1 n 2
The table shows the two different cases when we have large samples.
normal
σ12 σ 22 for all n1 and n2
µ1 – µ2 zα/2 +
n1 n2
σ1 and σ2 unknown
normal
s12 s 22
µ1 – µ2 zα/2 + estimated by
n1 n 2 s1 and s2
n1≥30
n2≥30
2 (n1 − 1) s12 + (n 2 − 1) s 22
s =
n1 + n2 − 2
If the variables are not normally distributed then we must distinguish the following
situations.
not normal
σ12 σ 22
µ1 – µ2 zα/2 + n1≥30
n1 n2 n2≥30
not normal
s12 s 22
µ1 – µ2 zα/2 + n1≥100
n1 n 2 n2≥100
p1(1 − p1 ) p 2 (1 − p 2 )
Var(η) = Var( p̂1−p̂ 2 ) = Var( p̂1 ) + Var( p̂ 2 ) = +
n1 n2
(n − 1)S 2
η= 2
which is χ2{(n–1) df}.
σ
(n − 1)s 2
Step 3: The test statistic χ2 =
σ2
(15 − 1) 0.8 2
χ2 = = 6.22 < 6.57
1 .2 2
Step 5: H0 is rejected. It seems that the standard deviation is lower than 1.2.
s12 σ 22
Then we use the test statistic F = ⋅
σ12 s 22
which is F-distributed with (n1–1, n2–1) df.
ANALYSIS OF VARIANCE
+
x1 + + + + +
+ + x1 • + •
+
x2 + •
∗ ∗ • • ∗
x2 ∗ ∗ ∗ ∗ •
x3
∗ ∗ ∗
• •
x3 • •
• •
2. The observations are normally distributed with the same standard deviation
possibly with different expected values.
µ1 – µ2 = 0 µ2 – µ3 = 0 µ1 – µ3 = 0
1 – 0.953 ≈ 0.14
MST
F = is F-distributed
MSE
där
MST means mean square of Treatments
MSE means mean square of Errors
ANOVA table
SS
sources of variation SS df MS =
df
Between treatments B−C c −1 (B − C) / (c − 1)
Error A−B n−c ( A − B) / (n − c)
Total A−C n−1
nj c nj
2
c nj c
( ∑ x ij ) ( ∑ ∑ x ij )2
i=1 j=1 i=1
A= ∑ ∑ x ij2 B =∑
nj
C=
n
j=1 i=1 j=1
Ex 1 A factory has 3 different line that produce the same type of electric bulbs. A
random sample of 3 lamps from each line is taken to examine if the averages of the
length of life are different depending on which line that has produced the bulbs. The
length of life was measured. This is the result:
Step 1: H0: µ1 = µ2 = µ3
H1: all not equal
(2, 6) df
α = 0.05
5.14
MSlines
Step 3: The test statistic F =
MS error
c nj
A= ∑ ∑ xij2 = 10002 + 11002 + 12002 + 10002 + 9002 + 10002 +
j =1 i =1
+ 13002 + 10002 + 11002 = 10 360 000
nj
c
( ∑ x ij )2
i=1 1 1
B =∑ = (1000 + 1100 + 1200)2 + (1000 + 900 + 1000)2 +
j=1 nj 3 3
1
+ (1300 + 1000 + 1100)2 = 10 286 666.67
3
c nj
( ∑ ∑ x ij )2
j=1j=11
C = =
(1000 + 1100 + 1200 + 1000 + 900 + 1000 +
n 9
+ 1300 + 1000 + 1100)2 = 10 240 000
ANOVA table
SS
source of variation SS df MS =
df
between lines B − C = 46 666.67 2 23 333.335
error A − B = 73 333.33 6 12 222.22167
Total A − C = 120 000 8
23 333.335
F= ≈ 1.91
12 222.22167
Step 5: H0 cannot be rejected. This investigation does not give support to the
statement that the averages of the length of life differ between the lines.
Two way analysis of variance
SS
source of variation SS df MS =
df
between rows B −D c −1 (B − D) / (c − 1)
between columns C −D r −1 (C − D) / (r − 1)
error A −B−C+D (c − 1) (r − 1) ( A − B − C + D) / (c − 1)(r − 1)
Total A −D n −1
c r r c
r c r
( ∑ x ij )2 c
( ∑ x ij )2 (∑ ∑ x ij )
2
2 j =1 i =1 i = 1 j =1
A = ∑ ∑ x ij B =∑ C =∑ D=
i =1 j =1 i =1 k j =1 r n
Ex 2 A test using three different types of colour was accomplished for three different
types of cloth. One meter of each type of cloth was cut in tree equally sized pieces.
Each piece was randomly chosen to be handled by one of the tree types of colour.
The quality of a colour was measured as the wear of the fibre threads. The result is
shown in the following table. High values means a big wear and poorer quality.
Step 1: H0: µ1 = µ2 = µ3
H1: all not equal
(2, 4) df
α = 0.05
6.94
MS colours
Step 3: The test statistic F=
MS error
r c 2
A = ∑ ∑ xij = 362 + 432 + 222 + 402 + 482 + 262 +352 + 442 + 222 = 11 854
i =1 j =1
r
( ∑ x ij )2 1 1
j =1
B =∑ = (36 + 40 + 35)2 + (43 + 48 + 44)2 +
i =1 c 3 3
1
+ (22 + 26 + 22)2 = 11 815.333
3
c
( ∑ x ij )2 1 1
i =1
C =∑ = (36 + 43 + 22)2 + (40 + 48 + 26)2 +
j =1 r 3 3
1
+ (35 + 44 + 22)2 = 11 132.667
3
r c
( ∑ ∑ x ij )2 1
i =1 j =1
D = = (36 + 40 + 35 + 43 + 48 + 44 + 22 +
n 9
+ 26 + 22)2 = 11 095.111
ANOVA table
SS
source of variation SS df MS =
df
between cloths B − D = 720.222 2 360.111
between colours C − D = 37.556 2 18.778
error 1.111 4 0.27775
Total A − D = 758.889 8
18.778
F= ≈ 67.61 > 6.94
0.27775
360.111
F= ≈ 1296.53 > 6.94
0.27775
Ex 3 Three different types of nutritive substance were given to 24 rates, 12 male and
12 female. The result measured in increasing weight (in gram) is shown in the
following table.
Sustance
Sex A B C
Male 5 7 21
5 7 14
9 9 17
7 6 12
Female 7 10 16
6 8 14
9 7 14
8 6 10
ANOVA table
Source of variation SS df
Sex 0.67 1
Substance 301.00 2
Interaction 14.33 2
Error 94.50 18
(2, 18) df
α = 0.05
3.55
MSint eraction
Step 3: The test statistic F=
MS error
Step 4:
14.33
F= 2 ≈ 0.758 < 3.55
94.50
18
Step 5: H0 cannot be rejected. This investigation does not support the statement
that there is an interaction.