Professional Documents
Culture Documents
Presented by
Simon Cheung
Email: kingchaucheung@cuhk.edu.hk
Example. A six-sided die is tossed 1000 times, and 200 fours are observed. Is there
evidence to conclude that the die is biased?
1
Let 𝑝𝑝 be the probability of getting a 4. If the die is unbiased, 𝑝𝑝 = . This is our initial
6
1
assumption. It is called the null hypothesis, and we write 𝐻𝐻0 : 𝑝𝑝 = The next step is to
.
6
collect evidence (data). We know that the tie was tossed 𝑛𝑛 = 1000 times, and 𝑦𝑦 = 200
200
fours were observed. Hence, an estimate of 𝑝𝑝 is 𝑝𝑝̂ = = 0.20. To make a judgement
1000
about whether to reject or not to reject the null hypothesis, we note that, by the central
𝑝𝑝 1−𝑝𝑝
�
limit theorem, the large sample distribution of 𝑃𝑃 is 𝑛𝑛 𝑝𝑝, . Based on our initial
𝑛𝑛
1 5
assumption (under 𝐻𝐻0 ), the large sample distribution of 𝑃𝑃� is 𝑛𝑛 , .
6 36000
Example. A six-sided die is tossed 1000 times, and 200 fours are observed. Is there
evidence to conclude that the die is biased?
Hence, under 𝐻𝐻0 ,
1
1 0.20 −
𝑃𝑃 𝑃𝑃� ≥ 0.20 𝑝𝑝 = = 𝑃𝑃 𝑍𝑍 ≥ 6 = 𝑃𝑃 𝑍𝑍 ≥ 2.82843 = 0.00234.
6 5
36000
1
With our initial assumption that 𝑝𝑝 = , we only have a 0.23% chance of observing 200
6
fours from a 1000 tosses of the die. Can we say that the evidence provided by the data is
more extreme than we expect?
Example. A poll released on October 13, 2010 found that 47% of 1000 adults surveyed classified
themselves as "very happy" when given the choices of (A) "very happy", (B) "fairly happy", and (C)
"not too happy". Suppose that a journalist who is a pessimist took advantage of this poll to write a
headline titled "Poll finds that adults who are very happy are in the minority." Is the pessimistic
journalist's headline warranted?
Answer. Let 𝑝𝑝 be the proportion of adults who are "very happy". The hypothesis is 𝐻𝐻0 : 𝑝𝑝 = 0.5 vs
𝐻𝐻𝐴𝐴 : 𝑝𝑝 < 0.5. It is reasonable to reject 𝐻𝐻0 if 𝑝𝑝̂ − 0.5 < 𝐶𝐶, for some constant 𝐶𝐶. To determine 𝐶𝐶, we
set the probability of committing a type I error at 5%, that is 𝛼𝛼 = 0.05.
𝐶𝐶 𝐶𝐶
0.05 = 𝑃𝑃 𝑝𝑝̂ − 0.5 < 𝐶𝐶 𝑝𝑝 = 0.5 = 𝑃𝑃 𝑍𝑍 < ⟹ = −1.645 ⟹ 𝐶𝐶 = −0.026.
1 1
4000 4000
Since 𝑝𝑝̂ = 0.47 < 0.5 − 0.026 = 0.474, we reject 𝐻𝐻0 at 𝛼𝛼 = 5% significance level.
Example. The following table is the result of a telephone poll of 800 adults. The question
posed of the people who were surveyed was "Should the tax on cigarettes be raised to pay
for more health care?".
Non-Smokers Smokers
𝑛𝑛1 = 605 𝑛𝑛2 = 195
𝑝𝑝̂1 = 0.58 𝑝𝑝̂ 2 = 0.21
Paired T-test.
Suppose that we have a random sample of subjects of size 𝑛𝑛. For the 𝑖𝑖𝑡𝑡𝑡 subject in the
sample, we take observation 𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 . We are interested in the null hypothesis 𝐻𝐻0 : 𝜇𝜇𝑋𝑋 =
𝜇𝜇𝑌𝑌 . For 𝑖𝑖 = 1,2, … , 𝑛𝑛, define 𝐷𝐷𝑖𝑖 = 𝑋𝑋𝑖𝑖 − 𝑌𝑌𝑖𝑖 . The null hypothesis becomes 𝐻𝐻0 : 𝜇𝜇𝐷𝐷 = 0.
� is 𝑛𝑛 𝜇𝜇𝐷𝐷 , 𝜎𝜎 2 . Since 𝜎𝜎 2 is
By the central limit theorem, the large sample distribution of 𝐷𝐷
� 𝐷𝐷
𝐷𝐷−𝜇𝜇
unknown, the large sample distribution of 𝑇𝑇 = is 𝑡𝑡𝑛𝑛−1 . If our null hypothesis is
𝑆𝑆𝐷𝐷 / 𝑛𝑛
�
𝐷𝐷
𝐻𝐻0 : 𝜇𝜇𝐷𝐷 = 0, then an approximate distribution of 𝑇𝑇 = is 𝑡𝑡𝑛𝑛−1 .
𝑆𝑆𝐷𝐷 / 𝑛𝑛
Paired T-test.
We have three cases.
• 𝐻𝐻0 : 𝜇𝜇𝐷𝐷 = 0 vs 𝐻𝐻𝐴𝐴 : 𝜇𝜇𝐷𝐷 > 0
𝑑𝑑�
� > 𝑑𝑑̅ 𝜇𝜇𝐷𝐷 = 0 = 𝑃𝑃 𝑇𝑇 > 𝑛𝑛
The p-value of the test is 𝑃𝑃 𝐷𝐷 .
𝑠𝑠𝐷𝐷
• 𝐻𝐻0 : 𝜇𝜇𝐷𝐷 = 0 vs 𝐻𝐻𝐴𝐴 : 𝜇𝜇𝐷𝐷 < 0
𝑑𝑑�
� < 𝑑𝑑̅ 𝜇𝜇𝐷𝐷 = 0 = 𝑃𝑃 𝑇𝑇 < 𝑛𝑛
The p-value of the test is 𝑃𝑃 𝐷𝐷 .
𝑠𝑠𝐷𝐷
• 𝐻𝐻0 : 𝜇𝜇𝐷𝐷 = 0 vs 𝐻𝐻𝐴𝐴 : 𝜇𝜇𝐷𝐷 ≠ 0
𝑑𝑑�
� > 𝑑𝑑̅ 𝜇𝜇𝐷𝐷 = 0 = 2𝑃𝑃 𝑇𝑇 > 𝑛𝑛
The p-value of the test is 𝑃𝑃 𝐷𝐷 .
𝑠𝑠𝐷𝐷
Example. Is there a statistically significant difference at the 𝛼𝛼 = 0.01 level, say, in the
(population) mean cholesterol levels reported by Lab 1 and Lab 2?
Answer. The hypothesis is 𝐻𝐻0 : 𝜇𝜇𝐷𝐷 = 0 vs 𝐻𝐻𝐴𝐴 : 𝜇𝜇𝐷𝐷 ≠ 0. Note that 𝑑𝑑̅ = 12.8 and 𝑠𝑠𝐷𝐷 = 4.238.
The p-value of the test is
𝑑𝑑̅
2𝑃𝑃 𝑇𝑇 > 10 ≈ 0.
𝑠𝑠𝐷𝐷
Since the p-value is less than 0.01, we reject the null hypothesis at 𝛼𝛼 = 0.01 significance
level. We conclude that there are differences between Lab1 and Lab2.
Is there sufficient evidence at 𝛼𝛼 = 0.05 significance level to conclude that the mean
fastest speed driven by men differs from that driven by women?
Is there sufficient evidence at 𝛼𝛼 = 0.05 significance level to conclude that the mean
fastest speed driven by men differs from that driven by women?
It appears that the box plots for Brand1 and Brand5 have very little, if any, overlap at all. The same
can be said for Brand3 and Brand5.
It appears that the box plots for Brand1 and Brand5 have very little, if any, overlap at all. The same
can be said for Brand3 and Brand5.
Since the p-value of the F-test is small, we reject the null hypothesis to conclude that at
least one of the exam scores is different.
Since the p-value of the F-test is less than 0.05, we reject the null hypothesis at 5%
significance to conclude that at least one of the exam scores is different.
Since the p-value of the F-test is large, we do not have enough evidence to reject the null
hypothesis.
Remarks:
• The result of the F-test indicates the significance of fitting the group means model to
the data, that is to assume that 𝑋𝑋𝑖𝑖𝑖𝑖 = 𝜇𝜇𝑖𝑖 + 𝜀𝜀𝑖𝑖𝑖𝑖 , for 𝑗𝑗 = 1,2, … , 𝑛𝑛𝑖𝑖 and 𝑖𝑖 = 1,2, … , 𝑘𝑘,
where 𝜀𝜀𝑖𝑖𝑖𝑖 ’s are independent normally distributed with mean 0 and variance 𝜎𝜎 2 . The
𝑛𝑛𝑖𝑖
1
mle of 𝜇𝜇𝑖𝑖 is 𝑋𝑋�𝑖𝑖 = � 𝑋𝑋𝑖𝑖𝑖𝑖 .
𝑛𝑛𝑖𝑖 𝑗𝑗=1
1 2 𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆𝑆𝑆
• Since 𝑆𝑆𝑆𝑆𝑆𝑆~𝜒𝜒 𝐸𝐸
𝑛𝑛−𝑘𝑘 , = 𝜎𝜎 2 .
That is, 𝑀𝑀𝑀𝑀𝑀𝑀 = is an unbiased estimator of 𝜎𝜎 2 .
𝜎𝜎 2 𝑛𝑛−𝑘𝑘 𝑛𝑛−𝑘𝑘
This result is true irrespective of whether the null hypothesis is true or not.
• If the group means model fits the data, 𝑆𝑆𝑆𝑆𝑆𝑆 is large and 𝑆𝑆𝑆𝑆𝑆𝑆 is small. In this case, the
p-value of the F-test is small and the null hypothesis is rejected. We say that the model
is significant and fits the data well.
Remarks:
• The F-test assumes that the random samples are independent, normally distributed
and the error variances are equal. However, the F-test works quite well even if the
underlying measurements are not normally distributed, unless the data are highly
skewed or the variances are markedly different.
• If the data are highly skewed, or if there is evidence that the variances differ greatly,
we have two analysis options at our disposal. We could attempt to transform the
observations (take the natural log of each value, for example) to make the data more
symmetric with more similar variances. Alternatively, we could use nonparametric
methods
Example. A large body of evidence shows that soy has health benefits for most people.
Some of these benefits originate largely from isoflavones, plant compounds that have
estrogen-like properties. A consumer group purchased various soy products and ran
laboratory tests to determine the amount of isoflavones in each product. There were
three major sources of soy products: (1) cereals and snacks, (2) energy bars and (3) veggie
burgers. Five different products from each of the three categories were selected and the
amount of isoflavones (in mg) was determined for an adult serving. Our objective is to
determine if the average amount of isoflavones was different for the three sources of soy
products.
Example. The data are given in the following table. Use these data to test the hypothesis
of a difference in the mean isoflavones level for the three categories.
Sample
Source of Sample Sample
Isoflavones Content (mg) Standard
Soy Size Mean
deviation
1 5 17 12 10 4 5 9.60 4.7582
2 19 10 9 7 5 5 10.00 4.8166
3 25 15 12 9 8 5 13.80 6.1123
Total 15 11.133
Example. The testing of hypothesis is 𝐻𝐻0 : 𝜇𝜇1 = 𝜇𝜇2 = 𝜇𝜇3 versus 𝐻𝐻1 : At least one of the
three population means is different from the rest.
3
𝑖𝑖=1
2 2 2
= 5 9.60 − 11.133 + 5 10.0 − 11.133 + 5 13.80 − 11.133 = 53.733
3
Example. The testing of hypothesis is 𝐻𝐻0 : 𝜇𝜇1 = 𝜇𝜇2 = 𝜇𝜇3 versus 𝐻𝐻1 : At least one of the
three population means is different from the rest.
The ANOVA table is given as
Source df SS MS F p-value
Method 2 53.733 26.867 0.775 0.4824
Error 12 416 34.667
Total 14 469.733
Since the p-value of the F-test is large, we do not have enough evidence to reject the null
hypothesis.