Professional Documents
Culture Documents
Stat2 2023 Lecture Slides 2A PDF
Stat2 2023 Lecture Slides 2A PDF
2
Stat2
2A one population: mean
(Statistics 1 for Economics)
𝐻𝐻1 ∶ 𝜇𝜇 > 8 rejection region: 𝑇𝑇 ≥ 𝑡𝑡𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
population 𝑛𝑛 = 20
6 4 14 6
random sample sample
?
6 11 8 7
14 15 12 11
3 10 5 6 7
5 7 19
?
6 11 8 7
14 15 12 11
3 10 5 6 7
5 7 19
𝐻𝐻0 ∶ 𝜎𝜎 2 = 4 𝑠𝑠 2 = 8.98
how much… μ 𝑥𝑥̅ mean
𝑛𝑛 − 1 𝑠𝑠 2
variance
2
how different… σ² 𝜒𝜒 = ~𝜒𝜒 2 [df = 𝑛𝑛 − 1] s²
𝜎𝜎 2
if population normal
what percentage… p 𝑝𝑝̂ proportion
𝒔𝒔𝟐𝟐𝟏𝟏
𝝈𝝈𝟐𝟐𝟏𝟏 𝒔𝒔𝟐𝟐𝟏𝟏
𝑭𝑭 = = 𝟐𝟐
𝒔𝒔𝟐𝟐𝟐𝟐 𝒔𝒔𝟐𝟐
𝝈𝝈𝟐𝟐𝟐𝟐 4
Stat2
2A one population: proportion
(Statistics 1 for Economics)
𝐻𝐻1 ∶ 𝑝𝑝 ≠ 40% rejection region: 𝑍𝑍 ≥ 𝑧𝑧𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 or 𝑍𝑍 ≤ −𝑧𝑧𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
population
no no yes no
random sample sample
?
no no no no
no yes no no
no no no yes no
𝐻𝐻0 ∶ 𝑝𝑝 = 40% yes no no
4
𝑝𝑝̂ = 20 = 20%
how much… μ 𝑥𝑥̅ mean
Nominal and ordinal can be coded with numbers, but still not quantitative:
• smokes: yes = 1, no = 0
• employed: full-time = 1, part-time = 2, none = 3 7
Stat2
2A Statistics 2 for Economics
2A inference about two proportions 2A non-parametric: Wilcoxon Rank Sum
?
at least 10% larger than the
one population
proportion of smokers among 𝑝𝑝̂ − 𝑝𝑝
students (population 1)? 𝑧𝑧 = ~ 𝑁𝑁(0, 1)
𝑝𝑝 1 − 𝑝𝑝
𝑛𝑛
𝑝𝑝2 > 𝑝𝑝1 + 0.10 if 𝑛𝑛 � 𝑝𝑝 ≥ 5 and 𝑛𝑛 � 1 − 𝑝𝑝 ≥ 5
𝑥𝑥
𝑝𝑝̂1 = 𝑛𝑛1
p1 1
𝑝𝑝1 + 0.10 < 𝑝𝑝2 𝑝𝑝̂1 − 𝑝𝑝̂ 2 − 𝑝𝑝1 − 𝑝𝑝2
𝑧𝑧 = ~ 𝑁𝑁(0, 1)
𝑝𝑝1 − 𝑝𝑝2 𝑝𝑝̂1 1 − 𝑝𝑝̂1 𝑝𝑝̂ 1 − 𝑝𝑝̂ 2 𝑝𝑝̂1 − 𝑝𝑝̂ 2
𝑝𝑝1 < 𝑝𝑝2 − 0.10 + 2
𝑛𝑛1 𝑛𝑛2
p2 if 𝑛𝑛1 𝑝𝑝̂1 ≥ 5 𝑛𝑛1 1 − 𝑝𝑝1̂ ≥ 5 𝑛𝑛2 𝑝𝑝̂2 ≥ 5 𝑛𝑛2 1 − 𝑝𝑝̂2 ≥ 5 𝑥𝑥
𝐻𝐻1 ∶ 𝑝𝑝1 − 𝑝𝑝2 < −0.10 𝑝𝑝̂ 2 = 𝑛𝑛2
2
random sample
?
𝐻𝐻0 ∶ 𝑝𝑝1 − 𝑝𝑝2 = −0.10
sample 2
population 2
9
Stat2
2A two populations: proportion
hypothesis test with 𝛼𝛼 = 5%
11
Stat2
2A example Asperin
Can Aspirin help to prevent heart attacks?
22 000 participants:
𝑛𝑛1 = 11 000 half of the participants take Aspirin daily
𝑛𝑛2 = 11 000
the other half takes a placebo
𝛼𝛼 = 0.01
Hypothesis test with significance level 0.01
12
Stat2
2A example Asperin
hypothesis test with 𝛼𝛼 = 1%
14
Stat2
2A case Calculators
A factory has the choice between two methods of producing
calculators (methods 1 and 2). In a random sample of 80 calculators
made by method 1, 16 were defective, while in a random sample of
60 calculators made by method 2, 21 were defective.
a. Test, using 𝛼𝛼 = 5%, whether there is a difference between the
two methods with respect to the percentage of defective
calculators.
b. Because method 1 is more expensive than method 2,
management wants to know whether the percentage of
defectives for method 2 is more than 3%-points higher than that
for method 1. Test this using 𝛼𝛼 = 5%.
c. Determine a 90%-confidence interval for the difference in
percentages of defectives for the two methods.
(answers: see Canvas)
15
Stat2
2A Statistics 2 for Economics
2A inference about two proportions 2A non-parametric: Wilcoxon Rank Sum
population
sample
?
𝑥𝑥̅ s² 𝑝𝑝̂
μ σ² p statistics
parameters 19
Stat2
2A Wilcoxon Rank Sum Test
The Wilcoxon Rank Sum Test tests the difference in location of two populations.
Hypotheses:
𝐻𝐻0 : population 1 and population 2 have the same location
and
𝐻𝐻1 : location of population 1 differs from location population 2 (two-sided)
𝐻𝐻1 : location of population 1 is to the left of location population 2 (one-sided)
𝐻𝐻1 : location of population 1 is to the right of location population 2 (one-sided)
1 2
The Wilcoxon Rank Sum Test only assumes that the populations are identical
with respect to the spread and the form of their distributions.
20
Stat2
2A Wilcoxon Rank Sum Test
1. Conditions and assumptions
two independent random samples
ordinal data or quantitative data from non-normal distribution
2. Hypotheses
𝐻𝐻0 : population 1 and population 2 have the same location
𝐻𝐻1 : location of pop. 1 differs from location of pop. 2 (two-sided)
3. Test statistic
𝑇𝑇 = rank sum for sample 1 = 𝑇𝑇1
21
Stat2
2A rank sum and location
Rank the values of the two samples:
1 3 2 44 𝑇𝑇1𝑇𝑇1==1010
sample 1 8 17 13 22
sample 2 37 42 36 31
7 8 6 55 𝑇𝑇2𝑇𝑇=
2 =
2626
You can imagine that this ranking
1 2 3 4 5 6 7 8
likely from the situation where population 1 lies to the left of population 2:
1 2
The rank sum of the first sample is low: 𝑇𝑇1 = 1 + 2 + 3 + 4 = 10 < 𝑇𝑇2 = 26 22
Stat2
2A rank sum and location
Another example:
1 7 6 4.5 𝑇𝑇1 = 18.5
sample 1 11 37 35 28
sample 2 40 16 12 28
8 3 2 4.5 𝑇𝑇2 = 17.5
This ranking
4.5
1 2 3 4.5 6 7 8
might very well be from the situation where population 1 and population 2 have
(almost) the same location:
1
2
The rank sums are almost equal: 𝑇𝑇1 ≈ 𝑇𝑇2 notice 𝑛𝑛1 = 𝑛𝑛2
23
Stat2
2A rank sum and location
Recipe:
Rank each observation, from 1 (lowest) to 𝑛𝑛1 + 𝑛𝑛2 (highest)
In case of eaqual scores, take the mean of the ranks
𝑇𝑇1 is the sum of ranks for sample 1, and 𝑇𝑇2 of sample 2.
4. Rejection region
𝐿𝐿 𝑈𝑈
𝑇𝑇 ≤ 𝑇𝑇𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 ∨ 𝑇𝑇 ≥ 𝑇𝑇𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 (lower and upper)
Stat2
2A rejection region
4. Rejection region
𝐿𝐿 𝑈𝑈
𝑇𝑇 ≤ 𝑇𝑇𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 ∨ 𝑇𝑇 ≥ 𝑇𝑇𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 (lower and upper)
single blade 10 6 3 7 13 15 5 7
double blade 8 18 10 11 16 10 6 12