Professional Documents
Culture Documents
Dr Florian Reiche
2/43
3/43
Determining the Sample Size
4/43
The Research Cycle
5/43
• If we sample, we need to accept that we have to deal with uncertainty
• We can quantify and control this uncertainty
• The key to this is the standard error, which we defined as
s
se = √ (1)
n
where s is the standard deviation of the sample and n the sample size.
6/43
Recall the equation for calculating the standard error:
s
se = √ (2)
n
If we invert it, we receive:
s2
n= (3)
se 2
7/43
Example1
Assume we have a population of 10,000, s 2 =0.2, and our desired standard error
se=0.016. When we pop that into our equation:
s2
n= (4)
se 2
we receive:
0.20
n= = 781.25 (5)
0.000256
1
Taken from Walliman, N. (2011)
8/43
Large Samples
If the sample size is large relative to the population, we need to add a correction to
this, by calculating the optimal sample size n’:
n
n0 = n (6)
(1 + N)
where:
• N: population size
• n=sample size
• n’=optimal sample size
9/43
Example (contd.)
In our example:
781.25
n0 = = 725 (7)
(1 + 781.25
10,000 )
10/43
Considerations
11/43
Any Questions?
12/43
Significance Tests for a Mean
13/43
The Research Cycle
14/43
Significance Tests2
Significance Test
A significance test uses data to summarise the evidence about a hypothesis. It
compares point estimates of parameters to the values predicted by the hypothesis.
2
Based on Agresti and Finlay (2013)
15/43
Example
16/43
Example
16/43
Example
16/43
Example
16/43
Example
16/43
Example
16/43
5 Steps of a Significance Test
1. Assumptions
2. Hypotheses
3. Test statistic
4. p-value
5. Conclusion
17/43
1. Assumptions
• Randomisation
• Population Distribution (here: normal)
• (Type of data)
• Sample Size
18/43
2. Hypotheses
• In empirical social science research, we try to find out, whether the data agree
with certain predictions
• These predictions result from theories we want to test
• The predictions are called hypotheses
Hypothesis
"In statistics, a hypothesis is a statement about a population. It is usually a
prediction that a parameter describing some characteristic of a variable takes a
particular numerical value or falls in a certain range of values." (Agresti and Finlay,
2014, p. 143)
19/43
2. Hypotheses (contd.)
Examples:
• “All unicorns are pink.”
• “Countries are democracies if their per capita GDP exceeds $ 10,000.”
• “A person is either an immigrant or not.”
20/43
2. Hypotheses (contd.)
Examples:
• “All unicorns are pink.”
• “Countries are democracies if their per capita GDP exceeds $ 10,000.”
• “A person is either an immigrant or not.”
20/43
2. Hypotheses (contd.)
Examples:
• “All unicorns are pink.”
• “Countries are democracies if their per capita GDP exceeds $ 10,000.”
• “A person is either an immigrant or not.”
20/43
2. Hypotheses (contd.)
• Each significance test has TWO hypotheses about the value of a parameter
• Null hypothesis (H0 ): is a statement that the parameter takes a particular
value, that usually indicates no effect.
• Alternative hypothesis (Ha ): states that the parameter falls into some
alternative range of values, representing an effect of some type
21/43
2. Hypotheses (contd.)
22/43
2. Hypotheses (contd.)
22/43
2. Hypotheses (contd.)
22/43
2. Hypotheses (contd.)
23/43
2. Hypotheses (contd.)
23/43
2. Hypotheses (contd.)
23/43
3. Test Statistic
Test Statistic
"The parameter to which the hypotheses refer has a point estimate. The test
statistic summarizes how far that estimate falls from the parameter value in H0 .
Often this is expressed by the number of standard errors between the estimate and
the H0 value." (Agresti and Finlay, 2014, p. 145)
24/43
3. Test Statistic (contd.)
25/43
3. Test Statistic (contd.)
25/43
3. Test Statistic (contd.)
25/43
3. Test Statistic (contd.)
25/43
t-Test Statistic
26/43
t-Test Statistic
26/43
t-Test Statistic
26/43
t-Test Statistic
26/43
t-Test Statistic (contd.)
ȳ −µ0 √s
t= se , where se = n
27/43
Calculating the t-value
ȳ −µ0 √s
t= se , where se = n
• In our example:
−3.007−0 7.309
t= 1.357 = −2.22, where se = √
29
28/43
4. The p-value
29/43
4. The p-value (contd.)
p-value
"The p-value is the probability that the test statistic equals the observed value or a
value even more extreme in the direction predicted by Ha . It is calculated
presuming that H0 is true. The p-value is denoted by p."(Agresti and Finlay, 2014,
p. 145)
30/43
Determining the p-value
• We have calculated the t-statistic, and know that our observed value of ȳ lies
2.22 standard errors away from H0 (in our case zero).
• We now use this value to determine what percentage under the distribution is
covered by this distance
31/43
Determining the p-value (contd.)
32/43
Determining the p-value (contd.)
32/43
Determining the p-value (contd.)
32/43
Determining the p-value (contd.)
32/43
• The remaining area beyond the t-value is the p-value (for a two-sided test you
need to sum up both sides)
• This is the blue area in the graph below
• Stata will tell you the exact value automatically
density
−2.22 0 2.22
ty
33/43
5. Conclusion
34/43
5. Conclusion
34/43
5. Conclusion
34/43
5. Conclusion
34/43
Any Questions?
35/43
Type I and Type II Errors
36/43
Why not go for p = 0?
37/43
The Relationship between Type I and Type II Errors
38/43
Why does it matter?
• Court Trial
• H0 : Defendant is innocent
• Ha : Defendant is guilty
• Type I error: We send an innocent person to jail
• Type II error: We let a guilty person run free
39/43
Why does it matter?
• Court Trial
• H0 : Defendant is innocent
• Ha : Defendant is guilty
• Type I error: We send an innocent person to jail
• Type II error: We let a guilty person run free
39/43
Why does it matter?
• Court Trial
• H0 : Defendant is innocent
• Ha : Defendant is guilty
• Type I error: We send an innocent person to jail
• Type II error: We let a guilty person run free
39/43
Why does it matter?
• Court Trial
• H0 : Defendant is innocent
• Ha : Defendant is guilty
• Type I error: We send an innocent person to jail
• Type II error: We let a guilty person run free
39/43
Why does it matter?
• Court Trial
• H0 : Defendant is innocent
• Ha : Defendant is guilty
• Type I error: We send an innocent person to jail
• Type II error: We let a guilty person run free
39/43
Any Questions?
40/43
Congratulations, you have survived QS104!
41/43
Goodbye, QS104
Academic Year
2019/20 2020/21
42/43
Combinations
Academic Year
2019/20 2020/21
43/43