You are on page 1of 4

GV207 – Political Analysis, Week 05

Department of Government, University of Essex

Confidence and Significance
Inferential Statistics1


We are typically not interested in the characteristics of our sample per se, but rather in what we
can learn from our sample about the characteristics of the population.
Thus, we try to infer from the observed parameters in our sample (e.g. the sample mean) to the
unknown parameters of the population (e.g. the population mean).
We can do this in two (closely related) ways: performing a significance test or constructing a
confidence interval.

Significance Testing

Significance testing involves calculating the probability of observing our sample mean given our
assumption about the population mean.
Let’s look at an example to see how this works. If we assume that 40% of the voting age
population intend to vote for the Labour party. To test this assumption, we conduct a survey
based on a random sample of 1000 individuals drawn from this population. We find that only
37% of the respondents intend to vote for the Labour party.
How likely is it that we get a sample mean of 0.37 (just because of sampling variance) given that
the population mean is 0.40? (Note that if we treat those that intend to vote for Labour as 1’s and
the others as 0’s, then the sample mean of the dummy variable will be 0.37 and its standard
deviation will be 0.48.)
To answer this question we can exploit the fact that – if we draw an infinite number of random
samples – then the sampling distribution of sample means will be normally distributed around
the population mean. This is always true, even if the underlying variable is not normally
distributed.2
The standard deviation of the sampling distribution is defined as

where s is the standard deviation of the variable in our sample and n is the sample size. The
standard deviation of the sampling distribution of sample means SE is typically called the
standard error of the sample mean or simply the standard error.
Remember that we can calculate the area under the curve of the normal distribution by
calculating z-scores and looking up the corresponding area in a table:

Part of the curve

Proportion of area under the curve

1 SD  mean 1 SD

About 68%

2 SD  mean 2 SD

About 95%3

3 SD  mean 3 SD

About 99%

1

These notes draw on the lecture slides and Kellstedt and Whitten (2009).
The following website illustrates this: http://onlinestatbook.com/stat_sim/sampling_dist/index.html
3
In order to get an area of exactly 95%, we need to move 1.96 (and not 2) standard deviations below and above
the mean. However, we often use 2 as an approximation.
2

1

GV207 – Political Analysis, Week 05

Department of Government, University of Essex

Since sample means are normally distributed around the population mean, we know how likely it
is to get a certain sample mean given our assumption about the population mean. If our sample
mean is very far away from our assumed population mean, it is more likely that our assumption
about the value of the population mean is wrong.
Remember that we calculated the z-score of an observation’s deviation from the sample mean in
the following way:

Similarly, we can calculate the z-score, which measures the sample mean’s deviation from the
assumed population mean:4

In this case, is our sample mean, μ is the assumed population mean, and SE is the standard error
of the sample mean.
In our example of the electoral support for the Labour party, we know that the sample mean is
0.37, the sample standard deviation is 0.48, the sample size is 1000, and the assumed population
mean is 0.40. Using this information, we only need to figure out the standard error of the sample
mean to calculate the z-score.

Knowing the standard error of the sample mean, we can now calculate the z-score:

The result is roughly -2. Thus, if we look at the sampling distribution around the assumed
population mean of 0.40, our sample mean of 0.37 lies about 2 standard errors below the mean.
Knowing this, we can conclude that only about 5% of the samples drawn from a population with
mean 0.40 should have a sample mean that is that far away (in either direction) from the
population mean.
Using statistical jargon, we would conclude that the difference between our sample mean and the
assumed population mean is significant at the 5% level. Thus, we would reject the assumption
that the population mean is equal to 0.40 at the 5% significance level.

Question: Which factors influence by how many standard errors our sample mean will deviate from
the assumed population mean?

4

Strictly speaking, we do not calculate a z-score, but something called a t-score, since we don’t know the
population variance in our example. However, I have used the z-score notation to emphasise that this is similar
to what we did last week.
2

GV207 – Political Analysis, Week 05

Department of Government, University of Essex

Confidence Intervals


In many cases, we do not have a strong expectation about the population mean. Thus, we may
want to use our sample mean as an estimate of the unknown population mean.
However, drawing inferences from the observed sample mean to the unknown population mean
involves a degree of uncertainty. In fact, our sample mean is very unlikely to exactly match the
true population mean, but is likely to be a bit smaller or larger. In order to acknowledge the
uncertainty associated with our estimate, we can construct a so-called confidence interval.
In order to construct a confidence interval, we first have to choose a confidence level. Typically,
social scientists use a 90%, 95% or 99% confidence level.
Since we know that our sampling distribution is normal, we can start from the sample mean and
move 2 standard errors in each direction in order to construct a 95% confidence level.
(Remember that this is the area under the normal curve listed in the table above.) For example, to
produce a 95% confidence interval around our sample mean of 0.37, we simply calculate:

The 95% confidence interval ranges from 0.34 to 0.40. This means that we can be 95% confident
that the population mean of Labour support is between 34% and 40%. 5 Note that this is how the
“plus-minus” or “margin of errors” figures that we see in public opinion polls are calculated.

Question: Calculating a confidence interval acknowledges our uncertainty about the true population
mean. Which factors influence how certain or uncertain we can be about our estimate of the
population mean?

Significance Testing and Relationships between Variables


So far we have discussed what an observed sample mean can tell us about the unknown
population mean. However, most of the time we are not interested in means but in the
relationships between at least two variables X and Y.
Fortunately, we can apply exactly the same principles in this case. We can use the observed
relationship in the sample to infer to the unobserved relationship in the population and we can
calculate the probability of getting our sample relationship given our assumption about the
population relationship. Alternatively, we can calculate confidence intervals.
The first approach (i.e. calculating the probability of observing our sample relationship given our
assumption about the population relationship) is commonly referred to as significance testing.
To test whether an observed sample relationship is statistically significant, we proceed in three
steps (which are very similar to what we already did above):
1. We formulate a hypothesis about the relationship in the population. (This is similar to
making an assumption about the value of the population mean as we did above.)
2. We choose a significance level (e.g. 10%, 5% or 1%).
3. We decide whether the observed results should lead us to reject our hypothesis.

5

This is the loose interpretation of a confidence interval which is also used by Kellstedt and Whitten (2009:
129). Strictly speaking, this interpretation is incorrect. A 95% confidence interval does not tell us that there is a
95% probability that the true population mean lies within the interval. In fact, the probability that a given
confidence interval includes the population mean is either 0 or 1. Rather, we only know that – if we would draw
an infinite number of samples – the confidence intervals of 95% of these samples would include the true
population mean.
3

GV207 – Political Analysis, Week 05

Department of Government, University of Essex



Step 1 requires us to come up with a hypothesis about the relationship in the population. In
statistics, we typically employ the so-called null hypothesis (H0), which posits that there is no
relationship in the population.
Step 2 requires us to choose a significance level. In the social sciences, the 10%, 5% and 1%
levels are typically used.
The final step requires us to measure the relationship between X and Y in the sample and to
calculate the probability of finding such a relationship given that there is no relationship in the
population (i.e. the H0 is true). For example, if we find that the probability of finding such a
relationship just due to sampling variance is less than 0.05, we would conclude that the
relationship is statistically significant at the 5% level.
In statistical terms, we would say that p < 0.05. This so-called p-value simply indicates the
probability that we would find the observed relationship in our sample if there was in fact no
relationship in the population. This means that if the p-value becomes smaller (i.e. the
relationship becomes more significant), then we can be more confident that there really is a
relationship between X and Y in the population.

Question: What does a p-value of 0.01 indicate? What does a p-value of 0.10 indicate? What does a
p-value of 0.50 indicate? In which case are we most (least) likely to reject the H0?


How do we find out the p-value? We will see how this works for different statistical measures of
association during the next few weeks.
In general, we follow the same steps as above when we calculated the probability of getting a
particular sample mean given our assumption about the population mean. First, we calculate a socalled test statistic (like the z-score). Second, we look at a table for that test statistic to find out
the probability, with which we would observe such a test statistic given that the H0 is true.

Question: Which factors determine the level of statistical significance (i.e. the p-value)?

Finally, note that if we conclude that a relationship between X and Y is statistically significant,
this does not mean that the relationship is important or strong. It simply means that we can be
relatively certain that there is a relationship between the two variables. Thus, statistical
significance is a measure of certainty, not of the importance or strength of a relationship.

References
Kellstedt, Paul M., and Guy D. Whitten. 2009. The Fundamentals of Political Science Research. Cambridge: Cambridge University Press, pp. 120–145.

4