6 views

Uploaded by Octavian Albu

s411111111111242

save

You are on page 1of 4

Department of Government, University of Essex

**Confidence and Significance
**

Inferential Statistics1

**We are typically not interested in the characteristics of our sample per se, but rather in what we
**

can learn from our sample about the characteristics of the population.

Thus, we try to infer from the observed parameters in our sample (e.g. the sample mean) to the

unknown parameters of the population (e.g. the population mean).

We can do this in two (closely related) ways: performing a significance test or constructing a

confidence interval.

Significance Testing

**Significance testing involves calculating the probability of observing our sample mean given our
**

assumption about the population mean.

Let’s look at an example to see how this works. If we assume that 40% of the voting age

population intend to vote for the Labour party. To test this assumption, we conduct a survey

based on a random sample of 1000 individuals drawn from this population. We find that only

37% of the respondents intend to vote for the Labour party.

How likely is it that we get a sample mean of 0.37 (just because of sampling variance) given that

the population mean is 0.40? (Note that if we treat those that intend to vote for Labour as 1’s and

the others as 0’s, then the sample mean of the dummy variable will be 0.37 and its standard

deviation will be 0.48.)

To answer this question we can exploit the fact that – if we draw an infinite number of random

samples – then the sampling distribution of sample means will be normally distributed around

the population mean. This is always true, even if the underlying variable is not normally

distributed.2

The standard deviation of the sampling distribution is defined as

**where s is the standard deviation of the variable in our sample and n is the sample size. The
**

standard deviation of the sampling distribution of sample means SE is typically called the

standard error of the sample mean or simply the standard error.

Remember that we can calculate the area under the curve of the normal distribution by

calculating z-scores and looking up the corresponding area in a table:

Part of the curve

Proportion of area under the curve

1 SD mean 1 SD

About 68%

2 SD mean 2 SD

About 95%3

3 SD mean 3 SD

About 99%

1

**These notes draw on the lecture slides and Kellstedt and Whitten (2009).
**

The following website illustrates this: http://onlinestatbook.com/stat_sim/sampling_dist/index.html

3

In order to get an area of exactly 95%, we need to move 1.96 (and not 2) standard deviations below and above

the mean. However, we often use 2 as an approximation.

2

1

GV207 – Political Analysis, Week 05

Department of Government, University of Essex

**Since sample means are normally distributed around the population mean, we know how likely it
**

is to get a certain sample mean given our assumption about the population mean. If our sample

mean is very far away from our assumed population mean, it is more likely that our assumption

about the value of the population mean is wrong.

Remember that we calculated the z-score of an observation’s deviation from the sample mean in

the following way:

**Similarly, we can calculate the z-score, which measures the sample mean’s deviation from the
**

assumed population mean:4

In this case, is our sample mean, μ is the assumed population mean, and SE is the standard error

of the sample mean.

In our example of the electoral support for the Labour party, we know that the sample mean is

0.37, the sample standard deviation is 0.48, the sample size is 1000, and the assumed population

mean is 0.40. Using this information, we only need to figure out the standard error of the sample

mean to calculate the z-score.

Knowing the standard error of the sample mean, we can now calculate the z-score:

**The result is roughly -2. Thus, if we look at the sampling distribution around the assumed
**

population mean of 0.40, our sample mean of 0.37 lies about 2 standard errors below the mean.

Knowing this, we can conclude that only about 5% of the samples drawn from a population with

mean 0.40 should have a sample mean that is that far away (in either direction) from the

population mean.

Using statistical jargon, we would conclude that the difference between our sample mean and the

assumed population mean is significant at the 5% level. Thus, we would reject the assumption

that the population mean is equal to 0.40 at the 5% significance level.

Question: Which factors influence by how many standard errors our sample mean will deviate from

the assumed population mean?

4

**Strictly speaking, we do not calculate a z-score, but something called a t-score, since we don’t know the
**

population variance in our example. However, I have used the z-score notation to emphasise that this is similar

to what we did last week.

2

GV207 – Political Analysis, Week 05

Department of Government, University of Essex

Confidence Intervals

**In many cases, we do not have a strong expectation about the population mean. Thus, we may
**

want to use our sample mean as an estimate of the unknown population mean.

However, drawing inferences from the observed sample mean to the unknown population mean

involves a degree of uncertainty. In fact, our sample mean is very unlikely to exactly match the

true population mean, but is likely to be a bit smaller or larger. In order to acknowledge the

uncertainty associated with our estimate, we can construct a so-called confidence interval.

In order to construct a confidence interval, we first have to choose a confidence level. Typically,

social scientists use a 90%, 95% or 99% confidence level.

Since we know that our sampling distribution is normal, we can start from the sample mean and

move 2 standard errors in each direction in order to construct a 95% confidence level.

(Remember that this is the area under the normal curve listed in the table above.) For example, to

produce a 95% confidence interval around our sample mean of 0.37, we simply calculate:

**The 95% confidence interval ranges from 0.34 to 0.40. This means that we can be 95% confident
**

that the population mean of Labour support is between 34% and 40%. 5 Note that this is how the

“plus-minus” or “margin of errors” figures that we see in public opinion polls are calculated.

**Question: Calculating a confidence interval acknowledges our uncertainty about the true population
**

mean. Which factors influence how certain or uncertain we can be about our estimate of the

population mean?

**Significance Testing and Relationships between Variables
**

**So far we have discussed what an observed sample mean can tell us about the unknown
**

population mean. However, most of the time we are not interested in means but in the

relationships between at least two variables X and Y.

Fortunately, we can apply exactly the same principles in this case. We can use the observed

relationship in the sample to infer to the unobserved relationship in the population and we can

calculate the probability of getting our sample relationship given our assumption about the

population relationship. Alternatively, we can calculate confidence intervals.

The first approach (i.e. calculating the probability of observing our sample relationship given our

assumption about the population relationship) is commonly referred to as significance testing.

To test whether an observed sample relationship is statistically significant, we proceed in three

steps (which are very similar to what we already did above):

1. We formulate a hypothesis about the relationship in the population. (This is similar to

making an assumption about the value of the population mean as we did above.)

2. We choose a significance level (e.g. 10%, 5% or 1%).

3. We decide whether the observed results should lead us to reject our hypothesis.

5

**This is the loose interpretation of a confidence interval which is also used by Kellstedt and Whitten (2009:
**

129). Strictly speaking, this interpretation is incorrect. A 95% confidence interval does not tell us that there is a

95% probability that the true population mean lies within the interval. In fact, the probability that a given

confidence interval includes the population mean is either 0 or 1. Rather, we only know that – if we would draw

an infinite number of samples – the confidence intervals of 95% of these samples would include the true

population mean.

3

GV207 – Political Analysis, Week 05

Department of Government, University of Essex

**Step 1 requires us to come up with a hypothesis about the relationship in the population. In
**

statistics, we typically employ the so-called null hypothesis (H0), which posits that there is no

relationship in the population.

Step 2 requires us to choose a significance level. In the social sciences, the 10%, 5% and 1%

levels are typically used.

The final step requires us to measure the relationship between X and Y in the sample and to

calculate the probability of finding such a relationship given that there is no relationship in the

population (i.e. the H0 is true). For example, if we find that the probability of finding such a

relationship just due to sampling variance is less than 0.05, we would conclude that the

relationship is statistically significant at the 5% level.

In statistical terms, we would say that p < 0.05. This so-called p-value simply indicates the

probability that we would find the observed relationship in our sample if there was in fact no

relationship in the population. This means that if the p-value becomes smaller (i.e. the

relationship becomes more significant), then we can be more confident that there really is a

relationship between X and Y in the population.

**Question: What does a p-value of 0.01 indicate? What does a p-value of 0.10 indicate? What does a
**

p-value of 0.50 indicate? In which case are we most (least) likely to reject the H0?

How do we find out the p-value? We will see how this works for different statistical measures of

association during the next few weeks.

In general, we follow the same steps as above when we calculated the probability of getting a

particular sample mean given our assumption about the population mean. First, we calculate a socalled test statistic (like the z-score). Second, we look at a table for that test statistic to find out

the probability, with which we would observe such a test statistic given that the H0 is true.

Question: Which factors determine the level of statistical significance (i.e. the p-value)?

**Finally, note that if we conclude that a relationship between X and Y is statistically significant,
**

this does not mean that the relationship is important or strong. It simply means that we can be

relatively certain that there is a relationship between the two variables. Thus, statistical

significance is a measure of certainty, not of the importance or strength of a relationship.

References

Kellstedt, Paul M., and Guy D. Whitten. 2009. The Fundamentals of Political Science Research. Cambridge: Cambridge University Press, pp. 120–145.

4

- Econometrics_ch6Uploaded bymuhendis_8900
- PercobaanUploaded byLutfi Mustofa
- ECOLOGY Lab Exercise 4Uploaded byBeatrice Del Rosario
- The Effect of Using the Random Variable Method in Developing Some Motor & Skill Abilities for Junior Football PlayersUploaded byThe Swedish Journal of Scientific Research (SJSR) ISSN: 2001-9211
- Solutions Chapter 5Uploaded byArslan Qayyum
- STAB22_FinalExam_2011FUploaded byexamkiller
- “Attitude of PG students’ towards using e-resources in learning”Uploaded bysushma
- Post- Test Form - 8C.xlsxUploaded byErold Tarvina
- STAT.docxUploaded byJayrick James Arisco
- 13-0300 Edwards NHSN Metrics Analysis Day2 Workshop2Uploaded byzarcone7
- Mini TabUploaded byCidAlexanderRami
- Compre Statistics(1)Uploaded byPaspasan Rodin
- 6385.Scribe_Confidence_IntervalsUploaded byelgorilagrande
- 15 SimulationUploaded byMubanga
- Med School ProbabilitiesUploaded byMichael Tilghman
- Stat NU Question 190-208Uploaded bySayeed Ahmad
- ABE December 2011 quantitative past paperUploaded bymichelle_jj_2
- MARGIN OF ERROR.docxUploaded byFaith Febe Austria
- Anesth Analg-2010-Smallman-879-87Uploaded byhartanto_budi6222
- Descriptive Statistics With s PssUploaded byZaenal Muttaqin
- Ap09 Statistics SgsUploaded bycamillesyp
- CCP303Uploaded byapi-3849444
- ap stats ch 6 testUploaded byapi-253579853
- c14-1Uploaded byVinay Gupta
- Proposed Revision of Aci 214r-02 Section 3.4.2Uploaded byMaamar Sing
- Chapter2 StatsUploaded byPoonam Naidu
- Modeling Risk and Realities Week 4 Session 3Uploaded byhitesh
- R_Manual.pdfUploaded byjsalgado_catie
- APPLYING Article Critique IIIUploaded byEvin Shinn
- ASHISH brmUploaded bySujeet Singh

- asasawwwwwUploaded byOctavian Albu
- Abaddon, Exterminatorul - Ernesto SabatoUploaded byOctavian Albu
- ra131313Uploaded byOctavian Albu
- v35i2a06pUploaded byOctavian Albu
- asfasfaesaeaundasfasfasftiei - IsaacfsaeeimovUploaded byOctavian Albu
- 2001 O Odisee Spatiala - Arthur C. ClarkeUploaded byOctavian Albu
- 3001 Odiseea Finala - Arthur C. ClarkeUploaded byOctavian Albu
- ssesarsagasUploaded byOctavian Albu
- ses22sfffsgUploaded byOctavian Albu
- 2061 a Treia Odisee Spatiala - Arthur C. ClarkeUploaded byOctavian Albu
- Nssew Text DocumentUploaded byOctavian Albu
- 2484 Quirinal AVE - Sebastian a. CornUploaded byOctavian Albu
- U Olcott Kent Pettee and KendallUploaded byOctavian Albu
- wawddddddUploaded byOctavian Albu
- a1rr1ra1rUploaded byOctavian Albu
- Introduction to Politics Essay 2Uploaded byOctavian Albu
- 3 g33g3Uploaded byOctavian Albu
- 12 CrackedhUploaded byOctavian Albu
- Introduction to Politics Essay 3Uploaded byOctavian Albu
- Test1 Ec111 1314 TemplateUploaded byOctavian Albu
- FestivalsUploaded byOctavian Albu
- FestivalsUploaded byOctavian Albu
- 12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)12cracked(1)Uploaded byOctavian Albu
- 632634 2 235 wdrsUploaded byOctavian Albu
- q123edrft g dgdgdUploaded byOctavian Albu
- sUploaded byOctavian Albu
- Introduction to Politics Essay 1Uploaded byOctavian Albu
- Introduction to Politics Essay 1Uploaded byOctavian Albu