You are on page 1of 36

Unit 6: Introduction to Inference

Chapter 6 in IPS
Unit 6 Outline: Introduction to Inference

• Confidence Intervals

• Hypothesis Tests

• Power and Sample Size

2
Confidence Intervals - General Idea
We want to make a statement (inference) about a population parameter
(e.g. μ or p; unknown value) using information from observed sample
data (statistic; an estimate such as x or pˆ )

The general form of a CI is: Estimate + margin of error


Estimate + (z-value)×(SD of estimate)

Usually written as: (Lower confidence limit, Upper confidence limit)


(Est. – z*×(SD of est.), Est. + z*×(SD of est.))

Example: σ is known to be 5, and we sample x to be 18:


18 + 1.96×(5) = (8.2, 27.8)

3
Example: Confidence interval for population
mean rent in Cambridge
• Suppose we wish to make a statement about rent
of one-bedroom apartments in Cambridge
• We take a random sample of properties; perhaps
from the property listings at Cambridge Property
Tax Office, perhaps from realty listings (not
ideal)
– keeping in mind good sampling principles of Stat
S100 as we collect the apartment rents in the sample

4
Regardless of the population distribution from
which the individual random variables are drawn, Population of
the sample mean is approximately normally rents: mean = μ,
distributed with mean μ and standard dev. (σ/√n) sd = σ … a bit
right-skewed
Approximation improves with increasing n.

Random Observed
Variable Values
Apt (rents) Pop. Mean Pop. SD (rents)
1 X1 µ σ x1 = $920
2 X2 µ σ x2 = $800
… … … … …
n Xn µ σ xn = $1500
Sample 
X µ x
Mean n

  
X has a N   ,  distribution
 n
Logic behind confidence intervals
• Draw a sample of 100 rents X1,…X100 from 1-BR apartments, and
calculate the sample mean (say $1250).
• Assume the standard deviation across individual apartment rents in the
population is σ = 300. What is the standard deviation of the sample
mean?  300 300
X     30
n 100 10
• Thus with 95% probability, x will lie within plus or minus 1.96(σ/√n) =
1.96(30) = 58.8 dollars of the true population mean price, μ.
• So we say with 95% confidence, the unknown mean, μ, will lie within
plus or minus 58.8 dollars of the sample mean ( x )
• We express this by saying that the 95% confidence interval for the true
population mean rent (1-BR Cambridge Apts) is:
(1250 – 58.8, 1250 + 58.8) or ($1,191.20, $1,308.80)
• So why a confidence interval and not a probability interval? Because
only random things have probability. And even though we are
estimating the unknown μ, it is a fixed number. What was random is
the sampling procedure we used to calculate x. 6
More generally, the 95%
confidence interval for the
population mean is:

   
 x  1.96 , x  1.96 
 n n

IPS Figure 6.3 shows the


confidence intervals based on
25 random samples. Note
that all but 1 contain the true
mean μ

7
Varying the level of confidence
Suppose we know , the population standard deviation, but not the mean μ,
and have drawn a random sample of size n from the population to estimate μ.
The 95% confidence interval is:
   
 x  1.96 , x  1.96 
 n n
Can we be more confident? Suppose z* and C are related as in Figure 6.4
from IPS. We can say that the sample mean will be within the population
mean, plus or minus z*σ/√n with 100*C% confidence.

A confidence interval for the mean with


confidence level C will have the form:

 *  *  
x  z ,xz 
 n n
8
A few values of z* and C
• These are the most common confidence
coefficients and z* values.

z* 1.645 1.960 2.576

C 90% 95% 99%


Some alternative notation on the next slide

9
Alternative common notation
In many texts, the region of area C is instead labeled with area
1-. The two regions in the tail then each have area (/2). The
z point to the right (labeled z* below) is then denoted by z(1- /2)

area 1-
The formula for the confidence
interval becomes

   
x  z , x  z 
  
 1  n
 
 1  n 
  2  2 

In this case, 1- is the confidence


coefficient, and  is the error rate.
area /2
10
Controlling the margin of error
The margin of error m in a confidence z *
interval for the mean is In the rents m
example, m = 58.8 n

If we want to choose a sample size n


 z * 
2
to get a given margin of error, we n 
solve for n  m 

How many apartments should 2


 1.96 * 300 
have been sampled to have a n   864.36  865
margin of error of  $20?  20 

11
Looking Ahead: Confidence Intervals for
a Population Mean, unknown SD
• So what are we doing with these confidence interval
calculations?
– We are trying to determine where the true unknown mean, μ,
is based on a sample mean, x
– But this calculation (so far) assumes we know the true population
standard deviation: σ
– This doesn’t often happen in real life. If we are trying to estimate
μ, we will also probably have to estimate σ .
– What’s our sample-based estimate of σ ?
– s
– This throws off everything. The calculation is no longer based on
a normal distribution, but a t-distribution. More on this after the
midterm…

12
Statistical Decision Making
(aka Hypothesis Testing)
• A discussion of this material can be found in IPS, 6.2
• Also called Significance Testing or Testing Statistical Hypotheses
• We look at it from two aspects:
– Some general principles (motivated by examples)
– Specific forms of tests in commonly arising situations
(comparing two groups, cross-classified data, regression
considered more formally)
• Today’s Example: Cambridge officials claim that rents have not
changed since the year 2000 when the true population average rent
was reported to be $1200.
– Does the data support the claim?

13
Testing a hypothesis about
a population mean
1. Formulate a Null Hypothesis and an Alternative Hypothesis
• The null hypothesis (H0) assumes a distribution for the population
that reflects no change from the past or is nothing interesting going
on. If the null hypothesis is true, any discrepancy between the
observed data and the hypothesized distribution is due only to
chance variation (Cambridge 1-BR Rent H0: μ = 1200)
• The alternative hypothesis (HA) states that there is a real difference
between the distribution of the observed data and the null-
hypothesized distribution. An observed discrepancy between the
observed data and the null-hypothesis distribution is not due to
chance variation. (Cambridge 1-BR Rent HA: μ ≠ 1200)
• We set things up this way because it is easier to disprove
something than prove it (we are usually hoping to disprove H0)

14
Testing a hypothesis…(cont.)
2. Calculate the value of the test statistic on which the test will be
based.
• The test statistic measures the difference between the observed
data and what is expected given the null hypothesis is true.
• The test statistic answers the question, “How many standard
deviations from the hypothesized value is the observed sample
value?”
• Almost always of the form
(observed statistic) - (its expected value)
(standard deviation of statistic)
• When will this value be large (in magnitude)? When will it be
close to zero?

15
Testing a hypothesis…(cont.)

3. Find the probability of getting this test statistic or a more


extreme one if the null hypothesis were true.
• This is called the p-value (stands for probability-value)
• The smaller the p-value, the stronger the evidence against
the null hypothesis.

4. Come to a conclusion about your hypotheses.


• If the p-value is as small or smaller than the pre-specified
level of the test or alpha (), usually 0.05, we say the result
is statistically significant at level .

16
Testing a hypothesis about 
(when  is known)
• Collect a simple random sample from a population with
– Unknown mean 
– known standard deviation  (rarely happens, but good place to start)
• If the population is normal or the sample size is large enough then
– The sampling distribution of the sample mean ( x ) is:
• Approximately Normal
 x
• With mean 
• Standard deviation n  n
This is equivalent to saying that the standardized sample mean has a sampling
distribution that is approximately standard normal. This will be the test
statistic. It will be computed under the assumption that the null hypothesis is
true (so  is the null value). The p-value will be calculated under this null
sampling distribution.
17
Mechanics of testing
(1-sided vs. 2-sided alternatives)
σ known
Reject H0: µ = µ0 in favor of the one-sided
alternative HA: µ > µ0 whenever
X  0
Z  z*

n
Reject H0: µ = µ0 in favor of the one-sided
alternative HA: µ < µ0 whenever
X  0
Z  z*

n
Reject H0: µ = µ0 in favor of the two-sided
alternative HA: µ ≠ µ0 whenever

X  0
Z  z*

n
Connection between Hypothesis Tests
and Confidence Intervals
• There is a close connection between confidence intervals and
two-sided tests of hypotheses
– A level  two-sided significance test rejects a hypothesis
H0:  = 0 exactly when the value 0 falls outside a level
(1 - ) confidence interval for 
– Rationale (use .95 (95%) confidence level,  = 0.05 level
test to be concrete): Suppose a value 0 falls outside a
95% confidence level. Then we are 95% certain that 0 is
not consistent with the data. This is equivalent to saying
that the data are not consistent with the hypothesis
H0:  = 0 at the 5% significance level.
– Note:  and confidence level should add to one

19
Cambridge 1-BR Rents Example as
a Hypothesis Test
The true population average rent in Cambridge in 2000
was $1200. Let’s use the data that we collected to test
whether there is evidence of a change in the average rent
since 2000.

Follow these 4 easy steps:


1. Set-up your hypotheses (and α-level)
2. Gather your data and calculate the test-statistic
3. Calculate the p-value (based on the appropriate sampling
distribution)
4. Come to a conclusion about your hypotheses

20
Cambridge Rent Data

15
10
Percent

5
0

500 1000 1500 2000 2500


rent

. summarize rent

Variable | Obs Mean Std.Dev. Min Max


----------+-----------------------------------------
rent | 100 1250 316.522 730 2400
21
Solution…
As a 2-sided test: As a 1-sided test:
1. H 0 :   1200 1. H 0 :   1200
H A :   1200 H A :   1200
  0.05   0.05
2. 2.
x   1250  1200 x   1250  1200
z   1.670 z   1.670
 / n 300 / 100  / n 300 / 100

3. p  value  P (| z | 1.67) 3. p  value  P ( z  1.67)


 2 P ( z  1.67)  2(1  0.9525)  1  0.9525  0.0475
 2(0.0475)  0.0950
4. We cannot reject the null hypothesis. 4. We can reject the null hypothesis. It
There is not enough evidence to suggest that appears that the average rent for 1-BR
1-BR rents have changed since 2000. aptartments have increased since 2000.

Most of this can be done in Stata…

22
Lecture Outline
• Confidence Intervals

• Hypothesis Tests

• Power and Sample Size

23
Error rates in hypothesis testing
• In the Cambridge Rent example, the null hypothesis to be tested
was H0:  = $1200, with the two-sided alternative HA:  ≠ $1200.
• Think of this as strictly a decision problem with only two
possibilities for the decision maker, H0 or HA. Because of the lack
of complete predictability (i.e., the presence of randomness in the
sample), all four of the following branches are possible (before
actually doing the study):

Conclude H0 Conclude H0
H0 true HA true

Conclude HA Conclude HA

Correct Decision Error

24
Terminology on types of errors

25
Error rates in hypothesis testing….

• P(type I error) = P(conclude HA when H0 is true) is labeled


…this is what we set to 0.05, typically
• We have been using this implicitly already.
– Suppose we reject a null hypothesis whenever the p-value is
less than 0.05.
– The p-value is calculated assuming the null hypothesis is
true, so if the null hypothesis really is true, we will reject it
less than 5% of the time.
• The approach we use controls P(type I error), but we have
been silent about the probability of a type II error. More on
that later.

26
Error rates in hypothesis testing,
Statistical Power
• P(type II error) = P(Conclude H0 when HA is true) is
typically labeled 
• In decision theoretic approach to hypothesis testing,
– the type I error rate  is fixed (usually 0.05)
– 1 -  = P (Conclude HA | HA is true)
= P(correct decision when HA is true)
1 -  is called the statistical power of the test. Computing
this probability can be subtle, and depends closely on
particular problem

Note: Power is always between 0 and 1…it’s a probability

27
Statistical Power
• Picture to the left is from IPS, ex.
6.17 (don’t worry about #’s)
• Statistical power is something that
is calculated before gathering data.
–Analogous to calculating sample
size to obtain desired margin of
error in confidence interval.

3 things to think about:


• What happens to power as the mean
under the alternative is further from
the null hypothesis?
• What happens to power as standard
deviation increases?
• What happens to power as sample
size increases?

28
Has the cost of living changed in Maine?
• The US Department of Housing and Urban Development publishes a
table of Fair Market Rents (FMR) by state to set amounts paid in
voucher programs, Section 8 housing, and other programs
• The data used here oversimplifies the technical details in the
definition of FMR, but is a good illustration.
• In 1997, FMR for two-bedroom apartment in Maine was $590.
• Suppose we wish to do a study to examine how rents in Maine
compare to FMR. We’ll collect a random sample of rents of two-
bedroom apartments.
• More specifically, we would like to `decide’ (in 1997), based on data,
whether or not rents in Maine are different than the fair market rent
• We’ll eventually use a data set that is a random sample of rents for 32
two-bedroom apartments in Maine in 1997
• Start with some pre-data calculations…

29
Before data are collected
• Suppose we `know’ (perhaps from previous studies) that the
standard deviation for the rents of all two-bedroom apartments in
Maine is σ = $72
• Suppose we wish to design our study so that the margin of error
in a 95% confidence interval for the mean is $25, that is, the
confidence interval will be (approximately) of the form: sample
mean  $25
– How large should our sample of rents be?
• Recall the formula for margin of error (m) in a confidence
interval for a mean:   
m  z*  
 n
• From this we can get 2
 ( z * )( )   1.96(72) 
2

(just solve for n): n        31.8  32


 m   25 
• So what is the Power for this test?

30
Calculating Power

• When calculating power, it’s a 2-step Process


– First: determine the rejection region assuming the Null
hypothesis is true (what values of x are needed to reject the
test).
– Second: calculate the probability of finding a sample statistic
(in this case, the sample mean) that falls in the rejection
region if the alternative hypothesis is actually true
• What are our hypothesis here?
– H0: μ = 590
– HA: μ ≠ 590
• Let’s do the calculation…

31
Calculating Power
• First: determine the rejection region assuming the Null
hypothesis is true (always helps to draw the picture):
– We will reject when |z-stat| > 1.96 (far out in tails)
– What does this mean in terms of the sample mean?
• x > μ + (1.96)*(σ/√n) = 590 + 1.96*72/√32 = 615
• x < μ - (1.96)*(σ√n) = 590 - 1.96*72/√32 = 565
• Second: calculate the probability of finding a sample mean
that falls in the rejection region if HA is actually true…so we
have to pick a specific μ within HA…let’s use μ = 625 here:
P ( x  615 |  A  625)  P ( x  565 |  A  625)  P ( x  615 |  A  625)  0
 x   A 615  625 
 P    P ( z  0.79)  0.7840
 / n 72 / 32 

• A picture can be worth a thousand words…


32
33 33
Maine rents
• Next step: take a sample of 32 rents and see what happens
• Summary of data below…apparently bad luck on the standard
deviation…we will return to this

. summarize rent, detail


rent
-------------------------------------------------------------
Percentiles Smallest
1% 250 250
5% 510 510
10% 530 510 Obs 32
25% 550 530 Sum of Wgt. 32

50% 590 Mean 603.125


Largest Std. Dev. 97.69728
75% 655 710
90% 710 710 Variance 9544.758
95% 770 770 Skewness -1.059636
99% 800 800 Kurtosis 6.791641
34
Maine rents…
• We can go ahead with the confidence interval, using the normal distribution
the multiplier (z*) in the formula:
    72 
x  z *   603.13  1.96   (578.18, 628.08)
 n  32 
• An apparent inconsistency: we used a putative σ = 72 to choose a sample
size, but this may have been wishful thinking since s = 97.7

35
From the testing perspective
• Suppose policy maker in Maine wants to use these data to
decide if the average rent in Maine are different than the
FMR. If so, Maine will implement an expensive rent
subsidy program.
• Wishes to test the null hypothesis H0: mean rent = 0 =
$590 vs. the alternative hypothesis HA:  ≠ $590.
• Want to conduct a decision theoretic type test with type I
error  = 0.05
• Note that this is a two-sided hypothesis and we will
construct a two-tailed p-value.
• Would this null hypothesis be rejected?
• Let’s check in Stata…

36

You might also like