You are on page 1of 8

7/22/2012

Lecture Outline

Unit 7: Inference for Means • The t-distribution


Chapter 7 in IPS • One-Sample Inference for a Mean (μ)
• Two-Sample Inference for Means (μ1-μ2)
• Matched Pairs Inference (μDiff = μ1-μ2)

1 2

Remainder of the Course: Confidence Intervals for a Population Mean,


unknown SD
Inference
• So what are we doing with these confidence interval calculations?
• The rest of the course will be focusing on Inference – We are trying to determine where the true unknown mean, μ,
– What is inference? It’s the art of using results from data to infer is based on a sample mean, x . It’s a range of plausible values
back on the entire population…its all based on good for μ.
sampling/randomization procedures – But this calculation (so far) assumes we know the true
• Two main inferential procedures population standard deviation: σ
– Confidence Intervals for estimating values of population – This doesn’t often happen in real life. If we are trying to
parameters estimate μ, we will also probably have to estimate σ .
– Hypothesis Testing for deciding whether the population – What’s our sample-based estimate of the standard deviation?
supports a specific idea/model/hypothesis
• We will see these repeated in MANY contexts: for means, s
proportions, contingency tables, regression, multiple regression, – This throws off everything. The calculation is no longer
analysis of variance (ANOVA), etc… based on a normal distribution, but a t-distribution.

3 4

1
7/22/2012

The t-distribution The t-distribution


(sometimes called Student’s t)
When the true standard deviation  is not known we need to use s
instead. The usual formula:
Derived by William Sealy
Gossett in 1918. He discovered    
the t-distribution while working  x  z* , x  z* 
at Guinness Brewery, but was  n n
forced to publish under a is replaced by:
pseudonym to protect `trade  s s 
secrets’ of the brewery (named  x  tdf* , x  tdf* 
himself ‘Student’).  n n
where the multiplier z* is replaced by a value from the t-distribution
Notice the larger spread of the t-distribution. This one has 5 with df = (n – 1) `degrees of freedom’. As we’ve seen, the
degrees of freedom. This is what happens when you don’t t-distribution is really a family of distributions that look like the
know the real σ and you’re forced to use s…less reliability. normal distribution, but is spread out further (fatter tails).
5 6

One sample t-significance test


Note: the t-distribution (when is σ unknown)‫‏‬
critical values get closer
To test the hypothesis H0:  = 0 against a one or two-side alternative
and closer to the normal
hypothesis, compute the one-sample t-statistic:
distribution (z) critical
values as degrees of
freedom increase.
x  0
t
s n
p-values are computed by comparing the statistic with a t-distribution
with df = n – 1. If using the table, you can only get approximate p-
values (essentially just bounds).

7 8

2
7/22/2012

Practice guidelines for t-procedures Quick clarification on some terminology


(p. 432, section 7.1)
• The population standard deviation (typically denoted by ) is
• Whenever possible, graph the data to explore its features the theoretical standard deviation of a probability distribution.
• Sample sizes less than 15: Use t-procedures if the data are
close to normal. If the data are clearly non-normal or if • A sample standard deviation (denoted by ‘s’) is the usual
outliers are present, do not use t standard deviation of a set of numbers.
• Sample size at least 15, less than 30: The t-procedures can be
used except in the presence of severe outliers or strong • The standard deviation of a statistic is the theoretical standard
skewness.
deviation of the sampling distribution for a statistic (Ex: σ/√n)
• Large samples: The t-procedures can be used even for clearly
skewed distributions when the sample size is large, roughly n
 30. • The standard error of a statistic is the estimated (from data)
standard deviation for a sampling distribution, after any
• If these conditions do not hold for a data set, the t-test falls
unknown parameters have been replaced by their estimates.
apart (the p-value is incorrect), and the correct approach to
analyze the data is not always clear. (Ex: s/√n)
9 10

Confidence Intervals for a Sleep Example (cont.)


Population Mean, unknown SD • Data from the first day survey

20
• Here are the data (self-reported “typical”
number of hours slept per night)
• The National Sleep Foundation recommends that adults get about

15
from n = 47 surveys:

Frequency
8 hours of sleep per night, on average.

10
7, 6, 7, 6, 6.5, 6.5, 6, 6, 8, 7, 7, 6.5, 7, 8, 7,
• We want to know if summer school students meet this 7.5, 6, 7, 5, 9, 7, 7, 7, 7.5, 8, 6, 8, 7, 5, 9, 7,
7, 9, 6, 7, 8, 6, 7, 7, 6, 9, 6, 5.5, 10, 8, 7, 7
recommendation or not

5
• We'll assume this class represents a random sample from the

0
4 6 8 10
target population . summarize sleep sleep

Variable | Obs Mean Std. Dev. Min Max


-----------+--------------------------------------------------
sleep | 47 7.021277 1.068102 5 10

What can we can we say about the average number of hours in the
`population’ of all summer school students (assuming this is a good random
sample) ?
Let’s calculate the 95% confidence interval for the mean number of hours
slept in the population of all summer school students.
11 12

3
7/22/2012

Sleep Example as a Hypothesis Test


Solution…
Let's test the claim that summer school students get an average of at least
• What are the known pieces of info (sample statistics) from the data? 8 hrs of sleep per night.
x = 7.021 ; s = 1.068 ; n = 47 (so df = n – 1 = 46). Solution:
• What’s the t-distribution’s critical value? H0 :   8
t* = 2.021 (round down to df = 40 from the table) HA :  8
• The confidence interval will have the form:   0.05

x  0 7.021  8
t   6.28
 s s  s 1.068
 x  tdf* , x  tdf*  n 47
 n n
p  value  P(tdf 47  6.28)  .0005
 1.068 1.068 
  7.021  2.021 , 7.021  2.021 
 47 47  Since our p-value < 0.05, we can reject the null hypothesis. It looks like
summer school students truly get less than the recommended amount of sleep
 6.706, 7.336 (assuming this is a good random sample).

13 14

Some caveats… Lecture Outline


Two assumptions for using the t-distribution here
1) Data represent a random sample from the population
2) Normal Distribution of observations (or near normal)
• The t-distribution
• If the sample is biased in some way, even the correct
formulas are useless • One-Sample Inference for a Mean (μ)
• Two-Sample Inference for Means (μ1-μ2)
• The use of the normal curve (or t-distribution) can be
justified either by the Central Limit Theorem or by the
original population being normally distributed to begin • Matched Pairs Inference (μDiff = μ1-μ2)
with. Sometimes these conditions do not hold, but we
will not worry too much about that in this course (we’ll
just make note of it when it is the case). In particular,
outliers can cause problems.

15 16

4
7/22/2012

200
Maternal Smoking and Infant Health Coding:

150
bwt
smoke = 0, nonsmoking
• Surgeon General’s warning on cigarette packages:

100
smoke = 1, smoking
smoke = 9, refused or unknown
`Smoking by pregnant women may result in fetal injury,

50
premature birth, and low birth weight.’ 0 1 9

• We will use data that is a subset of Child Health and . by smoke, sort : summarize bwt
-------------------------------------------------------
Development Studies (CHDS) that examined association -> smoke = 0
between smoking status of pregnant women and birthweight. Variable | Obs Mean Std.Dev. Min Max
-----------+-------------------------------------------
– Study conducted between 1960 and 1967 by Kaiser bwt | 742 123.05 17.399 55 176
Foundation Health Plan, Oakland -------------------------------------------------------
-> smoke = 1
– Data here on smoking status, birthweight (in ounces) of Variable | Obs Mean Std.Dev. Min Max
infant for random sample of 1236 babies in the study: baby -----------+-------------------------------------------
bwt | 484 114.11 18.099 58 163
boys born during one year of the study, survived at least 28 -------------------------------------------------------
days, and were single births -> smoke = 9
Variable | Obs Mean Std.Dev. Min Max
• Results on next slide… -----------+-------------------------------------------
bwt | 10 126.7 21.813 90 158
17 18

Maternal Smoking and Infant Health Two sample t-significance test


• Let: (assuming unequal standard deviations)‫‏‬
– X1 = birthweight of a baby born from a smoking mother
– X2 = birthweight of a baby born from a non-smoking mother • To test the hypothesis H0: 1 - 2 = 0 against a one or two-sided
alternative, compute the two-sample t-statistic
• What would be an appropriate null hypothesis of no difference
between the smoking group and non-smoking group? What ( x1  x2 )  ( 1   2 | H 0 )
about the alternative hypothesis? t
s12 s22
• H0: 1 - 2 = 0 • HA: 1 - 2 ≠ 0 
• So what from the data would be a good measurement for these n1 n2
hypotheses?
x1  x2 • Use p-values or critical values for the tdf distribution, where the
• What is the standard deviation of this measurement? value of degrees of freedom (df) is:
– By hand: is the smaller of n1-1 and n2-1
 12  22 which can be s12 s22
  – By computer: it uses a complicated formula for degrees of
n1 n2 estimated by: n1 n2 freedom for this test (on page 460 of the book)…no need to
worry about the formula, but use the df Stata spits out.
• More formally…
19 20

5
7/22/2012

Two sample t-significance test When should we pool?


(assuming equal standard deviations)‫‏‬ • Great, 2 formulas…so when should we use each and why?
To test the hypothesis H0: 1 - 2 = 0 against a one or two-side – Rule of thumb: first check to see if the ratio:
alternative, compute the two-sample pooled variance t-statistic
s1
( x  x )  ( 1   2 | H 0 )  1.5
t 1 2 s2
1 1
sp  – if its larger than or equal to 1.5, then the sd’s are too far apart,
n1 n2
so we shouldn’t assumer they are equal, so no pooling (holds
Where: for both CI’s and hypothesis tests) in the test stat or CI calc.
(n1  1) s12  (n2  1) s22 • Advantage of pooling
s 2p  – Has slightly more power to detect a difference
(n1  n2  2)
• Disadvantage of pooling
p-values are computed by comparing the statistic with a – If the assumption of equal variances is actually incorrect, then
t-distribution with df = n1 + n2 – 2 the theory that the formula is based on just falls apart
21 22

Solution to test if mom’s smoking is associated with different birthweight


Assuming
s1 18.10
H 0 : 1   2  1   2  0   1.040  we should pool unequal
s1 17.40 variances
H A : 1   2  0
(484  1)18.10 2  (742  1)17.40 2
  0.05 s 2p   312.57
(484  742  2)
s p  s 2p  17.68

( x1  x2 )  ( 1   2 ) (114.11  123.05)  0
t   8.65
1 1 1 1
sp  17.68 
n1 n2 484 742
Assuming
p  value  P(| t || 8.65 |)  2 * P(t  8.65)  2 * (0.0005)  0.001 equal
variances
Since our p-value < 0.05, we can reject the null hypothesis. It looks like
smoking is associated with birthweight; in fact, it is associated with lower
birthweights.
What confounding factors could be present in this observational study?
23 24

6
7/22/2012

Confidence Intervals for the Difference of


Lecture Outline
Two Population Means
• If assuming unequal variances, the two sample level C
(confidence coefficient) confidence interval has the form
 s2 s2 
( x1  x2 )  t *  1  2  • The t-distribution
 n1 n2 
  • One-Sample Inference for a Mean (μ)
• If assuming equal variances:

• Two-Sample Inference for Means (μ1-μ2)
1 1 
( x1  x2 )  t *  s p 
 n1 n2  • Matched Pairs Inference (μDiff = μ1-μ2)
• In these formulas, t* is the value from the tdf distribution, with
degrees of freedom (df varies for the two setting), with area C
between –t* and t*.
25 26

Tests with paired measurements Matched pairs…


• How strong is the evidence against the null hypothesis of no
full moon effect? What from the data supports this?
• The change within each individual of the average number of
disruptive behaviors is best measure of effect.
– Look at the change within each individual, then combine
these differences.
• When using only the differences, we now have one sample of
observations….

• This is called the paired t-test


• This applies in the setting when there are two observations
taken for each individual (like the shoe example from before)
• Its essentially a one-sample test on the differences
27 28

7
7/22/2012

Matched Pairs: The full moon effect Main Points


• We use confidence intervals to estimate where the true population
• Suppose we assume that the difference scores are a sample mean (μ) is based on a sample…it’s a range of plausible values.
from a normally distributed population, with mean  and sd  • We can formally test whether a hypothesized value of μ is
(i.e.,N(,)). reasonable or not via a hypothesis test
• Then the null hypothesis of no full moon effect is – Confidence Intervals and Hypothesis tests are closely
– H0:  = 0 related…sometimes called the inverse of one another.
• If we have no prior information to suggest whether full moons • If  is known then everything is z-based. If its not known, then we
increase or decrease aggressive behavior, the alternative need to use s from the data to estimate it, and all is t-based.
hypothesis is two-sided • Big choice to determine if methods should be one-sided or two-
– HA:  ≠ 0 sided (for hypothesis tests only)
• We already know how to compute the test: it is a one sample t- • Power is difficult to calculate…but will be done before collecting
test for a mean…let’s use Stata! data in a real-life situation
• If data comes in pairs, then to test for a difference in the 2 settings,
its essentially a one-sample test on the differences within pairs.
29 30

Midterm Results
(distance students not included yet)

Variable | Obs Mean Std.Dev. Min Max


-----------+----------------------------------------
midterm | 120 83.83 15.56 12 100
60

Where do you stand?


In very good shape:
90 – 100 (49%)
40
Frequency

In decent shape:
75 – 89 (28%)
Should boost slightly:
20

60 – 74 (18%)
Need to buckle down:
0 – 59 (5%)
0

0 20 40 60 80 100
midterm

As stated in the syllabus, this is worth 25% of final grade (HW = 35%, Final = 40%)
31

You might also like