0% found this document useful (0 votes)
18 views45 pages

Inference About a Mean: t Tests Explained

Uploaded by

galyafei1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views45 pages

Inference About a Mean: t Tests Explained

Uploaded by

galyafei1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Chapter 11:

Inference About a Mean

12/31/24 1
In Chapter 11:
11.1 Estimated Standard Error of the Mean
11.2 Student’s t Distribution
11.3 One-Sample t Test
11.4 Confidence Interval for μ
11.5 Paired Samples
11.6 Conditions for Inference

12/31/24 2
σ not known
• Prior chapter: σ was known
before collecting data  z
procedures used to help
 ?
infer µ
• When σ NOT known,  ?
calculate sample standard
deviations s and use it to
s
calculate this standard SE x 
error: n
12/31/24 3
Additional Uncertainty
s The Normal
SE x  distribution
n doesn’t fit
well
• Using “s” instead of σ
adds uncertainty to
inferences  can NOT
use z procedures
• Instead, rely on Student’s
t procedures
William Sealy Gosset
(1876–1937)
12/31/24 4
T-score vs. z-score: When to use a t score?

-The sample size is below 30


-The population standard deviation is unknown (estimated from your
sample data)

12/31/24 5
Student’s t distributions
• Probability
distributions are
identified by degrees
of freedom (df)
• t distribution is similar
to “Z”, but with
broader tails
• As df increases →
tails get skinnier A t distribution with infinite
→ t become like z degrees of freedom is a
Standard Normal Z
distribution
12/31/24 6
12/31/24 7
t - Test
• t is a measure of how likely it is that a
difference in Means is Statistically
Significant

• As with all Test Statistics, we compare t to its


Critical Value. The value of t is calculated
from Sample data

• The value of t-critical is determined by the


value selected for Alpha, the Significance
Level, and the appropriate t-Distribution
t - Test
• A large value for t makes it more likely to
be larger than t-critical, and so makes it
more likely that there is a Statistically
Significant difference in the Means
t - Test
These three different types of the second
Mean correspond to three different t-tests
t - Test
• Degree of Freedom
– For a single sample t-test, df = n – 1, where
n is the Sample Size
– In the 2-Sample t-test, df = n1 + n2 − 2
Table C (t table)

Rows  df
Columns  probabilities
Entries  t values
Notation: tcum_prob,df = t value
Example: t.975, 9 = 2.262
12/31/24 13
One-Sample t Test

Objective: test a claim about population mean µ


Conditions :
• Simple Random Sample
• Normal population or “large sample”

12/31/24 14
Hypothesis Statements
• Null hypothesis
H0: µ = µ0
where µ0 represents the
pop. mean expected by
the null hypothesis

• Alternative hypotheses
Ha: µ < µ0 (one-sided, left)
Ha: µ > µ0 (one-sided, right)
Ha: µ ≠ µ0 (two-sided)

12/31/24 15
Example 1
• Do SIDS babies have
lower average birth
weights than a
general population
mean µ of 3300 gms?
• H0: µ = 3300
• Ha: µ < 3300 (one-
sided) or
Ha: µ ≠ 3300 (two-
sided)

12/31/24 16
One-Sample t Test Statistic
x  0
tstat 
SE x
where

x the sample mean


 0 expected population mean under H 0
s
SE x 
n
This t statistic has n – 1 degrees of freedom
12/31/24 17
2998 Example (Data)
3740 SRS n = 10 birth weights (grams)
2031 of SIDS cases
2804
x 2890.5 grams
2454
2780 s 720 grams
2203 n 10
3803 s 720
3948 SE x   227.7 grams
n 10
2144
12/31/24 18
Example
Testing H0: µ = 3300
x   0 2890.5  3300
tstat    1.80
SE x 227.7

This statistic has


df n  1 10  1 9

12/31/24 19
P-value via Table C
• Bracket |tstat| between t critical values
• For |tstat| = 1.80 with 9 df

Table C. |tstat| = 1.80


Upper-tail P 0.25 0.20 0.15 0.10 0.05 0.025
df = 9 0.703 0.883 1.100 1.383 1.833 2.262

Thus  One-tailed: 0.05 < P < 0.10


Two-tailed: 0.10 < P < 0.20

12/31/24 20
For a more precise P-value use a
computer utility
Here’s output from the
free utility StaTable
Graphically:

12/31/24 21
Interpretation
• Testing
H0: µ = 3300 gms
• Two-tailed P > .10
• Conclude: weak
evidence against H0
• The sample mean
(2890.5) is NOT
significantly different
from 3300
12/31/24 22
Confidence level (Interval)
• It is a measure of the reliability of a result.
• A confidence level of 95 per cent or 0.95 means that
there is a probability of at least 95 per cent that the
result is reliable

12/31/24 23
Same Data
x 2890.5 s 720 n 10
s 720 For 95% confidence use :
SE x   227.68
n 10 t10 1,1 .05 t9 ,.975 2.262
2

95% CI for  x t9,.975 SE x 2890.5 (2.262)(227.68)


2890.5 ± 515.1
= (2375 to 3406) grams
Interpretation: Population mean µ is between
2375 and 3406 grams with 95% confidence
12/31/24 24
The Normality Condition
• t Procedures require Normal population or
large samples
• How do we assess this condition?
• Guidelines. Use t procedures when:
– Population Normal
– population symmetrical and n ≥ 10
– population skewed and n ≥ ~45
(depends on severity of skew)

12/31/24 25
Sample Size and Power
Methods:
(1) n required to achieve m when estimating µ
(2) n required to test H0 with 1−β power
(3) Power of a given test of H0

12/31/24 26
Power
 |  | n 
1     z1   
 2  
 

• α ≡ alpha (two-sided)
• Δ ≡ “difference worth detecting” = µa – µ0
• n ≡ sample size
• σ ≡ standard deviation
• Φ(z) ≡ cumulative probability of Standard Normal z
score
12/31/24 27
Power: SIDS Example
• Let α = .05 and z1-.05/2 = 1.96
• Test: H0: μ = 3300 vs. Ha: μ = 3000.
Thus: Δ ≡ µ1 – µ0 = 3300 – 3000 = 300
• n = 10 and σ ≡ 720 (see prior SIDS example)
 || n   | 300 | 10 
1     z1      1.96  
   720 
 
2
 
  0.64 

Use Table B to look up cum prob  Φ(-0.64) = .2611

12/31/24 28
Power: Illustrative Example

12/31/24 29
Example 2
Using an adequate commercialized kit and 5g of initial mass of fresh meat we extract an
average of 5ug of DNA per sample. To increase the extraction efficiency, a scientist add a
grinding step before starting the extraction of the DNA from 10 fresh meat samples.
Considering that the variable is normally distributed and the sample is randomly selected,
Does grinding improve DNA yield?
Sample data set
(DNA quantities in ug)

10
8
8
7
6
4
5
9
12 x
4

s
n
12/31/24 30
Hypotheses

H0: µ= µ0

Ha: µ> µ0

Statistics
x 7.33 s 2.626
SE x   0.83
n 10
s 2.626
n 10 7.3  5
tstat  2.7
df 9 0.83
12/31/24 31
2.7

12/31/24 32
Decision:

Calculated t (2.7) is greater than the critical t (1.83) at


α=0.05. H0 is rejected

Conclusion and interpretation:

The grinding step significantly improves extracted


DNA quantities

12/31/24 33
§11.5 Paired Samples
• Two samples
• Each data point in one sample uniquely
matched to a data point in the other sample
• Examples of paired samples
– “Pre-test/post-test”
– Cross-over trials
– Pair-matching

12/31/24 34
Example
• Does oat bran reduce LDL cholesterol?
• Start half of subjects on CORNFLK diet
• Start other half on OATBRAN
• Two weeks  LDL cholesterol
• Washout period
• Cross-over to other diet
• Two weeks  LDL cholesterol

12/31/24 35
Subject CORNFLK OATBRAN
Oat bran ---- ------- -------

data 1
2
4.61
6.42
3.84
5.57
LDL cholesterol 3 5.40 5.85
mmol 4 4.54 4.80
5 3.98 3.68
6 3.82 2.96
7 5.01 4.41
8 4.34 3.72
9 3.80 3.49
10 4.56 3.84
11 5.35 5.26
12/31/24 12 3.89 3.73 36
Within-pair difference “DELTA”
• Let DELTA = CORNFLK - OATBRAN
• First three observations in OATBRAN data:
ID CORNFLK OATBRAN DELTA
---- ------- ------- -----
1 4.61 3.84 0.77
2 6.42 5.57 0.85
3 5.40 5.85 -0.45
etc.
All procedures are now directed toward
difference variable DELTA
12/31/24 37
Exploratory and descriptive stats
DELTA: 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72,
0.09, 0.16

Stemplot n 12
|-0f|4
xd 0.3808
|-0*|2 sd 0.4335
|+0*|01
|+0t|33
|+0f|
subscript d denotes
|+0s|6677
|+0.|88 “difference”
×1 LDL (mmol)
12/31/24 38
95% Confidence Interval (CI)
sd 0.4335
SE xd   .1251
n 12
For 95% confidence ( .05) use t12 1,1 .05 t11,.975 2.201 (Table C)
2

95% CI for  d  xd t n  1,1  SE xd


2

 0.3808 2.2010.1251
 0.3808 0.2754 (0.105 to 0.656)

 95% confident population mean difference µ d is


between 0.105 and 0.656 mmol/L
12/31/24 39
Hypothesis Test
• Claim: oat bran diet is associated with a
decline (one-sided) or change (two-sided) in
LDL cholesterol.
• Test H0: µd = µ0 where µ0 = 0
Ha: µd > µ0 (one-sided)
Ha: µ ≠ µ0 (two-sided)

xd   0
tstat  with df  n  1
sd n
12/31/24 40
Paired t statistic
Current data : n 12 xd 0.3808 sd 0.4335

Test H 0 : µ 0

xd   0 0.38083  0
tstat   3.043
s n .4335 / 12

df n  1 12  1 11

12/31/24 41
P-value via Table C

Table C. |tstat| = 3.043

Upper-tail P .01 .005 .0025


df = 11 2.718 3.106 3.497

Thus  One-tailed: .005 < P < .01


Two-tailed: .01 < P < .02
12/31/24 42
P-value via Computer

12/31/24 43
SPSS Output: “Oat Bran”

12/31/24 44
Interpretation
My P value
• Testing H0: µ = 0 is smaller
• Two-tailed P = 0.011 than yours!
  Good reason to doubt
H0
• (Optional) The difference
is “significant” at α = .05
but not at α = .01

12/31/24 45

You might also like