Estimation

5: Introduction to estimation
(A) Intro to statistical inference

(B) Sampling distribution of the mean
(C) Confidence intervals (σ known)
(D) Student’s t distributions
(E) Confidence intervals (σ not known)
(F) Sample size requirements
04/15/24 5: Intro to estimation 1

Statistical inference
Statistical inference  generalizing from
a sample to a population with
calculated degree of certainty
Two forms of statistical inference
 Estimation  introduced this chapter
 Hypothesis testing  next chapter

Parameters and estimates
Parameter  numerical characteristic of a
population
Statistics = a value calculated in a sample
Estimate  a statistic that “guesstimates” a
parameter
Example: sample mean “x-bar” is the estimator of
population mean µ
Parameters and estimates are related but are

not the same

Parameters and statistics
Parameters Statistics
Source Population Sample
Notation Greek (μ, σ) Roman (x, s)
Random No Yes
variable?
Calculated No Yes

Sampling distribution of the mean
x-bar takes on different values with
repeated (different) samples
µ remain constant
Even though x-bar is variable, it’s
“behavior” is predictable
The behavior of x-bar is predicted by its
sampling distribution, the Sampling
Distribution of the Mean (SDM)

Simulation experiment
Distribution of AGE in population.sav
(Fig. right) 200
 N = 600
 µ = 29.5 (center)
  = 13.6 (spread)
 Not Normal (shape)
Conduct three sampling simulations
100
For each experiment

 Take multiple samples of size n
 Calculate means
 Plot means  simulated SDMs 0
0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 55.0 60.0 65.0
Experiment A: each sample n = 1 AGE

Experiment B: each sample n = 10
Experiment C: each sample n = 30

Results of simulation experiment
Findings:
(1) SDMs are
centered on
29 (µ)
(2) SDMs
become
tighter as n
increases
(3) SDMs
become
Normal as
the n
increases

95% Confidence Interval for µ
Formula for a 95% confidence interval for μ when σ is known:
x  (1.96)( SEM )

where SEM 
n

Illustrative example
Example
 Population with σ = 13.586 (known ahead of
time)
 SRS  {21, 42, 11, 30, 50, 28, 27, 24, 52}
 n = 10, x-bar = 29.0
SEM = n13.586 / 10 = 4.30
95% CI for µ = Margin of error
= xbar ± (1.96)(SEM)
= 29.0 ± (1.96)(4.30)
= 29.0 ± 8.4
= (20.6, 37.4)

Margin of error
Margin or error  d = half the confidence
interval
Surrounded x-bar with margin of error
95% CI for µ
= xbar ± (1.96)(SEM)
= 29.0 ± (1.96)(4.30)
= 29.0 ± 8.4
point estimate
margin of error

Interpretation of a 95% CI
We are 95% confident the parameter will be captured by the interval.

Other levels of confidence
Let the probability confidence interval will not capture parameter
1 –  the confidence level
Confidence level Alpha level z1–
1– 
.90 .10 1.645
.95 .05 1.96
.99 .01 2.58

(1 – )100% confidence for μ
Formula for a (1-α)100% confidence interval for μ when σ is known:
x  z1   SEM
2

Example: 99% CI, same data
Same data as before
99% confidence interval for µ
= x-bar ± (z1–.01)(SEM)
= x-bar ± (z.995)(SEM)
= 29.0 ± (2.58)(4.30)
= 29.0 ± 11.1
= (17.9, 40.1)

Confidence level and CI length
p. 5.9 demonstrates the effect of raising your confidence
level  CI length increases  more likely to capture µ
Confidence CI for illustrative CI length*

level data
90% (21.9, 36.1) 14.2
95% (20.6, 37.4) 16.8
99% (17.9, 40.1) 22.2
* CI length = UCL – LCL

Beware
Prior CI formula applies only to
 SRS
 Normal SDMs
 σ known ahead of time
It does not account for:
 GIGO
 Poor quality samples (e.g., due to non-
response)

When σ is Not Known
In practice we rarely know σ
Instead, we calculate s and use this as an
estimate of σ
This adds another element of uncertainty to
the inference
A modification of z procedures called
Student’s t distribution is needed to
account for this additional uncertainty

Student’s t distributions
Brilliant!
William Sealy Gosset
(1876-1937) worked for
the Guinness brewing
company and was not
allowed to publish
In 1908, writing under
the the pseudonym
“Student” he described
a distribution that
accounted for the extra
variability introduced by
using s as an estimate
of σ

t Distributions
Student’s t distributions
are like a Standard
Normal distribution but
have broader tails
There is more than one
t distribution (a family)
Each t has a different
degrees of freedom (df)
As df increases, t
becomes increasingly
like z

t table
Each row is for a particular df
Columns contain cumulative
probabilities or tail regions
Table contains t percentiles (like z
scores)
Notation: tdf,p Example: t9,.975 = 2.26

95% CI for µ, σ not known
Formula for a (1-α)100% confidence interval for μ when σ is NOT known:
x  t n 1,1   sem
2
s
where sem 
n
Same as z formula except replace z1/2 with t/2 and SEM with sem

Illustrative example: diabetic weight
To what extent
are diabetics over x  112 .778
weight? s  14.424
Measure “% of sem 
s

14.242
 3.400
ideal body n 18
weight” = (actual t n 1,1   t181,1 .05  t17,.975  2.110 (from t table)
body weight) ÷
2 2
x  (t n 1,1  )( sem)
(ideal body 2
weight) × 100%  112 .778  (2.110 )(3.44)

Data (n = 18):  112.778 ± 7.17
{107, 119, 99, 114, 120, = (105.6, 120.0)
104, 88, 114, 124, 116,
101, 121, 152, 100, 125,
114, 95, 117}

Interpretation of 95% CI for µ
Remember that the CI seeks to capture µ,
NOT x-bar
95% confidence means that 95% of similar
intervals would capture µ (and 5% would not)
For the diabetic body weight illustration, we
can be 95% confident that the population
mean is between 105.6 and 120.0

Sample size requirements
Assume: SRS, Normality, valid data
Let d  the margin of error (half
confidence interval length)
To get a CI with margin of error ±d,
use:
4 2
n 2
d
Sample size requirements, illustration
Suppose, we have a variable with  = 15

4 152
For d  5, use n  2
 36
5
4 15 2
For d  2.5, use n  2
 144
2.5
Smaller margins of
4 152 error require larger
For d  1, use n  2  900 sample sizes
1

Acronyms
SRS  simple random sample
SDM  sampling distribution of the mean
SEM  sampling error of mean
CI  confidence interval
LCL  lower confidence limit
UCL  lower confidence limit

Estimation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Estimation

Uploaded by

Copyright:

Available Formats

5: Introduction to estimation

(A) Intro to statistical inference

04/15/24 5: Intro to estimation 1

04/15/24 5: Intro to estimation 2

Parameters and estimates are related but are

04/15/24 5: Intro to estimation 3

Source Population Sample

Notation Greek (μ, σ) Roman (x, s)

04/15/24 5: Intro to estimation 4

04/15/24 5: Intro to estimation 5

For each experiment

Experiment A: each sample n = 1 AGE

04/15/24 5: Intro to estimation 6

04/15/24 5: Intro to estimation 7

04/15/24 5: Intro to estimation 8

04/15/24 5: Intro to estimation 9

04/15/24 5: Intro to estimation 10

04/15/24 5: Intro to estimation 11

04/15/24 5: Intro to estimation 12

04/15/24 5: Intro to estimation 13

04/15/24 5: Intro to estimation 14

Confidence CI for illustrative CI length*

95% (20.6, 37.4) 16.8

99% (17.9, 40.1) 22.2

* CI length = UCL – LCL

04/15/24 5: Intro to estimation 15

04/15/24 5: Intro to estimation 16

04/15/24 5: Intro to estimation 17

04/15/24 5: Intro to estimation 18

04/15/24 5: Intro to estimation 19

04/15/24 5: Intro to estimation 20

04/15/24 5: Intro to estimation 21

weight) × 100%  112 .778  (2.110 )(3.44)

04/15/24 5: Intro to estimation 22

04/15/24 5: Intro to estimation 23

Suppose, we have a variable with  = 15

04/15/24 5: Intro to estimation 25

04/15/24 5: Intro to estimation 26

You might also like