You are on page 1of 33

Sta$s$cal Inference à Sta$s$cal methods that are used to

make decisions and draw conclusions about popula$ons.

§ Popula$ons are described by their probability distribu$ons


and parameters.
§ For quan$ta$ve popula$ons, the loca$on and shape are
described by µ and s.
§ If the values of parameters are unknown, we make
inferences about them using sample informa$on.

2
Estimation:
àEstimating or predicting the value of the parameter
e.g. “What is (are) the most likely values of 𝜇 or 𝑝 ?”

Hypothesis Testing:
à Deciding about the value of a parameter based on some
preconceived idea
e.g. “Did the sample come from a population with 𝜇 = 5 or 𝑝 = 0,2?”

3
Examples:
§ A consumer wants to es.mate the average price of similar homes in her
city before pu8ng her home on the market
Es.ma.on: Es.mate 𝜇, the average home price
§ A manufacturer wants to know if a new type of steel is more resistant to
high temperatures than the old type was.
Hypothesis test: Is the new steel’s average resistant, 𝜇! equal to the
old steel’s average resistant, 𝜇" ?

Whether you are es?ma?ng parameters or tes?ng hypotheses, sta?s?cal


methods are important because they provide:
§ Methods for making the inference
§ A numerical measure of the goodness or reliability of the inference

4
Every member of
population has the same
probability to be selected
as sample.
Populasi

Parameter

Random Sample
Es8ma8on

Statistic
5
(estimator)
Popula'on Sample

§ A population is the set of all § Sample is a subset of the


elements of interest for a population.
particular study. ( ), also
§ The sample mean (𝑿
§ Quantities such as the known as sample statistic, are
the estimator of population
population mean (µ) are known parameters.
as population parameter
§ Inferences need to be made
§ You can’t normally get from a sample
information on every element
§ It’s vital that any sample is a
in population
representative of the
population

6
Popula'on Random Sample I believe µ
(mean) is
mean between
µ (mean)
! 50
𝑋= 40 & 60.
unknown

Sample

7
In es8ma8ng popula8on parameter we want the es8ma8on to be accurate and
precise (short interval with high degree of confidence).

Accurate shows how close the sample statistic to the parameter population
(true value)

Precise shows whether the sample variance is large or small


8
The objective of estimation is to determine the approximate
value of a population parameter on the basis of a sample
statistic.
There are two types of estimators:

§ Point Estimator

§ Interval Estimator

9
§ A point es1mator draws inferences about a popula$on by
es$ma$ng the value of an unknown parameter using a
single value or point.

§ We saw earlier that point probabili$es in con$nuous


distribu$ons were virtually zero. Likewise, we’d expect that
the point es$mator gets closer to the parameter value with
an increased sample size, but point es$mators don’t reflect
the effects of larger sample sizes. Hence we will employ the
interval es1mator to es$mate popula$on parameters…

10
§ An interval estimator draws inferences about a population
by estimating the value of an unknown parameter using an
interval.

§ That is we say (with some ___% certainty) that the


population parameter of interest is between some lower
and upper bounds.

11
For example, suppose we want to estimate the mean summer
income of a class of business students. For 𝑛=25 students,
§ 𝑋" is calculated to be 400 $/week.

§ point estimate interval estimate

An alternative statement is:


§ The mean income is between 380 and 420 $/week.
10.1
2
Parameter Measure Statistic
𝜇 Mean of a single population 𝑋"
𝜎! Variance of a single population 𝑠!
𝜎 Standard deviation of a single population 𝑠
𝑝 Proportion of a single population 𝑃'
𝜇" − 𝜇! Difference in means of two population 𝑋"" − 𝑋"!
𝑝" − 𝑝! Difference in proportion of two population 𝑃'" − 𝑃'!

§ There could be choices for the point estimator of a parameter.


§ To estimate the mean of a population, we could choose the:
§ Sample mean.
§ Sample median.
§ Average of the largest & smallest observations in the sample. 13
For {x1, x2, …, xn} a random sample from a population, with 𝜇
unknown and variance 𝜎 ! :
§ Point estimation for parameter 𝜇 is the sample mean 𝑋:#
∑ ##
#
𝑋= $
§ Point estimation for parameter 𝜎 ! is the sample variance
! ' $
∑(## &#)
𝑆 = $&)

14
For {x1, x2, …, xn} a random sample from a population with
parameter 𝜃 is unknown.
§ If 𝑃 𝐵 ≤ 𝜃 ≤ 𝐴 = 1 − 𝛼 , then:
§ Interval [B,A] is a confidence interval (1 − 𝛼 ) for parameter 𝜃
§ (1 − 𝛼 ) is the confidence level

15
§ Confidence Level
Interval Estimation for μ (mean of population)
For {x1, x2, …, xn} a random sample from a population, with μ
unknown and variance 𝜎 ! .
According to the Central Limit Theorem, we can expect the
sampling distribution of 𝑋# to be approximately normally
distributed with:
mean, 𝜇#' = 𝜇
*
and standard deviation, 𝜎#' = $

18
Writing 𝑧&⁄' for the 𝑧 -value above which we find an area of 𝛼⁄2 under
the normal curve, we can see from figure below that:
(
)*+
𝑃 −𝑧&⁄' < 𝑍 < 𝑧&⁄' = 1 − 𝛼, where 𝑍= .
,⁄ -
(
)*+
Hence, 𝑃 −𝑧&⁄' < ,⁄ -
< 𝑧&⁄'
We obtain:
𝑃 𝑋2 − 𝑧&⁄' ,- < 𝜇 < 𝑋2 + 𝑧&⁄' ,- = 1 − 𝛼

𝑃 −𝑧"⁄# < 𝑍 < 𝑧"⁄# = 1 − 𝛼


19
!
Confidence Interval on μ, 𝜎 Known

If 𝑥̅ is the mean of a random sample of size 𝑛 from a


population with known variance 𝜎 ! , a
100 1 − 𝛼 % confidence interval for 𝜇 is given by:
𝑥̅ − 𝑧,⁄! ./ < 𝜇 < 𝑥̅ + 𝑧,⁄! ./
where 𝑧,⁄! is the z-value leaving an area of 𝛼⁄2 to the right.

20
The average zinc concentration recovered from a sample of
measurements taken in 36 different locations in a river is
found to be 2.6 grams per milliliter. Find the 95% and 99%
confidence intervals for the mean zinc concentration in the
river. Assume that the population standard deviation is 0.3
gram per milliliter.

21
A homeowner randomly samples 64 homes similar to her
own and finds that the average selling price is $252,000 with
a standard devia$on of $15,000. Es$mate the average selling
price for all similar homes in the city.

23
If 𝑥̅ ̄ is used as an es$mate of 𝜇, we can be 100 1 − 𝛼 %
confident that the error will not exceed a specified amount 𝑒
when the sample size is :
𝑧,⁄! 𝜎 !
𝑛=
𝑒

24
How large a sample is required if we want to be 95%
confident that our estimate of 𝜇 in Example 1 is off by less
than 0.05?

25
One-Sided Confidence Bounds on μ,
!
𝜎 Known

If 𝑥̅ is the mean of a random sample of size 𝑛 from a


population with known variance 𝜎 ! , the one-sided
100 1 − 𝛼 % confidence bounds for 𝜇 is given by:
upper one-sided bound: 𝑥̅ + 𝑧, ./
lower one-sided bound: 𝑥̅ − 𝑧, ./

27
In a psychological testing experiment, 25 subjects are
selected randomly and their reaction time, in seconds, to a
particular stimulus is measured. Past experience suggests
that the variance in reaction times to these types of stimuli is
4 sec2 and that the distribution of reaction times is
approximately normal. The average time for the subjects is
6.2 seconds. Give an upper 95% bound for the mean reaction
time.

28
𝜎
àUse the t-Student distribution , with the degree of freedom
𝑛 − 1, if it is small sample (𝑛 < 30).
'
#&-
𝑇= .⁄ $
Here 𝑠 is the sample standard deviation.

t
-tα/2 tα/2
30
!
Confidence Interval on μ, 𝜎
Unknown
If 𝑥̅ ̄ and 𝑠 are the mean and standard devia$on of a random
sample from a normal popula$on with unknown variance 𝜎 ! ,
a 100 1 − 𝛼 % confidence interval for 𝜇 is

/ /
𝑥̅ − 𝑡,⁄! $
< 𝜇 < 𝑥̅ + 𝑡,⁄! $

where 𝑡,⁄! is the 𝑡 -value with 𝑣 = 𝑛 − 1 degrees of


freedom, leaving an area of 𝛼⁄2 to the right.

31
Computed one-sided confidence bounds for 𝜇 with 𝜎
unknown are as the reader would expect, namely
𝑥̅ + 𝑡, 0/ and 𝑥̅ − 𝑡, 0/

They are the upper and lower 100 1 − 𝛼 % bounds,


respectively. Here 𝑡, is the 𝑡-value having an area of 𝛼 to the
right.

32
The contents of seven similar containers of sulfuric acid are
9.8, 10.2, 10.4, 9.8, 10.0, 10.2, and 9.6 liters. Find a 95%
confidence interval for the mean contents of all such
containers, assuming an approximately normal distribution.

33
34
Often statisticians recommend that even when normality
cannot be assumed, 𝜎 is unknown, and 𝑛 ≥ 30, 𝑠 can replace
𝜎 and the confidence interval :
/
𝑥̅ ± 𝑧,0! $

35
Scholastic Aptitude Test (SAT) mathematics scores of a
random sample of 500 high school seniors in the state of
Texas are collected, and the sample mean and standard
deviation are found to be 501 and 112, respectively. Find a
99% confidence interval on the mean SAT mathematics score
for seniors in the state of Texas.

36

You might also like