You are on page 1of 63

Estimation of

Parameters

1
Point Estimate for Population μ
Point Estimate
• A single value estimate for a population parameter
• Most unbiased point estimate of the population mean μ is the sample
mean
x

Estimate Population with Sample


Parameter… Statistic
Mean: μ x
2
Example: Point Estimate for Population μ
Market researchers use the number of sentences per advertisement as a
measure of readability for magazine advertisements. The following
represents a random sample of the number of sentences found in 50
advertisements. Find a point estimate of the
population mean, . (Source: Journal of Advertising Research)

9 20 18 16 9 9 11 13 22 16 5 18 6 6 5 12 25
17 23 7 10 9 10 10 5 11 18 18 9 9 17 13 11 7
14 6 11 12 11 6 12 14 11 9 18 12 12 17 11 20
3
Solution: Point Estimate for Population μ
The sample mean of the data is
x 620
x   12.4
n 50

Your point estimate for the mean length of all magazine


advertisements is 12.4 sentences.

4
Interval Estimate
Interval estimate
• An interval, or range of values, used to estimate a population parameter.

Point estimate
12.4
( • )

Interval estimate

How confident do we want to be that the interval estimate contains the


population mean μ?
5
Level of Confidence
Level of confidence c
• The probability that the interval estimate contains the population
parameter.
c is the area under the
c standard normal curve
between the critical values.
½(1 – c) ½(1 – c)
z
-zc z=0 zc
Use the Standard
Critical values Normal Table to find the
The remaining area in the tails is 1 – c . corresponding z-scores. 6
Level of Confidence
• If the level of confidence is 90%, this means that we are 90% confident
that the interval contains the population mean μ.

c = 0.90

½(1 – c) = 0.05 ½(1 – c) = 0.05

z
 zc
-zc = -1.645 z=0 zc = 1.645
The corresponding z-scores are +1.645.
7
Sampling error Sampling Error
• The difference between the point estimate and the actual population
parameter value.
• For μ:
 the sampling error is the difference x – μ
 μ is generally unknown
 x varies from sample to sample

8
Margin of error
Margin of Error
• The greatest possible distance between the point estimate and the value
of the parameter it is estimating for a given level of confidence, c.
• Denoted by E.

σ When n  30, the sample


E  zcσ x  zc standard deviation, s, can
n be used for .
• Sometimes called the maximum error of estimate or error tolerance.

9
Example: Finding the Margin of Error
Use the magazine advertisement data and a 95%
confidence level to find the margin of error for the
mean number of sentences in all magazine
advertisements. Assume the sample standard deviation
is about 5.0.

10
Solution: Finding the Margin of Error
• First find the critical values
0.95

0.025 0.025

z
 zc
-zc = -1.96 z=0 zczc= 1.96
95% of the area under the standard normal curve falls within
1.96 standard deviations of the mean. (You can approximate
the distribution of the sample means with a normal curve by
the Central Limit Theorem, because n ≥ 30.)
11
Solution: Finding the Margin of Error
 s You don’t know σ, but
E  zc  zc since n ≥ 30, you can
n n use s in place of σ.
5.0
 1.96 
50
 1.4
You are 95% confident that the margin of error for the
population mean is about 1.4 sentences.
12
Confidence Intervals for the Population Mean

A c-confidence interval for the population mean μ


• 
x  E    x  E where E  zc
n
• The probability that the confidence interval contains μ is c.

13
Constructing Confidence Intervals for μ
Finding a Confidence Interval for a Population Mean
(n  30 or σ known with a normally distributed population)

In Words In Symbols

1. Find the sample statistics n and x x


x
. n
2. Specify , if known. Otherwise, if n  30,
find the sample standard deviation s and use (x  x )2
it as an estimate for .
s
n 1

14
In WordsConstructing Confidence Intervals for μ
In Symbols

3. Find the critical value zc that Use the Standard


Normal Table.
corresponds to the given level
of confidence.
E  zc

4. Find the margin of error E. n
5. Find the left and right Left endpoint:x E
endpoints and form the xE
Right endpoint: Interval:

confidence interval. xE  xE


15
Example: Constructing a Confidence
Interval
Construct a 95% confidence interval for the mean number of sentences in
all magazine advertisements.

Solution: Recall x  12.4 and E = 1.4


Left Endpoint: Right Endpoint:
xE xE
 12.4  1.4  12.4  1.4
 11.0  13.8
11.0 < μ < 13.8 16
Solution: Constructing a Confidence
11.0 < μ < 13.8
Interval

11.0 12.4 13.8


( • )

With 95% confidence, you can say that the population


mean number of sentences is between 11.0 and 13.8.

17
Example: Constructing a Confidence
Interval σ Known
A college admissions director wishes to estimate the mean
age of all students currently enrolled. In a random sample of
20 students, the mean age is found to be 22.9 years. From
past studies, the standard deviation is known to be 1.5
years, and the population is normally distributed. Construct
a 90% confidence interval of the population mean age.

18
Solution: Constructing a Confidence
Interval σ Known
• First find the critical values
c = 0.90

½(1 – c) = 0.05 ½(1 – c) = 0.05

z
 zc z=0 zc
-zc = -1.645 zc = 1.645
zc = 1.645
19
Solution: Constructing a Confidence
Interval σ Known
• Margin of error:
 1.5
E  zc  1.645   0.6
n 20
• Confidence interval:

Left Endpoint: Right Endpoint:


xE xE
 22.9  0.6  22.9  0.6
 22.3  23.5
22.3 < μ < 23.5 20
Solution: Constructing a Confidence
22.3 < μ < 23.5
Interval σ Known

Point estimate
22.3 22.9 23.5
( • )
x E x xE

With 90% confidence, you can say that the mean age
of all the students is between 22.3 and 23.5 years.

21

• Given a c-confidence level and a E  zc
margin of error E, the minimum n
sample size n needed to estimate the
population mean  is
 2
E z
2 2
c
n
c
2 2
z
E 
2

• If  is unknown, you can estimate it n


using s provided you have a zc 
2 2
zc 2
preliminary sample with at least 30 n 2 ( )
members. E E
 zc 
2

n 
 E 
22
Example: Sample Size
You want to estimate the mean number of sentences in a
magazine advertisement. How many magazine advertisements
must be included in the sample if you want to be 95% confident
that the sample mean is within one sentence of the population
mean? Assume the sample standard deviation is about 5.0.

23
Solution: Sample Size
• First find the critical values
0.95

0.025 0.025

z
-zc = -1.96
zc z=0 zczc= 1.96

zc = 1.96
24
Solution: Sample Size
zc = 1.96   s = 5.0 E=1

 zc   1.96  5.0 


2 2

n     96.04
 E   1 
When necessary, round up to obtain a whole number.

You should include at least 97 magazine advertisements


in your sample.

25
The t-Distribution
• When the population standard deviation is unknown, the sample size is
less than 30, and the random variable x is approximately normally
distributed, it follows a t-distribution.
x -
t
s
n
• Critical values of t are denoted by tc.

26
In statistics, the t-distribution was first derived in 1876 by Helmert
and Lüroth. In the English-language literature it takes its name
from William Sealy Gosset's 1908 paper in Biometrika under the
pseudonym "Student", published while he worked at the Guinness
Brewery in Dublin, Ireland. One version of the origin of the
pseudonym is that Gosset's employer forbade members of its staff
from publishing scientific papers, so he had to hide his identity.
Another version is that Guinness did not want their competition to
know that they were using the t-test to test the quality of raw
material. The t-test and the associated theory became well-known
through the work of Ronald A. Fisher, who called the distribution
"Student's distribution".
T-distribution’s mathematical formulas:
Slide 6- 27
T-distribution’s mathematical formulas:

Slide 6- 28
Properties of the t-Distribution

1. The t-distribution is bell shaped and symmetric about the mean.


2. The t-distribution is a family of curves, each determined by a parameter
called the degrees of freedom. The degrees of freedom are the number
of free choices left after a sample statistic such as x is calculated.
When you use a t-distribution to estimate a population mean, the
degrees of freedom are equal to one less than the sample size.
 d.f. = n – 1 Degrees of freedom

29
Properties of the t-Distribution
3. The total area under a t-curve is 1 or 100%.
4. The mean, median, and mode of the t-distribution are equal to zero.
5. As the degrees of freedom increase, the t-distribution approaches the normal
distribution. After 30 d.f., the t-distribution is very close to the standard
normal z-distribution.

d.f. = 2 The tails in the t-


d.f. = 5 distribution are “thicker”
t than those in the standard
0 normal distribution.
Standard normal curve
30
Example: Critical Values of t
Find the critical value tc for a 95% confidence when the sample size is 15.
Solution: d.f. = n – 1 = 15 – 1 = 14
Table 5: t-Distribution

tc = 2.145

31
Solution: Critical Values of t
95% of the area under the t-distribution curve with 14 degrees of freedom
lies between t = +2.145.

c = 0.95

t
-tc = -2.145 tc = 2.145

32
Confidence Intervals for the Population Mean

A c-confidence interval for the population mean μ


• s
x E   x E where E  tc
n
• The probability that the confidence interval contains μ is c.

33
In Words In Symbols

1. Identify the sample statistics n, x , x (x  x ) 2


x s
and s. n n 1
2. Identify the degrees of freedom, the
level of confidence c, and the critical
d.f. = n – 1
value tc.

3. Find the margin of error E. E  tc


s
n

34
In Words In Symbols

4. Find the left and right Left endpoint: x  E


endpoints and form the Right endpoint: x  E
confidence interval. Interval:
xE  xE

E  tc
s
n

35
Example: Constructing a Confidence
Interval
You randomly select 16 coffee shops and measure the temperature of the
coffee sold at each. The sample mean temperature is 162.0ºF with a sample
standard deviation of 10.0ºF. Find the 95% confidence interval for the
mean temperature. Assume the temperatures are approximately normally
distributed.

Solution:
Use the t-distribution (n < 30, σ is unknown,
temperatures are approximately normally distributed.)

36
Solution: Constructing a Confidence
Interval
• Margin of error:
E  tc
s  2.131
10
 5.3
n 16
• Confidence interval:

Left Endpoint: Right Endpoint:


xE xE
 162  5.3  162  5.3
 156.7  167.3
156.7 < μ < 167.3 37
Solution: Constructing a Confidence
Interval
• 156.7 < μ < 167.3
Point estimate
156.7 162.0 167.3
( •x )
x E xE

With 95% confidence, you can say that the mean


temperature of coffee sold is between 156.7ºF and
167.3ºF.
38
Is n  30? Yes Use the normal distribution with
E  zc
σ
No
n
Is the population normally, or
If  is unknown, use s instead.
approximately normally,
distributed? No
Cannot use the normal
Yes distribution or the t-distribution.
Is  known? Yes Use the normal distribution
No
with E  z σ
c
Use the t-distribution with n
E  tc
s
n
and n – 1 degrees of freedom. 39
Point Estimate for Population p

Population Proportion
• The probability of success in a single trial of a binomial experiment.
• Denoted by p
Point Estimate for p
• The proportion of successes in a sample.
• Denoted by
x number of successes in sample
 pˆ 
n number in sample
 read as “p hat”

40
Point Estimate for Population p
Estimate Population with Sample
Parameter… Statistic
Proportion: p p̂

Point Estimate for q, the proportion of failures


• Denoted by qˆ  1  pˆ
• Read as “q hat”

41
Example: Point Estimate for p
In a survey of 1219 U.S. adults, 354 said that their favorite sport to watch
is football. Find a point estimate for the population proportion of U.S.
adults who say their favorite sport to watch is football. (Adapted from
The Harris Poll)

Solution: n = 1219 and x = 354


x 354
pˆ    0.290402  29.0%
n 1219
42
Think of proportion as:
x
pˆ  where x is binomial.
n
So the mean of p̂ is the mean
of x divided by n, and std is std of
x divided by n: Remember : the mean for the
np Binomial distribution is np
 pˆ   p
n and variance is npq.
npq pq
 pˆ  2 
2
Also, when we divide the data points by
n n
a number the new mean gets divided by
pq
 pˆ  that number, and the variance gets divided
n by the square of the number! 43
Constructing Confidence Intervals for p
In Words In Symbols

1. Identify the sample statistics n and x.


x
2. Find the point estimate p̂. pˆ 
n
3. Verify that the sampling distribution
of p̂ can be approximated by the npˆ  5, nqˆ  5
normal distribution.
4. Find the critical value zc that Use the Standard
corresponds to the given level of Normal Table
confidence c.

44
Constructing Confidence Intervals for p
In Words In Symbols

5. Find the margin of error E. E  zc ˆˆ


pq
n
6. Find the left and right endpoints Left endpoint: p̂  E
and form the confidence Right endpoint: p̂  E
interval. Interval:
pˆ  E  p  pˆ  E

45
Example: Confidence Interval for p
In a survey of 1219 U.S. adults, 354 said that their favorite sport to watch
is football. Construct a 95% confidence interval for the proportion of
adults in the United States who say that their favorite sport to watch is
football.

Solution: Recall pˆ  0.290402


qˆ  1  pˆ  1  0.290402  0.709598

46
Solution: Confidence Interval for p
• Verify the sampling distribution of p̂ can be approximated by the normal
distribution

npˆ  1219  0.290402  354  5


nqˆ  1219  0.709598  865  5
• Margin of error:
ˆˆ
pq (0.290402)  (0.709598)
E  zc  1.96  0.025
n 1219
47
Solution: Confidence Interval for p
• Confidence interval:

Left Endpoint: Right Endpoint:


pˆ  E pˆ  E
 0.29  0.025  0.29  0.025
 0.265  0.315
0.265 < p < 0.315

48
Solution: Confidence Interval for p
• 0.265 < p < 0.315

Point estimate
0.265 0.29 0.315
( • )
p̂  E p̂ p̂  E

With 95% confidence, you can say that the proportion


of adults who say football is their favorite sport is
between 26.5% and 31.5%.
49
Sample Size
• Given a c-confidence level and a margin of error E, the minimum
sample size n needed to estimate p is
2
 zc 
n  pq
ˆ ˆ 

 E
This formula assumes you have an estimate for p̂
andqˆ .
• If not, use pˆ  0.5 and qˆ  0.5.

50
Sample Size:

ˆˆ
pq
E  zc
n
2
ˆˆ
zc pq
E 
2
n
2
ˆˆ
zc pq
n 2
E
Slide 6- 51
pˆ  0.5
and
qˆ  0.5

The product of pq is the largest when p and q are both 0.5.


Max at p=0.5, q=0.5
 y  pq

p  q 1
y  p(1  p)
y  p p 2

Slide 6- 52
Example: Sample Size
You are running a political campaign and wish to
estimate, with 95% confidence, the proportion of
registered voters who will vote for your candidate. Your
estimate must be accurate within 3% of the true
population. Find the minimum sample size needed if
1. no preliminary estimate is available.
Solution:
Because you do not have a preliminary estimate
for p̂ use pˆ  0.5 and qˆ  0.5. 53
Solution: Sample Size
• c = 0.95 zc = 1.96 E = 0.03

2 2
 zc   1.96 
n  pq
ˆ ˆ    (0.5)(0.5)    1067.11
E  0.03 

Round up to the nearest whole number.

With no preliminary estimate, the minimum sample


size should be at least 1068 voters.
54
Example: Sample Size
You are running a political campaign and wish to estimate, with 95%
confidence, the proportion of registered voters who will vote for your
candidate. Your estimate must be accurate within 3% of the true
population. Find the minimum sample size needed if
2.a preliminary estimate gives p ˆ  0.31
.

Solution:
Use the preliminary estimate pˆ  0.31
qˆ  1  pˆ  1  0.31  0.69
55
Solution: Sample Size
• c = 0.95 zc = 1.96 E = 0.03

2 2
 zc   1.96 
n  pq
ˆ ˆ    (0.31)(0.69)    913.02
E  0.03 

Round up to the nearest whole number.


With a preliminary estimate of pˆ  0.31 , the
minimum sample size should be at least 914 voters.
Need a larger sample size if no preliminary estimate
is available. 56
Sample Problems

1. A simple random sample of 36 employees in large


manufacturing company yields an average length of
service of 8 years with a standard deviation of 5 years.
Determine (a) a 95% confidence interval for μ. (b) a 99%
confidence interval for μ.

57
Sample Problems

58
Sample Problems

3. The standard deviation of a random sample of 36, taken from


a large population is 18.2. How large a sample is required if we
want to be 95% confident that our estimate of μ will not be off
by more than 4?

59
Sample Problems
4. The time required to finish an assembly job is believed to
be normally distributed with a standard deviation of 16
minutes. How large a sample is required if we want to have a
probability of .90 that the sample mean will differ from the true
mean by at most 2.2 minutes?

60
Sample Problems

5. A random sample of size 25, taken from a normal


population, has a mean of 80 and a standard deviation of
5. Construct a 95% confidence interval for the mean μ of
the population.

61
Sample Problems

6. Ten test runs were conducted in order to estimate the


average time required to assemble a mechanical device. The
results (rounded off to the nearest minute) are
22, 24, 28, 30, 26, 32, 35, 20, 24, 25
Construct a 99% confidence interval for the true mean.

62
63

You might also like