You are on page 1of 45

2023

Point and Interval Estimate

By
Prof. Vishal Singh Patyal

Outline

• Point Estimate Properties


• Confidence Intervals for the Population
Mean
• Confidence Intervals for the Population
Proportion
• Determining the Required Sample Size
for Population mean and proportion
• Excel Exercise

1
2023

Point Estimates
• A point estimate is a single number.
– For the population mean (and population standard
deviation)
– a point estimate is the sample mean (and sample
standard deviation).
• A confidence interval provides additional
information about variability

Lower Confidence Upper Confidence


Limit Point Estimate Limit

Width of
confidence interval

Point Estimators and Their Properties


• Properties of Point Estimators (UEC)
• Unbiased
• An estimator is unbiased if its expected value equals the
unknown population parameter being estimated.
• Efficient
• An unbiased estimator is efficient if its standard error is
lower than that of other unbiased estimators.
• Consistent
• An estimator is consistent if it approaches the unknown
population parameter being estimated as the sample size
grows larger.

2
2023

Point Estimators and Their Properties


• Properties of Point Estimators Illustrated:
Unbiased Estimators
• The distributions of unbiased (U1) and biased (U2)
estimators.

Point Estimators and Their Properties


• Properties of Point Estimators Illustrated:
Efficient Estimators
• The distributions of efficient (V1) and less efficient
(V2) estimators.

3
2023

Point Estimators and Their Properties


• Properties of Point Estimators Illustrated:
Consistent Estimator
• The distribution of a consistent estimator X
for various sample sizes.

Confidence Interval of the Population


Mean When  Is Known

• Confidence Interval—provides a range of values


that, with a certain level of confidence, contains
the population parameter of interest.
– Also referred to as an interval estimate.
• Construct a confidence interval as:
Point estimate ± Margin of error.
– Margin of error accounts for the variability of the
estimator and the desired confidence level of the
interval.

4
2023

10

5
2023

Confidence Interval Estimates


General formula for all confidence intervals is:

Point Estimate ± (Critical Value) (Standard Error)


population

Sample Mean

11

Confidence Interval of the Population


Mean When  Is Known

12

6
2023

Confidence Interval of the Population


Mean When  Is Known

• Constructing a Confidence Interval for  When


 is Known
• Consider a standard normal random variable Z.
• P  1.96  Z  1.96   0.95
as illustrated here.

13

Confidence Interval of the Population


Mean When  Is Known

• Constructing a Confidence Interval for  When 


is Known
• Since X 
Z
 n

 X  
• We get P  1.96   1.96   0.95
  n 

• Which, after algebraically manipulating, is


equal to P    1.96  n  X    1.96  n   0.95

14

7
2023

Intervals and Level of Confidence

Sampling Distribution
of the Mean
/2 1  /2
x
Intervals μx  μ
extend from x1 (1-)x100%
σ x2 of intervals
X Z constructed
n
to contain μ;
σ ()x100% do
X Z
n not.
Confidence Intervals

15

Confidence Interval of the Population


Mean When  Is Known
• Constructing a Confidence Interval for  When 
is Known
• Since we do not know , we cannot determine if a
particular x falls within the interval or not.
• However, we do know that x will fall within the
interval   1.96  n if and only if  falls within the
interval x  1.96  n.
• This will happen 95% of the time given the interval
construction. Thus, this is a 95% confidence interval
for the population mean.

16

8
2023

Confidence Interval of the Population


Mean When  Is Known
• Constructing a Confidence Interval for  When 
is Known
• Level of significance (i.e., probability of error) = .
• Confidence coefficient = (1  )
 = 1  confidence cofficient
• A 100(1‐)% confidence interval of the population
mean  when the standard deviation  is known
is computed as x  z 2  n
or equivalently,  x  z 2  n , x  z 2  . n 

17

Finding the Critical Value, Z

Consider a 95% confidence interval:


1    .95

α α
 .025  .025
2 2

Z units: Z= -1.96 0 Z= 1.96


Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit

18

9
2023

Confidence Interval of the Population


Mean When  Is Known
• Constructing a Confidence Interval for  When 
is Known
• z/2 is the z value associated
with the probability of /2
in the upper‐tail.
 x  z 2  n , x  z 2  n
 
• Confidence Intervals:
• 90%,  = 0.10, /2 = 0.05, z/2 = z = 1.645.
• 95%,  = 0.05, /2 = 0.025, z/2 = z = 1.96.
• 99%,  = 0.01, /2 = 0.005, z/2 = z = 2.575.

19

Confidence Level

• Confidence Level
• Confidence in which the interval will contain
the unknown population parameter
• A percentage (less than 100%)

20

10
2023

Confidence Interval for μ


(σ Known)
• Assumptions
√ Population standard deviation σ is known
√ Population is normally distributed
√ If population is not normal, use large sample
• Confidence interval estimate:
σ
XZ
n
(where Z is the standardized normal distribution
critical value for a probability of α/2 in each tail)
21

Finding the Critical Value, Z


Commonly used confidence levels are 90%,
95%, and 99%
Confidence Confidence Significance
Level Z value
Coefficient level
80% .80 .20 1.28
90% .90 .10 1.645
95% .95 .05 1.96
98% .98 .01 2.33
99% .99 .001 2.58
99.8% .998 .0001 3.08
99.9% .999 .00001 3.27

22

11
2023

Confidence Interval of the Population


Mean When  Is Known
• Example: Constructing a Confidence Interval for
 When  is Known
• A sample of 25 cereal boxes of Granola Crunch, a
generic brand of cereal, yields a mean weight of 1.02
pounds of cereal per box.
• Construct a 95% confidence interval of the mean
weight of all cereal boxes.
• Assume that the weight is normally distributed with a
population standard deviation of 0.03 pounds.

23

Confidence Interval of the Population


Mean When  Is Known
 Constructing a Confidence Interval for  When 
is Known
 This is what we know: n  25, x  1.02 pounds
 = 1  .95  .05, z 2  1.96
  0.03
 Substituting these values, we get
x  1.96  
n  1.02  1.96 0.03 
25  1.02  0.012

or, with 95% confidence, the mean weight of all cereal


boxes falls between 1.008 and 1.032 pounds.

24

12
2023

Confidence Interval of the Population


Mean When  Is Known
• Interpreting a Confidence Interval
• Interpreting a confidence interval requires care.
• The probability that  falls in the interval is 0.95.
Incorrect
• Correct: If numerous samples of size n are drawn
from a given population, then 95% of the intervals
x  z 2 
formed by the formula n will contain .
• Since there are many possible samples, we will
be right 95% of the time, thus giving us 95%
confidence.

25

Confidence Interval of the Population


Mean When  Is Known

• The Width of a Confidence Interval


• Margin of Error Confidence Interval Width
z 2  n 
2 z 2  n 
• The width of the confidence interval is
influenced by the:
• Standard deviation 
• Sample size n.
• Confidence level 100(1  )% :how much
you can trust your confidence‐interval
calculation
26

13
2023

27

28

14
2023

• C.I Width = 2*Margin of Error = 2*(0.012)=


0.024

29

30

15
2023

Confidence Interval of the Population


Mean When  Is Known
• The Width of a Confidence Interval is influenced by:
• For a given confidence level 100(1  )% and sample size n,
the width of the interval is wider, the greater the population
standard deviation .
 Example: Let the standard deviation of the population of cereal
boxes of Granola Crunch be 0.05 instead of 0.03. Compute a
95% confidence interval based on the same sample
information.
x  z 2  
n  1.02  1.96 0.05 
25  1.02  0.20

 This confidence interval width has increased from 0.024 to


2(0.020) = 0.040.

31

Confidence Interval of the Population


Mean When  Is Known
• The Width of a Confidence Interval is influenced by:
• For a given confidence level 100(1  )% and population
standard deviation , the width of the interval is wider, the
smaller the sample size n.
• Example: Instead of 25 observations, let the sample be based
on 16 cereal boxes of Granola Crunch. Compute a 95%
confidence interval using a sample mean of 1.02 pounds and a
population standard deviation of 0.03.
x  z 2  
n  1.02  1.96 0.03 
16  1.02  0.015

• This confidence interval width has increased from 0.024 to


2(0.015) = 0.030.

32

16
2023

Confidence Interval of the Population


Mean When  Is Known
• The Width of a Confidence Interval is influenced by:
• For a given sample size n and population standard deviation ,
the width of the interval is wider, the greater the confidence
level 100(1  )%.
• Example: Instead of a 95% confidence interval, compute a 99%
confidence interval based on the information from the sample
of Granola Crunch cereal boxes.
x  z 2  
n  1.02  2.575 0.03 
25  1.02  0.015

• This confidence interval width has increased from 0.024 to


2(0.015) = 0.030.

33

Example: Discount Sounds


• Discount Sounds has 260 retail outlets
throughout the United States.
• The firm is evaluating a potential location for a
new outlet, based in part, on the mean annual
income of the individuals in the marketing area of
the new location.
• A sample of size n = 36 was taken; the sample
mean income is $41,100.
• The population is not believed to be highly
skewed. The population standard deviation is
estimated to be $4,500, and the confidence
coefficient to be used in the interval estimate is
0.95.

34

17
2023

Thus at 95% confidence, the margin of error is $1,470.


$41,100 + $1,470 Or $39,630 to $42,570
We are 95% confident that the interval contains the
population mean.

For a higher degree of confidence, the margin of


error and thus the width of the confidence
interval must be larger.

35

Interval Estimate of a Population


Mean: σ Known (4 of 5)

Example: Discount Sounds

In order to have a higher degree of confidence,


the margin of error and thus the width of the
confidence interval must be larger.

36

18
2023

Sample size: Thumb rule


• Majority of applications sample size greater than 30 is
adequate
• If the population distribution is highly skewed or
contains outliers, a sample size of 50 or more is
recommended.
• If the population is not normally distributed but is
roughly symmetric, a sample size as small as 15 will
suffice.
• If the population is believed to be at least
approximately normal, a sample size of less than 15
can be used.

37

Confidence Interval of the Population


Mean When  Is Known
 Example:

38

19
2023

Example

39

40

20
2023

Confidence Interval for μ


(σ Known) Example

• A sample of 11 circuits from a large normal


population has a mean resistance of 2.20 ohms.
We know from past testing that the population
standard deviation is .35 ohms.

• Determine a 95% confidence interval for the


true mean resistance of the population.

41

Confidence Interval for μ


(σ Known) Example
σ
X Z
n
 2.20  1.96 (.35/ 11)
 2.20  .2068
(1.9932 , 2.4068)
• We are 95% confident that the true mean resistance is
between 1.9932 and 2.4068 ohms
• Although the true mean may or may not be in this
interval, 95% of intervals formed in this manner will
contain the true mean
42

21
2023

Confidence Interval of the Population


Mean When  Is Unknown

43

Confidence Interval of the Population


Mean When  Is Unknown

• The t Distribution
– If repeated samples of size n are taken from a normal
population with a finite variance, then the
statistic T follows the t distribution
with (n  1) degrees of freedom, df.
• Degrees of freedom
– Determine the extent of the broadness of the tails of the
distribution
– the fewer the degrees of freedom, the broader the tails.

44

22
2023

Student’s t Distribution

t (df = 13)

t (df = 5)

0 t

45

Confidence Interval for μ


(σ Unknown)

• If the population standard deviation σ is


unknown, we can substitute the sample
standard deviation, S
• This introduces extra uncertainty, since S is
variable from sample to sample
• So we use the t distribution instead of the Z
distribution

46

23
2023

Confidence Interval for μ


(σ Unknown)
• Assumptions
√ Population standard deviation is unknown
√ Population is normally distributed
√ If population is not normal, use large sample
(CLT)
• Use Student’s t Distribution 𝑆
Confidence Interval Estimate: 𝑋 𝑡n 1 𝑛

(where t is the critical value of the t distribution


with n‐1 d.f. and an area of α/2 in each tail)

47

Student’s t Distribution

• The t value depends on degrees of


freedom (d.f.)
√ Number of observations that are free to
vary after sample mean has been
calculated
d.f. = n ‐ 1

48

24
2023

Degrees of Freedom
Idea: Number of observations that are free to vary after
sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

If the mean of these three


Let X1 = 7
values is 8.0,
Let X2 = 8 then X3 must be 9
What is X3? (i.e., X3 is not free to vary)

Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2


(2 values can be any numbers, but the third is not free to vary
for a given mean)
49

Confidence Interval of the Population


Mean When  Is Unknown
• Summary of the tdf Distribution
• Bell‐shaped and symmetric around 0 with asymptotic
tails (the tails get closer and closer to the horizontal
axis, but never touch it).
• Has slightly broader tails than the z distribution.
• Consists of a family of distributions where the actual
shape of each one depends on the df.
• As df increases, the tdf distribution becomes
similar to the z distribution
• it is identical to the z distribution when df
approaches infinity.

50

25
2023

Confidence Interval of the Population


Mean When  Is Unknown
 The tdf Distribution with Various Degrees of
Freedom

51

52

26
2023

Student’s t Distribution
Note: t Z as n increases

Standard Normal
(t with df = ∞)

t-distributions are bell- t (df = 13)


shaped and symmetric,
but have ‘fatter’ tails t (df = 5)
than the normal

0 t

53

t Distribution

Degrees of .20 area in .10 area in .05 area in .025 area in .01 area in .005 area in
Freedom upper tail upper tail upper tail upper tail upper tail upper tail

50 .849 1.299 1.676 2.009 2.403 2.678

60 .848 1.296 1.671 2.000 2.390 2.660

80 .846 1.292 1.664 1.990 2374 2.639

100 .845 1.290 1.660 1.984 2.364 2.626

Infinite .842 1.282 1.645 1.960 2.326 2.576

54

27
2023

Student’s t Table

Upper Tail Area Let: n = 3


df = n - 1 = 2
df .25 .10 .05  = .10
1 1.000 3.078 6.314 /2 =.05

2 0.817 1.886 2.920


3 0.765 1.638 2.353 /2 = .05

The body of the table contains t


0 2.920 t
values, not probabilities

55

Confidence Interval of the Population


Mean When  Is Unknown
• Example: Compute t,df for  = 0.025 using 2, 5, and 50
degrees of freedom.
• Solution: Turning to the Student’s t Distribution table
we find that
• For df = 2, t0.025,2 = 4.303.
• For df = 5, t0.025,5 = 2.571.
• For df = 50, t0.025,50 = 2.009.

• Note that the tdf values change with the degrees of


freedom.

56

28
2023

Confidence Interval of the Population


Mean When  Is Unknown

• Constructing a Confidence Interval for  When


 is Unknown
– A 100(1  )% confidence interval of the population
mean  when the population standard deviation  is
not known is computed as
x  t 2,df s or
n equivalently,  x  t 2,df s n , x  t 2,df s n 
 
where s is the sample standard deviation.

57

Confidence Interval for μ


(σ Unknown) Example

√ d.f. = n – 1 A random sample of n = 25 has X = 50


and S = 8. Form a 95% confidence interval for μ
√ = 24, so
√ The confidence interval is
S 8
X  t/2, n-1  50  (2.0639)
n 25

(46.698 , 53.302)

58

29
2023

Example: Apartment Rents

• A reporter for a student newspaper is writing an article


on the cost of off‐campus housing.
• A sample of 16 one‐bedroom apartments within a half‐
mile of campus resulted in a sample mean of $750 per
month and a sample standard deviation of $55.
• Let us provide a 95% confidence interval estimate of
the mean rent per month for the population of one‐
bedroom apartments within a half‐mile of campus.
• We will assume this population to be normally
distributed.

59

60

30
2023

Interval Estimate

We are 95% confident that the mean rent per


month for the population of one‐bedroom
apartments within a half‐mile of campus is
between $720.70 and $779.30.

61

Summary of Interval Estimation Procedures for


a Population Mean

62

31
2023

Confidence Interval of the Population


Proportion

63

Confidence Interval of the Population


Proportion

• Let the parameter p represent the proportion of


successes in the population, where success is
defined by a particular outcome.
• P is the point estimator of the population
proportion p.
• By the central limit theorem, P can be
approximated by a normal distribution for large
samples (i.e., np > 5 and n(1  p) > 5).

64

32
2023

Interval Estimate of a Population


Proportion

65

Interval Estimate of a Population


Proportion

66

33
2023

Example: Political Science, Inc.

• Political Science Inc. (PSI) specializes in voter polls


and surveys designed to keep political office seekers
informed of their position in a race.
• Using telephone surveys, PSI interviewers ask
registered voters who they would vote for if the
election were held that day.
• In a current election campaign, PSI has just found
that 220 registered voters, out of 500 contacted,
favor a particular candidate.
• PSI wants to develop a 95% confidence interval
estimate for the proportion of the population of
registered voters that favor the candidate.

67

PSI is 95% confident that the proportion of all voters that


favor the candidate is between 0.3965 and 0.4835.

68

34
2023

Determining Sample Size

69

Determining Sample Size


• The required sample size can be found to reach a
desired margin of error (e) with a specified level of
confidence (1 ‐ )
• The margin of error is also called sampling error
√ the amount of imprecision in the estimate of the
population parameter
√ the amount added and subtracted to the point estimate to
form the confidence interval

70

35
2023

Determining Sample Size

• To determine the required sample size for the


mean, you must know:
√ The desired level of confidence (1 ‐ ), which
determines the critical Z value
√ The acceptable sampling error (margin of error), e
√ The standard deviation, σ

eZ
σ Now solve Z 2 σ2
n for n to get n
e2
71

Example
• Recall that Discount Sounds is evaluating a
potential location for a new retail outlet, based in
part, on the mean annual income of the
individuals in the marketing area of the new
location.
• Suppose that Discount Sounds’ management
team wants an estimate of the population mean
such that there is a 0.95 probability that the
sampling error is $500 or less.
• How large a sample size is needed to meet the
required precision?

72

36
2023

73

Sample Size for an Interval Estimate of


a Population Proportion
• Margin of error
• Solving for 𝑛, the necessary sample size is

74

37
2023

Sample Size for an Interval Estimate of


a Population Proportion
Necessary Sample Size
The planning value 𝑝∗ can be chosen by:
1.Using the sample proportion from a previous
sample of the same or similar size.
2.Selecting a preliminary sample and using the
sample proportion from that sample.
3.Using judgment or a “best guess” for the 𝑝∗
value.
4.Otherwise, use 𝑝∗ 0.5.
75

Example: Political Science, Inc.

• Suppose that PSI would like a 0.99 probability


that the sample proportion is within 0.03 of
the population proportion.
• How large a sample size is needed to meet the
required precision? (A previous sample of
similar units yielded 0.44 for the sample
proportion.)

76

38
2023

Note
• We used 0.44 as the best estimate of p.
• If no information is available about p, then 0.5 is often
used because it provides the greatest possible sample
size.
• If we had used 𝑝∗ = 0.5, the recommended n would
have been 1843.
77

Applications in Auditing
• Six advantages of statistical sampling in
auditing
Sample result is objective and defensible
Based on demonstrable statistical principles
Provides sample size estimation in advance on
an objective basis
Provides an estimate of the sampling error

78

39
2023

Applications in Auditing
• Can provide more accurate conclusions on the
population
Examination of the population can be time consuming
and subject to more non‐sampling error
• Samples can be combined and evaluated by
different auditors
 Samples are based on scientific approach
 Samples can be treated as if they have been done by
a single auditor
• Objective evaluation of the results is possible
Based on known sampling error

79

Excel Exercise

80

40
2023

Fuel Usage of “Ultra‐Green” Cars


• A car manufacturer advertises that its new
“ultra‐green” car obtains an average of 100 mpg
and, based on its fuel emissions, has earned an
A+ rating from the Environmental Protection
Agency.
• Pinnacle Research, an independent consumer
advocacy firm, obtains a sample of 25 cars for
testing purposes.
• Each car is driven the same distance in identical
conditions in order to obtain the car’s mpg.

81

Fuel Usage of “Ultra‐Green” Cars


• The mpg for each “Ultra‐Green” car is given below.

Jared would like to use the data in this sample to:


– Estimate with 90% confidence
• The mean mpg of all ultra‐green cars.
• The proportion of all ultra‐green cars that obtain over 100 mpg.
– Determine the sample size needed to achieve a specified level of
precision in the mean and proportion estimates.

82

41
2023

Confidence Interval of the Population


Mean When  Is Unknown
• Example: Recall that Jared Beane wants to estimate the mean mpg
of all ultra‐green cars. Use the sample information to construct a
90% confidence interval of the population mean. Assume that
mpg follows a normal distribution.
– Solution: Since the population standard deviation is not
known, the sample standard deviation has to be computed
from the sample. As a result, the 90% confidence interval is

x  t 2,df s 
n  96.52  1.711 10.70 
25  96.52  3.66

83

Confidence Interval of the Population


Mean When  Is Unknown
• Using Excel to construct confidence intervals. The easiest way to estimate the mean
when the population standard deviation is unknown is as follows:
– Open the MPG data file.
– From the menu choose Data >
Data Analysis > Descriptive
Statistics > OK.
– Specify the values as shown
here and click OK.
– Scroll down through the output
until you see the Confidence
Interval.

84

42
2023

Confidence Interval of the Population


Proportion
• Example: Recall that Jared Beane wants to estimate the
proportion of all ultra‐green cars that obtain over 100mpg.
Use the sample information (sample proportion is 0.28,
sample size is 25) to construct a 90% confidence interval of
the population proportion.
– Solution: Note that p 7 25 0.28. In addition, the normality
assumption is met since np > 5 and n(1  p) > 5. Thus,

p 1  p  0.28 1  0.28 
p  z 2 =0.28  1.645  0.28  0.148
n 28

85

Selecting a Useful Sample Size


• Example: Recall that Jared Beane wants to construct a
90% confidence interval of the mean mpg of all ultra‐
green cars.
• Suppose Jared would like to constrain the margin of error to
within 2 mpg. Further, the lowest mpg in the population is 76
mpg and the highest is 118 mpg.
• How large a sample does Jared need to compute the 90%
confidence interval of the population mean?
2
 z 2ˆ   1.645  10.50 
2

n     74.58 or 75
 D   2 

86

43
2023

Selecting a Useful Sample Size


• Selecting n to Estimate p
• Consider a confidence interval for p and let D denote
the desired margin of error.
• Since where p is the
p 1  p  sample proportion
D  z 2
n 2
z 
we may rearrange to get n    2  p 1  p 
 D 
• Since p comes from a sample, we must use a
reasonable estimate of p, that is, .p̂

87

Selecting a Useful Sample Size


• Selecting n to Estimate p
• For a desired margin of error D, the minimum sample
size n required to estimate a 100(1  )% confidence
interval of the population proportion
p is
2
 z 2 
n  pˆ 1  pˆ 
 D 
• Wherep̂ is a reasonable estimate of p in the planning
stage.

88

44
2023

Selecting a Useful Sample Size


• Example: Recall that Jared Beane wants to construct a
90% confidence interval of the proportion of all ultra‐
green cars that obtain over 100 mpg.
• Jared does not want the margin of error to be more than 0.10.
• How large a sample does Jared need for his analysis of the
population proportion?

2 2
z   1.645 
n    2  pˆ 1  pˆ     0.50 1  0.50   67.65 or 68
 D   0.10 

89

Thank You

90

45

You might also like