You are on page 1of 36

STATISTICAL INFERENCE

Confidence Intervals for


Population Proportions
Presented by:
MOHAMMAD FAROOQ
ACADEMIC COORDINATOR
MATHEMATICS
LEARNING OBJECTIVES
 Distinguish between a point estimate and a
confidence interval estimate
 Form and interpret confidence interval estimate for
a single population proportion p (z distribution)
 Determine the required sample size (n) to estimate
a single population proportion
 The Expected Value and Standard Deviation of the
Sample Proportion
 Confidence Intervals Comparing Two Proportion
The Sampling Distribution of the
Sample Proportion

 Estimator
 Sample proportion P is used to estimate
the population parameter p.

 Estimate
 A particular value of the estimator p .
Point Estimate for Population p
The probability of success in a single trial of a binomial experiment is p.
This probability is a population proportion.

The point estimate for p, the population proportion of successes, is given


by the proportion of successes in a sample and is denoted by
x
pˆ 
n

where x is the number of successes in the sample and n is the number in


the sample.
The point estimate for the proportion of failures is q̂ = 1 – p̂ The
symbols p̂ and q̂ are read as “p hat” and “q hat.”
Example 1:Point Estimate p
Example:
In a survey of 1250 US adults, 450 of them said that their favorite sport to watch is
baseball. Find a point estimate for the population proportion of US adults who say their
favorite sport to watch is baseball.

n = 1250 x = 450

x 450
pˆ    0.36
n 1250

The point estimate for the proportion of US adults who say


baseball is their favorite sport to watch is 0.36, or 36%.
Confidence Intervals for p
A c-confidence interval for the population proportion p is

pˆ  E  p  pˆ  E
where
pq
ˆ ˆ.
E  zc
n
The probability that the confidence interval contains p is c.
Example 1: Continued.
Construct a 90% confidence interval for the proportion of US
adults who say baseball is their favorite sport to watch.
x = 450
n = 1250 p̂  0.36

Continued.
Example 1: Solution
Example continued:
pq
ˆˆ
p̂ x =0.36 E  zc
n = 1250 450 n
qˆ  0.64  1.645
(0.36)(0.64)  0.022
1250

Left endpoint = ? Right endpoint = ?

• p̂  •
0.36 •
p̂  E  0.36  0.022 p̂  E  0.36  0.022
 0.338  0.382
With 90% confidence we can say that the proportion of all US adults who
say baseball is their favorite sport to watch is between 33.8% and 38.2%.
Finding Confidence Intervals for p
Constructing a Confidence Interval for a Population Proportion
In Words In Symbols
1. Identify the sample statistics n and x.
x
2. Find the point estimatep̂ . p
ˆ 
n
3. Verify that the sampling distribution n pˆ  5, n qˆ  5
can be approximated by the normal
distribution.
4. Find the critical value zc that Use the Standard
corresponds to the given level of Normal Table.
confidence.
pq
5. Find the margin of error E. E  zc ˆ ˆ
n
6. Find the left and right endpoints and Left endpoint:p̂  E
form the confidence interval. Right endpoint:p̂  E
Interval: pˆ  E  p  pˆ  E
Sample Size Proportion p
Given a c-confidence level and a margin of error, E, the minimum sample
size n, needed to estimate p is
2
 zc 
n  pq
ˆˆ  .
E p̂ and qˆ.
This formula assumes you have an estimate for
ˆ  0.5and qˆ  0.5.
If not, use p

Example 2:
You wish to find out, with 95% confidence and within 2% of the true
population, the proportion of US adults who say that baseball is their
favorite sport to watch.

Continued.
Example 2: Solution
Example continued:
You wish to find out, with 95% confidence and within 2% of the true
population, the proportion of US adults who say that baseball is their
favorite sport to watch.

n = 1250 x = 450 p̂  0.36


2 2
z 
ˆ ˆ  c   (0.36)(0.64) 
1.96 
n  pq 
E  0.02 
 2212.8 (Always round up.)

You should sample at least 2213 adults to be 95% confident.


The Expected Value and Standard Deviation of the
Sample Proportion

 Expected Value
 The expected value of P ,

 
E P p
 The standard deviation of P ,

p  1 p 
 
SD P 
n
The Central Limit Theorem for the Sample
Proportion
 For any population proportion p, the
sampling distribution of P is approximately
normal if the sample size n is sufficiently
large .
 As a general guideline, the normal
distribution approximation is justified when

np > 5 and n(1  p) > 5.


Standard Normal Z for the Sample p

 The Central Limit Theorem for the


Sample Proportion
 If P is normal, we can transform it into
the standard normal random variable as
Z
 
P E P Pp
SD  P  p  1 p 
n
 Therefore any value p on P Z
pp
has a corresponding value p  1 p 
z on Z given by n
Graph of the Sample Proportion
 The Central Limit Theorem for the
Sample Proportion

Sampling distribution of P Sampling distribution of P


when the population when the population
proportion proportion
is p = 0.10. is p = 0.30.
Example 3:
 From the introductory case, Anne wants to
determine if the marketing campaign has had
a lingering effect on the proportion of
customers who are women and teenage girls.
 Before the campaign, p = 0.43 for women and
p = 0.21 for teenage girls.
Based on 50 customers sampled after the
campaign, p = 0.46 and p = 0.34, respectively.
 Let’s find P  P  0.46  . Since n > 30, the central
limit theorem states that P is approximately
normal.
Example 3: Solution
   
   
p  p   P  Z  0.46  0.43
 
P P  0.46  P  Z 
 p  1 p    0.43  1  0.43 


   
 n   50 
 P  Z  0.43   1  0.6664  0.3336
Comparisons of two independent
population proportions:
Confidence Interval
P1 (1  P1 ) P2 (1  P2 )
c.i.   P2  P1   z 
n1 n2
EXAMPLE 5: A study of teenage suicide included a sample of
96 boys and 123 girls between ages of 12 and 16 years
selected scientifically from admissions records to a private
psychiatric hospital.  Suicide attempts were reported by 18 of
the boys and 60 of the girls.  We assume that the girls
constitute a simple random sample from a population of
similar girls and likewise for the boys.  Construct a 99 percent
confidence interval for the difference between the two
proportions.
Example 5: Solution

Given      n1  = 123        n2  = 96

           P1 = 0.4878      P2  = 0.1875


1)Confidence Intervals
for Variance and
Standard Deviation
2)Confidence Interval
for Ratio of Variances
The Chi-Square Distribution
The point estimate for 2 is s2, and the point estimate
for  is s. s2 is the most unbiased estimate for 2.

You can use the chi-square distribution to


construct a confidence interval for the variance
and standard deviation.
If the random variable x has a normal distribution,
then the distribution of 2 (n  1)s 2
 
σ2
forms a chi-square distribution for samples of any size n > 1.
Properties of the (χ 2 )Chi-Square
distribution are as follows
1. All chi-square values χ 2
are greater than or equal to zero.
2. The chi-square distribution is a family of curves, each
determined by the degrees of freedom. To form a
confidence interval for 2, use the χ 2-distribution with
degrees of freedom equal to one less than the sample
size.
3. The area under each curve of the chi-square distribution
equals one.
4. Find the critical value zc that corresponds to the given
level of confidence.
5. Chi-square distributions are positively skewed.
Confidence Intervals for 2 and 
A c-confidence interval for a population variance and standard deviation is
as follows.

Confidence Interval for 2:


(n  1)s 2 2 (n  1)s 2
2
σ 
XR X L2

Confidence Interval for :

(n  1)s 2 (n  1)s 2
2
σ 
XR X L2

The probability that the confidence intervals contain 2 or  is c.


Critical Values for X 2

There are two critical values for each level of confidence. The value χ 2
R

represents the right-tail critical value and χ 2


L represents the left-tail critical
value.

c
1 c
2
1 c
2
X2
X 2
L X 2
R

The area between the left and right critical values is c.


Example 1: for
Critical Values for X 2

Find the critical values χ 2R and χ 2L for an 80%


confidence when the sample size is 18.
Because the sample size is 18, there are
d.f. = n – 1 = 18 – 1 = 17 degrees of
freedom,

1c10
.
8
Area to the right of χ 2
=  
0.
1
R 2 2

1  c1 0 .
8
Area to the right of χ 2
L 
= 0 .
9
2 2
Use the Chi-square distribution
table to find the critical values. Continued
Critical Values for X2
Example continued:

Appendix B: Table 6: χ2-Distribution


Degrees of 
freedom 0.995 0.99 0.975 0.95 0.90 0.10 0.05
1 - - 0.001 0.004 0.016 2.706 3.841
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815

16 5.142 5.812 6.908 7.962 9.312 23.542 26.296


17 5.697 6.408 7.564 8.672 10.085 24.769 27.587
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869

Χ Χ 2
= 10.085
R = 24.769
2 L
Constructing a Confidence Interval
Example2:
You randomly select and weigh 41 samples of 16-ounce bags of
potato chips. The sample standard deviation is 0.05 ounce.
Assuming the weights are normally distributed, construct a 90%
confidence interval for the population standard deviation.

d.f. = n – 1 = 41 – 1 = 40 degrees of freedom,


1c1
0.
9
Area to the right of χ 2
R=
 
0.
05
2 2

1c1
0.
9
Area to the right of χ 2
=  
0.
95
L
2 2
The critical values are χ 2
R = 55.758 and χ 2
L = 26.509.

Continued
.
C=0.9
1  0.9
 0.95
2 1  0.9
Χ 2
= 26.509  0.05
L 2
Χ 2
R = 55.758 X 2

X 2
L X 2
R

Left endpoint = ? Right endpoint = ?


• •
(n  1)s 2 (41  1)(0.05)2 (n  1)s 2 (41  1)(0.05)2
 
R2 55.758 L2 26.509
 0.04  0.06

0.04  σ  0.06
With 90% confidence we can say that the population
standard deviation is between 0.04 and 0.06 ounces.
Tests for Comparing Two Population Variances

Objective: Test for the equality of variances (homogeneity assumption).

s 12 has a probability distribution in repeated


s 22 sampling which follows the F distribution.

 12

1 .0
 22
F(2,5)

0 .8
The F distribution shape is
defined by two parameters
0 .6

F(5,5)
denoted the numerator
x2 5

degrees of freedom (ndf or


0 .4

df1 ) and the denominator


0 .2

degrees of freedom (ddf or


df2 ).
0 .0

0 2 4 6 8 10
y
Inferences Regarding 2 Population Variances

 Goal: Compare variances between 2 populations


 Parameter:
12

2 (Ratio is 1 when variances are equal)


2
2
 Estimator: s 1 (Ratio of sample variances)
s 22
 Distribution of (multiple) of estimator (Normal
Data):
s12 12 s12 s22
 2 2 ~F with df1  n1 1 and df 2  n2 1
s2  2 1  2
2 2

F-distribution with parameters df1 = n1-1 and df2 = n2-1


Properties of F-Distributions
 Take on positive density
over the range (0 , )
 Cannot take on negative values
 Non-symmetric (skewed right)
 Indexed by two degrees of freedom

df1 (numerator df) and df2 (denominator df))


 Parameters of F-distribution:

df2 2df22  df2  df1  2


 (df2  2) 2   df2  4
df2  2 df1 (df2  2) (df2  4)
2
Confidence Interval for Ratio of Variances
 
s12  1  12 s12  
2      Fdf2 ,df1 ,
s2  Fdf ,df ,   2 s2
2 2
 2
 1 2 2

Note: degrees of freedom have been swapped.


Confidence Interval for Ratio of Variances
Example
Background Samples Study Site Samples
n1  7 n2  7

y1  2.48 y 2  4.82

s1  1.13 s2  0.89

Example (95% CI): F6, 6, 0.025  5.82

1
.
131
2

1


2

1
.
132
 F

0
.
89
2
F6
,
6,
0.
 2
20
025 .
892 6
,
6,
0.

025


2
0
. 1
2779
. Note: not a  argument!
282

2
2
Table 8
Numerator df = df1.

Note this table has three


things to specify in order
to get the critical value.

Denominator df = df2.

4.28
5.82

Probability Level
Va
ria
nc

F Table
e-
Te
st-
33
Example
Among 11 patients in a certain study, the
standard deviation of the property of interest
was 5.8.  In another group of 4 patients, the
standard deviation was 3.4.  We wish to
construct a 95 percent confidence interval for
the ratio of the variances of these two
populations.

34
Solution
Given
     n1  = 11         = (5.8)2 = 33.64 a = .05

     n2  = 4           = (3.4)2 = 11.56

         10, 3 = 14.42

        = 1/ 3, 10 = 1/4.83 = .20704

35
Solution continued
Calculation of the 95% confidence interval for  /

       

36

You might also like