You are on page 1of 43

Statistics and Probability – Grade 11

Alternative Delivery Mode


Quarter 3 – Module 4: Estimation of Parameters
First Edition, 2019

Republic Act 8293, section 176 states that: No copyright shall subsist in any
work of the Government of the Philippines. However, prior approval of the government
agency or office wherein the work is created shall be necessary for exploitation of such
work for profit. Such agency or office may, among other things, impose as a condition
the payment of royalties.
Borrowed materials (i.e., songs, poems, pictures, photos, brand names,
trademarks, etc.) included in this book are owned by their respective copyright holders.
Every effort has been exerted to locate and seek permission to use these materials
from their respective copyright owners. The publisher and authors do not represent
nor claim ownership over them.

Published by the Department of Education – Region X – Northern Mindanao


Regional Director: Dr. Arturo B. Bayocot, CESO III

Development Team of the Module


Development Team of the Module:
Author: Roxanne J. Montojo
Authors: Roxanne J. Montojo
Reviewers: Evangeline M. Pailmao
Reviewers: Evangeline M. Pailmao Emily A. Tabamo
Emily A. Tabamo
Rufe A. Felicilda
Rufe A. Felicilda
Illustrator: Jay Michael A. Calipusan
Management Team
Chairperson:
ManagementDr. Arturo B. Bayocot, CESO III
Team:
Regional Director
Chairperson: Dr. Arturo B. Bayocot, CESO III
Regional
Co-Chairperson: Dr. Victor G. De Gracia Director
Jr., CESO V
Co-Chairpersons:
Asst. Regional Dr. Victor G. De Gracia Jr., CESO V
Director
Asst. Regional Director
Members
Mala Epra B.
Mala Epra B. Magnaong, Magnaong
Chief ES, CLMD
CES, CLMD
Bienvenido U. Tagolimot Jr., EPS-ADM
Members: Dr. Bienvenido U. Tagolimot, Jr.
Neil A. Improgo EPS -LRMS
Regional ADM Coordinator
Joel D. Potane, SEPS/LRMS Manager
Marino O. Dal
Himaya B. Sinatao, EPS-LRMS
EPS, Math
Printed in the Philippines by
Department
PrintedofinEducation – Bureau
the Philippines of Learning
by: Department of Resources
Education –(DepEd-BLR)
Regional Office 10
Office Address:
Office Address:Zone
Zone1, 1,Upper BalulangCagayan
Upper Balulang Cagayandede Oro
Oro City
City 9000
9000
Telefax:Telefax: (088) 880-7071,
(088) 880-7071, (088) 880-7072
(088) 880-7072
E-mail E-mail
Address:Address: region10@deped.gov.ph
region10@deped.gov.ph

ii
11
Statistics and
Probability
Module 4
Estimation of Parameters

This instructional material was collaboratively developed and reviewed


by educators from public and private schools, colleges, and/or universities. We
encourage teachers and other education stakeholders to email their feedback,
comments, and recommendations to the Department of Education at
action@deped.gov.ph.

We value your feedback and recommendations.

Department of Education • Republic of the Philippines

iii
Table of Contents

What I Need To Know ……………………………………..……..…… 1


Module Content ………………………………………………… 1
Module Objectives ………………………………………………… 1
General Instructions ………………………………………………… 2
What I Know ………………………………………………… 3
Lessons
Lesson 1 – Random Sampling of the Mean
And the Median ………………………………… 4
What I Can Do ………………………………… 7

Lesson 2 – Confidence Interval and


the Central Limit Theorem ………………………… 8
What I Can Do ………………………………… 13

Lesson 3 – Z-Distribution and T-Distribution ………………… 15


What I Can Do ………………………………… 24

Lesson 4 – Z-Distribution and T-Distribution ………………… 25


What I Can Do ………………………………… 29

What I Have Learned ………….……………………………………... 31


Assessment ………………………………………………… 34
References ………………………………………………… 38

v
What I Need To Know

In any statistical inference, the use of estimates to approximate the value of an


unknown population parameter is an important aspect.
Like the case of the mercury contamination of rivers and the water system as a
whole in Compostela Valley. In order to trace its extent, you need to estimate the
average mercury content found in the mining silts in a river. Suppose that a random
sample of 10 such sites resulted in a sample average of 90mg of mercury was found
per liter of silt in the river. We may use these findings as an estimate of the average
mercury for all of the setting areas of mining sites in Compostela Valley. This type of
estimate can help us analyze the risks that people are facing should they decide to get
water from the river even faucet water that can probably be contaminated with
mercury.

Module Content

This module contains some examples and solutions, activities and exercises
that can help you know the basic estimation of parameters.
This module has three lessons:
 Lesson 1 Random Sampling of the Median and the Mean
 Lesson 2 Confidence Interval and the Central Limit Theorem
 Lesson 3 Z-Distribution & T-Distribution
 Lesson 4 Population Proportion

Module Objectives

Once you are done with this module, you should be able to:
 (M11/12SP-IIIf-2) illustrates point and interval estimations;
 (M11/12SP-IIIf-3) distinguishes between point and interval estimations;
 (M11/12SP-IIIf-4) identifies point estimator for the population mean;
 (M11/12SP-IIIf-5) computes for the point estimate of the population mean;
 (M11/12SP-IIIg-1) identifies the appropriate form of the confidence interval estimator
for the population mean when; (a) the population variance is known, (b) the
population variance is unknown, and (c) the Central Limit Theorem is to be used;
 (M11/12SP-IIIg-2) illustrates the t-distribution;
 (M11/12SP-IIIg-3) constructs a t-distribution;

1
 (M11/12SP-IIIg-4) identifies regions under the t-distribution corresponding to different
t-values;
 (M11/12SP-IIIg-5) identifies percentiles using the t-table;
 (M11/12SP-IIIh-1) computes for the confidence interval estimate based on the
appropriate form of the estimator for the population mean;
 (M11/12SP-IIIh-2) solves problems involving confidence interval estimation of the
population mean;
 (M11/12SP-IIIh-3) draws conclusion about the population mean based on its
confidence interval estimate;
 (M11/12SP-IIIi-1) identifies point estimator for the population proportion;
 (M11/12SP-IIIi-2) computes for the point estimate of the population proportion;
 (M11/12SP-IIIi-3) identifies the appropriate form of the confidence interval estimator
for the population proportion based on the Central Limit Theorem;
 (M11/12SP-IIIi-4) computes for the confidence interval estimate of the population
proportion;
 (M11/12SP-IIIi-5) solve problems involving confidence interval estimation of the
population proportion;
 (M11/12SP-IIIi-6) draws conclusion about the population proportion based on its
confidence interval estimate;
 (M11/12SP-IIIj-1) identifies the length of a confidence interval;
 (M11/12SP-IIIj-2) computes the length of a confidence interval;
 (M11/12SP-IIIj-3) computes for an appropriate sample size using the length of the
interval; and
 (M11/12SP-IIIj-4) solves problems involving sample size determination.

General Instructions

To achieve the objectives of this module, do the following:


 Take your time to read the lesson explanations carefully.
 Solve the sample problems given in each topic on your own as guided by the
given solution.
 Answer all the given exercises and activities.
 Familiarize yourself with the given terms on the definition box at the beginning
of each topic.

2
What I Know

I. Identification.
______________1. The process of making inferences about a population
based on information obtained from a sample.
______________2. It states that the sample mean 𝑥𝑥̅ approximately follows
the normal distribution with mean μ and standard
𝝈𝝈
deviation .
√𝒏𝒏
______________3. A range of values used to estimate the parameter. It
can be calculated using two numbers or values
which may or may not contain the value of the
parameter being estimated.
______________4. This refers to the number of independent observations
in the set of data, or the number of variables that
are free to vary.
______________5. This distribution is ideally used when n ≤ 30 and the
standard deviation or variance of the entire
population is unknown, or that the only standard
deviation given is from the sample.
______________6. It represents a part of a whole and can be expressed
as a percentage, decimal or fraction.
______________7. A single value used to approximate a population
parameter.
______________8. The interval defined within the true population where
members of the sample are expected to be found.
______________9. It quantifies the probabilities in which, a member of the
sample would fall within a known interval of thetrue
population. If 𝛼𝛼 is the allowable sampling error, the
confidence level is equal to 1 – 𝛼𝛼.
_____________10.This distribution is ideally used when 𝑛𝑛 ≥ 30 and the
standard deviation or the variance of the entire is
given.
II. Determine the standard of error of the mean, the margin of error, and
the confidence interval. Assume that all data are normally
distributed.
1. In a survey, male and female student respondents are asked if they
prefer to go to college or not. Find the 99% confidence interval of the
difference in the two proportions as shown in the table.

Will go to Will not go to


Student
college college
Male 100 150
Female 125 75

Key to answer on page 36

3
Lesson Random Sampling of the
1 Median and the Mean

Learning Concepts

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What is it

DEFINITION 4.1

Parameter Estimation – the process of making inferences about a population based


on the information/ value obtained from a sample describing a characteristic of the
population. For example, consider the following set of data representing the number
of errors made by a secretary on 10 different pages of a document 1, 0, 1, 2, 3, 1, 1,
4, 0, and 2. Let us assume that the document contains exactly 10 pages so that the
data constitute of a small finite population. A quick study of this population leads to a
number of conclusions. The population mean of the typing errors mentioned above is
𝜇𝜇 = 1.5. It may be noted that the parameter is a constant value describing a
population.

Point Estimate – the sample mean 𝒙𝒙


̅ of the population or mean 𝝁𝝁. It is the numerical
value which gives an estimate of a parameter.

Interval Estimator – is a formula that tells us how to use sample data to calculate
an interval that estimates a population parameter.

4
Example 1
Consider the table below:
Random Sample Sample
Sample mean, 𝒙𝒙
̅
(n=3) Median
87 88 90 88 88
87 88 92 89 88
87 88 95 90 88
87 90 92 90 90
87 90 95 91 90
87 92 95 91 92
88 90 92 90 90
88 90 95 91 90
88 92 95 92 92
90 92 95 92 92

Looking at column 2, the sample mean 88 and 89 appeared only once. Thus their
1
probabilities are all 10 or 0.10 while the mean 90, 91 and 92 appeared twice, then their
2
probabilities are all 10 or 0.20. Hence, we obtain the following values:

Random Sampling of the Sample Mean


Sample Mean P(𝒙𝒙
̅)
(𝒙𝒙
̅)
88 0.1
89 0.1
90 0.2
91 0.2
92 0.2

Probability Histogram of the Sample Mean


0.25
0.2
0.15
0.1 Probability

0.05
0
88 89 90 91 92

5
Example 2
Random Sampling of the Sample Median
Looking at the third column for the sample median, we see that both 88 and 92
appeared thrice while 90 appeared four times. Thus their probabilities P(88) = 0.3,
P(92) = 0.3 and P(90) = 0.4. We then obtain the following table:
Sample P(x)
Median (x)
88 0.3
90 0.4
92 0.3

Probability Histogram of the


Sample Median
0.6
0.4 Probability
Histogram
0.2 of the
0 Sample…
88 90 92

Example 3
Estimate the mean consumption of 8 families in one month if their expenses are
Php13,300; Php14,800; Php18,800; Php17,900; Php23,500; Php24,700; Php22,000
and Php29,000
Solution:
∑ 𝑋𝑋 13,300 + 14,800 + 18,800 + 17,900 + 23,500 + 24,700 + 22,000 + 29,000
𝜇𝜇 = =
𝑁𝑁 8
= 20,500
𝑥𝑥̅ = 𝜇𝜇 = 𝑃𝑃ℎ𝑝𝑝20,500 is the point estimator

DEFINITION 4.2
Interval Estimation gives us a range of values which is likely to contain the population
parameter. It can be determined by two values.

6
Example 4
The following are examples of interval estimation:
1. The average family expense in Region X is Php250- 400 a day.
2. The average life span of stage 4 breast cancer patients is 3 ± 5 years.
3. The average scores of students in General Mathematics exam is 75 < μ <84

What I Can Do

1. Estimate the mean weight of 6 students randomly chosen in a university if


student 1 weighs 53 kg; student 2 weighs 64 kg; student 3 weighs 49 kg;
student 4 weighs 59 kg; student 5 weighs 62 kg and student 6 weighs
55kg.

Key to answer on page 36

7
Lesson Confidence Interval and the
2 Central Limit Theorem

Learning Concept

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What’s New
DEFINITION 4.3

Confidence level – expressed as percent, it sets a portion of the sample to be


included within a known range of the true population. Example: In an article on a
certain Survey group reported that 38% of likely Philippine voters are hopeful that
their health insurance coverage will change because of the Universal Health Bill.
Going down the article, we see a margin of sampling error is +/-2.9 percentage
points with a 95% level of confidence. It’s impractical to survey 100 million Filipinos
so it’s impossible to know exactly how many people would actually respond “yes, my
health insurance coverage will change.” We take a sample of 2,000 Filipinos, and
using good statistical techniques like simple random sampling, take our “best guess”
at what that actual figure is. What a 95% confidence level is saying is that if the poll
or survey were repeated over and over again, the results would match the results
from the actual population 95% of the time.

Confidence Interval – The width +/-2.9% stated as plus or minus 2.9. When the
interval and confidence level are put together, you get a spread of percentage. In this
case, you would expect the results to be 35.1% (38% - 2.9) to 40.9% (38% + 2.9),
95% of the time.

8
Illustration 1

The graph shows that 95% of the population distribution is contained in the
confidence interval. A confidence level of 1

– 𝛼𝛼 when

𝛼𝛼 = 5%, which means that there is a probability of at least 95% that the result is
reliable. Each tail of curve has a value of 2.50% and the areas to the middle have
47.5% each.

From the table of normal curve, the value of 𝑧𝑧𝛼𝛼 𝑎𝑎𝑎𝑎 𝐴𝐴 = 0.475 𝑖𝑖𝑖𝑖 ± 1.96
2

Illustration 2

Using the same method to derive the z-score at confidence level of 99%, we
get that 𝑧𝑧𝛼𝛼 = ±2.576
2

𝛼𝛼 = 1
(99% confidence level)

49% 49%
0.495 0.495

0.50% 0.50%
z = -2.576 z = 2.576

9
Summary of z-scores for Commonly Used Confidence Interval
Confidence Level
Margin of Error (𝝈𝝈) z-value
(1-𝝈𝝈)
10% 90% ±𝟏𝟏. 𝟔𝟔𝟔𝟔𝟔𝟔
5% 95% ±𝟏𝟏. 𝟗𝟗𝟗𝟗𝟗𝟗
1% 99% ±𝟐𝟐. 𝟓𝟓𝟓𝟓𝟓𝟓

CENTRAL LIMIT THEOREM

 The value of z has been derived using the Central Limit Theorem:
̅−𝝁𝝁
𝒙𝒙 ̅−𝝁𝝁
𝒙𝒙
z= or z = 𝝈𝝈
𝝈𝝈𝒙𝒙̅
√𝒏𝒏

 The Central Limit Theorem states that the sample mean 𝑥𝑥̅
approximately follows the normal distribution with mean μ and standard
𝝈𝝈
deviation .
√𝒏𝒏

 If the population follows a normal distribution, then the sample size n


can be either small or large.
 If the population from where the sample is taken follows a non-normal
distribution, then the sample size n, has to be large (usually n ≥ 30)
 The Central Limit Theorem also states that even if a population
distribution is strongly non-normal, its sampling distribution of means
will be approximately normal for large sample sizes (over 30). The
theorem makes it possible to use probabilities associated with the
normal curve to answer questions about the means of sufficiently large
samples.
 The (1 – 𝜎𝜎)100% confidence interval for the population mean derived
from the Central Limit Theorem is as stated below:
𝝈𝝈 𝝈𝝈
̅ − 𝒛𝒛𝝈𝝈 ( ) < μ < 𝒙𝒙
𝒙𝒙 ̅ + 𝒛𝒛𝝈𝝈 ( )
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏

The formula is explained in detail below:

The confidence interval of the population mean with a given confidence


level of (1-𝜎𝜎)100% and when the population variance is unknown is:
10
𝝈𝝈 𝝈𝝈
̅ − 𝒛𝒛𝝈𝝈 ( ) , 𝒙𝒙
{𝒙𝒙 ̅ + 𝒛𝒛𝝈𝝈 ( ) }
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏
Where 𝑥𝑥̅ = sample mean
𝜎𝜎 = sample standard deviation or square root of the
sample variance
𝜎𝜎
= standard error of the mean
√𝑛𝑛
σ
zσ ( ) = margin of error
2 √n

NOTE: The margin of error depends on the confidence level.

Example 1:
An operations manager plans to select 300 female employees from a group of
tenure workers (where height was considered due the nature of their task). The
selected group has an average height of 170 cm and a sample standard deviation of
25 cm. What is the 95% confidence interval of all the employees’ heights?

Solution:
Given:
Sample size, n = 300
Sample mean = 170 cm
Sample standard deviation, 𝜎𝜎= 25 cm
Confidence interval gives = 95%; 𝜎𝜎 = 5 and 𝒛𝒛𝝈𝝈 = 𝟏𝟏. 𝟗𝟗𝟗𝟗
𝟐𝟐

We substitute the given values into formula for the confidence interval.

𝝈𝝈 𝝈𝝈
̅ − 𝒛𝒛𝝈𝝈 ( ) ,
{𝒙𝒙 ̅ + 𝒛𝒛𝝈𝝈 (
𝒙𝒙 )}
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏
25 25
= {170 − 1.96 ( ) , 170 + 1.96( )}
√300 √300

= (1.67.17, 172.83)
= (167, 173)
Hence the operations manager is 95% confident that the employees have a
mean height of 167 to 173 cm.

11
Example 2:

A random sample of 40 residents of Quezon City has an average of electrical


consumption of 29 kWh/mo. with a sample standard deviation of 8 kWh. Give the
90% confidence interval for the mean usage of electricity per month.

Solution:

Given: Sample size, n = 40


Sample mean 𝑥𝑥̅ = 29
Sample standard deviation, 𝜎𝜎𝑥𝑥̅ = 8
Confidence interval gives = 90%; 𝜎𝜎 = 10 and 𝒛𝒛𝝈𝝈 = 𝟏𝟏. 𝟔𝟔𝟔𝟔𝟔𝟔
𝟐𝟐

Substituting the given values above, we get

𝝈𝝈 𝝈𝝈
̅ − 𝒛𝒛𝝈𝝈 ( ) , 𝒙𝒙
{𝒙𝒙 ̅ + 𝒛𝒛𝝈𝝈 ( ) }
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏

8 8
= {29 − 1.645 ( ) , 29 + 1.645( )}
√40 √40

= (26.92, 31.08)

= ( 26.9, 31.1 )

Thus, the confidence interval is 26.9 to 31.1 kWh per month.

Example 3. A sample size of n = 100 produced the sample mean of 𝑥𝑥̅ = 16.
Assuming the population standard deviation 𝜎𝜎 = 3, compute a 95% confidence
interval for the population mean μ.
Solution:

A 95% confidence interval for μ is


𝜎𝜎
𝑥𝑥̅ ± 𝑧𝑧𝛼𝛼/2
√𝑛𝑛

where 𝑥𝑥̅ = sample mean


𝜎𝜎 = sample standard deviation or square root of the sample variance
𝜎𝜎
= standard error of the mean
√𝑛𝑛
σ
zσ ( ) = margin of error
2 √n

3
16±(1.96) = 16 ± 0.588 = [𝟏𝟏𝟏𝟏. 𝟒𝟒𝟒𝟒𝟒𝟒, 𝟏𝟏𝟏𝟏. 𝟓𝟓𝟓𝟓𝟓𝟓]
√100

12
What I Can Do

I. Compute the following.

1. A random sample is drawn from a population of known standard deviation


11.3. Construct a 90% confidence interval for the population mean based on
the information given:
a. n = 36 𝑥𝑥̅ = 105.2
b. n = 100 𝑥𝑥̅ = 105.2
2. A random sample is drawn from a population of unknown standard deviation.
Construct a 99% confidence interval for the population mean based on the
information given:
a. n = 49 𝑥𝑥̅ = 17.1 𝜎𝜎 = 2.1
b. n = 169 𝑥𝑥̅ = 17.1 𝜎𝜎 = 2.1
3. A random sample of size 144 is drawn from a population whose distribution,
mean, and standard deviation are all unknown. The summary statistics are 𝑥𝑥̅
= 58.2 and 𝜎𝜎 = 2.6. Construct a 90% confidence interval for the population
mean μ.

II. Solve these problems.


1. A government agency was charged by the legislature with estimating the
length of time it takes citizens to fill out various forms. Two hundred
randomly selected adults were timed as they filled out a particular form.
The times required had mean 12.8 minutes with standard deviation1.7
minutes. Construct a 90% confidence interval for the mean time taken for
all adults to fill out this form.

2. A sample of 250 workers aged 16 and older produced an average length


of time with the current employer of 4.4 years with standard deviation of
3.8 years. Construct a 99.9% confidence interval for mean job tenure of all
workers aged 16 or older.

13
3. A corporation that own apartment complexes wishes to estimate the
average length of time residents remain in the same apartment before
moving out. A sample of 150 rental contracts gave a mean length of
occupancy of 3.7 years with a standard deviation of 1.2 years. Construct a
95% confidence interval for the mean length of occupancy of apartments
owned by this corporation.

4. In order to estimate the mean amount of damage sustained by vehicles


when a cow is struck, an insurance company examined the records of 50
such occurrences, and obtained a sample mean of Php2,785 with
standard deviation of Php221. Construct a 95% confidence interval for the
mean amount of damage in all such accidents.

Key to answer on page 36

14
Lesson Z-Distribution and
3 T-Distribution

Learning Concept

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and be able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What’s In

The T-distribution (also called Student’s T Distribution) is a family of


distributions that look almost identical to the normal distribution curve, only a bit shorter
and fatter. The T-distribution is used instead of the normal distribution when you have
small samples. The larger the sample size, the more the t-distribution looks like the
normal distribution. In this lesson, we will study another form of distribution that can
be used if situational problems do not allow us to use the standard normal distribution.

DEFINITION 4.4

The t-distribution – is the probability distribution that estimates the population


parameters when the sample size is small and the population standard deviation is
unknown.

Degree of freedom – refers to the number of independent observations on the set of


data, or the number of variables that are free to vary. The formula for the t-value is =
̅−𝝁𝝁
𝒙𝒙
𝒔𝒔 . Similar formula is also used in the z-score except that our sample size is less
√𝒏𝒏

than 30.

Properties of T-Distribution

15
1. The shape of the curve is bell-shaped and symmetrical with mean zero.
2. The t-distribution ranges from −∞ 𝑡𝑡𝑡𝑡 ∞ (infinity).
3. The shape of the distribution changes with the change in the degrees of
freedom.
4. The variance is always greater than one and can be defined only when the
degrees of freedom v ≥ 3 and is given as: Var (t) = [ v / v - 2 ].
5. It is less peaked at the center and higher in tails thus it assumes platykurtic
shape.
6. The t-distribution has a greater dispersion than the standard normal
distribution. As the sample size ‘n’ increases, it assumes the normal
distribution. Here the sample size is said to be large when n ≥ 30.

Comparison Between the t-distribution and the z-distribution


(or normal distribution)
T - distribution Z - distribution
Ideally used when n ≤ 30 and the
standard deviation or the variance of the Ideally used when n ≥ 30 and the
entire population is unknown, or that the standard deviation or the variance of the
only standard deviation given is from entire population is given.
the sample.
Both can be used for determining the confidence interval of the population mean
and confidence interval of the difference between two means
The distribution has a graph that is bell
shaped and symmetrical about the
mean. It is more variable since t-values
The distribution has a graph that is bell
depend on the fluctuations of the mean
shaped and symmetrical about the
and standard deviation.
mean. Z-values only depend on the
The degree of freedom df is equal to
fluctuation of the mean from sample to
(n-1) if the mean and standard deviation
sample.
are computed from samples of size n.
The values of t are said to belong to a t-
distribution with df = n-1. The bell curve

16
of the t-distribution approaches the
standard normal curve as n becomes
bigger.

Values of the t-distribution (two-tailed)

DF A 0.80 0.90 0.95 0.98 0.99 0.995 0.998 0.999


P 0.20 0.10 0.05 0.02 0.01 0.005 0.002 0.001
1 3.078 6.314 12.706 31.820 63.657 127.321 318.309 636.619
2 1.886 2.920 4.303 6.965 9.925 14.089 22.327 31.599
3 1.638 2.353 3.182 4.541 5.841 7.453 10.215 12.924
4 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 1.397 1.860 2.306 2.897 3.355 3.833 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 1.345 1.761 2.145 2.625 2.977 3.326 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 1.337 1.746 2.120 2.584 2.921 3.252 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.090 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646

17
31 1.309 1.695 2.040 2.453 2.744 3.022 3.375 3.633
32 1.309 1.694 2.037 2.449 2.738 3.015 3.365 3.622
33 1.308 1.692 2.035 2.445 2.733 3.008 3.356 3.611
34 1.307 1.691 2.032 2.441 2.728 3.002 3.348 3.601
35 1.306 1.690 2.030 2.438 2.724 2.996 3.340 3.591
36 1.306 1.688 2.028 2.434 2.719 2.991 3.333 3.582
37 1.305 1.687 2.026 2.431 2.715 2.985 3.326 3.574
38 1.304 1.686 2.024 2.429 2.712 2.980 3.319 3.566
39 1.304 1.685 2.023 2.426 2.708 2.976 3.313 3.558
40 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
42 1.302 1.682 2.018 2.418 2.698 2.963 3.296 3.538
44 1.301 1.680 2.015 2.414 2.692 2.956 3.286 3.526
46 1.300 1.679 2.013 2.410 2.687 2.949 3.277 3.515
48 1.299 1.677 2.011 2.407 2.682 2.943 3.269 3.505
50 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496
60 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
70 1.294 1.667 1.994 2.381 2.648 2.899 3.211 3.435
80 1.292 1.664 1.990 2.374 2.639 2.887 3.195 3.416
90 1.291 1.662 1.987 2.369 2.632 2.878 3.183 3.402
100 1.290 1.660 1.984 2.364 2.626 2.871 3.174 3.391
120 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
150 1.287 1.655 1.976 2.351 2.609 2.849 3.145 3.357
200 1.286 1.652 1.972 2.345 2.601 2.839 3.131 3.340
300 1.284 1.650 1.968 2.339 2.592 2.828 3.118 3.323
500 1.283 1.648 1.965 2.334 2.586 2.820 3.107 3.310
1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291

18
How to Use the T-distribution Table of Values

1. Determine alpha (𝛼𝛼). The probability that the population parameter is not in
the confidence interval.
2. Identify which of the two test must be used (two-tailed or one-tailed).

Graph of a two-tailed t-distribution Graph of a one-tailed t-distribution


Confidence level of 90% Confidence level of 90%
𝛼𝛼 𝛼𝛼 = 10
𝛼𝛼 = 10 =5
2

5% 5% 10%

z= 0 z= 0

3. Compute for the degrees of freedom, df = n – 1


4. Using the t-table, determine the t-value using the row of the desired df, and
the column of the allowed error.

Example 1:
What is the t-value of a one-tailed t-distribution with 𝛼𝛼 = 5% and df = 15?
Solution:
From the given table, we find that the t-value is 1.753
𝜶𝜶 for one-tailed test 0.05 0.025 0.01
df
:
: : :
:
15 1.753 : :
:
: : :
:

Example 2:

What is the 95% confidence interval of the scores of the Philippine Rugby
(nicknamed Tamaraws) Team players if the mean of the 12 randomly selected
players is 9.5 points per game with a standard deviation of 3.5 points?

19
Solution:

We will use the t-distribution since the sample size, n is less than 30, where

n = 12, sample mean of 𝑥𝑥̅ = 9.5 and sample standard deviation of 𝜎𝜎𝑥𝑥̅ = 3.5 and a
confidence level of 95%.

From the t-table, look for df = 12-1 = 11.Find the value under the two-tailed
test since the deviations in the values of the data could possibly come from both
ends of the distribution. The confidence level is 95%, giving us 𝛼𝛼 = 5%

DF A 0.90 0.95
P 0.10 0.05
1 6.314 12.706
2 2.920 4.303
3 2.353 3.182
4 2.132 2.776
5 2.015 2.571
6 1.943 2.447
7 1.895 2.365
8 1.860 2.306
9 1.833 2.262
10 1.812 2.228
11 1.796 2.201

Hence, 𝑡𝑡𝛼𝛼 = 2.201


2

𝜎𝜎 𝜎𝜎
𝑥𝑥̅ ± 𝑡𝑡𝜎𝜎 ( 𝑛𝑛𝑥𝑥̅ ) = 𝑥𝑥̅ ± 2.201( 𝑛𝑛𝑥𝑥̅ )
2 √ √

3.5
= 9.5 ± 2.201( )
√12

= 9.5 ± 2.201(2.22)
= 9.5 ± 4.89
Therefore the confidence interval is (4.61, 14.39)

20
Example 3:
The mean sales and standard deviation of procured samples of two TV
brands in a certain appliance store are summarized below:

Standard Number
Mean sales
TV deviation of
(in
brand (in selling
thousands)
thousands) days
A 𝑥𝑥1 = 10.4
̅̅̅ 𝜎𝜎̅𝑥𝑥̅̅1̅ = 2.5 𝑛𝑛1 = 6
B 𝑥𝑥2 = 11.8
̅̅̅ 𝜎𝜎̅𝑥𝑥̅̅2̅ = 3.6 𝑛𝑛2 = 6

1. Determine the 90% confidence interval for the difference between the two
mean sales of the two TV brands?
2. Is it safe to conclude that there is no significant difference in the mean
sales of the two TV brands?

Solution:

1. To determine the 90% confidence interval, we substitute the given


𝑥𝑥1 = 10.4; 𝑥𝑥
values ̅̅̅ ̅̅̅2 = 11.8; 𝜎𝜎𝑥𝑥̅̅̅1̅ = 2.5;
𝜎𝜎̅𝑥𝑥̅̅2̅ = 3.6; 𝑛𝑛1 = 6; 𝑛𝑛2 = 6; and the 90% confidence interval (where
𝛼𝛼
𝛼𝛼 = 10 or = 5 ) into the formula.
2

Note: Use df = 10 since df = 𝑛𝑛1 + 𝑛𝑛2 – 2

Formula:
𝜎𝜎̅̅̅̅ 2 𝜎𝜎𝑥𝑥 2
𝑥𝑥1 ̅̅̅̅
|(𝑥𝑥̅1 − 𝑥𝑥2 | ± 𝑡𝑡𝛼𝛼 √
̅̅̅) +√ 2
=
2 𝑛𝑛1 𝑛𝑛2

2.52 3.62
=|10.4 − 11.8| ± 1.812√ +
6 6

= 1.4 ± 1.812 (1.79)


= 1.4 ± 3.24

21
DF A 0.80 0.90
P 0.20 0.10
1 3.078 6.314
2 1.886 2.920
3 1.638 2.353
4 1.533 2.132
Thus the confidence interval of the
5 1.476 2.015
6 1.440 1.943 difference between the mean sales of the
7 1.415 1.895 two TV brands is (-1.84, 4.64), or in
8 1.397 1.860 terms of thousands, the confidence
9 1.383 1.833 interval should be (-1,840, 4,640.
10 1.372 1.812

Since the confidence interval ranges from a negative to a positive value


and zero lies in between, it is safe to conclude that there is no
significant difference in the mean sales of the two TV brands.

Example 4:

What is the percentage distribution of the 13.6 million peso average


production profit of the 25 lines carried by a certain product brand with a standard
deviation of 1.56 million pesos, if the average production profit of all the brand’s lines
is 12.9 million pesos? Assume that the data follow a t-distribution.

Solution:

From the problem, the given are


𝑥𝑥̅ = 13.6, s = 1.56, n = 25 𝜇𝜇 = 12.9
From central limit theorem;
𝑥𝑥̅ −𝜇𝜇 13.6−12.9 0.7
𝑡𝑡 = 𝑠𝑠 → 𝑡𝑡 = 1.56 = 0.312 → 𝑡𝑡 = 2.244
√𝑛𝑛 √25

The degree of freedom is df = n-1 = 25 - 1 = 24

From the t-distribution table at df = 24, the value t = 2.244 is between 2.064
and 2.492

DF A 0.95 0.98
P 0.05 0.02
10 2.228 2.764

22
11 2.201 2.718
12 2.179 2.681
13 2.160 2.650
14 2.145 2.625
15 2.131 2.602
16 2.120 2.584
17 2.110 2.567
18 2.101 2.552
19 2.093 2.539
20 2.086 2.528
21 2.080 2.518
22 2.074 2.508
23 2.069 2.500
24 2.064 2.492

By Interpolation:
𝜶𝜶 t-value
0.025 2.064
𝜶𝜶𝒙𝒙 2.244
0.01 2.492

𝛼𝛼𝑥𝑥 − 0.025 2.244 − 2.064


=
0.01 − 0.025 2.492 − 2.2064
0.18
𝛼𝛼𝑥𝑥 − 0.025 = 0.2856 (−0.015)

𝛼𝛼𝑥𝑥 = 0.025 − 0.009454


𝛼𝛼𝑥𝑥 = 0.0155
To determine the area from the mean, subtract 𝛼𝛼𝑥𝑥 from 0.5. That is
Area = 0.5 - 𝛼𝛼𝑥𝑥
= 0.5 – 0.01555
= 0.4844 or 48.44%
Hence, the area from t = 0 to t = 2.244 is 48.44%

23
What I Can Do

Solve the following problems:

1. Find the t-value to the left of the mean for 𝛼𝛼 = 1 and n = 11


2. Construct a 99% confidence interval for a population mean if we have a
sample mean of 67.5, a sample standard deviation of 8.3, and n = 7
3. What is the t-value of the t-distribution with 𝛼𝛼 = 1 and n = 18 under the
one-tailed test?
4. What is the area under the t-distribution from t = -1.771 to t = 2.160 at df =
13?
5. A study is conducted to compare the performance of students with more
than one personal electronic gadget and those with only one. A number of
them were taken as subjects for the study. The mean grades of these
students and these standard deviations are given below:

Sampling
Students Mean Standard
size
deviation
w/ one
𝑥𝑥1 = 83
̅̅̅ 𝑠𝑠1 = 12 𝑛𝑛1 = 7
gadget
w/ more
than one 𝑥𝑥2 = 79
̅̅̅ 𝑠𝑠2 = 13 𝑛𝑛2 = 5
gadget

Is it possible to conclude that there is no significant difference in the mean


grades of the two types of students at 95% confidence level?
𝑠𝑠1 𝑠𝑠
Hint: Use the formula t = (𝑥𝑥
̅̅̅1 − 𝑥𝑥
̅̅̅)
2 ± 𝑡𝑡𝛼𝛼 √ 𝑛𝑛1
+ 𝑛𝑛2 , 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑑𝑑𝑑𝑑 = 𝑛𝑛1 + 𝑛𝑛2 − 2
2 2

Key to answer on page 36

24
Lesson
Population Proportion
4

Learning Concepts

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and be able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What is it

We have learned that the z-distribution is used when n > 30 (or if we have a
large sample size) and when the population standard deviation or population
variance is given. The t-distribution is used when n ≤ 30 and the only standard
deviation given is from a sample. In this lesson, you will learn how to estimate the
proportion of a population, sample size n, to limit the margin of error and get a higher
accuracy for better results, percentage, decimal, or fraction.

Point Estimate – is a single value used to approximate a population


parameter. The sample proportion, denoted by 𝑝𝑝̂ , is the best point estimate of the
population proportion(p).

There are research experiments which need an estimation of the proportion of the
parameter or a confidence interval for the population proportion. Below are some
examples:

1. Proportion of customers who are satisfied with the services rendered by a


restaurant.
2. Proportion of Fil-Am players in the Philippine Rugby Team.

25
3. Proportion of registered voters who will likely vote in favor of a female
candidate.
4. Proportion of college scholars who get a job related to their field of discipline.

In this lesson, we denote p as the population proportion, q as the proportion


of “not p”, 𝒑𝒑
̂ as the estimate of sample proportion, and 𝒒𝒒
̂ as the estimated
proportion of “not 𝒑𝒑
̂ ”.

The formulas for 𝒑𝒑 ̂ are as follows:


̂ 𝒂𝒂𝒂𝒂𝒂𝒂 𝒒𝒒
𝑥𝑥
𝑝𝑝̂ = 𝑛𝑛
𝑥𝑥 𝑛𝑛−𝑥𝑥
𝑞𝑞̂ = 1 − 𝑝𝑝 = 1 − 𝑛𝑛 = 𝑛𝑛

where x is the number of successes in n trials.

Remember the following in the sampling distribution of 𝑝𝑝̂ :

 𝑝𝑝̂ is the estimate of a sample proportion with x successes in n trials.


 𝑝𝑝̂ is the best point estimate
 If np and nq are both greater than or equal to 5, then p will have a normal
𝒑𝒑𝒑𝒑
distribution. From the Central Limit Theorem, 𝝁𝝁𝒑𝒑̂ = 𝒑𝒑 𝒂𝒂𝒂𝒂𝒂𝒂 𝝈𝝈𝒑𝒑̂ = √ ≈
𝒏𝒏

̂𝒒𝒒
𝒑𝒑 ̂

𝒏𝒏

 In the normal approximation to a binomial distribution, 𝝁𝝁 = 𝒏𝒏𝒏𝒏 𝒂𝒂𝒂𝒂𝒂𝒂 𝝈𝝈 = √𝒏𝒏𝒏𝒏𝒏𝒏


 If a sample is not a representative of the population, then 𝑝𝑝̂ will not be a useful
estimate of p. Instead, use the sampling technique discussed in the previous
lessons.

Example 1
If 30 students from a batch of graduates were surveyed and 30 of them
answered that they finished BS Industrial Engineering (BS IE), what is the estimated
proportion of those who took up BS IE out of the whole batch?

26
Solution:
Let 𝑝𝑝̂ = sample proportion of BS IE graduates
x = 30 (number of BS IE graduates)
n = 350 (total number of surveyed graduates)
𝑥𝑥 30
𝑝𝑝̂ = 𝑛𝑛 = 350 = 0.086 = 8.6

Example 2
From the example given above, what is the estimated proportion of graduates
who didn’t take up BS IE?

Solution:
Let 𝑞𝑞̂ = sample proportion of non BS IE graduates.
𝑞𝑞̂ = 1 = 𝑝𝑝̂ = 1 − 0.086 = 0.914 𝑜𝑜𝑜𝑜 91.4

The following are formulas involving the concept of point estimation:


 The confidence interval of the population proportion is given by:

̂𝒒𝒒
𝒑𝒑 ̂ ̂𝒒𝒒
𝒑𝒑 ̂ ̂𝒒𝒒
𝒑𝒑 ̂
̂ − 𝒛𝒛𝜶𝜶 √
𝒑𝒑 ̂ + 𝒛𝒛𝜶𝜶 √
< 𝒑𝒑 < 𝒑𝒑 or ̂±𝒛𝒛𝜶𝜶 √
𝒑𝒑
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏

 The confidence interval of the difference of two proportions is given by:


̂𝒒𝒒
𝒑𝒑𝟏𝟏 ̂ ̂𝒒𝒒
𝒑𝒑𝟐𝟐 ̂ ̂𝒒𝒒
𝒑𝒑𝟏𝟏 ̂ ̂𝒒𝒒
𝒑𝒑𝟐𝟐 ̂
|𝒑𝒑
̂𝟏𝟏 − 𝒑𝒑
̂|𝟐𝟐 − 𝒛𝒛𝜶𝜶 √ 𝒏𝒏𝟏𝟏
𝟏𝟏
+ 𝒏𝒏𝟐𝟐
𝟐𝟐
< (𝒑𝒑𝟏𝟏 − 𝒑𝒑𝟐𝟐 ) < |𝒑𝒑
̂𝟏𝟏 − 𝒑𝒑
̂|𝟐𝟐 + 𝒛𝒛𝜶𝜶 √ 𝒏𝒏𝟏𝟏
𝟏𝟏
+ 𝒏𝒏𝟐𝟐
𝟐𝟐
or
𝟐𝟐 𝟐𝟐

̂𝒒𝒒
𝒑𝒑 𝟏𝟏 ̂𝟏𝟏 ̂𝒒𝒒
𝒑𝒑 𝟐𝟐 ̂𝟐𝟐
|𝒑𝒑
̂𝟏𝟏 − 𝒑𝒑
̂| 𝟐𝟐 ± 𝒛𝒛𝜶𝜶 √ +
𝟐𝟐 𝒏𝒏𝟏𝟏 𝒏𝒏𝟐𝟐

̂𝒒𝒒
𝒑𝒑 ̂
 The standard of error SE of the estimate is 𝑺𝑺𝑺𝑺 = √ 𝒏𝒏
̂𝒒𝒒
𝒑𝒑 ̂
 The margin of error ME of the estimate is 𝑴𝑴𝑴𝑴 = 𝒛𝒛𝜶𝜶 √ 𝒏𝒏
𝟐𝟐
 Conversion of 𝑝𝑝̂ value to z-value

27
𝑥𝑥−𝜇𝜇
Recall that 𝜇𝜇 = 𝑛𝑛𝑛𝑛 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 = √𝑛𝑛𝑛𝑛𝑛𝑛. Since 𝑧𝑧 = 𝜎𝜎
from the Central Limit
𝒙𝒙−𝒏𝒏𝒏𝒏 𝒙𝒙
𝑥𝑥−𝑛𝑛𝑛𝑛 −𝒑𝒑 ̂−𝒑𝒑
𝒑𝒑
Theorem, 𝑧𝑧 = Thus, 𝒛𝒛 = 𝒏𝒏
= 𝒏𝒏
=
√𝑛𝑛𝑛𝑛𝑛𝑛 √𝒏𝒏𝒏𝒏𝒏𝒏 √𝒑𝒑𝒑𝒑 𝒑𝒑𝒑𝒑
√ 𝒏𝒏
𝒏𝒏 √𝒏𝒏

 Formula for estimating a sample size n of a population proportion:


̂𝒒𝒒
𝒑𝒑 ̂ 𝟐𝟐
𝒏𝒏 = (𝑴𝑴𝑴𝑴)𝟐𝟐 (𝒛𝒛𝜶𝜶 )
𝟐𝟐

 If 𝑝𝑝̂ 𝑜𝑜𝑜𝑜 𝑞𝑞̂ is unknown, you may use a conservative estimate of 𝑝𝑝̂ = 0.5 𝑎𝑎𝑎𝑎𝑎𝑎 𝑞𝑞̂ =
0.5; then 𝑝𝑝̂ 𝑞𝑞̂ = 0.25, Thus we have
𝟎𝟎.𝟐𝟐𝟐𝟐
𝒏𝒏 = 𝑴𝑴𝑴𝑴𝟐𝟐 (𝒛𝒛𝜶𝜶 )𝟐𝟐
𝟐𝟐

 Formula for sample size 𝑛𝑛𝑖𝑖 in estimating the difference in two proportions:
̂𝒒𝒒
(𝒑𝒑𝟏𝟏 ̂+𝒑𝒑
𝟐𝟐 ̂ ̂𝟐𝟐
𝟏𝟏 𝒒𝒒
𝒏𝒏𝒊𝒊 = 𝑴𝑴𝑴𝑴𝟐𝟐
(𝒛𝒛𝜶𝜶 )𝟐𝟐
𝟐𝟐

Example 3

A random sample of size 75 is selected from a binomial probability with 𝑝𝑝̂ = 0.13. Is it
appropriate to use the normal distribution to approximate the sampling distribution of
the sample proportion?

Solution:
𝑝𝑝 = 0.13 𝑎𝑎𝑎𝑎𝑎𝑎 𝑞𝑞 = 1 − 𝑝𝑝 = 1 − 0.13 = 0.87
𝑛𝑛𝑝𝑝̂ = 0.13(75) = 9.75 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛𝑞𝑞̂ = 0.87(75) = 65.25 (𝑏𝑏𝑏𝑏𝑏𝑏ℎ 𝑎𝑎𝑎𝑎𝑎𝑎 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑡𝑡ℎ𝑎𝑎𝑎𝑎 5)
Since both 𝑛𝑛𝑝𝑝̂ 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛𝑞𝑞̂ are greater than 5, we can use the normal distribution to
approximate the sampling distribution of the sample proportion.

Example 4

Two hundred randomly selected graduates were asked whether they believed that
the country’s employment status will improve under the new president. One hundred
twenty of them said yes. Construct a 90% confidence interval for the proportion of
graduates who believe that the employment status will improve.

28
Solution:
120
Given: x =120, and n = 200, 𝑝𝑝̂ = 200 = 0.6 Thus 𝑞𝑞̂ = 1 − 𝑝𝑝̂ = 𝟎𝟎. 𝟒𝟒

Using the formulas required to get the confidence interval, we have:

̂𝒒𝒒
𝒑𝒑 ̂ ̂𝒒𝒒
𝒑𝒑 ̂
̂ − 𝒛𝒛𝜶𝜶 √ < 𝒑𝒑 < 𝒑𝒑
𝒑𝒑 ̂ + 𝒛𝒛𝜶𝜶 √
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏

Recall that for a confidence level of 90%, 𝛼𝛼 = 10 and 𝒛𝒛𝜶𝜶 = 1.645. Substituting the
𝟐𝟐

given values, we get:

𝟎𝟎.𝟔𝟔(𝟎𝟎.𝟒𝟒) 𝟎𝟎.𝟔𝟔(𝟎𝟎.𝟒𝟒)
0.6 – 1.645√ < 𝒑𝒑 < 𝟎𝟎. 𝟔𝟔 + 𝟏𝟏. 𝟔𝟔𝟔𝟔𝟔𝟔√
𝟐𝟐𝟐𝟐𝟐𝟐 𝟐𝟐𝟐𝟐𝟐𝟐

Thus, the confidence interval is from 0.543 to 0.657 or 54.3% to 65.7%

Interpretation: We are 90% confident that about 54.2% to 65.8% of the workers
believe that the country’s economy will improve under the new president.

What I Can Do

1. A political campaign manager wishes to survey a number of voters to estimate


the proportion of those who are in favor of his candidate. If a previous survey
shows that 55% of registered voters plans to vote for his candidate, what is
the minimum sample size required to make his surveys accurate with a 95%
confidence level and a margin of error of 2.5%?

2. A quality controller wants to estimate the proportion of high quality goods out
of a batch of products with a 90% confidence level and a margin of error of
5%. How many products must he test.

29
3. A school administrator wishes to assess the quality of graduates from their
school within 5 school years. A randomly selected group of graduates from
two areas of discipline were interviewed as to why whether they landed a job
related to their field. The data gathered is as follows:

No. of students with job


Area of discipline Sample size
related to field of study
BS Criminology 50 35
BS in Education 45 27

Given the previous data, how many sample respondents from each area must be taken for a
deeper assessment if the school administrator wants a 95% confidence level and a margin of
error of 3%?

Key to answer on page 36

30
What I Have Learned

Parameter Estimation – the process of making inferences about a population based


on the information/ value obtained from a sample describing a characteristic of the
population.

Point Estimate – the sample mean 𝒙𝒙


̅ of the population or mean 𝝁𝝁. It is the numerical
value which gives an estimate of a parameter.

Interval Estimator – is a formula that tells us how to use sample data to calculate
an interval that estimates a population parameter.

∑ 𝑋𝑋
Point Estimator - 𝜇𝜇 = 𝑁𝑁
= 𝑥𝑥̅

Interval Estimation gives us a range of values which is likely to contain the


population parameter. It can be determined by two values.

Confidence level – expressed as percent, it sets a portion of the sample to be


included within a known range of the true population.

Confidence Interval – The width +/-2.9% stated as plus or minus 2.9. When the
interval and confidence level are put together, you get a spread of percentage.

The value of z has been derived using the Central Limit Theorem:

̅−𝝁𝝁
𝒙𝒙 ̅−𝝁𝝁
𝒙𝒙
z= 𝝈𝝈𝒙𝒙̅
or z = 𝝈𝝈
√𝒏𝒏

The Central Limit Theorem states that the sample mean 𝑥𝑥̅ approximately follows
𝝈𝝈
the normal distribution with mean μ and standard deviation .
√𝒏𝒏

The (1 – 𝜎𝜎)100% confidence interval for the population mean derived from the
Central Limit Theorem is as stated below:
𝝈𝝈 𝝈𝝈
̅ − 𝒛𝒛𝝈𝝈 ( ) < μ < 𝒙𝒙
𝒙𝒙 ̅ + 𝒛𝒛𝝈𝝈 ( )
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏

The confidence interval of the population mean with a given confidence level of (1-
𝜎𝜎)100% and when the population variance is unknown is:

31
𝝈𝝈 𝝈𝝈
̅ − 𝒛𝒛𝝈𝝈 ( ) , 𝒙𝒙
{𝒙𝒙 ̅ + 𝒛𝒛𝝈𝝈 ( ) }
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏

𝜎𝜎
= standard error of the mean
√𝑛𝑛

σ
zσ ( ) = margin of error
2 √n

The probability interval for the difference between two population means is:

𝑆𝑆1 2 𝑆𝑆2 2
|(𝑥𝑥 𝑥𝑥2 ± 𝑧𝑧𝛼𝛼 √
̅̅̅1 − ̅̅̅)| + √
2 𝑛𝑛1 𝑛𝑛2

Where 𝑛𝑛1 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛2 = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝

𝑠𝑠1 𝑎𝑎𝑎𝑎𝑎𝑎 𝑠𝑠2 = 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑

̅̅̅
𝑥𝑥1 𝑎𝑎𝑎𝑎𝑎𝑎 𝑥𝑥
̅̅̅2 = 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚

The t-distribution – is the probability distribution that estimates the population


parameters when the sample size is small and the population standard deviation is
unknown.

Degree of freedom – refers to the number of independent observations on the set of


data, or the number of variables that are free to vary.

̅−𝝁𝝁
𝒙𝒙
The formula for the t-value is 𝒔𝒔 where n is less than 30.
√𝒏𝒏

The z-distribution is used when n ≥ 30 and the standard deviation of variance of


the entire population is given.

The t-distribution is used when n ≤ 30 and the standard deviation of variance of the
entire population is unknown, or that the only standard deviation given is from the
sample.

The degree of freedom df in a t-test is equal to (n-1) if the mean and standard
deviation are computed from samples of size n. The values of t are said to belong to
a t-distribution with df = n-1.

32
Point Estimate – is a single value used to approximate a population parameter. The
sample proportion, denoted by 𝑝𝑝̂ , is the best point estimate of the population
proportion(p).

We denote p as the population proportion, q as the proportion of “not p”, 𝒑𝒑


̂ as the
estimate of sample proportion, and 𝒒𝒒
̂ as the estimated proportion of “not 𝒑𝒑
̂ ”.

The formulas for 𝒑𝒑 ̂ are as follows:


̂ 𝒂𝒂𝒂𝒂𝒂𝒂 𝒒𝒒
𝒙𝒙
̂=
𝒑𝒑
𝒏𝒏
𝒙𝒙 𝒏𝒏−𝒙𝒙
̂ = 𝟏𝟏 − 𝒑𝒑 = 𝟏𝟏 − =
𝒒𝒒 𝒏𝒏 𝒏𝒏

where x is the number of successes in n trials.

The confidence interval of the population proportion:

̂𝒒𝒒
𝒑𝒑 ̂ ̂𝒒𝒒
𝒑𝒑 ̂
̂ − 𝒛𝒛𝜶𝜶 √ < 𝒑𝒑 < 𝒑𝒑
𝒑𝒑 ̂ + 𝒛𝒛𝜶𝜶 √
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏

̂𝒒𝒒
𝒑𝒑 ̂
Or ̂ ± 𝒛𝒛𝜶𝜶 √
𝒑𝒑 𝒏𝒏
𝟐𝟐

The confidence interval of the difference of two proportions is given by:


̂𝒒𝒒
𝒑𝒑𝟏𝟏 ̂ ̂𝟐𝟐 𝒒𝒒
𝒑𝒑 ̂𝟐𝟐 ̂𝒒𝒒
𝒑𝒑𝟏𝟏 ̂ ̂𝒒𝒒
𝒑𝒑𝟐𝟐 ̂
|𝒑𝒑
̂𝟏𝟏 − 𝒑𝒑
̂|𝟐𝟐 − 𝒛𝒛𝜶𝜶 √ 𝒏𝒏𝟏𝟏
𝟏𝟏
+ 𝒏𝒏𝟐𝟐
< (𝒑𝒑𝟏𝟏 − 𝒑𝒑𝟐𝟐 ) < |𝒑𝒑
̂𝟏𝟏 − 𝒑𝒑
̂|𝟐𝟐 + 𝒛𝒛𝜶𝜶 √ 𝒏𝒏𝟏𝟏
𝟏𝟏
+ 𝒏𝒏𝟐𝟐
𝟐𝟐
𝟐𝟐 𝟐𝟐

̂𝒒𝒒
𝒑𝒑 ̂
The standard of error SE of the estimate is 𝑺𝑺𝑺𝑺 = √ 𝒏𝒏
̂𝒒𝒒
𝒑𝒑 ̂
The margin of error ME of the estimate is 𝑴𝑴𝑴𝑴 = 𝒛𝒛𝜶𝜶 √ 𝒏𝒏
𝟐𝟐

Conversion of 𝑝𝑝̂ value to z-value


𝑥𝑥−𝜇𝜇
Recall that 𝜇𝜇 = 𝑛𝑛𝑛𝑛 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 = √𝑛𝑛𝑛𝑛𝑛𝑛. Since 𝑧𝑧 = 𝜎𝜎
from the Central Limit
𝒙𝒙−𝒏𝒏𝒏𝒏 𝒙𝒙
𝑥𝑥−𝑛𝑛𝑛𝑛 −𝒑𝒑 ̂−𝒑𝒑
𝒑𝒑
Theorem, 𝑧𝑧 = Thus, 𝒛𝒛 = 𝒏𝒏
= 𝒏𝒏
=
√𝑛𝑛𝑛𝑛𝑛𝑛 √𝒏𝒏𝒏𝒏𝒏𝒏 √𝒑𝒑𝒑𝒑 𝒑𝒑𝒑𝒑
√ 𝒏𝒏
𝒏𝒏 √𝒏𝒏

Formula for estimating a sample size n of a population proportion:


̂𝒒𝒒
𝒑𝒑 ̂ 𝟐𝟐
𝒏𝒏 = (𝑴𝑴𝑴𝑴)𝟐𝟐 (𝒛𝒛𝜶𝜶 )
𝟐𝟐

33
If 𝑝𝑝̂ 𝑜𝑜𝑜𝑜 𝑞𝑞̂ is unknown, you may use a conservative estimate of 𝑝𝑝̂ = 0.5 𝑎𝑎𝑎𝑎𝑎𝑎 𝑞𝑞̂ = 0.5;
then 𝑝𝑝̂ 𝑞𝑞̂ = 0.25, Thus we have
𝟎𝟎.𝟐𝟐𝟐𝟐
𝒏𝒏 = 𝑴𝑴𝑴𝑴𝟐𝟐 (𝒛𝒛𝜶𝜶 )𝟐𝟐
𝟐𝟐

Formula for sample size 𝑛𝑛𝑖𝑖 in estimating the difference in two proportions:
̂𝒒𝒒
(𝒑𝒑𝟏𝟏 ̂+𝒑𝒑
𝟐𝟐 ̂ ̂𝟐𝟐
𝟏𝟏 𝒒𝒒
𝒏𝒏𝒊𝒊 = 𝑴𝑴𝑴𝑴𝟐𝟐
(𝒛𝒛𝜶𝜶 )𝟐𝟐
𝟐𝟐

Assessment

Directions: Read and analyze the statements below. Encircle the letter of the correct
answer.

1. It represents part of a whole. Similar to probability, it can be expressed as a


percentage, decimal or fraction.
a. Point estimate b. proportion c. degree of freedom
2. This refers to the number of independent observations in the set of data, or
the number of variables that are free to vary.
a. T-distribution b. degree of freedom c. z-distribution
3. The interval defined within the true population where members of the sample
are expected to be found.
a. Confidence interval b. confidence level c. margin of
error
4. It is the process of making inferences about a population based on information
obtained from a sample.
a. Population proportion b. estimate c. Central Limit Theorem

5. The standard error estimate is given by the formula:


̂𝒒𝒒
𝒑𝒑 ̂ ̂𝒒𝒒
𝒑𝒑 ̂ ̂𝒒𝒒
𝒑𝒑 ̂
a. 𝒑𝒑
̂ ± 𝒛𝒛𝜶𝜶 √ b. 𝒛𝒛𝜶𝜶 √ 𝒏𝒏 c. √ 𝒏𝒏
𝟐𝟐 𝒏𝒏 𝟐𝟐

For numbers 6-8.

34
A group of students in their research would like to determine the EQ of
Mindanao Science State University. They followed the instructions given by
their research adviser. Through simple random sampling, they got 150
students from a population of 3,000 students. Among sampled students, the
average EQ score is 115 with a standard deviation of 10.
6. What is the sample mean?
a. 10 b. 3,000 c. 115
7. To solve for the standard deviation of the population, compute the standard
error.
a. 0.82 b. 0.995 c.0.01
8. What is the 99% confidence interval for the students’ EQ score?
a. 114± 3.1 b. 112.9 to 117.1 c. 111.1 to 115.6
For numbers 9-10
Before the BOL (Bangsamoro Organic Law) election, a poll was conducted.
Out of 1,285 randomly selected voters interviewed, 599 said they would vote
for Candidate X and 676 for candidate Y.
9. Construct a 98% confidence interval for the proportion p of voters who would
vote for candidate X.
a. 0.0433 to 0.4985 b. 0.0324 to 0.4661 c. 0.4871 to 0.5651
10. Construct a 98% confidence interval for the proportion p of voters who would
vote for candidate X.
a. 0.0433 to 0.4985 b. 0.0324 to 0.4661 c. 0.4871 to 0.5651
b.

Key to answer on page 36

35
Key to Answers

Pretest
I.
1. Estimation
2. Central Limit Theorem
3. Interval Estimate
4. Degree of Freedom
5. T-distribution
6. Population Proportion
7. Point estimate
8. Confidence Interval
9. Confidence level
10. Z-distribution
II.

Page 6. Exercise.
∑ 𝑋𝑋 53 + 64 + 49 + 59 + 62 + 55
𝜇𝜇 = = = 57
𝑁𝑁 6
𝑥𝑥̅ = 𝜇𝜇 = 57 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
Page 13. Exercise
1. A. 105.2 ±3.10
B. 105.2 ± 1.86
2. A. 17.1 ±3.10
B. 17.1 ± 0.42
3. 58.2 ±0.36
Application
1. 12.8 ±0.20
2. 4.4 ± 0.79
3. 3.7 ± 0.19
4. Php2,785 ±61

36
Chapter test.

1. b

2. b

3. a

4. b

5. c

6. c

7. a

8. b

9. a

10. c

37
References

De Guzman, Danilo B. Statistics and Probability. Quezon: C & E Publishing


Inc., 2017

Calaca, Ninia I., Chin Uy, Nestor M. Noble, and Ronaldo A. Manalo. Statistics
and Probability. Quezon: VIBAL Group Inc., 2016

38

You might also like