You are on page 1of 48

Statistics and Probability – Grade 11

Alternative Delivery Mode


Quarter 3 – Module 4: Estimation of Parameters
First Edition, 2019

Republic Act 8293, section 176 states that: No copyright shall subsist in any
work of the Government of the Philippines. However, prior approval of the
government agency or office wherein the work is created shall be necessary for
exploitation of such work for profit. Such agency or office may, among other things,
impose as a condition the payment of royalties.
Borrowed materials (i.e., songs, poems, pictures, photos, brand names,
trademarks, etc.) included in this book are owned by their respective copyright
holders. Every effort has been exerted to locate and seek permission to use these
materials from their respective copyright owners. The publisher and authors do not
represent nor claim ownership over them.

Development Team of the Module:

Authors: Roxanne J. Montojo


Reviewers: Evangeline M. Pailmao
Emily A. Tabamo
Rufe A. Felicilda
Illustrator: Jay Michael A. Calipusan
Management Team:
Chairperson:

Dr. Arturo B. Bayocot, CESO III


Regional Director
Co-Chairpersons: Dr. Victor G. De Gracia Jr., CESO V
Asst. Regional Director Mala Epra B.
Magnaong CES, CLMD
Dr. Bienvenido U. Tagolimot, Jr.
Members: Regional ADM Coordinator Marino
O. Dal
EPS, Math

Printed in the Philippines by: Department of Education – Regional Office 10


Office Address: Zone 1, Upper Balulang Cagayan de Oro City 9000
Telefax: (088) 880-7071, (088) 880-7072
E-mail Address: region10@deped.gov.ph
11
Statistics and
Probability
Module 4
Estimation of Parameters

This instructional material was collaboratively developed and reviewed by educators from public

We value your feedback and recommendations.

Department of Education • Republic of the Philippines

i
Table of Contents

What I Need To Know ……………………………………..……..…… 1


Module Content ………………………………………………… 1
Module Objectives ………………………………………………… 1
General Instructions ………………………………………………… 2
What I Know ………………………………………………… 3
Lessons
Lesson 1 – Random Sampling of the Mean
And the Median ………………………………… 4
What I Can Do ………………………………… 7

Lesson 2 – Confidence Interval and


the Central Limit Theorem ………………………… 8
What I Can Do ………………………………… 13

Lesson 3 – Z-Distribution and T-Distribution ………………… 15


What I Can Do ………………………………… 24

Lesson 4 – Z-Distribution and T-Distribution ………………… 25


What I Can Do ………………………………… 29

What I Have Learned ………….……………………………………... 31


Assessment ………………………………………………… 34
References ………………………………………………… 38

v
What I Need To Know

In any statistical inference, the use of estimates to approximate the value of


an unknown population parameter is an important aspect.
Like the case of the mercury contamination of rivers and the water system as
a whole in Compostela Valley. In order to trace its extent, you need to estimate the
average mercury content found in the mining silts in a river. Suppose that a random
sample of 10 such sites resulted in a sample average of 90mg of mercury was found
per liter of silt in the river. We may use these findings as an estimate of the average
mercury for all of the setting areas of mining sites in Compostela Valley. This type of
estimate can help us analyze the risks that people are facing should they decide to
get water from the river even faucet water that can probably be contaminated with
mercury.

Module Content

This module contains some examples and solutions, activities and exercises
that can help you know the basic estimation of parameters.
This module has three lessons:
 Lesson 1 Random Sampling of the Median and the Mean
 Lesson 2 Confidence Interval and the Central Limit Theorem
 Lesson 3 Z-Distribution & T-Distribution
 Lesson 4 Population Proportion

Module Objectives

Once you are done with this module, you should be able to:
 (M11/12SP-IIIf-2) illustrates point and interval estimations;
 (M11/12SP-IIIf-3) distinguishes between point and interval estimations;
 (M11/12SP-IIIf-4) identifies point estimator for the population mean;
 (M11/12SP-IIIf-5) computes for the point estimate of the population mean;
 (M11/12SP-IIIg-1) identifies the appropriate form of the confidence interval
estimator for the population mean when; (a) the population variance is known, (b)
the population variance is unknown, and (c) the Central Limit Theorem is to be
used;
 (M11/12SP-IIIg-2) illustrates the t-distribution;
 (M11/12SP-IIIg-3) constructs a t-distribution;

1
 (M11/12SP-IIIg-4) identifies regions under the t-distribution corresponding to different
t-values;
 (M11/12SP-IIIg-5) identifies percentiles using the t-table;
 (M11/12SP-IIIh-1) computes for the confidence interval estimate based on
the appropriate form of the estimator for the population mean;
 (M11/12SP-IIIh-2) solves problems involving confidence interval estimation of
the population mean;
 (M11/12SP-IIIh-3) draws conclusion about the population mean based on its
confidence interval estimate;
 (M11/12SP-IIIi-1) identifies point estimator for the population proportion;
 (M11/12SP-IIIi-2) computes for the point estimate of the population proportion;
 (M11/12SP-IIIi-3) identifies the appropriate form of the confidence interval
estimator for the population proportion based on the Central Limit Theorem;
 (M11/12SP-IIIi-4) computes for the confidence interval estimate of the
population proportion;
 (M11/12SP-IIIi-5) solve problems involving confidence interval estimation of
the population proportion;
 (M11/12SP-IIIi-6) draws conclusion about the population proportion based on its
confidence interval estimate;
 (M11/12SP-IIIj-1) identifies the length of a confidence interval;
 (M11/12SP-IIIj-2) computes the length of a confidence interval;
 (M11/12SP-IIIj-3) computes for an appropriate sample size using the length of
the interval; and
 (M11/12SP-IIIj-4) solves problems involving sample size determination.

General Instructions

To achieve the objectives of this module, do the following:


 Take your time to read the lesson explanations carefully.
 Solve the sample problems given in each topic on your own as guided by
the given solution.
 Answer all the given exercises and activities.
 Familiarize yourself with the given terms on the definition box at the
beginning of each topic.

2
What I Know

I. Identification.
1. The process of making inferences about a population
based on information obtained from a sample.
2. It states that the sample mean 𝑥𝑥̅ approximately follows
the normal distribution with mean μ and standard
deviation 𝝈𝝈 .
√𝒏𝒏
3. A range of values used to estimate the parameter. It
can be calculated using two numbers or values
which may or may not contain the value of the
parameter being estimated.
4. This refers to the number of independent observations
in the set of data, or the number of variables that
are free to vary.
5. This distribution is ideally used when n ≤ 30 and the
standard deviation or variance of the entire
population is unknown, or that the only standard
deviation given is from the sample.
6. It represents a part of a whole and can be expressed
as a percentage, decimal or fraction.
7. A single value used to approximate a population
parameter.
8. The interval defined within the true population where
members of the sample are expected to be found.
9. It quantifies the probabilities in which, a member of the
sample would fall within a known interval of thetrue
population. If 𝛼𝛼 is the allowable sampling error,
the confidence level is equal to 1 – 𝛼𝛼.
10.This distribution is ideally used when 𝑛𝑛 ≥ 30 and the
standard deviation or the variance of the entire is
given.
II. Determine the standard of error of the mean, the margin of error,
and the confidence interval. Assume that all data are normally
distributed.
1. In a survey, male and female student respondents are asked if they
prefer to go to college or not. Find the 99% confidence interval of
the difference in the two proportions as shown in the table.
Will go to Will not go to
Student
college college
Male 100 150
Female 125 75

Key to answer on page 36

3
Lesson 1Random Sampling of the Median and

Learning Concepts

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What is it

DEFINITION 4.1

Parameter Estimation – the process of making inferences about a population


based on the information/ value obtained from a sample describing a characteristic
of the population. For example, consider the following set of data representing the
number of errors made by a secretary on 10 different pages of a document 1, 0, 1,
2, 3, 1, 1, 4, 0, and 2. Let us assume that the document contains exactly 10 pages
so that the data constitute of a small finite population. A quick study of this
population leads to a number of conclusions. The population mean of the typing
errors mentioned above is
𝜇𝜇 = 1.5. It may be noted that the parameter is a constant value describing a
population.

Point Estimate – the sample mean 𝒙𝒙̅ of the population or mean 𝝁𝝁. It is the
numerical value which gives an estimate of a parameter.

Interval Estimator – is a formula that tells us how to use sample data to calculate
an interval that estimates a population parameter.

4
Example 1
Consider the table below:
Random Sample Sample
Sample mean, 𝒙̅
(n=3) Median
87 88 90 88 88
87 88 92 89 88
87 88 95 90 88
87 90 92 90 90
87 90 95 91 90
87 92 95 91 92
88 90 92 90 90
88 90 95 91 90
88 92 95 92 92
90 92 95 92 92

Looking at column 2, the sample mean 88 and 89 appeared only once. Thus their
probabilities are all 1 or 0.10 while the mean 90, 91 and 92 appeared twice, then their
10
probabilities are all 2 or 0.20. Hence, we obtain the following values:
10

Random Sampling of the Sample Mean


Sample Mean P ( 𝒙̅ )
(𝒙𝒙̅)
88 0.1
89 0.1
90 0.2
91 0.2
92 0.2

Probability Histogram of the Sample Mean


0.25
0.2
0.15
0.1
0.05
Probability
0

8889909192

5
Example 2
Random Sampling of the Sample Median
Looking at the third column for the sample median, we see that both 88 and 92
appeared thrice while 90 appeared four times. Thus their probabilities P(88) = 0.3,
P(92) = 0.3 and P(90) = 0.4. We then obtain the following table:
Sample P(x)
Median (x)
88 0.3
90 0.4
92 0.3

Probability Histogram of the


Sample Median
0.6
0.4 Probability
0.2 Histogram of the Sample…
0

889092

Example 3
Estimate the mean consumption of 8 families in one month if their expenses are
Php13,300; Php14,800; Php18,800; Php17,900; Php23,500; Php24,700; Php22,000
and Php29,000
Solution:

𝜇𝜇 = ∑ 𝑋𝑋 13,300 + 14,800 + 18,800 + 17,900 + 23,500 + 24,700 + 22,000 + 29,000


=
𝑁𝑁 8
= 20,500
𝑥𝑥̅ = 𝜇𝜇 = 𝑃𝑃ℎ𝑝𝑝20,500 is the point estimator

DEFINITION 4.2
Interval Estimation gives us a range of values which is likely to contain the
population parameter. It can be determined by two values.

6
Example 4
The following are examples of interval estimation:
1. The average family expense in Region X is Php250- 400 a day.
2. The average life span of stage 4 breast cancer patients is 3 ± 5 years.
3. The average scores of students in General Mathematics exam is 75 < μ <84

What I Can Do

1. Estimate the mean weight of 6 students randomly chosen in a university


if student 1 weighs 53 kg; student 2 weighs 64 kg; student 3 weighs 49
kg; student 4 weighs 59 kg; student 5 weighs 62 kg and student 6 weighs
55kg.

Key to answer on page 36

7
Lesson 2Confidence Interval and the Central L

Learning Concept

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What’s New
DEFINITION 4.3

Confidence level – expressed as percent, it sets a portion of the sample to be


included within a known range of the true population. Example: In an article on a
certain Survey group reported that 38% of likely Philippine voters are hopeful that
their health insurance coverage will change because of the Universal Health Bill.
Going down the article, we see a margin of sampling error is +/-2.9 percentage
points with a 95% level of confidence. It’s impractical to survey 100 million Filipinos
so it’s impossible to know exactly how many people would actually respond “yes, my
health insurance coverage will change.” We take a sample of 2,000 Filipinos, and
using good statistical techniques like simple random sampling, take our “best guess”
at what that actual figure is. What a 95% confidence level is saying is that if the poll
or survey were repeated over and over again, the results would match the results
from the actual population 95% of the time.

Confidence Interval – The width +/-2.9% stated as plus or minus 2.9. When the
interval and confidence level are put together, you get a spread of percentage. In this
case, you would expect the results to be 35.1% (38% - 2.9) to 40.9% (38% + 2.9),
95% of the time.

8
Illustration 1

The graph shows that 95% of the population distribution is contained in the
confidence interval. A confidence level of 1

– 𝛼𝛼 when

𝛼𝛼 = 5%, which means that there is a probability of at least 95% that the result is
reliable. Each tail of curve has a value of 2.50% and the areas to the middle have
47.5% each.

From the table of normal curve, the value of 𝑧𝑧𝛼𝛼 𝑎𝑎𝑎𝑎 𝐴𝐴 = 0.475 𝑖𝑖𝑖𝑖 ± 1.96
2

Illustration 2

Using the same method to derive the z-score at confidence level of 99%, we
get that 𝑧𝑧𝛼𝛼 = ±2.576
2

𝛼𝛼 = 1
(99% confidence level)

49% 49%
0.495 0.495

0.50% 0.50%
z = -2.576 z = 2.576

9
Summary of z-scores for Commonly Used Confidence Interval
Confidence Level
Margin of Error (𝝈𝝈) z-value
(1-𝝈𝝈)
10% 90% ±𝟏𝟏. 𝟔𝟔𝟔𝟔𝟔𝟔
5% 95% ±𝟏𝟏. 𝟗𝟗𝟔𝟔𝟗𝟗
1% 99% ±𝟐𝟐. 𝟔𝟔𝟓𝟓𝟔𝟔

CENTRAL LIMIT THEOREM

 The value of z has been derived using the Central Limit Theorem:
̅𝒙𝒙−𝝁𝝁
z ̅𝒙 𝒙 −𝝁 𝝁
= 𝝈𝝈 ̅𝒙 or z = 𝝈 𝝈
𝒙 √𝒏𝒏

 The Central Limit Theorem states that the sample mean 𝑥𝑥̅
approximately follows the normal distribution with mean μ and standard
𝝈𝝈
deviation .
√𝒏𝒏

 If the population follows a normal distribution, then the sample size


n can be either small or large.
 If the population from where the sample is taken follows a non-normal
distribution, then the sample size n, has to be large (usually n ≥ 30)
 The Central Limit Theorem also states that even if a population
distribution is strongly non-normal, its sampling distribution of means
will be approximately normal for large sample sizes (over 30). The
theorem makes it possible to use probabilities associated with the
normal curve to answer questions about the means of sufficiently
large samples.
 The (1 – 𝜎𝜎)100% confidence interval for the population mean
derived from the Central Limit Theorem is as stated below:
𝒙̅ − 𝒛𝒛𝝈𝝈 ( 𝝈𝝈 ) < μ < 𝒙̅ + 𝒛𝒛𝝈𝝈 ( 𝝈𝝈
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏)

The formula is explained in detail below:

The confidence interval of the population mean with a given confidence


level of (1-𝜎𝜎)100% and when the population variance is unknown is:

1
{𝒙𝒙̅− 𝒛𝒛𝝈𝝈 𝝈𝝈 ) , 𝒙̅ + 𝒛𝒛𝝈𝝈 ( 𝝈𝝈
)}
𝟐𝟐
( √𝒏𝒏 𝟐𝟐 √𝒏𝒏
Where 𝑥𝑥̅ = sample mean
𝜎𝜎 = sample standard deviation or square root of the
sample variance
𝜎𝜎
= standard error of the mean
√𝑛𝑛
σ
zσ ( ) = margin of error
2 √n

NOTE: The margin of error depends on the confidence level.

Example 1:
An operations manager plans to select 300 female employees from a group of
tenure workers (where height was considered due the nature of their task). The
selected group has an average height of 170 cm and a sample standard deviation of
25 cm. What is the 95% confidence interval of all the employees’ heights?

Solution:
Given:
Sample size, n = 300
Sample mean = 170 cm
Sample standard deviation, 𝜎𝜎= 25 cm
Confidence interval gives = 95%; 𝜎𝜎 = 5 and 𝒛𝒛𝝈𝝈 = 𝟏𝟏. 𝟗𝟗𝟗𝟗
𝟐𝟐

We substitute the given values into formula for the confidence interval.

𝝈𝝈 𝝈𝝈
{𝒙𝒙̅ − 𝒛𝒛𝝈𝝈 ( ) , 𝒙̅ + 𝒛𝒛𝝈𝝈 ( )}
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏
25 25
{170 − 1.96 ( ) , 170 + 1.96( )}
= √300 √300

= (1.67.17, 172.83)
= (167, 173)
Hence the operations manager is 95% confident that the employees have a
mean height of 167 to 173 cm.

1
Example 2:

A random sample of 40 residents of Quezon City has an average of electrical


consumption of 29 kWh/mo. with a sample standard deviation of 8 kWh. Give the
90% confidence interval for the mean usage of electricity per month.

Solution:

Given: Sample size, n = 40


Sample mean 𝑥𝑥̅ = 29
Sample standard deviation, 𝜎𝜎𝑥𝑥̅= 8
Confidence interval gives = 90%; 𝜎𝜎 = 10 and 𝒛𝒛𝝈𝝈 = 𝟏𝟏. 𝟔𝟔𝟔𝟔𝟔𝟔
𝟐𝟐

Substituting the given values above, we get

{𝒙𝒙̅ − 𝒛𝒛𝝈𝝈 𝝈𝝈 )
𝝈𝝈
)}
, 𝒙̅ + 𝒛𝒛𝝈𝝈 (
𝟐𝟐
( √𝒏𝒏 𝟐𝟐 √𝒏𝒏

8 8
{29 − 1.645 ( ) , 29 + 1.645( )}
= √40 √40

= (26.92, 31.08)

= ( 26.9, 31.1 )

Thus, the confidence interval is 26.9 to 31.1 kWh per month.

Example 3. A sample size of n = 100 produced the sample mean of 𝑥𝑥̅ = 16.
Assuming the population standard deviation 𝜎𝜎 = 3, compute a 95%
confidence interval for the population mean μ.
Solution:

A 95% confidence interval for μ is


𝑥𝑥̅ ± 𝑧𝑧 𝜎𝜎
𝛼𝛼/2 √𝑛𝑛

where 𝑥𝑥̅ = sample mean


𝜎𝜎 = sample standard deviation or square root of the sample variance
𝜎𝜎
√𝑛𝑛 = standard error of the mean
σ
zσ ( ) = margin of error
2 √n

3
16±(1.96)
√100 = 16 ± 0.588 = [𝟏𝟏𝟔𝟔. 𝟔𝟔𝟏𝟏𝟐𝟐, 𝟏𝟏𝟔𝟔. 𝟔𝟔𝟓𝟓𝟓𝟓]
1
What I Can Do

I. Compute the following.

1. A random sample is drawn from a population of known standard deviation


11.3. Construct a 90% confidence interval for the population mean based on
the information given:
a. n = 36 𝑥𝑥̅ = 105.2
b. n = 100 𝑥𝑥̅ = 105.2
2. A random sample is drawn from a population of unknown standard
deviation. Construct a 99% confidence interval for the population mean
based on the information given:
a. n = 49 𝑥𝑥̅ = 17.1 𝜎𝜎 = 2.1
b. n = 169 𝑥𝑥̅ = 17.1 𝜎𝜎 = 2.1
3. A random sample of size 144 is drawn from a population whose distribution,
mean, and standard deviation are all unknown. The summary statistics are
𝑥𝑥̅
= 58.2 and 𝜎𝜎 = 2.6. Construct a 90% confidence interval for the population
mean μ.

II. Solve these problems.


1. A government agency was charged by the legislature with estimating the
length of time it takes citizens to fill out various forms. Two hundred
randomly selected adults were timed as they filled out a particular form.
The times required had mean 12.8 minutes with standard deviation1.7
minutes. Construct a 90% confidence interval for the mean time taken
for all adults to fill out this form.

2. A sample of 250 workers aged 16 and older produced an average


length of time with the current employer of 4.4 years with standard
deviation of
3.8 years. Construct a 99.9% confidence interval for mean job tenure of all
workers aged 16 or older.
1
3. A corporation that own apartment complexes wishes to estimate the
average length of time residents remain in the same apartment before
moving out. A sample of 150 rental contracts gave a mean length of
occupancy of 3.7 years with a standard deviation of 1.2 years. Construct
a 95% confidence interval for the mean length of occupancy of
apartments owned by this corporation.

4. In order to estimate the mean amount of damage sustained by vehicles


when a cow is struck, an insurance company examined the records of 50
such occurrences, and obtained a sample mean of Php2,785 with
standard deviation of Php221. Construct a 95% confidence interval for
the mean amount of damage in all such accidents.

Key to answer on page 36

1
Lesson 3Z-Distribution and T-Distribution

Learning Concept

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and be able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What’s In

The T-distribution (also called Student’s T Distribution) is a family of


distributions that look almost identical to the normal distribution curve, only a bit
shorter and fatter. The T-distribution is used instead of the normal distribution when
you have small samples. The larger the sample size, the more the t-distribution looks
like the normal distribution. In this lesson, we will study another form of distribution
that can be used if situational problems do not allow us to use the standard normal
distribution.
DEFINITION 4.4

The t-distribution – is the probability distribution that estimates the population parameters when th
Degree of freedom – refers to the number of independent observations on the set of data, or the n

̅𝒙𝒙−𝝁𝝁 . Similar formula is also used in the z-score except that our sample size is less
𝒔𝒔
√𝒏𝒏
than 30.

1
Properties of T-Distribution

1
1. The shape of the curve is bell-shaped and symmetrical with mean zero.
2. The t-distribution ranges from −∞ 𝑡𝑡𝑡𝑡 ∞ (infinity).
3. The shape of the distribution changes with the change in the degrees
of freedom.
4. The variance is always greater than one and can be defined only when
the degrees of freedom v ≥ 3 and is given as: Var (t) = [ v / v - 2 ].
5. It is less peaked at the center and higher in tails thus it assumes
platykurtic shape.
6. The t-distribution has a greater dispersion than the standard normal
distribution. As the sample size ‘n’ increases, it assumes the
normal distribution. Here the sample size is said to be large when n
≥ 30.

Comparison Between the t-distribution and the z-distribution


(or normal distribution)
T - distribution Z - distribution
Ideally used when n ≤ 30 and the
standard deviation or the variance of the Ideally used when n ≥ 30 and the
entire population is unknown, or that the standard deviation or the variance of the
only standard deviation given is from entire population is given.
the sample.
Both can be used for determining the confidence interval of the population mean
and confidence interval of the difference between two means
The distribution has a graph that is bell
shaped and symmetrical about the
mean. It is more variable since t-values
The distribution has a graph that is bell
depend on the fluctuations of the mean
shaped and symmetrical about the
and standard deviation.
mean. Z-values only depend on the
The degree of freedom df is equal to
fluctuation of the mean from sample to
(n-1) if the mean and standard deviation
sample.
are computed from samples of size n.
The values of t are said to belong to a t-
distribution with df = n-1. The bell curve

1
of the t-distribution approaches the
standard normal curve as n
becomes bigger.

Values of the t-distribution (two-tailed)

DF A 0.80 0.90 0.95 0.98 0.99 0.995 0.998 0.999


P 0.20 0.10 0.05 0.02 0.01 0.005 0.002 0.001
1 3.078 6.314 12.706 31.820 63.657 127.321 318.309 636.619
2 1.886 2.920 4.303 6.965 9.925 14.089 22.327 31.599
3 1.638 2.353 3.182 4.541 5.841 7.453 10.215 12.924
4 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 1.397 1.860 2.306 2.897 3.355 3.833 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 1.345 1.761 2.145 2.625 2.977 3.326 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 1.337 1.746 2.120 2.584 2.921 3.252 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.090 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
31 1.309 1.695 2.040 2.453 2.744 3.022 3.375 3.633

1
32 1.309 1.694 2.037 2.449 2.738 3.015 3.365 3.622
33 1.308 1.692 2.035 2.445 2.733 3.008 3.356 3.611
34 1.307 1.691 2.032 2.441 2.728 3.002 3.348 3.601
35 1.306 1.690 2.030 2.438 2.724 2.996 3.340 3.591
36 1.306 1.688 2.028 2.434 2.719 2.991 3.333 3.582
37 1.305 1.687 2.026 2.431 2.715 2.985 3.326 3.574
38 1.304 1.686 2.024 2.429 2.712 2.980 3.319 3.566
39 1.304 1.685 2.023 2.426 2.708 2.976 3.313 3.558
40 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
42 1.302 1.682 2.018 2.418 2.698 2.963 3.296 3.538
44 1.301 1.680 2.015 2.414 2.692 2.956 3.286 3.526
46 1.300 1.679 2.013 2.410 2.687 2.949 3.277 3.515
48 1.299 1.677 2.011 2.407 2.682 2.943 3.269 3.505
50 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496
60 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
70 1.294 1.667 1.994 2.381 2.648 2.899 3.211 3.435
80 1.292 1.664 1.990 2.374 2.639 2.887 3.195 3.416
90 1.291 1.662 1.987 2.369 2.632 2.878 3.183 3.402
100 1.290 1.660 1.984 2.364 2.626 2.871 3.174 3.391
120 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
150 1.287 1.655 1.976 2.351 2.609 2.849 3.145 3.357
200 1.286 1.652 1.972 2.345 2.601 2.839 3.131 3.340
300 1.284 1.650 1.968 2.339 2.592 2.828 3.118 3.323
500 1.283 1.648 1.965 2.334 2.586 2.820 3.107 3.310
1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291

1
How to Use the T-distribution Table of Values

1. Determine alpha (𝛼𝛼). The probability that the population parameter is not
in the confidence interval.
2. Identify which of the two test must be used (two-tailed or one-tailed).

Graph of a two-tailed t-distribution Graph of a one-tailed t-distribution


Confidence level of 90% Confidence level of 90%
𝛼𝛼 𝛼𝛼 = 10
𝛼𝛼 = 10 2 = 5

5% 5% 10%

z= 0 z= 0

3. Compute for the degrees of freedom, df = n – 1


4. Using the t-table, determine the t-value using the row of the desired df,
and the column of the allowed error.

Example 1:
What is the t-value of a one-tailed t-distribution with 𝛼𝛼 = 5% and df = 15?
Solution:
From the given table, we find that the t-value is 1.753
𝜶𝜶 for one-tailed test 0.05 0.025 0.01
df
:
: : :
:
15 1.753 : :
:
: : :
:

Example 2:

What is the 95% confidence interval of the scores of the Philippine Rugby
(nicknamed Tamaraws) Team players if the mean of the 12 randomly selected
players is 9.5 points per game with a standard deviation of 3.5 points?

2
Solution:

We will use the t-distribution since the sample size, n is less than 30, where

n = 12, sample mean of 𝑥𝑥̅ = 9.5 and sample standard deviation of 𝜎𝜎𝑥𝑥̅ = 3.5 and a
confidence level of 95%.

From the t-table, look for df = 12-1 = 11.Find the value under the two-tailed
test since the deviations in the values of the data could possibly come from both
ends of the distribution. The confidence level is 95%, giving us 𝛼𝛼 = 5%

DF A 0.90 0.95
P 0.10 0.05
1 6.314 12.706
2 2.920 4.303
3 2.353 3.182
4 2.132 2.776
5 2.015 2.571
6 1.943 2.447
7 1.895 2.365
8 1.860 2.306
9 1.833 2.262
10 1.812 2.228
11 1.796 2.201

Hence, 𝑡𝑡𝛼𝛼 = 2.201


2
𝜎𝜎𝑥𝑥̅
𝜎𝜎𝑥𝑥̅
𝑥𝑥̅ ± 𝑡𝑡𝜎𝜎 ( ) = 𝑥𝑥̅ ± 2.201( )
2 √𝑛𝑛 √𝑛𝑛

= 9.5 ± 2.201( 3.5 )


√12

= 9.5 ± 2.201(2.22)
= 9.5 ± 4.89
Therefore the confidence interval is (4.61, 14.39)

2
Example 3:
The mean sales and standard deviation of procured samples of two TV
brands in a certain appliance store are summarized below:

Standard Number
Mean sales
TV deviation of
(in
brand (in selling
thousands)
thousands) days
A 𝑥̅1̅ = 10.4 𝜎𝜎𝑥̅̅𝑥 𝑛𝑛1 = 6
̅1̅ = 2.5
B 𝑥̅2̅ = 11.8 𝜎𝜎𝑥̅̅𝑥 𝑛𝑛2 = 6
̅2̅ = 3.6

1. Determine the 90% confidence interval for the difference between the
two mean sales of the two TV brands?
2. Is it safe to conclude that there is no significant difference in the
mean sales of the two TV brands?

Solution:

1. To determine the 90% confidence interval, we substitute the

given values 𝑥̅1̅ = 10.4; 𝑥̅2̅ = 11.8;

𝜎̅𝑥̅

̅1̅ = 2.5;

𝜎𝜎̅𝑥𝑥̅2̅ ̅ = 3.6; 𝑛𝑛1 = 6; 𝑛𝑛2 = 6; and the 90% confidence interval (where
𝛼𝛼 = 10 or 𝛼𝛼 = 5 ) into the formula.
2

Note: Use df = 10 since df = 𝑛𝑛1 + 𝑛𝑛2 – 2

Formula:
𝜎𝜎 2 𝜎𝜎𝑥̅2̅ 2
| (𝑥𝑥̅1 − 𝑥̅2̅)| ± 𝑡𝛼𝛼√ ̅
+ √ 𝑛𝑛2 =
2 𝑛𝑛1

2.52 3.62
=|10.4 − 11.8| ± 1.812√ +
6 6

= 1.4 ± 1.812 (1.79)


2
= 1.4 ± 3.24

2
DF A 0.80 0.90
P 0.20 0.10
1 3.078 6.314
2 1.886 2.920
3 1.638 2.353
4 1.533 2.132
Thus the confidence interval of the
5 1.476 2.015
6 1.440 1.943 difference between the mean sales of
7 1.415 1.895 the two TV brands is (-1.84, 4.64), or in
8 1.397 1.860 terms of thousands, the confidence
9 1.383 1.833
interval should be (-1,840, 4,640.
10 1.372 1.812

Since the confidence interval ranges from a negative to a positive value


and zero lies in between, it is safe to conclude that there is no
significant difference in the mean sales of the two TV brands.

Example 4:

What is the percentage distribution of the 13.6 million peso average


production profit of the 25 lines carried by a certain product brand with a standard
deviation of 1.56 million pesos, if the average production profit of all the brand’s
lines is 12.9 million pesos? Assume that the data follow a t-distribution.

Solution:

From the problem, the given are


𝑥𝑥̅ = 13.6, s = 1.56, n = 25 𝜇𝜇 =
12.9 From central limit theorem;
𝑥𝑥̅−𝜇𝜇 13.6−12.9 0.7 → 𝑡𝑡 = 2.244
𝑡𝑡 = 𝑠𝑠 → 𝑡𝑡 = 1.56 = 0.312
√𝑛𝑛 √25

The degree of freedom is df = n-1 = 25 - 1 = 24

From the t-distribution table at df = 24, the value t = 2.244 is between 2.064
and 2.492

DF A 0.95 0.98
P 0.05 0.02
10 2.228 2.764

2
11 2.201 2.718
12 2.179 2.681
13 2.160 2.650
14 2.145 2.625
15 2.131 2.602
16 2.120 2.584
17 2.110 2.567
18 2.101 2.552
19 2.093 2.539
20 2.086 2.528
21 2.080 2.518
22 2.074 2.508
23 2.069 2.500
24 2.064 2.492

By Interpolation:
𝜶𝜶 t-value
0.025 2.064
𝜶𝜶𝒙𝒙 2.244
0.01 2.492

𝛼𝛼𝑥𝑥 − 0.025
2.244 − 2.064
0.01 − 0.025 =
2.492 − 2.2064
𝛼𝛼𝑥𝑥 − 0.025 = 0.18
(−0.015)
0.2856

𝛼𝛼𝑥𝑥 = 0.025 − 0.009454


𝛼𝛼𝑥𝑥 = 0.0155
To determine the area from the mean, subtract 𝛼𝛼𝑥𝑥 from 0.5. That is
Area = 0.5 - 𝛼𝛼𝑥𝑥
= 0.5 – 0.01555
= 0.4844 or 48.44%
Hence, the area from t = 0 to t = 2.244 is 48.44%

2
What I Can Do

Solve the following problems:

1. Find the t-value to the left of the mean for 𝛼𝛼 = 1 and n = 11


2. Construct a 99% confidence interval for a population mean if we have
a sample mean of 67.5, a sample standard deviation of 8.3, and n = 7
3. What is the t-value of the t-distribution with 𝛼𝛼 = 1 and n = 18 under the
one-tailed test?
4. What is the area under the t-distribution from t = -1.771 to t = 2.160 at df =
13?
5. A study is conducted to compare the performance of students with more
than one personal electronic gadget and those with only one. A number
of them were taken as subjects for the study. The mean grades of these
students and these standard deviations are given below:

Sampling
Students Mean Standard
size
deviation
w/ one
gadget 𝑥̅1̅ = 𝑠𝑠1 = 12 𝑛𝑛1 = 7
83
w/ more
than one 𝑥̅2̅ = 𝑠𝑠2 = 13 𝑛𝑛2 = 5
gadget 79

Is it possible to conclude that there is no significant difference in the mean


grades of the two types of students at 95% confidence level?

Hint: Use the formula t = (̅𝑥̅1̅ − 𝑥̅ ̅2̅) ± 𝑡𝑡𝛼𝛼√ 𝑠𝑠1


+ 𝑛𝑛2 , 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑑𝑑𝑑𝑑 = 𝑛𝑛1 + 𝑛𝑛2 − 2
𝑠𝑠2
2 𝑛𝑛1

Key to answer on page 36

2
Lesson 4 Population Proportion

Learning Concepts

The learner demonstrates understanding of key concepts of estimation of


population mean and population proportion and be able to estimate the population
mean and population proportion to make sound inferences in real-life problems in
different disciplines.

What is it

We have learned that the z-distribution is used when n > 30 (or if we have a
large sample size) and when the population standard deviation or population
variance is given. The t-distribution is used when n ≤ 30 and the only standard
deviation given is from a sample. In this lesson, you will learn how to estimate the
proportion of a population, sample size n, to limit the margin of error and get a
higher accuracy for better results, percentage, decimal, or fraction.

There are Point Estimate


research – is a single
experiments which value
need used to approximate
an estimation a population
of the proportion of the
parameter.orThe
parameter sample proportion,
a confidence denoted
interval for by 𝑝𝑝̂, isproportion.
the population the best point estimate
Below of the
are some
population proportion(p).
examples:

1. Proportion of customers who are satisfied with the services rendered by


a restaurant.
2. Proportion of Fil-Am players in the Philippine Rugby Team.

2
3. Proportion of registered voters who will likely vote in favor of a
female candidate.
4. Proportion of college scholars who get a job related to their field of discipline.

In this lesson, we denote p as the population proportion, q as the proportion


of “not p”, 𝒑𝒑̂ as the estimate of sample proportion, and 𝒒𝒒̂ as the estimated
proportion of “not 𝒑𝒑̂ ”.

The formulas for 𝒑𝒑̂ 𝒂𝒂𝒂𝒂𝒂𝒂 𝒒𝒒̂ are as follows:


𝑥𝑥
𝑝𝑝̂ =
𝑛𝑛

𝑞̂ = 1 − 𝑝𝑝 = 1 − 𝑥𝑥
𝑛𝑛−𝑥𝑥
𝑛𝑛 = 𝑛𝑛

where x is the number of successes in n trials.

Remember the following in the sampling distribution of 𝑝𝑝̂ :

 𝑝𝑝̂ is the estimate of a sample proportion with x successes in n trials.


 𝑝𝑝̂ is the best point estimate
 If np and nq are both greater than or equal to 5, then p will have a normal
𝒑𝒑𝒒𝒒
distribution. From the Central Limit Theorem, = 𝒑𝒑 𝒂𝒂𝒂𝒂𝒂𝒂 =√ ≈
𝝁𝝁𝒑𝒑̂ 𝝈 𝝈 𝒑̂
𝒂𝒂

𝒑̂𝒒̂
√ 𝒂𝒂
 In the normal approximation to a binomial distribution, 𝝁𝝁 = 𝒂𝒂𝒑𝒑 𝒂𝒂𝒂𝒂𝒂𝒂 𝝈𝝈 = √𝒂𝒂𝒑𝒑𝒒𝒒
 If a sample is not a representative of the population, then 𝑝𝑝̂ will not be a
useful estimate of p. Instead, use the sampling technique discussed in the
previous lessons.

Example 1
If 30 students from a batch of graduates were surveyed and 30 of them
answered that they finished BS Industrial Engineering (BS IE), what is the estimated
proportion of those who took up BS IE out of the whole batch?

2
Solution:
Let 𝑝𝑝̂ = sample proportion of BS IE
graduates x = 30 (number of BS IE
graduates)
n = 350 (total number of surveyed graduates)
30
𝑝𝑝̂ = 𝑥𝑥 =
= 0.086 = 8.6
𝑛𝑛 350

Example 2
From the example given above, what is the estimated proportion of graduates
who didn’t take up BS IE?

Solution:
Let 𝑞𝑞̂ = sample proportion of non BS IE graduates.
𝑞̂ = 1 = 𝑝𝑝̂ = 1 − 0.086 = 0.914 𝑜𝑜𝑜𝑜 91.4

The following are formulas involving the concept of point estimation:


 The confidence interval of the population proportion is given by:

𝒑̂ − 𝒛𝒛𝜶𝜶√𝒑̂𝒒̂ < 𝒑𝒑 < 𝒑̂ + 𝒑̂𝒒̂


or
𝒑̂𝒒̂
𝟐𝟐 𝒏𝒏 𝒏𝒏 𝒏𝒏
𝒛𝒛𝜶𝜶√ 𝒑̂ ±𝒛𝒛𝜶𝜶 √
𝟐𝟐 𝟐𝟐

 The confidence interval of the difference of two proportions is given by:


|𝒑̂𝟏𝟏 − 𝒑̂𝟐𝟐| − 𝒑𝒑̂𝟏𝟏
𝒛𝒛𝜶𝜶√
𝒑𝒑̂𝟏𝟏𝒒𝒒̂𝟏𝟏
𝒑𝒑̂𝟐𝟐𝒒𝒒̂𝟐𝟐 < (𝒑𝒑𝟏𝟏 − 𝒑𝒑𝟐𝟐) < |𝒑𝒑̂𝟏𝟏 − 𝒑̂𝟐𝟐| + 𝒛𝒛𝜶𝜶√ 𝒑𝒑̂𝟐𝟐𝒒𝒒̂𝟐𝟐
𝟐𝟐 𝒏𝒏𝟏𝟏
+
𝒏𝒏𝟐𝟐 𝒏𝒏𝟏𝟏
+
𝒏𝒏𝟐𝟐
or
𝟐𝟐

𝒑̂𝟏𝟏 𝒒̂𝟏𝟏 𝒑̂𝟐𝟐 𝒒̂𝟐𝟐


|𝒑̂𝟏𝟏 − 𝒑̂𝟐𝟐 | ± 𝒛𝒛𝜶𝜶√
𝟐𝟐 𝒏𝒏𝟏𝟏 +
𝒏𝒏𝟐𝟐
𝒑𝒑̂𝒒𝒒̂
 The standard of error SE of the estimate is 𝑺𝑺𝑺𝑺 = √
𝒏𝒏

 The margin of error ME of the estimate is 𝑴𝑴𝑺𝑺 = 𝒛𝒛𝜶𝜶√ �

2
𝒏𝒏
 Conversion of 𝑝𝑝̂ value to z-value 𝟐𝟐

3
𝑥𝑥−𝜇𝜇
Recall that 𝜇𝜇 = 𝑛𝑛𝑛𝑛 𝑎𝑎𝑛𝑛𝑎𝑎 𝜎𝜎 = √𝑛𝑛𝑛𝑛𝑛𝑛. Since 𝑧𝑧 = from the Central Limit
𝜎𝜎
𝒙𝒙
𝑥𝑥−𝑛𝑛𝑛𝑛 𝒙𝒙−𝒏𝒏𝒏𝒏 −𝒏𝒏 𝒏̂ − 𝒏 𝒏
Theorem, 𝑧𝑧 = Thus, 𝒛𝒛 = 𝒏 𝒏
= 𝒏 𝒏
=
√𝒏𝒏𝒏𝒏𝒏𝒏 √𝒏𝒏𝒏𝒏 𝒏𝒏𝒏𝒏
√𝑛𝑛𝑛𝑛𝑛𝑛 √ 𝒏𝒏
𝒏𝒏 √𝒏𝒏

 Formula for estimating a sample size n of a population proportion:


𝒏̂𝒏̂ 𝟐𝟐
𝒏𝒏 = ( (𝒛𝒛 )
𝜶𝜶
𝑴𝑴𝑴𝑴)𝟐𝟐 �
 If 𝑛𝑛̂ 𝑜𝑜𝑜𝑜 𝑛̂ is unknown, you may use a conservative estimate of 𝑛𝑛̂ = 0.5 𝑎𝑎𝑛𝑛𝑎𝑎 𝑛̂
= 0.5; then 𝑛 𝑛 ̂ 𝑛̂ = 0.25, Thus we have
𝟎𝟎.𝟐𝟐𝟐𝟐
𝒏𝒏 = (𝒛𝒛𝜶𝜶)𝟐𝟐
𝑴𝑴𝑴𝑴𝟐𝟐 𝟐𝟐

 Formula for sample size 𝑛𝑛𝑖𝑖 in estimating the difference in two proportions:
𝒏𝒏
(𝒏𝒏̂𝟏𝒏̂𝟐+𝒏𝒏̂𝟏𝒏̂𝟐
𝒊𝒊 = (𝒛𝒛𝜶𝜶)𝟐
𝑴𝑴𝑴𝑴𝟐𝟐 𝟐𝟐

Example 3

A random sample of size 75 is selected from a binomial probability with 𝑛𝑛̂ = 0.13. Is
it appropriate to use the normal distribution to approximate the sampling distribution
of the sample proportion?

Solution:
𝑛𝑛 = 0.13 𝑎𝑎𝑛𝑛𝑎𝑎 𝑛𝑛 = 1 − 𝑛𝑛 = 1 − 0.13 = 0.87
𝑛𝑛𝑛𝑛̂ = 0.13(75) = 9.75 𝑎𝑎𝑛𝑛𝑎𝑎 𝑛 𝑛 𝑛̂ = 0.87(75) = 65.25 (𝑏𝑏𝑜𝑜𝑏𝑏ℎ 𝑎𝑎𝑜𝑜𝑎𝑎 𝑔𝑔𝑜𝑜𝑎𝑎𝑎𝑎𝑏𝑏𝑎𝑎𝑜𝑜 𝑏𝑏ℎ𝑎𝑎𝑛𝑛 5)
Since both 𝑛𝑛𝑛𝑛̂ 𝑎𝑎𝑛𝑛𝑎𝑎 𝑛 𝑛 𝑛̂ are greater than 5, we can use the normal distribution to
approximate the sampling distribution of the sample proportion.

Example 4

Two hundred randomly selected graduates were asked whether they believed that
the country’s employment status will improve under the new president. One hundred
twenty of them said yes. Construct a 90% confidence interval for the proportion of
graduates who believe that the employment status will improve.

3
Solution:
120
Given: x =120, and n = 200, 𝑝𝑝̂ = = 0.6 Thus 𝑞̂ = 1 − 𝑝𝑝̂ = 𝟎𝟎. 𝟒𝟒
200

Using the formulas required to get the confidence interval, we have:

𝒑̂𝒒̂ 𝒑̂𝒒̂
𝒑̂ − 𝒛𝒛𝜶𝜶√ < 𝒑𝒑 < 𝒑̂ + 𝒛𝒛𝜶𝜶√
𝟐𝟐 𝒏𝒏 𝟐𝟐 𝒏𝒏

Recall that for a confidence level of 90%, 𝛼𝛼 = 10 and 𝒛𝒛𝜶𝜶 = 1.645. Substituting the
𝟐𝟐

given values, we get:

𝟎𝟎.𝟔𝟔(𝟎𝟎.𝟒𝟒) 𝟎𝟎.𝟔𝟔(𝟎𝟎.𝟒𝟒)
0.6 – 1.645√ < 𝒑𝒑 < 𝟎𝟎. 𝟔𝟔 + 𝟏𝟏. 𝟔𝟔𝟒𝟒𝟔𝟔√
𝟐𝟐𝟎𝟎𝟎𝟎 𝟐𝟐𝟎𝟎𝟎𝟎

Thus, the confidence interval is from 0.543 to 0.657 or 54.3% to 65.7%

Interpretation: We are 90% confident that about 54.2% to 65.8% of the workers
believe that the country’s economy will improve under the new president.

What I Can Do

1. A political campaign manager wishes to survey a number of voters to


estimate the proportion of those who are in favor of his candidate. If a
previous survey shows that 55% of registered voters plans to vote for his
candidate, what is the minimum sample size required to make his surveys
accurate with a 95% confidence level and a margin of error of 2.5%?

2. A quality controller wants to estimate the proportion of high quality goods


out of a batch of products with a 90% confidence level and a margin of error
of 5%. How many products must he test.

3
3. A school administrator wishes to assess the quality of graduates from their
school within 5 school years. A randomly selected group of graduates from
two areas of discipline were interviewed as to why whether they landed a
job related to their field. The data gathered is as follows:

No. of students with job


Area of discipline Sample size
related to field of study
BS Criminology 50 35
BS in Education 45 27

Given the previous data, how many sample respondents from each area must be taken for a
deeper assessment if the school administrator wants a 95% confidence level and a margin
of error of 3%?

Key to answer on page 36

3
What I Have Learned

Parameter Estimation – the process of making inferences about a population


based on the information/ value obtained from a sample describing a characteristic
of the population.

Point Estimate – the sample mean 𝒙𝒙̅ of the population or mean 𝝁𝝁. It is the numerical
value which gives an estimate of a parameter.

Interval Estimator – is a formula that tells us how to use sample data to calculate
an interval that estimates a population parameter.
∑ 𝑋𝑋
Point Estimator - 𝜇𝜇 = = 𝑥𝑥̅
𝑁𝑁

Interval Estimation gives us a range of values which is likely to contain the


population parameter. It can be determined by two values.

Confidence level – expressed as percent, it sets a portion of the sample to be


included within a known range of the true population.

Confidence Interval – The width +/-2.9% stated as plus or minus 2.9. When the
interval and confidence level are put together, you get a spread of percentage.

The value of z has been derived using the Central Limit Theorem:

𝒙
̅ 𝒙−𝝁𝝁 𝒙̅ 𝒙−𝝁𝝁
z= 𝝈𝝈̅𝒙𝒙 𝝈 𝝈
or z √𝒏𝒏

The Central Limit Theorem states that the sample mean 𝑥𝑥̅ approximately follows
the normal distribution with mean μ and standard deviation 𝝈𝝈
.
√𝒏𝒏

The (1 – 𝜎𝜎)100% confidence interval for the population mean derived from
the Central Limit Theorem is as stated below:
𝒙̅ − 𝒛𝒛𝝈𝝈 ( 𝝈𝝈 ) < μ < 𝒙̅ + 𝒛𝒛𝝈𝝈 ( 𝝈𝝈
𝟐𝟐 √𝒏𝒏 𝟐𝟐 √𝒏𝒏)

The confidence interval of the population mean with a given confidence level of (1-
𝜎𝜎)100% and when the population variance is unknown is:

3
{𝒙𝒙̅ − 𝒛𝒛𝝈𝝈 𝝈𝝈 )
𝝈𝝈
)}
, 𝒙̅ + 𝒛𝒛𝝈𝝈 (
𝟐𝟐
( √𝒏𝒏 𝟐𝟐 √𝒏𝒏

𝜎𝜎
√𝑛𝑛
= standard error of the mean
σ
zσ ( ) = margin of error
2 √n

The probability interval for the difference between two population means is:

| 𝑆𝑆12 2
(𝑥� + √𝑆𝑆𝑛𝑛2
�̅̅ ̅ −
�𝑥�̅12̅ ̅)| ± 𝑧𝑧𝛼𝛼√
𝑛𝑛
2 1 2

Where 𝑛𝑛1𝑎𝑎𝑛𝑛𝑎𝑎 𝑛𝑛2 = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑠𝑠𝑎𝑎𝑛𝑛𝑠𝑠𝑠𝑠𝑛𝑛𝑠𝑠 𝑜𝑜𝑛𝑛 𝑠𝑠𝑜𝑜𝑠𝑠𝑛𝑛𝑠𝑠𝑎𝑎𝑝𝑝𝑝𝑝𝑜𝑜𝑛𝑛𝑠𝑠

𝑠𝑠1 𝑎𝑎𝑛𝑛𝑎𝑎 𝑠𝑠2 = 𝑠𝑠𝑎𝑎𝑛𝑛𝑠𝑠𝑠𝑠𝑛𝑛 𝑠𝑠𝑝𝑝𝑎𝑎𝑛𝑛𝑎𝑎𝑎𝑎𝑛𝑛𝑎𝑎 𝑎𝑎𝑛𝑛𝑑𝑑𝑝𝑝𝑎𝑎𝑝𝑝𝑝𝑝𝑜𝑜𝑛𝑛𝑠𝑠

� ̅𝑥
�̅1̅ 𝑎𝑎𝑛𝑛𝑎𝑎
�𝑥
�̅̅2̅ = 𝑠𝑠𝑎𝑎𝑛𝑛𝑠𝑠𝑠𝑠𝑛𝑛 𝑛𝑛𝑛𝑛𝑎𝑎𝑛𝑛𝑠𝑠

The t-distribution – is the probability distribution that estimates the population


parameters when the sample size is small and the population standard deviation is
unknown.

Degree of freedom – refers to the number of independent observations on the set of


data, or the number of variables that are free to vary.

The formula for the t-value is 𝒙̅ 𝒙−𝝁𝝁

√𝒏𝒏

where n is less than 30.

The z-distribution is used when n ≥ 30 and the standard deviation of variance of


the entire population is given.

The t-distribution is used when n ≤ 30 and the standard deviation of variance of the
entire population is unknown, or that the only standard deviation given is from the
sample.

The degree of freedom df in a t-test is equal to (n-1) if the mean and standard
deviation are computed from samples of size n. The values of t are said to belong to
3
a t-distribution with df = n-1.

3
Point Estimate – is a single value used to approximate a population parameter. The
sample proportion, denoted by 𝑝𝑝̂, is the best point estimate of the population
proportion(p).

We denote p as the population proportion, q as the proportion of “not p”, 𝒑𝒑̂ as the
estimate of sample proportion, and 𝒒𝒒̂ as the estimated proportion of “not 𝒑𝒑̂ ”.

The formulas for 𝒑𝒑̂ 𝒂𝒂𝒂𝒂𝒂𝒂 𝒒𝒒̂ are as follows:


𝒙𝒙
𝒑̂ =
𝒂𝒂

𝒒̂ = 𝟏𝟏 − 𝒑𝒑 = 𝟏𝟏 − 𝒙𝒙
𝒂𝒂−𝒙𝒙
𝒂𝒂 = 𝒂𝒂

where x is the number of successes in n trials.

The confidence interval of the population proportion:

𝒑̂𝒒̂ 𝒑̂𝒒̂
𝒑̂ − 𝒛𝒛𝜶𝜶√ < 𝒑𝒑 < 𝒑̂ + 𝒛𝒛𝜶𝜶√
𝟐𝟐 𝒂𝒂 𝟐𝟐 𝒂𝒂

𝒑̂𝒒̂
𝒑̂ ± 𝒛𝒛𝜶𝜶
Or 𝟐𝟐 𝒂𝒂

The confidence interval of the difference of two proportions is given by:

|𝒑𝒑̂𝟏𝟏 − 𝒑̂𝟐𝟐| − 𝒑𝒑̂𝟏𝟏 𝒑𝒑̂𝟐𝟐𝒒𝒒̂𝟐𝟐 < (𝒑𝒑𝟏𝟏 − 𝒑𝒑𝟐𝟐) < |𝒑̂𝟏𝟏 − 𝒑̂𝟐𝟐| + 𝒛𝒛𝜶𝜶√ 𝒑𝒑̂𝟏𝟏 𝒑𝒑̂𝟐𝟐𝒒𝒒̂𝟐𝟐
𝒛𝒛𝜶𝜶√ + +
𝟐𝟐 𝒂𝒂𝟏𝟏 𝒂𝒂𝟐𝟐 𝟐𝟐 𝒂𝒂𝟏𝟏 𝒂𝒂𝟐𝟐

𝒑𝒑̂𝒒𝒒̂
The standard of error SE of the estimate is 𝑺𝑺𝑺𝑺 = √
𝒂𝒂

The margin of error ME of the estimate is 𝑴𝑴𝑺𝑺 = 𝒛𝒛𝜶𝜶√ �


𝟐𝟐 𝒂𝒂

Conversion of 𝑝𝑝̂ value to z-value


𝑥𝑥−𝜇𝜇
Recall that 𝜇𝜇 = 𝑛𝑛𝑝𝑝 𝑎𝑎𝑛𝑛𝑎𝑎 𝜎𝜎 = √𝑛𝑛𝑝𝑝𝑛𝑛. Since 𝑧𝑧 = from the Central Limit
𝜎𝜎
𝒙𝒙
𝑥𝑥−𝑛𝑛𝑛𝑛 𝒙𝒙−𝒂𝒂𝒑𝒑 −𝒑𝒑 𝒑̂ − 𝒑 𝒑
Theorem, 𝑧𝑧 = Thus, 𝒛𝒛 = 𝒂 𝒂
= 𝒂 𝒂
=
√𝒂𝒂𝒑𝒑𝒒𝒒 √𝒑𝒑𝒒𝒒 𝒑𝒑𝒒𝒒
√𝑛𝑛𝑛𝑛𝑛𝑛 √ 𝒂𝒂
𝒂𝒂 √𝒂𝒂

Formula for estimating a sample size n of a population proportion:


𝒑̂𝒒̂ 𝟐𝟐

3
𝒂𝒂 = ( (𝒛𝒛𝜶𝜶 )
𝑴𝑴𝑺𝑺)𝟐𝟐 �

3
If 𝑝𝑝̂ 𝑜𝑜𝑜𝑜 𝑞̂ is unknown, you may use a conservative estimate of 𝑝𝑝̂ = 0.5 𝑎𝑎𝑎𝑎𝑎𝑎 𝑞̂ = 0.5;
then 𝑝 𝑝 ̂ 𝑞𝑞̂ = 0.25, Thus we have
𝟎𝟎.𝟐𝟐𝟐𝟐
𝒏𝒏 = (𝒛𝒛𝜶𝜶)𝟐𝟐
𝑴𝑴𝑴𝑴𝟐𝟐 𝟐𝟐

Formula for sample size 𝑎𝑎𝑖𝑖 in estimating the difference in two proportions:
𝒏𝒏
(𝒑𝒑̂𝟏𝒒̂𝟐+𝒑𝒑̂𝟏𝒒̂𝟐
𝒊𝒊 = (𝒛𝒛𝜶𝜶)𝟐
𝑴𝑴𝑴𝑴𝟐𝟐 𝟐𝟐

Assessment

Directions: Read and analyze the statements below. Encircle the letter of the correct
answer.
1. It represents part of a whole. Similar to probability, it can be expressed as
a percentage, decimal or fraction.
a. Point estimate b. proportion c. degree of freedom
2. This refers to the number of independent observations in the set of data,
or the number of variables that are free to vary.
a. T-distribution b. degree of freedom c. z-distribution
3. The interval defined within the true population where members of the
sample are expected to be found.
a. Confidence interval b. confidence level c. margin of
error
4. It is the process of making inferences about a population based on
information obtained from a sample.
a. Population proportion b. estimate c. Central Limit Theorem

5. The standard error estimate is given by the formula:


𝒑̂𝒒̂ 𝒑̂𝒒̂
𝒑̂ ± 𝒛𝒛𝜶𝜶
b.𝒛𝒛
𝜶𝜶 𝒑𝒑̂𝒒̂
a. 𝟐𝟐 𝒏𝒏 𝒏𝒏
c. √ 𝒏𝒏
𝟐𝟐

For numbers 6-8.

3
A group of students in their research would like to determine the EQ of
Mindanao Science State University. They followed the instructions given by
their research adviser. Through simple random sampling, they got 150
students from a population of 3,000 students. Among sampled students, the
average EQ score is 115 with a standard deviation of 10.
6. What is the sample mean?
a. 10 b. 3,000 c. 115
7. To solve for the standard deviation of the population, compute the
standard error.
a. 0.82 b. 0.995 c.0.01
8. What is the 99% confidence interval for the students’ EQ score?
a. 114± 3.1 b. 112.9 to 117.1 c. 111.1 to 115.6
For numbers 9-10
Before the BOL (Bangsamoro Organic Law) election, a poll was conducted.
Out of 1,285 randomly selected voters interviewed, 599 said they would vote
for Candidate X and 676 for candidate Y.
9. Construct a 98% confidence interval for the proportion p of voters who
would vote for candidate X.
a. 0.0433 to 0.4985 b. 0.0324 to 0.4661 c. 0.4871 to 0.5651
10.Construct a 98% confidence interval for the proportion p of voters who
would vote for candidate X.
a. 0.0433 to 0.4985 b. 0.0324 to 0.4661 c. 0.4871 to 0.5651
b.

Key to answer on page 36

4
Key to Answers

Pretest
I.
1. Estimation
2. Central Limit Theorem
3. Interval Estimate
4. Degree of Freedom
5. T-distribution
6. Population Proportion
7. Point estimate
8. Confidence Interval
9. Confidence level
10. Z-distribution
II.

Page 6. Exercise.

∑ 𝑋𝑋 53 + 64 + 49 + 59 + 62 + 55
𝜇𝜇 = = = 57
𝑁𝑁 6
𝑥𝑥̅ = 𝜇𝜇 = 57 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑝𝑝𝑝𝑝𝑖𝑖𝑝𝑝𝑡𝑡 𝑒𝑒𝑖𝑖𝑡𝑡𝑖𝑖𝑒𝑒𝑒𝑒𝑡𝑡𝑝𝑝𝑒𝑒
Page 13. Exercise
1. A. 105.2 ±3.10
B. 105.2 ± 1.86
2. A. 17.1 ±3.10
B. 17.1 ± 0.42
3. 58.2 ±0.36
Application
1. 12.8 ±0.20
2. 4.4 ± 0.79
3. 3.7 ± 0.19
4. Php2,785 ±61

4
Chapter test.

1. b

2. b

3. a

4. b

5. c

6. c

7. a

8. b

9. a

10. c

4
References

De Guzman, Danilo B. Statistics and Probability. Quezon: C & E Publishing


Inc., 2017

Calaca, Ninia I., Chin Uy, Nestor M. Noble, and Ronaldo A. Manalo. Statistics
and Probability. Quezon: VIBAL Group Inc., 2016

You might also like