You are on page 1of 30

Senior High School

Statistics and
Probability
Module 4
Estimation of Parameters

Department of the Education ● Republic of the Philippines


Statistics and Probability – Grade 11
Alternative Delivery Mode
Quarter 3 – Module 4: Estimation of Parameters
First Edition, 2020

Republic Act 8293, section 176 states that: No copyright shall subsist in any
work of the Government of the Philippines. However, prior approval of the
government agency or office wherein the work is created shall be necessary for the
exploitation of such work for a profit. Such agency or office may, among other things,
impose as a condition the payment of royalty.

Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand


names, trademarks, etc.) included in this book are owned by their respective
copyright holders. Every effort has been exerted to locate and seek permission to
use these materials from their respective copyright owners. The publisher and
authors do not represent nor claim ownership over them.
Published by the Department of Education – Division of Cagayan de Oro
Schools Division Superintendent: Dr. Cherry Mae L. Limbaco, CESO V

Development Team of the Module

Author/s: Irish Anne A. Ubalde Rufe A. Felicilda


Maria Hazelle A. Abdala Clemencia E. Masiba
Alex M. Acedera Cheryl Cabiara
Reviewers: Emily A. Tabamo Evangeline Pailmao Rufe A. Felicilda

Layout and Design: Arian M. Edullantes

Management Team
Chairperson: Cherry Mae L. Limbaco, PhD, CESO V
Schools Division Superintendent

Co-Chairperson: Rowena H. Para-on, PhD


Assistant Schools Division Superintendent
Members
Lorebina C. Carrasco, OIC – CID Chief
Marlon Francis C. Serina, School Principal
Norma B. Delima, School Principal
Joel D. Potane, SEPS/LRMS Manager
Lanie O. Signo, Librarian II
Gemma Pajayon, PDO II
Printed in the Philippines by
Department of Education – Division of Cagayan de Oro City
Office Address: Fr. William F. Masterson Ave Upper Balulang Cagayan de Oro
Telefax: (08822) 855-0048
E-mail Address: cagayandeoro.city@deped.gov.ph
Senior High School

Statistics and
Probability
Quarter 3 - Module 4
Estimation of Parameters

This instructional material was collaboratively developed and


reviewed by educators from public and private schools, colleges,
and/or universities. We encourage teachers and other education
stakeholders to email their feedback, comments, and
recommendations to the Department of Education at
action@deped.gov.ph

We value your feedback and recommendations.

Department of Education • Republic of the Philippines


Table of Contents

What This Module Is About ……………………………………..……..…… i


What I Need To Know ………………………………………………… i
How To Learn from this Module ………………………………………………… ii
Icons for this Module ………………………………………………… ii
What I Know ………………………………………………… iv

Lesson 1 – The t Distribution


What’s In ………………………………….. 1
What’s New ………………………………….. 1
What Is It ………………………………….. 3
What’s More ………………………………….. 13
What I Can Do ………………………………….. 14
What I Have Learned ………………………………….. 15
Assessment ………………………………………………… 16
References ………………………………………………… 18
Key To Answers ………………………………………………… 19
What This Module Is About
Reporting quantitative results must be efficient because they may be the bases for
some important decisions to be made. Statisticians use random samples to
undertake this. They select random samples from a target population, describe the
characteristics of the random samples, and then make inferences about population
characteristics based on the characteristics of the samples. The process of
concluding about parameter values based on sample information is called inferential
statistics. Inferential statistics has two areas: estimation and hypothesis testing.

This module will tackle more on the basic of estimation. One should understand that
estimation is the process of determining values from parameters. An estimate is a
value that approximates a parameter which is usually based on sample statistics
computed from sample data.

Specifically, this module will talk about the t – distribution and sample size
determination.

What I Need to Know


In the previous lesson, you have learned about sampling distribution, the
Central Limit Theorem, as well as the sampling distribution when the sample and
population variance are either given or unknown.

In this module we will tackle about the T-distribution and the confidence interval and
sample size determination. Specifically, when you’re done with this module, it is
expected that you familiarize if not master the following competencies:

✓ (M11/12SP-IIIg-2) illustrate the t-distribution;


✓ (M11/12SP-IIIg-5) identify percentiles using the t-table;
✓ (M11/12SP-IIIj-1) identify the length of a confidence interval;
✓ (M11/12SP-IIIj-2) compute the length of a confidence interval;
✓ (M11/12SP-IIIj-3) compute for an appropriate sample size using the length of
the interval; and
✓ (M11/12SP-IIIj-4) solve problems involving sample size determination.

i
How to learn from this module
To achieve the objectives of this module, do the following:

• Take your time to read the lesson explanations carefully.


• Solve the sample problems given in each topic on your own as guided
by the given solution.
• Answer all the given exercises and activities.
• Familiarize yourself with the given terms on the definition box at the
beginning of each topic.

Icons of this Module

This part contains learning objectives that


What I Need to Know are set for you to learn as you go along the
module.

This is an assessment as to your level of


knowledge of the subject matter at hand,
What I know
meant specifically to gauge prior related
Knowledge

This part connects the previous lesson


What’s In with that of the current one.

This part is an introduction to the new


What’s New lesson through various activities, before it
will be presented to you

This section provides a brief discussion of


What is It the lesson as a way to deepen your
discovery and understanding of concept.

This portion provides follow-up activities


that are intended for you to practice further
What’s More
to
master the competencies.

This part includes activities designed to


What I Have
process what you have learned from the
Learned
lesson.

ii
This section provides tasks and activities that
are designed to showcase your skills and
What I can do
knowledge gained, and applied to real-life
concerns and situations.

This is a task which aims to evaluate your


Assessment level of mastery in achieving the learning
competency.

This contains answers to all activities in


Answer Key
the module

This is a list of all sources used in


References
developing this module

iii
What I Know
Choose the option that corresponds to the correct answer. Write the
letter of your choice on a separate sheet of paper.

1. Decreasing the sample size, while holding the confidence level the same, will
do what to the length of the confidence interval?
A. make it bigger C. it will stay the same
B. make it smaller D. cannot be determined from the given
information

2. Given: 𝑛 ≥ 30 and  is known, what is the appropriate distribution?


A. z b. t C. p D. r

3. Which of the following refers to a range of values used to estimate the


parameter which can be calculated using two numbers or values which may or
may not contain the value of the parameter being estimated?
A. Confidence level C. margin of error
B. Interval estimate D. degree of freedom

4. Which of the following quantifies the probabilities in which, a member of the


sample would fall within a known interval of the true population, 1-𝛼, if 𝛼 is the
allowable sampling error?
A. Degree of freedom C. confidence interval
B. Margin of error D. confidence level

5. The interval defined within the true population where the members of the
sample are expected to be found.
A. Confidence level C. interval estimate
B. Confidence interval D. degree of freedom

6. Which of the following is the formula for standard error estimate for t?
̂𝒒
𝒑 ̂ ̂𝒒
𝒑 ̂ ̂𝒒
𝒑 ̂
̂ ± 𝒛𝜶 √
A. 𝒑 B. 𝒛𝜶 √ 𝒏 c. √ 𝒏 D. none of these
𝟐 𝒏 𝟐

7. This distribution is ideally used when n ≤ 30 and the standard deviation or


variance of the entire population is unknown, or that only standard deviation
given is from the sample.
A. Normal distribution C. z-distribution
B. Sampling distribution D. t-distribution

8. This refers to the number of independent observations in the set of data or the
number of variables that are free to vary.
A. Margin of error C. standard error

iv
B. Degree of freedom D. parameter
9. This distribution is ideally used when 𝑛 ≥ 30 and the standard deviation or the
variance of the entire is given
A. t-distribution C. z-distribution
B. sampling distribution D. normal distribution

For numbers 10-12.

A group of students in their research would like to determine the EQ of Mindanao


Science State University. They followed the instructions given by their research
adviser. Through simple random sampling, they got 150 students from a population
of 3,000 students. Among sampled students, the average EQ score is 115 with a
standard deviation of 10.

10. What is the sample mean?


A. 10 B. 3,000 C. 115 D. none of these

11. To solve for the standard deviation of the population, compute the standard
error.
A. 0.82 B. 0.995 C.0.01 D. .0.001

12. What is the 99% confidence interval for the students’ EQ score?
A. 114± 3.1 B. 112.90 to 117.10 C. 111.1 to 115.6 D.111.06 to 115.1

13. Suppose that we wanted to estimate the true average number of eggs a queen
bee lays with 95% confidence. The margin of error we are willing to accept is
0.5? Suppose we also know that s is about 10. What sample size should we
use?
A. 15.36 B. 15.37 C. 26.53 D. 26.50

For numbers 14-15

The Principal wants to know the mean of all entering trainees in a boot camp. The
mean age of a random sample of 25 trainees is 18 years and the standard deviation
is 1.3 years. The sample comes from a normally distributed population. Use 𝛼 = 0.1
to find what is asked.

14. What is the error E?


A. 0,83 B. 0.73 C. 0.63 D. 0.53

15. What is the interval estimate of the population mean?


A. 19.27 to 20.71 B. 18.37 to 19.73 C. 17.27 to 18.73 D. 17.07 to 18.03

v
Lesson
The t - Distribution
1
When the sample values are not that large enough for the Central Limit
Theorem to be used and the normal curve concepts cannot be applied, or when the
population standard deviation is not known, there is still another way of estimating
the population mean. The situation calls for another kind of distribution, provided
assumptions are met.

What’s In
Before starting the lesson, let us see if you can still remember some of
these terms.

COLUMN A COLUMN B

1. Population A. Values that belong to a population


2. Sample B. Set of data one wishes to investigate
3. Population mean C. Subset of a set of data
4. Population standard deviation D. Denoted by 𝜇
5. Error E. Denoted by 𝜎
6. Parameter F. Difference between a value and the
mean
G. Mathematical model for decision making

What’s New

Activity 1: Word Search for some Statistical Terms. Find statistical terms hidden in
the grid of letters. Copy and encircle the terms you find in diagonal, vertical and
horizontal position. (Hint: There are 12 statistical terms necessary to learn the lesson
in this module)

1
How many terms you think you can find? Let us try to define these terms later.

C I N T E R V A L L E N G T H U
R O C R T T E L M A U E S T M E
C I N T E R V A L Q S E R E A V
R E S F T R C M B C E S T C E C
I M T D I S T R I B U T I O N A
T D I S T D O O R H U I B V P I
I P E A L E E X T K T M E A O I
C A S M V V S N V L S A M P L E
A R T P T X T X C L R T X W P R
L A I L R C I S I E A I C E E C
V M M E Y I M L A A L E R X R Z
A E A S T A N D A R D E R R O R
L T T I E X S T M R M W V C S X
U E I Z E S T I M A T T Z E E E
E R O E G R E E S O F F R E L L
E N N L E N R T H I N T E R V L
I M A R G I N O F E R R O R E I
D E G R E E S O F F R E E D O M

We have learned that the sampling distribution follows a normal distribution for
large sample size provided the population standard deviation is given. However,
there are problems in which the normal distribution is not appropriate, particularly,
when small sample size and if the population variance is unknown. In this lesson, we
will study another form of distribution that can be used if situational problems do not
allow us to use the standard normal distribution. This distribution is called the
student’s t distribution or simply t-distribution.

The t-distribution is a family of distributions that look almost identical to the


normal distribution curve, only a bit shorter and fatter. The t-distribution is used
instead of the normal distribution when you have small samples. Increasing the
sample size, the more the t-distribution looks like the normal distribution.

2
What Is It

DEFINITION 4.4

The t - distribution – is the probability distribution that estimates the population


parameters when the sample size is small and the population standard deviation is
unknown.

Degree(s) of freedom – refers to the number of independent observations on the set


of data, or the number of variables that are free to vary. The formula for the degree of
freedom is df = n -1 where n is the number of observations.

Confidence level – usually expressed in percent, it sets a portion of the sample to


be included within a known range of the true population. It also quantifies the
probability in which, a member of the sample would fall within a known interval of the
true population. If 𝛼 (alpha) is the allowable sampling error, the confidence level, is
equal to 1 – 𝛼.

Confidence interval – also called interval estimate, is a range of values that is used
to estimate a parameter. This estimate may or may not contain the true parameter
value.

Here are several properties of the t – distribution:

1. The mean, median, and mode of the t-distribution are equal to 0.


2. The t-distribution is bell-shaped and symmetric about the mean.
3. The total area under the t-distribution curve is equal to 1.
4. The tails in the t-distribution are “thicker” than those in the standard normal
distribution.
5. The standard deviation of the t-distribution varies with the sample size, but it is
greater than 1.
6. The t-distribution is a family of curves, each determined by a parameter called
the degrees of freedom. The degrees of freedom (sometimes abbreviated as
df) are the number of free choices left after a sample statistic such as x is
calculated. When you use a t-distribution to estimate a population mean, the
degrees of freedom are equal to one less than the sample size.
df = n – 1

3
Degrees of freedom
7. As the degrees of freedom
increase, the t-distribution
approaches the standard normal
distribution, as shown in the figure.
After df=30, the t-distribution is
close to the standard normal
distribution.

Comparison Between the 𝒕 - distribution and the 𝒛 - distribution


(or Normal Distribution)
t – distribution z - distribution
Ideally used when n ≤ 30 and the
standard deviation or the variance of Ideally used when n ≥ 30 and the
the entire population is unknown, or that standard deviation or the variance of the
the only standard deviation given is entire population is given.
from the sample.
Both can be used for determining the confidence interval of the population
mean and confidence interval of the difference between two means
The distribution has a graph that is
bell-shaped and symmetrical about the
mean. It is more variable since t-values
depend on the fluctuations of the mean
The distribution has a graph that is
and standard deviation.
bell-shaped and symmetrical about the
The degree(s) of freedom df is equal
mean. Z-values only depend on the
to (n-1) if the mean and standard
fluctuation of the mean from sample to
deviation are computed from samples of
sample.
size n. The values of t are said to
belong to a t-distribution with df = n-1.
The bell curve of the t-distribution
approaches the standard normal curve
as n becomes bigger.
Interval estimate formula for Interval estimate formula for
t-distribution z-distribution
𝒔 𝒔 𝜎 𝜎
(𝒙
̅ − 𝒕𝜶 ( ) < 𝝁 < 𝒙
̅ + 𝒕𝜶 ( )) (𝑥̅ − 𝑧𝛼 ( ) < 𝜇 < 𝑥̅ + 𝑧𝛼 ( ))
𝟐 √𝒏 𝟐 √𝒏 2 √𝑛 2 √𝑛
𝒔 𝜎
= standard error of the mean (SE) = standard error of the mean (SE)
√𝒏 √𝑛
𝒔 σ
𝒕𝛂 ( ) = margin of error (E) zα ( ) = margin of error (E)
𝟐 √𝐧 2 √n

4
The confidence coefficients for t that are used in computing interval estimates
of 𝜇 (mu) are found in the t Table. The formula for computing the confidence interval
is given above.

The t values found in the reproduced t table below are the proportions of the
areas in the two tails of the t curve. These are critical values for the t distribution and
are utilized like the z critical values. Like the z, they are also called confidence
coefficients.

Observe that in the table, t values are based not on the sample size n, but on
the degrees of freedom (df), n-1.

Table 1. The t Distribution


Degrees of Confidence Interval 80% 90% 95% 98% 99%
freedom One tail, α 0.10 0.05 0.025 0.01 0.005
(df) Two tail, α 0.20 0.10 0.05 0.02 0.01
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
40 1.303 1.684 2.021 2.423 2.704
50 1.299 1.676 2.009 2.403 2.678
60 1.296 1.671 2.000 2.390 2.660
70 1.294 1.667 1.994 2.381 2.648
80 1.292 1.664 1.990 2.374 2.639
90 1.291 1.662 1.987 2.368 2.632
100 1.290 1.660 1.984 2.364 2.626
500 1.283 1.648 1.965 2.334 2.586
1000 1.282 1.646 1.962 2.330 2.581

5
How to Use the t - distribution Table of Values

1. Determine alpha (𝛼). The probability that the population parameter is not in the
confidence interval.
2. Identify which of the two tests must be used (two-tailed or one-tailed).

Graph of a two-tailed t-distribution Graph of a one-tailed t-distribution


Confidence level of 90% Confidence level of 90%
𝛼 𝛼 = 10
𝛼 = 10 =

5% 5% 10%

z= 0 z= 0

Note: When the problem does not specify the type of distribution, use two-tailed
distribution.
3. Compute for the degrees of freedom, df = n – 1
4. Using the t-table, determine the t-value using the row of the desired df, and
the column of the allowed error.

Example 1:

What is the t-value of a one-tailed t-distribution with 𝛼 = 5% and df = 15?


Solution:
From the given table, we find that the t-value is 1.753
Table 1. The t Distribution
Confidence Interval 80% 90% 95% 98% 99%
d.f. One tail, α 0.10 0.05 0.025 0.01 0.005
Two tail, α 0.20 0.10 0.05 0.02 0.01
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
15 1.341 1.753 2.131 2.602 2.947
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮

6
Example 2:

What is the 95% confidence interval of the scores of the Philippine Rugby
(nicknamed Tamaraws) Team players if the mean of the 12 randomly selected
players is 9.5 points per game with a standard deviation of 3.5 points?

Solution:

We will use the t-distribution since the sample size, n is less than 30, where
n = 12, sample mean of 𝑥̅ = 9.5 and sample standard deviation of 𝜎𝑥̅ = 3.5 and a
confidence level of 95%.

From the t-table, look for df = 12-1 = 11. Find the value under the two-tailed
test since the deviations in the values of the data could come from both ends of the
distribution. The confidence level is 95%, giving us 𝛼 = 5%

Table 1. The t Distribution


Confidence Interval 80% 90% 95% 98% 99%
d.f. One tail, α 0.10 0.05 0.025 0.01 0.005
Two tail, α 0.20 0.10 0.05 0.02 0.01
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
11 1.363 1.796 2.201 2.718 3.106
⋮ ⋮ ⋮ ⋮ ⋮ ⋮

Hence, 𝑡𝛼 = . 01
2

To find confidence interval when 𝑛 < 30, use the formula


𝜎 𝜎
𝑥̅ ± 𝑡𝛼 ( 𝑛𝑥̅ ) = 𝑥̅ ± . 01( 𝑛𝑥̅ )
2 √ √

3.5
= 9.5 ± 2.201( )
√12

= 9.5 ± 2.201(2.22)

= 9.5 ± 4.89
Therefore, the confidence interval is (4.61, 14.39)

7
Since the population standard deviation and the standard deviation of the
sampling distribution of means 𝜎𝑥̅ are rarely known, the procedure involving t is
typically used in setting confidence intervals.
In determining the interval estimate for the population mean when 𝜎 is
unknown, let us follow the suggested steps.
Step 1. Describe the population parameter of interest.
Step 2. Check the specifics of the confidence criteria (normality, and test statistic – t
statistic, in this case)
Step 3. Identify the confidence level and critical values as well as the given values.
Also, find the point estimate.
Step 4. Compute the Error E and find the interval estimate. Interpret the results.

Example 3:

A trainer wants to know the mean of all entering trainees in a camp. The mean age of
a random sample of 25 trainees is 18 years and the standard deviation is 1.3 years.
The sample comes from a normally distributed population. Use 𝛼 = 0.1 to find the
following:

a. the point estimate;


b. the error (E); and
c. the interval estimate of the population mean.

Solution:

Step 1. The parameter of interest s the mean 𝜇 of the population where the sample
comes from.

Step 2. The sample comes from a parent population that is normally distributed. The
sample information consists of n = 25, s = 1.3 and the t distribution will be used.

Step 3. 99% confidence level or 𝛼 = .01, df = 25 – 1 = 24, critical values = ± .797,


the point estimate of the population mean 𝜇 is 18.
𝑠
Step 4. To solve for the error E, use the formula: E = 𝑡𝛼⁄2 ( 𝑛)

1.3
E = 2.797( ) Table 1. The t Distribution
√25 Confidence
Interval 80% 90% 95% 98% 99%
d.f.
One tail, α 0.10 0.05 0.025 0.01 0.005
= 2.797(0.26 Two tail, α 0.20 0.10 0.05 0.02 0.01
1 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
E = 0.73 3 1.638 2.353 3.182 4.541 5.841
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
24 1.318 1.711 2.064 2.492 2.797

8
To solve for the interval estimate, use the formula:
𝒔 𝒔
(𝒙
̅ − 𝒕𝜶 ( ) < 𝝁 < 𝒙
̅ + 𝒕𝜶 ( ))
𝟐 √𝒏 𝟐 √𝒏

18 – 0.73 < 𝜇 < 18 + 0.73 ;


17.27 < 𝝁 < 18.73
Lower boundary 17.27, while the upper boundary is 18.73
Confidence interval: (17.27, 18.73)

Summary of Results:
1. The point estimate of the population mean 𝜇 is 18.
2. The error E is 0.73
3. We can say with 99% confidence that the interval between 17.27 and 18.73
contains the population mean age of trainees, based on a sample size of 25.

Finding Areas and Percentiles

The process of solving problems in areas and percentiles in the t distribution is


similar to the z distribution. We just refer to the t table to find the critical values.

Example 4

a. What is the 95th percentile of a one-tailed t-distribution when df=10?

Solution:

Looking at the t table when df= 10, it shows that when 𝛼 = 0.0 , one- tailed, the t
value is 1.812.

Thus, the 95th percentile is 1.812

b. What is the 90th percentile of a two-tailed t distribution when df=25?


Answer: t value is 1.708

9
To find the area under the t-distribution for a certain range of t-values, first identify
whether a two-tailed or one-tailed t-test was done. The area always corresponds to
1-𝛼.

Example 5.

a. Find the area under the t-distribution


from t=-2.447 to t=2.447 with df =
7Solution:

Based on the range of t, the distribution is a


two-tailed curve. From table 1, the only 𝛼
which will give us a t value of 2.447 is 𝛼 =
0.0 or5% for a two-tailed test. Thus, the https://mathcracker.com/t-distribution-graph-generator#results

area under the curve from t=-2.447 must be 1- 𝛼 = 1.0.05 = 0.95 or 95% .

b. Find the area under t-distribution to the right of t = -1.761 when df = 14


Solution:
Since the t-distribution is symmetric, the area to the right of t= -1.761 is just
equal to the area to the left of t = 1.761. Thus, since 𝛼 = 0.0 𝑜𝑟 when t =
1.761 and df = 14, then the area should be 1 - 𝛼 = 1.0.05 = 0.05 or 95%.

https://mathcracker.com/t-distribution-graph-generator#results

10
Sample Size Determination for the Mean

In estimating the population mean 𝜇 by the sample mean 𝑥̅ , the error will not exceed
the margin of error 𝑒 with a confidence level (1 − 𝛼)100 when the sample size is at
𝜎 2 𝑧𝛼⁄ ∗𝜎
least equal to 𝑛 = (𝑧𝛼 . ) or 𝑛 = ( 2
)2 . This is the formula in determining the
2 𝐸 𝐸

minimum sample size needed when estimating the population mean. Since the value
of 𝜎 is usually unknown, it can be estimated by the standard deviation s from a prior
sample. Alternatively, we may approximate the range R of observations in the
𝑅
population and make a conservative estimate of ≈ . In any case, round up the
4
value of result to ensure that the sample size will be sufficient to achieve the
specified reliability.

Example 6

You will be conducting a study to estimate the average daily food expenditure of
students. You want to be 95% confident that the sample mean will be within ₱20 of
the true mean. If you can approximate the population standard deviation by ₱100 and
assume an approximate normal distribution, how large a sample should you get?

Solution:

Using the sample size determination formula, you get

2
𝜎 2 100
𝑛 = (𝑧𝛼 . ) = (1.96 ( )) = 96.04 = 97
2 𝐸 0

The minimum sample size is oftentimes rounded up since a sample consisting of


96.04 students is not possible. Hence, the sample size needed to be 95% confident
that the estimate of the daily food expenditure will differ by no more than ₱20 is 97
persons.

Example 7

Suppose you want to replicate a study where the lowest observed value is 12.4 while
the highest is 12.8. You want to estimate the population mean 𝜇 within an error of

11
0.025 of its true value. Using 99% confidence level, find the sample size n that you
need.

Solution

Recall that for 90% confidence interval, 𝑧𝛼⁄2 =1.65, for 95% confidence interval, 𝑧𝛼⁄2 = 1.96,
and for 99% confidence interval, 𝑧𝛼⁄ =  2.58
2

The confidence level is 99%. So, 𝛼 = 0.01. Hence,, 𝑧𝛼⁄2= 2.58

𝑅
The desired error E is 0.025. Since the range R = 12.8 – 12.4 =0.4, then 𝜎 = =
4

0.01

𝑧𝛼⁄ ∗𝜎
Substituting these values in the equation 𝑛 = ( 2
)2 .for getting the sample size,
𝐸
2.58∗0.1 2 0.258 2
we have: 𝑛=( ) =( ) = (10.3 )2 = 106. .
0.25 0.025

Remember: When determining sample size, we always round up the resulting value to the
next whole number.
Rounding up106.5 to 107. So the required sample size is 107.

Sample Size Determination for the Proportion


Determining the sample size at the start of a study ensures that the error in
estimating the population proportion p will not exceed the margin of error (maximum
𝑝̂𝑞̂
error) of E= 𝑧𝛼 √ 𝑛 error with a level of confidence (1 − 𝛼) 100%.
2

In estimating the population proportion 𝑝 using the sample proportion 𝑝̂ , the


error will not exceed the margin of error E with a confidence level (1 − 𝛼)100 when
2
(𝑧𝛼 ) 𝑝̂𝑞̂
2
the sample size is at least equal to 𝑛 = .
𝐸2
2
(𝑧𝛼 )
2
If there are no prior or similar given values (𝑝̂ is unknown), you can use 𝑛 =
4𝐸 2

12
To determine sample size that will not exceed the maximum error E with at least
(1 − 𝛼)100 confidence level. This is sometimes referred to as the conservative
approach.
Example 8
A market research survey will be conducted on consumer preference of
laundry soap brands. How large a sample is needed if the maximum error must not
exceed 3% at 99% confidence interval and the calculated sample proportion from a
previous study on preferred laundry brand is 0.46?

Solution:
2 2
(𝑧𝛼 ) 𝑝̂𝑞̂ (𝑧0.005 ) (0.46)(0.54)
( . 7 ) (0.46)(0. 4)
𝑛= 2
𝑒2
= 2
(0.03)2
= (0.03)
= 1830.0

Therefore, the minimum sample size required in the survey should be at least 1,831.
Example 9
How large a sample do you need to obtain if you will be conducting a new research
survey with no prior study, with 2% margin of error and 95% level of confidence?
Solution:
The minimum sample size is given by
2 2
(𝑧𝛼 ) (𝑧0.05 )
2 2
(1.96)2
𝑛= = = = 401
4𝑒 2 4(0.0 )2 4(0.0 )2
Hence, the sample size should be at least ,401 respondents.

What’s More

Let us find out if you can answer the following activities for practice.

Activity 2

1. Using the t-table, give the confidence coefficients (t-value) for each of the
following:
a. n =12, 99% confidence
b. n = 23, 95% confidence

13
2. Assuming that the samples comes from normal distributions, find the margin of
error, E given the following:
a. n =18, 𝑥̅ = 78.3, s = 2.5, 95% confidence
b. n = 28, 𝑥̅ = 90.8, s = 2.8, 99% confidence
Solve the following problems.
3. A random sample of 12 students in a certain dormitory has an average weekly
expenses of Php400 for snacks, with a standard deviation of Php12.50.
Construct a 90% confidence interval for the amount spent on snacks,
assuming the expenses are normally distributed.
4. A quality controller wants to estimate the proportion of high-quality goods out
of a batch of products with a 90% confidence level and a margin of error of
5%. How many products must he test?
5. Given a sample size n = 12, sample mean of 120 ml and sample standard
deviation of 6. The parent population is normally distributed. Find the error E
and the interval estimate of the population mean 𝜇.

What I Can Do

Tasks:

• Study the hypothetical situation about an effective teaching strategy


• Compute the parameter estimates to answer the questions that follow
As adopted from: Rene Belecina, et. al., Statistics and Probability

Suppose you want to know if cooperative grouping is an effective strategy in


improving the mathematics performance of Grade 11 students. Twenty students were
included in the experimental group while another 20 students were included in the
control group. The mean achievement score of the students in the experimental
group was 82.5 with a standard deviation of 3 while the mean of the students in the
control group was 80 with a standard deviation of 6. The two groups come from
normally distributed populations. The confidence level adopted was 95%.

14
1. What is the estimate of the population mean where the experimental group
comes from?
2. What is the estimate of the population mean where the control group comes
from?
3. Express your confidence as percentage.

What I Have Learned


Let us see if you can identify some of these terms or identify what is
required in each item.
A. Identification:
______1. The process of making inferences about a population based on
information obtained from a sample.
______2. A range of values used to estimate the parameter. It can be
calculated using two numbers or values which may or may not
contain the value of the parameter being estimated.
______3. This refers to the number of independent observations in the set
of data or the number of variables that are free to vary.
______4. This distribution is ideally used when n ≤ 30 and the standard
deviation or variance of the entire population is unknown, or that
the only standard deviation given is from the sample.
______5. A single value used to approximate a population parameter.
______6. The interval defined within the true population where members of
the sample are expected to be found.
______7. It quantifies the probabilities in which, a member of the sample
would fall within a known interval of the true population. If 𝛼 is the
allowable sampling error, the confidence level is equal to 1 – 𝛼.
______8. This distribution is ideally used when 𝑛 ≥ 30 and the standard
deviation or the variance of the entire is given.
______9. In a t-distribution, what is the critical t-value for a two-tailed test,
𝛼 = 0.0 and df=12?
______10. In t-distribution, the critical values are based on ___.

15
16
Assessment: (Post-Test)
Write the letter of the correct answer on your answer sheet.
1. The interval defined within the true population where the members of the
sample are expected to be found.
A. Confidence level C. interval estimate
B. Confidence interval D. degree of freedom
2. Which of the following refers to a range of values used to estimate the
parameter which can be calculated using two numbers or values which may or
may not contain the value of the parameter being estimated?
A. Confidence level C. margin of error
B. Interval estimate D. degree of freedom
3. Decreasing the confidence level, while holding the sample size the same, will
do what to the length of the confidence interval?
A. make it bigger C. it will stay the same
B. make it smaller D. cannot be determined from the given
information
4. Which of the following quantifies the probabilities in which, a member of the
sample would fall within a known interval of the true population, 1-𝛼, if 𝛼 is the
allowable sampling error?
A. Degree of freedom C. confidence interval
B. Margin of error D. confidence level
5. This distribution is ideally used when n ≤ 30 and the standard deviation or
variance of the entire population is unknown, or that the only standard deviation
given is from the sample.
A. Normal distribution C. z-distribution
B. Sampling distribution D. t-distribution
6. This refers to the number of independent observations in the set of data or the
number of variables that are free to vary.
A. Margin of error C. standard error
B. Degree of freedom D. parameter

For numbers 7-9.

A group of students in their research would like to determine the EQ of Mindanao


Science State University. They followed the instructions given by their research
adviser. Through simple random sampling, they got 150 students from a population

17
of 3,000 students. Among sampled students, the average EQ score is 115 with a
standard deviation of 10.

7. What is the sample mean?


B. 10 B. 3,000 C. 115 D. none of these
8. To solve for the standard deviation of the population, compute the standard
error.
B. 0.82 B. 0.995 C.0.01 D. .0.001
9. What is the 99% confidence interval for the students’ EQ score?
B. 114± 3.1 B. 112.90 to 117.10 C. 111.1 to 115.6 D.111.06 to 115.1

For numbers 10-11

The Principal wants to know the mean of all entering trainees in a boot camp. The
mean age of a random sample of 25 trainees is 18 years and the standard deviation
is 1.3 years. The sample comes from a normally distributed population. Use 𝛼 = 0.1
to find what is asked.

10. What is the error E?


A. 0.83 B. 0.73 C. 0.63 D. 0.53
11. What is the interval estimate of the population mean?

A. 19.27 to 20.71 B. 18.37 to 19.73 C. 17.27 to 18.73 D. 17.07 to 18.03

12. Suppose that we wanted to estimate the true average number of eggs a queen
bee lays with 95% confidence. The margin of error we are willing to accept is
0.5? Suppose we also know that s is about 10. What sample size should we
use?
A. 15.37 B. 16 C. 26.53 D. 26.50
th
13. Which of the following is the 90 percentile of a two-tailed t-distribution when
degrees of freedom is15?
A. 1.753 B. 1.761 C. 1.341 D. none of these
14. What proportion of the t-distribution with df = 18 falls below -2.10?
A. 0.5 B. 0.25 C. 0.025 D. 0.1
15. If you increase the sample size and confidence level at the same time, what will
happen to the length of your confidence interval?
A. Make it bigger C. It will stay the same
B. Make it smaller D. cannot be determined from the given information

18
References:
"What Are the Properties of F-distribution?" Business Jargon, July 9, 2016.
https://businessjargons.com/properties-of-f-distribution.html.,
Belecina, Rene R. , Baccay, Elisa S. and Mateo, Efren B. “Statistics and Probability”,
168-169. Manila, Philippines: Rex Book Store, Inc., 2016.
Calaca, NInia, Nestor Roble, and Ronaldo Manalo. “Chapter 4.” Essay. In Statistics
and Probability, edited by Chin Uy, 2016th ed., 188–210. Quezon City,
Philippines: Vibal Publishing Co., 2016.
Canva. Accessed December 22, 2020. https://www.canva.com/education
Central Limit Theorem Examples.” www.probabilityformula.org. Accessed June 30,
2020. http://www.probabilityformula.org/central-limit-theorem-examples.html
De Guzman, Danilo. “Chapter 4 Estimation of Parameters.” Essay. In Statistics and
Probability, 98–133. Quezon City, Philippines: Camp; E Publishing, Inc., 2017.
Glen, Stephanie. “Confidence Level: What Is It?” Statistics How To, January 15,
2018. https://www.statisticshowto.datasciencecentral.com/confidence-level
Glen, Stephanie. “Interval Estimate: Definition, Examples.” Statistics How To, June 2,
2019. https://www.statisticshowto.datasciencecentral.com/interval-estimate
https://alicia.public.iastate.edu/stat328/review-3pdf
https://newsonlinecourses.science.psu.edu/stat414/node/192/
https://quizizz.com/admin/quiz/5e6a44d16289ea001bdf95fd/estimation-of-parameters
https://www.isixsigma.com/tools-templates/sampling-data/how-determine-sample-
size-determining-sample-size/
Larson, Ron, and Betsy Farber. “6.2 Confidence Intervals for the Mean (σ
Unknown).” Essay. In Elementary Statistics: Picturing the World, 331. Boston:
Pearson, 2015.
Remoto-Ocampo, Shirlee. “Sample Size Determination for the Proportion.” Essay. In
Probability, Statistics and Applications, edited by Regina Macarangal
Tresvalles, 128-144. Quezon City, Philippines: Abiva Publishing House, Inc.,
2017.
T-Distribution / Student's T: Definition, Step by Step Articles, Video.” Statistics How
To, May 29, 2020. https://www.statisticshowto.datasciencecentral.com/probability-and-
statistics/t-distribution

19
20
What’s New
What I have learned
1. confidence interval
1. Estimation 2. interval length
2. Interval estimate 3. interval
What’s More 3. Degree of freedom 4. critical value
4. T-distribution 5. degree of freedom
1. a) 3.108 b) 2.074
5. Point estimate 6. margin of error
2. a)3.24 b) 1.47
6. Confidence interval 7. standard error
3. (393.52, 406.48)
7. Confidence level 8. t distribution
4. 271 products 9. sample size
8. z-distribution
5. E=1.40 and interval 10. parameter
9. 2.179
estimate ranges from 11. sample size
10. Degrees of freedom
81.1 to 83.9 12. estimation
What I Know
1. A 2. A 3. B 4. D 5. B 6. D 7. D 8. B 9. C 10. C 11.A 12. B 13. B 14. B 15. C
What’s In
1. B 2. C 3. D 4. E 5. F 6. A
Key to Answers
For inquiries and feedback, please write or call:

Department of Education – Bureau of Learning Resources (DepEd-BLR)

DepEd Division of Cagayan de Oro City


Fr. William F. Masterson Ave Upper Balulang Cagayan de Oro
Telefax: ((08822)855-0048
E-mail Address: cagayandeoro.city@deped.gov.ph

21

You might also like