You are on page 1of 131

11

STATISTICS and
PROBABILITY
Fourth Quarter

LEARNING ACTIVITY SHEET

i
Republic of the Philippines
Department of Education

COPYRIGHT PAGE
Learning Activity Sheet in Statistics and Probability
Grade 11
Copyright @ 2020
DEPARTMENT OF EDUCATION
Regional Office No. 02 (Cagayan Valley)
Regional Government Center, Carig Sur, Tuguegarao City, 3500

“No copy of this material shall subsist in any work of the government of the Philippines. However, prior
approval of the government agency or office wherein the work is created shall be necessary for exploitation
of such work for profit.

This material has been developed for the implementation of K to 12 Curriculum through the Curriculum and
Learning Management Division (CLMD). It can be reproduced for educational purposes and the source
must be acknowledged. Derivatives of the work including creating an edited version and enhancement of
supplementary work are permitted provided all original works are acknowledged and the copyright is
attributed. No work may be derived from the material for commercial purpose and profit.

Consultants:
Regional Director : BENJAMIN D. PARAGAS, PhD, CESO IV
Assistant Regional Director : JESSIE L. AMIN, EdD, CESO V
Schools Division Superintendent : MADELYN L. MACALLING, PhD, CESO VI
Assistant Schools Division Superintendents
: DANTE MARCELO, PhD, CESO VI
: EDNA P. ABUAN, PhD
Chief Education Supervisor, CLMD : OCTAVIO V. CABASAG, PhD
Chief Education Supervisor, CID : RODRIGO V. PASCUA, EdD

Development Team
Writers : JAYBEL B. CALUMPIT, REGIONAL SCIENCE HS- ISABELA
: ANGELICA M. BATTUNG, ROXAS STAND ALONE SHS
: CAYSELYN GUITERING, ALFREDA ALBANO NATIONAL HS-ISABELA
: ENGR. RONALD MORALES, BARUCBOC NATIONAL HS
: ARNOLD HABAN, QUEZON NATIONAL HS
: JAYLORD R. MENOR, CAGASAT NATIONAL HS
: CINDY L. AQUINO, LUNA GENERAL COMPREHENSIVE HS

Content Editors : ALJON S. BUCU, PhD


: MAI RANI ZIPAGAN, PhD
: LEONOR BALICAO

Focal Persons : INOCENCIO T. BALAG, EPS MATHEMATICS


: MA. CRISTINA ACOSTA, EPS LRMDS, SDO ISABELA
: ISAGANI DURUIN, REGIONAL EPS MATH
: RIZALINO CARONAN, REGIONAL EPS LRMDS

Printed in DepEd Regional Office No. 02


Regional Government Center, Carig Sur, Tuguegarao City
ii
Table of Contents
Competencies Page Number
illustrate a null hypothesis, alternative hypothesis, level of
significance, rejection region and types of errors in
hypothesis testing 1
Identifies the parameter to be tested given a real-life problem 8
Formulates the appropriate null and alternative hypotheses on
a population mean 13
Identifies the appropriate form of the test-statistic when:
(a) the population variance is assumed to be known
(b) the population variance is assumed to be unknown;
and (c) the Central Limit Theorem is to be used 18
identifies the appropriate rejection region for a given level of
significance when: (a) the population variance is assumed
to be known; (b) the population variance is assumed to be
unknown; and (c) the Central Limit Theorem is to be used 28
Computes for the Test-Statistic Value (Population Mean) 35
Draws Conclusion About the Population Mean Based on the
Test-Statistic Value and the Rejection Region 47
Solves problems involving test of hypothesis on the population mean 57
Formulate the appropriate null and alternative hypotheses on a
population proportion 63
Identifies the appropriate form of the test-statistic in population
proportion when the Central Limit Theorem is to be use 70
Solve problems involving test of hypothesis on the population proportion 80
Illustrate the nature of bivariate data 84
Construct a scatter plot 90
Describes the shape (form), trend (direction), and variation
(strength) based on a scatter plot 96
Calculates the Pearson’s sample correlation coefficient 106
Solves problems involving correlation analysis 115
Predict the value of the dependent variable given the value 119
of the independent variable
solve problems involving regression analysis 124

iii
STATISTICS & PROBABILITY

Name: ________________________________ Grade Level: _______


Date: _________________________________ Score: _____________

LEARNING ACTIVITY SHEET


UNDERSTANDING HYPOTHESIS TESTING

Background Information for Learners

Hypothesis testing is a decision-making process for evaluating claims about a population


based on the characteristics of a sample purportedly coming from that population. The
decision is whether the characteristic is acceptable or not.
The null hypothesis, denoted by H0, is a statement that there is no difference between a
parameter and a specific value, or that there is no difference between two parameters.
The alternative hypothesis, denoted by H1, is a statement that there is a difference between
a parameter and a specific value, or that there is a difference between two parameters.
The significance level, also denoted as alpha or α, is the probability of rejecting the null
hypothesis when it is true.
Under the normal curve, the rejection region refers to the region where the value of the test
statistic lies for which we will reject the null hypothesis.
A type 1 error is also known as a false positive and occurs when a researcher incorrectly
rejects a true null hypothesis. This means that your report that your findings are significant
when in fact they have occurred by chance.
A type II error is also known as a false negative and occurs when a researcher fails to reject
a null hypothesis which is really false. Here a researcher concludes there is no significant
effect, when actually there really is.

Learning Competency with code


The learner is able to illustrate a null hypothesis, alternative hypothesis, level of significance,
rejection region and types of errors in hypothesis testing (M11/12SP-IVa- and identifies the
parameter to be tested given a real life problem (M11/12SP-IVa-

Note: Practice Personal Hygiene protocols at all times.


Exercise A
Directions: Write TRUE if the statement is correct; otherwise, write FALSE.
1. The null hypothesis always indicates an exact hypothesized value of the parameter.
2. If the null hypothesis is true but is rejected, the decision is correct.
3. A Type I error is made when the null hypothesis is rejected when it is true.
4. The risk of Type II error does not depend on the risk of Type I error.
5. If we assume α to be 5%, this means the probability of rejecting a true null
hypothesis is 5 out of 100.
6. The probability of committing a type I error is the significance level of the test.
7. Type I error occurs when we convict a person, in reality, did not commit the crime.
8. Type II error could be acquitting a person who, in reality, committed the crime.
9. The higher the level of significance, the higher the probability of rejecting the null
hypothesis when it is true.
10. No two things can be and cannot be at the same time.

Exercise B
Directions: For each pair of null and alternative hypotheses, determine whether the set is a valid
set of hypotheses, write Y for yes and N for no.

Question Null Alternative


number Valid? (Y/N)
Hypothesis Hypothesis

H : μ = 36 H : μ ≠ 36

H : π = .45 H : π ≠ .45

3 ̅ ̅

4 H : π≥ H : π≤

5 H : μ > 47 H : μ ≤ 47

6 H : p = .70 H : p ≠ .7

7 H : μ ≥ 98 H : μ < 98

8 H : p ≤ .44 H : p > .44

9 ̅ ̅

Note: Practice Personal Hygiene protocols at all times.


H : π ≤ .8 H : π > .8

Exercise C
Directions: Write the letter of the correct answer.

1. What type of error occurs if you fail to reject H0 when, in fact, it is not true?
a. Type II
b. Type I
c. either Type I or Type II, depending on the level of significance
d. either Type I or Type II, depending on whether the test is one tail or two tail

2. What do we call an assumption that is made about the value of a population parameter?
a. Hypothesis
b. Conclusion
c. Confidence
d. Significance

3. What is the probability of committing a Type I error when the null hypothesis is true?
a. the confidence level
b. the hypothesized mean
c. greater than 1
d. the Level of Significance

4. Which of the following is true about hypothesis testing?


a. the smaller the Type I error, the smaller the Type II error will be
b. the smaller the Type I error, the larger the Type II error will be
c. Type II error will not be effected by Type I error
d. the sum of Type I and type II errors must equal to 1

5. The null and alternative hypotheses divide all possibilities into:


a. two sets that overlap
b.two non-overlapping sets
c. two sets that may or may not overlap
d.as many sets as necessary to cover all possibilities

6. Which of the following is true of the null and alternative hypotheses?


a. Exactly one hypothesis must be true
b.both hypotheses must be true
c. It is possible for both hypotheses to be true
d.It is possible for neither hypothesis to be true

Note: Practice Personal Hygiene protocols at all times.


3
7. When does a type II error occur?
a. the null hypothesis is incorrectly accepted when it is false
b.the null hypothesis is incorrectly rejected when it is true
c. the sample mean differs from the population mean
d.the test is biased

8. A two-tailed test is one where:


a. results in only one direction can lead to rejection of the null hypothesis
b.negative sample means lead to rejection of the null hypothesis
c. results in either of two directions can lead to rejection of the null hypothesis
d.no results lead to the rejection of the null hypothesis

9. Which of the following does the null hypothesis usually represent?


a. the theory the researcher would like to prove.
b.the preconceived ideas of the researcher
c. the perceptions of the sample population
d.the status quo

10. Which of the following values is not typically used for  ?


a. 0 01
b.0 0
c. 0 10
d.0 2

Note: Practice Personal Hygiene protocols at all times.


4
Reflection

Complete this statement:


What I learned in this activity
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________

References:

Chua, S.L., Dela Cruz, E Jr O., Aguilar, I.C., Rodriguez, A.A.& Puro, L.M. Soaring 21st Century
Mathematics (Statistics & Probability).Phoenix Publishing House, Inc.2016

Belecina, R.R., Baccay, E.S., & Mateo E.B. Statistics & Probability. Rex Book Store.2016

Avillano-Tales, Karen. Senior High School Statistics and Probability.FNB Educational, Inc. 2016

Note: Practice Personal Hygiene protocols at all times.


5
Answer Key
Exercise A
1. TRUE
2. FALSE
3. TRUE
4. FALSE
5. TRUE
6. TRUE
7. TRUE
8. TRUE
9. TRUE
10. TRUE

Exercise B

Question Null Alternative


number Valid? (Y/N)
Hypothesis Hypothesis

H : μ = 36 H : μ ≠ 36 Y

H : π = .45 H : π ≠ .45 Y

3 ̅ ̅ Y

4 H : π≥ H : π≤ N

5 H : μ > 47 H : μ ≤ 47 N

6 H : p = .70 H : p ≠ .7 Y

7 H : μ ≥ 98 H : μ < 98 Y

8 H : p ≤ .44 H : p > .44 Y

9 ̅ ̅ N

H : π ≤ .8 H : π > .8 Y

Note: Practice Personal Hygiene protocols at all times.


6
Exercise C
1. a
2. a
3. d
4. b
5. b
6. a
7. a
8. c
9. d
10. d

Prepared by:

JAYBEL B. CALUMPIT
Regional Science High School for Region 02
Note: Practice Personal Hygiene protocols at all times.
7
STATISTICS AND PROBABILITY 11
Name of Learner: _______________________________ Grade Level: ___________________
Section: _______________________________________ Date: _________________________

LEARNING ACTIVITY SHEET


FORMULATING THE APPROPRIATE NULL AND ALTERNATIVE HYPOTHESES ON A
POPULATION MEAN

In our daily life, we have different observations on what’s happening around us. We create our
tentative explanation about the COVID19 by guessing. These guesses deduced from observations
were called hypotheses.
Background Information for Learners

A hypothesis is a tentative statement or explanation of a phenomenon. It is an assertion


about a parameter.
A null hypothesis (𝐻0 , read as “H zero”) is a statement that there is no difference between a
parameter and a specific value.
An alternative hypothesis (𝐻1 , read as “H one”) is a statement that there exists a difference
between a parameter and a specific value.

In formulating the null and alternative hypotheses, we examine the claim or conjecture about the
population parameter. The following examples show how to formulate null and alternative
hypotheses for a given conjecture or claim.
Example 1.
Claim: The average daily confirmed cases of COVID19 in the Philippines is 659 (per
million population).
𝐻0 : The average daily confirmed cases of COVID19 in the Philippines is 659 (per
million population) (𝜇 = 659).
𝐻1 : The average daily confirmed cases of COVID19 in the Philippines is not equal to
659 (per million population) (𝜇 ≠ 659).D
Observe that the “equal” symbol is used to express the null hypothesis while the “not equal”
symbol is used to express the alternative hypothesis which proposes that the claim does not
specify any direction.
Example 2.
Claim: The average number of students per class in the new normal is less than 20.

Note: Practice Personal Hygiene protocols at all times.


𝐻0 : The average number of students per class in the new normal education is
equal to 20 (𝜇 = 20).
𝐻1 : The average number of students per class in the new normal is less than 20.
(𝜇 < 20).
Notice that the claim uses the phrase “less than”, thus the alternative hypothesis is expressed with
the < symbol.
Example 3.
Claim: The average number of hours that the Filipino internet users spend each day
during the ECQ is greater than 10.03 hours.
𝐻0 : The average number of hours that a Filipino internet user spend each day
during the ECQ is equal to 10.03 hours (𝜇 = 10.03).
𝐻1 : The average number of hours that a Filipino internet user spend each day
during the ECQ is greater than 10.03 hours (𝜇 > 10.03).
Notice that the claim uses the phrase “greater than”, thus the alternative hypothesis is expressed
with the > symbol.
Example 4.
A new drink in the market is claimed by its manufactures to increase height by 2 inches per month
with a standard deviation of 0.42 inch. Chosen at random, fifteen teens have reported increasing
an average of 1.67 inches within a month. Do these data support the claim of the manufacturer at
0.05 level of significance?
Claim: The average increased in height per month using a new drink is equal to 2
inches.
𝐻0 : The average increased in height per month using a new drink is equal to 3
inches (𝜇 = 2).
𝐻1 : The average increased in height per two months using a new drink is equal
to 3 inches (𝜇 ≠ 2).
Observe that the claim on the first statement of the problem does not specify any direction, thus
the alternative hypothesis is expressed with the ≠ symbol.
If you take a look at the presented examples, the “equal” symbol is always used to express the null
hypothesis. It always states that the parameter is equal to a specific value. On the other hand, the
symbols ≠, < 𝑎𝑛𝑑 > are used to express the alternative hypothesis depending on the claim.

Learning Competency
Formulates the appropriate null and alternative hypotheses on a population mean (Quarter 4,
Week 2, M11/12SP-IVb-1)
EXERCISE 1

Note: Practice Personal Hygiene protocols at all times.


Directions: Identify whether the following is a null or an alternative hypothesis. (1 point each)
1. The mean height of Filipino women is 149.6 cm.
2. The average daily allowance of grade eleven students is less than Php 150.
3. The mean content of sugar in a bottle of soft drinks is greater than 52 g.
4.The average weekly consumption of ordinary rice by Filipino families is 8.9 kg.
5. The average number of hours it takes to travel from Isabela to Manila by bus is less than 12
hours.

EXERCISE 2
Directions: State the null (𝐻0 ) and the alternative (𝐻1 ) hypotheses for each of the following claim.
(2 points each)
1. The average number of years spent by Filipino workers before retiring is 31 years.
2. The mean tuition fee in private school is greater than Php 100 000 annually.
3. The average number of hours it takes a grade eleven student to learn a certain topic in
Mathematics is less than 45 minutes.
4. The mean weight of grade eleven students is 54.4 kg.
5. The average salary of private school teachers is less than Php 20 000 monthly.

EXERCISE 3
Directions: Identify the claim on the following problem. Then, state the null (𝐻0 ) and the
alternative (𝐻1 ) hypotheses for each claim. (3 points each)
1. A teacher saw a news that claims that the drop-out rate in primary education is 21.2%. He wants
to know if it is true in the town where he teaches. He randomly selected 250 respondents. He finds
out that the drop-out rate of the respondents is 23.7% with a standard deviation of 1.02%. What
can the teacher conclude about the accuracy of the news at 0.01 level of significance?
2. A new established restaurant in the city claims that the waiting time for customers is less than
10 minutes. Fifty randomly selected customers have reported an average waiting time of 13
minutes with a standard deviation of 2.5 minutes. At 0.05 level of significance, what can you
conclude about the restaurant’s claim?
3. A researcher believes that it costs more than Php 150 000 to send a college student in a private
school per year. The researcher takes a random sample of 50 families who had sent their child in
private school universities to see if his claim is true. It reveals that the mean expenses of these
families are Php 160 000 with a standard deviation of Php 5 000. Can it be concluded that the
researcher is correct in his claim at 0.05 level of significance?
Reflection:

Note: Practice Personal Hygiene protocols at all times.


What is the most important thing you’ve learned from this topic? Why?
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________

References:
Ocampo, J.M. & Marquez, W.G. (2016). Conceptual Math and Beyond. Quezon City, Manila
Belecina, R.R., et.al (2016). Statistics and Probability. Quezon City, Manila

Answer key:
EXERCISE 1
1.Null hypothesis
2.Alternative hypothesis
3. Alternative hypothesis
4. Null hypothesis
5. Alternative hypothesis

EXERCISE 2
1. 𝐻0 : The average number of years spent by Filipino workers before retiring is 31 years
(𝜇 = 31).
𝐻1 : The average number of years spent by Filipino workers before retiring is not
equal to 31 years (𝜇 ≠ 31).
2. 𝐻0 : The mean tuition fee in private school is equal to Php 100 000 annually
(𝜇 = 100 000).
𝐻1 : The mean tuition fee in of private school is greater than Php 100 000 annually
(𝜇 > 100 000).
3. 𝐻0 : The average number of hours it takes a grade eleven student to learn a certain topic
in Mathematics is equal to 45 minutes (𝜇 = 45).
𝐻1 : The average number of hours it takes a grade eleven student to learn a certain topic
in Mathematics is less than 45 minutes (𝜇 < 45).
4. 𝐻0 : The mean weight of grade eleven students is 54.4 kg (𝜇 = 54.4).
𝐻1 : The mean weight of grade eleven students is not equal to 54.4 kg (𝜇 ≠ 54.4).

Note: Practice Personal Hygiene protocols at all times.


5. 𝐻0 : The average salary of private school teachers is equal to Php 20 000 monthly
(𝜇 = 20 000).
𝐻1 : The average salary of private school teachers is less than to Php 20 000 monthly
(𝜇 < 20 000).

EXERCISE 3
1. Claim: The average drop-out rate in primary education is equal to 21.2%.
𝐻0 : The average drop-out rate in primary education is equal to 21.2% (𝜇 = 0.212).
𝐻1 : The average drop-out rate in primary education is not equal to 21.2%
(𝜇 ≠ 0.212).
2. Claim: The average waiting time for customers in a new established restaurant is less
than 13 minutes.
𝐻0 : The average waiting time for customer in a new established restaurant is equal to
13 minutes (𝜇 = 10).
𝐻1 : The average waiting time for customer in a new established restaurant is less than
13 minutes (𝜇 < 10).
3. Claim: The average cost to send a college student in private school per year is more than
Php 150 000.
𝐻0 : The average cost to send a college student in private school per year is equal to
Php 150 000 (𝜇 = 150 000).
𝐻1 : The average cost to send a college student in private school per year is more than
Php 150 000 (𝜇 > 150 000).

Prepared by:
ANGELICA M. BATTUNG

Note: Practice Personal Hygiene protocols at all times.


STATISTICS AND PROBABILITY 11
Name of Learner: _______________________________ Grade Level: ___________________
Section: _______________________________________ Date: _________________________

LEARNING ACTIVITY SHEET


IDENTIFYING THE APPROPRIATE FORM OF THE TEST-STATISTIC

In reality, there are many instances in testing hypothesis. There are cases wherein the sample is
large or the sample is small. Depending on the situation, there are several approaches to validate
our hypothesis. You will learn about these in this activity sheet.
Background Information for Learners

Z-test of one-sample mean


It is used to test if the sample mean 𝑋̅ differs significantly from the population mean 𝜇. There are
two cases when we can use the z-test.
1. The population standard deviation 𝜎 is known.
2. The population standard deviation 𝜎 is unknown but 𝑛 ≥ 30 and the Central Limit Theorem
(CLT) applies. In this case, the sample standard deviation 𝑠 can replaced the population standard
deviation 𝜎.
The Central Limit Theorem
The central limit theorem (CLT) states that the sampling distribution of the mean approaches the
normal distribution as the sample size gets larger. The sample sizes greater than or equal to 30 are
considered sufficient for the CLT to hold.

Example 1. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 1.86, 𝜇 = 2, 𝜎 = 0.35, 𝑛 = 50


Solution: Since the first condition is satisfied, and the population standard deviation 𝜎 is
given, then we can use z-test in this case.

Example 2. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 92, 𝜇 = 90, 𝑠 = 4, 𝑛 = 120


Solution: Notice that population standard deviation 𝜎 is not given, but the sample size is
greater than 30, thus by the CLT we can replace the population standard deviation
𝜎 by the sample standard deviation 𝑠. In this case, we can still use z-test.

Note: Practice Personal Hygiene protocols at all times.


Example 3. A teacher saw a news that claims the drop-out rate in primary education is 21.2%.
He wants to know if it is true in the town where he teaches. He randomly selected
150 respondents and finds out that the drop-out rate is 23.7% with a standard
deviation of 2.02%. What can the teacher conclude about the accuracy of the news
at 0.01 level of significance?
Solution: The data on the problem satisfy the second condition, thus the appropriate test
statistic on this kind of problem is z-test.

Generally, we can use z-test when the population standard deviation 𝜎 is known. However, if the
population standard deviation 𝜎 is unknown, z-test can still be used provided that 𝑛 ≥ 30, large
enough for the CLT to hold. What if 𝜎 is unknown and 𝑛 < 30? The appropriate test statistic for
this case is the t-test.

T-test of one sample mean


It is used to compare the population mean 𝜇 and the sample mean 𝑋̅, whenever 𝜎 is unknown
and 𝑛 < 30.

Example 4. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 20, 𝜇 = 17, 𝑠 = 4, 𝑛 = 10


Solution: Obviously, the population standard deviation 𝜎 is unknown and 𝑛 < 30, thus we
shall use the t-test for this example.
Example 5.
An ICT teacher in a certain school believes that his grade 11 students can type more than 50 words
in a minute. A fifteen randomly selected students reveal an average of 53.2 words per minute with
a standard deviation of 6.7 words per minute in an encoding performance task. What can you
conclude on the teacher’s claim at 0.05 level of significance?
Solution: Since 𝑛 = 15 and 𝜎 is unknown, we shall use t-test to test the teacher’s claim.

Learning Competency
Identifies the appropriate form of the test-statistic when: (a) the population variance is assumed to
be known (b) the population variance is assumed to be unknown; and (c) the Central Limit
Theorem is to be used. (Quarter 4, Week 2, M11/12SP-IVb-2)

EXERCISE 1
Directions: Write TRUE if the statement is correct and FALSE if it is not. (1 point each)
1. Z-tests assume that 𝜎 is known, while t-tests assume that 𝜎 is unknown.

Note: Practice Personal Hygiene protocols at all times.


2. When the 𝜎 is unknown and 𝑛 < 30, z-test is the appropriate statistical tool.
3. The sample size 30 is considered enough for CLT to be applied.
4. In a sample of fifty and sample standard deviation is known, we shall use t-test.
5. CLT is always applicable as long as the 𝜎 is known.

EXERCISE 2
Directions: Identify the appropriate test statistics for each of the following. (1 point each)
1. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 19, 𝜇 = 22, 𝜎 = 2.0, 𝑛 = 30
2. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 77.9, 𝜇 = 80, 𝑠 = 1.5, 𝑛 = 18
3. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 118, 𝜇 = 120, 𝑠 = 6, 𝑛 = 23
4. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 27.2, 𝜇 = 23.8, 𝑠 = 4.5, 𝑛 = 70
5. 𝐺𝑖𝑣𝑒𝑛: 𝑋̅ = 5.05, 𝜇 = 5.55, 𝜎 = 0.90, 𝑛 = 10

EXERCISE 3
Directions: Identify the appropriate test statistic for each of the following, then justify your
answer. (2 points each)
1. It is believed that the average monthly salary of a blogger is at least Php 100 000. A random
sample of ten bloggers has shown an average monthly salary of Php 112 000 with a standard
deviation of Php 15 000. At 0.01 level of significance, is the hypothesized mean true?
2. A new established restaurant in the city claims that the waiting time for customers is less than
15 minutes with a standard deviation of 2.5 minutes. Fifty randomly selected customers have
reported an average waiting time of 17 minutes. At 0.05 level of significance, what can you
conclude about the restaurant’s claim?
3. A psychologist claims that the attention span of Grade 11 students is 50 minutes. Thirty
randomly selected students reported to have a mean of 46 minutes attention span. If the population
standard deviation can be assumed to be 12 minutes, should the psychologist stick to his belief at
0.01 level of significance?
4. The mean weight of 20 packs of brand X detergent powder is 62.3 g with a standard deviation
of 5g. However, the manufacturer claims that it contains an average of 65 g. Use 0.01 level of
significance to validate the manufacturer’s claim.
5. The owner of a café wants to know whether the true average number of customers that visit the
store per day is 25. It is revealed that the average number of customers per day is 27 with a standard
deviation of seven customers, in a random sample of 42 days. Is there enough evidence to reject
the null at 0.05 level?

Note: Practice Personal Hygiene protocols at all times.


Reflection:
What have you learned from this topic?
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________

References:
Ocampo, J.M. & Marquez, W.G. (2016). Conceptual Math and Beyond. Quezon City, Manila
Belecina, R.R., et.al (2016). Statistics and Probability. Quezon City, Manila

Answer key:
EXERCISE 1
1. TRUE
2. FALSE
3. TRUE
4. FALSE
5. FALSE

EXERCISE 2
1. Z-test
2. T-test
3. T-test
4. Z-test
5. Z-test

EXERCISE 3
1. T-test, since the population standard deviation 𝜎 is unknown and 𝑛 < 30.
2. Z-test, since the population standard deviation 𝜎 is known.
3. Z-test, since the population standard deviation 𝜎 is known.
4. T-test, since the population standard deviation 𝜎 is unknown and 𝑛 < 30.

Note: Practice Personal Hygiene protocols at all times.


5. Z-test, though the population standard deviation 𝜎 is unknown, the sample is large
enough for CLT to hold.

Prepared by:
ANGELICA M. BATTUNG

Note: Practice Personal Hygiene protocols at all times.


STATISTICS AND PROBABILITY 11

Name of Learner:___________________________________ Grade Level:________


Section:_________________________________________ ___ Score:______________

LEARNING ACTIVITY SHEET


Determining the Rejection Region
Background Information for Learners
Hypothesis testing involves the process of decision-making. Wherein, there is a possibility
that we shall also commit an error of accepting or rejecting the hypothesis. Thus, the type of tests,
level of significance, critical regions or rejection regions, and critical values must be defined first.

In the previous activity sheets, you have learned how to compute the confidence interval
for a population mean focusing on three different cases. Now, you will determine the appropriate
rejection regions based on the critical value for a given level of significance for the same cases.

Case Description Test Statistics


A test concerning the mean of a normal population
1 z-test
with a known variance
A large-sample test concerning the mean of a
2 normal population (using the central limit z-test
theorem)
A small-sample test concerning the mean of a
3 t-test
population with unknown variance

The z- test is used to predict the value the population mean when the variance (σ) is known,
or even when it is unknown provided that the sample size is large based on the Central Limit
Theorem (CLT), i.e., n ≥ 30.

Recall that the critical values are the z-values in the z distribution table associated with the
probabilities at the tails of the normal curves.

Critical Values of z
Level of Significance
Type of Test
α = 0.01 α = 0.05 α = 0.10
One - Tailed ±2.326 ±1.645 ±1.282
Two-Tailed ±2.575 ±1.960 ±1.645

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 1


Rejection Regions for the Type of Tests
One-Tailed Test One-Tailed Test
Two-Tailed Test
(Left Tail) (Right Tail)
𝐻𝑂 : 𝜇𝑥 = 𝜇0 𝐻𝑂 : 𝜇𝑥 = 𝜇0 𝐻𝑂 : 𝜇𝑥 = 𝜇0
𝐻1 : 𝜇𝑥 < 𝜇0 𝐻1 : 𝜇𝑥 ≠ 𝜇0 𝐻1 : 𝜇𝑥 > 𝜇0

https://www.sciencedirect.com/topics/mathematics/rejection-region

In the critical value approach, the computed statistic is compared to the critical value of
the test statistic. When the absolute value of the computed statistic is greater than the absolute
critical value, the decision is to reject 𝐻𝑜 .

Example 1.
A new food supplement is claimed by its manufacturer to increase the weight of woman
by 1.5 kilograms per month with a standard deviation of 0.65 kg. 35 women chosen at random
have reported gaining weight an average of 1.65 kilograms within a month. Does this data support
the claim of the manufacturer at 0.05 level of significance?
Solution.
a. 𝐻𝑜 : μ=1.5
𝐻𝑜 : μ≠1.5
b. Type of test: two-tailed test
Test Statistic: z -test
Level of significance: α=0.05
Critical values: ±1.960
c. Given: 𝑋̅ = 1.65, μ=1.5, n=35, σ= 0.65
(1.65−1.5)√35
𝑧= = 1.365
0.65

The test value or computed value is z= 1.365 -1.960 1.960

This means that the null hypothesis will be rejected when 𝑧𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 ≥1.960 or when
𝑧𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 ≤ -1.960.

d. Since, -1.960 < 𝑧𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 =1.365 <1.960 or |1.365| < |±1.960|, and falls within the
acceptance region. Therefore, the null hypothesis is accepted.

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 2


e. There is no significant difference between the sample mean and the population mean. Thus,
the manufacturer is correct in claiming that the new food supplement can increase the
weight of women by 1.5 kg per month.

When the population variance (σ) is unknown and the sample size is limited, i.e., n < 30,
then, the t-test is the appropriate test statistic. The t- distribution will also be used in finding the
critical values. Different sample sizes have different distributions determined by its degree of
freedom (df). Degree of freedom is 1 less than the sample size, thus, n-1.
t distribution: Critical t values

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 3


Finding the critical value of t using the table involves finding the intersection of the degree
of freedom and the α value. For example, α =0.05 and a sample size of 25.
one-tailed test

𝑡0.05,24= -1.711 𝑡0.05,24=1.711

two-tailed test

𝑡0.025,24=-2.064 𝑡0.025,24=2.064

Example 2.
A sample of 8 measurements, randomly selected from an approximately normally
distributed population, resulted in the summary statistics: 𝑋̅=5.4, s= 1.3. Test the null hypothesis
that the mean of the population is 6 against the alternative hypothesis μ<6. Use α=0.05

Solution.

a. 𝐻𝑜 : μ=6
𝐻𝑜 : μ<6
b. Type of test: one-tailed test
Test Statistic: t -test
Level of significance: α=0.05
df= n-1; n=8
df= 7
Critical values: -1.860

c. Given: 𝑋̅ = 5.4, μ=6, n=8 s= 1.3


(5.4−6)√8
𝑡= = −1.305
1.3

The test value or computed value is t=-1.305

-1.860

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 4


d. Since, 𝑡𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 =-1.305 > -1.860 or |−1.305| < |−1.860| and falls within the
acceptance region. Therefore, the null hypothesis is accepted.
e. The sample does not provide enough evidence to reject the null hypothesis. Thus, there is
no significant difference between the means.

Learning Competency
The learner identifies the appropriate rejection region for a given level of significance
when: (a) the population variance is assumed to be known; (b) the population variance is assumed
to be unknown; and (c) the Central Limit Theorem is to be used. M11/12SP-IVc-1

Exercise 1.
Determine the appropriate rejection regions for the following given the type of test, sample size
and the significance level. [10 points]
1. Z-test, Right-tailed with n=89 at α=0.05
2. Z-test, Two-tailed with n=45 and α= 0.01
3. Z-test, Left-tailed with n=65 at α=0.10
4. Z-test, Two-tailed with n=55 at α=0.05
5. Z-test, Right-tailed with n=87 at α=0.10
6. T-test, Left-tailed with n= 15 at α=0.01
7. T-test, Two-tailed with n=22 at α=0.05
8. T-test, Right-tailed with n=15 at α=0.05
9. T-test, Two-tailed with n=10 at α=0.10
10. T-test, Right-tailed with n= 29 at α=0.05

Exercise 2.
Decide whether the null hypothesis is to be rejected or accepted, given the test value and the critical
value of test statistic. Draw the rejection region. [20 points]

Hypotheses Rejection Region Decision


1. 𝐻𝑜 : 𝜇 = 50
𝐻𝑜 : 𝜇 ≠ 50
Critical Value: ±2.093
Computed t value: 1.89

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 5


2. 𝐻𝑜 : 𝜇 = 110
𝐻𝑜 : 𝜇 > 110
Critical Value: 2.326
Computed z value: 2.350

3. 𝐻𝑜 : 𝜇 = 75
𝐻𝑜 : 𝜇 ≠ 75
Critical Value: ±1.960
Computed z value: -1.85

4. 𝐻𝑜 : 𝜇 = 2.8
𝐻𝑜 : 𝜇 < 2.8
Critical Value: -2.467
Computed t value: 1.04

5. 𝐻𝑜 : 𝜇 = 43
𝐻𝑜 : 𝜇 < 43
Critical Value: -1.282
Computed z value=0.815

Exercise 3.
Solve the following problems.
1. A mathematics teacher wants to study if the modular approach of learning affects the
performance of the students in an examination. From the previous examination, it was noted
that the population mean for an examination is 44 with a standard deviation of 4. The teacher
then applied the modular approach of learning to a sample of 20 students and after the
examination, a sample mean of 46 is calculated. Can the teacher claim that the modular
approach is effective in improving the performance of the students in an examination? Use
α=0.05.

2. In a certain barangay, a researcher wishes to determine whether the average expense of the
families is P 10,000 a month. Using a sample of 15 families, he found a mean expense of
P 8,500 with standard deviation of P1,500. At α=0.01, can the researcher conclude that the
average monthly expense of the families is P10,000?

3. A recent survey stated that adults spend an average of 8 hours a day playing mobile games. A
random sample of 50 adults is selected from a normally distributed population of adults and

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 6


noted an average of 6 hours playing mobile games a day with a standard deviation of 3 hours.
Using the 0.05 level of significance, would you conclude that the statement given in the survey
is correct?

Reflection
What have you learned?
________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________

Reference for Learners


Canlapan, R. B., et. al. (2016). Statistics and Probability. DIWA Learning System Inc. Makati
City, Philippines.
Belecina, R. R., et.al (2016). Statistics and Probability. First Edition. Rex Book Store, Inc.
Sampaloc, Manila.
Ocampo Jr., J. M., et. al. (2016). Statistics and Probability. Brilliant Creations Publishing, Inc.
Novaliches, Quezon City.

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 7


Answer Key

Exercise 1.

1. 6.
-2.624
1.645

2. 7.
-2.575 2.575 -2.080 2.080

3. 8.
-1.282 1.761

4. 9.
-1.833 1.833
-1.960 1.960
10.
5.
1.701
1.282

Exercise 2.
Hypotheses Rejection Region Decision
1. 𝐻𝑜 : 𝜇 = 50 Accept Ho
𝐻𝑜 : 𝜇 ≠ 50
Critical Value: ±2.093
Computed t value: 1.89 -2.093 2.093

2. 𝐻𝑜 : 𝜇 = 110 Reject Ho
𝐻𝑜 : 𝜇 > 110
Critical Value: 2.326
Computed z value: 2.350 2.326

3. 𝐻𝑜 : 𝜇 = 75 Accept Ho
𝐻𝑜 : 𝜇 ≠ 75
Critical Value: ±1.960
-1.960 1.960
Computed z value: -1.85
4. 𝐻𝑜 : 𝜇 = 2.8 Accept Ho
𝐻𝑜 : 𝜇 < 2.8
Critical Value: -2.467
-2.467
Computed t value: 1.04
5. 𝐻𝑜 : 𝜇 = 43 Reject Ho
𝐻𝑜 : 𝜇 < 43
Critical Value: -1.282
Computed z value: -1.9 -1.282

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 8


Exercise 3.
1. a. 𝐻𝑜 : μ=44
𝐻𝑜 : μ≠44
b. Type of test: two-tailed test
Test Statistic: z -test
Level of significance: α=0.05
Critical values: ±1.960
c. Given: 𝑋̅ = 46, μ=44, n=20, σ= 4
(46−44)√20
𝑧= = 2.236 -1.960 1.960
4
The test value or z computed value is 2.236
d. Since, |2.236| > |±1.960|, and falls within the rejection region. Therefore, the
null hypothesis is rejected

e. There is a significant difference between the sample mean and the population mean.
Thus, the mathematics teacher is correct in claiming that the modular approach of
learning is effective in improving the performance of students in an examination.

2. a. 𝐻𝑜 : μ=10,000
𝐻𝑜 : μ≠10,000
b. Type of test: two-tailed test
Test Statistic: t -test
Level of significance: α=0.01
df: 19
Critical value: ±2.977
c. Given: 𝑋̅ = 8,500, μ=10,000, n=15, s= 1,500
(8,500−10,000)√15
𝑡= = −3.873
1500
-2.977 2.977
The test value or t computed value is -3.873
d. Since, −3.873 < -2.977 and falls within the rejection region. Therefore, the null
hypothesis is rejected.

e. There is a significant difference between the sample mean and the population mean.
Thus, the average monthly expense of the families is not P10,000.

3. a. 𝐻𝑜 : μ=8
𝐻𝑜 : μ ≠ 8
b. Type of test: two-tailed test
Test Statistic: z -test
Level of significance: α=0.05
Critical value: ±1.960
c. Given: 𝑋̅ = 6, μ=8, σ=3, n=50
-1.960 1.960
(6−8)√50
𝑧= = −4.714.
3

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 9


The test value or z computed value is -4.714.
d. Since, -4.714 < -1.960 and falls within the rejection region. Therefore, the null
hypothesis is rejected.

e. There is a significant difference between the sample mean and the population mean.
Thus, the statement given by the survey on the mean number of hours adults play
mobiles games is incorrect.

Prepared by:

CAYSELYN GUITERING-MANSIBANG
Alfreda Albano National High School-Magassi

PRACTICE PERSONAL HYGIENE PROTOCOLS AT ALL TIMES 10


STATISTICS AND PROBABILITY
Name of Learner: Grade Level:
Section: Date:

LEARNING ACTIVITY SHEET


COMPUTING FOR THE TEST STATISTIC VALUE (POPULATION MEAN)

Background Information for Learners


Calculation of the test statistic is an essential process that we must undertake in
hypothesis testing. The test statistic value compares your data with the expected standards;
thus, the test statistic value serves as an index to determine the needed probability to aid in
coming up with a decision.

For this activity sheet, we will be focusing on one-population test or a significance


test for a single mean. A one population test is used on one sample that came from a
population with a given mean µ. But before performing the test, we must make sure that:
1. the sample size is large (n≥30) so that we can apply the Central Limit Theorem
(CLT),
2. when the population standard deviation is not given the sample standard
deviation s may be used as an estimate of the population standard deviation.
With the given premise, it can be presumed that we will be considering 2 cases when we
compute for the test statistic value of a one-population mean. The first being that the population
standard deviation is given and the second when it is not both of which will be tackled
separately. It is also worth noting that the test statistic that will be used is the z-test.

Case 1. The population mean µ and the population standard deviation σ are given:

Equation 1
𝑋𝑋� − µ
Test Statistic z = σ𝑥𝑥𝑥
σ
Where: σx̅ =
√𝑛𝑛

Example 1: Compute for the test statistic value z:

Given: n = 100, X̅ = 92, µ = 90, and σ = 7


Find the value of z

Solution: Since the population mean µ and the population standard deviation σ was given, we
will make use of the given equation 1 and find the value of z by following these
steps.
Note: Practice Personal Hygiene Protocols at All Times
Step 1. Let us write our working formula:
𝑋𝑋� − µ
z= σ𝑥𝑥𝑥

σ
σx̅ =
√𝑛𝑛

Step 2. Replace the given values to our working formula. It can be observed that we must solve
for the value of σx̅ before we are able to find for the value of z.
σ 7
σx̅ = =
√𝑛𝑛 √100
σx̅ = 0.7

𝑋𝑋� − µ 92−90
z= σ𝑥𝑥𝑥
== 0.7
z = 2.857
The computed test statistic value z = 2.857

Case 2. The population mean µ is given and the population standard deviation σ is unknown:

Since the population standard deviation σ is not known, the sample standard deviation
s will be used as an approximate value, thus equation 1 becomes:

Equation 2
𝑋𝑋� − µ
Test Statistic z = σ𝑥𝑥𝑥
s
Where: σx̅ =
√𝑛𝑛

Example 2: Compute for the test statistic value z:

Given: n = 90, X̅ = 60 , µ = 57, and s = 5


Find the value of z

Solution: Since the population mean µ and the sample standard deviation s was given, we will
make use of the given equation 2 and find the value of z by following these steps.

Step 1. Let us write our working formula:


𝑋𝑋� − µ
z= σ𝑥𝑥𝑥

s
σx̅ =
√𝑛𝑛

Step 2. Replace the given values to our working formula. It can be observed that we must solve
for the value of σx̅ before we are able to find for the value of z.
𝑠𝑠 5
σx̅ = =
√𝑛𝑛 √90
σx̅ = 0.527
𝑋𝑋� − µ 60−57
z= σ𝑥𝑥𝑥
== 0.527
z = 5.693

Note: Practice Personal Hygiene Protocols at All Times


The computed test statistic value z = 5.692

Example 3. In a certain study conducted at Barucboc National High School, it was found that
the average weight of grade 11 students is 48 Kg with a standard deviation of 4 Kg. To
validate the result of the said study, a sample of 55 students were randomly selected and
was found out that the average weight of the sample is 50 Kg with a standard deviation
of 3 Kg. Calculate the test statistic value of the weight of grade 11 students.

Solution:

Step 1. Since the problem did not specify the given values, we must write them down to
simplify and avoid confusion in our problem solving.
Given: n = 55, X̅ = 50, µ = 48, σ = 4, and s = 3

Step 2. Since the population standard deviation σ is given, we will use Equation 1. Let
us write our working formula:
𝑋𝑋� − µ
z= σ𝑥𝑥𝑥

σ
σx̅ =
√𝑛𝑛

Step 3. Replace the given values to our working formula. It can be observed that we
must solve for the value of σx̅ before we are able to find for the value of z.
σ 4
σx̅ = =
√𝑛𝑛 √55
σx̅ = 0.539

𝑋𝑋� − µ 50−48
z= σ𝑥𝑥𝑥
== 0.539
z = 3.711
The computed test statistic value z = 3.711

Example 4. A locally produced bottled water claims that every bottle they produce contains
330 mL of water. Grade 11 students of Tumauini National High School wanted to
test the claim and gathers a sample of 120 bottles to measured. The students found
out that the average volume of each bottle is 322 mL with a standard deviation of
15 mL. Calculate the test statistic value.

Solution:

Step 1. Since the problem did not specify the given values, we must write them down to
simplify and avoid confusion in our problem solving.
Given: n = 120, X̅ = 322, µ = 330, and s = 15

Step 2. Since the population standard deviation σ is not given, we will use Equation 2.
Let us write our working formula:
𝑋𝑋� − µ
z= σ𝑥𝑥𝑥

s
σx̅ =
√𝑛𝑛

Note: Practice Personal Hygiene Protocols at All Times


Step 3. Replace the given values to our working formula. It can be observed that we
must solve for the value of σx̅ before we are able to find for the value of z.
s 15
σx̅ = =
√𝑛𝑛 √120
σx̅ = 1.369

𝑋𝑋� − µ 322−330
z= σ𝑥𝑥𝑥
== 1.369
z = -5.843
The computed test statistic value z = -5.843

LEARNING COMPETENCY
Computes for the Test-Statistic Value (Population Mean). (Quarter 4, Week 4,
M11/12SP-IVd-1)
EXERCISE 1
Directions: Solve for the statistical value z for each of the following (2 points each)

1. X̅ = 18, σ = 2, µ = 16, n = 58
2. X̅ = 27.4, σ = 4.8, µ = 28.1, n = 127

3. X̅ = 889, σ = 14.4, µ =904 , n = 145


4. X̅ = 13.07, s = 1.2, µ = 12.95, n = 45

5. X̅ = 1505, s = 55, µ = 1513, n = 220

EXERCISE 2
Directions: Determine the given in each problem and solve for the statistical value z. (4 points
each)

1. It was found on a study that most teens sleep for about 7.25 hours each day (
Nationwide Children.org.). To verify this, a survey was conducted with a total of 87
participants aged 16-18 years old. It was found out from the survey that the average
was 6.8 with a standard deviation of 0.5 hours.

2. The average birth weight of naturally born Filipinos is 3000 grams with a standard
deviation of 200 grams. A survey of 200 newborn babies resulted with an average of
2750 grams with a sample standard deviation of 300 grams.

Note: Practice Personal Hygiene Protocols at All Times


EXERCISE 3

Directions: Solve for the test statistic z for the given problem.
The canteen manager claims that the average weight of a platter of spaghetti that they
serve is 350 grams. A student wanted to verify this claim and gathered a total of 30
sample with the following result:

350 345 360 350 345 350


345 360 340 355 355 360
337 350 355 360 340 345
345 340 340 350 348 355
350 345 345 340 355 345

a. If the population standard deviation is assumed at 5 grams, calculate the test


statistic value z. (5 points)

b. The population standard deviation is not given, compute for the sample population
standard deviation and find the test statistic value. (10 points)

REFERENCES:
Belecina, Rene R., Baccay, Elisa S., Mateo, Efren B. (2016). Statistics and Probability (First
Edition). Rex Bookstore

Nationwide Childrens:Sleep in Adolescents. Retrieved from


https://www.nationwidechildrens.org/specialties/sleep-disorder-center/sleep-in-
adolescents#:~:text=Sleep%20in%20Adolescents-
,What%20to%20expect,9%20%C2%BC%20hours%20of%20sleep).

REFLECTION :
Briefly discuss the key points you have learned from this topic
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________

Note: Practice Personal Hygiene Protocols at All Times


ANSWER KEY

Exercise 1
σ 2
1. σx̅ = = = 0.263
√𝑛𝑛 √58
𝑋𝑋�− µ 18−16
z= σ𝑥𝑥𝑥
= 0.263
= z =7.605

σ 4.8
2. σx̅ = = = 0.426
√𝑛𝑛 √127
𝑋𝑋�− µ 27.4−28.1
z= σ𝑥𝑥𝑥
= 0.426
= z =-3.052

σ 14.4
3. σx̅ = = = 1.196
√𝑛𝑛 √145
𝑋𝑋�− µ 889−904
z= σ𝑥𝑥𝑥
= 1.196
= z =-12.542

s 1.2
4. σx̅ = = = 0.179
√𝑛𝑛 √45
𝑋𝑋�− µ 13.07−12.95
z= σ𝑥𝑥𝑥
= 0.179
= z =0.670

s 55
5. σx̅ = = = 3.708
√𝑛𝑛 √220
𝑋𝑋�− µ 1505−1513
z= σ𝑥𝑥𝑥
= 3.708
= z =-2.158

Exercise 2

1. Given: n = 87, X̅ = 6.8, µ = 7.25, and s = 0.5


s 0.5
σx̅ = = = 0.054
√𝑛𝑛 √87
𝑋𝑋�− µ 6.8−7.25
z= σ𝑥𝑥𝑥
= 0.054
= z = -8.333

2. Given: n = 200, X̅ = 2750, µ = 3000, σ = 200, and s = 300


σ 200
σx̅ = = = 14.142
√𝑛𝑛 √200
𝑋𝑋�− µ 2750−3000
z= σ𝑥𝑥𝑥
= 14.142
= z = -17.678

Note: Practice Personal Hygiene Protocols at All Times


Exercise 3

a. Given: n = 30, X̅ = ?, µ = 350, and σ = 5


Since the population mean µ is not given, we have to solve it by getting the average of the
given sample

X f X*f
337 1 337 From the table, we can solve for the µ
340 5 1700
345 8 2760 X� =
Σx∗f
=
10460
= 348.667
30 30
348 1 348
350 6 2100
355 5 1775
360 4 1440
Σx*f = 10460

Finally:
σ 5
σx̅ = = = 0.913
√𝑛𝑛 √30
𝑋𝑋� − µ 348.667−350
z= σ𝑥𝑥𝑥
= 0.913
= z = -1.46

b. The population standard deviation is not given so we have to compute for the sample
standard deviation.

X f/30 X*(f/30) X2*(f/30)


337 1/30 11.233 3785.633 𝑓𝑓
s = �𝛴𝛴𝑋𝑋 2 ∗ �30� − µ2
340 5/30 56.667 19266.667
345 8/30 92.000 31740.000 s = √121613.26 – 348.6672
348 1/30 11.600 4036.800 s = √121613.26 − 121568.68
350 6/30 70.000 24500.000 s = √44.583
355 5/30 59.167 21004.167 s = 6.677
360 4/30 48.000 17280.000
Σ X2*(f/30)=
X̅=348.667 121613.26

s 6.667
σx̅ = = = 1.217
√𝑛𝑛 √30
𝑋𝑋� − µ 348.667−350
z= σ𝑥𝑥𝑥
= 1.217
= z = -1.095

PREPARED BY:

ENGR. RONALD L. MORALES

Note: Practice Personal Hygiene Protocols at All Times


STATISTICS AND PROBABILITY
Name of Learner: Grade Level:
Section: Date:

LEARNING ACTIVITY SHEET

Draws Conclusion About the Population Mean Based on the Test-Statistic Value
and the Rejection Region.

Background Information for Learners


Critical values serve as boundaries that delineate one region from the other. It serves as markers
to make it easy for observers whether one region is already crossed or not yet. For analogy,
when one is travelling from one town to another, municipal boundaries are placed to inform
motorists that they have already passed one area of jurisdiction from the other. Thus, we can
easily say that the rejection region is the area right after passing the critical values. The test
statistic value serves as an index whether the probability have crossed the critical value or if it
has stayed in the acceptable boundary. A comparison of the two will help the researcher come
up with an assured decision in accepting or rejecting the null hypothesis.

Critical Value

Rejection area

Figure 1 – Normal Distribution Curve

Figure 1 shows the normal distribution curve highlighting the critical value and the
rejection areas under the normal curve.

Example 1. In a certain study conducted at Barucboc National High School, it was found that
the average weight of grade 11 students is 48 Kg with a standard deviation of 4 Kg.
To validate the result of the said study, a sample of 55 students were randomly
selected and was found out that the sample is above the average weight. Use α =
95%.

Note: Practice Personal Hygiene Protocols at All Times


STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the average weight


parameter of interest of the sample population

2. Formulate the hypothesis HO : µ = 48

Hi : µ ≠ 48

3. Test Statistic to be used Since n = 55, you can assume that the sample is
normally distributed and apply CLT.

Use z-test

4. Determine critical values and The test is two tailed


establish rejection regions
z critical values: ±1.96

-1.96 0 1.96

σ 4
σx̅ = = = 0.539
5. Calculate the test statistic √𝑛 √55
𝑋̅ − µ 50−48
value z= == = 3.71
σ𝑥̅ 0.539

6. State the decision rule Accept H0 if -1.96 < z < 1.96

7. Compare the test statistic value 3.71 > ±1.96


and the critical value and draw
a conclusion Since the test statistic value is greater than the
critical value, the null hypothesis H0 is rejected

We can conclude that there is a significant


difference between the sample mean and the
population mean

3.71
Figure 1.1

As can be seen on Figure 1.1, the computed statistic value is already in the region of rejection
that is why the null hypothesis H0 is rejected.

Note: Practice Personal Hygiene Protocols at All Times


Example 2. A locally produced bottled water claims that every bottle they produce contains
330 ml. of water. Grade 11 students of Tumauini National High School wanted to
test the claim and gathers a sample of 120 bottles to be measured. The students
found out that the average volume of each bottle is 327 ml. with a standard
deviation of 22 ml. Calculate the test statistic value and find out if the
manufacturer’s claim is correct using α = 95%.

STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the average weight


parameter of interest of the sample population

2. Formulate the hypothesis HO : µ = 330

Hi : µ ≠ 330

3. Test Statistic to be used Since n = 120, you can assume that the sample
is normally distributed and apply CLT.

Use z-test

4. Determine critical values and The test is two tailed


establish rejection regions
z critical values: ±1.96

-1.96 0 1.96

𝑠 22
σx̅ = = = 2.008
5. Calculate the test statistic √𝑛 √120
𝑋̅ − µ 327−330
value z= == = -1.494
σ𝑥̅ 2.008

6. State the decision rule Accept H0 if -1.96 < z < 1.96

7. Compare the test statistic value -1.494 > -1.96


and the critical value and draw
a conclusion Since the test statistic value is within the
acceptable value region, the null hypothesis H0
is accepted

We can conclude that there is no significant


difference between the sample mean and the
population mean

Note: Practice Personal Hygiene Protocols at All Times


-1.494

Figure 1.2

In Figure 1.2, The computed statistical value is greater than the critical value and is located at
the acceptance region so the null hypothesis H0 is accepted.

Example 3. A locally produced bottled water claims that every bottle they produce contains
330 mL of water. Grade 11 students of Tumauini National High School wanted to
test the claim and gathers a sample of 20 bottles to measured. The students found
out that the average volume of each bottle is 327 mL with a standard deviation of
22 mL. Calculate the test statistic value and find out if the manufacturer’s claim is
correct using α = 95%.
Solution:
It can be noted that example 3 is the exact problem of example 2 but instead of having a sample
size of 120, the researchers have only used 20 sample.
STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the average weight


parameter of interest of the sample population

2. Formulate the hypothesis HO : µ = 330

Hi : µ ≠ 330

3. Test Statistic to be used Since n = 20, CLT cannot be applied but it is


assumed that the samples came from a normally
distributed population.

Use t-test

4. Determine critical values and The test is two tailed


establish rejection regions
t critical values: ±2.101

-2.101 0 2.10
1
𝑠 22
σx̅ = = = 2.008
5. Calculate the test statistic √𝑛 √120

Note: Practice Personal Hygiene Protocols at All Times


value 𝑋̅ − µ 327−330
t= == = -1.494
σ𝑥̅ 2.008

6. State the decision rule Accept H0 if -2.101 < t < 2.101

7. Compare the test statistic value -1.494 > -2.101


and the critical value and draw
a conclusion Since the test statistic value is within the
acceptable value region, the null hypothesis H0
is accepted

We can conclude that there is no significant


difference between the sample mean and the
population mean.

LEARNING COMPETENCY
Draws Conclusion About the Population Mean Based on the Test-Statistic Value
and the Rejection Region.). (Quarter 4, Week 4, M11/12SP-IVd-2)

EXERCISE 1
Directions: Determine the critical value and solve for the statistical value for each of the
following. (1 point for the critical value and 2 points for the statistical value)

1. X̅ = 23, σ = 5, µ = 20, n = 99, α = 90% single tailed (right side)

2. X̅ = 102, σ = 10, µ = 99, n = 17, α = 99% two tailed


3. X̅ = 2075, σ = 40, µ = 2084 , n = 64, α = 95% single tailed (left side)
4. X̅ = 69, s = 14, µ = 79, n = 22, α = 90 % single tailed (right side)

5. X̅ = 136, s = 15, µ = 134, n = 100 , α = 99% two tailed

EXERCISE 2
Directions: Study the given problem. Write the null and alternative hypothesis and draw a
conclusion based on the comparison of the computed statistical value and the critical
values. (5 points each)

1. A principal at a certain school claims that the students have high aptitude in
mathematics. She claims that the population average is above 96. To test the claim, 30
randomly selected students were given the exam and the result showed that the average
is 98 with a standard deviation of 3. With 90% level of confidence, check that the
sample supports the claim of the principal.

Note: Practice Personal Hygiene Protocols at All Times


2. A principal of a certain school claims that the IQ of the students in her school is above
110. To test her claim, she administered an IQ test to 25 of her students. The average
from the randomly selected students is 114 with a standard deviation of 5. Based on
the result, did the result of the administered exam with the sample students support her
claim? Assume a 5% level of significance.

3. A recent survey result showed that teen spend at least 22 hours a week on their
cellphone with a standard deviation of 1.5 hours. 45 students of a certain school were
surveyed and showed that they spend 24 hours on their cellphone each week. Verify
with 99% confidence level that the sample supports the result of the survey.
EXERCISE 3

Directions: State whether to accept or reject the null hypothesis and draw a conclusion based on
the computed statistical value and the critical value using the given seven steps.
The canteen manager claims that there are at least 38 bilo-bilo balls in every bowl that
they sell. A survey was conducted, and the result is shown below.

35 38 43 40 35
37 35 37 42 39
40 40 39 36 36
38 37 39 41 40
37 36 38 37 36

a. Use 95% as confidence level. (10 points)

b. Instead of stating that at least 38, the manager changed her claim and said that there
is an average of 38 bilo-bilo balls in each bowl. Will the change in the statement
affect the problem? If so, prove by showing your solution. Use 95% confidence
level. (10 points)

REFERENCES:
Belecina, Rene R., Baccay, Elisa S., Mateo, Efren B. (2016). Statistics and Probability (First
Edition). Rex Bookstore

REFLECTION :
How can you apply the lesson in real life? Briefly discuss Using your own experience.
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________
____________________________________________________________________________

Note: Practice Personal Hygiene Protocols at All Times


ANSWER KEY

Exercise 1

σ 5
1. σx̅ = = = 0.505
√𝑛 √99
𝑋̅ − µ 23−20
z= = = z = 5.941
σ𝑥̅ 0.505
zcritical = 1.288
σ 10
2. σx̅ = = = 2.425
√𝑛 √17
𝑋̅ − µ 102−99
z= = = z = 1.237
σ𝑥̅ 2.425
tcritical = ± 2.921
σ 40
3. σx̅ = = =5
√𝑛 √64
𝑋̅ − µ 2075−2084
z= = = z = -1.8
σ𝑥̅ 5
zcritical = -1.645
s 14
4. σx̅ = = = 2.985
√𝑛 √22
𝑋̅ − µ 69−79
z= = = z = -3.35
σ𝑥̅ 2.985
tcritical = 1.323
s 15
5. σx̅ = = = 1.5
√𝑛 √100
𝑋̅ − µ 136−134
z= = = z = 1.333
σ𝑥̅ 1.5
zcritical = ± 2.575

Exercise 2

1.

STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the average score in


parameter of interest mathematics of the sample population

2. Formulate the hypothesis HO : µ = 96

Hi : µ > 96

3. Test Statistic to be used Since n = 30, you can assume that the sample is
normally distributed and apply CLT.
Use z-test

4. Determine critical values and The test is single tailed right side
establish rejection regions
z critical values: 1.288

Note: Practice Personal Hygiene Protocols at All Times


0 1.28
8
𝑠 3
σx̅ = = = 0.548
5. Calculate the test statistic √𝑛 √30
𝑋̅ − µ 98−96
value z= == = 3.65
σ𝑥̅ 0.548

6. State the decision rule Accept H0 if z < 1.28

7. Compare the test statistic value 3.65 > 1.28


and the critical value and draw
a conclusion Since the test statistic value is greater than the
critical value, the null hypothesis H0 is rejected

We can conclude that there is a significant


difference between the sample mean and the
population mean

2.
STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the IQ level of the


parameter of interest sample population

2. Formulate the hypothesis HO : µ = 110

Hi : µ > 110

3. Test Statistic to be used Since n = 25, CLT cannot be applied but it is


assumed that the samples came from a
normally distributed population.

Use t-test

4. Determine critical values and The test is single tailed directed to the right
establish rejection regions
t critical values: 1.711

0 1.711

𝑠 5
5. Calculate the test statistic value σx̅ = = =1
√𝑛 √25
𝑋̅ − µ 114−110
z= == =4
σ𝑥̅ 1

Note: Practice Personal Hygiene Protocols at All Times


6. State the decision rule Accept H0 if t < 1.711

7. Compare the test statistic value 4 > 1.711


and the critical value and draw a
conclusion Since the test statistic value is greater than the
critical value, the null hypothesis H0 is
rejected

We can conclude that there is a significant


difference between the sample mean and the
population mean

3.
STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the number of hours


parameter of interest that is consumed weekly in using cellphone by
the sample population

2. Formulate the hypothesis HO : µ = 22

Hi : µ > 22

3. Test Statistic to be used Since n = 45, you can assume that the sample is
normally distributed and apply CLT.

Use z-test

4. Determine critical values and The test is single tailed directed to the right
establish rejection regions
z critical values: 2.33

0 2.33

𝑠 1.5
5. Calculate the test statistic σx̅ = = = 0.224
√𝑛 √45
𝑋̅ − µ 24−22
value z= == = 8.929
σ𝑥̅ 0.224

6. State the decision rule Accept H0 if z < 2.33

7. Compare the test statistic value 8.929 > 2.33


and the critical value and draw
a conclusion Since the test statistic value is greater than the
critical value, the null hypothesis H0 is rejected

We can conclude that there is a significant


difference between the sample mean and the

Note: Practice Personal Hygiene Protocols at All Times


population mean

Exercise 3

a. The population standard deviation is not given so we need compute for the sample mean µ
and the sample standard deviation s.

X f X*(f/25) X2*(f/25)
35 3 4.2 147 𝑓
s = √𝛴𝑋 2 ∗ ( ) − µ2
36 4 5.76 207.36 30

37 5 7.4 273.8 s = √1451.72 – 38.042


38 3 4.56 173.28 s = √1451.72 − 1447.04
39 3 4.68 182.52 s = √4.678
40 4 6.4 256 s = 2.163
41 1 1.64 67.24
42 1 1.68 70.56
43 1 1.72 73.96
Σ X2*(f/30)=
µ=38.04 1451.72

STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the IQ level of the


parameter of interest sample population

2. Formulate the hypothesis HO : µ = 38

Hi : µ > 38

3. Test Statistic to be used Since n = 25, CLT cannot be applied but it is


assumed that the samples came from a normally
distributed population.

Use t-test

4. Determine critical values and The test is single tailed directed to the right
establish rejection regions
t critical values: 1.711

0 1.711

𝑠 2.163
σx̅ = = = 0.4326
5. Calculate the test statistic √ 𝑛 √25
𝑋̅ − µ 38.04−38
value z= == = 0.092
σ𝑥̅ 0.4326

Note: Practice Personal Hygiene Protocols at All Times


6. State the decision rule Accept H0 if t < 1.711

7. Compare the test statistic 0.092 < 1.711


value and the critical value
and draw a conclusion Since the test statistic value is within the
acceptable value region, the null hypothesis H0 is
accepted

We can conclude that there is no significant


difference between the sample mean and the
population mean

b. Even if the statement was changed, the survey values remain the same so the µ and the s
will also remain the same. Therefore, we will make use of the same values. The only
difference is that the test will become two tailed.

STEP SOLUTION/ANSWER

1. Describe the population The parameter of interest is the average weight of


parameter of interest the sample population

2. Formulate the hypothesis HO : µ = 38

Hi : µ ≠ 38

3. Test Statistic to be used Since n = 25, CLT cannot be applied but it is


assumed that the samples came from a normally
distributed population.

Use t-test

4. Determine critical values and The test is two tailed


establish rejection regions
t critical values: ±2.064

-2.064 0 2.064

𝑠 2.163
σx̅ = = = 0.4326
5. Calculate the test statistic √𝑛 √25
𝑋̅ − µ 38.04−38
value z= == = 0.092
σ𝑥̅ 0.4326

6. State the decision rule Accept H0 if -2.064 < t < 2.064

7. Compare the test statistic 0.092 < 2.064


value and the critical value
and draw a conclusion Since the test statistic value is within the
acceptable value region, the null hypothesis H0 is

Note: Practice Personal Hygiene Protocols at All Times


accepted

We can conclude that there is no significant


difference between the sample mean and the
population mean.

PREPARED BY:

ENGR. RONALD L. MORALES

Note: Practice Personal Hygiene Protocols at All Times


STATISTICS AND PROBABILITY

Name of Learner: _____________________ Grade Level: __________________


Section: _____________________________ Date: ________________________

LEARNING ACTIVITY SHEET


SOLVING PROBLEMS INVOLVING TEST OF HYPOTHESIS ON THE POPULATION
MEAN

Background Information for Learners:

In the previous lessons, we have learned the steps in testing the null hypothesis
where we computed the test statistic value using the z-test or t-test in order for us to draw
conclusions based on the test statistic value and the rejection region. In this lesson, we
shall learn to solve problems involving test of hypothesis on the population mean. But
before going through this, let us recall the steps in hypothesis testing, (1) Identify the claim
and formulate the null (Ho) and the alternative (Ha) hypothesis, (2) set the level of
significance and determine whether the test is one-tailed or two-tailed by looking at how
the alternative hypothesis is expressed and draw the rejection region, (3) Determine the
appropriate test statistic or the statistical test and calculate the test value of the statistical
test and (4) Make a decision whether to accept or reject the null hypothesis. If the
computed value or test value falls in the rejection region, then reject the null hypothesis;
otherwise, accept the null hypothesis, and (5) Formulate the conclusion.

The z test is a statistical test for population mean. It is used when the population is
normal and the population standard deviation σ is known and the sample size n ≥ 30.
The formula is

Note: If the population standard deviation is not known, z-test can still be used by
replacing σ by s (sample standard deviation) provided that n ≥ 30.
The t test is another statistical test for population mean. It is used when the population
is normal and the population standard deviation σ is unknown and the sample size n <
30. The formula is
Where:
n = sample size
s = sample standard deviation
μ = population mean
x̄ = sample mean

df = n - 1

Example 1:
A new medicine is claimed by its manufacturer to reduce overweight person by
4.65 kg per month with a standard deviation of 0.95 kg. 45 people were chosen to take
the medicine for a month and reported losing an average of 4.05 kg. does this data
support the claim of the manufacturer at 0.05 level of significance?
Solution:
Ho: The average weight loss per month is equal to 4.65 kg. (μ = 4.65)
Step 1 Ha: The average weight loss per month is not equal to 4.65 kg (μ≠ 4.65)

Step 2 Two-tailed or nondirectional test


Critical value: ±1.96
α= 0.05

-1.96 +1.96

Step 3 z-test
x̄ = 4.05 kg
μ = 4.65 kg
n = 45
σ = 0.95 kg

(4.05−4.65)√45
z= 0.95
z = -4.24

Step 4 The computed value (z = -4.24) falls within the rejection region, we reject the
null hypothesis.
Step 5 Conclusion: The average weight loss per month is not equal to 4.65 kg. Thus,
the manufacturer is incorrect in claiming that the new medicine can reduce
overweight people by 4.65 kg. per month.
Example 2:
A researcher believes that it costs more than 95, 000 pesos to raise a child from
birth to age one with a standard deviation of 4,500 pesos. A random sample of 50 babies
is selected to test if the claim is correct. The average expenses reveal a mean of 98, 000
pesos. Based on the collected data, can it be concluded that the claim is correct at 0.01
level of significance?

Solution:
Ho: The average cost to raise a child from birth to age one is equal to 95,000
Step 1 pesos. (μ = 95,000)
Ha: The average cost to raise a child from birth to age one is greater than
95,000 pesos. (μ > 95,000)
Step 2 one-tailed or directional test (right-tailed)
Critical value: +2.33
α= 0.01

+2.33
Step 3 z-test
x̄ = 98,000 pesos
μ = 95,000 pesos
n = 50
σ = 4,500 pesos

(98000−95000)√50
z= 4500
z = 4.71
Step 4 The computed value (z = 4.71) falls within the rejection region, we reject the
null hypothesis.
Step 5 Conclusion: The average cost to raise a child from birth to age one is greater
than 95,000 pesos. Thus, the researcher is correct in claiming that the
average cost to raise a child from birth to age one is greater than 95,000
pesos.

Example 3:
A certain feeds manufacturer is verifying a complaint from tilapia breeders that
there is a short-weight selling of feeds in a certain town. An agent manufacturer took a
random sample of 20 sacks from the “25-kilo” sacks of feeds from a large shipment and
found that the mean weight was 24.85 kg with a standard deviation of 0.32 kg. Is this
evidence of short-weighing at 0.01 level of significance?
Solution:
Ho: The average weight of tilapia feeds is 25 kg. (μ = 25)
Step 1 Ha: The average weight of tilapia feeds is less than 25 kg. (μ < 25)

Step 2 one-tailed or directional test (left-tailed)


Critical value: -2.539; df = 20 – 1 = 19
α= 0.01

-2.539
Step 3 t-test
x̄ = 24.85 kg
μ = 25 kg
n = 20
s = 0.32 kg

(24.85−25)√20
t= 0.32
t = -2.10

Step 4 The computed value (t = -2.10) does not fall within the rejection region, we
decide not to reject the null hypothesis.
Step 5 Conclusion: The average weight of tilapia feeds is 25 kg. Thus, there is no
enough evidence to reject that the mean weight of tilapia feeds is 25 kg.

Example 4:
A recent study showed that high school students received an average of 50
telephone calls per month. To test the claim, the Supreme Student Government president
surveyed 29 students and found out that the average number of calls was 47.6 with the
standard deviation of 7. Is there a significant difference between the population mean and
the sample mean at 0.05 level of significance?

Solution:
Ho: There is no significant difference between the population mean and the
Step 1 sample mean. (μ = 50)
Ha: There is a significant difference between the population mean and the
sample mean. (μ≠ 50)
Step 2 Two-tailed or non-directional test
Critical value: ±2.048 df = 28
α= 0.05
-2.048 +2.048

Step 3 t-test
x̄ = 47.6
μ = 50
n = 29
s=7

(47.6−50)√29
t= 7
t = -1.85

Step 4 The computed value (t = -1.85) falls within the acceptance region, we decide
not to reject the null hypothesis.
Step 5 Conclusion There is no significant difference between the population mean
and the sample mean. Thus, there is no enough evidence to reject the claim
that high school students received an average of 50 telephone calls per
month.

Learning Competency

Solves problems involving test of hypothesis on the population mean.


(M11/12SP-IV-e-1)

Problem Set
1. Yna Celestine believes that the average amount of time spent by her classmates
in studying their self learning module in Math per week is less than 300 minutes
with a standard deviation of 45 minutes. She took a random sample of 35 students
in their class and found out that average time spent for studying was 285 minutes.
Test the claim at the 0.05 level of significance.
2. Don, a canteen owner claims that the average meal cost of his usual costumers is
190 pesos. In order to test his claim, Don took a random sample of 25 costumers
and found out that the meal cost is 210 with a standard deviation of 30 pesos. Test
the hypothesis at 0.01 level of significance.
3. A coffee vending machine is designed to dispense 180 ml of coffee but its owner
suspects that it is dispensing more than what is designed for. He took a random
sample of 40 and found out that the mean is 192 ml with a standard deviation of 4
ml. do you think the owner is right about his suspicion? Test at 0.05 level of
significance.
Exercise 1

Direction: Formulate the null and alternative hypotheses of each problem in the set.

Problem Null hypothesis (Ho) Alternative hypothesis (Ha)


(1 point each) (1 point each)
1

Exercise 2

Directions: Determine the type of test (two-tailed or one-tailed), level of significance, the
test statistic to be used, the critical value and the degree of freedom (if possible) of each
problem in the set. (1 point each)

Problem Type of test α Test statistic Critical df


value (if possible)
1

3
Exercise 3

Directions: Compute the test value using the test statistic and draw the rejection region
of each problem in the set.

Problem Computed or Test value Rejection region


(2 points each) (1 point each)
1

Exercise 4

Directions: Make a decision whether to accept or reject the null hypothesis and
formulate the conclusion of each problem in the set.

Problem Decision Conclusion


(1 point each) (2 points each)
1
2

References:

Lim, Y. F., et.al. (2016). Statistics and Probability. Sibs Publishing House, Inc. Quezon
City, Philippines
Belecina, R. R., et.al. (2016). Statistics and Probability. Rex Bookstore, Inc. Sampaloc,
Manila
Ocampo, Jr. J. M., et.al. (2016). Math and Beyond Statistics and Probability. Brilliant
Creations Publishing, Inc. Quezon City, Philippines
Reflection:

What have you learned from this topic?


______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
Answer key:

Exercise 1

Problem Null hypothesis (Ho) Alternative hypothesis (Ha)


1 The average amount of time spent in The average amount of time spent in
studying the self learning module in studying the self learning module in
Math per week is 300 minutes. (μ = Math per week is less than 300
300) minutes. (μ < 300)
2 The average meal cost of usual The average meal cost of usual
costumers is 190 pesos. (μ = 190) costumers is not 190 pesos. (μ ≠
190)
3 The average volume of coffee that a The average volume of coffee that a
vending machine can dispense is vending machine can dispense is
180 ml. (μ = 180) more than 180 ml. (μ > 180)

Exercise 2

Problem Type of test α Test statistic Critical df


value (if possible)
1 One-tailed 0.05 z-test -1.645 none
(Left)
2 Two-tailed 0.01 t-test ±2.797 24
3 One-tailed 0.05 z-test +1.645 none
(right)

Exercise 3

Problem Computed or Test Rejection region


value
1 (285−300)√35
z= 45
z = -1.97

-1.97 -1.645
2 t=
(210−190)√25
30
t = 3.33

-2.797 +2.797 3.33


3 z=
(192−180)√40
4
z = 15.81

+1.645 15.81

Exercise 4

Problem Decision Conclusion


1 Reject the null The average amount of time spent in studying the self
hypothesis learning module in Math per week is less than 300 minutes.
Thus, Yna’s claim is correct.
2 Reject the null There is a significant difference between the population
hypothesis mean and the sample mean. Thus, the average meal cost
of usual costumers is not 190 pesos.
3 Reject the null The average volume of coffee that a vending machine can
hypothesis dispense is more than 180 ml. Thus, the owner’s suspicion
is correct.

Prepared by:

ARNOLD L. HABAN
STATISTICS AND PROBABILITY

Name of Learner: _____________________ Grade Level: __________________


Section: _____________________________ Date: ________________________

LEARNING ACTIVITY SHEET

FORMULATING THE APPROPRIATE NULL AND ALTERNATIVE HYPOTHESES ON


A POPULATION PROPORTION

Background Information for Learners:

A statistical hypothesis is a statement about the numerical value of a population


parameter. In the previous lessons, we have learned the kinds of hypothesis, the null and
alternative hypotheses. Formulating the null and alternative hypotheses is one of the
major steps in hypothesis testing. Incorrect hypotheses will give incorrect decision and
conclusion. In this lesson, we will learn to formulate the appropriate null and alternative
hypotheses on a population proportion but before going further, let’s have a simple review
on formulating the null and alternative hypotheses on a population mean.

Example 1
Identify whether the hypothesis is null or alternative
Hypothesis Answer
a. The average daily allowance of senior high school Null
students is 150 pesos.
b. The average COVID 19 cases in the Philippines per Alternative
day is more than 1,500.
c. There is a significant difference between the average Alternative
weights of students before and after participating the
Zumba exercise.
d. There is no significant difference between the average Null
deaths of pigs caused by ASF virus in Isabela and
Cagayan provinces.

Example 2
Formulate the null and alternative hypotheses of each statement and classify if it is two-
tailed or one-tailed test.
Statement Answer
a. A barangay official claims that the  Two-tailed test
daily average number of persons  Ho: The daily average number of
who violates curfew hours is 15 but persons who violates curfew hours
some group of residents believe is 15. (μ = 15)
that this is not true.
 Ha: The daily average number of
persons who violates curfew hours
is not 15. (μ ≠ 15)
b. A farmer in Region 2 believes that  One-tailed test (right directional)
organic fertilizers on his plants will  Ho: The average income using the
yield greater income. His average organic fertilizer is 300,000 pesos.
income from the past was 300,000 (μ = 300,000)
pesos per year.  Ha: The average income using the
organic fertilizer is greater than
300,000 pesos. (μ > 300,000)
c. An electric company says that the  One-tailed test (left-directional)
average consumption of residents  Ho: The average consumption of
in a certain town is 350 kWh per residents in a certain town is 350
month but the town’s mayor says kWh per month. (μ = 350)
their residents consume less.  Ha: The average consumption of
residents in a certain town is less
than 350 kWh per month. (μ < 350)

Now let us consider formulating null and alternative hypotheses that involve a proportion
(p) from a given population.

Example 4

You are a supervisor of XM Mall with 6 branches of more than 5,000 employees.
According to one of the managers, 60% of the employees of the 6 branches do not want
to wear uniform during Wednesdays and Fridays.
a. Formulate the null and alternative hypotheses using a two-tailed statistical test.
b. Formulate the null and alternative hypotheses using a one-tailed statistical test.

Solution:
a. Ho: The proportion of employees who do not want to wear uniform during
Wednesdays and Fridays is 60%. (p = 0.60)
Ha: The proportion of employees who do not want to wear uniform during
Wednesdays and Fridays is not 60%. (p ≠ 0.60)

b. Ho: The proportion of employees who do not want to wear uniform during
Wednesdays and Fridays is 60%. (p = 0.60)
Ha: The proportion of employees who do not want to wear uniform during
Wednesdays and Fridays is less than 60%. (p < 0.60) (for left directional test)
The proportion of employees who do not want to wear uniform during
Wednesdays and Fridays is greater than 60%. (p > 0.60) (for right directional
test)
Example 5
It has been claimed that less than 30% of students in a certain school dislike Mathematics.
A researcher conducted a survey and it showed that 153 out of 600 students dislike
Mathematics. Test the claim at .05 level of significance.
a. If you were the researcher in the situation, what statistical test would you apply?
b. What are the null and alternative hypotheses?

Solution:
a. Directional or one-tailed test will be used in the situation since it uses the inequality
(<) and the rejection region lies entirely in one tail of the sampling distribution.
b. Ho: The proportion of students who dislike Mathematics is 30%. (p=0.30)
Ha: The proportion of students who dislike Mathematics is less than 30%. (p<0.30)

Example 6

According to a popcorn company, 90% of their kernels will pop when microwaved.
a. What statistical test should be applied?
b. Formulate the null and alternative hypotheses.

Solution:
a. Non-directional or two-tailed test because the possible alternative hypothesis has
inequality sign (≠)
b. Ho: The proportion of kernels will pop when microwaved is 90%. (p = 0.90)
Ha: The proportion of kernels will pop when microwaved is 90%. (p ≠ 0.90)

Learning Competency

Formulate the appropriate null and alternative hypotheses on a population proportion.


(M11/12SP-IV-e-2)

Exercise 1

Direction: State whether each hypothesis is null or alternative. Write your answer on the
space provided before each number. (1 point each)
1. Students using brand A ballpens write better than those using brand B ballpens.
2. There is no significant difference between the average consumption of gasoline
in car A and the average consumption of gasoline in car B.
3. The proportion of senior high school graduates in a certain school is 10% of the
entire population.
4. The average income of families per month in a certain town is 38,000 pesos.
5. The proportion of husbands using motorcycles in region 2 is more than 50%.
6. The proportion of patients with lung cancer is higher among smokers than among
nonsmokers.
7. The proportion of senior high school students who enrolled in HUMSS strand is
not equal to 80%.
8. An airline claims that less than 12% of its entire lost luggage is never found.
9. The proportion of COVID 19 patients who have asymptomatic cases is less than
25%.
10. The proportion of students who will use distance learning in the entire population
is more than 65%.

Exercise 2

Direction: Determine if the given problem requires two-tailed or one-tailed statistical test.
Write A if two- tailed and B if it is one-tailed. Write your answer on the space provided
before each number. (1 point each)

1. A researcher believes that 85% of Facebook users post their photos each day.
2. A doctor’s claim that only 10% of COVID 19 patients are senior citizens.
3. A certain magazine stated that 20% of men said that they used biking to reduce
stress.
4. It is found out that more than 95% of students who used the Learning Activity
Sheets passed the quarterly test prepared by DepEd Division Office.
5. Study shows that more than 98% of high school students have Facebook account.
6. A medicine manufacturer claims that their pain reliever capsule is 80% effective.
7. A brand of cellphone claims that more than two-thirds of residents in a certain city
use their brand.
8. In a certain Municipality, more than 75% of residents’ source of living is farming.

Exercise 3

Directions: Formulate the null and alternative hypotheses in each item of exercise 2.
(2 points each)
No. Null hypothesis Alternative hypothesis

4
5

References:

Lim, Y. F., et.al. (2016). Statistics and Probability. Sibs Publishing House, Inc. Quezon
City, Philippines
Belecina, R. R., et.al. (2016). Statistics and Probability. Rex Bookstore, Inc. Sampaloc,
Manila
Ocampo, Jr. J. M., et.al. (2016). Math and Beyond Statistics and Probability. Brilliant
Creations Publishing, Inc. Quezon City, Philippines

Reflection:

How to formulate the appropriate null and alternative hypotheses on a population


proportion?
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
Answer key:
Exercise 1
1. Alternative 6. Alternative
2. Null 7. Alternative
3. Null 8. Alternative
4. Null 9. Alternative
5. Alternative 10. Alternative

Exercise 2
1. Two-tailed 5.One-tailed
2. Two-tailed 6. Two-tailed
3. Two-tailed 7. One-tailed
4. One-tailed 8. One-tailed

Exercise 3
No. Null hypothesis Alternative hypothesis
The proportion of Facebook users post The proportion of Facebook users post
1 their photos each day is 85%. (p = 0.85) their photos each day is not equal to
85%. (p ≠ 0.85)
The proportion of COVID 19 patients The proportion of COVID 19 patients
2 are senior citizens is 10%. (p = 0.10) are senior citizens is not equal to 10%.
(p ≠ 0.10)
The proportion of men who used biking The proportion of men who used biking
3 to reduce stress is 20%. (p = 0.20) to reduce stress is not equal to 20%.
(p ≠ 0.20)
The proportion of students who used The proportion of students who used
4 the Learning Activity Sheets passed the the Learning Activity Sheets passed the
quarterly test prepared by DepEd quarterly test prepared by DepEd
Division Office is 95%. (p = 0.95) Division Office is more than 95%.
(p > 0.95)
The proportion of high school students The proportion of high school students
5 who have Facebook account is 98%. who have Facebook account is more
(p = 0.98) than 98%. (p > 0.98)
The proportion of effectiveness of pain The proportion of effectiveness of pain
6 reliever capsule is 80%. reliever capsule is not equal to 80%.
(p = 0.80) (p ≠ 0.80)
The proportion of residents in a certain The proportion of residents in a certain
7 city who use a brand of cellphone is city who use a brand of cellphone is
67%. (p = 0.67) more than 67%. (p > 0.67)
8 The proportion of residents whose The proportion of residents whose
source of living is farming is 75%. source of living is farming is more than
(p = 75%) 75%. (p > 75%)

Prepared by:

ARNOLD L. HABAN
STATISTICS AND PROBABILITY

Name of Learner: _____________________ Grade Level: __________________


Section: _____________________________ Date: ________________________

LEARNING ACTIVITY SHEET


IDENTIFYING THE APPROPRIATE FORM OF THE TEST STATISTIC WHEN THE
CENTRAL LIMIT THEOREM IS TO BE USED

Background Information for Learners:

In the previous lesson, we have learned the sampling distribution of the sample
mean using the Central Limit Theorem. Remember that the Central Limit Theorem allows
us to use the standard normal distribution of sample means provided that n ≥ 30 or the
sample size is large. In testing the hypothesis when the population proportion is given,
we need to consider another test statistic for us to formulate the decision, whether to
reject or not to reject the null hypothesis as basis for the formulation of conclusion. For
the last lesson this week, we will identify the appropriate form of the test-statistic in
population proportion when the Central Limit Theorem is to be used.

To compare sample proportion and population proportion, we use the z-test for one-
sample proportion. The formula is

p̂ = sample proportion
po = population proportion
n = size of the sample
x = number of successes
𝑥
p̂ =
𝑛

Example 1:

Convert the following percent to decimals.


a. 65% b. 2.5% c. 36%

Solution:
a. 65 ÷ 100 = 0.65 b. 2.5 ÷ 100 = 0.025 c. 36 ÷ 100 = 0.36

Example 2:

Convert the following fractions to decimals.


a. 4/5 b. 9/20 c. 3/8
Solution:
a. 4 ÷ 5 = 0.80 b. 9 ÷ 20 = 0.45 c. 3 ÷ 8 = 0.38

Example 3:

It has been claimed that 30% of students in a certain school who have difficulty of
waking up early due to playing online games. A researcher would like to verify the claim
by getting 700 sample students for survey. Out of 700 students, 240 students said that
they had difficulty of waking up early due to playing online games.
a. What type of statistical test should be applied?
b. What are the null and alternative hypotheses?
c. What are the corresponding values of the variables in the z-test formula?
d. What is the computed test value?

Solution:
a. Two-tailed test (non-directional test)
b. Ho: The proportion of students in a certain school who have difficulty of waking
up early due to playing online games is 30%. (po = 0.30)
Ha: The proportion of students in a certain school who have difficulty of waking
up early due to playing online games is not equal to 30%. (po ≠ 0.30)
c. po = 0.30 (convert 30% to decimal)
n = 700
x = 240
p̂ = 240 ÷ 700 = 0.34
0.34−0.30
d. z =
0.30(1−0.30)

700
0.04
z = 0.0173
z = 2.31

Example 4:

Don Fast Food Restaurant believes that more than 90% of their customers are satisfied
with the quality of service that they offer. 150 customers were surveyed and it was found
out that only 130 customers were satisfied.
a. What type of statistical test should be applied?
b. What are the null and alternative hypotheses?
c. What are the corresponding values of the variables in the z-test formula?
d. What is the computed test value?

Solution:
a. One-tailed test (right directional test)
b. Ho: The proportion of customers who are satisfied with the quality of service that
Don Fast Food Restaurant offers is 90%. (po = 0.90)
Ha: The proportion of customers who are satisfied with the quality of service that
Don Fast Food Restaurant offers is more than 90%. (po > 0.90)
c. po = 0.90 (convert 90% to decimal)
n = 150
x = 130
p̂ = 130 ÷ 150 = 0.87
0.87−0.90
d. z =
0.90(1−0.90)

150
−0.03
z = 0.0245
z = -1.22

Learning Competency

Identifies the appropriate form of the test-statistic in population proportion when the
Central Limit Theorem is to be used (M11/12SP-IV-e-3)

Problem Set

1. A certain school claims that less than 20% of their students prefer online learning
in the new normal education. After conducting a survey on 500 randomly chosen
students, they found out that 87 of them preferred online learning.

2. It has been claimed that 30% of students in a certain school dislike Mathematics.
A researcher conducted a survey and it showed that 153 out of 600 students dislike
Mathematics.

3. A certain magazine stated that more than 20% of men said that they used biking
to reduce stress. A survey was conducted to test the claim. They surveyed 1,300
randomly selected bikers in a certain region and found out that only 280 of them
said that they used biking to reduce stress.

4. Kat’s Drug store claims that 8 out of 10 doctors recommend Brand A drug to
combat body pain. To test the claim, 400 doctors were randomly chosen as
sample. It was found out that only 325 of them recommended Brand A drug.

Exercise 1

Direction: Formulate the null and alternative hypotheses of each problem in the set.

Problem Null hypothesis (Ho) Alternative hypothesis (Ha)


(1 point each) (1 point each)
1
2

Exercise 2

Directions: Determine the corresponding values of variables in the z-test formula for
population proportion. (1 point each)

Problem p̂ po x n
1

4
Exercise 3

Directions: Compute the test value using the z – test formula for population proportion.
(2 points each)

Problem Computed Test value


1

References:

Lim, Y. F., et.al. (2016). Statistics and Probability. Sibs Publishing House, Inc. Quezon
City, Philippines
Belecina, R. R., et.al. (2016). Statistics and Probability. Rex Bookstore, Inc. Sampaloc,
Manila
Ocampo, Jr. J. M., et.al. (2016). Math and Beyond Statistics and Probability. Brilliant
Creations Publishing, Inc. Quezon City, Philippines
Reflection:

What have you learned from this topic?


______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
Answer key:

Exercise 1

Problem Null hypothesis (Ho) Alternative hypothesis (Ha)


1 The proportion of students who The proportion of students who
prefer online learning in the new prefer online learning in the new
normal education is 20%. (po = 0.20) normal education is less than 20%.
(po < 0.20)
2 The proportion of students in a The proportion of students in a
certain school dislike Mathematics is certain school dislike Mathematics is
30%. (po = 0.30) not equal to 30%. (po ≠ 0.30)
3 The proportion of men said that they The proportion of men said that they
used biking to reduce stress is 20%. used biking to reduce stress is more
(po = 0.20) than 20%. (po > 0.20)
4 The proportion of doctors who The proportion of doctors who
recommend Brand A drug to combat recommend Brand A drug to combat
body pain is 80%. (po = 0.80) body pain is not equal to 80%. (po ≠
0.80)

Exercise 2

Problem p̂ po x n
1 0.17 0.20 87 500
2 0.26 0.30 153 600
3 0.22 0.20 280 1,300
4 0.81 0.80 325 400

Exercise 3
Problem Computed Test value
1 z = - 1.68
2 z = -2.14
3 z = 1.80
4 z = 0.56

Prepared by:

ARNOLD L. HABAN
STATISTICS AND PROBABILITY
Name: _________________________________________ Grade Level: ____________________
Section: ________________________________________ Date: ___________________________

LEARNING ACTIVITY SHEET

SOLVING PROBLEM INVOLVING TEST OF HYPOTHESIS ON THE


POPULATION PROPORTION

Background Information for Learners


In the previous lesson, you have learned how to compute test-statistic value and draw
conclusion about the population proportion based on its value. In this learning activity sheet, you
will learn how to solve problem on test of hypothesis about population proportion.
To test a claim about population proportion, use the z-test for population proportion. The
formula below is used:
̂

√ ⁄
where:
=claimed or hypothesized proportion
̂ =sample proportion
=1-
=sample size

In testing hypothesis, the five-step hypothesis testing procedure below could be used:
PROCEDURE
STEPS IN TESTING HYPOTHESIS
(Critical Value Method)
1. Determine the null hypothesis ( ) and alternative hypothesis ( ). A
hypothesis that uses or is called one-tailed while a hypothesis that
uses is called two-tailed.
2. Identify the statistical test to be used, value of α and the critical value
of the test statistic.
3. Computation. Get the absolute value of the computed z when comparing
it to the critical value of z if the hypothesis is two-tailed.
4. Decision Rule and the Decision (reject or not to reject )
 For a one-tailed test, reject Ho if (Ha:
p ). Reject also Ho if for Ha: p .
 For two-tailed test, reject Ho if | |
5. Conclusion (in non-technical terms)

To further understand how to conduct hypothesis testing using these steps, let us study
the succeeding examples:
Example 1.
Using the 0.05 level of significance, run a z-test given the following:
; ̂ =0.41; p=0.35
Solution:
We follow the five-step hypothesis testing in showing our solution.
Step 1: Determine the null hypothesis (Ho) and alternative hypothesis (Ha).
Since the assumed population proportion is 0.35, the null hypothesis Ho is p=0.35. The
alternative hypothesis therefore is p .

Ho: p=0.35
Ha: p 0.35

Step 2: Identify the statistical test to be used, value of α and the critical value of the test
statistic.

Statistical Test: z-test for proportion (two-tailed)


α=0.05
𝑧𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 1.96

Step 3: Computation.
Substitute the given
̂ values in the formula.
To get the value q,
√ ⁄ just subtract the
value of p from 1
(q=1-p). In our
⁄ example, q=1-0.35,
√ that is 0.65.

Step 4: Decision Rule and Decision (reject or not to reject Ho)


Decision Rule: Reject Ho if | | . (This is for two-tailed test)
The decision part is where the heart of the hypothesis testing lies. Always consider the
decision rule in deciding whether to reject or not to reject Ho.
Since the computed value of z of 1.19 is less than the critical value of z which is 1.96, the
decision rule tells us to “do not reject”.
Step 5: Making conclusion.
In making conclusion, avoid being too verbose and using technical terms. In our example,
we can simply say, “There is a sufficient evidence to support the claim that the population
proportion is 0.35.”.
Example 2:
A medical doctor claims that less than 5% of those who recovered from COVID-19 are at
risk of reinfection. In order to verify the claim, 200 recovered patients were retested and found
out that 6 of them are positive of the disease. Is the doctor’s claim true? Test at 0.10 level of
significance.
Solution:
We employ the usual five-step hypothesis testing in order to test the claim.
STEPS ANSWERS
1. Determine the Ho: p
null hypothesis Ha: p 0.05
(Ho) and (The hypothesized population proportion is less than 5% or
alternative 0.05. This is a one-tailed hypothesis.)
hypothesis (Ha).
2. Identify the Statistical test = z-test for population proportion
statistical test to α=0.10
be used, value of 6 out of 200
α and the critical when
value of the test converted to
statistic. decimal is
3. Computation ̂ 0.03

√ ⁄

√ ⁄

(Do not get the absolute value since the hypothesis is one-
tailed.)
4. Decision Rule Decision Rule: Reject Ho if z
and Decision Since the computed z is -1.30 which less than critical value
of 1.28 is, we reject Ho.
(Note: The doctor’s claim of p is the Ha.)
5. Making The doctor’s claim that the proportion of COVID-19
conclusion recovered patients at risk of reinfection is 5% is true. There
is enough evidence to support the claim.
Example 3:
A school guidance counselor believes that 20% of the Junior High School completers of
the school want to transfer to a private school for Senior High School. Out of 250 interviewed
completers, 60 want to transfer to a private school. Test the guidance counselor’s claim at 0.05
level of significance.
Solution:
STEPS ANSWERS
1. Determine the Ho: p
null hypothesis Ha: p 0.20
(Ho) and (The hypothesized population proportion is 20% or 0.20.
alternative This is a two-tailed hypothesis.)
hypothesis (Ha).
2. Identify the Statistical test = z-test for population proportion
statistical test to α=0.05
be used, value of
α and the critical
value of the test
statistic.
3. Computation ̂
6 out of 200
√ ⁄
when
converted to
decimal is
√ ⁄ 0.03

4. Decision Rule Decision Rule: Reject Ho if | |


and Decision Since the computed z is 1.59 which is less than critical value
of 1.96, we cannot reject Ho.
5. Making The guidance counselor’s claim that the proportion of Junior
conclusion High School completers from their school who wants to
transfer to private schools is 20% is true. There is enough
evidence to support the claim.

Learning Competency
Solve problems involving test of hypothesis on the population proportion (Quarter 4, Week 7,
M11/12SP-IVg-1)
ACTIVITY 1: “Decide Now!”
Directions: Complete the table by providing the computed and critical value of z and decide
whether to reject or not to reject Ho. (3 points each)
No. n ̂ Decision
1 100 0.05, 23% 20%
one-
tailed,
p
2 350 0.01, 0.27 0.3
two-
tailed
3 95 0.10, 0.35
two-
tailed
4 120 0.10, 0.5
one-
tailed,
p
5 230 0.05, 64% 60%
two-
tailed

ACTIVITY 2: “Complete Me!”


Directions: Each item below is an incomplete test of hypothesis. Supply the missing solution to
the item. (2 points each blank)
1. Using the 0.10 level of significance, conduct a test of hypothesis if p=0.17 given the
following:
n=150; ̂=30 out of 150
STEPS ANSWERS
1. Determine the Ho: p
null hypothesis Ha: _______________
(Ho) and
alternative
hypothesis (Ha).
2. Identify the Statistical test = z-test for population proportion
statistical test to α=__________
be used, value of
α and the critical
value of the test
statistic.
3. Computation ̂

√ ⁄
4. Decision Rule Decision: ____________________________
and Decison Since the computed z is ______which is ___________ the
critical value of_______, __________________ Ho.
5. Making ________________________________________________
conclusion ________

2. A television network claims that 75% of Filipinos are in favor of their franchise renewal.
A survey of 1,200 randomly selected Filipinos shows that 850 said they want the
network’s franchise be renewed. Is there enough evidence to support the network’s
claim? Use .
STEPS ANSWERS
1. Determine the Ho: _______________
null hypothesis Ha: _______________
(Ho) and
alternative
hypothesis (Ha).
2. Identify the Statistical test = z-test for population proportion
statistical test to α=0.05
be used, value of
α and the critical
value of the test
statistic.
3. Computation ̂

√ ⁄

4. Decision Since the computed z is ______which is ___________ the


critical value of_______, __________________ Ho.
5. Making ________________________________________________
conclusion ________

3. A non-government organization (NGO) claims that 50% of Pinoys consider themselves


poor. A survey of 1,500 Filipinos reveals that 54% said they are poor. Is there enough
evidence that supports the NGO’s claim? Test at .
STEPS ANSWERS
1. Determine the Ho: _______________
null hypothesis Ha: _______________
(Ho) and
alternative
hypothesis (Ha).
2. Identify the Statistical test = z-test for population proportion
statistical test to α=0.01
be used, value of
α and the critical
value of the test
statistic.
3. Computation ̂

√ ⁄

4. Decision Since the computed z is ______which is ___________ the


critical value of_______, __________________ Ho.
5. Making ________________________________________________
conclusion ________

ACTIVITY 3: “It’s Your Turn!”


Directions: Conduct a five-step hypothesis testing on population proportion on the following
problems.
1. A local political leader claims that 95% of the families in his area of responsibility were
given “ayuda” during a one-month lockdown. Of a random sample of 200 families, 187
said they received relief or “ayuda”. Is this enough to affirm the leader’s claim? Use
.
2. A politician claims that he will get at least 70% of the votes. Out of 300 randomly
sampled registered voters, 200 said they will vote for the said politician. Test the claim
using 0.10 level of significance.
3. A medical expert claims that 80% of recovered COVID-19 patients have produced
antibodies against the virus. In order to verify this, 1,000 recovered patients were tested
and found that 823 of them have antibodies for the corona virus. Is this enough evidence
to support the claim? Use .

References:
Banigon, R.B. Jr., et.al (2016). Statistics and Probability for Senior High School. Cubao. Quezon
City
Belecina, R.R., et.al (2016). Statistics and Probability. Sta. Mesa Heights, Quezon City

Reflection:
Complete the following sentences.
1. In this lesson, I have learned how to __________________________________________
________________________________________________________________________
_______________________________________________________________________.
2. I am feeling _____________________ about the lesson because ____________________
________________________________________________________________________
_______________________________________________________________________.
3. I am excited and hoping for _________________________________________________
_______________________________________________________________________.
ANSWER KEY
Activity 1: “Decide Now!”
No. Decision
1 0.75 1.64 Do not reject Ho
2 -1.22 2.58 Do not Reject Ho
3 -0.70 1.64 Do not Reject Ho
4 0.91 1.28 Do not Reject Ho
5 1.24 1.96 Do not Reject Ho

Activity 2: “Complete Me!”


1. Step 1: Ha: p
Step 2:

Step 3:
Step 4: Decision Rule: Reject Ho if | |
0.98, less than, 1.64. do not reject
Step 5: There is sufficient evidence to support the claim that the population proportion is
0.2.

2. Step 1: Ho: p=0.75


Ha: p 0.75
Step 2: z-critical: 1.96
Step 3: but since it is two-tailed, negative sign will be disregarded and so
.
Step 4: Decision Rule: Reject Ho if | |
3.34, greater than, 1.96, reject
Step 5: There is no sufficient evidence to support the television network’s claim that 75%
of Filipinos are in favor of their franchise renewal.

3. Step 1: Ho: p
Ha: p
Step 2:
Step 3: z=3.1
Step 4: Decision Rule: Reject Ho if | |
3.1, greater than, 2.58, reject
Step 5: There is no enough evidence to support the NGO’s claim that 50% of the
Filipinos consider themselves poor.

Activity 3: “It’s Your Turn!”


1. Step 1: Ho: p
Ha: p 0.95
Step 2: statistical test: z-test for population proportion
α=0.05
=1.96
Step 3:
̂

√ ⁄

(Get the absolute value since the hypothesis is two-tailed)


Step 4: Decision Rule: Reject Ho if | |
Since the computed z of 0.97 is less than the critical value of 1.96, do not reject
Ho.
Step 5: There is enough evidence to support the claim that 95% of the families received
“ayuda”.

2. Step 1: Ho: p 0.7


Ha: p 0.7
Step 2: Statistical test: z-test for population proportion
α=0.10
: 1.28 (one-tailed)

Step 3:
̂

√ ⁄

Step 4: Since the computed z of -1.13 is greater than the critical value of -1.28, do not
reject Ho.
Step 5: There is a sufficient evidence to support the politician’s claim of getting at least
70% of the votes.

3. Step 1: Ho: p
Ha: p 0.8
Step 2: Statistical test: z-test for population proportion

Step 3:
̂

√ ⁄

Step 4: Decision Rule: Reject Ho if | |


Since the computed value of z of 1.83 is less than the critical value of 2.58, do not
reject Ho.
Step 5: There is enough evidence to support the claim that 80% of the recovered COVID-
19 patients have developed antibodies.

Prepared by:

Armando G. Balucas Jr.


San Mateo National High School
STATISTICS AND PROBABILITY
Name: _________________________________________ Grade Level: ____________________
Section: ________________________________________ Date: ___________________________

LEARNING ACTIVITY SHEET

NATURE OF BIVARIATE DATA

Background Information for Learners


In our previous lessons, we have dealt data with single variable. These are called
univariate data which are treated independently from the other variables. On the other hand,
bivariate data are data that involve two variables which are paired together to mostly find
associations or relationships.
For example, Grade 11 students’ reading comprehension level is a single variable and
thus it is consider a univariate data. However, if you pair it to the students’ scores in solving
word problems and find a relationship, the two variables represent a bivariate data. If one
variable is influencing or affecting the other variable, then you have a bivariate data which has
and independent and dependent variables. An independent variable is a piece of data that can be
changed or controlled. On the other hand, dependent variable is a variable that is influenced by
the independent variable.

Example 1: Reading comprehension level and problem solving scores


This is a bivariate data which is composed of two variables. Reading comprehension
levels are related to scores in problem solving. In most cases, students with high reading
comprehension level tend to understand the problems better and thus, will score higher. Scores in
problem solving depends on reading comprehension level and so this is the dependent variable.
Reading comprehension level being the determinant of problem solving scores is the
independent variable.

Example 2: Temperature of the day and sales of Halo-halo


The temperature of the day determines the sales of Halo-halo. Conversely, sales of Halo-
halo depend upon the temperature of the day. Hot days will most probably generate higher sales
on Halo-halo compared to cold days. In this case, temperature of the day is the independent
variable while sales is the dependent variable.

Example 3: Age of a car and its resale value


A second-hand property such as gadgets and cars can be sold and bought depending upon
its age. More often than not, a car’s resale value is dependent of its age. Resale value is the
dependent variable and age is the independent variable.

There are a lot of examples of bivariate data. Their relationships in fact are used in
researches, product development and decision making.
Learning Competency
Illustrate the nature of bivariate data (Quarter 4, Week 7, M11/12SP-IVg-2)

Activity 1
Directions: Determine whether each research topic below is univariate or bivariate.
1. Socio-economic status of SHS students
2. IQ level and career preferences
3. Mathematics vocabulary level and math grades
4. Ages of COVID-19 patients
5. Ages of COVID-19 patients and days of recovery
6. Annual net income of a television network
7. Amount spent in an advertisement and gross sales
8. Learning modality preferences of SHS students
9. Time spent in reviewing and test scores
10. Monthly electricity consumption

Activity 2
Directions: Identify the independent and dependent variable in each of the following bivariate
data.
1. Students’ age and height
2. Number of days present and final grade
3. Internet speed and distance from the tower
4. Daily allowance and monthly income of parents
5. Radius of a circle and its area
6. Altitude of place and its temperature
7. Side of a cube and its volume
8. Time spent in social media and social issue awareness
9. Height and weight
10. Number of shares and amount of dividend

Activity 3
Directions: List down at five (5) examples and bivariate data.
1. ___________________________________________________

2. ___________________________________________________

3. ___________________________________________________

4. ___________________________________________________

5. ___________________________________________________
References:
Belecina, R.R., et.al (2016). Statistics and Probability. Sta. Mesa Heights, Quezon City
Stephanie Glen.”Bivariate Analysis Definition and Example”. Retrieved from
StatisticsHowTo.com
Jackson, Cathryn. “What is Bivariate Data?”. Retrieved from study.com/academy/lesson/what –
is-bivariate-data-definition-examples.html

Reflection:
Complete the following sentences.
1. In this lesson, I have learned how to __________________________________________
________________________________________________________________________
_______________________________________________________________________.
2. I am feeling _____________________ about the lesson because ____________________
________________________________________________________________________
_______________________________________________________________________.
3. I am excited and hoping for _________________________________________________
_______________________________________________________________________.

ANSWER KEY
Activity 1
1. Univariate
2. Bivariate
3. Bivariate
4. Univariate
5. Bivariate
6. Univariate
7. Bivariate
8. Univariate
9. Bivariate
10. Univariate

Activity 2
1. Dependent Variable: height
Independent Variable: age
2. Dependent Variable: final grade
Independent Variable: number of days present
3. Dependent Variable: internet speed
Independent Variable: distance from the tower
4. Dependent Variable: daily allowance
Independent Variable: monthly income
5. Dependent Variable: area
Independent Variable: radius
6. Dependent Variable: temperature
Independent Variable: altitude
7. Dependent Variable: volume
Independent Variable: side
8. Dependent Variable: social issue awareness
Independent Variable: time spent in social media
9. Dependent Variable: weight
Independent Variable: height
10. Dependent Variable: amount of dividend
Independent Variable: number of shares

Prepared by:

Armando G. Balucas Jr.


San Mateo National High School
STATISTICS AND PROBABILITY
Name: _________________________________________ Grade Level: ____________________
Section: ________________________________________ Date: ___________________________

LEARNING ACTIVITY SHEET

CONSTRUCTING A SCATTER PLOT

Background Information for Learners


In the previous lesson, you have learned what bivariate data is and how it is different
from the univariate data. You also learned how to identify independent and dependent variables.
Bivariate data seek to investigate existing relationship between the two variables. One way to
have a good visualization of the relationship is through plotting the points in the Cartesian plane.
We call this scatter plot or sometimes also known as scatter diagram.
A scatter plot uses points to represent the values from two variables in the Cartesian
plane with one variable on each axis. The purpose of plotting the points is to look for relationship
between the variables.

Example 1: The table below shows the age and weight of 10 students. Make a scatter plot for
this bivariate data.
Age in 13 17 14 15 16 17 18 20 16 19
years
Weight 37 42 40 38 45 50 55 52 45 50
in kg

Solution: In order to make a scatter plot, plot the points (13,37), (17,42), (14,40), (15,38),
(16,45), (17,50), (18,55), (20,52), (16,45) and (19,50) in the Cartesian plane. The graph should
look like this:

This is the
scatter plot
that represents
the bivariate
data above
Example 2: A teacher interviewed his students on the number of hours they have spent
reviewing for their final examinations in Statistics and Probability. The teacher then compared
the data to number of incorrect answers in the exam. The table shows the data.

Number of 0.5 1 3 1.5 2 1 4 3 0.25 3.5


hours spend
reviewing
Number of 7 5 3 3 5 8 2 7 6 3
incorrect
answers

Solution: Making the pairs of values ordered pairs, plot the points in the Cartesian Plane. The
graph should be similar to the one below.

Example 3: Make a scatter plot based on the bivariate data below.

Distance from 0.2 2 1 1.5 5 3 0.5 6 2


the school (km)
Daily 30 50 60 50 100 50 20 70 100
allowance in
peso

Solution: The graph on the


right represents the
scatter plot of the
bivariate data
above.
Learning Competency
Construct a scatter plot (Quarter 4, Week 7, M11/12SP-IVg-3)

Activity 1: “Throwback”
Directions: Plot the following points in the Cartesian plane.
1. (3, 5)
2. (-6, 10)
3. (4, -7)
4. (-5, -11)
5. (8, 4)
6. (9, -5)
7. (12, -1)
8. (-10, -6)
9. (0, 8)
10. (-4, 0)

Activity 2: “Better in Scatter”


Directions: Construct a scatter plot for each pair of variables.
1. Scores in Scores in
Math English
10 11
8 15
6 7
5 10
9 12
13 8
16 17
15 14

2. Number of Number of
workers Days to
finish the
job
10 20
5 40
11 18
6 33
20 10
13 15
8 25
9 22
3. IQ Entrance
Exam
Score
100 75
110 80
121 91
90 70
95 68
115 83
130 95
117 80
100 65
105 78

Activity 3: “You Can Do It!”


Directions: Interview 10 of your classmates of their weekly allowance and how much do they
spend for mobile load in a week. Tabulate the results and construct a scatter plot for the data.

Average Weekly Amount spent for Mobile


Allowance Load in a week
References:
Belecina, R.R., et.al (2016). Statistics and Probability. Sta. Mesa Heights, Quezon City
Mike Yi (2019). A Complete Guide to Scatter Plots. Retrieved from
https://chartio.com/learn/charts/what-is-a-scatter-plot/

Reflection:
Complete the following sentences.
1. In this lesson, I have learned how to __________________________________________
________________________________________________________________________
_______________________________________________________________________.
2. I am feeling _____________________ about the lesson because ____________________
________________________________________________________________________
_______________________________________________________________________.
3. I am excited and hoping for _________________________________________________
_______________________________________________________________________.

ANSWER KEY
Activity 1: “Throwback”

Activity 2: “Better in Scatter”


1.
2.

3.

Prepared by:

Armando G. Balucas Jr.


San Mateo National High School
STATISTICS AND PROBABILITY
Name: _________________________________________ Grade Level: ____________________
Section: ________________________________________ Date: ___________________________

LEARNING ACTIVITY SHEET

FORM, DIRECTION AND STRENGTH OF A SCATTER PLOT

Background Information for Learners


In the previous lessons, you’ve learned what scatter plot is and how to construct one
using values of bivariate data. It is not enough to make one. For better understanding of the
relationship and association of the two variables, one must describe its shape, trend and
variation.
The form or shape of a scatter plot can be linear or non-linear. Figure 1 below represents
a linear scatter plot since points tend to form and assemble along a straight line.

Figure 1: Linear Scatter Plot Figure 2:Non-linear Scatter Plot

The direction or trend of scatter plot answers the question “Is the association positive or
negative?”. The direction may be positive, negative or zero correlation. A positive correlation
means that an increase in one of the variables is associated with an increase in the other. Figure 3
is an example of a scatter plot with positive correlation. A negative correlation on the other hand
means that an increase in one of the variables is associated with a decrease in the other just like
in Figure 4. Not all scatter plots can be classified as either positive or negative. There are
instances where the variables have no association at all. We label this kind of scatter plot as no
correlation at all or zero correlation. Figure 5 shows an example.
Figure 3: Positive Correlation Figure 4: Negative Correlation Figure 5: Zero Correlation

The correlation and association of the variables in a scatter plot can also be described in
terms of its strength. This describes the closeness of the points. It may be high, moderate or
weak. The figures below show the difference among the three.

Figure 6: Strong Correlation Figure 7: Moderate Correlation Figure 8: Weak Correlation

Figure 6 shows a strong correlation since the points are close to each other. Figure 7
shows moderately close to each other points but not that close compared to Figure 6. Figure 8
shows a weak correlation since the points are far from each other.

Learning Competency
Describes the shape (form), trend (direction), and variation (strength) based on a scatter plot
(Quarter 4, Week 7, M11/12SP-IVg-4)
Activity 1: “Describe Me”
Directions: Describe the following scatter plots in terms of their form, trend and variation.

1. 2.

3. 4.

5.
Activity 2: “Matchy-matchy”
Directions: Each item below is a description of a scatter plot based on its form, direction and
strength. Choose the scatter plot that matches the description.

1. “There is a strong, negative, linear relationship between the two variables.”

A. B. C.

2. “There is a weak, zero, non-linear relationship between the two variables.”

A. B. C.

3. “There is a moderate, negative, linear relationship between the two variables.”

A. B. C.
Activity 3: “Draw Me!”
Directions: Each item below is a description of the form, direction and strength of a scatter plot.
Draw a scatter plot that represents the description.
1. There is a strong, positive, linear relationship between the variables.
2. There is a moderate, negative, non-linear relationship between the variables.
3. There is a strong, positive, non-linear relationship between the variables.
4. There is a weak, negative, linear relationship between the variables.
5. There is a weak, positive, linear relationship between the variables.

References:
Belecina, R.R., et.al (2016). Statistics and Probability. Sta. Mesa Heights, Quezon City
Khan Academy (2018). Describing Scatter Plots (form, direction, strength and outliers).
Retrieved from https://www.khanacademy.org/math/ap-statistics/bivariate-data-ap/scatterplots-
correlation/a/describing-scatterplots-form-direction-strength-outliers

Reflection:
Complete the following sentences.
1. In this lesson, I have learned how to __________________________________________
________________________________________________________________________
_______________________________________________________________________.
2. I am feeling _____________________ about the lesson because ____________________
________________________________________________________________________
_______________________________________________________________________.
3. I am excited and hoping for _________________________________________________
_______________________________________________________________________.

ANSWER KEY
Activity 1: “Describe Me”
1. Linear, positive, moderate
2. Linear, negative, strong
3. Non-Linear, positive, moderate
4. Non-linear, zero, strong
5. Non-linear, zero, weak

Activity 2: “Match-matchy”
1. C
2. B
3. C
Activity 3: “Draw Me”

1. 2.

3. 4.

5.

Prepared by:

Armando G. Balucas Jr.


San Mateo National High School
STATISTICS AND PROBABILITY
Name of Learner: Grade Level:
Section: Date: _

LEARNING ACTIVITY SHEET


CALCULATING THE PEARSON’S SAMPLE
CORRELATION COEFFICIENT

Background Information for Learners


You have learned in the previous lesson how to estimate the strength of association between
two variables based on a scatterplot. Now, you will learn how to measure the strength and
relationship between two variables using the Pearson’s Correlation Coefficient.

Correlation is used to describe and test the significance of relationships between two
quantitative and continuous variables.
The Pearson’s Correlation Coefficient (r), also referred as Pearson’s r, measures the linear
correlation between two variables.
To compute for r, we use the formula,
𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

The table below shows the verbal description of the strength of the computed r.
Pearson r Linear Relationship

0 – 0.19/-0.19 – 0 Very Weak Positive (Negative) Correlation


0.20 – 0.39/ -0.39 – (-0.20) Weak Positive (Negative) Correlation
0.40 – 0.59/ -0.59 – (-0.40) Moderate Positive (Negative) Correlation
0.60 – 0.79/ -0.79 – (-0.60) Strong Positive (Negative) Correlation
0.80 – 1.00/ -1 – (-0.80) Very Strong Positive (Negative) Correlation

Note: Practice Personal Hygiene protocols at all times


EXAMPLE 1.
The table shows the scores of ten Grade 11 students in Statistics and Practical Research.
Determine if there is a relationship between the scores in the two subjects by computing the
correlation coefficient of these two variables.

Students Score in Statistics (x) Score in Practical Research (y)


1 13 15
2 9 10
3 8 7
4 17 16
5 23 25
6 11 12
7 15 14
8 18 17
9 4 6
10 20 24

SOLUTION:
To solve for r, follow the following steps.
Step 1: Compute for 𝒙𝟐 , 𝒚𝟐 and 𝒙𝒚. Present the data in tabular form
Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 13 15 169 225 195
2 9 10 81 100 90
3 8 7 64 49 56
4 17 16 289 256 272
5 23 25 529 625 575
6 11 12 121 144 132
7 15 14 225 196 210
8 18 17 324 289 306
9 4 6 16 36 24
10 20 24 400 576 480

Step 2: Find the sum of all the entries in each column.


Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 13 15 169 225 195
2 9 10 81 100 90
3 8 7 64 49 56
4 17 16 289 256 272
5 23 25 529 625 575
6 11 12 121 144 132
7 15 14 225 196 210
8 18 17 324 289 306
9 4 6 16 36 24
10 20 24 400 576 480
𝜮 = 𝟏𝟑𝟖 𝜮 = 𝟏𝟒𝟔 𝜮 = 𝟐𝟐𝟏𝟖 𝜮 = 2496 𝜮 = 𝟐𝟑𝟒𝟎

Note: Practice Personal Hygiene protocols at all times


Step 3: Calculate the Pearson’s sample correlation coefficient by substituting the
values obtained from step 3 in the formula.

𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

(10)(2340) − (138)(146)
𝒓=
√[10(2218) − (138)2 ][10(2496) − (146)2 ]

𝒓 = 0.96

The computed value of r falls within the range of 0.80 – 1.00, therefore, the scores of
the students in Statistics and Practical Research have a very strong positive correlation. This
means that if a student got a high score in Statistics, it can be expected that the student will also
get a high score in Practical Research.

EXAMPLE 2.
The table below shows the number of absences of 5 Grade 11 students in their
Mathematics subject and their Final Exam Grade. Compute the correlation coefficient and
interpret the result.

Student Number of Absences (x) Final Grade (y)


1 2 80
2 0 92
3 3 60
4 1 93
5 5 50

SOLUTION:

Step 1: Compute for 𝒙𝟐 , 𝒚𝟐 and 𝒙𝒚. Present the data in tabular form

Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 2 80 4 6400 160
2 0 95 0 9025 0
3 3 60 9 3600 180
4 1 85 1 7225 85
5 5 50 25 2500 250

Note: Practice Personal Hygiene protocols at all times


Step 2: Find the sum of all the entries in each column.

Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 2 80 4 6400 160
2 0 95 0 9025 0
3 3 60 9 3600 180
4 1 85 1 7225 85
5 5 50 25 2500 250
𝜮 = 𝟏𝟏 𝜮 = 𝟑𝟕𝟎 𝜮 = 𝟑𝟗 𝜮 = 28750 𝜮 = 𝟔𝟕𝟓

Step 3: Calculate the Pearson’s sample correlation coefficient by substituting the values
obtained from step 3 in the formula.

𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

(5)(675) − (11)(370)
𝒓=
√[5(39) − (11)2 ][5(28750) − (370)2 ]

𝒓 = −0.21

There is a weak negative correlation between the number of absences and final exam
grade of the Grade 11 students since the value of r which is -0.21 falls within the range of -
0.39 to -0.20.

EXAMPLE 3.

A researcher wants to know if there is negative linear relationship between the number
of hours in playing online games and the Final Grade of Grade 11 students. Do you think that
the length of hour in playing online games has a negative linear relationship with their Final
Grade?

Students No. of Hours in Playing Final Grade (y)


Online Games (x)
1 2 90
2 5 83
3 3 85
4 4 80
5 1 94
6 6 75
7 5 78
8 7 75
9 3 91
10 1 92

Note: Practice Personal Hygiene protocols at all times


SOLUTION:

Step 1: Compute for 𝒙𝟐 , 𝒚𝟐 and 𝒙𝒚. Present the data in tabular form
Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 2 90 4 8100 180
2 5 83 25 6889 415
3 3 85 9 7225 255
4 4 80 16 6400 320
5 1 94 1 8836 94
6 6 75 36 5625 450
7 5 78 25 6084 390
8 7 75 49 5625 525
9 3 91 9 8281 273
10 1 92 1 8464 92

Step 2: Find the sum of all the entries in each column.


Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 2 90 4 8100 180
2 5 83 25 6889 415
3 3 85 9 7225 255
4 4 80 16 6400 320
5 1 94 1 8836 94
6 6 75 36 5625 450
7 5 78 25 6084 390
8 7 75 49 5625 525
9 3 91 9 8281 273
10 1 92 1 8464 92
𝜮 = 𝟑𝟕 𝜮 = 𝟖𝟒𝟑 𝜮 = 𝟏𝟕𝟓 𝜮 = 71529 𝜮 = 𝟐𝟗𝟗𝟒

Step 3: Calculate the Pearson’s sample correlation coefficient by substituting the values
obtained from step 3 in the formula.

𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

(10)(2994) − (37)(843)
𝒓=
√[10(175) − (37)2 ][10(71529) − (843)2 ]

𝒓 = −0.08

Based on the obtained value of r, which is -0.08, we can then conclude that there is a
negative linear relationship (very weak) between the number of hours in playing online games
and the final grade of the Grade 11 students.

Note: Practice Personal Hygiene protocols at all times


Learning Competency

Calculates the Pearson’s sample correlation coefficient (Quarter 2, Week 8, M11/12SP-IVh-


2)

ACTIVITY 1

Directions: Compute r for each of the following:


[5 points each]

1. 𝜮𝒙 = 𝟐𝟎𝟎 2. 𝜮𝒙 = 𝟏𝟎 3. 𝜮𝒙 = 𝟏𝟓𝟎
𝜮𝒚 = 𝟐𝟓 𝜮𝒚 = 𝟏𝟓 𝜮𝒚 = 40
𝜮𝒙𝟐 = 8775 𝜮𝒙𝟐 = 𝟑𝟓 𝜮𝒙𝟐 = 𝟏𝟎𝟐𝟐𝟓
𝜮𝒚𝟐 = 𝟏𝟕𝟓 𝜮𝒚𝟐 = 𝟒𝟗 𝜮𝒚𝟐 = 𝟐𝟏𝟓
𝜮𝒙𝒚 = 𝟕𝟐𝟓 𝜮𝒙𝒚 = 𝟑𝟔 𝜮𝒙𝒚 = 𝟔𝟎𝟎
𝒏=𝟖 𝒏=𝟓 𝒏 = 𝟏𝟎

4. 𝜮𝒙 = 𝟑𝟕 5. 𝜮𝒙 = 𝟏𝟖
𝜮𝒚 = 𝟏𝟑𝟗 𝜮𝒚 = 𝟓𝟔𝟒
𝜮𝒙𝟐 = 𝟑𝟕𝟓 𝜮𝒙𝟐 = 𝟕𝟑
𝜮𝒚𝟐 = 𝟒𝟏𝟑𝟓 𝜮𝒚𝟐 = 𝟒6770
𝜮𝒙𝒚 = 𝟏𝟏𝟖𝟗 𝜮𝒙𝒚 = 𝟏𝟑𝟕𝟓
𝒏=𝟓 𝒏=𝟕

ACTIVITY 2

Directions: Complete the tables below and compute the correlation coefficient.

1.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
6 35
7 37
8 32
9 39
10 43
11 48
12 48
13 50
14 47
15 51
𝜮= 𝜮= 𝜮= 𝜮= 𝜮=

Note: Practice Personal Hygiene protocols at all times


2.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
10 95
20 85
30 80
40 55
50 75
60 45
70 25
80 25
90 15
100 10
𝜮= 𝜮= 𝜮= 𝜮= 𝜮=

Activity 3

Directions: For each of the following sets of data, compute the correlation coefficient
and interpret the result. Show your complete solution.

1. The table shows the average number of hours spent students in watching television
and their General Weighted Average.

Hours spent watching T. V 7 1 0 5 8 9 10 3


General Weighted Average 85 92 91 85 75 79 70 88

2. The table below shows the age of a car (in years) and the distance it travels (km/L).

Age of a car 1 2 3 4 5 6 7 8 9 10
Distance travelled 20 18 16 13 15 13 12 10 11 7

3. The table shows the number of study hours and the number of sleeping hours of seven
students.

Number of Study Hours 2 3 5 7 8 9 10


Number of Sleeping Hours 10 9 7 7 6 5 4

Rubric for Scoring

Activity 2
• On the table
✓ All data were entered correctly into the table. (15 points)
✓ Almost all data were entered correctly into the table. (10 points)
✓ Few data were entered correctly into the table. (5 points)
✓ No data were entered correctly into the table. (No points)

Note: Practice Personal Hygiene protocols at all times


• On the computation
✓ With solution and correct answer (5 points)
✓ With solution but wrong answer (1 point)
✓ Without solution but with correct answer (2 points)
Activity 3
• On the table
✓ All data were entered correctly into the table. (15 points)
✓ Almost all data were entered correctly into the table. (10 points)
✓ Few data were entered correctly into the table. (5 points)
✓ No data were entered correctly into the table. (No points)
• On the computation
✓ With solution and correct answer (5 points)
✓ With solution but wrong answer (1 point)
✓ Without solution but with correct answer (2 points)
• On the interpretation
✓ Correct interpretation (5 points)

REFERENCES

Belecina, R. R, et.al (2016) Statistics and Probability, pp 293 – 301


Lim, Yvette, et. Al (2016) Math for Engaged Learning-Statistics and Probability.

REFLECTION

What is the implication of the topic in your life?

___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________

ANSWER KEY
Activity 1

1. r = 0.17
2. r = 0.77
3. r=0
4. r = 0.97
5. r = 0.40

Activity 2
1.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
6 35 36 1225 210
7 37 49 1369 259
8 32 64 1024 256
9 39 81 1521 351

Note: Practice Personal Hygiene protocols at all times


10 43 100 1849 430
11 48 121 2304 528
12 48 144 2304 576
13 50 169 2500 650
14 47 196 2209 658
15 51 225 2601 765
𝜮 = 105 𝜮 = 𝟒𝟑𝟎 𝜮 = 𝟏𝟏𝟖𝟓 𝜮 = 𝟏𝟖𝟗𝟎𝟔 𝜮 = 𝟒𝟔𝟖𝟑
r = 0.91

2.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
10 95 100 9025 950
20 85 400 7225 1700
30 80 900 6400 2400
40 55 1600 3025 2200
50 75 2500 5625 3750
60 45 3600 2025 2700
70 25 4900 625 1750
80 25 6400 625 2000
90 15 8100 225 1350
100 10 10000 100 1000
𝜮 = 𝟓𝟓𝟎 𝜮 = 𝟓𝟏𝟎 𝜮 = 38500 𝜮 =34900 𝜮 = 19800
r = -0.96

Activity 3
1.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
7 85 49 7225 595
1 92 1 8464 92
0 91 0 8281 0
5 85 25 7225 425
8 75 64 5625 600
9 79 81 6241 711
10 70 100 4900 700
3 88 9 7744 264
𝜮 = 43 𝜮 = 𝟔𝟔𝟓 𝜮 = 329 𝜮 = 55705 𝜮 = 3387
r = -0.92
Interpretation: There is a very strong negative correlation between the average
number of hours spent and the general weighted average of the student, since r (-0.92) falls
within the range of -1.00 to -0.80. This means that as the number of hours in watching T.V
increases, the GWA tends to decrease.

2.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 20 1 400 20
2 18 4 324 36

Note: Practice Personal Hygiene protocols at all times


3 16 9 256 48
4 13 16 169 52
5 15 25 225 75
6 13 36 169 78
7 12 49 144 84
8 10 64 100 80
9 11 81 121 99
10 7 100 49 70
𝜮 =55 𝜮 = 𝟏𝟑𝟓 𝜮 = 385 𝜮 = 𝟏𝟗𝟓𝟕 𝜮 =642
r = -0.95
Interpretation: There is very strong negative correlation between the age of a car
and the distance it travels.

3.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
2 10 4 100 20
3 9 9 81 27
5 7 25 49 35
7 7 49 49 49
8 6 64 36 48
9 5 81 25 45
10 4 100 16 40
𝜮 = 𝟒𝟒 𝜮 = 48 𝜮 = 332 𝜮 = 𝟑𝟓𝟔 𝜮 = 𝟐𝟔𝟒
r = -0.98
Interpretation: There is a very strong negative correlation between the number of
study hours and the number of sleeping hours of the students.

Prepared by:

JAYLORD R. MENOR

Note: Practice Personal Hygiene protocols at all times


STATISTICS AND PROBABILITY
Name: ____________________________________ Grade Level: ___________________
Section: __________________________________ Date: _________________________

LEARNING ACTIVITY SHEET


SOLVING PROBLEMS INVOLVING CORRELATION
ANALYSIS

Background Information for Learners


The correlation coefficient measures the direction and strength of a linear association
between two variables. For the sample correlation coefficient, we use the symbol r. The range of
the value of r is from -1 to 1. The strength of the correlation is based on the value of the computed
r. If the computed coefficient correlation is close to 1, it means that the two variables have a strong
positive correlation. On the other hand, if the computed r value is close to -1, the two variables
have a strong negative correlation. And it can be said that they have no correlation if r is 0.

To compute for r, we use the formula,


𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]
where:
n = number of pairs of values
𝜮𝒙 = sum of the x values
𝜮𝒚 = sum of the y values
𝜮𝒙𝟐 = sum of the squared values of x
𝜮𝒚𝟐 = sum of the squared values of y
𝜮𝒙𝒚 = sum of the products of x and y
r = Pearson’s correlation coefficient

Note: Practice Personal Hygiene protocols at all times


EXAMPLE 1.
A Mathematics teacher wants to determine the relationship between the time spent by his
students in studying their Final Exam in Statistics and Probability and their Final Exam scores.
Ten students were randomly selected in his class. Their number of hours spent in studying and
their obtained scores are shown in the table below. Compute for Pearson’s sample correlation
coefficient.
Students Hours spent in studying (x) Final Exam Score (y)
1 1 17
2 0 11
3 2 29
4 3 30
5 5 47
6 4 35
7 2 24
8 6 49
9 5 45
10 1 15

SOLUTION:

Students 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 1 17 1 289 17
2 0 11 0 121 0
3 2 29 4 841 58
4 3 30 9 900 90
5 5 47 25 2209 235
6 4 35 16 1225 140
7 2 24 4 576 48
8 6 49 36 2401 294
9 5 45 25 2025 225
10 1 15 1 225 15
𝜮 =29 𝜮 = 𝟑𝟎𝟐 𝜮 = 121 𝜮 = 10812 𝜮 = 𝟏𝟏𝟐𝟐

𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

(10)(1122) − (29)(302)
𝒓=
√[10(121) − (29)2 ][10(10812) − (302)2 ]

𝒓 = 0.99

Note: Practice Personal Hygiene protocols at all times


There is a very strong positive correlation between the number of hours spent by the
students in studying their final exam and their final exam scores.

EXAMPLE 2.
A researcher wants to determine if there is a relationship between the age and recovery
time (in days) of COVID-19 patients. The table below shows the age and the recovery time of
randomly selected patients. Calculate the correlation coefficient and interpret the result.

Age (x) 16 24 27 30 38 46 62 69 93
Recovery Time (y, in days) 14 17 15 23 31 21 27 30 3

SOLUTION:

Patients 𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 16 14 256 196 224
2 24 17 576 289 408
3 27 15 729 225 405
4 30 23 900 529 690
5 38 31 1444 961 1178
6 46 21 2116 441 966
7 62 27 3844 729 1674
8 69 30 4761 900 2070
9 93 3 8649 9 279
𝜮 = 𝟒𝟎𝟓 𝜮 = 𝟏𝟖𝟏 𝜮 = 𝟐𝟑𝟐𝟕𝟓 𝜮 = 4279 𝜮 = 𝟕𝟖𝟗𝟒

𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

(9)(7894) − (405)(181)
𝒓=
√[9(23275) − (405)2 ][9(4279) − (181)2 ]

𝒓 = −0.14

The computed correlation coefficient r = -0.14 shows that the two variables have a very
weak negative linear relationship.

EXAMPLE 3.
Find the correlation coefficient of the data below showing the weight (in kg) and the
pulse rate (in bpm) of seven randomly selected individuals.

Weight (x) 40 60 75 35 53 86 66
Pulse rate (y) 55 65 105 55 100 70 88

Note: Practice Personal Hygiene protocols at all times


SOLUTION:

𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
40 55 1600 3025 2200
60 65 3600 4225 3900
75 105 5625 11025 7875
35 55 1225 3025 1925
53 100 2809 10000 5300
53 70 2809 4900 3710
86 88 7396 7744 7568
𝜮 = 𝟒𝟎𝟐 𝜮 = 𝟓𝟑𝟖 𝜮 = 𝟐𝟓𝟎𝟔𝟒 𝜮 = 43944 𝜮 = 𝟑𝟐𝟒𝟕𝟖

𝑛𝛴𝑥𝑦 − (𝛴𝑥)(𝛴𝑦)
𝒓=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]

(7)(32478) − (402)(538)
𝒓=
√[7(25064) − (402)2 ][7(43944) − (538)2 ]

𝒓 = 0.70

Based from the computed value of r, it can be said that there is a strong positive linear
relationship between weight and pulse rate. But, also, it should be noted that even if two
variables are linearly related, it does not necessarily mean that the one variable is causing the
change of the other variable for there can be other factors not accounted by the relationship.
For example, form our previous problem, the two variables can have linear relationship but the
pulse rate may be also due to the age or health condition of the respondents.

Learning Competency

Solves problems involving correlation analysis (Quarter 2, Week 8, M11/12SP-IVh3)

Note: Practice Personal Hygiene protocols at all times


Activity 1

Directions: Categorize the following r values.


[ 1 point each]
1. r = -0.81
2. r = 0.30
3. r = 0.01
4. r = -0.51
5. r = 0. 90
6. r = -.92
7. r = 1.00
8. r = 0
9. r = -0.46
10. r = 0.23

Activity 2

Directions: Determine whether each statement is TRUE or FALSE. Write POSITIVE if


the statement is true, if it is false, underline the word/phrase that make it wrong and
change it to make the statement correct.
[ 2 points each]

1. Correlation is positive when the values increases together.


2. If the correlation coefficient of two variables is 0.03, it means that they have strong
positive correlation.
3. The sign of the correlation indicates the strength of the association.
4. A perfect positive correlation means that the variables tend to move in the same
direction.
5. The coefficient can take from any values from 0 – 1.

Activity 3

Directions: Calculate the correlation coefficient of each data below and determine if
the relationship is strong or weak, positive or negative.
[5 points each]

1.
x 50 55 60 65 70 75 80 85 90 95 45 40
y 40 45 45 50 63 61 70 30 51 60 25 30

2.
x 19 17 18 23 15 8 31 27 21
y 24 22 23 28 20 13 36 32 26
3.
x 3 6 9 12 15
y 0 3 6 9 12

Note: Practice Personal Hygiene protocols at all times


Activity 4

Directions: Interchange the values of x and y in Activity 3 then compute for r. Compare
the results and make a conclusion.
[ 10 points]

Activity 5

Directions: In your home, select five members of your family to be the respondents and
gather the following data: height (in cm), arm span (in cm), age, and number of glasses
of water drunk in a day. Using the data, you’ve gathered, compute for the correlation
coefficient of the pairs of variables below. Interpret the result.

a. Height and length of arm span


b. Age and number of glasses of water drunk in a day

RUBRIC FOR SCORING

Activity 4

• On the table
✓ All data were entered correctly into the table. (15 points)
✓ Almost all data were entered correctly into the table. (10 points)
✓ Few data were entered correctly into the table. (5 points)
✓ No data were entered correctly into the table. (No points)
• On the computation
✓ With solution and correct answer (5 points)
✓ With solution but wrong answer (1 point)
✓ Without solution but with correct answer (2 points)
• On the interpretation
✓ Correct interpretation (5 points)

REFERENCES

Belecina, R. R, et.al (2016) Statistics and Probability, pp 293 – 301


Lim, Yvette, et. Al (2016) Math for Engaged Learning-Statistics and Probability.

Note: Practice Personal Hygiene protocols at all times


REFLECTION

What have you learned from the topic?

___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________

Note: Practice Personal Hygiene protocols at all times


ANSWER KEY

Activity 1

1. Very strong negative correlation


2. Weak positive correlation
3. Very weak positive correlation
4. Moderate negative correlation
5. Very strong positive correlation
6. Very strong negative correlation
7. Very strong positive correlation
8. No correlation/ Very weak correlation
9. Moderate negative correlation
10. Weak positive correlation

Activity 2

1. POSITIVE
2. strong ------> very weak
3. strength ------> direction
4. POSITIVE
5. 0 – 1 --------> -1 to 1

Activity 3

1. r = 0.60(strong, positive)
2. r = 1 (strong, positive)
3. r = 1 (strong, positive)

Activity 4

1. r =0.60
2. r = 1
3. r = 1

Conclusion: The Pearson’s correlation coefficient r is still the same even if the
values of x and y are interchanged.

Prepared by:

JAYLORD R. MENOR

Note: Practice Personal Hygiene protocols at all times


STATISTICS AND PROBABILITY
Name of Learner: ___________________________ Grade Level: __________
Section:___________________________________ Date: ________________

LEARNING ACTIVITY SHEET


REGRESSION ANALYSIS
Background Information for Learners
Your learnings in calculating the slope and y-intercept of a regression line will help you
to get the regression equation in the form y′ = b0 x + b1 , where b0 is the slope of the regression
line, b1 is the y-intercept of the regression line, x is the value of the independent variable and
y′ is the predicted value.
For example, if the computed value for the slope and y-intercept of a regression line is
24.31 and 3.25 respectively then, the regression equation will be y′ = 24.31x + 3.25.
Regression Analysis- the process of predicting the value of a variable in terms of the others
variable.
Regression equation- it is the algebraic expression of the regression line.
We can predict the value of one variable in terms of the other variable as long as the
correlation of two variables are statistically significant.
Example: The data below shows the number of absences and number of missed quizzes of 6
students. If there is a significant relationship between the two variable, predict the number of
missed quizzes by a student who was absent for 7 days.
Number of Absences Number of Missed Quizzes
1 1
1 2
2 2
2 3
3 3
4 5
Step 1. Identify the dependent and independent variables.
- In the given data, the independent variable is the number of absences while the
dependent variable is the number of missed quizzes.

Step 2. Compute the correlation coefficient r using the formula:


𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
- You need to find the value of ∑ 𝑋 , ∑ 𝑌, ∑ 𝑋 2 , ∑ 𝑌 2 , ∑ 𝑋𝑌 and substitute them in
the formula.
X Y X2 Y2 XY
1 1 1 1 1
1 2 1 2 2
2 2 4 2 4
2 3 4 9 6
3 3 9 9 9
4 5 16 25 20
∑ 𝑋 = 13 ∑ 𝑌 = 16 ∑ 𝑋 2 = 35 ∑ 𝑌 2 = 48 ∑ 𝑋𝑌 = 42

Note: Practice Personal Hygiene protocols at all times.


𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
6(42) − (13)(16)
𝑟=
√[(6)(35) − (13)2 ][(6)(48) − (16)2 ]
𝑟 = 0.9183
-The computed r is 0.9183 indicating a very high positive correlation.

Step 3. Test the significance of r using the formula:


𝑛−2
𝑡 = 𝑟√
1 − 𝑟2
- In the given data, n= 6 and r = 0.9183.
6−2
𝑡 = 0.9183√
1 − 0.91832
𝑡 = 4.64

Step 4. Compare the computed t-value to the critical value.


- Using df= n -2 = 6 -2= 4, α =0.05, two-tailed test, you can find in the t-table that
the critical value of t is 2.77645.

Step 5. Make a decision and summarize the result.


- There is enough evidence to conclude that there is a significant relationship between
number of absences and number of missed quizzes. Therefore, you can proceed to
regression analysis.

Step 6. Compute the values of b0 and b1 in the regression equation y′ = b0 x + b1 using the
following formulas.
𝑛 ∑ 𝑥𝑦−(∑ 𝑥)(∑ 𝑦) ∑ 𝑦−𝑏0 ∑ 𝑥
𝑏0 = ∑ 2 2 𝑏1 = 𝑛
𝑛 𝑥 −(∑ 𝑥)
- Using the values obtained in Step 2, we have the following:
𝑛 ∑ 𝑥𝑦−(∑ 𝑥)(∑ 𝑦) ∑ 𝑦−𝑏0 ∑ 𝑥
𝑏0 = 2 2 𝑏1 =

𝑛 𝑥 −(∑ 𝑥) 𝑛
6(42)−(13)(16) 16−(1.0732)(13)
𝑏0 = 𝑏1 =
6(35)−(13)2 6
𝑏0 = 1.0732 𝑏1 = 0.3414

Step 7. Form the regression equation.


- Substitute the computed values of slope and y-intercept in the regression equation.
y′ = b0 x + b1

y = 1.0732x + 0.3414

Step 8. Predict the number of missed quizzes by a student who was absent for 7 days.
- Find the value of y when x is 7 in the regression equation.
y′ = 1.0732x + 0.3414
y′ = 1.0732(7)+ 0.3414
y′ = 7.8538

Note: Practice Personal Hygiene protocols at all times.


- Therefore, the predicted number of missed quizzes of a student who was absent for
7 days is approximately 8 quizzes. Remember that this is just a predicted value
based on the given data.

Learning Competency with code


The learner is able to predict the value of the dependent variable given the value of the
independent variable. M11/12SP-IVj-1

Exercise 1: Directions: Formulate the regression equation given the following set of slope and
y-intercept of a regression line. [1 point each item]
1. b0 = 12.145 ; b1 = 2.235
2. b0 = 36.57 ; b1 = 8.34
3. b0 = 24.63 ; b1 = 8.22
4. b0 = 56.2 ; b1 = 9.23
5. b0 = 74.82 ; b1 = 15.04

Exercises 2: Directions: Given the regression line equation, find the value of the other variable
that is being asked. [2 points each item]

1. y = 1.32𝑥 + 3.5
a. What is the value of y if x = 4? b. What is the value of y if x = 5?

2. y = 4.11𝑥 + 5.62
a. What is the value of y if x = 10? b. What is the value of y if x = 16?

3. y = 8.03𝑥 + 2.14
a. What is the value of y if x = 6? b. What is the value of y if x = 9?

4. y = 51.65𝑥 + 13.9
a. What is the value of y if x = 12? b. What is the value of y if x = 18?

Exercise 3: Directions: Based on the scatter plot below, predict the value of variable y if the
independent variable x are as follows: [1 point each item]

5
Y
4

0
1 2 3 4 5 6 7 8 9 10 11 12
X

Note: Practice Personal Hygiene protocols at all times.


1. x = 3.5
2. x = 5.2
3. x = 10
4. x = 6.8
5. x = 15

Exercise 4. Directions: Read and analyze the problem. Your task is to answer the questions
posted after each item. [6 points each item]
1. A sorbetes vendor observed that whenever the temperature was high, his sales also
increased. He recorded the data so that he could predict his future sales. This data are
shown in the table.
Temperature in ℃ (x) Sorbetes Sales in Php (y)
33 1300
36 1540
38 1890
40 2300
35 1600
30 1150
28 980
a. Predict the sales of the sorbetes vendor on the day that the temperature is 37℃.
b. How many sales does he have if the temperature drops to 26℃?
c. Estimate the amount of sales if the temperature is 37℃?

2. The table below shows the titles, number of chapters and the total number of pages of
Harry Potter books.
Title of the Book Number of Number of
Chapters Pages
Harry Potter and the Philosopher’s Stone 17 223
Harry Potter and the Chamber of Secrets 18 251
Harry Potter and the Prisoner of Azkaban 22 317
Harry Potter and the Goblet of Fire 37 636
Harry Potter and the Order of the Phoenix 38 766
Harry Potter and the Half-Blood Prince 30 607
Harry Potter and the Deathly Hallows 37 607
a. Use this data to predict how many chapters the author could make if the book had
345 pages.
b. Based on the data, estimate the number of pages of a Harry Potter book with 33
chapters.
c. Predict how many pages are there in a 23-chapter book.

Closure
3-2-1 Check
Write three things you have learned in the activity.
 __________________________________________________________________
 __________________________________________________________________
 __________________________________________________________________
Write two things you have a question about the regression.
 __________________________________________________________________
 __________________________________________________________________

Note: Practice Personal Hygiene protocols at all times.


If you give yourself a rate about your understanding in predicting the value of one
variable in terms of another variable, what would it be? 10 is the highest and 1 is
the lowest.
 __________________________________________________________________

References
Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and probability. Manila, Philippines:
Rex Printing Company, Inc.
Ocampo, J. & Marquez, W. (2016). Conceptual math and beyond statistics and probability.
Manila, Philippines: Brilliant Creations Publishing, Inc.
https://numberbender.com/lessons/view/752/6.3-Calculate-the-Slope-and-the-Y-intercept-of-
the-Regression-Line
https://www.academia.edu/29721020/TEACHING_GUIDE_FOR_SENIOR_HIGH_SCHOO
L_Statistics_and_Probability_CORE_SUBJECT_Commission_on_Higher_Education
_in_collaboration_with_the_Philippine_Normal_University

Answer Key
Exercise 1
1. y′ = 12.145x+ 2.235
2. y′ = 36.57x+ 8.34
3. y′ = 24.63x+ 8.22
4. y′ = 56.2x+ 9.23
5. y′ = 74.82x+ 15.04
Exercise 2
1. a. y′ = 8.78
b. y′ = 10.1
2. a. y′ = 46.72
b. y′ = 71.38
3. a. y′ = 50.32
b. y′ = 74.41
4. a. y′ = 633.7
b. y′ = 943.6

Exercise 3
1. y′ = 3.7306
2. y′ = 4.6656
3. y′ = 7.3056
4. y′ = 5.5456
5. y′ = 10.0556
Exercise 4
1. a. Php 1814.5953
b. Php 690.1828
c. Php 1814.5953
2. a. 23 chapters
b. 590 pages
c. 365 pages
Prepared by:
CINDY L. AQUINO
Luna General Comprehensive High School

Note: Practice Personal Hygiene protocols at all times.


STATISTICS AND PROBABILITY
Name of Learner: ___________________________ Grade Level: __________
Section:___________________________________ Date: ________________

LEARNING ACTIVITY SHEET


REGRESSION ANALYSIS
Background Information for Learners
Regression Analysis- the process of taking a set of data and use this data to predict an
outcome.
Regression equation- it is the algebraic expression of the regression line.
Linear Regression- the regression curves form a straight line.
Regression Line- is a line that best fits the data.
There are different kinds of regression analysis, but you will just focus on linear
regression where you will only deal with one independent variable and one dependent variable.
The main goal of regression analysis is to determine the regression line that will be used in
prediction. The easiest way to draw the regression line is by using Microsoft excel and other
software for regression analysis, but you can do it manually by using a ruler to draw a line on
the area where approximately half the points are on each side of the line.
Examples:
Best Fitting Regression Line Not The Best Fitting Regression Line

Whenever there is a significant relationship between two variables, then you can
proceed to regression analysis. In many cases, regression analysis is used in businesses to
predict future sales using the previously records on sales and productions. Even manufacturers
make predictions on their income based on the production costs. School administrators can also
estimate the future number of enrollees based on student enrollment data for quite a number of
years.
For further understanding about regression analysis, let us take a look at the given
example.

Note: Practice Personal Hygiene protocols at all times.


Example: It is believed that there is a significant correlation between age and height. Let us
consider the data below based from the BMI of 10 students in Luna General Comprehensive
High School.
Student Age (x) Height in inches (y)
A 12 48
B 13 52
C 13 50
D 14 51
E 14 54
F 14 53
G 15 60
H 16 57
I 18 64
J 19 67
a. Formulate the regression equation that will predict the height of a student in terms of age.
b. Draw the scatter plot and regression line of the data.
c. Predict the height of a 17-year-old student.

Step 1. Identify the dependent and independent variable.


- In the given data, the independent variable is height and the dependent variable is
age.
Step 2. Compute the correlation coefficient r.
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]

- We need the values of ∑ 𝑋 , ∑ 𝑌, ∑ 𝑋 2 , ∑ 𝑌 2 , ∑ 𝑋𝑌


Age (X) Height in inches (Y) 𝑿𝟐 𝒀𝟐 𝑿𝒀
12 48 144 2304 576
13 52 169 2704 676
13 50 169 2500 650
14 51 196 2601 714
14 54 196 2916 756
14 53 196 2809 742
15 60 225 3600 900
16 57 256 3249 912
18 64 324 4096 1152
19 67 361 4489 1273
∑ 𝑋 =148 ∑ 𝑌 = 556 ∑ 𝑋 2 = 2236 ∑ 𝑌 2 = 31,268 ∑ 𝑋𝑌 = 8351

𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]

10(8351) − (148)(556)
𝑟=
√[10(2236) − (148)2 ][10(31,268) − (556)2 ]

𝑟 = 0.9613
- The computed value of r is 0.9613 indicating a very high positive correlation.

Note: Practice Personal Hygiene protocols at all times.


Step 3. Test significance of r using the formula
𝑛−2
𝑡 = 𝑟√
1 − 𝑟2
- In the given data, n =10 and r = 0.9613.
10 − 2
𝑡 = 0.9613√
1 − 0.96132
𝑡 =9.86907
Step 4. Compare the computed t-value to the critical value.
- Using df= n-2= 10-2= 8, α =0.05, two-tailed test, you can find in the t-table that the
critical value of t is 2.306.

Step 5. Make a decision and summarize the result.


- There is enough evidence to conclude that there is a significant relationship between
age and height. Therefore, you can proceed to regression analysis.

Step 6. Compute the values of slope (b0 ) and y-intercept (b1 ) and formulate the regression
equation.
𝑛 ∑ 𝑥𝑦−(∑ 𝑥)(∑ 𝑦) ∑ 𝑦−𝑏0 ∑ 𝑥
𝑏0 = 2 2 𝑏 1 =
∑𝑛 𝑥 −(∑ 𝑥) 𝑛
10(8351)−(148)(556) 556−(2.67982)(148)
𝑏0 = 𝑏1 =
10(2236)−1482 10
𝑏0 = 2.67982 𝑏1 = 15.9386

- Therefore, The regression equation is y′ = 2.67982x + 15.9386

Step 7. Use the regression equation to predict the height of a 17-year-old student.
y′ = 2.67982x + 15.9386 Regression equation
y′ = 2.67982(𝟏𝟕) + 15.9386 Substitute 17 to x.
y′ = 45.55694+15.9386 Compute the product of 2.67982 and 17

y = 61.49664 Add 45.55694 and 15.9386

- Therefore, the predicted height of a 17-year-old student is 61.49664 inches.


Remember that this is just a predicted value based on the given data.

Step 8. Let’s go back to the aforementioned questions and answer it completely.


a. Formulate the regression equation that will predict the height of a student in terms of
age.
Answer: The regression equation is y′ = 2.67982x + 15.9386
b. Draw the scatter plot and regression line of the data.
Answer:

Regression line.

Note: Practice Personal Hygiene protocols at all times.


c. Predict the height of a 17-year-old student.
Answer: The predicted height of a 17-year-old student is 61.49664 inches.

Learning Competency with code


The learner is able to solve problems involving regression analysis. M11/12SP-IVj-2

Exercise 1. Put a check mark (/) on the box if the scatter plot shows the best fitting regression
line and cross mark (x) if not. [1 point each item]

Source:https://www.tes.com/lessons/OrtWlCqUIqsGDw/line-of- Source:https://ammar-alyousfi.com/2018/machine-learning-
best-fit-examples linear-regression-simply-explained

Source:http://people.sabanciuniv.edu/yuki/ns101_lab/calc_tutori Source: https://sphweb.bumc.bu.edu/otlt/MPH-


al_Fall2014.html Modules/BS/BS704_Correlation-Regression/BS704_Correlation-
Regression_print.html

Source:https://www.expii.com/t/identify-trend-lines-on-graphs- Source:https://highschoolmathteachers.com/wp-
4395 content/uploads/2015/09/Line-of-Best-Fit-Vocabulary-Notes-
PDF.pdf

Note: Practice Personal Hygiene protocols at all times.


Exercise 2. Directions: Formulate the regression equation in predicting the value of y in terms
of x. [1 point each item]
1. X 4 5 7 9 10 11
Y 15 16 18 22 21 23

2. X 1 2 3 4 5 6
Y 77 75 78 80 83 82

3. X 5 4 6 6 7 2
Y 85 103 70 66 72 169

4. X 36 48 51 54 57 60
Y 86 90 91 93 94 95

Exercise 3. Directions: Analyze each problem and answer completely.


1. Filipinos love to take vacations every summer. Often people plan ahead especially in
planning where to go. How many days will be spent? How much money will be
budgeted? A survey in 10 families who took vacations in Baguio City was conducted
and the data gathered are shown below. [4 points]
Number of Days Amount of Money
Spent in Vacation Spent
2 6000
2 7500
3 9000
3 10000
3 8000
3 11500
4 12000
4 11000
5 14000
6 17000
a. Draw a scatter plot of the data.
b. Formulate the regression equation that will predict the amount of money spent in
terms of number of days spent in Vacation.
c. Graph the regression line on the same coordinate system where you the draw the
scatter plot.
d. Based on the data, how much money will be spent if they will stay at Baguio City
for one week?

2. The data show the population of Luna, Isabela based on the conducted census.
[4 points]
Year Population
1990 12335
1995 13255
2000 14581
2007 15884
2010 18091
2015 19326
a. Draw a scatter plot of the data.

Note: Practice Personal Hygiene protocols at all times.


b. Find the regression equation that will predict the population of Luna, Isabela in
terms of year.
c. Graph the regression line on the same coordinate system where you the draw the
scatter plot.
d. Predict the population of Luna, Isabela in 2025.

3. Because of the pandemic that is taking place in the world today, everyone is being
advised to stay in their homes. One of the effects people are now complaining about
staying at home is high power consumption. According to electricity experts, the cost
of electricity is rising because people spend more time in using appliances such as
refrigerators, electric fans, air conditions, televisions, radios, even gadgets such as
cellphones, laptops, and tablets or iPod. A survey is conducted to 7 households
gathering a data on the number of appliances available at home and monthly electricity
bill. The data are shown in the table. [5 points]
Number of Appliances Monthly Electricity Bill in
Available at Home Philippine Peso
4 560
6 765
7 990
8 932
10 1,432
12 1,904
14 2,470
a. Draw a scatter plot of the data.
b. Find the regression equation that will predict the monthly electricity bill in terms of
the number of appliances available at home.
c. Graph the regression line on the same coordinate system where you the draw the
scatter plot.
d. Predict the monthly electric bill of a household with 17 appliances?
e. Estimate the monthly electric bill of a household with 11 appliances.

Exercise 4. Directions: In this activity, you can select one among the options for your
performance-based output.
a. Create a vlog explaining how to perform regression analysis.
b. Make a video clip about the appreciation of beauty of regression analysis when it comes
to businesses, decision-making, economics, and etc. You can visit the website
https://study.com/academy/lesson/using-regression-analysis-in-business.html for a
sample of video clip.
c. Write a poem about regression analysis with a minimum of five stanzas.
d. Compose a song about regression analysis. The recorded audio of your composition
will be submitted to your subject teacher.
e. Think of two variables in the real world and perform regression analysis on them. Be
sure that your chosen variables are measurable so that you can collect the needed data.
As much as possible, please avoid face-to-face in gathering data. You are required to
collect at least 10 data for this output. Use the given format below.

Note: Practice Personal Hygiene protocols at all times.


Name:_____________________________ Grade level & Section:________
Title of Task: Regression Analysis_______ Date:______________________

Independent Variable:_________________________________________
Dependent Variable:__________________________________________

DATA COLLECTION SHEET


Independent Variable X Dependent Variable Y

1. Is there significant relationship between your data? Is it strong, moderate,


weak, or perfect? Describe why you think this correlation exists.

2. What is the computed value of correlation coefficient r in your data? (Show


your solution)

3. Write the regression equation of your data.

4. Draw a scatter plot and regression line that represents your data.

Modified from https://www.npsd.k12.nj.us/cms/lib04/NJ01001216/Centricity/Domain /113/Statistics%20-


%20End%20MP%201%20Performance%20Task.pdf

Rubric for Scoring


Criteria 5 points 4 points 3 points 2 points 1 point
Mastery of Complete A lot of Good Little mastery No mastery
the Topic mastery of mastery of mastery of of topic of topic
topic topic topic
Explanation Explanation Explanation Explanation Explanation is Explanation
is detailed is clear is a little difficult to is irrelevant
and clear difficult to understand
understand. and missed
several details.
Creativity Unique Very Creative Somewhat Needs to be
creative creative more
creative
Organization Excellent Very good Good Organization Organization
organization organization organization could be better needs
improvement
Timeliness Submitted Missed the Missed the Missed due Missed due
on or before due date by due date by date by five date by more
the dead no more three days days than a week
line than one day or more days
Modified from http://s3-us-west-1.amazonaws.com/powget/oral-presentation-rubric-read-think-write.html

Note: Practice Personal Hygiene protocols at all times.


Closure
What is your most essential learning about regression analysis?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________

References for Learners


Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and probability. Manila, Philippines:
Rex Printing Company, Inc.
Ocampo, J. & Marquez, W. (2016). Conceptual math and beyond statistics and probability.
Manila, Philippines: Brilliant Creations Publishing, Inc.
https://numberbender.com/subjects/view/philippines:%20%20statistics%20and%20probabilit
y:%20grade%2011%20or%20grade%2012/all
https://www.academia.edu/29721020/TEACHING_GUIDE_FOR_SENIOR_HIGH_SCHOO
L_Statistics_and_Probability_CORE_SUBJECT_Commission_on_Higher_Education
_in_collaboration_with_the_Philippine_Normal_University
https://study.com/academy/lesson/using-regression-analysis-in-business.html

Answer Key
Exercise 1.

Source:https://www.tes.com/lessons/OrtWlCqUIqsGDw/line-of- Source:https://ammar-alyousfi.com/2018/machine-learning-
best-fit-examples linear-regression-simply-explained

Source:http://people.sabanciuniv.edu/yuki/ns101_lab/calc_tutori Source: https://sphweb.bumc.bu.edu/otlt/MPH-


al_Fall2014.html Modules/BS/BS704_Correlation-Regression/BS704_Correlation-
Regression_print.html

Note: Practice Personal Hygiene protocols at all times.


Source:https://www.expii.com/t/identify-trend-lines-on-graphs- Source:https://highschoolmathteachers.com/wp-
4395 content/uploads/2015/09/Line-of-Best-Fit-Vocabulary-Notes-
PDF.pdf

Exercise 2.
1. ŷ = 1.15254X + 10.33051
2. ŷ = 1.45714X + 74.06667
3. ŷ = -20.625X + 197.29167
4. ŷ = 0.38333X + 71.95

Exercise 3
1. a & c.

b. ŷ = 2448.27586X + 2031.03448
d. Php 19168.9655

2. a & c

b. ŷ = 282.86654X - 550955.87542
d. The predicted population in 2025 is 21849.

Note: Practice Personal Hygiene protocols at all times.


3. a & c

b. ŷ = 191.66342X - 376.92412
d. Php 2881.3541
e. Php 1731.3735

Exercise 4. Answer may vary.

Prepared by:

CINDY L. AQUINO
Luna General Comprehensive High School

Note: Practice Personal Hygiene protocols at all times.

You might also like