You are on page 1of 40

SENIOR HIGH SCHOOL

STATISTICS &
PROBABILITY
QUARTER 4
MODULE 1

DO_Q4_STATISTICS & PROBABILITY_GRADE 11_MODULE1


Statistics & Probability
Alternative Delivery Mode
Quarter 4 – Weeks 1-10
Second Edition, 2022

Republic Act 8293, section 176 states that: No copyright shall subsist in any work of
the Government of the Philippines. However, prior approval of the government agency or office
wherein the work is created shall be necessary for the exploitation of such work for a profit.
Such agency or office may, among other things, impose as a condition the payment of
royalties.

Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand names,
trademarks, etc.) included in this module are owned by their respective copyright holders.
Every effort has been exerted to locate and seek permission to use these materials from their
respective copyright owners. The publisher and authors do not represent nor claim ownership
over them.

Published by the Department of Education

Development Team of the Module

Writers: MARY ANN B. ORIO, Dalandanan National SHS, SDO-Valenzuela


JHERALD Q. GABICA, Bignay National SHS, SDO-Valenzuela
VIRGILIO G. VENTURA, Caruhatan National SHS, SDO-Valenzuela
DAISY LYN F. MARIANO, Parada National SHS, SDO-Valenzuela
Reviewers: REBECCA M. BIÑAS, Malinta National SHS, SDO- Valenzuela
Editors: HELEN P. ADVENCOLA, Valenzuela City School of Mathematics and Science
Layout Artist: OLIVER G. MARIANO, Sitero Francisco Memorial NHS, SDO-Valenzuela
RAPHAEL A. LOPEZ
Management Team:
MELITON P. ZURBANO, Assistant Schools Division Superintendent (OIC-SDS)
FILMORE A. CABALLERO, CID Chief
JEAN A. TROPEL, Division EPS In-Charge of LRMS
EDNA LLANERA, Division SHS Focal Person
MARILYN B. SORIANO, Division Mathematics Coordinator

Printed in the Philippines by ________________________

Department of Education – National Capital Region – SDO VALENZUELA

Office Address: Pio Valenzuela St., Marulas, Valenzuela City


Telefax: (02) 292 – 3247
E-mail Address: sdovalenzuela@deped.gov.ph
Targets:
1. Illustrate: (a) null hypothesis, (b) alternative hypothesis, (c) level of significance,
(d) rejection region, and (e) types of errors in hypothesis testing. M11/12SP-IVa-1
2. Identify the parameters to be tested given a real-life problem. M11/12SP-IVa-3

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.

1. It is also known as non-directional test.


A. One-tailed test C. Two-tailed test
B. Tailed test D. Three-tailed test
2. It refers to a statement that there is no difference between a parameter and a
specific value.
A. Tailed-test C. Null hypothesis
B. Alternative hypothesis D. Significant difference
3. It is the decision to reject or do not reject the null hypothesis in a given situation.
A. Conclusion C. Hypothesis
B. Directional test D. Significance
4. It refers to a statement that there is a difference between a parameter and a
specific value.
A. Alternative hypothesis C. Significant difference
B. Tailed test D. Null hypothesis
5. It is a classification of error wherein the decision to reject the null hypothesis
could be wrong.
A. Correct decision C. Type II error
B. Type 1 error D. Type III error

Lesson
Understanding Hypothesis Testing
1
In this lesson, the learners will understand the concepts of tests of hypotheses
on the population mean and population proportion.

In Statistics, decision making starts with a concern about a population


regarding its characteristics denoted by parameter values. We might be interested in
the population parameter like the mean and the proportion. For example, a
fisherman looks into several factors before deciding to go out to catch fish in the sea.
In the same manner, a farmer’s decision on when to plant his crops, and a politician
DO_Q4_STATISTICS &
1
PROBABILITY_GRADE 11_LESSON1
in a community who decided to approve an agenda on environmental awareness are
some examples that can be addressed in procedures in Statistics called hypothesis
testing.

Hypothesis Testing is another area in Inferential Statistics. It is a decision-


making process for evaluating claims about a population based on the characteristics
of a sample purportedly coming from that population. The decision is whether the
characteristic is acceptable or not.
There are two types of hypotheses: the null hypothesis and alternative
hypothesis.
Null hypothesis is the hypothesis to be tested. It states an exact value about
the parameter. When the null hypothesis is rejected, this leads to another option,
which is the alternative hypothesis that allows for the possibility of many values.

Null hypothesis
• It is denoted by Ho.
• It is a statement which states that there is no significant difference between a
parameter and a specific value, or that there is no significant difference between
two parameters.
• It is a statement that asserts the value to which the population parameter is equal
and is presumed to be true.
• It is a statement of equality (=) or one which involves equality (≤ and ≥).

Alternative hypothesis
• It is denoted by Ha.
• It is a statement that there is a significant difference between a parameter and a
specific value, or that there is a significant difference between two parameters.
• It is a statement of inequality such as, ≠, < and >.

Example 1: The mean number of studying hours of a Grade 11 student is 6 hours.


Ho: The mean number of studying hours of a Grade 11 student is 6 hours.
In symbols: Ho: µ = 6.
Ha: The mean number of studying hours of a Grade 11 student is not equal to
6 hours.
In symbols: Ha: µ ≠ 6

Example 2: The mean height of a Grade 12 student is at least 150 cm.


Ho: The mean height of a Grade 12 student is at least 150 cm.
In symbols: Ho: µ ≥ 150
Ha: The mean height of a Grade 12 student is less than 150 cm.
In symbols: Ha: µ < 150

Directional versus Non-directional Test


In example 1 above, we can write the alternative hypothesis as:
a. The mean number of studying hours of a Grade 11 student is not equal to 6
hours. (In symbols, Ha : µ ≠ 6)

DO_Q4_STATISTICS &
2
PROBABILITY_GRADE 11_LESSON1
b. The mean number of studying hours of a Grade 11 student is greater than 6
hours. (In symbols, Ha : µ > 6)
c. The mean number of studying hours of a Grade 11 student is less than 6 hours.
(In symbols, Ha : µ ˂ 6)
The appropriateness on the use of “not equal to”, “greater than”, and “less than”
in alternative hypothesis depends on the design of the hypothesis test.
Design of Hypothesis Test can be: (a) one - tailed test (also known as directional
test); (b) two – tailed test (also known as non-directional test).
The two-tailed test (non-directional test) is the standard test used in many
researches and it compares the population parameter in both directions (left or right)
of the bell curve. On the other hand, one-tailed test (directional test) is a test that
determines the relationship between the variables in only one direction, either the
left or the right tail of the curve.
LEVEL OF SIGNIFICANCE
The next step in hypothesis testing after the statement of the hypotheses is
the setting of the standard or criterion on which the decision will be based.
Apparently, there are only two possible decisions to make in the process of
hypothesis testing- either “reject Ho” (accept Ha) or “do not reject Ho” (reject Ha). This
decision to reject the null hypothesis is called significance and it should be based
on a set of criteria of judgment called the level of significance, denoted using the
Greek lower-case alpha, α.
● Significance is reached when the p-value of the statistic is less than the level
of significance.
● In general, statisticians arbitrarily set the commonly used levels of
significance,
at 1%, 5% and 10%.

THE REJECTION REGION


To clarify the rejection or retention of the null hypothesis, a critical region or
rejection region must be defined.
After the level of significance for the hypothesis test is set, the researcher now
computes the test statistic. When the computed test statistic falls within a specific
range of values allowable for the test, the null hypothesis is rejected. This range of
values for the sample statistic that indicates when the null hypothesis should be
rejected is called the rejection region. Figures 1,2, and 3 show the rejection region
for both directional and non-directional tests.

Figure 1 Figure 2 Figure 3


The critical region is based on a value called the critical value, which is usually
determined using an appropriate distribution table based on the test statistic.

DO_Q4_STATISTICS &
3
PROBABILITY_GRADE 11_LESSON1
DECISION ERRORS IN HYPOTHESIS TESTING

The last step in hypothesis testing is the decision to reject or not to reject the
null hypothesis. Since not all members of the population are considered in the
process of verifying the null hypothesis, it is always a possibility that the decision to
reject or not to reject the null hypothesis is wrong.

Classification of Decision Errors: (a.) Type I error. It is the decision to reject the
null hypothesis could be wrong; and (b.) Type II error. It is the decision not to reject
the null hypothesis could be wrong.

Of course, you only reject the null hypothesis when it is false and you fail to
reject the null hypothesis when it is true. Doing otherwise would certainly lead to do
a decision error. The Table 1 below summarizes the four possible outcomes when a
decision is made in hypothesis testing.

Table 1. Four Possible Outcomes of the Decision in Hypothesis Testing


Reality Fail to Reject Reject
Null hypothesis is true. Correct decision Type I error
Null hypothesis is false. Type II error Correct decision

Example 3:
Maria insists that she is 30 years old, when in fact, she is 32 years old. What
error is Maria committing?
Answer: Maria is rejecting the truth. She is committing a Type I error.
Example 4:
It has been established that a particular teaching strategy improves math
performance. However, the p-value taken from your experiment at an alpha-value of
0.05 was 0.15. Thus, you did not reject the null hypothesis and concluded that there
is no significance between the strategy and math performance. What type of decision
is illustrated in this example?
Answer: This illustrates Type II error because there is really significance in the
population between the teaching strategy and math performance, but you did not
find any significance in your sample.

PROBABILITY OF COMMITTING A TYPE I AND TYPE II ERRORS


In decisions that we make, we form conclusions and these conclusions are the
bases of our actions but this is not always the case in Statistics because we make
decisions based on a sample information. The best way we can do is to control the
probability with which an error occurs.
The probability of committing a Type I error is denoted by Greek letter α (alpha)
while the probability of committing a Type II error is denoted by β (beta)
The following table shows the probability with which decisions occur.
Table 2. Types of Errors
Error in Type Probability Correct Type Probability
Decision Decision
Reject a true I α Accept a true A 1-α
Ho Ho
Accept a false II β Reject a false B 1-β
Ho Ho
DO_Q4_STATISTICS &
4
PROBABILITY_GRADE 11_LESSON1
Parameter
Parameter is defined as any numerical quantity that characterizes a given
population or some of its aspects. It means that the parameter tells us something
about the whole population.
However, the numerical measure that is calculated from the sample is called
statistic. Statistic is a known number and a variable that depends on the portion of
the population.
A parameter denotes the true value that would be obtained if a census
rather than a sample was undertaken.
Examples of parameters are the measures of central tendency. These tell us
how the data behave on an average basis. For example, mean, median, and mode are
measures of central tendency that give us an idea about where the data concentrate.
Meanwhile, standard deviation tells us how the data are spread from the central
tendency, i.e. whether the distribution is wide or narrow. Such parameters are often
very useful in analysis.

Identifying parameters to be used:


1. The television habits of children were observed and found out that the
standard deviation is 10.2 hours per week.
Parameter to be tested: The standard deviation of children’s television habits
hours per week is 10.2
Parameter: standard deviation in symbol: 𝜎 = 10.2
2. A study claims that the mean quarantine days for a certain person is 14 days.
Parameter to be tested: mean quarantine days for a certain person
Parameter: mean in symbol: 𝜇 = 14

Directions: State the null and the alternative hypotheses of the following
statements. Use another answer sheet.

1. A medical trial is conducted to test whether or not a new releases medicine


reduces uric acid by 40%.
: ____________________________________________________
: ____________________________________________________
2. Supposed, we want to test whether the general average of students in Math is
different from 82%.
: ____________________________________________________
: ____________________________________________________
3. We want to test whether the mean height of Grade 7 students is 56 inches.
: ____________________________________________________
: ____________________________________________________
4. We want to test if BNHS students take more than four years to graduate from
high school, on the average.
: ____________________________________________________
: ____________________________________________________
5. We want to test if it takes less than 45 minutes to answer the summative test in
Mathematics.
: ____________________________________________________
: ____________________________________________________

DO_Q4_STATISTICS &
5
PROBABILITY_GRADE 11_LESSON1
Directions: Determine if one-tailed test or two-tailed test fits the given alternative
hypothesis.

1. The enrolment in Junior High Schools is not the same as the enrolment in the
Senior High Schools.
2. The standard deviation of their height is not equal to 7 inches.
3. The average number of internet users this year is significantly higher as
compared last year.
4. Male Grade 8 and Grade 11 students differ in height on average.
5. Miya’s present grade is higher compared to her previous grade.

Multiple Choice. Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.

1. What kind of parameter is applied in the given situation? “The mean height of all
Grade 10 students is 170 cm.”
A. Standard Deviation B. Mean C. Proportion D. Variance
2. A licensed teacher claims that more than 40 % of all education graduates passed
the licensure examination for teachers. What kind of parameter is used in this
claim?
A. Standard Deviation B. Mean C. Proportion D. Variance
3. It is a classification of error wherein the decision to reject the null hypothesis
could be wrong.
A. Correct decision B. Type I error C. Type II error D. type III error
4. It is also known as non-directional test.
A. One-tailed test B. Three- tailed test C. Two-tailed test D. Tailed test
5. Which of the following is not a parameter?
A. Mean B. Standard Deviation C. Summation D. Mode
6. The decision to reject or to fail the null hypothesis is called ____________.
A. Conclusion B. Directional test C. Hypothesis D. Significance
7. It refers to any numerical quantity that characterizes a given population or some
of its aspects.
A. Parameter B. Hypothesis C. Median D. Mode

For numbers 8-10, refer to this:


Anna wants to estimate the average shower time of teenagers. From the
sample of 50 teenagers, she found out that it takes 5 minutes for teenagers to shower.
8. What parameter is to be tested? ____________
9. What parameter is to be used? _______________
10. How are you going to write it in symbols? _______________

DO_Q4_STATISTICS &
6
PROBABILITY_GRADE 11_LESSON1
Targets:
1. Formulate the appropriate null and alternative hypotheses on a population mean.
M11/12SP-IVb-1; and
2. Identify the appropriate form of the test statistic when: (a) the population variance
is assumed to be known; (b) the population variance is assumed to be unknown;
and (c) the central limit theorem is to be used.
M11/12SP-IVb-2 - M11/12SP-IVc-1.

Directions: Identify the population standard deviation/sample standard deviation


and the number of samples in each problem. Write your answer in
another sheet of paper.
1. A sample of 150 people has a mean age of 25 with a population standard deviation
(σ) of 5. Test the hypothesis that the population mean is 24.7 at α=0.05.
2. An electric lamps manufacturer is testing a new production method that will be
considered acceptable if the lamps produced by this method result in a normal
population with an average life of 1,250 hours and a standard deviation equal to
110. A sample of 100 lamps produced by this method has an average life of 1,150
hours.
3. The cholesterol levels in a certain population have mean of 200 and standard
deviation 20. The cholesterol levels for a random sample of 9 individuals are
measured and the sample mean x is determined. What is the z-score for a sample
mean x=180?
4. Mapagmahal Elementary School has 1,000 students. The principal of the school
thinks that the average IQ of students at Mapagmahal is at least 110. To prove
her point, she administers an IQ test to 20 randomly selected 10 students. Among
the sampled students, the average IQ is 108 with a standard deviation of 10.
5. A new energy-efficient lawn mower engine was developed by a well-known
inventor. He claims that the engine will run continuously for 5 hours on a single
gallon of regular gasoline. From his stock of 1,000 engines, the inventor selects a
simple random sample of 50 engines for testing. The engines run for an average
of 290 minutes with a standard deviation of 20 minutes.

Formulating the Appropriate Null


Lesson
and Alternative Hypotheses
2
on a Population Mean
In statistics, hypothesis testing is a way for you to test the results of a survey
or experiment to see if you have meaningful results. You are basically testing whether
your results are valid by figuring out the odds that your results have happened by
chance. In addition, it allows you to collect samples and make decision based on
facts, not on how you feel or what you think is right. To prove your assumptions, you
must state first the null and alternative hypotheses.
DO_Q4_STATISTICS &
7
PROBABILITY_GRADE 11_LESSON2
This module will start by recalling your knowledge on the equality/inequality
symbols. This concept will help you understand how to formulate hypothesis.

“The Importance of Sunlight in Plants”


Directions: Examine the pictures below then answer the guide questions that follow.

Guide Questions:
1. What have you observed between the two figures?
2. Do you think the sunlight has an effect to the plant?
3. What do you think are the variables shown in the pictures?
4. Is there any relationship among the variables in Figure 1 and Figure 2?
5. How does these pictures relate to hypothesis?

Statistical hypothesis an assertion or conjecture concerning one or more


populations.
an assumption or statement which may or may not be
true concerning one or more population.

TWO TYPES OF STATISTICAL HYPOTHESIS


a) Null Hypothesis, H0
● It states that there is no difference between population parameters and the
hypothesized value.
● It is a hypothesis that the population mean equals a specific value.
● It contains the “=”, “≥”, and “ ≤ " signs.
b) Alternative hypothesis, Ha or H1.
● It is a claim about the population that is contradictory to H0 and what we
conclude when we reject H0.

DO_Q4_STATISTICS &
8
PROBABILITY_GRADE 11_LESSON2
● The alternative hypothesis says the population mean is “greater than” or “less
than” or “not equal to” the value we assume is true in the null hypothesis.
● It contains the “ >” , “<”, and “ ≠ " signs.
One-tailed test Two-tailed test
-Alternative hypothesis contains -Alternative contains the
the greater than (>) or less than inequality symbol (≠).
(<) symbol.
- It is directional either right- -It has no direction
tailed or left tailed

Hypothesis
H0: The exposure to sunlight Ha: The exposure to sunlight
does not affect the growth of does affect the growth of the
the plant. plant.

To state the null and alternative hypotheses correctly:


1. Identify the parameter in a given problem.
2. Identify the claim to be tested that may show up in null or alternative hypothesis.
3. Translate the claim into mathematical symbols/notations.
4. Formulate first the null hypothesis (H0) then alternative hypothesis (Ha) based on
the three different ways in writing hypothesis as: “: µ =”, “: µ ≤” and “:
µ ≥”
Test Statistics is used to calculate the p-value of your results, helping to decide
whether to reject your null hypothesis.
the larger the test statistic, the smaller the p-value and the more
likely you are to reject the null hypothesis.
The table below shows the appropriate test statistics to be used when:

Example:
1. A study was conducted to look at the average time students do exercise. A
researcher claimed that in average, students exercise less than 12 hours per
month. In a random sample size n=110, it was found that the mean time students
exercise is x̄ = 11.3 hours per month with s = 6.40 hours per month.
Since n=110, the sample size is large and variance is unknown. Hence, z-
test is the appropriate tool. (Central Limit Theorem)

DO_Q4_STATISTICS &
9
PROBABILITY_GRADE 11_LESSON2
2. An English teacher wanted to test whether the mean reading speed of students is
540 words per minute. A sample of 10 students revealed a sample mean of 520
words per minute with a standard deviation of 5 words per minute. At 0.05
significance level, is the reading speed different from 540 words per minute?
The sample size (n) is 10 which is less than 30 and sample standard
deviation (5 words per minute) was given. Therefore, the appropriate test is
t-test

Directions: Write the null and alternative hypothesis of the following and determine
if it is one-tailed or two-tailed. Use another answer sheet.
1. Mrs. Queliste claims that her students scored an average of 90 in their
Mathematics quiz. The master teacher wants to know whether the teacher’s claim
is acceptable or not.
2. A manufacturer of soft drinks claims that all labeled 1.5-liter bottles contain an
average of 1.48 liters of soft drinks. A retailer wishes to test whether the mean
amount of soft drinks in labeled 1.5-liter bottle is less than 1.48 liters.
3. A car manufacturer claims that the mean selling price of all cars manufactured
is only ₱160,000. A consumer agency wants to test whether the mean selling price
of all the cars manufactured exceeds ₱160, 000.
4. The average power consumption of air conditioner is at most 2,500 watts as
claimed by the owner. A survey made by an electric power company found out
that the mean consumption is 3,500 with standard deviation of 225.
5. A bus company in Manila claims that the mean waiting time for a bus during
rush hour is less than 12 minutes. A random sample of 30 waiting times has a
mean of 15 minutes with a standard deviation of 4.8 minutes.

Directions: Identify the appropriate test statistic to be used in each problem. Write
z-test or t-test on a separate sheet of paper.
___________1. A sample of n=20 is selected from a normal population, mean = 53
and s= 10.
___________2. Based on the report of the school nurse, the average height of Grade
11 students has increased. Five years ago, the average height of Grade 11 students
was 168 cm with standard deviation of 36 cm. She took a random sample of 150
students and derived the average height of 159 cm.
___________3. Knowing from a previous study that the average of athletes is 60, an
athletic adviser asked how his soccer players are academically doing as compared to
other student athletes. After an initiative to help improve the average of student
athletes, the adviser randomly selected 15 soccer players and found 80 as the
average with standard deviation of 1.20.
___________4. The CEO of a battery manufacturing company claimed that their
batteries would last an average of 270 hours under normal use. A researcher
randomly selected 15 batteries from the production line and tested them. The tested
batteries had a mean life span of 270 hours with a standard deviation of 40 hours.
Do we have enough evidence to suggest that the claim of an average of 280 hours is
false?

DO_Q4_STATISTICS &
10
PROBABILITY_GRADE 11_LESSON2
___________5. It was known that the number of tickets purchased by students at the
ticket window for the volleyball match of two popular universities followed a
distribution that has mean of 500 and standard deviation of 8.7. Suppose that a few
hours before the start of one of these matches, there are 100 eager students standing
in line to purchase tickets. If there are 250 tickets remaining, what is the probability
that all 100 students will be able to purchase the tickets they want?

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
1. This hypothesis states that there is no difference between population parameters
and the hypothesized value.
A. Hypothesis C. Alternative hypothesis
B. Null hypothesis D. Two-tailed hypothesis
2. When the value of parameter has significant difference with the hypothesized
value, then it is called ________________.
A. One-tailed test C. Null hypothesis
B. Two-tailed test D. Alternative hypothesis
3. What kind of hypothesis is illustrated below? The mean score of all Grade 12
students is higher than 75.
A. One-tailed test C. Null hypothesis
B. Two-tailed test D. Alternative hypothesis
4. The sign of the alternative hypothesis in a left-tailed test is always_________.
A. Equal B. Less than C. Not equal D. Greater than
5. “A modern approach in advertisement will not increase the demand for a
product.” This is an example of _______________ hypothesis.
A. null B. alternative C. Mean D. right-tailed

Target:
1. Identify the appropriate rejection region for a given level of significance when: (a)
the population variance is assumed to be known; (b) the population variance is
assumed to be unknown; and (c) the Central Limit Theorem is to be used
(M11/12SP-IVc-1).

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
***Refer to the statements below to answer item numbers 1 and 2.
I. A boy whose height is 5’2” insists that his height is just 5’6”.
II. A police officer accepts unsolicited gifts even if it is wrong to do so.
III. Danilo still insists on working illegally despite knowing its risks.
IV. Julia says that her hair color is not black, instead tells that she has dark-
skinned.

DO_Q4_STATISTICS &
11
PROBABILITY_GRADE 11_LESSON3
1. Which of the statement/s above illustrates a Type I error?
A. I only B. I and IV C. III only D. II and III
2. Which of the statement/s above illustrates a Type II error?
A. I only B. I and IV C. III only D. II and III
***For item numbers 3 and 4, determine the critical values that matches the given
condition.
3. Population standard deviation is known, and the confidence level is 90% for a
two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
4. Sample size is 150 but the population standard deviation is unknown, and the
confidence level is 90% for a two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
***Refer to the situation below to answer item numbers 5 to 7.
A sample of 250 bulbs were taken to verify the claim of its manufacturer that
the average lifespan of their bulbs is 2.5 years.
5. Which distribution must be considered to identify the appropriate rejection region?
A. t-distribution C. f-distribution
B. z-distribution D. insufficient data

Lesson
The Rejection Region and Critical Values
3
In this lesson, the learners will understand the concept of rejection region and
critical values. Ideas about types of error will also be presented.

When performing hypothesis testing, we come up with a decision of either


rejecting the null hypothesis or not. The decision is made based on how our
computed test statistic relates to the corresponding critical value set at a given
confidence level.
This goes to show that critical values play an important role in establishing
the region/s under the curve where the hypothesis being tested may be rejected or
not. The region in which the hypothesis must be rejected is called the rejection
region.
Aside from establishing the hypothesis and identifying the appropriate test
statistic, there are other elements of hypothesis testing relevant to decision making.
As we can naturally commit mistakes, one of these relevant elements is the concept
of error. It was mentioned above that there are two possible decisions. Also, the null
hypothesis may either be true or false. Hence, there are 4 possible combinations of
decisions and truth values of the null hypothesis.

Interestingly, only two of these four


outcomes are correct. The other two are
errors. These errors are named as Type I
error and Type II error. Study the diagram
given at the right.

DO_Q4_STATISTICS &
12
PROBABILITY_GRADE 11_LESSON3
We can note from the diagram that a Type I error is committed when a true
hypothesis is rejected while a Type II error is committed when you fail to reject a false
hypothesis.
How do these errors relate in real life? Let us see the illustrations below.

Illustration 1: A man who insists that he stands 5’10” when in fact, his height is only
5’8”. In this situation, the man is said to commit a Type I error since he is rejecting
the idea that he just stands 5’8”.

Illustration 2: A student who allows his/her classmates to cheat on his answers


during a test. In this scenario, the student is said to commit a Type II error since he
is allowing the act of cheating despite knowing that it is a wrong deed.

The question now is, how likely are we going to commit these errors as we do
decision making in the hypothesis testing process? We denote the probability of
committing a Type I error as 𝛼 while the probability of committing a Type II error as
𝛽. With the goal of minimizing these errors, we set their values to be relatively small.
For instance, we usually assign an alpha value of 0.05 or 0.01, depending on the
implications of the errors. Of course, the more serious the implications, the less likely
we would like to commit the error. From here we can further say that the probability
of making a correct decision with respect to Type I error is 1 − 𝛼.

The probabilities introduced above may be seen graphically in a normal curve.

The figures on the left show the rejection region


under the normal curve for a directional (one-
tailed) test. Notice that the entire area under the
curve is divided into two parts by the critical
value. The shaded region is the rejection region.

The figure on the left shows the rejection regions under the
normal curve for a non-directional (two-tailed) test. This time,
notice that the entire area under the curve is divided into three
parts by the critical values. The rejection regions are seen on
both tails, which means that have been equally distributed.

Remember that these regions serve us our guide in decision making. If the
computed test statistic falls in the rejection region, then we must reject Ho. If the test
statistic falls outside the rejection region, then we do not reject Ho.
As a remark, the curve to be used is based on whether the population variance
is known or not.

Sample Problem:
1. Assuming that the population standard deviation is known, sketch the rejection
region for a two-tailed test with 95% confidence. Does z = 1.68 fall in the rejection
region?

DO_Q4_STATISTICS &
13
PROBABILITY_GRADE 11_LESSON3
Solution:
Since the population standard deviation is assumed to be known, we will use
the z-distribution (normal distribution). Also, the 95% confidence level implies that
𝛼 = 0.05. Further, the two-tailed test implies that we must consider 𝛼/2 = 0.025
since the probability is distributed on both tails. Thus, the critical value is the
corresponding z-value for 1 – 0.025 = 0.975. Using the z-table, we find that the
critical values are -1.96 and 1.96. The sketch of the rejection region is shown
below.

Obviously, z = 1.68 is found between -1.96


and 1.96, it means that z does NOT lie in
the rejection region.

2. Assuming that the population standard deviation is unknown for a randomly


selected sample whose size is 11, sketch the rejection region for a one-tailed test
with 99% confidence. Does t = 2.86 fall in the rejection region?
Solution:
Since the population standard deviation is assumed to be unknown and the
sample size is small, we will use the t-distribution. Also, the 99% confidence level
implies that 𝛼 = 0.01. Further, the one-tailed test implies that the rejection region
is found on one side of the curve only.
Given that the sample size is 11, it follows that 𝑑𝑓 = 𝑛 − 1 = 11 − 1 = 10. Using
the t-table, we find that the critical value is 2.764. The sketch of the rejection
region is shown below.

It can be seen that t=2.86 is found


on the right of the right-tailed t-
critical value. Thus, t=2.86 lies on
the rejection region.

3. A random sample of 250 bottles of juice drink were taken and was found to have
an average content that is less than the company’s claim that each bottle
contains 500 mL of juice drink. Suppose that an appropriate test statistic
revealed a value of -1.75 at 95% confidence, sketch the rejection region and
locate test statistic value.
Solution:
It is seen from the problem that the population standard deviation is unknown
but with the sample size of 250, which is large enough, we can make use of the
Central Limit Theorem and consider the z-distribution. With 95% confidence, it
shows that 𝛼 = 0.05. Thus, 1 − 𝛼 = 1 − 0.05 = 0.95.

DO_Q4_STATISTICS &
14
PROBABILITY_GRADE 11_LESSON3
Further, the phrase ‘less than’ indicates
that we have a one-tailed test. Thus, we
verify now if z = -1.75 lies on the rejection
region or not. The sketch is shown below.

As shown in the figure, the test


statistic
z = -1.75 lies on the rejection region.

Directions: Answer the given problem.


Assuming that the population standard deviation is unknown for a randomly
selected sample whose size is 11, sketch the rejection region for a one-tailed test with
90% confidence. Does t = 2.86 fall in the rejection region?

Directions: Decide whether each statement is TRUE or FALSE. Write T for True
and F for False in separate sheet of paper.
1. We use t-distribution when the population standard deviation is known.
2. In a one-tailed test, the rejection region is found on both tails of a distribution.
3. The critical values divide the curve into rejection and non-rejection regions.
4. Type II error is committed when a false hypothesis is not rejected.
5. The probability of not committing a Type I error is 1 − 𝛼.

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
1. What type of error is committed when you fail to reject a false null hypothesis?
A. Type I B. Type II C. Type A D. Type B
***Refer to the statements below to answer item numbers 2 and 3.
I. A man who weighs 80 kilograms argues that his weight is just 75
kilograms.
II. A judge accepts bribe even if it is wrong to do so.
III. Carla still insists on working illegally despite knowing its risks.
IV. A woman says that her skin color is not black, instead tells that she is
dark-skinned.
2. Which of the statement/s above illustrates a Type I error?
A. I only B. I and IV C. III only D. II and III
3. Which of the statement/s above illustrates a Type II error?
A. I only B. I and IV C. III only D. II and III
***For item numbers 4 to 5, determine the critical values that matches the given
condition.

DO_Q4_STATISTICS &
15
PROBABILITY_GRADE 11_LESSON3
4. Population standard deviation is unknown; the confidence level is 95% for a two-
tailed test and the sample size is 24.
A. ±1.711 B. ±1.714 C. ±2.064 D. ±2.069
5. Sample size is 101 but the population standard deviation is unknown, and the
confidence level is 95% for a two-tailed test.
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58

Targets:
1. Compute for the test-statistic value (population mean) (M11/12SP-IVd-1); and
2. Draw conclusion about the population mean based on the test-statistic value and
the rejection region (M11/12SP-IVd-2).

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
***Refer to the situation below to answer item numbers 1 to 5.
A pool of researchers claims that the average age of schooling among children
in a certain district is 4.8 years with a standard deviation of 0.21. A pre-school
teacher attempted to verify this claim by taking the ages of 360 first-time
schoolers in the said district and found out that the average age is 4.12 years.
1. What test statistic must be computed to test the claim of the researchers?
A. t-test statistic C. either A or B
B. z-test statistic D. insufficient data
2. What is the correct value of the test statistic?
A. -61.44 B. -58.42 C. -0.48 D.-0.02
3. What must be the appropriate critical value if 99% significance level was used?
A. ±1.65 B. ±1.96 C. ±2.33 D. ±2.58
4. How do the absolute values of the test statistic (TS) and the critical value (CV)
compare?
A. TS = CV C. TS < CV
B. TS > CV D. Insufficient data
5. Which conclusion can be possibly drawn?
A. The claim of the researchers is true.
B. The average age of schooling is 4.12 years.
C. There is no enough evidence to support the claim of the researchers.
D. The sample selected by the pre-school teacher does not correctly represent
the population.

Lesson
4 The Test Statistic
In this lesson, the learners will be equipped with the skill of computing for the
appropriate test statistic needed in performing hypothesis testing later.

DO_Q4_STATISTICS &
16
PROBABILITY_GRADE 11_LESSON4
In the previous lesson, we learned about critical values and their role in
establishing the rejection region. To complete the scenario towards arriving with a
certain decision on whether to reject the null hypothesis or not, one must be able to
compute for the test statistic accurately.
The calculation of the test statistic mainly depends on whether or not the
population standard deviation is known as well as on the size of the sample. In this
lesson, we introduce the formula and the procedures for computing an appropriate
test statistic which will be used to arrive at a correct decision that may eventually
lead to sound conclusions.
Generally, to compute for a test statistic, we subtract the expected value form
the observed value and divide the result by the standard error. There are two main
statistical tests performed concerning the population mean, they are known as the
z-test and the t-test.
On one hand, we use z-test when 𝑛 ≥ 30 or when the population is normally
𝑋−𝜇
distributed and 𝜎 is known. The formula for the z-test statistic is given by 𝑧 =
𝜎/√𝑛
where, 𝑋 is the sample mean,
𝜇 is the hypothesized population mean,
𝜎 is the population standard deviation, and
𝑛 is the sample size.
On the other hand, we use t-test when the population is normal or
approximately normal and 𝜎 is unknown. The formula for the t-test statistic is given
𝑋−𝜇
by 𝑡 =
𝑠/√𝑛
where, 𝑋 is the sample mean,
𝜇 is the hypothesized population mean,
𝑠 is the sample standard deviation, and
𝑛 is the sample size.
Notice that the formulas presented above are very much similar. It is only in
the standard error of the mean that they differ since in the use of t-test, the
population standard deviation is unknown and is replaced by the sample standard
deviation.

After computing for the appropriate test statistic, a decision must be made. To
come up with a correct decision, we follow the following rule: If the absolute value of
computed test statistic is greater than or equal to the critical value, the null hypothesis
is rejected and if the absolute value of computed test statistic is less than the critical
value, we do not reject the null hypothesis.
After deciding, we proceed to writing sound conclusions. These are inferences
that we can draw from the context of the situation based on whether we have rejected
the null hypothesis or not. To illustrate how we compute test statistics and write
conclusions, thereafter, study the example below.

Sample Problem:
1) Previous records revealed that the mean salary of the high school teachers in a
municipality is Php 16, 250 with a standard deviation of Php 1, 400. A sample of
50 teachers were taken and was reported to have a mean salary of Php 18,000. At
DO_Q4_STATISTICS &
17
PROBABILITY_GRADE 11_LESSON4
95% confidence level, do we have enough evidence to believe what the records
revealed?
Solution:
Since the sample size is 50, which is greater than or equal to 30 and that the
population standard deviation is known, we compute for the z-test statistic.
Substituting the known values to our formula for z-test, we obtain the following:
𝑋−𝜇 18 000−16 250 1 750
𝑧= = ≈ ≈ 8.839
𝜎/√𝑛 1 400/√50 197.99
At, 95% confidence interval, the critical value for z is 1.96. Clearly, the
computed z-test statistic which is 8.839 is greater than the z-critical value of 1.96.
Thus, we reject the null hypothesis stating that the population mean salary and
the sample mean salary are statistically equal.
Therefore, the sample mean is statistically different from that of the
population mean. This implies that the selected high school teachers have
significantly different salary as compared to the population and so, there is no
enough evidence to believe what the records have revealed.

2) A medical report claims that the number of infections per week at a certain
hospital in a province is 12.7. A random sample of 9 weeks had a mean number
of 11.4 infections with a standard deviation of 0.6. Is there enough evidence to
support the claim at 95% confidence level?
Solution:
Given that the sample size is 9, which is less than 30 and that the
population standard deviation is unknown, we compute for the t-test statistic.
Applying the formula for this test value, we have the following:
𝑋−𝜇
𝑡=
𝑠/√𝑛
11.4 − 12.7
𝑡=
0.6/√9
−1.3
𝑡= = −6.5
0.2

At 95% confidence level, 𝛼 = 0.05. Also, when 𝑛 = 9, 𝑑𝑓 = 9 − 1 = 8. Using


the t-table, these lead to the t-critical value of ±2.306. Comparing the absolute
values, we can say that the computed test statistic is greater than the critical
value. Hence, we reject the null hypothesis stating that the population mean, and
the sample mean are statistically equal.
Therefore, the sample mean is statistically different from that of the
population mean. This implies that the selected weeks have significantly different
number of infections as compared to the population and so, there is no enough
evidence to support what the medical report claims.

Directions: Compute for the appropriate test statistic in each of the following:
a. 𝑛 = 17; 𝑋 = 8; 𝜇 = 7.4; 𝑠 = 0.5 d. 𝑛 = 10; 𝑋 = 5.9; 𝜇 = 6.3; 𝑠 = 0.125
b. 𝑛 = 40; 𝑋 = 6.24; 𝜇 = 5.11; 𝜎 = 3.6 e. 𝑛 = 48; 𝑋 = 10.27; 𝜇 = 9.4; 𝑠 = 2.04
c. 𝑛 = 100; 𝑋 = 11; 𝜇 = 9; 𝜎 = 1.2

DO_Q4_STATISTICS &
18
PROBABILITY_GRADE 11_LESSON4
Directions: Decide whether each statement is TRUE or FALSE. Write T for True and
F for False on a separate sheet of paper.
1. We use z-test statistic when the population standard deviation is unknown.
2. The inferences that we can draw out of a decision are called conclusions.
3. If the critical value is greater than the computed test statistic, we do not reject
the null hypothesis.
4. We use t-test statistic when the population standard deviation is known.
5. The distribution must be normal or approximately normal so that one can proceed
with computing for either z- or t-test statistic.

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
***Refer to the situation below to answer item numbers 1 to 5.
A pool of researchers claims that the average age of schooling among children
in a certain district is 4.8 years with a standard deviation of 0.21. A pre-school
teacher attempted to verify this claim by taking the ages of 20 first-time schoolers
in the said district and found out that the average age is 4.12 years.
1. In testing the claim of the researchers, what test statistic must be determined?
A. t-test statistic C. either A or B
B. z-test statistic D. insufficient data
2. Which of the following is the correct value of the test statistic?
A. 14.48 B. 10.48 C. -0.03 D. -14.48
3. Suppose that 95% confidence level was used, what must be the appropriate
critical value?
A. ±1.65 B. ±1.96 C. ±2.093 D. ±2.064
4. How do the absolute values of the test statistic (TS) and the critical value (CV)
compare?
A. TS = CV C. TS < CV
B. TS > CV D. Insufficient data
5. Which conclusion can be possibly drawn?
A. The claim of the researchers is true.
B. The average age of schooling is 4.12 years.
C. There is no enough evidence to support the claim of the researchers.
D. The sample selected by the pre-school teacher does not correctly represent
the population.

Targets:
1. Solve problems involving test of hypothesis on the population mean (M11/12SP-
IVe-1);
2. Formulate the appropriate null and alternative hypotheses on a population
proportion (M11/12SP-IVe-2); and
3. Identify the appropriate form of the test-statistic when the Central Limit Theorem
is to be used (M11/12SP-IVe-3).

DO_Q4_STATISTICS &
19
PROBABILITY_GRADE 11_LESSON5
Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
*****Refer to the problem below to answer item numbers 1-5.
A table manufacturing company reported that at the end of the 2021, the mean
number of delivered tables daily is 245.2. If a random sample of 16 manufacturing
days revealed that the mean number of delivered tables is 250.1 with a standard
deviation of 3.6, test the claim of the company in its report at 0.05 level of
significance.
1. Which of the following shows the correct alternative hypothesis?
A. µ = 245.2 B. µ ≠ 245.2 C. µ > 245.2 D. µ < 245.2
2. Which distribution is appropriate for the desired hypothesis testing?
A. t-distribution B. z-distribution C. f-distribution D. none of these
3. What is the corresponding critical value?
A. ± 2.131 B. ±1.96 C. ± 1.68 D. ± 1.42
4. Which of the following is the correct test statistic?
A. 4. 31 B. 5. 05 C. 5.44 D. 6.87
5. Which of the following can be concluded from the results of the hypothesis
testing?
A. There is not enough evidence to reject the claim of the company in its report.
B. There is enough evidence to reject the claim of the company in its report.
C. On the average, the company delivers more than 245.2 tables.
D. No valid conclusion can be drawn.

Lesson
Hypothesis Testing
5
In this lesson, the learners will be acquainted of the entire procedures in
performing hypothesis testing concerning both population mean and population
proportion.

In the past lessons, you learned how to formulate null and alternative hypotheses
for population mean, identify critical values, compute test statistic, decide, and state
conclusions based on the results. The test of hypothesis concerning population mean
may be described as a decision-making process about a certain claim concerning a
population.
The previous lessons have indeed allowed you to experience how this entire
process is performed in a piece-by-piece approach. In this lesson, we will look at all
these procedures as a one whole big picture. The test of hypothesis may be conducted
in three ways namely – traditional method, p-value method, and confidence interval
method. For this lesson we will only tackle about the traditional method and its steps
are summarized below.
Steps in Hypothesis Testing using the Traditional Method
1. State the hypothesis and identify the claim. 4. Make a decision.
2. Find the critical value. 5. State the conclusion.
3. Compute for the appropriate test statistic.

DO_Q4_STATISTICS &
20
PROBABILITY_GRADE 11_LESSON5
To solve problems involving test of hypothesis concerning population mean, we
follow the steps presented above. As an illustration, let us take the problem below.
Sample Problem 1:
According to the latest data published by the World Health Organization (WHO)
in 2018, the life expectancy of male Filipinos is 66.2 years. A random sample of 50
recorded deaths among male Filipinos was taken and was found to have a mean of
64.6 years. Assuming that the population standard deviation is 7.2 years, does this
seem to indicate that the mean life span of male Filipinos is less than 66.2 years?
Use 0.05 level of significance.
Steps Solution
1. State the hypothesis The following are the hypotheses:
and identify the claim. Ho: 𝜇 = 66.2 H1: 𝜇 < 66.2 (claim)
2. Find the critical value. Since the population standard deviation is known, then
the appropriate distribution is z. Also, the hypotheses are
directional and so it implies a one-tailed test with 𝛼 = 0.05.
Using the z-table, the critical z-value is -1.65.
3. Compute for the As mentioned in Step 2, we use the z-distribution. So, the
appropriate test appropriate test statistic is z and is computed as follows:
statistic. 𝑧=
𝑋−𝜇
=
64.6−66.2
≈ −1.571
𝜎/√𝑛 7.2/√50
4. Make a decision. Comparing the absolute values of the computed z-test
statistic and the z-critical value, it can be seen that
zcomputed < zcritical. Thus, at 0.05 level of significance, the null
hypothesis is not rejected.
5. State the conclusion. Based on the decision, there is not enough evidence to
support the claim that the life span of male Filipinos is
less than 66.2 years.
Sample Problem 2:
In a report by Sanchez (2020), during 2016 the average household electric
consumption per capita in our country is 248.1 kilowatt hours. If a random sample
of 16 households included in a planned study indicated that their consumption is
254.3 kilowatt hours with a standard deviation of 3.6 kilowatt hours, test the claim
of Sanchez in his report at 0.05 level of significance.
Steps Solution
1. State the hypothesis The following are the hypotheses:
and identify the claim. Ho: 𝜇 = 248.1 (claim)
H1: 𝜇 ≠ 248.1
2. Find the critical value. Since the population standard deviation is unknown and
the sample is less than 30, then the appropriate
distribution is t. Also, the hypotheses are non-directional
and so it implies a two-tailed test with 𝛼 = 0.05. Using the
t-table, with 𝑑𝑓 = 16 − 1 = 15, the critical t-value is
±2.131.
3. Compute for the As mentioned in Step 2, we use the t-distribution. So, the
appropriate test appropriate test statistic is t and is computed as follows:
statistic. 𝑡=
𝑋−𝜇
𝑠/√𝑛
254.3 − 248.1
𝑡=
3.6
√16
𝑡 ≈ 6.889
4. Make a decision. Comparing the absolute values of the computed t-test
statistic and the t-critical value, it is clear that

DO_Q4_STATISTICS &
21
PROBABILITY_GRADE 11_LESSON5
tcomputed > tcritical. Thus, at 0.05 level of significance, the
null hypothesis is rejected.
5. State the conclusion. Based on the decision, there is enough evidence to reject
the claim of Sanchez in his report that the average electric
consumption in Filipino households is 248.1 kilowatt
hours.

At this point, it is timely for you to know that hypothesis testing does not only
apply on problems that concern population mean but also on population proportion.
To effectively carry out hypothesis testing in these cases, one must be able to
correctly state hypotheses for population proportion.

Instead of actual population mean values, our hypotheses concern a part or a


percentage of a population. Consider the following hypothetical statements:
a. 72% of public elementary school teachers have their own internet connection
at home;
b. 12% of diabetes patients are at risk of having kidney failure; and
c. 16% of female college students graduate with honors.

These statements are examples of claims that involve population proportion,


and thus, may be subjected to hypothesis testing. The establishment of the
hypotheses in these cases is like those which concern population mean. We will use
𝑝 to denote the population proportion.

To illustrate how hypotheses concerning population proportion are


formulated, let us look at the illustrations below.

State the null and alternative hypotheses for each of the following.
a. A recent report indicated that 67% of teenagers spend more than 5 hours in
doing social media activities. To verify this claim, a researcher took 90 teenagers,
and revealed that 54 of them affirms the report.
Solution:
In the above problem, we formulate the following hypotheses:
Ho: 𝑝 = 0.67 (claim) H1: 𝑝 ≠ 0.67

b. A survey revealed that more than 46% of working professionals dine-in at fast
food stores daily. A pool of researchers tested this survey result by taking 75
working professionals with 52 of them agreeing the result.
Solution:
In the above problem, we formulate the following hypotheses:
Ho: 𝑝 = 0.46 H1: 𝑝 > 0.46 (claim)

A recent survey of 200 people revealed that the mean time spent in watching
television of teenagers is 4.2 hours. Previous national records say that the mean time
was 3.8 hours with standard deviation of 0.3 hours. Do the survey results
significantly differ from previous records at 0.05 level of significance?

DO_Q4_STATISTICS &
22
PROBABILITY_GRADE 11_LESSON5
A. Directions: State the null and alternative hypotheses for each of the following.
1. A recent report indicated that 72% of teachers spend more than 8 hours in
doing schoolwork in this current work-from-home arrangement. To verify this
claim, a researcher took 80 teachers, and revealed that 63 of them affirms the
report.
2. A survey revealed that less than 36% of children ages 8 to 10 years old are
exposed to computer games daily. A pool of researchers tested this survey
result by taking 105 children on the given age bracket with 41 of them
agreeing the result.

B. Directions: Read and analyze the problem below then solve.


A random sample of 150 bottles of juice drink were taken and was found to
have an average content of 318 mL with a standard deviation of 2.2 mL. This
average content is less than the company’s claim that each bottle contains 330
mL of juice drink. At 0.05 significance level, do the contents of the juice drink of
the sample significantly differ to that of the population?

Directions: Read and analyze each item carefully, then write the letter of your
answer on a separate sheet of paper.
***** Refer to the problem below to answer item numbers 1-5.
A chocolate manufacturing company reported that at the end of the 2021, the mean
number of boxes of chocolates delivered daily is 514.3. If a random sample of 25
manufacturing days revealed that the mean number of delivered boxes of chocolates
is 510.5 with a standard deviation of 2.5, test the claim of the company in its report
at 0.05 level of significance.
1. Which of the following shows the correct null hypothesis?
A. µ = 514.3 B. µ ≠ 514.3 C. µ > 514.3 D. µ < 514.3
2. Which distribution is appropriate for the desired hypothesis testing?
A. t-distribution B. z-distribution C. f-distribution D. none of these
3. What is the corresponding critical value?
A. ± 2.064 B. ±2.060 C. ± 1.96 D. ± 1.65
4. Which of the following is the correct test statistic?
A. 7.6 B. 5. 05 C. -6.5 D. -7.6
5. Which of the following can be concluded from the results of the hypothesis
testing?
A. There is not enough evidence to reject the claim of the company in its report.
B. There is enough evidence to reject the claim of the company in its report.
C. On the average, the company delivers more than 514.3 boxes of chocolates.
D. No valid conclusion can be drawn.

DO_Q4_STATISTICS &
23
PROBABILITY_GRADE 11_LESSON5
Targets:
1. Identify the appropriate rejection region for a given level of significance.
M11/12SP-IVe-4;
2. Compute for the test statistic value (Z-test for Proportion). M11/12SP-IVf-1;
3. Draw conclusion about the population proportion based on the whole hypothesis
testing procedure. M11/12SP-IVf-2; and
4. Solve problems using Z-test for proportions. M11/12SP-IVf-g-1.

Do this Pre-Test: Check the appropriate box (TRUE or FALSE).


TRUE FALSE
1. If n = 20, the Central Limit Theorem applies.
2. If the confidence level is 90%, then α/2 is .05.
3. The area under the curve represents the probability,
proportion, or percentage.
4. When H0 is rejected, it means that a significant difference
does not exist.
5. A sample is small when n<30.

Lesson Hypothesis Testing About a


6 Population Proportion
In this lesson, the learners will understand the concept of hypothesis testing
about a population proportion. Also, they will learn the procedure and the steps in
doing so.

We will now illustrate how the rejection region is determined when =.05.

The REJECTION REGION is a range of values such that if the test statistic (Z, t, or p) falls
into that range, we decide to reject the H0 in favor of the H1. (Keller/Warrack, p.324)

For a Left-Tailed Test (One-Tailed Directional Test), the rejection region will
have an area equal to 0.05 at the extreme left side of the Normal Curve. See the figure
below.

DO_Q4_STATISTICS &
24
PROBABILITY_GRADE 11_LESSON6
The line that divides the Rejection Region and the Non-rejection Region (some
books call it Acceptance Region) corresponds to a Z value that we call Critical Value.
This critical value may be found in our Z-table. Since the rejection region (red region)
has an area equal to 0.05, we have to look at our Z-table for the corresponding Z
value. Were you able to locate it? It’s exactly between -1.64 and -1.65. That means
the Z value we are looking for is -1.645.

REMEMBER: We use this diagram if our H1 uses the symbol <.

Using the same argument and also because our Normal Curve is symmetric,
our critical value for a Right-Tailed Test (One-Tailed Directional Test) when = .05
must be 1.645.

REMEMBER: We use this diagram if our H1 uses the symbol >.


The diagram is a little bit different if we are using a Two-Tailed Test (Directional Test).
The =.05 will be divided equally into two at both tails of the Normal Curve.

REMEMBER: We use the Two-Tailed Test if we are using the symbol ≠ in the H1.
Do you understand? The preceding discussion is based only on =.05 level of
significance.

In the beginning of this module, we talked about brands of coffee and the
people’s preference. When data are nominal, the only thing we can do to describe
the population or sample is to count the number of occurrences for each category.
From the counts we determine the proportions. (Keller/Warrack, p. 373)

In the following discussion, we will perform a hypothesis test by comparing a


sample proportion with a hypothesized proportion. The procedure here is almost
similar to what you did when you dealt with sample mean and population mean.

DO_Q4_STATISTICS &
25
PROBABILITY_GRADE 11_LESSON6
The sampling distribution of a proportion approximately follows a
standardized normal distribution. (Levine, p.356)
The test statistic for proportion p0 is given as follows:

𝑝̂ – p0
𝑋
Z= where 𝑝̂ =
𝑛
√(p0q0/n) 𝑝̂ = sample proportion
x = number of successes
n = sample size
p0 = hypothesized population proportion
q0 = 1 – p 0

NOTE: p0 is approximately normal for np0 > 5 and nq0 > 5.


(Keller/Warrack, p. 374)

EXAMPLE 1: A local government official claims that only up to 25% of all public
school students in the city own an electronic gadget that can be used for distance
learning like cellphone, tablet, or laptop. To test the claim, a group of Grade 11
Statistics students made a survey and found out that out of 1, 000 randomly selected
students, 275 indicated that they are ready for Online learning. Can we infer from
the data that the local official is true to his claim? Use =.05
SOLUTION:
(1) H0: The proportion of students who own an electronic gadget is at most 25%,
p0 ≤ 0.25
H1: The proportion of students who own an electronic gadget is more than 25%,
p0 > 0.25
NOTE: We are using the symbol > in our H1 because we hope to show that the obtained
sample 275 is significantly greater than 250, the 25% of 1,000.
(2) One-Tailed Test, =.05
(3) Is np0 > 5? YES! Is nq0 > 5? YES!
np0 = (1000)(0.25)= 250
n= (1000)(0.75)= 750
(4) USE Z TEST
𝑋 275
𝑝̂ = = = 0.275 q0 = 1 – p0 = 1 – 0.25 = 0.75
𝑛 1000

𝑝̂ - p0 0.275 – 0.25
Z= = = 1.83 (computed value)
(p0q0/n) [(0.25)(0.75)/1000)]

(5) Set up the Rejection Region and the Critical Values

DO_Q4_STATISTICS &
26
PROBABILITY_GRADE 11_LESSON6
(6) DECISION RULE: FOR RIGHT-TAILED TEST
If the computed value is greater than or equal to the critical value,
REJECT H0. Otherwise, do NOT reject the Null Hypothesis.

(7) COMPARE the computed value with the critical value: Since 1.83 > 1.645,
DECISION: REJECT H0.

(8) CONCLUSION: There is enough evidence to reject the claim of the government
official. There is a Significant Difference between the sample proportion and the
hypothesized population proportion. It is safe to say that the proportion of students
who own an electronic gadget in this city is more than 25%.

Let us now solve the problem presented in the beginning of this module – the
ABM coffee.
SOLUTION:
(1) H0: The proportion of residents in Pinalagad that prefer the ABM coffee is 20%
or more, p0 ≥ 0.20
H1: The proportion of residents in Pinalagad that prefer the ABM coffee is
less than 20%, p0 < 0.20
NOTE: We are using the symbol < in our H1 because we hope to prove that the obtained
sample value 95 is significantly lesser than 100, the 20% 0f 500.
(2) One-Tailed Test, =.05
(3) Is np0 > 5? YES! Is nq0 > 5? YES! np0 = (500)(0.20)= 100 nq0 =(500)(0.80)=400
(4) USE Z TEST
𝑝̂ - p0 0.19 – 0.20
Z= = = -0.56 (computed value)
(p0q0/n) [(0.20)(0.80)/500)]

𝑋 95
𝑝̂ = = = 0.19 q0 = 1 – p0 = 1 – 0.20 = 0.80
𝑛 500

(5) Set up the Rejection Region and the Critical Value

(6) DECISION RULE: FOR LEFT-TAILED TEST


If the computed value is less than or equal to the critical value, REJECT H0.
Otherwise, do NOT reject the Null Hypothesis.

(7) COMPARE the computed value with the critical value: Since -0.56 > -1.645,
DECISION: Do NOT reject H0.
DO_Q4_STATISTICS &
27
PROBABILITY_GRADE 11_LESSON6
(8) CONCLUSION: There is NOT enough evidence to reject the claim of ABM Coffee
Company. There is NO Significant Difference between the sample proportion and the
hypothesized population proportion. It is safe to say that 20% of Pinalagad residents
prefer ABM coffee.

(a) Set up the Rejection Region and the Critical Values for Left-Tailed, Right-Tailed,
and Two-Tailed tests. Use =.01.
(b) Set up the Rejection Region and the Critical Values for Left-Tailed, Right-Tailed,
and Two-Tailed tests. Use =.10.

Targets:
1. Illustrate the nature of bivariate data. M11/12SP-IVg-2;
2. Construct a Scatter Diagram. M11/12SP-IVg-3; and
3. Describe the shape (FORM), trend (DIRECTION), and variation
(STRENGTH)based on the scatter diagram. M11/12SP-IVg-4

Do this Pre-Test: Check the appropriate box (TRUE or FALSE).


TRUE FALSE
1. Data which involve a single variable are called univariate
data.
2. Brand of vaccine is an example of categorical variable.
3. The dependent variable is usually positioned in the X axis.
4. When using the p-value approach, we reject the null
hypothesis if the computed p-value is greater than α, the
level of significance.
5. The p-value is equal to the probability of committing Type I
error or α.

Lesson Constructing and Analyzing a


7 Scatter Diagram
In this lesson, the learners will learn the nature of bivariate data. They will
also learn how to construct and analyze a scatter diagram.

DO_Q4_STATISTICS &
28
PROBABILITY_GRADE 11_LESSON7
In order to see graphically the possible relationship between two variables the
Scatter Diagram is used. The Scatter Diagram (or Scatter Plot) is a technique used
to describe the relationship between two numerical variables. (Keller/ Warrack, p.58)
When drawing the scatter diagram, we need the raw data from our two
variables. Each pair of observations from the two variables is represented by a dot.
This is similar to what you did in your Gen Math class when you plotted points on
the XY plane.
In most cases, one variable seems to be dependent on the other variable. Just
to cite some examples --- an individual’s income somewhat depends on the number
of years of education (the higher your educational attainment, the higher you expect
your salary to become), a company’s sales depend on the amount spent in advertising
(This is the reason why companies spend a lot of money to advertise their products),
a student’s score in a major exam may depend on the number of hours spent in
studying (We sincerely hope you prepare really well for your exams).

HOW TO DRAW A SCATTER DIAGRAM (UST Worktext, p. 149)


1. Draw the X and Y axes.
2. Position the independent variable on the X axis. Use an appropriate scale.
3. Position the dependent variable on the Y axis. Use also an appropriate scale.
4. Plot each ordered pair, (x, y) from the raw data.

Example
It is unfortunate that this generation has experienced a dreaded pandemic.
The following are the actual number of Covid 19 cases in the Philippines starting
January 30, 2020 when the first case was detected. The data cover a period of 20
weeks from January 30 to June 30, 2020 (en.m.wikipedia.org)

DO_Q4_STATISTICS &
29
PROBABILITY_GRADE 11_LESSON7
What do you notice about the general direction of the dots in our example?
Take a look at the example again. You can clearly see that as time passes by, the
number of Covid 19 cases also increases. As the independent variable increases, the
dependent variable also increases. When this happens, we say that there is a
positive linear relationship between the two variables.

Sometimes the general direction of the dots is downward. As the independent


variable increases, the dependent variable decreases. When this happens, we say
that there is a negative linear relationship between the two variables.
Now, let’s try a crude technique of determining the strength of a linear
relationship. Please, get a pencil and a ruler, and try to draw a straight line through
the middle of the scatter diagram. Position the straight line in such a way that you
can “pierce” as many dots as possible. If most of the dots fall close to the line, we say
that there is a strong linear relationship. (Keller/Warrack, p. 61) If you are having
a hard time positioning your straight line and you can’t “pierce” even a few dots, we
say that there is a weak or no linear relationship at all.

The following are examples of scatter diagrams that show strength.


In a certain senior high school in Valenzuela City where Gen Math (X) and
Statistics (Y) are offered as core subjects, a sample of 15 students was drawn. The
midterm grades for both subjects were recorded for each student. The data are
listed below. (Keller/Warrack, p. 67)

a. Draw a scatter diagram of the data.


b. What does the graph tell you about the relationship between the grades in Gen
Math and Statistics?
SOLUTION: (a)

(b) There is a positive linear relationship between the two core subjects, and the
relationship is of medium-strength.

DO_Q4_STATISTICS &
30
PROBABILITY_GRADE 11_LESSON7
In a certain senior high school in Valenzuela City a random sample of 10
students in Statistics were asked regarding the number of hours they spent in
studying (X) and the scores (Y) they received during the recently concluded Final
Exam. The data are given below. (Acelejado, et. al., p. 185)

a. Draw a scatter diagram for the given data.


b. Describe the relationship between the two variables with respect to direction and
strength.

Targets:
1. Calculates the Pearson’s sample correlation coefficient. M11/12SP-IVh-2; and
2. Solves problems involving correlation analysis. M11/12SP-IVh-3

Do this Pre-Test: Write True if the statement is correct, False otherwise. Write your
answer in your notebook.
____________1.) 1.001 can be a representation of correlation coefficient r ?
____________2.) Negative relationship means direct relationship.
____________3.) The first step in computing Pearson’s sample correlation coefficient r
is to get the sum of all entries in all columns.
____________4.) If the coefficient of correlation falls between 0.51 to 0.74, there is a
high negative correlation.
____________5.) In the Pearson r, n represents sum of x-values.

Lesson Calculating the Pearson’s Sample


8 Correlation Coefficient

The Correlation coefficient, r, between sets of the data is a measure of how


well they are related. It is a measure of the strength of the relationship between or
among variables.
The most common measure of correlation is the Pearson Correlation.
The strength of correlation is indicated by the coefficient of correlation. There
are several coefficients of correlation. One that is most commonly used in linear
correlation is Pearson product-moment correlation coefficient.

DO_Q4_STATISTICS &
31
PROBABILITY_GRADE 11_LESSON8
Note: the rounding rule for the correlation coefficient value of r shall be to the three
decimal places

The type of relationship is represented by the correlation coefficient:


r =+1 perfect positive correlation
+1 >r > 0 positive relationship
r=0 no relationship
0>r>1 negative relationship
r=1 perfect negative correlation
The correlation coefficient is bound by –1 and +1. The closer the coefficient to
–1 or +1, the stronger is the correlation.

The given example will guide you on how to compute the Pearson’s sample
correlation coefficient r.

Example 1: Find the coefficient of correlation and interpret the relationship between
the two set of test scores in Algebra and Geometry of ten (10) students as shown
below:

Solution:

‘Since the table is already completed, proceed to substitution for the values required
for the formula:

DO_Q4_STATISTICS &
32
PROBABILITY_GRADE 11_LESSON8
Interpretation: Since r is closer to +1, there’s a strong positive correlation between
number of years of college and the monthly income.

Example 2. Correlation Coefficient


Below are the data for six participants giving their number of years in college
(X) and their subsequent monthly income (Y). Which one of the following best
describes the correlation between X and Y?

Solution:
Step 1: Complete the table.

Step 2: Substitute the values obtained through summations.

Step 3: Interpret.
Since r is closer to +1, there is a strong positive correlation between number of
years of college and the monthly income.

Complete the table below. Fill in the blanks in the formula to arrive at the computed
Pearson r. Then interpret the result.

DO_Q4_STATISTICS &
33
PROBABILITY_GRADE 11_LESSON8
Directions: Solve for the given problem.
The time x in years that an employee spent at a company and the employee’s
hourly pay, y, for 5 employees are listed in the table below. Calculate and interpret
the correlation coefficient r. refer to the table below:

(4) Calculate the correlation coefficient r.


(5) Interpret the result.

Targets:
1. Calculates the slope and y-intercept of the regression line. M11/12SP-IVi-3;
2. Interprets the calculated slope and y-intercept of the regression line. M11/12SP-
IVi-4;
3. Predicts the value of the dependent variable given the value of the independent
variable. M11/12SP-IVj-1; and
4. Solves problems involving regression analysis. M11/12SP-IVj-2.

Do this Pre-Test: Write True if the statement is correct, False otherwise. Write
your answer in your notebook.
____________1.) The y-intercept is the value of y when x=0.
____________2.) correlation is used to determine the existence, strength, and direction
of relationship between bivariate data?
____________3.) In regression analysis, a response variable is also known as the
dependent variable.
____________4.) The equation for the straight line that is used to estimate y based on
x is referred to as linear equation.
____________5.) Independent Variable is also known as output variable.

DO_Q4_STATISTICS &
34
PROBABILITY_GRADE 11_LESSON9-10
Lessons Dependent and Independent Variables &
9-10 Regression Analyses

Let us have some review of the Dependent and Independent Variables. By example
below, you may be reminded of what is meant by Dependent Variable as the values
that predicts or assumes the predictor and sometimes called the outcome or response
variable:
● How you will perform in a race depends on your training.
● How much you weigh depends on your diet.
● How much you earn depends upon the number of hours you work.
While, the variables that are manipulated or are changed by researchers and
whose effects are measured and compared are Independent Variable some called as
predictors or input.
In the equation 𝑦 = 3𝑥, can you tell what is independent and independent
variable? Yes, y is the dependent variable while x represents the independent
variable.
The technique used to develop the equation for a straight line and make
predictions about relationship of two variable is called Regression Analysis. The
equation for the straight line that is used to estimate y based on x is referred to as
regression equation.
The equation of the regression line is written as: 𝑦 = 𝑎 + 𝑏𝑥, 𝑎 = 𝑦 − 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡, 𝑏 =
𝑠𝑙𝑜𝑝𝑒.
The formulas used to generate the Regression Equation (least square method) are:

The following example will help you identify the independent and dependent variables
in some research questions:
Example 1:
Questions Independent Dependent
Variable Variable
1. To what extent does remote working remote working job satisfaction
increase job satisfaction?
2. What is the effect of intermittent intermittent blood sugar
fasting on blood sugar levels? fasting levels
3. Is stressful experiences increase the stressful likelihood of
likelihood of headaches? experiences headaches
4. How does time of day affects someone’s time of day someone’s
alertness.? alertness
5. How true that women are more Wearing earrings women
attracted to men without earrings than among men attraction
men with earrings?

DO_Q4_STATISTICS &
35
PROBABILITY_GRADE 11_LESSON9-10
Example 2. Regression Analysis:

To further describe the relationship of dependent and independent variable,


regression analysis can be used. See example below:
The Chief of Admission Office of the of a certain university wanted to determine if
the Exit Exam Rating (EER) is a good indicator of the Grade Point Average (GPA) of
the 16 academic scholars selected at random from the graduating class. Their GPA
and EER are shown below:

Describe the relationship of EER to GPA. What is a point estimate of a


graduating GPA when EER is 85?

DO_Q4_STATISTICS &
36
PROBABILITY_GRADE 11_LESSON9-10
Solve for this:
Connect the dependent and independent variables to form a correct sentence
structure. The first one is given as an example.

Independent Variable Dependent Variable Correct Sentence


Number of hours spent in winning the contest Winning the contest
practice depends on the number of
hours spent in practice.
Cubic meter used in a water bill The water bill depends on
household the cubic meter used in a
household.
Screen time spent daily eye health status of a The eye health status of a
person person depends on the
screen time spent daily.
Participation of learners academic performance The academic
performance depends on
the participation of
learners.

Directions: Solve for the following problem:

The time x in years that an employee spent at a company and the employee’s
hourly pay, y, for 5 employees are listed in the table below:

(1) What is the independent variable?


(2) What is the dependent Variable?
(3) Describe the relationship among the variables.
(4) Find the equation of the regression line.
(5) How much could be employee’s hourly pay if he is already 20 years in the
company? Round off your answer to the nearest whole number.

DO_Q4_STATISTICS &
37
PROBABILITY_GRADE 11_LESSON9-10
Almeda, Capistrano, Ferry Sarte. (2010). Elementary Statistics. Quezon City: University of the
Philippines Press.

Belecina, R., Baccay, E., & Mateo, E. (2016). Statistics and Probability. Manila,Philippines:
REX Book Store Inc.

Bluman, A. (2018). Elementary Statistics: A Step by Step Approach 10th edition. McGraw
Hill. New York, USA.

Canlapan, R. (2016). Statistics and Probability. Makati, Philippines: Diwa Learning System
Inc.

Keller, Warrack. (2003). Statistics For Management and Economics. California USA:
Thomson Learning, Inc.

Levine, et. al. (2005). Statistics: A Handbook for Managers. New Jersey: Prentice Hall.

PERCDC Learnhub

Walpole, R., Myers, R., Myers, S., and Ye, K., (2012). Probability and Statistics for Engineers
and Scientists 9th edition. Pearson Education Inc. Massachusetts, USA.

For inquiries or feedback, please write or call:


Department of Education – SDO Valenzuela
Office Address: Pio Valenzuela Street, Marulas, Valenzuela City
Telefax: (02) 8292-4340
38
Email Address: sdovalenzuela@deped.gov.ph

You might also like