Professional Documents
Culture Documents
STATISTICS AND
PROBABILITY
Learning Activity Sheets
Quarter 4: Week 1-8
1
Name of learner: ___________________________ Section: ___________________
BACKGROUND INFORMATION:
Hypothesis Testing
1. State the hypothesis to be tested. The first step in hypothesis testing involves the
statement of the claim that you want to test. For instance, you want to validate the claim
that the mean number of hours of sleep of college students is 5 hours.
2. Set the standard that describes whether the claim is true or not. For example, in order to
validate the claim that the mean number of hours of sleep of college students is 5, then
majority of the chosen samples should have a mean number of hours of sleep that is
equal to or close to 5. This is the criterion.
3. Compute the test statistic. After the sample is selected, compute the test statistic, which is
the sample mean in this example, but basically depends on the statistic being tested. For
instance, choose 20 college students at random and compute the mean number of hours
they sleep every day.
4. Make the decision. This step involves comparing the mean of the chosen sample of 20
students to the expected mean number of hours of sleep of all college students and
testing using existing statistical test if there is a significant difference. The hypothesis is
refuted if the difference in the values is statistically significant.
There are two types of statistical hypotheses: the null hypothesis and the alternative
hypothesis. The null hypothesis, denoted by HO, is the hypothesis to be tested. It has a statement
of equality, such as ≥, ≤, or =. On the other hand, the alternative hypothesis, denoted by HA, is
the hypothesis that has no statement of equality, such as >, <, or ≠. For instance, in the same
example of testing the claim that the mean number of hours of sleep of college students is 5, the
null and alternative hypotheses, respectively, are the following:
Ho: The mean of the number of hours of sleep of college students is equal to 5.
Ha: The mean of the number of sleep of college students is not equal to 5.
Example 1: Tell whether the following statement is a null or an alternative hypothesis, “The
mean general weighted average (GWA) of the college students in City College
of Angeles City is 84.8”.
2
Solution: The statement “The mean general weighted average (GWA) of the college
students in City College of Angeles City is 84.8” is an example of a null
hypothesis because it describes a value that is equal to the population parameter.
The related alternative hypothesis can be stated as any of the following:
a. The mean GWA of the college students in City College of Angeles City is not
equal to 84.8.
b. The mean GWA of the college students in City College of Angeles City is greater
than 84.8.
c. The mean GWA of the college students in City College of Angeles City is less
than 84.8.
There are two basic types of hypothesis testing procedures depending on the alternative
hypothesis.
a. Right-tailed test: It is used when an assertion is made that the difference falls within
the positive end of the distribution. The alternative hypothesis uses comparatives such
as greater than, higher than, better than, superior to, exceeds, above, increased, etc.
b. Left-tailed test: It is used when an assertion is made that the difference falls within the
negative end of the distribution. The alternative hypothesis uses comparatives such
as less than, smaller than, inferior to, lower than, below, decreased, etc.
2. The non-directional test of hypothesis, more commonly referred to as two-tailed test,
makes use of two opposite sides or tails of the statistical model or distribution. It is used
when no assertion is made as to whether the difference falls within the positive or the
negative end of the distribution. The alternative hypothesis uses comparatives such as not
equal to, different from, not the same as, etc.
1. right-tailed test
HO: The daily average time spent by Filipinos on social media is at most 4 hours a day.
(HO:μ≤4)
HA: The daily average time spent by Filipinos on social media is more than 4 hours a day.
(HA:μ>4)
2. left-tailed test
HO: The daily average time spent by Filipinos on social media is at least 4 hours a day.
(HO:μ≥4)
HA: The daily average time spent by Filipinos on social media is less than 4 hours a day.
(HA:μ<4)
3
3. two-tailed test
HO: The daily average time spent by Filipinos on social media is 4 hours a day. (HO:μ=4)
HA: The daily average time spent by Filipinos on social media is not equal to 4 hours a
day. (HA:μ≠4)
Level of Significance
Rejection Region
The rejection region pertains to the set of all values for which the null hypothesis will be
rejected.
The following table shows the types of hypothesis test and their corresponding rejection
regions.
The decisions we make about the population are based on the samples and the result of
the test on HO; the result may be negative or positive relative to the null hypothesis. If the result
corresponds with reality, then a correct decision has been made. Otherwise, an error must have
4
been committed. There are two types of possible errors in hypothesis testing, the type I and type II
errors.
A type I error occurs when the null hypothesis is rejected when it is true. This means that
a true hypothesis is incorrectly rejected. On the other hand, A type II error occurs when the null
hypothesis is not rejected when it is false.
There are four possible outcomes in the situation. Philip either does or does not have
dengue fever, and either he will receive treatment or not. We have the following hypotheses:
Based on the blood test results, the doctors decide on Philip’s treatment.
If Philip receives treatment but he does not have dengue fever, then a type I error has been
committed (see Quadrant A). On the other hand, if Philip receives treatment and he has dengue
fever, then a correct decision has been made (see Quadrant B).
If Philip does not receive treatment and he does not have dengue fever, a correct decision
has been made (see Quadrant C). However, if Philip does not receive treatment and he has dengue
fever, then a type II error has been committed (see Quadrant D).
In reality, the null hypothesis may or may not be true, and a decision is made to reject or
not reject it on the basis of the data obtained from a sample. In every hypothesis testing situation,
there are two possibilities for a correct decision and two possibilities for an incorrect decision. This
is illustrated by the following table:
The probability of a type I error is called the level of significance and is usually set by the
researcher. For example, if α=0.05, then there is a 5% probability that we will reject the null
hypothesis when it is true.
5
The z−score is used to find the corresponding tail area of the probability of committing type I and
type II errors.
x̄ − μ
𝑧= σ
√n
where
x̄ = sample mean;
𝜇 = population mean;
𝜎 = population standard deviation; and
n = sample size
𝛼 = 𝑃(𝑇𝑦𝑝𝑒 𝐼 𝑒𝑟𝑟𝑜𝑟)
𝜇 = 15 x̄ = 17
17−15
Since x̄ = 17 corresponds to a z-score of 8 = 1.5, then
√36
6
GENERAL INSTRUCTION: Write your answer on a separate paper
ACTIVITY-1: Answer the following question on a sheet of paper.
1. Differentiate a null hypothesis from an alternative hypothesis.
2. Differentiate the non-directional test from the directional test.
3. Given the null hypothesis, “The mean age of the patients in a hospital is equal to 26”, state
the alternative hypothesis if the test is
a. right-tailed;
b. left-tailed
c. two-tailed
1. The level of significance of a certain test is 0.05. What does this means?
A. The degree of certainty required to reject the alternative hypothesis in favor of the null
hypothesis is 0.05.
B. The degree of certainty required to accept the null hypothesis over the alternative
hypothesis is 0.05.
C. The degree of certainty required to reject the null hypothesis in favor of the alternative
hypothesis is 0.05.
D. The degree of certainty required to reject both of the null and alternative hypotheses is
0.05.
2. In a certain hypothesis testing procedure the rejection region is 𝑧 < −1.96. What does this
mean?
A. The rejection region is composed of values found on the right side of −1.96 in a normal
distribution.
B. The rejection region is composed of values found on the left side of −1.96 in a normal
distribution.
C. The rejection region is the value −1.96.
D. The rejection region is composed of values greater than −1.96.
3. The average weight of the whole chickens sold at Pampang Market is known to be 1.26 kg.
A random sample of 40 chickens shows that the mean weight is 1.34 kg. What is the
alternative hypothesis of the problem?
A. 𝜇 = 1.26 C. 𝜇 ≠ 1.26
B. 𝜇 ≥ 1.26 D. 𝜇 < 1.26
4. What is a type of error wherein the null hypothesis is rejected when in fact it is true.
5. Given that 𝐻𝑜 : 𝜇 = 100 and 𝐻𝑎 : 𝜇 ≠ 100, a researcher rejected the null hypothesis when
𝜇 = 100. What type of error did the researcher commit?
7
B. Type II error D. No error
6. Given that 𝐻𝑜 : 𝜇 = 100 and 𝐻𝑎 : 𝜇 ≠ 100, a researcher accepted the null hypothesis when
𝜇 = 100. What type of error did the researcher commit?
7. A school administrator claims that the mean IQ of all the students in their school is 126. A
supervisor wants to test this claim. In the hypothesis testing made by the supervisor, she
rejected the null hypothesis when the true population mean is 126. What type of error did
the supervisor commit?
8. What is a type of error wherein the null hypothesis is accepted when in fact it is false?
9. A shop owner claims that his shop earns an average of P10 000 a day with a standard
deviation of P850. To test this claim, a random sample of 40 operating days was tested
and found that the mean is P10 450. What is the z-score of the sample mean P 10 450?
A. 1.15 C. 2.48
B. 1.83 D. 3.35
10. A nutritionist wants to estimate the mean amount of junk food that is consumed by teenagers
aged 11 to 14 years in a week. From a random sample of 50 teenagers, the mean amount
of junk food consumption per week is 250 g. What is the parameter in the problem?
8
REFERENCES:
ANSWER KEY:
Activity 1 Activity 2
Answers may vary. 1. C 6. D
2. B 7. A
3. A C 8. B
4. A 9. D
5. A 10. D
Prepared by:
LLOYD D. DUQUE
Writer
9
WEEK 2: TEST OF HYPOTHESIS
BACKGROUND INFORMATION:
Hypothesis testing is a statistical method that is used in making statistical decisions using
experimental data. Basically, it is a process of gathering evidences to either accept or reject a
claim, a guess, or an assumption, known as hypothesis.
In real life, we are doing hypothesis testing every time we need to make decisions on
something that affect our lives. As students you need to make decisions by looking both the positive
and negative sides of the problem that confronted you before making any decision. Unknowingly,
your decision to enroll in the Open Senior High School went to a series of hypothesis testing. You
were confronted with a lot of “what ifs” until finally you decided to be here, one of the pioneers of
the Open Senior High School Program.
A statistical hypothesis is a claim or a conjecture that may either be true or false. The
claim is usually expressed in terms of the value of a parameter or the distribution of the population
values.
There are two kinds of statistical hypothesis: the null and the alternative hypothesis. The
definition is written inside the box below to remind you that these are very important concepts and
should be remembered as you go on with the module.
10
In formulating the hypotheses (plural form of hypothesis), we can use the following
guidelines.
1. First, identify the claim. Does it denote “absence” or it states equality to a certain
value?
2. Identify the parameter used in the claim. Does it talk about population average or a
proportion of the population?
3. Represent the parameter by a symbol. For population mean (average), we use µ and
for population proportion we use p.
4. Always remember that the null and alternative hypotheses are complementary and
must not overlap. The usual pairs are as follow:
Now, let’s apply the guidelines above by formulating the null and alternative hypothesis using the
following situations.
Situation 1: A manufacturer of IT gadgets recently announced they had developed a new battery
for a tablet and claimed that it has an average life of at least 24 hours. Would you buy
this battery?
11
Step 3: Representation
Symbol: The symbol to be used for parameter is µ
Step 4: Null and alternative hypotheses complementary pair
The claim states “at least 24 hours”. This claim means that the battery life will not
go lower than 24 hours, but rather equal to 24 hours or more than 24 hours. Thus, we will
be using the complementary pair ;
Ho: Parameter ≥ Value versus Ha: Parameter < Value
Answer: The null and alternative hypotheses stated in;
(a) Words: Ho : The average life of a newly developed battery for tablet is at
least 24 hours.
Ha : The average life of9 a newly developed battery for tablet is
less than 24 hours.
(b) Symbols
Ho : µ ≥ 24
Ha : µ < 24
Situation 2: A student researcher wants to test his assumption that 75% of the senior high school
students who enrolled in the academic track wanted to become a teacher. He
collected samples randomly and found out that 25 out of 130 students are planning to
become a teacher. State the null and alternative hypotheses.
There are two possible actions that a person can do with a statement. Either he accepts the
statement or rejects it. The decision of accepting or rejecting a statement depends on the person’s
assessment whether it is true of false. Consider a statement or a claim about the average number
12
of text messages that an Open Senior High School student sends in a day. The following could be
one way of stating the claim:
“The average number of text messages that an Open Senior High School student sends daily is
equal to 75.”
As stated earlier, this claim could either be true or false so it can be accepted or rejected. The
validity of the statement can be assessed through a series of steps known as test of hypothesis. A
test of hypothesis is a procedure based on a random sample of observations with a given
level of probability of committing an error in making the decision, whether the hypothesis
is true or false.
13
Outcome 1: If the null hypothesis is true and is not rejected (accepted), the decision is
correct. No error is committed.
Outcome 2: If the null hypothesis is true and rejected, the decision is incorrect. A Type I error
is committed.
Outcome 3: If the null hypothesis is false and rejected, the decision is correct. No error is
committed.
Outcome 4: If the null hypothesis is false and accepted, the decision is incorrect
and a Type II error is committed.
Every action that one takes is coupled with consequences. When an error is committed in
decision making, consequences happens too. These consequences might be acceptable or too
terrible, terrible enough to claim lives. In statistics, the chance of committing an error is measured
and this measurement served as the basis in making a decision.
Now, let us examine some examples of errors in decision making.
1. A manufacturer of IT gadgets recently announced they had developed a new battery for a tablet
and claimed that it has an average life of at least 24 hours. Would you buy this battery?
Explanation
A type I error is committed if you decide not to buy the battery and a possible
consequence is you lost the opportunity to have a battery that could last for at least
24 hours.
Type II error is committed when you buy the battery and found out that the
battery’s life was less than 24 hours. A possible consequence is that you wasted your
money in buying the battery.
2. A teenager who wanted to lose weight is contemplating on a diet she read about in social media.
She wants to adopt it but, unfortunately, the following diet requires buying nutritious, low
calories yet expensive food. Help her decide.
Explanation
A type I error is committed when the teenager did not follow the diet. A possible
consequence of this error is that the teenager loses the opportunity to attain her goal of
weight reduction.
Type II error is committed when the teenager did follow the diet and a
possible consequence is that she spends unnecessarily for a diet that did not help
her reduce weight.
14
ACTIVITY 1. Read each situation carefully and fill in the space provided with appropriate
information. Happy hypothesizing.
1. A student researcher claims that fewer than 8% of the Junior High School completers will
enroll in private Senior High Schools. To test this claim, he collected sufficient samples
randomly and found out that 85 out of 380 Junior High School completers are planning to
enroll in private Senior High Schools.
Claim: ________________________________________________________
Parameter: ____________________________________________________
Symbol for parameter: ___________________________________________
Ho and Ha complementary pair: _____________________________________
Hypotheses in words:
Ho: __________________________________________________________
Ha: __________________________________________________________
Hypotheses in symbols:
Ho: ____________________
Ha: ____________________
2. A telecommunications company claims that senior high school students spend an average
of 20 Php a day for their cellphone loads. Do you agree with the claim?
Claim: ________________________________________________________
Parameter: ____________________________________________________
Symbol for parameter: ___________________________________________
Ho and Ha complementary pair: _____________________________________
Hypotheses in words:
Ho: __________________________________________________________
Ha: __________________________________________________________
Hypotheses in symbols:
Ho: ____________________
Ha: ____________________
3. The Senior High School researchers claim that more than 20% of Senior High School male
students have tried smoking cigarette. After collecting 150 random samples, they found that
60 of them have tried smoking cigarette.
Claim: ________________________________________________________
Parameter: ____________________________________________________
Symbol for parameter: ___________________________________________
Ho and Ha complementary pair: _____________________________________
Hypotheses in words:
Ho: __________________________________________________________
Ha: __________________________________________________________
Hypotheses in symbols:
Ho: ____________________
Ha: ____________________
15
4. In a certain town, a school principal hypothesized that students enroll in schools within 5
km from their homes. To check this claim, you ask 38 students from the said town. You
found out that the average distance between the students’ home and their schools is 5.6
km.
Claim: ________________________________________________________
Parameter: ____________________________________________________
Symbol for parameter: ___________________________________________
Ho and Ha complementary pair: _____________________________________
Hypotheses in words:
Ho: __________________________________________________________
Ha: __________________________________________________________
Hypotheses in symbols:
Ho: ____________________
Ha: ____________________
5. A teacher wants to test his assumption that less than 30% of the Senior High School
students liked research class. After randomly collecting 150 samples, he found out that only
40 students like their research class.
Claim: ________________________________________________________
Parameter: ____________________________________________________
Symbol for parameter: ___________________________________________
Ho and Ha complementary pair: _____________________________________
Hypotheses in words:
Ho: __________________________________________________________
Ha: __________________________________________________________
Hypotheses in symbols:
Ho: ____________________
Ha: ____________________
Activity 2. Directions: In each situation below state when the error will be committed and give its
possible consequences.
1. After studying open senior high school, Mary is thinking whether or not to pursue a
degree in in college. She was told that if she graduates with a degree with a degree
in college, a life of fulfilment and happiness awaits her. Assist Mary in making her
decision.
2. An airline company does regular quality control checks on airplanes. One of them is
tire inspection because tires are sensitive to the heat produced when the airplane
runs through the runway. Since its operation, the company uses a particular type of
tire which is guaranteed to perform even at a maximum surface temperature of
107ºC. However, the tires cannot be used and need to be replaced when surface
temperature exceeds a mean of 107ºC. Help the company decide whether or not to
do a complete tire replacement.
3. Alden is exclusively dating Maine. He remembers that on their first date, Maine told
him that her birthday was this month. However, he forgot the exact date. Ashamed
to admit that he did not remember, he decides to use the hypothesis testing to make
an educated guess that today is Maine’s birthday. Help Alden do it.
16
REFERENCES:
Andy Schmitz. “Two-Sample Problems,” in Introductory Statistics, Saylor Academy, 2012.
Retrieved from https://saylordotorg.github.io/text_introductory- statistics/s13-two-
sample-problems.html.
Amitav Banerjee, et al. “Hypothesis testing, type I and type II errors”. Industrial Psychiatry
Journal. 2009. Retrieved from
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2996198/.
Banigon, et al., 2016. Statistics and Probability for Senior High Schools. Quezon City, Philippines:
Educational Resources Corporation.
Belicina, et al., 2016. Statistics and Probability.1st ed. Manila, Philippines: Rex Book Store, Inc.
“Chapter 6.1 The Elements of a Test of Hypothesis” in Lecture 6: Tests of Hypothesis. University
of California, Davis Department of Statistics Summer Session II, 2012. Retrieved from
http://www.stat.ucdavis.edu/~ntyang/teaching/12SSII/lecture06.pdf.
Datasciencecentral. (2017). Importance of Hypothesis Testing in Quality
Management.Retrieved from
https://www.datasciencecentral.com/profiles/blogs/importance-of-hypothesis- testing-
in-quality-management.
Diego M. Amid, Fundamentals of Statistics, Lorimar Publishing Company, Inc., 2005.
Efren B. Mateo, Elisa S. Baccay, & Rene R. Belecina, Statistics and Probability, Rrex Book Store,
2016.
Jose Ramon G. Albert, et.al., Statistics and Probability Teaching Guide for Senior High School,
Department of Education, 2016.
Lisa Sullivan, “Hypothesis Testing: Upper-, Lower, and Two Tailed Tests,” in Hypothesis
Testing for Means & Proportions, Boston University School of Public Health, 2017.
Retrieved from http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/BS/BS704_HypothesisTest-Means- Proportions/BS704_HypothesisTest-
Means-Proportions3.html
Neil J. Salkind, “Directional Hypothesis,” in Encyclopedia of Research Design, SAGE
Publications, 2010. Retrievedfrom
http://methods.sagepub.com/reference/encyc-of-research-design/n114.xml
Neil J. Salkind, “T-Test Definition,” in Encyclopedia of Research Design, SAGE Publications,
2010. Retrieved from https://www.investopedia.com/terms/t/t- test.asp
“One-and Two-tailed Test”. Cliffsnote. Houghton Mifflin Harcourt. 2016. Retrieved
fromhttps://www.cliffsnotes.com/study-guides/statistics/principles-of- testing/one-and-
twotailed-tests
Rosie Shier. “Statistics: 1.1 Paired t-tests”, Mathematics learning Center, 2004. Retrieved from
http://www.statstutor.ac.uk/resources/uploaded/paired-t- test.pdf
Statistics Solutions. (2013). Hypothesis Testing [WWW Document]. Retrieved from
http://www.statisticssolutions.com/academic- solutions/resources/directory-of-
statistical-analyses/hypothesis-testing/
“Steps in Hypothesis Testing,” University of Florida Health , 2018. Retrieved from
https://bolt.mph.ufl.edu/6050-6052/unit-4/module-12/steps-in-hypothesis- testing
Will Kenton,”Null hypothesis,” in Investopia, Dotdash publishing, 2018. Retrieved from
https://www.investopedia.com/terms/n/null_hypothesis.asp
“What is a critical value?”, Minitab, LLC, 2018. Retrieved
from https://support.minitab.com/en-us/minitab-express/1/help-and-how-to/basic-
“Z-test” in explorable.com, 2018. Retrieved from https://explorable.com/z-test
17
ANSWER KEY:
Activity 1
1. a) Claim: Fewer than 8% of Junior High School completers will enroll in private
Senior High Schools.
b) Parameter : Population proportion
c) Symbol for parameter: p
d) Ho and Ha complementary pair:
Ho: Parameter ≤ Value versus Ha: Parameter > Value
Hypotheses in words:
Ho: The number of Junior high school completers who will enroll in private
Senior high school is less than or equal to 8%.
Ha: The number of Junior high school completers who will enroll in private
Senior high school is more than 8%.
Hypotheses in symbols:
Ho : p ≤ 0.08
Ha : p > 0.08
2. a) Claim: Senior high school students spend an average of 20 Php a day for their
cellphone loads.
b) Parameter : Population mean
c) Symbol for parameter: µ
d) Ho and Ha complementary pair:
Ho: Parameter = Value versus Ha: Parameter ≠ Value
Hypotheses in words:
Ho: The average amount of money spend by senior high school student on
their cellphone load a day is equal to Php 20.
Ha: The average amount of money spend by senior high school student on
their cellphone load a day is not equal to Php 20.
Hypotheses in symbols:
Ho : µ = 20
Ha : µ ≠ 20
3. a) Claim: More than 20% of Senior High School male students have tried
smoking cigarette.
b) Parameter : Population proportion
c) Symbol for parameter: p
d) Ho and Ha complementary pair:
Ho: Parameter ≥ Value versus Ha: Parameter < Value
Hypotheses in words:
Ho: The number of Senior high school male students who have tried smoking
cigarette is greater than or equal to 20%
Ha: The number of Senior high school male students who have tried smoking
cigarette is less than 20%
Hypotheses in symbols:
Ho : p ≥ 0.20
Ha : p < 0.20
18
4. a)
Claim: Students enroll in schools within 5 km from their homes.
b)
Parameter : Population mean
c)
Symbol for parameter: µ
d)
Ho and Ha complementary pair:
Ho: Parameter ≤ Value versus Ha: Parameter > Value
Hypotheses in words:
Ho: The average distance between the students home and their school is less than
or equal to 5 km.
Ha: The average distance between the students home and their school is
more than 5 km
Hypotheses in symbols:
Ho : µ ≤ 20
Ha : µ > 20
5. a) Claim: Less than 30% of Senior High School students like research class.
b) Parameter : Population proportion
c) Symbol for parameter: p
d) Ho and Ha complementary pair:
Ho: Parameter ≤ Value versus Ha: Parameter > Value
Hypotheses in words:
Ho: The number of Senior high school students who like research class is less
than or equal to 30%.
Ha: The number of Senior high school students who like research class is
more than 30%.
Hypotheses in symbols:
Ho : p ≤ 0.30
Ha : p > 0.30
Activity 2
1. A type I error is committed if Mary decide not to pursue a degree in college and a
possible consequence is she loses the opportunity to have a happy and fulfilled life. Type
II error is committed when Mary pursues a degree in college and ends up with an
unhappy and less fulfilled life.
2. A type I error is committed when the company decide not to change tire brand and the
possible consequence is spending more if the surface temperature exceeds 107ºC.
Type II error is committed when company decided to change tire brand and ended and
ended up spending more if the surface temperature of the runway does exceed 107ºC.
3. A type I error is committed when Alden’s guess of Maine’s birthday is not on this day
and a possible consequence is that he fails to greet or give Maine a birthday gift.
Type II error is committed when Alden guess that today is Maine’s birthday and a
possible consequence is that he makes a mistake of greeting Maine a happy birthday on
that day.
Prepared
Almaflor David
Rafael L. Lazatin Memorial High School
19
WEEK 3: REJECTION REGION FOR A GIVEN LEVEL OF SIGNIFICANCE
Background Information:
Normal curve evolved from the probability distribution. With the area under the curve equal
to 1, it has become a mathematical model in hypothesis testing. The areas are probability values
that we need for decision-making. In hypothesis testing, we determine the probability of obtaining
the sample results if the null is true. Thus, the calculations can be graphically represented by using
the normal curve. The greater than (>) the mean direction can be shown at the right tail of the curve
just as less than (<) the mean direction can be shown at the left tail.
A one-population test is a test conducted on one sample purportedly coming from a population
mean. It is sometimes called a significance test for a single mean. There are two cases to consider
for testing the mean of a single population:
1. The sample is large (n ≥ 30). Thus, we can apply the Central Limit Theorem (CLT)
and we use the normal curve as a model.
2. When CLT is applied, the standard deviation may be used as an estimate of the
population standard deviation when it is unknown
When the sample is large, that is n ≥ 30, the test statistic to use is z. The z statistic measures
the number of the standard deviations between the observed value of X and the null hypothesized
value.
One of the most common tests for population mean is called the z-test which uses the
properties of z-distribution or normal distribution when the population standard deviation or variance
is known. This is also used when the sample size is greater than or equal to 30 (n≥30) by virtue of
central limit theorem.
20
𝑋̅ −𝜇
Formula : z = 𝜎
√𝑛
where
𝜇 = population mean
𝜎 = population standard deviation
𝑋̅ = sample mean
n = sample size
If the sample size is small (n<30) and if the population standard deviation or variance is
unknown, the z-test cannot be used. For a special case where the population from which the
samples are taken is known to be normally distributed, the t-test can be used to test a claim or
hypothesis about population mean
21
𝑋̅−𝜇
Formula: t = 𝑠
√𝑛
where
𝜇 = population mean
𝑠 = sample standard deviation
𝑋̅ = sample mean
n = sample size
Identify the appropriate rejection region for a given level of significance when: (a) the
population variance is assumed to be known; (b) the population variance is assumed to be
unknown; and (c) the Central Limit Theorem is to be used (M11/12SP-IVc-1)
22
GENERAL INSTRUCTION: Write your answer on a separate paper
Activity 1:
Suppose that the z is the test statistic for hypothesis testing, calculate the value of z for each of
the following
1. 𝜇 = 10, 𝜎 = 3, n = 68, and 𝑋̅ = 9.2
Activity 2:
Calculate the t-statistic for each of the following
1. 𝜇 = 12, 𝑋̅ = 15, s = 4, n = 12
3. 𝜇 = 5.3, 𝑋̅ = 5, s = 0.14, n = 8
5. 𝜇 = 7, 𝑋̅ = 5, s = 3, n = 10
Activity 3:
For each of the following, sketch the normal curve and shade the area of the rejection region.
1. n = 12, 𝛼 = 0.05, left-tailed
3. n = 8, 𝛼 = 0.05, left-tailed
Activity 4:
For each of the following, sketch the normal curve and shade the area of the rejection region.
1. z > 1.96
2. z >1.645
3. z < 2.58
23
Activity 5:
For each of the given, sketch the normal curve and shade the area of the rejection region. Identify
if the z-value is in the rejection region or not
1. Z = 2, 95% confidence, two-tailed
2. Z = 2.68, 95% confidence, two-tailed
3. Z = 1, 95% confidence, right-tailed
4. Z = 1.33, 99% confidence, two-tailed
5. Z = -4, 99% confidence, two-tailed
References:
Rene R. Belecina, et.al, 2016. “Statistics and Probability.” First Edition. pgs.223-224,229-232
Ricardo B. Banigon, et.al., 2016. “Statistics and Probability for Senior High School”. pgs.80-81
24
Answer Key
Activity 1. Activity 2.
1. z = -2.20 1. t = 2.60
2. z = 4.57 2. t = -1.36
3. z = -5.24 3. t = -6.06
4. z = -4.34 4. t = -4.09
5. z = -7.60 5. t = -2.11
Activity 3
1.
25
-1.796
2.
1.833
3.
-1.895
4.
26
1.761
5.
- 2.262 2.262
Activity 4
1.
1.96
2.
27
1.645
3.
2.58
4.
-1.645 1.645
5.
28
-2.58 2.58
Activity 5
1. Z = 2, inside the rejection region
-1.96 1.96
-1.96 1.96
29
1.645
-2.58 2.58
-2.58 2.58
Prepared:
JOSEL L. DIZON
Senior High School Teacher III
Angeles City Senior High School
30
WEEK 4: HYPOTHESIS TESTING – TEST ON POPULATION MEAN
Background Information
In the previous lessons, you have learned the basic concepts in hypothesis testing. In this
lesson you will apply those concepts and do a test of hypothesis concerning the population mean.
You will solve real-life problems by following the steps in hypothesis testing, as shown
below:
STEP 1: State the null and alternative hypothesis.
STEP 2: Find the critical value/s and rejection region.
STEP 3: Compute the test statistic value.
STEP 4: Make a decision.
STEP 5: State your conclusion.
31
The critical value(s) in a z-test is obtained from the Z table (normal distribution table)
while the critical value(s) in a t-test is obtained from the t Distribution table. (See
Appendix A and B)
The critical value(s) separates the critical from the non-critical region.
EXAMPLE 1:
Historical data suggests that the average salary of assistant professors in Angeles City is
at most ₱42,000. A researcher claims that the average salary of assistant professors in Angeles
City is now more than ₱42,000. A sample of 30 assistant professors has a mean salary of ₱43,260.
At 𝛼 = 0.05, test the claim that assistant professors earn more than ₱42,000 a year. The standard
deviation of the population is ₱5,230.
Solution: This is a z-test since the population standard deviation is known and n=30.
STEP 1: State the null and alternative hypothesis.
𝐻𝑜 : The average salary of assistant professors in Angeles City is less than or equal to ₱42,000. In
symbols, 𝐻𝑜 : 𝜇 ≤ ₱42,000
𝐻1 : The average salary of assistant professors in Angeles City is more than ₱42,000.
In symbols, 𝐻1 : 𝜇 > ₱42,000 (Claim)
With the level of significance at 𝛼 = 0.05, 𝑛 = 30, this is a right-tailed test. Referring to your
z-table (right-tail) to determine the critical value, the area in question is [0.5 – 0.05 = 0.4500]. Locate
this area on the table and take note of the value in the first column that intersects with the row
where the area is located. Add to this the decimal number at the top of the corresponding column.
In this case, the table does not have an exact area of 0.4500 but it is found approximately between
the areas 0.4495 and 0.45053, which corresponds to the two critical values 1.64 and 1.65. By
32
interpolation, the critical value corresponding to the area 0.4500 is between 1.64 and 1.65, which
is +1.645 [(1.64+1.65)/2]. (See Appendix A)
Rejection region
If the test value is greater than 1.645, which is in the critical region, then reject the null
hypothesis.
𝑥̅ −𝜇 43,260−42,000
𝑧𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 = = = +1.32
𝜎⁄√𝑛 5,230⁄√30
Since the test value +1.32 is less than the critical value +1.65, and it is not in the critical region,
do not reject the null hypothesis”.
1.32
STEP 5: Conclusion.
There is not enough evidence to support the claim that assistant professors on average earn
more than ₱42,000.
EXAMPLE 2:
The father of a senior high school student lists down the expenses he will incur when he
sends his daughter to the university. At the university where he wants his daughter to study, he
hears that the average tuition fee is at least ₱20,000 per semester. He wants to do a test of
hypothesis. From a simple random sample of 16 students, a sample mean of ₱19,750 was
obtained. Further, the variable of interest, which is the tuition fee in the university, is said to be
normally distributed with an assumed population variance equal to ₱160,000. The level of
significance is set at α=0.05.
Solution: Even though n=16, the population standard deviation is known (which is the square root
of the variance). Hence, this is a z-test.
33
STEP 1: State the null and alternative hypothesis.
𝐻1 : The average tuition fee in the targeted university is less than ₱20,000.
In symbols, 𝐻1 : 𝜇 < ₱20,000
At α=0.05, this is a one-tailed z-test (left-tail). The rejection region is illustrated as follows:
Rejection region
𝑥̅ −𝜇 19,750−20,000
𝑧𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 = = = −2.50
𝜎⁄√𝑛 400⁄√16
The computed value −2.50 is found in the rejection region. Therefore, the null hypothesis is
rejected.
STEP 5: Conclusion.
There is a significant difference in the sample mean and population mean. The father can say
that the average tuition fee in the university where he wants his daughter to study is less than
₱20,000.
EXAMPLE 3:
A national magazine claims that the average college student spends less time on television now
than the general public. The national average is at least 29.4 hours per week, with a standard
deviation of 2 hours. A sample of 30 college students has a mean of 27 hours. Is there enough
evidence to support the claim at 𝛼 = 0.01?
34
Solution: Population standard deviation is known and n=30. Hence, this is a z-test.
The claim that the average college student watches less television than the general public serves
as the alternative hypothesis, 𝐻1 .
𝐻𝑜 : The average time college students spend watching television is greater than or equal to 29.4
hours. In symbols, 𝐻𝑜 : 𝜇 ≥ 29.4
𝐻1 : The average time college students spend watching television is less than 29.4 hours. In
symbols, 𝐻1 : 𝜇 < 29.4
With 𝛼 = 0.01 and n=30, this is a one-tailed test (left-tail). Based on your righted-tailed z table
(from 0 to z), the area under the normal curve in question is
0.50 − 0.01 = 0.4900, which corresponds to the critical value 2.33. Take the negative form of this
value (since the right side of the center is a mirror image of the left side).
The critical region is illustrated below.
Rejection region
𝑧 = −2.33
The decision will be “Do not reject the null hypothesis” if the test value is greater than −2.33.
𝑥̅ −𝜇 27−29.4
𝑧𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 = = = −6.57
𝜎⁄√𝑛 2⁄√30
Since the test value falls in the critical region, the decision is to reject the null hypothesis.
STEP 5: Conclusion.
There is enough evidence to support the claim that college students spends less time on
television now than the general public.
35
EXAMPLE 4:
The average cost of child delivery in Quezon City is ₱24,750. To see if the average cost of
child delivery is different at a large hospital, a researcher selected a random sample of 36
deliveries and found that the average cost is ₱25,468. The standard deviation of the population is
₱3,250. At 𝛼 = 0.01, can it be concluded that the average at a large hospital is different from
₱24,750?
Solution: The population standard deviation is known and n=36. Hence, this is a z-test.
STEP 1: State the null and alternative hypothesis.
The claim that the average cost of child delivery is different from ₱24,750 serves as the
alternative hypothesis, 𝐻1 .
𝐻𝑜 : The average cost of child delivery in Quezon City is equal to ₱24,750. In symbols, 𝐻𝑜 : 𝜇 =
₱24,750
𝐻1 : The average cost of child delivery in Quezon City is different from ₱24,750. In symbols,
𝐻1 : 𝜇 ≠ ₱24,750
Since 𝛼 = 0.01 and the test is a two-tailed test, the critical values are 𝑧 = +2.58 and 𝑧 = −2.58.
The rejection region is illustrated below.
Rejection region
Rejection region
0.005 0.005
𝑧 = −2.58 𝑧 = +2.58
𝑥̅ −𝜇 25,468−24,750
𝑧= = = 1.33
𝜎⁄√𝑛 3,250⁄√36
Since the test value falls in the non-critical region, the decision is do not reject the null hypothesis.
STEP 5: Conclusion.
36
There is not enough evidence to support the claim that the average cost of child delivery at a
large hospital is different from ₱24,750.
EXAMPLE 5:
The average tuition fee of the university where he wants to send his daughter is known to
be equal to ₱20,000. A father of a senior high school student hypothesizes that this is no longer
true. The variable of interest follows the normal distribution, but the population mean and variance
are unknown. The father asks, at random, 25 students of the university about their tuition fee per
semester. He is able to get an average of ₱20,050 with a standard deviation of 500. At 𝛼 = 0.05,
is there enough evidence to support the father's claim?
Solution: Since the population standard deviation is unknown and n=25, the appropriate test for
this is the t-test.
𝐻1 : The average tuition fee in the targeted university is no longer equal to ₱20,000.
In symbols, 𝐻1 : 𝜇 ≠ ₱20,000
Since this is a two-tailed test, the level of significance split into to areas on both ends of the
normal curve: (0.05)/2=0.025.
On the t Distribution table: 𝑑𝑓 = 25 − 1 = 24, and under the column for two-tailed test at 𝛼 = 0.05,
the critical values are 𝑧 = +2.064 and 𝑧 = −2.064
𝑥̅ −𝜇 20,050−20,000
𝑡= = = +0.50
𝑠⁄√𝑛 500⁄√25
STEP 4: Make a decision.
Since the test value is in the non-rejection region, do not reject the null hypothesis.
STEP 5: Conclusion.
The father can say that the average tuition fee at the university where he wanted his daughter to
study is at least P20,000.
37
Learning Competencies with Code
Compute the test statistic value (population mean). M11/12SP-IVd-1
Draw conclusion about the population mean based on the test-statistic value and rejection
region. M11/12SP-IVd-2
ACTIVITY 1:
Anna Garcia, the manager of EG Manufacturing Company believes that the average daily
wages of the employees is below the currently known rate of at least ₱300. She wants to prove
this by getting a sample of 22 employees that resulted to a mean daily wage of ₱285. The
standard deviation of all the salary is ₱35. Assume the variable is normally distributed. At 𝛼 =
0.01, is there enough evidence to support the manager’s claim?
ACTIVITY 2:
Cris Elevators Inc. claims that the average cost of elevator installation and repair is still
₱228,760. A sample of 60 repairs has an average of 227,880. The standard deviation of the
sample is 3000. At 𝛼 = 0.05, is there enough evidence to reject the company’s claim?
ACTIVITY 3:
A CHED commissioner claims that the average cost of one year’s tuition for all private
colleges in Metro Manila is ₱32,800. A sample of 15 colleges is selected, and the average tuition
is ₱31,080. The standard deviation of the sample is ₱4,000. At 𝛼 = 0.01, is there enough
evidence to reject the claim that the average cost of tuition is equal to ₱32,800?
RUBRIC:
Criteria/
Computation Correctness Understanding
Points
4 Includes correct Excellent understanding of
All the answers
computation. Shows the problem. Shows and
provided are all correct
multiple computational explains more than the
and accurate.
approaches. problem asks.
3 Most of the answers
Includes correct
provided are all correct Understands the problem.
computation.
and accurate.
2 Includes basic Some of the answers
Minimal understanding of
computation, some provided are all correct
the problem.
incorrect. and accurate.
1 Computation shows no Attempts but demonstrates
No answers are
evidence of no understanding of the
provided.
understanding. problem.
Score
TOTAL ____/12
38
References:
Hernandez, Rogelio M., Santos, Analiza SM., Raphael, Johann C., Villanueva, Cherry G. Edited by
Cabero, Jonathan B. Basic Statistics. Booklore Publishing Corporation. Sta. Cruz, Manila, 2009.
Answer Key
Activity 1:
STEP 1: State the null and alternative hypothesis.
With 𝛼 = 0.01 and n=22, the test statistic is a z test (left-tail). Based on your righted-tailed z table (See
Appendix A), the area is question is 0.50 = 0.01 = 0.4900, which corresponds to the critical value which
is a negative value, 𝑧 = −2.33. The critical region is illustrated below.
Since the test value is in the non-critical region (computed z is greater than the critical value), the
decision is do not reject the null hypothesis.
39
There is not enough evidence to support the manager’s claim that the average daily wage of the
employees is less than ₱300.
Activity 2:
The population standard deviation is not known but n=60. Hence, this is a z test.
𝑥̅ −𝜇 227,880−228,760
𝑧= = = −2.27
𝑠 ⁄√𝑛 3000⁄√60
Since the test value is in the critical or rejection region, reject the null hypothesis.
There is enough evidence to reject the company’s claim that the average cost of elevator installation and
repair is will remain equal to ₱228,760 until new data proves otherwise.
40
Activity 3:
The population standard deviation is not known and n=15. Hence, this requires a t test.
𝐻𝑜 : The average cost of one year’s tuition for all private colleges in Metro Manila is equal to ₱32,800. In
symbols, 𝐻𝑜 : 𝜇 = ₱32,800 (Claim)
𝐻1 : The average cost of one year’s tuition for all private colleges in Metro Manila is not equal to ₱32,800.
In symbols, 𝐻1 : 𝜇 ≠ ₱32,800
𝑥̅ −𝜇 31,080−32,800
𝑡 = 𝑠⁄ = = −1.67
√𝑛 4,000⁄√15
Since the test value is in the non-critical or non-rejection region, do not reject the null hypothesis.
STEP 5: Conclusion.
There is not enough evidence to reject the claim that the average cost of one year’s tuition for all private
colleges in Metro Manila is equal to ₱32,800.
Prepared by:
Roxan O. Condes
SST-II
Amsic Integrated School
41
Appendix A:
42
Appendix B
43
WEEK 5 : CONDUCTING HYPOTHESIS TESTING
Background Information
“A theory, a theorem and a hypothesis walk into a bar, but leave as soon as the
bartender asks them for proofs.” Rajesh’
The following are examples of problems that deals with hypothesis testing on the population
mean.
44
Example 1: A researcher used a developed problem solving test to randomly select 50 Grade 6
pupils. In this sample, X = 80 and s = 10. The mean µ and the standard deviation of the population
used in the standardization of the test were 75 and 15, respectively. Use the 95% confidence level.
Steps Answer
1. Formulate the null and alternative Ho: µ = 75
hypothesis. Ha: µ ≠ 75 (two-tailed)
2. Collect data and decide on appropriate Given: µ = 75 X = 80
statistical testing procedure. Specify σ = 15 s = 10
the level of significance α. n = 50 α = 0.05
z95 = +1.96
3. Compute the test-statistic a. Solve for standard error
𝜎
𝜎𝑥̅ =
√𝑛
15
=
√50
𝜎𝑥̅ = 2.12
b. Since n> 30, use z
X − µ
𝑧=
𝜎𝑥̅
80 − 75
=
2.12
5
=
2.12
𝑧 = 2.36
4. Determine the rejection region, then ZCOMPUTED VS zCRITICAL VALUE
make a decision. 2.36 > 1.96
Decision: The null hypothesis is rejected
45
a. Ha: µ < 6
Steps Answer
1. Formulate the null and alternative Ho: µ ≥ 6
hypothesis. Ha: µ < 6 (one-tailed)
2. Collect data and decide on appropriate Given: µ = 6 X = 4.6
statistical testing procedure. Specify s = 1.5 n=5
the level of significance α. d.f. = 4 α = 0.05
t90 = - 1.533
(since it is less than <)
3. Compute the test-statistic a. Solve for standard error
𝑠
𝑠𝑥̅ =
√𝑛
1.5
=
√5
𝑠𝑥̅ = 0.67
b. Since n < 30, use t
X − µ
𝑡=
𝑠𝑥̅
4.6 − 6
=
0.67
−1.4
=
0.67
𝑧 = −2.09
4. Determine the rejection region, then t COMPUTED VS tCV
make a decision. −2.09 < −1.533
The null hypothesis is rejected.
5. Make a conclusion about the There is a significant decrease in population
hypotheses mean.
b. µ ≠ 6.
Steps Answer
1. Formulate the null and alternative Ho: µ = 6
hypothesis. Ha: µ ≠ 6 (two-tailed)
2. Collect data and decide on appropriate Given: µ = 6 X = 4.6
statistical testing procedure. Specify σ = 15 s = 1.5
the level of significance α. n=5 α = 0.05
d.f. = 4 t90 = + 2.132
3. Compute the test-statistic a. Solve for standard error
𝑠
𝑠𝑥̅ =
√𝑛
46
1.5
=
√5
𝑠𝑥̅ = 0.67
b. Since n< 30, use t
X − µ
𝑡=
𝑠𝑥̅
4.6 − 6
=
0.67
−1.4
=
0.67
𝑧 = −2.09
4. Determine the rejection region, then tCOMPUTED VS tCV
make a decision. −2.09 > −2.132
The null hypothesis is not rejected.
5. Make a conclusion about the There is no significant difference between the
hypotheses sample mean and the population mean.
6.
The previous examples conducted hypothesis testing involving means. The next
examples are hypothesis testing involving population proportions.
Example: What should be the appropriate null and alternative hypotheses if given is a population
proportion.
1. Mr. Sy asserts that less than 5% of the bulbs that he sells are defective, against a claim that it
is more. Suppose 300 bulbs are randomly selected, each are tested and 10 defective bulbs are
found.
Answer: Ho: p ≥ 0.05
Ha: p < 0.05 (one-tailed) hint: fewer
with n =300, the Central Limit Theorem applies.
2. A survey is conducted to determine the opinions of people on global warming. In a
random sample of 150 people, 108 think that global warming is a serious world problem.
Is there a sufficient evidence that the proportion of people who regard global warming as a
serious problem is significantly higher than the claim of at most 60%?
Answer: Ho: p ≤ 0.60
Ha: p > 0.60 (one-tailed) hint: higher than
with n =150, the Central Limit Theorem applies.
47
GENERAL INSTRUCTION: Write your answer on a separate paper
Practice A. Complete the table. Read the given problem then solve using the steps of hypothesis
testing using population means by filling the blanks.
The owner of a factory that sells a particular bottled fruit juice claims that the average
capacity of their product is at least 250 ml. To test the claim, a consumer group gets a sample of
100 such bottles, calculates the capacity of each bottle, and then finds the mean capacity to be 248
ml. The standard deviation s is 5 ml. Is the claim true? Write your answer on a separate sheet of
paper. (10 points)
Steps Answer
48
hypotheses
Practice B. Complete the table. Read the given problem then solve using the steps of hypothesis
testing using population.
In a plant nursery, the owner thinks that the lengths of seedlings in a box sprayed with a
new kind of fertilizer has an average height of 26 cm after three days and a standard deviation of
10 cm. One researcher randomly selected 80 such seedlings and calculated mean height to be 20
cm and the standard deviation was 10 cm. Will you conduct a one-tailed test or a two-tailed test
using α = 0.05?
Steps Answer
Reference
Belecina, Rene R., Baccay, Elisa S., Mateo, Efren B. Statistics and Probability (Quezon City:
Rex Bookstore, Inc., 2016), 233 - 281
49
Answer Key
Practice A Practice B
1. µ < 250 Step 1: Ho: µ = 26
2. one-tailed, left Ha: µ ≠ 26(two-tailed)
Step 2: µ = 26 X = 20
3. -1.65
σ = 10 s = 10
5
4. α = 0.05 n = 80
√100
Prepared by:
50
WEEK 6: TEST INVOLVING POPULATION PROPORTION
Background Information
The principal of a school believes that this year there would be more students from the
school who would pass the National Achievement Test (NAT), so that the proportion of students
who passed the NAT is greater than the proportion obtained in previous year, which is 0.75. What
will be the appropriate null and alternative hypotheses to test this belief?
In this problem, the parameter of interest is the proportion of students of the school who will
pass the NAT this year. To determine if the null hypothesis is rejected or not, we need to use a test
statistic. A test statistic is a sample statistic computed from sample data. The value of the test
statistic is used in determining whether or not we may reject the null hypothesis.
Testing the null hypothesis with large sample to be able to apply the Central Limit Theorem, the
appropriate test statistic, denoted as ZC is computed as
Where :
P ≥ P0 Reject Ho if Zc < - Z
P < P0 Otherwise, fail to reject
Ho
P ≤ P0 Reject Ho if Zc > Z
P > P0 Otherwise, fail to reject
Ho
51
Example 1: Previous evidences show that at most, half of the student population are happy and
contented with the university’s policies. This year, a random sample of 100 students was drawn.
They were asked if they were happy and contented with the university’s policies. Out of 100
students, 65 said so. What conclusions could be made at 10% level of significance?
Step 1: Formulate the appropriate null and alternative hypotheses.
Ho: At most, half of the student population are happy and contended with the university’s
policies.
Ha: Majority of the student population are happy and contended with the university’s
policies.
In symbols,
Ho : p ≤ 0.50
Ha : p > 0.50
Step 2: Identify the test statistic to use. With the given level of significance and the distribution of
the test statistics, state the decision rule and specify the rejection region.
Having the variable of interest defined as the number of happy and contented students
with the university policies out of n students, the appropriate test statistic is
With 10% level of significance, the decision rule is “Reject the null hypothesis (Ho) if Z C >
Z0.10 = 1.28. Otherwise, we fail to reject Ho.” The rejection region is found on the right tail
of the standard normal distribution as shown below:
Rejection
Region
Z= 1.28
Step 3: Using a simple random sample of observations, compute for the value of the test statistic.
𝑥 65
First, compute the sample proportion = = = 0.65
𝑛 100
Second, compute for the value of the test statistic with the given Po = 0.50, n = 100
0.65−0.50 0.15 0.15 0.15
= = = = = 3.0
0.50 ( 1−0.50 ) 0.50 ( 0.50 ) 0.25 0.05
√ √ √
100 100 100
52
Example 2: Globally the long-term proportion of newborns who are male is 51.46%. A researcher
believes that the proportion of boys at birth changes under severe economic conditions. To test
this belief randomly selected birth records of 5,000 babies born during a period of economic
recession were examined. It was found in the sample that 52.55% of the newborns were boys.
Determine whether there is sufficient evidence, at the 10% level of significance, to support the
researcher’s belief.
Step 1. Formulate the appropriate null and alternative hypotheses.
Let p be the true proportion of boys among all newborns during the recession period. The
burden of proof is to show that severe economic conditions change it from the historic long-term
value of 0.5146 rather than to show that it stays the same, so the hypothesis test is
Ho : p = 0.5146
Ha : p 0.5146
Step 2: Identify the test statistic to use. With the given level of significance and the distribution of
the test statistics, state the decision rule and specify the rejection region.
The appropriate test statistic is
With 10% level of significance, the decision rule is “Reject the null hypothesis (Ho) if /Z C/ > Z/2
(Z/2 = Z0.10/2 = Z0.05 = 1.645). Otherwise, we fail to reject Ho.” The rejection region is found
on both tails of the standard normal distribution as shown below:
Rejection Rejection
Region Region
-1.645 1.645
Step 3: Using a simple random sample of observations, compute for the value of the test statistic.
In the problem, it is stated that the sample proportion is 52.55% of the newborns were boys.
So, the sample proportion = 0.5255
Compute for the value of the test statistic with the given Po = 0.5146, n = 5000
0.5255−0.5146 0.0109 0.0109 0.0109
= = = = = 1.54
0.5146( 1−0.5146 ) 0.5146 ( 0.4854 ) 0.2498 0.0071
√ √ √
5000 5000 5000
53
Learning Competencies with Code:
Identify the appropriate rejection region for a given level of significance when the Central
Limit Theorem is to be used. M11/12SP-IVe-6
Compute for the test-statistic value (population proportion). M11/12SP-IVf-1
Draw conclusion about the population proportion based on the test-statistic value and the
rejection region. M11/12SP-IVf-2
Solve problems involving test of hypothesis on the population proportion. M11/12SP-IVf-g-1
Activity 1: Determine if each of the following hypothesis is a one-tailed or two-tailed test and
draw the rejection region.
1. Ho : p ≤ 0.40
Ha : p > 0.40 ________________ ____________________
2. Ho : p ≥ 0.30
Ha : p < 0.30 ________________ ____________________
3. Ho : p ≥ 0.50
Ha : p < 0.50 ________________ ____________________
4. Ho : p = 0.70
Ha : p 0.70 ________________ ____________________
5. Ho : p ≤ 0.80
Ha : p > 0.80 _______________ ____________________
Activity 2: Determine the sample proportion of successes ( ) for each of the following problems.
1. The government reports that the literacy rate is 52%. A non-governmental organization
believes it to be less. The organization takes a random sample of 600 inhabitants and
obtains a literacy rate of 42%. Perform the relevant test at the 0.5% (one-half of 1%) level
of significance. _______________.
2. In the previous year the proportion of deposits in checking accounts at a certain bank that
were made electronically was 45%. The bank wishes to determine if the proportion is higher
this year. It examined 20,000 deposit records and found that 9,217 were electronic.
Determine, at the 1% level of significance, whether the data provide sufficient evidence to
conclude that more than 45% of all deposits to checking accounts are now being made
electronically. _____________.
3. Suppose a new treatment for a certain disease is given to a sample of 150 patients. The
treatment was successful for 91 of the patients. Assume that these patients are
representative of the population of individuals who have this disease. _______________.
54
Activity 3:
A. Compute the value of the test statistic for each test using the information given.
1. Testing H0: p = 0.50 vs. Ha: p > 0.50, n = 360, = 0.56. ________________
2. Testing H0: p = 0.24 vs. Ha: p ≠ 0.24, n = 40, = 0.2304. ________________
3. Testing H0: p = 0.37 vs. Ha: p < 0.37, n = 1200, = 0.35. _________________
B. For each part of Activity 3 A construct the rejection region for the test for α= 0.05
and make the decision based on your answer to that part of the exercise.
2.
3.
Activity 4: Read and understand carefully the given problem. Provide the needed
information for item #1-10.
A soft drink maker claims that a majority of adults prefer its leading beverage over
that of its main competitor’s. To test this claim 500 randomly selected people were given
the two beverages in random order to taste and found that 270 preferred the soft drink
maker’s brand, 211 preferred the competitor’s brand, and 19 could not make up their
minds. Determine whether there is sufficient evidence, at the 5% level of significance, to
support the soft drink maker’s claim against the default that the population is evenly split
in its preference.
Activity 5: Apply the 5 steps in testing the hypothesis of the given problem.
An insurance industry report indicated that 30% of those persons involved in minor traffic
accidents this year have been involved in at least one other traffic accident in the last five years.
An advisory group decided to investigate this claim, believing it was too large. A sample of 200
traffic accidents this year showed that 56 persons were also involved in another accident within
the last five years. Use = 0.10.
55
References:
Department of Education. Most Essential Learning Competencies in Statistics and Probability: pp. 67- 68.
Department of Education. Statistics and Probability: Teaching Guide: pp. 385 – 389.
Chan Shio,Christian Paul O, Reyes Maria Angeli T. Statistics and Probabilty for Senior High
School.(Quezon City: C & E Publishing Inc., 2017).
Introductory Statistics :https://saylordotorg.github.io/text_introductory-statistics/s12-05-large-
sample-tests-for-a-popul.html, accessed on October 14, 2020.
Introductory Statistics https://stats.libretexts.org/Bookshelves/Introductory_ Statistics /Book%3A
_Introductory_Statistics_(Shafer_and_Zhang)/08%3A_Testing_Hypotheses/8.05%3A_Large_
Sample_Tests_for_a_Population_Proportion accessed on October 14, 2020.
Hypothesis Test Worksheet for One Population Proportion: ourses.wccnet.edu/~palay/math
160r/wrksht10hypoprop.htm accessed on October 14, 2020.
Answer Key
Activity 1 Activity 2 Activity 3.A.
1. one-tailed 1. 42 % or 0.42 1. 2.277
2. 0.46 2. – 0.14
3. 0.61 3. -1.435
2. one-tailed
3. one-tailed
4. two-tailed
5. one-tailed
Prepared by:
Vilma B. Panela
Master Teacher 1
56
WEEK 7: CORRELATION ANALYSIS
Background Information
Positive association: as x goes up, y tends to go up. The same positive association is
also exhibited when x goes down, y tends to go down as well.
Example: The profit increases when capital increases. Weight goes down when food intake
goes down
Negative association: as x goes up, y tends to go down. The same negative association
is also exhibited when x goes down, y tends to go up
Example: The savings increases when expenditures decrease. When price decreases,
sales increases
The statistical procedure that is used to determine or to describe the relationship between
two variables is called correlation analysis.
To explore the relationship between two variables, we can either do it graphically using the
Scatter Plot or numerically using the Pearson-Moment Correlation or Pearson r.
5 5
2. Graph the points 2. Graph the points
to the bivariate data. to the bivariate
data.
57
3. Identify the 3. Identify the
variables describe variables describe
how the points are how the points are
scattered? scattered?
A. Scatter Plot
Scatter plot, scatter graph, or sometimes called scatter diagram is a graphical
representation of the relationship between two variables. Data’s are scattered on the Cartesian
plane where points are not joined. The relationship or correlation between two variables may
be described in terms of direction and strength. The line closest to the points is called the trend
line – it indicates the direction.
A negative correlation exists when high values in one variable correspond to low values
in the other variable or low values in one variable correspond to high values in the other
variable.
A zero correlation when high values in one variable correspond to either high or low values
in the other variable.
The strength of correlation may be perfect, very high, moderately high, moderately low,
very low and zero.
The following are examples of scatterplot with interpretation of strength and directions.
58
No Correlation
Sometimes a scatterplot does not evidently show that a correlation exists between the two
variables. This is in the case of very weak correlation where it would be very difficult to identify the
trend line. Thus, we use the Pearson Product-Moment Correlation.
The sign of r denotes the nature of the association while the value of r denotes the strength of
association.
Example: The following data shows the scores of five students in Statistics and Physics. Determine
if there is a relationship between the scores in Physics and Statistics using scatterplot and Pearson
r. Interpret the results.
Score in Score in
Student Statistics Physics
X Y
Hyun Bin 3 5
Ji Chang Wook 9 8
Lee Dong Wook 10 10
Park Bo Gum 12 9
Lee Min Ho 7 8
59
From the graph, we can
interpret that there is a Strong positive
correlation between the Score in
Statistics and score in Physics
60
Learning Competencies with Code:
Illustrate the nature of bivariate data. M11/12SP-IVg-2
Construct a scatter plot. M11/12SP-IVg-3
Describe shape(form), trend(direction), and variation(strength) based on a scatter plot.
M11/12SP-IVg-4
Calculates the Pearson’s sample correlation coefficient M11/12SP-IVh-2
Solves problems involving correlation analysis. M11/12SP-IVh-3
1. Annual income of the family and floor area of the residence house.
2. Age and price of a car
3. Gross national product and level of technology of a country.
4. Age and reaction time of person’s over 18 years of age.
5. Yearly income and number of year of schooling of company owners.
Practice B. For each case, determine the two variables and tell whether the relationship is positive
or negative.
1. The more time is spent in studying his lessons, the higher is the average grade of Nelson.
2. If the population of fox in the forest increases, the number of deer decreases.
3. The more students enroll in a school, the more teachers are needed.
4. As a person ages, his memory decreases.
5. The more workers are hired to paint the whole school, the sooner the job is done.
61
References
Belecina, Rene R., Baccay, Elisa S., Mateo, Efren B. Statistics and Probability (Quezon City:
Rex Bookstore, Inc., 2016), 282 – 301.
” Graphs of Positive Strong Correlation”. Accessed on October 21, 2020
https://www.google.com/search?q=graphs+of+strong+positive+correlation&rlz=1C1SQJL_enPH8
60PH860&tbm=isch&source=iu&ictx=1&fir=LIwS1APakpGuEM%252CrJ5QMzYjG9bQWM%252
C_&vet=1&usg=AI4_-kQXHx-
4kQ0CRpyd31UCLBR0BM3VZQ&sa=X&ved=2ahUKEwjI3YyrlsfsAhWXBIgKHebkAoQQ9QF6BA
gLEEc#imgrc=ZOGGiVoePb6EyM
Answer Key
Practice A Practice B Practice C
1. Positive, strong 1. Positive Moderately Negative Correlation
2. Negative, Strong 2. Negative
3. Positive, Strong 3. Positive
4. Negative
4. No Correlation
5. Negative
5. No Correlation
Practice D Practice E
62
1. r = -0.63, moderately high negative ƩX = 34.5
correlation ƩY = 115 r = - 0.74,
ƩX2 = 162.75 moderately high negative
2. r = 0.10, very low positive correlation correlation
ƩY2 = 1375
ƩXY = 361.50
n = 10
Prepared :
MAUREEN RHEA T. ATCHICO
Master Teacher I
Background Information
Regression analysis is used in statistics to find trends in data. For example, you might
guess that there’s a connection between your age and how much you weigh; regression analysis
can help you quantify that. Moreover, regression analysis is used to estimate or predict possible
values of the dependent variable given the value of the independent variable.
Dependent and Independent Variables
Dependent variable is the condition that you measure in an experiment. It is also called the
“responding or outcome variable”. The dependent variable is denoted by “Y”.
Independent variable also called the "predictor variable" because it predicts or forecast the
values of the dependent variable in the model. The independent variable is denoted by “X”.
Example: Identify the dependent variable (Y) and the independent variable (X) in each of the
following situations.
1. A scientist conducts an experiment to test the theory that a vitamin could extend a person's
life-expectancy
Dependent variable (Y) = life span
Independent variable (X) = Amount of vitamin
2. You want to figure out which brand of microwave popcorn pops the most kernels so you can
get the most value for your money. You test different brands of popcorn to see which bag pops
the most popcorn kernels.
Dependent variable (Y) = Number of kernels popped
Independent variable (X) = Brand of popcorn bag
3. You want to determine whether how long a student sleeps affects his test scores.
Dependent variable (Y) = Test score
Independent variable (X) = length of time spent sleeping
Regression analysis is a set of statistical methods used for the estimation of relationships
between a dependent variable and one or more independent variables..( e.g. It predicts how
much you’ll weigh in ten years ,if you continue to put on weight at the same rate. )
63
Simple linear regression is a linear approach to modeling the relationship between the
dependent variables and one independent variables.
A regression line is a single line that best fits the data (in terms of having the smallest overall
distance from the line to the points). Also called “the line of best fit” or the
Least regression line.
y = a + bX
where:
a - is the y-intercept.
b - is the slope of the line
Y – is the dependent variable
X – is the independent variable
The slope (b) of a line is the change in Y over the change in X.
The y-intercept (a)is the value on the y-axis where the line crosses.
To find the y-intercept (a) of the regression line, we use the formula
To find the slope (b) of the regression line, we use the formula
Example 1. The table below shows the accumulated number of hours spent in a week by each of
the ten students from grade 11 in reviewing Statistics and Probability and their test scores
respectively.
Student 1 2 3 4 5 6 7 8 9 10
Number of Review Hours 7 9 12 10 6 13 14 18 7 5
Scores in the Exam 33 36 41 53 42 78 82 88 29 28
Solution:
Step 1: Make a chart of data, filling in the columns in the same way as you would fill in the chart if
you were finding the Pearson’s Correlation Coefficient.
64
3 12 41 492 144 1681
4 10 53 530 100 2809
5 6 42 252 36 1764
6 13 78 1014 169 6084
7 14 82 1148 196 6724
8 18 88 1584 324 7744
9 7 29 203 49 841
10 5 28 140 25 784
Sum x = 101 y = 510 xy = 5918 x =1173
2
y2 = 30816
From the above table, Σx = 101, Σy = 510, Σxy = 5918, Σx2 = 1173, Σy2 = 30816, n = 10 (sample
size)
Step 2. Find the slope (b) of the regression line, we use the formula
(10)(5918) − (101)(510)
= (10)( 1173 ) − (101)2
b = 5.0
For every 1 hour spend in reviewing Statistics and probability, the average score in the exam
increases by 5.0.
Step 3. Find the y-intercept (a) of the regression line, we use the formula
a = 0.33
Step 4: Find the equation of the regression line, we use the formula
y = a + bX
Substitute the computed value of the slope ( b) and the y – intercept (a)
y = a + bX
y = 0.33 + 5.0 x ( equation of the line )
65
Review Hrs vs Exam Score
90
80
70
Exam Score
60
50
40
30
20
0 2 4 6 8 10 12 14 16 18 20
Review Hours
I replaced the graph to correct the line when x = 7 as both plots should be below 35.33
Predict the value of the dependent variable given the value of the independent variable.
To predict the value of the dependent variable given the value of the independent variable
we use the formula of the regression line.
y = a + bx
Let us use the above example 1 where a = 0.33 , b = 5.01. What is the predicted score in the
exam of the 11th student if he spends 15 hours in reviewing Statistics and Probability?
Solution : Using the equation of the regression line, substitute the equivalent value for each
variable
y = a + bx
y = 0.33 + 5.01 (15)
y = 0.33 + 75.15
y = 75.48 75 (score must be a whole number)
Therefore, the predicted score in the exam in Statistics and Probability of the 11th student is 75 if
he will review 15 hours in a week.
Example 2. Given the following data: a = 12, b= 2 and x(age of father in years) = 58. What is the
predicted height(cm) of the father?
Solution:
y = a + bx
y = 12 + 3 ( 58)
y = 12 + 174
y = 186
Therefore, the predicted height of the father is 186 cm when he is 58 years old.
66
Activity 1. Identify the dependent variable (Y) and the dependent variable (X) in each of the
following situations.
1. You want to compare brands of paper towels, to see which holds the most liquid.
Y = _____________________________________________
X = _____________________________________________
2. You are interested in whether a higher minimum wage impacts employment rate.
Y = _____________________________________________
X = _____________________________________________
3. A real-estate agent may want to predict the selling price of a house (in pesos) based on the
floor area ( in m2) of the house.
Y = _____________________________________________
X = _____________________________________________
4. A study is done to determine if the weekly grocery bill changes based on the number of
family member.
Y = _____________________________________________
X = _____________________________________________
Activity 2.
Compute for the slope(b) and the y-intercept (a) given the following information.
1. x = 194 , y = 58 , xy = 1969 , x2 = 8464, y2 = 766 , n = 5 ________ _______
2. x = 92 , y = 507, xy = 6273 , x = 1262, y = 32467 , n = 9 ________ _______
2 2
3. x = 854 , y = 816, xy = 69941, x2 = 73346, y2 = 66872, n = 10 ________ _______
Activity 3.
Find the predicted value of when x is given
1. 𝑦 = 6.828𝑥 + 90.463 , when x = 5 ____________________
2. 𝑦 = 0.736𝑥 + 47.788 , when x = 7 _____________________
3. 𝑦 = 57. 82𝑥 + 6.88 , when x = 9.81 _____________________
4. 𝑦 = 93𝑥 + 29.21 , when x = 23.38 _____________________
Activity 4.
The table below shows the data of 10 randomly selected grade 11 student in Angeles City.
Student Height Weight
(meters) (kg)
X Y XY X2 Y2
1 1.64 40
2 1.52 49
3 1.52 50
4 1.65 45
5 1.42 42
6 1.48 46
7 1.50 36
8 1.54 50
9 1.67 63
10 1.72 55
SUM x= y= xy= x2= y2=
67
4. Find the equation of the regression line.
5. What is the predicted weight of the 11th student if the height is 1.60 meters?
6. Graph the regression line.
References:
Chan Shio,Christian Paul O, Reyes Maria Angeli T. Statistics and Probabilty for Senior High
School.(Quezon City: C & E Publishing Inc., 2017).
Department of Education. Most Essential Learning Competencies in Statistics and Probability: pp.
67- 68.
Department of Education. Statistics and Probability: Teaching Guide: pp. 385 – 389.
Dependent and Independent Variable. https://www.thoughtco.com/independent-and-dependent-
variable-examples-606828. Accessed on October 18,2020.
Interpreting the slope of the line.https://math.libretexts.org/Courses/De_Anza_College/Pre-
Interpreting Statistics/2%3A_Graphing_Points_and_Lines_in_Two_ Dimensions/2.6%3A_
Interpreting_the_Slope_of_a_Line Accessed on October 18,2020.
How to calculate regression line. https://www.dummies.com/education/math/statistics/how-to-
calculate-a-regression-line/. Accessed on October 18,2020.
Regression Analysis: Step by Step Articles, Videos, Simple Definitions.https://www.statistics
howto.com/probability-and-statistics/regression-analysis/ Accessed on October 18,2020.
Simple linear regression line. https://online.stat.psu.edu/stat462/node/101/ Accessed on October
18,2020.
68
Answer Key
69
Activity 1 Activity 2
1. Y= Amount of liquid absorbed by the paper 1. b = -0.30
towel. a = 23.25
X = Brand of paper towel. 2. b = 3.39
2. Y = Employment rate. a = 21.67
X = Minimum wage 3. b = 0.61
3. Y = Selling price of a house (in pesos). a = 29.13
X = Floor area of the house (in m 2).
4. Y = Grocery bill.
X = Number of family members.
Activity 4
Activity 3
1.
1. 124.603
2. 52.94
3. 574.09
4. 2 203.55
2. b = 39.84
3. a = -14.80
4. y = -14.80 + 39.84x
5. y = 48.94
6.
Height vs Weight
1.75
1.70
1.65
Height Mtr
1.60
1.55
1.50
Prepared by:
Vilma B. Panela
Master Teacher 1
70