You are on page 1of 20

Statistics and Probability CM1 who got the top 5 scores in the Midterm Examination in

Statistics and Probability.


Introduction to Statistics: Modern statistics origins
c. 5 out of 8 housewives want to use detergent A than
from two different interests and fields; political states
detergent B.
and games of chances. Governments are keeping
d. 95% of all the patients were cured because of the
records and data of measures of lands, population,
newly created medicine for headache.
number of workers, and even crop production. In the
games of chance, players are using concepts from The following cases use inferential statistics.
statistics to increase their ability to win in different a. Due to the data gathered for 5 years, the company’s
games of chances. income will increase by 3% this coming year.
b. The driver wants to determine the average life span
Statistics is a process of making decisions based on
of a particular brand of tire.
results drawn from a collection of data. As a science, it
c. Based on the number of typhoons entered the
refers to a group of methods that are used to collect,
Philippines for the past 10 years, the PAGASA
analyze, present, and interpret the results of data that
anticipates that there will be at least 30 typhoons this
will help in making decisions and conclusions.
year
Basic Elements of Statistics
Population – collection of all elements that are being
∙ Data Collection – it is done through interviews,
studied.
questionnaires, experiments, existing data, etc.
Sample – a portion or subset of the population that are
∙ Presentation – it is through reports, graphs, figures, or
being studied.
tables
Example: The researcher wants to study the preferences
∙ Analysis – using statistical tests and hypothesis testing
of NU FV SHS students in terms of fast-food chain. To
∙ Interpretation – explaining the findings and results to
determine it, he asked 20 students from each class
arrive on a conclusion
section. The population of the study is all NU FV SHS
Statistical methods refer to the process used in the
students. While the subset of 20 students per class
collection, presentation, analysis, and interpretation of
sections is the sample.
data.
The number of elements in a population (sample) is
Types of Statistics
called population size (sample size).
∙ Descriptive statistics deals with methods of recording
Parameter – is a description of a characteristic of a
and collecting data, with the properties of various kinds
population.
of measures, and computing the said data. It gives a
Statistic – is a description of a characteristic of a
summarized description of the data gathered.
sample.
∙ Inferential statistics deals with inferences, forecasts,
An element or member of a population (or sample) is a
or conclusions from a set of data, drawn from analysis
specific object or subject about which the data is taken,
and testing a subset of the whole data.
i.e. person, place, category, country, etc.
Examples:
A constant is a characteristic of an element that is fixed.
The following cases uses descriptive statistics.
A variable is a characteristic of the element under
a. The average salary per month of employees in ABC
investigation or study which can vary. A variable can be
Company is Php 23,200,
independent or dependent.
b. The teacher will announce the names of the students

THE BOOK LOUNGE PH I 1


Independent variables are a characteristic that does Examples: ∙ College Year Level (Freshmen, Sophomores,
not affect each other, while dependent variables are Juniors, and Seniors)
those characteristics that when one changes, the other ∙ Beauty Contest ranking (1st runner up, 2nd runner up,
also changes. etc.)
Examples: Independent Variables - Basic Calculus ∙ Rank of pieces in Games of General (General, Private,
grades and Filipino 1 grades - Temperature outside and etc.)
number of Tiktok followers Interval Scale – observed data are classified into
categories with ordering, direction of rank, and
Dependent Variable - Contest ranking and Cash prize -
magnitude between ranks. Negative numbers are also
Price of gasoline and Price of basic needs in the market
considered.
A variable can be categorized as qualitative and
Examples: ∙ Measurement of latitudes ∙
quantitative. Quantitative variables are variables that
Height of mercury level in a barometer
can be measured numerically, while qualitative
Ratio Scale – it is an interval scale with absolute zero,
variables are those variables that cannot be expressed
meaning a value of zero means absence of the
using numbers.
characteristics being measured.
Examples: Quantitative Variables - Room temperature - Examples: ∙ Speed or acceleration ∙ Salary
Monthly salary - Quiz scores -Distance between two places
Qualitative Variables - Civil Status - Nationality - Skin Statistics and Probability CM2
color - ML Rank. A quantitative variable can be classified PROBABILITY DISTRIBUTIONS
as discrete or continuous. Probability refers to the chance that a particular event
will occur. Usually, this chance of knowing something
Discrete variables are variables whose values can be will happen is hard to predict. But by using probability
counted. It is usually in integer form. While the concepts, we can improve our chance of knowing.
continuous variables are variables whose values can be Probability is a very useful tool for properly
between a certain interval of two integral values. understanding inferential statistics.
Examples: A researcher will do an experiment where he/she will
Discrete Variables - Number of cars that travelled using generate a set of data. The set of all possible outcomes
NLEX - Number of students who passed the National in a statistical experiment is called the sample space,
University Admission Test - Number of people who have and denoted by S. A subset of a sample space is called
recovered from Covid-19 an event.
Continuous Variable - Amount of water consumed by Examples: a. Tossing a die once. The sample space of
students in a day - Weight of rice in kilograms eaten by the experiment is 𝑆 = {1, 2, 3, 4, 5, 6}
Filipino family in a month - Temperature of the sea level
each day during summer season In this experiment, we have six possible outcomes when
we toss a die once.
Levels of Measurement The value of a variable or
characteristics of an element is called an observation or b. Flip a coin twice. The sample space of the
measurement. There are four scales of measurement; experiment is 𝑆 = {(𝐻, 𝐻), (𝐻, 𝑇), (𝑇, 𝐻), (𝑇, 𝑇)}
nominal, ordinal, interval, and ratio.
Flipping. Let A be an event from an experiment, then
Nominal Scale – observed data are classified into
we denote 𝑃(𝐴) as the probability of the event that A
categories in which no ordering is implied.
will happen.
Examples; ∙ Gender (Male/Female)
∙ Religious affiliation (Catholic, Muslim, Buddhist, etc.) Classical Definition of Probability
∙ Eye color (black, blue, green, etc.) Assume that a given experiment has N number of
possible outcomes in its sample space. If an event A is
Ordinal Scale – observed data are classified into
being studied that can occur in n out of N different
categories with ordering, in which the direction of the
ways, then 𝑃(𝐴) = 𝑛/𝑁
rank is specified but not the magnitude of differences in
rank.
THE BOOK LOUNGE PH I 2
Properties of Probabilities Step2: Write the possible outcomes.
∙ 0 ≤ 𝑃(𝐴) ≤ 1 for any event A
∙ The complement of A, denoted by 𝐴c , is the event
when A will not occur. Then, 𝑃(𝐴) + 𝑃 (𝐴c) = 1
Examples: a. Rolling a fair die once, what is the
probability of getting the following:
i. an event A that you will get a 4?
ii. an event B of getting an even number
iii. an event C of rolling a number higher than 2

Solutions: Rolling a fair die once will give us six possible S3: Write the random variable. 𝑿 = {𝟎, 𝟏, 𝟐, 𝟑}
results, so 𝑁 = 6. The sample space of the experiment is
2.Two balls are drawn in succession without
𝑆 = {1,2,3,4,5,6}.
replacement from an urn containing 5 red balls and 6
blue balls. Let the random variable 𝑌 represents the
number of blue balls. Find the possible values of
random variable 𝑌.
S1: Perform the experiment.

Random Variables
-It is a variable whose possible values are numerical
outcomes of a random experiment. S2: Write the possible outcomes.
-A function X which assigns a real number x to each
possible outcome in the sample space is called a
random variable.
-A random variable can be discrete or continuous.

Discrete Random Variable - Variable whose values have


finite (countable) number of distinct values expressed S3: Write the random variable. 𝒀 = {𝟎, 𝟏, 𝟐}
as positive integers.
Continuous Random Variable - Variable that can 3. Two marbles are drawn at random in succession
assume an infinite number of values in an interval without replacement from a bag containing 2 red
between two specific integers. Often results of marbles, 1 blue marbles, and 3 yellow marbles.
measurements. Determine the random variable 𝑍 representing the
number of yellow marbles in each draw.
Examples: 1. A coin is tossed three times. Let the S1: Perform the experiment.
random variable 𝑋 represent the number of head.
Step1: Perform the experiment.

S3: Write a random variable. 𝒁 = {𝟎, 𝟏, 𝟐}


THE BOOK LOUNGE PH I 3
S2: Write the possible outcomes. blue balls. Let the random variable 𝑌 represents the
number of blue balls. Find the possible values of
random variable 𝑌.
S4: Construct the DPD.

4. Two fair dice are thrown together. Let random


variable 𝐴 be the sum of outcomes in throwing two
fair dice together.
S1: Perform the experiment.

3. Two marbles are drawn at random in succession


S2: Write the possible outcomes. without replacement from a bag containing 2 red
marbles, 1 blue marbles, and 3 yellow marbles.
Determine the random variable 𝑍 representing the
number of yellow marbles in each draw.
S4: Construct the DPD.

S3:
Write the random variable.
𝑨 = {𝟐, 𝟑, 𝟒, 𝟓, 𝟔, 𝟕, 𝟖, 𝟗, 𝟏𝟎, 𝟏𝟏, 𝟏𝟐}

DISCRETE PROBABILITY DISTRIBUTION


distribution that lists all the possible values of a discrete
random variable and their corresponding probabilities.

4. Two fair dice are thrown together. Let random


variable 𝐴 be the sum of outcomes in throwing two
fair dice together S4: Construct the DPD.

Examples: A coin is tossed three times. Let the random


variable 𝑋 represent the number of head.

S4: Construct the DPD.

2. Two balls are drawn in succession without


replacement from an urn containing 5 red balls and 6
THE BOOK LOUNGE PH I 4
Statistics and Probability CM3
MEAN, VARIANCE, AND STANDARD DEVIATION OF A
DISCRETE RANDOM VARIABLE

Expected Value or Mean


A probability distribution shows all the possible
outcomes and their respective probabilities. Based on
the data, we can determine which is the most likely to
happen in all the possible outcomes. It is called the
expected value or mean of x. It is denoted as 𝑬(𝒙) 𝒐𝒓 𝝁.

Given the discrete random variable 𝑥𝑖 ′𝑠 and their


respective probabilities 𝑃(𝑥 = 𝑥𝑖 ) 𝑜𝑟 𝑠𝑖𝑚𝑝𝑙𝑦 𝑃(𝑥𝑖), then
the expected value or mean of 𝑥 is given by
PROPERTIES OF DPD
1. The probability of each value of the random variable
must be between 0 and 1 or equal to 0 or 1.
𝟎 ≤ 𝑷(𝒙) ≤ 1 Examples:
2. The sum of probabilities of all values of a random I. Compute the expected value of the random variable x.
variable must be equal to 1. a.
∑ 𝑷 (𝒙) = 1

It is a probability distribution since all probabilities are


between 0 and 1. Also, the sum of all probabilities is 1.

It is a probability distribution since all probabilities are


between 0 and 1. Also, the sum of all probabilities is 1.
b

It is not a probability distribution since the sum of all


probabilities is not equal to 1.

1/3 + 1/3 + 1/3 + 1/3 = 4/3 ≠ 1

It is not a probability distribution since there exist a


probability that is not between 0 and 1.

𝑃(4) = 1.2 > 1

THE BOOK LOUNGE PH I 5


II. Solve the following problems. The expected value or mean will not give you the exact
a. Two marbles are drawn at random in succession value that will result in the given case or situation. It will
without replacements from a bag containing 2 red give you the result with the highest possibility that will
marbles, 1 blue marbles, and 3 yellow marbles. What is occur. So, in making a decision where you need to choose
the expected number of yellow marbles in one draw? only one result, the best choice is the mean or the
expected value.
Solution: The probability distribution of the number of
yellow marbles in a draw is constructed in Course *The computed expected value is 6.52. Since the possible
Material 2. It is given by this table outcomes are counting numbers from 2 to 12 only, then
we round off the result giving us 7.

Variance and Standard Deviation

The expected value or mean provides a measure of


central tendency, meaning it shows where the central
location of all possible outcomes.

Other than the mean, we also want to measure the


variability of the outcomes, or the dispersion of the
points. This is called the variance of the probability
distribution, denoted as 𝜎2 . The standard deviation is
the square root of the variance, denoted as 𝜎.
The formulas for the variance and standard deviation are
Hence, the expected value of the number of yellow given by the following:
marbles in a draw is one.
b. Tom and Jerry are playing a game. Tom will throw
two fair dice together. Then, he asked Jerry the sum of
the two resulting numbers in the dice. If Jerry gets it
correctly, Tom will treat him to a burger. If Jerry will
guess the sum, what is the answer that will give him the Examples: Determine the variance and standard
highest chance of winning? deviation of the following discrete random variable x.
Solution: The probability distribution is given by

a.
Note that the expected value of item a is already known
from the previous example. 𝐸(𝑥) = 𝜇 = 2

THE BOOK LOUNGE PH I 6


For teaching satisfaction scores of College teachers:

b.
The expected value of item b is given by 𝐸(𝑥) = 𝜇 = 6.7

Summary of Results
𝑉(𝑥) = 2.209 + 1.458 + 0.098 + 0.423 + 2.723
𝑉(𝑥) = 𝜎2 = 6.911
Then, 𝑆𝐷(𝑥) = 𝜎 = 2.629

Application of Mean, Variance and Standard Deviation


Solve the following problems.
a. The following probability distributions of teaching
satisfaction scores of teachers in Senior High School and
College levels. The scores range from 1 (very II. Problem Solving.
dissatisfied) to 5 (very satisfied). 1. A group of researcher interviewed a random sample
of 1000 SHS students. They were asked on how many
cups of coffee they drink on an average day. The table
below shows the result of the survey.

i. What is the expected value of a teaching satisfaction


score for Senior High School teachers?
ii. What is the expected value of a teaching satisfaction a. Construct a probability distribution for the number of
score for College teachers? cups of coffee per day. (Hint: the variable 𝑥 is the
iii. Compute the variance of teaching satisfaction scores number of cups and the probability is the number of
for Senior High School and College teachers. students who drink the number of cups over the total
iv. Compute for the standard deviation of both number of students interviewed.)
probability distributions. b. Compute for the expected value of 𝑥.
Solutions: To solve the mean, variance, and standard c. Compute for the variance and the standard deviation
deviation of the teaching satisfaction scores of Senior of 𝑥.
High School and College teachers, we can use the
II. Problem Solving
tabular solution.
1. The Character Formation Coordinator is conducting online
For teaching satisfaction scores of Senior High School consultations for students experiencing problems involving
teachers: social and mental health. Some students consult once every
week, some twice, others thrice or more every week. The
table shows the probability of how many times a student will
consult for in a given week

THE BOOK LOUNGE PH I 7


a. What is the probability that a student consults more higher numbers are on the right. This is usually denoted
than once? as 𝑥 − 𝑠𝑐𝑎𝑙𝑒.
b. What is the expected number of times a student
Standard Normal Distribution A normal distribution
consult the Character Formation Coordinator?
where 𝜇 = 0 and 𝜎 = 1 is called a standard normal
c. Compute for the variance and standard deviation of
distribution. Any normal distribution can be
the given data
represented by a standard normal distribution by
Statistics and Probability CM4 changing the units of measurement used in the
distribution. It is converting the original 𝑥 − 𝑠𝑐𝑎𝑙𝑒 into 𝑧
NORMAL DISTRIBUTION AND NORMAL CURVE
− 𝑠𝑐𝑜𝑟𝑒𝑠 using the formula below;
Normal Distribution The probability distribution
discussed in Course Material 3 is about discrete
probability distributions. In this course material, our
focus is on a continuous probability distribution. There Examples: For a normally distributed population with
are many ways of describing a continuous probability mean equal to 200 and standard deviation equal to 20,
distribution, but the most common and most important find the standardized z-value of each of the following.
of them is the normal probability distribution or simply
normal distribution. a.x=190 b.x=240
The mathematical equation of a normal distribution is
dependent on the mean 𝜇 and the standard deviation 𝜎.

c. 𝑥 = 200 d. 𝑥 = 225

Properties of a Normal Distribution


a. The graph of a normal distribution is a bell-shaped
curve that is asymptotic to the horizontal axis that The scores on the Statistics exam of a class were
extends indefinitely in both directions. It is called the normally distributed, and the z-scores for some
normal curve. students are shown below;
b. The highest point on the normal curve is the mean of
the distribution.
c. The normal curve is symmetric with respect to the
vertical line passing through the mean.
d. The mean, median and mode of a normal distribution
are equal. a. Which of the students scored above the mean score?
e. The area under the normal curve and above the b. Which of the students scored below the mean score?
horizontal axis is equal to 1. c. Which of the students scored on the mean?
f. The standard deviation determines how flat and wide d. If the mean score was 𝜇 = 85, with a standard
the normal curve is. The higher the standard deviation, deviation of 𝜎 = 5, what was the final score of each
the higher the variability of the data that gives a wider students?
and flatter normal curve.
g. The horizontal axis represents a real-number line Solutions: Note that in a standard normal distribution,
where the middle value is the mean, numbers lower the mean score is equal to 0. So, there actual scores
than the mean are on the left of the mean, and the
THE BOOK LOUNGE PH I 8
depends on the position of their respective z-scores the normal curve, above the horizontal axis, and
with respect to 0. between the vertical lines through 0 and z.
Examples: Find the following area under the normal
a. Students who got a z-score greater than zero (or
curve. The area represented by 𝑃(𝑍 = 1.25) is
positive zscores) are those who scored above the mean
the area under the normal curve, above the horizontal
score. They are Xia (0.45), Riel (1.5), and Hanna (1.25).
axis, and between
b. Students who got a negative z-scores are those who
the vertical line
got a score less than the mean score. They are Sam (-
through 0 and
1.3), and Rachel (-0.65).
1.25, i.e. 𝑃(0 < 𝑍 <
c. Students who got a z-score equal to 0 means he/she
1.25). The
got a score equal to the mean score. That student is
leftmost part of
Benedict.
the table indicates
d. To solve for the original scores of each student, or the
the ones and tenths digit of the z-score under
xvalue, we need to derive first the formula to convert z-
consideration. In the example, it is 1.2.
scores to x-values.
The uppermost part of the table indicates the
hundredths digit/value of the z-score. In the example, it
is 0.05. Trace a horizontal line through 1.2, and a
vertical line through 0.05. The value in the intersection
gives you the area represented by 𝑃(𝑍 = 1.25).

Hence, 𝑃(𝑍 = 1.25) = 0.8944.


Convert the following z-scores to their original scores (x
values) b. 𝑃(𝑍 = 0.07) The z-score under consideration is 𝑧 =
0.07. It is the area represented by 𝑃(0 < 𝑍 < 0.07). Look
for 0.0 on the left most part, and 0.07 on the uppermost
part.

Hence, 𝑃(𝑍 = 0.07) = 0.5279.

Area under the Normal Curve


The expression 𝑃(𝑥1 < 𝑋 < 𝑥2 ) represents the area
under the normal curve, above the horizontal axis, and
between the vertical lines through the points 𝑥1 and 𝑥2.
To solve for the area, we are going to use a table of Properties of Areas under the Normal Curve
values included in this Course Material in the Appendix. ∙ The total area under the normal curve and above the
This table of values computes the area under the horizontal line is 1.
normal curve that uses z-scores. So, in solving the area ∙ The area on the left side of the vertical line passing
under the normal curve, we need to convert x-values to through the mean is 0.5, i.e. 𝑃(𝑍 < 0) = 0.5.
z-scores first. So, 𝑃(𝑥1 < 𝑋 < 𝑥2 ) will be 𝑃(𝑧1 < 𝑍 < 𝑧2 ) ∙ The area on the right side of the vertical line passing
where 𝑧1 and 𝑧2 are the z-scores equivalent of 𝑥1 and through the mean is 0.5, i.e. 𝑃(𝑍 > 0) = 0.5.
𝑥2, respectively. Examples: Sketch and compute the area of the
How to use the table for Areas under the Normal Curve following. a. 𝑷(𝒁 = −𝟐. 𝟒𝟐)
The areas computed in the table is the area represented
by 𝑃(0 < 𝑍 < 𝑧) or simply 𝑃(𝑍 = 𝑧). It is the area under
THE BOOK LOUNGE PH I 9
Solution: The area represented by 𝑃(𝑍 = −2.42) is the c. the area between 𝑥 = 110 and 𝑥 = 215
area 𝑃(−2.42 < 𝑍 < 0). d. the percentage of population with score below 167
Solutions: a. the area above 𝒙 = 𝟏𝟐𝟓 First, convert the
Since the normal curve is symmetric to the vertical line
x-value to z-score.
passing through the mean, then the area represented
on the right side of the mean, say z, is equal to the area
represented on the left side by – 𝑧. 𝑃(𝑍 = 𝑧) =
𝑃(𝑍 = −𝑧) So, we have 𝑃(𝑍 = −2.42) = 𝑃(𝑍
= 2.42) 𝑃(𝑍 = −2.42) = 0.4922 b. 𝑷(𝒁 < 𝟏. 𝟑𝟏)

b. the area below 𝒙 = 𝟏𝟖𝟕 Convert 𝑥 = 187 into z-score

c. 𝑷(𝒁 > −𝟏. 𝟎𝟕)

c. the area between 𝒙 = 𝟏𝟏𝟎 and 𝒙 = 𝟐𝟏𝟓


Converting the respective x-values, we have

Applications of Areas under the Normal Curve


Examples: I. Given a normal distribution with 𝜇 = 150
and 𝜎 = 25, find the following:
a. the area above 𝑥 = 125
b. the area below 𝑥 = 187

THE BOOK LOUNGE PH I 10


Two Types of Statistical Hypothesis
1. Null Hypothesis - is a statistical hypothesis that
assumes that there is no significant differences or
effects exist between two or more population
parameters. A claim is denoting “absence” of
relationship, difference, or effects.

2. Alternative Hypothesis - is a statistical hypothesis


opposite to the null hypothesis. It is the hypothesis that
the researcher wants to prove. It is denoted as 𝐻1. In
stating the alternative hypothesis, the sign used is
based on the conjecture of the researcher. A claim
denoting “presence” of relationship, difference or
d. the percentage of population with score below 167 effects.
Convert the x-value to z-score.
Level of Significance The level of significance refers to
the degree of significance in which we accept or reject
the null hypothesis. In hypothesis testing, 100% accuracy
The area under is not possible for accepting or rejecting the null
the normal hypothesis. We will select a maximum probability of
curve committing an error. the level of significance which is
represents the usually 1% or 5%. This probability is symbolized by 𝛼. The
percentage of level of significance represents the area under the
the whole normal curve called the critical regions (or rejection
population. The total area under the normal curve is area). These critical regions are regions at the end part of
equal to 1 which is equivalent to 100%. So, if the area the normal curve. This is the range of values of the test
under consideration is 0.7517. Then it is equivalent to value that indicates that there is a significant difference
75.17%. Hence, there are 75.17% of the population who and that the null hypothesis should be rejected.
scored below 167. However, if the test value falls under the non-critical
region (non-rejection region), then the null hypothesis
Statistics and Probability CM5 should not be rejected.
HYPOTHESIS TESTING - INTRODUCTION
- Also is used to select a critical value from the table of
Hypothesis testing was introduced by Sir Ronald Fisher, area under the normal curve. The critical value separates
Jerzy Newman, Karl Pearson, and Egon Pearson, which the critical region from the noncritical region.
is the son of Karl Pearson. Hypothesis testing is a
statistical method that is applied to experimental data
to make statistical decisions.

First step in hypothesis testing is to have a statistical


hypothesis. A statistical hypothesis is an assumption,
guess, or prediction made by the researcher about the
population parameter. This conjecture is based on the
sample statistic gathered. The goal of the hypothesis
testing is to check whether the statistical hypothesis is
true or not. Is there
enough evidence to claim the conjecture made by the
researcher?

THE BOOK LOUNGE PH I 11


Two types of error

A Type 1 error is committed when we reject the null


hypothesis even though it is true. On the other hand, a
Relationship between the Alternative Hypothesis and Type 2 error is committed when we do not reject the
the Tails of the Hypothesis Test null hypothesis (accept 𝐻0) but it is false.
In selecting what test to use, look at the Ha. With k
Steps in Conducting a Hypothesis Testing
representing a specific value.
1. State the null and alternative hypothesis.
2. Determine if the given problem is a two-tailed test or
one tailed test (right-tailed or left-tailed).
3. Choose the level of significance 𝛼.
4. Determine the critical value.
5. Solve for the computed value (test value).
Common Phrases in Hypothesis Testing
6. Compare the critical value and the computed value.
The table below will show common phrases used in
7. Make a statistical decision.
hypothesis testing and its corresponding sign to be used
8. State the conclusion.
for the alternative hypothesis.
Critical value is based on the level of significance and
the type of test will be used.
One Sample Z-test - the one sample z-test is a statistical
test for the mean. It is used when the sample size 𝑛 ≥
30, or when the population is normally distributed and
the population standard deviation is known.

formula for the computed value

Example: The level of significance 𝛼 represents the area


of the critical region. if we have a two-tailed test with 𝛼
= 0.05, then the critical region will be represented by
this diagram.
The total area of the critical region is equal to the level
of significance used in the hypothesis testing.

THE BOOK LOUNGE PH I 12


Assumptions in One Sample z-test Step 4: Determine the critical value. 𝑧𝑐𝑟𝑖𝑡 =
1. Subjects are randomly selected. 2.326
2. Population is normally distributed.
Step 5: Solve for the computed value.
3. Cases of the samples should be independent of each
other.
4. Sample size should be greater than or equal to 30.

The table below will give the critical values (𝑧𝑐𝑟𝑖𝑡) for
Step 6: Compare the critical value and the computed
one sample z-test.
value

Step 7: Make a statistical decision. Reject the null


hypothesis. Step 8: State the conclusion.
Since we reject the null hypothesis, we accept the
alternative hypothesis. Hence, there is enough evidence
Compare the critical value and the computed value to
to support the claim that the College Deans earn more
determine the statistical decision.
than Php 63,000 a month.
DECISION FOR ONE SAMPLE Z TEST
The graphical representation of the critical values and
the computed values in a normal curve is another way
to know the statistical
LAGER – LESS THAN- ACCEPT decision of the given
GREATER THAN OR EQUAL- REJECT problem.
Examples: 1. A researcher reports that the average 2. A local newspaper
salary of College Deans is more than Php 63,000. A reported that SHS
sample of 35 College Deans has a mean salary of Php students watch less television than the general public in
65,700. At 𝛼 = 0.01, test the claim that the College terms of the mean number of hours per week. The
Deans earn more than Php 63,000 a month. The national mean is 29.4 hours per week with a standard
standard deviation of the population is Php 5,250. deviation of 2 hours, while the average of the 30 SHS
students is 27 hours. Is there enough evidence to
support the report of the local newspaper at 𝛼 = 0.05?

Step 1: State the null and alternative hypothesis. Also,


identify the claim of the researcher. Step 1: State the null and alternative hypothesis. Also,
identify the claim of the researcher.

Step 2: Determine the test to be used. It is a right-tailed


test.
Step 3: Choose the level of significance. Step 2: Determine the test to be used. It is a left-tailed
𝛼 = 0.01 test.
Step 3: Choose the level of significance. 𝛼 = 0.05
Step 4: Determine the critical value. 𝑧𝑐𝑟𝑖𝑡 = −1.645
Step 5: Solve for the computed value.

THE BOOK LOUNGE PH I 13


Step 6: Compare the critical value and the computed Statistics and Probability CM6
value. HYPOTHESIS TESTING: ONE-SAMPLE T-TEST

One sample z-test is a statistical tool that is used to test


if there exists a significant difference between the
Step 7: Make a statistical decision. population mean, and the hypothesized mean (sample
Reject the null hypothesis. mean). The sample size should be 30 or larger.
Step 8: State the conclusion. Since we reject the null If the sample size is less than 30, what test should we
hypothesis, we accept the alternative hypothesis. use to test the significance of the mean?
Hence, there is enough evidence to support the report
of the local newspaper that SHS students watch less We collect a random sample from the population and
television than the general public in terms of the mean then compare the sample mean with the population
number of hours per week. mean to make a statistical decision as to whether the
population mean and the hypothesized mean (sample
3. The average social gathering in the National Capital mean) is significantly different or not.
Region includes 50 guests. A researcher surveys a
random sample of 32 social gatherings during the past Assumptions in One Sample t-test
year with a mean of 53 guests and a standard deviation 1. Subjects are randomly selected.
of 10. Is there sufficient evidence at the 0.05 level of 2. Population is normally distributed.
significance that the average number of guests differs 3. Cases of the samples should be independent of each
from the national average? other.
4. Sample size should be less than 30

Steps in Conducting a One Sample t-test


1. State the null and alternative hypothesis

Step 1: State the null and alternative hypothesis. Also,


identify the claim of the researcher.
2. Determine if the given problem is a two-tailed test
or one-tailed test (right-tailed or left-tailed).
Step 2: Determine the test to be used.
It is a two-tailed test.
Step 3: Choose the level of significance. 𝛼 = 0.05
Step 4: Determine the critical value. 𝑧𝑐𝑟𝑖𝑡 = ±1.960 3. Set the level of significance 𝛼.
Step 5: Solve for the computed value. The common level of significance is 0.05, 0.10, or 0.01.
4. Compute for the degrees of freedom.
𝑑𝑓 = 𝑛 – 1 where 𝑛 is the sample size. Then, determine
the critical value.
5. Solve for the computed value (test value).
Step 6: Compare the critical value and the computed
The formula for the computed value is
value.

Step 7: Make a statistical decision. Do not reject the


null hypothesis.
Step 8: State the conclusion. Since we do not reject the
null hypothesis, we can’t accept the alternative
hypothesis. Hence, there is not enough evidence to
support the claim of the researcher.

THE BOOK LOUNGE PH I 14


alternative hypothesis. Hence, the researchers are
correct that the average commute time is greater than
35 minutes.

b. The average cost of tuition fee for Pre-School was


Php 12,500. A group of 25 schools had a mean of Php
6. Compare the critical value and the computed value 13,100 and a standard deviation of Php 3,550. IS there
to determine the statistical decision. sufficient evidence to say that the cost of tuition fee is
7. Make a statistical decision. different from the national mean? Use 𝛼 = 0.01.

8. State the conclusion.


Step 1: State the null and alternative hypothesis. Also,
Examples: a. A survey of the Office of the Mayor of
identify the claim of the researcher.
Quezon City finds that the average commute time from
Quezon City Memorial Circle to SM Fairview is 35
minutes. A group of employees of SM Fairview thinks
that the commute time is greater and wants to prove it. Step 2: Determine the test to be used. It is a two-tailed
They randomly selected 28 employees and finds that test since 𝐻1 uses the symbol “≠”.
the average commute time is 43 minutes with a Step 3: Set the level of significance. 𝛼 = 0.01
standard deviation of 5 minutes. At 𝛼 = 0.05, are they Step 4: Compute for the degrees of freedom. 𝑑𝑓 =
correct? 25 − 1 = 24

Then, determine the critical value using the table


𝑡𝑐𝑟𝑖𝑡 = ±2.797

Step 5: Solve for the computed value.


Step 1: State the null and alternative hypothesis. Also,
identify the claim of the researcher.

Step 6: Compare the critical value and the computed


Step 2: Determine the test to be used. It is a right-tailed
value.
test since 𝐻1 uses the symbol “>”.
Step 3: Set the level of significance. 𝛼 = 0.05 Step 7: Make a statistical decision. Do not reject the
Step 4: Compute for the degrees of freedom. 𝑑𝑓 = 28 null hypothesis.
− 1 = 27 Then, determine the critical value using the Step 8: State the conclusion. Since we do not reject the
table 𝑡𝑐𝑟𝑖𝑡 = 1.703 null hypothesis, we accept the null hypothesis. Hence,
Step 5: Solve for the computed value. there is not enough evidence to support the claim.

One Sample t-test (the sample mean and sample


Step 6: Compare the critical value and the computed deviation is unknown)
There are some problems that the sample mean and
value. sample standard deviation is unknown and need to be
Step 7: Make a statistical decision. Reject the null computed. To solve the sample, mean and sample
hypothesis. standard deviation, we use the following formulas:

Step 8: State the conclusion.


Since we reject the null hypothesis, we accept the
THE BOOK LOUNGE PH I 15
Step 7: Make a statistical decision.
- Do not reject the null hypothesis.
Step 8: State the conclusion. Since we do not reject the
null hypothesis, we accept the null hypothesis. Hence,
there is enough evidence to support the claim.

b. For the past few years, it is said that the average


temperature during December is 24.5°𝐶 . A group of
researchers said that for this year, the temperature in
Examples: a. The average family size of a Filipino family December is higher. The temperature of 14 randomly
was reported as 4.5. A researcher wants to prove that it chosen days in December is listed in the table below.
is true. He conducts a survey to 8 randomly chosen
families and the result is given by the following
numbers: 4, 6, 8, 3, 5, 7, 8, and 4. Is there enough
evidence to prove that the researcher is correct? Use 𝛼
= 0.01 as a level of significance.
Given: μ = 4.5 n= 8
Test if the researcher has enough evidence for the claim
Step 1: State the null and alternative hypothesis. Also,
at 𝛼 = 0.05.
identify the claim of the researcher.

Step 1: State the null and alternative hypothesis. Also,


identify the claim of the researcher.
Step 2: Determine the test to be used. It is a two-tailed
test since 𝐻1 uses the symbol “≠”.
Step 3: Set the level of significance. 𝛼 = 0.01
Step 2: Determine the test to be used. It is a right-tailed
Step 4: Compute for the degrees of freedom. 𝑑𝑓 = 8 − 1
test since 𝐻1 uses the symbol “>”.
= 7 Then, determine the critical value using the table
𝑡𝑐𝑟𝑖𝑡 = ±3.499 Step 3: Set the level of significance. 𝛼 = 0.05
Step 5: First, compute for the sample mean and sample Step 4: Compute for the degrees of freedom. 𝑑𝑓 =
standard deviation. We will use a table as part of the 14 − 1 = 13 Then, determine the critical value using the
computation. table 𝑡𝑐𝑟𝑖𝑡 = 1.771 Step 5: First, compute for the
sample mean and sample standard deviation. We will
use a table as part of the computation.

Solve for the computed value.

Step 6: Compare the critical value and the computed


value.
THE BOOK LOUNGE PH I 16
4. Population standard deviations are known.
5. Each population has a size of 30 or above.

Solve for the computed value Steps for Two Sample Z-test
Step 1: Formulate the null and alternative hypothesis

Step 6: Compare the critical value and the computed The null hypothesis states that there is no significant
difference between the means of the two population
value. while the alternative hypothesis states that there is a
Step 7: Make a statistical decision. Reject the null significant difference between the two means, one is
hypothesis. Step 8: State the conclusion. Since we said to be higher or lower than the other, or simply
reject the null hypothesis, we accept the alternative different from each other
hypothesis. Hence, there is enough evidence to support Step 2: Determine if the given problem is a two-tailed
the claim. test or a one-tailed test.
Statistics and Probability CM7
HYPOTHESIS TESTING: TWO POPULATIONS

To compare two populations, we need two samples. It Step 3: Set the level of significance.
can be done on independent and dependent samples. Step 4: Determine the critical value 𝑧𝑐𝑟𝑖𝑡.
The test also differs if you have large independent
samples, small independent samples, or two dependent
samples.

Testing Two Large Independent Samples


-We will compare the means of two populations, say 𝜇1
𝑎𝑛𝑑 𝜇2 by considering their difference.
-The inference will be about 𝜇1 − 𝜇2 that will be based Step 5: Solve for the computed value 𝑧𝑐𝑜𝑚𝑝.
in the test on the observed sample means 𝑋̅ 1 − 𝑋̅ 2

-If we assume that there is no significant difference


between the population means it will be 𝜇1 - 𝜇2 = 0 & it
follows that 𝜇1 = 𝜇2.

Step 6: Compare the critical value and the computed


value.

Step 7: Make a statistical decision.

Step 8: State the conclusion.


Assumptions in z-test for Independent Populations
1. Subjects are randomly chosen. Examples: a. A businessman wants to buy a new
2. The two populations are independent of each other. property for his next business. He compares the price of
3. Populations are normally distributed. townhouses in Quezon City and in Makati City and
claims that there is a difference in prices. The gathered

THE BOOK LOUNGE PH I 17


data is shown below. Is there enough evidence to say productive. The male employees wants to prove that
that there is a difference between the price of they are more productive. The data below shows the
townhouses in Quezon City and in Makati City? Use 𝛼 = average evaluation of productiveness of 35 male
0.05. employees and 42 female employees.

Is there enough evidence that the male employees is


Step 1: Formulate the null and alternative hypothesis more productive than the female employees? Use 𝛼 =
0.01.
Step 1: Formulate the null and alternative hypothesis.

Step 2: Determine if the given problem is a two-tailed


test or a one-tailed test. Since the symbol in the
alternative hypothesis is " ≠ , then we use a two-tailed Step 2: Determine if the given problem is a two-tailed
test. test or a one-tailed test. Since the symbol in the
alternative hypothesis is " > ", then we use a right-tailed
Step 3: Set the level of significance. 𝛼 = 0.05 test.
Step 4: Determine the critical value 𝑧𝑐𝑟𝑖𝑡. 𝑧𝑐𝑟𝑖𝑡 = Step 3: Set the level of significance. 𝛼 = 0.01
±1.960 Step 5: Solve for the computed value 𝑧𝑐𝑜𝑚𝑝. Step 4: Determine the critical value 𝑧𝑐𝑟𝑖𝑡. 𝑧𝑐𝑟𝑖𝑡 =
2.326 Step 5: Solve for the computed value

𝑧𝑐𝑜𝑚𝑝.

Step 6: Compare the critical value and the computed


value.
Step 6: Compare the critical value and the computed

value
Step 7: Make a statistical decision. Reject the null Step 7: Make a statistical decision. Do not reject the null
hypothesis. hypothesis.
Step 8: State the conclusion. Since we reject the null Step 8: State the conclusion. Since we do not reject the
hypothesis, we accept the alternative hypothesis. null hypothesis, we accept the null hypothesis. Hence,
Hence, there is a significant difference between the there is not enough evidence for the claim that the male
prices of townhouses in Quezon City and Makati City. employees are more productive than the female
b. The Human Resource Department wants to know employees.
who among the male or female employees is more
THE BOOK LOUNGE PH I 18
Testing Two Small Populations SHS teachers. From each group, she calculated the
The hypothesis test used for small populations is called means and standard deviation. The data is shown
test. It is used for samples with a size less than 30. below. At 𝛼 = 0.05, can she claim that the monthly
salary of SHS public school teacher is higher than SHS
Testing the Means between Two Independent Samples
private school teachers?
Assumptions in t-test for Independent Samples
1. Subjects are randomly selected.
2. Groups are independent of each other.
3. Population variances are homogeneous.
4. Populations are normally distributed. Step 1: Formulate the null and alternative hypothesis
Steps for Two Independent Sample t-test

Step 1: Formulate the null and alternative hypothesis


Step 2: Determine if the given problem is a two-tailed
test or a one-tailed test. -- right tailed test
Step 3: Set the level of significance. α = 0.05
Step 2: Determine if the given problem is a two-tailed Step 4: Calculate the degrees of freedom.
test or a one-tailed test. 𝑑𝑓 = 𝑛1 + 𝑛2 – 2 𝑑𝑓 = 12 + 15 – 2 𝑑𝑓 = 25
Determine the critical value 𝑡𝑐𝑟𝑖𝑡. 𝑡𝑐𝑟𝑖𝑡 = 1.708
Step 5: Solve for the computed value 𝑡𝑐𝑜𝑚𝑝.
We are assuming that there is no significant difference
Step 3: Set the level of significance. between the population means. So, 𝜇1 = 𝜇2 𝜇1 − 𝜇2
Step 4: Calculate the degrees of freedom =0
(𝑑𝑓 = 𝑛1 + 𝑛2 − 2). Determine the critical value 𝑡𝑐𝑟𝑖𝑡.
Step 5: Solve for the computed value 𝑡𝑐𝑜𝑚p.

Step 6: Compare the critical value and the computed


Step 6: Compare the critical value and the computed
value.
value.
Step 7: Make a statistical decision. Reject 𝐻0.
Step 8: State the conclusion. Since we reject the null
hypothesis, then we accept the alternative hypothesis.
Hence, we have enough evidence to support the claim
Step 7: Make a statistical decision.
that the monthly salary of SHS public school teacher is
Step 8: State the conclusion.
higher than the SHS private school teacher.
Examples: a. A researcher wants to determine whether
the monthly salary of SHS teachers in private school and
public school differs. She randomly selects sample of

THE BOOK LOUNGE PH I 19


Testing the Means between Two Dependent Samples Step 6: Compare the critical value and the computed
value.
Assumptions in t-test for Dependent Samples
1. Subjects are randomly selected. Step 7: Make a statistical decision.
2. Groups are dependent on each other, in other words,
each component is paired to each other.
3. Population variances are homogeneous.
4. Populations are normally distributed. Step 8: State the conclusion.

Steps for Two Dependent (Paired) Sample t-test

Step 1: Formulate the null and alternative hypothesis

Step 2: Determine if the given problem is a two-tailed


test or a one-tailed test.

Step 3: Set the level of significance.


Step 4: Calculate the degrees of freedom (𝑑𝑓 = 𝑛 − 1).
Determine the critical value 𝑡𝑐𝑟𝑖𝑡.
Step 5: Calculate the paired sample t-test.

THE BOOK LOUNGE PH I 20

You might also like