Professional Documents
Culture Documents
Lesson Preview
This lesson is designed to equip you with the knowledge and skills of gathering, organizing, or
presenting of data in tables and graphs, and interpreting data. You will be asked to choose a relevant issue
which you want to research/study as your final output.
What is Statistics?
Statistics is a field of mathematics that deals with the Collection, Organization, Analysis, and
Interpretation of quantitative data.
Collection of data is the process of gathering relevant information from the population.
Organization of data is the systematic arrangement of data into tables, graphs, or charts so that
logical and statistical conclusions can easily be derived from the collected information.
Analysis of data refers to the process of deducing relevant information from the given data so that
the numerical description can be formulated.
1
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Interpretation of data is all about deriving conclusion from the data that have been analyzed. It also
involves making predictions and forecasts about large groups based on gathered data from small
groups.
2. Inferential Statistics is another area of Statistics concerned with drawing conclusions about large
groups of data called the population based on selected elements of that population, known as
sample.
Here, the statistician tries to make inferences from samples to population. This area also
makes use of the concept of probability.
2. Sample (n) is the set of measurements that is collected in the course of investigation. It is the subset
of objects/subjects drawn from the population.
same example, if N = 3600 out of this population we will get the sample
like n = 2500.
3. Variable is the particular characteristic of the object or the individual. It varies from object to object.
A variable in any study maybe quantitative or qualitative in nature.
A quantitative variable has a value or numerical measurement for which operation can be applied.
Examples: age, height, weight
A qualitative variable describes an object or individual by placing the object into a category or group.
Examples: gender, nationality, color.
2
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
There are many approaches to determine the sample size. Tis includes;
1. using a census for small populations,
2. using the sample size of similar studies,
3. using published tables by well-established authors such as the sample size table using
2. Confidence level (in %) tells the researcher how sure s/he can be that the response of the sample
represents that of the population.
3
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
For example: A 95% confidence interval with a 3 percent margin of error ( e = ±3%)means that our
statistics will be given ±3 percentage points of the real population with the value 95% of the time.
To illustrate: Let us use the results of the Jobstree.com survey which says thet Filipinos are the
happiest employees in South east Asia.
Jon Carlos Rodriguea, ABS CBN News: Posted on August 31, 2016 12:12PM / Updated
as of Sep 01, 2016 09:18PM
MANILA (UPDATE) Filipino employees are the happiest in Southeast Asia and Their
positive attitude is likely boost the economy, results of a Jobstreet.com survey
released August 31, 2016 showed.
Let us assume that the researcher used 5% margin of error, thus, the results are interpreted in the
following manner.
Sampling Techniques
The methods of selecting samples from a given population.
4
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Simple Random Sampling: It is the most basic sampling technique where samples are selected from
a population entirely by chance, and each member of the population has equal or known chance of
being included in the sample.
Example: Lottery sampling, or the use of random numbers
Stratified Random Sampling: Stratified random sampling is a sampling method that subdivides the
attributes or characteristics. A sample from each stratum proportional to its size when compared to
the population is pooled to form a random sample
Qualitative research seeks to give an in-depth picture of why and how people behave, or why a
phenomenon occurred by collecting data in words coming from interview, observations, focus group
discussions, open-ended questions, etc. in order to draw conclusions and make inferences.
Lesson Overview
In this lesson you will recognize that correlation and regression analysis can be used
in making decisions.
Correlation Analysis
Scatter Plot
A scatter plot is drawn so we can analyze if the two variables are related somehow. If there is correlation
found, depending upon the numerical values measured, this can be either positive or negative.
5
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
A scatter plot is a graph of ordered pairs (x, y) consisting of data from two data sets.
Positive correlation exists if one variable increases simultaneously with the other, i.e. the high numerical
values of one variable relate to the high numerical values of the other.
Negative correlation exists if one variable decreases when the other increases, i.e. the high numerical values
of one variable relate to the low numerical values of the other.
Example 1. Draw a scatter plot for the scores shown. Is there a relationship between the sets of scores?
6
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Geometry Scores
(y)
We can see from the graph that there appears a positive correlation between the Geometry and
Algebra scores, since the graph progress from lower left to upper right which means as the score in Geometry
increased(decreased) the scores in Algebra decreased (increased).
Example 2. Suppose the scores of the students in those subjects are as follows. Plot the scatter diagram. Is
there a relationship between the two sets of scores.
7
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
We can see from the graph that there appears a negative correlation between the Geometry and
Algebra scores, since the graph progress from upper left to lower right which means as the score in Geometry
increased(decreased) the scores in Algebra also increased (decreases).
Deciding whether or not the two data sets are related by simply looking at a scatter plot is a pretty
subjective process, so it would be nice to have a way to quantify how strongly connected data sets are.
The correlation coefficient is a number that describes how strong the relationship between two data
sets. Correlation coefficients range from -1 (perfect negative correlation) to 1 (perfect positive correlation).
A correlation coefficient close to zero indicates that the data sets are most likely not linearly correlated (See
figure 1).
Figure 1.0
n xy ( x)( y )
r
[n( x 2 ) ( x) 2 ][n( y 2 ) ( y ) 2 ]
where:
n = the number of data pairs
∑ 𝑥 = the sum of the x values
∑ 𝑦 = the sum of the y values
∑ 𝑥𝑦 = the sum of the products of the x and y values for each pair
∑ 𝑥 2 = the sum of the squares of the x values
∑ 𝑦 2 = the sum of the squares of the y values
Obviously, this is a pretty complicated formula, so arranging information in orderly table is a big help,
8
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
To interpret its value, refer to Table 1.0 in which of the following values your correlation r is closest to.
Example 3. Is there a significant relationship between the two sets of test scores in
Algebra and Geometry of ten students? Find the correlation coefficient for
the data and discuss what you think it indicates.
Solution: n - 10
Use a table to organize your data
9
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Substitute all the sums of the data on the formula, then compute r:
n xy ( x)( y )
r
[n( x 2 ) ( x) 2 ][n( y 2 ) ( y ) 2 ]
10(2045) (137)(146)
r
[10(1,933) (137) 2 ][10(2,186) (146) 2 ]
20,450−20,002
𝑟=
√[(19330)−(18769)][(21,860−21,316)]
448
𝑟=
√[561][545]
448 448
𝑟= =
√305,745 552.94213
Geometry 3 6 7 4 2 9 8 4 2 10
Scores(y)
Solution: n = 10
9 3 27 81 9
3 6 18 9 36
4 7 28 15 49
7 4 28 49 16
6 2 12 36 4
1 9 9 1 81
10
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
2 8 16 4 64
5 4 20 25 16
10 2 20 100 4
2 10 20 16 100
Substitute all the sums of the data on the formula, then compute r:
n xy ( x)( y )
r
[n( x 2 ) ( x) 2 ][n( y 2 ) ( y ) 2 ]
10(198) − (49)(55)
𝑟=
√[10(325)−(49)2 ][10(379)−(55)2 ]
1980−2685
𝑟=
√[(3250)−(2401)][(3790−3025)]
−705
𝑟=
√[849][765]
−705 −705
𝑟= =
√649485 805.90632
𝒓 = −𝟎. 𝟖𝟖 Interpretation: From table 1.0 a coefficient of -0.88 indicates a high negative
correlation, very dependable relationship, when scores in Algebra
increased (decreased), scores in Geometry decreased (increased).
Regression Analysis
Once we have concluded that there is a significant relationship between the two variables the next step
is to find the equation of the regression line through the data points.
If you look back at the scatter plot of Example 1, you can see a general trend among the points from
lower left to upper right. You could probably put a straightedge down and draw what seems like the closest
ta
distance from each point in the line is a minimum. For this reason the regression line is also called the line of
best fit.
Recall from algebra that the equation of a line in slope-intercept form is 𝑦 = 𝑚𝑥 + 𝑏 , where m is the
slope b is the y intercept. In statistics, the equation of the regression line is written as 𝒚 = 𝒂 + 𝒃𝒙 , where a
is the y-intercept and b is the slope. This is the equation that will be used here. In order to find the values for
a and b, we need two formulas.
11
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Formulas for Finding the Values of a and b for the Equation of the Regression Line
Slope (b)
y-intercept (a)
∑ 𝑦 − 𝑏(∑ 𝑥)
𝑎=
𝑛
Example 5. Find the equation of the regression line for the data in Example 3.
Solution
We already calculated the values need for each formula when we found the correlation coefficient in
Example 3. Substitute into the first formula to find the value of the slope.
20,450−20,002 448
𝑏= = 561 = 0.798 ≈ 0.80
19,330−18,769
Substitute into the second formula to find the value of a (y-intercept) when b = 0.80
Example 6. Find the equation of the regression line in the data in Example 4.
Substitute into the second formula to find the value of a (y-intercept) when b = 0.84
12
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
A hypothesis is a speculation or theory based on insufficient evidence that lends itself to further testing
and experimentation. With further testing, a hypothesis can usually be proven true or false.
2. An alternative hypothesis (Ha) is one that states there is a statistically significant relationship
between two variables.
Example: There is a significant relationship between the test scores in Algebra and
Geometry
13
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Step 6. Conclusion
For this problem we have computed the correlation coefficient which is 𝑟 = 0.81 ,
you will use this coefficient in testing the hypothesis.
Solution:
Step 1. State the Null and alternative hypotheses.
Ho: There is no significant relationship between the scores in Algebra and
Geometry.
Ha: There is a significant relationship between the scores in Algebra and Geometry.
Step 3.
𝑡𝑐𝑜𝑚𝑝 = 3.906 (Compare this value if it is > 𝑜𝑟 < to the t critical value from
the t- table of values at 0.05 level of significance)
Step 5. Decision
From the t-table of values, at 0.05 level of significance tcritical = 2.2306.
14
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
Since t computed is 3.906 > t critical = 2.2306. Reject the Ho and accept Ha.
Step 6. Conclusion
We can conclude that there is a highly significant correlation between
Algebra and Geometry scores. Hence, when the scores in Algebra are
increased (or decreased) then the scores in Geometry are also increased (or
decreased).
Activity
1. Accomplish Worksheet 4.1 Correlation and Regression Analysis and Worksheet 4.2 Hypothesis
Testing. Due Date: May 2, 2023
2. Long Quiz (Nature of Mathematics to Statistical Tools). Due Date: May 2, 2023
3. Group Output: Project Proposal for a Quantitative Study. (Refer to the General Instructions and Scoring
Rubric). Due Date: May 30, 2023
References
Aufmann, Lockwood, Nation and Clegg. (2013). Mathematical Excursions, Third Edition.
Cengage Learning. Belmont, CA 94002-3098 USA.
Baltazar, E Ethel Cecille et. Al. (2018). Mathematics in the Modern World. C and E
Publishing, Inc. Quezon City, Philippines.
Sobecki, Dave. (2018). Math in Our World, Fourth Edition. Mc Graw Hill Education.
New York, New York 10121.
Stewart, Ian. (1995). Nature’s Numbers. BasicBooks, 10 East 53rd Street, New York, NY
10022-5299
It is not the intention of the author/s nor the publisher of this te to have monetary gain in using the textual
information, imageries, and other references used in its production. This guide is only for the exclusive use of a bona fide
student of Mabalacat City College.
In addition, this guide or no part of it thereof may be reproduced, stored in a retrieval system, or transmitted,
in any form or by any means, electronic, mechanical, photocopying, and/or otherwise, without the prior permission of
Mabalacat City College.
April Ann L. Galang GRACIA T. CANLAS, LPT, MAED MARILYN S. ARCILLA, RN, LPT, MAN MICHELLE AGUILAR-ONG, DPA
Clerk, IAS MATH 101, Instructor Dean, IAS VPAA
15
MABALACAT CITY COLLEGE MATH 101 | MATHEMATICS IN THE MODERN WORLD
HANDOUT – TG 4 Week No. 7 - 8
16