You are on page 1of 10

INTENDED LEARNING OUTCOMES

After going through this topic, you are expected to:


1. Recall the different terms and basic concepts in Statistics
2. Compute and interpret the mean, median and mode of ungrouped and
grouped data
3. Solve and interpret the different measures of position
4. Calculate and interpret, the range, variance and standard deviation
5. Employ the different statistical tools in solving real life problems
INTRODUCTION
Statistics is used almost every day in our life. Statistics is the branch
of mathematics which we use to analyse what is happening in the world
around us. Below are some examples of the application of Statistics in real
life:
 Medical Study
Physicians use statistics to examine the effectiveness of treatments
 Weather Forecasts/Emergency Preparedness
Weather Forecast models are built using statistics that compare prior
weather conditions with current weather to forecast future weather
conditions. We can predict any natural disaster that may happen shortly.
It will help us to get prepared for an emergency. It also helps the rescue
team do the preparation to rescue the life of the people who are in danger.
 Quality Testing
A company makes thousands of products every day and make sure that
they sold the best quality items. For a company it is not possible to test
each product. So the company uses quality test with the help of statistics
 Consumer Goods
Retailers keeps track of everything they sell and to know the stock using
statistics. Worldwide leading retailers use statistics to calculate what
products ship to each store
 Economics
Economics is about allocating limited resources among unlimited ends in
the most optimal manner. Statistics offers information to answer some
basic questions in economics –
o What to produce?
o How to produce?
o For whom to produce

MODULE 4
STATISTICS
Statistics is the field of mathematics that deals with the Collection,
Organization, Analysis and Interpretation of quantitative data
When we say Collection of Data we mean the process of gathering
relevant information from the population. When we talk about Organization
of Data, we refer to the systematic arrangement of data into tables, graphs,
or charts so that logical and statistical conclusions can easily be derived from
the collected information. Analysis of data refers to the process of deducing
relevant information from the given data so that numerical description can
be formulated. Interpretation of data is all about deriving conclusion from
the data that have been analyzed. It also involves making predictions or
forecasts about large groups based on gathered data from small groups.
TWO FIELDS OF STATISTICS
Statistics may be subdivided into two fields:
1. Descriptive Statistics - consist of the collection, organization,
summarization, and presentation of data.
 Here, the statistician tries to describe a given situation
2. Inferential Statistics – is another area of Statistics concerned with
drawing conclusions about large groups of data called the population
based on selected elements of that population, known as sample.
 Here, the statistician tries to make inferences from samples to
population. This area also makes the concept of probability.
Lesson 4.1 BASIC CONCEPTS IN STATISTICS
Data Types are an important concept of statistics, which needs to
be understood, to correctly apply statistical measurements to your data
and therefore to correctly conclude certain assumptions about it.
As shown in the figure, data types can either be categorical or numerical.
CATEGORICAL DATA/QUALITATIVE DATA
Categorical/qualitative data represent characteristics. Therefore it can
represent things like a person’s gender, language etc. Categorical data can
also take on numerical values (Example: 1 for female and 0 for male). Note
that those numbers don’t have mathematical meaning. Categorical data can
either be nominal or ordinal
SCALES OF MEASUREMENT
1. NOMINAL SCALE
A nominal scale is the 1st level of measurement scale in which the
numbers serve as “tags” or “labels” to classify or identify the objects. A
nominal scale usually deals with the non-numeric variables or the
numbers that do not have any value.
Nominal data can either be real nominal or artificial nominal
o Real nominal are those classified under naturally occurring
characteristics like gender, nationality, color of the eyes, etc.
o Artificial nominal are those classified based on man-made
characteristics following certain rules like passed or failed on their
scores in a test.
Other example of nominal data
o Breeds of Cattle
o Types of beans
o Brand of fertilizer

2. ORDINAL SCALE
The ordinal scale is the 2nd level of measurement that reports the ordering
and ranking of data without establishing the degree of variation between
them. Ordinal represents the “order.” Ordinal data is known as qualitative
data or categorical data. It can be grouped, named and also ranked.
Ordinal values represent discrete and ordered units. It is therefore
nearly the same as nominal data, except that it’s ordering matters. You
can see an example below:

In business you wanted to determine the satisfaction of your client so


in a form of a questionnaire you can determine the level of their
satisfaction.
How satisfied are you with our service?
o1 – Very unsatisfied
o2 - Somewhat unsatisfied
o3 - Neutral
o4 - Somewhat satisfied
o5 - Very satisfied
Here are other examples of ordinal data
o Academic Rank (Instructor, Assistant Professor, Associate Professor,
Professor)
o Employment Status (Permanent, temporary, Contractual)
o Position (President, Vice President, Secretary, Treasurer)
o Sickness Stage (Stage 1, Stage 2, Stage 3)
 Ranking of school students – 1st, 2nd, 3rd, etc.
 Evaluating the frequency of occurrences
 Very often
 Often
 Not often
 Not at all
 Assessing the degree of agreement
 Totally agree
 Agree
 Neutral
 Disagree
 Totally disagree

3.INTERVAL SCALE
The interval scale is the 3rd level of measurement scale. It is defined as a
quantitative measurement scale in which the difference between the two
variables is meaningful. In other words, the variables are measured in an
exact manner, not as in a relative way in which the presence of zero is
arbitrary.
Interval scales hold no true zero and can represent values below zero. For
example, you can measure temperature below 0 degrees Celsius such as
-10 degrees
The interval level is a numerical level of measurement which, like the
ordinal scale, places variables in order. Unlike the ordinal scale, however,
the interval scale ha a known and equal distance between each value on
the scale.
Interval data can include numerical data that does not use zero as
reference, such as age ranges, income ranges, temperature ranges and
similar range-based data

4. RATIO SCALE
The ratio scale is the 4th level of measurement scale, which is quantitative.
It is a type of variable measurement scale. It allows researchers to
compare the differences or intervals. The ratio scale has a unique feature.
It possesses the character of the origin or zero points.
Characteristics of Ratio Scale:
 Ratio scale has a feature of absolute zero
 It doesn’t have negative numbers, because of its zero-point feature
 It affords unique opportunities for statistical analysis. The variables
can be orderly added, subtracted, multiplied, divided. Mean, median,
and mode can be calculated using the ratio scale.
 Ratio scale has unique and useful properties. One such feature is
that it allows unit conversions like kilogram – calories, gram –
calories, etc.

MEASURES OF CENTRAL TENDENCY OF UNGROUPED DATA


 MEAN
The “mean” is the usual average, where you add up all the numbers and
divide by the number of values
Examples:
1. The ages of five contestants in a statistics Quiz Bee are the following:
18, 17, 18, 19, and 18. Find their average age.
Solution:
𝑥̅= 18+17+18+19+18/ 18
add all the values
𝑥̅= 18 then divide the sum by 5
Then the mean age of the contestant is 18
2. The number of tanker fabricated by E. Pagtalunan Steel Works
Fabrication in a year is shown below. Determine the mean.
243
160
98
360
157
104
139
230
348
211
249
111
𝑥̅= 243 + 157 + 348 + 160 + 104 + 211 + 98 + 139 + 249 + 360 + 230 + 111/12
𝒙̅= 𝟐𝟎𝟎. 𝟖𝟑
This shows that the average numbers of tankers fabricated by E.
Pagtalunan Steel Works Fabrication is 200.83

Lesson 4.2 MEASURES OF CENTRAL TENDENCY


Weighted mean is mean calculated by giving values in a data set more
influence according to some attribute of the data. It is an average in which
each quantity to be averaged is assigned a weight, and these weightings
determine the relative importance of each quantity on the average.
Weightings are the equivalent of having that many like terms with the same
value involved in the average.
The formula for weighted mean is
𝑾𝑴 = ∑ 𝒘𝒙 / ∑𝒘
Where: w is the weight of each value
x is the matching value

Example 1
Carla brought different fruits for New Year. She bought 3 apples at
P10 each, 5 ponkans at P5 each, 3 pears at P15 each, 4 pieces of
chico at P25 pesos each. What is the average price of each fruit that
Carla bought?
Solution:
𝑾𝑴 = ∑ 𝒘𝒙/ ∑ 𝒘
𝑊𝑀 = 3 • 10 + 5 • 5 + 3 • 15 + 4 • 25 / 3 + 5 + 3 + 4
𝑊𝑀 = 30 + 25 + 45 + 100 /15 = 13.33
Thus, the average price of each fruit bought by Carla is P13.33

MEDIAN
The “median” is the “middle value in the list of numbers. To find the
median, the numbers have to be listed in ascending order, so you may
have to rewrite your list first. If there is an even number of
data/numbers such that there is no middle value, the median is the
mean of the two values. If it is odd, the middle data serves as the
median.

Example 2
The following data is the number of computers purchase by a certain
company in 13 years. Determine the median
24, 25, 85, 38, 77, 63, 49, 71, 80, 55, 40, 42, 81
Solutions: Arrange the given data in either ascending or descending
order
Ascending order:
24, 25, 38, 40, 42, 49, 55, 63, 71, 77, 80, 81, 85
Find the middle data
Since the number of data is odd (13 years), then 55 is the middle
data and is the median of these observations

Example 3
The following is the number of refrigerator manufactured by Hans
Company in 10 years. Determine the median
214, 175, 128, 155, 238, 205, 200, 195, 180, 305
Solutions: Arrange the given data in either ascending or descending
order
Descending order:
305, 238, 214, 205, 200, 195, 180, 175, 155, 128
Since two observations occur as the middle data then add these
values and divide by two, as shown below:
Median = 𝟐𝟎𝟎+𝟏𝟗𝟓
𝟐
= 𝟏𝟗𝟕. 𝟓

MODE
The “mode” is the value that occurs most often. If no number is
repeated, then there is no mode for the list.

Example 4
Find the mode of the following set of data
18, 19, 24, 37, 37, 37, 48, 50
Answer: The mode is 37, since it is the most frequent score in the
distribution. The distribution is unimodal

Example 5
Find the mode of the following set of data
43, 47, 50, 50, 50, 50, 65, 70, 70, 70, 83
Answer: The modes are 50 and 70 respectively because both occur
four times, therefore the distribution is bimodal

Measures of relative position are used to locate the relative position of an


observation in a set of data and they are said to be the natural extension of the
median. The common measures of relative position are the quartiles, deciles, and
percentiles.
In this module, we shall discuss data analysis by dividing it into four, ten, and
hundred parts of equal sizes and the corresponding partition values are called
quartiles, deciles, and percentiles.

Lesson 4.2 MEASURES OF RELATIVE POSITION


FORMULA IN FINDING THE QUARTILE for ungrouped data
 Steps to solve the quartile of the given data
1. Arrange the data from the lowest value to the highest value.
2. Find the N or the total number of elements presented in the data.
3. Find the least value of the data and the greatest value of the data.
4. Solve for the unknown

Example1:
The owner of the coffee shop recorded the number of customers who came
into his café each hour in a day. The results were 14, 10, 12, 9, 17, 5, 8, 9, 14, 10,
and 11.
1. Find
Solution:
Arrange the given in an ascending order
{5, 8, 9, 9, 10, 10, 11, 12, 14, 14, 17}
n =11
Substitute the formula:
Therefore the Q1 is the 3rd element in the data which is 9.
2. Find
Solution:
Substitute the formula:

Therefore the Q3 is the 9th element in the data which is 14. This
implies that 14 is the number of costumers which accumulated 75% of the
customers who went to the coffee shop

2. Deciles - the values that divide the data set into ten equal parts.
The first decile ( is the value of the variable below which lies 10% of the
cases, and so on.

Example 2:
Find the 7th decile (D7), given the scores of 11 students in their
mathematics activity.
{ 1, 27, 16, 7, 31, 7, 30, 31, 3, 4, 21 }
Solution:
Arrange the data into an ascending order: 1, 3, 4, 7, 7, 16, 21, 27, 30,
31, 31
k = 7 n = 11
Substitute the formula
D7 is the 8th element therefore D7 = 27. This indicates that 70% of the
students who took the mathematics activity got scores below 27 and the other
30% got scores higher than 27.
3. Percentiles – values that divide the distribution into 100 equal parts.
They are used to characterize values according to the percentage below
them.
Example 3:
Find the 58th percentile (P58), given the scores of 10 students in their
mathematics activity using linear interpolation
{ 1, 27, 16, 7, 31, 7, 30, 3, 4, 21 }
Arrange the scores in an ascending order: 1, 3, 4, 7, 7, 16, 21,
27, 30, 31
k = 58
n = 10
Substitute the formula
P58 is the 6th element therefore, P58 = 16. This implies that 58% of the
students got scores above 16 and the remaining 42% got scores below 16.
EXAMPLE 4.
The monthly salary in pesos of 18 salesman are as follows: Php15,693, Php13,066,
Php12,685, Php11,128, Php13,760, Php13,657, Php11,995, Php11,372, Php11,313,
Php9,656, Php9,518, Php9,116, Php10,503, Php10,264, Php10,466, Php10,896,
Php12,192, Php14,718.
Find the 3 rd quartile, 6 th decile and 18th percentile.
Solution:
First, arrange the observation in an array
Php9,116, Php9,518, Php9,656, Php10,264, Php10,466, Php10,503, Php10,896,
Php11,128,
Php11,313,
Php11,372,
Php11,995,
Php12,192,
Php12,685,
Php13,066, Php13,657, Php13,760, Php14,718, Php15,693
3 rd quartile, n=18
Php13,066.
This means that 75% of the salesman have salaries that are below or lower
than Php13,066.
, n=18
This means that 60% of the salesman have salaries that are below or lower
than Php11,995.
18th Percentile, n=18
Php9,656
This means that 18% of the salesman have salaries that are below or lower
than Php9,656.

ACTIVITY 4.3
Find of each of the following data sets. Interpret the
results.
1. 20, 27, 23, 28, 23, 25
2. 45, 59, 52, 46, 41, 26, 36, 34, 38, 41, 39, 38, 30, 49, 46, 51
3. 157, 133, 232, 267, 289, 274, 321, 348, 188, 432
VARIABILITY refers to how spread apart the value/observations of the
distribution are or how much the values/observations vary from each
other.

FOUR MEASURES OF VARIABILITY


1. Range (R) – is the difference between the highest and the lowest
value. It is the simplest measure of variability to calculate. The
formula is as follows:
Range =highest – lowest
Example:
Find the range of the following group of numbers 10, 12, 5, 16, 7, 13,4
 The highest number is 16 and the lowest number is 4
Range = 16 – 4 = 12

2. Mean Absolute Deviation (MAD) – is the average distance between


each observation and the mean. It gives us an idea about the
variability in a data set.
For ungrouped data:

Where:
MAD = mean absolute deviation
x = raw score
N = number of observations

Lesson 4.4 MEASURES OF VARIABILITY

Example:
A group of mountaineers went on hiking to Mt. Pulag, Philippines to
study the different species of plants existing in the area. The ages of
the mountaineers are 34, 35, 45, 46, 49, 32. What is the MAD of their
ages.
Solution:

Therefore, the mean absolute


deviation is 7 yrs. Old
3. Variance – is the average of the squared deviations of the set of
observations from the mean. It measures how far a data set is spread
out.
For ungrouped data
Population Variance: (when your data is the whole population)

Where:

Sample Variance:
̅
Where:

Example:
A group of mountaineers went on hiking to Mt. Pulag, Philippines to
study the different species of plants existing in the area. The ages of
the mountaineers are 34, 35, 45, 46, 49, 32. What is the variance of
their ages.
̅
4. Standard Deviation – is a measure of the dispersion of a set of
data
from its mean. It is determined by calculating the positive square root
of variance. A large standard deviation indicates that the data points
are far from the mean (heterogeneous) and a small standard deviation
indicates that they are clustered closely around the mean
(homogeneous)
Population Standard Deviation:

Sample Standard Deviation:

Example:
A group of mountaineers went on hiking to Mt. Pulag, Philippines to
study the different species of plants existing in the area. The ages of
the mountaineers are 34, 35, 45, 46, 49, 32. What is the variance of
their ages.
Solution: (based from our previous example in variance)

 The data is homogeneous

Standard deviation calculated using a frequency table


The formulas for variance and standard deviation change slightly if
observations are
grouped into a frequency table. Squared deviations are multiplied by each
frequency's value, and then the total of these results is calculated.
In a frequency table, the variance for a discrete variable is defined as
The standard deviation for a discrete variable is defined as
EXAMPLE:
Thirty farmers were asked how many farm workers they hire during a typical
harvest
season. Their responses were:
4, 5, 6, 5, 3, 2, 8, 0, 4, 6, 7, 8, 4, 5, 7, 9, 8, 6, 7, 5, 5, 4, 2, 1, 9, 3, 3, 4, 6, 4
Thirty farmers were asked how many farm workers they hire
during
a typical harvest season. Their responses were:
ACTIVITY 4.4
1. Solve for the range, MAD, variance and standard deviation of the following
data sets:
a. 20, 27, 23, 39, 45
b. 12, 24, 33, 32, 17, 18, 21, 25
2. The following sales (in thousand pesos) for agricultural products were
recorded during the 30-day period. Calculate the standard deviation using
a frequency table
15, 12, 18, 12, 14, 20, 17, 16, 15, 10, 11, 18, 17, 12, 11, 10, 19, 18, 13, 14,
15, 18, 19, 14, 16, 14, 19, 11, 10, 12

You might also like