You are on page 1of 7

MODULE 3

STATISTICS

What is Statistics?

Statistics is a mathematical science including methods of collecting, organizing and analyzing data in such a way
that meaningful conclusions can be drawn from them. In general, its investigations and analyses fall into two broad
categories called descriptive and inferential statistics.
Descriptive statistics deals with the processing of data without attempting to draw any inferences from it. The data are
presented in the form of tables and graphs. The characteristics of the data are described in simple terms. Events that are
dealt with include everyday happenings such as accidents, prices of goods, business, incomes, epidemics, sports data,
population data.
Inferential statistics is a scientific discipline that uses mathematical tools to make forecasts and projections by analyzing
the given data. This is of use to people employed in such fields as engineering, economics, biology, the social sciences,
business, agriculture and communications.

I. Objectives: At the end of this module, you will be able to:

a. construct a frequency distribution using Rule or Rule 1 and Rule 2.


b. recognize the importance of statistical analyses in making decisions.

II. Pre-test

III. Guiding Questions:

1. How to construct a categorical frequency distribution?


2. How to determine the class interval of a frequency distribution?

TEACHING FRAME 1
INTRODUCTION TO DATA MANAGEMENT

A. ORGANIZATION OF DATA

When conducting a statistical research, investigation or study, the research must gather data for the particular
variable under investigation. To describe situations, make conclusions, and draw inferences about events, the researcher
must organize the data gathered in some meaningful way. The easiest way and widely used of organizing data is to
construct a frequency distribution. A frequency distribution is a grouping of the data into categories showing the number
of observations in each of the non-overlapping classes.
After organizing data, the next move of the researcher is to present the data so they can be understood easily by
those who will benefit from reading the study. The most useful method of presenting data is by constructing graphs and
charts.
Before we get started in constructing frequency distribution, we must define some terms that are essential to
understand deeper the nature of data that are displayed in a frequency distribution.
 Raw data is the data collected in original form.
 Range is the difference of the highest value and the lowest value in a distribution.
 Frequency distribution is the organization of data in a tabular form, using mutually exclusive classes showing the
number of observations in each.
 Class Limits is the highest and lowest values describing a class.
 Class Boundaries is the upper and lower values of a class for group frequency distribution whose values has
additional decimal place more than the class limits and end with the digit 5.
 Interval is the distance between the class lower boundary and the class upper boundary and it is denoted by the
symbol i.
 Frequency (f) is the number of values in a specific class of a frequency distribution.
 Percentage is obtained by multiplying the relative frequency by 100%.
 Cumulative Frequency (cf) is the sum of the frequencies accumulated up to the upper boundary of a class in a
frequency distribution.

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.


 Midpoint is the point halfway between the class limits of each class and is representative of the data within that
class.

CATEGORICAL FREQUENCY DISTRIBUTION

The categorical frequency distribution is used to organize nominal-level or ordinal-level type of data.

Example 1. Twenty applicants were given a performance evaluation appraisal. The data set is

High High High Low Average


Average Low Average Average Average
Low Average Average High High
Low Low Average High High

Construct a frequency distribution for the data.

Solution:

Class Frequency Tally Frequency Percentage


High llll – ll 7 35
Average llll – lll 8 40
Low llll 5 25

 The percentage is computed using the formula:

Percentage =

DETERMINING THE CLASS INTERVAL

Generally, the number of classes for a frequency distribution table varies from 5 to 20, depending primarily on the
number of observations in the data set. It is preferred to have more classes as the size of the data set increases. The
decision about the number of classes depends on the method used by the researcher.

1. Rule 1. To determine the number of classes is to use the smallest positive integer k such that , where n
is the total number of observations.

Suggested Class Interval =

where HV = Highest value in a data set


LV = Lowest value in a data set
K = number of classes
i = suggested class interval

2. Rule 2. Another way to determine the class interval is by applying the formula below.

Suggested Class Interval =


Grouped Frequency Distribution

Example 2. Suppose a researcher wished to do a study on the monthly salary of young professionals of selected
companies in Cauayan City. The research first would have to collect the data by asking each young professional about his
monthly salary. The data collected in original form is called raw data.
17,400 29,500 23,500 32,400 27,300 20,200 24,600 21,300 22,750 26,200
14,000 27,500 22,900 30,500 26,500 17,950 23,700 20,250 21,750 24,750

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.


15,500 27,800 23,000 30,700 26,800 18,400 23,850 20,400 21,900 25,000
17,300 29,399 23,400 32,100 27,000 20,000 24,500 21,000 22,600 26,100
15,700 27,900 23,200 30,700 26,900 18,700 24,100 20,500 21,900 25,150
14,300 27,600 22,900 30,650 26,500 18,350 23,700 20,300 21,800 25,000
17,000 27,900 23,400 30,750 27,000 18,800 24,300 20,800 22,000 26,000
17,800 30,400 23,700 33,500 27,400 20,250 24,700 21,600 22,800 26,300
Construct a frequency distribution using Rule and determine the following:

a. Range e. Percentages
b. Interval f. Cumulative frequencies
c. Class limits
d. Relative frequencies

Solution:

Step 1. Arrange the raw data in ascending or descending order. In this particular example we will arrange raw data in
ascending order. This will make it easier for us to tally the data

14,000 17,950 20,250 21,750 22,900 23,700 24,750 26,500 27,500 30,500
14,300 18,350 20,300 21,800 22,900 23,700 25,000 26,500 27,600 30,650
15,500 18,400 20,400 21,900 23,000 23,850 25,000 26,800 27,800 30,700
15.700 18,700 20,500 21,900 23,200 24,100 25,150 26,900 27,900 30,700
17,000 18,800 20.800 22,000 23,400 24,300 26,000 27,000 27,900 30,750
17,300 20,000 21,000 22,600 23,400 24,500 26,100 27,000 29,300 32,100
17,400 20,200 21,300 22,750 23,500 24,600 26,200 27,300 29,500 32,400
17,800 20,250 21,600 22,800 23,700 24,700 26,300 27,400 30,400 33,500

Step 2. Determine the classes

 Find the highest and lowest value.


Highest Value (HV) = 33,500 and Lowest Value (LV) = 14,000

 Find the range


Range = Highest Value (HV) – Lowest Value (LV) = 33,500 – 14,000 = 19,500

 Determine the number of Classes.


The objective is to use just enough classes. We can determine the number of classes (k) using “ 2 to the k rule”.
This will enable us to select the smallest number (k) for the number of classes such that (2 raised to the power
of k) is greater than the number of observations (n). Using our example, there are 80 call center agents (or n = 80). If
we apply k = 6, which means we would use 6 classes, then , somewhat less than 80. Thus, 6 is not
enough classes. If we try k = 7, then , which is greater than 80. Therefore, the recommended number
of classes is 7.

 Determine the class interval (or width).

Generally, the class interval (or width) should be equal for all classes. The classes must cover all the values in the
raw data (that is, from lowest to highest. Class interval is generated using the formula:

Suggested Class Interval =

Note: Round the value of the interval up to the nearest whole number if there is a remainder.

 Select a starting point for the lowest class limit.

The starting point can be the smallest data value or any convenient number less than the smallest data value. In our
cased 14,000 is used.

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.


 Set the individual class limit.

We need to add the interval (or width) to the lowest score taken as the starting point to obtain the lower limit of the
next class. Keep adding until we reach the 7 classes, as reflected 14,000; 16,800; 19,600; 22,400; 25,200; 28,000 and
30,800

To obtain the upper class limits, we first need to add the interval to the lower limit of the class to obtain the upper
limit of the first class. That is, 14,000 + 2,800 = 16,800. Then add the interval (or width) to each lower limit to obtain
all the upper limits.

Class Limits
14,000 < 16,800
16,800 < 19,600 Step 3. Tally the raw data.
19,600 < 22,400
22,400 < 25,200
25,200 < 28,000
28,000 < 30,800
30,800 < 33,600
Class Limits Tally
14,000 < 16,800 llll
16,800 < 19,600 lllll – llll
19,600 < 22,400 lllll – lllll – lllll – l
22,400 < 25,200 lllll – lllll – lllll – lllll –
25,200 < 28,000 lll
28,000 < 30,800 lllll – lllll – lllll – ll
30,800 < 33,600 lllll – lll
lll

Step 4. Convert the tallied data into numerical frequencies.

Class Limits Tally Frequency


14,000 < 16,800 llll 4
16,800 < 19,600 lllll – llll 9
19,600 < 22,400 lllll – lllll – lllll – l 16
22,400 < 25,200 lllll – lllll – lllll – lllll – 23
25,200 < 28,000 lll 17
28,000 < 30,800 lllll – lllll – lllll – ll 8
30,800 < 33,600 lllll – lll 3
lll

Step 5. Determine the relative frequency. It can be found by dividing each frequency by the total frequency.

Class Limits Frequency Relative Frequency Found by


14,000 < 16,800 4 0.05 4 ÷ 80
16,800 < 19,600 9 0.11 9 ÷ 80
19,600 < 22,400 16 0.20 16 ÷ 80
22,400 < 25,200 23 0.29 23 ÷ 80
25,200 < 28,000 17 0.21 17 ÷ 80
28,000 < 30,800 8 0.10 8 ÷ 80
30,800 < 33,600 3 0.04 3 ÷ 80

Step 6. Determine the percentage. It can be found by multiplying 100% in each relative frequency.

Class Limits Frequency Percentage Found by


14,000 < 16,800 4 5 (4 ÷ 80) x 100
16,800 < 19,600 9 11 (9 ÷ 80) x 100
19,600 < 22,400 16 20 (16 ÷ 80) x 100
22,400 < 25,200 23 29 (23 ÷ 80) x 100

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.


25,200 < 28,000 17 21 (17 ÷ 80) x 100
28,000 < 30,800 8 10 (8 ÷ 80) x 100
30,800 < 33,600 3 4 (3 ÷ 80) x 100
Total 80 100

Step 7. Determine the cumulative frequencies. The cumulative frequency can be found by adding the frequency in each
class to the total frequencies of the classes preceding that class.

Class Limits Frequency Cumulative Frequency Found by


14,000 < 16,800 4 4 4
16,800 < 19,600 9 13 4+9
19,600 < 22,400 16 29 4 + 9 + 16
22,400 < 25,200 23 52 4 + 9 + 16 + 23
25,200 < 28,000 17 69 4 + 9 + 16 + 23 + 17
28,000 < 30,800 8 77 4 + 9 + 16 + 23 + 17 + 8
30,800 < 33,600 3 80 4 + 9 + 16 + 23 + 17 + 8 + 3

Step 8. Determine the midpoints. The midpoint can be found by getting the average of the upper limit and lower limit in
each class.

Class Limits Frequency Midpoints Found by


14,000 < 16,800 4 15 (14 + 16) ÷ 2
16,800 < 19,600 9 18 (17 + 19) ÷ 2
19,600 < 22,400 16 21 (20 + 22) ÷ 2
22,400 < 25,200 23 24 (23 + 25) ÷ 2
25,200 < 28,000 17 27 (26 + 28) ÷ 2
28,000 < 30,800 8 30 (29 + 31) ÷ 2
30,800 < 33,600 3 33 (32 + 34) ÷ 2

Example 3. RAF Travel Agency, a nationwide local travel agency, offers special rates on summer period. The owner
wants additional information on the ages of those people taking travel tours. A random sample of 50 customers taking
travel tours last summer revealed these ages.

18 29 42 57 61 67 37 49 53 47
24 34 45 58 63 70 39 51 54 48
28 36 46 60 66 77 40 52 56 49
19 31 44 58 62 68 38 50 54 48
27 36 46 59 64 74 39 51 55 48

Construct a frequency distribution using Rule 2.

Solution:

Step 1. Arrange the raw data in ascending order.

18 29 37 42 47 49 53 57 61 67
19 31 38 44 48 50 54 58 62 68
24 34 39 45 48 51 54 58 63 70
27 36 39 46 48 51 55 59 64 74
28 36 40 46 49 52 56 60 66 77

Step 2. Determine the classes.

 Find the highest and lowest value.

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.


Highest Value (HV) = 77 and Lowest Value (LV) = 18

 Find the range.


Range = Highest Value (HV) – Lowest Value (LV) = 77 – 18 = 59

 Determine the class interval or width


Class interval is generated using the formula below:

Suggested Class Interval =

 Select a starting point for the lowest class limit. The lowest value in the data set is 18, this will also serve as our
starting point.

 Set the individual class limit. We will add 9 to each lower class limit until reaching the number of classes (18, 27, 36,
45, 54, 63, and 72). To obtain the upper class limit, we need to add 9 to the lower limit of the class to obtain the
upper limit of the first class. Then add the interval (or width) to each upper limit to obtain all the upper limits (27, 36,
45, 54, 63, 72, and 81).

Class Limits
18 < 27
27 < 36
36 < 45
45 < 54
54 < 63
63 < 72
72 < 81

Step 3. Tally the raw data.

Class Limits Tally


18 < 27 lll
27 < 36 lllll
36 < 45 lllll – llll
45 < 54 lllll – lllll – llll
54 < 63 lllll – lllll – l
63 < 72 lllll – l
72 < 81 ll

Step 4. Convert the tallied data into numerical frequencies.

Class Limits Tally Frequency


18 < 27 lll 3
27 < 36 lllll 5
36 < 45 lllll – llll 9
45 < 54 lllll – lllll – llll 14
54 < 63 lllll – lllll – l 11
63 < 72 lllll – l 6
72 < 81 ll 2

Step 5. Determine the percentage.

Class Limits Frequency Percentage Found by


18 < 27 3 6 (3 ÷ 50) x 100
27 < 36 5 10 (5 ÷ 50) x 100
36 < 45 9 18 (9 ÷ 50) x 100

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.


45 < 54 14 28 (14 ÷ 50) x 100
54 < 63 11 22 (11 ÷ 50) x 100
63 < 72 6 12 (6 ÷ 50) x 100
72 < 81 2 4 (2 ÷ 50) x 100
Total 50 100

Mathematics in the Modern World Rowena C. Foronda, LPT, MAEd.

You might also like