You are on page 1of 33

1

Quantitative Classification of Data

* use quantitative classification if the observed values of the


data are either a result of count or measurement

* organize this type of data in tabular form in the form of


a frequency distribution table.

Frequency distribution is a summarized table wherein the


classes are either distinct values or intervals with a frequency
count.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
2

Forms of the Frequency Distribution

Single value grouping


* is a frequency count of observed values wherein classes
are distinct values
* range of values is short and with many unique values
occurring more than once

Grouping by class intervals


* is a frequency count of observed values wherein the
classes are intervals.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
3

Data for Single Value Grouping

Suppose we have data on the number of children of 50


currently married women using any modern contraceptive
method. Construct a summary table for the data set below.

0 0 1 2 2 2 3 3 4 4

0 0 1 2 2 3 3 3 4 4

0 1 1 2 2 3 3 3 4 4

0 1 1 2 2 3 3 3 4 5

0 1 1 2 2 3 3 3 4 5

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
4
Example of Single Value Grouping

Distribution of Currently Married Women Using Any Modern


Method of Contraceptive by Number of Children:
No. of Frequency of
Children Married Women %
0 7 14
1 8 16
2 11 22
3 14 28
4 8 16
5 2 4
TOTAL 50 100

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
5
Dot Plot
 A dot plot is a method of summarizing data to
illustrate the major features of the distribution
of the data in a convenient form where each
observation is represented by a dot.
• A horizontal axis shows the range of data values,
then each data value is represented by a dot placed
above the axis.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
6
Dot Plot
 Given: 8, 4, 2, 12, 8, 2, 4, 12, 6, 8, 6, 8,
10, 10, 12, 16

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
7
Definition of Terms Used in a Frequency
Distribution Table
Class interval contains the numbers defining a class.
Class frequency is the number of observations falling under a
class interval.
Class limits are the end numbers of a class interval.
* The lower class limit (LCL) is the lower end of the class
interval and the upper class limit (UCL) is the upper
end of the class interval.
* The number of digits of the class limits should be the
same as the number of digits of the raw data.
Open class interval is a class interval with either no lower class
limit or upper class limit.
Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
8

Class boundaries are the true class limits.

* There are no gaps in the class boundaries.


* The number of decimal places is one more than the
number of decimal place of the class limits.
* The lower class boundary (LCB) is average of the
lower class limit of the class interval and the upper
class limit of the preceding class interval.
* The upper class boundary (UCB) is the average of
the upper class limit of the class interval and the
lower class limit of the next class interval.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
9

Class size is the size of the class interval.


* It is the difference between two successive lower
class limits, or two successive upper class limits, or
two successive lower class boundaries, or two
successive upper class boundaries.

Class mark is the midpoint of a class interval.


* It is the average of the lower class limit and the upper
class limit or the average of the lower class boundary
and upper class boundary of a class interval.

Modal class is the class interval having the highest frequency.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
10
Steps in Constructing a Frequency
Distribution Table
1. Determine an adequate number of classes (K).
* The number of classes should not be too many or not
too few.
* Usually, the number of classes is between 5 and 20.
* The class intervals should be non-overlapping.
2. Determine the range (R). Range = Maximum – Minimum

3. Calculate the approximate class size (C’).


C’ = R/K
4. Determine the class size (C ) by rounding off C’ to a number
that is easy to work with. We recommend class sizes of
multiples of 5, 10, 15, 20, etc.
Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
11

5. List the required number (K) of class intervals.


* Start with the lower class limit of the lowest class
interval.
* Its value should be less or equal to the minimum value of the
data set.
* Add the class size (C) to the lower class limit to get
the next lower class limit.
* The last class interval should include the maximum
value.
6. Tally the frequency for each class interval.
7. Sum the frequency column and check against the total number
of observations.
Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
TABLE 3. Magnitude of Poor Population in the Philippines: 2000
NCR 848,962 Region 2 820,786 Region 4a 1,699,333

(National 1st District 120,663 (Cagayan Batanes 2,535 (CALABARZON) Batangas 440,603

Capital 2nd District 229,301 Valley) Cagayan 251,222 Cavite 244,712

Region)1 3rd District 292,611 Isabela 424,580 Laguna 207,184

4th District 206,387 Nueva Vizcaya 82,895 Quezon 667,385

CAR 536,169 Quirino 59,555 Rizal 139,449

(Cordillera Abra 110,937 Region 3 1,695,227 Region 4b 1,030,987

Administrative Apayao 28,770 (Central Aurora 59,985 (MIMAROPA) Marinduque 113,553

Region) Benguet 122,762 Luzon) Bataan 68,659 Occidental Mindoro 177,823

Ifugao 113,719 Bulacan 147,812 Oriental Mindoro 340,690

Kalinga 83,844 Nueva Ecija 532,961 Region 5 2,540,618

Mt. Province 76,137 Pampanga 331,739 (Bicol Albay 553,629

Region 1 1,447,638 Tarlac 360,109 Region) Camarines Norte 301,147

(Ilocos Ilocos Norte 115,116 Zambales 193,962 Camarines Sur 765,373

Region) Ilocos Sur 190,297 Catanduanes 116,866

La Union 253,382 Masbate 483,651

Pangasinan 888,844 Sorsogon 319,952


Region 6 2,765,055 Region 8 1,646,371 Region 10 1,580,249

(Western Aklan 186,813 (Eastern Biliran 58,135 (Northern Bukidnon 449,647

Eastern
Visayas) Antique 208,169 Visayas) Samar 202,680 Mindanao) Camiguin 41,017

Lanao Del
Capiz 328,635 Leyte 680,536 Norte 424,819

Northern Misamis
Guimaras 37,838 Samar 240,228 Occidental 260,764

Southern Misamis
Iloilo 690,639 Samar 116,738 Oriental 404,002

Negros Western
Occidental 1,312,961 Samar 348,054 Region 11 1,222,367

Davao del
Region 7 2,017,162 Region 9 1,254,884 (Davao Norte 637,298

Zamboanga
(Central Bohol 590,926 (Zamboanga del Norte 433,091 Region) Daval del Sur 412,442

Zamboanga Davao
Visayas) Cebu 973,490 Peninsula) del Sur 821,793 Oriental 172,627

Negros Zamboanga 2 Compostela 4


Oriental 427,509 Sibugay Valley

Siquijor 25,237 Isabela City3


Region 12 1,596,785 Region 13 1,071,005 ARMM 1,648,441

(SOCCSKSAR (Autonomous
GEN) North Cotabato 509,463 (Caraga) Agusan del Norte 259,475 Region Basilan 123,825

in Muslim Lanao del


Saranggani 223,279 Agusan del Sur 353,825 Mindanao Sur 432,307

Surigao del
South Cotabato 469,874 Norte 232,065 Maguindanao 534,628

Sultan Kudarat 344,172 Sulu 397,119

Cotabato City 49,997 Tawi-tawi 160,562

1 Districts of NCR cover the following: 1st District – Manila; end District –
Mandaluyong, Marikina, Pasig, Quezon City and San Juan; 3rd District -
Valenzuela, Kaloocan City, Malabon and Navotas; and 4th District – Las Pinas,
Makati, Muntinlupa, Paranaque, Pasay City, Pateros, and Taguig.
2 Zamboanga Sibugay was part of Zamboanga del Sur in 2000. Thus, 2000
estimates of Zamboanga del Sur includes Zamboanga Sibugay
3 Isabela City was part of Basilan in 2000. Thus, 2000 estimates of Basilan still
includes Isabela City.
4 Davao del Norte estimates for 2000 include Compostela Valley.
Source: National Statistical Coordination Board
15
TABLE 4. Sorted Data (Array) of Magnitude of Poor
Population for the 82 provinces of the Philippines: 2000

2,535 76,137 122,762 193,962 240,228 331,739 424,819 534,628 973,490


25,237 82,895 123,825 202,680 244,712 340,690 427,509 553,629 1,312,961

28,770 83,844 139,449 206,387 251,222 344,172 432,307 590,926

37,838 110,937 147,812 207,184 253,382 348,054 433,091 637,298

41,017 113,553 160,562 208,169 259,475 353,825 440,603 667,385

49,997 113,719 170,917 223,279 260,764 360,109 449,647 680,536

58,135 115,116 172,627 225,640 292,611 397,119 469,874 690,639

59,555 116,738 177,823 228,004 301,147 404,002 483,651 765,373

59,985 116,866 186,813 229,301 319,952 412,442 509,463 821,793

68,659 120,663 190,297 232,065 328,635 424,580 532,961 888,844

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
TABLE 5. Frequency Distribution Table on Magnitude of Poor
Population for the 82 Provinces of the Philippines: 2000
TABLE 5a TABLE 5b
CLASS LIMITS CLASS LIMITS
LCL UCL f LCL UCL f
2,500 152,499 24 2,500 202,499 31
152,500 302,499 24 202,500 402,499 26
302,500 452,499 18 402,500 602,499 16
452,500 602,499 7 602,500 802,499 5
602,500 752,499 4 802,500 1,002,499 3
752,500 902,499 3 1,002,500 1,202,499 0
902,500 1,052,499 1 1,202,500 1,402,499 1
1,052,500 1,202,499 0 82
1,202,500 1,352,499 1
82
17

TABLE 5c
CLASS LIMITS
LCL UCL f
2,500 192,499 30
192,500 382,499 26
382,500 572,499 16
572,500 762,499 5
762,500 952,499 3
952,500 1,142,499 1
1,142,500 1,332,499 1
82

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
18
Example: This illustrates the use of appropriate column
labels in a frequency distribution table.

TABLE 6. Frequency Distribution Table of the


Magnitude of Poor Population in the Phils: 2000

Magnitude of Poor Population No. of Provinces


2,500 - 192,499 30
192,500 - 382,499 26
382,500 - 572,499 16
572,500 - 762,499 5
762,500 - 952,499 3
952,500 - 1,142,499 1
1,142,500 - 1,332,499 1
Total 82
Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
19
TABLE 7. Frequency Distribution Table with
Class Boundaries and Class Marks

Class Limits Class Boundaries


LCL UCL LCB UCB Class Mark f
2,500 - 192,499 2,500 - 192,499 97,500 30
192,500 - 382,499 192,500 - 382,499 287,500 26
382,500 - 572,499 382,500 - 572,499 477,500 16
572,500 - 762,499 572,500 - 762,499 667,500 5
762,500 - 952,499 762,500 - 952,499 857,500 3
952,500 - 1,142,499 952,500 - 1,142,499 1,047,500 1
1,142,500 - 1,332,499 1,142,500 - 1,332,499 1,237,500 1
82

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
20

Relative Frequency and Relative Frequency


Percentage

Relative frequency
* divide the class frequency of a class interval to the number of
observations
* the sum of the relative frequency column is one

Relative frequency percentage


* multiply the relative frequency by 100
* the sum of the relative frequency percentage column is one
hundred percent.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
TABLE 8. Frequency Distribution Table 21
with Relative Frequency and Relative
Frequency Percentage

Relative
Class Limits Relative Frequency
LCL UCL f Frequency Percentage
2,500 - 192,499 30 0.366 36.6
192,500 - 382,499 26 0.317 31.7
382,500 - 572,499 16 0.195 19.5
572,500 - 762,499 5 0.061 6.1
762,500 - 952,499 3 0.037 3.7
952,500 -1,142,499 1 0.012 1.2
1,142,500 -1,332,499 1 0.012 1.2
82 1.000 100.0

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
TABLE 9. Frequency Distribution Table 22
with Less than Cumulative Frequency and Greater than
Cumulative Frequency Distributions

Greater than
Less than Cumulative
Class Limits cumulative Frequency
LCL UCL f Frequency
2,500 - 192,499 30 30 82
192,500 - 382,499 26 56 52
382,500 - 572,499 16 72 26
572,500 - 762,499 5 77 10
762,500 - 952,499 3 80 5
952,500 -1,142,499 1 81 2
1,142,500 -1,332,499 1 82 1
82

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
23
Graphical Representation of the
Frequency Distribution

 Frequency Histogram
- use the class frequency on the vertical axis and
the class boundaries on the horizontal axis

 Frequency Polygon
- use the class frequency on the vertical axis and
the class mark on the horizontal axis

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
24
Frequency Histogram

 represents the shape of the distribution of the data set


 place the class boundaries on the horizontal axis and the
class frequencies on the vertical axis
 the height of the bar represents the frequency for each
class interval
 plot the sides of the bars at the class boundaries
 the area under the frequency histogram corresponds to
the number of observations
 the tallest vertical bar represents the frequency of the
modal class, the class interval with the largest class
frequency.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
25

Illustration of a Frequency Histogram


FIGURE 15 . Frequency Histogram of Magnitude of
Poor People in the Philippines: 2000

35
30
Number of Provinces

25
20
15
10
5
0
0- 2 2- 4 4- 6 6- 8 8 - 10 10-11 11-13
Magnitude of Poor in Hundred Thousands

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
26

Frequency Polygon
• graphical representation of the frequency distribution table that
shows the shape of the data set

• place the frequencies on the vertical axis and the class marks
on the horizontal axis

• connect the points by straight line

• close the chart by putting an additional class mark at both ends


of the horizontal axis and bring down the line to the horizontal
axis at the midpoints of the additional class marks

• may draw and compare two or more frequency distributions

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
27
Illustration of a Frequency Polygon

FIGURE 16. Frequency Polygon on Magnitude of Poor People


in the Philippines: 2000
35
number of provinces

30
25
20
15
10
5
0
0.975 2.875 4.775 6.675 8.575 10.47512.375
magnitude of poor people in hundred thousands

Source: National Statistical Coordination Board

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
28
Use of Microsoft Excel

Creation of Single Value Grouping


1. Enter the data set in one column or one row.
2. List down in a column all possible values of the variable.
3. Click the right cell beside the first cell of the column.
4. To tally the frequencies, click the function wizard (fx), choose
statistical function, click frequency, and highlight the data to input
cells for the data array and highlight the column of all possible
values for the bins array. Click OK. Highlight the whole column
beside column of all possible values. Place the cursor at the
formula bar. Press Ctrl-Shift-Enter.
5. Label the column as the name of the variable, label the column to
its right frequency, and provide the sum for the frequency column
by highlighting the blank cell after the last value of the frequency
column and clicking the  button in the menu bar.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
29
Creation of a Frequency Distribution Table
1. Enter the data set in one column or one row.
2. List down in a column all upper class limits of the variable.
3. Click the right cell beside the first cell of the column
4. To tally the frequencies, click the function wizard (fx), choose
statistical function, click frequency, and highlight the data to
input cells for the data array and highlight the column of upper
class limits for the bins array.
5. Click OK. Highlight the whole column beside column of all
possible values. Place the cursor at the formula bar. Press
Ctrl-Shift-Enter.
6. Label the column as the name of the variable, label the column
to its right frequency, and provide the sum for the frequency
column by highlighting the blank cell after the last value of the
frequency column and clicking the  button in the menu bar.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
30
Constructing a Frequency Histogram

1. Click the chart wizard. Choose the bar graph, highlight the
frequency column including the label but excluding the sum
seeing to it that you choose the option series.
2. Click series, and for category of X series, highlight the column
with the class boundaries excluding the label.
3. Click next, and improve the graph by typing the titles and
labels for the axes. Remove the legend.
4. Click next. You have a choice where to put your graph – in the
same sheet as the data or in a different sheet. Choose your
preference and click finish.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
31
Constructing a Frequency Histogram Using
Data Analysis

1. Extend the existing intervals of the frequency


distribution at both ends.

2. Add 0s to the cells in the frequency column


corresponding to these appended intervals.

3. Add a column of class marks (mean of the upper and


lower class limits) leaving blank those corresponding to
the appended intervals.

4. Click TOOLS\DATA ANALYSIS\HISTOGRAM and


choose as the input range the column of the original
data, choose as the bin range the column of class marks.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
32

6. Improve the histogram by


a. deleting MORE , the last row in the resulting
frequency distribution.
b. pointing to the top of any column in the
histogram until “ frequency point” appears, then
RIGHT_CLICK, format axis, options\gap
width equal to 0.
c. highlighting chart area and
d. replacing the x-axis labels with the class marks
by clicking chart wizard, and in series, choosing
as category(x) axis the class marks including the
blank cells corresponding to the appended
intervals.
Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011
33
Constructing a Frequency Polygon

1. Use Chart Wizard\Line

2. Data range : frequency column( with the 0s corresponding to


the appended intervals)

3. Category (x) axis: class marks column with the blank cells at
both ends

4. We can improve the frequency polygon further by entering


into the blank cells in the class marks column the values of
the class marks of the appended class intervals.

Statistical Research and Training Center Training Course on Basic Statistical Analysis Using MS Excel 2007
March 28 – April 1, 2011

You might also like