You are on page 1of 37

INTRODUCTION TO STATISTICS

AND
DATA PRESENTATION

1.1 Meaning of Statistics


Statistics has been defined in several ways because statistics has
developed so fast that it covers many fields of endeavor. The study of statistics
has grown enormously in such manner that the amounts of data in the form of
taxes, population, births and deaths have increased beyond comprehension.
From the data being collected, processed and disseminated, there is an
increase in the quantitative approach employed in all sciences and in
businesses, which in one way or another affect our lives.
Several textbooks have been written in business statistics, educational
statistics, medical statistics and others, but no agreement has been made as to
how statistics will be defined. Some say that statistics is a science of handling
data and others say that it is an art of handling data. However, there are some
who define statistics as a body of methodology for the collection,
presentation, analysis and interpretation of qualitative or quantitative
data.
Statistics can also be used in making correct decisions during the
time of uncertainty. One may apply the different statistical methods so as to
arrive at the correct result with appropriate critical judgment.

1.2 Origin
The origin of statistics can be traced from two fields of interest, namely:
the games of chance otherwise known as gambling and the second is political
science. Even in the early periods, there were incomplete estimates of the
population in the Philippines. Population estimates were based on church
records, births, deaths and marriages. Another source of information about the
population was the number of residence certificates issued every year. It is a

1
form of tax, compulsory to all citizens between 18 and 60 years old. But during
the American time, a more systematic way of collecting data was established.
Different statistical units were created such as the Bureau of Customs which
collects, tabulates and disseminates statistics on imports and exports. Second,
the Bureau of Agriculture which keeps records on the number of farms, the
cultivated land as well as irrigated areas. Another is the Bureau of Labor which
provides the government with the number of employed and unemployed
citizens as well as the different problems inherent in the work. But the newest
of all the units is the National Statistics Office, which undertakes the census
of population and housing. A housing inventory was also taken to evaluate the
housing condition of the population. It is noted that the collection of data on
economic subjects such as population, production, trade, domestic and foreign
were done for political purposes.
The second origin of statistics was the game of chance such as dice,
playing cards and toss coins. The early gamblers suspected that the occurrence
of events in various game of chance follow certain laws, but being unschooled,
they can not deduct the laws from it. It is the famous gambler in the name of
Chevalier de Mere who proposed the well-known problem of points to a great
mathematician Blaise Pascal. Pascal found the problem challenging and so he
worked it out with Fermat, another mathematician. They used different methods
and solutions to many problems, which became the origin of mathematics of
probability upon which the theory of modern statistics is founded.

1.3 Uses of Statistics


Statistics is very essential in education, government, business,
psychology, economics, medicine, sociology, sports, banks and others.
The youth is most familiar to the use of statistics in sports such as
basketball. Every end of the quarter, the newscaster would report the scores of
each team and who among the players are doing good which later becomes the
basis of their pay.

2
Statistics is also vital and important in the field of education. Statistical
tools are used to get information on enrollment, physical facilities and finance,
which are important for an intelligent administration and management.

Statistical tools are needed in the government to provide pertinent data


for an effective management of the affairs of the state. A good record of
population, cost of living, taxes, wages and other data are necessary for
intelligent decisions and policy making.
Psychologists understand better human individuals if they can
systematize, analyze and interpret data on intelligence, personality traits and
others.
In sociology, statistics is very important in the study of the society in
which man lives. Observations are properly analyzed and interpreted to have
better effects for the improvement and development of society.
In business and economics, statistics plays an important role in
business forecasting, opening new business, market research as well as quality
control. In forecasting, statistics is needed so that men can plan ahead
correctly and can formulate policies with the existing conditions. Forecasting is
the main function of the management but forecasting based on incorrect and
unreliable data will cause the collapse of the business enterprise. However,
statistics is also very badly needed in opening new business since there are
several factors to consider before any business can start. To start a new
business, the market must be considered so that the demand for the product
can be determined. Availability of the materials to be used must also be taken
into consideration so as to produce products of good qualities. The third factor
any business should consider is the capital to be used. Will there be enough
capital to meet the production of the product being demanded? The fourth
factor is the labor which will include the manpower, equipment as well as the
building. The last but not least is the competitor. A product should be able to
compete in terms of quality and price.

3
Statistics is also used in market research. Businessmen should be able
to explore new markets for his products because it will be used as guide to
market expansion. The buying habits, the capacity of the consumer to buy
goods as well as the income must be considered and carefully studied.
The quality of a certain product must be maintained so it can command
better price as well as steady market. The manufactured products or items need
to be carefully inspected to determine the acceptability of the product to the
consumer. This method is known as statistical quality control.
Statistics is also important in banking institutions because they have
research department, which gathers and analyzes statistical information
concerning their operations.
Statistics is vital in personnel relations. The business organizations,
which employ a large number of employees wherein the executives have little
chance to know their people, can delegate the responsibility to a personnel
director or personnel officer/manager. The personnel director/manager utilizes
all available data or information taken from the employees. For purposes of
analysis, employees can be classified and evaluated so that there can be basis
for promotion in rank/salaries as well as other benefits available in the
organization.

1.4 Types of Statistics


Statistics is divided into two main areas.
1. Descriptive statistics is the process of collecting, presenting, and
organizing data in some manner that can easily and quickly describe the
data. For example, the National Statistics Offices conducts surveys to
determine the average age, income, and other characteristics of the
Filipino nation.
2. Inferential Statistics uses sample data to make inferences about a
population. It consists of generalizing from samples to populations,
performing hypothesis testing, determining relationships among

4
variables, and making predictions. This kind of statistics uses the
concept of probability – the chance of an event to happen.

In statistics, we commonly use the terms population and sample. A


population is the complete and entire collection of elements to be studied.
Sometimes a population is very large. To save time and money,
statisticians may study only a part of the population. This is called a sample. A
sample is a subset of a population.
Closely related to the concepts of a population and a sample are the
concepts of parameter and statistics. A parameter is a numerical measurement
describing some characteristics of a sample.

1.5 Levels of Measurement


Aside from being classified as qualitative or quantitative, variables can
also be classified according to how they are categorized, counted or measured.

1. Nominal Level
This is characterized by data that consist of names, labels, or categories
only. The data cannot be arranged in an ordering scheme. There is no
criterion as to which values can be identified as greater than or less than
other values. For example, in classifying the instructors in a university as
male or female, no ranking can be placed on the data. Another example
is classifying residents according to their area codes. Although, numbers
are assigned as area codes, there is no meaningful order.

In addition numbers may serve as labels to identify items such as the


number carried at the back of the athletes. Similarly nominal level/scales
may be used for a sample of people being studied and may be classified
according to their religious beliefs/preference such as Protestant,
Catholic and others. People may be classified on the basis of sex, eye

5
color or other organization membership. Simple statistics are used with
nominal data such as proportion, percentage and others.

2. Ordinal Level
This involves data that maybe arranged in some order, but differences
between data values either cannot be determined or are meaningless.
An example is the grading system involving letter (A, B, C, D, E).

An ordinal scale/level produces a distinct ordering or arrangement of


data in which the observations may be ranked based on some criteria
such as good, better and best. With nominal data, numbers can be
ranked as for example a student is asked on how he will rank his
professor in statistics with respect to knowledge of the subject matter
(from 1 – 10). He can give a rank of 8 where the arithmetic differences
between the numbers is meaningless.

3. Interval Level
This is the same as the ordinal level, with an additional property that we
can determine meaningful amounts of differences between data. Data at
this level may lack an inherent zero starting point. For example,
temperature is an interval measurement. There is a meaningful
difference in one degree between each unit such as 80 and 81 degrees.
But a zero degree temperature does not mean that there is no heat.

Variables on an interval scale/level are measured numerically and like an


ordinal data, it carries an inherent ranking or ordering. But unlike the
ordinal data, the differences between the values are important.

6
4. Ratio Level
This is an interval level modified to include the inherent zero starting
point. The difference and ratios of data are meaningful. This is also the
highest level of measurement. An example would be the measure of
height, weight, or area. There is a meaning between values, and a true
zero exists.

Of the four levels of measurement, only the ratio scale is based on the
number system in which zero becomes meaningful. Arithmetic
operations such as multiplication and division take a rational
interpretation. Ratio scale/level is used to measure several types of data
found in business such as cost, profit and inventory. These variables are
expressed in ratio measures.

1.6 Preliminary Steps in Statistical Study


A. Define the problem.
B. Determine the population/subject of the study.
C. Devise the set of questions.
D. Determine the sampling design.
E. Prepare a manual of instruction.
F. Organize and train personnel.

The researcher should know what he aims to discover or establish. The


researcher should state in a simple language what he wants to investigate. It
can be in the form of a question.

Guidelines in the Selection of a Research Problem or Topic


There are guidelines or criteria in the selection of the research program
to make it interesting and the research work more enjoyable to the researcher
as well as to make sure that the study will be completed.

7
The following are the guidelines, which will help in the choice of the problem:

1. The research problem must be chosen by the researcher himself so that


he will not make excuses for all the obstacles he will encounter.
2. The problem must be within the interest of the researcher so that he will
give all the time and effort in the research work.
3. The problem must be within the specialization of the researcher. It will
make the work easier for the researcher because he is familiar in the
area and it will help him improve his specialization, skill and competence
in his own area.
4. The research problem must be within the competence of the researcher.
The researcher must know the procedures in making research and how
to apply them. He must have a workable understanding of his study.
5. The researcher must have the ability and capacity to finance the
research problem to make sure that the study will be completed on the
target time.
6. The research problem must be manageable. The data must be available
or it must be within the capacity of the researcher to gather. The data
must be accurate, objective and not biased. The data should help the
researcher answer the question being investigated.
7. The research problem must be completed within the period set by the
researcher.
8. The research problem must be significant, important and relevant to the
present time as well as to the future. This means that the research
problem must have an impact to the situation and people it is intended
for. The study must contribute to the human knowledge. The facts and
knowledge must be a product of research.
9. The results of the study must be practical and implementable.

8
B. Population
Before a researcher starts collecting the data, the population/subjects to
be considered in the study must be defined. The population to be considered
should be in agreement with the objective and is properly identified.

C. Devise the Set of Questions


A researcher who may collect data by interview or by questionnaire
should prepare a list of questions to be asked. On the other hand, the
questionnaire may consist of one or two question sheets accompanied by a
cover letter. Good questions give accurate responses.

D. Sampling Design
The extent of the population will depend on the nature of the problem.
The census survey will require all individual in the population that is considered
while the sample survey will consider a few representative of the population.

In determining the sample size, the formula which can be


applied is as follows:
N
n≥
1 + Ne 2
where:
n = sample size
N = population size
e = desired margin for error
(per cent allowance for non-precision because a sample is used)

Example 1

9
A researcher wants to make use of a student population of 3,000 for his
study in the mathematical achievement test. If he allows a 5% margin of
error, how many students must he take for his sample?

Solution:
The formula given can be used:
N
n≥
1 + Ne 2
3000
n≥
1 + 3000 (0.05 ) 2

3000
n≥
1 + 3000 (0.0025 )

3000
n≥
1 + 7.5
3000
n≥
8.5
n ≥ 352 .94 or 353

The questions to be asked among the respondents should be evaluated

and it can be subjected to pretest by 5 or 10 percent of the desired sample size.

Sampling Techniques
It is not necessary for the researcher to examine every member of the
population to get the necessary information and data needed in the study. Cost
as well as time constraints will prevent the researcher from studying the whole
population. All the researcher needs is to draw sample units systematically or
at random.

Various sampling techniques or sample designs can be used by the


researcher. The choice of what technique to be used will depend on the nature

10
of the problem at hand, the kind of population and in which sample results will
be applied.
The techniques can be grouped into how selection of items are made
such as probability sampling and non-probability sampling.

1. Probability Sampling
In probability sampling, the sample is a proportion of the population and
such sample is selected from the population by means of systematic way
in which every element of the population has a chance of being included
in the sample.

Types of Probability Sampling


a) Pure Random Sampling or Simple Random Sampling
This type of sampling is one in which every one in the population
of the study has an equal chance of being selected to be included in the
sample. This is also called lottery sampling, which may be used if the
population has no differentiated levels, sections or classes. This is
carried out by assigning number to every member of the population.
Usually this is done by getting a certain percentage of the
population to be included in the study. Suppose there are 150 persons
in the populations and 25% will be included in the study. Therefore
approximately 38 persons must be taken from the population. Each
member of the population will be assigned with a number and all these
numbers will be placed in a container. Thirty eight (38) persons will be
drawn from it and will be included in the sample. Drawing prizes through
the raffle system follows the principle of random sampling. This
technique is easy to understand and easy to apply also. This can be
used with the help of the table of random numbers.

11
Pure random sampling is also called unrestricted random
sampling which means that every individual in the population has an
equal chance of being chosen to be included in the sample.

b) Systematic Sampling
This is a technique of sampling in which every nth name in the list
may be selected to be included in the sample. This is used when
respondents in the study are arranged in some systematic or logical
manner such as the alphabetical arrangement, residential or house
arrays, geographical placement and etc.

The process is done as follows:


Suppose 20% of the population is the sample size. If 100% is
divided by 20% then the result is 5. So every 5th name will be taken from
the population but there must be a random start. Suppose the random
start is 11. This is the first selection. The second respondent will be
11 + 5 = 16 , next will be 16 + 5 = 21 then, 21 + 5 = 26 and so on.
This technique of sampling is more convenient, faster and more
economical than pure random sampling. The main disadvantage is that it
becomes biased if the person in the list belongs to a class themselves
whereas the investigation requires that all sectors of the population are
involved.
The systematic random sampling is a restricted random
sampling because there are certain restrictions imposed upon it.

c) Stratified Random Sampling


It is a more efficient sampling procedure wherein the population is
grouped into a more or less homogeneous classes or strata in order to
avoid the possibility of drawing samples whose members come from one
stratum. This method is used when the population of study has class

12
stratifications or groupings either horizontally or vertically. Examples of
horizontal classification are courses in the same year in a university such
as first year BSM, BSChem, BSA, BSCoE, BSIE and others or students
in the same grade as male and female. An example of vertical
classification are levels in the high school such as first year, second year,
third year and fourth year or age of the students such as 10, 11, 12, 13,
and so on.
In stratified sampling, the distribution of sampling units will depend
on the total number of units in each stratum. The bigger the population,
the more sample units are drawn and the lesser the population, the
smaller sample units are taken into consideration.

d) Cluster Sampling
It is sometimes called area sampling because it is applied on
geographical basis. On this basis, districts or blocks of a municipality or
city are selected. These districts or blocks comprise the clusters. This
method is useful when the samples in a community are occupied by
heterogeneous groups.
Generally, a cluster sampling will give more precise results
particularly when each cluster contains a more varied mixture and when
one cluster is nearly like the other. This sampling is the reverse case of
the stratified sampling where each stratum is internally as homogeneous
as possible. At the same time, each stratum must be different from one
another as much as possible. A cluster sampling gives a less precise
estimate than a simple random sample of the same size. The use of
cluster sampling will depend on the cost and administrative
considerations.
The cluster sample is advantageous because of its efficiency, but
it is disadvantageous because of its reduced accuracy or representative

13
ness on the account of the fact that in every stage there is a sampling
error.
Area sampling requires larger samples of elementary units that
will be required in simple random sampling. The cost advantage in
having the elementary units clustered in a number of locations is so great
that the area sample is preferred when the study covers an extensive
area.
However, this does not mean that the use of one of the
techniques described automatically excludes the use of the others at the
same time. It is possible that combinations of one or two or even more
sample designs can be used in a single survey.

2. Non-Probability Sampling
In a non-probability sampling, the sample is not a proportion of the
population and there is no system in selecting the sample. The selection
depends on the situation.

Types of Non-Probability Sampling


The four types of non-probability sampling are accidental
sampling, quota sampling, convenience sampling and purposive
sampling.

a) Accidental Sampling
In this type of sampling, there is no system of selection but only
those whom the researcher or interviewer meets by chance are included
in the sample. This type of sampling lacks representative ness where the
sample may be biased. If the interviewer goes to a business section,
most people who will be interviewed are likely from the business and
probably rich people hence the respondents will be from well-to-do
people. But if the interviewer stays in a slum area, then it is possible that

14
the respondents are poor people. In a research, every section of the
population must be equally represented in the sample. This method is
being used when there is no alternative.

b) Quota Sampling
In this type of sampling, specified number of persons of certain
types is included in the sample. Suppose the reactions of the people for
a particular issue, such as the effects of drug addiction in a certain
locality, can be decided from a sample that constitutes 10 doctors, 9
lawmakers, 15 parents and 20 drug addicts.
In quota sampling, many sectors of the population are
represented. However, the representation is doubtful because there is
no proportional representation since there are no guidelines in the
selection of the respondents. Anyone who is selected to participate will
do. Quota sampling may be used only when any of the more desirable
types of sampling will not do.

c) Convenience Sampling
Convenience sampling is a process of picking out people in the
most convenient and fastest way to get reactions immediately. This
method can be done by telephone interview to get the immediate
reactions of a certain group of sample for a certain issue. This kind of
method is biased and not representative. This is quite different from
gathering data by interview whereby the interview can be done through
the telephone. In the interview method, people who are interviewed
through the telephone are properly selected to be included in the sample.

d) Purposive Sampling
It is based on certain criteria laid down by the researcher. People
who satisfy the criteria are interviewed. Purposive sampling is

15
determining the target population of those who will be taken for the
study. The respondents are chosen on the basis of their knowledge of
the information desired. If the research will be on the methods and
techniques in teaching Mathematics, then teachers in Math must be
chosen. If the research will be on the history of a particular place, then
the people of the place must be considered. If a certain circular of the
Central Bank is the subject of the study then executives of some big
banks in the country may be considered in the study. Of course the
answers obtained through this procedure are not representative of the
entire population. However the actual selection of respondents is done
either by pure random sampling or systematic random sampling.

Advantages of Sampling
Listed below are reasons for using sampling rather than complete
enumeration or census:

1. More economical.
Expenses will be less if data is obtained from only a part of an
aggregate than if census is conducted. If the interview method will be
used in collecting data from the respondents, then fewer interviewers will
be needed and trained in a sample survey than in a complete
enumeration.

2. Accomplished faster.
In sampling, small part of the population will be considered hence,
data can be collected and presented more quickly.

3. Wider scope.
Scope refers to the amount of information and data that will be
obtained from the respondents. In census survey, because of the great

16
number of respondents, a researcher will not have all the time and
money to ask many questions within the specified time. In sampling,
using the same time and money, the researcher can ask more questions
and get more information from the respondents thus, the scope will be
broadened.

4. More accurate.
In a sample survey, the information gathered are often more
accurate because with the number of respondents to be asked and using
the same resources, the researcher can hire the services of few but
qualified selected personnel. Mistakes and errors can be minimized
because there can be a better control and supervision in the sample
survey.

5. Sampling makes possible the study of a large, heterogeneous


population.
This is so because small portion may be involved thus, enabling
the researcher to reach all through this small portion of the population.

E. Preparation of Manual of Instruction


An exact and complete manual of instruction contains the instructions
and directions which will be observed and followed by the personnel in getting
the necessary information as well as the respondents who will answer the
questions. The manual will serve as guide for the field personnel as well as the
respondents.

F. Organize and Train Field Personnel


The field personnel will include data gatherers or enumerators as well as
supervisors. The group of enumerators should work as a team. The work

17
should be coordinated and supervised by the area supervisor. The survey
workers should have a rigid training before they are sent out to the field.

1.7 Main Steps in Statistical Study


A. Collection of Data
B. Presentation of Data
C. Analysis of Data
D. Interpretation of Data

A. Collection of Data
The first step in a statistical study is the collection of data. Data are the
values that the variables can assume. Variables whose values are determined
by chance are called random variables. These data can be used in different
ways. There are two types of variables – qualitative and quantitative. Qualitative
variables are words or codes that represent a class or category. On the other
hand, quantitative variables are numbers that represent an amount or a count.
The data, which are collected, must be correct and reliable because
when wrong data are used, no statistical technique can correct the result so it
will be a waste of time and effort. Collection of data is considered to be the
most expensive (in terms of money and time) phase of the statistical study.
There are two sources of statistical data namely: the original or direct
source and secondary source. Data, which are taken from the original or
direct source are called primary. The primary data are considered relevant
because the researchers are directly involved in the process. While data, which
are taken from a secondary source is called secondary data.
The primary data most often gives detailed definition of terms, which are
used in the statistical survey. Most often the secondary data contains little or
no explanation. The primary data contains a copy of the procedures used in the
collection of data. The primary data are classified into small subgroups.

18
Methods in the Collection Data
1. Direct or Interview Method
The direct method is an effective method of collecting data because
there is the personal presence of the interviewer and the interviewee.
The interview method gives consistent and reliable data. Questions can
be modified so that the respondents can understand the questions
better. There is a high proportion of responses because respondents
prefer to respond to surveys. The interviewer can clarify any
misinterpretation of the respondents and the interviewer can observe the
respondents’ reactions, which are pertinent as supplementary
information. The interview method is considered the most expensive way
of collecting data because it needs more time and money in conducting
it. It is possible that the interviewer can influence the interviewee in their
responses.

2. The Indirect or Questionnaire Method


The indirect method of collecting data is the most widely used because it
is considered the cheapest method and it can cover wider area in a short
span of time. A questionnaire is a set of questions, which is intended for
the problem to be answered by the respondents. This is usually
accompanied by clear and concise directions, which are sent to the
respondent by mail or hand carried. This method is relatively simple and
inexpensive for it requires small staff to handle it. A standard set of
questions can be prepared and the respondents may feel a greater
sense of freedom to express views and opinions because their identities
are not known. The possibility of influence by the researcher on the
respondent’s replies can be avoided. The respondents can answer the
questionnaire with privacy at their own convenience. Confidential
questions can be asked without affecting the respondents.

19
The questionnaire method has some limitations. It is inconvenient
to some illiterates thus, it reduces the proportion of responses. There
are questions which are not easily understood so these might not be
answered by the respondents or the respondents may give incorrect
information which can not be corrected at once.

3. The Registration Method


This method of collecting data is commonly enforced by certain laws,
ordinances or standard practices. This method is very practical and
inexpensive method of gathering data. Examples of data, which can be
secured through registration are: registration of births, deaths, motor
vehicles, marriages and licenses. In this method, information is kept
systematized and available to all because of the requirement of the law.

20
Exercise 1.1

I. Indicate whether each of the following statements is a descriptive or


inferential statistics.

______________1. Last school year, the ages of students at Philippine


Science High School is 11 to 16 years old.
______________2. Based on the survey conducted by the National
Statistics Office, it is estimated that 29% of unemployed
people are men.
______________3. A survey says that 5 out of 100 Filipinos is a member of
a fitness center.
______________4. Cigarettes were associated with a 35% of the 5,200
civilian fire deaths in 2008.
______________5. A recent study showed that eating garlic can lower
blood pressure.

II. Indicate which of the following examples refer to population or sample.


______________1. a group of 35 BS Math students selected to test a new
teaching technique
______________2. a total machines produced by a factory in 1 month
______________3. the monthly expenditures on food for 25 families
______________4. the ages of employees of all companies in Southern
Luzon
______________5. the number of globe subscribers

III. Classify each variable as quantitative or qualitative.


______________1. the height of students in UST
______________2. the religious affiliation of the people in the USA
______________3. favorite dance

21
______________4. the weekly intake of chocolates
______________5. gender
______________6. the days absent from office
______________7. civil status
______________8. the number of cars owned
______________9. the monthly electric bill
______________10. the number of students who passed

IV. Classify each as nominal, ordinal, interval or ratio-level data.


______________1. student identification number
______________2. the total annual income for a sample of families
______________3. number of enrollees in BA Statistics
______________4. the ranking of chess players
______________5. the salaries of employees

V. Classify each sample as random, stratified, systematic, or cluster.


______________1. Every 4th customer entering SM shopping mall is asked
to select his or her favorite store.
______________2. In PUP College of Science, all professors are
interviewed to determine whether they believe the
students have higher grades now than in previous
years.
______________3. Company supervisors are selected using random
numbers in order to determine annual salaries.
______________4. A teacher writes the name of each student in a card,
shuffles the cards, and then draws ten (10) names.
______________5. A head nurse selects 15 patients from each floor of a
hospital.

22
VI. True or False
______________1. Probability is the foundations of statistics.
______________2. Pure random sampling is sometimes called lottery
method.
______________3. Statistics plays an important role in business forecasting
and opening new business.
______________4. Sampling does not allow the study of large
heterogeneous group.
______________5. The research problem must be chosen by anybody
even if they are not involved in the research.
______________6. Questions should call for one answer only.
______________7. In quota sampling, a specified number of subjects are
considered in the sample.
______________8. Census survey considers only a part of the whole
population (subject of the study) under consideration in
the study.
______________9. The interview method is the most convenient and
cheapest way of collecting data.
______________10. Probability sampling is a type of sampling
techniques which is a proportion of the population and
uses a system in selecting the sample.

VII. Enumeration

1 – 6. Give the preliminary steps in making a statistical study/research.


1.
2.
3.
4.

23
5.
6.

7 – 10 Methods of Collecting Data


7.
8.
9.
10.

11 – 14 Types of Probability Sampling Techniques


11.
12.
13.
14.

15 – 23 Guidelines in choosing a research problem.


15.
16.
17.
18.
19.
20.
21.
22.
23.

24 – 27 Main steps in making a statistical research


24.
25.

24
26.
27.

28 – 30 Three types of questions commonly used


28.
29.
30.

B. Presentation of Data

Collected data must be accurate but most often; they are not as accurate
as they appear. There are many sources in gathering the data.
Take for example in management or administrative decisions, data of
high precision is not required. It is enough that the order of magnitude be
known so that management can draw relationship and make useful
comparisons. But other types of work like in accounting, data should be carried
to the last centavo or other last unit in order that the review of computational
and clerical operations can be conducted.
Statistical data result in either counting or measuring. When distinct
objects exist such as persons, vehicles or things and a physical count can be
conducted then the result is a whole number which is called discrete variable.
For instance, the number of students in a Math class in the first year is 43, that
43 is exact or discrete quantity. It is a result of counting whereas the result of
measurement is called a continuous variable. A continuous variable is a
magnitude which can take any value within a specific interval. The exact value
can not be taken because of the accuracy of the measuring device used will
only be for a certain degree. Furthermore the users of the measuring device or
instruments have some limitations in the ability to see, hear, detect and even
discriminate. For instance, take a bag of rice which contains 5 kilograms. This
implies that the correct weight of that particular bag of rice will be 4.5 kilograms

25
to 5.5 kilograms. If the device can measure to the nearest tenth of a kilogram,
then it is possible that the weight will be between 4.55 kilograms to 5.55
kilograms. Another example, 2 people who have different watches may have
difference in the time that they have. It is clear that the measure taken is
always a rounded number which indicates a range of values rather than an
exact value. No matter how accurate is the instrument used to obtain the
measure, the range of values can not be reduced to zero. Thus, the results of
measurements are always rounded numbers. For convenience in presentation
as well as in analysis of most business and economic data, results of counting
are always given in rounded form especially if the data lack complete accuracy.

Data can be presented in the form of textual, tabular and graphical.

1. Textual Presentation
The textual presentation combines text and figures in a statistical
report. This is usually news items in business, finance, economics or
industries which are published in the business trade or finance sections
of local periodicals. In the presentation of the text, the researcher or
writer can emphasize the importance of some figures.
This method of presentation of data is not particularly effective
since it takes dull reading and may not give a good grasp of the meaning
of the quantitative relationship indicated in any particular report.
The data are presented in textual method so as to direct the
reader’s attention to some data which need particular emphasis as well
as to some important comparisons and to supplement the narrative
account with a table or chart.

2. Tabular Presentation
Statistical tables present numerical data in a systematic way.
Tabulation is the process of condensing classified data and arranging

26
them in a table. Tables are constructed to facilitate analysis of
relationship. Tables are made possible by the orderly arrangement of
numerical facts in columns and rows. Each class or subclass is given
certain column or row. Through this process data can be readily
understood and comparisons are made easy.
However data have to be classified before it can be tabulated and
interpreted.

Classification is the process of putting together similar items. Each


item of information should fall in one class only. The classification should be
designed such that the number of cases falling under each category should be
small or classes should account for every case. Classification should adopt all
classes which are necessary for the study or investigation.

Coding is the process of presenting the observations or information with


symbols which are entered in the schedule so as to facilitate tabulation. This is
very important process for statistical surveys in which there is the possibility of
many varied observations. Coding should be done with care to make it distinct
and clear.

Advantages of Tabular over Textual Presentation


1. Statistical tables are concise and convenient because data are
systematically arranged.
2. They are brief and reduce explanatory matter to the minimum.
3. Tables give the whole information without combining text with figures.
Tables are constructed such that the ideas are easily understood even
without reading the textual presentation.
4. Data are easily read and readily understood because of the systematic
and logical arrangement into columns and rows.

27
5. The arrangement of data into columns and rows makes comparison
easier. It makes the reader easily understand and interpret the data
accurately and see the relationship of data at once.
The use of table can facilitate the study and interpretation of data, as well
as making of inferences and implications of the relationship of statistical data.

3. Graphical Presentation
Graph is the most effective way to present results in a study since it
shows the statistical values and relationship in a pictorial or diagrammatic
form. When data are shown in terms of visual representations, the reader
sees essential facts and relationships and grasps significant proportions,
differences, similarities or even trends. The purpose of the graph is to
present variations and relationship of data in an effective and convincing
way.
Most people find visual representations to be useful in highlighting
information obtained from sample observations. The information presented
in a frequency distribution table can be more easily grasped if it is presented
in a graphical format. Absolute, relative or cumulative frequencies can be
represented in the graphs, depending on the particular objectives for
creating the graph. There are various graphical means to visualize a
frequency distribution: bar charts, histograms, pie charts and frequency
polygons/ogives are among the most popular.

Advantages of the Graphical Method


The graphs have the following advantages:
1. It attracts attention more effectively than tables and is less likely to be
overlooked. Readers may skip table but pause to look at graphs or
charts.
2. The use of colors and pictorial diagrams makes a list figures in business
reports more meaningful.

28
3. It gives a comprehensive view of quantitative data. The wandering of a
line exerts a more powerful effect on the reader’s mind than the
tabulated data. It shows what is happening and what is likely to take
place.
4. Graphs enable the busy executives of a business concern to grasp the
essential facts quickly and without much trouble. Any relation not readily
seen from the figures themselves is easily discovered from the graph.
Illustrations including attractive charts and graphs are now considered by
most businessmen as indispensable accomplishments to good business
reports.
5. Their general usefulness lies in the simplicity they add to the
presentation of numerical data. (Bacani, 1968)

Disadvantages in the use of Graphic Method


Some of the disadvantages are the following:
1. Graphs do not show as much information at a time as do tables.
2. Graphs do not show data as accurately as the tables do.
3. Charts require more skill and time and cost to prepare than tables.
4. Graphs can not be quoted in the same way as tabulated data.
5. Graphs can be made only after the data have been tabulated.

Types of Graphs
Graphs are designed to meet the needs for which they are constructed.
The following are the different kinds of graphs:

1. Bar Graphs
The bar graph is the most commonly used graphic presentation. It
is used for comparing magnitudes. Each bar is drawn to a height (length)
proportional to the quantity it represents.
a) Single Bar Graph

29
b) Grouped (Multiple Bar Graph)
c) Duo-Directional Bar Graph
d) Subdivided Bar Graph
e) Histogram

A bar chart graphs the frequency distribution of the data on an x-


y coordinate system. The class intervals are plotted on the x-axis, the
absolute (or relative) frequencies on the y-axis. Each interval is
represented by a rectangle whose base corresponds to a class interval
and whose height is equal to the frequency associated with the class
interval.
Bar Graph
30
25
20
Frequency
15
Age
10
5
0
10 20 30 40 50 60

Class Boundary

A histogram is similar to a bar chart but the base of the rectangle has
a length exactly equal to the class width of the corresponding interval. As the
rectangle is centered on the average of the lower and upper class limits, the
rectangles of a class interval are adjacent to the rectangles of adjoining class
intervals – there are no spaces between rectangles.

Histogram

30
3 0
2 5
2 0
Frequency
1 5
A g e
1 0
5
0
1 0 2 0 3 0 4 0 5 0 6 0

Class Boundary

2. Linear Graph
The linear graph is a practical and effective device to show changes in
values over successive period of time. Variations in the data are indicated by
the changes and differences in the movement of the linear curves. The linear
graph shows data as a continuous line thus its effect is continuous. At a glance,
the reader can tell the trend as shown in the linear graph. It is easier to prepare
thus it requires less time and skill.
a) Time Series chart
b) Frequency polygon
c) Composite line chart
d) Ogive

A frequency polygon plots (x,y) pairs on the x-y coordinate system,


where x is the average of the lower and upper class limits of the interval and y
is the absolute (or relative) frequency associated with class intervals. These
(x,y) points are connected by straight lines; the polygon is then formed by
plotting points corresponding to two additional class intervals, with frequency 0,
at each end of the frequency distribution. As these two additional points will be
plotted on the x-axis, they will form a polygon with the continuous set of line
segments joining the (x,y) pairs.

31
3 5
3 0
2 5
2 0
Frequency A g e
1 5
1 0
5
0
1 0 2 0 3 0 4 0 5 0 6 0
Class Boundary

A frequency ogive is a frequency polygon that uses the cumulative


frequencies of the frequency distribution (as opposed to absolute or relative
frequencies) as the values plotted on the y-axis. Since the cumulative frequency
is a non-decreasing function, ie., the value of the cumulative frequency
increases (or remains the same) as the upper class limit of the interval
increases, the graph of the frequency ogive will also be non-decreasing as the
value of the x-coordinate increases.

6 0

5 0
Frequency
4 0
c u m . f <
3 0
c u m . f >
2 0

1 0

0
3 8 1 3 1 8 2 3 2 8

32
3. Hundred Percent charts
One hundred percent charts are used to show the relative sizes of
the component parts to make up the whole. This is useful when the
component parts are compared among themselves.

a) Subdivided Bar or Rectangular chart


b) Pie chart

A pie chart displays the absolute (or relative) frequencies of the


class intervals as sectors of a circle. Each sector in a pie chart
corresponds to a class interval; the ratio of the area of the sector to the
area of the circle (i.e., the ratio of the measure of the sector’s central
angle to 360) is equal to the relative frequency of the class interval.

3 %
1 3 %

3 1 %

2 6 %

1 1 %

1 6 %

The pie chart is used to show percent distribution of a whole into


its component parts. This is effective in presenting financial data.
Every 1% corresponds to 3.6% of the circumference of the circle.

33
4. Statistical Maps
Statistical Maps are used to show geographical distribution of
magnitudes in which shades, bars or dots may be used to indicate
variation in magnitude in different areas. Shading or cross-hatching
indicate the varying relative magnitude in different areas. The darkest
shade indicates the highest magnitude while the shade becomes lighter
and lighter as its magnitude decreases.

5. Pictograms
Pictograms are usually called pictograph. These are effective
device in showing data by pictures or symbols. Pictogram does not
attempt to show details but it facilitates comparison of approximate
quantities. It can easily attract the reader’s attention and see important
relationship better and faster than any type of graph.

34
3 5

3 0

2 5

2 0

1 5

1 0

0
1 0 2 0 3 0 4 0 5 0 6 0

The symbols or pictures should suggest the nature of the data being
presented. The symbols should be self-explanatory. Larger quantities
should be presented by more symbols or pictures of the same size and not
be bigger symbols.

6. Ratio Charts
Ratio charts are widely used in the analysis of data. It is also used
in comparing relative changes. If the study deals in the absolute
magnitude of changes, an arithmetic-scaled graph should be used.
Ratio charts cannot show zero or negative value.

35
Exercise 1.2
I. Fill in the blanks
______________1. is a part of the table which explains the meaning of
entries in the table which are not fully understood.
______________2. is the process of putting similar items/ideas together.
______________3. numerals are used to number the tables.
______________4. is a special type of the hundred percent graph that is
used in financial budgets and reports.
______________5. is a dull way of presenting data because it combines
text with figures.
______________6. is a kind of variable which resulted from measurements.
______________7. is a part of the table which simplifies or explains the title
of the table.
______________8. is a method of presenting data that can not show data
as accurately as tables do.
______________9. is a method of presenting data that make use of column
and row arrangement.
______________10. is a part of the table that shows the origin of the
table.

II. Enumeration

1 – 5 Give the importance of the source note in a table.


1.
2.
3.
4.
5.

36
6 – 10 Give 5 advantages of the tabular presentation over the textual
presentation of data.
6.
7.
8.
9.
10.

11 – 15 Give the disadvantages of the graphical presentation of data.


11.
12.
13.
14.
15.

16 – 20 Give at least 5 kinds of graph used in presenting data.


16.
17.
18.
19.
20.

III. Classify each variable as discrete or continuous.


______________1. the number of metals manufactured each month
______________2. the air temperature in Baguio City today
______________3. the monthly income of couples living in Baguio City
______________4. the weights of newborn infants
______________5. the capacity (in gallons) of water in an overhead tank

37

You might also like