Professional Documents
Culture Documents
Directions:
Topic 1: Data gathering and Organizing data, representing data using
graphs and charts, interpreting organized data
Time Allotment:
Learning Objectives
Presentation of Content
Statistics Data –
are measurements or observations that are gathered for an event under study.
Statistics is the branch of mathematics that involves collecting, organizing, summarizing, and
presenting data and drawing general conclusions from that data.
B. Inferential Statistics refers to the making of conclusions about the population based on
the study of sample.
The data gathered is called discrete if the variable take a finite number of values while
continuous if the variable take an unlimited number of values. Example is number of trees is
discrete since the possible answer of the variable is 1, 2, 3…or any whole number. For
continuous, weight. It can be answered as 45.5 kg, 46.7 kg, 67.1 kg,… Meaning, values have
excess or between two whole numbers or decimal/ fraction.
Quantitative data are data gathered from measurement (i.e. height, weight,
temperature,…)while qualitative are those taken from attributes or respondents (i.e. color, taste,
palatability, skin complexion, ethnicity,…).
Variables in Statistics are the characteristics common to all the respondents/subjects but they
differ from one another. These can be classified, measured, or labeled I different ways.
Example of this is height. Let say, the population under study are the first year students. All of
first year students have height but they do not have the same height.
1. Nominal Variables. Variables which can be classified into two or more categories are
called nominal variables. Examples of this are sex, civil status, religion, ethnicity, etc.
a. Real nominal when the variables are classified in naturally occurring attribute.
Example of this is sex. Upon birth, the baby can be a male or a female we like it or not.
Another is nationality. If you are born in the Philippines, then your nationality is
Filipino.
2. Ordinal Variables. Ordinal variables are those grouped according to rank or order of
categories. The terms less than or greater than in this type has meaning. For instance,
winning in a declamation contest is better than the second prize winner. Military rank,
ranking of winners in a pageant, honor roll in class are examples of this type.
3. Interval Variables. Interval data are data wherein not only ordering or ranking of the
observations is possible but also arithmetic differences between them are meaningful.
In this type of variable, addition and subtraction have meaning. The zero point of the
interval scale is arbitrary and does not reflect an absence of the attribute. Suppose a
student got zero in a test in English. Does it mean that the student has absolutely no
knowledge of English? Or that he/she does not know anything in English? It is doubtful
whether such an explanation is acceptable.
4. Ratio Variables. They refer to a variable where equality of ratio or proportion has
meaning. In this type of scale, the zero point is not arbitrary but indicates total absence
of the property measured. One example of ratio is force. One can speak of force which
is twice the other or of the absence of force to a certain object being measured. Zero
kelvin is a meaningful concept. In fact, all concepts in the natural sciences fall as ratio
while social sciences most variables are classified as nominal, ordinal or interval.
It is very important to understand the different type of data so that researchers will
interpret data properly and the appropriate statistics used.
4. When an existing group of subjects that represent the population is used for a sample, it is
called a cluster.
SOLUTION
(a) Since he is choosing all students in a particular place at a particular time, he has chosen a
cluster sample.
(b) The sample is unlikely to be representative. Since he’s polling people early in the
morning, those that tend to stay up very late studying are less likely to be included in the
sample.
1. A census is a survey of a whole population. For example, the U.S. Census. Censuses can
be very expensive and time-consuming, if the population is large.
2. A sample survey takes a fraction or part of the population. Sample surveys are cheaper
than censuses, but are not as accurate. Bias can also be an issue.
3. An experiment is a controlled study of a group. Experiments are very common in the
medical fields. The researcher controls how members are placed study groups and which
treatment each group receives. Bias can be a major issue with experiments.
4. An observational study is about the same as an experiment. However, the researcher
does not use control groups or assign treatments.
Note that
Respondents are those with common observable characteristics which can answer to
the question/information directly to the researcher (i.e. students, fishers, farmers, gays, etc.).
If the source of the data could not respond to your question verbally, it is called subject (i.e.
fish, trees, plants, etc.) To further explains this, if you want to ask/gather what is your
height? The farmer can answer you immediately. So farmer is considered respondent. But if
you ask the same question to a fish, then definitely, the fish couldn’t answer you instead you
observe the fish or use measuring instrument to gather the height of the fish. Then the fish is
considered subject.
In general, the following data collection methods work for qualitative research:
Document review.
In depth interviews.
Observation methods.
Quantitative research data collection methods, which tends to rely on random samples,
include:
Hypothesis testing eliminates assumptions while making a proposition from the basis of
reason.
For collectors of data, there is a range of outcomes for which the data is collected. But the
key purpose for which data is collected is to put a researcher in a vantage position to make
predictions about future probabilities and trends.
The core forms in which data can be collected are primary and secondary data. While the
former is collected by a researcher through first-hand sources, the latter is collected by an
individual other than the user.
Before broaching the subject of the various types of data collection. It is pertinent to note that
data collection in itself falls under two broad categories; Primary data collection and
secondary data collection.
Primary data collection by definition is the gathering of raw data collected at the source. It is
a process of collecting the original data collected by a researcher for a specific research
purpose. It could be further analyzed into two segments; qualitative research and quantitative
data collection methods.
Quantitative Method
Secondary data collection, on the other hand, is referred to as the gathering of second-hand
data collected by an individual who is not the original user. It is the process of collecting data
that is already existing, be it already published books, journals and/or online portals, radio,
television, from certain office. When a researcher wants to gather grade of students, just refer
to the Registrar’s office; if legal registration or information required by the government, the
Philippine Statistics Authority, Professional Regulation Commission, Land Transportation
Office can be a possible source of data. In terms of ease, it is much less expensive and easier
to collect.
Your choice between Primary data collection and secondary data collection depend on the
nature, scope and area of your research as well as its aims and objectives.
There are a bunch of underlying reasons for collecting data, especially for a researcher.
Walking you through them, here are a few reasons;
A key reason for collecting data, be it through quantitative or qualitative methods is to ensure
that the integrity of the research question is indeed maintained.
The correct use of appropriate data collection of methods reduces the likelihood of errors
consistent with the results.
Decision Making
To minimize the risk of errors in decision making, it is important that accurate data is
collected so that the researcher doesn't make uninformed decisions.
Data collection saves the researcher time and funds that would otherwise be misspent without
a deeper understanding of the topic or subject matter.
To prove the need for a change in the norm or the introduction of new information that will
be widely accepted, it is important to collect data as evidence to support these claims.
Data collection tools refer to the devices/instruments used to collect data, such as a paper
questionnaire or computer-assisted interviewing system. Case Studies, Checklists, Interviews,
Observation sometimes, and Surveys or Questionnaires are all tools used to collect data.
It is important to decide the tools for data collection because research is carried out in
different ways and for different purposes. The objective behind data collection is to capture
quality evidence that allows analysis to lead to the formulation of convincing and credible
answers to the questions that have been posed.
Here are 7 top data collection methods and tools for Academic, Opinion or Product
Research
The following are the top 7 data collection methods for Academic, Opinion-based or product
research. Also discussed in detail is the nature, pros and cons of each one. At the end of this
segment, you will be best informed about which method best suits your research.
A. INTERVIEW
An interview is a face-to-face conversation between two individuals with the sole purpose of
collecting relevant information to satisfy a research purpose. Interviews are of different types
namely; Structured, Semi-structured and unstructured with each having a slight variation
from the other.
Use this interview consent form template to let interviewee give you consent to use data
gotten from your interviews for investigative research purpose.
Pros
In-depth information
Freedom of flexibility
Accurate data.
When the questions raised to gather data is in English, the interviewer/researcher can
rephrase the question or transform the English language into the dialect which can be
understood by the interviewee/respondents.
Cons
Time-consuming
Expensive to collect.
For collecting data through interviews, here are a few tools you can use to easily collect data.
Audio Recorder
An audio recorder is used for recording sound on disc, tape, or film. Audio information can
meet the needs of a wide range of people, as well as provide alternatives to print data
collection tools.
Digital Camera
An advantage of a digital camera is that it can be used for transmitting those images to a
monitor screen when the need arises.
Camcorder
A camcorder is used for collecting data through interviews. It provides a combination of both
an audio recorder and a video camera. The data provided is qualitative in nature and allows
the respondents to answer questions asked exhaustively. If you need to collect sensitive
information during an interview, a camcorder might not work for you as you would need to
maintain your subject’s privacy.
B. QUESTIONNAIRES
This is the process of collecting data through an instrument consisting of a series of questions
and prompts to receive a response from individuals it is administered to. Questionnaires are
designed to collect data from a group.
For clarity, it is important to note that a questionnaire isn't a survey, rather it forms a part of
it. A survey is a process of data gathering involving a variety of data collection methods,
including a questionnaire.
On a questionnaire, there are three kinds of questions used. They are; fixed-alternative, scale,
and open-ended. With each of the questions tailored to the nature and scope of the research.
Pros
Cons
Paper Questionnaire
C. REPORTING
By definition, data reporting is the process of gathering and submitting data to be further
subjected to analysis. The key aspect of data reporting is reporting accurate data because of
inaccurate data reporting leads to uninformed decision making.
Pros
Cons
Reporting tools enable you to extract and present data in charts, tables, and other
visualizations so users can find useful information. You could source data for reporting from
Non-Governmental Organizations (NGO) reports, newspapers, website articles, hospital
records.
NGO Reports
Contained in NGO reports is an in-depth and comprehensive report on the activities carried
out by the NGO, covering areas such as business and human rights. The information
contained in these reports are research-specific and forms an acceptable academic base
towards collecting data. NGOs often focus on development projects which are organized to
promote particular causes.
Newspapers
Newspaper data are relatively easy to collect and are sometimes the only continuously
available source of event data. Even though there is a problem of bias in newspaper data, it is
still a valid tool in collecting data for Reporting.
Website Articles
Gathering and using data contained in website articles is also another tool for data collection.
Collecting data from web articles is a quicker and less expensive data collection Two major
disadvantages of using this data reporting method are biases inherent in the data collection
process and possible security/confidentiality concerns.
Health care involves a diverse set of public and private data collection systems, including
health surveys, administrative enrollment and billing records, and medical records, used by
various entities, including hospitals, CHCs, physicians, and health plans. The data provided is
clear, unbiased and accurate, but must be obtained under the legal means as medical data is
kept with the strictest regulations.
D. EXISTING DATA
This is the introduction of new investigative questions in addition to/other than the ones
originally used when the data was initially gathered. It involves adding measurement to a
study or research. An example would be sourcing data from an archive.
Pros
Cons
What are the Best Data Collection Tools for Existing Data?
The concept of Existing data means that data is collected from existing sources to investigate
research questions other than those for which the data were originally gathered. Tools to
collect existing data include:
D. OBSERVATION
Easy to administer.
There subsists a greater accuracy with results.
It is a universally accepted practice.
It diffuses the situation of an unwillingness of respondents to administer a report.
It is appropriate for certain situations.
Cons
Checklists - state specific criteria, allow users to gather information and make
judgments about what they should know in relation to the outcomes. They offer
systematic ways of collecting data about specific behaviors, knowledge, and skills.
Direct observation - This is an observational study method of collecting evaluative
information. The evaluator watches the subject in his or her usual environment
without altering that environment.
E. FOCUS GROUPS
The opposite of quantitative research which involves numerical based data, this data
collection method focuses more on qualitative research. It falls under the primary category
for data based on the feelings and opinions of the respondents. This research involves asking
open-ended questions to a group of individuals usually ranging from 6-10 people, to provide
feedback.
Pros
Cons
What are the best Data Collection Tools for Focus Groups?
A focus group is a data collection method that is tightly facilitated and structured around a set
of questions. The purpose of the meeting is to extract from the participants' detailed responses
to these questions. The best tools for tackling Focus groups are:
Two-Way - One group watches another group answer the questions posed by the
moderator. After listening to what the other group has to offer, the group that listens
are able to facilitate more discussion and could potentially draw different conclusions.
Dueling-Moderator - There are two moderators who play the devil’s advocate. The
main positive of the dueling-moderator focus group is to facilitate new ideas by
introducing new ways of thinking and varying viewpoints.
F. COMBINATION RESEARCH
This method of data collection encompasses the use of innovative methods to enhance
participation to both individuals and groups. Also under the primary category, it is a
combination of Interviews and Focus Groups while collecting qualitative data. This method is
key when addressing sensitive subjects.
Pros
Cons
What are the best Data Collection Tools for Combination Research?
The Combination Research method involves two or more data collection methods, for
instance, interviews as well as questionnaires or a combination of semi-structured telephone
interviews and focus groups. The best tools for combination research are:
Online Survey - The two tools combined here are online interviews and the use of
questionnaires. This is a questionnaire that the target audience can complete over the
Internet. It is timely, effective and efficient. Especially since the data to be collected is
quantitative in nature.
Dual-Moderator - The two tools combined here are focus groups and structured
questionnaires. The structured questioners give a direction as to where the research is
headed while two moderators take charge of proceedings. Whilst one ensures the
focus group session progresses smoothly, the other makes sure that the topics in
question are all covered. Dual-moderator focus groups typically result in a more
productive session and essentially leads to an optimum collection of data.
WHAT IS DATA ORGANIZATION?
A process organizing collected factual material commonly accepted in the scientific
community as necessary to validate research findings.
“Research data is data that is collected, observed, or created, for purposes of analysis to
produce original research results” (Boston University Libraries, n.d.a).
Construct a table with three columns. The first column shows what is being arranged in
ascending order (i.e. the scores). The lowest mark is 4. So, start from 4 in the first column as
shown below. The second column is Tally, third is frequency.
B. GROUP - It refers to data being organized into groups known as classes. GUIDELINES
1. Use between 5 – 20 classes
2. Classes are mutually exclusive
3. Include all classes even if the frequency is zero
4. Use the same width for all classes
5. Use convenient numbers for the class limit
6. The sum of the frequency must total the data set
7. Have enough classes for all the data
8. Remember to use 0 if the class has no data, don’t leave it blank.
The following data represents
1. Determine the highest and lowest value and then compute the Range:
Range = Highest value- Lowest value,
Range = 46 - 18 = 28.
2. Decide how many numbers of classes you want to have.
Example: 5 Classes
3. Compute the Class width or class interval
i = Class Interval = Range/# of Classes
= 28/5 = 5.6 or 6
4. Lower class limit (Smallest number of each class) and upper class limit (largest number of
each class)
Example:
LCL = 18(the smallest number in the data) ,24,30,36,42
UCL = 23,29,35,41,47
5. Class Boundaries – The number that separates the classes from one another by Subtracting
.5 to Lower limit and add .5 to upper limit of each class.
Example:
(LL) 18 - .5 = 17.5 (Class Boundary) and
(UP) 23 + .5 = 23.5 (Class Boundary)
we proceed as follows:
The table above is a frequency distribution in downward manner. The first class limit (18-
23) is listed on the first row.
The data can also be written in upward manner. That is, the lowest class limit (18-23) is
not found on the first row instead, it is written on the last row upward.
2. STEM AND LEAF DIAGRAM A method used to organize statistical data that helps
us to see values according to their size, so we can order them accordingly.
In a stem-and-leaf diagram, each data value is split into a stem and a leaf.
The leaf is the last digit to the right.
The stem is the remaining digits to the left.
For the number 243, the stem is 24 and the leaf is 3.
Example: The following data represents the science test scores for the third grading period
(out of 100%):
97 92 77 82 96 75 68 80 79 96
21 34 55 84 87 68 87 88 97 81
TYPES OF CHART
a. Bar Chart
b. Pie Chart
c. Line Chart
d. Histogram
A. Bar chart is composed of discrete bars that represent different categories of data. The
length or height of the bar is equal to the quantity within that category of data. Bar graphs
are best used to compare values across categories. Example: The following data represents
Peters’ Grades in Science subject for 1st – 4th quarter.
B. Pie chart is a circular chart used to compare parts of the whole. It is divided into sectors
that are equal in size to the quantity represented. Example: The following data represent
the monthly household expenses of Rich family.
C. Line chart displays the relationship between two types of information, such as number of
school personnel trained by year. They are useful in illustrating trends over time. Example:
The following data shows daily temperature in Luna, La Union, recorded for 5 days in
Degrees Celsius
D. Histogram has connected bars that display the frequency or proportion of cases that fall
within defined intervals or columns. The bars on the histogram can be of varying width
and typically display continuous data. Example: The following data represents the number
of respondents aged 8-55 who are disabled.
Note that…
A bar chart is different from histogram. Bar chart has equal spaces between bars while
histogram has no space between bars. In terms of data presented, bar chart represents
discrete data while histogram represents continuous data.
Example:
The local ice cream shop keeps track of how much ice cream they sell versus the noon
temperature on that day. Here are their figure in the last 12 days.
The data above can be presented in scatterplot like shown below.
Note that…
Line graphs are like Scatter plots in that they record individual data values as marks on the
graph. They both represents correlation/ association between variables. The difference is that
a line is created connecting each data point together. In this way, the local change from point
to point can be seen.
F. Cartogram is a map in which some thematic mapping variable- such as travel time,
population or election –is substituted for a land area or distance. The geometry or space of
the map is distorted, sometimes extremely, in order to convey the information of these
alternate variables.
Example of cartogram is shown below presenting the distribution of global population.
Each of the 15, 266 pixels represents the home country of 500,000 people- cartogram made
by Max Roser for Our World in Data.
G. Pictograph is the presentation of data using images. Pictographs represent the frequency
of data while using symbols or images that are relevant to the data. This is one of the
simplest ways to represent statistical data. And reading a pictograph is made extremely
easy as well.
GUIDELINES FOR FORMATTING CHARTS
1. Keep it simple and avoid flashy special effects. Present only essential information.
Avoid using gratuitous options in graphical software programs, such as three-
dimensional bars, that confuse the reader. If the graph or chart is too complex, it will
not clearly communicate the important points.
2. Title your graph or chart clearly to convey the purpose. The title provides the reader
with the overall message you are conveying.
3. Specify the units of measurement on the x and y-axis. Years, number of participants
trained, and type of school personnel are examples of labels for units of measurement.
4. Label each part of the chart or graph. You may need a legend if there is too much
information to label each part of the chart or graph. Use different colors or variations in
patterns to help the reader distinguish categories and understand your graph or chart.