Professional Documents
Culture Documents
DATA COLLECTION,
ORGANIZATION AND
PRESENTATION
with software application
LEARNING MODULE
Rina A. Abner, DBA, CPA,
This learning material has adopted various resources offline and online. Some discussions were lifted verbatim. The sources
are properly recognized in the references section. This material does not intend to infringe any copyrights, and is for
educational purposes alone.
2 AE9: Statistical Analysis with Software Application
I. OBJECTIVES
II. LESSON
Data
Primary data are data that were gathered by the data collectors themselves,
for a specific purpose. Secondary data are those that were documented or gathered
by other sources, which are made available for other researchers. The Pulse Asia
typically conducts surveys on the opinions of people regarding current issues – that
is an example of gathering primary data. Another example could be an online survey
that was administered by the researcher himself. The data that was gathered from
the online survey is an example of a primary data. On the other hand, secondary
data are those data that were previously gathered by other sources that may be
used by researchers in their study. If you try to visit the website of the Philippines
Stock Exchange, you will see the uploaded data and information of all the publicly
listed companies (PLCs) in the country. You may use such data in your study, and
that is an example of secondary data. Another example is the data about the number
of farmers in a certain municipality that was gathered by the Local Government
Unit (LGU). If your study needs data about the number of farmers and you went to
the LGU to have access on the data, your data is called a secondary data. However,
if you yourself gathered the data about the farmers, that would be considered as
primary.
Both types of data have their advantages and disadvantages. Take a look at
the following table which was taken from the book of (Sarstedt & Mooi, 2019).
Also, look at the coded data in Appendix A. Now we import the data to Gretl,
and click first the variable that which frequency distribution you want to know.
Then, click “Frequency Distribution” under “Variable”. Just click the “Show data
only”, and untick the “show plot”, and click “Ok”. The following will pop up for
Gender, Address, and Religion:
Note that if we have qualitative data, and we want to import such in any
statistical software, they have to be coded. You may decide what values you are
going to use to turn them into quantitative data. In that way, the software can read
the data easily.
On one hand, there are qualitative data that the researcher does not want to
express in quantitative terms, instead, such will be analyzed as qualitative data.
That’s another field that this our course does not cover.
Data Collection
Now, that you already know the types of data that you may gather, let’s talk
about the data collection. Recall that data are collected on a sample which is a
representative of the whole population. There are ways and techniques on how to
get the sample, which will be covered in details in Week 7 Module under the topic
Sampling and Sampling Distributions. Following are some of the data collection
strategies that a researcher may use, depending on the research design:
4. Observational Data. Here, the researcher does not design and administer
any questionnaire or conduct an experiment. Instead, data are collected
through observation. Suppose that you want to study about the practices
of a certain tribe in Partido, Camarines Sur. You may stay in their
community for a certain period and observe what they do and how they
conduct their activities and culture. You may choose to participate with
them in their activities (participant observer) or just sit under a tree and
observe and take notes of what’s happening (non-participant observer).
Observational data may also be gathered online. For example, there is a
7. Secondary data collection. Here, we utilize the data that were collected
by someone else. For example, in addition to what was previously
mentioned in this module, we have the database such as COMPUSTAT and
OSIRIS, where we can get various data about publicly listed companies
around the world (subscription to them aren’t free, by the way). Collection
of data from the database of Philippines Statistics Authority (PSA),
Philippine Stock Exchange, Securities and Exchange Commission,
Company Websites, Government Reports, and other sources from the
internet, are also examples.
Data Organization
After we have collected our data, it’s now time to organize them. Commonly,
the researcher would prepare fist the frequency distribution (we have actually done
this in the previous section when we code the qualitative data). A frequency
distribution contains the number of observations that fall in each of the category
variables that a study covers.
For this sub-topic, let us use the data in Appendix B (data in Excel file format
will be provided), which contains the ratings of the takers in CPALE and their
academic performance (grades) in their major accounting subjects. Suppose that we
want to know whether academic performance is a predictor of CPALE ratings.
Assume the same codes for the profile, except for the “Award”, which 0 means no
award, and 1 means with Latin honor.
Using the figures generated from Gretl, we may have the following frequency
distribution table:
Note that gender, address, religious affiliation, and award are nominal variables,
and the frequency distribution table is the one commonly used to present them.
Such will also be the case if the variables are interval and ordinal. For ratio
variables, like the CPALE ratings and grades (last two columns of our data), we may
also present them is frequency, but the software will develop intervals, so such can
be presented in frequency. Take a look at the following:
But when it comes to ratio variables, it would also be good if we present the
summary statistics, like the following:
Most of the time, data are presented in tables, however, the researcher may
also present them graphically if it would be better in such way. Graphical
presentations of data. Graphical presentation can take the form of charts and
graphs, and with the help of statistical software, it is easy. It’s also easy if you use
MS Excel.
Let us prepare the pie chart first, using the same data on CPALE takers.
Step 2. Click the range of the values and label that you want to present as a pie
chart, then click the Insert option, then Insert Pie Chart. The following will appear:
Step 3. You may format the pie chart according to your specifications, you also put
the chart title.
Here’s now your pie graph for the gender of the respondents.
Gender
27
53
Male Female
Now, for the ratio variables, it would not be so practical if we present it in pie charts.
Instead, we may prepare a histogram. This is “a graph showing the differences in
frequencies or percentages among the categories of an interval-ratio variable. The
categories are displayed as contiguous bars, with width proportional to the width of
the category and height proportional to the frequency or percentage of that category”
(Leon-Guerrero & Nachmias, 2017, p. 45).
This time, we graph the CPALE and academic performance using Gretl.
Click “Variable”, then “Frequency Distribution”, make sure that the “show plot” is
ticked, then click “Ok”. The following will appear.
Take note that you ay save the graph as an image file. Just right click the graph
and choose “Save as PNG” (you may also choose other formats). You may also edit
the graph according to your specifications.
Aside from histogram, there are other graphs in the Gretl software. Using the
same graph above, let’s just edit such to turn it into other form of graphs. Click the
“Menu” (indicated by three lines), and “Edit”.
If the data involve time series, you may do a line graph to show the trend.
Example is the following (a different dataset was used; time period involved is 2013
– 2017).
The graphical presentation in this module is limited. There are charts and
graphs other the ones presented here. Just explore the Gretl software and MS Excel
if you are interested.
III. ACTIVITIES
IV. ASSESSMENT
Instruction: Answer the following questions. If you have your laptop or computer,
you may use MS Word for your answers. If you can’t do it in MS Word, you may just
write your answer on a clean sheet of paper and take photo/s of it (please make
sure that your images are clear).
In naming the file for your output, kindly put your name, subject code and module
number (e.g., ABNER, RINA_AE9_Module 2). Please turn in your output through
our Moodle.
A. Theory (50%)
1. In your own words, how do you differentiate primary data from secondary data?
Provide a concrete example to expound your answer.
2. Give concrete examples for quantitative data from qualitative data.
3. If I want to present the socio-demographic profile of my respondents, what
graphic presentation would you likely recommend?
4. What if I want to use graphical presentation for the financial performance of the
SMEs in the Philippines, what would you most likely recommend?
5. For the following, identify the data type (primary or secondary), and the collection
method (choose from the methods discussed under “Data Collection”) that would
best fit the situations:
B. Application (50%)
Take screenshots of your outputs in Gretl, and in the same file for your answers
in Assessment A (Questions) above, paste your screenshots. Please add at least one
selfie photo (with Gretl) while working on it.
V. REFERENCES
Appendix B
Organization and Presentation of Data