Professional Documents
Culture Documents
Data which have been gathered, processed and organized in data matrices, when left un-
analyzed are as good as information not gathered at all. Without analyzing research data, whether
presented in tabular form or not, there is no way the researcher will even attain his research
objectives. And, without interpreting the results of data analysis, he is deprived of the meaning
and implications of knowledge implied by the data.
Data Analysis
Data analysis is the examination of gathered and organized information in terms of the
characteristics, patterns, trends, differences or similarities, and relationships so as to answer
research questions or meet study objectives. Analysis of data involves decision regarding the unit
and level of analysis to use and the procedures and statistical tools to employ.
In deciding what unit and level of analysis to use often-asked question has to do with the
basis or bases of making the right decision. The answer: go back and examine the objectives and
hypotheses of the study because they will tell what unit and level of analysis to use. It should be
remembered that the research objectives are the very reasons for the conduct of the study. Unless
the study objectives are met, unless the hypotheses have been validated or rejected, the issues
raised by the study will remain unsettled and the study will stay unfinished.
Unit of analysis refers to the element of the study population or the specific variable
being used in data analysis. As element of the study population, unit of analysis maybe
inidividuals as in the study of construction workers in the Province of Iloilo by Ardales and
David (1985). In this study, the analysis is focused on the individual construction workers with
the goal of obtaining a general idea of a typical Iloilo construction worker. But the unit of
analysis need not be individuals. It could be an aggregate of individuals or of groups,
organizations or communities. An example of this is Porio’s (1990) study of the impact of the
Local Resource Management (LRM) project of the National Economic and Development
Authority (NEDA). This study focused on the poverty groups, namely, sustenance fishermen,
upload farmers and landless workers in different geographical and socio-economic settings.
Studies may, therefore, utilize either the individual or group of individuals as unit of analysis,
there are studies, however, which use both the individual and the group as their units of analysis.
With regard to variable as unit of analysis, the researcher may use any specific variable of
his study such as sex, civil status, educational attainment, residence, knowledge, and attitude and
so on. These variables are specified in the objectives and hypothesis of the study.
On the basis of the unit of analysis, a research may either be macro or micro in category.
Macro researches are large-scale studies such those which compare large aggregates of persons
like continents, countries or states. Micro researches are studies which have an individuals will
be considered micro, but there is no agreement on the borderline between macro and micro.
According to Bailey (1987), “It is probably not necessary to set an exact dividing line. What is
important is that a researcher… make a decision concerning the unit of analysis of a project.”
Level of analysis is not easy to define. But it can be understood when one thinks of
layers, as in the structure of society where layers consist of individuals, and group of individuals
from the family to community town, province, region and nation. Each societal layer is
equivalent to a level which can be utilized in the analysis of research data.
The study of young adult fertility and sexuality by the Population Institute of the
University of the Philippines (1994) used various levels of analysis. Some researchers involved
in the study analyzed the provincial level data, others focused on the regional level data, while
others analyzed the data on the national level.
It should be pointed out that the unit and level of analysis as well as the different
analytical levels are interrelated in a manner described by Selltiz and others (1967), as follows:
As one moves from lower to higher levels of analysis, the units shift from individuals to
small groups to larger groups to entire societies to supranational regional aggregations.
The major relation among these “levels of analysis” is that lower levels are included
within higher levels. Individuals, for instance, are included within small group, small groups are
included within larger groups, and larger groups are included within nations.
In the analysis of research data, particularly quantitative data, the use of statistics is
inevitable. Without statistics as a tool of measurement the researcher will have difficulty in
examining the data he gathered and in interpreting their implications or meaning. This is
particularly true for studies which involve a sizable population or sample, big number of
variables, and require comparative and relational analyses. It will be advantageous for students
taking up research to also enroll in statistics in the same semester so as to understand and
appreciate the interrelations of the two fields.
The word statistics has been understood from two viewpoints. From one viewpoint it
refers to a body of mathematical processes or techniques for gathering, organizing, analyzing,
and interpreting numerical data. It is a basic and indispensable tool for measurement, evaluation
and research which usually yield quantitative data. The other point of view is that it refers to
obtained numerical data. Statistical data describe the characteristics or behavior of the group
abstracted from a number of individual observations that are combined to make generalization
possible.
According to Best and Kahn (1989), proper application of statistics involves answering
the following questions:
1. What facts are needed to be gathered to provide the information necessary to answer
the question or to test the hypothesis?
2. How are these data to be selected, gathered, organized, and analyzed?
3. What assumptions underline the statistical methodology to be employed?
4. What conclusion can be validity drawn from the analysis of the data?
The choice of a statistical test or tool is dictated by the questions for which research is
designed and the level, distributed and dispersion of data. The following are the most commonly-
used statistical tests used in analyzing research data:
7. One way Analysis of - To determine the main and interaction effects of two
Variance independent variables of the interval type. Example:
Did the method of teaching interact with students’
learning styles? What is the interaction effect of the
method of teaching and medium of instruction on the
achievement of students in Statistics?
15. Regression Analysis - To predict the value of one dependent variable from the
corresponding value of the independent variable of the
interval type. (Multiple regression if a number of
independent variables are involved.) Example: How
much incre3ase do you expect for every year increase
in work experience?
Qualitative data can be analyzed by way of classifying them into categories. Observations
or responses which are essential similar are put together into a class or category, making possible
the quantification of the data and expediting the task of analysis. For example, on the question
why the subjects of a study were not able to pursue college studies, various responses were given
which when analyzed can be grouped into various categories. Most answers may be basically
financial in nature. Other answers may be classified under the personal and the parental decision
categories, while still others may fall under categories which have to do with such circumstances
as health, safety and peace and order problem.
All individual qualitative data need not be grouped into categories of similar essence.
Some individual responses or observation are more meaningful to the study when left
unclassified. Therefore, the researcher should not feel compelled to classify all individual
responses or observations into classes or categories.
One question often making plans for analysis has to do with the number of categories.
One has to be reminded that the reason for assigning numerous responses or observations of
wide range to few classes or categories is to facilitate analysis. The more categories there will be,
the more difficult the analyses will be. For marginal analysis a number of less than seven
categories would be easier to handle than when one analyzes each answer/observation the
number of which may approximate the number of subjects, say 100 or more, under investigation.
When one has to analyze cross-tabulated data the fewer the number of categories of each
tabulated variables the better it is for the researcher. To have so many, say more than three
categories for each variable involved, the analysis of data will be difficult for the researcher.
1. The set of categories should be exhaustive, that is, it should be possible to place every
response in one of the categories of the set;
2. The categories within the set should be mutually exclusive, meaning, one should not
place a given response in more than one category within the set;
4. Categories in one level should not be combined with those in another level.
Basically, quantitative analysis is used when the researcher deals with numerical data,
that is, information which were assigned numerical values which facilitate counting,
summarization, comparison and generalization. The said analysis requires the use of statistical
tools, the appropriateness of which is determined by the type of data and the type of
measurement scale used.
Types of data. In the analysis of quantitative data using statistical treatments two types of
data are recognized, namely, parametric data and non-parametric data. Parametric data are
measured data as with ratios and interval data. Parametric statistical tests assume that the data are
normally or nearly normally distributed. Parametric tests are applied to both interval-and-ratio-
scaled data.
Non-parametric data are those which are counted or ranked such as nominal and ordinal
data. Non-parametric test, sometimes known as distribution-free tests, do not rest upon the more
stringent assumption of normally-distributed populations.
Scales of measurement. There are four kinds of measurement scales: nominal, ordinal,
interval, and ratio. Their properties and implications are described below.
2. Ordinal scales. Ordinal scales are higher level than the nominal scales because they
involve quantitative distinctions. They have the property of magnitude which allows
the ordering of members of a group of people or objects into ranks. In using this
scale, one can say that one member of a group is not only different or similar (a
nominal characteristics) but is also greater or less than the others in a criterion used.
If, for example, a teacher rates his students according to cooperativeness, the result
would be the following ordinal scales:
Very cooperative
Cooperative
Somewhat cooperative
Somewhat uncooperative
Uncooperative
Very uncooperative
The rank is such that the “very cooperative” scale is the best and ideal and is given the
highest rank, and “very uncooperative” is assigned the lowest rank. In between these two ranks
are scales which differ in terms of degree. No particular values is attached to the difference
between the scales. The implication is that one cannot say that the difference between “very
cooperative” and “cooperative” is always the same or that it is equal to the difference between
the “very uncooperative” and “uncooperative” ranks.
Data measured or ordinal scales are called ordered data.
3. Interval scales. In an interval scale, the difference or distance between any two
categories is known, equal and constant. An interval scale has all the characteristics
of nominal and ordinal scales plus the added characteristics of a constant unit of
measurement between categories that are equally spaced. For example, a child
assigned to category 5 is not only different (nominal) from one placed in category 7
but also younger (ordinal) and two years younger (interval). Age, time, height, weight
can be measured using interval scale, provided the distance between categories is
equal and constant, as in the case of the following age ranges:
10 – 14 years
15 – 19 years
20 – 24 years
25 – 29 years
30 – 34 years
35 – 39 years
An interval scale does not have a “true” zero point that would indicate an absence of
something like temperature, attitude, height that is being measured. For convenience, a zero
point may be assigned arbitrarily but this will not allow one to say that an object which weighs
200 kilograms weighs twice as much as an object which weighs 100 kilograms.
4. Ratio scales. Ration scales not only possess the characteristics of the three
measurement scales given above (identity, magnitude, equal intervals) but their level
or measurement is based on the natural origin, an absolute zero point that indicates
the absence of the variable being measured. Comparing two objects using the ration
scale, one can say that one object is 5 times longer ( or as long as), or 10 times
heavier (or as heavy as) than the other.
The basic difference between a ratio and an interval is that ratio measurements are made
from a true zero point but in the case of interval, measurements are made from an arbitrary zero
point. For this reason, in a ratio variable, ratios may be from directly from the level of values
themselves. An interval variable may be converted into a ratio variable using the differences
between levels or values. The difference or change is a ratio variable. The process of subtraction
eliminates or cancels the effects of an arbitrary origin.
As in the case of interval scales, data measured by ratio scales are called score data.
*These two scales are sometimes referred to collectively as metric measurement or numerical
measurement
Source: Roger F. Kirk, Statistics: An Introduction, 3rd edition, Chicago: Holt, Rinehart and
Winston, Inc., 1990, p.18.
Table 9.1 presents graphic summary of the levels of quantitative descriptive and types of
statistical analysis for each level.
Analytical goals. In data analysis, the goal of the researcher may be directed toward
finding out from the data one or more of the following attributes or characteristics of the group
being studied.
2. Variance in the group. The interest of the researcher may not be confined to the
average characteristics of the group; he may want to find out also how individual
members of a group vary from the average characteristics of the group. For example,
if the average age of a group of students is 18, how do the ages of students vary or
differ from the average age of 18? To determine variation in the group, measures such
as deviation from the mean, the variance, and the standard deviation may be used.
The deviation from the mean is computed by subtracting the mean from an individual
measure such as grade or score the result of which is called a deviation score. The
variance is the sum of the of the squared deviation from the mean, divided by the
number of individuals. The standard deviation is the square root of the variance which
indicates the average distance of individual measurements from the group mean.
3. Difference within the group. The researcher may want to find out whether or not
subgroups of the group being studied are different or similar on certain traits
investigated. This is true in experimental studies where the score of the experimental
group is compared with that of the control group. The question to be answered in the
analysis is: Is the difference between the scores of the groups statistically significant
or not? To determine this, the statistical tools commonly used are the chi-square (X2)
and the t or z test for the difference between means (for the formula and
computational procedures of these tools please refer to statistical books).
4. Relationship within the group. There many studies, particularly in graduate studies
programs, the primary aim of which is to find out the relationship of certain variables
covered in the study. Research questions such as enumerated below imply the use of
relational analysis.
Types of analysis. With the use of statistics data, analysis can be grouped into two broad
categories, namely, descriptive analysis and inferential analysis.
Inferential analysis is the application of the findings or conclusions from a small group to
a large group from which the former group was drawn. Statistically, the small group is known as
the sample, and the large group is the population.to justify the description of the population on
the basis of the findings on the sample (statistics), the latter should be adequate in size and
approximates the former a statistic is a measure derived from observation of the characteristics of
the sample. It is used to estimate a parameter which is the corresponding value in the population
from which the sample is drawn.
In analyzing quantitative data, the researcher may use any or a combination of the
following specific types of analysis:
1. Univariate analysis. The term univariate suggests that this type of analysis is
restricted to the examination of single variables. Sample cases may be examined in
terms of a single variable to find out whether the sample approximates the population
from which it was drawn. Using this type of analysis in a sample will prove useful
and economical in finding out the composition of the larger population with regard to
variables important to the study for which such information is not otherwise
available. In some cases, univariate analysis in adequate to meet the objectives of the
study. For example, it is the purpose of the study to find out the knowledge, attitude
and practice of a particular group of people of, say, traditional medicine. Univariate
analysis of responses to specific question would be sufficient to meet the research
objectives. Even if there are comparative and relational analyses to be made, still
there is a need for univariate analysis to find out the characteristics of the people
being investigated before proceeding to complex of data intended to meet the major
goal of the study.
The simplest statistical tool for univariate analysis is the frequency count. But the
distribution of frequency counts may not be interesting and meaningful so statistical
measurements like the percentage, the mean, the median, or the mode may be added.
2. Bivariate relationships. There are studies which call for determining the
relationships between pairs of variables, such as intervention and effects of inputs and
outputs. The appropriate analytic technique for examining bivariate relationships is
defined by the type of variables – whether they are nominal ordinal or interval.
When one is to examine relationships between two nominal variables, the chi-square
test can be applied to the cross-tabulated variables so as to find out whether or not
significant relationship exists between the variables. If relationship does not exist and
the researcher wishes to know the strength of relationships, then the “measure of
association” is to be used. For nominal variables, one of the best measures of
association is the Cramer’s V which is derived by using the chi-square value.
There are various measures of association which can be used in examining the
relationship between cross-tabulated ordinal variables. One of the simplest to
calculate is gamma. Some researchers resort to using the chi-square test to find out
whether the relationship between variables is significant or not. It should be
remembered, however, that this test is not sensitive to the ordinality of the data and,
therefore, does not provide a true test of the significance of gamma.
It should be noted that all the measures of association discussed above, except for the
regression coefficient and Cramer’s V, range in Value from – 1.00 (a perfect negative
relationship) to +1.00 (a perfect positive relationship). When there is no relationship,
the coefficient is 0.00. Finally, it should be remembered that none of the measures of
association indicate whether a relationship is causal or not.
5. The time analysis. Some studies generate time series data which allow the
investigator to know certain changes over time. Changes can be measured in terms of
days, weeks, months of years, before and after an intervention has been produced.
Patterns and trends can be easily discerned when the occurrences of the phenomenon
investigated over time (may it be monthly, quarterly of yearly) are plotted on a graph.
The trend may be smoothly progressing or decreasing of erratic – increasing,
fluctuating, increasing and so on. Since time series data are gathered before and after
implementation of an intervention, the researcher will be able to compare the trends
before and after the intervention, enabling him to find out whether or not the
intervention caused change in the occurrence of the phenomenon studied. Simple
measures can be used in the time series analysis. Most commonly used are
frequencies, percentages and central tendencies. Measure on the significance of the
difference should be used when comparing the trends before and after the
intervention.
The specific types of analyses involving statistical measures presented above are
not exhaustive. The researcher, particular the beginner, has to refer to other books on
research and statistics for the type of analysis and statistics for the type of analysis and
statistical tools which he needs for his particular study.
Example. In the analysis of statistical data, the analysis or text is given first,
followed by the statistical table the latter being the support or back up which corroborates
or bear out the former. The following illustration will sharpen one’s understating and skill
in data analysis.
Among rural mothers, the biggest proportion of mothers was breast feeding
(45%), followed by those who were bottle feeding their infants (35%), and the least
proportion (20%) was constituted by mothers who were mixed feeding their babies.
Among urban mothers, the biggest proportion (52%) of mothers was bottle feeding their
infants. Breast feeding mothers were the next biggest group (33%), while the smallest
group (15%) was constituted by mothers who were mixed feeding their infants.
Comparatively, breast feeding was predominant among rural mothers while bottle
feeding was true among urban mothers. In both categories of residence, rural and urban,
mixed feeding was practiced by the least number of mothers.
Table 9.2 Percent Distribution of Mothers by Infant-feeding Practice, Education and Residence
Feeding Rural (n=75) Sub- Urban (n=60) Sub- All Res. (n-135) Total
Practice Elm. HS Col Total Elm. HS Col Total Elm. HS Col
Breast 75 55 05 45 65 25 10 33 70 40 07 39
Feeding
Bottle 15 35 55 35 25 55 75 52 20 45 65 43
Feeding
Mixed 10 10 40 20 10 20 15 15 10 15 28 18
Feeding
Total 100 100 100 100 100 100 100 100 100 100 100 100
First, read with understanding the table title and column headings. Identify the variables
and their corresponding categories, the statistical tools and the populations or sample sizes by
variable and categories.
Second, determine what time analysis should be made. For this the researcher should
refer to his specific objectives.
Third, the researcher should come up with a concluding statement based on the analysis
made.
In the text or analysis, the researcher should use the past tense in referring to the
characteristics or what the subjects were doing at the time the data were gathered. However,
when he refers to the table or the data in the table present tense should be used. For example,
In analytical data, priority and emphasis be given to the extremes: highest/ biggest/
predominant, lowest/ smallest/ least. But the researcher should always be guided by what he
wants to achieve which he explicitly stated in his specific objectives. The researcher should
avoid enumerating all the variables and categories which their respective statistics. This is a case
of table reading, not data analysis.
It is in the data analysis section or chapter where researcher rejects or does not reject his
null hypotheses. While some researcher use “deny” and “accept” in relation to null hypotheses it
is the position of the author that rejection and non-rejection of null hypotheses which can only be
rejected or not rejected according to books in statistics.
Data Interpretation
Having done analysis of data, the next task of the researcher is to interpret the results of
the analysis. The purpose of data interpretation is to “search for the broader meaning of research
finding” (Selltiz and others, 1976) This goal is not achieved if the researcher just explains the
meaning of the data and fails to explore the probable meaning or meanings beyond what the data
imply or convey. From the point of view of Kerlinger (1986), in making interpretation of the
data, the researcher “takes the results of analysis, makes inferences pertinent to the research
relations studied, and draws conclusions about these relations.”
The search for broader meaning in date interpretation has two major aspects. One aspect
is the attempt to establish continuity in research undertaking. This is done by linking the result of
one’s study with those of other studies. In the most cases, studies related to his own are presented
or discussed in the review of related literature section. In linking up, the researcher points out
whether or not results of his own study support or contradict the findings of other studies similar
or related to his own. In so doing, knowledge on the issues tacked by the researchers who may
want to investigate further the research issues.
The other aspects of data interpretation is that doing it leads to the establishment of
explanatory concepts. Certain attributes of the population under investigation or the kind of
relationship that exists between study variables can be understood best when the researcher
explains them by citing supporting statistics in the same study or those of others studies. The
researcher should not just describe, for example, the children who live with their mothers
performing better in class than their counterparts whose mothers live and work abroad for a long
time. He should explain the factors others than or due to the presence and absence of the mother
which affects the academic performance the children. Using the data at hand as anchor or bases,
he should make deductions and inferences thereby exploring possible meanings which other
research may use as hypotheses for their own study on the issue.
Illustrations:
A. Establishment of continuity in research undertaking.
Result of the study corroborates the findings of Polido’ that elementary educated
mothers are likely to breastfed their infants while mother with high school and
college education popularly practiced bottle feeding.