Professional Documents
Culture Documents
When all the data needed to answer the study objectives have
collected, the data should be processed in preparation for analysis.
Why is data processing necessary?
• In studies involving collection and analysis of quantitative
information, a great amount of information needs to be processed in
preparation for collation and analysis.
Editing
Coding
Encoding
Creation of data files
Tabulation
Step One: Editing
B. Educational Background
1. What undergraduate degree did 1 AB/BA
you finish? 2 Education
3 Commerce
4 Nursing
5 Engineering
6 Agriculture
7 Others
• A coding sheet is a form that contains columns and rows where the code
symbols are entered.
• The column represent variables or question items, while the rows represent the
data souces, subjects, respondents, or instruments being coded.
Sampling coding sheet
R No. Sex Age Civ Work Emp Occ. Income Educ Course Like Why Why
Stat Stat grad No Yes
1 1 18 1 1 1 2 10,000 1 1 1 2 1
2 2 17 1 2 1 1 12,300 1 1 1 2 2
3 2 16 1 1 1 2 10,500 2 2 2 3 3
4 1 17 1 2 0 0 NAP 2 2 1 3 2
A Coding Manual
• Is a form which define variables and gives the codes for the categories
of responses for the questions or items in the research instrument.
• For example, sex is a nominal variable. Its categories, “male” and “female,” do
not have mathematical value. If number “1” is used to represent “male” and
“2” is used to represent “female”, does not mean that the “female” category
has a higher value than the “male” category. Numbers are assigned to
categories to facilitate processing.
Ordinal Scale
• The distance between the first, second and the second ranks, however, is not
the same as the distance between the second and the third ranks or the
distance between third and fourth ranks.
• For example, three high school students who got the first, second and third
honors in their class obtained a general average of 94, 89 and 88, respectively.
Take note that while their ranks are consecutive, the differences in grades
between ranks are not equal.
The rank in class is an ordinal variable.
Interval Scale
• Income values have equal distances between each other. For instance,
the distance between Php 1,000 and Php 3,000. similarly, the distance
between Php 5,000 and Php 10,000 is the same as the distance
between php 15,000 and Php 20,000 which is Php 5,000.
Descriptions and Examples of the Four Scales of Measurements
• Descriptive Analysis
• Is used to describe the nature and characteristics of an event or a
population under investigation. It is used to describe the
characteristics of a variable or a set of data and/or the variance
within the data.
Inferential Analysis
• Is a method of analysis used in testing hypothesis. It is used to test for
significance of observed differences or relationship between or among
variables. This method is used in analytical studies.
Frequency distribution
The frequency distribution indicates the number and percentage of
responses for each category.
• The distribution is a useful measure for analysing nominal and ordinal data.
30
25
20
15
10
0
Normal 1st degree malnourihed 2nd degree malnourished 3rd degree malnourished
Figure 1. Distribution of Respondents by Nutritional Status
Nutritional Status Normal 1st degree malnourihed 2nd degree malnourished 3rd degree malnourished
Measure of Central Tendency
Is the average of all values. It is useful in analysing interval and ratio data.
It is derived by adding all the values and dividing the sum by the total
number of cases.
Example: The achievement can be measured by a score in 100 item test.
In the illustration below, the mean 84.4, is the average of the scores
obtained by 15 students.
Scores of 15 students in an achievement test
82 83 85 87 87 88 90 91 93 93 94 95 95 95 96
Mean = Sum of 82+83+85+87 = 1266/15 = 84.4
• Median
The median is the midpoint of a group of interval measures arranged from
highest to lowest.
Example: In the 15 scores below which are arranged from lowest to highest,
the midpoint is the 8th score from the lowest (82), and the 8th score from the
highest (96), is the median.
Scores: 82 83 85 87 87 88 90 91 93 93 94 95 95 95 96
Mode
The mode is the most frequently occurring figure in a set of figures. In actual
situations, modal scores are usually near the middle of continuum of scores.
Example: In the 15 scores below, 90 is the mode because it occurs three times.
Scores 82 83 85 87 87 88 90 90 90 91 93 93 96 97 97
• B. Describing the /variance in the data (Univariate)
The two commonly used measures of variations are the range and the
standard deviation.
Range
The range is a simple measure of variation calculated as the highest value
in a distribution, minus the lowest value plus 1.
Range = Highest value – Lowest Value + 1
Example:
In the sample data below, the highest score is 97, while the lowest is 82. the
range = 97 (the highest score) minus 82, the lowest score) plus 1 = 15+1 = 16
Scores: 82 83 85 87 87 88 90 90 90 91 93 93 96 97 97
• Standard Deviation (SD)
The standard deviation (SD) gives the average of the distance of individual
observations from the group mean, the square root of the average squared deviation of
each case from the mean.
The steps involved in calculating the standard deviation (SD) are:
1. Calculate the mean of the distribution (X).
2. Subtract the mean from each score (X-X)
3. square each of these scores (X-X)2
4. divide the sum of the squared scores by the number of scores (n). The result is
called the variance.
5. Take the square root of the variance. The result is the SD
SD = Σ(x-x)2
n
• B. Analyzing Differences Within The Data
A researcher may want to know whether the difference between two groups
is satisfactorily significant or may have occurred by chance.
1. Difference in proportions
For example in a survey on the smoking practices of high school students, it
was found that 63 percent of the male students, while only 20 percent of the
female students smoke. To determine whether the difference in the
proportion of male smokers and the proportion of female smokers is statistically
significant, Z test for difference in proportions can be applied. The result of the
analysis is shown in Table 3 indicate a significant difference between proportions
(Z=5.87, .01)
Table 3. Distribution of Students by Smoking Practices and Sex
______________________________________________
Smoking Practices Male Female Z-test
______________________________________________
No. % No. % Value P
Smoking 47 63.0 15 20 5.87 .01
Not Smoking 28 37.0 60 80 7.12 .01
Total 75 100.0 75 100.0
______________________________________________
2. Difference
A study comparing the performance of male and female college students, revealed
that the 129 sample male students obtained a mean grade of 82.34 (SD=1.69).
While the 178 sample female students obtained a mean grade of 81.85 (SD=1.34).
To determine whether the mean grade of the male students significantly differ from
that of the female students, the Z-test difference between means can be applied.
The result of the analysis shown in table 4 indicates that there is no significant
difference between the two means (Z=1.74, p=.350). This means that the male and
the female students d not significantly differ in terms of their mean college grades.
(For a sample population of 30 or less, the t-test is used.
To analyze the differences among thee or more means, the analysis of Variance
(ANOVA) is used.
Table 4. Comparison of Mean College Performance of Students.
_________________________________________________
Sex No. Mean Grade SD Z-Test Sig.
_________________________________________________
Male 129 82.34 1.69
Female 178 81.85 1.34 1.74 3.54
Total 307 82.05 1.49
_________________________________________________
C. Describing Relationships Between Variables (Bivariate Analysis)
Many studies focus on determining association or relationship between two
variables. The simplest way of finding out whether there is an association
between two variables is by using crosstabulation.