Professional Documents
Culture Documents
Statistics in Research
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
Learning Objectives
Data collected are useless and meaningless unless they are properly presented for analysis and
interpretation. All statistical procedures help to describe data.
In this lesson, you will learn the different ways of presenting data, either tabular or graphical. These
methods of presenting data are considered important characteristics of the data on a more direct manner
than is possible using any of the statistical analysis.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
A. Data Collection
Methods of Data Collection
Direct Method referred to as interview. This may be structured or unstructured interview. This is mainly
used for a small sample size. This is a method where there is a person-to-person exchange of idea
between the one soliciting information (interviewer) and the one supplying the information (interviewee).
Indirect Method popularly known as paper and pencil method or the questionnaires method. Researcher
has to prepare questions relevant to the subject of the study.
Registration Method referred to as documentary analysis where the researcher makes use of the data
/fact / information on file. These documents are something that is enforced by a certain law or policy. This
includes birth, death, licenses and other records.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
Experimental method this method examines the cause and effect of certain phenomena. Data
obtained here are done through a series of experiments which require laboratory results.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
A. Sampling Techniques
Probability Sampling.
It is a sampling procedure wherein every element of the population is given a non-zero chance of being
selected as sample. This is taken to mean that everyone in the population has the chance to be included
in the sample. It is also known as Random Sampling.
Simple Random Sampling Selection is done fairly, just and without bias. Researcher gives no
criteria or is being objective in the selection of samples. Examples: drawing of winning stub in the
tambiolo; selection of number in the table of random sampling and others.
Systematic Sampling. The researcher obtains sampling by developing a certain nth star or simply
developing a pattern which can also be dine through random selection.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
Stratified Sampling. Selection of samples in this sampling technique can be done by equal or
proportional strata. This is the technique commonly used particularly if there are several sources of
data.
Cluster Sampling. This technique is done by choosing samples in group. Selection will be randomly
done in clustered form. When a group is chosen, regardless of who is in the group, they are all
considered as samples.
Non- Probability Sampling.
It is a sampling technique wherein not every element of the population is given a chance of being
selected as a sample. The researcher states his prejudice for certain samples. It is otherwise known as
non-random sampling.
Purposive Sampling. It is a non-random sampling technique of choosing samples where the researcher
defined his criteria and rules.
Quota Sampling. The researcher or investigator limits the number of samples on the required number for
the subject of his study.
Convenience Sampling. The researcher chooses his most preferred location / venue where he can
conduct his study. The researcher specifies the place and time where he can collect his data.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
C. Data Presentation
Textual Presentation
Data collected is presented in paragraph form if it is purely qualitative or when there are very few numbers
involved. This method is commonly adopted by researchers undergoing qualitative research.
Tabular Presentation
The more effective way of presenting the data is by means of table which appears in the form of rows and
columns. Data presented in tabular form can be easily used for comparison and emphasis. One can
easily draw relationships from the presented table.
A statistical table has four components: table heading, body, stubs, and box heads.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
A statistical table has four components: table heading, body, stubs, and box heads.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
The statistics often uses graphs for better analysis of variables. There are two types of graphs for
analyzing variables:
- Histogram (bar chart)
- Pie Chart
Histogram is a standard graph where variants of the variables are represented on one axis and
variable frequencies on the other axis. Individual values of the frequency are then displayed as bars
(boxes, vectors, logs, cones etc.)
Pie Chart represents relative frequencies of individual variants of a variable. Frequencies are presented
as proportion in a sector of a circle.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
Bar Graph is used to represent discrete data, so instead of being joined, like in the histogram the bars
are separated. The length of each represents the frequency within the given class. The width of the bar
is arbitrary., however must be of the same width almost the same as the histogram.
Frequency Polygon is a line chart. The frequency is placed along the vertical axis and the
individual variants are placed along the horizontal axis. The values are attached to a line.
Ogive a graphical presentation of cumulative frequencies or relative cumulative frequency. The
vertical axis is the cumulative frequency or relative cumulative frequency. The horizontal axis
represents the variants. The graph always starts at zero, at the lowest variant and ends up at
the total frequency.
Pareto Graph is a bar chart for qualitative variable with the bars arranged by frequency. The
variants are on the horizontal axis and are sorted from the highest importance to the lowest.
Stem and Leaf plot a device for presenting quantitative data in graphical format, similar to a
histogram, to assist in visualizing the shape of a distribution.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
To construct a stem - and – leaf display, the observations must first be sorted in ascending order; this
can be done most easily if working by hand by constructing a draft of the stem – and –leaf display with the
leaves unsorted, then sorting the leaves to produce the final stem-and-leaf display.
Here is the sorted set of data values that will be used in the following example:
46 47 49 63 64 66 68 68 72
72 75 76 81 84 88 106
In this example the leaf represents the ones place, and the stem will represent the rest of the
numbers. The stem-and-left display is drawn with two columns separated by a vertical line. The stems
are listed to the left of the vertical line. It is important that each stem is listed only once and that no
numbers are skipped, even if it means that some stems have no leaves.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
The leaves are listed in increasing order in a row to the right of each stem.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
D. Frequency Distribution
Frequency Distribution is an arrangement of data showing the frequency of occurrence of the different
values of the variable.
Frequency Distribution Table is the tabular arrangement of data by classes or categories together with
their corresponding frequencies.
Constructing Frequency Distribution Table
Supposed we have collected a raw data as shown below:
Given: 70 83 87 76 80 87 75 84 85
76 81 82 89 77 84 86 71 80
80 79 84 86 93 83 85 88 72
84 84 92
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
1.Find the Range ( R) of values. Get the 2. Determine the desired Class interval (CI). The
difference of the highest value (HV) and ideal number of class intervals is somewhere
the lowest value (LV). between 5 and15 preferably odd class intervals. But
the more scientific way is applying the pattern :
R = HV - LV
C I = 3.33 + log n
R = 93 - 70
= 3.33 + log 30
R = 23
= 3.33 + 1.4771
= 4.81 or 5
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
3. Compute for Class Size ( i) . Divide the 4. Construct a frequency table by making class
computed range (R ) by the desired intervals starting with the lowest value in the lower
computed class interval (CI ). limit of first-class interval, then add the computed
class size (i) to obtain the lower limit of the next class
i = R / CI
interval. Continue adding the class size on the lower
= 23 / 5 = 4.6 = 5
limits until you reach the desired class interval (CI).
Get the upper limit of each class interval by
subtracting one from the lower limit of the next class
interval.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
5. Determine the number of data (frequency) for every class interval by tallying the raw data.
6. Write the obtained frequency (f) from each class interval by counting the tallied form.
7. Determine the Class mark (X) of each class interval. Add the lower limit (LL) and the upper limit (UL)
then divide the sum by 2 to get its mid-point.
8. Determine the class boundary (CB) or class limit by subtracting 0.5 from every lower limit and adding
0.5 from every upper limits.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
9. Determine the less than cumulative frequency (< F) and the greater than cumulative frequency (> F). To
determine the less than cumulative frequencies, write the first-class frequency (f) under the column (< F)
and add the next class frequency of the next class interval. From the cumulative sum, add again the third-
class frequency to obtain the 3rd < F, continue performing the process until you reach the last class interval.
To determine the greater than cumulative frequency, write the total number of data collected (n) under the
column > F. Subtract the second-class frequency to determine the 3rd > F. Continue performing the operation
until the last class interval is reached.
10. Obtain the relative frequencies (RF) to determine the percentage distribution of frequencies. Divide the
class frequency (f) of each class interval (CI) then multiply by 100.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
Based on the table above, notice that 70 75 80 85 90 are called lower limit (LL) and
74 79 84 89 94 are called upper limit.
Definition of Terms
Class Interval (CI) – it is the grouping of category defined by a lower limit and an upper limit.
Class Size (i) – refers to the quotient of the computed range and class frequency of the desired class interval.
Class frequency (f) – refers to the number of observations belonging to a class interval or the number of items within a category.
Class Boundaries (CB) – the true limit which is situated between the upper limit of one interval and the lower limit of the next interval.
These are more precise expressions of the class limits by at least 0.5 of their values.
Class Mark (X) – refers to the midpoint if the acquired class size. It is obtained by adding the lower and upper values divided by 2.
Cumulative Frequency – the total number of observations that have values less than or equal to specified amount.
Relative frequency (RF) – these are the percentage distribution in every class interval.
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
Exercise 2
1.In each of the following, construct a complete frequency distribution table.
35 58 43 80 48 85 42 39 63 44 35
54 38 63 62 65 37 76 46 34 34 45
36 44 42 47 51 40 31 80 54 50 50
34 50
Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
84 81 74 92 80 88 98 79
82 85 97 82 89 84 86 91
85 87 95 90 90 84 93 92
88 85 86 90 86 89 88 91
88 98 96 94 83 92 95 87