LESSON 2 Data Collection Presentation

Taguig City University
COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY
DS 103 – STATISTICS with SPSS
Statistics in Research

Lesson 2: DATA COLLECTION AND PRESENTATION
Learning Objectives
State the different methods in collecting and presenting data

Differentiate probability from non-probability sampling
Construct the frequency distribution table
Enumerate the different graphical presentations.
Data collected are useless and meaningless unless they are properly presented for analysis and
interpretation. All statistical procedures help to describe data.
In this lesson, you will learn the different ways of presenting data, either tabular or graphical. These
methods of presenting data are considered important characteristics of the data on a more direct manner
than is possible using any of the statistical analysis.
A. Data Collection

Methods of Data Collection

Direct Method referred to as interview. This may be structured or unstructured interview. This is mainly
used for a small sample size. This is a method where there is a person-to-person exchange of idea
between the one soliciting information (interviewer) and the one supplying the information (interviewee).
Indirect Method popularly known as paper and pencil method or the questionnaires method. Researcher
has to prepare questions relevant to the subject of the study.

Registration Method referred to as documentary analysis where the researcher makes use of the data
/fact / information on file. These documents are something that is enforced by a certain law or policy. This
includes birth, death, licenses and other records.
• Observation Method data pertaining to behaviors of an individual or a group of individuals at the

time of occurrence of a given situation are best obtain by direct observation. Subjects may be taken
individually or collectively, depending on the target of the investigator. This method is used also if the
objects of the study cannot talk nor write – like plants and animals.

Experimental method this method examines the cause and effect of certain phenomena. Data
obtained here are done through a series of experiments which require laboratory results.
A. Sampling Techniques

 Probability Sampling.

It is a sampling procedure wherein every element of the population is given a non-zero chance of being
selected as sample. This is taken to mean that everyone in the population has the chance to be included
in the sample. It is also known as Random Sampling.

Simple Random Sampling Selection is done fairly, just and without bias. Researcher gives no
criteria or is being objective in the selection of samples. Examples: drawing of winning stub in the
tambiolo; selection of number in the table of random sampling and others.
Systematic Sampling. The researcher obtains sampling by developing a certain nth star or simply
developing a pattern which can also be dine through random selection.
 Stratified Sampling. Selection of samples in this sampling technique can be done by equal or
proportional strata. This is the technique commonly used particularly if there are several sources of
data.
 Cluster Sampling. This technique is done by choosing samples in group. Selection will be randomly
done in clustered form. When a group is chosen, regardless of who is in the group, they are all
considered as samples.
 Multistage Sampling. This technique is referred to as selection of samples in several stages of

sampling.

Non- Probability Sampling.

It is a sampling technique wherein not every element of the population is given a chance of being
selected as a sample. The researcher states his prejudice for certain samples. It is otherwise known as
non-random sampling.
 Purposive Sampling. It is a non-random sampling technique of choosing samples where the researcher
defined his criteria and rules.
 Quota Sampling. The researcher or investigator limits the number of samples on the required number for
the subject of his study.
 Convenience Sampling. The researcher chooses his most preferred location / venue where he can
conduct his study. The researcher specifies the place and time where he can collect his data.
C. Data Presentation
Textual Presentation
Data collected is presented in paragraph form if it is purely qualitative or when there are very few numbers
involved. This method is commonly adopted by researchers undergoing qualitative research.
 Tabular Presentation
The more effective way of presenting the data is by means of table which appears in the form of rows and
columns. Data presented in tabular form can be easily used for comparison and emphasis. One can
easily draw relationships from the presented table.
A statistical table has four components: table heading, body, stubs, and box heads.
A statistical table has four components: table heading, body, stubs, and box heads.
 Graphical Presentation of Data
The statistics often uses graphs for better analysis of variables. There are two types of graphs for
analyzing variables:
- Histogram (bar chart)
- Pie Chart
 Histogram is a standard graph where variants of the variables are represented on one axis and
variable frequencies on the other axis. Individual values of the frequency are then displayed as bars
(boxes, vectors, logs, cones etc.)

 Pie Chart represents relative frequencies of individual variants of a variable. Frequencies are presented
as proportion in a sector of a circle.
 Bar Graph is used to represent discrete data, so instead of being joined, like in the histogram the bars
are separated. The length of each represents the frequency within the given class. The width of the bar
is arbitrary., however must be of the same width almost the same as the histogram.
Frequency Polygon is a line chart. The frequency is placed along the vertical axis and the
individual variants are placed along the horizontal axis. The values are attached to a line.
Ogive a graphical presentation of cumulative frequencies or relative cumulative frequency. The
vertical axis is the cumulative frequency or relative cumulative frequency. The horizontal axis
represents the variants. The graph always starts at zero, at the lowest variant and ends up at
the total frequency.
Pareto Graph is a bar chart for qualitative variable with the bars arranged by frequency. The
variants are on the horizontal axis and are sorted from the highest importance to the lowest.
Stem and Leaf plot a device for presenting quantitative data in graphical format, similar to a
histogram, to assist in visualizing the shape of a distribution.
To construct a stem - and – leaf display, the observations must first be sorted in ascending order; this
can be done most easily if working by hand by constructing a draft of the stem – and –leaf display with the
leaves unsorted, then sorting the leaves to produce the final stem-and-leaf display.

Here is the sorted set of data values that will be used in the following example:

46 47 49 63 64 66 68 68 72
72 75 76 81 84 88 106

In this example the leaf represents the ones place, and the stem will represent the rest of the
numbers. The stem-and-left display is drawn with two columns separated by a vertical line. The stems
are listed to the left of the vertical line. It is important that each stem is listed only once and that no
numbers are skipped, even if it means that some stems have no leaves.
The leaves are listed in increasing order in a row to the right of each stem.
D. Frequency Distribution

Frequency Distribution is an arrangement of data showing the frequency of occurrence of the different
values of the variable.
Frequency Distribution Table is the tabular arrangement of data by classes or categories together with
their corresponding frequencies.
Constructing Frequency Distribution Table
Supposed we have collected a raw data as shown below:
Given: 70 83 87 76 80 87 75 84 85

76 81 82 89 77 84 86 71 80
80 79 84 86 93 83 85 88 72
84 84 92

Steps
1.Find the Range ( R) of values. Get the 2. Determine the desired Class interval (CI). The
difference of the highest value (HV) and ideal number of class intervals is somewhere
the lowest value (LV). between 5 and15 preferably odd class intervals. But
the more scientific way is applying the pattern :
R = HV - LV
C I = 3.33 + log n
R = 93 - 70
= 3.33 + log 30
R = 23
= 3.33 + 1.4771
= 4.81 or 5
3. Compute for Class Size ( i) . Divide the 4. Construct a frequency table by making class
computed range (R ) by the desired intervals starting with the lowest value in the lower
computed class interval (CI ). limit of first-class interval, then add the computed
class size (i) to obtain the lower limit of the next class
i = R / CI
interval. Continue adding the class size on the lower
= 23 / 5 = 4.6 = 5
limits until you reach the desired class interval (CI).
Get the upper limit of each class interval by
subtracting one from the lower limit of the next class
interval.
5. Determine the number of data (frequency) for every class interval by tallying the raw data.
6. Write the obtained frequency (f) from each class interval by counting the tallied form.
7. Determine the Class mark (X) of each class interval. Add the lower limit (LL) and the upper limit (UL)
then divide the sum by 2 to get its mid-point.
8. Determine the class boundary (CB) or class limit by subtracting 0.5 from every lower limit and adding
0.5 from every upper limits.
9. Determine the less than cumulative frequency (< F) and the greater than cumulative frequency (> F). To
determine the less than cumulative frequencies, write the first-class frequency (f) under the column (< F)
and add the next class frequency of the next class interval. From the cumulative sum, add again the third-
class frequency to obtain the 3rd < F, continue performing the process until you reach the last class interval.
To determine the greater than cumulative frequency, write the total number of data collected (n) under the
column > F. Subtract the second-class frequency to determine the 3rd > F. Continue performing the operation
until the last class interval is reached.
10. Obtain the relative frequencies (RF) to determine the percentage distribution of frequencies. Divide the
class frequency (f) of each class interval (CI) then multiply by 100.


Based on the table above, notice that 70 75 80 85 90 are called lower limit (LL) and
74 79 84 89 94 are called upper limit.
Try to answer the following:
1.Which class has the greatest frequency?

2.Which class has the least frequency?
3.What limits does 85 - 89 class interval have?
4.How many respondents got 80 and above?
5.How many respondents got 89 and below?
6.About how many percent belongs to 75 - 79?
7.What is the midpoint of 80 - 84?
 Definition of Terms
 Range (R). It is determined by the difference of highest and lowest values.
 Class Interval (CI) – it is the grouping of category defined by a lower limit and an upper limit.
 Class Size (i) – refers to the quotient of the computed range and class frequency of the desired class interval.
 Class frequency (f) – refers to the number of observations belonging to a class interval or the number of items within a category.

 Class Boundaries (CB) – the true limit which is situated between the upper limit of one interval and the lower limit of the next interval.
These are more precise expressions of the class limits by at least 0.5 of their values.
 Class Mark (X) – refers to the midpoint if the acquired class size. It is obtained by adding the lower and upper values divided by 2.
 Cumulative Frequency – the total number of observations that have values less than or equal to specified amount.
 Relative frequency (RF) – these are the percentage distribution in every class interval.
Exercise 2
1.In each of the following, construct a complete frequency distribution table.
35 58 43 80 48 85 42 39 63 44 35
54 38 63 62 65 37 76 46 34 34 45
36 44 42 47 51 40 31 80 54 50 50
34 50

Find the following

1. Class size
2. Number of classes
3. Class mark of the 3rd class
4. Lower limit of the 4th class
5. Upper class boundary of the third class
6. Total number of frequency
7. Highest frequency
8. Class that comprise 30% if the distribution
9. Class with the highest frequency
10. Class boundary of the class with lowest frequency.
2. The grades given to you are the following:
84 81 74 92 80 88 98 79
82 85 97 82 89 84 86 91
85 87 95 90 90 84 93 92
88 85 86 90 86 89 88 91
88 98 96 94 83 92 95 87
From this data, prepare the following:

1.Stem- and – leaf display
2.Complete frequency distribution table using 5 class intervals
3.Histogram

LESSON 2 Data Collection Presentation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LESSON 2 Data Collection Presentation

Uploaded by

Copyright:

Available Formats

Taguig City University

COLLEGE OF INFORMATION AND COMMUNICATION TECHNOLOGY

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

Lesson 2: DATA COLLECTION AND PRESENTATION

State the different methods in collecting and presenting data

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

• Observation Method data pertaining to behaviors of an individual or a group of individuals at the

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

 Multistage Sampling. This technique is referred to as selection of samples in several stages of

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

 Graphical Presentation of Data

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

Try to answer the following:

1.Which class has the greatest frequency?

DS 103 – STATISTICS with SPSS

 Range (R). It is determined by the difference of highest and lowest values.

DS 103 – STATISTICS with SPSS

DS 103 – STATISTICS with SPSS

Find the following

DS 103 – STATISTICS with SPSS

2. The grades given to you are the following:

From this data, prepare the following:

You might also like