You are on page 1of 68

Introduction to Quantitative Research

A. Data and Variables


B. The Nature of Quantitative Research (Done)
C. Descriptive and Inferential Statistics
Data and Variables
Research data is any information that has been
collected, observed, generated or created to
validate original research findings.
Type of Research Data
1. documents, spreadsheets
2. laboratory notebooks, field notebooks,
diaries
3. questionnaires, transcripts, codebooks
4. audiotapes, videotapes
5. photographs, films
6. test responses
Type of Research Data
7. slides, artefacts, specimens, samples
8. collections of digital outputs
9. data files
10. database contents (video, audio, text,
images)
11. models, algorithms, scripts
Type of Research Data
12. contents of an application (input,
output, logfiles for analysis software,
simulation software, schemas)
13. methodologies and workflows
14. standard operating procedures and
protocols
15. Non-digital data
Sources of Research Data
Research data can be generated for different purposes
and through different processes.
•Observational data is captured in real-time, and is
usually irreplaceable, for example sensor data, survey
data, sample data, and neuro-images.
•Experimental data is captured from lab equipment.
It is often reproducible, but this can be expensive.
Examples of experimental data are gene sequences,
chromatograms, and toroid magnetic field data.
Sources of Research Data
•Simulation data is generated from test models where
model and metadata are more important than output
data. For example, climate models and economic
models.
•Derived or compiled data has been transformed
from pre-existing data points. It is reproducible if lost,
but this would be expensive. Examples are data
mining, compiled databases, and 3D models.
Sources of Research Data
•Reference or canonical data is a static or organic
conglomeration or collection of smaller (peer-
reviewed) datasets, most probably published and
curated. For example, gene sequence databanks,
chemical structures, or spatial data portals.
Sources of Research Data
Primary Sources are immediate, first-hand
accounts of a topic, from people who had a direct
connection with it.
Primary sources can include:
•Texts of laws and other original documents.
•Newspaper reports, by reporters who witnessed an
event or who quote people who did.
•Speeches, diaries, letters and interviews - what the
people involved said or wrote.
Sources of Research Data
•Original research.
•Datasets, survey data, such as census or economic
statistics.
•Photographs, video, or audio that capture an event.
Sources of Research Data
Secondary Sources are one step
removed from
primary sources, though they often quote or
otherwise use primary sources. They can
cover the same topic, but add a layer of
interpretation and analysis.
Sources of Research Data
Secondary sources can include:
•Most books about a topic.
•Analysis or interpretation of data.
•Scholarly or other articles about a topic,
especially by people not directly involved.
•Documentaries (though they often include
photos or video portions that can be
considered primary sources).
Sources of Research Data
References
Research Data Management. Retrieved from
explainedhttps://library.leeds.ac.uk/info/14062/research
_data_management/61/research_data_management_exp
lained
Primary Sources: A Research Guide. Retrieved from
https://umb.libguides.com/PrimarySources/secondary
Variables
Variable is defined as an attribute of an object of
study.
• You need to know which types of variables
you are working with in order to choose
appropriate statistical tests and interpret
the results of your study.
Types of Variables
Data is a specific measurement of a variable –
it is the value you record in your data sheet.
Data is generally divided into two categories:

1. Quantitative data represents amounts.


2. Categorical data represents groupings.
Types of Variables
A variable that contains quantitative data is a
quantitative variable; a variable that contains
categorical data is a categorical variable.
Each of these types of variable can be broken
down into further types.
Types of Variables
Quantitative variables
When you collect quantitative data, the numbers you
record represent real amounts that can be added,
subtracted, divided, etc. There are two types of
quantitative variables: discrete and continuous.
Types of Variables
Discrete QV counts of individual items or values.
(integer)
•Number of students in a class
•Number of different tree species in a forest
Continuous QV measurements of continuous or
non-finite values. (ratio)
•Distance
•Volume
•Age
Types of Variables
Categorical variables represent groupings of some
kind. They are sometimes recorded as numbers, but the
numbers represent categories rather than actual
amounts of things.
Types of Variables
Types of categorical variables:
1. Binary variables: Yes/no outcomes, Heads/tails in
a coin flip;Win/lose in a football game; True/False
2. Nominal variables: Groups with no rank or
order between them; Gender, Colors, Brands
3. Ordinal variables: Groups that are ranked in a
specific order; Level of performance, Finishing
place in a race, Rating scale responses in a survey*
Types of Variables
Ordinal variable can also be used as a quantitative
variable if the scale is numeric and doesn’t need to be
kept as discrete integers.

For example, star ratings on product reviews are


ordinal (1 to 5 stars), but the average star rating is
quantitative.
Types of Variables
Independent vs Dependent vs Control Variables
Independent Variables you manipulate in order to
affect the outcome of an experiment.
Dependent Variables that represent the outcome of
the experiment.
Control Variables that are held constant throughout
the experiment.
Types of Variables
You manipulate the independent variable (the one you
think might be the cause) and then measure the
dependent variable (the one you think might be the
effect) to find out what this effect might be.
You will probably also have variables that you hold
constant (control variables) in order to focus on your
experimental treatment.
Types of Variables
When you do correlational research, the terms
“dependent” and “independent” don’t apply, because
you are not trying to establish a cause and effect
relationship.
Types of Variables
When you do correlational research, the terms
“dependent” and “independent” don’t apply, because
you are not trying to establish a cause and effect
relationship.
References
Bevans, Rebecca. (2021)Understanding types of variables
Retrieved from
https://www.scribbr.com/methodology/types-of-variables/
Statistical analysis
Descriptive statistics is a method
concerned w/ collecting, describing, and
analyzing a set of data without drawing
conclusions about a large group.
Statistical analysis
Inferential statistics is a method concerned
with the analysis of a subset of data
leading to predictions or inferences about
the entire set of data or population.
Inferential Statistics
Population Sample
(N ) (n)

Inferences and
Generalizations

Source: Pilot Training Course on Teaching Basic Statistics by Statistical Research and Training
Center Philippine Statistical Association , Inc.
Descriptive Statistics
Measure of Central Tendency
A single value that is used to identify the
“center” of the data
1. Mean
2. Median
3. Mode
Mean is the most common
measure of the center. It is
also known as arithmetic
average.
Population Mean
Sample Mean

x̄ =
Example. The owner of XYZ
restaurant recorded the
number of students who dine
in 10, 14, 9, 17, 12, 18, 25,
20, 22. Find the mean value
of the given data. Interpret
the answer.
x
9
10
12
14
17 x ̄ = 147/9
18 x̄ = 16.33
20
22
25
Ʃx = 147
n=9
Uses/Benefits
•best measure for symmetrical
distribution;
•influence by all data;
•most reliable;
•good for interval and ratio
Limitation
•works best with no outliers
Median divides the observations
into two equal parts.
• If n is odd, the median is the
middle number.
• If n is even, the median is the
average of the 2 middle
numbers.
Uses/Benefits
1. good for assymetrical data
2. works for ordinal, interval
and ratio data
Limitations
1. does not accounts for
extreme scores
2. not algebraically defined
3. not appropriate for nominal
data
Example. The owner of XYZ
restaurant recorded the
number of students who dine
in 10, 14, 9, 17, 12, 18, 25,
20, 22. Find the median value
of the given data. Interpret the
answer.
First, arrange the scores in a
numerical sequence or
according to order that is from
lowest score to highest score
or highest score to lowest
score, as much as possible
number the arrangement of
the given scores.
No x
1. 9
2. 10
3. 12
4. 14
5. 17 Median
6. 18
7. 20
8. 22
9. 25
x̃ =

x̃ =

x̃ =

x̃ =
The fifth score in the distribution is 17.
Therefore the value of median is 17.
Mode the score/s that occurs
most frequently.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9, 12
Types of Mode
1. Unimodal is a score
distribution that consists
of only one mode.
Example:
25, 26, 26, 28, 30, 30, 30,
45, 45, 50, 55
Mode = 30, because 30
occurred three times
Types of Mode
2. Bimodal is a score
distribution that consists of
only two modes.
Example:25, 25, 26, 26, 26,26 28,
30, 30, 30, 30 45, 45, 45, 50, 55
Mode = 26 and 30, because 26
occurred four times and 30 also
occurred four times.
Types of Mode
3. Trimodal is a score distribution that consists
of only three modes.
Example:25, 25, 26, 28, 30, 30 44, 45, 45, 50,
53, 55
There are three different scores occurred twice each.
These are 25, 30, and 45, therefore the Mode are
25 , 30, 45.
It is also known as multimodal. It consists of more
than three modes.
Uses/benefits
•only appropriate for
nominal data
•there can be several modes
Limitations
•can not be used for further
calculation
•unstable
Descriptive Statistics
Measures of Variation
A measure of variation is a
single value that is used to
describe the spread of the
distribution.
Two Types of Measures of Variation
Absolute Measures of Variation:
• Range
• Inter-quartile Range
• Mean Deviation
• Variance
• Standard Deviation
Relative Measure of Variation:
• Coefficient of Variation
Range (R) is the difference
between the highest and lowest
scores in the data.
R = HS – LS
• The larger the value of the
range, the more dispersed the
observations are.
• It is quick and easy to
understand.
• A rough measure of dispersion.
Inter-Quartile Range (IQR)
The difference between the
third quartile and first
quartile, i.e.

IQR = Q3 – Q1
Quartile is a score points
which divides the scores in
the distribution into 4 equal
parts.
First quartile is denoted by (Q1), it
is a value such that 25% of the
scores in the distribution are
smaller than or equal to the
value of the first quartile.
If Q1 = 35, this means that 25% of
the scores in the distribution
are lower than or equal to 35.
The second quartile is denoted by (Q2),
it is a value such that 50% of the
scores in the distribution are
smaller than or equal to the value
of the second quartile.
For example the value of Q2 = 48, this
means that 50% of the scores in the
distribution are lower than or equal
to 48.
The third quartile is denoted by (Q3), it
is a value such that 75% of the
scores in the distribution are
smaller than or equal to the value
of the third quartile.
If Q3 = 59, this means that 75% of the
scores in the distribution are lower
than or equal to 59.
Variance
•important measure of variation
•shows variation about the mean

Population variance

Sample variance
Standard Deviation (s)
•is the average of the degree to
which a set of scores deviate from
the mean value
•most important measure of
variation
•square root of Variance
•it is the most stable measures of
variation
Remarks on Standard Deviation
• If there is a large amount of
variation, then on average, the
data values will be far from the
Remarksmean. Hence, the SD will be
on Standard Deviation

large.
Remarks on Standard Deviation
• If there is only a small amount of
variation, then on average, the
data values will be close to the
Remarksmean. Hence, the SD will be small.
on Standard Deviation
A look at dispersion… Pilot
Source: Training Course on Teaching Basic Statistics by Statistical Research and Training Center Philippine
Statistical Association , Inc.

Section A

Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21
s = 3.338

Section B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258

Section C
Mean = 15.5
s = 4.57
11 12 13 14 15 16 17 18 19 20 21
Coefficient of Variation (CV)
 measure of relative variation
 usually expressed in percent
 shows variation relative to mean
 used to compare 2 or more groups
 Formula :  SD 
CV =    100%
 Mean 
Comparing Coefficient of Variation

Section A: Mean Score = 90


s = 2.25
CV = 2.50%
Section B: Mean Score = 95
s = 2.30
CV = 2.42%
Correlation
•refers to the extent to which the
distributions are related or
associated.
•the extent of correlation is indicated
by the numerically by the coefficient
of correlation.
•the coefficient of correlation ranges
from -1 to +1.
Types of Correlation
1. Positive Correlation
a) High scores in distribution A are
associated with high scores in
distribution B.
b) Low scores in distribution A are
associated with low scores in
distribution B.
2. Negative Correlation
a) High scores in distribution A are
associated with low scores in
distribution B.
b)Low scores in distribution A are
associated with high scores in
distribution B.
3. Zero Correlation
No association between
distribution A and
distribution B. No
discernable pattern.
30

S
c 25

i
e
20
n
c
e 15

S
c 10

o
r
5
e
s
0
0 5 10 15 20 25

English Scores
[TITLE]
[TITLE]

You might also like