You are on page 1of 55

Dannalyn D.

Ibañez, RRT, MAEM, PhD

.
1. What is Data Analysis?
2. Statistics
3. Types of Analysis
4. Levels of Measurement
5. Measures of Central Tendencies
6. Major Areas of Statistics: Descriptive and
Inferential Statistics
7. Population and Sample
8. Methods of Data Collection
9. Probability Sampling Collection
10. Nonprobability Sampling Collection
11. Hypothesis
12. Level of Significance
13. Errors in Hypothesis Testing
14. Reliability and Validity
Data analysis is the process of
systematically applying
statistical and/or logical
techniques to describe and
illustrate, condense and recap,
and evaluate data.

Dr. D. Ibanez
 anyrecorded information derived
from counts, measurements,
observations, interviews,
experiments and other
techniques. The data originally
measured are referred to as raw
data.

Dr. D. Ibanez
 a branch of mathematics dealing with the
collection, analysis, interpretation and
presentation of masses of numerical data

Dr. D. Ibanez
A. As a body of knowledge or science
 The study of data
 The study of populations
 The study of variation
 The study of distributions

B. As a mass of data

Dr. D. Ibanez
interview
collection questionnaire
observation
records

textual
presentation tabular
graphical

univariate
(s.) a branch of knowledge analysis bivariate
(a science) multivariate

STATISTICS
Interpretation narrow

nominal of data
broad
(pl.) data ordinal
interval
ratio
Dr. D. Ibanez
 Univariate analysis- technique referring to the
analysis of single variable distributions. Example:
measures of central location like mean, mode and
median; frequency distribution, graphs, tables, etc.
 Bivariate analysis- technique referring to the

analysis of two variables. Example: t-test or test of


difference, relationship, etc.
 Multivariate analysis- technique referring to the

analysis of more than two variables. Example: 3-


Way ANOVA, MANCOVA,
 Multiple Regression Analysis

Dr. D. Ibanez
Levels of Measurement refer to the
amount of information implied by the
numbers that represent the categories of a
variable. There are four levels of
measurement, namely:
1.Nominal
2. Ordinal
3. Interval - Scale
4. Ratio - Scale

Dr. D. Ibanez
 Basic level of measurement
 Also known as categorical or qualitative
 There is no sense of order
 Can be given a code but it doesn’t imply
order but just mere description/label
 For example: age, sex, color, preferred type of

chocolate, blood type, race, eye color


 To summarize nominal data, we use
Frequency and percentage
 We cannot calculate mean or average in this

data
Dr. D. Ibanez
 The data has meaning but intervals within
the data may not be equal
 For example: rank, socio-economic status,

educational level, satisfaction rating, income


level
 To summarize ordinal, we use frequency,

percentage, and sometimes mean

Dr. D. Ibanez
 Also known as SCALE
 The most precise level of measurement
 It can be measured rather than
classified/ordered
 For example: number of customers, weight,

age, size, length, temperature, grades


 Can be discrete (whole numbers, i.e. 5

customers, 5 points) or continuous (example


4.2 miles, 32 degrees, 2.5 minutes)

Dr. D. Ibanez
You can “scale up” but
can’t “scale down”
- Interval/ratio to
nominal
- Interval/ratio to ordinal
- Ordinal to nominal

Dr. D. Ibanez
interview
collection questionnaire
observation
records

textual
presentation tabular
graphical

univariate
(s.) a branch of knowledge analysis bivariate
(a science) multivariate

STATISTICS
Interpretation narrow

nominal of data
broad
(pl.) data ordinal
interval
ratio
Dr. D. Ibanez
 Mean
 Median
 Mode

Dr. D. Ibanez
Mean is the sum of the values, divided by the
number of values.
Example: War On Drugs
The number of illegal suspects killed that the
Philippine National Police (PNP) responded to
for a sample of 17 weeks is shown. Find the
mean

Dr. D. Ibanez
Solution :

X = 2+6+16+20+19+61+32+90+120+139+136+157+159+129+119+102+58
17
X = 80.36

Hence, the mean number of ID suspects killed per week to which the police
responded is 80.36.
 Standard Deviation is a statistic that
measures the dispersion of a dataset relative
to its mean and is calculated as the square
root of the variance.
 If the data points are further from the mean,

there is a higher deviation within the dataset,


thus the more spread out the data, therefore
the higher the deviation.

Dr. D. Ibanez
 Median is the midpoint of data array.

Example: Police Officers Killed


The number of police officers killed in the line
of duty over the last 11 years is shown. Find
the median.

177 153 122 141 189 155 162 165 149 157 240

Dr. D. Ibanez
 Mode is the value that occurs most in the
data set.

 Example: Find the mode of the signing


bonuses of 8 PBA players for a specific year.
The bonuses in millions of pesos are

2.8, 2.0, 3.4, 4.0, 5.3, 4.0, 4.5, 4.0

Dr. D. Ibanez
Descriptive Statistics
 concerned with the methods for
collecting, organizing and describing
a set of data so as to yield meaningful
information.
 For example: frequency distribution,
measures of central tendency,
measures of variation, normality test,
identification of outliers

Dr. D. Ibanez
Inferential Statistics
 deals with the analysis and
interpretation of data
 For example: test of difference, test
of relationship and test of
association

Dr. D. Ibanez
 Parametric tests – are tests that require
normal distribution, the levels of
measurement of which are expressed in an
interval or ratio data. These are tests that
used parameters.
 Nonparametric tests – are tests that do not

require a normal distribution, and they


utilize both nominal and ordinal data. No
need to use parameters for these tests.

Dr. D. Ibanez
Number of Parametric tests Non-parametric tests
Groups/Variables
2 independent groups  t-test for independent  Mann-Whitney U
samples  Wilcoxon rank-sum
test

Correlated sample/ one-  Paired t-test  Wilcoxon Signed


sample group Rank Test
 Fisher sign test
 McNemar’s test for
correlated
proportions
3 or more independent  ANOVA (F-test)  Kruskal-Wallis test
groups  Friedman test

Dr. D. Ibanez
Number of Parametric tests Non-parametric tests
Groups/Variables
Relationship: one  Pearson Product  Chi-square test of
dependent and one Moment Coefficient of independence
independent variable Correlation  Chi-square test of
homogeneity
 Spearman Rank-Order
Coefficient of
Correlation
Association: one dependent  Simple linear  Kendall’s Coefficient
and one independent regression of Concordance W
variable
Association: One  Multiple linear  Kendall’s Coefficient
dependent and 2 or more regression of Concordance W
independent variables

Dr. D. Ibanez
1. What is the demographic profile in terms of: DESCRIPTIVE
1.1 Age
1.2 Sex
1.3 Monthly Family Income
2. What is the level of burnout in terms of the following: DESCRIPTIVE
2.1. personal burnout;
2.2. work-related burnout;

3. What is the level of safety outcome measures in terms of event reporting?

DESCRIPTIVE

4. Is there a significant relationship between the level of burnout and level of

safety outcome measures? INFERENTIAL

5. Does burnout significantly predict the safety outcome measures?

INFERENTIAL
Dr. D. Ibanez
 A population is the entire group that you
want to draw conclusions about.
  A sample is the specific group that you will
collect data from.
 The size of the sample is always less than the
total size of the population.
 In research, a population doesn't always refer
to people.  It can mean a group containing
elements of anything you want to study, such
as objects, events, organizations, countries,
species, organisms, etc.

Dr. D. Ibanez
 Variable - Any trait or attribute that vary
from person to person or case to case.
 Measurement – the assignment of numbers

to attributes of persons or objects based on


an assigned rule

Dr. D. Ibanez
 Administration of Tests, Scales, and
Questionnaires
 Interviews
 Focus Group Discussions
 Observations
 Records

Dr. D. Ibanez
 The process of selecting the sample or the
study units from a previously defined
population.

Dr. D. Ibanez
The ways of selecting a part of the population
to enable researchers to make reliable
inferences about the nature of the
population.
 The list of units from which we draw the
sample in any sampling procedure is called
the sampling frame.
frame

Dr. D. Ibanez
 Simple random sampling
 Systematic random Sampling
 Stratified random sampling
 Cluster random Sampling

Dr. D. Ibanez
 Simple random sampling is a subset of
statistical population in which each member
has an equal probability of being chosen. An
example of a simple random sampling would be
the names of 25 employees being chosen out of
a hat from a company of 250 employees.
 Systematic random sampling is a statistical
method involving the selection of elements from
an ordered sampling frame.

Dr. D. Ibanez
 Stratified random sampling is a random
sampling in which members of the population
are first divided into strata, then are randomly
selected to be a part of the sample.
 Cluster random sampling is when the
researcher divides the population into separate
group called clusters. Then a simple random
sample of clusters is selected from the
population. The researcher conducts his
analysis from the sampled clusters.

Dr. D. Ibanez
 Convenience Sampling
 Voluntary Response Sampling
 Quota sampling
 Purposive or Judgmental Sampling
 Snowball Sampling

Dr. D. Ibanez
 Convenience Sampling – or accidental
sampling, simply includes the individuals who
happen to be most accessible to the
researcher.
 This is an easy and inexpensive way to gather

initial data, but there is no way to tell if the


sample is representative of the population, so
it can’t produce generalizable results.

Dr. D. Ibanez
 Voluntary Response Sampling - similar to a
convenience sample, a voluntary response sample is
mainly based on ease of access.
 Instead of the researcher choosing participants and
directly contacting them, people volunteer themselves
(e.g. by responding to a public online survey).
 Voluntary response samples are always at least
somewhat biased, as some people will inherently be
more likely to volunteer than others.

Dr. D. Ibanez
 Quota Sampling - is a type of non-probability
sampling where researchers will form a sample of
individuals who are representative of a larger population.
 Researchers will assign quotas to a group of people in
order to create subgroups of individuals that represent
characteristics of the target population as a whole.
 Some examples are these characteristics are gender, age,
sex, residency, education level, or income. Once the
subgroups are formed, the researchers will use their own
judgment to select the subjects from each segment to
produce the final sample.

Dr. D. Ibanez
 Purposive or Judgmental Sampling - is a form of
non-probability sampling in which the researcher
uses his own judgment about which respondents to
choose, and picks those who best meets the
purposes of the study.
 It is often used in qualitative research, where the
researcher wants to gain detailed knowledge about
a specific phenomenon rather than make statistical
inferences, or where the population is very small
and specific. An effective purposive sample must
have clear criteria and rationale for inclusion.

Dr. D. Ibanez
 Snowball Sampling - also called chain referral
and referential sampling.
 If the population is hard to access, snowball

sampling can be used to recruit participants


via other participants. The number of people
you have access to “snowballs” as you get in
contact with more people.

Dr. D. Ibanez
 a method used in inferential statistics to
arrive conclusions about a certain population
under study through the use of sample and
parameters.

Dr. D. Ibanez
 Null hypothesis – serves to deny what is
explicitly indicated in a given research
hypothesis

 Research hypothesis (alternative


hypothesis) – is the hypothesis derived from
the researcher’s theory about some social
phenomenon

Dr. D. Ibanez
 is a decision rule used to support or refute
the hypothesis and ensures objectivity into
interpretations of observations (p-value)

Dr. D. Ibanez
 means the probability of outcome occurring
by chance is less than 5 percent. Something
else other than chance has affected the
outcome.
 The value α = .05 is the significance level, the

maximum level of risk that we are willing to


accept in making inference about a
population based on the generated sample.

Dr. D. Ibanez
 Generate the p-value
of a certain test
 Compare the p-value
to the level of
significance (usually
0.05)

Dr. D. Ibanez
 To say that a result is
statistically significant at the
alpha level just means
that the p-value is less than
alpha. For instance, for a
value of alpha = 0.05, if the p-
value is greater than 0.05,
then we fail to reject the null
hypothesis.

Dr. D. Ibanez
 Decide over the H0
◦ If p-value is  to
0.05; the result of
the test is significant;
reject H0
◦ Do not reject H0, if
otherwise

Dr. D. Ibanez
 we reject the null hypothesis when it is true
and should not be rejected
 The lower we set the level of significance, the

lower the likelihood of Type I error, and the


higher the likelihood of Type II error.

Dr. D. Ibanez
 we fail to reject the null hypothesis when it is
actually false
 The higher we set the level of significance,

the higher the likelihood of Type I error, and


the lower the likelihood of Type II error.

Dr. D. Ibanez
Dr. D. Ibanez
QUANTITATIVE QUALITATIVE MIXED METHODS
• Experimental designs • Narrative research • Convergent
• Non-experimental • Phenomenology • Explanatory sequential
designs such as surveys • Grounded theory • Exploratory sequential
• Ethnographies • Transformative,
• Case study embedded, or
multiphase

Dr. D. Ibanez
 Diagnose the problem
 Specify statistical test
 Retrieve analysis results
 Interpret analysis results

Dr. D. Ibanez
 What are the questions to be
answered?
 What’s the analysis required by
the question?
 What is the nature of the data?
 For comparative analysis, how
many means are to be
compared?

Dr. D. Ibanez
THANK YOU

Dr. D. Ibanez

You might also like