You are on page 1of 48

STATISTICS

STATISTICS:
A Research Tool
Descriptive and Inferential Statistics
Parametric and Non-parametric tests
Commonly used tests
Advantages and Limitations
Analysis and corresponding test
Hypothesis Testing and Sample Problems
Two main statistical methods used in
data analysis:

DESCRIPTIVE Statistics
“What have you found in your sample?”

INFERENTIAL Statistics
“To what extent is what you have found
in your sample likely to be an accurate
depiction of the larger group
(population) from which the sample was
drawn?”
Descriptive Statistics Inferential Statistics

Summarizes or describes Draws conclusions or


data from a sample using inferences about
indexes or descriptive population parameters
measures such as the using sample data that
mean or standard are subject to random
deviation, without variation (e.g.,
attempting to infer observational errors,
anything that goes sampling variations).
beyond the data.
Where statistics is used?
Automotive Banking and Finance Brand

Food Government Media

Medical Telecommunication
Real Estate
Uses of Data in Different Fields
Parametric and
Non-Parametric Statistics
Non-
Parametric
parametric
Parametric Non-
Tests parametric
A parametric statistical test is a test
whose model specifies certain
conditions about the parameters of
the population from which the
research sample was drawn.

Robson (1994)
The Conditions
The observation must be independent

The observations must be drawn from


normally distributed populations

These populations must have the same variances

Variables involved must have been measured in at


least an interval scale
Commonly Used Parametric Tests
Pearson Product Student’s t-Test
Correlation Coefficient

The z-Test Analysis of Variance (ANOVA)


Pearson’s Correlation Coefficient

The correlation coefficient (r) is a value


that tells ushow well 2 continuous
variables from the same subject
correlate to each other. The important
thing to remember is that this is only
an association and does not imply a
cause-and-effect relationship.
Student t-Test

The Student t-test is probably the most widely


used parametric test. It wasdeveloped by a
statistician working at the Guinness brewery and
is called the Student t-testbecause of proprietary
rights. A single sample t-test is used to determine
whether the mean of asample is different from a
known average. A 2-sample t-test is used to
establish whether adifference occurs between the
means of 2 similar data sets.
The z-test

This test is very similar to the


Student t-test. However, withthe z-
test, the variance of the standard
population, rather than the
standard deviation of the study
groups, is used to obtain the z-test
statistic.
Type of Test depending on the
Sample Size
 t-test
used when the sample size
is small (n < 30)
 z-test
used when the sample size is large
(n ≥ 30)
3/20/2017 Sample Size Table

 
Professional researchers typically set a sample size level of about 500 to optimally estimate a single population
parameter (e.g., the proportion of likely voters who will vote for a particular candidate).  This will construct a
95% confidence interval with a Margin of Error of about ±4.4% (for large populations).
ANOVA
Analysis of variance (ANOVA) is
a test that incorporates means
and variances to determine the
test statistic. The test statistic is
then used to determine
whether groups of data are the
same or different.
Non-
Parametric parametric
Tests
A non-parametric statistical test is a
test whose model does NOT specify
conditions about the parameters of
the population from which the sample
was drawn.

(Robson, 1994)
The Conditions
“distribution free statistics”

Variables must be nominal or ordinal

Used when the assumptions of parametric test have


not been met
Commonly Used Non-Parametric Tests
Chi-squared Test Spearman Rank Coefficient

Mann-Whitney U Test Kruskal-Wallis Test


(Wilcoxon Rank Test)
Chi-Squared Test

The chi-squared test is usually used to


compare multiple groups where the input
variable and the output variable are binary.
The chi-squared test helps to decide whether a
frequency distribution could be the result of a
definite cause or just chance. It does this by
comparing the actual distribution with the
distribution which could be expected if chance
was the only factor operating.
Spearman Rank Coefficient

Like the Pearson product correlation


coefficient, the Spearman rank
coefficient is calculated to
determine how well 2 variables for
individual data points can predict
each other. The difference is that
the data need not be linear.
Mann-Whitney U Test:

This test sometimes referred to as


Wilcoxon rank test, uses rank just asthe
previous test did. It is analogous to the t-
test for continuous variable but can be
used for ordinal data. This test compares
2 independent populations to determine
whether they are different.
Kruskal-Wallis Test

This test uses ranks of ordinal data to


perform an analysis of variance to
determine whether multiple groups are
similar to each other. This test, like the
previous example, ranks all data from
the groups into 1 rank order and
individually sums the different ranks
from the individual groups.
Differences between Parametric and
Non-Parametric Tests
Parametric Tests Non-parametric Tests

Do not make numerous


Make numerous or
or stringent
stringent assumptions
assumptions about
about parameters.
parameters.

- Focus on the difference


between medians
Focus on the mean - Focus on order & ranking
difference - Data are changed from scores
to ranks or signs
Parametric Tests Non-parametric Tests

Populations must have Variable under study


the same variances has underlying
continuity

Rely on assumptions about the


shape of the distribution (i.e., Rely on no or few assumptions
assume a normal distribution) in about the shape of the
the underlying population and distribution or parameters of
about the form or parameters (i.e., the population from which the
means and standard deviations) of sample was drawn.
the assumed distribution.
Analysis Type and Corresponding Test
Analysis Type Example Parametric Non-parametric
Procedure Procedure
Compare means Is the mean Two-sample Wilcoxon rank-
between two annual t-test sum test
distinct/ temperature of
independent extreme southern Two-sample
groups Philippines z-test
different from the
mean annual
temperature of
extreme northern
Philippines?

Source: Adopted and Modified from Tanya Hoskin (Updated)


Analysis Type Example Parametric Non-parametric
Procedure Procedure
Compare two Was there a Paired t-test Wilcoxon signed-
quantitative significant change in (t-test for 2 rank test
measurements soil fertility between dependent samples)
taken from the a soil which
same individual inorganic fertilizer
was applied and the
same soil which
group which organic
manure was applied
for one year?

Source: Adopted and Modified from Tanya Hoskin (Updated)


Analysis Type Example Parametric Non-parametric
Procedure Procedure

Compare means If our experiment Analysis of Variance Kruskal-Wallis Test


between three or had three rock (ANOVA)
more distinct / types (e.g..,
independent groups igneous,
sedimentary and
metamorphic), we
might want to know
whether the mean
mineral content at
baseline different
among the three
groups?

Source: Adopted and Modified from Tanya Hoskin (Updated)


Analysis Type Example Parametric Non-parametric
Procedure Procedure
Estimate the Is excessive Pearson’s Spearman’s rank
degree of deforestation coefficient of correlation
association associated with correlation
between two soil erosion?
quantitative
variables

Source: Adopted and Modified from Tanya Hoskin (Updated)


Advantages of Parametric Tests
More powerful than non-parametric tests
(Power efficiency is higher)

More sensitive at detecting differences between


samples

More sensitive at detecting an effect of the


independent variable on the dependent variable
Limitations of Parametric Tests
If the data deviate strongly from the assumptions of
parametric procedure, using the parametric procedure could
lead to incorrect conclusions.

One must be aware of the assumptions associated with


parametric procedure and should learn methods to evaluate
the validity of those assumptions.

The parametric assumption of normality is particularly


worrisome for small sample sizes (n < 30)
Advantages of Non-Parametric Tests
Data are not normally distributed

Non-parametric tests accommodate very small sample sizes (as small as N = 6)

Advantages of Non-Parametric Tests


Can treat samples made up of observations from several different populations

Can treat data which are inherently in ranks

Can treat data which are classificatory

Relatively easier to learn and apply than parametric tests


Limitations of Non-Parametric Tests
Non-parametric test leads to loss of precision and wastefulness of data

They are “less powerful” and have false sense of security


Advantages of Non-Parametric Tests
They lack software for quick and large scale analysis

Their results are often less easy to interpret


Hypothesis testing will be discussed later… see
you in the Computer Laboratory for the actual
Hands-on computation using SPSS 

© 1984-1994 T/Maker 
Co.
Thank You!

You might also like