You are on page 1of 19

INFERENTIAL

STATISTICS
Inferential Statistics
- allows you to make predictions (inferences) from your gathered
data

 Parametric Test
- assume underlying statistical
distributions in the data
- Has more statistical power

 Non- Parametric Test


- Do not rely on any distribution
- can be applied even if parametric
conditions of validity are not met
Assumptions of Parametric Test

 Independence
- Data point values for variables for different groups should be independent
of each other. 

 Homogeneity of variance
- assumes that different samples have the same variance, even if they came
from different populations. 
- Levene test is used to test for homogeneity of variance.
- The null hypothesis states equality of variances

 Normal distribution of data


-  the most important assumption
- It can be checked visually (Histograms & QQ Plot) or numerically
(Normality Test)
Normality
Types of Distribution
 DISCRETE DISTRIBUTION

 The Poisson distribution


- Usually a sample of time or space is taken and the number of events recorded.
- to test for randomness or independence in either space or time.
- Variance > Mean = Population is more clumped
- Variance < Mean = Population is more ordered or uniform
- Example: number of fish-lice on a fish or number of influenza cases reported in a
week.

 The binomial distribution


- This is a discrete distribution of number of events
- When there are two possible outcomes for each event the probability of each is
constant

 The negative binomial distribution


- The negative binomial distribution is a discrete distribution that can be used to
describe clumped data (i.e. when there are more very crowded and more sparse observations
than a Poisson distribution with the same mean).
 The hypergeometric distribution
- used to describe events where individuals are removed from a population and not
replaced

 CONTINOUS DISTRIBUTION

 The rectangular distribution


- describes any distribution where all values are equally likely to occur

 The normal distribution


- the most important distribution in statistics and it is often assumed that data are
distributed in this way
- described by two parameters: the mean and standard deviation
Types of Normality Test
Chi-square goodness of fit
- used to test if a sample of data came from a population with a specific
distribution.
- can only be used on nominal (categorical) scale data, that is, frequency
data

Hypothesis:
H0: The data follow a specified distribution
Ha: The data do not follow the specified distribution.

Formula:
Kolmogorov-Smirnov Test
- compares your data with a known distribution and lets you know if they
have the same distribution. 
- The test is distribution free, free in the sense that the critical values do not
depend on the specific distribution being tested
- There are no restrictions on sample size; Small samples are acceptable.
- can’t be used for discrete distributions

Lilliefors Test 
- is an improvement on the Kolomogorov-Smirnov (K-S) test, correcting the
K-S for small values at the tails of probability distributions
- can be used when you don’t know the population mean or standard deviation
Anderson-Darling Test

- used to test whether a set of continuous data is likely to have come from a
normal distribution.
- more powerful, especially since all the data values are considered
- has low power to reject H0 when the sample is small (n < 20) and may be
overly sensitive (i.e., rejects H0 too often) when the sample is large (n >
1000). 

Shapiro-Wilk Test
- tell if a random sample comes from a normal
distribution
- The test has limitations, most importantly that the test has
a bias by sample size; The larger the sample, the more likely
you’ll get a statistically significant result.
- more appropriate for small sample sizes
Examples of Parametric Statistics

t-Test
- most common statistical procedure for determining the level of
significance
- Use compare means between two distinct/independent groups
- Assumes equal variability in both data sets

Paired t-Test

- Use to compares the means of two measurements taken from the same
individual, object, or related units
Analysis of Variance
- Similar to a t-test, but used when there are more than two groups being
compared
- used to analyze the difference between the means of more than two groups.

One-Way ANOVA

- It use when you have one independent variable affecting


a dependent variable
Factorial ANOVA
 - an Analysis of Variance test with more than one independent variable or
factor

Two-Way ANOVA
- Use a two way ANOVA when you have one measurement variable (i.e.
a quantitative variable) and two nominal variables

 Three-way ANOVA
has three factors (independent variables) and one dependent variable.
MANOVA

- an ANOVA with several dependent variables.

Repeated Measures ANOVA

- almost the same as one-way ANOVA


- Use when the same group of participants is being measured over and
over again
Assumptions for Two Way ANOVA

1. The population must be close to a normal distribution

2. Samples must be independent.

3. Population variances must be equal.

4. Groups must have equal sample sizes


Measure of Association
- used to quantify a relationship between two or more
variables. 

 Correlation analysis
-  used to measure how strong a relationship is between two variables
- It does not imply that there is any cause-and-effect relationship

Regression analysis
- Regression is used to fit a relationship between two variables such that one
can be predicted from the other.
- This does imply a cause-and-effect relationship or at least an implication
that one of the variables is a ‘response’ in some way
Type of Correlation Analysis

Pearson Correlation

 - The most common measure of correlation in stats 


- measures the strength of the linear relationship between two variables on
a continuous scale
- Disadvantage: unable to tell the difference between dependent
variables and independent variables
Non- Parametric Test
- class of statistical tests that make much weaker assumptions.
- The advantage of non-parametric tests is that they can be employed with a
much wider range of forms of data than their parametric cousins

Examples of Non- Parametric Test


Wilcoxon rank-sum test/ Mann-Whitney test

- Alternative test for T-test

Wilcoxon signed-rank test

- Alternative test for paired T-test


Kruskal Wallis Test
- Alterative test for ANOVA

Spearman’s rank correlation


- Alterative test for Pearson coefficient of correlation

Chi-Square Test

- a standard measure for association between two categorical variables.


- The chi-square test, unlike Pearson’s correlation coefficient or Spearman
rho, is a measure of the significance of the association rather than a measure
of the strength of the association.
Examples of Non-Parametric Statistics

You might also like