Professional Documents
Culture Documents
STATISTICS
Inferential Statistics
- allows you to make predictions (inferences) from your gathered
data
Parametric Test
- assume underlying statistical
distributions in the data
- Has more statistical power
Independence
- Data point values for variables for different groups should be independent
of each other.
Homogeneity of variance
- assumes that different samples have the same variance, even if they came
from different populations.
- Levene test is used to test for homogeneity of variance.
- The null hypothesis states equality of variances
CONTINOUS DISTRIBUTION
Hypothesis:
H0: The data follow a specified distribution
Ha: The data do not follow the specified distribution.
Formula:
Kolmogorov-Smirnov Test
- compares your data with a known distribution and lets you know if they
have the same distribution.
- The test is distribution free, free in the sense that the critical values do not
depend on the specific distribution being tested
- There are no restrictions on sample size; Small samples are acceptable.
- can’t be used for discrete distributions
Lilliefors Test
- is an improvement on the Kolomogorov-Smirnov (K-S) test, correcting the
K-S for small values at the tails of probability distributions
- can be used when you don’t know the population mean or standard deviation
Anderson-Darling Test
- used to test whether a set of continuous data is likely to have come from a
normal distribution.
- more powerful, especially since all the data values are considered
- has low power to reject H0 when the sample is small (n < 20) and may be
overly sensitive (i.e., rejects H0 too often) when the sample is large (n >
1000).
Shapiro-Wilk Test
- tell if a random sample comes from a normal
distribution
- The test has limitations, most importantly that the test has
a bias by sample size; The larger the sample, the more likely
you’ll get a statistically significant result.
- more appropriate for small sample sizes
Examples of Parametric Statistics
t-Test
- most common statistical procedure for determining the level of
significance
- Use compare means between two distinct/independent groups
- Assumes equal variability in both data sets
Paired t-Test
- Use to compares the means of two measurements taken from the same
individual, object, or related units
Analysis of Variance
- Similar to a t-test, but used when there are more than two groups being
compared
- used to analyze the difference between the means of more than two groups.
One-Way ANOVA
Two-Way ANOVA
- Use a two way ANOVA when you have one measurement variable (i.e.
a quantitative variable) and two nominal variables
Three-way ANOVA
has three factors (independent variables) and one dependent variable.
MANOVA
2. Samples must be independent.
Correlation analysis
- used to measure how strong a relationship is between two variables
- It does not imply that there is any cause-and-effect relationship
Regression analysis
- Regression is used to fit a relationship between two variables such that one
can be predicted from the other.
- This does imply a cause-and-effect relationship or at least an implication
that one of the variables is a ‘response’ in some way
Type of Correlation Analysis
Pearson Correlation
Chi-Square Test