You are on page 1of 10

Stats - Test Specs

Always mention get rid of the outliers, so don’t skew the dataset

Levene’s Test (Homogeneity of Variance)


- is an inferential statistic used to assess the equality of variances for a variable calculated
for two or more groups.
- If the resulting p-value of Levene's test is less than some significance level
(typically 0.05), the obtained differences in sample variances are unlikely to have
occurred based on random sampling from a population with equal variances. Thus, the
null hypothesis of equal variances is rejected and it is concluded that there is a difference
between the variances in the population.
- After running Levene’s test, we will assume that it came out not significant thereby
implying homogeneity of variance.

Effect Size (G* Power, 2019)


- is the magnitude of difference between groups or sometimes called the magnitude of the
experimental effect/phenomenon
- suggest a practical significance; whether it’s adequate to warrant action in the real-world
- Regression:
o we are looking for effect size  we are looking for effect size (f squared:
small .02 medium .15, large .35; the difference between your null hypothesis and
the alternative hypothesis that you hope to detect), alpha of .05, expected power
(.8)
Statistical Power
- Rang from 0 to 1
- is the probability that a statistical test will detect differences when they truly exist
- If statistical power is high, the probability of making a Type II error, or concluding there
is no effect when, in fact, there is one, goes down.

For a given statistical test, the sample size is calculated from statistical power, effect size, and
significance level.

Type 1 error: false positive, reject a true null; type 2 error: false negative, fail to reject a false
null

One-Sample T-test
- Assumptions
o Independent random sampling
o Normal distribution
o SD of sampled population equals to that of the comparison population

T-test for two independent sample means


- Overview: A one-sample t-test (two-tailed) is needed to compare the sample mean and
the known population mean to determine whether the difference between the two means
is statistically significant. The null hypothesis is that the sample mean are equal to each
other; the alternative hypothesis is that the two are significantly different.
- Requirements
- Bivariate independent variable
- Continuous dependent variable:
- DV measured on interval/ratio scale (although many treat ordinal scales
like interval scales as well)
- Generally, you should use the t-test if your two groups are very different (group 1
vs group 2). Otherwise, a correlation design might be better if your IV is a
continuous variable because less information is lost.
- Assumptions:
- Independent random sampling
- Normal distribution
- Central Limit Theorem can be generalized to imply that even when two
populations are not normally distributed, the distribution of sample means
will approach the normal distribution as the sample size increases
- If you find that either one or both of your group's data is not
approximately normally distributed and groups sizes differ greatly, you
have two options:
- transform your data so that the data becomes normally distributed
- run the Mann-Whitney U test which is a non-parametric test
- Homogeneity of Variance: HOV (Levene’s test), which will affect Type 1 error
rate if unequal (not significant)
- Generally not a problem if:
- Large sample sizes (at least 100 subjects in each)
- Both samples are the same size
- One sample variance is no more than twice as large as the other
- How to interpret results
- T value should be significant at 0.05 level
- Effect size, d: 0.2 - small, 0.5 - medium, 0.8 - large (Cohen, 1988)

T-test for two dependent/matched groups/samples


- Requirements
- Bivariate independent variable
- Continuous dependent variable
- DV measured on interval/ratio scale (although many treat ordinal scales
like interval scales as well)
- Assumptions
- Independent Random Sampling
- Normal distribution
- How to interpret results
- T value should be significant at the 0.05 level
- Effect size, Cohen’s d (for dependent measures) - essentially same as for
independent samples (but best to specify that you’re doing it for
matched/dependent samples)
- 0.2 - small, 0.5 - medium, 0.8 - large (Cohen, 1988)

Chi-square
- Overview: measure whether there is a relationship between two categorical variables;
whether distribution of categorical variables differ from each other.
- statistical independence means that the frequency distribution of a variable is the
same for all levels of some other variable.
- Expected frequencies are the frequencies we expect in our sample if the null
hypothesis holds.
- Observed frequencies
- One-way Chi Square/Goodness of fit test
- Only one IV (one variable, like a one sample t-test)
- determine whether or not the relative frequencies in the observed
categories are similar to, or statistically different from, the hypothesized
relative frequencies within those same categories
- Alternative - there is a difference between each level of the IV
- Assumptions
- Nominal/ordinal (categorical) data for DV
- Independence of observations
- Groups of the categorical IV should be mutually exclusive (e.g. a male
employee will be counted only under the male level, and cannot be
counted under the female level)
- At least 5 expected frequencies in each group of your categorical IV
- How to interpret results
- Chi square value should be significant at the 0.05 level - implying that
each level of the IV is significantly different from the other levels
- Effect size - Cramer’s phi
- 0.1 - small, 0.3 - medium, 0.5 - large
- Two-way Chi Square/Test of Independence
- 2x2 design two variables, similar to an interaction effect of an ANOVA
- Null - the two IVs are independent of each other
- Is the outcome in one variable related to the outcome in some other
variable
- Assumptions
- Two IVs should be measured using categorical data (nominal/ordinal DV)
- Each IV should have at least two levels that are independent of each
other/mutually exclusive
- How to interpret results
- Pearson Chi Square should be statistically significant at the 0.05 level -
implying that there is a statistically significant association between the two
IVs (they seem to be interacting)
- Effect size - Cramer’s phi
- 0.1 - small, 0.3 - medium, 0.5 - large

One-way ANOVA
- Overview: compare the means of # sample groups and determine whether any of those
means are statistically significantly different from each other.
- One IV multiple levels
- Independent
- Assumptions
- HOV (Levene’s test for HOV)
- Independent Random Sampling
- Normal Distribution
- How to interpret results
- F ratio should be significant at the 0.05 level - only tells you that a
statistically significant difference exists but not where it exists
- Use post-hoc tests
- if IV has only two levels - use an independent samples t-
test
- If IV has three levels - use Fisher’s LSD (because only
three pairs of comparisons)
- If IV has more than three levels - use Tukey’s HSD
Note: Can use a Bonferroni adjusted alpha level for post-hoc tests
- Effect size - ETA SQUARED
- Small: .01
- Medium: 06
- Large: .14
- http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize
- Repeated Measures
- Assumptions
- Sphericity (Mauchly’s W) - basically means that all pairwise interactions
will be equally large (for amount of interaction between any two levels of
the IV)
- Independent Random Sampling
- Normal Distribution
- How to interpret results
- F ratio should be significant at the 0.05 level - only tells you that a
statistically significant difference exists but not where it exists
- Use post-hoc tests
- if IV has only two levels - use (a dependent?) samples t-test
- If IV has three levels - use Fisher’s LSD (because only
three pairs of comparisons)
- If IV has more than three levels - use Tukey’s HSD
Note: Can use a Bonferroni adjusted alpha level for post-hoc tests
- Effect size - ETA SQUARED
- Small: .01
- Medium: .06
- Large: .14
- http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/effectSize

Two-way ANOVA
- Overview: understand if there is an interaction between the two independent variables on
the dependent variable
- Multiple IVs with multiple levels
Note - for independent/repeated measures - same protocol except look at assumptions for
respective one-way designs)
Mixed design:
- Null: the means of IV1 conditions are equal; the means of IV2 conditions are
equal; no interaction between IV1 & IV2
- Assumptions
- HOV
- Homogeneity of Covariance across groups (Box’s M)
- Sphericity
- Independent Random Sampling
- Normal Distribution
- How to interpret results
- You will have two main effects and one interaction effect (3 F ratios)
- If main effects significant, use post-hoc tests for follow up (same as One-
Way ANOVAs)
- If interaction effect significant, will need to heed more to this than your
significant main effects
- If IV(s) has two levels - post-hoc will be a t-test
- If IV(s) has more than two levels - simple main effect - keeping
one level of IV1 constant, and doing a one-way ANOVA for the
other IV
- Again, this will tell you that a difference exists, but you
don’t know where, so maybe follow up with a Fisher’s
LSD/Tukey’s HSD
- Ordinal and disordinal (check)
- Effect size
- Eta squared for main effects of each of the two IVs
- 0.01 - small, 0.06 - medium, 0.14 - large
- Partial eta squared for the interaction effect between the two IVs
- 0.01 - small, 0.09 - medium, 0.25 - large

Correlation
- Requirements:
- Each variable should be continuous
- Each observation/participant should have a pair of values (related variables)
- Assumptions:
- Linearity
- Independent random sampling
- Normal distribution
- Absence of influential outliers
- Homoscedasticity (look at scatterplot, distance from data points to straight line
should be roughly equal)
- Used for:
- Test-retest reliability
- Internal consistency (split-half)
- Interrater reliability
- Criterion validity
- Strength of correlation, Pearson’s r:
- .1, .3, .5 (Cohen, 1988)

Linear Regression
- Assumptions:
- Linearity
- Independent random sampling
- Normal distribution
- Absence of influential outliers
- Homoscedasticity (look at scatterplot, distance from data points to straight line
should be roughly equal)
- How to interpret results
- R squared - The amount of variance accounted for by the IV
- Small - 0.01, Medium - 0.09, Large - 0.25
- Check the beta weight for per unit change in DV resulting from per unit change in
IV - sign of beta weight will explain direction of relationship

Logistic Regression
- Overview:
- predicting for every unit increase in IV, what is the likelihood of a dichotomous
outcome
- model the probability of an event occurring depending on the values of the
independent variables, which can be categorical or numerical
- estimate the probability that an event occurs for a randomly selected observation
versus the probability that the event does not occur
- predict the effect of a series of variables on a binary response
- classify observations by estimating the probability that an observation is in a
particular category
- Assumptions (http://www.statisticssolutions.com/assumptions-of-logistic-regression/)
- NOT REQUIRED: First, logistic regression does not require a linear relationship
between the dependent and independent variables. Second, the error terms
(residuals) do not need to be normally distributed. Third, homoscedasticity is not
required. Finally, the dependent variable in logistic regression is not measured on
an interval or ratio scale. However, some other assumptions still apply.
- Dependent variable to be binary
- Observations to be independent of each other - in other words, the observations
should not come from repeated measurements or matched data.
- Little or no multicollinearity among the independent variables - this means
that the independent variables should not be too highly correlated with each other
- Linearity of independent variables and log odds. although this analysis does
not require the dependent and independent variables to be related linearly, it
requires that the independent variables are linearly related to the log odds.
- Logistic regression typically requires a large sample size. A general guideline
is that you need at minimum of 10 cases with the least frequent outcome for each
independent variable in your model. For example, if you have 5 independent
variables and the expected probability of your least frequent outcome is .10, then
you would need a minimum sample size of 500 (10*5 / .10).
- How to interpret results
- Nagelkerke R squared - amount of variance accounted for by predictors in the
regression model in predicting the DV
- Odds-ratio - change in odds of predicting DV with one unit change in one IV,
holding all other variables constant
- Example - if position grade was found significant, then it would suggest
that a change in one unit of grade will affect (increase/decrease) the
likelihood of return by (odds ratio) times - this is with reference to
expatriate training

Multiple Regression
- Assumptions of Multiple Regression (COHEN, 2013)
- Minimal sample size - 41 + number of predictors (accounts for
- Independent Random Sampling – individual cases should be selected
independently of each other
- Normal Distributions – all variables involved in the multiple regression are
normally distributed
- Homoscedasticity – errors from the regression surface (e.g. line, plane, etc.) have
the same variance in all locations
- Multivariate Outliers – combinations of values on three or more variables that are
unusual and may indicate measurement errors or psychological phenomena
- Measuring Leverage, Residuals and Influence – we probably want to do an outlier
analysis to ensure that these are in check
- Leverage – outliers that can easily rotate the regression line
- Residuals – a point’s value on the regression line minus its predicted value
- Influence – outliers that have leverage and large residual values
- Dichotomous Predictors – all categorical variables have been coded into numeric
values
- Categorical variables with more than two levels have been coded into
dichotomous categorical variables – these are IVs, therefore multiple
logistic regression not done
- Problems with Multiple Regression that will be addressed prior/post-test
- Multicollinearity – no two variables are perfectly or highly correlated, or
predicted by a combination of other variables
- Shrinkage – when regression model based on one sample but to be applied to
another
- Cross-validation to address shrinkage – take one half of sample and create
regression model, use beta weights to apply to the other half of sample and
see if it works
- HAVING TOO MANY PREDICTORS - USE A BONFERRONI ADJUSTED
ALPHA FOR YOUR REGRESSION MODEL + MINIMAL SAMPLE SIZE =
41 + # OF PREDICTORS - CITE THE STATS TEXTBOOK - COHEN (2013)
- How to interpret results
- Check R squared value for amount of variance explained by predictors in the
regression model
- Check beta weights to see what the per unit change in DV resulting from per unit
change in IV will be (for each IV) - the sign of the beta weight will indicate
positive/negative relationship
NOTE: We will only use predictors that have statistically significant correlations with the DV in
the regression model - relationship can be positive/negative or weak/moderate/strong

EFFECT SIZE CHART (WE DON’T KNOW ABOUT ETA SQUARED - TBD)
Effect size ‘r’ (correlation) R2 (R squared) f Cohen’s d (t-
Cramer’s phi Partial eta tests)
(for chi square) squared

Low 0.1 0.01 0.1 0.2

Medium 0.3 0.09 0.25 0.5

High 0.5 0.25 0.4 0.8

Effect Size Use Small Medium Large

Correlation inc Phi 0.1 0.3 0.5

Difference in Comparing two 0.2 0.5 0.8


arcsines proportions

η2 Anova 0.01 0.06 0.14

omega-squared Anova; See Field 0.01 0.06 0.14


(2013)

Multivariate eta- one-way 0.01 0.06 0.14


squared MANOVA
Cohen's f one-way an(c)ova 0.1 0.25 0.4
(regression)

η2 Multiple regression 0.02 0.13 0.26

κ2 Mediation analysis 0.01 0.09 0.25

Cohen's f Multiple Regression 0.14 0.39 0.59

Cohen's d t-tests 0.2 0.5 0.8

Cohen's ω chi-square 0.1 0.3 0.5

Odds Ratios 2 by 2 tables 1.5 3.5 9

Average Spearman Friedman test 0.1 0.3 0.5


rho