Midterm II Review

Remember: include lots of examples, be concise but give lots of information. 1. Differentiate between and within subjects design o A between subjects design is when you have separate groups that undergo different treatments (ie. Group 1 goes through condition A, group 2 condition B). A within subjects design is when there is only one group of participants and they undergo ALL of the experimental procedures (ie the group receives condition A, B, and C). Know assumptions of ANOVA o Normality  Normal distribution of data, hard to test especially with n < 30. It is a theoretical assumption. Kolmogorov-Smirnov Test tests for normalcy, but is only valid for large n. o Homogeneity   W12 = W22 = W32

We can test using tests like Levene's, and Bartlett¶s. We can correct for violations in homogeneity of variance. This assumption is ROBUST with large n, even for non-normal data. Rule of thumb, the differences between the largest variance should be no more than 4X the smallest.  Checking homoscedasticity (another word for homogeneity of variance): Bartlett¶s test estimates variance. Output is a P value. If n<30, the test may not work. With n<15, you may get false results. If P is significant, your data is not homogenous. Levene¶s test ± gives you a P value. If it is significant, your dad is not homogenous. Brown-Forsythe test ± good for non-normal data, uses median and not mean. Good for things like test scores. o Sphericity (Repeated Measures ANOVA only) Other ANOVA info o An ANOVA is a test used to disprove the null. We test it by analyzing the variance both within and between the means of samples. SS dF MS F P K-1 SSbetween/dFbetween MSbetween/MSwithin Between n7(xbar-xGM)2 Within Total 7(x-xbar)2 7(x-xGM)2 N-K N-1 SSwithin/dFwithin

SSb + SSw POSTHOC comparisons. Finding out where differences are after running an omnibus test. The more comparisons you makes, the more likely you are to make a type I (alpha) error. You may conclude that there is significant difference where there is not. You can protect against this by having large sample sizes, and replicable data. This would be bad because then your sample is not representative. The Bonferroni correction: Use t-tests and divide alpha but the # of comparisons you are making to set your threshold for significance. Tolerance for significance is now much lower. p < (alpha / # of comparisons) = significant. Tukey¶s Procedure (HSD test) qs = (Ya ± Yb) / SE Where Ya is the larger of 2 means, and Yb is smaller. Very stringent test. Designed to control inflated alpha error rates. Makes it harder to find significance. Gives you p values for each comparison. Fischer¶s Least Significance Difference Test Don¶t know formula ± we have replace estimate of pooled variance (in t-test) with MSe. Really bad test, very easy to find significance. 2. Know variance estimates (pie chart) and how these lead to calculating the F statistic. Know how to calculate MS and F

Partitioned Variance in ANOVA

Explained Error Unexplained Error

o o o

o

MSb = SSbetween/dFbetween  Reflects variability among group means. MSw = SSwithin/dFwithin  Reflects variance within a group, aka variability among scores within a sample. This is bad, and we want to control for this. Increased sample size leads to decrease in variance within a group. An increase in within mean squared error, it could mask a significant effect. Also equal to the squared standard error F = MSbetween/MSwithin  F is a ratio between the mean squared error between and within. An increased F means the difference is between the groups (good) while a low F means most of the variability is due to difference within the groups (variability among subjects) which is bad. In a simple comparison, F = t2.  If the null hypothesis is true, the within group error estimate will be the same as the between group error estimate (1:1). If the alternative hypothesis is true (we reject the null) there is a significant difference between the variance between groups and the variance within groups. This means the variance is caused by differences within our treatment groups (good).
Variation within population Null is true Within group estimate of variance Between group estimate of variance Null is false Within group estimate of variance Between group estimate of variance X X X X X Variation between population

Why use F? We use F for hypothesis testing ± we calculate the statistic, and compare the stat to the sampling distribution of means for that statistic. They have 2 different Degrees of freedom (dFw and dFb). Know 3 effect sizes o Effect size reflects the proportion of variance accounted for by the measure. Increased L2 is good because it means most of the variance is due to difference between the groups. With one level of variability, the values are the same. With more complex designs (2 levels or more) partial is better (more accurate). The estimates will differ. Effect size is just the Correlation coefficient. o Eta squared (L2)  o   R2 = SSbetween/SStotal R2 = SSbetween / (SSbetween+SSerror(within)) More typically reported than eta squared. Partial Eta (partial L2) 

o 3.

Generalized (generalized L2)  Standardized test used for Repeated Measures ANOVAs

Define Factorial ANOVA An ANOVA performed with 2 or more groups. SS Df in MS Factorial ANOVA Rows SSrows Nrows ± 1 SSrows/dFrows Columns Interaction SScolumns SSinteraction Ncolumns ± 1 N cells ± df rows ± df columns - 1 df1 + df 2 + df 3 + « + dfn (need to know # subjects in groups.) N-1 SScolumns/dFcolumns SSinteraction/dFinteraction

R2

F

SSR / (SStot ± SSC ± SSint) SSC / (SStot ± SSR ± SSint) SSint / (SStot ± SSR ± SSC)

MSrows / MSwithin MScolumns / MSwithin MSinteraction/ MSwithin

Within

7(x-xbar)2

SSwithin/dFwithin

Total

7(x-xGM)2 SSb + SSw

Know Main Effect (ME) and Interactions

Main effect is the effect of an independent variable on a dependent variable averaging across the levels of any other independent variables. ³a´ shows main effect of group, ³b´ shows the main effect of the group. The lines are parallel. o Interaction is when the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive. The lines are not parallel. Recognizing main effects and interactions o When you run analysis you are coming up with model/theory that explains your data. Ie correlation coefficient: how well your line fits. 1 is perfect. ANOVA is essentially just this, and effect size is just your correlation coefficient (R2). Be careful when judging main effect or interaction ± outliers may throw you off. For example, there may appear to be a ME, but that ME may be driven by interaction. Variance estimates in factorial design Within group variance estimate ± as before, the within group variance estimate reflects the average of the population variance estimates made from the scores for each cell.  sw2= (s12 + s22 + s32 + « + sn2) / K *****  MSwithin group o condition s22 s42 s32 s52 

SSwithin/dFwithin Main effect variance estimate - as in single level design, the main effect between variance estimate is based on the variation between the column / row means. This is like doing 2 one way ANOVAs, one for condition effect and one for group effect. Get a p value for both.

Interaction variance estimate ± interaction variance estimated is based on the variations between the other possible cell groupings. Receive a p value for this as well.

Variance Estimates in Factorial ANOVA
Unexplained variance ME 1 ME 2 Interaction

Unexplained error (error within) is the deviance of your scores from the mean of the group, or the variance of the group. In a perfect world, there would be no error within, and there would be variance between the group means (MSbetween). We need ANOVA to detect if there is significant variance between groups. Small samples are bad. Outliers increase MSw (bad). In a factorial ANOVA, the MSw is calculated R2 ± proportion of explained variance. If SSe is big, all the between R2 will be small. A big R2 (i.e. rows) means variance was mostly driven by that R2. If the effect size is small, it means it was a bad experiment (black t-shirt example). Literally explains how the variance is accounted for. POSTHOC analysis of factorial ANOVA Tukey¶s output would create a large number of tests: the more tests done, the more likely you are to increase the type I error. You can correct for this using the Bonferroni correction. POSTHOC with complex design is generally hard. Consider: Plotting data Simple effects analysis (t-tests) (perform these after plotting data, make smart comparisons) 4. Repeated Measures ANOVA o All subjects are exposed to all conditions, OR repeating the same test across time with the same individuals. Know how its variance differs from other ANOVA o Partitioning variance differently ± subtract subject variance from unexplained error.

Repeated Measures ANOVA
Explained Error Unexplained Error Subject error

SStotal = SS conditions (deviation of condition means from grand mean. Same as SSbetween in regular ANOVA) + SS subject + SS error (deviation of subject scores from group mean) o SS subject (subject error) = deviation of subject means from grand mean. Explain Sphericity o Sphericity is when the variance of difference scores is equal. When Compound Symmetry is met, Sphericity is met. It is essential to have sphericity or the ANOVA does not work. o Dealing with violations of sphericity:  Maulchy¶s Test. Use if n > 30. If Maulchy¶s test is significant, run Greenhouse-Geisser.  Use Greenhouse-Geisser correction whenever n < 30. o 

POST-HOC with paired-samples T-test. If you have lots of columns, it is possible to use Bonferroni correction.

Tutorials / Discoveries.