This action might not be possible to undo. Are you sure you want to continue?
Probably the most popular analysis in psychology Why? › Ease of implementation
variables It is also the most abuses/misused analysis › Arbitrary categorization of variables leads to lower correlations and in general missed/inaccurate effects › Nonexperimental designs can make it difficult to interpret › In nonexperimental design it is typically an oversimplification of the research problem
› Allows for analysis of several groups at once › Allows analysis of interactions of multiple grouping
In the past anyway, now one can do much more complex and computationally extensive analyses with the same amount of effort
As before, if the assumptions of the test are not met we may have problems in its interpretation The usual suspects:
› Independence of observations › Normally distributed variables of measure1 › Homogeneity of Variance
the only way left to differ is in terms of means However we will have to do multiple comparisons to give us the specifics .H0: µ1= µ2… = µk H1: not H0 ANOVA will tell us that the means are different in some way › As the assumptions specify that the shape and dispersion should be equal.
ANOVA allows for an easy way to look at interactions Probability of type one error goes up with multiple tests › For independent contrasts › Probability .857 = .143 probability of making type 1 error1 › So probability of type I error = 1 .95*.(1-α)c Note: some note that each analysis could be taken as separate and independent of all others .95 of not making type 1: .95 = .857 1-.95*.
With an independent samples t-test we looked to see if two groups had different means t= X1 − X 2 s2 p n1 + s2 p n2 With this formula we found the difference between the groups and divided by the variability within the groups by calculation of their respective variances which we then pooled together. .
In this sense with our t-statistic we have a ratio of the difference between groups to the variability within the groups (individual scores from group means) Total variability comes from: Total Variance › Differences between groups › Differences within groups Variance due to Group Membership Variance due to non-systematic factors A similar approach is taken for ANOVA as well .
t2 = F .e. i.difference between means t= difference due to individual differences/chance variance between sample means F= variance due to individual differences/chance Note as we have before that the t-test is just a special case (2-group) of Analysis of Variance.
The F-statistic is essentially this ratio. number of groups) has been made The greater the effect of the group differences. the larger F will be all else being equal The p-value is the probability that the F-statisic obtained (or more extreme) occurred due to sampling error assuming the null hypothesis is true › As always p(D|H0) . but after allowance for n (number of respondents.
Sums of squares1 › Treatment/Model/Regression/Between Groups/Effect etc. › Total (Treatment + Error) SSTotal = sums of squared deviations of scores about the grand mean SSGroup = sums of squares of the deviations of the means of each group from the grand mean (with consideration of group N) SSerror = sums of squared deviations of the scores about their group mean . › Error/Residual/Within groups etc.
Regression by another name1 ( X ij − X .Observed SSBetween Groups SSWithin Groups . Predicted . ) 2 ∑ Predicted value minus the Mean SSW/in groups = ∑∑ ( X ij − X j ) 2 SStotal Errors in prediction. ) 2 ∑ › Think of our means as predicted values. which is SSTotal = what they are... and one can see again that we are still in ‘regression land’ Variance in the outcome SSB/t groups = n( X j − X .
The more the sample means differ. the larger will be the between-samples variation .
The graph here helps us to actually see the F statistic as a ratio of variances The example looks at length of sentence given the defendant by the mock-juror subject for defendants rated as physically attractive. or unattractive › › › › › › Red triangles = Group means Green dot = grand mean Black = individual data points (with some jitter) Line represents a plot of the means against their difference from the grand mean (bottom axis) Blue box = MSw/in Red box = MSb/t . average.
2 5 5 3 4 4 7 2 2 Mean = 4 SD = 1. .7 2 3 2 1 2 1 3 2 Mean = 2 SD = .76 SSGroup = 8(6-4)2 + 8(4-4)2 + 8(2-4)2 = 64 SSTotal = (7-4)2 + (4-4)2 + (6-4)2 … + (3-4)2 + (2-4)2 = 122 SSerror = SSTotal – SSGroup = 58 For SSerror we could have also added variances 2.Ratings for a reality tv show involving former WWF stars1.22+1. and a couple orangutans 1) 18-25 group2 2) 25-45 group 3) 45+ group 7 4 6 8 6 6 2 9 Mean = 6 SD = 2.72+. people randomly abducted from the street.762 and multiplied by n-1 (7).
k (loss of degree of freedom for each group mean) SStotal = N . .Well now we just need df and we’re set to go. SSGroup = k – 1 where k is the number of groups (each group mean deviating from the grand mean) SSerror = N .1 (loss of degree of freedom from using the grand mean in the calculation) or just add the other two.
Construct an ANOVA table: Source Group Error Total df 2 21 23 SS 64 58 122 MS ? ? F ? p ? Calling this ANOVA rather than regression doesn’t change how the table is constructed. . SStreat used the sum of only 3 different values (the group means) compared to SSerror. MS). To eliminate this bias we can calculate the average sum of squares (known as the mean squares. Since both of the SS values are summed values they are influenced by the number of scores that were summed (for example. MS refers to the Mean Squares which are found by dividing the SS by their respective df. which used the sum of 24 different values). As we have noted before our F ratio (or F statistic) is the ratio of the two MS values.
57 p .0004 .76 F 11.Source Group Error Total df 2 21 23 SS 64 58 122 MS 32 2.
› The reason why is because if the F-ratio is less than 1 it means that MSGroup i.e.There is some statistically significant difference among the group means. our model variance. Measure of the ratio of the variation explained by the model and the variation not explained If < 1 then it must represent a non-significant effect. is just another estimate of the residual variance › This is why you will sometimes see just F < 1 reported .
The details require further analyses which will be covered later. . › This is the limit of the Analysis of Variance at this point. All that can be said is that there is some difference among the means of some kind.So they are different in some fashion. what else do we know? Nada.
Want equal ns if at all possible If not we will have to adjust our formula for SStreat.. especially if there are violations of our assumptions. Minor differences (you know what those are right?) are not going to be a big deal. . the more we may have trouble generalizing the results. though as it was presented before (and below) holds for both scenarios. ∑ n( X j − X . ) 2 The more discrepant they are.
Levene’s is the standard test of this for ANOVA as well though there are others. › Especially be concerned with unequal n .When we violate our assumption of homogeneity of variance other options become available.05 you should probably go with a corrective measure. › Levene’s is considered to be conservative. and so even if close to p = .
but Welch’s tends to perform better generally .g. add round up and add a › Kruskal-Wallis › Welch procedure1 › Brown-Forsythe squiggly (~) › Welch’s is probably more powerful in most heterogeneity of variance situations It depends on the situation. Tomarkin & Serlin 1986 for a comparison of these measures that can be used when HoV assumption is violated Welch’s F and B-F › When you report the df.Options: See e.
violation of HoV Regular F .
g. bootstrapped estimates) The Kruskal-Wallis is a non-parametric one-way on the ranked values of the dependent variable . we can use nonparametric techniques (e.When normality is a concern.
g.Approach One-way ANOVA much as you would a t-test › Same assumptions and interpretation taken to 3 or With ANOVA one must run planned comparisons or post hoc analyses (next time) to get to the good stuff as far as interpretation Turn to robust options in the face of yucky data and/or violations of assumptions › E. using trimmed means more groups › One would report similar info: Effect sizes. . confidence intervals. graphs of means etc.
∑w X X ′= ∑w k k k − X .′) 2 k −1 F ′′ = w 2(k − 2) ⎛ 1 ⎞ ⎛ 1+ 2 1− k ∑ ⎜ n −1 ⎟ ⎜ ⎜ k −1 ⎝ k ⎠ ⎝ ∑ wk k k ∑ w (X ⎞ ⎟ ⎟ ⎠ df ′ = k 2 −1 ⎛ 1 ⎞ ⎛ wk 3∑ ⎜ ⎟⎜ nk − 1 ⎠ ⎜ ∑ wk ⎝ ⎝ ⎞ ⎟ ⎟ ⎠ 2 .nk wk = 2 sk .
∑n (X F* = ∑ (1 − n k k k k − X . f f = nk − 1 ck 2 ∑ k (1 − nk / N ) Sk 2 c= (1 − nk / N ) Sk 2 ∑ k .. ) 2 / N )Sk 2 k df = (k − 1).
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.