# Application of ANOVA

What is ANOVA?

An ANOVA is an analysis of the variation present in an experiment. It is a test of the hypothesis that the existence of differences among several population means.

Application of ANOVA 

ANOVA is designed to detect differences

among means from populations subject to different treatments 

ANOVA is a joint test  The equality of several population means is
tested simultaneously or jointly. 

ANOVA tests for the equality of several

population means by looking at two estimators of the population variance (hence, analysis of variance).

The Hypothesis Test of Analysis of Variance

 In an analysis of variance:
corresponding to a population subject to a different treatment. We have:  n = n1+ n2+ n3+ ...+nr total observations.  r sample means: x1, x2 , x3 , ... , xr  These r sample means can be used to calculate an estimator of the population variance. If the population means are equal, we expect the variance among the sample means to be small.  r sample variances: s12, s22, s32, ...,sr2  These sample variances can be used to find a pooled estimator of the population variance. 

We have r independent random samples, each one

The Hypothesis Test of Analysis of Variance (continued): Assumptions
 We assume independent random sampling from 
each of the r populations We assume that the r populations under study:
± are normally distributed, ± with means Qi that may or may not be equal, ± but with equal variances, Wi2.
W

Q1

Q2

Q3

Population 1

Population 2

Population 3

Testing Hypothesis
The hypothesis test of analysis of variance: H0: Q1 = Q2 = Q3 = Q4 = ... Qr H1: Not all Qi (i = 1, ..., r) are equal
 The test statistic of analysis of variance:   F(r-1, n-r) = Estimate of variance based on means from r samples  Estimate of variance based on all sample observations  That is, the test statistic in an analysis of variance is based on the ratio of two estimators of a population variance, and is therefore based on the F distribution, with (r-1) degrees of freedom in the numerator and (nr) degrees of freedom in the denominator.

Extension of ANOVA to Three Factors
Source of Variation Factor A Sum of Squares SSA a-1 Degrees of Freedom
MSA !

Mean Square
SSA a 1

F Ratio

F !
F !
F ! F !

MSA MSE
MSB MSE
MSC MSE

Factor B Factor C Interaction (AB) Interaction (AC) Interaction (BC) Interaction (ABC) Error Total

SSB SSC SS(AB) SS(AC) SS(BC)

b-1 c-1 (a-1)(b-1) (a-1)(c-1) (b-1)(c-1)

MSB !
MSC !

SSB b 1
SSC c 1 SS ( AB ) ( a  1)( b  1) SS ( AC ) ( a  1)( c  1) SS ( BC ) (b  1)( c  1)
SS ( ABC ) ( a  1)( b  1)( c  1)

MS ( AB ) ! MS ( AC ) ! MS ( BC ) !

MS ( AB ) MSE MS ( AC ) MSE MS ( BC ) MSE MS ( ABC ) MSE

F ! F !

SS(ABC) SSE SST

(a-1)(b-1)(c-1) abc(n-1) abcn-1

MS ( ABC ) ! MSE !

F !

SSE abc ( n  1)

Application of ANOVA in Marketing
 Product testing, ad copy testing and concept testing are some common applications, though ANOVA Analysis Surveys can be used in retail environments or simulated lab-type environments.  We can manipulate certain variables (like promotion, ad copy, display at the point of purchase), and observe changes in other variables (like sales, or consumer preferences, behavior or attitude). The application areas for experiments are wide .

Application of ANOVA in Marketing

Whenever a marketing-mix variable (independent variable) is changed, we can determine its effect. Such variables include price, a specific promotion or type of distribution, or specific elements like shelf space, color of packaging etc.

 An experiment can be done with just one independent variable (factor) or with multiple independent variables.  ANOVA - The key to success in an ANOVA Analysis Survey is the degree of control on the various independent variable (factors) that are being manipulated during the experiment.

N-way Analysis of Variance
In marketing research, one is often concerned with the

effect of more than one factor simultaneously. For example:  How do advertising levels (high, medium, and low) interact with price levels (high, medium, and low) to influence a brand's sale?  What is the effect of consumers' familiarity with a department store (high, medium, and low) and store image (positive, neutral, and negative) on preference for the store?

N-way Analysis of Variance
Consider the simple case of two factors X1 and X2 having categories c1 and c2. The total variation in this case is partitioned as follows: SStotal = SS due to X1 + SS due to X2 + SS due to interaction of X1 and X2 + SSwithin or

SS y = SS x 1 + SS x 2 + SS x 1x 2 + SS error

The strength of the joint effect of two factors, called the overall effect, or multipleL 2, is measured as follows: multipleL 2 =

(SS x 1 + SS x 2 + SS x 1 x 2 )/ SS y

N-way Analysis of Variance
The significance of the overall effect may be tested by an F test, as follows
F= (SS x 1 + SS x 2 + SS x 1 x 2 )/dfn SS error/dfd

=

SS x 1 ,x 2 ,x 1 x 2 / dfn SS error/dfd

=

MS x 1 ,x 2 ,x 1 x 2 MS error

where dfn = = = = = = degrees of freedom for the numerator (c1 - 1) + (c2 - 1) + (c1 - 1) (c2 - 1) c1c2 - 1 degrees of freedom for the denominator N - c1c2 mean square

dfd MS

N-way Analysis of Variance
If the overall effect is significant, the next step is to examine the significance of the interaction effect. Under the null hypothesis of no interaction, the appropriate F test is:

F=

SS x 1 x 2 /dfn SS error/dfd

=
where dfn dfd

S x 1x 2 S error

= (c1 - 1) (c2 - 1) = N - c1c2

N-way Analysis of Variance
The significance of the main effect of each factor may be tested as follows for X1: MS x SS x / n F MS error SS error/ d
¡

where dfn dfd = c1 - 1 = N - c1c2

Issues in Interpretation
 The most commonly used measure in ANOVA is omega squared, 2 [ . This measure indicates what proportion of the variation in the dependent variable is related to a particular independent variable or factor. The relative contribution of a factor X is calculated as  follows:

[2 = x

SS x - (dfx x MS error) SS total + MS error

 Normally, [ is interpreted only for statistically significant effects.

2

Repeated Measures ANOVA
In the case of a single factor with repeated measures, the total variation, with nc - 1 degrees of freedom, may be split into between-people variation and within-people variation. SStotal = SSbetween people + SSwithin people The between-people variation, which is related to the differences between the means of people, has n - 1 degrees of freedom. The within-people variation has n (c - 1) degrees of freedom. The within-people variation may, in turn, be divided into two different sources of variation. One source is related to the differences between treatment means, and the second consists of residual or error variation. The degrees of freedom corresponding to the treatment variation are c - 1, and those corresponding to residual variation are (c - 1) (n -1).

Repeated Measures ANOVA
Thus, SSwithin people = SSx + SSerror A test of the null hypothesis of equal means may now be constructed in the usual way:
F SS x / c SS error/ n cMS x MS error

So far we have assumed that the dependent variable is measured on an interval or ratio scale. If the dependent variable is nonmetric, however, a different procedure should be used.

Nonmetric Analysis of Variance
 Nonmetric analysis of variance examines the difference in the central tendencies of more than two groups when the dependent variable is measured on an ordinal scale.  One such procedure is the k-sample median test. As its name implies, this is an extension of the median test for two groups, which was considered in Chapter 15.

Nonmetric Analysis of Variance
 A more powerful test is the Kruskal-Wallis one way analysis of variance. This is an extension of the Mann-Whitney test This test also examines the difference in medians. All cases from the k groups are ordered in a single ranking. If the k populations are the same, the groups should be similar in terms of ranks within each group. The rank sum is calculated for each group. From these, the Kruskal-Wallis H statistic, which has a chi-square distribution, is computed.  The Kruskal-Wallis test is more powerful than the k-sample median test as it uses the rank value of each case, not merely its location relative to the median. However, if there are a large number of tied rankings in the data, the k-sample median test may be a better choice.