You are on page 1of 6

Introduction to Analysis of Variance

Throughout tests of hypothesis, we have investigated how we can study and relate different
variables. However, one fact remained across different types of tests. All of these underwent
some type of experimental design to gather data for the necessary analysis.
An experimental design is a plan and a structure to test hypothesis in which the researcher
controls or manipulates one or more variables. You can imagine this as the methodology in the
scientific method.
These variables being studied were already previously discussed – Independent and Dependent
Variables. When we look at independent variables, there are two sub-classifications.
1. Treatment Variable – the one that the experimenter controls or modifies in the
experiment
2. Classification Variable – a characteristic of the experimental subjects that was present
prior to the experiment, and is not a result of the experimenter’s manipulations or
control
Imagine the following example. Say I will perform an experiment on two of my classes. I want to
experiment on two different teaching styles – asynchronous with modules, or synchronous
classes. To do this, I will select two classes – A and B. Class A will be taught via asynchronous
with modules and Class B will be taught via synchronous classes. Here, the treatment variable is
the one that I am controlling or modifying – teaching style. On the other hand, the classification
variable is based on the classes. Regardless, both of these variables are considered to be
independent variables.
Each independent variable has two or more levels or classifications. When there are only two, it
is exactly the two-sample test of hypothesis. If we go beyond and have three or more groups,
we can apply this. Levels or classifications are the subcategories of the independent variable
used by the researcher in the experimental design.
These independent variables affect the results of our experiment. These results are what we
call dependent variables. These variables are the responses to the different levels of the
independent variables or simply the one we are measuring in the experiment and eventually
analyzed. In the example above, the dependent variable might be test score results. Normally,
there is only one dependent variable experimented on at a time. If we are going to investigate a
different dependent variable, we will need a different experiment for that.
Example
A product development engineer investigates the tensile strength of a new synthetic fiber.
From experience, she knows that

• The strength is affected by the weight percent of cotton used


• Cotton content should range from 10% to 40%
The engineer suspects that increasing the cotton content may increase the strength and she
decides to

• Test specimens at five levels of cotton weight percent: 15%, 20%, 25%, 30%, and 35%
• Test five specimens at each level of cotton content

This experimental design clearly wants to investigate the effect of a particular independent
variable – amount of cotton weight. By changing the percent cotton weight, the tensile strength
should vary – this means that tensile strength is the dependent variable as it changes as we
manipulate the independent variable.
If we look at the independent variable – percent cotton weight, we can see that there are five
levels. This is essentially performing an experiment with five different samples. Also, the
experimental design indicates that there will be five specimens at each level of cotton content.
This simply implies that in each group, there will be five data points. Simply, it means 𝑛1 =
𝑛2 = 𝑛3 = 𝑛4 = 𝑛5 = 5. If this was a simple two-sample test of hypothesis, we can say that
we need an independent samples t-test. However, we have five different groups here.

15%

20% 35%

25% 30%
If we really force an analysis via independent samples t-test, we can see that we have the
corresponding pairs as shown above. In total, there will be 10 pairs for the 5 groups.
You might be wondering, why can’t I just apply the independent samples t-test 10 times? Isn’t
that easier to do? Well, easier maybe. However, it does not give us good results. Why?
Remember that in EACH test of hypothesis, we set a level of significance, α. Recall that this α is
equal to the probability of having a Type I error. Hence, every time we perform a test of
hypothesis, there is an α% chance of making a Type I error. So, the probability that my test is
correct will be (1-α)%. Let us assume α = 0.05. We have a 5% chance of committing a Type I
error. In other words, we only have a 95% chance of performing a good experiment. If we
perform this kind of test 10 times in a row, our chance of properly performing a good
experiment for all 10 pairs will lower to
(0.95)10 = 0.60
We will only have a 60% chance of correctly obtaining correct conclusions for all of the 10 pairs.
Or we can say that we have the chance of committing a Type I error as follows.
𝑇𝑦𝑝𝑒 𝐼 𝑒𝑟𝑟𝑜𝑟 = 1 − 0.6 = 0.4 ~ 40%
As you can see, we have a 40% chance of committing a Type I error when we perform 10 paris
of independent samples t-test. Therefore, we should not do this. There is a different procedure
that must be used to compare three or more groups which we will discuss in the next module.

Say we are performing the experiment design showed above, how should we go about it? If we
try to set up our experiment, we need 5 groups with 5 replicates each. This is summarized in
the table below. How should you go about performing the experiment?

Replicate 1 Replicate 2 Replicate 3 Replicate 4 Replicate 5


15%
20%
25%
30%
35%

Honestly speaking, I am sure that if you are tasked to perform the experiment above, everyone
will suggest performing the experiments per group. That is, we start with 15% and perform 5
replicates before proceeding to 20% and the rest. This seems logical and straightforward,
correct? However, THIS IS NOT CORRECT! Why? This will increase the bias of our results.
Remember that in data gathering methods, we employ a random system of selecting our data.
This is done to minimize the biases of our data. It also applies to this topic.
This randomization in experimental design is called Completely Randomized Design. This is
used when there is one IV with 3 or more treatment levels. When applied, it determines If the
mean DV measurement for any of the treatment levels are equal or not. An important
requirement is to do more than one replicate (at least 3). Finally, to minimize biases and errors,
runs should be performed in random order.
Thus, instead of performing it this way

Perform a random number generation to determine the run order.

Simply, we can use an online tool to help randomize the runs. In this particular example (note
that in randomization, no two results will be similar), the first run should be number 8
corresponding to 25% cotton weight. The second run is number 18 which corresponds to 30%
cotton weight and so on.
Okay. Now, we have completed the experiment in a Completely Randomized Design. We have
measured the dependent variable for each run and replicate. The following are the results of
our experimentation.

If we analyze the table above, we can see that there are 5 observations per cotton weight
percentage. For 15%, we can see that there are 5 results corresponding to 7, 7, 15, 11, 9 psi.
Closely looking at the results, we might wonder why there are differences despite all of them
being 15%. This is to be expected when performing experiments properly. We have no
knowledge of the possible outcomes. We simply measure them in the best way we can. Taking
the average, we can se that 15% cotton weight gives an average of 9.8 psi. Applying a similar
observation to the other cotton weight percent, we can see that 20% averages 15.4 psi, 25%
averages 17.6%, 30% averages 21.6%, and 35% averages 10.8 psi.
What can you infer from these results? Well, you can say that increasing the cotton weight
percentage increases the tensile strength. However, this only works until 30% cotton weight as
the tensile strength drastically lowers at 35%. The next question is, which one is the best? Well,
you can probably say 30% since it gives us 21.6 psi on average. However, say we will consider
the economics of preparing said synthetic fiber. What if higher cotton weight results to a more
expensive product? Are we still going to select 30%? What if 25% is just as good? Recall what I
said before that statistical tools are developed to remove the subjectivity in our decision
making.
Well, we can surely conclude that cotton weight percent DOES affect the tensile strength. As
we increase the cotton weight percentage, the tensile strength increases. Hence, the
independent variable (cotton weight percentage) affects the dependent variable (tensile
strength). This is what we can determine by performing this experiment.
So, what now? I said that to compare these 5 groups, it is not good to perform 10 pairs of
independent samples t-test as it increases our Type I error probability. We can actually
compare all the 5 groups simultaneously using a different statistical tool. In our next module,
we will talk about Analysis of Variance or ANOVA. This will enable us to compare three or more
groups simultaneously.

You might also like