Professional Documents
Culture Documents
between groups.
So, In this article, we will be discussing the statistical test for hypothesis testing including both
Table of Contents
T-test
Z-test
F-test
ANOVA
Chi-square
Mann-Whitney U-test
Kruskal-Wallis H-test
Parametric Tests
The basic principle behind the parametric tests is that we have a fixed set of parameters that are
used to determine a probabilistic model that may be used in Machine Learning as well.
Parametric tests are those tests for which we have prior knowledge of the population distribution
(i.e, normal), or if not then we can easily approximate it to a normal distribution which is
Mean
Standard Deviation
assumptions. There are many parametric tests available from which some of them are as follows:
To find the confidence interval for the population means with the help of known standard
deviation.
To determine the confidence interval for population means along with the unknown
standard deviation.
To find the confidence interval for the population variance.
To find the confidence interval for the difference of two means, with an unknown value
of standard deviation.
Non-parametric Tests
In Non-Parametric tests, we don’t make any assumption about the parameters for the given
population or the population we are studying. In fact, these tests don’t depend on the population.
Hence, there is no fixed set of parameters is available, and also there is no distribution (normal
This is also the reason that nonparametric tests are also referred to as distribution-free tests.
In modern days, Non-parametric tests are gaining popularity and an impact of influence some
The main reason is that there is no need to be mannered while using parametric tests.
The second reason is that we do not require to make assumptions about the population
given (or taken) on which we are doing the analysis.
Most of the nonparametric tests available are very easy to apply and to understand also
i.e. the complexity is very low.
T-Test
2. It is essentially, testing the significance of the difference of the mean values when the sample
size is small (i.e, less than 30) and when the population standard deviation is not available.
3. Assumptions of this test:
A T-test can be a:
One Sample T-test: To compare a sample mean with that of the population mean.
where,
Conclusion:
If the value of the test statistic is greater than the table value -> Rejects the null
hypothesis.
If the value of the test statistic is less than the table value -> Do not reject the null
hypothesis.
Z-Test
2. It is used to determine whether the means are different when the population variance is known
One Sample Z-test: To compare a sample mean with that of the population mean.
where,
F-Test
2. It is a test for the null hypothesis that two normal populations have the same variance.
F = s12/s22
6. By changing the variance in the ratio, F-test has become a very flexible test. It can then be
used to:
ANOVA
3. It is used to test the significance of the differences in the mean values among more than two
sample groups.
4. It uses F-test to statistically test the equality of means and the relative variance between them.
5. Assumptions of this test:
Chi-Square Test
3. It helps in assessing the goodness of fit between a set of observed and those expected
theoretically.
4. It makes a comparison between the expected frequencies and the observed frequencies.
6. If there is no difference between the expected and observed frequencies, then the value of chi-
11. Chi-square as a parametric test is used as a test for population variance based on sample
variance.
12. If we take each one of a collection of sample variances, divide them by the known population
variance and multiply these quotients by (n-1), where n means the number of items in the
Mann-Whitney U-Test
2. This test is used to investigate whether two independent samples were selected from a
3. It is a true non-parametric counterpart of the T-test and gives the most accurate estimates of
significance especially when sample sizes are small and the population is not normally
distributed.
4. It is based on the comparison of every observation in the first sample with every observation
where n1 is the sample size for sample 1, and R1 is the sum of ranks in Sample 1.
When consulting the significance tables, the smaller values of U1 and U2 are used. The sum of
Knowing that R1+R2 = N(N+1)/2 and N=n1+n2, and doing some algebra, we find that the sum is:
Kruskal-Wallis H-test
2. This test is used for comparing two or more independent samples of equal or different sample
sizes.
3. It extends the Mann-Whitney-U-Test which is used to comparing only two groups.
4. One-Way ANOVA is the parametric equivalent of this test. And that’s why it is also known as
What is ANOVA?
Developed by Ronald Fisher, ANOVA stands for Analysis of Variance. One-Way Analysis of
Variance tells you if there are any statistical differences between the means of three or more
independent groups.
You might use Analysis of Variance (ANOVA) as a marketer, when you want to test a particular
hypothesis. You would use ANOVA to help you understand how your different groups respond,
with a null hypothesis for the test that the means of the different groups are equal. If there is a
statistically significant result, then it means that the two populations are unequal (or different).
The one-way ANOVA can help you know whether or not there are significant differences
between the means of your independent variables (such as the first example: age, sex, income).
When you understand how each independent variable’s mean is different from the others, you
can begin to understand which of them has a connection to your dependent variable (landing
page clicks), and begin to learn what is driving that behavior.
You may want to use ANOVA to help you answer questions like this:
Do age, sex, or income have an effect on whether someone clicks on a landing page?
Do location, employment status, or education have an effect on NPS score?
One-way ANOVA can help you know whether or not there are significant differences between
the groups of your independent variables (such as USA vs Canada vs Mexico when testing a
Location variable). You may want to test multiple independent variables (such as Location,
employment status or education). When you understand how the groups within the independent
variable differ (such as USA vs Canada vs Mexico, not location, employment status, or
education), you can begin to understand which of them has a connection to your dependent
variable (NPS score).
“Do all your locations have the same average NPS score?”
Although, you should note that ANOVA will only tell you that the average NPS scores across all
locations are the same or are not the same, it does not tell you which location has a significantly
higher or lower average NPS score.
This is defined by how many independent variables are included in the ANOVA test. One-way
means the analysis of variance has one independent variable. Two-way means the test has two
independent variables. An example of this may be the independent variable being a brand of
drink (one-way), or independent variables of brand of drink and how many calories it has or
whether it’s original or diet.
Like other types of statistical tests, ANOVA compares the means of different groups and shows
you if there are any statistical differences between the means. ANOVA is classified as an
omnibus test statistic. This means that it can’t tell you which specific groups were statistically
significantly different from each other, only that at least two of the groups were.
It’s important to remember that the main ANOVA research question is whether the sample
means are from different populations. There are two assumptions upon which ANOVA rests:
First: Whatever the technique of data collection, the observations within each sampled
population are normally distributed.
The one-way ANOVA tests for an overall relationship between the two variables, and the
pairwise tests test each possible pair of groups to see if one group tends to have higher values
than the other.
The Overall Stat Test of Averages acts as an Analysis of Variance (ANOVA). An ANOVA tests
the relationship between a categorical and a numeric variable by testing the differences between
two or more means. This test produces a p-value to determine whether the relationship is
significant or not.
Ensure your “banner” (column) variable has 3+ groups and your “stub” (rows) variable
has numbers (like Age) or numeric recodes (like “Very Satisfied” = 7)
Click “Overall stat test of averages”
You’ll see a basic ANOVA p-value
A one way ANOVA will allow you to distinguish that at least two groups were different from
each other. Once you begin to understand the difference between the independent variables you
will then be able to see how each behaves with your dependent variable. (See landing page
example above)
One-Way ANOVA
One-way ANOVA is generally the most used method of performing the ANOVA
test. It is also referred to as one-factor ANOVA, between-subjects ANOVA, and
an independent factor ANOVA. It is used to compare the means of two
independent groups using the F-distribution.
Two carry out the one-way ANOVA test, you should necessarily have only one
independent variable with at least two levels. One-way ANOVA does not differ
much from t-test.
Example where one-way ANOVA is used
Suppose a teacher wants to know how good he has been in teaching with the
students. So, he can split the students of the class into different groups and
assign different projects related to the topics taught to them.
He can use one-way ANOVA to compare the average score of each group. He
can get a rough understanding of topics to teach again. However, he won’t be
able to identify the student who could not understand the topic.
Two-way ANOVA
Two-way ANOVA is carried out when you have two independent variables. It is
an extension of one-way ANOVA. You can use the two-way ANOVA test when
your experiment has a quantitative outcome and there are two independent
variables.
Two-way ANOVA is performed in two ways:
1. Two-way ANOVA with replication: It is performed when there are two
groups and the members of these groups are doing more than one
thing. Our example in the beginning can be a good example of two-way
ANOVA with replication.
2. Two-way ANOVA without replication: This is used when you have only
one group but you are double-testing that group. For example, a patient
is being observed before and after medication.
Assumptions for Two-way ANOVA
The population must be close to a normal distribution.
Samples must be independent.
Population variances must be equal.
Groups must have equal sample sizes.
What is a Chi-square test?
A Chi-square test is a hypothesis testing method. Two common Chi-square tests involve
checking if observed frequencies in one or more categories match expected frequencies.
There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-
square test of independence. Both tests involve variables that divide your data into categories. As
a result, people can be confused about which test to use. The table below compares the two tests.
Visit the individual pages for each type of Chi-square test to see examples along with details on
assumptions and calculations.
Theoretical
distribution used in Chi-Square Chi-Square
test
Number of categories for first variable
Number of categories minus 1, multiplied by number of
minus 1 categories for second variable minus 1
Degrees of freedom
In our example, In our example, number of
number of flavors of movie categories minus 1, multiplied by
candy minus 1 1 (because snack purchase is a Yes/No
variable and 2-1 = 1)
For both the Chi-square goodness of fit test and the Chi-square test of independence, you
perform the same analysis steps, listed below. Visit the pages for each type of test to see these
steps in action.
1. Define your null and alternative hypotheses before collecting your data.
2. Decide on the alpha value. This involves deciding the risk you are willing to take of
drawing the wrong conclusion. For example, suppose you set α=0.05 when testing for
independence. Here, you have decided on a 5% risk of concluding the two variables are
independent when in reality they are not.
3. Check the data for errors.
4. Check the assumptions for the test. (Visit the pages for each test type for more detail on
assumptions.)
5. Perform the test and draw your conclusion.
Both Chi-square tests in the table above involve calculating a test statistic. The basic idea behind
the tests is that you compare the actual data values with what would be expected if the null
hypothesis is true. The test statistic involves finding the squared difference between actual and
expected data values, and dividing that difference by the expected data values. You do this for
each data point and add up the values.
Independence
When considering student sex and course choice, a χ2 test for independence could be used. To do
this test, the researcher would collect data on the two chosen variables (sex and courses picked)
and then compare the frequencies at which male and female students select among the offered
classes using the formula given above and a χ2 statistical table.
If there is no relationship between sex and course selection (that is, if they are independent), then
the actual frequencies at which male and female students select each offered course should be
expected to be approximately equal, or conversely, the proportion of male and female students in
any selected course should be approximately equal to the proportion of male and female students
in the sample. A χ2 test for independence can tell us how likely it is that random chance can
explain any observed difference between the actual frequencies in the data and these theoretical
expectations.
Goodness-of-Fit
χ2 provides a way to test how well a sample of data matches the (known or assumed)
characteristics of the larger population that the sample is intended to represent. If the sample data
do not fit the expected properties of the population that we are interested in, then we would not
want to use this sample to draw conclusions about the larger population.
For example consider an imaginary coin with exactly 50/50 chance of landing heads or tails and
a real coin that you toss 100 times. If this real coin has an is fair, then it will also have an equal
probability of landing on either side, and the expected result of tossing the coin 100 times is that
heads will come up 50 times and tails will come up 50 times. In this case, χ2 can tell us how well
the actual results of 100 coin flips compare to the theoretical model that a fair coin will give
50/50 results. The actual toss could come up 50/50, or 60/40, or even 90/10. The farther away the
actual results of the 100 tosses is from 50/50, the less good the fit of this set of tosses is to the
theoretical expectation of 50/50 and the more likely we might conclude that this coin is not
actually a fair coin.